Thursday, 16 March 2000
.....
Python and XML are two of my favorite technical things, so I was surprised to realize I hadn't yet introduced Pyxie to the Monkeyfist audience. SAX and DOM are the dominant XML APIs, and while it's not clear whether Pyxie is meant to move out from Python to be implemented in other languages, since Python is such a nice language, it's also not clear whether being Python-only will prevent Pyxie from being widely and cleverly used.
Pyxie contains both a parsing API with tree and stream-like modes, and a line-oriented textual representation of XML data, called PYX. If you're familiar with ESIS from SGML, then PYX will be familiar too. PYX emits each XML object -- whether data, processing instruction, attribute or element -- on a line, the first character of which is an XML object type indicator:
"(" signals a start-tag
")" signals an end-tage
"A" signals an attribute
"-" signals data
"?" signals a processing instruction
And that's just about it; from that notation Pyxie builds tree and stream-like APIs. As you can see, one of Pyxie's charms is its simplicity.
One thing that is of particular interest is the way that Pyxie accomodates XML-encoded data to the Unix, command-line, filter, little tools style of getting work done. You can have Pyxie emit a textual stream on STDOUT, and from there you can use the whole slew of Unix command-line tools to do interesting stuff: sed, awk, Perl, grep, wc, etc.
While it's not going to replace either SAX or DOM, Pyxie is apparently robust enough for its creator, Sean McGrath, to use it for some very serious XML development work for the Irish Parliament.
If you are a Pythoneer, or are new to XML programming, Pyxie is a great place to start. You may find that for lots of XML jobs -- especially ones in a Unix little tools setting -- it's all you'll need.
This is Pyxie: Python and XML <http://monkeyfist.com/articles/345>