Python and XML are two of my favorite technical things, so I
was surprised to realize I hadn't yet introduced Pyxie to the Monkeyfist
audience. SAX and DOM are the dominant XML APIs, and while
it's not clear whether Pyxie is meant to move out from Python
to be implemented in other languages, since Python is such a
nice language, it's also not clear whether being Python-only
will prevent Pyxie from being widely and cleverly used.
Pyxie contains both a parsing API with tree and stream-like
modes, and a line-oriented textual representation of XML data,
called PYX. If you're familiar with ESIS from SGML, then PYX
will be familiar too. PYX emits each XML object -- whether
data, processing instruction, attribute or element -- on a
line, the first character of which is an XML object type
indicator:
"(" signals a start-tag
")" signals an end-tage
"A" signals an attribute
"-" signals data
"?" signals a processing instruction
And that's just about it; from that notation Pyxie builds tree
and stream-like APIs. As you can see, one of Pyxie's charms is
its simplicity.
One thing that is of particular interest is the way that Pyxie
accomodates XML-encoded data to the Unix, command-line,
filter, little tools style of getting work done. You can have
Pyxie emit a textual stream on STDOUT, and from there
you can use the whole slew of Unix command-line tools to do
interesting stuff: sed, awk, Perl, grep, wc, etc.
While it's not going to replace either SAX or DOM, Pyxie is
apparently robust enough for its creator, Sean McGrath, to use
it for some very serious XML development work for the Irish
Parliament.
If you are a Pythoneer, or are new to XML programming, Pyxie
is a great place to start. You may find that for lots of XML
jobs -- especially ones in a Unix little tools setting -- it's
all you'll need.