[XML-SIG] Tim Bray on XML

Uche Ogbuji uche.ogbuji@fourthought.com
Tue, 18 Mar 2003 12:01:00 -0700

> In case you missed today's post on Slashdot ...
> 	XML Is Too Hard For Programmers
> 	by Tim Bray
> http://www.tbray.org/ongoing/When/200x/2003/03/16/XML-Prog
> So, how does he parse xml?  Using regexp!

This is a pretty big exaggeration of Tim's point, I think.  I didn't read his 
blog carefully when he posted it, but he's often said on XML-DEV that he used 
regexp and Perl for some quick and dirty tasks.  And who doesn't?  Who hasn't 

fgrep "Bucciarelli" my-contact-list.xml ?

I hardly think there's any deep insight there for developers.  Yes, Tim 
sometimes goes beyond that, but he's always been very careful to point out 
that when it's time to do full, general XML parsing, he uses a proper parser, 
just like anyone else.

> Other points: he finds callback (SAX, I guess) painful, and suggests creating 
> a new idiom for stream processing of XML, one that "abstracts away all the 
> XML syntax weirdness, igoring line-breaks, attribute orders, choice of 
> quotemakrs and so on."

Of course SAX already does all the abve except for line breaks, which I don't 
think should be silently canonicalized by a parser, anyway.

Yes, SAX is a bit of a pain for anyone who thinks state diagrams are unfunny 
cartoons.  Luckily there are stream-like alternatives: XMLReader in libxslt 
and .NET, pulldom in Python (though the latter has some warts).

Anyway, I've been rolling my eyes at all the doomsday placards being paraded 
around lately.  I take what you might call the elitist position: if XML is too 
hard for a programmer, he'd perhaps best get out of the profession.



Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Universal Business Language (UBL) - http://www-106.ibm.com/developerworks/xml/l
EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html
The worry about program wizards - http://www.adtmag.com/article.asp?id=7238
Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/develo
Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/libra
Python Generators + DOM - http://www.xml.com/pub/a/2003/01/08/py-xml.html
Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.ht
SAX filters for flexible processing - http://www-106.ibm.com/developerworks/xml