XML overuse? (was Re: Python to XML to Python conversion)

Mike C. Fletcher mcfletch at rogers.com
Fri Jul 12 12:56:20 EDT 2002


Oh goody, a religious war :)

I've been (peripherally) involved in a --go to XML for XML's sake-- 
projects.  We managed to take a readable, readily-parsed, UTF-8 format 
which dozens of pieces of software could read and write, which could be 
readily and reliably edited with a plain-text editor, and was compact 
enough for use as a web-publishing data format, and turned that ISO 
standard into a format that no one would consider writing or editing by 
hand (ridiculously voluminous), no-one would download over the internet 
(same), no software could read or write readily (save a few sample 
implementations produced by the conversion teams), and which made data 
validation a hairier step for the programmes dealing with the new format.

Months (years) of effort that should have gone elsewhere, across dozens 
of companies was spent dismantling and re-branding a format which had 
been designed and evolved over years to do _exactly_ what it needed to 
do as an interchange format for 3D data.  There were conferences where 
the big question wasn't "how do we make this stuff better for our 
users", but "do we encode attributes as tags or tag-attributes?", and 
"how do we encode the data types reliably so that editors know what they 
are, but the resulting file doesn't look like trash?"

Why?  Because a group of corporations had decided:

    1) this format was to be a "web" format (instead of a 3D interchange 
and VR format)
    2) all things "web" must be blessed by being XML
    3) current tools, programmes and users don't matter, because once we 
are XML-based, everything will just magically work.

That, to me, is XML done totally wrong, and projects like it are why 
people get allergic reactions.

The problem with cultivating an allergic reaction is that I _love_ SGML 
(which I started with way back when Paul and I worked with Professor 
Beam at Waterloo).  I _love_ SGML (and XML by extension) as a _textual_ 
markup language.  It's good for taking a stream of characters and 
stating what type of text each bit of that stream is.  It's even better 
for describing truly hierarchically structured texts, such as seen in 
textbooks, manuals and the like (it's not great for poetry or artistic 
works of many types).

I also prefer XML for use in comp-sci problems where there's no 
readily-available and superiour format available (configuration files 
are fine in XML (if people prefer), as long as I can sit down and edit 
one from scratch without _requiring_ an XML editor (which is hard in 
many examples of XML I've seen, because of the hoops being jumped 
through to shoe-horn data-typing into a _textual_ markup language)).

[As a note, I've not found a decent XML editor along the lines of 
InContext's SGML editor, with useful support for editing text with the 
hierarchy visible, but not interfering with navigation, along with 
intelligent split/join/surround/un-surround/merge/
paste-hierarchical/paste-flat/create entity/use entity/etceteras 
facilities.]

I don't even mind XML being used for SOAP-like systems, "let them eat 
cake if they're only going to do it a few times a minute, they don't 
mind wasting some bandwidth, why should I care".  In the absence of a 
better format, yes, go with XML.  If you're worried about long-term 
storage, I'll consider an argument for XML.  For real-time work, with an 
only slightly better format, sure, if we don't have a lot of code 
depending on that format, go ahead.  If you're downgrading service to 
your customers just to "be XML", then you've screwed up.


Mastery of force is not the ability to marshal great force in all 
situations, but to know where, how and when to apply minimal force to 
achieve maximal benefit.
Mike


François Pinard wrote:
...
>>but it's readable and fairly obvious in meaning.
> 
> 
> The original non-XML format is also pretty readable and obvious in meaning.
> Surely, there are advantages to XML, but at first glance here, it seems we
> gain nothing but verbosity and monstrosity.  In my opinion, the advantages
> have to be pretty real to justify such a change.  We should not go XML
> for the only sake of going XML.
...






More information about the Python-list mailing list