XML overuse? (was Re: Python to XML to Python conversion)
Mike C. Fletcher
mcfletch at rogers.com
Fri Jul 12 12:56:20 EDT 2002
Oh goody, a religious war :)
I've been (peripherally) involved in a --go to XML for XML's sake--
projects. We managed to take a readable, readily-parsed, UTF-8 format
which dozens of pieces of software could read and write, which could be
readily and reliably edited with a plain-text editor, and was compact
enough for use as a web-publishing data format, and turned that ISO
standard into a format that no one would consider writing or editing by
hand (ridiculously voluminous), no-one would download over the internet
(same), no software could read or write readily (save a few sample
implementations produced by the conversion teams), and which made data
validation a hairier step for the programmes dealing with the new format.
Months (years) of effort that should have gone elsewhere, across dozens
of companies was spent dismantling and re-branding a format which had
been designed and evolved over years to do _exactly_ what it needed to
do as an interchange format for 3D data. There were conferences where
the big question wasn't "how do we make this stuff better for our
users", but "do we encode attributes as tags or tag-attributes?", and
"how do we encode the data types reliably so that editors know what they
are, but the resulting file doesn't look like trash?"
Why? Because a group of corporations had decided:
1) this format was to be a "web" format (instead of a 3D interchange
and VR format)
2) all things "web" must be blessed by being XML
3) current tools, programmes and users don't matter, because once we
are XML-based, everything will just magically work.
That, to me, is XML done totally wrong, and projects like it are why
people get allergic reactions.
The problem with cultivating an allergic reaction is that I _love_ SGML
(which I started with way back when Paul and I worked with Professor
Beam at Waterloo). I _love_ SGML (and XML by extension) as a _textual_
markup language. It's good for taking a stream of characters and
stating what type of text each bit of that stream is. It's even better
for describing truly hierarchically structured texts, such as seen in
textbooks, manuals and the like (it's not great for poetry or artistic
works of many types).
I also prefer XML for use in comp-sci problems where there's no
readily-available and superiour format available (configuration files
are fine in XML (if people prefer), as long as I can sit down and edit
one from scratch without _requiring_ an XML editor (which is hard in
many examples of XML I've seen, because of the hoops being jumped
through to shoe-horn data-typing into a _textual_ markup language)).
[As a note, I've not found a decent XML editor along the lines of
InContext's SGML editor, with useful support for editing text with the
hierarchy visible, but not interfering with navigation, along with
intelligent split/join/surround/un-surround/merge/
paste-hierarchical/paste-flat/create entity/use entity/etceteras
facilities.]
I don't even mind XML being used for SOAP-like systems, "let them eat
cake if they're only going to do it a few times a minute, they don't
mind wasting some bandwidth, why should I care". In the absence of a
better format, yes, go with XML. If you're worried about long-term
storage, I'll consider an argument for XML. For real-time work, with an
only slightly better format, sure, if we don't have a lot of code
depending on that format, go ahead. If you're downgrading service to
your customers just to "be XML", then you've screwed up.
Mastery of force is not the ability to marshal great force in all
situations, but to know where, how and when to apply minimal force to
achieve maximal benefit.
Mike
François Pinard wrote:
...
>>but it's readable and fairly obvious in meaning.
>
>
> The original non-XML format is also pretty readable and obvious in meaning.
> Surely, there are advantages to XML, but at first glance here, it seems we
> gain nothing but verbosity and monstrosity. In my opinion, the advantages
> have to be pretty real to justify such a change. We should not go XML
> for the only sake of going XML.
...
More information about the Python-list
mailing list