[Tutor] XML: Expletive Deleted (OT)

Carroll, Barry Barry.Carroll at psc.com
Mon Jun 12 18:55:53 CEST 2006


Alan, Ralph, et al:

This is a little off-topic, I guess, being not directly related to
Python.  Oh, well.  Here are a couple of personal opinions and a
question about XML.

> -----Original Message-----
> Date: Sun, 11 Jun 2006 08:55:17 +0100
> From: "Alan Gauld" <alan.gauld at freenet.co.uk>
> Subject: Re: [Tutor] Expletive Deleted
> To: "Ralph H. Stoos Jr." <rstoos at rochester.rr.com>,
<Tutor at python.org>
> Message-ID: <001d01c68d2c$62077090$0301a8c0 at XPpro>
> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
> 	reply-type=original
> 
> > I think XML is a tool that allows non-programmers to look at
> > structured
> > data and have it a in human readable form that gives us a chance of
> > understanding that structure.
> 
> Thats not a great reason to choose a file format IMHO.
> Tools can be written to display data in a readable format.
> For example SQL can be used to view the data in a database.
> File formats should be designed to store data, compactly
> and with easy access.

One reason to for choosing a human-readable format is the desire to
visually confirm the correctness of the stored data and format.  This
can be invaluable when troubleshooting a bug involving stored data.  If
there is a tool between the user and the data, one must then rely upon
the correctness of the tool to determine the correctness of the data.
In a case like this, nothing beats the evidence of one's eyes, IMHO.  

In their book, "The Pragmatic Programmer: From Journeyman to Master"
(Addison Wesley Professional), Andrew Hunt and David Thomas give another
reason for storing data in human readable form:

    The problem with most binary formats is that the context necessary 
    to understand the data is separate from the data itself. You are 
    artificially divorcing the data from its meaning. The data may 
    as well be encrypted; it is absolutely meaningless without the 
    application logic to parse it. With plain text, however, you can 
    achieve a self-describing data stream that is independent of the 
    application that created it.

        Tip 20

            Keep Knowledge in Plain Text

> > The other strength that I can see is this:  Once data is in this
> > format,
> > and a tool has been written to parse it,  data can be added to the
> > structure (more elements) and the original tool will not be broken
> > by
> > this.  Whatever it is parsed for is found and the extra is ignored.
> 
> But this is a very real plus point for XML.
> And this IMHO is the biggest single reason for using it, if you have
> data where the very structure itself is changing yet the same file
> has to be readable by old and new clients then XML is a good choice.

No argument there.  

> > Without a doubt, the overhead XML adds over say, something as simple
> > as
> > CSV is considerable, and XML would appear to be rather more hard to
> > work
> > with in things like Python and PERL.
> 
> Considerable is an understatement, its literally up to 10 or 20 times
> more space and that means bandwidth and CPU resource to
> process it.
> 
> Using XML as a storage medium - a file - is not too bad, you suck
> it up, process it and foirget the file. MY big gripe is that people
> are
> inceasingly trying to use XML as the payload in comms systems,
> sending XML messages around. This is crazy! The extra cost of the
> network and hardware needed to process that kind of architecture
> is usually far higher than the minimal savings it gives in developer
> time.
> [As an example I recently had to uplift the bandwidth of the
> intranet pipe in one of our buildings from 4Mb to a full ATM pipe
> of 34Mb just to accomodate a system 'upgrade' that now used XML.
> That raised the network operations cost of that one building
> from $10k per year to over $100k! - The software upgrade by
> contrast was only a one-off cost of $10K]

This is an example of the resource balancing act that computer people
have been faced with since the beginning.  The most scarce/expensive
resource dictates the program's/system's design.  In Alan's example high
speed bandwidth is the limiting resource.  A data transmission method
that fails to minimize use of that resource is therefore a bad solution.


Python itself is a result of this balancing act.  Interpreted languages
like Basic were invented to overcome the disadvantages of writing of
programs in machine-readable, human-unfriendly formats.  Compiled
languages like C were invented to overcome the slow execution speed of
interpreted programs.  As processor speeds increased and execution times
dropped , interpreted languages like Python once again became viable for
large scale programs.  

> > So, I think XML has it's place but I will not fault anyone for
> > trying to
> > make it easier to get code to work.
> 
> Absolutely agree with that. Just be careful how you use it and
> think of the real cost impact you may be having if its your choice.
> Your customers will thank you.

So here's my off-topic question: Ajax is being touted as the 'best-known
method' (BKM) for making dynamic browser-based applications, and XML is
the BKM for transferring data in Ajax land.  If XML is a bad idea for
network data-transfer, what medium should be used instead?

> Alan Gauld
> Author of the Learn to Program web site
> http://www.freenetpages.co.uk/hp/alan.gauld

Regards,
 
Barry
barry.carroll at psc.com
541-302-1107
________________________
We who cut mere stones must always be envisioning cathedrals.

-Quarry worker's creed




More information about the Tutor mailing list