[Tutor] Trying to parse a HUGE(1gb) xml file in python

Alan Gauld alan.gauld at btinternet.com
Tue Dec 21 11:03:19 CET 2010


"David Hutto" <smokefloat at gmail.com> wrote

> XML stands for eXtensible Markup Language.
> XML is designed to transport and store data.
>
> Then what other file medium would you suggest as the tagging means.

See my other post but there are many alternatives that are orders
of magnitude more efficient. XML is one of the most inefficient
data transport mechanisms ever invented and its main redeeming
feature is its human readability.

> You have a file with tags, you can't parse and store the data in any
> file anymore than the next, right?

Wrong, even CSV files are more efficient than parsing XML.
(But are very limited in their data structure)

But binary based formats like IDL and ASN.1 can be parsed
very efficiently and, because they are binary based, store
(and therefore transmit) their data much more efficiently too.

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/




More information about the Tutor mailing list