[Tutor] Trying to parse a HUGE(1gb) xml file in python
Steven D'Aprano
steve at pearwood.info
Tue Dec 21 14:54:51 CET 2010
Alan Gauld wrote:
> XML is a self-describing data format. It is usually used for files
> but can be used in data streams or in-memory strings.
>
> It's natural competitors are TLV (Tag,Lenth,Value) and
> CSV(Comma Seperated Value) files but neither is as rich
> in structure. Alternative options include ASN.1, Edifact and
> IDL but these are not self-describing(*) (although they are all
> more compact and faster to parse, but only IDL is free.)
I would have thought that both JSON and YAML are competitors to XML,
although of course it depends on exactly what you are using XML for. For
example, Gnome uses XML files extensively for their poor-man's Registry,
which is a shame as (in my opinion) simple Windows-style INI files or
Unix/Linux style config files would be a far better and more natural choice.
Basically, people shouldn't make the mistake of thinking that because
XML is text-based it is meant as a human-readable (let alone
human-editable) format. It's not. It's a machine format that happens to
be *just barely* human-readable and -editable in simple cases due to
using ASCII text
--
Steven
More information about the Tutor
mailing list