[Tutor] Trying to parse a HUGE(1gb) xml file in python
Stefan Behnel
stefan_ml at behnel.de
Tue Dec 21 12:59:58 CET 2010
David Hutto, 21.12.2010 12:45:
>> If file a.xml has simple tagged xml like<a>, and file b.config has
>> tags that represent the a.xml(i.e.<a> =<antonym>) as greater tags,
>> does this pattern optimize the process by limiting the size of the
>> tags to be parsed in the xml, then converting those simpler tags that
>> are found to the b.config values for the simple<a-z> simple format?
>
> In other words I'm lazy and asking for the experiment to be performed
> for me(or, more importantly, if it has been), but since I'm not new to
> this, if no one has a specific case, I'll timeit when I get to it.
I'm still not sure I understand what you are trying to describe here, but I
think you want to look into the Wikipedia articles on indexing, hashing and
compression.
http://en.wikipedia.org/wiki/Index_%28database%29
http://en.wikipedia.org/wiki/Index_%28information_technology%29
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Data_compression
Terms like "indirection" and "mapping" also come to my mind when I try to
make sense out of your hints.
Stefan
More information about the Tutor
mailing list