XML parsing: SAX/expat & yield
__peter__ at web.de
Wed Aug 4 19:22:54 CEST 2010
> I want to write code that parses a file that is far bigger than
> the amount of memory I can count on. Therefore, I want to stay as
> far away as possible from anything that produces a memory-resident
> DOM tree.
> The top-level structure of this xml is very simple: it's just a
> very long list of "records". All the complexity of the data is at
> the level of the individual records, but these records are tiny in
> size (relative to the size of the entire file).
> So the ideal would be a "parser-iterator", which parses just enough
> of the file to "yield" (in the generator sense) the next record,
> thereby returning control to the caller; the caller can process
> the record, delete it from memory, and return control to the
> parser-iterator; once parser-iterator regains control, it repeats
> this sequence starting where it left off.
More information about the Python-list