Python parsing XML file problem with SAX

Aahz aahz at
Tue Aug 10 01:20:33 CEST 2010

In article <mailman.1860.1281375095.1673.python-list at>,
Stefan Behnel  <stefan_ml at> wrote:
>Aahz, 09.08.2010 18:52:
>> In article<mailman.1250.1280314148.1673.python-list at>,
>> Stefan Behnel wrote:
>>> First of all: don't use SAX. Use ElementTree's iterparse() function. That
>>> will shrink you code down to a simple loop in a few lines.
>> Unless I'm missing something, that only helps if the final tree fits into
>> memory.  What do you suggest other than SAX if your XML file may be
>> hundreds of megabytes?
>Well, what about using ElementTree's iterparse() function in that case? 
>That's what it's good at, and its cElementTree version is extremely fast.

The docs say, "Parses an XML section into an element tree incrementally".
Sure sounds like it retains the entire parsed tree in RAM.  Not good.
Again, how do you parse an XML file larger than your available memory
using something other than SAX?
Aahz (aahz at           <*>

"...if I were on life-support, I'd rather have it run by a Gameboy than a
Windows box."  --Cliff Wells

More information about the Python-list mailing list