10GB XML Blows out Memory, Suggestions?
Fredrik Lundh
fredrik at pythonware.com
Thu Jun 8 09:50:49 EDT 2006
fuzzylollipop wrote:
> SAX style or a pull-parser has to be used when the data is "large" or
> when you don't really need to process every element and attribute.
>
> This problem looks like it is just a data export / import problem. In
> that case you will either have to use a sax style parser and parse the
> 10GB file. Or as I suggested in another reply, export the data in
> smaller chunks
or use a parser that can do the chunking for you, on the way in...
in Python, incremental parsers like cET's iterparse and the one in Amara
gives you *better* performance than SAX (including "raw" pyexpat) in
many cases, and offers a much simpler programming model.
</F>
More information about the Python-list
mailing list