[Chicago] Out of Memory: Killed Process: on CentOS

Cosmin Stejerean cosmin at offbytwo.com
Mon Apr 27 20:31:53 CEST 2009


On Mon, Apr 27, 2009 at 11:37 AM, Brian Ray <bray at sent.com> wrote:

>
> On Apr 27, 2009, at 11:22 AM, Cosmin Stejerean wrote:
>
>
>> How huge are the XML documents you are trying to parse?
>>
>
>
> Average size of a record is around 7.3kb.  The testing engineer is playing
> with the idea of process up to 100,000 records at a time.  that is 714.9 MB.
>

If you're going to handle arbitrarily sized XML documents that contain
independent data records you should definitely use SAX or iterparse from
cElementTree. Also make sure you don't keep the string representation of the
XML document in memory all at once nor the resulting records you are going
to insert into the database.

-- 
Cosmin Stejerean
http://offbytwo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20090427/3c5e9d40/attachment.htm>


More information about the Chicago mailing list