[Chicago] Out of Memory: Killed Process: on CentOS
Cosmin Stejerean
cosmin at offbytwo.com
Mon Apr 27 20:31:53 CEST 2009
On Mon, Apr 27, 2009 at 11:37 AM, Brian Ray <bray at sent.com> wrote:
>
> On Apr 27, 2009, at 11:22 AM, Cosmin Stejerean wrote:
>
>
>> How huge are the XML documents you are trying to parse?
>>
>
>
> Average size of a record is around 7.3kb. The testing engineer is playing
> with the idea of process up to 100,000 records at a time. that is 714.9 MB.
>
If you're going to handle arbitrarily sized XML documents that contain
independent data records you should definitely use SAX or iterparse from
cElementTree. Also make sure you don't keep the string representation of the
XML document in memory all at once nor the resulting records you are going
to insert into the database.
--
Cosmin Stejerean
http://offbytwo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20090427/3c5e9d40/attachment.htm>
More information about the Chicago
mailing list