[Expat-discuss] large XML files in python 2.5 (expat 2.0.0, XML_LARGE_SIZE)

Robert Hancock robert.hancock1 at virgin.net
Sun Apr 1 23:38:06 CEST 2007


Hi,

I'm processing some very large (>1TB) XML files with python and expat 
(python 2.5 to get expat 2.0.0).

The parser.CurrentByteIndex attribute is useful for me for some 
statistical and debugging purposes, but in the default Python build 
seems to be limited to 2**31 bytes (signed 32 bit int?), which I see as 
the Index wrapping as I work through the file.

I have tried rebuilding python2.5 from source (platform: Linux x86) but 
can't seem to get the XML_LARGE_SIZE option to have any effect. I've run
./configure CFLAGS=-DXML_LARGE_SIZE CPPFLAGS=-DXML_LARGE_SIZE
and rebuilding, but I still see the same wrapping behaviour.

Are there notes anywhere on how to enable this? Any way to test it 
within Python?

Thanks in advance for any help,

Robert Hancock


More information about the Expat-discuss mailing list