[Tracker-discuss] [issue223] xml.etree.ElementTree does not read xml-text over page bonderies

roland rehmnert metatracker at psf.upfronthosting.co.za
Fri Oct 10 10:49:48 CEST 2008

New submission from roland rehmnert <roland.rehmnert at ericsson.com>:

xml text fields are not read properly when it is encountered in a 'start' event.

During a 'start'-event elem.text returns None, if the text string cross a page
boundary of the file. (this is platform dependent and a typical value is 8K
(8192 byte)).  

This line cause an error if the page size is 8192.
<a>this is a text where X has position 8192 in the file</a>

In most cases this erroneous behaviour can be avoid when elem.tree always return
the proper value at the 'end'-event.   

Two files are submitted:
bug.py: An excerpted file that produced an error with the submitted xml file.
bug.xml: An xml file, a little bit more then 8200 bytes. In can of the page size
is greater than 8K.. file should be enlarged. Important is however that the text
should cross the page boundary. Tags and attributes and attribute values as well
are OK

I might have misunderstood the documentation of etree, because there are
situations that I have not tested.

messages: 1110
nosy: roland
priority: bug
status: unread
title: xml.etree.ElementTree does not read xml-text over page bonderies

PSF Meta Tracker <metatracker at psf.upfronthosting.co.za>

More information about the Tracker-discuss mailing list