[XML-SIG] Re: tabs inside attribute values removed
Luke Kenneth Casson Leighton
lkcl@samba-tng.org
Tue, 13 Mar 2001 00:15:46 +1100
On Fri, 9 Mar 2001, Jeremy Kloth wrote:
>
>
> > i am having to pre-process all text, substituting
> > 	 for "\t" as a work-around for this problem.
> >
> > if this is not performed, then all tabs inside
> > attribute's values, e.g.
> > <node attr="value\tsep\tby\ttabs"/>, are turned into
> > spaces.
>
> Using PyXML 0.6.4, I didn't see this behavior.
>
> from xml.dom.ext.reader import Sax2
> doc = Sax2.FromXml('<element attr="a	tab"/>')
> attr = doc.documentElement.attributes.item(0)
> print repr(attr.value)
> 'a\011tab'
it's the other way round [and this was with 0.6.2]
doc = Sax2.FromXml('<element attr="a\011tab"/>')
attr = doc.documentElement.attributes.attributes['','attr'].value
and should i be using doc.documentElement.attributes['ns','name'].value,
is that okay?
[ just checked this]
it still doesn't work, and it still doesn't work with 0.6.4.
so, yes: i have to pre-process all text, substituting \t with 	 which
is _not_ something i want to have to leave in the code, long-term, as you
might imagine!
some of the documents i am parsing are over 2.5mb in size, and other
people may find larger uses (see http://sourceforge.net/projects/pyxsmqll)
yes, i know: i need to move to a Sax model not a DOM one. first
implementation, and all that :)
all best,
luke
----- Luke Kenneth Casson Leighton <lkcl@samba-tng.org> -----
"i want a world of dreams, run by near-sighted visionaries"
"good. that's them sorted out. now, on _this_ world..."