Regular Expressions: Can't quite figure this problem out
rcdailey at gmail.com
Wed Sep 26 03:14:34 CEST 2007
On 9/25/07, J. Cliff Dyer <jcd at sdf.lonestar.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Robert Dailey top posted:
> > Hmm, ElementTree.tostring() also adds a space between the last
> > character of the element name and the />. Not sure why it is doing
> > this.
> > Something like <root/> will become <root /> after the tostring().
> The space was common practice in pseudo-XHTML code when people still
> had to routinely support browsers like Netscape 4, which had no clue
> about XML. It basically makes a uniquely XML construct into valid
> HTML. Basically, the space makes unaware parsers treat the / as the
> next attribute. Being an attribute with unknown meaning, the standard
> practice is to ignore it, and hence, it is parsed properly in both
> XHTML parsers and plain HTML parsers. I guess the practice just
> caught on beyond the XHTML world.
> I don't know if there's a flag to get rid of it, but you can always
> dig into the code....
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (MingW32)
> -----END PGP SIGNATURE-----
Right now I just run a trivial regular expression on the result of
tostring() to remove the spaces.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list