[ANN] pyxser-0.2r --- Python XML Serialization
stefan_ml at behnel.de
Sun Apr 19 21:08:50 CEST 2009
Daniel Molina Wegener wrote:
> Stefan Behnel <stefan_ml at behnel.de>
> on Sunday 19 April 2009 02:25
> wrote in comp.lang.python:
>> Daniel Molina Wegener wrote:
>>> * Every serilization is made into unicode objects.
>> Hmm, does that mean that when I serialise, I get a unicode object back?
>> What about the XML declaration? How can a user create well-formed XML from
>> your output? Or is that not the intention?
> Yes, if you serialize an object you get an XML string as
> unicode object, since unicode objects supports UTF-8 and
> some other encodings.
That's not what I meant. I was wondering why you chose to use a unicode
string instead of a byte string (which XML is defined for). If your only
intention is to deserialise the unicode string into a tree, that may be
acceptable. However, as soon as you start writing the data to a file or
through a network pipe, or pass it to an XML parser, you'd better make it
well-formed XML. So you either need to encode it as UTF-8 (for which you do
not need a declaration), or you will need to encode it in a different byte
encoding, and then prepend a declaration yourself. In any case, this is a
lot more overhead (and cumbersome for users) than writing out a correctly
serialised byte string directly.
You seemed to be very interested in good performance, so I don't quite
understand why you want to require an additional step with a relatively
high performance impact that only makes it harder for users to use the tool
More information about the Python-list