iterparse and unicode
Fredrik Lundh
fredrik at pythonware.com
Wed Aug 27 05:42:29 EDT 2008
George Sakkis wrote:
>> if you meant to write "encode", you can indeed safely do
>> [s.encode('utf8') for s in strings] as long as all strings are returned
>> by an ET implementation.
>
> I was replying to the general assertion that "in 2.x ASCII byte
> strings and unicode strings are compatible", not specifically about
> the strings returned by ET.
that assertion was made in the context of ET. having to unilaterially
change the topic to "win" an argument is pretty lame.
and if you really meant to write "decode", you picked a rather stupid
example to support your complaint about ET not returning Unicode -- your
example does work fine for byte strings (whether they contain pure ASCII
or not), but doesn't work at all for arbitrary Unicode strings, because
decoding things that are already decoded makes very little sense (which
explains why that method was removed in 3.0).
>>> "hello".decode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
are you sure you understand the distinction between Unicode strings and
encoded strings?
</F>
More information about the Python-list
mailing list