Python3.3 str() bug?
rosuav at gmail.com
Fri Nov 9 13:22:04 CET 2012
On Fri, Nov 9, 2012 at 10:08 PM, Helmut Jarausch
<jarausch at igpm.rwth-aachen.de> wrote:
> For me it's not funny, at all.
His description "funny" was in reference to the fact that you
described this as a bug. This is a heavily-used mature language; bugs
as fundamental as you imply are unlikely to exist (consequences of
design decisions there will be, but not outright bugs, usually);
extraordinary claims require extraordinary evidence.
> Whenever Python3 encounters a bytestring it needs an encoding to convert it to
> a string. If I feed a list of bytestrings or a list of list of bytestrings to
> 'str' , etc, it should use the encoding for each bytestring component of the
> given data structure.
> How can I convert a data strucure of arbitrarily complex nature, which contains
> bytestrings somewhere, to a string?
Okay, now we're getting somewhere.
What you really should be doing is not transforming the whole
structure, but explicitly transforming each part inside it. I
recommend you stop fighting the language and start thinking about your
data as either *bytes* or *characters* and using the appropriate data
types (bytes or str) everywhere. You'll then find that it makes
perfect sense to explicitly translate (en/decode) from one to another,
but it doesn't make sense to encode a list in UTF-8 or decode a
dictionary from Latin-1.
> This problem has arisen while converting a working Python2 script to Python3.3.
> Since Python2 doesn't have bytestrings it just works.
Actually it does; it just calls them "str". And there's a Unicode
string type, called "unicode", which is (more or less) the thing that
Python 3 calls "str".
You may be able to do some kind of recursive cast that, in one sweep
of your data structure, encodes all str objects into bytes using a
given encoding (or the reverse thereof). But I don't think this is the
best way to do things.
More information about the Python-list