[Python-Dev] unicode/string asymmetries

M.-A. Lemburg mal@lemburg.com
Thu, 10 Jan 2002 12:14:31 +0100


Thomas Heller wrote:
>=20
> > > How can I do the equivalent of
> > >   u"some string"
> > > in terms of
> > >   unicode("some string", encoding)
> For example the copyright symbol "=A9" (repr("=A9") gives "\xa9").
> Now I want to convert this string to unicode.
> u"=A9" works fine, unicode(variable) gives an ASCII decoding error.

u"something" maps to unicode("something", "latin-1"). This is because
Unicode literals in Python are interpreted as being Latin-1.=20

See the source code encoding PEP (0263) for details on what could be=20
done to make this user-configurable.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/