Unicode and dictionaries
Ben Finney
ben+python at benfinney.id.au
Sat Jan 16 22:06:01 EST 2010
Carl Banks <pavlovevidence at gmail.com> writes:
> On Jan 16, 3:56 pm, Ben Finney <ben+pyt... at benfinney.id.au> wrote:
> > gizli <mehm... at gmail.com> writes:
> > > >>> test_dict = {u'öğe':1}
> > > >>> u'öğe' in test_dict.keys()
> > > True
> > > >>> 'öğe' in test_dict.keys()
> > > True
> >
> > I would call this a bug. The two objects are different, so the latter
> > expression should return ‘False’.
>
> Except the two objects are not different if default encoding is utf-8.
They are different, because a Unicode object is *not* encoded in any
character encoding, whereas the byte string object is.
The source code shows a Unicode *literal* represented in some encoding;
but, just like the source code sequence ‘1.0’ results in an
floating-point object, the source code sequence ‘u'öğe'’ results in a
Unicode object. Neither the floating-point object nor the Unicode object
have a character encoding, even though their representations in source
code did have one.
The Effbot explains it <URL:http://effbot.org/zone/unicode-objects.htm>
in more detail.
--
\ “[W]hoever is able to make you absurd is able to make you |
`\ unjust.” —Voltaire |
_o__) |
Ben Finney
More information about the Python-list
mailing list