Re: 'äÄöÖüÜ' in Unicode (utf-8)

Chris Angelico rosuav at gmail.com
Thu Mar 31 12:59:32 EDT 2022


On Fri, 1 Apr 2022 at 03:45, Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
>
> On Thu, 31 Mar 2022 00:36:10 -0700 (PDT), moi <wxjmfauth at gmail.com>
> declaimed the following:
>
> >>>> 'äÄöÖüÜ'.encode('utf-8')
> >b'\xc3\xa4\xc3\x84\xc3\xb6\xc3\x96\xc3\xbc\xc3\x9c'
> >>>> len('äÄöÖüÜ'.encode('utf-8'))
> >12
> >>>>
> >>>> ?
>
>         Is there a question in there somewhere?
>
>         Crystal ball is hazy...
>
>         However... Note that once you encode the Unicode literal, you have a
> BYTE string. There are 12 bytes in that binary -- it is NOT considered
> Unicode at that point (only when you decode it with the same CODEC will it
> be Unicode).
>

That's jmf. Ignore him. He knows nothing about Unicode and is
determined to make everyone aware of that fact.

He got blocked from the mailing list ages ago, and I don't think
anyone's regretted it.

ChrisA


More information about the Python-list mailing list