[Python-3000] Pre-PEP: Easy Text File Decoding

"Martin v. Löwis" martin at v.loewis.de
Sat Oct 14 20:54:06 CEST 2006


Marcin 'Qrczak' Kowalczyk schrieb:
> It changes the interpretation of some filenames which are valid UTF-8
> (or generally of texts known to not contain '\0'). My hack is a pure
> extension since U+0000 can't be produced by standard UTF-8.

That's not true. See RFC 2279:

# Character values from 0000 0000 to 0000 007F (US-ASCII repertoire)
# correspond to octets 00 to 7F (7 bit US-ASCII values).

So U+0000 is represented by the octet 00.

Regards,
Martin


More information about the Python-3000 mailing list