[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?
Marcin 'Qrczak' Kowalczyk
qrczak at knm.org.pl
Tue Sep 30 21:46:36 CEST 2008
2008/9/30 Marcin 'Qrczak' Kowalczyk <qrczak at knm.org.pl>:
> I've experimentally implemented (not for Python) a different escaping
> scheme with a similar goal as UTF-8b: undecodable bytes are prefixed
> with U+0000 instead of being converted to unpaired surrogates, and
> '\x00' decodes as U+0000 U+0000.
This was not my idea: mono did that first.
http://go-mono.com/docs/index.aspx?link=T%3AMono.Unix.UnixEncoding
"In short, it's a Glorious Hack. Rejoice. Or something."
Note that there are many people, including the Unicode list, who
consider this evil because they view this as a non-standard
modification of UTF-8. I am undecided on how evil it is.
(My implementation differs from mono by the strictness of what Unicode
sequences can be encoded: mono encodes all and mine does not, OTOH
mine is a bijection and mono is not. Both implementations decode all
byte sequences of course.)
--
Marcin Kowalczyk
qrczak at knm.org.pl
http://qrnik.knm.org.pl/~qrczak/
More information about the Python-3000
mailing list