[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
Toshio Kuratomi
a.badger at gmail.com
Fri Apr 24 23:26:12 CEST 2009
Glenn Linderman wrote:
> On approximately 4/24/2009 11:40 AM, came the following characters from
> And so my encoding (1) doesn't alter the data stream for any valid
> Windows file name, and where the naivest of users reside (2) doesn't
> alter the data stream for any Posix file name that was encoded as UTF-8
> sequences and doesn't contain ? characters in the file name [I perceive
> the use of ? in file names to be rare on Posix, because of experience,
> and because of the other problems caused by such use] (3) doesn't
> introduce data puns within applications that are correctly coded to know
> the encoding occurs. The encoding technique in the PEP not only can
> produce data puns, thus not being reversible, it provides no reliable
> mechanism to know that this has occurred.
>
Uhm.... Not arguing with your goals but '?' is unfortunately reasonably
easy to get into a filename. For instance, I've had to download a lot
of scratch built packages from our buildsystem recently. Scratch builds
have url's with query strings in them so::
wget
'http://koji.fedoraproject.org/koji/getfile?taskID=1318059&name=monodevelop-debugger-gdb-2.0-1.1.i586.rpm'
Which results in the filename:
getfile?taskID=1318059&name=monodevelop-debugger-gdb-2.0-1.1.i586.rpm
-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090424/232171fd/attachment-0001.pgp>
More information about the Python-Dev
mailing list