[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?
Adam Olsen
rhamph at gmail.com
Tue Sep 30 01:23:52 CEST 2008
On Mon, Sep 29, 2008 at 5:14 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Adam Olsen wrote:
>> There's no solution except to not
>> decode, and 8859-1 is the way to do that.
>
> I think you need to elaborate that. What does ISO-8859-1 has to do
> with a Python datatype in this context: which datatype, and what
> algorithm on it are you specifically referring to?
>
> When I do (in 2.x)
>
> py> "foo".decode("iso-8859-1")
> u'foo'
>
> ISTM that 8859-1 is all about decoding, so I don't understand why
> you say it is a way not to decode.
8859-1 has no invalid bytes and is a 1-to-1 mapping. If you have an
API that always returns unicode but accepts an encoding you can use
it, then reencode using 8859-1 to get back the original bytes.
An ugly hack, but more correct than UTF-8b or any similar attempt to
do "unicode but not quite unicode"; either it's lossy, or it's not
unicode. There's no in between.
--
Adam Olsen, aka Rhamphoryncus
More information about the Python-3000
mailing list