[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?

Adam Olsen rhamph at gmail.com
Mon Sep 29 09:32:48 CEST 2008


On Sun, Sep 28, 2008 at 10:43 PM, James Y Knight <foom at fuhm.net> wrote:
> [1] UTF-8b has a similar property to 8859-1, in that all byte strings can be
> successfully round-tripped. It's not currently implemented in python core,
> but it's a pretty trivial encoding, and is available under the BSD license,
> see below.

UTF-8b doesn't work as intended.  It produces an invalid unicode
object (garbage surrogates) that cannot be used with external APIs or
libraries that require unicode.  If you don't need unicode then your
code should state so explicitly, and 8859-1 is ideal there.


-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-3000 mailing list