[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
rhamph at gmail.com
Wed Feb 15 06:11:49 CET 2006
On 2/14/06, Guido van Rossum <guido at python.org> wrote:
> On 2/13/06, Adam Olsen <rhamph at gmail.com> wrote:
> > If I understand correctly there's three main candidates:
> > 1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x
> I'm not sure what you mean, but I'm guessing you're thinking that the
> repr() of a bytes object created from bytes('abc\xf0') would be
> under this rule. What's so bad about that?
> > 2. Direct copying to str/unicode if it's only ascii values, switching
> > to a list of hex literals if there's any non-ascii values
> That works for me too. But why hex literals? As MvL stated, a list of
> decimals would be just as useful.
PEBKAC. Yeah, decimals are simpler and shorter even.
> > 3. b"foo" literal with ascii for all ascii characters (other than \
> > and "), \xFF for individual characters that aren't ascii
> > Given the choice I prefer the third option, with the second option as
> > my runner up. The first option just screams "silent errors" to me.
> The 3rd is out of the running for many reasons.
> I'm not sure I understand your "silent errors" fear; can you elaborate?
I think it's that someone will create a unicode object with real
latin-1 characters and it'll get passed through without errors, the
code assuming it's 8bit-as-latin-1. If they had put other unicode
characters in they would have gotten an exception instead.
However, at this point all the posts on latin-1 encoding/decoding have
become so muddled in my mind that I don't know what they're
suggesting. I think I'll wait for the pep to clear that up.
Adam Olsen, aka Rhamphoryncus
More information about the Python-Dev