[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Guido van Rossum guido at python.org
Tue Feb 14 00:10:50 CET 2006


On 2/13/06, M.-A. Lemburg <mal at egenix.com> wrote:
> Guido van Rossum wrote:
> > It'd be cruel and unusual punishment though to have to write
> >
> >   bytes("abc", "Latin-1")
> >
> > I propose that the default encoding (for basestring instances) ought
> > to be "ascii" just like everywhere else. (Meaning, it should really be
> > the system default encoding, which defaults to "ascii" and is
> > intentionally hard to change.)
>
> We're talking about Py3k here: "abc" will be a Unicode string,
> so why restrict the conversion to 7 bits when you can have 8 bits
> without any conversion problems ?

As Phillip guessed, I was indeed thinking about introducing bytes()
sooner than that, perhaps even in 2.5 (though I don't want anything
rushed).

Even in Py3k though, the encoding issue stands -- what if the file
encoding is Unicode? Then using Latin-1 to encode bytes by default
might not by what the user expected. Or what if the file encoding is
something totally different? (Cyrillic, Greek, Japanese, Klingon.)
Anything default but ASCII isn't going to work as expected. ASCII
isn't going to work as expected either, but it will complain loudly
(by throwing a UnicodeError) whenever you try it, rather than causing
subtle bugs later.

> While we're at it: I'd suggest that we remove the auto-conversion
> from bytes to Unicode in Py3k and the default encoding along with
> it.

I'm not sure which auto-conversion you're talking about, since there
is no bytes type yet. If you're talking about the auto-conversion from
str to unicode: the bytes type should not be assumed to have *any*
properties that the current str type has, and that includes
auto-conversion.

> In Py3k the standard lib will have to be Unicode compatible
> anyway and string parser markers like "s#" will have to go away
> as well, so there's not much need for this anymore.
>
> (Maybe a bit radical, but I guess that's what Py3k is meant for.)

Right.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list