[Python-ideas] Strings can sometimes convert to bytes without an encoding
Ethan Furman
ethan at stoneleaf.us
Wed Jun 15 11:27:04 EDT 2016
On 06/14/2016 04:46 PM, Franklin? Lee wrote:
> On Tue, Jun 14, 2016 at 7:26 PM, Guido van Rossum wrote:
>> -1. Such a check for the contents of the string sounds exactly like the
>> Python 2 behavior we are trying to get away [from].
>
> But isn't it really just converting back and forth between two
> representations of the same thing? A str with char width 1 is
> conceptually an ASCII string; you're just changing how it's exposed to
> the program.
The main reason Python 3 is not Python 2 is because text is text and
bytes are bytes and there will be no more automagic encoding/decoding
betwixt the two.
On 06/15/2016 01:55 AM, Franklin? Lee wrote:
> UTF-8 is a default encoding for str.encode and bytes.decode. Latin-1
> is the internal encoding in CPython whenever possible, and
> PyASCIIObject is an internal struct in Python 3. It is not exactly
> alien to Python to choose ASCII as a default. If it is a bad idea, it
> is not original to me.
- cPython is not the only Python
- Latin-1 is an implementation detail, not a language guarantee
- PyASCIIObject is (probably) a name left over from Python 2 (massive
renames of various structures is usually needless code churn)
- it may not have been a bad idea when Python was created, but it is a
bad idea now
Please put your energy elsewhere because this particular is not going to
change.
--
~Ethan~
More information about the Python-ideas
mailing list