diferences between 22 and python 23

Bengt Richter bokr at oz.net
Thu Dec 4 11:33:33 CET 2003

On 04 Dec 2003 08:28:16 +0100, martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) wrote:

>"Mike C. Fletcher" <mcfletch at rogers.com> writes:
>> AFAIK, that's the plan.  IIRC, rationale was that there would be some
>> other type for 8-bit data, while all "normal" strings would become
>> Unicode strings.  
>No. <type 'str'> will remain a byte string type for any foreseeable
>future. The only change that is likely to happen is this: To denote
>bytes > 128 in source code, you will need to use escape codes. 
Anyone considered extending the hex escape with delimiters to make
long runs more dense? E.g.,


being spellable as


   'ab' x'00010203' 'cd'


>A change that might happen in the future is this: A string literal
>does not create an instance of <type 'str'>, but an instance of <type
>'unicode'>. However, IMO, this should only happen after a syntax for
>byte string literals has been introduced.
Still, the actual characters used in the _source_ representation will have to
be whatever the -*- xxx -*- thing says, right? -- including the characters
in the source representation of a string that might wind up utf-8 internally?
(so you could have several modules whose sources are encoded differently and
have the run time see a single unified internal representation of utf-8?
Or wchar/utf-16le?

>> >I'm very unimpressed with this decision if that's the case.
>> >
>> Doesn't make me ecstatic, either, as I like the simple 8-bit-clean
>> string type.  But maybe we'll luck out and it will turn out that I'm
>> all wet on this one :) .
>The byte string type is not going away. It is a useful type, e.g. when
>reading or writing to or from a byte stream.
Is this moving towards a single 8-bit str base type with various
encoding-specifying subtypes?

Bengt Richter

More information about the Python-list mailing list