[Python-Dev] Byte string class hierarchy
"Martin v. Löwis"
martin at v.loewis.de
Thu Aug 19 00:38:31 CEST 2004
Jack Jansen wrote:
> genericbytes
> mutablebytes
> bytes
> genericstring
> string
> unicode
I think this hiearchy is wrong. unicode is not a specialization of
genericybytes: a unicode strings is made out of characters, not out
of bytes.
> The basic type for all bytes, buffers and strings is genericbytes. This
> abstract base type is neither mutable nor immutable, and has the
> interface that all of the types would share. Mutablebytes adds slice
> assignment and such. Bytes, on the other hand, adds hashing.
There is a debate on whether such a type is really useful. Why do you
need hashing on bytes?
> genericstring is the magic stuff that's there already that makes unicode
> and string interoperable for hashing and dict keys and such.
Interoperability, in Python, does not necessarily involve a common base
type.
> Casting to a basetype is always free and doesn't copy anything
And, of course, there is no casting at all in Python.
> Operations like concatenation return the most specialised class.
Assuming the hieararchy on the top of your message, what does that mean?
Suppose I want to concatenate unicode and string: which of them is
more specialized?
> Read() is guaranteed only to return genericbytes, but if you open a file
> in textmode they'll returns strings, and we should add the ability to
> open files for unicode and probably mutablebytes too.
I think Guido's proposal is that read(), in text mode, returns Unicode
strings, and (probably) that there is no string type in Python anymore.
read() on binary files would return a mutable byte array.
Regards,
Martin
More information about the Python-Dev
mailing list