[Python-3000] Should PyString (new bytes type) accept strings with encoding?

Christian Heimes lists at cheimes.de
Mon Oct 15 18:33:48 CEST 2007


I'm working on the renaming of str8 -> bytes and bytes -> buffer.
PyBytes (old bytes, new buffer) can take a string together with an
encoding and an optional error argument:


>>> bytes(source="abc", encoding="ascii", errors="replace")
b'abc'
>>> str(b"abc", encoding="ascii")
'abc'

IMO this should work
>>> str8("abc", encoding="ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'encoding' is an invalid keyword argument for this function

And this should break with a type error
>>> str8("abc")
b'abc'


PyString' constructor doesn't take strings (PyUnicode). I like to add
the support for strings to it. It makes the API of str, bytes and buffer
consistent and fixes a *lot* of broken code and tests.

Are you confused by the name changes? I'm sometimes confused so I made a
table:

 c name   |  old  |   new  |  repr
-------------------------------------------
PyUnicode | str   |   -    | ''
PyString  | str8  | bytes  | b''
PyBytes   | bytes | buffer | buffer(b'')

Christian


More information about the Python-3000 mailing list