On 12 Jan 2014 03:29, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 01/11/2014 12:43 AM, Nick Coghlan wrote:
In particular, the bytes type is, and always will be, designed for pure binary manipulation [...]
I apologize for being blunt, but this is a lie.
Lets take a look at the methods defined by bytes:
dir(b'')
['__add__', '__class__', '__contains__', '__delattr__', '__dir__',
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'center', 'count', 'decode', 'endswith', 'expandtabs', 'find', 'fromhex', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
Are you really going to insist that expandtabs, isalnum, isalpha,
isdigit, islower, isspace, istitle, isupper, ljust, lower, lstrip, rjust, splitlines, swapcase, title, upper, and zfill are pure binary manipulation methods? Do you think I don't know that? However, those are all *in-place* modifications. Yes, they assume ASCII compatible formats, but they're a far cry from encouraging combination of data from potentially different sources. I'm also on record as considering this a design decision I regret, precisely because it has resulted in experienced Python 2 developers failing to understand that the Python 3 text model is *different* and they may need to create a new type.
Let's take a look at the repr of bytes:
bytes([48, 49, 50, 51])
b'0123'
Wow, that sure doesn't look like binary data!
Py3 did not go from three text models to two, it went to one good one
(unicode strings) and one broken one (bytes). If the aim was indeed for pure binary manipulation, we failed. We left in bunches of methods which can *only* be interpreted as supporting ASCII manipulation. No, no, no. We made some concessions in the design of the bytes type to *ease* development and debugging of ASCII compatible protocols *where we believed we could do so without compromising the underlying text model changes. Many experienced Python 2 developers are now suffering one of the worst cases of paradigm lock I have ever seen as they keep trying to make the Python 3 text model the same as the Python 2 one instead of actually learning how Python 3 works and recognising that they may actually need to create a new type for their use case and then potentially seek core dev assistance if that type reveals new interoperability bugs in the core types (or encounters old ones).
Due to backwards compatibility we cannot now finish yanking those out, so
either we live with a half-dead class screaming "I want be ASCII! I want to be ASCII!" or add back the missing functionality. No, we don't - we treat the core bytes type as PEP 460 does, by adding a *new* feature proposed by a couple people writing native Python 3 libraries like asyncio that makes binary formats easier to deal with without carrying forward even *more* broken assumptions from the Python 2 text model. (Remember, I'm in favour of Antoine's updated PEP, because it's a real spec for a new feature, rather than yet another proposal to bolt on even more text specific formatting features from someone that has never bothered to understand the reasons for the differences between the two versions). People that want a full hybrid type back can then pursue the custom extension type approach. Cheers, Nick.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com