On Mar 11, 2020, at 03:07, Steven D'Aprano <steve@pearwood.info> wrote:
But bytes are useful for more than just file names!
The paradigm example of this is HTTP. It’s mostly people working on HTTP clients, servers, middleware, and apps who pushed for the bytes methods in Python 3.x. IIRC, the PEP for bytes.__mod__ (461) had links to a lot of discussion and history. But it’s probably not an exaggeration to say that if you couldn’t parse HTTP headers as bytes with split, strip, etc. (and maybe bytes regexes as well), the entire 3.x transition would have gone a lot worse. And Python itself has been doing something similar (if simpler) since the early 2.x days to find the source content encoding. Now that we have surrogateescape, maybe you could go back and redo all that code with str methods, but it would be less efficient, harder rather than easier to follow, just as easy to get wrong, and harder to debug. (I recently tried something similar with a parser for the rigid-language text chunks in a binary chunked file format.) That being said, none of this means the new methods necessarily have to be added to bytes. I think the bar is higher. Writing your own split is a daunting task for a novice, and easy to get wrong, so bytes.split makes sense. But a prefix stripper, that’s more a convenience than a must-have, and it might well be convenient enough for str but not quite enough for bytes. This thread has demonstrated people reinventing this wheel over and over in the wild for str, but has anyone found examples of people doing it for bytes?