On Sep 17, 2014, at 21:21, "Stephen J. Turnbull" firstname.lastname@example.org wrote:
Andrew Barnert writes:
The possibility of confusion might be increased if some of the options to bytes look like they should work for str. People will ask, "I can chunk bytes into groups of 4 with /4, why can't I do that with characters when the rest of the format specifier is the same?"
Isn't the answer to that kind of question "because you haven't written the PEP yet?"
Or "Repeat after me, 'bytes are not str' ... Very good, now do a set of 100 before each meal for a week."
As long as you don't ask for a set of 100 bytearrays, because they're not hashable.
After all, there are things you can do with integer or float formats that you can't do with str and vice versa.
bytes are indeed very similar to str as streams of code units (octets vs. characters), but the specific usages for human-oriented text (including such unnatural languages as C and Perl) require some differences in semantics. The sooner people get comfortable with that, the better, of course, but I don't think the language should be prevented from evolving because many people are going to take a while to get the difference and its importance.
I think we agree on all of that.
(By the way, is there a word for that Unicode ignorance and confusion? Something like "illiteracy" and "innumeracy", but probably spelled with a non-BMP character, maybe U+1F4A9?)
My point is that, given a choice between two APIs, one which reinforces the illusion that bytes are text and one which doesn't, the latter gets points. (And similarly for format vs. printf.)
Of course on the other hand, when str and bytes really _are_ perfect parallels in some way, making them gratuitously inconsistent just adds more things to learn and memorize.
At this point, I'm not sure that adds up to an argument for Nick's less-str-like version of his original proposal, or against it, but I'm pretty sure it's a good argument for one or other...
(Of course eventually they want to do something where the format isn't identical to printf, and many of them seem to go to StackOverflow or IRC and complain that there's a "bug in str.format" instead of just glancing at the docs, so maybe making them learn early isn't such a bad thing...)
Obviously, given the snotty remark above, I sympathize. But I doubt it's really going to help that. It's just going to give them one more thing to complain about.<wink/>
Yes, people can be amazingly good at avoiding learning.