
On Apr 17, 2020, at 01:58, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Apr 16, 2020 at 09:21:05PM -0700, Andrew Barnert via Python-ideas wrote:
But I don’t see why that rules out the “bare colon” form that I and someone else apparently both proposed in separate sub threads of this thread: { :a, "b": x, :c } as shorthand for: { "a": a, "b": x, "c": c }
I did a double-take reading that, because I visually parsed it as:
{ :a, "b": x, :c }
and couldn't work out what was going on.
After saving this draft, closing the email, then reopening it, I read the proposed dict the same way. So I don't think it was just a momentary glitch.
I honestly think, as you suggested at the end, that this may be just you. You’ve had similar reactions to other syntax that nobody else replicated, and I think that’s happening again here. At least one other person suggested the exact same syntax independently in this set of threads. Multiple people have responded to it, positively or negatively, without any of them having any trouble reading it as intended. As one of the other Steves pointed out, everyone who writes Lisp deals with basically the same syntax (albeit for an only vaguely related, pretty different, feature) and nobody has a problem with it there, and it’s not because most Lisp programmers are native speakers of Arabic or other RTL languages. And a wide variety of languages with very different syntax from Lisp (Ruby is closer to Python than to Lisp; Smalltalk is radically different from both Lisp and Python; etc.) borrowed its symbol syntax (sometimes changing the prefix character), and nobody has any trouble reading it in those languages either. And there’s similar syntax for all kinds of less similar things in all kinds of languages. Because really, this is just prefix syntax, not infix syntax with a piece missing, and people don’t have any problem with prefixes. If anything, prefixes are more common than suffixes in programming languages and math notation and other things designed by western people with LTR scripts (and prepositions and subject-first normal sentence order and whatever else might be relevant). I don’t disbelieve that you parsed this weirdly. Some people do read things idiosyncratically. When C had a proposal for designated initializer syntax, there was one person on the committee who just couldn’t see this: struct Spam spam = { .eggs = 2, .cheese = 3, .ham = 1 }; … without trying to figure out how you can take the `cheese` member of a `2,`. Even when it was split onto multiple lines: struct Spam slam = { .eggs = 2, .cheese = 3, .ham = 1 }; … his brain still insisted that the dot must be an infix operator, not a prefix one. Pointing out that you can parse -4 even though that’s normally an infix operator doesn’t help—understanding something intellectually doesn’t rewrite your mental parse rules. If most C programmers read it the same way as him, the feature would be unusable. Nobody would argue “we can teach the whole world to get used to it, and who cares if C is unreadable until we do?” But it turned out that nobody else had any problem reading it as intended, the feature was added to C11, and people love it. (The Python extending docs were even rewritten to strongly encourage people to use it.) And the same is true here. If a lot of people can’t see :spam as meaning the value of spam with the name of spam, because their brain instead keeps looking for something on the left to attach it to, then the feature is unusable and I’d drop it. But if it’s just one guy with idiosyncratic internal parse rules, and everyone else reads it without even stumbling, I don’t think that should stop an otherwise useful proposal. And I don’t see anyone else stumbling. (Of course there are other problems with it, the biggest one being that I’m not sure any change is needed at all. And there are legitimate differences on whether, if a change is needed, it should be a change to call syntax or **unpacking syntax or dict display syntax. And so on. I’m not sure I’m +1 on the proposal myself.)
I think that, as little as I like the original proposal and am not really convinced it is necessary, I think that it is better to have the explicit token (the key/parameter name) on the left, and the implicit token (blank) on the right:
key=
I suspect it may be because we read left-to-right in English and Python, so having the implicit blank come first is mentally like running into a pothole at high speed :-)
What examples can you think of—in English, Python, other popular languages, math notation, whatever—where there’s an infix-operator-like thing and the right token is elided and inferred implicitly? I’m sure there are some, but nothing comes to mind immediately. Meanwhile, I can think of lots of examples of the opposite, the token on the left being elided and inferred. You can read -2 and infer 2 less than 0, but not 2- to infer 0 less than 2. There’s the C designated initializer syntax, and BASIC with statements, and a variety of similar features across other languages, where you can infer what’s to the left of the dot. In fact, in Python, you can `import .spam` with “this package” implied as the thing to the left of the dot (and you can’t `import spam.` with “all the publics” or any other meaning; you have to rewrite the whole statement as `from spam import *` if you want that), and nobody has any problem inferring the intended meaning.
There’s no problem for the parser. Make a trivial change to the grammar to add a `":" identifier` alternative to dict display items, and nothing becomes ambiguous.
Except to the human reader, which is the only ambiguity that *really* matter when you get down to it.
Why do you do this? You snipped out the very next sentence about human readers, just to take this out of context so you can imply that human readers weren’t taken into account just so you can disagree with not taking human readers into account. And you don’t even really believe this. Obviously *both* ambiguities matter. Something that reads wonderfully but nobody could write a parser for would not be any more useful than something that‘s easy to parse but no human can understand it. There are already enough issues to discuss that fabricating issues that aren’t there just to be contrary to them really isn’t necessary.