Hi, What do you think about adding this to Python: 'whatever a long string' in x or in y I've often wished for this because the current way is quite verbose: 'whatever a long string' in x or 'whatever a long string' in y We might add "and in" wwai\ dafsdfasdfasdfa while we're at it. Ram.
On 02/12/2014 02:12 PM, Ram Rachum wrote:
What do you think about adding this to Python:
'whatever a long string' in x or in y
I've often wished for this because the current way is quite verbose:
'whatever a long string' in x or 'whatever a long string' in y
Except you'd never actually do that, you'd just put the long string in a variable. Or the other option: any('whatever a long string' in i for i in [x, y]) (And 'all' similarly covers the 'and' case.) Carl
On Feb 12, 2014, at 15:55, Carl Meyer
On 02/12/2014 02:12 PM, Ram Rachum wrote:
What do you think about adding this to Python:
'whatever a long string' in x or in y
I've often wished for this because the current way is quite verbose:
'whatever a long string' in x or 'whatever a long string' in y
For future reference, using expression_with_side_effects() instead of 'whatever a long string' would be a more compelling use case. With a long string, it's inconvenient and ugly to repeat it; with an expression with side effects, it's all that plus incorrect to boot. But of course the same solutions still work.
Except you'd never actually do that, you'd just put the long string in a variable. Or the other option:
any('whatever a long string' in i for i in [x, y])
Or, if you're doing this so often that even this is too verbose: def in_any(element, *containers): return any(element in container for container in containers) in_any('whatever a long string', x, y, z, w) If you're not doing it often enough for in_any to become familiar, then you didn't have a problem to solve in the first place. :)
Ram Rachum
I've often wished for this because the current way is quite verbose:
'whatever a long string' in x or 'whatever a long string' in y
No need to repeat the literal: >>> x = 'foo' >>> y = 'bar whatever a long string bar' >>> z = 'baz' >>> any('whatever a long string' in item for item in [x, y, z]) True -- \ “Politics is not the art of the possible. It consists in | `\ choosing between the disastrous and the unpalatable.” —John | _o__) Kenneth Galbraith, 1962-03-02 | Ben Finney
On Wed, Feb 12, 2014 at 01:12:17PM -0800, Ram Rachum wrote:
Hi,
What do you think about adding this to Python:
'whatever a long string' in x or in y
I like it. In natural language, people often say things like: my keys are in the car or in my pocket which fools them into writing: keys in car or pocket which does the wrong thing. Chained "in" comparisons is a natural extension to Python's already natural language-like syntax. Python already has other chained comparisons. Being able to write: keys in car or in pocket feels natural and right to me. (We can't *quite* match the human idiom where the second "in" is left out, but one can't have everything.) This is particularly useful when there are side-effects involved: something_with_side_effects() in this and in that or in other I'm not usually one for introducing syntax just to avoid a temporary variable or extra line: temp = something_with_side_effects() temp in this and temp in that or temp in other but I think that chained comparisons are one of Python's best syntactic features, and this just extends it to "in". The only two concerns I have are: - given the restrictions on the parser, is this even possible? and - the difference between "x in y and z" and "x in y and in z" is quite subtle, and hence may be an unfortunately common source of errors. So a tentative +1 on the idea. -- Steven
On Thu, Feb 13, 2014 at 9:57 PM, Steven D'Aprano
- given the restrictions on the parser, is this even possible? and
Generalizing the syntax, I'd see this as: operand1 binary-op1 operand2 {and|or} binary-op2 operand3 which implicitly places the value (not the code) of operand1 between and/or and binary-op2. Since all binary operators have higher precedence than either and or or (all that's lower is lambda and if/else), this notation is currently guaranteed to fail... except in two cases, namely + and -, which exist in unary form as well. So if + and - are excluded, it should be unambiguous. Whether or not the parser can actually handle it is a question for someone who knows what he's talking about, though :) The way I see it, there should ideally be no syntactic rule against using different operators on the two sides: input("> ") in legalcommands and not in forbiddencommands value > 20 or in {3,5,7,11} even though it would allow insanity: if 5 < int(input("Enter a number: ")) or < int(input("Greater than five please: ")) or < int(input("Come on now! ")) or == print("Bah, I give up."): print("Thank you.") In this way, it's like chained comparisons: 1 < x <= 10 which don't mind mixing and matching. ChrisA
On 2014-02-13 11:27, Chris Angelico wrote:
On Thu, Feb 13, 2014 at 9:57 PM, Steven D'Aprano
wrote: - given the restrictions on the parser, is this even possible? and
Generalizing the syntax, I'd see this as:
operand1 binary-op1 operand2 {and|or} binary-op2 operand3
which implicitly places the value (not the code) of operand1 between and/or and binary-op2. Since all binary operators have higher precedence than either and or or (all that's lower is lambda and if/else), this notation is currently guaranteed to fail... except in two cases, namely + and -, which exist in unary form as well. So if + and - are excluded, it should be unambiguous. Whether or not the parser can actually handle it is a question for someone who knows what he's talking about, though :)
The way I see it, there should ideally be no syntactic rule against using different operators on the two sides:
input("> ") in legalcommands and not in forbiddencommands value > 20 or in {3,5,7,11}
even though it would allow insanity:
if 5 < int(input("Enter a number: ")) or < int(input("Greater than five please: ")) or < int(input("Come on now! ")) or == print("Bah, I give up."): print("Thank you.")
In this way, it's like chained comparisons:
1 < x <= 10
which don't mind mixing and matching.
I think it would require 3-token lookahead (there would be new binary operators such "and in" and "and not in"). It's certainly achievable.
I think we're talking about the wrong thing by focusing on whether we want
"or in" or "or" or other finnicky syntax things. In terms of whether we can
express the logic the OP wanted, we already can:
*>>> a = {1, 2, 3}*
*>>> b = {4, 5, 6}*
*>>> c = 5*
*>>> c in a | b*
*True*
Perfectly concisely, with the exact semantics we want, e.g. when we want to
chain more things to the *|* clause, or we want to include *&* clauses, or
other things.
Maybe we want to extend *|* to other sequences which *in* works on. Maybe
we could make *|* chaining on non-sets lazy so we don't build unnecessary
data structures. Maybe we want to add a fast-path in the interpreter to
speed up this special case. Maybe we want to overload *or* on sets and seqs
to behave similarly to how *|* does.
On the other hand, I think introducing new syntax for something *that
already works with almost the same syntax and exactly the desired
semantics* would
be a terrible idea.
On Thu, Feb 13, 2014 at 10:34 AM, MRAB
On 2014-02-13 17:04, אלעזר wrote:
Another issue: people will expect for i in a and not in b: print(i) to work.
Do they currently expect:
for i in a and i not in b: print(i)
to work?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Feb 14, 2014 at 5:51 AM, Haoyi Li
I think we're talking about the wrong thing by focusing on whether we want "or in" or "or" or other finnicky syntax things. In terms of whether we can express the logic the OP wanted, we already can:
a = {1, 2, 3} b = {4, 5, 6} c = 5 c in a | b True
... Maybe we want to extend | to other sequences which in works on.
Most definitely. The 'in' operator is far broader than the set() type. Notably: "somestring" in "wholesome" or in "string of words" is not the same as concatenating the two strings and checking if the target string is in the result. ChrisA
"somestring" in "wholesome" or in "string of words"
is not the same as concatenating the two strings and checking if the target string is in the result.
That's true, but I strongly suspect we can get what we want via plain old
*"somestring" in "wholesome" | "string of words"*
syntax by overloading the *|* operator to return some data structure that
does what we want with *in*.
In general, I feel like implementing cool stuff as libraries is superior to
implementing cool stuff as hardcoded semantics in the interpreter. This
seems like a case where implementing the semantics we want via *|* seems
not just feasible, but trivial.
On Thu, Feb 13, 2014 at 11:35 AM, Chris Angelico
On Fri, Feb 14, 2014 at 5:51 AM, Haoyi Li
wrote: I think we're talking about the wrong thing by focusing on whether we want "or in" or "or" or other finnicky syntax things. In terms of whether we can express the logic the OP wanted, we already can:
a = {1, 2, 3} b = {4, 5, 6} c = 5 c in a | b True
... Maybe we want to extend | to other sequences which in works on.
Most definitely. The 'in' operator is far broader than the set() type. Notably:
"somestring" in "wholesome" or in "string of words"
is not the same as concatenating the two strings and checking if the target string is in the result.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Feb 14, 2014 at 6:57 AM, Haoyi Li
"somestring" in "wholesome" or in "string of words"
is not the same as concatenating the two strings and checking if the target string is in the result.
That's true, but I strongly suspect we can get what we want via plain old
"somestring" in "wholesome" | "string of words"
syntax by overloading the | operator to return some data structure that does what we want with in.
That requires that everything implement some kind of pipe union type. It's putting things backwards. Why should the str type have to be able to make a union of itself and everything else? It's like the reasoning behind having ''.join(seq) rather than seq.join(''). Compare:
"foo" in "asdfoobar" "foo" in {"asd","foo","bar"} "foo" in "asdfoobar" or "foo" in {"asd","foo","bar"}
Okay, now how are you going to deduplicate the last line?
"foo" in "asdfoobar" | {"asd","foo","bar"}
The only way this would work is by having some universal "this-or-that" structure that doesn't care about its data types. And once you have that, you may as well make it a language construct and have lazy evaluation semantics:
cmd not in forbiddens and in fetch_command_list()
If the forbiddens set is small and kept locally (as a blacklist should be), tripping that check should be able to short-circuit the calling of fetch_command_list, just as the fully written out version would:
cmd not in forbiddens and cmd in fetch_command_list()
There's fundamentally no way to implement that with the | operator.
I like the proposal, largely for its parallel with the chained
comparison operators (1
No language change required: class either(object): def __init__(self, *args): self.args = args def __contains__(self, x): for a in self.args: if x in a: return True return False pocket = ['hankie', 'pocketknife', 'keys'] car = ['satchel', 'jacket', 'teddybear'] if 'keys' in either(car, pocket): print("Found them") else: print("Lost them") -- Greg
That's nice, but having to import a definition like that would be quite
cumbersome.
Also, this doesn't allow lazy evaluation.
On Fri, Feb 14, 2014 at 1:46 AM, Greg Ewing
No language change required:
class either(object):
def __init__(self, *args): self.args = args
def __contains__(self, x): for a in self.args: if x in a: return True return False
pocket = ['hankie', 'pocketknife', 'keys'] car = ['satchel', 'jacket', 'teddybear']
if 'keys' in either(car, pocket): print("Found them") else: print("Lost them")
-- Greg
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/python-ideas/LqFVq8sMMwU/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
On 02/13/2014 03:49 PM, Ram Rachum wrote:
On Fri, Feb 14, 2014 at 1:46 AM, Greg Ewing wrote:
class either(object):
def __init__(self, *args): self.args = args
def __contains__(self, x): for a in self.args: if x in a: return True return False
That's nice, but having to import a definition like that would be quite cumbersome.
What? Importing is cumbersome? Since when?
Also, this doesn't allow lazy evaluation.
Certainly it does. If the target is in the first thing it returns True at that point. -- ~Ethan~
On Thu, Feb 13, 2014 at 04:34:12PM -0800, Ethan Furman wrote:
On 02/13/2014 03:49 PM, Ram Rachum wrote:
On Fri, Feb 14, 2014 at 1:46 AM, Greg Ewing wrote:
class either(object):
def __init__(self, *args): self.args = args
def __contains__(self, x): for a in self.args: if x in a: return True return False
[...] Also, this doesn't allow lazy evaluation.
Certainly it does. If the target is in the first thing it returns True at that point.
No, Ram is correct. either() short circuits the `x in a` tests, but it evaluates the individual args eagerly, not lazily. For simplicity, let's just talk about two elements, rather than arbitrary numbers, and compare lazy and non-lazy evaluation. Contrast how Python's short-circuiting `and` works compared to this not-quite equivalent: x and 1/x # 1/x is only evaluated if x is not zero def and_(a, b): if a: return b return a and_(x, 1/x) # 1/x is evaluated before and_ even sees it Because Python's short-circuiting operators are syntax, they can avoid evaluating the operand they don't need. The same doesn't apply to Greg's "either" class, you have to evaluate all the terms first. It can short-circuit calling the __contains__ methods, but that's all. -- Steven
On 02/14/2014 04:15 AM, Steven D'Aprano wrote:
On Thu, Feb 13, 2014 at 04:34:12PM -0800, Ethan Furman wrote:
On 02/13/2014 03:49 PM, Ram Rachum wrote:
On Fri, Feb 14, 2014 at 1:46 AM, Greg Ewing wrote:
class either(object):
def __init__(self, *args): self.args = args
def __contains__(self, x): for a in self.args: if x in a: return True return False
[...] Also, this doesn't allow lazy evaluation.
Certainly it does. If the target is in the first thing it returns True at that point.
No, Ram is correct. either() short circuits the `x in a` tests, but it evaluates the individual args eagerly, not lazily.
Ah, thanks. Ram, my apologies. -- ~Ethan~
On Fri, Feb 14, 2014 at 01:49:20AM +0200, Ram Rachum wrote about a custom "either" class:
That's nice, but having to import a definition like that would be quite cumbersome.
No more cumbersome than any other import. We don't insist that every itertools function or maths routine be a built-in, and people manage :-) The barrier to a new built-in is higher than the barrier to a new module or function in a module. I'm not aware of any concrete examples where this has happened, but in principle, a really popular function might be moved from a module to built-ins.
Also, this doesn't allow lazy evaluation.
That is a bigger problem with the suggestion. -- Steven
On Thu, Feb 13, 2014 at 10:51:22AM -0800, Haoyi Li wrote:
I think we're talking about the wrong thing by focusing on whether we want "or in" or "or" or other finnicky syntax things. In terms of whether we can express the logic the OP wanted, we already can:
*>>> a = {1, 2, 3}* *>>> b = {4, 5, 6}* *>>> c = 5* *>>> c in a | b* *True*
Perfectly concisely, with the exact semantics we want,
But they are not the same semantics! They are *quite different* semantics. You may have lead yourself astray because the result of: c in a or c in b happens to give the same result as: c in a|b for the specific values of a, b, c given. But they actually perform semantically different things: the first one lazily performs two separate containment tests, while the second eagerly calculates the union of two sets a|b and then performs a single non-lazy containment test. Because the first form is lazy, this succeeds: a = {1, 2, 3} b = None c = 2 c in a or c in b (admittedly it succeeds by accident) while your eager version needs to be re-written as: c in (a|b if b is not None else a) in order to avoid failure. Another problem: since the __or__ operator can be over-ridden, you cannot afford to assume that a|b will always be a union. Or it might have side-effects. __contains__ also can be over-ridden, and might also have side-effects, but in general they will be *different* side-effects. -- Steven
Ah, I did not think about the laziness. That indeed is a pain since we
can't create our own custom lazy operators/methods.
I would say the correct answer is that we should let people define their
own lazy operators, and then define *or_in* or whatever as a lazy
operator/method, but I'm sure others would disagree.
On Thu, Feb 13, 2014 at 1:11 PM, Steven D'Aprano
On Thu, Feb 13, 2014 at 10:51:22AM -0800, Haoyi Li wrote:
I think we're talking about the wrong thing by focusing on whether we want "or in" or "or" or other finnicky syntax things. In terms of whether we can express the logic the OP wanted, we already can:
*>>> a = {1, 2, 3}* *>>> b = {4, 5, 6}* *>>> c = 5* *>>> c in a | b* *True*
Perfectly concisely, with the exact semantics we want,
But they are not the same semantics! They are *quite different* semantics. You may have lead yourself astray because the result of:
c in a or c in b
happens to give the same result as:
c in a|b
for the specific values of a, b, c given. But they actually perform semantically different things: the first one lazily performs two separate containment tests, while the second eagerly calculates the union of two sets a|b and then performs a single non-lazy containment test.
Because the first form is lazy, this succeeds:
a = {1, 2, 3} b = None c = 2 c in a or c in b
(admittedly it succeeds by accident) while your eager version needs to be re-written as:
c in (a|b if b is not None else a)
in order to avoid failure.
Another problem: since the __or__ operator can be over-ridden, you cannot afford to assume that a|b will always be a union. Or it might have side-effects. __contains__ also can be over-ridden, and might also have side-effects, but in general they will be *different* side-effects.
-- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Feb 13, 2014, at 13:20, Haoyi Li
Ah, I did not think about the laziness. That indeed is a pain since we can't create our own custom lazy operators/methods.
I would say the correct answer is that we should let people define their own lazy operators, and then define or_in or whatever as a lazy operator/method, but I'm sure others would disagree.
Well, first you need to come up with a way to allow people to define new operators in the first place. Because the parser can't know what you're going to define at runtime, you can't really do fancy things like specifying precedence or association direction, and they all have to fit some easily detectable pattern. So, let's pick a straw man: a `foo` b always means foo(a, b), and has higher precedence than all other binary operators, and left associates. As long as foo has to be an identifier, not an arbitrary expression, this would be pretty simple to add to the grammar. To add short circuiting, just do this: a ``foo`` b is the same thing as foo(lambda: a, lambda: b). It has the same precedence as single-backtick operators and left associates.
On Thu, Feb 13, 2014 at 10:27:19PM +1100, Chris Angelico wrote:
Generalizing the syntax, I'd see this as:
operand1 binary-op1 operand2 {and|or} binary-op2 operand3 [...] The way I see it, there should ideally be no syntactic rule against using different operators on the two sides:
input("> ") in legalcommands and not in forbiddencommands value > 20 or in {3,5,7,11}
I would rather be conservative about adding new syntax in this way. It's easier to generalise later than to make it less general. I don't mind chaining "in" or "not in", I mind a bit that one might extend that to other comparisons, and I *really strongly dislike* that one might write something like this without the compiler complaining: x + 23 or * 2 Although we might declare that this is to understood as "x+23 or x*2", I think it's ugly and weird and doesn't read like executable pseudo-code. To me, it feels almost Forth-like (and I like Forth, as Forth, just not in Python code). -- Steven
On Fri, Feb 14, 2014 at 9:20 PM, Steven D'Aprano
I would rather be conservative about adding new syntax in this way. It's easier to generalise later than to make it less general. I don't mind chaining "in" or "not in", I mind a bit that one might extend that to other comparisons, and I *really strongly dislike* that one might write something like this without the compiler complaining:
x + 23 or * 2
Maybe, but I find it easier to explain if it's simply "and/or followed by a binary operator" rather than specifically a function of "[not] in". Maybe require that it be only comparison operators? That excludes the example you give (which I agree is insane), and also prevents the ambiguity of + and - in their unary forms, but would allow this: inside = 1 < x < 5 outside = x <= 1 or >= 5 In each case, x is written only once. We already have chained comparisons which implicitly require both conditions; this would allow an either-or without negating all conditions and putting a big fat "not" around the outside of it. ChrisA
On 13 February 2014 20:57, Steven D'Aprano
- given the restrictions on the parser, is this even possible? and
I haven't worked through an actual candidate grammar (so I could potentially be wrong) but I'm pretty sure it requires more lookahead than Guido is willing to allow for the language definition. At the point we hit: X in Y and we're going to want to process "X in Y" as an expression, and then start looking at compiling the RHS of the "and". This proposal would mean that when the "in" shows up after the "and", we need to backtrack, lift "X" out as its own expression, and then reinsert it as the LHS of *multiple* containment tests. It's worth noting that this deliberate "no more than one token lookahead" design constraint in the language syntax isn't just for the benefit of compiler writers: it's there for the benefit of human readers as well. In linguistics, there's a concept called "garden path" sentences - these are sentences where the first part is a grammatically coherent sentence, but by *adding more words to the end*, you change the meaning of words that appeared earlier. This is a jarring experience for readers. This is one of the classic examples: The horse raced past the barn fell. That's a grammatical English sentence. If you're yelling at your computer telling me that there's no way something that awkward can be grammatically correct, you're not alone in feeling that way, but the awkwardness isn't due to bad grammar. The reason it feels bad, is that the first six words form a sentence in their own right: The horse raced past the barn. This is the sentence our brains typically start constructing as we read the seven word version, but then we get to the extra word "fell", and the original grammar structure falls apart - the previous sentence was already complete, and has no room for the extra word. So our brain has to back track and come up with this alternate parsing: The horse [that was] raced past the barn fell. The extra word at the end reaches back to change the entire structure of the sentence, and effectively changing the meaning of "raced" as well. The "only one token lookahead" rule in Python's syntax design helps to make it more difficult to write "garden path expressions" where information presented late in the expression forces you to go back and reevaluate information presented earlier in the expression. (You can still write them - there'll just be some marker earlier on in the expression to suggest that trickery may be afoot). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
On 13 February 2014 20:57, Steven D'Aprano
wrote: - given the restrictions on the parser, is this even possible? and
I haven't worked through an actual candidate grammar (so I could potentially be wrong) but I'm pretty sure it requires more lookahead than Guido is willing to allow for the language definition.
At the point we hit:
X in Y and
we're going to want to process "X in Y" as an expression, and then start looking at compiling the RHS of the "and".
This proposal would mean that when the "in" shows up after the "and", we need to backtrack, lift "X" out as its own expression, and then reinsert it as the LHS of *multiple* containment tests.
It's worth noting that this deliberate "no more than one token lookahead" design constraint in the language syntax isn't just for the benefit of compiler writers: it's there for the benefit of human readers as well.
In linguistics, there's a concept called "garden path" sentences - these are sentences where the first part is a grammatically coherent sentence, but by *adding more words to the end*, you change the meaning of words that appeared earlier.
Are you trying to say "X in Y and" forms a complete expression?? And couldn't the lexer recognize "and in" and cognates as single tokens anyway, like it apparently does presently for "is not" ? --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
On 13 Feb 2014 23:30, "Boris Borcic"
Nick Coghlan wrote:
On 13 February 2014 20:57, Steven D'Aprano
- given the restrictions on the parser, is this even possible? and
I haven't worked through an actual candidate grammar (so I could potentially be wrong) but I'm pretty sure it requires more lookahead than Guido is willing to allow for the language definition.
At the point we hit:
X in Y and
we're going to want to process "X in Y" as an expression, and then start looking at compiling the RHS of the "and".
This proposal would mean that when the "in" shows up after the "and", we need to backtrack, lift "X" out as its own expression, and then reinsert it as the LHS of *multiple* containment tests.
It's worth noting that this deliberate "no more than one token lookahead" design constraint in the language syntax isn't just for the benefit of compiler writers: it's there for the benefit of human readers as well.
In linguistics, there's a concept called "garden path" sentences - these are sentences where the first part is a grammatically coherent sentence, but by *adding more words to the end*, you change the meaning of words that appeared earlier.
Are you trying to say "X in Y and" forms a complete expression?? And couldn't the lexer recognize "and in" and cognates as single tokens anyway,
owrrOrA@public.gmane.org> wrote: like it apparently does presently for "is not" ? Yes, there are ways around the technical limitation - the point of the rest of the post was to make it clear that was largely irrelevant, because it would still be too hard to read. Cheers, Nick.
--- Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Nick Coghlan wrote:
In linguistics, there's a concept called "garden path" sentences - these are sentences where the first part is a grammatically coherent sentence, but by *adding more words to the end*, you change the meaning of words that appeared earlier.
Are you trying to say "X in Y and" forms a complete expression?? And couldn't the lexer recognize "and in" and cognates as single tokens anyway, like it apparently does presently for "is not" ?
Yes, there are ways around the technical limitation - the point of the rest of the post was to make it clear that was largely irrelevant, because it would still be too hard to read.
And my point was that your argument was rather unconvincing, for a variety of reasons going from the difference between indefinite lookahead and (rectifiable) two token lookahead, to the rarity of actual garden path sentences that's evidenced by the fact that the single example you cite is many decades old. Cheers, Boris Borcic --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
On 14 Feb 2014 00:39, "Boris Borcic"
Nick Coghlan wrote:
In linguistics, there's a concept called "garden path" sentences - these are sentences where the first part is a grammatically coherent sentence, but by *adding more words to the end*, you change the meaning of words that appeared earlier.
Are you trying to say "X in Y and" forms a complete expression?? And
couldn't the lexer recognize "and in"
and cognates as single tokens anyway, like it apparently does presently for "is not" ?
Yes, there are ways around the technical limitation - the point of the rest of the post was to make it clear that was largely irrelevant, because it would still be too hard to read.
And my point was that your argument was rather unconvincing, for a variety of reasons going from the difference between indefinite lookahead and (rectifiable) two token lookahead, to the rarity of actual garden path sentences that's evidenced by the fact that the single example you cite is many decades old.
Huh? Garden path sentences are rare because they're hard to understand - people instinctively realise that and rephrase them as something less awkward. The example I used is the one that was used to explain the concept to me because it's short and to the point. The point of the elaboration in the post was to make it clear that we *know* it is technical possible to work around the "only one token lookahead" limitation, but that as a general principle of the language design *we won't*. It's not an accident of the implementation, it's a deliberate design choice intended to benefit both human readers and the creators of automated tools. This is a constraint that people *need* to take into account if they wish to make successful syntax change proposals for Python. A suggestion like this, which would require defining two or three word tokens to meet the letter of the guideline while still breaking its spirit, simply isn't going to happen (especially when it doesn't provide a significant increase in expressiveness). Now, if you want to argue with me about whether or not it's a good rule, that's not an argument I'm interested in having - you can certainly design usable languages that don't follow it, but I'm pointing out an existing design principle for Python, and providing the general rationale for it (i.e. simplicity), not trying to make the case that *all* languages should be designed that way. Regards, Nick.
Cheers, Boris Borcic
--- Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 14 February 2014 06:55, Nick Coghlan
A suggestion like this, which would require defining two or three word tokens to meet the letter of the guideline while still breaking its spirit, simply isn't going to happen (especially when it doesn't provide a significant increase in expressiveness).
OK, I realised this proposal is actually closer to chained comparison operators than I initially thought, and multiword tokens do indeed make it technically feasible. However, that's still just a hack that meet the letter of the guideline while still failing to abide by the spirit of it. (Disclaimer: all parsing descriptions below are intended to be illustrative, and do not necessarily reflect how the CPython compiler actually works. In particular, "compiler" is used as a shorthand to refer to the arbitrary parts of toolchain.) First, lets look at the existing multi-word tokens. "is" vs "is not" is relatively simple: they're both binary operators. In the following expressions: LHS is RHS LHS is not RHS the extra "not" after "is", changes the comparison *operator*, but it doesn't need to reach back and alter the interpretation of the LHS expression itself. "not" vs "not in" is a little different: that relies on the fact that failing to follow a "not" in this position with "in" is a SyntaxError: LHS not RHS # Illegal LHS not in RHS So again, the addition of the "in" doesn't affect the interpretation of the LHS in any way. Contrast that with the proposal in this thread: LHS or in RHS LHS and in RHS LHS or not in RHS LHS and not in RHS In all of these cases, the "or op"/"and op" alters the way the *LHS* is processed. The closest parallel we have to that is chained comparison operators, where the presence of "op2" alters the processing of "X op1 Y": X op1 Y op2 Z All comparison operators are designed such that when the compiler is about to resolve "X op1 Y", it looks ahead at the next token, and if it sees another comparison operator, starts building an "and" construct instead: X op1 Y and Y op2 Z A *generalisation* of the current proposal, to work with arbitrary comparison operators, clearly requires two token look ahead in order to see "op2" after the logical operator: X op1 Y or op2 Z X op1 Y and op2 Z To be reparsed as: X op1 Y or X op2 Z X op1 Y and X op2 Z And allowing constructs like: if x == 1 or == 2 or in range(10, 20): # Do stuff This is why making "or in" (etc) multiword tokens would still break the spirit of the "only one token lookahead" guideline - the proposal put forward is actually perfectly coherent for arbitrary comparison operators, it just requires two token lookahead in order to see both the logical operator *and* the comparison operator after it before deciding how to resolve the processing of the LHS expression. I'm actually more amenable to the generalised proposal than I was to the original more limited idea, but anyone that wants to pursue it further really needs to appreciate how deeply ingrained that "only one token look ahead" guideline is for most of the core development team. I don't actually believe even the more general proposal adds enough expressiveness to convince Guido to change the lookahead restriction, but I'm only -0 on that, whereas I was -1 on the containment-tests-only special cased version. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Boris Borcic wrote:
Are you trying to say "X in Y and" forms a complete expression?
No, I think he's saying that "X in Y" forms a complete expression, but only if it's *not* followed by "or in". If it is, you need to back up and re-interpret it as "X in <something more complicated>". A syntax along the lines of X in either Y or Z would avoid the problem. But we don't have a good candidate for 'either' that's already a keyword. -- Greg
On 02/13/2014 04:04 PM, Greg Ewing wrote: [snip]
A syntax along the lines of
X in either Y or Z
would avoid the problem. But we don't have a good candidate for 'either' that's already a keyword.
This makes me think of X in any(Y, Z) which is equally concise, and I think clearer and more readable (for garden-path reasons) than X in Y or in Z It's also technically not ambiguous, since currently any() only takes one argument, and "X in any(Y)" is useless. Still a confusing overload of the meaning of any(). Implementation would be a bit ugly; it would need to return an object that implements every operator magic method, and "distributes" the operation across the objects passed to it. Equivalent could be done for all(). Or different names could be chosen to avoid the overloads, though I'm not sure what those names would be. Probably there be dragons, but it might be fun to experiment with (and doesn't require changes to the language, only new/modified builtins, or even just library additions). Carl
The issue I see with "X in any(Y, Z)" is that it's unclear to the reader
how it differs from "X in (Y,Z)".
On Thu Feb 13 2014 at 3:23:41 PM, Carl Meyer
On 02/13/2014 04:04 PM, Greg Ewing wrote: [snip]
A syntax along the lines of
X in either Y or Z
would avoid the problem. But we don't have a good candidate for 'either' that's already a keyword.
This makes me think of
X in any(Y, Z)
which is equally concise, and I think clearer and more readable (for garden-path reasons) than
X in Y or in Z
It's also technically not ambiguous, since currently any() only takes one argument, and "X in any(Y)" is useless. Still a confusing overload of the meaning of any().
Implementation would be a bit ugly; it would need to return an object that implements every operator magic method, and "distributes" the operation across the objects passed to it. Equivalent could be done for all().
Or different names could be chosen to avoid the overloads, though I'm not sure what those names would be.
Probably there be dragons, but it might be fun to experiment with (and doesn't require changes to the language, only new/modified builtins, or even just library additions).
Carl _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
X in any(Y, Z)
It also doesn't have the desired laziness, unless we special case *any* in
the interpreter, or provide a more generic mechanism for lazy parameters
without writing *lambda: *everywhere
On Thu, Feb 13, 2014 at 3:26 PM, Amber Yust
The issue I see with "X in any(Y, Z)" is that it's unclear to the reader how it differs from "X in (Y,Z)".
On Thu Feb 13 2014 at 3:23:41 PM, Carl Meyer
wrote: On 02/13/2014 04:04 PM, Greg Ewing wrote: [snip]
A syntax along the lines of
X in either Y or Z
would avoid the problem. But we don't have a good candidate for 'either' that's already a keyword.
This makes me think of
X in any(Y, Z)
which is equally concise, and I think clearer and more readable (for garden-path reasons) than
X in Y or in Z
It's also technically not ambiguous, since currently any() only takes one argument, and "X in any(Y)" is useless. Still a confusing overload of the meaning of any().
Implementation would be a bit ugly; it would need to return an object that implements every operator magic method, and "distributes" the operation across the objects passed to it. Equivalent could be done for all().
Or different names could be chosen to avoid the overloads, though I'm not sure what those names would be.
Probably there be dragons, but it might be fun to experiment with (and doesn't require changes to the language, only new/modified builtins, or even just library additions).
Carl _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 02/13/2014 03:32 PM, Haoyi Li wrote:
Carl Meyer wrote:
This makes me think of
X in any(Y, Z)
It also doesn't have the desired laziness, unless we special case *any* in the interpreter, or provide a more generic mechanism for lazy parameters without writing *lambda: *everywhere
Well, currently any() and all() return True/False. To overload like this would require returning an object with a __contains__ like Greg's either() (which is lazy), and a __bool__ that does what any() and all() currently do. I can see it being confusing, and it also doesn't get the chained comparisons like "or in" and friends do. -- ~Ethan~
On Fri, Feb 14, 2014 at 10:26 AM, Amber Yust
The issue I see with "X in any(Y, Z)" is that it's unclear to the reader how it differs from "X in (Y,Z)".
X in (Y, Z) <-> X == any(Y, Z) This equivalence is seen in SQL, more or less. I wouldn't want to encourage it, though. ChrisA
13.02.14 12:57, Steven D'Aprano написав(ла):
I like it. In natural language, people often say things like:
my keys are in the car or in my pocket
which fools them into writing:
keys in car or pocket
which does the wrong thing. Chained "in" comparisons is a natural extension to Python's already natural language-like syntax. Python already has other chained comparisons. Being able to write:
keys in car or in pocket
feels natural and right to me. (We can't *quite* match the human idiom where the second "in" is left out, but one can't have everything.)
You forgot keyword "are". keys are in car or in pocket
2014-02-13 17:04 GMT+02:00 Serhiy Storchaka
13.02.14 12:57, Steven D'Aprano написав(ла):
I like it. In natural language, people often say things like:
my keys are in the car or in my pocket
which fools them into writing:
keys in car or pocket
which does the wrong thing. Chained "in" comparisons is a natural extension to Python's already natural language-like syntax. Python already has other chained comparisons. Being able to write:
keys in car or in pocket
feels natural and right to me. (We can't *quite* match the human idiom where the second "in" is left out, but one can't have everything.)
You forgot keyword "are".
keys are in car or in pocket
And what about keys are in car or in pocket and not in purse it will be too complicated.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (16)
-
Amber Yust
-
Andrew Barnert
-
Ben Finney
-
Boris Borcic
-
Carl Meyer
-
Chris Angelico
-
Ethan Furman
-
Greg Ewing
-
Haoyi Li
-
MRAB
-
Nick Coghlan
-
Ram Rachum
-
Ram Rachum
-
Serhiy Storchaka
-
Steven D'Aprano
-
אלעזר