On Fri, 17 Jul 2020 at 12:26, <emmanuel.coirier@caissedesdepots.fr> wrote:
Hello everyone,

I'm sorry if my proposition has already being said, or even withdrawn,
but I think that capture variables shouldn't be as implicit as they
are now. I didn't see any mention of capture variable patterns in
the rejected ideas. So here is my idea:

I've looked at the PEP very quickly, jumping on the examples to have
a taste and an idea of what was going here. I saw a new kind of control
structure based on structural pattern matching (pattern based on
classes or compositions of classes to make it short). A very good
idea, emphasized by Tobias Kohn ("Another take on PEP 622") is that
pattern matching eases the writting of code based on matching such
structures, and that capturing values stored inside of these
structures at the match time really eases the writting of the code
associated with this match.

But... looking at the examples, it wasn't very obvious that some
variables were catching variables and some others were matching ones.
I then read in details some rules about how to discover what is a
captured variable. But I'm not sure everybody will do this...

Zen of Python tells us that "explicit is better than implicit". I know
this is not a rule written in the stone, but I think here, it applies
very well.

Guido said :
> We’re really looking
> for a solution that tells you when you’re looking at an individual
> case which variables are captured and which are used for
> load-and-compare.
>
> Marking up the capture variables with some sigil (e.g. $x or
> x?)
> or other markup (e.g. backticks or <x>) makes this common case ugly
> and inconsistent: it’s unpleasant to see for example
>
>     case %x, %y:
>         print(x, y)

Guido talk about a "sigil", which seems to be a meaningless mark only
here to help the parser understand what the dev was writing.

I propose that this "sigil" be the affectation mark : "=". Look :

    z = 42
    match pt:
        case x=, y=, z:
            print(x, y, "z == 42")

Or this one :

    def make_point_3d(pt):
        match pt:
            case (x=, y=):
                return Point3d(x, y, 0)
            case (x=, y=, z=):
                return Point3d(x, y, z)
            case Point2d(x=, y=):
                return Point3d(x, y, 0)
            case Point3d(_, _, _):
                return pt
            case _:
                raise TypeError("not a point we support")


I kind of agree it is nicer to be more explicit.  But somehow x= looks ugly. It occurred to me (and, again, apologies if already been mentioned), we might use the `as` keyword here.  

The example above would become: 

    def make_point_3d(pt):
        match pt:
            case (as x, as y):
                return Point3d(x, y, 0)
            case (as x, as y, as z):
                return Point3d(x, y, z)
            case Point2d(as x, as y):
                return Point3d(x, y, 0)
            case Point3d(_, _, _):
                return pt
            case _:
                raise TypeError("not a point we support")

If having "as x" as a standalone expression without anything to the left of "as" causes confusion, we could instead mandate the use of _ thus:

            case (_ as x, _ as y):
                return Point3d(x, y, 0)

On the need to be explicit:

Simple case blocks will perhaps be a bit longer to write, but will
not be harder to read, since they stay in the "simple case blocks"
family.

More complex cases will be harder to write, but the explicit markup
will help understand what will be captured and where, and what will
be looked-and-matched, using already known rules : looked-and-matched
expressions will be computed as usual, then compared with the match
term, and captured expression will be computed to a l-value (which
is much more restrictive than random expressions).

Moreover, explicilty introducing a difference between "capture" and
"look-and-match" will help newcomers to understand what is the point
about a match without they have to look at a PEP or other normative
document.

Remember that code has to be readable, because it will be read much
more often than written. The reader has to understand quickly but not
in details what will happen. Being explicit removes they the task
to concentrate on this point.

Also remember that Python has to be teached, and that all that is
implicit in the code have to be explicited when teaching. And the
longer you teach microdetails like what is the difference between
"capture" vs "look-and-match", the less your audience will be prone
to listen to you.

On the drawback to be explicit:

Adding a mark to every captured variables can and will be annoying.
More annoying than not adding it. It's obvious. But we don't expect to
have more than a handful of captured variables per case. Or the case
is perhaps too complex, not to say perhaps too complicated.

Using a carrefully choosen mark can alleviate this drawback. Using
an already accepted and commonly used symbol will surely help.

I know that Python will diverge from other languages on this point. Yes,
but Python already diverged from others languages, and it is see
by the community that it is for the better. Ex : the conditional expression
aka the "ternary operator" aka "x = blabla if plop else bla".

And I'll be a bit confused to have to explain that "captured variables"
look like simple expressions "but are not" because that's how things
are written in other functionnal languages. I'm not sure it will
convince anybody that aren't already familiar with pattern matching in
functionnal languages (which is a big topic by itself).

On the syntax:

Using the "=" character is well know, easy to find on a keyboard
and already hold the semantics of "putting a value in a variable".

So this mark is not a random sigil, but the way we write
affectations, with a l-value, a "=" character, and a
r-value. The r-value is given from the match and is omited here.

And even if only a few ones know what "l-value" means, everybody knows
what is allowed to put on the left side of an "=".

Morever, this kind of syntax already exists in the Python world, when
using default function parameters values :

    def make_point(x=0, y=0):
        return x, y

Here the r-value is present. The l-value still has a defined
semantic that is easy to learn and understand without requiring
to read the Python semantics book or any PEPs. And this is still
the same semantic of "putting a value in a variable".

And see how there is "x=" in the definition, but just "x" in the body
of the function. Like in the case block.

That's why I think it will add value to be explicit about captured
variable, and that choosing a meaningfull mark can clarify many
implicit, and hard to understand, patterns.

Emmanuel
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RDEKWUZ657BE5KSYVF7IF2N47XRQ5DEV/
Code of Conduct: http://python.org/psf/codeofconduct/


--
Gustavo J. A. M. Carneiro
Gambit Research
"The universe is always one step beyond logic." -- Frank Herbert