From gnewsg at gmail.com  Tue Dec  4 19:46:15 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Tue, 4 Dec 2007 10:46:15 -0800 (PST)
Subject: [Python-ideas] Symlink chain resolver module
Message-ID: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>

Hi there,
I thought it would have been good sharing this code I used in a
project of mine.
I thought it could eventually look good incorporated in Python stdlib
through os or os.path modules.
Otherwise I'm merely offering it as a community service to anyone who
might be interested.


-- Giampaolo


#!/usr/bin/env python
# linkchainresolver.py

import os, sys, errno

def resolvelinkchain(path):
    """Resolve a chain of symbolic links by returning the absolute
path
    name of the final target.

    Raise os.error exception in case of circular link.
    Do not raise exception if the symlink is broken (pointing to a
    non-existent path).
    Return a normalized absolutized version of the pathname if it is
    not a symbolic link.

    Examples:

    >>> resolvelinkchain('validlink')
    /abstarget
    >>> resolvelinkchain('brokenlink') # resolved anyway
    /abstarget
    >>> resolvelinkchain('circularlink')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "module.py", line 19, in resolvelinkchain
        os.stat(path)
    OSError: [Errno 40] Too many levels of symbolic links: '3'
    >>>
    """
    try:
        os.stat(path)
    except os.error, err:
        # do not raise exception in case of broken symlink;
        # we want to know the final target anyway
        if err.errno == errno.ENOENT:
            pass
        else:
            raise
    if not os.path.isabs(path):
        basedir = os.path.dirname(os.path.abspath(path))
    else:
        basedir = os.path.dirname(path)
    p = path
    while os.path.islink(p):
        p = os.readlink(p)
        if not os.path.isabs(p):
            p = os.path.join(basedir, p)
            basedir = os.path.dirname(p)
    return os.path.join(basedir, p)


From guido at python.org  Tue Dec  4 20:06:03 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Dec 2007 11:06:03 -0800
Subject: [Python-ideas] Symlink chain resolver module
In-Reply-To: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
Message-ID: <ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com>

Isn't this pretty close to os.path.realpath()?

On Dec 4, 2007 10:46 AM, Giampaolo Rodola' <gnewsg at gmail.com> wrote:
> Hi there,
> I thought it would have been good sharing this code I used in a
> project of mine.
> I thought it could eventually look good incorporated in Python stdlib
> through os or os.path modules.
> Otherwise I'm merely offering it as a community service to anyone who
> might be interested.
>
>
> -- Giampaolo
>
>
> #!/usr/bin/env python
> # linkchainresolver.py
>
> import os, sys, errno
>
> def resolvelinkchain(path):
>     """Resolve a chain of symbolic links by returning the absolute
> path
>     name of the final target.
>
>     Raise os.error exception in case of circular link.
>     Do not raise exception if the symlink is broken (pointing to a
>     non-existent path).
>     Return a normalized absolutized version of the pathname if it is
>     not a symbolic link.
>
>     Examples:
>
>     >>> resolvelinkchain('validlink')
>     /abstarget
>     >>> resolvelinkchain('brokenlink') # resolved anyway
>     /abstarget
>     >>> resolvelinkchain('circularlink')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>       File "module.py", line 19, in resolvelinkchain
>         os.stat(path)
>     OSError: [Errno 40] Too many levels of symbolic links: '3'
>     >>>
>     """
>     try:
>         os.stat(path)
>     except os.error, err:
>         # do not raise exception in case of broken symlink;
>         # we want to know the final target anyway
>         if err.errno == errno.ENOENT:
>             pass
>         else:
>             raise
>     if not os.path.isabs(path):
>         basedir = os.path.dirname(os.path.abspath(path))
>     else:
>         basedir = os.path.dirname(path)
>     p = path
>     while os.path.islink(p):
>         p = os.readlink(p)
>         if not os.path.isabs(p):
>             p = os.path.join(basedir, p)
>             basedir = os.path.dirname(p)
>     return os.path.join(basedir, p)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From gnewsg at gmail.com  Tue Dec  4 20:25:32 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Tue, 4 Dec 2007 11:25:32 -0800 (PST)
Subject: [Python-ideas] Symlink chain resolver module
In-Reply-To: <ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com>
References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
	<ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com>
Message-ID: <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com>

On 4 Dic, 20:06, "Guido van Rossum" <gu... at python.org> wrote:
> Isn't this pretty close to os.path.realpath()?
>
> On Dec 4, 2007 10:46 AM, Giampaolo Rodola' <gne... at gmail.com> wrote:
>
>
>
>
>
> > Hi there,
> > I thought it would have been good sharing this code I used in a
> > project of mine.
> > I thought it could eventually look good incorporated in Python stdlib
> > through os or os.path modules.
> > Otherwise I'm merely offering it as a community service to anyone who
> > might be interested.
>
> > -- Giampaolo
>
> > #!/usr/bin/env python
> > # linkchainresolver.py
>
> > import os, sys, errno
>
> > def resolvelinkchain(path):
> >     """Resolve a chain of symbolic links by returning the absolute
> > path
> >     name of the final target.
>
> >     Raise os.error exception in case of circular link.
> >     Do not raise exception if the symlink is broken (pointing to a
> >     non-existent path).
> >     Return a normalized absolutized version of the pathname if it is
> >     not a symbolic link.
>
> >     Examples:
>
> >     >>> resolvelinkchain('validlink')
> >     /abstarget
> >     >>> resolvelinkchain('brokenlink') # resolved anyway
> >     /abstarget
> >     >>> resolvelinkchain('circularlink')
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in <module>
> >       File "module.py", line 19, in resolvelinkchain
> >         os.stat(path)
> >     OSError: [Errno 40] Too many levels of symbolic links: '3'
>
> >     """
> >     try:
> >         os.stat(path)
> >     except os.error, err:
> >         # do not raise exception in case of broken symlink;
> >         # we want to know the final target anyway
> >         if err.errno == errno.ENOENT:
> >             pass
> >         else:
> >             raise
> >     if not os.path.isabs(path):
> >         basedir = os.path.dirname(os.path.abspath(path))
> >     else:
> >         basedir = os.path.dirname(path)
> >     p = path
> >     while os.path.islink(p):
> >         p = os.readlink(p)
> >         if not os.path.isabs(p):
> >             p = os.path.join(basedir, p)
> >             basedir = os.path.dirname(p)
> >     return os.path.join(basedir, p)
> > _______________________________________________
> > Python-ideas mailing list
> > Python-id... at python.org
> >http://mail.python.org/mailman/listinfo/python-ideas
>
> --
> --Guido van Rossum (home page:http://www.python.org/~guido/)
> _______________________________________________
> Python-ideas mailing list
> Python-id... at python.orghttp://mail.python.org/mailman/listinfo/python-ideas- Nascondi testo tra virgolette -
>
> - Mostra testo tra virgolette -

Are you trying to tell me that I've lost the whole evening when such a
thing was already available in os.path? :-)
Wait a moment. I'm going to check what os.path.realpath() does.


From gnewsg at gmail.com  Tue Dec  4 20:53:46 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Tue, 4 Dec 2007 11:53:46 -0800 (PST)
Subject: [Python-ideas] Symlink chain resolver module
In-Reply-To: <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com>
References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
	<ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com> 
	<77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com>
Message-ID: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com>

Ok, it's been a wasted evening. :-)
Sorry.

(I'm feeling so sad...)


From aahz at pythoncraft.com  Tue Dec  4 21:00:47 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 4 Dec 2007 12:00:47 -0800
Subject: [Python-ideas] Symlink chain resolver module
In-Reply-To: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com>
References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
	<ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com>
	<77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com>
	<809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com>
Message-ID: <20071204200047.GA12925@panix.com>

On Tue, Dec 04, 2007, Giampaolo Rodola' wrote:
>
> Ok, it's been a wasted evening. :-)
> Sorry.
> 
> (I'm feeling so sad...)

Another victim of Guido's Time Machine.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith


From guido at python.org  Tue Dec  4 22:24:12 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Dec 2007 13:24:12 -0800
Subject: [Python-ideas] Symlink chain resolver module
In-Reply-To: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com>
References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com>
	<ca471dc20712041106s3b609ca8r6778358e53603f7d@mail.gmail.com>
	<77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com>
	<809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com>
Message-ID: <ca471dc20712041324t530ab4c4o44b1304c6ccb4dd0@mail.gmail.com>

On Dec 4, 2007 11:53 AM, Giampaolo Rodola' <gnewsg at gmail.com> wrote:
> Ok, it's been a wasted evening. :-)
> Sorry.
>
> (I'm feeling so sad...)

Don't fret. It's not been wasted. You learned a couple of things: you
learned how to resolve symlinks recursively and safely, you probably
improved your Python debugging skills, *and* you learned to do
research before rolling up your sleeves. That's three valuable
lessons!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From arno at marooned.org.uk  Wed Dec 12 21:10:13 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Wed, 12 Dec 2007 20:10:13 +0000
Subject: [Python-ideas] free variables in generator expressions
Message-ID: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>

In the generator expression

    (x+1 for x in L)

the name 'L' is not local to the expression (as opposed to 'x'), I
will call such a name a 'free name', as I am not aware of an existing
terminology.

The following is about how to treat 'free names' inside generator
expressions.  I argue that these names should be bound to their values
when the generator expression is created, and the rest of this email
tries to provide arguments why this may be desirable.

I am fully aware that I tend to think about things in a very skewed
manner though, so I would be grateful for any rebuttal.

Recently I tried to implement a 'sieve' algorithm using generator
expressions instead of lists.  It wasn't the sieve of Eratosthenes but
I will use that as a simpler example (all of the code shown below is
python 2.5).  Here is one way to implement the algorithm using list
comprehensions (tuples could also be used as the mutability of lists
is not used):

def list_sieve(n):
    "Generate all primes less than n"
    P =  range(2, n)
    while True:
        # Yield the first element of P and sieve out its multiples
        for p in P:
            yield p
            P = [x for x in P if x % p]
            break
        else:
            # If P was empty then we're done
            return

 >>> list(list_sieve(20))
[2, 3, 5, 7, 11, 13, 17, 19]

So that's ok.  Then it occured to me that I didn't need to keep
creating all theses lists; so I decided, without further thinking, to
switch to generator expressions, as they are supposed to abstract the
notion of iterable object (and in the function above I'm only using
the 'iterable' properties of lists - any other iterable should do).
So I came up with this:

def incorrect_gen_sieve(n):
    "Unsuccessfully attempt to generate all primes less than n"
    # Change range to xrange
    P =  xrange(2, n)
    while True:
        for p in P:
            yield p
            # Change list comprehension to generator expression
            P = (x for x in P if x % p)
            break
        else:
            return

 >>> list(incorrect_gen_sieve(20))
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

Ouch.  Looking back at the code I realised that this was due to the
fact that the names 'p' and 'P' in the generator expressions are not
local to it, so subsequent binding of these names will affect the
expression.  I can't really analyse further than this, my head start
spinning if I try :).  So I wrapped the generator expression in a
lambda function to make sure that the names p and P inside it were
bound to what I intended them to:

def gen_sieve(n):
    "Generate all primes less than n, but without tables!"
    P =  xrange(2, n)
    while True:
        for p in P:
            yield p
            # Make sure that p and P are what they should be
            P = (lambda P, p: (x for x in P if x % p))(P, p)
            break
        else:
            return

 >>> list(gen_sieve(20))
[2, 3, 5, 7, 11, 13, 17, 19]

That's better.  I like the idea of a sieve of Eratosthenes without any
tables, and it's achievable in Python quite easily.  The only problem
is the one that I mentioned above, which boils down to:

    In a generator expression, free names can be bound to a new object
    between the time when they are defined and when they are used,
    thus changing the value of the expression.

But I think that the behaviour of generator expressions would be more
controllable and closer to that of 'real sequences' if the free names
they contain were implicitly frozen when the generator expression is
created.

So I am proposing that for example:

   (f(x) for x in L)

be equivalent to:

   (lambda f, L: (f(x) for x in L))(f, L)

In most cases this would make generator expressions behave more like
list comprehensions.  You would be able to read the generator
expression and think "that't what it does" more reliabley.  Of course
if a free name is bound to a mutable object, then there is always the
chance that this object will be mutated between the creation of the
generator expression and its use.

Lastly, instead of using a generator expression I could have
written:

from itertools import ifilter
from functools import partial

def tools_sieve(n):
    "Generate all primes less than n"
    P =  xrange(2, n)
    while True:
        for p in P:
            yield p
            P = ifilter(partial(int.__rmod__, p), P)
            break
        else:
            return

 >>> list(sieve(20))
[2, 3, 5, 7, 11, 13, 17, 19]

It obviously works as P and p are 'frozen' when ifilter and partial
are called.  If

    (f(x) for x in L if g(x))

is to be the moral equivalent of

    imap(f, ifilter(g, L))

Then my proposal makes even more sense.

-- 
Arnaud




From brett at python.org  Wed Dec 12 21:56:25 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 12 Dec 2007 12:56:25 -0800
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
Message-ID: <bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>

On Dec 12, 2007 12:10 PM, Arnaud Delobelle <arno at marooned.org.uk> wrote:
> In the generator expression
>
>     (x+1 for x in L)
>
> the name 'L' is not local to the expression (as opposed to 'x'), I
> will call such a name a 'free name', as I am not aware of an existing
> terminology.
>
> The following is about how to treat 'free names' inside generator
> expressions.  I argue that these names should be bound to their values
> when the generator expression is created, and the rest of this email
> tries to provide arguments why this may be desirable.
>

Calling it a free variable is the right term (at least according to
Python and the functional programmers of the world).

As for what you are asking for, I do believe it came up during the
discussion of when genexps were added to the language.  I honestly
don't remember the reasoning as to why we didn't do it this way, but I
am willing to guess it has something to do with simplicity and purity
of what genexps are meant to be.

Consider what your genexp, ``(x for x in P if x % p)``, really is::

  def _genexp():
      for x in P:
        if x % p:
          yield x

But what you are after is::

  def _genexp():
    _P = P
    _p = p
    for x in _P:
      if x % _p:
        yield x

The former maps to what you see in the genexp much more literally than
the latter.  And if you want the latter, just define a function like
the above but have it take P and p as arguments and then you get your
generator just the way you want it.

Genexps (as with listcomps) are just really simple syntactic sugar.
And Python tends to skew from hiding details like capturing variables
implicitly in some piece of syntactic sugar.  In my opinion it is
better to be more explicit with what you want the generator to do and
just write out a generator function.

-Brett


From g.brandl at gmx.net  Wed Dec 12 22:56:35 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 12 Dec 2007 22:56:35 +0100
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>
Message-ID: <fjplat$6r5$1@ger.gmane.org>

Brett Cannon schrieb:
> On Dec 12, 2007 12:10 PM, Arnaud Delobelle <arno at marooned.org.uk> wrote:
>> In the generator expression
>>
>>     (x+1 for x in L)
>>
>> the name 'L' is not local to the expression (as opposed to 'x'), I
>> will call such a name a 'free name', as I am not aware of an existing
>> terminology.
>>
>> The following is about how to treat 'free names' inside generator
>> expressions.  I argue that these names should be bound to their values
>> when the generator expression is created, and the rest of this email
>> tries to provide arguments why this may be desirable.
>>
> 
> Calling it a free variable is the right term (at least according to
> Python and the functional programmers of the world).
> 
> As for what you are asking for, I do believe it came up during the
> discussion of when genexps were added to the language.  I honestly
> don't remember the reasoning as to why we didn't do it this way, but I
> am willing to guess it has something to do with simplicity and purity
> of what genexps are meant to be.
> 
> Consider what your genexp, ``(x for x in P if x % p)``, really is::
> 
>   def _genexp():
>       for x in P:
>         if x % p:
>           yield x

Actually it is

def _genexp(P):
    for x in P:
        if x % p:
            yield x

IOW, the outmost iterator is not a free variable, but passed to the
invisible function object.

Georg



From arno at marooned.org.uk  Thu Dec 13 00:01:42 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Wed, 12 Dec 2007 23:01:42 +0000
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <fjplat$6r5$1@ger.gmane.org>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>
	<fjplat$6r5$1@ger.gmane.org>
Message-ID: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>


On 12 Dec 2007, at 21:56, Georg Brandl wrote:

> Brett Cannon schrieb:
[...]
>
>>
>> Consider what your genexp, ``(x for x in P if x % p)``, really is::
>>
>>  def _genexp():
>>      for x in P:
>>        if x % p:
>>          yield x
>
> Actually it is
>
> def _genexp(P):
>    for x in P:
>        if x % p:
>            yield x
>
> IOW, the outmost iterator is not a free variable, but passed to the
> invisible function object.


I see. 'P' gets frozen but not 'p', so I should be able to write:

def gen2_sieve(n):
     "Generate all primes less than n"
     P =  xrange(2, n)
     while True:
         for p in P:
             yield p
             P = (lambda p: (x for x in P if x % p))(p)
             break
         else:
             return

 >>> list(gen2_sieve(20))
[2, 3, 5, 7, 11, 13, 17, 19]

It seems to work.  Ok then in the same vein, I imagine that

     (x + y for x in A for y in B)

becomes:

def _genexp(A):
     for x in A:
         for y in B:
             yield x + y

Let's test this (python 2.5):

 >>> A = '12'
 >>> B = 'ab'
 >>> gen = (x + y for x in A for y in B)
 >>> A = '34'
 >>> B = 'cd'
 >>> list(gen)
['1c', '1d', '2c', '2d']

So in the generator expression, A is remains bound to the string '12'
but B gets rebound to 'cd'.  This may make the implementation of
generator expressions more straighforward, but from the point of view
of a user of the language it seems rather arbitrary. What makes A so
special as opposed to B?  Ok it belongs to the outermost loop, but
conceptually in the example above there is no outermost loop.

At the moment I still think it makes more sense for the generator
expressions to generate as much as possible a sequence which is the
same as what the corresponding list comprehension would have been,
i.e.:

l = [f(x, y) for x in A for y in B(x) if g(x, y)]
g = [f(x, y) for x in A for y in B(x) if g(x, y)]
<code, maybe binding A, B, f, g to new objects>
assert list(g) == l

to work as much as possible.

Perhaps I should go and see how generator expressions are generated in
the python source code.

-- 
Arnaud




From g.brandl at gmx.net  Thu Dec 13 00:41:22 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 13 Dec 2007 00:41:22 +0100
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>	<fjplat$6r5$1@ger.gmane.org>
	<72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>
Message-ID: <fjprfb$sjr$1@ger.gmane.org>

Arnaud Delobelle schrieb:

> Let's test this (python 2.5):
> 
>  >>> A = '12'
>  >>> B = 'ab'
>  >>> gen = (x + y for x in A for y in B)
>  >>> A = '34'
>  >>> B = 'cd'
>  >>> list(gen)
> ['1c', '1d', '2c', '2d']
> 
> So in the generator expression, A is remains bound to the string '12'
> but B gets rebound to 'cd'.  This may make the implementation of
> generator expressions more straighforward, but from the point of view
> of a user of the language it seems rather arbitrary. What makes A so
> special as opposed to B?  Ok it belongs to the outermost loop, but
> conceptually in the example above there is no outermost loop.

Well, B might depend on A so it can't be evaluated in the outer context
at the time the genexp "function" is called. It has to be evaluated
inside the "function".

Georg



From greg.ewing at canterbury.ac.nz  Thu Dec 13 00:36:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Dec 2007 12:36:19 +1300
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
Message-ID: <47607073.1030307@canterbury.ac.nz>

Arnaud Delobelle wrote:

> The following is about how to treat 'free names' inside generator
> expressions.  I argue that these names should be bound to their values
> when the generator expression is created,

My opinion is that this is the wrong place to attack the
problem. Note that it's not only generator expressions that
have this issue -- the same thing can trip you up with
lambdas as well.

It's instructive to consider why other languages with
lexical scoping and first-class functions don't seem to
have this problem to the same extent. In Scheme, for
example, the reason is that its looping constructs
usually create a new binding for the loop variable
on each iteration, instead of re-using the same one.
So if you return a lambda from inside the loop, each
one lives in a different lexical environment and
sees a different value for the loop variable.

If Python's for-loop did the same thing, I suspect that
this problem would turn up much less frequently.

In CPython, there's a straightforward way to implement
this: if the variable is used in an inner function, and
is therefore in a cell, create a new cell each time
round the loop instead of replacing the contents of the
existing one. (If the variable isn't in a cell, there's
no need to change anything.)

Note that this wouldn't interfere with using the loop
variable after the loop has finished -- the variable
is still visible to the whole function, and the following
code just sees whatever is in the last created cell.

--
Greg


From arno at marooned.org.uk  Thu Dec 13 08:08:53 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Thu, 13 Dec 2007 07:08:53 +0000
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <fjprfb$sjr$1@ger.gmane.org>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>	<fjplat$6r5$1@ger.gmane.org>
	<72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>
	<fjprfb$sjr$1@ger.gmane.org>
Message-ID: <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk>


On 12 Dec 2007, at 23:41, Georg Brandl wrote:

> Arnaud Delobelle schrieb:
>
>> Let's test this (python 2.5):
>>
>>>>> A = '12'
>>>>> B = 'ab'
>>>>> gen = (x + y for x in A for y in B)
>>>>> A = '34'
>>>>> B = 'cd'
>>>>> list(gen)
>> ['1c', '1d', '2c', '2d']
>>
>> So in the generator expression, A is remains bound to the string '12'
>> but B gets rebound to 'cd'.  This may make the implementation of
>> generator expressions more straighforward, but from the point of view
>> of a user of the language it seems rather arbitrary. What makes A so
>> special as opposed to B?  Ok it belongs to the outermost loop, but
>> conceptually in the example above there is no outermost loop.
>
> Well, B might depend on A so it can't be evaluated in the outer  
> context
> at the time the genexp "function" is called. It has to be evaluated
> inside the "function".

You're right. I expressed myself badly: I was not talking about
evaluation but binding.  I was saying that if the name A is bound to
the object that A is bound to when the generator expression is
created, then the same should happen with B.

-- 
Arnaud




From ntoronto at cs.byu.edu  Fri Dec 14 07:40:15 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Thu, 13 Dec 2007 23:40:15 -0700
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
	(probably)
Message-ID: <4762254F.8030607@cs.byu.edu>

I apologize if this is well-known, but it's new to me. It may be 
something to keep in mind when we occasionally dabble in generic 
functions or multimethods. Hopefully there's a way around it.

While kicking around ideas for multiple dispatch, I came across one of 
Perl's implementations, and it reminded me of something Larry Wall said 
recently in his State of the Onion address about how types get together 
and democratically select a function to dispatch to. I immediately 
thought of Arrow's Theorem.

Arrow's Theorem says that you can't (final, period, that's it) make a 
"social choice function" that's fair. A social choice function takes 
preferences as inputs and outputs a single preference ordering. (You 
could think of types as casting preference votes for functions, for 
example, by stating their "distance" from the exact type the function 
expects. It doesn't matter exactly how they vote, just that they do it.) 
"Fair" means the choice function meets these axioms:

1. The output preference has to be a total order. This is important for 
choosing a president or multiple dispatch, since without it you can't 
choose a "winner".

2. There has to be more than one agent (type, here) and more than two 
choices. (IOW, with two choices or one agent it's possible to be fair.)

3. If a new choice appears, it can't affect the final pairwise ordering 
of two already-existing choices. (Called "independence of irrelevant 
alternatives". It's one place most voting systems fail. Bush vs. Clinton 
vs. Perot is a classic example.)

4. An agent promoting a choice can't demote it in the final pairwise 
ordering. (Called "positive association". It's another place most voting 
systems fail.)

5. There has to be some way to get any particular pairwise ordering. 
(Called "citizen's sovereignty". Notice this doesn't say anything about 
a complete ordering.)

6. No single agent (or type) can control a final pairwise ordering 
independent of every other agent. (Called "non-dictatorship".)


Notice what it doesn't say, since some of these are very loose 
requirements. In particular, it says nothing about relative strengths of 
votes.

What does this mean for multiple dispatch?

It seems to explain why so many different kinds of dispatch algorithms 
have been invented: they can't be made "fair" by those axioms, which all 
seem to describe desirable dispatch behaviors. No matter how it's done, 
either the outcome is intransitive (violates #1), adding a seemingly 
unrelated function flips the top two in some cases (violates #3), using 
a more specific type can cause a strange demotion (violates #4), or 
there's no way to get some specific function as winner (violates #5). 
Single dispatch gets around this problem by violating #6 - that is, the 
first type is the dictator.

Am I missing something here? I want to be, since multiple dispatch 
sounds really nice.

Neil



From arno at marooned.org.uk  Fri Dec 14 09:20:20 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Fri, 14 Dec 2007 08:20:20 -0000 (GMT)
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
 (probably)
Message-ID: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>

(Sorry I can't in-reply-to I've lost the original message)

Neil Toronto ntoronto at cs.byu.edu:

> While kicking around ideas for multiple dispatch, I came across one of
> Perl's implementations, and it reminded me of something Larry Wall said
> recently in his State of the Onion address about how types get together
> and democratically select a function to dispatch to. I immediately
> thought of Arrow's Theorem.

Maybe I'm wrong, but I don't think this theorem applies in the case of
mulitple dispatch.  The type of the argument sequence as a whole chooses
the best specialisation, it's not the case that each argument expresses a
preference.  The set of choices is the set S of signatures of
specialisations of the function (which is partially ordered), and the
'commitee' is made of one entity only: the function call's signature s. 
To choose the relevant specialisation, look at:

{ t \in S | s <= t and \forall u \in S (s < u <= t) \implies  u = t }

If this set is a singleton { t } then the specialization with signature t
is the best fit for s, otherwise there is no best fit.

-- 
Arnaud




From aaron.watters at gmail.com  Fri Dec 14 16:31:27 2007
From: aaron.watters at gmail.com (Aaron Watters)
Date: Fri, 14 Dec 2007 10:31:27 -0500
Subject: [Python-ideas] democratic multiple dispatch and type generality
	partial orders
Message-ID: <fc13a6500712140731i6612d9b2xd8bbc8ef5fdbb511@mail.gmail.com>

From: "Arnaud Delobelle" <arno at marooned.org.uk>

>
> { t \in S | s <= t and \forall u \in S (s < u <= t) \implies  u = t }


Sorry if I'm late to this discussion, but the hard part here
is defining the partial order <=, it seems to me.
I've tried to think about this at times and I keep getting
entangled in some highly recursive graph matching scheme.
I'm sure the Lisp community has done something with this,
anyone have a reference or something?

For example:

def myFunction(x, y, L):
      z = x+y
      L.append(z)
      return z

What is (or should be) the type of myFunction, x, y, and L?
Well, let's see. type(x) has an addition function that works
with type(y), or maybe type(y) has a radd function that works
with type(x)... or I think Python might try something
involving coercion....? in any case the result type(z) is acceptible to
the append function of type(L)... then I guess myFunction is of type
   type(x) x type(y) x type(L) --> type(z)...

I'm lost.

But that's only the start!  Then you have to figure out efficient
ways of calculating these type annotations and looking up "minimal"
matching types.

Something tells me that this might be a Turing complete
problem -- but that doesn't mean you can't come up with
a reasonable and useful weak approximation.  Please inform
if I'm going over old ground or otherwise am missing something.

Thanks,  -- Aaron Watters
===
http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=melting
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071214/c01a891e/attachment.html>

From stephen at xemacs.org  Fri Dec 14 20:46:17 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 15 Dec 2007 04:46:17 +0900
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
 (probably)
In-Reply-To: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
Message-ID: <873au56ovq.fsf@uwakimon.sk.tsukuba.ac.jp>

Arnaud Delobelle writes:
 > (Sorry I can't in-reply-to I've lost the original message)
 > 
 > Neil Toronto ntoronto at cs.byu.edu:
 > 
 > > While kicking around ideas for multiple dispatch, I came across one of
 > > Perl's implementations, and it reminded me of something Larry Wall said
 > > recently in his State of the Onion address about how types get together
 > > and democratically select a function to dispatch to. I immediately
 > > thought of Arrow's Theorem.
 > 
 > Maybe I'm wrong, but I don't think this theorem applies in the case of
 > mulitple dispatch.  The type of the argument sequence as a whole chooses
 > the best specialisation, it's not the case that each argument expresses a
 > preference.

I don't think it applies but I think a detailed analysis sheds light
on why perfect multiple dispatch is impossible.

The second part is true, individual arguments are not expressing
preferences in general (though they could, for example a generic
concatenation function could take an optional return_type argument so
that the appropriate type of sequence is constructed from scratch,
rather than sometimes requiring a possibly expensive conversion).  The
type is not doing the choosing, though.

The problem here is that what is really doing the choosing is the
application calling the function.  You could imagine a module dwim
which contains exactly one API: dwim(), which of course does the
computation needed by the application at that point.  dwim's
implementation would presumably dispatch to specializations.

Now since all dwim has is the type signature (this includes any
subtypes deducible from the value, such as "nonnegative integer", so
it's actually quite a lot of information), dwim can't work.  Eg,
search("de","dent") and append("de","dent") have the same signature
(and a lisper might even return the substring whose head matches the
pattern, so the return type might not disambiguate).

While this is an example requiring so much generality as to be
bizarre, I think it's easy to imagine situations where applications
will disagree about the exact semantics that some generic function
should have.  The generic concatenation function is one, and an
individual application might even want to have the dispatch done
differently in different parts of the program.

In sum, the problem is the real "voters" (applications) are
represented by rather limited information (argument signatures) to the
dispatcher.

So the real goal here seems to be to come up with rules that
*programmers* can keep in their heads that (a) are flexible enough to
cover lots of common situations and (b) simple enough so that any
programmer good enough to be given dog food can remember both the
covered situations and the exceptions.

But that, in turn, requires the "situations" to be easy to factor the
"correct" way.  Saunders Mac Lane makes a comment about the analysis
of certain spaces in mathematics, where the special case that provides
all the intuition wasn't defined for decades after the general case.
Had it been done the other way around, the important theorems would
have been proved within a couple of years, and grad students could
have worked out the details within the decade.

So I don't think that this can be worked out by defining multiple
dispatch for Python; you have to define Python for multiple dispatch.
Maybe it's already pretty close?!

This-is-a-job-for-the-time-machine-ly y'rs,



From ntoronto at cs.byu.edu  Sat Dec 15 01:28:32 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 14 Dec 2007 17:28:32 -0700
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
	(probably)
In-Reply-To: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
Message-ID: <47631FB0.3080101@cs.byu.edu>

Arnaud Delobelle wrote:
> Neil Toronto ntoronto at cs.byu.edu:
> 
>> While kicking around ideas for multiple dispatch, I came across one of
>> Perl's implementations, and it reminded me of something Larry Wall said
>> recently in his State of the Onion address about how types get together
>> and democratically select a function to dispatch to. I immediately
>> thought of Arrow's Theorem.
> 
> Maybe I'm wrong, but I don't think this theorem applies in the case of
> mulitple dispatch.  The type of the argument sequence as a whole chooses
> the best specialisation, it's not the case that each argument expresses a
> preference.  The set of choices is the set S of signatures of
> specialisations of the function (which is partially ordered), and the
> 'commitee' is made of one entity only: the function call's signature s. 
> To choose the relevant specialisation, look at:
> 
> { t \in S | s <= t and \forall u \in S (s < u <= t) \implies  u = t }
> 
> If this set is a singleton { t } then the specialization with signature t
> is the best fit for s, otherwise there is no best fit.

Let's call this set T. In English, it's the set of t's (function 
signatures) whose types aren't more specific than s's - but only the 
most specific ones according to the partial general-to-specific 
ordering. Most multiple dispatch algorithms agree that T has the right 
function in it.

Some kind of conflict resolution seems necessary. The number of ways to 
have "no best fit" is exponential in the number of function arguments, 
so punting with a raised exception doesn't seem useful. It's in the 
conflict resolution - when the set isn't a singleton - that multiple 
dispatch algorithms are most different.

Conflict resolution puts us squarely into Arrow's wasteland. Whether 
it's formulated as a social choice function or not, I think it's 
equivalent to one.

In a social choice reformulation of multiple dispatch conflict 
resolution, T are the candidate outcomes. (This much is obvious.) The 
types are the agents: each prefers functions in which the type in its 
position in the signature is most specific. Again, it's a reformulation, 
but I believe it's equivalent.

The application implements a social choice function: it picks a winner 
function based on types' "preferences". Arrow's Theorem doesn't care 
exactly how it does this - whether it has types vote or sums distances 
from whatever - it just says that however it does this, the result can't 
always be "fair". And "fair" doesn't even mean one type can't have more 
say than another by, for example, narrowing the subset first based on 
its own preferences. It only means that things don't act 
counterintuitively in general, that there's *some* means to every 
outcome, and that one agent (type) doesn't control everything.

Maybe, as not-useful as it looks, punting with a raised exception is the 
only way out.

Neil


From ntoronto at cs.byu.edu  Sat Dec 15 09:07:14 2007
From: ntoronto at cs.byu.edu (ntoronto at cs.byu.edu)
Date: Sat, 15 Dec 2007 01:07:14 -0700 (MST)
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
 (probably)
In-Reply-To: <47631FB0.3080101@cs.byu.edu>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
	<47631FB0.3080101@cs.byu.edu>
Message-ID: <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>

I wrote:
> The application implements a social choice function: it picks a winner
> function based on types' "preferences". Arrow's Theorem doesn't care
> exactly how it does this - whether it has types vote or sums distances
> from whatever - it just says that however it does this, the result can't
> always be "fair". And "fair" doesn't even mean one type can't have more
> say than another by, for example, narrowing the subset first based on
> its own preferences. It only means that things don't act
> counterintuitively in general, that there's *some* means to every
> outcome, and that one agent (type) doesn't control everything.

Sorry to self-reply, but I was just reading this:

    http://dev.perl.org/perl6/rfc/256.html

"""
1. 2. 3. <Exact matches>

4. Otherwise, the dispatch mechanism examines each viable target and
computes its inheritance distance from the actual set of arguments. The
inheritance distance from a single argument to the corresponding parameter
is the number of inheritance steps between their respective classes
(working up the tree from argument to parameter). If there's no
inheritance path between them, the distance is infinite. The inheritance
distance for a set of arguments is just the sum of their individual
inheritance distances.

5. The dispatch mechanism then chooses the viable target with the smallest
inheritance distance as the actual target. If more than one viable target
has the same smallest distance, the call is ambiguous. In that case, the
dispatch process fails and an exception is thrown (but see "Handling
dispatch failure" below) If there's only a single actual target, its
identity is cached (to accelerate subsequent dispatching), and then the
actual target is invoked.
"""

This is almost precisely the Borda protocol:

    http://en.wikipedia.org/wiki/Borda_count

Borda does not satisfy independence of irrelevant alternatives. HOWEVER,
the proposed dispatch mechanism does.

Why? The agents are voting a kind of utility, not a preference. With the
addition or removal of a function, the sum of votes for any other function
will not change. AFAIK, that's the only Arrow-ish problem with Borda, so
this proposed dispatch mechanism doesn't have the problems I was
expecting. Big stink for nothing, I guess.

I imagine that making next-method calls behave is a nightmare, though. The
more I read about multiple dispatch, the less I like it.

Neil



From stephen at xemacs.org  Sat Dec 15 23:08:43 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 16 Dec 2007 07:08:43 +0900
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
 (probably)
In-Reply-To: <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
	<47631FB0.3080101@cs.byu.edu>
	<32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>
Message-ID: <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp>

ntoronto at cs.byu.edu writes:

 > This is almost precisely the Borda protocol:
 > 
 >     http://en.wikipedia.org/wiki/Borda_count

No, it is not.  The Borda rule requires a linear order, whereas this
is a partial order, and complete indifference is possible.

 > Why? The agents are voting a kind of utility, not a preference. With the
 > addition or removal of a function, the sum of votes for any other function
 > will not change.

And this utilitarian rule has its own big problems.

For one, you could argue that it violates the Arrow condition of
non-dictatorship because *somebody* has to choose the weights.  In
particular, weighting the number of levels by one doesn't make a lot
of sense: some developers prefer a shallow style with an abstract
class and a lot of concrete derivatives, others prefer a hierarchy of
abstract classes with several levels before arriving at the concrete
implementations.  I think it would be a bad thing if devotees of the
latter style were discouraged because their users found the
convenience of automatic dispatch more important than the (usually
invisible) internal type hierarchy.

Also, my intuition doesn't rule out the possibility that self *should*
be a dictator if it "expresses a preference" -- the Arrow condition of
non-dictatorship might not even be appropriate.

 > I imagine that making next-method calls behave is a nightmare, though. The
 > more I read about multiple dispatch, the less I like it.

It's a hard problem.



From jan.kanis at phil.uu.nl  Sat Dec 15 23:41:12 2007
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Sat, 15 Dec 2007 23:41:12 +0100
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>
	<fjplat$6r5$1@ger.gmane.org>
	<72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>
	<fjprfb$sjr$1@ger.gmane.org>
	<24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk>
Message-ID: <op.t3eboqtqd64u53@jan-laptop.netwerk>

On Thu, 13 Dec 2007 08:08:53 +0100, Arnaud Delobelle  
<arno at marooned.org.uk> wrote:

>
> On 12 Dec 2007, at 23:41, Georg Brandl wrote:
>
>> Arnaud Delobelle schrieb:
>>
>>> Let's test this (python 2.5):
>>>
>>> >>> A = '12'
>>> >>> B = 'ab'
>>> >>> gen = (x + y for x in A for y in B)
>>> >>> A = '34'
>>> >>> B = 'cd'
>>> >>> list(gen)
>>>  ['1c', '1d', '2c', '2d']
>>>
>>> So in the generator expression, A is remains bound to the string '12'
>>> but B gets rebound to 'cd'.  This may make the implementation of
>>> generator expressions more straighforward, but from the point of view
>>> of a user of the language it seems rather arbitrary. What makes A so
>>> special as opposed to B?  Ok it belongs to the outermost loop, but
>>> conceptually in the example above there is no outermost loop.
>>
>> Well, B might depend on A so it can't be evaluated in the outer
>> context
>> at the time the genexp "function" is called. It has to be evaluated
>> inside the "function".
>
> You're right. I expressed myself badly: I was not talking about
> evaluation but binding.  I was saying that if the name A is bound to
> the object that A is bound to when the generator expression is
> created, then the same should happen with B.
>

I think what Georg meant was this (I intended to reply this to your  
earlier mail of Thursday AM, but Georg beat me to it):

The reason for not binding B when the genexp is defined is so you can do  
this:

  >>> A = [[1, 2], [3, 4]]
  >>> gen = (x for b in A for x in b)
  >>> list(gen)
  [1, 2, 3, 4]

Here, b can't be bound to something at generator definition time because  
the 'something' may not exist yet. (It does actually in this example, but  
you get the point.) So, only the first (outer loop) iterable is bound  
immediately.

Whether a variable is rebound within the expression could of course be  
decided at compile time, so all free variables could be bound immediately.  
I think that would be an improvement, but it requires the compiler to be a  
bit smarter. Unfortunately, it seems to be pythonic to bind variables at  
moments I disagree with :), like function default arguments (bound at  
definition instead of call) and loop counters (rebound every iteration  
instead of every iteration having it's own scope).

And, while I'm writing this:

On Thu, 13 Dec 2007 00:01:42 +0100, Arnaud Delobelle  
<arno at marooned.org.uk> wrote:
> l = [f(x, y) for x in A for y in B(x) if g(x, y)]
> g = [f(x, y) for x in A for y in B(x) if g(x, y)]
> <code, maybe binding A, B, f, g to new objects>
> assert list(g) == l

I suppose this should have been

g = (f(x, y) for x in A for y in B(x) if g(x, y))


Jan


From arno at marooned.org.uk  Sun Dec 16 00:31:19 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Sat, 15 Dec 2007 23:31:19 +0000
Subject: [Python-ideas] free variables in generator expressions
In-Reply-To: <op.t3eboqtqd64u53@jan-laptop.netwerk>
References: <C8574052-A9EA-42B5-832D-677A686163A0@marooned.org.uk>
	<bbaeab100712121256j48f3fb2fg340df699b5c80b5f@mail.gmail.com>
	<fjplat$6r5$1@ger.gmane.org>
	<72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk>
	<fjprfb$sjr$1@ger.gmane.org>
	<24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk>
	<op.t3eboqtqd64u53@jan-laptop.netwerk>
Message-ID: <47FB762F-A4F4-4522-BE07-BAF103EC0ED0@marooned.org.uk>


On 15 Dec 2007, at 22:41, Jan Kanis wrote:

> On Thu, 13 Dec 2007 08:08:53 +0100, Arnaud Delobelle <arno at marooned.org.uk 
> > wrote:
>
>>
>> On 12 Dec 2007, at 23:41, Georg Brandl wrote:
>>
>>> Arnaud Delobelle schrieb:
>>>
>>>> Let's test this (python 2.5):
>>>>
>>>> >>> A = '12'
>>>> >>> B = 'ab'
>>>> >>> gen = (x + y for x in A for y in B)
>>>> >>> A = '34'
>>>> >>> B = 'cd'
>>>> >>> list(gen)
>>>> ['1c', '1d', '2c', '2d']
>>>>
>>>> So in the generator expression, A is remains bound to the string  
>>>> '12'
>>>> but B gets rebound to 'cd'.  This may make the implementation of
>>>> generator expressions more straighforward, but from the point of  
>>>> view
>>>> of a user of the language it seems rather arbitrary. What makes A  
>>>> so
>>>> special as opposed to B?  Ok it belongs to the outermost loop, but
>>>> conceptually in the example above there is no outermost loop.
>>>
>>> Well, B might depend on A so it can't be evaluated in the outer
>>> context
>>> at the time the genexp "function" is called. It has to be evaluated
>>> inside the "function".
>>
>> You're right. I expressed myself badly: I was not talking about
>> evaluation but binding.  I was saying that if the name A is bound to
>> the object that A is bound to when the generator expression is
>> created, then the same should happen with B.
>>
>
> I think what Georg meant was this (I intended to reply this to your  
> earlier mail of Thursday AM, but Georg beat me to it):
>
> The reason for not binding B when the genexp is defined is so you  
> can do this:
>
> >>> A = [[1, 2], [3, 4]]
> >>> gen = (x for b in A for x in b)
> >>> list(gen)
> [1, 2, 3, 4]
>
> Here, b can't be bound to something at generator definition time  
> because the 'something' may not exist yet. (It does actually in this  
> example, but you get the point.) So, only the first (outer loop)  
> iterable is bound immediately.
>

In your example, b is not free of course.

> Whether a variable is rebound within the expression could of course  
> be decided at compile time, so all free variables could be bound  
> immediately. I think that would be an improvement, but it requires  
> the compiler to be a bit smarter.
>
This is what I was advocating.  As it is decided at compile time
which variables are free, it may only be a small extra step to
add a bit of code saying that they must be bound at the creation
of the generator expression.  Or, to continue with the _genexp
function mentioned in previous posts, for:

(f(x) for b in a for x in b)

to be translated as

def _genexp(f, A):
     for b in A:
         for x in b:
             yield f(x)

as A and f are free but not b and x.  Then

gen = (f(x) for b in A for x in b)

would be translated as

gen = _genexp(f, A)

I imagine this wouldn't be too hard, but I am not familiar with
the specifics of python code compilation...
Moreover this behaviour ('freezing' all free variables at the
creation of the generator expression) is well defined and easy
to reason on I think.  I haven't yet had the time to see how
generator expressions are created, but I'd like to have a look,
although I suspect I will have to learn a lot more besides in
order to understand it.

[...]
> And, while I'm writing this:
>
> On Thu, 13 Dec 2007 00:01:42 +0100, Arnaud Delobelle <arno at marooned.org.uk 
> > wrote:
>> l = [f(x, y) for x in A for y in B(x) if g(x, y)]
>> g = [f(x, y) for x in A for y in B(x) if g(x, y)]
>> <code, maybe binding A, B, f, g to new objects>
>> assert list(g) == l
>
> I suppose this should have been
>
> g = (f(x, y) for x in A for y in B(x) if g(x, y)

Yes!  Sorry about that.  In fact, I should also have called the
generator expression something else than 'g' as it is already
the name of a function (g(x, y)) :|

-- 
Arnaud




From ntoronto at cs.byu.edu  Mon Dec 17 06:17:57 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sun, 16 Dec 2007 22:17:57 -0700
Subject: [Python-ideas] Democratic multiple dispatch doomed to fail
	(probably)
In-Reply-To: <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>	<47631FB0.3080101@cs.byu.edu>	<32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>
	<87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <47660685.8020903@cs.byu.edu>

Stephen J. Turnbull wrote:
> ntoronto at cs.byu.edu writes:
>  > Why? The agents are voting a kind of utility, not a preference. With the
>  > addition or removal of a function, the sum of votes for any other function
>  > will not change.
> 
> And this utilitarian rule has its own big problems.
> 
> For one, you could argue that it violates the Arrow condition of
> non-dictatorship because *somebody* has to choose the weights.  In
> particular, weighting the number of levels by one doesn't make a lot
> of sense: some developers prefer a shallow style with an abstract
> class and a lot of concrete derivatives, others prefer a hierarchy of
> abstract classes with several levels before arriving at the concrete
> implementations.  I think it would be a bad thing if devotees of the
> latter style were discouraged because their users found the
> convenience of automatic dispatch more important than the (usually
> invisible) internal type hierarchy.

Good point. Further, what if it's more important for one type to be 
treated as specifically as possible than for another?

In a multi-agent setting, we'd call this the problem of combining 
utilities. In general, there's not a good way to do it - utilities (or 
these pseudo-utilities) are unique only up to a linear (or positive 
affine) transform. Assuming it's possible to pick a zero point you can 
normally shift and multiply them, but here the only obvious zero point 
(minimum distance over all matching functions) would net you a lot of ties.

Here's another fun idea: a lesser-known theorem called 
Gibbard-Satterthwaite says you can't construct a fair voting system in 
which the best response for each agent is to always tell the truth. It's 
strange to think of types *lying* about their distances from each other, 
but I'm sure users will end up constructing whacked-out hierarchies in 
an attempt to game the system and swing the vote toward specific functions.

At that point, dispatch may as well be completely manual.

Maybe the voting reformulation wasn't such a bad idea after all. At the 
very least, I've decided I really don't like Perl's implementation. :D

> Also, my intuition doesn't rule out the possibility that self *should*
> be a dictator if it "expresses a preference" -- the Arrow condition of
> non-dictatorship might not even be appropriate.

I'm not completely sure of this (the docs I've read have been pretty 
vague) but I think CLOS does that. One way to look at the problem is 
finding a total order in some D1xD2x...xDn space, where Di is a type's 
distance from matching function signature types. CLOS imposes a 
lexicographical order on this space, so self is a dictator. Type 2 gets 
to be a dictator if self is satisfied, and so on. Somehow they make it 
work with multiple inheritance. Of all the approaches I've seen I like 
this best, though it's still really complex.

Python's __op__/__rop__ is a rather clever solution to the problem in 
binary form. Do you know if there's a generalization of this? CLOS is 
almost it, but Python lets the class containing __op__ or __rop__ own 
the operation after it finds one that implements it.

Neil


From stephen at xemacs.org  Mon Dec 17 08:27:55 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 17 Dec 2007 16:27:55 +0900
Subject: [Python-ideas] Democratic multiple dispatch doomed to
	fail	(probably)
In-Reply-To: <47660685.8020903@cs.byu.edu>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>
	<47631FB0.3080101@cs.byu.edu>
	<32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>
	<87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp>
	<47660685.8020903@cs.byu.edu>
Message-ID: <87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Toronto writes:

 > Python's __op__/__rop__ is a rather clever solution to the problem in 
 > binary form. Do you know if there's a generalization of this? CLOS is 
 > almost it, but Python lets the class containing __op__ or __rop__ own 
 > the operation after it finds one that implements it.

Sorry, no.  I'm a professional economist (my early work was related to
Gibbard-Satterthwaite, thus my interest here) and wannabe developer,
who happens to have enjoyed the occasional guidance of Ken Arrow a
couple decades ago.  If you think there's something to the application
of multiobjective optimization to this problem, I'd love to take a
hack at it ... no classes to teach winter term, I've got some time to
bone up on the multidispatch problem itself.



From ntoronto at cs.byu.edu  Mon Dec 17 20:43:21 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 17 Dec 2007 12:43:21 -0700
Subject: [Python-ideas] Democratic multiple dispatch doomed to
	fail	(probably)
In-Reply-To: <87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk>	<47631FB0.3080101@cs.byu.edu>	<32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu>	<87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp>	<47660685.8020903@cs.byu.edu>
	<87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4766D159.9060005@cs.byu.edu>

Stephen J. Turnbull wrote:
> Neil Toronto writes:
> 
>  > Python's __op__/__rop__ is a rather clever solution to the problem in 
>  > binary form. Do you know if there's a generalization of this? CLOS is 
>  > almost it, but Python lets the class containing __op__ or __rop__ own 
>  > the operation after it finds one that implements it.
> 
> Sorry, no.  I'm a professional economist (my early work was related to
> Gibbard-Satterthwaite, thus my interest here) and wannabe developer,
> who happens to have enjoyed the occasional guidance of Ken Arrow a
> couple decades ago.

Excellent.

> If you think there's something to the application
> of multiobjective optimization to this problem, I'd love to take a
> hack at it ... no classes to teach winter term, I've got some time to
> bone up on the multidispatch problem itself.

For finding the first matching function, you could say it's picking some 
"best fit" from the Pareto optimal front. AFAIK all ways of doing that 
suck somehow. :) After that, the problem is calling a next-method, which 
requires imposing a total order on the entire space of fitting 
functions. (This had better define the "best fit" as well.)

The big hurdles to multiple dispatch I see are:

1. It has to "make sense" somehow - maybe by satisfying some set of 
intuitive axioms.
2. It seems unlikely a majority will agree on the axioms. (Yay! Another 
social choice problem!)
3. It has to be simple enough that it's easy to predict what a method or 
next-method call does.

Because this seems nigh-impossible (particularly #3), I'm starting to 
look elsewhere for solutions to "the problem". Speaking of which, it's 
really easy to forget the two primary use-cases that motivate multiple 
dispatch in the first place:

1. Interop with other types. For instance, I define a BarNumber type, 
and I want fooing of BarNumber with anything else to return a new 
BarNumber. Or perhaps I need some class I currently would have no 
control over to treat my custom type specially or apply a transformation 
before doing what it usually does. CLOS (Common Lisp Object System) 
handles this. Python's __op__/__rop__ does for a finite set of binary 
operators.

2. True dynamic overloading so we don't have to do type checks at the 
heads of our functions. CLOS handles this.

#1 can always be done with clever delegation. For example, numpy's 
arrays do it halfway by applying __array__ to a function argument before 
operating on it. If they also applied, say, a type's fromarray 
classmethod upon return it'd provide a full solution.

But as with numpy's arrays, users don't have enough control to make it 
work exactly as they want if the only solution is delegation. The class 
creator has to anticipate how his class will be used and risk 
over-generalizing it. In the worst cases, you'd end up with a crazy 
proliferation of types like FooLoggable and FooAppendable and methods 
like __foo_loggable__ and __foo_appendable__...

The most general way to do #2 is the way it's done now. Sure it's 
Turing-complete, but it'd be nice to be able to handle the common cases 
automatically.

PJE has a lot of further reading here:

   http://mail.python.org/pipermail/python-3000/2006-November/004421.html

Neil


From ryan.freckleton at gmail.com  Wed Dec 26 00:33:31 2007
From: ryan.freckleton at gmail.com (Ryan Freckleton)
Date: Tue, 25 Dec 2007 16:33:31 -0700
Subject: [Python-ideas] Anonymous functions for decorators
Message-ID: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com>

I've been working with decorators to make a design by contract
library. This has gotten me thinking about decorators and anonymous
functions.

The two use cases I'll target are callbacks and decorating descriptors. i.e.
(from the twisted tutorial):

class FingerProtocol(basic.LineReceiver):
    def lineReceived(self, user):
        self.factory.getUser(user
        ).addErrback(lambda _: "Internal error in server"
        ).addCallback(lambda m:
                      (self.transport.write(m+"\r\n"),
                       self.transport.loseConnection()))

currently this can be modified to (ab)use a decorator:
    ...
    def lineReceived(self, user):
        @self.factory.getUser(user).addCallback
        def temp(_):
            return "Internal error in server"
    ...

This is an abuse of decorators because it doesn't actually decorate the
function temp, it registers it as a callback for the class FingerProtocol.

And this use of decorators, which I mimic for my design by contract
library. (Guido recently posted a patch on python-dev that would allow
the property to be used in the following way):

class C(object):
    x = property(doc="x coordinate")

    @x.setter
    def x(self, x):
        self.__x = x

    @x.getter
    def x(self):
        return self__x


There was some concern voiced on the list about repeating the variable
name both in the decorator and the function definition.

In other languages, these use cases are handled by anonymous functions
or blocks. Python also has an anonymous function, lambda, but it's
limited to a single expression. There are parsing and readability
issues with making lambda multi-line.

My proposal would be to use a special syntax, perhaps a keyword, to
allow decorators to take an anonymous function.

    @self.factory.getUser(user).addCallback
    def temp(_):
        return "Internal error in server"

    would become:

    @self.factory.getUser(user).addCallback
    do (_):
        return "Internal error in server"

    and:

    @x.setter
    def x(self, x):
        self.__x = x

    would become:
    @x.setter
    do (self, x):
        self.__x = x

In general:

@DECORATOR
do (..args...):
    ..function

would be the same as:

def temp(..args..):
    ..function
DECORATOR(temp)
del temp


The decorators should be stackable, as well so:

@DEC2
@DEC1
do (..args..)
    ..function

would be the same as:

def temp(..args..):
    ..function
DEC2(DEC1(temp))
del temp

Other uses/abuses

Perhaps this new 'do' block could be used to implement things like the
following:

@repeatUntilSuccess(times=10)
do ():
    print "Attempting to connect."
    socket.connect()

with repeatUntilSuccess implemented as:

def repeatUntilSuccess(times=None):
    if times is None:
        def repeater(func):
            while True:
                try:
                    func()
                    break
                except Exception:
                    continue
    else:
        def repeater(func):
            for i in range(times):
                try:
                    func()
                    break
                except Exception:
                    continue
    return repeater


@fork
do ():
    time.sleep(10)
    print "This is done in another thread."
    thing.dostuff()


with fork implemented as:

def fork(func):
    t = threading.Thread(target=func)
    t.start()

Open Issues:
~~~~~~~~~~~~
Doing a google search shows that the word 'do' is occasionally used as
a function/method name. Same for synonyms act and perform. Reusing def
would probably not be a good idea, because it could be easily confused
with a normal decorator and function.

This is still somewhat an abuse of the decorator syntax, it isn't
adding annotations to the anonymous function, so it isn't really
'decorating' it. I do think it's better than using a function that
returns None as a decorator.

While scanning source, the 'do' keyword may be easily confused with
def func. I'm not really sure what would work better, suggestions are
welcome. Perhaps I'm going down the wrong path by reusing decorator
syntax.

Other possible keywords instead of 'do': anonymous, function, act, perform.

I first thought of reusing the keyword lambda, but I think the
semantics of lambda are too different to be used in this case.
Similarly, the keyword with has different semantics, as well.

Thanks,

-- 
=====
--Ryan E. Freckleton


From adam at atlas.st  Wed Dec 26 01:03:05 2007
From: adam at atlas.st (Adam Atlas)
Date: Tue, 25 Dec 2007 19:03:05 -0500
Subject: [Python-ideas] Anonymous functions for decorators
In-Reply-To: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com>
References: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com>
Message-ID: <6F7CB91A-F6F8-4AE2-A76D-1E5237CBFA2F@atlas.st>

I don't think def temp(...) is so terrible. Perhaps it's slightly  
inelegant, but adding a new keyword just so decorator syntax can be  
used in a decidedly non-decoratorish way to save a small amount of  
typing doesn't seem like something that would be accepted. If we were  
adding anonymous functions, I should think we'd make them more  
powerful while we're at it; it would at least be good to have them be  
first-class expressions, so you could do:

DECORATOR(def (...args...): ...function...) #or whatever the syntax  
for anonymous functions would end up being

and all the other useful things that come with anonymous functions and  
closures.

I think I suggested a while ago that we allow that syntax -- def  
(args): ...statements... -- to be used as an expression, and also  
allow it to span multiple lines like a normal Python function  
definitions, but I was told that this would complicate parsing of  
indentation. I'm not sure why that is, but I'm not familiar with the  
internals of Python's parser.

In any case, you can currently do
@DECORATOR
def temp(...args...):
     ...function...

(as you are already doing in your Twisted example) and all you'll end  
up with is a name called "temp" bound to None. Not too bad. I really  
don't think inhibiting the assignment of some Nones is worth a new  
keyword and such.


On 25 Dec 2007, at 18:33, Ryan Freckleton wrote:
> In general:
>
> @DECORATOR
> do (..args...):
>    ..function
>
> would be the same as:
>
> def temp(..args..):
>    ..function
> DECORATOR(temp)
> del temp