From gnewsg at gmail.com Tue Dec 4 19:46:15 2007 From: gnewsg at gmail.com (Giampaolo Rodola') Date: Tue, 4 Dec 2007 10:46:15 -0800 (PST) Subject: [Python-ideas] Symlink chain resolver module Message-ID: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> Hi there, I thought it would have been good sharing this code I used in a project of mine. I thought it could eventually look good incorporated in Python stdlib through os or os.path modules. Otherwise I'm merely offering it as a community service to anyone who might be interested. -- Giampaolo #!/usr/bin/env python # linkchainresolver.py import os, sys, errno def resolvelinkchain(path): """Resolve a chain of symbolic links by returning the absolute path name of the final target. Raise os.error exception in case of circular link. Do not raise exception if the symlink is broken (pointing to a non-existent path). Return a normalized absolutized version of the pathname if it is not a symbolic link. Examples: >>> resolvelinkchain('validlink') /abstarget >>> resolvelinkchain('brokenlink') # resolved anyway /abstarget >>> resolvelinkchain('circularlink') Traceback (most recent call last): File "", line 1, in File "module.py", line 19, in resolvelinkchain os.stat(path) OSError: [Errno 40] Too many levels of symbolic links: '3' >>> """ try: os.stat(path) except os.error, err: # do not raise exception in case of broken symlink; # we want to know the final target anyway if err.errno == errno.ENOENT: pass else: raise if not os.path.isabs(path): basedir = os.path.dirname(os.path.abspath(path)) else: basedir = os.path.dirname(path) p = path while os.path.islink(p): p = os.readlink(p) if not os.path.isabs(p): p = os.path.join(basedir, p) basedir = os.path.dirname(p) return os.path.join(basedir, p) From guido at python.org Tue Dec 4 20:06:03 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Dec 2007 11:06:03 -0800 Subject: [Python-ideas] Symlink chain resolver module In-Reply-To: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> Message-ID: Isn't this pretty close to os.path.realpath()? On Dec 4, 2007 10:46 AM, Giampaolo Rodola' wrote: > Hi there, > I thought it would have been good sharing this code I used in a > project of mine. > I thought it could eventually look good incorporated in Python stdlib > through os or os.path modules. > Otherwise I'm merely offering it as a community service to anyone who > might be interested. > > > -- Giampaolo > > > #!/usr/bin/env python > # linkchainresolver.py > > import os, sys, errno > > def resolvelinkchain(path): > """Resolve a chain of symbolic links by returning the absolute > path > name of the final target. > > Raise os.error exception in case of circular link. > Do not raise exception if the symlink is broken (pointing to a > non-existent path). > Return a normalized absolutized version of the pathname if it is > not a symbolic link. > > Examples: > > >>> resolvelinkchain('validlink') > /abstarget > >>> resolvelinkchain('brokenlink') # resolved anyway > /abstarget > >>> resolvelinkchain('circularlink') > Traceback (most recent call last): > File "", line 1, in > File "module.py", line 19, in resolvelinkchain > os.stat(path) > OSError: [Errno 40] Too many levels of symbolic links: '3' > >>> > """ > try: > os.stat(path) > except os.error, err: > # do not raise exception in case of broken symlink; > # we want to know the final target anyway > if err.errno == errno.ENOENT: > pass > else: > raise > if not os.path.isabs(path): > basedir = os.path.dirname(os.path.abspath(path)) > else: > basedir = os.path.dirname(path) > p = path > while os.path.islink(p): > p = os.readlink(p) > if not os.path.isabs(p): > p = os.path.join(basedir, p) > basedir = os.path.dirname(p) > return os.path.join(basedir, p) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gnewsg at gmail.com Tue Dec 4 20:25:32 2007 From: gnewsg at gmail.com (Giampaolo Rodola') Date: Tue, 4 Dec 2007 11:25:32 -0800 (PST) Subject: [Python-ideas] Symlink chain resolver module In-Reply-To: References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> Message-ID: <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com> On 4 Dic, 20:06, "Guido van Rossum" wrote: > Isn't this pretty close to os.path.realpath()? > > On Dec 4, 2007 10:46 AM, Giampaolo Rodola' wrote: > > > > > > > Hi there, > > I thought it would have been good sharing this code I used in a > > project of mine. > > I thought it could eventually look good incorporated in Python stdlib > > through os or os.path modules. > > Otherwise I'm merely offering it as a community service to anyone who > > might be interested. > > > -- Giampaolo > > > #!/usr/bin/env python > > # linkchainresolver.py > > > import os, sys, errno > > > def resolvelinkchain(path): > > """Resolve a chain of symbolic links by returning the absolute > > path > > name of the final target. > > > Raise os.error exception in case of circular link. > > Do not raise exception if the symlink is broken (pointing to a > > non-existent path). > > Return a normalized absolutized version of the pathname if it is > > not a symbolic link. > > > Examples: > > > >>> resolvelinkchain('validlink') > > /abstarget > > >>> resolvelinkchain('brokenlink') # resolved anyway > > /abstarget > > >>> resolvelinkchain('circularlink') > > Traceback (most recent call last): > > File "", line 1, in > > File "module.py", line 19, in resolvelinkchain > > os.stat(path) > > OSError: [Errno 40] Too many levels of symbolic links: '3' > > > """ > > try: > > os.stat(path) > > except os.error, err: > > # do not raise exception in case of broken symlink; > > # we want to know the final target anyway > > if err.errno == errno.ENOENT: > > pass > > else: > > raise > > if not os.path.isabs(path): > > basedir = os.path.dirname(os.path.abspath(path)) > > else: > > basedir = os.path.dirname(path) > > p = path > > while os.path.islink(p): > > p = os.readlink(p) > > if not os.path.isabs(p): > > p = os.path.join(basedir, p) > > basedir = os.path.dirname(p) > > return os.path.join(basedir, p) > > _______________________________________________ > > Python-ideas mailing list > > Python-id... at python.org > >http://mail.python.org/mailman/listinfo/python-ideas > > -- > --Guido van Rossum (home page:http://www.python.org/~guido/) > _______________________________________________ > Python-ideas mailing list > Python-id... at python.orghttp://mail.python.org/mailman/listinfo/python-ideas- Nascondi testo tra virgolette - > > - Mostra testo tra virgolette - Are you trying to tell me that I've lost the whole evening when such a thing was already available in os.path? :-) Wait a moment. I'm going to check what os.path.realpath() does. From gnewsg at gmail.com Tue Dec 4 20:53:46 2007 From: gnewsg at gmail.com (Giampaolo Rodola') Date: Tue, 4 Dec 2007 11:53:46 -0800 (PST) Subject: [Python-ideas] Symlink chain resolver module In-Reply-To: <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com> References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com> Message-ID: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com> Ok, it's been a wasted evening. :-) Sorry. (I'm feeling so sad...) From aahz at pythoncraft.com Tue Dec 4 21:00:47 2007 From: aahz at pythoncraft.com (Aahz) Date: Tue, 4 Dec 2007 12:00:47 -0800 Subject: [Python-ideas] Symlink chain resolver module In-Reply-To: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com> References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com> <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com> Message-ID: <20071204200047.GA12925@panix.com> On Tue, Dec 04, 2007, Giampaolo Rodola' wrote: > > Ok, it's been a wasted evening. :-) > Sorry. > > (I'm feeling so sad...) Another victim of Guido's Time Machine. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Typing is cheap. Thinking is expensive." --Roy Smith From guido at python.org Tue Dec 4 22:24:12 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Dec 2007 13:24:12 -0800 Subject: [Python-ideas] Symlink chain resolver module In-Reply-To: <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com> References: <0b29f85f-cf67-4043-9da3-2406d56615c1@l16g2000hsf.googlegroups.com> <77980755-1253-4fd5-aef2-ce59ac24ff22@w56g2000hsf.googlegroups.com> <809f3810-b059-4be7-89b7-329e775c4813@a35g2000prf.googlegroups.com> Message-ID: On Dec 4, 2007 11:53 AM, Giampaolo Rodola' wrote: > Ok, it's been a wasted evening. :-) > Sorry. > > (I'm feeling so sad...) Don't fret. It's not been wasted. You learned a couple of things: you learned how to resolve symlinks recursively and safely, you probably improved your Python debugging skills, *and* you learned to do research before rolling up your sleeves. That's three valuable lessons! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arno at marooned.org.uk Wed Dec 12 21:10:13 2007 From: arno at marooned.org.uk (Arnaud Delobelle) Date: Wed, 12 Dec 2007 20:10:13 +0000 Subject: [Python-ideas] free variables in generator expressions Message-ID: In the generator expression (x+1 for x in L) the name 'L' is not local to the expression (as opposed to 'x'), I will call such a name a 'free name', as I am not aware of an existing terminology. The following is about how to treat 'free names' inside generator expressions. I argue that these names should be bound to their values when the generator expression is created, and the rest of this email tries to provide arguments why this may be desirable. I am fully aware that I tend to think about things in a very skewed manner though, so I would be grateful for any rebuttal. Recently I tried to implement a 'sieve' algorithm using generator expressions instead of lists. It wasn't the sieve of Eratosthenes but I will use that as a simpler example (all of the code shown below is python 2.5). Here is one way to implement the algorithm using list comprehensions (tuples could also be used as the mutability of lists is not used): def list_sieve(n): "Generate all primes less than n" P = range(2, n) while True: # Yield the first element of P and sieve out its multiples for p in P: yield p P = [x for x in P if x % p] break else: # If P was empty then we're done return >>> list(list_sieve(20)) [2, 3, 5, 7, 11, 13, 17, 19] So that's ok. Then it occured to me that I didn't need to keep creating all theses lists; so I decided, without further thinking, to switch to generator expressions, as they are supposed to abstract the notion of iterable object (and in the function above I'm only using the 'iterable' properties of lists - any other iterable should do). So I came up with this: def incorrect_gen_sieve(n): "Unsuccessfully attempt to generate all primes less than n" # Change range to xrange P = xrange(2, n) while True: for p in P: yield p # Change list comprehension to generator expression P = (x for x in P if x % p) break else: return >>> list(incorrect_gen_sieve(20)) [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] Ouch. Looking back at the code I realised that this was due to the fact that the names 'p' and 'P' in the generator expressions are not local to it, so subsequent binding of these names will affect the expression. I can't really analyse further than this, my head start spinning if I try :). So I wrapped the generator expression in a lambda function to make sure that the names p and P inside it were bound to what I intended them to: def gen_sieve(n): "Generate all primes less than n, but without tables!" P = xrange(2, n) while True: for p in P: yield p # Make sure that p and P are what they should be P = (lambda P, p: (x for x in P if x % p))(P, p) break else: return >>> list(gen_sieve(20)) [2, 3, 5, 7, 11, 13, 17, 19] That's better. I like the idea of a sieve of Eratosthenes without any tables, and it's achievable in Python quite easily. The only problem is the one that I mentioned above, which boils down to: In a generator expression, free names can be bound to a new object between the time when they are defined and when they are used, thus changing the value of the expression. But I think that the behaviour of generator expressions would be more controllable and closer to that of 'real sequences' if the free names they contain were implicitly frozen when the generator expression is created. So I am proposing that for example: (f(x) for x in L) be equivalent to: (lambda f, L: (f(x) for x in L))(f, L) In most cases this would make generator expressions behave more like list comprehensions. You would be able to read the generator expression and think "that't what it does" more reliabley. Of course if a free name is bound to a mutable object, then there is always the chance that this object will be mutated between the creation of the generator expression and its use. Lastly, instead of using a generator expression I could have written: from itertools import ifilter from functools import partial def tools_sieve(n): "Generate all primes less than n" P = xrange(2, n) while True: for p in P: yield p P = ifilter(partial(int.__rmod__, p), P) break else: return >>> list(sieve(20)) [2, 3, 5, 7, 11, 13, 17, 19] It obviously works as P and p are 'frozen' when ifilter and partial are called. If (f(x) for x in L if g(x)) is to be the moral equivalent of imap(f, ifilter(g, L)) Then my proposal makes even more sense. -- Arnaud From brett at python.org Wed Dec 12 21:56:25 2007 From: brett at python.org (Brett Cannon) Date: Wed, 12 Dec 2007 12:56:25 -0800 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: Message-ID: On Dec 12, 2007 12:10 PM, Arnaud Delobelle wrote: > In the generator expression > > (x+1 for x in L) > > the name 'L' is not local to the expression (as opposed to 'x'), I > will call such a name a 'free name', as I am not aware of an existing > terminology. > > The following is about how to treat 'free names' inside generator > expressions. I argue that these names should be bound to their values > when the generator expression is created, and the rest of this email > tries to provide arguments why this may be desirable. > Calling it a free variable is the right term (at least according to Python and the functional programmers of the world). As for what you are asking for, I do believe it came up during the discussion of when genexps were added to the language. I honestly don't remember the reasoning as to why we didn't do it this way, but I am willing to guess it has something to do with simplicity and purity of what genexps are meant to be. Consider what your genexp, ``(x for x in P if x % p)``, really is:: def _genexp(): for x in P: if x % p: yield x But what you are after is:: def _genexp(): _P = P _p = p for x in _P: if x % _p: yield x The former maps to what you see in the genexp much more literally than the latter. And if you want the latter, just define a function like the above but have it take P and p as arguments and then you get your generator just the way you want it. Genexps (as with listcomps) are just really simple syntactic sugar. And Python tends to skew from hiding details like capturing variables implicitly in some piece of syntactic sugar. In my opinion it is better to be more explicit with what you want the generator to do and just write out a generator function. -Brett From g.brandl at gmx.net Wed Dec 12 22:56:35 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 12 Dec 2007 22:56:35 +0100 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: Message-ID: Brett Cannon schrieb: > On Dec 12, 2007 12:10 PM, Arnaud Delobelle wrote: >> In the generator expression >> >> (x+1 for x in L) >> >> the name 'L' is not local to the expression (as opposed to 'x'), I >> will call such a name a 'free name', as I am not aware of an existing >> terminology. >> >> The following is about how to treat 'free names' inside generator >> expressions. I argue that these names should be bound to their values >> when the generator expression is created, and the rest of this email >> tries to provide arguments why this may be desirable. >> > > Calling it a free variable is the right term (at least according to > Python and the functional programmers of the world). > > As for what you are asking for, I do believe it came up during the > discussion of when genexps were added to the language. I honestly > don't remember the reasoning as to why we didn't do it this way, but I > am willing to guess it has something to do with simplicity and purity > of what genexps are meant to be. > > Consider what your genexp, ``(x for x in P if x % p)``, really is:: > > def _genexp(): > for x in P: > if x % p: > yield x Actually it is def _genexp(P): for x in P: if x % p: yield x IOW, the outmost iterator is not a free variable, but passed to the invisible function object. Georg From arno at marooned.org.uk Thu Dec 13 00:01:42 2007 From: arno at marooned.org.uk (Arnaud Delobelle) Date: Wed, 12 Dec 2007 23:01:42 +0000 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: Message-ID: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> On 12 Dec 2007, at 21:56, Georg Brandl wrote: > Brett Cannon schrieb: [...] > >> >> Consider what your genexp, ``(x for x in P if x % p)``, really is:: >> >> def _genexp(): >> for x in P: >> if x % p: >> yield x > > Actually it is > > def _genexp(P): > for x in P: > if x % p: > yield x > > IOW, the outmost iterator is not a free variable, but passed to the > invisible function object. I see. 'P' gets frozen but not 'p', so I should be able to write: def gen2_sieve(n): "Generate all primes less than n" P = xrange(2, n) while True: for p in P: yield p P = (lambda p: (x for x in P if x % p))(p) break else: return >>> list(gen2_sieve(20)) [2, 3, 5, 7, 11, 13, 17, 19] It seems to work. Ok then in the same vein, I imagine that (x + y for x in A for y in B) becomes: def _genexp(A): for x in A: for y in B: yield x + y Let's test this (python 2.5): >>> A = '12' >>> B = 'ab' >>> gen = (x + y for x in A for y in B) >>> A = '34' >>> B = 'cd' >>> list(gen) ['1c', '1d', '2c', '2d'] So in the generator expression, A is remains bound to the string '12' but B gets rebound to 'cd'. This may make the implementation of generator expressions more straighforward, but from the point of view of a user of the language it seems rather arbitrary. What makes A so special as opposed to B? Ok it belongs to the outermost loop, but conceptually in the example above there is no outermost loop. At the moment I still think it makes more sense for the generator expressions to generate as much as possible a sequence which is the same as what the corresponding list comprehension would have been, i.e.: l = [f(x, y) for x in A for y in B(x) if g(x, y)] g = [f(x, y) for x in A for y in B(x) if g(x, y)] assert list(g) == l to work as much as possible. Perhaps I should go and see how generator expressions are generated in the python source code. -- Arnaud From g.brandl at gmx.net Thu Dec 13 00:41:22 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 13 Dec 2007 00:41:22 +0100 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> References: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> Message-ID: Arnaud Delobelle schrieb: > Let's test this (python 2.5): > > >>> A = '12' > >>> B = 'ab' > >>> gen = (x + y for x in A for y in B) > >>> A = '34' > >>> B = 'cd' > >>> list(gen) > ['1c', '1d', '2c', '2d'] > > So in the generator expression, A is remains bound to the string '12' > but B gets rebound to 'cd'. This may make the implementation of > generator expressions more straighforward, but from the point of view > of a user of the language it seems rather arbitrary. What makes A so > special as opposed to B? Ok it belongs to the outermost loop, but > conceptually in the example above there is no outermost loop. Well, B might depend on A so it can't be evaluated in the outer context at the time the genexp "function" is called. It has to be evaluated inside the "function". Georg From greg.ewing at canterbury.ac.nz Thu Dec 13 00:36:19 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Dec 2007 12:36:19 +1300 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: Message-ID: <47607073.1030307@canterbury.ac.nz> Arnaud Delobelle wrote: > The following is about how to treat 'free names' inside generator > expressions. I argue that these names should be bound to their values > when the generator expression is created, My opinion is that this is the wrong place to attack the problem. Note that it's not only generator expressions that have this issue -- the same thing can trip you up with lambdas as well. It's instructive to consider why other languages with lexical scoping and first-class functions don't seem to have this problem to the same extent. In Scheme, for example, the reason is that its looping constructs usually create a new binding for the loop variable on each iteration, instead of re-using the same one. So if you return a lambda from inside the loop, each one lives in a different lexical environment and sees a different value for the loop variable. If Python's for-loop did the same thing, I suspect that this problem would turn up much less frequently. In CPython, there's a straightforward way to implement this: if the variable is used in an inner function, and is therefore in a cell, create a new cell each time round the loop instead of replacing the contents of the existing one. (If the variable isn't in a cell, there's no need to change anything.) Note that this wouldn't interfere with using the loop variable after the loop has finished -- the variable is still visible to the whole function, and the following code just sees whatever is in the last created cell. -- Greg From arno at marooned.org.uk Thu Dec 13 08:08:53 2007 From: arno at marooned.org.uk (Arnaud Delobelle) Date: Thu, 13 Dec 2007 07:08:53 +0000 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> Message-ID: <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk> On 12 Dec 2007, at 23:41, Georg Brandl wrote: > Arnaud Delobelle schrieb: > >> Let's test this (python 2.5): >> >>>>> A = '12' >>>>> B = 'ab' >>>>> gen = (x + y for x in A for y in B) >>>>> A = '34' >>>>> B = 'cd' >>>>> list(gen) >> ['1c', '1d', '2c', '2d'] >> >> So in the generator expression, A is remains bound to the string '12' >> but B gets rebound to 'cd'. This may make the implementation of >> generator expressions more straighforward, but from the point of view >> of a user of the language it seems rather arbitrary. What makes A so >> special as opposed to B? Ok it belongs to the outermost loop, but >> conceptually in the example above there is no outermost loop. > > Well, B might depend on A so it can't be evaluated in the outer > context > at the time the genexp "function" is called. It has to be evaluated > inside the "function". You're right. I expressed myself badly: I was not talking about evaluation but binding. I was saying that if the name A is bound to the object that A is bound to when the generator expression is created, then the same should happen with B. -- Arnaud From ntoronto at cs.byu.edu Fri Dec 14 07:40:15 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Thu, 13 Dec 2007 23:40:15 -0700 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) Message-ID: <4762254F.8030607@cs.byu.edu> I apologize if this is well-known, but it's new to me. It may be something to keep in mind when we occasionally dabble in generic functions or multimethods. Hopefully there's a way around it. While kicking around ideas for multiple dispatch, I came across one of Perl's implementations, and it reminded me of something Larry Wall said recently in his State of the Onion address about how types get together and democratically select a function to dispatch to. I immediately thought of Arrow's Theorem. Arrow's Theorem says that you can't (final, period, that's it) make a "social choice function" that's fair. A social choice function takes preferences as inputs and outputs a single preference ordering. (You could think of types as casting preference votes for functions, for example, by stating their "distance" from the exact type the function expects. It doesn't matter exactly how they vote, just that they do it.) "Fair" means the choice function meets these axioms: 1. The output preference has to be a total order. This is important for choosing a president or multiple dispatch, since without it you can't choose a "winner". 2. There has to be more than one agent (type, here) and more than two choices. (IOW, with two choices or one agent it's possible to be fair.) 3. If a new choice appears, it can't affect the final pairwise ordering of two already-existing choices. (Called "independence of irrelevant alternatives". It's one place most voting systems fail. Bush vs. Clinton vs. Perot is a classic example.) 4. An agent promoting a choice can't demote it in the final pairwise ordering. (Called "positive association". It's another place most voting systems fail.) 5. There has to be some way to get any particular pairwise ordering. (Called "citizen's sovereignty". Notice this doesn't say anything about a complete ordering.) 6. No single agent (or type) can control a final pairwise ordering independent of every other agent. (Called "non-dictatorship".) Notice what it doesn't say, since some of these are very loose requirements. In particular, it says nothing about relative strengths of votes. What does this mean for multiple dispatch? It seems to explain why so many different kinds of dispatch algorithms have been invented: they can't be made "fair" by those axioms, which all seem to describe desirable dispatch behaviors. No matter how it's done, either the outcome is intransitive (violates #1), adding a seemingly unrelated function flips the top two in some cases (violates #3), using a more specific type can cause a strange demotion (violates #4), or there's no way to get some specific function as winner (violates #5). Single dispatch gets around this problem by violating #6 - that is, the first type is the dictator. Am I missing something here? I want to be, since multiple dispatch sounds really nice. Neil From arno at marooned.org.uk Fri Dec 14 09:20:20 2007 From: arno at marooned.org.uk (Arnaud Delobelle) Date: Fri, 14 Dec 2007 08:20:20 -0000 (GMT) Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) Message-ID: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> (Sorry I can't in-reply-to I've lost the original message) Neil Toronto ntoronto at cs.byu.edu: > While kicking around ideas for multiple dispatch, I came across one of > Perl's implementations, and it reminded me of something Larry Wall said > recently in his State of the Onion address about how types get together > and democratically select a function to dispatch to. I immediately > thought of Arrow's Theorem. Maybe I'm wrong, but I don't think this theorem applies in the case of mulitple dispatch. The type of the argument sequence as a whole chooses the best specialisation, it's not the case that each argument expresses a preference. The set of choices is the set S of signatures of specialisations of the function (which is partially ordered), and the 'commitee' is made of one entity only: the function call's signature s. To choose the relevant specialisation, look at: { t \in S | s <= t and \forall u \in S (s < u <= t) \implies u = t } If this set is a singleton { t } then the specialization with signature t is the best fit for s, otherwise there is no best fit. -- Arnaud From aaron.watters at gmail.com Fri Dec 14 16:31:27 2007 From: aaron.watters at gmail.com (Aaron Watters) Date: Fri, 14 Dec 2007 10:31:27 -0500 Subject: [Python-ideas] democratic multiple dispatch and type generality partial orders Message-ID: From: "Arnaud Delobelle" > > { t \in S | s <= t and \forall u \in S (s < u <= t) \implies u = t } Sorry if I'm late to this discussion, but the hard part here is defining the partial order <=, it seems to me. I've tried to think about this at times and I keep getting entangled in some highly recursive graph matching scheme. I'm sure the Lisp community has done something with this, anyone have a reference or something? For example: def myFunction(x, y, L): z = x+y L.append(z) return z What is (or should be) the type of myFunction, x, y, and L? Well, let's see. type(x) has an addition function that works with type(y), or maybe type(y) has a radd function that works with type(x)... or I think Python might try something involving coercion....? in any case the result type(z) is acceptible to the append function of type(L)... then I guess myFunction is of type type(x) x type(y) x type(L) --> type(z)... I'm lost. But that's only the start! Then you have to figure out efficient ways of calculating these type annotations and looking up "minimal" matching types. Something tells me that this might be a Turing complete problem -- but that doesn't mean you can't come up with a reasonable and useful weak approximation. Please inform if I'm going over old ground or otherwise am missing something. Thanks, -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=melting -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Dec 14 20:46:17 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 15 Dec 2007 04:46:17 +0900 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> Message-ID: <873au56ovq.fsf@uwakimon.sk.tsukuba.ac.jp> Arnaud Delobelle writes: > (Sorry I can't in-reply-to I've lost the original message) > > Neil Toronto ntoronto at cs.byu.edu: > > > While kicking around ideas for multiple dispatch, I came across one of > > Perl's implementations, and it reminded me of something Larry Wall said > > recently in his State of the Onion address about how types get together > > and democratically select a function to dispatch to. I immediately > > thought of Arrow's Theorem. > > Maybe I'm wrong, but I don't think this theorem applies in the case of > mulitple dispatch. The type of the argument sequence as a whole chooses > the best specialisation, it's not the case that each argument expresses a > preference. I don't think it applies but I think a detailed analysis sheds light on why perfect multiple dispatch is impossible. The second part is true, individual arguments are not expressing preferences in general (though they could, for example a generic concatenation function could take an optional return_type argument so that the appropriate type of sequence is constructed from scratch, rather than sometimes requiring a possibly expensive conversion). The type is not doing the choosing, though. The problem here is that what is really doing the choosing is the application calling the function. You could imagine a module dwim which contains exactly one API: dwim(), which of course does the computation needed by the application at that point. dwim's implementation would presumably dispatch to specializations. Now since all dwim has is the type signature (this includes any subtypes deducible from the value, such as "nonnegative integer", so it's actually quite a lot of information), dwim can't work. Eg, search("de","dent") and append("de","dent") have the same signature (and a lisper might even return the substring whose head matches the pattern, so the return type might not disambiguate). While this is an example requiring so much generality as to be bizarre, I think it's easy to imagine situations where applications will disagree about the exact semantics that some generic function should have. The generic concatenation function is one, and an individual application might even want to have the dispatch done differently in different parts of the program. In sum, the problem is the real "voters" (applications) are represented by rather limited information (argument signatures) to the dispatcher. So the real goal here seems to be to come up with rules that *programmers* can keep in their heads that (a) are flexible enough to cover lots of common situations and (b) simple enough so that any programmer good enough to be given dog food can remember both the covered situations and the exceptions. But that, in turn, requires the "situations" to be easy to factor the "correct" way. Saunders Mac Lane makes a comment about the analysis of certain spaces in mathematics, where the special case that provides all the intuition wasn't defined for decades after the general case. Had it been done the other way around, the important theorems would have been proved within a couple of years, and grad students could have worked out the details within the decade. So I don't think that this can be worked out by defining multiple dispatch for Python; you have to define Python for multiple dispatch. Maybe it's already pretty close?! This-is-a-job-for-the-time-machine-ly y'rs, From ntoronto at cs.byu.edu Sat Dec 15 01:28:32 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Fri, 14 Dec 2007 17:28:32 -0700 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> Message-ID: <47631FB0.3080101@cs.byu.edu> Arnaud Delobelle wrote: > Neil Toronto ntoronto at cs.byu.edu: > >> While kicking around ideas for multiple dispatch, I came across one of >> Perl's implementations, and it reminded me of something Larry Wall said >> recently in his State of the Onion address about how types get together >> and democratically select a function to dispatch to. I immediately >> thought of Arrow's Theorem. > > Maybe I'm wrong, but I don't think this theorem applies in the case of > mulitple dispatch. The type of the argument sequence as a whole chooses > the best specialisation, it's not the case that each argument expresses a > preference. The set of choices is the set S of signatures of > specialisations of the function (which is partially ordered), and the > 'commitee' is made of one entity only: the function call's signature s. > To choose the relevant specialisation, look at: > > { t \in S | s <= t and \forall u \in S (s < u <= t) \implies u = t } > > If this set is a singleton { t } then the specialization with signature t > is the best fit for s, otherwise there is no best fit. Let's call this set T. In English, it's the set of t's (function signatures) whose types aren't more specific than s's - but only the most specific ones according to the partial general-to-specific ordering. Most multiple dispatch algorithms agree that T has the right function in it. Some kind of conflict resolution seems necessary. The number of ways to have "no best fit" is exponential in the number of function arguments, so punting with a raised exception doesn't seem useful. It's in the conflict resolution - when the set isn't a singleton - that multiple dispatch algorithms are most different. Conflict resolution puts us squarely into Arrow's wasteland. Whether it's formulated as a social choice function or not, I think it's equivalent to one. In a social choice reformulation of multiple dispatch conflict resolution, T are the candidate outcomes. (This much is obvious.) The types are the agents: each prefers functions in which the type in its position in the signature is most specific. Again, it's a reformulation, but I believe it's equivalent. The application implements a social choice function: it picks a winner function based on types' "preferences". Arrow's Theorem doesn't care exactly how it does this - whether it has types vote or sums distances from whatever - it just says that however it does this, the result can't always be "fair". And "fair" doesn't even mean one type can't have more say than another by, for example, narrowing the subset first based on its own preferences. It only means that things don't act counterintuitively in general, that there's *some* means to every outcome, and that one agent (type) doesn't control everything. Maybe, as not-useful as it looks, punting with a raised exception is the only way out. Neil From ntoronto at cs.byu.edu Sat Dec 15 09:07:14 2007 From: ntoronto at cs.byu.edu (ntoronto at cs.byu.edu) Date: Sat, 15 Dec 2007 01:07:14 -0700 (MST) Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <47631FB0.3080101@cs.byu.edu> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> <47631FB0.3080101@cs.byu.edu> Message-ID: <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> I wrote: > The application implements a social choice function: it picks a winner > function based on types' "preferences". Arrow's Theorem doesn't care > exactly how it does this - whether it has types vote or sums distances > from whatever - it just says that however it does this, the result can't > always be "fair". And "fair" doesn't even mean one type can't have more > say than another by, for example, narrowing the subset first based on > its own preferences. It only means that things don't act > counterintuitively in general, that there's *some* means to every > outcome, and that one agent (type) doesn't control everything. Sorry to self-reply, but I was just reading this: http://dev.perl.org/perl6/rfc/256.html """ 1. 2. 3. 4. Otherwise, the dispatch mechanism examines each viable target and computes its inheritance distance from the actual set of arguments. The inheritance distance from a single argument to the corresponding parameter is the number of inheritance steps between their respective classes (working up the tree from argument to parameter). If there's no inheritance path between them, the distance is infinite. The inheritance distance for a set of arguments is just the sum of their individual inheritance distances. 5. The dispatch mechanism then chooses the viable target with the smallest inheritance distance as the actual target. If more than one viable target has the same smallest distance, the call is ambiguous. In that case, the dispatch process fails and an exception is thrown (but see "Handling dispatch failure" below) If there's only a single actual target, its identity is cached (to accelerate subsequent dispatching), and then the actual target is invoked. """ This is almost precisely the Borda protocol: http://en.wikipedia.org/wiki/Borda_count Borda does not satisfy independence of irrelevant alternatives. HOWEVER, the proposed dispatch mechanism does. Why? The agents are voting a kind of utility, not a preference. With the addition or removal of a function, the sum of votes for any other function will not change. AFAIK, that's the only Arrow-ish problem with Borda, so this proposed dispatch mechanism doesn't have the problems I was expecting. Big stink for nothing, I guess. I imagine that making next-method calls behave is a nightmare, though. The more I read about multiple dispatch, the less I like it. Neil From stephen at xemacs.org Sat Dec 15 23:08:43 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 16 Dec 2007 07:08:43 +0900 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> <47631FB0.3080101@cs.byu.edu> <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> Message-ID: <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp> ntoronto at cs.byu.edu writes: > This is almost precisely the Borda protocol: > > http://en.wikipedia.org/wiki/Borda_count No, it is not. The Borda rule requires a linear order, whereas this is a partial order, and complete indifference is possible. > Why? The agents are voting a kind of utility, not a preference. With the > addition or removal of a function, the sum of votes for any other function > will not change. And this utilitarian rule has its own big problems. For one, you could argue that it violates the Arrow condition of non-dictatorship because *somebody* has to choose the weights. In particular, weighting the number of levels by one doesn't make a lot of sense: some developers prefer a shallow style with an abstract class and a lot of concrete derivatives, others prefer a hierarchy of abstract classes with several levels before arriving at the concrete implementations. I think it would be a bad thing if devotees of the latter style were discouraged because their users found the convenience of automatic dispatch more important than the (usually invisible) internal type hierarchy. Also, my intuition doesn't rule out the possibility that self *should* be a dictator if it "expresses a preference" -- the Arrow condition of non-dictatorship might not even be appropriate. > I imagine that making next-method calls behave is a nightmare, though. The > more I read about multiple dispatch, the less I like it. It's a hard problem. From jan.kanis at phil.uu.nl Sat Dec 15 23:41:12 2007 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Sat, 15 Dec 2007 23:41:12 +0100 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk> References: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk> Message-ID: On Thu, 13 Dec 2007 08:08:53 +0100, Arnaud Delobelle wrote: > > On 12 Dec 2007, at 23:41, Georg Brandl wrote: > >> Arnaud Delobelle schrieb: >> >>> Let's test this (python 2.5): >>> >>> >>> A = '12' >>> >>> B = 'ab' >>> >>> gen = (x + y for x in A for y in B) >>> >>> A = '34' >>> >>> B = 'cd' >>> >>> list(gen) >>> ['1c', '1d', '2c', '2d'] >>> >>> So in the generator expression, A is remains bound to the string '12' >>> but B gets rebound to 'cd'. This may make the implementation of >>> generator expressions more straighforward, but from the point of view >>> of a user of the language it seems rather arbitrary. What makes A so >>> special as opposed to B? Ok it belongs to the outermost loop, but >>> conceptually in the example above there is no outermost loop. >> >> Well, B might depend on A so it can't be evaluated in the outer >> context >> at the time the genexp "function" is called. It has to be evaluated >> inside the "function". > > You're right. I expressed myself badly: I was not talking about > evaluation but binding. I was saying that if the name A is bound to > the object that A is bound to when the generator expression is > created, then the same should happen with B. > I think what Georg meant was this (I intended to reply this to your earlier mail of Thursday AM, but Georg beat me to it): The reason for not binding B when the genexp is defined is so you can do this: >>> A = [[1, 2], [3, 4]] >>> gen = (x for b in A for x in b) >>> list(gen) [1, 2, 3, 4] Here, b can't be bound to something at generator definition time because the 'something' may not exist yet. (It does actually in this example, but you get the point.) So, only the first (outer loop) iterable is bound immediately. Whether a variable is rebound within the expression could of course be decided at compile time, so all free variables could be bound immediately. I think that would be an improvement, but it requires the compiler to be a bit smarter. Unfortunately, it seems to be pythonic to bind variables at moments I disagree with :), like function default arguments (bound at definition instead of call) and loop counters (rebound every iteration instead of every iteration having it's own scope). And, while I'm writing this: On Thu, 13 Dec 2007 00:01:42 +0100, Arnaud Delobelle wrote: > l = [f(x, y) for x in A for y in B(x) if g(x, y)] > g = [f(x, y) for x in A for y in B(x) if g(x, y)] > > assert list(g) == l I suppose this should have been g = (f(x, y) for x in A for y in B(x) if g(x, y)) Jan From arno at marooned.org.uk Sun Dec 16 00:31:19 2007 From: arno at marooned.org.uk (Arnaud Delobelle) Date: Sat, 15 Dec 2007 23:31:19 +0000 Subject: [Python-ideas] free variables in generator expressions In-Reply-To: References: <72EFBFDC-DD0B-40AB-B042-CE35E220762C@marooned.org.uk> <24DF1DCA-A737-4E6E-8A73-7F1EFA605402@marooned.org.uk> Message-ID: <47FB762F-A4F4-4522-BE07-BAF103EC0ED0@marooned.org.uk> On 15 Dec 2007, at 22:41, Jan Kanis wrote: > On Thu, 13 Dec 2007 08:08:53 +0100, Arnaud Delobelle > wrote: > >> >> On 12 Dec 2007, at 23:41, Georg Brandl wrote: >> >>> Arnaud Delobelle schrieb: >>> >>>> Let's test this (python 2.5): >>>> >>>> >>> A = '12' >>>> >>> B = 'ab' >>>> >>> gen = (x + y for x in A for y in B) >>>> >>> A = '34' >>>> >>> B = 'cd' >>>> >>> list(gen) >>>> ['1c', '1d', '2c', '2d'] >>>> >>>> So in the generator expression, A is remains bound to the string >>>> '12' >>>> but B gets rebound to 'cd'. This may make the implementation of >>>> generator expressions more straighforward, but from the point of >>>> view >>>> of a user of the language it seems rather arbitrary. What makes A >>>> so >>>> special as opposed to B? Ok it belongs to the outermost loop, but >>>> conceptually in the example above there is no outermost loop. >>> >>> Well, B might depend on A so it can't be evaluated in the outer >>> context >>> at the time the genexp "function" is called. It has to be evaluated >>> inside the "function". >> >> You're right. I expressed myself badly: I was not talking about >> evaluation but binding. I was saying that if the name A is bound to >> the object that A is bound to when the generator expression is >> created, then the same should happen with B. >> > > I think what Georg meant was this (I intended to reply this to your > earlier mail of Thursday AM, but Georg beat me to it): > > The reason for not binding B when the genexp is defined is so you > can do this: > > >>> A = [[1, 2], [3, 4]] > >>> gen = (x for b in A for x in b) > >>> list(gen) > [1, 2, 3, 4] > > Here, b can't be bound to something at generator definition time > because the 'something' may not exist yet. (It does actually in this > example, but you get the point.) So, only the first (outer loop) > iterable is bound immediately. > In your example, b is not free of course. > Whether a variable is rebound within the expression could of course > be decided at compile time, so all free variables could be bound > immediately. I think that would be an improvement, but it requires > the compiler to be a bit smarter. > This is what I was advocating. As it is decided at compile time which variables are free, it may only be a small extra step to add a bit of code saying that they must be bound at the creation of the generator expression. Or, to continue with the _genexp function mentioned in previous posts, for: (f(x) for b in a for x in b) to be translated as def _genexp(f, A): for b in A: for x in b: yield f(x) as A and f are free but not b and x. Then gen = (f(x) for b in A for x in b) would be translated as gen = _genexp(f, A) I imagine this wouldn't be too hard, but I am not familiar with the specifics of python code compilation... Moreover this behaviour ('freezing' all free variables at the creation of the generator expression) is well defined and easy to reason on I think. I haven't yet had the time to see how generator expressions are created, but I'd like to have a look, although I suspect I will have to learn a lot more besides in order to understand it. [...] > And, while I'm writing this: > > On Thu, 13 Dec 2007 00:01:42 +0100, Arnaud Delobelle > wrote: >> l = [f(x, y) for x in A for y in B(x) if g(x, y)] >> g = [f(x, y) for x in A for y in B(x) if g(x, y)] >> >> assert list(g) == l > > I suppose this should have been > > g = (f(x, y) for x in A for y in B(x) if g(x, y) Yes! Sorry about that. In fact, I should also have called the generator expression something else than 'g' as it is already the name of a function (g(x, y)) :| -- Arnaud From ntoronto at cs.byu.edu Mon Dec 17 06:17:57 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Sun, 16 Dec 2007 22:17:57 -0700 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> <47631FB0.3080101@cs.byu.edu> <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <47660685.8020903@cs.byu.edu> Stephen J. Turnbull wrote: > ntoronto at cs.byu.edu writes: > > Why? The agents are voting a kind of utility, not a preference. With the > > addition or removal of a function, the sum of votes for any other function > > will not change. > > And this utilitarian rule has its own big problems. > > For one, you could argue that it violates the Arrow condition of > non-dictatorship because *somebody* has to choose the weights. In > particular, weighting the number of levels by one doesn't make a lot > of sense: some developers prefer a shallow style with an abstract > class and a lot of concrete derivatives, others prefer a hierarchy of > abstract classes with several levels before arriving at the concrete > implementations. I think it would be a bad thing if devotees of the > latter style were discouraged because their users found the > convenience of automatic dispatch more important than the (usually > invisible) internal type hierarchy. Good point. Further, what if it's more important for one type to be treated as specifically as possible than for another? In a multi-agent setting, we'd call this the problem of combining utilities. In general, there's not a good way to do it - utilities (or these pseudo-utilities) are unique only up to a linear (or positive affine) transform. Assuming it's possible to pick a zero point you can normally shift and multiply them, but here the only obvious zero point (minimum distance over all matching functions) would net you a lot of ties. Here's another fun idea: a lesser-known theorem called Gibbard-Satterthwaite says you can't construct a fair voting system in which the best response for each agent is to always tell the truth. It's strange to think of types *lying* about their distances from each other, but I'm sure users will end up constructing whacked-out hierarchies in an attempt to game the system and swing the vote toward specific functions. At that point, dispatch may as well be completely manual. Maybe the voting reformulation wasn't such a bad idea after all. At the very least, I've decided I really don't like Perl's implementation. :D > Also, my intuition doesn't rule out the possibility that self *should* > be a dictator if it "expresses a preference" -- the Arrow condition of > non-dictatorship might not even be appropriate. I'm not completely sure of this (the docs I've read have been pretty vague) but I think CLOS does that. One way to look at the problem is finding a total order in some D1xD2x...xDn space, where Di is a type's distance from matching function signature types. CLOS imposes a lexicographical order on this space, so self is a dictator. Type 2 gets to be a dictator if self is satisfied, and so on. Somehow they make it work with multiple inheritance. Of all the approaches I've seen I like this best, though it's still really complex. Python's __op__/__rop__ is a rather clever solution to the problem in binary form. Do you know if there's a generalization of this? CLOS is almost it, but Python lets the class containing __op__ or __rop__ own the operation after it finds one that implements it. Neil From stephen at xemacs.org Mon Dec 17 08:27:55 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 17 Dec 2007 16:27:55 +0900 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <47660685.8020903@cs.byu.edu> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> <47631FB0.3080101@cs.byu.edu> <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp> <47660685.8020903@cs.byu.edu> Message-ID: <87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp> Neil Toronto writes: > Python's __op__/__rop__ is a rather clever solution to the problem in > binary form. Do you know if there's a generalization of this? CLOS is > almost it, but Python lets the class containing __op__ or __rop__ own > the operation after it finds one that implements it. Sorry, no. I'm a professional economist (my early work was related to Gibbard-Satterthwaite, thus my interest here) and wannabe developer, who happens to have enjoyed the occasional guidance of Ken Arrow a couple decades ago. If you think there's something to the application of multiobjective optimization to this problem, I'd love to take a hack at it ... no classes to teach winter term, I've got some time to bone up on the multidispatch problem itself. From ntoronto at cs.byu.edu Mon Dec 17 20:43:21 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Mon, 17 Dec 2007 12:43:21 -0700 Subject: [Python-ideas] Democratic multiple dispatch doomed to fail (probably) In-Reply-To: <87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp> References: <60914.217.180.32.12.1197620420.squirrel@www.marooned.org.uk> <47631FB0.3080101@cs.byu.edu> <32039.10.7.75.26.1197706034.squirrel@mail.cs.byu.edu> <87zlwbtxuc.fsf@uwakimon.sk.tsukuba.ac.jp> <47660685.8020903@cs.byu.edu> <87abo9g4qs.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4766D159.9060005@cs.byu.edu> Stephen J. Turnbull wrote: > Neil Toronto writes: > > > Python's __op__/__rop__ is a rather clever solution to the problem in > > binary form. Do you know if there's a generalization of this? CLOS is > > almost it, but Python lets the class containing __op__ or __rop__ own > > the operation after it finds one that implements it. > > Sorry, no. I'm a professional economist (my early work was related to > Gibbard-Satterthwaite, thus my interest here) and wannabe developer, > who happens to have enjoyed the occasional guidance of Ken Arrow a > couple decades ago. Excellent. > If you think there's something to the application > of multiobjective optimization to this problem, I'd love to take a > hack at it ... no classes to teach winter term, I've got some time to > bone up on the multidispatch problem itself. For finding the first matching function, you could say it's picking some "best fit" from the Pareto optimal front. AFAIK all ways of doing that suck somehow. :) After that, the problem is calling a next-method, which requires imposing a total order on the entire space of fitting functions. (This had better define the "best fit" as well.) The big hurdles to multiple dispatch I see are: 1. It has to "make sense" somehow - maybe by satisfying some set of intuitive axioms. 2. It seems unlikely a majority will agree on the axioms. (Yay! Another social choice problem!) 3. It has to be simple enough that it's easy to predict what a method or next-method call does. Because this seems nigh-impossible (particularly #3), I'm starting to look elsewhere for solutions to "the problem". Speaking of which, it's really easy to forget the two primary use-cases that motivate multiple dispatch in the first place: 1. Interop with other types. For instance, I define a BarNumber type, and I want fooing of BarNumber with anything else to return a new BarNumber. Or perhaps I need some class I currently would have no control over to treat my custom type specially or apply a transformation before doing what it usually does. CLOS (Common Lisp Object System) handles this. Python's __op__/__rop__ does for a finite set of binary operators. 2. True dynamic overloading so we don't have to do type checks at the heads of our functions. CLOS handles this. #1 can always be done with clever delegation. For example, numpy's arrays do it halfway by applying __array__ to a function argument before operating on it. If they also applied, say, a type's fromarray classmethod upon return it'd provide a full solution. But as with numpy's arrays, users don't have enough control to make it work exactly as they want if the only solution is delegation. The class creator has to anticipate how his class will be used and risk over-generalizing it. In the worst cases, you'd end up with a crazy proliferation of types like FooLoggable and FooAppendable and methods like __foo_loggable__ and __foo_appendable__... The most general way to do #2 is the way it's done now. Sure it's Turing-complete, but it'd be nice to be able to handle the common cases automatically. PJE has a lot of further reading here: http://mail.python.org/pipermail/python-3000/2006-November/004421.html Neil From ryan.freckleton at gmail.com Wed Dec 26 00:33:31 2007 From: ryan.freckleton at gmail.com (Ryan Freckleton) Date: Tue, 25 Dec 2007 16:33:31 -0700 Subject: [Python-ideas] Anonymous functions for decorators Message-ID: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com> I've been working with decorators to make a design by contract library. This has gotten me thinking about decorators and anonymous functions. The two use cases I'll target are callbacks and decorating descriptors. i.e. (from the twisted tutorial): class FingerProtocol(basic.LineReceiver): def lineReceived(self, user): self.factory.getUser(user ).addErrback(lambda _: "Internal error in server" ).addCallback(lambda m: (self.transport.write(m+"\r\n"), self.transport.loseConnection())) currently this can be modified to (ab)use a decorator: ... def lineReceived(self, user): @self.factory.getUser(user).addCallback def temp(_): return "Internal error in server" ... This is an abuse of decorators because it doesn't actually decorate the function temp, it registers it as a callback for the class FingerProtocol. And this use of decorators, which I mimic for my design by contract library. (Guido recently posted a patch on python-dev that would allow the property to be used in the following way): class C(object): x = property(doc="x coordinate") @x.setter def x(self, x): self.__x = x @x.getter def x(self): return self__x There was some concern voiced on the list about repeating the variable name both in the decorator and the function definition. In other languages, these use cases are handled by anonymous functions or blocks. Python also has an anonymous function, lambda, but it's limited to a single expression. There are parsing and readability issues with making lambda multi-line. My proposal would be to use a special syntax, perhaps a keyword, to allow decorators to take an anonymous function. @self.factory.getUser(user).addCallback def temp(_): return "Internal error in server" would become: @self.factory.getUser(user).addCallback do (_): return "Internal error in server" and: @x.setter def x(self, x): self.__x = x would become: @x.setter do (self, x): self.__x = x In general: @DECORATOR do (..args...): ..function would be the same as: def temp(..args..): ..function DECORATOR(temp) del temp The decorators should be stackable, as well so: @DEC2 @DEC1 do (..args..) ..function would be the same as: def temp(..args..): ..function DEC2(DEC1(temp)) del temp Other uses/abuses Perhaps this new 'do' block could be used to implement things like the following: @repeatUntilSuccess(times=10) do (): print "Attempting to connect." socket.connect() with repeatUntilSuccess implemented as: def repeatUntilSuccess(times=None): if times is None: def repeater(func): while True: try: func() break except Exception: continue else: def repeater(func): for i in range(times): try: func() break except Exception: continue return repeater @fork do (): time.sleep(10) print "This is done in another thread." thing.dostuff() with fork implemented as: def fork(func): t = threading.Thread(target=func) t.start() Open Issues: ~~~~~~~~~~~~ Doing a google search shows that the word 'do' is occasionally used as a function/method name. Same for synonyms act and perform. Reusing def would probably not be a good idea, because it could be easily confused with a normal decorator and function. This is still somewhat an abuse of the decorator syntax, it isn't adding annotations to the anonymous function, so it isn't really 'decorating' it. I do think it's better than using a function that returns None as a decorator. While scanning source, the 'do' keyword may be easily confused with def func. I'm not really sure what would work better, suggestions are welcome. Perhaps I'm going down the wrong path by reusing decorator syntax. Other possible keywords instead of 'do': anonymous, function, act, perform. I first thought of reusing the keyword lambda, but I think the semantics of lambda are too different to be used in this case. Similarly, the keyword with has different semantics, as well. Thanks, -- ===== --Ryan E. Freckleton From adam at atlas.st Wed Dec 26 01:03:05 2007 From: adam at atlas.st (Adam Atlas) Date: Tue, 25 Dec 2007 19:03:05 -0500 Subject: [Python-ideas] Anonymous functions for decorators In-Reply-To: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com> References: <318072440712251533u4eadbe79w7af11582f82f3313@mail.gmail.com> Message-ID: <6F7CB91A-F6F8-4AE2-A76D-1E5237CBFA2F@atlas.st> I don't think def temp(...) is so terrible. Perhaps it's slightly inelegant, but adding a new keyword just so decorator syntax can be used in a decidedly non-decoratorish way to save a small amount of typing doesn't seem like something that would be accepted. If we were adding anonymous functions, I should think we'd make them more powerful while we're at it; it would at least be good to have them be first-class expressions, so you could do: DECORATOR(def (...args...): ...function...) #or whatever the syntax for anonymous functions would end up being and all the other useful things that come with anonymous functions and closures. I think I suggested a while ago that we allow that syntax -- def (args): ...statements... -- to be used as an expression, and also allow it to span multiple lines like a normal Python function definitions, but I was told that this would complicate parsing of indentation. I'm not sure why that is, but I'm not familiar with the internals of Python's parser. In any case, you can currently do @DECORATOR def temp(...args...): ...function... (as you are already doing in your Twisted example) and all you'll end up with is a name called "temp" bound to None. Not too bad. I really don't think inhibiting the assignment of some Nones is worth a new keyword and such. On 25 Dec 2007, at 18:33, Ryan Freckleton wrote: > In general: > > @DECORATOR > do (..args...): > ..function > > would be the same as: > > def temp(..args..): > ..function > DECORATOR(temp) > del temp