
At 04:18 PM 11/30/03 -0800, Guido van Rossum wrote:
The chaining part, or the idea at all? For the idea in general, I was just proposing a more explicit form of the last API proposal. For the chaining part, well, my use case is the same as the old Zope query library: being able to compose operators to craft OO queries from a high level description. No reason that needs to go in the standard library, but as long as we were dreaming, I figured I might help implement it if it solved enough problems for me. :) (Without the chaining part, I don't really care if there's a standard library 'extract()' or not, since I'll still need to write a chaining one sooner or later.)
Yes. Really the whole extract thing isn't that useful, except to get extra speed over using 'lambda x: x.foo' or whatever, which is what I'd probably use in any code that wasn't composing functions or compiling an OO query language. :)

The chaining part, or the idea at all?
Sorry, I didn't even see the chaining idea, so it was the keyword attribute instead of two different functions I disliked. The chaining idea seems unnecessary, you can do this with a general currying facility. (I'm not so keen on copying too many such ideas from Zope in to Python -- together they feel more like a bag of clever tricks than like a well-thought-out language.) Anyway, it's all moot -- Raymond just added operator.itemgetter and operator.attrgetter. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Phillip J. Eby]
[Thomas Heller]
Hm, couldn't "lambda ob: ob.foo.bar" return exactly the same thing as
"extract(extract(attr='foo'), attr='bar')"
? In other words: return specialized C implemented functions for simple lambda expressions?
I agree with Thomas - rather than adding yet more specialised functions, it would seem more sensible to optimize lambda - probably via special cases like this. Paul. -- This signature intentionally left blank

Paul Moore <pf_moore@yahoo.co.uk> writes:
One question that remains is: do a handful of these specialized functions make it possible to replace the remaining uses lambda completely? Looking at parts of my codebase nearly all uses of lambda are 'lambda self: self.someattr'. The remaining occurences have not yet been ported to the idioms of newer Pythons. Thomas

Aha. Very interesting ideas both! In the past, we had a similar issue with exec/eval. We looked at the most frequent uses of these, and found that getting an attribute with a computed name was the most common, so we added getattr (and setattr and delattr). Importing a module with a computed name was also quite common, and now we have __import__. So now exec/eval are typically only used when we *really* want to run code provided by an end user. (Exception: I often use eval() to parse literals when I know it is a literal but it can have several types, e.g. string, int, float. Maybe there's a restricted form of eval that could be used for this too?) So again, here we have a mechanism that's rather generic (lambda) which is frequently used in a few stylized patterns (to extract an attribute or field). So Raymond's new functions attrgetter and itemgetter (whose names I cannot seem to remember :-) take care of these. But, at least for attrgetter, I am slightly unhappy with the outcome, because the attribute name is now expressed as a string literal rather than using attribute notation. This makes it harder to write automated tools that check or optimize code. (For itemgetter it doesn't really matter, since the index is a literal either way.) So, while I'm not particularly keen on lambda, I'm not that keen on attrgetter either. But what could be better? All I can think of are slightly shorter but even more crippled forms of lambda; for example, we could invent a new keyword XXX so that the expression (XXX.foo) is equivalent to (lambda self: self.foo). This isn't very attractive. Maybe the idea of recognizing some special forms of lambda and implementing them more efficiently indeed makes more sense! Hm, I see no end to this rambling, but I've got to go, so I'll just stop now... --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Doesn't have to be a keyword... I implemented something like this years ago and then ditched it when list comps appeared. It would let you do things like
map(X + 1, range(2)) [1, 2, 3]
too, IIRC. Cheers, mwh --

What was your notation like? Did you actually use 'X'? --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Um, I think so. I defined it in my $PYTHONSTARTUP file as an instance of a class _X, or something like that. Cheers, mwh -- In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it. -- Tim Peters, 16 Sep 93

Michael Hudson <mwh@python.net> writes:
Something like this? class Adder: def __init__(self, number): self._number = number def __call__(self, arg): return arg + self._number class X: def __add__(self, number): return Adder(number) X = X() print map(X + 1, range(2))
[1, 2, 3]
(Although the above only prints [1, 2] ;-)

Ah, of course. Nice. This can be extended to __getattr__ and __getitem__; unfortunately __call__ would be ambiguous. It could probably be made quite fast with a C implementation. Now the question remains, would it be better to hide this and simply use it under the hood as an alternative way of generating code for lambda, or should it be some sort of standard library module, to be invoked explicitly? In favor of the latter pleads that this would solve the semantic differences with lambda when free variables are involved: obviously X+q would evaluate q only once, while (lamda X: X+q) evaluates q on each invocation. Remember that for generator expressions we've made the decision that (X+q for X in seq) should evaluate q only once. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
The latter has another advantage (or is this a disadvantage of the former?): You can invoke lambda x: x.something with a keyword arg, which would not be possible with a C implemented function, I assume. lambda expressions are often used to implement gui callbacks, and they are sometimes invoked this way. So the former would introduce incompatibilities. Thomas

Thomas Heller <theller@python.net>:
If you're asking whether a C function can have a keyword argument whose name is determined at runtime, I think that could be arranged -- it would just be a matter of concocting the appropriate arguments to PyArg_ParseTupleAndKeywords dynamically. As to a user-friendly syntax for invoking it, maybe something like arg.x.something Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Guido van Rossum <guido@python.org> writes:
Yes. I think I used .c() for that. IIRC correctly, I also complicated things to the extent that there was a special object, say N, and you could write
map(X.split(N), ['a a', 'b b\nb'], [None, '\n']) [['a', 'a'], ['b b', 'b']]
This might be going a bit far...
It could probably be made quite fast with a C implementation.
Would be tedious to write though :-)
I am not a fan of the idea of compiling certain kinds of lambdas differently. Cheers, mwh -- While preceding your entrance with a grenade is a good tactic in Quake, it can lead to problems if attempted at work. -- C Hacking -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

[Thomas Heller]
Looking at parts of my codebase nearly all uses of lambda are 'lambda self: self.someattr'.
Yes, they are everywhere. Getting rid of those lambdas was part of the attraction for attrgetter().
I don't know if you like this, but there is a way to change the interface to attrgetter() so that the dot notation can be used instead of a string. It produces the same result and is neater, but I find it somewhat harder to explain: import operator class NewAttrGetter(object): def __getattribute__(self, name): return operator.attrgetter(name) newattrgetter = NewAttrGetter() class A: pass a = A() a.score = 10 getscore = operator.attrgetter('score') print getscore(a) getscore = newattrgetter.score print getscore(a) new-style-classes-rock-ly yours, Raymond Hettinger

[Guido]
[Me, with mildly wacky idea]
Carried to the limit, the idea turns into something that is either sublime or severely bonkers. The good points are that Guido gets his dotted access and I get to trade in the two ugly names and for a single beautiful "extract". And there's no performance cost, the inner loop is the same. The downside is I still don't know how to explain it (AFAICT, super() is the closest thing to it): import operator class ExtractorClass(object): def __getattribute__(self, attr): return operator.attrgetter(attr) def __getitem__(self, key): return operator.itemgetter(key) extract = ExtractorClass() class A: pass a = A() a.score = 10 getscore = extract.score # Houston, we have dotted access print getscore(a) b = [10, 20, 30, 40] getsecond = extract[1] # and, we have the bracketed lookups print getsecond(b) animal_weights = [('cat', 10), ('dog', 90), ('human', 150), ('goldfish', 0.1), ('unladen_sparrow', 3)] list.sorted(animal_weights, key=extract[1]) list.sorted(students, key=extract.score) So, now we have a weird bird that is faster and better looking than lambda. Raymond

Raymond Hettinger wrote:
What you end up with is still a unary operator, so it could still live in the operator module, to. And I think what you posted would work for strings as dictionary keys, too - answering another of the objections to the original operator.extract Which leaves figuring out a concise explanation for what the hell it does (without using lambda in the examples, since I assume part of the aim here is to avoid explaining lambda to people who don't need it). . . "operator.extract provides an interim target for an attribute or item access where the real target of the access is to be determined later. The access is made normally (i.e. dotted notation or indexing), with 'operator.extract' substituted where the target would normally be written. The actual access is carried out by calling the result returned by the expression operator.extract is part of with the real target as the first argument. E.g. y = operator.extract.foo is equivalent to def y(x): return x.foo y = operator.extract[2] is equivalent to def y(x): return x[2] " If you meant a concise explanation of _how_ it does what it does, then I'm not sure that can be done :) And I still don't know if I should be admiring this or running away screaming! Cheers, Nick.

Nick Coghlan <ncoghlan@iinet.net.au> wrote in news:3FCDCDA1.2040402@iinet.net.au:
You could even consider reinstating the ability to chain extract operations. (Although this may be even harder to explain). The version below allows attribute and item access to be chained and mixed freely: class ExtractorClass(object): def __init__(self, fn=None, arg=None, parent=None): if fn is None: extract = [] else: extract = [(fn,arg)] if parent is not None: extract = parent._accessors + extract self._accessors = extract def __getattribute__(self, attr): if attr == '_accessors': return object.__getattribute__(self, attr) return ExtractorClass(getattr, attr, self) def __getitem__(self, key): return ExtractorClass(operator.getitem, key, self) def __call__(self, obj): for fn, arg in self._accessors: obj=fn(obj, arg) return obj extract = ExtractorClass() class A: pass a = A() a.score = 10 b = A() a.b = b b.score = 42 getscore = extract.score # Houston, we have dotted access print getscore(a) getsubscore = extract.b.score # Chaining print getsubscore(a) b = [10, 20, 30, 40] getsecond = extract[1] # and, we have the bracketed lookups print getsecond(b) a.b = b getbsecond = extract.b[1] # Chain a mix of attributes and indexes. print getbsecond(a) -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

The chaining part, or the idea at all?
Sorry, I didn't even see the chaining idea, so it was the keyword attribute instead of two different functions I disliked. The chaining idea seems unnecessary, you can do this with a general currying facility. (I'm not so keen on copying too many such ideas from Zope in to Python -- together they feel more like a bag of clever tricks than like a well-thought-out language.) Anyway, it's all moot -- Raymond just added operator.itemgetter and operator.attrgetter. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Phillip J. Eby]
[Thomas Heller]
Hm, couldn't "lambda ob: ob.foo.bar" return exactly the same thing as
"extract(extract(attr='foo'), attr='bar')"
? In other words: return specialized C implemented functions for simple lambda expressions?
I agree with Thomas - rather than adding yet more specialised functions, it would seem more sensible to optimize lambda - probably via special cases like this. Paul. -- This signature intentionally left blank

Paul Moore <pf_moore@yahoo.co.uk> writes:
One question that remains is: do a handful of these specialized functions make it possible to replace the remaining uses lambda completely? Looking at parts of my codebase nearly all uses of lambda are 'lambda self: self.someattr'. The remaining occurences have not yet been ported to the idioms of newer Pythons. Thomas

Aha. Very interesting ideas both! In the past, we had a similar issue with exec/eval. We looked at the most frequent uses of these, and found that getting an attribute with a computed name was the most common, so we added getattr (and setattr and delattr). Importing a module with a computed name was also quite common, and now we have __import__. So now exec/eval are typically only used when we *really* want to run code provided by an end user. (Exception: I often use eval() to parse literals when I know it is a literal but it can have several types, e.g. string, int, float. Maybe there's a restricted form of eval that could be used for this too?) So again, here we have a mechanism that's rather generic (lambda) which is frequently used in a few stylized patterns (to extract an attribute or field). So Raymond's new functions attrgetter and itemgetter (whose names I cannot seem to remember :-) take care of these. But, at least for attrgetter, I am slightly unhappy with the outcome, because the attribute name is now expressed as a string literal rather than using attribute notation. This makes it harder to write automated tools that check or optimize code. (For itemgetter it doesn't really matter, since the index is a literal either way.) So, while I'm not particularly keen on lambda, I'm not that keen on attrgetter either. But what could be better? All I can think of are slightly shorter but even more crippled forms of lambda; for example, we could invent a new keyword XXX so that the expression (XXX.foo) is equivalent to (lambda self: self.foo). This isn't very attractive. Maybe the idea of recognizing some special forms of lambda and implementing them more efficiently indeed makes more sense! Hm, I see no end to this rambling, but I've got to go, so I'll just stop now... --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Doesn't have to be a keyword... I implemented something like this years ago and then ditched it when list comps appeared. It would let you do things like
map(X + 1, range(2)) [1, 2, 3]
too, IIRC. Cheers, mwh --

What was your notation like? Did you actually use 'X'? --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Um, I think so. I defined it in my $PYTHONSTARTUP file as an instance of a class _X, or something like that. Cheers, mwh -- In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it. -- Tim Peters, 16 Sep 93

Michael Hudson <mwh@python.net> writes:
Something like this? class Adder: def __init__(self, number): self._number = number def __call__(self, arg): return arg + self._number class X: def __add__(self, number): return Adder(number) X = X() print map(X + 1, range(2))
[1, 2, 3]
(Although the above only prints [1, 2] ;-)

Ah, of course. Nice. This can be extended to __getattr__ and __getitem__; unfortunately __call__ would be ambiguous. It could probably be made quite fast with a C implementation. Now the question remains, would it be better to hide this and simply use it under the hood as an alternative way of generating code for lambda, or should it be some sort of standard library module, to be invoked explicitly? In favor of the latter pleads that this would solve the semantic differences with lambda when free variables are involved: obviously X+q would evaluate q only once, while (lamda X: X+q) evaluates q on each invocation. Remember that for generator expressions we've made the decision that (X+q for X in seq) should evaluate q only once. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
The latter has another advantage (or is this a disadvantage of the former?): You can invoke lambda x: x.something with a keyword arg, which would not be possible with a C implemented function, I assume. lambda expressions are often used to implement gui callbacks, and they are sometimes invoked this way. So the former would introduce incompatibilities. Thomas

Thomas Heller <theller@python.net>:
If you're asking whether a C function can have a keyword argument whose name is determined at runtime, I think that could be arranged -- it would just be a matter of concocting the appropriate arguments to PyArg_ParseTupleAndKeywords dynamically. As to a user-friendly syntax for invoking it, maybe something like arg.x.something Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Guido van Rossum <guido@python.org> writes:
Yes. I think I used .c() for that. IIRC correctly, I also complicated things to the extent that there was a special object, say N, and you could write
map(X.split(N), ['a a', 'b b\nb'], [None, '\n']) [['a', 'a'], ['b b', 'b']]
This might be going a bit far...
It could probably be made quite fast with a C implementation.
Would be tedious to write though :-)
I am not a fan of the idea of compiling certain kinds of lambdas differently. Cheers, mwh -- While preceding your entrance with a grenade is a good tactic in Quake, it can lead to problems if attempted at work. -- C Hacking -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

[Thomas Heller]
Looking at parts of my codebase nearly all uses of lambda are 'lambda self: self.someattr'.
Yes, they are everywhere. Getting rid of those lambdas was part of the attraction for attrgetter().
I don't know if you like this, but there is a way to change the interface to attrgetter() so that the dot notation can be used instead of a string. It produces the same result and is neater, but I find it somewhat harder to explain: import operator class NewAttrGetter(object): def __getattribute__(self, name): return operator.attrgetter(name) newattrgetter = NewAttrGetter() class A: pass a = A() a.score = 10 getscore = operator.attrgetter('score') print getscore(a) getscore = newattrgetter.score print getscore(a) new-style-classes-rock-ly yours, Raymond Hettinger

[Guido]
[Me, with mildly wacky idea]
Carried to the limit, the idea turns into something that is either sublime or severely bonkers. The good points are that Guido gets his dotted access and I get to trade in the two ugly names and for a single beautiful "extract". And there's no performance cost, the inner loop is the same. The downside is I still don't know how to explain it (AFAICT, super() is the closest thing to it): import operator class ExtractorClass(object): def __getattribute__(self, attr): return operator.attrgetter(attr) def __getitem__(self, key): return operator.itemgetter(key) extract = ExtractorClass() class A: pass a = A() a.score = 10 getscore = extract.score # Houston, we have dotted access print getscore(a) b = [10, 20, 30, 40] getsecond = extract[1] # and, we have the bracketed lookups print getsecond(b) animal_weights = [('cat', 10), ('dog', 90), ('human', 150), ('goldfish', 0.1), ('unladen_sparrow', 3)] list.sorted(animal_weights, key=extract[1]) list.sorted(students, key=extract.score) So, now we have a weird bird that is faster and better looking than lambda. Raymond

Raymond Hettinger wrote:
What you end up with is still a unary operator, so it could still live in the operator module, to. And I think what you posted would work for strings as dictionary keys, too - answering another of the objections to the original operator.extract Which leaves figuring out a concise explanation for what the hell it does (without using lambda in the examples, since I assume part of the aim here is to avoid explaining lambda to people who don't need it). . . "operator.extract provides an interim target for an attribute or item access where the real target of the access is to be determined later. The access is made normally (i.e. dotted notation or indexing), with 'operator.extract' substituted where the target would normally be written. The actual access is carried out by calling the result returned by the expression operator.extract is part of with the real target as the first argument. E.g. y = operator.extract.foo is equivalent to def y(x): return x.foo y = operator.extract[2] is equivalent to def y(x): return x[2] " If you meant a concise explanation of _how_ it does what it does, then I'm not sure that can be done :) And I still don't know if I should be admiring this or running away screaming! Cheers, Nick.

Nick Coghlan <ncoghlan@iinet.net.au> wrote in news:3FCDCDA1.2040402@iinet.net.au:
You could even consider reinstating the ability to chain extract operations. (Although this may be even harder to explain). The version below allows attribute and item access to be chained and mixed freely: class ExtractorClass(object): def __init__(self, fn=None, arg=None, parent=None): if fn is None: extract = [] else: extract = [(fn,arg)] if parent is not None: extract = parent._accessors + extract self._accessors = extract def __getattribute__(self, attr): if attr == '_accessors': return object.__getattribute__(self, attr) return ExtractorClass(getattr, attr, self) def __getitem__(self, key): return ExtractorClass(operator.getitem, key, self) def __call__(self, obj): for fn, arg in self._accessors: obj=fn(obj, arg) return obj extract = ExtractorClass() class A: pass a = A() a.score = 10 b = A() a.b = b b.score = 42 getscore = extract.score # Houston, we have dotted access print getscore(a) getsubscore = extract.b.score # Chaining print getsubscore(a) b = [10, 20, 30, 40] getsecond = extract[1] # and, we have the bracketed lookups print getsecond(b) a.b = b getbsecond = extract.b[1] # Chain a mix of attributes and indexes. print getbsecond(a) -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
participants (9)
-
Duncan Booth
-
Greg Ewing
-
Guido van Rossum
-
Michael Hudson
-
Nick Coghlan
-
Paul Moore
-
Phillip J. Eby
-
Raymond Hettinger
-
Thomas Heller