From masklinn at masklinn.net  Mon Oct  4 10:34:33 2010
From: masklinn at masklinn.net (Masklinn)
Date: Mon, 4 Oct 2010 10:34:33 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
Message-ID: <7E929BBF-952E-4B86-BBDE-E7C8AD437337@masklinn.net>

On 2010-10-04, at 05:04 , Eviatar Bach wrote:
> Hello,
> 
> I have a proposal of making the range() function inclusive; that is,
> range(3) would generate 0, 1, 2, and 3, as opposed to 0, 1, and 2. Not only
> is it more intuitive, it also seems to be used often, with coders often
> writing range(0, example+1) to get the intended result. It would be easy to
> implement, and though significant, is not any more drastic than changing
> print to a function in Python 3. Of course, if this were done, slicing
> behaviour would have to be adjusted accordingly.
> 
> What are your thoughts?

Same as the others:
0. This is a discussion for python-ideas, I'm CCing that list
1. This is a major backwards compatibility breakage, and one which is entirely silent (`print` from keyword to function wasn't)
2. It loses not only well-known behavior but interesting properties as well (`range(n)` has exactly `n` elements. With your proposal, it has ``n+1`` breaking ``for i in range(5)`` to iterate 5 times as well as ``for i in range(len(collection))`` for cases where e.g. ``enumerate`` is not good enough or too slow)
3. As well as the relation between range and slices
4. I fail to see how it is more intuitive (let alone more practical, see previous points)
5. If you want an inclusive range, I'd recommend proposing a flag on `range` (e.g. ``inclusive=True``) rather than such a drastic breakage of ``range``'s behavior. That, at least, might have a chance. Changing the existing default behavior of range most definitely doesn't.

I'd be ?1 on your proposal, ?0 on adding a flag to ``range`` (I can't recall the half-open ``range`` having bothered me recently, if ever)

From ncoghlan at gmail.com  Mon Oct  4 14:59:51 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 4 Oct 2010 22:59:51 +1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
Message-ID: <AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>

On Mon, Oct 4, 2010 at 5:27 PM, Xavier Morel <python-dev at masklinn.net> wrote:
> Same as the others:
> 0. This is a discussion for python-ideas, I'm CCing that list
> 1. This is a major backwards compatibility breakage, and one which is entirely silent (`print` from keyword to function wasn't)
> 2. It loses not only well-known behavior but interesting properties as well (`range(n)` has exactly `n` elements. With your proposal, it has ``n+1`` breaking ``for i in range(5)`` to iterate 5 times as well as ``for i in range(len(collection))`` for cases where e.g. ``enumerate`` is not good enough or too slow)
> 3. As well as the relation between range and slices
> 4. I fail to see how it is more intuitive (let alone more practical, see previous points)
> 5. If you want an inclusive range, I'd recommend proposing a flag on `range` (e.g. ``inclusive=True``) rather than such a drastic breakage of ``range``'s behavior. That, at least, might have a chance. Changing the existing default behavior of range most definitely doesn't.

A flag doesn't have any chance either - you spell inclusive ranges by
including a "+1" on the stop value.

Closed ranges actually do superficially appear more intuitive
(especially to new programmers) because we often use inclusive ranges
in ordinary speech ("10-15 people" allows 15 people, "ages 8-12"
includes 12 year olds, "from A-Z" includes items starting with "Z").
However, there are some cases where we naturally use half-open ranges
as well (such as "between 10 and 12" excluding 12:01 to 12:59) or
explicitly invoke exclusive ranges as being easier to deal with (such
as the "under 13s", "under 19s", etc naming schemes used for age
brackets in junior sports)

However, as soon you move into the mathematical world (including
programming), closed ranges turn out to require constant adjustments
in the arithmetic, so it far more natural to use half-open ranges
consistently.

Xavier noted the two most important properties of half-closed ranges
for Python: they match the definition of subtraction, such that
len(range(start, stop)) =  (stop - start), and they match the
definition of slicing as being half-open.

As to whether slicing itself being half-open is beneficial, the value
of that becomes clear ones you start trying to manipulate ranges:
With half-open slices, the following is true: s == s[:i] + s[i:]
With inclusive slices (which would be needed to complement inclusive
range), you would need either a -1 on the stop value of the first
slice, or a +1 on the start value of the second slice.
Similarly, if you know the length of the slice you want, then you can
grab it via s[i:i+slice_len], while you'd need a -1 correction on the
stop value if slices were inclusive.

There are other benefits to half-open ranges when it comes to
(approximately) continuous spectra like time values, floating point
numbers and lexically ordered strings. Being able to say things like
"10:00" <= x < '12:00", 10.0 <= x < 12.0, "a" <= x < "n" are much
clearer than trying to specify their closed range equivalents. While
that isn't specifically applicable to the range() builtin, it is
another factor in why it is important to drink the "half-open ranges
are your friend" Kool-aid as a serious programmer.

Cheers,
Nick.

P.S. many of the points above are just rephrased from
http://www.siliconbrain.com/ranges.htm, which is the first hit when
Googling "half-open ranges"

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From cmjohnson.mailinglist at gmail.com  Tue Oct  5 10:54:04 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Mon, 4 Oct 2010 22:54:04 -1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
Message-ID: <AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>

Changing range would only make sense if lists were also changed to
start at 1 instead of 0, and that's never gonna happen. It's a
massively backwards incompatible change with no real offsetting
advantage.

Still, if you were designing a brand new language today, would you
have arrays/lists start at 0 or 1? (Or compromise and do .5?) I
personally lean towards 1, since I recall being frequently tripped up
by the first element in an array being a[0] way back when I first
learn C++ in the 20th century. But maybe this was because I had been
messed up by writing BASIC for loops from 1 to n before that? Is there
anyone with teaching experience here? Is this much of a problem for
young people learning Python (or any other zero-based indexing
language) as their first language?

What do you guys think? Now that simplifying pointer arithmetic isn't
such an important consideration, is it still better to do zero-based
indexing?

-- Carl Johnson


From cmjohnson.mailinglist at gmail.com  Tue Oct  5 10:54:04 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Mon, 4 Oct 2010 22:54:04 -1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
Message-ID: <AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>

Changing range would only make sense if lists were also changed to
start at 1 instead of 0, and that's never gonna happen. It's a
massively backwards incompatible change with no real offsetting
advantage.

Still, if you were designing a brand new language today, would you
have arrays/lists start at 0 or 1? (Or compromise and do .5?) I
personally lean towards 1, since I recall being frequently tripped up
by the first element in an array being a[0] way back when I first
learn C++ in the 20th century. But maybe this was because I had been
messed up by writing BASIC for loops from 1 to n before that? Is there
anyone with teaching experience here? Is this much of a problem for
young people learning Python (or any other zero-based indexing
language) as their first language?

What do you guys think? Now that simplifying pointer arithmetic isn't
such an important consideration, is it still better to do zero-based
indexing?

-- Carl Johnson


From cmjohnson.mailinglist at gmail.com  Tue Oct  5 10:54:04 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Mon, 4 Oct 2010 22:54:04 -1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
Message-ID: <AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>

Changing range would only make sense if lists were also changed to
start at 1 instead of 0, and that's never gonna happen. It's a
massively backwards incompatible change with no real offsetting
advantage.

Still, if you were designing a brand new language today, would you
have arrays/lists start at 0 or 1? (Or compromise and do .5?) I
personally lean towards 1, since I recall being frequently tripped up
by the first element in an array being a[0] way back when I first
learn C++ in the 20th century. But maybe this was because I had been
messed up by writing BASIC for loops from 1 to n before that? Is there
anyone with teaching experience here? Is this much of a problem for
young people learning Python (or any other zero-based indexing
language) as their first language?

What do you guys think? Now that simplifying pointer arithmetic isn't
such an important consideration, is it still better to do zero-based
indexing?

-- Carl Johnson


From masklinn at masklinn.net  Tue Oct  5 11:08:35 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 5 Oct 2010 11:08:35 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
Message-ID: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>


On 2010-10-05, at 10:54 , Carl M. Johnson wrote:

> Changing range would only make sense if lists were also changed to
> start at 1 instead of 0, and that's never gonna happen. It's a
> massively backwards incompatible change with no real offsetting
> advantage.
> 
> Still, if you were designing a brand new language today, would you
> have arrays/lists start at 0 or 1? (Or compromise and do .5?) I
> personally lean towards 1, since I recall being frequently tripped up
> by the first element in an array being a[0] way back when I first
> learn C++ in the 20th century. But maybe this was because I had been
> messed up by writing BASIC for loops from 1 to n before that? Is there
> anyone with teaching experience here? Is this much of a problem for
> young people learning Python (or any other zero-based indexing
> language) as their first language?
> 
> What do you guys think? Now that simplifying pointer arithmetic isn't
> such an important consideration, is it still better to do zero-based
> indexing?

I will refer to EWD 831[0], which talks about ranges and starting indexes without *once* referring to pointers.

Pointers are in fact entirely irrelevant to the discussion: FORTRAN and ALGOL 60, among many others, used 1-indexed collections. Some languages (ADA, I believe, though I am by no means certain) also allow for arbitrary starting indexes.

[0] http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

From cmjohnson.mailinglist at gmail.com  Tue Oct  5 12:52:39 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Tue, 5 Oct 2010 00:52:39 -1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
Message-ID: <AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>

I did some research before posting and saw that they talked about that
Dykstra paper on C2's page about zero indexing, and honestly, I count
it as a point in favor of starting with 1. Dykstra was a great
computer scientist but a terrible computer programmer (with the
exception of "Goto Considered Harmful" [a headline he didn't actually
give to his article]) in the sense that he understand how to do things
mathematically but not how to take into account the human factors in
such a way that one can get normal people to program well. His theory
that we should all be proving the correctness of our program is, to my
way of thinking, a crank's theory. If regular people can't be trusted
to program, they certainly can't be trusted to write correctness
proofs, which is a harder task, not a simpler one. Moreover this
ignores all of the stuff that Paul Graham would eventually say about
the joys of exploratory programming, or to give an earlier reference,
the need to build one to throw away as Brooks said. Proving
correctness presumes that you know what you want to program before you
start programming it, which is only rarely the case, mostly in the
computer science classroom. So, I don't consider Dykstra's expertise
to be worth relying in matters of programming, as distinct from
matters of computer science.

In the particular case, the correct way to represent an integer
between 2 and 12 wouldn't be a, b, c, or d. It would be i in range(2,
12) (if we were creating a new language that was 1 indexed and range
was likewise adjusted), the list [1] would be range(1), and the empty
list would be range(0), so the whole issue could be neatly
sidestepped. :-)

As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be
made that it would be less confusing as l == l[:x] + l[x+1:y] +
l[y+1:], since you don't want to start again with x or y. You just
ended at x. When you pick up again, you want to start at x+1 and y+1
so that you don't get the x-th and y-th elements again. ;-)

Of course this is speculation on my part. Maybe students of
programming find 1-indexing just as confusing as 0-indexing. Any
pedagogues want to chime in?

-- Carl Johnson


From masklinn at masklinn.net  Tue Oct  5 13:05:43 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 5 Oct 2010 13:05:43 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
Message-ID: <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>

On 2010-10-05, at 12:52 , Carl M. Johnson wrote:
> In the particular case, the correct way to represent an integer
> between 2 and 12 wouldn't be a, b, c, or d. It would be i in range(2,
> 12)
You don't seem to realize that a, b, c and d are behaviors of languages, and that `range` can map to any of these 4 behaviors. The current `range` implements behavior `a`, the proposed one implements behavior c. a, b, c and d are simply descriptions of these behaviors in mathematical terms so as not to rely on language-specific concepts.

> (if we were creating a new language that was 1 indexed and range
> was likewise adjusted), the list [1] would be range(1), and the empty
> list would be range(0), so the whole issue could be neatly
> sidestepped. :-)
I fail to see what gets sidestepped there. Ignored at best.

> As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be
> made that it would be less confusing as l == l[:x] + l[x+1:y] +
> l[y+1:], since you don't want to start again with x or y.
Why not?

> You just
> ended at x. When you pick up again, you want to start at x+1 and y+1
> so that you don't get the x-th and y-th elements again. ;-)
Yes indeed, as you demonstrate here closed ranges greatly complexify the code one has to write compared to half-closed ranges.

From bborcic at gmail.com  Tue Oct  5 13:45:56 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Tue, 05 Oct 2010 13:45:56 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
Message-ID: <i8f35l$1t0$1@dough.gmane.org>

Nick Coghlan wrote:
> [...] Being able to say things like
> "10:00"<= x<  '12:00", 10.0<= x<  12.0, "a"<= x<  "n" are much
> clearer than trying to specify their closed range equivalents.

makes one wonder about syntax like :

for 10 <= x < 20 :
     blah(x)


Mh, I suppose with rich comparisons special methods, it's possible to turn 
chained comparisons into range factories without introducing new syntax. 
Something more like


for x in (10 <= step(1) < 20) :
     blah(x)



From cmjohnson.mailinglist at gmail.com  Tue Oct  5 13:51:10 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Tue, 5 Oct 2010 01:51:10 -1000
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
Message-ID: <AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>

On Tue, Oct 5, 2010 at 1:05 AM, Masklinn <masklinn at masklinn.net> wrote:

>> (if we were creating a new language that was 1 indexed and range
>> was likewise adjusted), the list [1] would be range(1), and the empty
>> list would be range(0), so the whole issue could be neatly
>> sidestepped. :-)
> I fail to see what gets sidestepped there. Ignored at best.

He was trying to be language neutral by writing using < and <= but
that's part of his problem. He's too much of a mathematician.
Rewriting things so that they don't use < or <= at all is the best way
to explain things to a non-math person. If you say "range(1, 5) gives
a range from 1 to 5" your explanation doesn't have to use < or <= at
all. This is unlike a C-like language where you would write int i=2;
i<12; i++. So the question of what mathematics "really" underlies it
can be sidestepped by using a language that many people know better
than the language of mathematics: the English language.

>> As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be
>> made that it would be less confusing as l == l[:x] + l[x+1:y] +
>> l[y+1:], since you don't want to start again with x or y.
> Why not?

Because (speaking naively) I already FizzBuzzed the x-th element
before. I don't want to double FizzBuzz it. So that means I should
start up again with the +1 element.

>> You just
>> ended at x. When you pick up again, you want to start at x+1 and y+1
>> so that you don't get the x-th and y-th elements again. ;-)
> Yes indeed, as you demonstrate here closed ranges greatly complexify the code one has to write compared to half-closed ranges.

Yup. TANSTAAFL. That's why we shouldn't actually bother to change
things: you lose on the backend what you gain on the frontend. I'm
just curious about whether starting programmers have a strong
preference for one or the other convention or whether both are
confusing.


From masklinn at masklinn.net  Tue Oct  5 14:10:46 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 5 Oct 2010 14:10:46 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
Message-ID: <107B0DE5-08EC-4933-8B02-32AE0CCE7BD2@masklinn.net>


On 2010-10-05, at 13:51 , Carl M. Johnson wrote:

> On Tue, Oct 5, 2010 at 1:05 AM, Masklinn <masklinn at masklinn.net> wrote:
> 
>>> (if we were creating a new language that was 1 indexed and range
>>> was likewise adjusted), the list [1] would be range(1), and the empty
>>> list would be range(0), so the whole issue could be neatly
>>> sidestepped. :-)
>> I fail to see what gets sidestepped there. Ignored at best.
> 
> He was trying to be language neutral by writing using < and <= but
> that's part of his problem. He's too much of a mathematician.
> Rewriting things so that they don't use < or <= at all is the best way
> to explain things to a non-math person. If you say "range(1, 5) gives
> a range from 1 to 5" your explanation doesn't have to use < or <= at
> all.
But again, you don't sidestep anything. "a range from 1 to 5" is ambiguous and can be understood as any of the 4 relations Dijkstra provides. So it's only a good way to explain it in that 0. it doesn't expose a reader to semi-mathematical notation anybody over 12 should be able to understand and 1. it avoids any semblance of unambiguity and instead decides to leave all interpretation to the reader.

> This is unlike a C-like language where you would write int i=2;
> i<12; i++.
Uh what?

> So the question of what mathematics "really" underlies it
> can be sidestepped by using a language that many people know better
> than the language of mathematics: the English language.
No. As I said, it doesn't sidestep the issue but ignores it by replacing perfectly unambiguous notation by utterly ambiguous description.



From fuzzyman at voidspace.org.uk  Tue Oct  5 15:07:41 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 5 Oct 2010 14:07:41 +0100
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
Message-ID: <AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>

On 5 October 2010 12:51, Carl M. Johnson <cmjohnson.mailinglist at gmail.com>wrote:

> [snip...]
>
> Yup. TANSTAAFL. That's why we shouldn't actually bother to change
> things: you lose on the backend what you gain on the frontend. I'm
> just curious about whether starting programmers have a strong
> preference for one or the other convention or whether both are
> confusing.
>

Both teaching new programmers and programmers coming from other languages
I've found them confused by the range behaviour and usually end up having to
apologise for it (a sure sign of a language wart).

It is *good* that range(5) produces 5 values (0 to 4) but *weird* that
range(3, 10) doesn't include the 10.

Changing it now would be *very* backwards incompatible of course. Python 4
perhaps?

All the best,

Michael Foord



> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
http://www.voidspace.org.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101005/dfa0bb19/attachment.html>

From ctb at msu.edu  Tue Oct  5 15:13:56 2010
From: ctb at msu.edu (C. Titus Brown)
Date: Tue, 5 Oct 2010 06:13:56 -0700
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
Message-ID: <20101005131356.GA21646@idyll.org>

On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote:
> On 5 October 2010 12:51, Carl M. Johnson <cmjohnson.mailinglist at gmail.com>wrote:
> 
> > [snip...]
> >
> > Yup. TANSTAAFL. That's why we shouldn't actually bother to change
> > things: you lose on the backend what you gain on the frontend. I'm
> > just curious about whether starting programmers have a strong
> > preference for one or the other convention or whether both are
> > confusing.
> 
> Both teaching new programmers and programmers coming from other languages
> I've found them confused by the range behaviour and usually end up having to
> apologise for it (a sure sign of a language wart).
> 
> It is *good* that range(5) produces 5 values (0 to 4) but *weird* that
> range(3, 10) doesn't include the 10.
> 
> Changing it now would be *very* backwards incompatible of course. Python 4
> perhaps?

Doesn't it make sense that 

len(range(5)) == 5

and

for i in range(5):
   ...

mimics the C/C++ behavior of

   for (i = 0; i < 5; i++) ...

?

--titus
-- 
C. Titus Brown, ctb at msu.edu


From fuzzyman at voidspace.org.uk  Tue Oct  5 15:16:20 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 05 Oct 2010 14:16:20 +0100
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <20101005131356.GA21646@idyll.org>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
Message-ID: <4CAB2524.1010008@voidspace.org.uk>

  On 05/10/2010 14:13, C. Titus Brown wrote:
> On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote:
>> On 5 October 2010 12:51, Carl M. Johnson<cmjohnson.mailinglist at gmail.com>wrote:
>>
>>> [snip...]
>>>
>>> Yup. TANSTAAFL. That's why we shouldn't actually bother to change
>>> things: you lose on the backend what you gain on the frontend. I'm
>>> just curious about whether starting programmers have a strong
>>> preference for one or the other convention or whether both are
>>> confusing.
>> Both teaching new programmers and programmers coming from other languages
>> I've found them confused by the range behaviour and usually end up having to
>> apologise for it (a sure sign of a language wart).
>>
>> It is *good* that range(5) produces 5 values (0 to 4) but *weird* that
>> range(3, 10) doesn't include the 10.
>>
>> Changing it now would be *very* backwards incompatible of course. Python 4
>> perhaps?
> Doesn't it make sense that
>
> len(range(5)) == 5
>
> and
>
> for i in range(5):
>     ...
>
> mimics the C/C++ behavior of
>
>     for (i = 0; i<  5; i++) ...
>

Yes. That is why I said that the current behaviour of range for a single 
input is *good*. Perhaps I should have been clearer; it is only the 
behaviour of range(x, y) that I've found people-new-to-python confused by.

All the best,

Michael

> ?
>
> --titus


-- 
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.



From alexander.belopolsky at gmail.com  Tue Oct  5 16:33:14 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 5 Oct 2010 10:33:14 -0400
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <4CAB2524.1010008@voidspace.org.uk>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<4CAB2524.1010008@voidspace.org.uk>
Message-ID: <AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>

On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> ... Perhaps I should have been clearer; it is only the
> behaviour of range(x, y) that I've found people-new-to-python confused by.

Teach them about range(x, y, z) and once you cover negative z they
will stop complaining about range(x, y). :-)

At least you don't have to deal with range vs. xrange in 3.x anymore.
IMO, range([start,] stop[, step]) is one of the worst interfaces in
python.  Is there any other function with an optional *first*
argument?  Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1))
cannot be used to produce days in January?  Why range(2**300)
succeeds, but len(range(2**300)) raises OverflowError?

No, I don't think much can be done about it.  Py3k has already done
everything that was practical about improving range(..).


From masklinn at masklinn.net  Tue Oct  5 16:47:33 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 5 Oct 2010 16:47:33 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<4CAB2524.1010008@voidspace.org.uk>
	<AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
Message-ID: <B02D3715-7640-4D60-B82B-4E4AB0315DB0@masklinn.net>

On 2010-10-05, at 16:33 , Alexander Belopolsky wrote:
> On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>> ... Perhaps I should have been clearer; it is only the
>> behaviour of range(x, y) that I've found people-new-to-python confused by.
> Teach them about range(x, y, z) and once you cover negative z they
> will stop complaining about range(x, y). :-)
> 
> At least you don't have to deal with range vs. xrange in 3.x anymore.
> IMO, range([start,] stop[, step]) is one of the worst interfaces in
> python.  Is there any other function with an optional *first*
> argument?
Dict, kinda, though the other arguments are keywords so it probably doesn't count.

>  Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1))
> cannot be used to produce days in January?
Likewise for range('a', 'e'). Range only working on integers is definitely annoying compared to the equivalent construct in Haskell for instance, or Ruby (though ruby has the issue of indistinguishable half-closed and fully-closed ranges when using the operator version).

>  Why range(2**300)
> succeeds, but len(range(2**300)) raises OverflowError?
The former overflows in Python 2. It doesn't in Python 3 due to `range` being an iterable not a list.



From fuzzyman at voidspace.org.uk  Tue Oct  5 16:51:20 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 05 Oct 2010 15:51:20 +0100
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>	<20101005131356.GA21646@idyll.org>	<4CAB2524.1010008@voidspace.org.uk>
	<AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
Message-ID: <4CAB3B68.4020304@voidspace.org.uk>

  On 05/10/2010 15:33, Alexander Belopolsky wrote:
> On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord<fuzzyman at voidspace.org.uk>  wrote:
>> ... Perhaps I should have been clearer; it is only the
>> behaviour of range(x, y) that I've found people-new-to-python confused by.
> Teach them about range(x, y, z) and once you cover negative z they
> will stop complaining about range(x, y). :-)

Well, it probably doesn't help (for those coming to Python from 
languages other than C) that some languages do-the-right-thing with 
ranges. <0.5 wink>

$ irb
 >> (1..3).to_a
=> [1, 2, 3]

All the best,

Michael Foord




> At least you don't have to deal with range vs. xrange in 3.x anymore.
> IMO, range([start,] stop[, step]) is one of the worst interfaces in
> python.  Is there any other function with an optional *first*
> argument?  Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1))
> cannot be used to produce days in January?  Why range(2**300)
> succeeds, but len(range(2**300)) raises OverflowError?
>
> No, I don't think much can be done about it.  Py3k has already done
> everything that was practical about improving range(..).


-- 
http://www.voidspace.org.uk/



From alexander.belopolsky at gmail.com  Tue Oct  5 17:02:12 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 5 Oct 2010 11:02:12 -0400
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <B02D3715-7640-4D60-B82B-4E4AB0315DB0@masklinn.net>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<4CAB2524.1010008@voidspace.org.uk>
	<AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
	<B02D3715-7640-4D60-B82B-4E4AB0315DB0@masklinn.net>
Message-ID: <AANLkTi==XAyi5z8YR_YwO89pTP+rFvFeMN_Hf6Uhs7JE@mail.gmail.com>

On Tue, Oct 5, 2010 at 10:47 AM, Masklinn <masklinn at masklinn.net> wrote:
..
>> ?Why range(2**300)
>> succeeds, but len(range(2**300)) raises OverflowError?
> The former overflows in Python 2. It doesn't in Python 3 due to `range` being an iterable not a list.

This particular wart is the subject of issue 2690.

http://bugs.python.org/issue2690


From masklinn at masklinn.net  Tue Oct  5 17:03:11 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 5 Oct 2010 17:03:11 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <4CAB3B68.4020304@voidspace.org.uk>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>	<20101005131356.GA21646@idyll.org>	<4CAB2524.1010008@voidspace.org.uk>
	<AANLkTimJpWh1je9MzO276p6k+qn9Zg-bNuFAfL-f1bWf@mail.gmail.com>
	<4CAB3B68.4020304@voidspace.org.uk>
Message-ID: <9FC12AF7-688B-40D5-A77F-4C2ED20FAA61@masklinn.net>

On 2010-10-05, at 16:51 , Michael Foord wrote:
> On 05/10/2010 15:33, Alexander Belopolsky wrote:
>> On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord<fuzzyman at voidspace.org.uk>  wrote:
>>> ... Perhaps I should have been clearer; it is only the
>>> behaviour of range(x, y) that I've found people-new-to-python confused by.
>> Teach them about range(x, y, z) and once you cover negative z they
>> will stop complaining about range(x, y). :-)
> 
> Well, it probably doesn't help (for those coming to Python from languages other than C) that some languages do-the-right-thing with ranges. <0.5 wink>
> 
> $ irb
> >> (1..3).to_a
> => [1, 2, 3]
> 
> All the best,
True, likewise for Haskell:
Prelude> [0..5]
[0,1,2,3,4,5]

On the other hand (for Ruby),
>> (1...3).to_a
=> [1, 2]

Ruby is also a bit different in that ranges are generally used more for containment-testing (via when) and there is a separate Fixnum.upto for iteration.

From denis.spir at gmail.com  Tue Oct  5 21:23:27 2010
From: denis.spir at gmail.com (spir)
Date: Tue, 5 Oct 2010 21:23:27 +0200
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <i8f35l$1t0$1@dough.gmane.org>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<i8f35l$1t0$1@dough.gmane.org>
Message-ID: <20101005212327.7a8965ff@o>

On Tue, 05 Oct 2010 13:45:56 +0200
Boris Borcic <bborcic at gmail.com> wrote:

> Nick Coghlan wrote:
> > [...] Being able to say things like
> > "10:00"<= x<  '12:00", 10.0<= x<  12.0, "a"<= x<  "n" are much
> > clearer than trying to specify their closed range equivalents.
> 
> makes one wonder about syntax like :
> 
> for 10 <= x < 20 :
>      blah(x)
> 
> 
> Mh, I suppose with rich comparisons special methods, it's possible to turn 
> chained comparisons into range factories without introducing new syntax. 
> Something more like
> 
> 
> for x in (10 <= step(1) < 20) :
>      blah(x)

About notation, even if loved right-hand-half-open intervals, I would wonder about [a,b] noting it. I guess 99.9% of programmers and novices (even purely amateur) have learnt about intervals at school in math courses. Both notations I know of use [a,b] for closed intervals, while half-open ones are noted either [a,b[ or [a,b). Thus, for me, the present C/python/etc notation is at best misleading.
So, what about a hypothetical language using directly math *unambiguous* notation, thus also letting programmers chose their preferred semantics (without fooling others)? End of war?

Denis
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From python at mrabarnett.plus.com  Tue Oct  5 21:43:29 2010
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 05 Oct 2010 20:43:29 +0100
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <20101005212327.7a8965ff@o>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<i8f35l$1t0$1@dough.gmane.org>
	<20101005212327.7a8965ff@o>
Message-ID: <4CAB7FE1.2040902@mrabarnett.plus.com>

On 05/10/2010 20:23, spir wrote:
> On Tue, 05 Oct 2010 13:45:56 +0200
> Boris Borcic<bborcic at gmail.com>  wrote:
>
>> Nick Coghlan wrote:
>>> [...] Being able to say things like
>>> "10:00"<= x<   '12:00", 10.0<= x<   12.0, "a"<= x<   "n" are much
>>> clearer than trying to specify their closed range equivalents.
>>
>> makes one wonder about syntax like :
>>
>> for 10<= x<  20 :
>>       blah(x)
>>
>>
>> Mh, I suppose with rich comparisons special methods, it's possible to turn
>> chained comparisons into range factories without introducing new syntax.
>> Something more like
>>
>>
>> for x in (10<= step(1)<  20) :
>>       blah(x)
>
> About notation, even if loved right-hand-half-open intervals, I would wonder about [a,b] noting it. I guess 99.9% of programmers and novices (even purely amateur) have learnt about intervals at school in math courses. Both notations I know of use [a,b] for closed intervals, while half-open ones are noted either [a,b[ or [a,b). Thus, for me, the present C/python/etc notation is at best misleading.
> So, what about a hypothetical language using directly math *unambiguous* notation, thus also letting programmers chose their preferred semantics (without fooling others)? End of war?
>
[Oops! Post sent to wrong list!]

Dijkstra came to his conclusion after seeing the results of students
using the programming language Mesa, which does support all 4 forms of
interval.


From tjreedy at udel.edu  Tue Oct  5 23:41:03 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 05 Oct 2010 17:41:03 -0400
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
Message-ID: <i8g61e$ams$1@dough.gmane.org>

On 10/5/2010 4:54 AM, Carl M. Johnson wrote:
> Changing range would only make sense if lists were also changed to
> start at 1 instead of 0, and that's never gonna happen. It's a
> massively backwards incompatible change with no real offsetting
> advantage.
>
> Still, if you were designing a brand new language today, would you
> have arrays/lists start at 0 or 1? (Or compromise and do .5?) I
> personally lean towards 1, since I recall being frequently tripped up
> by the first element in an array being a[0] way back when I first
> learn C++ in the 20th century. But maybe this was because I had been
> messed up by writing BASIC for loops from 1 to n before that? Is there
> anyone with teaching experience here? Is this much of a problem for
> young people learning Python (or any other zero-based indexing
> language) as their first language?
>
> What do you guys think? Now that simplifying pointer arithmetic isn't
> such an important consideration, is it still better to do zero-based
> indexing?

Sequences are often used as and can be viewed as tabular representations 
of functions for equally spaced inputs a+0*b, a+1*b, ..., a+i*b, .... In 
the simplest case, a==0 and b==1, so that the sequence directly maps 
counts 0,1,2,... to values. Without the 0 index, one must subtract 1 
from each index to have the same effect. Pointer arithmetic is an 
example of the utility of keeping the 0 term, but only one such example 
of many.

When one uses iterators instead of sequences, as in more common in 
Python 3, there is no inherent index to worry about or argue over.

def inner_product(p,q): # no equal, finite len() check!
   sum = 0
   for a,b in zip(p,q):
     sum += a*b

No index in sight.

-- 
Terry Jan Reedy



From greg.ewing at canterbury.ac.nz  Wed Oct  6 01:33:25 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 06 Oct 2010 12:33:25 +1300
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
Message-ID: <4CABB5C5.6000904@canterbury.ac.nz>

Carl M. Johnson wrote:
> I'm
> just curious about whether starting programmers have a strong
> preference for one or the other convention or whether both are
> confusing.

Starting programmers don't have enough experience to judge
which will be less confusing in the long run, so their
opinion shouldn't be given overriding weight when designing
a language intended for real-life use.

Speaking as an experienced programmer, I'm convinced that
Python has made the right choice. Not because Dijkstra or
any other authority says so, but because of my own personal
experiences.

-- 
Greg


From bruce at leapyear.org  Wed Oct  6 01:48:08 2010
From: bruce at leapyear.org (Bruce Leban)
Date: Tue, 5 Oct 2010 16:48:08 -0700
Subject: [Python-ideas] [Python-Dev] Inclusive Range
In-Reply-To: <4CABB5C5.6000904@canterbury.ac.nz>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<4CABB5C5.6000904@canterbury.ac.nz>
Message-ID: <AANLkTin03vZ2ookkRu7+_KFFj6KzYRso31ik8jQ-VWAc@mail.gmail.com>

With 1-based indexes, sometimes you have to add 1 and sometimes subtract 1
and sometimes neither. 0-based indexes avoid that problem.

Personally, I think changing any of this behavior has about the same
probability of success as adding
bleen<http://www.urbandictionary.com/define.php?term=bleen>
.

--- Bruce
http://www.vroospeak.com
http://j.mp/gruyere-security



On Tue, Oct 5, 2010 at 4:33 PM, Greg Ewing <greg.ewing at canterbury.ac.nz>wrote:

> Carl M. Johnson wrote:
>
>> I'm
>> just curious about whether starting programmers have a strong
>> preference for one or the other convention or whether both are
>> confusing.
>>
>
> Starting programmers don't have enough experience to judge
> which will be less confusing in the long run, so their
> opinion shouldn't be given overriding weight when designing
> a language intended for real-life use.
>
> Speaking as an experienced programmer, I'm convinced that
> Python has made the right choice. Not because Dijkstra or
> any other authority says so, but because of my own personal
> experiences.
>
> --
> Greg
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101005/83acae01/attachment.html>

From jimjjewett at gmail.com  Wed Oct  6 15:18:08 2010
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 6 Oct 2010 09:18:08 -0400
Subject: [Python-ideas] Inclusive Range
In-Reply-To: <20101005131356.GA21646@idyll.org>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
Message-ID: <AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>

On 10/5/10, C. Titus Brown <ctb at msu.edu> wrote:
> On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote:

>> It is *good* that range(5) produces 5 values (0 to 4)

If not for compatibility, the 5 values (1,2,3,4,5) would be even
better.  But even in a new language, changing the rest of the language
so that (1,2,3,4,5) was more useful might not be a win.

> Doesn't it make sense that ... for i in range(5):
> mimics the C/C++ behavior of    for (i = 0; i < 5; i++)

If not for assumed familiarity with C idioms, why shouldn't it instead match
    for (i=1; i<=5; i++)

-jJ


From ncoghlan at gmail.com  Wed Oct  6 15:58:48 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 6 Oct 2010 23:58:48 +1000
Subject: [Python-ideas] Inclusive Range
In-Reply-To: <AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
Message-ID: <AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>

On the more general topic of *teaching* 0-based indexing, the best
explanation I've seen is the one where 1-based indexing is explained
as referring directly to the items in the sequence, while 0-based
indexing numbers the implicit gaps between items and then returns the
item immediately after the identified gap. Slicing for 0-based
indexing can then be explained without needing to talk about half-open
ranges at all - you just grab everything between the two identified
gaps*.

I think the main point here is that these are not independent design
decisions - the behaviour of range() (or its equivalent), indexing,
slicing, enumeration and anything else related to sequences all comes
back to a single fundamental design choice of 1-based vs 0-based
indexing. Once you make that initial decision (regardless of the
merits either way), other decisions are going to flow from it as
consequences, and it isn't really something a language can ever
practically tinker with.

Cheers,
Nick.

*(unfortunately, it's a bit trickier to mesh that otherwise clear and
concise explanation cleanly with Python's definition of ranges and
slicing with negative step values, since those offset everything by
one, such that "list(reversed(range(1, 5, 1])) == list(range(4, 0,
-1])". If I was going to ask for a change to anything in Python's
indexing semantics, it would be for negative step values to create
ranges that were half-open at the beginning rather than the end, such
that reversing a slice just involved swapping the start value with the
stop value and negating the step value. As it is, you also have to
subtract one from both the start and stop value to get the original
range of values back. However, just like the idea of ranges starting
from 1 rather than 0, the idea of negative slices giving ranges
half-open at the start rather than the end is also doomed by
significant problems with backwards compatibility. For a new language,
you might be able to make the argument that the alternative behaviour
is a better design choice. For an existing one like Python, any
possible benefits are so nebulous as to not be worth the inevitable
hassle involved in changing the semantics)

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From masklinn at masklinn.net  Wed Oct  6 16:36:35 2010
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 6 Oct 2010 16:36:35 +0200
Subject: [Python-ideas] Inclusive Range
In-Reply-To: <AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
Message-ID: <583BFEEC-09C8-4B8F-827C-43B4D6403F45@masklinn.net>

On 2010-10-06, at 15:58 , Nick Coghlan wrote:
> *(unfortunately, it's a bit trickier to mesh that otherwise clear and
> concise explanation cleanly with Python's definition of ranges and
> slicing with negative step values, since those offset everything by
> one, such that "list(reversed(range(1, 5, 1])) == list(range(4, 0,
> -1])".
I'm not sure about that at all: the index is still right before the item, which is why the last item is `-1` rather than `-0`. And everything flows from that again.



From rrr at ronadam.com  Wed Oct  6 21:21:03 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 06 Oct 2010 14:21:03 -0500
Subject: [Python-ideas] improvements to slicing
In-Reply-To: <AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>	<20101005131356.GA21646@idyll.org>	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
Message-ID: <4CACCC1F.4010209@ronadam.com>


On 10/06/2010 08:58 AM, Nick Coghlan wrote:

> If I was going to ask for a change to anything in Python's
> indexing semantics, it would be for negative step values to create
> ranges that were half-open at the beginning rather than the end, such
> that reversing a slice just involved swapping the start value with the
> stop value and negating the step value.

Yes, negative slices are very tricky to get right.  They could use some 
attention I think.

> As it is, you also have to
> subtract one from both the start and stop value to get the original
> range of values back. However, just like the idea of ranges starting
> from 1 rather than 0, the idea of negative slices giving ranges
> half-open at the start rather than the end is also doomed by
> significant problems with backwards compatibility. For a new language,
> you might be able to make the argument that the alternative behaviour
> is a better design choice. For an existing one like Python, any
> possible benefits are so nebulous as to not be worth the inevitable
> hassle involved in changing the semantics)


We don't need to change the current range function/generator to add 
inclusive or closed ranges.  Just add a closed_range() function to the 
itertools or math module.

    [n for n in closed_range(-5, 5, 2)]  --> [-5, -3, -1, 1, 3, 5]



I just noticed the __getslice__ method is no longer on sequences. (?)



My preference is for slicing to be based more on practical terms for 
manipulating sequences rather than be defined in a purely mathematical way.


1. Have the direction determine by the start and stop values rather than 
than by the step value so that the following is true.

    "abcdefg"[start:stop:step] == "abcdefg"[start:stop][::step]


Reversing the slice can be done by simply swapping the start and stop.

Negating the slice too would give you ...

    "abcdefg"[start:stop:step] == "abcdefg"[stop:start:-step]


Negating the step would not always give you the reverse sequence for steps 
larger than 1, because the result may not contain the same values.

 >>> 'abcd'[::2]
'ac'
 >>> 'abcd'[::-2]
'db'

This is the current behavior and wouldn't change.


A positive step value would step from the left, and a negative step value 
would step from the right of the slice determined by start and stop.  This 
already works if you don't give stop and start values.

 >>> "abcdefg"[::2]
'aceg'
 >>> "abcdefg"[::-2]
'geca'


And these can be used in for loops or list comps.

 >>> [c for c in "abcdefg"[::2]]
['a', 'c', 'e', 'g']



If we could add a width value to slices we would be able to do this.

 >>> "abcdefg"[::2:2]
'abcdefg'

;-)


As unimpressive as that looked, when used in a for loop or list comp it 
would give us an easy and useful way to step through data.

    [cc for cc in "abcdefg"[::2:2]]  -->  ['ab', 'cd', 'ef', 'g']


You could also spell that as...

    list("abcdefg")[::2:2])  --> ['ab', 'cd', 'ef', 'g']



The problems start when you try to use actual index values to specify 
start and stop ranges.

You can't index the last element with an explicit stop value.

 >>> "abcdefg"[0:-1]
'abcdef'

 >>> "abcdefg"[0:-0]
''

But we can use "None" which is awkward and requires testing the stop value 
when the index is supplied by a variable.

 >>> 'abcdefg'[:None]
'abcdefg'

I'm not sure how to fix this one. We've been living with this for a long 
time so it's not like we need to fix it all at once.


Negative indexes can be confusing.

 >>> "abcdefg"[-5:5]
'cde'                         # OK
 >>> "abcdefg"[5:-5]
''                            # Expected "edc'" here, not ''.
 >>> "abcdefg"[5:-5:-1]
'fed'                         # Expected reverse of '' here,
                               # or 'cde',  not 'fed'.



With the suggested change we get...

 >>> "abcdefg"[-5:5]
'cde'                          # Stays the same.
 >>> "abcdefg"[5:-5]
'edc'                          # Swapping start and stop reverses it.
 >>> "abcdefg"[5:-5:-1]
'cde'                          # Negating the step, reverses it again.


I think these are easier to use than the current behavior.  It doesn't 
change slices using positive indexes and steps so maybe it's not so 
backward incompatible to sneak in.  ;-)

Ron



















From rrr at ronadam.com  Thu Oct  7 01:05:01 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 06 Oct 2010 18:05:01 -0500
Subject: [Python-ideas] A general purpose range iterator (Re:
 improvements to slicing)
In-Reply-To: <AANLkTimR6PNj2ZEG+DUMmJ-zFwDAEShwOKgHcscpdQrn@mail.gmail.com>
References: <AANLkTimR6PNj2ZEG+DUMmJ-zFwDAEShwOKgHcscpdQrn@mail.gmail.com>
Message-ID: <4CAD009D.6000805@ronadam.com>



On 10/06/2010 03:41 PM, Nick Coghlan wrote:
> On Thu, Oct 7, 2010 at 5:21 AM, Ron Adam<rrr at ronadam.com>  wrote:
>> I think these are easier to use than the current behavior.  It doesn't
>> change slices using positive indexes and steps so maybe it's not so backward
>> incompatible to sneak in.  ;-)
>
> I think that sound you just heard was thousands of SciPy users crying
> out in horror ;)

LOL, and my apologies to the SciPi users.

I did try to google to find routines where the stop and start indexes 
converge and result in an empty list as Spir suggested, but my google foo 
seems to be broken today.  Maybe someone can point me in the right google 
direction.


> Given a "do over", there a few things I would change about Python's
> range generation and extended slicing. Others would clearly change a
> few different things. Given the dual barriers of "rough consensus and
> running code", I don't think there are any *specific* changes that
> would make it through the gauntlet.

Yes, any changes would probably need to be done in a way that can be 
imported and live along side the current range and slice.



> The idea of a *generalised* range generator is in interesting one
> though. One that was simply:
>
> _d = object()
> def irange(start=_d, stop=_d, step=1, *, include_start=True,
> include_stop=False):
>      # Match signature of range while still allowing stop=val as the
> only keyword argument
>      if stop is _d:
>          start, stop = 0, start
>      elif start is _d:
>          start = 0
>      if include_start:
>          yield start
>      current = start
>      while 1:
>          current += step
>          if current>= stop:
>              break
>          yield current
>      if include_stop and current == stop:
>          yield stop
>
> Slower than builtin range() for the integer case, but works with
> arbitrary types (e.g. float, Decimal, datetime)

It wouldn't be that much slower if it returns another range object with the 
index's adjusted.  But then it wouldn't be able to use the arbitrary types.


A wild idea that keeps nudging my neurons is that sequence iterators or 
index objectes like slice objects could maybe be added before applying them 
to a final sequence.  Sort of an iter math.  It isn't as simple as just 
putting the iterators in a list and iterating the iterators in order, 
although that works for some things.

Ron


























From denis.spir at gmail.com  Wed Oct  6 22:39:11 2010
From: denis.spir at gmail.com (spir)
Date: Wed, 6 Oct 2010 22:39:11 +0200
Subject: [Python-ideas] improvements to slicing
In-Reply-To: <4CACCC1F.4010209@ronadam.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
	<4CACCC1F.4010209@ronadam.com>
Message-ID: <20101006223911.7fea02e1@o>

On Wed, 06 Oct 2010 14:21:03 -0500
Ron Adam <rrr at ronadam.com> wrote:

> 1. Have the direction determine by the start and stop values rather than 
> than by the step value so that the following is true.
> 
>     "abcdefg"[start:stop:step] == "abcdefg"[start:stop][::step]

Please provide an example with current and proposed semantics.
If I understand correctly, this does not work in practice. When range bounds are variable (result from computation), the upper one can happen to be smaller than the upper one and we just want the resulting sub-sequence to be empty. This is normal and common use case, and this is good. (upper <= lower) ==> []
Else many routine would have to special-case (upper < lower).
Your proposal, again if I understand, would break this semantics, instead returning a sub-sequence in reverse order.


Denis
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From raymond.hettinger at gmail.com  Wed Oct  6 22:35:39 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Wed, 6 Oct 2010 13:35:39 -0700
Subject: [Python-ideas] improvements to slicing
In-Reply-To: <4CACCC1F.4010209@ronadam.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>	<20101005131356.GA21646@idyll.org>	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
	<4CACCC1F.4010209@ronadam.com>
Message-ID: <991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com>


On Oct 6, 2010, at 12:21 PM, Ron Adam wrote:
> We don't need to change the current range function/generator to add inclusive or closed ranges.  Just add a closed_range() function to the itertools or math module.
> 
>   [n for n in closed_range(-5, 5, 2)]  --> [-5, -3, -1, 1, 3, 5]

If I were a betting man, I would venture that you could post
a recipe for closed_range(), publicize it on various mailing
lists, mention it in talks, and find that it would almost never
get used.

There's nothing wrong with the idea, but the YAGNI factor
will be hard to overcome.  IMO, this would become cruft on 
the same day it gets added to the library.

OTOH for numerical applications, there is utility for a floating
point variant, something like linspace() in MATLAB.



Raymond

From denis.spir at gmail.com  Wed Oct  6 22:11:59 2010
From: denis.spir at gmail.com (spir)
Date: Wed, 6 Oct 2010 22:11:59 +0200
Subject: [Python-ideas] Inclusive Range
In-Reply-To: <AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>
	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>
	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>
	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>
	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>
	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>
	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>
	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>
	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>
	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>
	<20101005131356.GA21646@idyll.org>
	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>
	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>
Message-ID: <20101006221159.73659a9b@o>

On Wed, 6 Oct 2010 23:58:48 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On the more general topic of *teaching* 0-based indexing, the best explanation I've seen is the one where 1-based indexing is explained as referring directly to the items in the sequence, while 0-based indexing numbers the implicit gaps between items and then returns the item immediately after the identified gap. Slicing for 0-based indexing can then be explained without needing to talk about half-open ranges at all - you just grab everything between the two identified gaps*.

In my experience, the only explanation that makes sense for newcomers is that 1-based indexes are just ordinary ordinals like we use everyday, while 0-based ones are _offsets_ measured from the start. It does not really help in practice (people do errors anyway), but at least they understand the logic so can reason when needed, namely to correct their errors.

> I think the main point here is that these are not independent design
> decisions - the behaviour of range() (or its equivalent), indexing,
> slicing, enumeration and anything else related to sequences all comes
> back to a single fundamental design choice of 1-based vs 0-based
> indexing.

I think there are languages with base 0 & closed range, unless it is base 1 & half-open range. Any convention works, practically. Also, the logics every supporter of the C convention, namely the famous tewt by EWD, reverses your argumentation: he show the advantages of helf-open intervals (according to his opinion), then that 0-based indexes fit better with this kind of intervals (ditto).


Denis
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From ncoghlan at gmail.com  Wed Oct  6 22:41:21 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 7 Oct 2010 06:41:21 +1000
Subject: [Python-ideas] A general purpose range iterator (Re: improvements
	to slicing)
Message-ID: <AANLkTimR6PNj2ZEG+DUMmJ-zFwDAEShwOKgHcscpdQrn@mail.gmail.com>

On Thu, Oct 7, 2010 at 5:21 AM, Ron Adam <rrr at ronadam.com> wrote:
> I think these are easier to use than the current behavior. ?It doesn't
> change slices using positive indexes and steps so maybe it's not so backward
> incompatible to sneak in. ?;-)

I think that sound you just heard was thousands of SciPy users crying
out in horror ;)

Given a "do over", there a few things I would change about Python's
range generation and extended slicing. Others would clearly change a
few different things. Given the dual barriers of "rough consensus and
running code", I don't think there are any *specific* changes that
would make it through the gauntlet.

The idea of a *generalised* range generator is in interesting one
though. One that was simply:

_d = object()
def irange(start=_d, stop=_d, step=1, *, include_start=True,
include_stop=False):
    # Match signature of range while still allowing stop=val as the
only keyword argument
    if stop is _d:
        start, stop = 0, start
    elif start is _d:
        start = 0
    if include_start:
        yield start
    current = start
    while 1:
        current += step
        if current >= stop:
            break
        yield current
    if include_stop and current == stop:
        yield stop

Slower than builtin range() for the integer case, but works with
arbitrary types (e.g. float, Decimal, datetime)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From andy at insectnation.org  Thu Oct  7 12:35:15 2010
From: andy at insectnation.org (Andy Buckley)
Date: Thu, 07 Oct 2010 12:35:15 +0200
Subject: [Python-ideas] improvements to slicing
In-Reply-To: <991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com>
References: <AANLkTim4q+YQ3wFjw9KND8W-_yCAsu9UVpET=t5U0b+E@mail.gmail.com>	<AANLkTimEtyzZudWTaQy7cXzLAE+_YuHB4NLHgfiKR+4j@mail.gmail.com>	<D910B5A9-EEFB-42D6-B615-6EC3978D7629@masklinn.net>	<AANLkTim6g2b7ztiowCgHb7h95GxMbK9sqerTVN7jQnQH@mail.gmail.com>	<AANLkTimKgYJYPMQ=BWBP0-4cRdhmLU+wYB1EahR+d2fU@mail.gmail.com>	<94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net>	<AANLkTin6K1d9OG7K6AZk+Ysk_FjVuK+b-+eqZFF-cQoQ@mail.gmail.com>	<076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net>	<AANLkTimxdDPQ=O6iORcCHhVrVbMFgnq65zcHJ9nwU_UU@mail.gmail.com>	<AANLkTimqFyrOb3RXhkm8HsBB-OY2LoSMkX6GFtuABNrZ@mail.gmail.com>	<20101005131356.GA21646@idyll.org>	<AANLkTi=E7xRdqhxjeSuWw7s4kCErq8AE_ADJOp7fsZQE@mail.gmail.com>	<AANLkTik5VMHSCzV2-bsT8TMoOz5ttigqasPxUDLXy4c3@mail.gmail.com>	<4CACCC1F.4010209@ronadam.com>
	<991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com>
Message-ID: <4CADA263.1020309@insectnation.org>

On 06/10/10 22:35, Raymond Hettinger wrote:
> 
> On Oct 6, 2010, at 12:21 PM, Ron Adam wrote:
>> We don't need to change the current range function/generator to add inclusive or closed ranges.  Just add a closed_range() function to the itertools or math module.
>>
>>   [n for n in closed_range(-5, 5, 2)]  --> [-5, -3, -1, 1, 3, 5]
> 
> If I were a betting man, I would venture that you could post
> a recipe for closed_range(), publicize it on various mailing
> lists, mention it in talks, and find that it would almost never
> get used.
> 
> There's nothing wrong with the idea, but the YAGNI factor
> will be hard to overcome.  IMO, this would become cruft on 
> the same day it gets added to the library.

There are plenty of places in my code where I would find such a thing
useful, though... usually where I'm working with pre-determined integer
codes (one very specific use-case: elementary particle ID codes, which
are integers constructed from quantum number values) and it's simply
more elegant and intuitive to specify a range whose requested upper
bound is a valid code rather than valid_code+1.

IMHO, an extra keyword on range/xrange would allow to write nicer code
where applicable, without crufting up the library with whole extra
functions. Depends on what you consider more crufty, I suppose, but I
agree that ~no-one is going to find and import a new range function.
numpy.linspace uses "endpoint" as the name for such a keyword:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html#numpy.linspace
but again no-one wants to depend on numpy *just* to get that functionality!

So how about
  range(start, realend, endpoint=True)
  xrange(start, realend, endpoint=True)
with endpoint=False as default? No backward compatibility or performance
issues to my (admittedly inexpert) eye.

Andy



From steve at pearwood.info  Mon Oct 11 01:17:54 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 11 Oct 2010 10:17:54 +1100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
	(minimum, maximum) tuple of a sequence
In-Reply-To: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
Message-ID: <201010111017.56101.steve@pearwood.info>

On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote:
> Just as an exercise, I wanted to try my hand at adding a function to
> the compiled Python C code.  An interesting optimization that I read
> about (where? don't recall) finds the minimum and maximum elements of
> a sequence in a single pass, with a 25% reduction in number of
> comparison operations:
> - the sequence elements are read in pairs 
> - each pair is compared to find smaller/greater
> - the smaller is compared to current min
> - the greater is compared to current max
>
> So each pair is applied to the running min/max values using 3
> comparisons, vs. 4 that would be required if both were compared to
> both min and max.
>
> This feels somewhat similar to how divmod returns both quotient and
> remainder of a single division operation.
>
> This would be potentially interesting for those cases where min and
> max are invoked on the same sequence one after the other, and
> especially so if the sequence elements were objects with expensive
> comparison operations.

Perhaps more importantly, it is ideal for the use-case where you have an 
iterator. You can't call min() and then max(), as min() consumes the 
iterator leaving nothing for max(). It may be undesirable to convert 
the iterator to a list first -- it may be that the number of items in 
the data stream is too large to fit into memory all at once, but even 
if it is small, it means you're now walking the stream three times when 
one would do.

To my mind, minmax() is as obvious and as useful a built-in as divmod(), 
but if there is resistance to making such a function a built-in, 
perhaps it could go into itertools. (I would prefer it to keep the same 
signature as min() and max(), namely that it will take either a single 
iterable argument or multiple arguments.)

I've experimented with minmax() myself. Not surprisingly, the 
performance of a pure Python version doesn't even come close to the 
built-ins.

I'm +1 on the idea.

Presumably follow-ups should go to python-ideas.



-- 
Steven D'Aprano


From zac256 at gmail.com  Mon Oct 11 02:55:51 2010
From: zac256 at gmail.com (Zac Burns)
Date: Sun, 10 Oct 2010 17:55:51 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <201010111017.56101.steve@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
Message-ID: <AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>

This could be generalized and placed into itertools if we create a function
(say, apply for lack of a better name at the moment) that takes in an
iterable and creates new iterables that yield each from the original
(avoiding the need for a list) holding only one in memory. Then you could
pass the whatever function you wanted to run the iterables over an get the
result back in a tuple.

Eg:

itertools.apply(iterable, min, max) ~= (min(iterable), max(iterable))

This class that creates 'associated iterables' from an original iterable
where each new iterable has to be iterated over at the same time might also
be useful in other contexts and could be added to itertools as well.


Unfortunately this solution seems incompatable with the implementations with
for loops in min and max (EG: How do you switch functions at the right
time?) So it might take some tweaking.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL



On Sun, Oct 10, 2010 at 4:17 PM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote:
> > Just as an exercise, I wanted to try my hand at adding a function to
> > the compiled Python C code.  An interesting optimization that I read
> > about (where? don't recall) finds the minimum and maximum elements of
> > a sequence in a single pass, with a 25% reduction in number of
> > comparison operations:
> > - the sequence elements are read in pairs
> > - each pair is compared to find smaller/greater
> > - the smaller is compared to current min
> > - the greater is compared to current max
> >
> > So each pair is applied to the running min/max values using 3
> > comparisons, vs. 4 that would be required if both were compared to
> > both min and max.
> >
> > This feels somewhat similar to how divmod returns both quotient and
> > remainder of a single division operation.
> >
> > This would be potentially interesting for those cases where min and
> > max are invoked on the same sequence one after the other, and
> > especially so if the sequence elements were objects with expensive
> > comparison operations.
>
> Perhaps more importantly, it is ideal for the use-case where you have an
> iterator. You can't call min() and then max(), as min() consumes the
> iterator leaving nothing for max(). It may be undesirable to convert
> the iterator to a list first -- it may be that the number of items in
> the data stream is too large to fit into memory all at once, but even
> if it is small, it means you're now walking the stream three times when
> one would do.
>
> To my mind, minmax() is as obvious and as useful a built-in as divmod(),
> but if there is resistance to making such a function a built-in,
> perhaps it could go into itertools. (I would prefer it to keep the same
> signature as min() and max(), namely that it will take either a single
> iterable argument or multiple arguments.)
>
> I've experimented with minmax() myself. Not surprisingly, the
> performance of a pure Python version doesn't even come close to the
> built-ins.
>
> I'm +1 on the idea.
>
> Presumably follow-ups should go to python-ideas.
>
>
>
> --
> Steven D'Aprano
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101010/ea2be184/attachment.html>

From masklinn at masklinn.net  Mon Oct 11 07:50:14 2010
From: masklinn at masklinn.net (Masklinn)
Date: Mon, 11 Oct 2010 07:50:14 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
	(minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
Message-ID: <C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>

On 2010-10-11, at 02:55 , Zac Burns wrote:
> 
> Unfortunately this solution seems incompatable with the implementations with
> for loops in min and max (EG: How do you switch functions at the right
> time?) So it might take some tweaking.
As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books.

As a result, this function would devolve into something along the lines of

    def apply(iterable, *funcs):
        return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs))))

which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator.

From taleinat at gmail.com  Mon Oct 11 22:18:41 2010
From: taleinat at gmail.com (Tal Einat)
Date: Mon, 11 Oct 2010 22:18:41 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
Message-ID: <AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>

Masklinn wrote:
> On 2010-10-11, at 02:55 , Zac Burns wrote:
>>
>> Unfortunately this solution seems incompatable with the implementations with
>> for loops in min and max (EG: How do you switch functions at the right
>> time?) So it might take some tweaking.
> As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books.
>
> As a result, this function would devolve into something along the lines of
>
> ? ?def apply(iterable, *funcs):
> ? ? ? ?return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs))))
>
> which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator.

We recently needed exactly this -- to do several running calculations
in parallel on an iterable. We avoided using co-routines and just
created a RunningCalc class with a simple interface, and implemented
various running calculations as sub-classes, e.g. min, max, average,
variance, n-largest. This isn't very fast, but since generating the
iterated values is computationally heavy, this is fast enough for our
uses.

Having a standard method to do this in Python, with implementations
for common calculations in the stdlib, would have been nice.

I wouldn't mind trying to work up a PEP for this, if there is support
for the idea.

- Tal Einat


From p.f.moore at gmail.com  Tue Oct 12 17:51:03 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 12 Oct 2010 16:51:03 +0100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
Message-ID: <AANLkTim4G1Be7QdYrgs2EGnc0F2u7GWUzUQZtB5M=gWd@mail.gmail.com>

On 11 October 2010 21:18, Tal Einat <taleinat at gmail.com> wrote:
> We recently needed exactly this -- to do several running calculations
> in parallel on an iterable. We avoided using co-routines and just
> created a RunningCalc class with a simple interface, and implemented
> various running calculations as sub-classes, e.g. min, max, average,
> variance, n-largest. This isn't very fast, but since generating the
> iterated values is computationally heavy, this is fast enough for our
> uses.
>
> Having a standard method to do this in Python, with implementations
> for common calculations in the stdlib, would have been nice.
>
> I wouldn't mind trying to work up a PEP for this, if there is support
> for the idea.

The "consumer" interface as described in
http://effbot.org/zone/consumer.htm sounds about right for this:

class Rmin(object):
    def __init__(self):
        self.running_min = None
    def feed(self, val):
        if self.running_min is None:
            self.running_min = val
        else:
            self.running_min = min(self.running_min, val)
    def close(self):
        pass

rmin = Rmin()
for val in iter:
    rmin.feed(val)
print rmin.running_min

Paul.


From taleinat at gmail.com  Tue Oct 12 21:41:00 2010
From: taleinat at gmail.com (Tal Einat)
Date: Tue, 12 Oct 2010 21:41:00 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTim4G1Be7QdYrgs2EGnc0F2u7GWUzUQZtB5M=gWd@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTim4G1Be7QdYrgs2EGnc0F2u7GWUzUQZtB5M=gWd@mail.gmail.com>
Message-ID: <AANLkTinruPgWoqnY+Bx3Ppg=efSvkgpOhvUAd7cUwR7o@mail.gmail.com>

Paul Moore wrote:
> On 11 October 2010 21:18, Tal Einat <taleinat at gmail.com> wrote:
>> We recently needed exactly this -- to do several running calculations
>> in parallel on an iterable. We avoided using co-routines and just
>> created a RunningCalc class with a simple interface, and implemented
>> various running calculations as sub-classes, e.g. min, max, average,
>> variance, n-largest. This isn't very fast, but since generating the
>> iterated values is computationally heavy, this is fast enough for our
>> uses.
>>
>> Having a standard method to do this in Python, with implementations
>> for common calculations in the stdlib, would have been nice.
>>
>> I wouldn't mind trying to work up a PEP for this, if there is support
>> for the idea.
>
> The "consumer" interface as described in
> http://effbot.org/zone/consumer.htm sounds about right for this:
>
> class Rmin(object):
> ? ?def __init__(self):
> ? ? ? ?self.running_min = None
> ? ?def feed(self, val):
> ? ? ? ?if self.running_min is None:
> ? ? ? ? ? ?self.running_min = val
> ? ? ? ?else:
> ? ? ? ? ? ?self.running_min = min(self.running_min, val)
> ? ?def close(self):
> ? ? ? ?pass
>
> rmin = Rmin()
> for val in iter:
> ? ?rmin.feed(val)
> print rmin.running_min

That's what I was thinking about too.

How about something along these lines?
http://pastebin.com/DReBL56T

I just worked that up now and would like some comments and
suggestions. It could either turn into a PEP or an external library,
depending on popularity here.

- Tal Einat


From p.f.moore at gmail.com  Tue Oct 12 22:33:01 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 12 Oct 2010 21:33:01 +0100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTinruPgWoqnY+Bx3Ppg=efSvkgpOhvUAd7cUwR7o@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTim4G1Be7QdYrgs2EGnc0F2u7GWUzUQZtB5M=gWd@mail.gmail.com>
	<AANLkTinruPgWoqnY+Bx3Ppg=efSvkgpOhvUAd7cUwR7o@mail.gmail.com>
Message-ID: <AANLkTi=+TErZmKmNZ3GKp5+LWcKu=BPp44RF-78=UFch@mail.gmail.com>

On 12 October 2010 20:41, Tal Einat <taleinat at gmail.com> wrote:
> That's what I was thinking about too.
>
> How about something along these lines?
> http://pastebin.com/DReBL56T
>
> I just worked that up now and would like some comments and
> suggestions. It could either turn into a PEP or an external library,
> depending on popularity here.

Looks reasonable. I'd suspect it would be more appropriate as an
external library rather than going directly into the stdlib. Also,
when I've needed something like this in the past (for simulation code,
involving iterators with millions of entries) speed has been pretty
critical, so something pure-python like this might not have been
enough. Maybe it's something that would be appropriate for numpy?

But I like the idea in general. I don't see the need for the
RunningCalc base class (duck typing rules!) and I'd be tempted to add
dummy close methods, to conform to the published consumer protocol
(even though it's not a formal Python standard). I wouldn't
necessarily use the given apply function, either, but that's a matter
of taste (I would suggest you change the name, though, to avoid
reusing the old apply builtin's name, which was something entirely
different).

Paul


From dan at programmer-art.org  Wed Oct 13 22:04:28 2010
From: dan at programmer-art.org (Daniel G. Taylor)
Date: Wed, 13 Oct 2010 16:04:28 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
Message-ID: <4CB610CC.1070009@programmer-art.org>

  Hey,

I've recently been doing a lot of work with dates related to payment and 
statistics processing at work and have run into several annoyances with 
the built-in datetime, date, time, timedelta, etc classes, even when 
adding in relativedelta. They are awkward, non-intuitive and not at all 
Pythonic to me. Over the past year I've written up a library for making 
my life a bit easier and  figured I would post some information here to 
see what others think, and to gauge whether or not such a library might 
be PEP-worthy.

My original post about it was here:

http://programmer-art.org/articles/programming/pythonic-date

The github project page is here:

http://github.com/danielgtaylor/paodate

This is code that is and has been running in production environments for 
months but may still contain bugs. I have tried to include unit tests 
and ample documentation. I'd love to get some feedback and people's 
thoughts.

I would also love to hear what others find is difficult or missing from 
the built-in date and time handling.

Take care,

-- 
Daniel G. Taylor
http://programmer-art.org/



From phd at phd.pp.ru  Wed Oct 13 22:30:08 2010
From: phd at phd.pp.ru (Oleg Broytman)
Date: Thu, 14 Oct 2010 00:30:08 +0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB610CC.1070009@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
Message-ID: <20101013203008.GA27423@phd.pp.ru>

On Wed, Oct 13, 2010 at 04:04:28PM -0400, Daniel G. Taylor wrote:
> http://programmer-art.org/articles/programming/pythonic-date

   Have you ever tried mxDateTime. Do you consider it unpythonic?

Oleg.
-- 
     Oleg Broytman            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From mal at egenix.com  Wed Oct 13 22:42:27 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 13 Oct 2010 22:42:27 +0200
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB610CC.1070009@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
Message-ID: <4CB619B3.4050203@egenix.com>

Daniel G. Taylor wrote:
>  Hey,
> 
> I've recently been doing a lot of work with dates related to payment and
> statistics processing at work and have run into several annoyances with
> the built-in datetime, date, time, timedelta, etc classes, even when
> adding in relativedelta. They are awkward, non-intuitive and not at all
> Pythonic to me. Over the past year I've written up a library for making
> my life a bit easier and  figured I would post some information here to
> see what others think, and to gauge whether or not such a library might
> be PEP-worthy.
> 
> My original post about it was here:
> 
> http://programmer-art.org/articles/programming/pythonic-date
> 
> The github project page is here:
> 
> http://github.com/danielgtaylor/paodate
> 
> This is code that is and has been running in production environments for
> months but may still contain bugs. I have tried to include unit tests
> and ample documentation. I'd love to get some feedback and people's
> thoughts.
> 
> I would also love to hear what others find is difficult or missing from
> the built-in date and time handling.

mxDateTime implements most of these ideas:

http://www.egenix.com/products/python/mxBase/mxDateTime/

It's been in production use for more than 13 years now and
has proven to be very versatile in practice; YMMV, of course.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 13 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From dan at programmer-art.org  Wed Oct 13 22:59:38 2010
From: dan at programmer-art.org (Daniel G. Taylor)
Date: Wed, 13 Oct 2010 16:59:38 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB619B3.4050203@egenix.com>
References: <4CB610CC.1070009@programmer-art.org> <4CB619B3.4050203@egenix.com>
Message-ID: <4CB61DBA.9060801@programmer-art.org>

On 10/13/2010 04:42 PM, M.-A. Lemburg wrote:
> mxDateTime implements most of these ideas:
>
> http://www.egenix.com/products/python/mxBase/mxDateTime/
>
> It's been in production use for more than 13 years now and
> has proven to be very versatile in practice; YMMV, of course.

Hah, that is a very nice looking library. I wish I had looked into it 
before writing my own. Looks like it still doesn't allow write access to 
many properties in date or delta objects, but looks to have a lot of 
really useful stuff in it. I'll be taking a closer look shortly.

Any idea why this hasn't made it into Python's standard library while 
being around for 13 years? Seems like it would be extremely useful in 
the standard distribution.

Take care,
-- 
Daniel G. Taylor
http://programmer-art.org/


From alexander.belopolsky at gmail.com  Wed Oct 13 23:17:36 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 13 Oct 2010 17:17:36 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB610CC.1070009@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
Message-ID: <AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>

On Wed, Oct 13, 2010 at 4:04 PM, Daniel G. Taylor
<dan at programmer-art.org> wrote:
> ... and have run into several annoyances with the
> built-in datetime, date, time, timedelta, etc classes, even when adding in
> relativedelta. They are awkward, non-intuitive and not at all Pythonic to
> me.

There seems to be no shortage of blogosphere rants about how awkward
python datetime module is, but once patches are posted on the tracker
to improve it, nobody seems to be interested in reviewing them.  I has
been suggested that C implementation presented a high barrier to entry
for people to get involved in datetime module development.  This was
one of the reasons I pushed for including a pure python equivalent in
3.2.   Unfortunately, getting datetime.py into SVN tree was not enough
to spark new interest in improving the module.  Maybe this will change
with datetime.py making it into a released version.

..
> My original post about it was here:
>
> http://programmer-art.org/articles/programming/pythonic-date
>

This post is severely lacking in detail, so I cannot tell how your
library solves your announced problems, but most of them seem to be
easy with datetime:

* Make it easy to make a Date from anything - a timestamp, date,
datetime, tuple, etc.

>>> from datetime import *
>>> datetime.utcfromtimestamp(0)
datetime.datetime(1970, 1, 1, 0, 0)
>>> datetime.utcfromtimestamp(0).date()
datetime.date(1970, 1, 1)

* Make it easy to turn a Date into anything

datetime.timetuple() will convert datetime to a tuple.  There is an
open ticket to simplify datetime to timestamp conversion

http://bugs.python.org/issue2736

but it is already easy enough:

>>> (datetime.now() - datetime(1970,1,1)).total_seconds()
1286989863.82536

* Make it easy and pythonic to add/subtract one or more days, weeks,
months, or years

monthdelta addition was discussed at http://bugs.python.org/issue5434,
but did not get enough interest.  The rest seems to be easy enough
with timedetla.

* Make it easy to get a tuple of the start and end of the month

Why would you want this?  Start of the month is easy: just date(year,
month, 1).  End of the month is often unnecessary because it is more
pythonic to work with semi-open ranges and use first of the next month
instead.


From dan at programmer-art.org  Wed Oct 13 23:45:33 2010
From: dan at programmer-art.org (Daniel G. Taylor)
Date: Wed, 13 Oct 2010 17:45:33 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
References: <4CB610CC.1070009@programmer-art.org>
	<AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
Message-ID: <4CB6287D.8070101@programmer-art.org>

On 10/13/2010 05:17 PM, Alexander Belopolsky wrote:
> On Wed, Oct 13, 2010 at 4:04 PM, Daniel G. Taylor
> <dan at programmer-art.org>  wrote:
>> ... and have run into several annoyances with the
>> built-in datetime, date, time, timedelta, etc classes, even when adding in
>> relativedelta. They are awkward, non-intuitive and not at all Pythonic to
>> me.
>
> There seems to be no shortage of blogosphere rants about how awkward
> python datetime module is, but once patches are posted on the tracker
> to improve it, nobody seems to be interested in reviewing them.  I has
> been suggested that C implementation presented a high barrier to entry
> for people to get involved in datetime module development.  This was
> one of the reasons I pushed for including a pure python equivalent in
> 3.2.   Unfortunately, getting datetime.py into SVN tree was not enough
> to spark new interest in improving the module.  Maybe this will change
> with datetime.py making it into a released version.

This at least sounds like some progress is being made, so that makes me 
happy. I'd be glad to work on stuff if I knew it has the potential to 
make a difference and be accepted upstream and if it doesn't require me 
rewriting every little thing in the module. I'm not really sure where to 
start as all I really want is a nice wrapper to make working with dates 
seem intuitive and friendly.

> ..
>> My original post about it was here:
>>
>> http://programmer-art.org/articles/programming/pythonic-date
>>
>
> This post is severely lacking in detail, so I cannot tell how your
> library solves your announced problems, but most of them seem to be
> easy with datetime:

Yeah sorry it was mostly just a frustrated rant and then the start of my 
wrapper implementation.

> * Make it easy to make a Date from anything - a timestamp, date,
> datetime, tuple, etc.
>
>>>> from datetime import *
>>>> datetime.utcfromtimestamp(0)
> datetime.datetime(1970, 1, 1, 0, 0)
>>>> datetime.utcfromtimestamp(0).date()
> datetime.date(1970, 1, 1)

Why does it not have this in the constructor? Where else in the standard 
lib does anything behave like this? My solution was to just dump 
whatever you want into the constructor and you get a Date object which 
can be converted to anything else via simple properties.

> * Make it easy to turn a Date into anything
>
> datetime.timetuple() will convert datetime to a tuple.  There is an
> open ticket to simplify datetime to timestamp conversion
>
> http://bugs.python.org/issue2736

I'll be happy when this is fixed :-)

> but it is already easy enough:
>
>>>> (datetime.now() - datetime(1970,1,1)).total_seconds()
> 1286989863.82536

This is new in Python 2.7 it seems, before you had to calculate it by 
hand which was annoying to me. Now this seems okay.

> * Make it easy and pythonic to add/subtract one or more days, weeks,
> months, or years
>
> monthdelta addition was discussed at http://bugs.python.org/issue5434,
> but did not get enough interest.  The rest seems to be easy enough
> with timedetla.

And that means yet another module I have to import with various 
functions I have to use to manipulate an object rather than methods of 
the object itself. This doesn't seem Pythonic to me...

> * Make it easy to get a tuple of the start and end of the month
>
> Why would you want this?  Start of the month is easy: just date(year,
> month, 1).  End of the month is often unnecessary because it is more
> pythonic to work with semi-open ranges and use first of the next month
> instead.

It's just for convenience really. For an example, I used it for querying 
a database for invoices in certain date ranges and for managing e.g. 
monthly recurring charges. It's just way more convenient and makes my 
code very easy to read where it counts - within the complex logic 
controlling when we charge credit cards. The less complex code there the 
better, because typos and bugs cost real money.

Even if the tuples returned contained e.g. the first day of this and 
next month instead of the last day of the month it's still useful to 
have these properties that return the tuples (at least to me), as it 
saves some manual work each time.

Take care,
-- 
Daniel G. Taylor
http://programmer-art.org/


From taleinat at gmail.com  Wed Oct 13 23:54:31 2010
From: taleinat at gmail.com (Tal Einat)
Date: Wed, 13 Oct 2010 23:54:31 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
Message-ID: <AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>

On Mon, Oct 11, 2010 at 10:18 PM, Tal Einat wrote:
> Masklinn wrote:
>> On 2010-10-11, at 02:55 , Zac Burns wrote:
>>>
>>> Unfortunately this solution seems incompatable with the implementations with
>>> for loops in min and max (EG: How do you switch functions at the right
>>> time?) So it might take some tweaking.
>> As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books.
>>
>> As a result, this function would devolve into something along the lines of
>>
>> ? ?def apply(iterable, *funcs):
>> ? ? ? ?return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs))))
>>
>> which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator.
>
> We recently needed exactly this -- to do several running calculations
> in parallel on an iterable. We avoided using co-routines and just
> created a RunningCalc class with a simple interface, and implemented
> various running calculations as sub-classes, e.g. min, max, average,
> variance, n-largest. This isn't very fast, but since generating the
> iterated values is computationally heavy, this is fast enough for our
> uses.
>
> Having a standard method to do this in Python, with implementations
> for common calculations in the stdlib, would have been nice.
>
> I wouldn't mind trying to work up a PEP for this, if there is support
> for the idea.

After some thought, I've found a way to make running several "running
calculations" in parallel fast. Speed should be comparable to having
used the non-running variants.

The method is to give each running calculation "blocks" of values
instead of just one at a time. The apply_in_parallel(iterable,
block_size=1000, *running_calcs) function would get blocks of values
from the iterable and pass them to each running calculation
separately. So RunningMax would look something like this:

class RunningMax(RunningCalc):
    def __init__(self):
        self.max_value = None

    def feed(self, value):
        if self.max_value is None or value > self.max_value:
            self.max_value = value

    def feedMultiple(self, values):
        self.feed(max(values))

feedMultiple() would have a naive default implementation in the base class.

Now this is non-trivial and can certainly be useful. Thoughts? Comments?

- Tal Einat


From dag.odenhall at gmail.com  Thu Oct 14 00:16:32 2010
From: dag.odenhall at gmail.com (Dag Odenhall)
Date: Thu, 14 Oct 2010 00:16:32 +0200
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB610CC.1070009@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
Message-ID: <1287008192.4178.9.camel@gumri>

On Wed, 2010-10-13 at 16:04 -0400, Daniel G. Taylor wrote:
> Hey,
> 
> I've recently been doing a lot of work with dates related to payment and 
> statistics processing at work and have run into several annoyances with 
> the built-in datetime, date, time, timedelta, etc classes, even when 
> adding in relativedelta. They are awkward, non-intuitive and not at all 
> Pythonic to me. Over the past year I've written up a library for making 
> my life a bit easier and  figured I would post some information here to 
> see what others think, and to gauge whether or not such a library might 
> be PEP-worthy.
> 
> My original post about it was here:
> 
> http://programmer-art.org/articles/programming/pythonic-date
> 
> The github project page is here:
> 
> http://github.com/danielgtaylor/paodate
> 
> This is code that is and has been running in production environments for 
> months but may still contain bugs. I have tried to include unit tests 
> and ample documentation. I'd love to get some feedback and people's 
> thoughts.
> 
> I would also love to hear what others find is difficult or missing from 
> the built-in date and time handling.
> 
> Take care,
> 

Not convinced your library is very Pythonic. Why a tuple attribute
instead of having date objects be iterable so you can do tuple(Date())?

How does the fancy formats deal with locales?

Is there support for ISO 8601? Should probably be the __str__.

+1 on the general idea, though.



From alexander.belopolsky at gmail.com  Thu Oct 14 00:52:52 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 13 Oct 2010 18:52:52 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB6287D.8070101@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
	<AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
	<4CB6287D.8070101@programmer-art.org>
Message-ID: <AANLkTi=w+Fe7VRVLYQ=exQuWRPzQwtWCKZnVuu1XioYh@mail.gmail.com>

On Wed, Oct 13, 2010 at 5:45 PM, Daniel G. Taylor
<dan at programmer-art.org> wrote:
..
>> * Make it easy to make a Date from anything - a timestamp, date,
>> datetime, tuple, etc.
>>
>>>>> from datetime import *
>>>>> datetime.utcfromtimestamp(0)
>>
>> datetime.datetime(1970, 1, 1, 0, 0)
>>>>>
>>>>> datetime.utcfromtimestamp(0).date()
>>
>> datetime.date(1970, 1, 1)
>
> Why does it not have this in the constructor?

Because "explicit is better than implicit."

> Where else in the standard lib does anything behave like this?

float.fromhex is one example.

This said, if I was starting from scratch, I would make date/datetime
constructors take a single positional argument that could be a string
(interpreted as ISO timestamp), tuple (broken down components), or
another date/datetime object.  This would make date/datetime
constructors more similar to those of numeric types.  I would not add
datetime(int) or datetime(float), however, because numeric timestamps
are too ambiguous and not necessary for purely calendaric
calculations.


From ncoghlan at gmail.com  Thu Oct 14 01:14:55 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 14 Oct 2010 09:14:55 +1000
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
Message-ID: <AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>

On Thu, Oct 14, 2010 at 7:54 AM, Tal Einat <taleinat at gmail.com> wrote:
> class RunningMax(RunningCalc):
> ? ?def __init__(self):
> ? ? ? ?self.max_value = None
>
> ? ?def feed(self, value):
> ? ? ? ?if self.max_value is None or value > self.max_value:
> ? ? ? ? ? ?self.max_value = value
>
> ? ?def feedMultiple(self, values):
> ? ? ? ?self.feed(max(values))
>
> feedMultiple() would have a naive default implementation in the base class.
>
> Now this is non-trivial and can certainly be useful. Thoughts? Comments?

Why use feed() rather than the existing generator send() API?

def runningmax(default_max=None):
    max_value = default_max
    while 1:
        value = max(yield max_value)
        if max_value is None or value > max_value:
            max_value = value

That said, I think this kind of thing requires too many additional
assumptions about how things are driven to make a particularly good
candidate for standard library inclusion without use in PyPI library
first.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From taleinat at gmail.com  Thu Oct 14 02:13:05 2010
From: taleinat at gmail.com (Tal Einat)
Date: Thu, 14 Oct 2010 02:13:05 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
Message-ID: <AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>

On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote:
> Why use feed() rather than the existing generator send() API?
>
> def runningmax(default_max=None):
> ? ?max_value = default_max
> ? ?while 1:
> ? ? ? ?value = max(yield max_value)
> ? ? ? ?if max_value is None or value > max_value:
> ? ? ? ? ? ?max_value = value

I tried using generators for this and it came out very clumsy. For one
thing, using generators for this requires first calling next() once to
run the generator up to the first yield, which makes the user-facing
API very confusing. Generators also have to yield a value at every
iteration, which is unnecessary here. Finally, the feedMultiple
optimization is impossible with a generator-based implementation.

> That said, I think this kind of thing requires too many additional
> assumptions about how things are driven to make a particularly good
> candidate for standard library inclusion without use in PyPI library
> first.

I'm not sure. "Rolling your own" for this isn't too difficult, so many
developers will prefer to do so rather then add another dependency
from PyPI. On the other hand, Python's standard library includes
various simple utilities that make relatively simple things easier,
standardized and well tested. Additionally, I think this fits in very
nicely with Python's embracing of iterators, and complements the
itertools library well.

While I'm at it, I'd like to mention that I am aiming at a single very
simple common usage pattern:

from RunningCalc import apply_in_parallel, RunningCount,
RunningNLargest, RunningNSmallest
count, largest10, smallest10 = apply_in_parallel(data, RunningCount(),
RunningNLargest(10), RunningNSmallest(10))

Implementing new running calculation classes would also be very simple.

- Tal Einat


From birbag at gmail.com  Thu Oct 14 10:02:24 2010
From: birbag at gmail.com (Marco Mariani)
Date: Thu, 14 Oct 2010 10:02:24 +0200
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
References: <4CB610CC.1070009@programmer-art.org>
	<AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
Message-ID: <AANLkTi==PYSTAcW1yJYBSwTwt0-p5u4fYRBu6cqwtvGu@mail.gmail.com>

On 13 October 2010 23:17, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:


* Make it easy to get a tuple of the start and end of the month
>
> Why would you want this?  Start of the month is easy: just date(year,
> month, 1).  End of the month is often unnecessary because it is more
> pythonic to work with semi-open ranges and use first of the next month
> instead.
>

Except next month may well be in next year.. blah

And I don't care about pythonic ranges if I have to push the values through
a BETWEEN query in SQL.

import calendar
import datetime

end = datetime.date(year, month, calendar.monthrange(year, month)[1])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/180f5efb/attachment.html>

From steve at pearwood.info  Thu Oct 14 13:23:31 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 14 Oct 2010 22:23:31 +1100
Subject: [Python-ideas] minmax() function returning (minimum,
	maximum) tuple of a sequence
In-Reply-To: <AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
Message-ID: <201010142223.32257.steve@pearwood.info>

On Thu, 14 Oct 2010 08:54:31 am you wrote:

> After some thought, I've found a way to make running several "running
> calculations" in parallel fast. Speed should be comparable to having
> used the non-running variants.

Speed "should be" comparable? Are you guessing or have you actually 
timed it?

And surely the point of all this extra coding is to make something run 
*faster*, not "comparable to", the sequential algorithm?


> The method is to give each running calculation "blocks" of values
> instead of just one at a time. The apply_in_parallel(iterable,
> block_size=1000, *running_calcs) function would get blocks of values
> from the iterable and pass them to each running calculation
> separately. So RunningMax would look something like this:
>
> class RunningMax(RunningCalc):
>     def __init__(self):
>         self.max_value = None
>
>     def feed(self, value):
>         if self.max_value is None or value > self.max_value:
>             self.max_value = value
>
>     def feedMultiple(self, values):
>         self.feed(max(values))

As I understand it, you have a large list of data, and you want to 
calculate a number of statistics on it. The naive algorithm does each 
calculation sequentially:

a = min(data)
b = max(data)
c = f(data)  # some other statistics
d = g(data)
...
x = z(data)

If the calculations take t1, t2, t3, ..., tx time, then the sequential 
calculation takes sum(t1, t2, ..., tx) plus a bit of overhead. If you 
do it in parallel, this should reduce the time to max(t1, t2, ..., tx) 
plus a bit of overhead, potentially a big saving.

But unless I've missed something significant, all you are actually doing 
is slicing data up into small pieces, then calling each function min, 
max, f, g, ..., z on each piece sequentially:

block = data[:size]
a = min(block)
b = max(block)
c = f(block)
...
block = data[size:2*size]
a = min(a, min(block))
b = max(b, max(block))
c = f(c, f(block))
...
block = data[2*size:3*size]
a = min(a, min(block))
b = max(b, max(block))
c = f(c, f(block))
...

Each function still runs sequentially, but you've increased the amount 
of overhead a lot: your slicing and dicing the data, plus calling each 
function multiple times.

And since each "running calculation" class needs to be hand-written to 
suit the specifics of the calculation, that's a lot of extra work just 
to get something which I expect will run slower than the naive 
sequential algorithm.


I'm also distracted by the names, RunningMax and RunningCalc. RunningFoo 
doesn't mean "do Foo in parallel", it means to return intermediate 
calculations. For example, if I ran a function called RunningMax on 
this list:

[1, 2, 1, 5, 7, 3, 4, 6, 8, 6]

I would expect it to yield or return:

[1, 2, 5, 7, 8]



-- 
Steven D'Aprano


From taleinat at gmail.com  Thu Oct 14 14:05:25 2010
From: taleinat at gmail.com (Tal Einat)
Date: Thu, 14 Oct 2010 14:05:25 +0200
Subject: [Python-ideas] minmax() function returning (minimum,
 maximum) tuple of a sequence
In-Reply-To: <201010142223.32257.steve@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<201010142223.32257.steve@pearwood.info>
Message-ID: <AANLkTi=VmM5ZJswJ1fEtN+xNbgvka==gL9CwNVt_0KcE@mail.gmail.com>

On Thu, Oct 14, 2010 at 1:23 PM, Steven D'Aprano wrote:
> On Thu, 14 Oct 2010 08:54:31 am you wrote:
>
>> After some thought, I've found a way to make running several "running
>> calculations" in parallel fast. Speed should be comparable to having
>> used the non-running variants.
>
> Speed "should be" comparable? Are you guessing or have you actually
> timed it?
>
> And surely the point of all this extra coding is to make something run
> *faster*, not "comparable to", the sequential algorithm?

The use-case I'm targeting is when you can't hold all of the data in
memory, and it is relatively "expensive" to generate it, e.g. a large
and complex database query. In this case just running the sequential
functions one at a time requires generating the data several times,
once per function. My goal is to facilitate running several
computations on a single iterator without keeping all of the data in
memory.

In almost all cases this will be slower than having run each of the
sequential functions one at a time, if it were possible to keep all of
the data in memory. The grouping optimization aims to reduce the
overhead.

- Tal Einat


From masklinn at masklinn.net  Thu Oct 14 14:06:09 2010
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 14 Oct 2010 14:06:09 +0200
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <AANLkTi==PYSTAcW1yJYBSwTwt0-p5u4fYRBu6cqwtvGu@mail.gmail.com>
References: <4CB610CC.1070009@programmer-art.org>
	<AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>
	<AANLkTi==PYSTAcW1yJYBSwTwt0-p5u4fYRBu6cqwtvGu@mail.gmail.com>
Message-ID: <C502DCBD-79E2-4CF7-9796-B2D122C216E6@masklinn.net>

On 2010-10-14, at 10:02 , Marco Mariani wrote:
> On 13 October 2010 23:17, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
> * Make it easy to get a tuple of the start and end of the month
>> 
>> Why would you want this?  Start of the month is easy: just date(year,
>> month, 1).  End of the month is often unnecessary because it is more
>> pythonic to work with semi-open ranges and use first of the next month
>> instead.
> 
> Except next month may well be in next year.. blah
> 
> And I don't care about pythonic ranges if I have to push the values through
> a BETWEEN query in SQL.
> 
> import calendar
> import datetime
> 
> end = datetime.date(year, month, calendar.monthrange(year, month)[1])

There's also dateutil, which exposes some ideas of mx.DateTime on top of the built-in datetime, including relativedelta.

As a result, you can get the last day of the current month by going backwards one day from the first day of next month:

>>> datetime.now().date() + relativedelta(months=+1, day=+1, days=-1)
datetime.date(2010, 10, 31)

Or (clearer order of operations):

>>> datetime.now().date() + relativedelta(months=+1, day=+1) + relativedelta(days=-1)
datetime.date(2010, 10, 31)

(note that in both cases the "+" sign is of course optional).

Parameters without an `s` postfix are absolute (day=1 sets the day of the current datetime to 1, similar to using .replace), parameters with an `s` are offsets (`days=+1` takes tomorrow).

From danielgtaylor at gmail.com  Thu Oct 14 20:51:05 2010
From: danielgtaylor at gmail.com (Daniel G. Taylor)
Date: Thu, 14 Oct 2010 14:51:05 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <C502DCBD-79E2-4CF7-9796-B2D122C216E6@masklinn.net>
References: <4CB610CC.1070009@programmer-art.org>	<AANLkTinQjxXUpjA=VUbb3dW=_4rkssLRRU7h2QD50nnd@mail.gmail.com>	<AANLkTi==PYSTAcW1yJYBSwTwt0-p5u4fYRBu6cqwtvGu@mail.gmail.com>
	<C502DCBD-79E2-4CF7-9796-B2D122C216E6@masklinn.net>
Message-ID: <4CB75119.7000701@gmail.com>

On 10/14/2010 08:06 AM, Masklinn wrote:
> On 2010-10-14, at 10:02 , Marco Mariani wrote:
>> On 13 October 2010 23:17, Alexander Belopolsky<
>> alexander.belopolsky at gmail.com>  wrote:
>> * Make it easy to get a tuple of the start and end of the month
>>>
>>> Why would you want this?  Start of the month is easy: just date(year,
>>> month, 1).  End of the month is often unnecessary because it is more
>>> pythonic to work with semi-open ranges and use first of the next month
>>> instead.
>>
>> Except next month may well be in next year.. blah
>>
>> And I don't care about pythonic ranges if I have to push the values through
>> a BETWEEN query in SQL.
>>
>> import calendar
>> import datetime
>>
>> end = datetime.date(year, month, calendar.monthrange(year, month)[1])
>
> There's also dateutil, which exposes some ideas of mx.DateTime on top of the built-in datetime, including relativedelta.
>
> As a result, you can get the last day of the current month by going backwards one day from the first day of next month:
>
>>>> datetime.now().date() + relativedelta(months=+1, day=+1, days=-1)
> datetime.date(2010, 10, 31)
>
> Or (clearer order of operations):
>
>>>> datetime.now().date() + relativedelta(months=+1, day=+1) + relativedelta(days=-1)
> datetime.date(2010, 10, 31)
>
> (note that in both cases the "+" sign is of course optional).
>
> Parameters without an `s` postfix are absolute (day=1 sets the day of the current datetime to 1, similar to using .replace), parameters with an `s` are offsets (`days=+1` takes tomorrow).

FWIW my library does the same sort of stuff using relativedelta 
internally, just sugar coats it heavily ;-)

Take care,
-- 
Daniel G. Taylor
http://programmer-art.org/


From dan at programmer-art.org  Thu Oct 14 20:54:30 2010
From: dan at programmer-art.org (Daniel G. Taylor)
Date: Thu, 14 Oct 2010 14:54:30 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <1287008192.4178.9.camel@gumri>
References: <4CB610CC.1070009@programmer-art.org>
	<1287008192.4178.9.camel@gumri>
Message-ID: <4CB751E6.20605@programmer-art.org>

On 10/13/2010 06:16 PM, Dag Odenhall wrote:
> Not convinced your library is very Pythonic. Why a tuple attribute
> instead of having date objects be iterable so you can do tuple(Date())?

How do you envision this working for days, weeks, months, years? E.g. 
getting the min/max Date objects for today, for next week, for this 
current month, etc.

I'm very open to ideas here; I just implemented what made sense to me at 
the time.

> How does the fancy formats deal with locales?

It internally uses datetime.strftime, so will behave however that 
behaves with regard to locales.

> Is there support for ISO 8601? Should probably be the __str__.

Not built-in other than supporting a strftime method. This is a good 
idea and I will probably add it.

> +1 on the general idea, though.

Thanks :-)

Take care,
-- 
Daniel G. Taylor
http://programmer-art.org/


From davejakeman at hotmail.com  Thu Oct 14 22:58:52 2010
From: davejakeman at hotmail.com (Dave Jakeman)
Date: Thu, 14 Oct 2010 20:58:52 +0000
Subject: [Python-ideas] String Subtraction
Message-ID: <COL116-W6328B814D046F505774296BE560@phx.gbl>


I'm new to Python and this is my first suggestion, so please bear with me:

I believe there is a simple but useful string operation missing from Python: subtraction.  This is best described by way of example:

>>> "Mr Nesbit has learnt the first lesson of not being seen." - "Nesbit "
'Mr has learnt the first lesson of not being seen.'
>>> "Can't have egg, bacon, spam and sausage without the spam." - " spam"
'Can't have egg, bacon, and sausage without the spam.'
>>> "I'll bite your legs off!" - "arms"
'I'll bite your legs off!'

If b and c were strings, then:

a = b - c

would be equivalent to:

if b.find(c) < 0:
 a = b
else:
 a = b[:b.find(c)] + b[b.find(c)+len(c):]

The operation would remove from the minuend the first occurrence (searching from left to right) of the subtrahend.  In the case of no match, the minuend would be returned unmodified.

To those unfamiliar with string subtraction, it might seem non-intuitive, but it's a useful programming construct.  Many things can be done with it and it's a good way to keep code simple.  I think it would be preferable to the current interpreter response:

>>> record = line - newline
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'str' and 'str'

As the interpreter currently checks for this attempted operation, it seems it would be straightforward to add the code needed to do something useful with it.  I don't think there would be backward compatibility issues, as this would be a new feature in place of a fatal error. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/37dfb0e1/attachment.html>

From lvh at laurensvh.be  Thu Oct 14 23:15:28 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Thu, 14 Oct 2010 23:15:28 +0200
Subject: [Python-ideas] String Subtraction
In-Reply-To: <COL116-W6328B814D046F505774296BE560@phx.gbl>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
Message-ID: <AANLkTikr3tuFcaABg_Qd1JgLYMnsEyPwKsbjcdKtpTXi@mail.gmail.com>

People already do this with the s1.replace(s2, "") idiom. I'm not sure what
the added value is. Your equivalent implementation looks pretty strange and
complex: how is it different from str.replace with the empty string as
second argument?


cheers
lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/23e88456/attachment.html>

From dougal85 at gmail.com  Thu Oct 14 23:16:07 2010
From: dougal85 at gmail.com (Dougal Matthews)
Date: Thu, 14 Oct 2010 22:16:07 +0100
Subject: [Python-ideas] String Subtraction
In-Reply-To: <COL116-W6328B814D046F505774296BE560@phx.gbl>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
Message-ID: <AANLkTik+FwL7swZnXUXsEr8-0z8mk6Wsddk3MS5uw7j8@mail.gmail.com>

On 14 October 2010 21:58, Dave Jakeman <davejakeman at hotmail.com> wrote:

> If b and c were strings, then:
>
> a = b - c
>
> would be equivalent to:
>
> if b.find(c) < 0:
>  a = b
> else:
>  a = b[:b.find(c)] + b[b.find(c)+len(c):]
>

Or more simply...

a = b.replace(c, '')

Dougal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/9c5279bd/attachment.html>

From mwm-keyword-python.b4bdba at mired.org  Thu Oct 14 23:23:11 2010
From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer)
Date: Thu, 14 Oct 2010 17:23:11 -0400
Subject: [Python-ideas] String Subtraction
In-Reply-To: <COL116-W6328B814D046F505774296BE560@phx.gbl>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
Message-ID: <20101014172311.13909a3d@bhuda.mired.org>

On Thu, 14 Oct 2010 20:58:52 +0000
Dave Jakeman <davejakeman at hotmail.com> wrote:
> I'm new to Python and this is my first suggestion, so please bear with me:
> 
> I believe there is a simple but useful string operation missing from Python: subtraction.  This is best described by way of example:
> 
> >>> "Mr Nesbit has learnt the first lesson of not being seen." - "Nesbit "
> 'Mr has learnt the first lesson of not being seen.'
> >>> "Can't have egg, bacon, spam and sausage without the spam." - " spam"
> 'Can't have egg, bacon, and sausage without the spam.'
> >>> "I'll bite your legs off!" - "arms"
> 'I'll bite your legs off!'
> 
> If b and c were strings, then:
> 
> a = b - c
> 
> would be equivalent to:

The existing construct a = b.replace(c, '', 1)

The problem isn't that it's non-intuitive (there's only one intuitive
interface, and it's got nothing to do with computers), it's that there
are a wealth of "intuitive" meanings. A case can be made that it
should mean the same as any of thise:

a = b.replace(c, '')
a = b.replace(c, ' ', 1)
a = b.replace(c, ' ')

For that matter, it might also mean the same thing as any of these:

a = re.sub(r'\s*%s\s*' % c, '', b, 1)
a = re.sub(r'\s*%s\s*' % c, '', b)
a = re.sub(r'\s*%s\s*' % c, ' ', b, 1)
a = re.sub(r'\s*%s\s*' % c, ' ', b)
a = re.sub(r'%s\s*' % c, '', b, 1)
a = re.sub(r'%s\s*' % c, '', b)
a = re.sub(r'%s\s*' % c, ' ', b, 1)
a = re.sub(r'%s\s*' % c, ' ', b)
a = re.sub(r'\s*%s' % c, '', b, 1)
a = re.sub(r'\s*%s' % c, '', b)
a = re.sub(r'\s*%s' % c, ' ', b, 1)
a = re.sub(r'\s*%s' % c, ' ', b)

Unless you can make a clear case as to why exactly one of those cases
is different enough from the others to warrant a syntax all it's own,
It's probably best to be explicit about the desired behavior.

     <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From masklinn at masklinn.net  Thu Oct 14 23:49:25 2010
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 14 Oct 2010 23:49:25 +0200
Subject: [Python-ideas] String Subtraction
In-Reply-To: <20101014172311.13909a3d@bhuda.mired.org>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
Message-ID: <8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net>

On 2010-10-14, at 23:23 , Mike Meyer wrote:
> 
> The problem isn't that it's non-intuitive (there's only one intuitive
> interface, and it's got nothing to do with computers), it's that there
> are a wealth of "intuitive" meanings. A case can be made that it
> should mean the same as any of thise:

Still, from my experience with numbers I would expect `a + b - b == a`, even if the order in which these operations are applied is important and not irrelevant.

From neatnate at gmail.com  Fri Oct 15 00:01:42 2010
From: neatnate at gmail.com (Nathan Schneider)
Date: Thu, 14 Oct 2010 18:01:42 -0400
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
In-Reply-To: <4CB751E6.20605@programmer-art.org>
References: <4CB610CC.1070009@programmer-art.org>
	<1287008192.4178.9.camel@gumri> <4CB751E6.20605@programmer-art.org>
Message-ID: <AANLkTinoAX9mhQ0d7jA+-XymH61HfBngSgN-kF-2fa0y@mail.gmail.com>

I'm glad to see there's interest in solving this (seems I'm not alone
in seeing date/time support as the ugly stepchild of the Python
standard library).

For what it's worth, not too long ago I ended up writing a bunch of
convenience functions to instantiate and convert between existing
date/time representations (datetime objects, time tuples, timestamps,
and string representations). The result is here, in case anyone's
interested:

http://www.cs.cmu.edu/~nschneid/docs/temporal.py

Cheers,
Nathan

On Thu, Oct 14, 2010 at 2:54 PM, Daniel G. Taylor
<dan at programmer-art.org> wrote:
> On 10/13/2010 06:16 PM, Dag Odenhall wrote:
>>
>> Not convinced your library is very Pythonic. Why a tuple attribute
>> instead of having date objects be iterable so you can do tuple(Date())?
>
> How do you envision this working for days, weeks, months, years? E.g.
> getting the min/max Date objects for today, for next week, for this current
> month, etc.
>
> I'm very open to ideas here; I just implemented what made sense to me at the
> time.
>
>> How does the fancy formats deal with locales?
>
> It internally uses datetime.strftime, so will behave however that behaves
> with regard to locales.
>
>> Is there support for ISO 8601? Should probably be the __str__.
>
> Not built-in other than supporting a strftime method. This is a good idea
> and I will probably add it.
>
>> +1 on the general idea, though.
>
> Thanks :-)
>
> Take care,
> --
> Daniel G. Taylor
> http://programmer-art.org/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From mwm-keyword-python.b4bdba at mired.org  Fri Oct 15 00:13:26 2010
From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer)
Date: Thu, 14 Oct 2010 18:13:26 -0400
Subject: [Python-ideas] String Subtraction
In-Reply-To: <8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
	<8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net>
Message-ID: <20101014181326.38579cfc@bhuda.mired.org>

On Thu, 14 Oct 2010 23:49:25 +0200
Masklinn <masklinn at masklinn.net> wrote:

> On 2010-10-14, at 23:23 , Mike Meyer wrote:
> > 
> > The problem isn't that it's non-intuitive (there's only one intuitive
> > interface, and it's got nothing to do with computers), it's that there
> > are a wealth of "intuitive" meanings. A case can be made that it
> > should mean the same as any of these:
> 
> Still, from my experience with numbers I would expect `a + b - b == a`, even if the order in which these operations are applied is important and not irrelevant.

Well, if you use the standard left-to right ordering, that equality
doesn't hold for the proposed meaning for string subtraction:

("xyzzy and " + "xyzzy") - "xyzzy" = " and xyzzy" != "xyzzy and "

It won't hold for any of the definition I proposed either - not if a
contains a copy of b.

Come to think of it, it doesn't hold for the computer representation
of numbers, either.

   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From denis.spir at gmail.com  Fri Oct 15 00:16:18 2010
From: denis.spir at gmail.com (spir)
Date: Fri, 15 Oct 2010 00:16:18 +0200
Subject: [Python-ideas] String Subtraction
In-Reply-To: <20101014172311.13909a3d@bhuda.mired.org>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
Message-ID: <20101015001618.2c19634e@o>

On Thu, 14 Oct 2010 17:23:11 -0400
Mike Meyer <mwm-keyword-python.b4bdba at mired.org> wrote:

> The problem isn't that it's non-intuitive (there's only one intuitive
> interface, and it's got nothing to do with computers), it's that there
> are a wealth of "intuitive" meanings.

Maybe have string.erase(sub,n) be a "more intuitive" shortcut for string.replace(sub,'',n)?

Denis
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From bruce at leapyear.org  Fri Oct 15 01:45:18 2010
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 14 Oct 2010 16:45:18 -0700
Subject: [Python-ideas] String Subtraction
In-Reply-To: <20101015001618.2c19634e@o>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
	<20101015001618.2c19634e@o>
Message-ID: <AANLkTimCPUD3FLKE3jRzYGGSAU04nnc=asFzGYyZj4+z@mail.gmail.com>

Here's a useful function along these lines, which ideally would be
string.remove():

def remove(s, sub, maxremove=None, sep=None):
  """Removes instances of sub from the string.

  Args:
    s: The string to be modified.
    sub: The substring to be removed.
    maxremove: If specified, the maximum number of instances to be
        removed (starting from the left). If omitted, removes all instances.
    sep: Optionally, the separators to be removed. If the separator appears
        on both sides of a removed substring, one of the separators is
removed.

  >>> remove('test,blah,blah,blah,this', 'blah')
  'test,,,,this'
  >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2)
  'test,,,blah,this'
  >>> remove('test,blah,blah,blah,this', 'blah', sep=',')
  'test,this'
  >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',')
  'test,blah,this'
  >>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1)
  'foo(1)(2)blah(3)bar'
  """

  processed = ''
  remaining = s
  while maxremove is None or maxremove > 0:
    parts = remaining.split(sub, 1)
    if len(parts) == 1:
      return processed + remaining
    processed += parts[0]
    remaining = parts[1]
    if sep and processed.endswith(sep) and remaining.startswith(sep):
      remaining = remaining[len(sep):]
    if maxremove is not None:
      maxremove -= 1
  return processed + remaining

--- Bruce
Latest blog post:
http://www.vroospeak.com/2010/10/today-we-are-all-chileans.html<http://www.vroospeak.com>
Learn how hackers think: http://j.mp/gruyere-security



On Thu, Oct 14, 2010 at 3:16 PM, spir <denis.spir at gmail.com> wrote:

> On Thu, 14 Oct 2010 17:23:11 -0400
> Mike Meyer <mwm-keyword-python.b4bdba at mired.org> wrote:
>
> > The problem isn't that it's non-intuitive (there's only one intuitive
> > interface, and it's got nothing to do with computers), it's that there
> > are a wealth of "intuitive" meanings.
>
> Maybe have string.erase(sub,n) be a "more intuitive" shortcut for
> string.replace(sub,'',n)?
>
> Denis
> -- -- -- -- -- -- --
> vit esse estrany ?
>
> spir.wikidot.com
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/326eeb5d/attachment.html>

From ben+python at benfinney.id.au  Fri Oct 15 02:51:43 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 15 Oct 2010 11:51:43 +1100
Subject: [Python-ideas] Pythonic Dates, Times, and Deltas
References: <4CB610CC.1070009@programmer-art.org>
	<4CB619B3.4050203@egenix.com> <4CB61DBA.9060801@programmer-art.org>
Message-ID: <874ocouvkw.fsf@benfinney.id.au>

"Daniel G. Taylor"
<dan at programmer-art.org> writes:

> Any idea why this hasn't made it into Python's standard library while
> being around for 13 years? Seems like it would be extremely useful in
> the standard distribution.

One barrier is that its license terms
<URL:http://www.egenix.com/products/python/mxBase/eGenix.com-Public-License-1.1.0.pdf>
are incompatible with redistribution under the terms of the Python
license.

I'd love to see the mx code released under compatible license terms, but
am not optimistic.

-- 
 \      ?He that would make his own liberty secure must guard even his |
  `\                             enemy from oppression.? ?Thomas Paine |
_o__)                                                                  |
Ben Finney



From rrr at ronadam.com  Fri Oct 15 04:09:06 2010
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 14 Oct 2010 21:09:06 -0500
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
Message-ID: <4CB7B7C2.8090401@ronadam.com>



On 10/13/2010 07:13 PM, Tal Einat wrote:
> On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote:
>> Why use feed() rather than the existing generator send() API?
>>
>> def runningmax(default_max=None):
>>     max_value = default_max
>>     while 1:
>>         value = max(yield max_value)
>>         if max_value is None or value>  max_value:
>>             max_value = value
>
> I tried using generators for this and it came out very clumsy. For one
> thing, using generators for this requires first calling next() once to
> run the generator up to the first yield, which makes the user-facing
> API very confusing. Generators also have to yield a value at every
> iteration, which is unnecessary here. Finally, the feedMultiple
> optimization is impossible with a generator-based implementation.

Something I noticed about the min and max functions is that they treat
values and iterable slightly different.

# This works
 >>> min(1, 2)
1
 >>> min([1, 2])
1


# The gotcha
 >>> min([1])
1
 >>> min(1)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable


So you need a function like the following to make it handle single values 
and single iterables the same.

def xmin(value):
     try:
         return min(value)
     except TypeError:
         return min([value])

Then you can do...

@consumer
def Running_Min(out_value=None):
     while 1:
         in_value = yield out_value
         if in_value is not None:
             if out_value is None:
                 out_value = xmin(in_value)
             else:
                 out_value = xmin(out_value, xmin(in_value))


Or for your class...

def xmax(value):
     try:
         return max(value)
     except TypeError:
         return max([value])

class RunningMax(RunningCalc):
     def __init__(self):
         self.max_value = None

     def feed(self, value):
         if value is not None:
             if self.max_value is None:
                 self.max_value = xmax(value)
             else:
                 self.max_value = xmax(self.max_value, xmax(value))


Now if they could handle None a bit better we might be able to get rid of 
the None checks too.  ;-)

Cheers,
    Ron


From guido at python.org  Fri Oct 15 04:14:02 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 14 Oct 2010 19:14:02 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB7B7C2.8090401@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
Message-ID: <AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>

Why would you ever want to write min(1)? (Or min(x) where x is not iterable.)

--Guido

On Thu, Oct 14, 2010 at 7:09 PM, Ron Adam <rrr at ronadam.com> wrote:
>
>
> On 10/13/2010 07:13 PM, Tal Einat wrote:
>>
>> On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote:
>>>
>>> Why use feed() rather than the existing generator send() API?
>>>
>>> def runningmax(default_max=None):
>>> ? ?max_value = default_max
>>> ? ?while 1:
>>> ? ? ? ?value = max(yield max_value)
>>> ? ? ? ?if max_value is None or value> ?max_value:
>>> ? ? ? ? ? ?max_value = value
>>
>> I tried using generators for this and it came out very clumsy. For one
>> thing, using generators for this requires first calling next() once to
>> run the generator up to the first yield, which makes the user-facing
>> API very confusing. Generators also have to yield a value at every
>> iteration, which is unnecessary here. Finally, the feedMultiple
>> optimization is impossible with a generator-based implementation.
>
> Something I noticed about the min and max functions is that they treat
> values and iterable slightly different.
>
> # This works
>>>> min(1, 2)
> 1
>>>> min([1, 2])
> 1
>
>
> # The gotcha
>>>> min([1])
> 1
>>>> min(1)
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: 'int' object is not iterable
>
>
> So you need a function like the following to make it handle single values
> and single iterables the same.
>
> def xmin(value):
> ? ?try:
> ? ? ? ?return min(value)
> ? ?except TypeError:
> ? ? ? ?return min([value])
>
> Then you can do...
>
> @consumer
> def Running_Min(out_value=None):
> ? ?while 1:
> ? ? ? ?in_value = yield out_value
> ? ? ? ?if in_value is not None:
> ? ? ? ? ? ?if out_value is None:
> ? ? ? ? ? ? ? ?out_value = xmin(in_value)
> ? ? ? ? ? ?else:
> ? ? ? ? ? ? ? ?out_value = xmin(out_value, xmin(in_value))
>
>
> Or for your class...
>
> def xmax(value):
> ? ?try:
> ? ? ? ?return max(value)
> ? ?except TypeError:
> ? ? ? ?return max([value])
>
> class RunningMax(RunningCalc):
> ? ?def __init__(self):
> ? ? ? ?self.max_value = None
>
> ? ?def feed(self, value):
> ? ? ? ?if value is not None:
> ? ? ? ? ? ?if self.max_value is None:
> ? ? ? ? ? ? ? ?self.max_value = xmax(value)
> ? ? ? ? ? ?else:
> ? ? ? ? ? ? ? ?self.max_value = xmax(self.max_value, xmax(value))
>
>
> Now if they could handle None a bit better we might be able to get rid of
> the None checks too. ?;-)
>
> Cheers,
> ? Ron
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (python.org/~guido)


From taleinat at gmail.com  Fri Oct 15 05:05:43 2010
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 15 Oct 2010 05:05:43 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB7B7C2.8090401@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
Message-ID: <AANLkTim27F-G4mw_O9tNKTQ4EZ3X_BHaQpmBT_0MJS06@mail.gmail.com>

On Fri, Oct 15, 2010 at 4:09 AM, Ron Adam wrote:
>
>
> On 10/13/2010 07:13 PM, Tal Einat wrote:
>>
>> On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote:
>>>
>>> Why use feed() rather than the existing generator send() API?
>>>
>>> def runningmax(default_max=None):
>>> ? ?max_value = default_max
>>> ? ?while 1:
>>> ? ? ? ?value = max(yield max_value)
>>> ? ? ? ?if max_value is None or value> ?max_value:
>>> ? ? ? ? ? ?max_value = value
>>
>> I tried using generators for this and it came out very clumsy. For one
>> thing, using generators for this requires first calling next() once to
>> run the generator up to the first yield, which makes the user-facing
>> API very confusing. Generators also have to yield a value at every
>> iteration, which is unnecessary here. Finally, the feedMultiple
>> optimization is impossible with a generator-based implementation.
>
> Something I noticed about the min and max functions is that they treat
> values and iterable slightly different.

Sorry, my bad. The max in "value = max(yield max_value)" was an error,
it should have been removed.

As Guido mentioned, there is never a reason to do max(value) where
value is not an iterable.

- Tal


From steve at pearwood.info  Fri Oct 15 05:12:03 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 15 Oct 2010 14:12:03 +1100
Subject: [Python-ideas] minmax() function returning (minimum,
	maximum) tuple of a sequence
In-Reply-To: <AANLkTi=VmM5ZJswJ1fEtN+xNbgvka==gL9CwNVt_0KcE@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010142223.32257.steve@pearwood.info>
	<AANLkTi=VmM5ZJswJ1fEtN+xNbgvka==gL9CwNVt_0KcE@mail.gmail.com>
Message-ID: <201010151412.03572.steve@pearwood.info>

On Thu, 14 Oct 2010 11:05:25 pm you wrote:
> On Thu, Oct 14, 2010 at 1:23 PM, Steven D'Aprano wrote:
> > On Thu, 14 Oct 2010 08:54:31 am you wrote:
> >> After some thought, I've found a way to make running several
> >> "running calculations" in parallel fast. Speed should be
> >> comparable to having used the non-running variants.
> >
> > Speed "should be" comparable? Are you guessing or have you actually
> > timed it?
> >
> > And surely the point of all this extra coding is to make something
> > run *faster*, not "comparable to", the sequential algorithm?
>
> The use-case I'm targeting is when you can't hold all of the data in
> memory, and it is relatively "expensive" to generate it, e.g. a large
> and complex database query. In this case just running the sequential
> functions one at a time requires generating the data several times,
> once per function. My goal is to facilitate running several
> computations on a single iterator without keeping all of the data in
> memory.

Okay, fair enough, but I think that's enough of a specialist need that 
it doesn't belong as a built-in or even in the standard library.

I suspect that, even for your application, a more sensible approach 
would be to write a single function to walk over the data once, doing 
all the calculations you need. E.g. if your data is numeric, and you 
need (say) the min, max, mean (average), standard deviation and 
standard error, rather than doing a separate pass for each function, 
you can do them all in a single pass:

sum = 0
sum_sq = 0
count = 0
smallest = sys.maxint
biggest = -sys.maxint
for x in data:
    count += 1
    sum += x
    sum_sq += x**2
    smallest = min(smallest, x)
    biggest = max(biggest, x)
mean = sum/count
std_dev = math.sqrt((sum_sq + sum**2)/(count-1))
std_err = std_dev/math.sqrt(count)

That expression for the standard deviation is from memory, don't trust 
it, I've probably got it wrong!

Naturally, if you don't know what functions you need to call until 
runtime it will require a bit more cleverness. A general approach might 
be a functional approach based on reduce:

def multireduce(functions, initial_values, data):
    values = list(initial_values)
    for x in data:
        for i, func in enumerate(functions):
            values[i] = func(x, values[i])
    return values

The point is that if generating the data is costly, the best approach is 
to lazily generate the data once only, with the minimal overhead and 
maximum flexibility.


-- 
Steven D'Aprano


From digitalxero at gmail.com  Fri Oct 15 05:22:59 2010
From: digitalxero at gmail.com (Dj Gilcrease)
Date: Thu, 14 Oct 2010 23:22:59 -0400
Subject: [Python-ideas] String Subtraction
In-Reply-To: <AANLkTimCPUD3FLKE3jRzYGGSAU04nnc=asFzGYyZj4+z@mail.gmail.com>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
	<20101015001618.2c19634e@o>
	<AANLkTimCPUD3FLKE3jRzYGGSAU04nnc=asFzGYyZj4+z@mail.gmail.com>
Message-ID: <AANLkTikZUMkN8rMO-=+pcpeRwVnLw-Vu4J7zK_Gccc1A@mail.gmail.com>

On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban <bruce at leapyear.org> wrote:
> Here's a useful function along these lines, which ideally would be
> string.remove():
> def remove(s, sub, maxremove=None, sep=None):
> ??"""Removes instances of sub from the string.
> ??Args:
> ?? ?s: The string to be modified.
> ?? ?sub: The substring to be removed.
> ?? ?maxremove: If specified, the maximum number of instances to be
> ?? ? ? ?removed (starting from the left). If omitted, removes all instances.
> ?? ?sep: Optionally, the separators to be removed. If the separator appears
> ?? ? ? ?on both sides of a removed substring, one of the separators is
> removed.
> ??>>> remove('test,blah,blah,blah,this', 'blah')
> ??'test,,,,this'
> ??>>> remove('test,blah,blah,blah,this', 'blah', maxremove=2)
> ??'test,,,blah,this'
> ??>>> remove('test,blah,blah,blah,this', 'blah', sep=',')
> ??'test,this'
> ??>>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',')
> ??'test,blah,this'
> ??>>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1)
> ??'foo(1)(2)blah(3)bar'
> ??"""

Could be written as

def remove(string, sub, max_remove=-1, sep=None):
    if sep:
        sub = sub + sep
    return string.replace(sub, '', max_remove)

t = 'test,blah,blah,blah,this'
print(remove(t, 'blah'))
print(remove(t, 'blah', 2))
print(remove(t, 'blah', sep=','))
print(remove(t, 'blah', 2, ','))
print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1))


Dj Gilcrease
?____
( | ? ? \ ?o ? ?() ? | ?o ?|`|
? | ? ? ?| ? ? ?/`\_/| ? ? ?| | ? ,__ ? ,_, ? ,_, ? __, ? ?, ? ,_,
_| ? ? ?| | ? ?/ ? ? ?| ?| ? |/ ? / ? ? ?/ ? | ? |_/ ?/ ? ?| ? / \_|_/
(/\___/ ?|/ ?/(__,/ ?|_/|__/\___/ ? ?|_/|__/\__/|_/\,/ ?|__/
? ? ? ? ?/|
? ? ? ? ?\|


From bruce at leapyear.org  Fri Oct 15 06:40:00 2010
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 14 Oct 2010 21:40:00 -0700
Subject: [Python-ideas] String Subtraction
In-Reply-To: <AANLkTikZUMkN8rMO-=+pcpeRwVnLw-Vu4J7zK_Gccc1A@mail.gmail.com>
References: <COL116-W6328B814D046F505774296BE560@phx.gbl>
	<20101014172311.13909a3d@bhuda.mired.org>
	<20101015001618.2c19634e@o>
	<AANLkTimCPUD3FLKE3jRzYGGSAU04nnc=asFzGYyZj4+z@mail.gmail.com>
	<AANLkTikZUMkN8rMO-=+pcpeRwVnLw-Vu4J7zK_Gccc1A@mail.gmail.com>
Message-ID: <AANLkTi=m46VXsU7GKbAL7_E8HQfuJkn5YSeCF+ymtdxe@mail.gmail.com>

Your code operates differently for "test blah,this". My code produces "test
,this" while yours produces "test this". Eliding multiple separators is
perhaps more useful when sep=' ' but I used commas because they're easier to
see.

An alternative design removes one separator either before or after a removed
string (but not both). That would work better for an example like this:

>>> remove('The Illuminati fnord are everywhere fnord.', 'fnord', sep=' ')
'The Illuminati are everywhere.'

Neither version of this may have sufficient utility to be added to standard
library.

--- Bruce
http://www.vroospeak.com
http://j.mp/gruyere-security



On Thu, Oct 14, 2010 at 8:22 PM, Dj Gilcrease <digitalxero at gmail.com> wrote:

> On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban <bruce at leapyear.org> wrote:
> > Here's a useful function along these lines, which ideally would be
> > string.remove():
> > def remove(s, sub, maxremove=None, sep=None):
> >   """Removes instances of sub from the string.
> >   Args:
> >     s: The string to be modified.
> >     sub: The substring to be removed.
> >     maxremove: If specified, the maximum number of instances to be
> >         removed (starting from the left). If omitted, removes all
> instances.
> >     sep: Optionally, the separators to be removed. If the separator
> appears
> >         on both sides of a removed substring, one of the separators is
> > removed.
> >   >>> remove('test,blah,blah,blah,this', 'blah')
> >   'test,,,,this'
> >   >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2)
> >   'test,,,blah,this'
> >   >>> remove('test,blah,blah,blah,this', 'blah', sep=',')
> >   'test,this'
> >   >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',')
> >   'test,blah,this'
> >   >>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1)
> >   'foo(1)(2)blah(3)bar'
> >   """
>
> Could be written as
>
> def remove(string, sub, max_remove=-1, sep=None):
>    if sep:
>        sub = sub + sep
>    return string.replace(sub, '', max_remove)
>
> t = 'test,blah,blah,blah,this'
> print(remove(t, 'blah'))
> print(remove(t, 'blah', 2))
> print(remove(t, 'blah', sep=','))
> print(remove(t, 'blah', 2, ','))
> print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1))
>
>
> Dj Gilcrease
>  ____
> ( |     \  o    ()   |  o  |`|
>   |      |      /`\_/|      | |   ,__   ,_,   ,_,   __,    ,   ,_,
> _|      | |    /      |  |   |/   /      /   |   |_/  /    |   / \_|_/
> (/\___/  |/  /(__,/  |_/|__/\___/    |_/|__/\__/|_/\,/  |__/
>          /|
>          \|
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101014/f36d7ec5/attachment.html>

From taleinat at gmail.com  Fri Oct 15 17:36:19 2010
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 15 Oct 2010 17:36:19 +0200
Subject: [Python-ideas] minmax() function returning (minimum,
 maximum) tuple of a sequence
In-Reply-To: <201010151412.03572.steve@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010142223.32257.steve@pearwood.info>
	<AANLkTi=VmM5ZJswJ1fEtN+xNbgvka==gL9CwNVt_0KcE@mail.gmail.com>
	<201010151412.03572.steve@pearwood.info>
Message-ID: <AANLkTinRUFT65BbS7-uJdguP9A7oMxZXh-NDBvTEN+vV@mail.gmail.com>

On Fri, Oct 15, 2010 at 5:12 AM, Steven D'Aprano wrote:
> On Thu, 14 Oct 2010 11:05:25 pm you wrote:
>> The use-case I'm targeting is when you can't hold all of the data in
>> memory, and it is relatively "expensive" to generate it, e.g. a large
>> and complex database query. In this case just running the sequential
>> functions one at a time requires generating the data several times,
>> once per function. My goal is to facilitate running several
>> computations on a single iterator without keeping all of the data in
>> memory.
>
> Okay, fair enough, but I think that's enough of a specialist need that
> it doesn't belong as a built-in or even in the standard library.

I don't see this as a specialist need. This is relevant to any piece
of code which receives an iterator and doesn't know whether it is
feasible to keep all of its items in memory. The way I see it,
Python's embracing of iterators is what makes this commonly useful.

> I suspect that, even for your application, a more sensible approach
> would be to write a single function to walk over the data once, doing
> all the calculations you need. E.g. if your data is numeric, and you
> need (say) the min, max, mean (average), standard deviation and
> standard error, rather than doing a separate pass for each function,
> you can do them all in a single pass:
>
> sum = 0
> sum_sq = 0
> count = 0
> smallest = sys.maxint
> biggest = -sys.maxint
> for x in data:
>    count += 1
>    sum += x
>    sum_sq += x**2
>    smallest = min(smallest, x)
>    biggest = max(biggest, x)
> mean = sum/count
> std_dev = math.sqrt((sum_sq + sum**2)/(count-1))
> std_err = std_dev/math.sqrt(count)

What you suggest is that each programmer rolls his own code, which is
reasonable for tasks which are not very common and are easy enough to
implement. The problem is that in this case, the straightforward
solution you suggest has both efficiency and numerical stability
problems. These are actually quite tricky to understand and sort out.
In light of this, a standard implementation which avoids common
stumbling blocks and errors could have its place in the standard
library. IIRC these were the reasons for the inclusion of the bisect
module, for example.

Regarding the numerical stability issues, these don't arise just in
extreme edge-cases. Even a simple running average calculation for some
large numbers, or numbers whose average is near zero, can have
significant errors. Variance and standard deviation are even more
problematic in this respect.

> Naturally, if you don't know what functions you need to call until
> runtime it will require a bit more cleverness. A general approach might
> be a functional approach based on reduce:
>
> def multireduce(functions, initial_values, data):
> ? ?values = list(initial_values)
> ? ?for x in data:
> ? ? ? ?for i, func in enumerate(functions):
> ? ? ? ? ? ?values[i] = func(x, values[i])
> ? ?return values
>
> The point is that if generating the data is costly, the best approach is
> to lazily generate the data once only, with the minimal overhead and
> maximum flexibility.

This is precisely what I am suggesting! The only difference is that I
suggest using objects with a simple API instead of functions, to allow
more flexibility. Some things are hard to implement using just a
function as you suggest, and various optimizations are impossible.

- Tal Einat


From rrr at ronadam.com  Fri Oct 15 19:13:54 2010
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 15 Oct 2010 12:13:54 -0500
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
Message-ID: <4CB88BD2.4010901@ronadam.com>

My apologies, I clicked "reply" instead of "reply list" last night.

After thinking about this a bit more, it isn't a matter of never needing to 
do it.  The min and max functions wouldn't be able to compare a series of 
lists as individual values without a keyword switch to choose the specific 
behavior for a single item.  ie... list of items  or an item that happens 
to be a list. The below examples would not be able to compare sequences 
correctly.

Ron



On 10/14/2010 09:14 PM, Guido van Rossum wrote:
> Why would you ever want to write min(1)? (Or min(x) where x is not iterable.)

Basically to allow easier duck typing without having to check weather x is 
an iterable.

This isn't a big deal or a must have.  It's just one solution to a problem 
presented here.  My own thoughts is that little tweaks like this may be 
helpful when using functions in indirect ways where it's nice not to have 
to do additional value, type, or attribute checking.


[Tal also says]
> As Guido mentioned, there is never a reason to do max(value) where
> value is not an iterable.

Well, you can always avoid doing it, but that doesn't mean it wouldn't be 
nice to have sometimes.  Take a look at the following three coroutines that 
do the same exact thing.  Which is easier to read and which would be 
considered the more Pythonic.


def xmin(*args, **kwds):
     # Allow min to work with a single non-iterable value.
     if len(args) == 1 and not hasattr(args[0], "__iter__"):
         return min(args, **kwds)
     else:
         return min(*args, **kwds)


# Accept values or chunks of values and keep a running minimum.

@consumer
def Running_Min(out_value=None):
     while 1:
         in_value = yield out_value
         if in_value is not None:
             if out_value is None:
                 out_value = xmin(in_value)
             else:
                 out_value = xmin(out_value, xmin(in_value))

@consumer
def Running_Min(out_value=None):
     while 1:
         in_value = yield out_value
         if in_value is not None:
             if not hasattr(in_value, "__iter__"):
                 in_value = [in_value]
             if out_value is None:
                 out_value = min(in_value)
             else:
                 out_value = min(out_value, min(in_value))

@consumer
def Running_Min(out_value=None):
     while 1:
         in_value = yield out_value
         if in_value is not None:
             if not hasattr(in_value, "__iter__"):
                 if out_value is None:
                     out_value = in_value
                 else:
                     out_value = min(out_value, in_value)
             else:
                 if out_value is None:
                     out_value = min(in_value)
                 else:
                     out_value = min(out_value, min(in_value))



From g.brandl at gmx.net  Fri Oct 15 19:27:10 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 15 Oct 2010 19:27:10 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB88BD2.4010901@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com>
Message-ID: <i9a2u9$q8k$1@dough.gmane.org>

Am 15.10.2010 19:13, schrieb Ron Adam:

> [Tal also says]
>> As Guido mentioned, there is never a reason to do max(value) where
>> value is not an iterable.
> 
> Well, you can always avoid doing it, but that doesn't mean it wouldn't be 
> nice to have sometimes.  Take a look at the following three coroutines that 
> do the same exact thing.  Which is easier to read and which would be 
> considered the more Pythonic.
> 
> 
> def xmin(*args, **kwds):
>      # Allow min to work with a single non-iterable value.
>      if len(args) == 1 and not hasattr(args[0], "__iter__"):
>          return min(args, **kwds)
>      else:
>          return min(*args, **kwds)

I don't understand this function.  Why wouldn't you simply always call

   return min(args, **kwds)

?

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From rrr at ronadam.com  Fri Oct 15 20:09:17 2010
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 15 Oct 2010 13:09:17 -0500
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <i9a2u9$q8k$1@dough.gmane.org>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>
	<i9a2u9$q8k$1@dough.gmane.org>
Message-ID: <4CB898CD.6000207@ronadam.com>



On 10/15/2010 12:27 PM, Georg Brandl wrote:
> Am 15.10.2010 19:13, schrieb Ron Adam:
>
>> [Tal also says]
>>> As Guido mentioned, there is never a reason to do max(value) where
>>> value is not an iterable.
>>
>> Well, you can always avoid doing it, but that doesn't mean it wouldn't be
>> nice to have sometimes.  Take a look at the following three coroutines that
>> do the same exact thing.  Which is easier to read and which would be
>> considered the more Pythonic.
>>
>>
>> def xmin(*args, **kwds):
>>       # Allow min to work with a single non-iterable value.
>>       if len(args) == 1 and not hasattr(args[0], "__iter__"):
>>           return min(args, **kwds)
>>       else:
>>           return min(*args, **kwds)
>
> I don't understand this function.  Why wouldn't you simply always call
>
>     return min(args, **kwds)
>
> ?

Because it would always interpret a list of values as a single item.

This function looks at args and if its a single value without an "__iter__" 
method, it passes it to min as min([value], **kwds) instead of min(value, 
**kwds).

Another way to do this would be to use a try-except...

     try:
          return min(*args, **kwds)
     except TypeError:
          return min(args, **kwds)

Ron


From arnodel at googlemail.com  Fri Oct 15 21:25:49 2010
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Fri, 15 Oct 2010 20:25:49 +0100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB898CD.6000207@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
Message-ID: <AANLkTi=Qawvg8rUDmX6UUrTtKsuRrj1+j2R1K+pHeGTT@mail.gmail.com>

On 15 October 2010 19:09, Ron Adam <rrr at ronadam.com> wrote:
>
>
> On 10/15/2010 12:27 PM, Georg Brandl wrote:
>>
>> Am 15.10.2010 19:13, schrieb Ron Adam:
>>
>>> [Tal also says]
>>>>
>>>> As Guido mentioned, there is never a reason to do max(value) where
>>>> value is not an iterable.
>>>
>>> Well, you can always avoid doing it, but that doesn't mean it wouldn't be
>>> nice to have sometimes. ?Take a look at the following three coroutines
>>> that
>>> do the same exact thing. ?Which is easier to read and which would be
>>> considered the more Pythonic.
>>>
>>>
>>> def xmin(*args, **kwds):
>>> ? ? ?# Allow min to work with a single non-iterable value.
>>> ? ? ?if len(args) == 1 and not hasattr(args[0], "__iter__"):
>>> ? ? ? ? ?return min(args, **kwds)
>>> ? ? ?else:
>>> ? ? ? ? ?return min(*args, **kwds)
>>
>> I don't understand this function. ?Why wouldn't you simply always call
>>
>> ? ?return min(args, **kwds)
>>
>> ?
>
> Because it would always interpret a list of values as a single item.
>
> This function looks at args and if its a single value without an "__iter__"
> method, it passes it to min as min([value], **kwds) instead of min(value,
> **kwds).

But there are many iterable objects which are also comparable (hence
it makes sense to consider their min/max), for example strings.

So we get:

    xmin("foo", "bar", "baz") == "bar"
    xmin("foo", "bar") == "bar"

but:

   xmin("foo") == "f"

This will create havoc in your running min routine.

(Notice the same will hold for min() but at least you know that min(x)
considers x as an iterable and complains if it isn't)

--
Arnaud


From g.brandl at gmx.net  Fri Oct 15 21:25:27 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 15 Oct 2010 21:25:27 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB898CD.6000207@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
Message-ID: <i9a9s1$qe4$1@dough.gmane.org>

Am 15.10.2010 20:09, schrieb Ron Adam:
>> I don't understand this function.  Why wouldn't you simply always call
>>
>>     return min(args, **kwds)
>>
>> ?
> 
> Because it would always interpret a list of values as a single item.
> 
> This function looks at args and if its a single value without an "__iter__" 
> method, it passes it to min as min([value], **kwds) instead of min(value, 
> **kwds).
> 
> Another way to do this would be to use a try-except...
> 
>      try:
>           return min(*args, **kwds)
>      except TypeError:
>           return min(args, **kwds)

And that's just gratuitous.  If you have the sequence of items to compare
already as an iterable, there is absolutely no need to unpack them using
*args.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From raymond.hettinger at gmail.com  Fri Oct 15 22:01:37 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 15 Oct 2010 13:01:37 -0700
Subject: [Python-ideas] Fwd:  stats module  Was: minmax() function ...
References: <1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com>
Message-ID: <379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com>

Drat.  This should have gone to python-ideas.
Re-sending.

Begin forwarded message:

> From: Raymond Hettinger <raymond.hettinger at gmail.com>
> Date: October 15, 2010 1:00:16 PM PDT
> To: Python-Dev Dev <python-dev at python.org>
> Subject: Fwd: [Python-ideas] stats module Was: minmax() function ...
> 
> Hello guys.  If you don't mind, I would like to hijack your thread :-)
> 
> ISTM, that the minmax() idea is really just an optimization request.
> A single-pass minmax() is easily coded in simple, pure-python,
> so really the discussion is about how to remove the loop overhead
> (there isn't much you can do about the cost of the two compares
> which is where most of the time would be spent anyway).
> 
> My suggestion is to aim higher.   There is no reason a single pass
> couldn't also return min/max/len/sum and perhaps even other summary
> statistics like sum(x**2) so that you can compute standard deviation 
> and variance.
> 
> A few years ago, Guido and other python devvers supported a
> proposal I made to create a stats module, but I didn't have time
> to develop it.  The basic idea was that python's batteries should
> include most of the functionality available on advanced student
> calculators.  Another idea behind it was that we could invisibility
> do-the-right-thing under the hood to help users avoid numerical
> problems (i.e. math.fsum(s)/len(s) is a more accurate way to
> compute an average because it doesn't lose precision when
> building-up the intermediate sums).
> 
> I think the creativity and energy of this group is much better directed
> at building a quality stats module (perhaps with some R-like capabilities).
> That would likely be a better use of energy than bike-shedding 
> about ways to speed-up a trivial piece of code that is ultimately
> constrained by the cost of the compares per item.
> 
> my-two-cents,
> 
> 
> Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101015/33f40ee2/attachment.html>

From rrr at ronadam.com  Fri Oct 15 22:00:53 2010
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 15 Oct 2010 15:00:53 -0500
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
Message-ID: <4CB8B2F5.2020507@ronadam.com>



On 10/15/2010 02:04 PM, Arnaud Delobelle wrote:

>> Because it would always interpret a list of values as a single item.
>>
>> This function looks at args and if its a single value without an "__iter__"
>> method, it passes it to min as min([value], **kwds) instead of min(value,
>> **kwds).
>
> But there are many iterable objects which are also comparable (hence
> it makes sense to consider their min/max), for example strings.
>
> So we get:
>
>       xmin("foo", "bar", "baz") == "bar"
>       xmin("foo", "bar") == "bar"
>
> but:
>
>      xmin("foo") == "f"
>
> This will create havoc in your running min routine.
>
> (Notice the same will hold for min() but at least you know that min(x)
> considers x as an iterable and complains if it isn't)

Yes

There doesn't seem to be a way to generalize min/max in a way to handle all 
the cases without knowing the context.

So in a coroutine version of Tals class, you would need to pass a hint 
along with the value.

Ron



From g.brandl at gmx.net  Fri Oct 15 22:52:43 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 15 Oct 2010 22:52:43 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB8B2F5.2020507@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com>
Message-ID: <i9aevm$hu7$1@dough.gmane.org>

Am 15.10.2010 22:00, schrieb Ron Adam:

>> (Notice the same will hold for min() but at least you know that min(x)
>> considers x as an iterable and complains if it isn't)
> 
> Yes
> 
> There doesn't seem to be a way to generalize min/max in a way to handle all 
> the cases without knowing the context.

I give up.  You see an issue where there is none.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From masklinn at masklinn.net  Fri Oct 15 22:56:46 2010
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 15 Oct 2010 22:56:46 +0200
Subject: [Python-ideas] Fwd:  stats module  Was: minmax() function ...
In-Reply-To: <379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com>
References: <1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com>
	<379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com>
Message-ID: <FB71C654-145D-41B7-B187-9A4A04000EEF@masklinn.net>

On 2010-10-15, at 22:01 , Raymond Hettinger wrote:
> Drat.  This should have gone to python-ideas.
> Re-sending.
> 
> Begin forwarded message:
> 
>> From: Raymond Hettinger <raymond.hettinger at gmail.com>
>> Date: October 15, 2010 1:00:16 PM PDT
>> To: Python-Dev Dev <python-dev at python.org>
>> Subject: Fwd: [Python-ideas] stats module Was: minmax() function ...
>> 
>> Hello guys.  If you don't mind, I would like to hijack your thread :-)
>> 
>> ISTM, that the minmax() idea is really just an optimization request.
>> A single-pass minmax() is easily coded in simple, pure-python,
>> so really the discussion is about how to remove the loop overhead
>> (there isn't much you can do about the cost of the two compares
>> which is where most of the time would be spent anyway).
>> 
>> My suggestion is to aim higher.   There is no reason a single pass
>> couldn't also return min/max/len/sum and perhaps even other summary
>> statistics like sum(x**2) so that you can compute standard deviation 
>> and variance.
>> 
>> A few years ago, Guido and other python devvers supported a
>> proposal I made to create a stats module, but I didn't have time
>> to develop it.  The basic idea was that python's batteries should
>> include most of the functionality available on advanced student
>> calculators.  Another idea behind it was that we could invisibility
>> do-the-right-thing under the hood to help users avoid numerical
>> problems (i.e. math.fsum(s)/len(s) is a more accurate way to
>> compute an average because it doesn't lose precision when
>> building-up the intermediate sums).
>> 
>> I think the creativity and energy of this group is much better directed
>> at building a quality stats module (perhaps with some R-like capabilities).
>> That would likely be a better use of energy than bike-shedding 
>> about ways to speed-up a trivial piece of code that is ultimately
>> constrained by the cost of the compares per item.
>> 
>> my-two-cents,
>> 
>> 
>> Raymond

I think I'd still go with composable coroutines, the kind of stuff dabeaz shows/promotes in his training sessions and stuff. Maybe with a higher-level interface making their usage easier, but they seem a perfect fit for that kind of stuff where you create arbitrary data pipes including forks and joins.

As others mentioned, generator-based coroutines in Python have to be primed (by calling next() once on them) which is kind-of a pain, but the decorator to "fix" that is easy enough to write.

From steve at pearwood.info  Sat Oct 16 02:11:21 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 16 Oct 2010 11:11:21 +1100
Subject: [Python-ideas] stats module Was: minmax() function ...
Message-ID: <201010161111.21847.steve@pearwood.info>

Seconding Raymond's 'drat'. Resending to python-ideas.


On Sat, 16 Oct 2010 07:00:16 am Raymond Hettinger wrote:
> Hello guys.  If you don't mind, I would like to hijack your thread
> :-)

Please do :)


> A few years ago, Guido and other python devvers supported a
> proposal I made to create a stats module, but I didn't have time
> to develop it.
[...] 
> I think the creativity and energy of this group is much better
> directed at building a quality stats module (perhaps with some R-like
> capabilities).

+1

Are you still interested in working on it, or is this a subtle hint that 
somebody else should do so?


-- 
Steven D'Aprano


From raymond.hettinger at gmail.com  Sat Oct 16 02:33:02 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 15 Oct 2010 17:33:02 -0700
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010161111.21847.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
Message-ID: <622121A3-6A51-4735-A292-9F82502BB623@gmail.com>


On Oct 15, 2010, at 5:11 PM, Steven D'Aprano wrote:

>> A few years ago, Guido and other python devvers supported a
>> proposal I made to create a stats module, but I didn't have time
>> to develop it.
> [...] 
>> I think the creativity and energy of this group is much better
>> directed at building a quality stats module (perhaps with some R-like
>> capabilities).
> 
> +1
> 
> Are you still interested in working on it, or is this a subtle hint that 
> somebody else should do so?

Hmm, perhaps this would be less subtle:
HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE!

There, that should do it :-)


Raymond



From sunqiang at gmail.com  Sat Oct 16 02:49:15 2010
From: sunqiang at gmail.com (sunqiang)
Date: Sat, 16 Oct 2010 08:49:15 +0800
Subject: [Python-ideas] [Python-Dev] Fwd: stats module Was: minmax()
 function ...
In-Reply-To: <AANLkTikRrz+VJYR_VyC4ssEUg=R7e_8=xa6NrbbEy72H@mail.gmail.com>
References: <AANLkTi=Qawvg8rUDmX6UUrTtKsuRrj1+j2R1K+pHeGTT@mail.gmail.com>
	<1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com>
	<AANLkTikRrz+VJYR_VyC4ssEUg=R7e_8=xa6NrbbEy72H@mail.gmail.com>
Message-ID: <AANLkTinB0z7izv5WtJD40Yk9thgK1f-1tURmtm=9n8WZ@mail.gmail.com>

On Sat, Oct 16, 2010 at 8:05 AM, geremy condra <debatem1 at gmail.com> wrote:
> On Fri, Oct 15, 2010 at 1:00 PM, Raymond Hettinger
> <raymond.hettinger at gmail.com> wrote:
>> Hello guys. ?If you don't mind, I would like to hijack your thread :-)
>>
>> ISTM, that the minmax() idea is really just an optimization request.
>> A single-pass minmax() is easily coded in simple, pure-python,
>> so really the discussion is about how to remove the loop overhead
>> (there isn't much you can do about the cost of the two compares
>> which is where most of the time would be spent anyway).
>>
>> My suggestion is to aim higher. ? There is no reason a single pass
>> couldn't also return min/max/len/sum and perhaps even other summary
>> statistics like sum(x**2) so that you can compute standard deviation
>> and variance.
>
> +1 from me. Here's a normal cdf and chi squared cdf approximation I
> use for randomness testing. They may need to refined for inclusion,
> but you're welcome to use them if you'd like.
>
> from math import sqrt, erf
>
> def normal_cdf(x, mu=0, sigma=1):
> ? ? ? ?"""Approximates the normal cumulative distribution"""
> ? ? ? ?return (1/2) * (1 + erf((x+mu)/(sigma*sqrt(2))))
>
> def chi_squared_cdf(x, k):
> ? ? ? ?"""Approximates the cumulative chi-squared statistic with k degrees
> of freedom."""
> ? ? ? ?numerator = 1 - (2/(9*k)) - ((x/k)**(1/3))
> ? ? ? ?denominator = (1/3) * sqrt(2/k)
> ? ? ? ?return normal_cdf(numerator/denominator)
>
>> A few years ago, Guido and other python devvers supported a
>> proposal I made to create a stats module, but I didn't have time
>> to develop it. ?The basic idea was that python's batteries should
>> include most of the functionality available on advanced student
>> calculators. ?Another idea behind it was that we could invisibility
>> do-the-right-thing under the hood to help users avoid numerical
>> problems (i.e. math.fsum(s)/len(s) is a more accurate way to
>> compute an average because it doesn't lose precision when
>> building-up the intermediate sums).
>
> Can you give some other examples? Sage does some of this and I
> frequently find it annoying, actually, but I'm not sure if you're
> referring to the same things there.
have seen a blog post[1]  several months ago from reddit[2], maybe it
worth a reading.
[1]: http://www.johndcook.com/blog/2010/06/07/math-library-functions-that-seem-unnecessary/
[2]: http://www.reddit.com/r/programming/comments/ccbja/math_library_functions_that_seem_unnecessary/

> Geremy Condra
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/sunqiang%40gmail.com
>


From rrr at ronadam.com  Sat Oct 16 07:31:36 2010
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 16 Oct 2010 00:31:36 -0500
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <i9aevm$hu7$1@dough.gmane.org>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>
	<i9aevm$hu7$1@dough.gmane.org>
Message-ID: <4CB938B8.4050709@ronadam.com>


On 10/15/2010 03:52 PM, Georg Brandl wrote:
> Am 15.10.2010 22:00, schrieb Ron Adam:
>
>>> (Notice the same will hold for min() but at least you know that min(x)
>>> considers x as an iterable and complains if it isn't)
>>
>> Yes
>>
>> There doesn't seem to be a way to generalize min/max in a way to handle all
>> the cases without knowing the context.
>
> I give up.  You see an issue where there is none.

Sorry for the delay, I was away for the day...

Thanks for trying George, it really wasn't an issue.  I was thinking about 
it from the point of view of, would it be possible to make min and max 
easier to use in indirect ways.

As I found out, those functions depend on both the number of arguments, and 
the context they are used in, to do the right thing.  Change either and you 
may get unexpected results.

In the example where *args was used... I had left out the function def of 
min(*args, **kwds) where you would have saw that args, was just unpacking 
the arguments, and not the list object being passed to min.  My mistake.

Cheers,
    Ron






From taleinat at gmail.com  Sat Oct 16 12:59:26 2010
From: taleinat at gmail.com (Tal Einat)
Date: Sat, 16 Oct 2010 12:59:26 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <201010111017.56101.steve@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
Message-ID: <AANLkTi=TB3er03PEeR3kQvFgJDk4-eZs94JMBxSm3ETD@mail.gmail.com>

On Mon, Oct 11, 2010 at 1:17 AM, Steven D'Aprano wrote:
> On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote:
>> Just as an exercise, I wanted to try my hand at adding a function to
>> the compiled Python C code. ?An interesting optimization that I read
>> about (where? don't recall) finds the minimum and maximum elements of
>> a sequence in a single pass, with a 25% reduction in number of
>> comparison operations:
>> - the sequence elements are read in pairs
>> - each pair is compared to find smaller/greater
>> - the smaller is compared to current min
>> - the greater is compared to current max
>>
>> So each pair is applied to the running min/max values using 3
>> comparisons, vs. 4 that would be required if both were compared to
>> both min and max.
>>
>> This feels somewhat similar to how divmod returns both quotient and
>> remainder of a single division operation.
>>
>> This would be potentially interesting for those cases where min and
>> max are invoked on the same sequence one after the other, and
>> especially so if the sequence elements were objects with expensive
>> comparison operations.
>
> Perhaps more importantly, it is ideal for the use-case where you have an
> iterator. You can't call min() and then max(), as min() consumes the
> iterator leaving nothing for max(). It may be undesirable to convert
> the iterator to a list first -- it may be that the number of items in
> the data stream is too large to fit into memory all at once, but even
> if it is small, it means you're now walking the stream three times when
> one would do.
>
> To my mind, minmax() is as obvious and as useful a built-in as divmod(),
> but if there is resistance to making such a function a built-in,
> perhaps it could go into itertools. (I would prefer it to keep the same
> signature as min() and max(), namely that it will take either a single
> iterable argument or multiple arguments.)
>
> I've experimented with minmax() myself. Not surprisingly, the
> performance of a pure Python version doesn't even come close to the
> built-ins.
>
> I'm +1 on the idea.
>
> Presumably follow-ups should go to python-ideas.

The discussion which followed this up has digressed quite a bit, but
I'd like to mention that I'm +1 on having an efficient minmax()
function available.

- Tal


From jan.koprowski at gmail.com  Sun Oct 17 09:27:37 2010
From: jan.koprowski at gmail.com (Jan Koprowski)
Date: Sun, 17 Oct 2010 09:27:37 +0200
Subject: [Python-ideas] dict.hash - optimized per module
Message-ID: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>

Hi,

  My name is Jan and this is my first post on this group. So hello :)
  I'm very sorry if my idea is so naive as to be ridiculous but I
believe it is worth to ask.
  I'm just watched "The Mighty Dictionary" video conference from
Atlanta deliver by Brandon Craig Rhodes.
  After watching I made graph, using presented at conference library
dictinfo, for __builtin__.__dict__.
  When I saw few collisions I think "Why this module doesn't have
their own hashing function implementation which allow to avoid
collision in this set of names?". My second think was "Why each Python
module doesn't have their own internal hashing function which doesn't
produce collisions in scope of his names". Maybe my thoughts was silly
but is this doesn't speed Python a little? I'm aware that this doesn't
work for locals or globals dict but may be an improvement in places
where set of names is constant or predictable like builtins Python
modules. What do You think?

Greetings from Poland,
-- 
><> Jan Koprowski


From steve at pearwood.info  Sun Oct 17 11:41:34 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 17 Oct 2010 20:41:34 +1100
Subject: [Python-ideas] dict.hash - optimized per module
In-Reply-To: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
References: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
Message-ID: <201010172041.34847.steve@pearwood.info>

On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote:

>   After watching I made graph, using presented at conference library
> dictinfo, for __builtin__.__dict__.
>   When I saw few collisions I think "Why this module doesn't have
> their own hashing function implementation which allow to avoid
> collision in this set of names?". 

Python 2.6 has 143 builtin names, and zero collisions:

>>> hashes = {}
>>> import __builtin__
>>> for name in __builtin__.__dict__:
...     h = hash(name)
...     if h in hashes: print "Collision for", name
...     L = hashes.setdefault(h, [])
...     L.append(name)
...
>>> len(hashes)
143
>>> filter(lambda x: len(x) > 1, hashes.values())
[]
>>> next(hashes.iteritems())
(29257728, ['bytearray'])



> My second think was "Why each 
> Python module doesn't have their own internal hashing function which
> doesn't produce collisions in scope of his names".

Firstly, the occasional collision doesn't matter much.

Secondly, your idea would mean that every module would need it's own 
custom-made hash function. Writing good hash functions is hard. The 
Python hash function is very, very good. Expecting developers to 
produce *dozens* of hash functions equally as good is totally 
impractical.


-- 
Steven D'Aprano


From pyideas at rebertia.com  Sun Oct 17 11:52:27 2010
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 17 Oct 2010 02:52:27 -0700
Subject: [Python-ideas] dict.hash - optimized per module
In-Reply-To: <201010172041.34847.steve@pearwood.info>
References: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
	<201010172041.34847.steve@pearwood.info>
Message-ID: <AANLkTinqq9GSQzYHUuhMcB=B1tRxs=frfGU8k5gq9XN=@mail.gmail.com>

On Sun, Oct 17, 2010 at 2:41 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote:
>> ? After watching I made graph, using presented at conference library
>> dictinfo, for __builtin__.__dict__.
>> ? When I saw few collisions I think "Why this module doesn't have
>> their own hashing function implementation which allow to avoid
>> collision in this set of names?".
<snip>
> Firstly, the occasional collision doesn't matter much.
>
> Secondly, your idea would mean that every module would need it's own
> custom-made hash function. Writing good hash functions is hard. The
> Python hash function is very, very good. Expecting developers to
> produce *dozens* of hash functions equally as good is totally
> impractical.

Actually, there's already software to automatically generate such
functions; e.g. http://www.gnu.org/software/gperf/
Not that this makes the suggestion any more tractable though.

Cheers,
Chris


From solipsis at pitrou.net  Sun Oct 17 12:52:40 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 17 Oct 2010 12:52:40 +0200
Subject: [Python-ideas] dict.hash - optimized per module
References: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
	<201010172041.34847.steve@pearwood.info>
Message-ID: <20101017125240.0ef893ee@pitrou.net>

On Sun, 17 Oct 2010 20:41:34 +1100
Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote:
> 
> >   After watching I made graph, using presented at conference library
> > dictinfo, for __builtin__.__dict__.
> >   When I saw few collisions I think "Why this module doesn't have
> > their own hashing function implementation which allow to avoid
> > collision in this set of names?". 
> 
> Python 2.6 has 143 builtin names, and zero collisions:

It depends what you call collisions. Collisions during bucket lookup, or
during hash value comparison (that is, after you selected a bucket)?

For the former, here is the calculation assuming an overallocation
factor of 4 (which, IIRC, is the one used in the dict implementation):

>>> import builtins
>>> d = builtins.__dict__
>>> m = len(d) * 4
>>> for name in d:
...   h = hash(name) % m
...   if h in hashes: print("Collision for", name)
...   hashes.setdefault(h, []).append(name)
... 
Collision for True
Collision for FutureWarning
Collision for license
Collision for KeyboardInterrupt
Collision for UserWarning
Collision for RuntimeError
Collision for MemoryError
Collision for Ellipsis
Collision for UnicodeError
Collision for Exception
Collision for tuple
Collision for delattr
Collision for setattr
Collision for ArithmeticError
Collision for property
Collision for KeyError
Collision for PendingDeprecationWarning
Collision for map
Collision for AssertionError
>>> len(d)
130
>>> len(hashes)
110


> > My second think was "Why each 
> > Python module doesn't have their own internal hashing function which
> > doesn't produce collisions in scope of his names".

The real answer here is that Python needs hash values to be
globally valid. Both for semantics (module dicts are regular dicts and
should be usable as such), and for efficiency (having an unique hash
function means the precalculated hash value can be stored for critical
types such as str).

Regards

Antoine.




From steve at pearwood.info  Sun Oct 17 18:57:58 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 18 Oct 2010 03:57:58 +1100
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
References: <201010161111.21847.steve@pearwood.info>
	<622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
Message-ID: <201010180357.59264.steve@pearwood.info>

On Sat, 16 Oct 2010 11:33:02 am Raymond Hettinger wrote:
> > Are you still interested in working on it, or is this a subtle hint
> > that somebody else should do so?
>
> Hmm, perhaps this would be less subtle:
> HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE!


http://pypi.python.org/pypi/stats

It is not even close to production ready. It needs unit tests. The API 
should be considered unstable. There's no 3.x version yet. Obviously it 
has no real-world usage. But if anyone would like to contribute, 
critique or criticize, I welcome feedback or assistance, or even just 
encouragement.



-- 
Steven D'Aprano


From daniel at stutzbachenterprises.com  Sun Oct 17 19:05:33 2010
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Sun, 17 Oct 2010 12:05:33 -0500
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180357.59264.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
	<622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
	<201010180357.59264.steve@pearwood.info>
Message-ID: <AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>

On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano <steve at pearwood.info>wrote:

> It is not even close to production ready. It needs unit tests. The API
> should be considered unstable. There's no 3.x version yet. Obviously it
> has no real-world usage. But if anyone would like to contribute,
> critique or criticize, I welcome feedback or assistance, or even just
> encouragement.
>

Would you consider hosting it on BitBucket or GitHub?  It would make
collaboration easier.

-- 
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101017/d2b64de5/attachment.html>

From steve at pearwood.info  Sun Oct 17 19:16:42 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 18 Oct 2010 04:16:42 +1100
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>
References: <201010161111.21847.steve@pearwood.info>
	<201010180357.59264.steve@pearwood.info>
	<AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>
Message-ID: <201010180416.43278.steve@pearwood.info>

On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote:
> On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano 
<steve at pearwood.info>wrote:
> > It is not even close to production ready. It needs unit tests. The
> > API should be considered unstable. There's no 3.x version yet.
> > Obviously it has no real-world usage. But if anyone would like to
> > contribute, critique or criticize, I welcome feedback or
> > assistance, or even just encouragement.
>
> Would you consider hosting it on BitBucket or GitHub?  It would make
> collaboration easier.

Yes I would.

I suppose if I ask people for their preferred hosting provider, I'll get 
30 different opinions and start a flame-war... 


-- 
Steven D'Aprano


From masklinn at masklinn.net  Sun Oct 17 19:33:21 2010
From: masklinn at masklinn.net (Masklinn)
Date: Sun, 17 Oct 2010 19:33:21 +0200
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180416.43278.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
	<201010180357.59264.steve@pearwood.info>
	<AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>
	<201010180416.43278.steve@pearwood.info>
Message-ID: <932D6642-B025-43D5-A9B0-BBA53B777FC8@masklinn.net>

On 2010-10-17, at 19:16 , Steven D'Aprano wrote:
> On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote:
>> On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano 
> <steve at pearwood.info>wrote:
>>> It is not even close to production ready. It needs unit tests. The
>>> API should be considered unstable. There's no 3.x version yet.
>>> Obviously it has no real-world usage. But if anyone would like to
>>> contribute, critique or criticize, I welcome feedback or
>>> assistance, or even just encouragement.
>> Would you consider hosting it on BitBucket or GitHub?  It would make
>> collaboration easier.
> Yes I would.
> 
> I suppose if I ask people for their preferred hosting provider, I'll get 
> 30 different opinions and start a flame-war? 
If you're a bit bored, you can always host on both at the same time via hg-git [0]. 99% of the population[1] should be happy enough if it's available through both git and mercurial.

[0] http://hg-git.github.com/
[1] yep, really, I didn't make that up at all.

From debatem1 at gmail.com  Sun Oct 17 19:36:28 2010
From: debatem1 at gmail.com (geremy condra)
Date: Sun, 17 Oct 2010 10:36:28 -0700
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180416.43278.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
	<201010180357.59264.steve@pearwood.info>
	<AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>
	<201010180416.43278.steve@pearwood.info>
Message-ID: <AANLkTi=mF=2kLwPwyzMtYbSMTTQOrFt7aAZ4g57OqOZj@mail.gmail.com>

On Sun, Oct 17, 2010 at 10:16 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote:
>> On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano
> <steve at pearwood.info>wrote:
>> > It is not even close to production ready. It needs unit tests. The
>> > API should be considered unstable. There's no 3.x version yet.
>> > Obviously it has no real-world usage. But if anyone would like to
>> > contribute, critique or criticize, I welcome feedback or
>> > assistance, or even just encouragement.
>>
>> Would you consider hosting it on BitBucket or GitHub? ?It would make
>> collaboration easier.
>
> Yes I would.
>
> I suppose if I ask people for their preferred hosting provider, I'll get
> 30 different opinions and start a flame-war...

Like that's ever stopped you ;)

I've been working on this as well, and somehow wound up with a totally
different module. I'll have mine up someplace later tonight, but we
should consider merging them.

Geremy Condra


From tjreedy at udel.edu  Mon Oct 18 00:10:13 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 17 Oct 2010 18:10:13 -0400
Subject: [Python-ideas] dict.hash - optimized per module
In-Reply-To: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
References: <AANLkTimcwveyJgYECa6ozGvb9K2_P4PJv_41-XzqOnno@mail.gmail.com>
Message-ID: <i9fs87$7nh$1@dough.gmane.org>

On 10/17/2010 3:27 AM, Jan Koprowski wrote:
> Hi,
>
>    My name is Jan and this is my first post on this group. So hello :)
>    I'm very sorry if my idea is so naive as to be ridiculous but I
> believe it is worth to ask.

Worth asking but not worth doing (or, in a sense, already done for 
function local namespaces).

As Antoine said, strings have their hash computed just once. Recomputing 
a namespace-depending hash for each lookup would take far longer than 
the occational collision.

For function local names, names are assigned a index at compile time so 
that runtime lookup is a super-quick index operation. If you want, call 
it perfect hashing with hashes computed once at compile time ;-).

-- 
Terry Jan Reedy



From merwok at netwok.org  Mon Oct 18 14:45:14 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 18 Oct 2010 14:45:14 +0200
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180357.59264.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>	<622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
	<201010180357.59264.steve@pearwood.info>
Message-ID: <4CBC415A.6040701@netwok.org>

[Sorry if this comes twice, connection errors here]

> http://pypi.python.org/pypi/stats

Isn?t it a potential source of errors that the module name is so close
to that of stat?

Regards



From merwok at netwok.org  Sun Oct 17 19:30:32 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sun, 17 Oct 2010 19:30:32 +0200
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180357.59264.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>	<622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
	<201010180357.59264.steve@pearwood.info>
Message-ID: <4CBB32B8.6080605@netwok.org>

> http://pypi.python.org/pypi/stats

Isn?t it a potential source of errors that the module name is so close
to that of stat?

Regards



From daniel at stutzbachenterprises.com  Mon Oct 18 22:48:35 2010
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 18 Oct 2010 15:48:35 -0500
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180416.43278.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
	<201010180357.59264.steve@pearwood.info>
	<AANLkTine74vMurh+xnPGjEEU00BV+ftKvZFUjdGUm=EV@mail.gmail.com>
	<201010180416.43278.steve@pearwood.info>
Message-ID: <AANLkTimG7+zVMiS06KZARNhpzR9ZbG25J2LS4_oiKbdA@mail.gmail.com>

On Sun, Oct 17, 2010 at 12:16 PM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote:
> > Would you consider hosting it on BitBucket or GitHub?  It would make
> > collaboration easier.
>
> Yes I would.
>
> I suppose if I ask people for their preferred hosting provider, I'll get
> 30 different opinions and start a flame-war...


That's likely, yes.  Just make an executive decision.  :-)

-- 
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101018/68325ca0/attachment.html>

From steve at pearwood.info  Mon Oct 18 23:57:09 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 19 Oct 2010 08:57:09 +1100
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <4CBC415A.6040701@netwok.org>
References: <201010161111.21847.steve@pearwood.info>
	<201010180357.59264.steve@pearwood.info>
	<4CBC415A.6040701@netwok.org>
Message-ID: <201010190857.09731.steve@pearwood.info>

On Mon, 18 Oct 2010 11:45:14 pm ?ric Araujo wrote:
> [Sorry if this comes twice, connection errors here]
>
> > http://pypi.python.org/pypi/stats
>
> Isn?t it a potential source of errors that the module name is so
> close to that of stat?

The name is not set in stone.

Any name is likely to lead to potential errors -- it took me five years 
to stop writing "import maths", and I still never remember whether I 
want to import date or datetime. But it's not a critical error -- it's 
pretty obvious when you've imported the wrong module.


-- 
Steven D'Aprano


From pingebre at yahoo.com  Tue Oct 19 21:17:13 2010
From: pingebre at yahoo.com (Peter Ingebretson)
Date: Tue, 19 Oct 2010 12:17:13 -0700 (PDT)
Subject: [Python-ideas] Proposal for an enhanced reload mechanism
Message-ID: <944134.31586.qm@web34408.mail.mud.yahoo.com>

The builtin reload function is very useful for iterative development, but it is also limited.? Because references to types and functions in the old version of the module may persist after reloading, the builtin reload function is typically only useful in simple use cases.

This is a proposal (pre-PEP?) for an enhanced reloading mechanism especially designed for iterative development:

https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw

The basic plan is to use the existing cycle-detecting GC to remap references from objects in the old module to equivalent objects in the new module.

I have a patch against the current 3.2 branch that adds the gc.remap function (and unit tests, etc...) but not any of the additional reloading functionality.  I have a separate prototype of the reloading module as well, but it only implements a portion of the proposal (one module at a time, and dicts/sets are not fixed up).

A few questions:

1) Does this approach seem reasonable?? Has anyone tried something similar and run into unsolvable problems?

2) Would there be interest in a PEP for enhanced reloading?  I would be happy to rewrite the proposal in PEP form if people think it would be worthwhile.

3) Should I submit my gc.remap patch to the issue tracker?? Because the change to visitproc modifies the ABI I would like to get that portion of the proposal in before 3.2 goes final.  Since the bulk of the change is adding one method to the gc module I was hoping it might be accepted without requiring a PEP.



      


From ckaynor at zindagigames.com  Tue Oct 19 21:53:46 2010
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Tue, 19 Oct 2010 12:53:46 -0700
Subject: [Python-ideas] Proposal for an enhanced reload mechanism
In-Reply-To: <944134.31586.qm@web34408.mail.mud.yahoo.com>
References: <944134.31586.qm@web34408.mail.mud.yahoo.com>
Message-ID: <AANLkTimn-93ysar+en+OmE9-xtS4shRvbY4_GofVvTdd@mail.gmail.com>

On Tue, Oct 19, 2010 at 12:17 PM, Peter Ingebretson <pingebre at yahoo.com>wrote:

> The builtin reload function is very useful for iterative development, but
> it is also limited.  Because references to types and functions in the old
> version of the module may persist after reloading, the builtin reload
> function is typically only useful in simple use cases.
>
> This is a proposal (pre-PEP?) for an enhanced reloading mechanism
> especially designed for iterative development:
>
>
> https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw
>
> The basic plan is to use the existing cycle-detecting GC to remap
> references from objects in the old module to equivalent objects in the new
> module.
>
> I have a patch against the current 3.2 branch that adds the gc.remap
> function (and unit tests, etc...) but not any of the additional reloading
> functionality.  I have a separate prototype of the reloading module as well,
> but it only implements a portion of the proposal (one module at a time, and
> dicts/sets are not fixed up).
>
> A few questions:
>
> 1) Does this approach seem reasonable?  Has anyone tried something similar
> and run into unsolvable problems?
>
> 2) Would there be interest in a PEP for enhanced reloading?  I would be
> happy to rewrite the proposal in PEP form if people think it would be
> worthwhile.
>
> 3) Should I submit my gc.remap patch to the issue tracker?  Because the
> change to visitproc modifies the ABI I would like to get that portion of the
> proposal in before 3.2 goes final.  Since the bulk of the change is adding
> one method to the gc module I was hoping it might be accepted without
> requiring a PEP.
>
>
>
What happens if you change the __init__ or __new__ methods of an object or
if you change a class's metaclass? It seems like those types of changes
would be impossible to propagate to existing objects, and without
propagating them any changes to existing objects may (are likely?) to break
the object.


>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101019/2764451f/attachment.html>

From pingebre at yahoo.com  Tue Oct 19 22:25:49 2010
From: pingebre at yahoo.com (Peter Ingebretson)
Date: Tue, 19 Oct 2010 13:25:49 -0700 (PDT)
Subject: [Python-ideas] Proposal for an enhanced reload mechanism
In-Reply-To: <AANLkTimn-93ysar+en+OmE9-xtS4shRvbY4_GofVvTdd@mail.gmail.com>
Message-ID: <741796.36646.qm@web34402.mail.mud.yahoo.com>

--- On Tue, 10/19/10, Chris Kaynor <ckaynor at zindagigames.com> wrote:
This is a proposal (pre-PEP?) for an enhanced reloading mechanism especially designed for iterative development:



https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw



The basic plan is to use the existing cycle-detecting GC to remap references from objects in the old module to equivalent objects in the new module.

What happens if you change the __init__ or __new__ methods of an object or if you change a class's metaclass? It seems like those types of changes would be impossible to propagate to existing objects, and without propagating them any changes to existing objects may (are likely?) to break the object.Yes, this is a limitation of the approach. ?More generally, any logic that has already runand would execute differently with the reloaded module has the potential to break things.
Even with this limitation I think the approach is still valuable. ?I spend far less time modifying__new__ methods and metaclasses than I spend changing the implementation and API ofother class- and module-level methods.
The issue of old instances not having members that are added in a new __init__ isproblematic, but there are several workarounds such as temporarily?wrapping the newmember in a property, or potentially the @reloadable decorator alluded to in the doc.


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101019/9947798e/attachment.html>

From pingebre at yahoo.com  Tue Oct 19 22:46:24 2010
From: pingebre at yahoo.com (Peter Ingebretson)
Date: Tue, 19 Oct 2010 13:46:24 -0700 (PDT)
Subject: [Python-ideas] Proposal for an enhanced reload mechanism
Message-ID: <718804.70750.qm@web34406.mail.mud.yahoo.com>

(Sorry, I sent an html-formatted email by accident)

--- On Tue, 10/19/10, Chris Kaynor <ckaynor at zindagigames.com> wrote:
> > This is a proposal (pre-PEP?) for an enhanced reloading mechanism
> > especially designed for iterative development:
> >
> > https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw
> >
> > The basic plan is to use the existing cycle-detecting GC to remap
> > references from objects in the old module to equivalent objects in
> > the new module.
>
> What happens if you change the __init__ or __new__ methods of an object
> or if you change a class's metaclass? It seems like those types of
> changes would be impossible to propagate to existing objects, and without
> propagating them any changes to existing objects may (are likely?) to
> break the object.

Yes, this is a limitation of the approach. ?More generally, any logic that
has already run and would execute differently with the reloaded module has
the potential to break things.

Even with this limitation I think the approach is still valuable. ?I spend 
far less time modifying __new__ methods and metaclasses than I spend 
changing the implementation and API ofother class- and module-level methods. 

The issue of old instances not having members that are added in a new 
__init__ is problematic, but there are several workarounds such as 
temporarily?wrapping the new member in a property, or potentially the 
@reloadable decorator alluded to in the doc.



      


From ziade.tarek at gmail.com  Tue Oct 19 23:26:04 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 19 Oct 2010 23:26:04 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
Message-ID: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>

Hello

There's one feature I want to add in distutils2: the develop command
setuptools provides. Basically it adds a "link" file into
site-packages, and does some magic at startup to load the path that is
contained in the link file. The use case is to be able to have a
project added in the python path without installing it.

I am not a huge fan of adding files in site-packages for this though,
and the magic it supposes. I thought of another mechanism: a
persistent list of paths site.py would load.

So the idea is to have two files:
- a site.cfg at the python level, with a persistent list of paths
- a .local/site.cfg at the user level for user-defined paths.

Then distutils2 would add/remove paths in these files in its develop command.

This file could contain paths and also possibly sitedirs.

Does this sound crazy ?

Tarek
-- 
Tarek Ziad? | http://ziade.org


From ianb at colorstudy.com  Wed Oct 20 00:03:14 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 19 Oct 2010 17:03:14 -0500
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
Message-ID: <AANLkTi=+=A53ph6=axOutuTGCpTam2bkuij6qoviPEgX@mail.gmail.com>

On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:

> Hello
>
> There's one feature I want to add in distutils2: the develop command
> setuptools provides. Basically it adds a "link" file into
> site-packages, and does some magic at startup to load the path that is
> contained in the link file. The use case is to be able to have a
> project added in the python path without installing it.
>

The link file is a red herring -- setuptools adds an entry to
easy-install.pth that points to the directory.  It would work equally as
well to add a .pth file for the specific package (though .pth files append
to the path, so if you already have a package installed and then a .pth file
pointing to a development version, then it won't work as expected, hence the
magic in easy-install.pth).

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101019/e1fa87ea/attachment.html>

From ziade.tarek at gmail.com  Wed Oct 20 00:12:16 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 20 Oct 2010 00:12:16 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTi=+=A53ph6=axOutuTGCpTam2bkuij6qoviPEgX@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTi=+=A53ph6=axOutuTGCpTam2bkuij6qoviPEgX@mail.gmail.com>
Message-ID: <AANLkTimu+-mWVhCy4Env_c2JG5F6euVioQs5OcEG1DST@mail.gmail.com>

On Wed, Oct 20, 2010 at 12:03 AM, Ian Bicking <ianb at colorstudy.com> wrote:
> On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>>
>> Hello
>>
>> There's one feature I want to add in distutils2: the develop command
>> setuptools provides. Basically it adds a "link" file into
>> site-packages, and does some magic at startup to load the path that is
>> contained in the link file. The use case is to be able to have a
>> project added in the python path without installing it.
>
> The link file is a red herring -- setuptools adds an entry to
> easy-install.pth that points to the directory.? It would work equally as
> well to add a .pth file for the specific package (though .pth files append
> to the path, so if you already have a package installed and then a .pth file
> pointing to a development version, then it won't work as expected, hence the
> magic in easy-install.pth).

Yes, or a develop.pth file containing those paths, like Carl proposed
on IRC. a .cfg is not really helping indeed.

But we would need to have the metadata built and stored somewhere. A
specific directory maybe for them.


> --
> Ian Bicking? |? http://blog.ianbicking.org
>



-- 
Tarek Ziad? | http://ziade.org


From p.f.moore at gmail.com  Wed Oct 20 11:57:03 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 20 Oct 2010 10:57:03 +0100
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
Message-ID: <AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>

On 19 October 2010 22:26, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> There's one feature I want to add in distutils2: the develop command
> setuptools provides. Basically it adds a "link" file into
> site-packages, and does some magic at startup to load the path that is
> contained in the link file. The use case is to be able to have a
> project added in the python path without installing it.

Can you explain the requirement in more detail? I don't use the
setuptools develop command, so I don't have the background, but it
seems to me that what you're proposing can be done simply by adding
the relevant directory to PYTHONPATH. That's all I ever do when
developing (but my needs are pretty simple, so there may well be
subtle problems with that approach).

Paul


From ziade.tarek at gmail.com  Wed Oct 20 15:36:23 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 20 Oct 2010 15:36:23 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
Message-ID: <AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>

On Wed, Oct 20, 2010 at 11:57 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 19 October 2010 22:26, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>> There's one feature I want to add in distutils2: the develop command
>> setuptools provides. Basically it adds a "link" file into
>> site-packages, and does some magic at startup to load the path that is
>> contained in the link file. The use case is to be able to have a
>> project added in the python path without installing it.
>
> Can you explain the requirement in more detail? I don't use the
> setuptools develop command, so I don't have the background, but it
> seems to me that what you're proposing can be done simply by adding
> the relevant directory to PYTHONPATH. That's all I ever do when
> developing (but my needs are pretty simple, so there may well be
> subtle problems with that approach).

Sorry that was vague indeed.

It goes a little bit farther than than: the project packages and
modules have to be found in the path, but we also need to publish the
project metadata that would be installed in a normal installation, so
our browsing/query APIs can find the project.

So, if a project 'Boo' has two packages 'foo' and 'bar' and a module
'baz.py', we need those in the path but also the Boo.dist-info
directory that is created at installation time (see PEP 376).
Setuptools' metadata directory is called Boo.egg-info, and distutils 1
has a file called Boo.egg-info since python 2.5

And since a python project can publish several top level directories,
all of them needs to be added in the path. so adding the current dir
to PYTHONPATH will not work in every case even if the metadata are
built and dropped there.

I am not sure what would be the best way to handle this, maybe having
these metadata built in place, then listing all the paths that need to
be included and write them to a .pth file Distutils2 manage.

So:

0. have a distutils2.pth file installed with distutils2

Then, to add the project in the path:

1. build the project metadata in-place
2. get the project paths by listing its packages and directories (by
invoking a pseudo-install command)
3. inject these paths in distutils2.pth

To remove it:

1. get the project paths by listing its packages and directories
2. remove these paths from distutils2.pth

Another problem I see is that any module or package that is not listed
by the project and that would not be installed in the site-packages
might be added in the path, but that's probably not a huge issue.

The goal is to be able to avoid re-installing a project you are
working on to try it, every time you make a change. This is used a
lot, and in particular with virtualenv.

So in any case, it turns out .pth files are a good way to do this so I
guess this thread does not belong to python-ideas anymore.

Cross-posting to the D2 Mailing list to move it there !

Tarek

-- 
Tarek Ziad? | http://ziade.org


From ncoghlan at gmail.com  Wed Oct 20 15:38:56 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 20 Oct 2010 23:38:56 +1000
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
Message-ID: <AANLkTik9+mESTaPPbySYgtLuhJrGRTdFQjS0JPpyLeCf@mail.gmail.com>

On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 19 October 2010 22:26, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>> There's one feature I want to add in distutils2: the develop command
>> setuptools provides. Basically it adds a "link" file into
>> site-packages, and does some magic at startup to load the path that is
>> contained in the link file. The use case is to be able to have a
>> project added in the python path without installing it.
>
> Can you explain the requirement in more detail? I don't use the
> setuptools develop command, so I don't have the background, but it
> seems to me that what you're proposing can be done simply by adding
> the relevant directory to PYTHONPATH. That's all I ever do when
> developing (but my needs are pretty simple, so there may well be
> subtle problems with that approach).

A different idea along these lines that I've been pondering is an
actual -p path option for the interpreter command line, that allowed a
sequence of directories to be provided that would be prepended to
PYTHONPATH (and hence included in sys.path).

So if you're wanting to test two different versions of a module (from
a parent directory containing the two versions in separate
subdirectories):

python -p versionA run_tests.py
python -p versionB run_tests.py

For more permanent additions to sys.path, PYTHONPATH (possibly in
conjunction with virtualenv) is reasonable answer. Zipfile and
directory execution covers execution of more complex applications
containing multiple files as if they were simple scripts.

The main piece I see missing from the puzzle is the ability to easily
switch back and forth between multiple versions of a support package
or library without mucking with persistent state like the environment
variables or the filesystem.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From p.f.moore at gmail.com  Wed Oct 20 16:00:54 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 20 Oct 2010 15:00:54 +0100
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
Message-ID: <AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>

On 20 October 2010 14:36, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> It goes a little bit farther than than: the project packages and
> modules have to be found in the path, but we also need to publish the
> project metadata that would be installed in a normal installation, so
> our browsing/query APIs can find the project.

Maybe I'm still missing something, but are you saying that the
metadata query APIs don't respect PYTHONPATH? Is there are reason why
they can't?

> So, if a project 'Boo' has two packages 'foo' and 'bar' and a module
> 'baz.py', we need those in the path but also the Boo.dist-info
> directory that is created at installation time (see PEP 376).
> Setuptools' metadata directory is called Boo.egg-info, and distutils 1
> has a file called Boo.egg-info since python 2.5

... and I'd expect the dist-info directory to be located by searching PYTHONPATH

> And since a python project can publish several top level directories,
> all of them needs to be added in the path. so adding the current dir
> to PYTHONPATH will not work in every case even if the metadata are
> built and dropped there.

So, project Foo publishes packages bar and baz.

    MyDir
        Foo
            __init__.py
            bar
                __init__.py
            baz
                __init__.py
        Foo-N.M-pyx.y.dist-info

(Is that right? I'm rusty on the structure. That's how it looks in Python 2.7)

So the directory MyDir is on PYTHONPATH. Then Foo.bar and Foo.baz are
visible, and the dist-info file is on PYTHONPATH for introspection.

If you're saying that Foo *isn't* a package itself, so Foo/__init__.py
doesn't exist, and bar and baz should be visible unqualified, then I
begin to see your issue (although my first reaction is to say "don't
do that, then" :-)). But don't you then just need to search *parents*
of elements of PYTHONPATH as well for the metadata search? If that's
an issue then doesn't that mean you've got other problems with how
people structure their directories? Actually, I suspect my picture
above is wrong, as I can't honestly see that mandating that the
dist-info file be a *sibling* (in an arbitrarily cluttered directory)
of the project directory, is sensible...

But I'm probably not seeing the real issues here.

All I would say is, don't let the needs of more unusual configurations
over-complicate basic usage.

Paul.


From ziade.tarek at gmail.com  Wed Oct 20 16:27:21 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 20 Oct 2010 16:27:21 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
Message-ID: <AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>

On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore at gmail.com> wrote:
...
>
> If you're saying that Foo *isn't* a package itself, so Foo/__init__.py
> doesn't exist, and bar and baz should be visible unqualified, then I
> begin to see your issue (although my first reaction is to say "don't
> do that, then" :-)). But don't you then just need to search *parents*
> of elements of PYTHONPATH as well for the metadata search? If that's
> an issue then doesn't that mean you've got other problems with how
> people structure their directories? Actually, I suspect my picture
> above is wrong, as I can't honestly see that mandating that the
> dist-info file be a *sibling* (in an arbitrarily cluttered directory)
> of the project directory, is sensible...

yeah that the main issue: we can't make assumptions on how the source
tree looks in the project, so adding the root path will not work all
the time. Some people even have two separate root packages. Which is
not a good layout, but allowed.. In Zope, I think the convention is to
use a src/ directory so that's another level.

Since distutils1 and distutils2 will let you provide in their options
a list of packages and modules, I think it's the only sane way to get
a list of paths we can then add in the path.

>
> But I'm probably not seeing the real issues here.
>
> All I would say is, don't let the needs of more unusual configurations
> over-complicate basic usage.

The trouble is: adding in PYTHONPATH the root of the source of your
project can be different from what it would be once installed in
Python.   Now the question is: if 90% of the projects out there would
work by adding the root, then this is might be overkill. I am afraid
it's way less though...

Tarek

-- 
Tarek Ziad? | http://ziade.org


From ianb at colorstudy.com  Wed Oct 20 18:02:03 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 20 Oct 2010 11:02:03 -0500
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
Message-ID: <AANLkTi=N8J2ZFGZkvjd_kNWgFe1EayR02QVqjU+fF1E5@mail.gmail.com>

On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:

> So, if a project 'Boo' has two packages 'foo' and 'bar' and a module
> 'baz.py', we need those in the path but also the Boo.dist-info
> directory that is created at installation time (see PEP 376).
> Setuptools' metadata directory is called Boo.egg-info, and distutils 1
> has a file called Boo.egg-info since python 2.5
>

So do it the same way as Setuptools -- setup.py egg_info writes the info to
the root of the packages (which might be src/ for some libraries) and when
that is added to the path, then the directory will be scanned and the
metadata found.  And setup.py develop calls egg_info.  Replace egg with dist
and it's all good, right?

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101020/e35c4746/attachment.html>

From ziade.tarek at gmail.com  Wed Oct 20 18:13:20 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 20 Oct 2010 18:13:20 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTi=N8J2ZFGZkvjd_kNWgFe1EayR02QVqjU+fF1E5@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTi=N8J2ZFGZkvjd_kNWgFe1EayR02QVqjU+fF1E5@mail.gmail.com>
Message-ID: <AANLkTikXOomqy_oFnpC05Z=H=wErX=OpHwiZRmLd80dM@mail.gmail.com>

On Wed, Oct 20, 2010 at 6:02 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>>
>> So, if a project 'Boo' has two packages 'foo' and 'bar' and a module
>> 'baz.py', we need those in the path but also the Boo.dist-info
>> directory that is created at installation time (see PEP 376).
>> Setuptools' metadata directory is called Boo.egg-info, and distutils 1
>> has a file called Boo.egg-info since python 2.5
>
> So do it the same way as Setuptools -- setup.py egg_info writes the info to
> the root of the packages (which might be src/ for some libraries) and when
> that is added to the path, then the directory will be scanned and the
> metadata found.? And setup.py develop calls egg_info.? Replace egg with dist
> and it's all good, right?

Not quite, since packages can be located in other (and several) places
than directly there. (See my answer to Paul)

So I am trying to write this options_to_paths() code to see how things can work

> --
> Ian Bicking? |? http://blog.ianbicking.org
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>



-- 
Tarek Ziad? | http://ziade.org


From rrr at ronadam.com  Wed Oct 20 20:46:42 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 20 Oct 2010 13:46:42 -0500
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTik9+mESTaPPbySYgtLuhJrGRTdFQjS0JPpyLeCf@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTik9+mESTaPPbySYgtLuhJrGRTdFQjS0JPpyLeCf@mail.gmail.com>
Message-ID: <4CBF3912.5050009@ronadam.com>



On 10/20/2010 08:38 AM, Nick Coghlan wrote:
> On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore<p.f.moore at gmail.com>  wrote:
>> On 19 October 2010 22:26, Tarek Ziad?<ziade.tarek at gmail.com>  wrote:
>>> There's one feature I want to add in distutils2: the develop command
>>> setuptools provides. Basically it adds a "link" file into
>>> site-packages, and does some magic at startup to load the path that is
>>> contained in the link file. The use case is to be able to have a
>>> project added in the python path without installing it.
>>
>> Can you explain the requirement in more detail? I don't use the
>> setuptools develop command, so I don't have the background, but it
>> seems to me that what you're proposing can be done simply by adding
>> the relevant directory to PYTHONPATH. That's all I ever do when
>> developing (but my needs are pretty simple, so there may well be
>> subtle problems with that approach).
>
> A different idea along these lines that I've been pondering is an
> actual -p path option for the interpreter command line, that allowed a
> sequence of directories to be provided that would be prepended to
> PYTHONPATH (and hence included in sys.path).
>
> So if you're wanting to test two different versions of a module (from
> a parent directory containing the two versions in separate
> subdirectories):
>
> python -p versionA run_tests.py
> python -p versionB run_tests.py
>
> For more permanent additions to sys.path, PYTHONPATH (possibly in
> conjunction with virtualenv) is reasonable answer. Zipfile and
> directory execution covers execution of more complex applications
> containing multiple files as if they were simple scripts.
>
> The main piece I see missing from the puzzle is the ability to easily
> switch back and forth between multiple versions of a support package
> or library without mucking with persistent state like the environment
> variables or the filesystem.

Yes, I don't like changing the system wide environment variables and file 
system options. It's too easy to break other things that depend on them.

How about adding the ability to use a .pth file from the current program 
directory?

Ron








From ianb at colorstudy.com  Wed Oct 20 21:20:40 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 20 Oct 2010 14:20:40 -0500
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
Message-ID: <AANLkTin2n6adYHTmwEAR-+VF8AXnhQzErnPsQ3kpvea8@mail.gmail.com>

On Wed, Oct 20, 2010 at 9:27 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:

> On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> ...
> >
> > If you're saying that Foo *isn't* a package itself, so Foo/__init__.py
> > doesn't exist, and bar and baz should be visible unqualified, then I
> > begin to see your issue (although my first reaction is to say "don't
> > do that, then" :-)). But don't you then just need to search *parents*
> > of elements of PYTHONPATH as well for the metadata search? If that's
> > an issue then doesn't that mean you've got other problems with how
> > people structure their directories? Actually, I suspect my picture
> > above is wrong, as I can't honestly see that mandating that the
> > dist-info file be a *sibling* (in an arbitrarily cluttered directory)
> > of the project directory, is sensible...
>
> yeah that the main issue: we can't make assumptions on how the source
> tree looks in the project, so adding the root path will not work all
> the time. Some people even have two separate root packages. Which is
> not a good layout, but allowed.. In Zope, I think the convention is to
> use a src/ directory so that's another level.
>

Setuptools puts the files in the src/ directory in that case.  More
complicated layouts simply aren't supported, and generally no one complains
as more complicated layouts are uncommon and a sign someone's head is
somewhere very different than where they would be if they were using
setup.py develop.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101020/54fbcb93/attachment.html>

From flub at devork.be  Thu Oct 21 01:35:37 2010
From: flub at devork.be (Floris Bruynooghe)
Date: Thu, 21 Oct 2010 00:35:37 +0100
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
Message-ID: <AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>

[sorry, forgot to include the list address before]

Hi

On 20 October 2010 15:27, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>> But I'm probably not seeing the real issues here.
>>
>> All I would say is, don't let the needs of more unusual configurations
>> over-complicate basic usage.
>
> The trouble is: adding in PYTHONPATH the root of the source of your
> project can be different from what it would be once installed in
> Python.   Now the question is: if 90% of the projects out there would
> work by adding the root, then this is might be overkill. I am afraid
> it's way less though...

I've read your and Ian's responses and still don't understand what
setup.py develop brings to the party which can't be done with simple
PYTHONPATH.  Excuse me if I also completely misunderstand what develop
does but it sounds like it's going to add an in-development version of
a project on a users's sys.path (at the front?) until it's undone
again somehow (is there a "setup.py undevelop"?).  This just seems
dangerous to me since it will affect all python programs run by that
user.

If I understand correctly this whole "develop" dance is for when you
have two inter-depended packages in development at the same time.  If
manually setting PYTHONPATH correctly in this situation is too
complicated then my feeling is there's nothing wrong with some sort of
helper which manipulates PYTHONPATH for you, something like spaw a new
shell and set the environment in that correctly.  But placing things
in files makes this permanent for the user and just seems the wrong
way to go to me.

Again, apologies if I understand the problem wrongly.  But I too am
worried about too many complexities and "magic".  One of my main
issues with setuptools is that it tries to handle my python
environment (sys.path) outside of normally expected python mechanisms
by modifying various custom files.  I would hate to see distutils2
repeat this.

Regards
Floris


-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org


From ncoghlan at gmail.com  Thu Oct 21 04:32:32 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 21 Oct 2010 12:32:32 +1000
Subject: [Python-ideas] Add a command line option to adjust sys.path? (was
 Re: Add a site.cfg to keep a persistent list of paths)
Message-ID: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>

On Thu, Oct 21, 2010 at 4:46 AM, Ron Adam <rrr at ronadam.com> wrote:
>
>
> On 10/20/2010 08:38 AM, Nick Coghlan wrote:
>> A different idea along these lines that I've been pondering is an
>> actual -p path option for the interpreter command line, that allowed a
>> sequence of directories to be provided that would be prepended to
>> PYTHONPATH (and hence included in sys.path).
>>
>> So if you're wanting to test two different versions of a module (from
>> a parent directory containing the two versions in separate
>> subdirectories):
>>
>> python -p versionA run_tests.py
>> python -p versionB run_tests.py
>>
>> For more permanent additions to sys.path, PYTHONPATH (possibly in
>> conjunction with virtualenv) is reasonable answer. Zipfile and
>> directory execution covers execution of more complex applications
>> containing multiple files as if they were simple scripts.
>>
>> The main piece I see missing from the puzzle is the ability to easily
>> switch back and forth between multiple versions of a support package
>> or library without mucking with persistent state like the environment
>> variables or the filesystem.
>
> Yes, I don't like changing the system wide environment variables and file
> system options. It's too easy to break other things that depend on them.
>
> How about adding the ability to use a .pth file from the current program
> directory?

A simple check to see if a supplied path was a directory or not would
let us do both with one new option:

-p  Specify a directory or a .pth file (see site module docs) to be
prepended to sys.path

distutils2 could then provide a way to generate an appropriate .pth
file instead of installing a distribution.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From rrr at ronadam.com  Thu Oct 21 06:36:01 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 20 Oct 2010 23:36:01 -0500
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>
Message-ID: <4CBFC331.8020309@ronadam.com>



On 10/20/2010 09:32 PM, Nick Coghlan wrote:
> On Thu, Oct 21, 2010 at 4:46 AM, Ron Adam<rrr at ronadam.com>  wrote:
>>
>>
>> On 10/20/2010 08:38 AM, Nick Coghlan wrote:
>>> A different idea along these lines that I've been pondering is an
>>> actual -p path option for the interpreter command line, that allowed a
>>> sequence of directories to be provided that would be prepended to
>>> PYTHONPATH (and hence included in sys.path).
>>>
>>> So if you're wanting to test two different versions of a module (from
>>> a parent directory containing the two versions in separate
>>> subdirectories):
>>>
>>> python -p versionA run_tests.py
>>> python -p versionB run_tests.py
>>>
>>> For more permanent additions to sys.path, PYTHONPATH (possibly in
>>> conjunction with virtualenv) is reasonable answer. Zipfile and
>>> directory execution covers execution of more complex applications
>>> containing multiple files as if they were simple scripts.
>>>
>>> The main piece I see missing from the puzzle is the ability to easily
>>> switch back and forth between multiple versions of a support package
>>> or library without mucking with persistent state like the environment
>>> variables or the filesystem.
>>
>> Yes, I don't like changing the system wide environment variables and file
>> system options. It's too easy to break other things that depend on them.
>>
>> How about adding the ability to use a .pth file from the current program
>> directory?
>
> A simple check to see if a supplied path was a directory or not would
> let us do both with one new option:
>
> -p  Specify a directory or a .pth file (see site module docs) to be
> prepended to sys.path

Prepending would be great. ;-)

> distutils2 could then provide a way to generate an appropriate .pth
> file instead of installing a distribution.

Where would the .pth file be and how would I run the application if I don't 
know I need to specify a .pth file?  How would I know I need to specify a 
.pth file?  (ie... if I'm trying to figure out what is wrong on some one 
else's computer.)

If you have a default .pth file in the same directory as the .py file being 
run, then that would give a way to specify an alternative or local library 
of modules and packages that is program specific without doing anything 
special.  It would be included in the distribution files as well, so 
distuitils2 doesn't have to generate anything.

+1 on the -p option with .pth files also.

Can .pth files use environment variables?


Cheers,
    Ron





From ncoghlan at gmail.com  Thu Oct 21 08:43:15 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 21 Oct 2010 16:43:15 +1000
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <4CBFC331.8020309@ronadam.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>
	<4CBFC331.8020309@ronadam.com>
Message-ID: <AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>

On Thu, Oct 21, 2010 at 2:36 PM, Ron Adam <rrr at ronadam.com> wrote:
> Where would the .pth file be and how would I run the application if I don't
> know I need to specify a .pth file? ?How would I know I need to specify a
> .pth file? ?(ie... if I'm trying to figure out what is wrong on some one
> else's computer.)

This idea is only aimed at developers. To run an actual Python
application that needs additional modules, either install it properly
or put it in a zipfile or directory, put a __main__.py at the top
level and just run the zipfile/directory directly.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From p.f.moore at gmail.com  Thu Oct 21 13:21:56 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 21 Oct 2010 12:21:56 +0100
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
	<AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
Message-ID: <AANLkTimKhQ-edQkZTiB7_BnGhZAzeGAQ=CCEPQF7X6RQ@mail.gmail.com>

On 21 October 2010 00:35, Floris Bruynooghe <flub at devork.be> wrote:

> I've read your and Ian's responses and still don't understand what
> setup.py develop brings to the party which can't be done with simple
> PYTHONPATH.

I'm glad it's not just me!

> Again, apologies if I understand the problem wrongly. ?But I too am
> worried about too many complexities and "magic". ?One of my main
> issues with setuptools is that it tries to handle my python
> environment (sys.path) outside of normally expected python mechanisms
> by modifying various custom files. ?I would hate to see distutils2
> repeat this.

I think that the key issue here is that PEP 376 introduces the idea of
a "distribution" which is a somewhat vaguely defined concept, which
can contain one or more packages or modules. Distributions don't have
a well-defined directory structure, and don't participate properly in
Python's standard import mechanism (PEP 302, PYTHONPATH, all that
stuff). The distribution metadata (dist-info directory) is not
package-based, and so doesn't fit the model.

Suggestions:

1. PEP 376 clearly defines what a "distribution" (installed or
otherwise) is, in terms of directory structure, whether/how it
supports PEP302-style non-filesystem access, etc. I don't see a reason
here why we can't mandate some structure, rather than leaving things
as a "free for all" like the current setuptools/adhoc approach.
2. Mechanisms for dealing with distributions are *only* discussed in
terms of the PEP 376 definitions, so we have a common understanding.

As a first cut, I'd say that a distribution is defined purely in terms
of its metadata (dist-info directory). On that basis, there should be
a definition of where dist-info directories are searched for, PEP 376
seems to state that this is only in site-packages ("This PEP proposes
an installation format inspired by one of the options in the
EggFormats standard, the one that uses a distinct directory located in
the site-packages directory."). And yet, this whole "develop"
discussion seems to be about locating dist-info directories located
elsewhere.

Having said that, PEP 376 later states:

get_distributions() -> iterator of Distribution instances.
Provides an iterator that looks for .dist-info directories in sys.path
and returns Distribution instances for each one of them.

This implies dist-info directories are searched for in sys.path. OK,
fine. That's broader than just site-packages, but still well-defined
and acceptable. And that's where I get my expectations that
manipulating PYTHONPATH should work.

So what's this directory structure we're talking about with Foo
containing two packages, and Foo.dist-info being alongside Foo? Foo
itself isn't on PYTHONPATH, so why should Foo.dist-info be found at
all? Based on PEP 376, it's not meant to be found.

Maybe if this *is* a requirement, it needs a change to PEP 376, which
I guess means the PEP discussion and approval process needs to be gone
through again. I, for one, would be OK with that, as I remain to be
convinced that the complexity and confusion is worth it.

Paul.


From doug.hellmann at gmail.com  Thu Oct 21 14:14:46 2010
From: doug.hellmann at gmail.com (Doug Hellmann)
Date: Thu, 21 Oct 2010 08:14:46 -0400
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTimKhQ-edQkZTiB7_BnGhZAzeGAQ=CCEPQF7X6RQ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
	<AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
	<AANLkTimKhQ-edQkZTiB7_BnGhZAzeGAQ=CCEPQF7X6RQ@mail.gmail.com>
Message-ID: <884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com>


On Oct 21, 2010, at 7:21 AM, Paul Moore wrote:

> On 21 October 2010 00:35, Floris Bruynooghe <flub at devork.be> wrote:
> 
>> I've read your and Ian's responses and still don't understand what
>> setup.py develop brings to the party which can't be done with simple
>> PYTHONPATH.
> 
> I'm glad it's not just me!

Using develop does more than just modify the import path.

It also generates the meta data, such as entry points, and re-generates any console scripts defined by my setup.py so that they point to the version of code in the sandbox.  After I run develop, any Python process on the system using the same python interpreter will run the code in my sandbox instead of the version "installed" in site-packages.  That includes any of the command line programs or plugins defined in my setup.py, and even applies to processes that don't run as my user.  

I use these features every day, since our application depends on a few daemons that run as root (it's a system management app, so it needs root privileges to do almost anything interesting).

Doug



From solipsis at pitrou.net  Thu Oct 21 14:17:50 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 21 Oct 2010 14:17:50 +0200
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
	<AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
	<AANLkTimKhQ-edQkZTiB7_BnGhZAzeGAQ=CCEPQF7X6RQ@mail.gmail.com>
Message-ID: <20101021141750.43a0c78d@pitrou.net>

On Thu, 21 Oct 2010 12:21:56 +0100
Paul Moore <p.f.moore at gmail.com> wrote:
> On 21 October 2010 00:35, Floris Bruynooghe <flub at devork.be> wrote:
> 
> > I've read your and Ian's responses and still don't understand what
> > setup.py develop brings to the party which can't be done with simple
> > PYTHONPATH.
> 
> I'm glad it's not just me!

How does PYTHONPATH work with C extensions?
Besides, how do you manage your PYTHONPATH when you have multiple
packages in "develop" mode, depending on each other?

Regards

Antoine.




From benjamin at python.org  Thu Oct 21 16:06:24 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 21 Oct 2010 14:06:24 +0000 (UTC)
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
Message-ID: <loom.20101021T160508-402@post.gmane.org>

Raymond Hettinger <raymond.hettinger at ...> writes:

> 
> One of the use cases for named tuples is to have them be automatically created
from a SQL query or CSV header. 
> Sometimes (but not often), those can have a huge number of columns.  In Python
2.x, it worked just fine -- we
> had a test for a named tuple with 5000 fields.  In Python 3.x, there is a
SyntaxError when there are more than
> 255 fields.

I'm not sure why you think this is new. It's been true from at least 2.5 as far
as I can see.






From ianb at colorstudy.com  Thu Oct 21 16:44:50 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 21 Oct 2010 09:44:50 -0500
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
	<AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
Message-ID: <AANLkTimiH2aVm2kGBEsa4D=QqmS+WT4jufBwy=596+zC@mail.gmail.com>

On Wed, Oct 20, 2010 at 6:35 PM, Floris Bruynooghe <flub at devork.be> wrote:

>
> On 20 October 2010 15:27, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> > On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>
> >> But I'm probably not seeing the real issues here.
> >>
> >> All I would say is, don't let the needs of more unusual configurations
> >> over-complicate basic usage.
> >
> > The trouble is: adding in PYTHONPATH the root of the source of your
> > project can be different from what it would be once installed in
> > Python.   Now the question is: if 90% of the projects out there would
> > work by adding the root, then this is might be overkill. I am afraid
> > it's way less though...
>
> I've read your and Ian's responses and still don't understand what
> setup.py develop brings to the party which can't be done with simple
> PYTHONPATH.  Excuse me if I also completely misunderstand what develop
> does but it sounds like it's going to add an in-development version of
> a project on a users's sys.path (at the front?) until it's undone
> again somehow (is there a "setup.py undevelop"?).


pip uninstall would unlink it (pip install -e calls setup.py develop as
well).  setup.py develop is persistent unlike PYTHONPATH.


>  This just seems
> dangerous to me since it will affect all python programs run by that
> user.
>

Hence virtualenv, which solves your other concerns.


> If I understand correctly this whole "develop" dance is for when you
> have two inter-depended packages in development at the same time.  If
> manually setting PYTHONPATH correctly in this situation is too
> complicated then my feeling is there's nothing wrong with some sort of
> helper which manipulates PYTHONPATH for you, something like spaw a new
> shell and set the environment in that correctly.  But placing things
> in files makes this permanent for the user and just seems the wrong
> way to go to me.
>
> Again, apologies if I understand the problem wrongly.  But I too am
> worried about too many complexities and "magic".  One of my main
> issues with setuptools is that it tries to handle my python
> environment (sys.path) outside of normally expected python mechanisms
> by modifying various custom files.  I would hate to see distutils2
> repeat this.
>

Note if you use pip, it uses setuptools in a way where only setup.py develop
uses .pth files, and otherwise the path is similar to how it is with
distutils alone (except with that extra metadata, as Doug mentions).

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101021/8a38daac/attachment.html>

From p.f.moore at gmail.com  Thu Oct 21 16:56:04 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 21 Oct 2010 15:56:04 +0100
Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths
In-Reply-To: <884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com>
References: <AANLkTinD=baUia_pj6zMVdLCH13HcjRAVuZW4C1zvtJQ@mail.gmail.com>
	<AANLkTinLzWHDpkZYJmH2NB=G9qP2OAiYJgKcCaMnd9OT@mail.gmail.com>
	<AANLkTimv2vE8AatCnD7PU-1Tn8iBjqaBniqKroVikmvJ@mail.gmail.com>
	<AANLkTinDJ_W+uNNbHMkfvtsYabrZ+tmoH3EkFFH6_9xs@mail.gmail.com>
	<AANLkTinF4t7HtanVvF0-h=hzg83KOZLvs3N8KzpK+uB-@mail.gmail.com>
	<AANLkTikDrzg2ZNo6rRQqvxKo45FrtPFAR2pzkacX4phJ@mail.gmail.com>
	<AANLkTimKhQ-edQkZTiB7_BnGhZAzeGAQ=CCEPQF7X6RQ@mail.gmail.com>
	<884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com>
Message-ID: <AANLkTi=qn2cqSTWPw2NCu0+7MXfw_rA29ZSdVDVeZg+1@mail.gmail.com>

On 21 October 2010 13:14, Doug Hellmann <doug.hellmann at gmail.com> wrote:
>
> On Oct 21, 2010, at 7:21 AM, Paul Moore wrote:
>
>> On 21 October 2010 00:35, Floris Bruynooghe <flub at devork.be> wrote:
>>
>>> I've read your and Ian's responses and still don't understand what
>>> setup.py develop brings to the party which can't be done with simple
>>> PYTHONPATH.
>>
>> I'm glad it's not just me!
>
> Using develop does more than just modify the import path.
>
> It also generates the meta data, such as entry points, and re-generates any console scripts defined by
> my setup.py so that they point to the version of code in the sandbox. ?After I run develop, any Python
> process on the system using the same python interpreter will run the code in my sandbox instead of the
> version "installed" in site-packages. ?That includes any of the command line programs or plugins defined
> in my setup.py, and even applies to processes that don't run as my user.
>
> I use these features every day, since our application depends on a few daemons that run as root (it's a
> system management app, so it needs root privileges to do almost anything interesting).

Note - my understanding is that this discussion is about metadata
discovery for distutils2, *not* about setuptools' develop feature
(which AIUI does far more than is being proposed at the moment).

Specifically, I thought we were just talking about metadata here. As
far as this discussion goes, entry points and console scripts aren't
included. That's not to say they aren't useful, just that they are a
separate discussion.

In case it's not obvious, I'm a strong -1 on simply importing
setuptools functionality into distutils2 wholesale, without
discussion/review.

Paul.


From g.brandl at gmx.net  Thu Oct 21 17:07:42 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 21 Oct 2010 17:07:42 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <loom.20101021T160508-402@post.gmane.org>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
Message-ID: <i9pl0o$vsk$1@dough.gmane.org>

Am 21.10.2010 16:06, schrieb Benjamin Peterson:
> Raymond Hettinger <raymond.hettinger at ...> writes:
> 
>> 
>> One of the use cases for named tuples is to have them be automatically created
> from a SQL query or CSV header. 
>> Sometimes (but not often), those can have a huge number of columns.  In Python
> 2.x, it worked just fine -- we
>> had a test for a named tuple with 5000 fields.  In Python 3.x, there is a
> SyntaxError when there are more than
>> 255 fields.
> 
> I'm not sure why you think this is new. It's been true from at least 2.5 as far
> as I can see.

You must be talking of a different restriction.  This snippet works fine in
2.7, but raises a SyntaxError in 3.1:

   exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass")

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From mal at egenix.com  Thu Oct 21 17:41:12 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 21 Oct 2010 17:41:12 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <i9pl0o$vsk$1@dough.gmane.org>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
Message-ID: <4CC05F18.3060607@egenix.com>

Georg Brandl wrote:
> Am 21.10.2010 16:06, schrieb Benjamin Peterson:
>> Raymond Hettinger <raymond.hettinger at ...> writes:
>>
>>>
>>> One of the use cases for named tuples is to have them be automatically created
>> from a SQL query or CSV header. 
>>> Sometimes (but not often), those can have a huge number of columns.  In Python
>> 2.x, it worked just fine -- we
>>> had a test for a named tuple with 5000 fields.  In Python 3.x, there is a
>> SyntaxError when there are more than
>>> 255 fields.
>>
>> I'm not sure why you think this is new. It's been true from at least 2.5 as far
>> as I can see.
> 
> You must be talking of a different restriction.  This snippet works fine in
> 2.7, but raises a SyntaxError in 3.1:
> 
>    exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass")

The AST code in 2.7 raises this error for function/method calls
only. In 3.2, it also raises the error for function/method
definitions.

Looking at the AST code, the limitation appears somewhat arbitrary.
There's no comment in the code suggesting a reason for the limit and
it's still possible to pass in more arguments via *args and **kws -
but without the built-in argument checking.

Could someone provide some insight ?

Note that it's not uncommon to have more than 255 possible
function/method arguments in generated code, e.g. in database
abstraction layers.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 21 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From alexander.belopolsky at gmail.com  Thu Oct 21 18:13:49 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 21 Oct 2010 12:13:49 -0400
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC05F18.3060607@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org> <4CC05F18.3060607@egenix.com>
Message-ID: <AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>

On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg <mal at egenix.com> wrote:
..
> Looking at the AST code, the limitation appears somewhat arbitrary.
> There's no comment in the code suggesting a reason for the limit and
> it's still possible to pass in more arguments via *args and **kws -
> but without the built-in argument checking.
>
> Could someone provide some insight ?
>

My understanding is that the limitation comes from bytecode generation
phase, not AST.

See also Guido's http://bugs.python.org/issue1636#msg58760.

According to Python manual section for opcodes,

CALL_FUNCTION(argc)

Calls a function. The low byte of argc indicates the number of
positional parameters, the high byte the number of keyword parameters.
On the stack, the opcode finds the keyword parameters first. For each
keyword argument, the value is on top of the key. Below the keyword
parameters, the positional parameters are on the stack, with the
right-most parameter on top. Below the parameters, the function object
to call is on the stack. Pops all function arguments, and the function
itself off the stack, and pushes the return value.

http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CALL_FUNCTION


From mal at egenix.com  Thu Oct 21 19:31:48 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 21 Oct 2010 19:31:48 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<4CC05F18.3060607@egenix.com>
	<AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>
Message-ID: <4CC07904.6070100@egenix.com>

Alexander Belopolsky wrote:
> On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> ..
>> Looking at the AST code, the limitation appears somewhat arbitrary.
>> There's no comment in the code suggesting a reason for the limit and
>> it's still possible to pass in more arguments via *args and **kws -
>> but without the built-in argument checking.
>>
>> Could someone provide some insight ?
>>
> 
> My understanding is that the limitation comes from bytecode generation
> phase, not AST.
> 
> See also Guido's http://bugs.python.org/issue1636#msg58760.
> 
> According to Python manual section for opcodes,
> 
> CALL_FUNCTION(argc)
> 
> Calls a function. The low byte of argc indicates the number of
> positional parameters, the high byte the number of keyword parameters.
> On the stack, the opcode finds the keyword parameters first. For each
> keyword argument, the value is on top of the key. Below the keyword
> parameters, the positional parameters are on the stack, with the
> right-most parameter on top. Below the parameters, the function object
> to call is on the stack. Pops all function arguments, and the function
> itself off the stack, and pushes the return value.
> 
> http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CALL_FUNCTION

Thanks for the insight.

Even with the one byte per position and keywords arguments
limitation imposed by the byte code, the checks in ast.c are
a bit too simple, since they apply a limit on the sum of positional
and keyword args, whereas the byte code and VM can deal with up
to 255 positional and 255 keyword arguments.

    if (nposargs + nkwonlyargs > 255) {
        ast_error(n, "more than 255 arguments");
        return NULL;
    }

I think this should be:

    if (nposargs > 255)
        ast_error(n, "more than 255 positional arguments");
        return NULL;
    }
    if (nkwonlyargs > 255)
        ast_error(n, "more than 255 keyword arguments");
        return NULL;
    }

There's a patch somewhere that turns Python's VM into a 16 or
32-bit byte code machine. Perhaps it's time to have a look at that
again.

Do other Python implementations have such limitations ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 21 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From alexander.belopolsky at gmail.com  Thu Oct 21 19:36:48 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 21 Oct 2010 13:36:48 -0400
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC07904.6070100@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org> <4CC05F18.3060607@egenix.com>
	<AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>
	<4CC07904.6070100@egenix.com>
Message-ID: <AANLkTimkZvj9QfYhFRp3h4maW8ZM5ZuVz6bp95AAdCOs@mail.gmail.com>

On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg <mal at egenix.com> wrote:
..
> There's a patch somewhere that turns Python's VM into a 16 or
> 32-bit byte code machine. Perhaps it's time to have a look at that
> again.
>

This sounds like a reference to wpython:

http://code.google.com/p/wpython/


I hope 255 argument limitation can be removed by simpler means.


From mal at egenix.com  Thu Oct 21 19:46:06 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 21 Oct 2010 19:46:06 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTimkZvj9QfYhFRp3h4maW8ZM5ZuVz6bp95AAdCOs@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<4CC05F18.3060607@egenix.com>	<AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>	<4CC07904.6070100@egenix.com>
	<AANLkTimkZvj9QfYhFRp3h4maW8ZM5ZuVz6bp95AAdCOs@mail.gmail.com>
Message-ID: <4CC07C5E.4060907@egenix.com>

Alexander Belopolsky wrote:
> On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> ..
>> There's a patch somewhere that turns Python's VM into a 16 or
>> 32-bit byte code machine. Perhaps it's time to have a look at that
>> again.
>>
> 
> This sounds like a reference to wpython:
> 
> http://code.google.com/p/wpython/

Indeed. That's what I was thinking of.

> I hope 255 argument limitation can be removed by simpler means.

Probably, but why not take this as a chance to improve other
aspects of the CPython VM as well ?

Here's a presentation by Cesare Di Mauro, the author of the
patch:

http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdf

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 21 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From cesare.di.mauro at gmail.com  Thu Oct 21 19:56:57 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Thu, 21 Oct 2010 19:56:57 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC07C5E.4060907@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org> <4CC05F18.3060607@egenix.com>
	<AANLkTi=90RPo5SrC5bx2Ogw7ppqWubdDH-zBZNVVFRg0@mail.gmail.com>
	<4CC07904.6070100@egenix.com>
	<AANLkTimkZvj9QfYhFRp3h4maW8ZM5ZuVz6bp95AAdCOs@mail.gmail.com>
	<4CC07C5E.4060907@egenix.com>
Message-ID: <AANLkTimfALQ-5LsFr-VYBT8e58FaK4d3rjkYx534nExN@mail.gmail.com>

Hi Marc

> I hope 255 argument limitation can be removed by simpler means.

Probably, but why not take this as a chance to improve other
> aspects of the CPython VM as well ?
>
> Here's a presentation by Cesare Di Mauro, the author of the
> patch:
>
>
> http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdf
>
> --
> Marc-Andre Lemburg
> eGenix.com
>

This presentation was made for wpython 1.0 alpha, which was the first
release I made.

Last year I released the second (and last), wpython 1.1, which carries
several other changes and optimizations. You can find the new project here:
http://code.google.com/p/wpython2/ and the presentation here:
http://wpython2.googlecode.com/files/Cleanup%20and%20new%20optimizations%20in%20WPython%201.1.pdf

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101021/da3e047b/attachment.html>

From benjamin at python.org  Thu Oct 21 20:15:55 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 21 Oct 2010 18:15:55 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?New_3=2Ex_restriction_on_number_of_keywo?=
	=?utf-8?q?rd=09arguments?=
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
Message-ID: <loom.20101021T201522-329@post.gmane.org>

Georg Brandl <g.brandl at ...> writes:

> You must be talking of a different restriction.

I assumed Raymond was talking about calling a function with > 255 args.






From g.brandl at gmx.net  Thu Oct 21 22:08:46 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 21 Oct 2010 22:08:46 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <loom.20101021T201522-329@post.gmane.org>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
Message-ID: <i9q6l8$mv0$1@dough.gmane.org>

Am 21.10.2010 20:15, schrieb Benjamin Peterson:
> Georg Brandl <g.brandl at ...> writes:
> 
>> You must be talking of a different restriction.
> 
> I assumed Raymond was talking about calling a function with > 255 args.

And I assumed Raymond was talking about defining a function with > 255 args.
Whatever, both instances should be fixed.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From cesare.di.mauro at gmail.com  Fri Oct 22 09:18:01 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Fri, 22 Oct 2010 09:18:01 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <loom.20101021T201522-329@post.gmane.org>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
Message-ID: <AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>

2010/10/21 Benjamin Peterson <benjamin at python.org>

> Georg Brandl <g.brandl at ...> writes:
>
> > You must be talking of a different restriction.
>
> I assumed Raymond was talking about calling a function with > 255 args.
>

I think that having max 255 args and 255 kwargs is a good and reasonable
limit which we can live on, and helps the virtual machine implementation
(and implementors :P).

Python won't lose its "power" and "generality" if one VM (albeit the
"mainstream" / "official" one) have some limits.

We already have some other ones, such as max 65536 constants, names, globals
and locals. Another one is the maximum 20 blocks for code object. Who thinks
that such limits must be removed?

I think that having more than 255 arguments for a function call is a very
rare case for which a workaround (may be passing a tuple/list or a
dictionary) can be a better solution than having to introduce a brand new
opcode to handle it.

Changing the current opcode(s) is a very bad idea, since common cases will
slow down.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101022/a6a2039e/attachment.html>

From python at mrabarnett.plus.com  Fri Oct 22 19:44:08 2010
From: python at mrabarnett.plus.com (MRAB)
Date: Fri, 22 Oct 2010 18:44:08 +0100
Subject: [Python-ideas] New 3.x restriction on number of
	keyword	arguments
In-Reply-To: <AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
Message-ID: <4CC1CD68.4020507@mrabarnett.plus.com>

On 22/10/2010 08:18, Cesare Di Mauro wrote:
> 2010/10/21 Benjamin Peterson <benjamin at python.org
> <mailto:benjamin at python.org>>
>
>     Georg Brandl <g.brandl at ...> writes:
>
>      > You must be talking of a different restriction.
>
>     I assumed Raymond was talking about calling a function with > 255 args.
>
>
> I think that having max 255 args and 255 kwargs is a good and reasonable
> limit which we can live on, and helps the virtual machine implementation
> (and implementors :P).
>
> Python won't lose its "power" and "generality" if one VM (albeit the
> "mainstream" / "official" one) have some limits.
>
> We already have some other ones, such as max 65536 constants, names,
> globals and locals. Another one is the maximum 20 blocks for code
> object. Who thinks that such limits must be removed?
>
The BDFL thinks that 255 is too low.

> I think that having more than 255 arguments for a function call is a
> very rare case for which a workaround (may be passing a tuple/list or a
> dictionary) can be a better solution than having to introduce a brand
> new opcode to handle it.
>
> Changing the current opcode(s) is a very bad idea, since common cases
> will slow down.
>


From solipsis at pitrou.net  Fri Oct 22 19:53:19 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 22 Oct 2010 19:53:19 +0200
Subject: [Python-ideas] New 3.x restriction on number of
	keyword	arguments
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1CD68.4020507@mrabarnett.plus.com>
Message-ID: <20101022195319.5a5043f9@pitrou.net>

On Fri, 22 Oct 2010 18:44:08 +0100
MRAB <python at mrabarnett.plus.com> wrote:
> On 22/10/2010 08:18, Cesare Di Mauro wrote:
> >
> > I think that having max 255 args and 255 kwargs is a good and reasonable
> > limit which we can live on, and helps the virtual machine implementation
> > (and implementors :P).
> >
> > Python won't lose its "power" and "generality" if one VM (albeit the
> > "mainstream" / "official" one) have some limits.
> >
> > We already have some other ones, such as max 65536 constants, names,
> > globals and locals. Another one is the maximum 20 blocks for code
> > object. Who thinks that such limits must be removed?
> >
> The BDFL thinks that 255 is too low.

The BDFL can propose a patch :)

Cheers

Antoine.




From mal at egenix.com  Fri Oct 22 20:35:18 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 22 Oct 2010 20:35:18 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
Message-ID: <4CC1D966.2080007@egenix.com>

Cesare Di Mauro wrote:
> 2010/10/21 Benjamin Peterson <benjamin at python.org>
> 
>> Georg Brandl <g.brandl at ...> writes:
>>
>>> You must be talking of a different restriction.
>>
>> I assumed Raymond was talking about calling a function with > 255 args.
>>
> 
> I think that having max 255 args and 255 kwargs is a good and reasonable
> limit which we can live on, and helps the virtual machine implementation
> (and implementors :P).
> 
> Python won't lose its "power" and "generality" if one VM (albeit the
> "mainstream" / "official" one) have some limits.
> 
> We already have some other ones, such as max 65536 constants, names, globals
> and locals. Another one is the maximum 20 blocks for code object. Who thinks
> that such limits must be removed?
>
> I think that having more than 255 arguments for a function call is a very
> rare case for which a workaround (may be passing a tuple/list or a
> dictionary) can be a better solution than having to introduce a brand new
> opcode to handle it.

It's certainly rare when writing applications by hand, but such
limits can be reached with code generators wrapping external resources
such as database query rows, spreadsheet rows, sensor data input, etc.

We've had such a limit before (number of lines in a module) and that
was raised for the same reason.

> Changing the current opcode(s) is a very bad idea, since common cases will
> slow down.

I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG
for such cases.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 22 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From qrczak at knm.org.pl  Fri Oct 22 20:52:01 2010
From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk)
Date: Fri, 22 Oct 2010 20:52:01 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
Message-ID: <AANLkTi=Z8FaCXr=EArvzUY+s6soQcrY4eM_+Es8g_EKJ@mail.gmail.com>

2010/10/22 Cesare Di Mauro <cesare.di.mauro at gmail.com>:

> I think that having more than 255 arguments for a function call is a very
> rare case for which a workaround (may be passing a tuple/list or a
> dictionary) can be a better solution than having to introduce a brand new
> opcode to handle it.

It does not need a new opcode. The bytecode can create an argument
tuple explicitly and pass it like it passes *args.

-- 
Marcin Kowalczyk


From cesare.di.mauro at gmail.com  Fri Oct 22 22:31:10 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Fri, 22 Oct 2010 22:31:10 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC1D966.2080007@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
Message-ID: <AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>

2010/10/22 M.-A. Lemburg <mal at egenix.com>

> Cesare Di Mauro wrote:
> > I think that having more than 255 arguments for a function call is a very
> > rare case for which a workaround (may be passing a tuple/list or a
> > dictionary) can be a better solution than having to introduce a brand new
> > opcode to handle it.
>
> It's certainly rare when writing applications by hand, but such
> limits can be reached with code generators wrapping external resources
> such as database query rows, spreadsheet rows, sensor data input, etc.
>
> We've had such a limit before (number of lines in a module) and that
> was raised for the same reason.
>
> > Changing the current opcode(s) is a very bad idea, since common cases
> will
> > slow down.
>
> I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG
> for such cases.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>

I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for
CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535
maximum. I hope it'll be enough. :)


In ast.c:

ast_for_arguments:
if (nposargs > 65535 || nkwonlyargs > 65535) {
ast_error(n, "more than 65535 arguments");
return NULL;
}

ast_for_call:
if (nargs + ngens > 65535 || nkeywords > 65535) {
ast_error(n, "more than 65535 arguments");
return NULL;
}


In compile.c:

opcode_stack_effect:
#define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff)
+ ((o) >> 16 & 0xff00)))
     case CALL_FUNCTION:
         return -NARGS(oparg);
     case CALL_FUNCTION_VAR:
     case CALL_FUNCTION_KW:
         return -NARGS(oparg)-1;
     case CALL_FUNCTION_VAR_KW:
         return -NARGS(oparg)-2;
#undef NARGS
#define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256))
     case MAKE_FUNCTION:
         return -NARGS(oparg) - ((oparg >> 16) & 0xffff);
     case MAKE_CLOSURE:
         return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff);
#undef NARGS

compiler_call_helper:
int len;
int code = 0;

len = asdl_seq_LEN(args) + n;
n = len & 0xff | (len & 0xff00) << 8;
VISIT_SEQ(c, expr, args);
if (keywords) {
  VISIT_SEQ(c, keyword, keywords);
  len = asdl_seq_LEN(keywords);
  n |= (len & 0xff | (len & 0xff00) << 8) << 8;
}


In ceval.c:

PyEval_EvalFrameEx:
     TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw)
     TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw)
     TARGET(CALL_FUNCTION_VAR_KW)
     _call_function_var_kw:
     {
         int na = oparg & 0xff | oparg >> 8 & 0xff00;
         int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;


call_function:
   int na = oparg & 0xff | oparg >> 8 & 0xff00;
   int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;


A quick example:

s = '''def f(*Args, **Keywords):
    print('Got', len(Args), 'arguments and', len(Keywords), 'keywords')

def g():
    f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} =
{}'.format(i, i) for i in range(500)) + ''')

g()
'''

c = compile(s, '<string>', 'exec')
eval(c)
from dis import dis
dis(g)


The output is:

Got 500 arguments and 500 keywords

5 0 LOAD_GLOBAL 0 (f)
3 LOAD_CONST 1 (0)
6 LOAD_CONST 2 (1)
[...]
1497 LOAD_CONST 499 (498)
1500 LOAD_CONST 500 (499)
1503 LOAD_CONST 501 ('k0')
1506 LOAD_CONST 1 (0)
1509 LOAD_CONST 502 ('k1')
1512 LOAD_CONST 2 (1)
[...]
4491 LOAD_CONST 999 ('k498')
4494 LOAD_CONST 499 (498)
4497 LOAD_CONST 1000 ('k499')
4500 LOAD_CONST 500 (499)
4503 EXTENDED_ARG 257
4506 CALL_FUNCTION 16905460
4509 POP_TOP
4510 LOAD_CONST 0 (None)
4513 RETURN_VALUE

The dis module seems to have some problem displaying the correct extended
value, but I have no time now to check and fix it.

Anyway, I'm still unconvinced of the need to raise the function def/call
limits.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101022/1c97140f/attachment.html>

From cesare.di.mauro at gmail.com  Fri Oct 22 22:36:49 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Fri, 22 Oct 2010 22:36:49 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTi=Z8FaCXr=EArvzUY+s6soQcrY4eM_+Es8g_EKJ@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<AANLkTi=Z8FaCXr=EArvzUY+s6soQcrY4eM_+Es8g_EKJ@mail.gmail.com>
Message-ID: <AANLkTikkbwaowXtC09hW0BjS_yU2Ekv_C3UHyn_o6iKO@mail.gmail.com>

2010/10/22 Marcin 'Qrczak' Kowalczyk <qrczak at knm.org.pl>

> 2010/10/22 Cesare Di Mauro <cesare.di.mauro at gmail.com>:
>
> > I think that having more than 255 arguments for a function call is a very
> > rare case for which a workaround (may be passing a tuple/list or a
> > dictionary) can be a better solution than having to introduce a brand new
> > opcode to handle it.
>
> It does not need a new opcode. The bytecode can create an argument
> tuple explicitly and pass it like it passes *args.
>
> --
> Marcin Kowalczyk
>

It'll be too slow. Current CALL_FUNCTION* uses "packed" ints, not
PyLongObject ints.

Having a tuple you need (at least) to extract the PyLongs, and convert them
to ints, before using them.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101022/8d6bc609/attachment.html>

From mal at egenix.com  Sat Oct 23 00:36:30 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 23 Oct 2010 00:36:30 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
Message-ID: <4CC211EE.1050308@egenix.com>

Cesare Di Mauro wrote:
> 2010/10/22 M.-A. Lemburg <mal at egenix.com>
> 
>> Cesare Di Mauro wrote:
>>> I think that having more than 255 arguments for a function call is a very
>>> rare case for which a workaround (may be passing a tuple/list or a
>>> dictionary) can be a better solution than having to introduce a brand new
>>> opcode to handle it.
>>
>> It's certainly rare when writing applications by hand, but such
>> limits can be reached with code generators wrapping external resources
>> such as database query rows, spreadsheet rows, sensor data input, etc.
>>
>> We've had such a limit before (number of lines in a module) and that
>> was raised for the same reason.
>>
>>> Changing the current opcode(s) is a very bad idea, since common cases
>> will
>>> slow down.
>>
>> I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG
>> for such cases.
>>
>> --
>> Marc-Andre Lemburg
>> eGenix.com
>>
> 
> I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for
> CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535
> maximum. I hope it'll be enough. :)

Sure, we don't have to raise it to 2**64 :-) Looks like a pretty simple fix,
indeed.

I wish we could get rid off all the byte shifting and div'ery
use in the byte compiler - I'm pretty sure that such operations
are rather slow nowadays compared to working with 16-bit or 32-bit
integers and dropping the notion of taking the word "byte"
in byte code literally.

> In ast.c:
> 
> ast_for_arguments:
> if (nposargs > 65535 || nkwonlyargs > 65535) {
> ast_error(n, "more than 65535 arguments");
> return NULL;
> }
> 
> ast_for_call:
> if (nargs + ngens > 65535 || nkeywords > 65535) {
> ast_error(n, "more than 65535 arguments");
> return NULL;
> }
> 
> 
> In compile.c:
> 
> opcode_stack_effect:
> #define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff)
> + ((o) >> 16 & 0xff00)))
>      case CALL_FUNCTION:
>          return -NARGS(oparg);
>      case CALL_FUNCTION_VAR:
>      case CALL_FUNCTION_KW:
>          return -NARGS(oparg)-1;
>      case CALL_FUNCTION_VAR_KW:
>          return -NARGS(oparg)-2;
> #undef NARGS
> #define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256))
>      case MAKE_FUNCTION:
>          return -NARGS(oparg) - ((oparg >> 16) & 0xffff);
>      case MAKE_CLOSURE:
>          return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff);
> #undef NARGS
> 
> compiler_call_helper:
> int len;
> int code = 0;
> 
> len = asdl_seq_LEN(args) + n;
> n = len & 0xff | (len & 0xff00) << 8;
> VISIT_SEQ(c, expr, args);
> if (keywords) {
>   VISIT_SEQ(c, keyword, keywords);
>   len = asdl_seq_LEN(keywords);
>   n |= (len & 0xff | (len & 0xff00) << 8) << 8;
> }
> 
> 
> In ceval.c:
> 
> PyEval_EvalFrameEx:
>      TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw)
>      TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw)
>      TARGET(CALL_FUNCTION_VAR_KW)
>      _call_function_var_kw:
>      {
>          int na = oparg & 0xff | oparg >> 8 & 0xff00;
>          int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;
> 
> 
> call_function:
>    int na = oparg & 0xff | oparg >> 8 & 0xff00;
>    int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;
> 
> 
> A quick example:
> 
> s = '''def f(*Args, **Keywords):
>     print('Got', len(Args), 'arguments and', len(Keywords), 'keywords')
> 
> def g():
>     f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} =
> {}'.format(i, i) for i in range(500)) + ''')
> 
> g()
> '''
> 
> c = compile(s, '<string>', 'exec')
> eval(c)
> from dis import dis
> dis(g)
> 
> 
> The output is:
> 
> Got 500 arguments and 500 keywords
> 
> 5 0 LOAD_GLOBAL 0 (f)
> 3 LOAD_CONST 1 (0)
> 6 LOAD_CONST 2 (1)
> [...]
> 1497 LOAD_CONST 499 (498)
> 1500 LOAD_CONST 500 (499)
> 1503 LOAD_CONST 501 ('k0')
> 1506 LOAD_CONST 1 (0)
> 1509 LOAD_CONST 502 ('k1')
> 1512 LOAD_CONST 2 (1)
> [...]
> 4491 LOAD_CONST 999 ('k498')
> 4494 LOAD_CONST 499 (498)
> 4497 LOAD_CONST 1000 ('k499')
> 4500 LOAD_CONST 500 (499)
> 4503 EXTENDED_ARG 257
> 4506 CALL_FUNCTION 16905460
> 4509 POP_TOP
> 4510 LOAD_CONST 0 (None)
> 4513 RETURN_VALUE
> 
> The dis module seems to have some problem displaying the correct extended
> value, but I have no time now to check and fix it.
> 
> Anyway, I'm still unconvinced of the need to raise the function def/call
> limits.

It may seem strange to have functions, methods or object constructors
with more than 255 parameters, but as I said: when using code generators,
the generators don't care whether they use 100 or 300 parameters. Even if
just 10 parameters are actually used later on. However, the user
will care a lot if the generators fail due such limits and then become
unusable.

As example, take a database query method that exposes 3-4 parameters
for each query field. In more complex database schemas that you find
in e.g. data warehouse applications, it is not uncommon to have
100+ query fields or columns in a data table.

With the current
limit in function/call argument counts, such a model could not be
mapped directly to Python. Instead, you'd have to turn to solutions
based on other data structures that are not automatically checked
by Python when calling methods/functions.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 22 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Sat Oct 23 00:45:08 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 23 Oct 2010 00:45:08 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com>
Message-ID: <20101023004508.6a6c1373@pitrou.net>

On Sat, 23 Oct 2010 00:36:30 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> 
> It may seem strange to have functions, methods or object constructors
> with more than 255 parameters, but as I said: when using code generators,
> the generators don't care whether they use 100 or 300 parameters.

Why not make the code generators smarter?





From cesare.di.mauro at gmail.com  Sat Oct 23 08:07:48 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Sat, 23 Oct 2010 08:07:48 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC211EE.1050308@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com>
Message-ID: <AANLkTi=kD_B0gqTzx5o_wYM0nhtbSCaGeSqgU-WwArmj@mail.gmail.com>

2010/10/23 M.-A. Lemburg <mal at egenix.com>

>
> I wish we could get rid off all the byte shifting and div'ery
> use in the byte compiler - I'm pretty sure that such operations
> are rather slow nowadays compared to working with 16-bit or 32-bit
> integers and dropping the notion of taking the word "byte"
> in byte code literally.
>

Unfortunately we can't remove such shift & masking operations, even on
non-byte(code) compilers/VMs.

In wpython I handle 16 or 32 bits opcodes (it works on multiple of 16 bits
words), but I have:
- specialized opcodes to call functions and procedures (functions which
trashes the result) which handle the most common cases (84-85% on average
from that stats that I have collected from some projects and standard
library); I have packed 4 bits nargs and 4 bits nkwargs into a single byte
in order to obtain a short (and fast), 16 bits opcode;
- big endian systems still need to extract and "rotate" the bytes to get the
correct word(s) value.

So, even on words (and longwords) representations, they are need.

The good thing is that they can be handled a bit fast because oparg stays in
one register, and na and nk vars read (and manipulate) it independently, so
a (common) out-of-order processor can do a good work, scheduling and
parallelize such instructions, leaving a few final dependencies (when
recombining shift and/or mask partial results).
Some work can also be done reordering the instructions to enhance execution
on in-order processors.

It may seem strange to have functions, methods or object constructors
> with more than 255 parameters, but as I said: when using code generators,
> the generators don't care whether they use 100 or 300 parameters. Even if
> just 10 parameters are actually used later on. However, the user
> will care a lot if the generators fail due such limits and then become
> unusable.
>
> As example, take a database query method that exposes 3-4 parameters
> for each query field. In more complex database schemas that you find
> in e.g. data warehouse applications, it is not uncommon to have
> 100+ query fields or columns in a data table.
>
> With the current
> limit in function/call argument counts, such a model could not be
> mapped directly to Python. Instead, you'd have to turn to solutions
> based on other data structures that are not automatically checked
> by Python when calling methods/functions.
>
> --
> Marc-Andre Lemburg
>

I understood the problem, but I don't know if this is the correct solution.
Anyway, now there's at least one solution. :)

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101023/f3c4cd75/attachment.html>

From scott+python-ideas at scottdial.com  Sat Oct 23 06:55:35 2010
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Sat, 23 Oct 2010 00:55:35 -0400
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>	<4CBFC331.8020309@ronadam.com>
	<AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>
Message-ID: <4CC26AC7.6060801@scottdial.com>

On 10/21/2010 2:43 AM, Nick Coghlan wrote:
> This idea is only aimed at developers. To run an actual Python
> application that needs additional modules, either install it properly
> or put it in a zipfile or directory, put a __main__.py at the top
> level and just run the zipfile/directory directly.

If this is only aimed at developers, then those developers why isn't,

PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py
PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py

, completely and utterly sufficient for the job.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu


From ncoghlan at gmail.com  Sat Oct 23 16:21:09 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 24 Oct 2010 00:21:09 +1000
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <4CC26AC7.6060801@scottdial.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>
	<4CBFC331.8020309@ronadam.com>
	<AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>
	<4CC26AC7.6060801@scottdial.com>
Message-ID: <AANLkTimcz-QepcQCMCw+hvJYgT8545o1riM_LxZCoY+y@mail.gmail.com>

On Sat, Oct 23, 2010 at 2:55 PM, Scott Dial
<scott+python-ideas at scottdial.com> wrote:
> On 10/21/2010 2:43 AM, Nick Coghlan wrote:
>> This idea is only aimed at developers. To run an actual Python
>> application that needs additional modules, either install it properly
>> or put it in a zipfile or directory, put a __main__.py at the top
>> level and just run the zipfile/directory directly.
>
> If this is only aimed at developers, then those developers why isn't,
>
> PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py
> PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py
>
> , completely and utterly sufficient for the job.

Without the addition of the ability to supply a .pth file instead, I
would tend to agree with you. There's a reason I'd never actually made
the suggestion before, despite first thinking of it ages ago.
(Although, I'll also point out that your suggestion doesn't work on
Windows, which has its own idiosyncratic way of dealing with
environment variables).

The proposed command line switch would also be compatible with -E,
which is *not* the case for any approach based on modifying
PYTHONPATH.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From rrr at ronadam.com  Sat Oct 23 20:32:15 2010
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 23 Oct 2010 13:32:15 -0500
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <AANLkTimcz-QepcQCMCw+hvJYgT8545o1riM_LxZCoY+y@mail.gmail.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>	<4CBFC331.8020309@ronadam.com>	<AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>	<4CC26AC7.6060801@scottdial.com>
	<AANLkTimcz-QepcQCMCw+hvJYgT8545o1riM_LxZCoY+y@mail.gmail.com>
Message-ID: <4CC32A2F.6040408@ronadam.com>



On 10/23/2010 09:21 AM, Nick Coghlan wrote:
> On Sat, Oct 23, 2010 at 2:55 PM, Scott Dial
> <scott+python-ideas at scottdial.com>  wrote:
>> On 10/21/2010 2:43 AM, Nick Coghlan wrote:
>>> This idea is only aimed at developers. To run an actual Python
>>> application that needs additional modules, either install it properly
>>> or put it in a zipfile or directory, put a __main__.py at the top
>>> level and just run the zipfile/directory directly.
>>
>> If this is only aimed at developers, then those developers why isn't,
>>
>> PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py
>> PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py
>>
>> , completely and utterly sufficient for the job.
>
> Without the addition of the ability to supply a .pth file instead, I
> would tend to agree with you. There's a reason I'd never actually made
> the suggestion before, despite first thinking of it ages ago.
> (Although, I'll also point out that your suggestion doesn't work on
> Windows, which has its own idiosyncratic way of dealing with
> environment variables).
>
> The proposed command line switch would also be compatible with -E,
> which is *not* the case for any approach based on modifying
> PYTHONPATH.

When you say "developers", do you mean developers of python, or developers 
with python?  I presumed the later.

Ron










From scott+python-ideas at scottdial.com  Sat Oct 23 20:56:46 2010
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Sat, 23 Oct 2010 14:56:46 -0400
Subject: [Python-ideas] Add a command line option to adjust sys.path?
 (was Re: Add a site.cfg to keep a persistent list of paths)
In-Reply-To: <4CC32A2F.6040408@ronadam.com>
References: <AANLkTi=C_nWCkVj16mdv6UVpu6jZBueNZS3FuLWScfNR@mail.gmail.com>	<4CBFC331.8020309@ronadam.com>	<AANLkTikQAkQu9a5SaDy_hZB_625G64BdeBFRoGs3z=G7@mail.gmail.com>	<4CC26AC7.6060801@scottdial.com>
	<AANLkTimcz-QepcQCMCw+hvJYgT8545o1riM_LxZCoY+y@mail.gmail.com>
	<4CC32A2F.6040408@ronadam.com>
Message-ID: <4CC32FEE.3020708@scottdial.com>

On 10/23/2010 2:32 PM, Ron Adam wrote:
> When you say "developers", do you mean developers of python, or
> developers with python?  I presumed the later.

I intended "developers" to mean anyone proficient with the use of python
as a tool or anyone who should be bothered to read "--help" to find out
about how to much with the path (e.g., PYTHONPATH).

> On 10/23/2010 09:21 AM, Nick Coghlan wrote:
>> The proposed command line switch would also be compatible with -E,
>> which is *not* the case for any approach based on modifying
>> PYTHONPATH.

Does anyone actually use -E? Is that a critical feature worth adding yet
another way to add something to sys.path for? I don't find "-p" to be a
confusing addition to the switch flag set, so I would say I am mostly a
-0 on adding another flag for this purpose unless it has serious
advantages over PYTHONPATH.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu


From ianb at colorstudy.com  Sat Oct 23 21:32:37 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 23 Oct 2010 14:32:37 -0500
Subject: [Python-ideas] pythonv / Python path
Message-ID: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>

The recent path discussion reminded me of a project I talked about with
Larry Hastings at the last PyCon about virtualenv and what could possibly be
included into Python.  Larry worked on a prototype that I was supposed to do
something with, and then I didn't, which is lame of me but deserves some
note:

http://bitbucket.org/larry/pythonv/src

It satisfies several requirements that I feel virtualenv accomplishes and a
lot of other systems do not; but it also has a few useful features
virtualenv doesn't have and is much simpler (mostly because it has a
compiled component, and changes the system site.py).

The features I think are important:

* Works with "environments", which is a set of paths and installed
components that work together (instead of just ad hoc single path extensions
like adding one entry to PYTHONPATH)
* Modifies sys.prefix, so all the existing installation tools respect the
new environment
* Works with #!, which basically means it needs its own per-environment
interpreter, as #! is so completely broken that it can't have any real
arguments (though it occurs to me that a magic comment could work)
* Doesn't use environmental variables (actually it uses them internally, but
not in a way that is exposed to developers) -- for instance, hg should not
be affected by whatever development you are doing just because it happens to
be written in Python

Anyway, I think what Larry did with pythonv accomplishes a lot of these
things, and probably some more constraints that I've forgotten about.  It
does have a more complicated/dangerous installation procedure than
virtualenv (but if it was part of Python proper that would be okay).

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101023/a54261bf/attachment.html>

From debatem1 at gmail.com  Sat Oct 23 22:14:18 2010
From: debatem1 at gmail.com (geremy condra)
Date: Sat, 23 Oct 2010 13:14:18 -0700
Subject: [Python-ideas] stats module Was: minmax() function ...
In-Reply-To: <201010180357.59264.steve@pearwood.info>
References: <201010161111.21847.steve@pearwood.info>
	<622121A3-6A51-4735-A292-9F82502BB623@gmail.com>
	<201010180357.59264.steve@pearwood.info>
Message-ID: <AANLkTim3UGEK+A1wn8+tQQ0+Hj3FuzF3ZNdQT6n0L0cy@mail.gmail.com>

On Sun, Oct 17, 2010 at 9:57 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sat, 16 Oct 2010 11:33:02 am Raymond Hettinger wrote:
>> > Are you still interested in working on it, or is this a subtle hint
>> > that somebody else should do so?
>>
>> Hmm, perhaps this would be less subtle:
>> HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE!
>
>
> http://pypi.python.org/pypi/stats
>
> It is not even close to production ready. It needs unit tests. The API
> should be considered unstable. There's no 3.x version yet. Obviously it
> has no real-world usage. But if anyone would like to contribute,
> critique or criticize, I welcome feedback or assistance, or even just
> encouragement.

Just an update on this, there's a sprint planned to work on this this
coming Thursday at the University of Washington in Seattle. We'll also
be set up for people to join us remotely if anybody's interested.
Here's the link to the signup: http://goo.gl/PJn4

Geremy Condra


From brett at python.org  Sat Oct 23 23:27:25 2010
From: brett at python.org (Brett Cannon)
Date: Sat, 23 Oct 2010 14:27:25 -0700
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
Message-ID: <AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>

Is this email meant to simply point out the existence of pythonv, or
to start a conversation about whether something should be tweaked in
Python so as to make pythonv/virtualenv easier to implement/use?

If it's the latter then let's have the conversation! This was brought
up at the PyCon US 2010 language summit and the consensus was that
modifying Python to make something like virtualenv or pythonv easier
to implement is completely acceptable and something worth doing.

On Sat, Oct 23, 2010 at 12:32, Ian Bicking <ianb at colorstudy.com> wrote:
> The recent path discussion reminded me of a project I talked about with
> Larry Hastings at the last PyCon about virtualenv and what could possibly be
> included into Python.? Larry worked on a prototype that I was supposed to do
> something with, and then I didn't, which is lame of me but deserves some
> note:
>
> http://bitbucket.org/larry/pythonv/src
>
> It satisfies several requirements that I feel virtualenv accomplishes and a
> lot of other systems do not; but it also has a few useful features
> virtualenv doesn't have and is much simpler (mostly because it has a
> compiled component, and changes the system site.py).
>
> The features I think are important:
>
> * Works with "environments", which is a set of paths and installed
> components that work together (instead of just ad hoc single path extensions
> like adding one entry to PYTHONPATH)
> * Modifies sys.prefix, so all the existing installation tools respect the
> new environment
> * Works with #!, which basically means it needs its own per-environment
> interpreter, as #! is so completely broken that it can't have any real
> arguments (though it occurs to me that a magic comment could work)
> * Doesn't use environmental variables (actually it uses them internally, but
> not in a way that is exposed to developers) -- for instance, hg should not
> be affected by whatever development you are doing just because it happens to
> be written in Python
>
> Anyway, I think what Larry did with pythonv accomplishes a lot of these
> things, and probably some more constraints that I've forgotten about.? It
> does have a more complicated/dangerous installation procedure than
> virtualenv (but if it was part of Python proper that would be okay).
>
> --
> Ian Bicking? |? http://blog.ianbicking.org
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


From ianb at colorstudy.com  Sat Oct 23 23:33:38 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 23 Oct 2010 16:33:38 -0500
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
Message-ID: <AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>

On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon <brett at python.org> wrote:

> Is this email meant to simply point out the existence of pythonv, or
> to start a conversation about whether something should be tweaked in
> Python so as to make pythonv/virtualenv easier to implement/use?
>

Both?  I have felt guilty for not following up on what Larry did, so this is
my other-people-should-think-about-this-too email.


> If it's the latter then let's have the conversation! This was brought
> up at the PyCon US 2010 language summit and the consensus was that
> modifying Python to make something like virtualenv or pythonv easier
> to implement is completely acceptable and something worth doing.
>

OK, sure!  Mostly it's about changing site.py.  The pythonv executable is
itself very simple, just a shim to make #! easier.  For Windows it would
have to be different (maybe similar to Setuptools' cmd.exe), but... I think
it's possible, and I'd just hope some Windows people would explore what
specifically is needed.

virtualenv has another feature, which isn't part of pythonv and needn't be
part of Python, which is to bootstrap installation tools.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101023/ec7f100c/attachment.html>

From raymond.hettinger at gmail.com  Sun Oct 24 00:05:30 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 23 Oct 2010 15:05:30 -0700
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
Message-ID: <92006216-1056-4B98-867A-41559A894B1B@gmail.com>


On Oct 23, 2010, at 2:27 PM, Brett Cannon wrote:

> Is this email meant to simply point out the existence of pythonv, or
> to start a conversation about whether something should be tweaked in
> Python so as to make pythonv/virtualenv easier to implement/use?
> 
> If it's the latter then let's have the conversation! This was brought
> up at the PyCon US 2010 language summit and the consensus was that
> modifying Python to make something like virtualenv or pythonv easier
> to implement is completely acceptable and something worth doing.

+1


Raymond


From brett at python.org  Sun Oct 24 00:11:44 2010
From: brett at python.org (Brett Cannon)
Date: Sat, 23 Oct 2010 15:11:44 -0700
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
	<AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>
Message-ID: <AANLkTikpxxagou0yhaTEcVVR0GDKHVKOZX0yLLzwx3pG@mail.gmail.com>

On Sat, Oct 23, 2010 at 14:33, Ian Bicking <ianb at colorstudy.com> wrote:
> On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon <brett at python.org> wrote:
>>
>> Is this email meant to simply point out the existence of pythonv, or
>> to start a conversation about whether something should be tweaked in
>> Python so as to make pythonv/virtualenv easier to implement/use?
>
> Both?? I have felt guilty for not following up on what Larry did, so this is
> my other-people-should-think-about-this-too email.
>
>>
>> If it's the latter then let's have the conversation! This was brought
>> up at the PyCon US 2010 language summit and the consensus was that
>> modifying Python to make something like virtualenv or pythonv easier
>> to implement is completely acceptable and something worth doing.
>
> OK, sure!? Mostly it's about changing site.py.

OK, what exactly needs to change?

>? The pythonv executable is
> itself very simple, just a shim to make #! easier.

But that's in C, though, right? What exactly does it do? It would be
best to make it if the shim can be in Python so that other VMs can
work with it.

-Brett

>? For Windows it would
> have to be different (maybe similar to Setuptools' cmd.exe), but... I think
> it's possible, and I'd just hope some Windows people would explore what
> specifically is needed.
>
> virtualenv has another feature, which isn't part of pythonv and needn't be
> part of Python, which is to bootstrap installation tools.
>
> --
> Ian Bicking? |? http://blog.ianbicking.org
>


From ianb at colorstudy.com  Sun Oct 24 00:16:52 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 23 Oct 2010 17:16:52 -0500
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTikpxxagou0yhaTEcVVR0GDKHVKOZX0yLLzwx3pG@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
	<AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>
	<AANLkTikpxxagou0yhaTEcVVR0GDKHVKOZX0yLLzwx3pG@mail.gmail.com>
Message-ID: <AANLkTimxz5AgB8J9jN7CaYYByTeLGkFsj5BAnS0_8Z0f@mail.gmail.com>

On Sat, Oct 23, 2010 at 5:11 PM, Brett Cannon <brett at python.org> wrote:

> On Sat, Oct 23, 2010 at 14:33, Ian Bicking <ianb at colorstudy.com> wrote:
> > On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon <brett at python.org> wrote:
> >>
> >> Is this email meant to simply point out the existence of pythonv, or
> >> to start a conversation about whether something should be tweaked in
> >> Python so as to make pythonv/virtualenv easier to implement/use?
> >
> > Both?  I have felt guilty for not following up on what Larry did, so this
> is
> > my other-people-should-think-about-this-too email.
> >
> >>
> >> If it's the latter then let's have the conversation! This was brought
> >> up at the PyCon US 2010 language summit and the consensus was that
> >> modifying Python to make something like virtualenv or pythonv easier
> >> to implement is completely acceptable and something worth doing.
> >
> > OK, sure!  Mostly it's about changing site.py.
>
> OK, what exactly needs to change?
>

Well, add a notion of "prefixes", where the system sys.prefix is one item,
but the environment location is the "active" sys.prefix.  Then most of the
site.py changes can follow logically from that (whatever you do for the one
prefix, do for all prefixes).  Then there's a matter of using an
environmental variable to add a new prefix (or multiple prefixes --
inheritable virtualenvs, if virtualenv allowed such a thing).  In the
pythonv implementation it sets that variable, and site.py deletes that
variable (it could be a command-line switch, that's just slightly hard to
implement -- but it's not intended as an inheritable attribute of the
execution environment like PYTHONPATH).

>  The pythonv executable is
> > itself very simple, just a shim to make #! easier.
>
> But that's in C, though, right? What exactly does it do? It would be
> best to make it if the shim can be in Python so that other VMs can
> work with it.
>

#! doesn't work with a Python target, otherwise it would be easy to
implement in Python.  #! is awful.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101023/d4955b90/attachment.html>

From greg.ewing at canterbury.ac.nz  Sun Oct 24 01:26:25 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 24 Oct 2010 12:26:25 +1300
Subject: [Python-ideas] New 3.x restriction on number of
	keyword	arguments
In-Reply-To: <AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
Message-ID: <4CC36F21.3040306@canterbury.ac.nz>

Cesare Di Mauro wrote:
> I think that having max 255 args and 255 kwargs is a good and reasonable 
> limit which we can live on, and helps the virtual machine implementation

Is there any corresponding limit to the number of arguments to
tuple and dict constructor? If not, the limit could perhaps be
circumvented without changing the VM by having the compiler
convert calls with large numbers of args into code that builds
an appropriate tuple and dict and makes a *args/**kwds call.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sun Oct 24 01:52:25 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 24 Oct 2010 12:52:25 +1300
Subject: [Python-ideas] New 3.x restriction on number of
	keyword	arguments
In-Reply-To: <4CC211EE.1050308@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com>
Message-ID: <4CC37539.6090900@canterbury.ac.nz>

M.-A. Lemburg wrote:

> I wish we could get rid off all the byte shifting and div'ery
> use in the byte compiler

I think it's there to take care of endianness issues.

-- 
Greg


From brett at python.org  Sun Oct 24 02:12:35 2010
From: brett at python.org (Brett Cannon)
Date: Sat, 23 Oct 2010 17:12:35 -0700
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTimxz5AgB8J9jN7CaYYByTeLGkFsj5BAnS0_8Z0f@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
	<AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>
	<AANLkTikpxxagou0yhaTEcVVR0GDKHVKOZX0yLLzwx3pG@mail.gmail.com>
	<AANLkTimxz5AgB8J9jN7CaYYByTeLGkFsj5BAnS0_8Z0f@mail.gmail.com>
Message-ID: <AANLkTik7hVGrTdoWm0Dw+S1XjwLhoJcC58N1b73NMrX6@mail.gmail.com>

On Sat, Oct 23, 2010 at 15:16, Ian Bicking <ianb at colorstudy.com> wrote:
> On Sat, Oct 23, 2010 at 5:11 PM, Brett Cannon <brett at python.org> wrote:
>>
>> On Sat, Oct 23, 2010 at 14:33, Ian Bicking <ianb at colorstudy.com> wrote:
>> > On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon <brett at python.org> wrote:
>> >>
>> >> Is this email meant to simply point out the existence of pythonv, or
>> >> to start a conversation about whether something should be tweaked in
>> >> Python so as to make pythonv/virtualenv easier to implement/use?
>> >
>> > Both?? I have felt guilty for not following up on what Larry did, so
>> > this is
>> > my other-people-should-think-about-this-too email.
>> >
>> >>
>> >> If it's the latter then let's have the conversation! This was brought
>> >> up at the PyCon US 2010 language summit and the consensus was that
>> >> modifying Python to make something like virtualenv or pythonv easier
>> >> to implement is completely acceptable and something worth doing.
>> >
>> > OK, sure!? Mostly it's about changing site.py.
>>
>> OK, what exactly needs to change?
>
> Well, add a notion of "prefixes", where the system sys.prefix is one item,
> but the environment location is the "active" sys.prefix.? Then most of the
> site.py changes can follow logically from that (whatever you do for the one
> prefix, do for all prefixes).? Then there's a matter of using an
> environmental variable to add a new prefix (or multiple prefixes --
> inheritable virtualenvs, if virtualenv allowed such a thing).? In the
> pythonv implementation it sets that variable, and site.py deletes that
> variable (it could be a command-line switch, that's just slightly hard to
> implement -- but it's not intended as an inheritable attribute of the
> execution environment like PYTHONPATH).

OK, so it sounds like site.py would just need a restructuring. That's
sounds like just a technical challenge and not a
backwards-compatibility one. Am I right?

>
>> >? The pythonv executable is
>> > itself very simple, just a shim to make #! easier.
>>
>> But that's in C, though, right? What exactly does it do? It would be
>> best to make it if the shim can be in Python so that other VMs can
>> work with it.
>
> #! doesn't work with a Python target, otherwise it would be easy to
> implement in Python.? #! is awful.

As in a #! can't target a Python script that has been chmod'ed to be executable?


From cesare.di.mauro at gmail.com  Sun Oct 24 08:31:04 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Sun, 24 Oct 2010 08:31:04 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC36F21.3040306@canterbury.ac.nz>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC36F21.3040306@canterbury.ac.nz>
Message-ID: <AANLkTikwa0XHrOe6wg02KcmAZY3qSoe6n++VhMjyQowB@mail.gmail.com>

2010/10/24 Greg Ewing <greg.ewing at canterbury.ac.nz>

> Cesare Di Mauro wrote:
>
>> I think that having max 255 args and 255 kwargs is a good and reasonable
>> limit which we can live on, and helps the virtual machine implementation
>>
>
> Is there any corresponding limit to the number of arguments to
>  tuple and dict constructor?


AFAIK there's no such limit. However, I'll use BUILD_TUPLE and BUILD_MAP
opcodes for such purpose, because they are faster.


> If not, the limit could perhaps be
> circumvented without changing the VM by having the compiler
> convert calls with large numbers of args into code that builds
> an appropriate tuple and dict and makes a *args/**kwds call.
>
> --
> Greg


I greatly prefer this solution, but it's a bit more complicated when there
are *arg and/or **kwargs special arguments.

If we have > 255 args and *args is defined, we need to:
1) emit BUILD_TUPLE after pushed the regular arguments
2) emit LOAD_GLOBAL("tuple")
3) push *args
4) emit CALL_FUNCTION(1) to convert *args to a tuple
5) emit BINARY_ADD to append *args to the regular arguments
6) emit CALL_FUNCTION_VAR

If we have > 255 kwargs and **kwargs defined, we need to:
1) emit BUILD_MAP after pushed the regular keyword arguments
2) emit LOAD_ATTR("update")
3) push **kwargs
4) emit CALL_FUNCTION(1) to update the regular keyword arguments with the
ones in **kwargs
5) emit CALL_FUNCTION_KW

And, finally, combining all the above in the worst case.

But, as I said, I prefer this one to handle "complex" cases instead of
changing the VM slowing the common ones.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101024/5cc5c547/attachment.html>

From phd at phd.pp.ru  Sun Oct 24 17:40:05 2010
From: phd at phd.pp.ru (Oleg Broytman)
Date: Sun, 24 Oct 2010 19:40:05 +0400
Subject: [Python-ideas] pythonv / Python path
In-Reply-To: <AANLkTik7hVGrTdoWm0Dw+S1XjwLhoJcC58N1b73NMrX6@mail.gmail.com>
References: <AANLkTim_pELZ0yJS-+kn2LdAN1OLXOXdtehDawQj6Rk2@mail.gmail.com>
	<AANLkTi=DqHPo_97_2z6TD8UWCp=VN5x9kZN1k6yb=6-A@mail.gmail.com>
	<AANLkTimJzp+ApPV8x3ApFfa_URDJKxmEt8mzBKt+c+_7@mail.gmail.com>
	<AANLkTikpxxagou0yhaTEcVVR0GDKHVKOZX0yLLzwx3pG@mail.gmail.com>
	<AANLkTimxz5AgB8J9jN7CaYYByTeLGkFsj5BAnS0_8Z0f@mail.gmail.com>
	<AANLkTik7hVGrTdoWm0Dw+S1XjwLhoJcC58N1b73NMrX6@mail.gmail.com>
Message-ID: <20101024154005.GA21511@phd.pp.ru>

On Sat, Oct 23, 2010 at 05:12:35PM -0700, Brett Cannon wrote:
> On Sat, Oct 23, 2010 at 15:16, Ian Bicking <ianb at colorstudy.com> wrote:
> > #! doesn't work with a Python target, otherwise it would be easy to
> > implement in Python.? #! is awful.
> 
> As in a #! can't target a Python script that has been chmod'ed to be executable?

   It also handles parameters in strange ways. One can write

#!/usr/bin/pytnon -O

   but not

#!/usr/bin/env pytnon -O

   Some operating systems understand that, some ignore -O completely, but
most OSes interpret "python -O" as one parameter and emit
"/usr/bin/env: python -O: No such file or directory" error.

Oleg.
-- 
     Oleg Broytman            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From lie.1296 at gmail.com  Sun Oct 24 23:59:16 2010
From: lie.1296 at gmail.com (Lie Ryan)
Date: Mon, 25 Oct 2010 08:59:16 +1100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CB8B2F5.2020507@ronadam.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com>
Message-ID: <ia2d8l$7rr$1@dough.gmane.org>

On 10/16/10 07:00, Ron Adam wrote:
> 
> 
> On 10/15/2010 02:04 PM, Arnaud Delobelle wrote:
> 
>>> Because it would always interpret a list of values as a single item.
>>>
>>> This function looks at args and if its a single value without an
>>> "__iter__"
>>> method, it passes it to min as min([value], **kwds) instead of
>>> min(value,
>>> **kwds).
>>
>> But there are many iterable objects which are also comparable (hence
>> it makes sense to consider their min/max), for example strings.
>>
>> So we get:
>>
>>       xmin("foo", "bar", "baz") == "bar"
>>       xmin("foo", "bar") == "bar"
>>
>> but:
>>
>>      xmin("foo") == "f"
>>
>> This will create havoc in your running min routine.
>>
>> (Notice the same will hold for min() but at least you know that min(x)
>> considers x as an iterable and complains if it isn't)
> 
> Yes
> 
> There doesn't seem to be a way to generalize min/max in a way to handle
> all the cases without knowing the context.
> 
> So in a coroutine version of Tals class, you would need to pass a hint
> along with the value.
> 
> Ron

There is a way, by using threading, and injecting a thread-safe tee into
max/min/otherFuncs (over half of the code is just for implementing
thread-safe tee). Using this, there is no need to make any modification
to min/max. I suppose it might be possible to convert this to using the
new coroutine proposal (though I haven't been following the proposal
close enough).

The code is quite slow (and ugly), but it can handle large generators
(or infinite generators). The memory shouldn't grow if the functions in
*funcs takes more or less similar amount of time (which is true in case
of max and min); if *funcs need to take both very fast and very slow
codes at the same time, some more code can be added for load-balancing
by stalling faster threads' request for more items, until the slower
threads finishes.

Pros:
- no modification to max/min

Cons:
- slow, since itertools.tee() is reimplemented in pure-python
- thread is uninterruptible


import threading, itertools

class counting_dict(dict):
    """ A thread-safe dict that allows its items to be accessed
        max_access times, after that the item is deleted.

        >>> d = counting_dict(2)
        >>> d['a'] = 'e'
        >>> d['a']
        'e'
        >>> d['a']
        'e'
        >>> d['a']
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "<stdin>", line 10, in __getitem__
        KeyError: 'a'
    """
    def __init__(self, max_access):
        self.max_access = max_access
    def __setitem__(self, key, item):
        super().__setitem__(key,
            [item, self.max_access, threading.Lock()]
        )
    def __getitem__(self, key):
        val = super().__getitem__(key)
        item, count, lock = val
        with lock:
            val[1] -= 1
            if val[1] == 0: del self[key]
        return item

def tee(iterable, n=2):
    """ like itertools.tee(), but thread-safe """
    def _tee():
        for i in itertools.count():
            try:
                yield cache[i]
            except KeyError:
                producer_next()
                yield cache[i]
    def produce(next):
        for i in itertools.count():
            cache[i] = next()
            yield
    produce.lock = threading.Lock()

    def producer_next():
        with produce.lock:
            next(producer); next(producer);
            next(producer); next(producer);

    cache = counting_dict(n)
    it = iter(iterable)
    producer = produce(it.__next__)
    return [_tee() for _ in range(n)]

def parallel_reduce(iterable, *funcs):
    class Worker(threading.Thread):
        def __init__(self, source, func):
            super().__init__()
            self.source = source
            self.func = func
        def run(self):
            self.result = self.func(self.source)

    sources = tee(iterable, len(funcs))
    threads = []
    for func, source in zip(funcs, sources):
        thread = Worker(source, func)
        thread.setDaemon(True)
        threads.append(thread)

    for thread in threads:
        thread.start()

    # this lets Ctrl+C work, it doesn't actually terminate
    # currently running threads
    for thread in threads:
        while thread.isAlive():
            thread.join(100)

    return tuple(thread.result for thread in threads)

# test code
import random, time
parallel_reduce([4, 6, 2, 3, 5, 7, 2, 3, 7, 8, 9, 6, 2, 10], min, max)
l = (random.randint(-1000000000, 1000000000) for _ in range(100000))
start = time.time(); parallel_reduce(l, min, min, max, min, max);
time.time() - start



From guido at python.org  Mon Oct 25 04:37:32 2010
From: guido at python.org (Guido van Rossum)
Date: Sun, 24 Oct 2010 19:37:32 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <ia2d8l$7rr$1@dough.gmane.org>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com> <ia2d8l$7rr$1@dough.gmane.org>
Message-ID: <AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>

On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan <lie.1296 at gmail.com> wrote:
> There is a way, by using threading, and injecting a thread-safe tee into
> max/min/otherFuncs (over half of the code is just for implementing
> thread-safe tee). Using this, there is no need to make any modification
> to min/max. I suppose it might be possible to convert this to using the
> new coroutine proposal (though I haven't been following the proposal
> close enough).
>
> The code is quite slow (and ugly), but it can handle large generators
> (or infinite generators). The memory shouldn't grow if the functions in
> *funcs takes more or less similar amount of time (which is true in case
> of max and min); if *funcs need to take both very fast and very slow
> codes at the same time, some more code can be added for load-balancing
> by stalling faster threads' request for more items, until the slower
> threads finishes.
>
> Pros:
> - no modification to max/min
>
> Cons:
> - slow, since itertools.tee() is reimplemented in pure-python
> - thread is uninterruptible
[snip]

This should not require threads.

Here's a bare-bones sketch using generators:

def reduce_collector(func):
    outcome = None
    while True:
        try:
            val = yield
        except GeneratorExit:
            break
        if outcome is None:
            outcome = val
        else:
            outcome = func(outcome, val)
    raise StopIteration(outcome)

def parallel_reduce(iterable, funcs):
    collectors = [reduce_collector(func) for func in funcs]
    values = [None for _ in collectors]
    for i, coll in enumerate(collectors):
        try:
            next(coll)
        except StopIteration as err:
            values[i] = err.args[0]
            collectors[i] = None
    for val in iterable:
        for i, coll in enumerate(collectors):
            if coll is not None:
                try:
                    coll.send(val)
                except StopIteration as err:
                    values[i] = err.args[0]
                    collectors[i] = None
    for i, coll in enumerate(collectors):
        if coll is not None:
            try:
                coll.throw(GeneratorExit)
            except StopIteration as err:
                values[i] = err.args[0]
    return values

def main():
    it = range(100)
    print(parallel_reduce(it, [min, max]))

if __name__ == '__main__':
    main()

-- 
--Guido van Rossum (python.org/~guido)


From jh at improva.dk  Mon Oct 25 12:19:14 2010
From: jh at improva.dk (Jacob Holm)
Date: Mon, 25 Oct 2010 12:19:14 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>
	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>
	<ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
Message-ID: <4CC559A2.8090305@improva.dk>

On 2010-10-25 04:37, Guido van Rossum wrote:
> This should not require threads.
> 
> Here's a bare-bones sketch using generators:
> 

If you don't care about allowing the funcs to raise StopIteration, this
can actually be simplified to:


def reduce_collector(func):
    try:
        outcome = yield
    except GeneratorExit:
        outcome = None
    else:
        while True:
            try:
                val = yield
            except GeneratorExit:
                break
            outcome = func(outcome, val)
    raise StopIteration(outcome)

def parallel_reduce(iterable, funcs):
    collectors = [reduce_collector(func) for func in funcs]
    values = [None for _ in collectors]
    for coll in collectors:
        next(coll)
    for val in iterable:
        for coll in collectors:
            coll.send(val)
    for i, coll in enumerate(collectors):
        try:
            coll.throw(GeneratorExit)
        except StopIteration as err:
            values[i] = err.args[0]
    return values


More interesting (to me at least) is that this is an excellent example
of why I would like to see a version of PEP380 where "close" on a
generator can return a value (AFAICT the version of PEP380 on
http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
mention this possibility, or even link to the heated discussion we had
on python-ideas around march/april 2009).

Assuming that "close" on a reduce_collector generator instance returns
the value of the StopIteration raised by the "return" statements, we can
simplify the code even further:


def reduce_collector(func):
    try:
        outcome = yield
    except GeneratorExit:
        return None
    while True:
        try:
            val = yield
        except GeneratorExit:
            return outcome
        outcome = func(outcome, val)

def parallel_reduce(iterable, funcs):
    collectors = [reduce_collector(func) for func in funcs]
    for coll in collectors:
        next(coll)
    for val in iterable:
        for coll in collectors:
            coll.send(val)
    return [coll.close() for coll in collectors]


Yes, this is only saving a few lines, but I find it *much* more readable...



- Jacob



From denis.spir at gmail.com  Mon Oct 25 15:49:32 2010
From: denis.spir at gmail.com (spir)
Date: Mon, 25 Oct 2010 15:49:32 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
Message-ID: <20101025154932.06be2faf@o>

Hello,


A recommended idiom to construct a text from bits -- usually when the bits themselves are constructed by mapping on a sequence -- is to store the intermediate results and then only join() them all at once. Since I discovered this idiom I find myself constantly use it, to the point of having a func doing that in my python toolkit:

def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''):
    if (map is None):
        return "%s%s%s" %(ldelim , sep.join(str(e) for e in seq) , rdelim)
    else:
        return "%s%s%s" %(ldelim , sep.join(str(map(e)) for e in seq) , rdelim)

Example use:

class LispList(list):
    def __repr__(self):
        return textFromMap(self , repr , ' ' , '(',')')
print LispList([1, 2, 3])   # --> (1 2 3)

Is there any similar routine in Python? If yes, please inform me off list and excuse the noise. Else, I wonder whether such a routine would be useful as builtin, precisely since it is a common and recommended idiom. The issues with not having it, according to me, are that the expression is somewhat complicated and, more importantly, hardly tells the reader what it means & does -- even when "unfolded" into 2 or more lines of code:

    elements = (map(e) for e in seq)
    elementTexts = (str(e) for e in elements)
    content = sep.join(elementTexts)
    text = "%s%s%s" %(ldelim , content , rdelim)

There are 2 discussable choices in the func above:
* Unlike join(), it converts to str automagically.
* It takes optional delimiter parameters which complicate the interface (but are really handy for my common use cases :-)
Also, the map parameter is optional in case there is no mapping at all, which is more common if the func "string-ifies" itself.

If ever you find this proposal sensible, then what should be the routine's name?
And where to integrate it in the language? I think there are at least 3 options:
1. A simple func                textFromMap(seq, ...)
2. A static method of str       str.fromMap(seq, ...)
3. A method for iterables (1)   seq.textFromMap(...)
(I personly find the latter more correct semantically (2).)

What do you think?


Denis

(1) I don't know exactly what should be the top class, if any.
(2) I think the same about join: should be "seq.join(sep)" since for me the object on which the method applies is seq, not sep.
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From masklinn at masklinn.net  Mon Oct 25 16:10:56 2010
From: masklinn at masklinn.net (Masklinn)
Date: Mon, 25 Oct 2010 16:10:56 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <20101025154932.06be2faf@o>
References: <20101025154932.06be2faf@o>
Message-ID: <6E25ADF7-129D-4396-A569-06CB2BA5D7C7@masklinn.net>

On 2010-10-25, at 15:49 , spir wrote:
> Hello,
> 
> 
> A recommended idiom to construct a text from bits -- usually when the bits themselves are constructed by mapping on a sequence -- is to store the intermediate results and then only join() them all at once. Since I discovered this idiom I find myself constantly use it, to the point of having a func doing that in my python toolkit:
> 
> def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''):
>    if (map is None):
>        return "%s%s%s" %(ldelim , sep.join(str(e) for e in seq) , rdelim)
>    else:
>        return "%s%s%s" %(ldelim , sep.join(str(map(e)) for e in seq) , rdelim)
> 
> Example use:
> 
> class LispList(list):
>    def __repr__(self):
>        return textFromMap(self , repr , ' ' , '(',')')
> print LispList([1, 2, 3])   # --> (1 2 3)
> 
> Is there any similar routine in Python? If yes, please inform me off list and excuse the noise. Else, I wonder whether such a routine would be useful as builtin, precisely since it is a common and recommended idiom. The issues with not having it, according to me, are that the expression is somewhat complicated and, more importantly, hardly tells the reader what it means & does -- even when "unfolded" into 2 or more lines of code:
> 
>    elements = (map(e) for e in seq)
>    elementTexts = (str(e) for e in elements)
>    content = sep.join(elementTexts)
>    text = "%s%s%s" %(ldelim , content , rdelim)
> 
I really am not sure you gain so much over the current `sep.join(str(map(e)) for e in seq))`, even with the addition of ldelim and rdelim which end-up in arguments-soup/noise (5 arguments in the worst case is quite a lot).

The name is also strange, and hints at needing function composition more than a new builtin.

> 3. A method for iterables (1)   seq.textFromMap(...)
> (I personly find the latter more correct semantically (2).)
> 
> (2) I think the same about join: should be "seq.join(sep)" since for me the object on which the method applies is seq, not sep.
> 
This is also the choice of e.g. Ruby, but it has a severe limitation: Python doesn't have any `Iterable` type, yet `join` can be used with any iterable including generators or callable-iterators. Thus you can not put it on the iterable or sequence, or you have to prepare some kind of iterable mixin. This issue might be solved/solvable via the new abstract base classes, but I'm so sure about it (do you explicitly have to mix-in an abc to use its methods?).

In fact, Ruby 1.8 does have that limitation (though it's arguably not the worst limitation ever): `Array#join` exists but not `Enumerable#join`. They tried to add `Enumerable#join` in 1.9.1 (though a fairly strange, recursive version of it) then took it out then added it back again (or something, I got lost around there). And in any case since there is no requirement for enumerable collections to mix Enumerable in, you can have enumerable collections with no join support.

From guido at python.org  Mon Oct 25 17:13:19 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 08:13:19 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
Message-ID: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>

[Changed subject]

> On 2010-10-25 04:37, Guido van Rossum wrote:
>> This should not require threads.
>>
>> Here's a bare-bones sketch using generators:
[...]

On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm <jh at improva.dk> wrote:
> If you don't care about allowing the funcs to raise StopIteration, this
> can actually be simplified to:
[...]

Indeed, I realized this after posting. :-) I had several other ideas
for improvements, e.g. being able to pass an initial value to the
reduce-like function or even being able to supply a reduce-like
function of one's own.

> More interesting (to me at least) is that this is an excellent example
> of why I would like to see a version of PEP380 where "close" on a
> generator can return a value (AFAICT the version of PEP380 on
> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
> mention this possibility, or even link to the heated discussion we had
> on python-ideas around march/april 2009).

Can you dig up the link here?

I recall that discussion but I don't recall a clear conclusion coming
from it -- just heated debate.

Based on my example I have to agree that returning a value from
close() would be nice. There is a little detail, how multiple
arguments to StopIteration should be interpreted, but that's not so
important if it's being raised by a return statement.

> Assuming that "close" on a reduce_collector generator instance returns
> the value of the StopIteration raised by the "return" statements, we can
> simplify the code even further:
>
>
> def reduce_collector(func):
> ? ?try:
> ? ? ? ?outcome = yield
> ? ?except GeneratorExit:
> ? ? ? ?return None
> ? ?while True:
> ? ? ? ?try:
> ? ? ? ? ? ?val = yield
> ? ? ? ?except GeneratorExit:
> ? ? ? ? ? ?return outcome
> ? ? ? ?outcome = func(outcome, val)
>
> def parallel_reduce(iterable, funcs):
> ? ?collectors = [reduce_collector(func) for func in funcs]
> ? ?for coll in collectors:
> ? ? ? ?next(coll)
> ? ?for val in iterable:
> ? ? ? ?for coll in collectors:
> ? ? ? ? ? ?coll.send(val)
> ? ?return [coll.close() for coll in collectors]
>
>
> Yes, this is only saving a few lines, but I find it *much* more readable...

I totally agree that not having to call throw() and catch whatever it
bounces back is much nicer. (Now I wish there was a way to avoid the
"try..except GeneratorExit" construct in the generator, but I think I
should stop while I'm ahead. :-)

The interesting thing is that I've been dealing with generators used
as coroutines or tasks intensely on and off since July, and I haven't
had a single need for any of the three patterns that this example
happened to demonstrate:

- the need to "prime" the generator in a separate step
- throwing and catching GeneratorExit
- getting a value from close()

(I did have a lot of use for send(), throw(), and extracting a value
from StopIteration.)

In my context, generators are used to emulate concurrently running
tasks, and "yield" is always used to mean "block until this piece of
async I/O is complete, and wake me up with the result". This is
similar to the "classic" trampoline code found in PEP 342.

In fact, when I wrote the example for this thread, I fumbled a bit
because the use of generators there is different than I had been using
them (though it was no doubt thanks to having worked with them
intensely that I came up with the example quickly).

So, it is clear that generators are extremely versatile, and PEP 380
deserves several good use cases to explain all the API subtleties.

BTW, while I have you, what do you think of Greg's "cofunctions" proposal?

-- 
--Guido van Rossum (python.org/~guido)


From lvh at laurensvh.be  Mon Oct 25 18:00:40 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 25 Oct 2010 18:00:40 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <20101025154932.06be2faf@o>
References: <20101025154932.06be2faf@o>
Message-ID: <AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>

Hm. I suppose the need for this would be slightly mitigated if I understood
why str.join does not try to convert the elements of the iterable it is
passed to strs (and analogously for unicode).

Does anyone know what the rationale for that is?


lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101025/31eff707/attachment.html>

From guido at python.org  Mon Oct 25 19:12:02 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 10:12:02 -0700
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
Message-ID: <AANLkTi=ZT8cu3D+shLgL8ze+9Ej8T6hgRLqKpgNscR8Z@mail.gmail.com>

On Mon, Oct 25, 2010 at 9:00 AM, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> Hm. I suppose the need for this would be slightly mitigated if I understood
> why str.join does not try to convert the elements of the iterable it is
> passed to strs (and analogously for unicode).

> Does anyone know what the rationale for that is?

To detect buggy code.

-- 
--Guido van Rossum (python.org/~guido)


From __peter__ at web.de  Mon Oct 25 19:10:56 2010
From: __peter__ at web.de (Peter Otten)
Date: Mon, 25 Oct 2010 19:10:56 +0200
Subject: [Python-ideas] [Python-Dev] minmax() function returning
	(minimum, maximum) tuple of a sequence
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com> <ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
Message-ID: <ia4dmg$7uc$1@dough.gmane.org>

Guido van Rossum wrote:

> On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan
> <lie.1296 at gmail.com> wrote:
>> There is a way, by using threading, and injecting a thread-safe tee into
>> max/min/otherFuncs (over half of the code is just for implementing
>> thread-safe tee). Using this, there is no need to make any modification
>> to min/max. I suppose it might be possible to convert this to using the
>> new coroutine proposal (though I haven't been following the proposal
>> close enough).
>>
>> The code is quite slow (and ugly), but it can handle large generators
>> (or infinite generators). The memory shouldn't grow if the functions in
>> *funcs takes more or less similar amount of time (which is true in case
>> of max and min); if *funcs need to take both very fast and very slow
>> codes at the same time, some more code can be added for load-balancing
>> by stalling faster threads' request for more items, until the slower
>> threads finishes.
>>
>> Pros:
>> - no modification to max/min
>>
>> Cons:
>> - slow, since itertools.tee() is reimplemented in pure-python
>> - thread is uninterruptible
> [snip]
> 
> This should not require threads.
> 
> Here's a bare-bones sketch using generators:

>             outcome = func(outcome, val)

I don't think the generator-based approach is equivalent to what Lie Ryan's 
threaded code does. You are calling max(a, b) 99 times while Lie calls 
max(items) once. 

Is it possible to calculate min(items) and max(items) simultaneously using 
generators? I don't see how...

Peter



From guido at python.org  Mon Oct 25 20:53:40 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 11:53:40 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <ia4dmg$7uc$1@dough.gmane.org>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com> <ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
	<ia4dmg$7uc$1@dough.gmane.org>
Message-ID: <AANLkTikAnG0MhBegq+Zw3rL7ph4D_suuhsc9PNcKv9o6@mail.gmail.com>

> Guido van Rossum wrote:
[...]
>> This should not require threads.
>>
>> Here's a bare-bones sketch using generators:
[...]

On Mon, Oct 25, 2010 at 10:10 AM, Peter Otten <__peter__ at web.de> wrote:
> I don't think the generator-based approach is equivalent to what Lie Ryan's
> threaded code does. You are calling max(a, b) 99 times while Lie calls
> max(items) once.

True. Nevertheless, my point stays: you shouldn't have to use threads
to do such concurrent computations over a single-use iterable. Threads
too slow and since there is no I/O multiplexing they don't offer
advantages.

> Is it possible to calculate min(items) and max(items) simultaneously using
> generators? I don't see how...

No, this is why the reduce-like approach is better for such cases.
Otherwise you keep trying to fit a square peg into a round hold.

-- 
--Guido van Rossum (python.org/~guido)


From tjreedy at udel.edu  Mon Oct 25 21:27:52 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 25 Oct 2010 15:27:52 -0400
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <20101025154932.06be2faf@o>
References: <20101025154932.06be2faf@o>
Message-ID: <ia4lnn$ioh$1@dough.gmane.org>

On 10/25/2010 9:49 AM, spir wrote:
> Hello,
>
>
> A recommended idiom to construct a text from bits -- usually when the
> bits themselves are constructed by mapping on a sequence -- is to
> store the intermediate results and then only join() them all at once.
> Since I discovered this idiom I find myself constantly use it, to the
> point of having a func doing that in my python toolkit:
>
> def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''):

'map' is a bad parameter name as it 1. reuses the builtin name and 2.
uses it for a parameter (the mapped function) of that builtin.

...
> (2) I think the same about join: should be "seq.join(sep)" since for
> me the object on which the method applies is seq, not sep.

The two parameters for the join function are a string and an iterable of 
strings. There is no 'iterable of strings' class, so that leaves the 
string class to attach it to as a method. (It once *was* just a function 
in the string module before it and other functions were so attached.) 
The fact that the function produces a string is another reason it should 
be a string method. Ditto for bytes and iterable of bytes.

-- 
Terry Jan Reedy



From rrr at ronadam.com  Mon Oct 25 21:53:57 2010
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 25 Oct 2010 14:53:57 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
Message-ID: <4CC5E055.9010009@ronadam.com>



On 10/25/2010 10:13 AM, Guido van Rossum wrote:
> [Changed subject]
>
>> On 2010-10-25 04:37, Guido van Rossum wrote:
>>> This should not require threads.
>>>
>>> Here's a bare-bones sketch using generators:
> [...]
>
> On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm<jh at improva.dk>  wrote:
>> If you don't care about allowing the funcs to raise StopIteration, this
>> can actually be simplified to:
> [...]
>
> Indeed, I realized this after posting. :-) I had several other ideas
> for improvements, e.g. being able to pass an initial value to the
> reduce-like function or even being able to supply a reduce-like
> function of one's own.
>
>> More interesting (to me at least) is that this is an excellent example
>> of why I would like to see a version of PEP380 where "close" on a
>> generator can return a value (AFAICT the version of PEP380 on
>> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
>> mention this possibility, or even link to the heated discussion we had
>> on python-ideas around march/april 2009).
>
> Can you dig up the link here?
>
> I recall that discussion but I don't recall a clear conclusion coming
> from it -- just heated debate.
>
> Based on my example I have to agree that returning a value from
> close() would be nice. There is a little detail, how multiple
> arguments to StopIteration should be interpreted, but that's not so
> important if it's being raised by a return statement.
>
>> Assuming that "close" on a reduce_collector generator instance returns
>> the value of the StopIteration raised by the "return" statements, we can
>> simplify the code even further:
>>
>>
>> def reduce_collector(func):
>>     try:
>>         outcome = yield
>>     except GeneratorExit:
>>         return None
>>     while True:
>>         try:
>>             val = yield
>>         except GeneratorExit:
>>             return outcome
>>         outcome = func(outcome, val)
>>
>> def parallel_reduce(iterable, funcs):
>>     collectors = [reduce_collector(func) for func in funcs]
>>     for coll in collectors:
>>         next(coll)
>>     for val in iterable:
>>         for coll in collectors:
>>             coll.send(val)
>>     return [coll.close() for coll in collectors]
>>
>>
>> Yes, this is only saving a few lines, but I find it *much* more readable...
>
> I totally agree that not having to call throw() and catch whatever it
> bounces back is much nicer. (Now I wish there was a way to avoid the
> "try..except GeneratorExit" construct in the generator, but I think I
> should stop while I'm ahead. :-)


This is how my mind wants to write this.

@consumer
def reduce_collector(func):
     try:
         value = yield            # No value to yield here.
         while True:
             value = func((yield), value)        # or here.
     except YieldError:
         # next was called not send.
         yield = value

def parallel_reduce(iterable, funcs):
     collectors = [reduce_collector(func) for func in funcs]
     for v in iterable:
         for coll in collectors:
             coll.send(v)
     return [next(c) for c in collectors]


It nicely separates input and output parts of a co-function, which can be 
tricky to get right when you have to receive and send at the same yield.

Maybe in Python 4k?  Oh well. :-)


> The interesting thing is that I've been dealing with generators used
> as coroutines or tasks intensely on and off since July, and I haven't
> had a single need for any of the three patterns that this example
> happened to demonstrate:
>
> - the need to "prime" the generator in a separate step

Having a consumer decorator would be good.

def consumer(f):
     @wraps(f)
     def wrapper(*args, **kwds):
         coroutine = f(*args, **kwds)
         next(coroutine)
         return coroutine
     return wrapper


Or maybe it would be possible for python to autostart a generator if it's 
sent a value before it's started?  Currently you get an almost useless 
TypeError.  The reason it's almost useless is unless you are testing for it 
right after you create the generator, you can't (easily) be sure it's not 
from someplace inside the generator.

Ron











From rrr at ronadam.com  Mon Oct 25 22:08:06 2010
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 25 Oct 2010 15:08:06 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC5E055.9010009@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC5E055.9010009@ronadam.com>
Message-ID: <4CC5E3A6.1000303@ronadam.com>

Minor correction...

On 10/25/2010 02:53 PM, Ron Adam wrote:


> @consumer
> def reduce_collector(func):
>     try:
>         value = yield            # No value to yield here.
>         while True:
>             value = func((yield), value)        # or here.
>     except YieldError:
>         # next was called not send.
>         yield = value

This line should have been "yield value" not "yield = value".


> def parallel_reduce(iterable, funcs):
>     collectors = [reduce_collector(func) for func in funcs]
>     for v in iterable:
>         for coll in collectors:
>             coll.send(v)
>     return [next(c) for c in collectors]



From guido at python.org  Mon Oct 25 22:21:07 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 13:21:07 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC5E055.9010009@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC5E055.9010009@ronadam.com>
Message-ID: <AANLkTimjRuhVQvnhESV-2YCZm8ssTjFctf2yCj5ZjVFJ@mail.gmail.com>

On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam <rrr at ronadam.com> wrote:
> This is how my mind wants to write this.
>
> @consumer
> def reduce_collector(func):
> ? ?try:
> ? ? ? ?value = yield ? ? ? ? ? ?# No value to yield here.
> ? ? ? ?while True:
> ? ? ? ? ? ?value = func((yield), value) ? ? ? ?# or here.
> ? ?except YieldError:

IIUC this works today if you substitute GeneratorExit and use
c.close() instead of next(c) below. (I don't recall why I split it out
into two different try/except blocks but it doesn't seem necessary.

As for being able to distinguish next(c) from c.send(None), that's a
few language revisions too late. Perhaps more to the point, I don't
like that idea; it breaks the general treatment of things that return
None and throwing away values. (Long, long, long ago there were
situations where Python balked when you threw away a non-None value.
The feature was boohed off the island and it's better this way.)

> ? ? ? ?# next was called not send.
> ? ? ? ?yield value

I object to overloading yield for both a *resumable* operation and
returning a (final) value; that's why PEP 380 will let you write
"return value". (Many alternatives were considered but we always come
back to the simple "return value".)

> def parallel_reduce(iterable, funcs):
> ? ?collectors = [reduce_collector(func) for func in funcs]
> ? ?for v in iterable:
> ? ? ? ?for coll in collectors:
> ? ? ? ? ? ?coll.send(v)
> ? ?return [next(c) for c in collectors]

I really object to using next() for both getting the return value and
the next yielded value. Jacob's proposal to spell this as c.close()
sounds much better to me.

> It nicely separates input and output parts of a co-function, which can be
> tricky to get right when you have to receive and send at the same yield.

I don't think there was a problem with this in my code (or if there
was you didn't solve it).

> Maybe in Python 4k? ?Oh well. :-)

Nah.

>> The interesting thing is that I've been dealing with generators used
>> as coroutines or tasks intensely on and off since July, and I haven't
>> had a single need for any of the three patterns that this example
>> happened to demonstrate:
>>
>> - the need to "prime" the generator in a separate step
>
> Having a consumer decorator would be good.
>
> def consumer(f):
> ? ?@wraps(f)
> ? ?def wrapper(*args, **kwds):
> ? ? ? ?coroutine = f(*args, **kwds)
> ? ? ? ?next(coroutine)
> ? ? ? ?return coroutine
> ? ?return wrapper

This was proposed during the PEP 380 discussions. I still don't like
it because I can easily imagine situations where sending an initial
None falls totally naturally out of the sending logic (as it does for
my async tasks use case), and it would be a shame if the generator's
declaration prevented this.

> Or maybe it would be possible for python to autostart a generator if it's
> sent a value before it's started? ?Currently you get an almost useless
> TypeError. ?The reason it's almost useless is unless you are testing for it
> right after you create the generator, you can't (easily) be sure it's not
> from someplace inside the generator.

I'd be okay with this raising a different exception (though for
compatibility it would have to subclass TypeError). I'd also be okay
with having a property on generator objects that let you inspect the
state. There should really be three states: not yet started, started,
finished -- and of course "started and currently executing" but that
one is already exposed via g.gi_running.

Changing the behavior on .send(val) doesn't strike me as a good idea,
because the caller would be missing the first value yielded!

IOW I want to support this use case but not make it the central
driving use case for the API design.

-- 
--Guido van Rossum (python.org/~guido)


From raymond.hettinger at gmail.com  Mon Oct 25 23:11:26 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 25 Oct 2010 14:11:26 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
	(minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTikAnG0MhBegq+Zw3rL7ph4D_suuhsc9PNcKv9o6@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com> <ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
	<ia4dmg$7uc$1@dough.gmane.org>
	<AANLkTikAnG0MhBegq+Zw3rL7ph4D_suuhsc9PNcKv9o6@mail.gmail.com>
Message-ID: <5D61A18E-3BAE-43DA-B6A9-892BB7E925AB@gmail.com>


>> Is it possible to calculate min(items) and max(items) simultaneously using
>> generators? I don't see how...
> 
> No, this is why the reduce-like approach is better for such cases.
> Otherwise you keep trying to fit a square peg into a round hold.

Which, of course, is neither good for the peg, nor for the hole ;-)

no-square-pegs-in-round-holes-ly yours,


Raymond



From eric at trueblade.com  Tue Oct 26 01:44:11 2010
From: eric at trueblade.com (Eric Smith)
Date: Mon, 25 Oct 2010 19:44:11 -0400
Subject: [Python-ideas] New 3.x restriction on number of
	keyword	arguments
In-Reply-To: <20101023004508.6a6c1373@pitrou.net>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>	<loom.20101021T201522-329@post.gmane.org>	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>	<4CC1D966.2080007@egenix.com>	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>	<4CC211EE.1050308@egenix.com>
	<20101023004508.6a6c1373@pitrou.net>
Message-ID: <4CC6164B.5040201@trueblade.com>

On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
> On Sat, 23 Oct 2010 00:36:30 +0200
> "M.-A. Lemburg"<mal at egenix.com>  wrote:
>>
>> It may seem strange to have functions, methods or object constructors
>> with more than 255 parameters, but as I said: when using code generators,
>> the generators don't care whether they use 100 or 300 parameters.
>
> Why not make the code generators smarter?

Because it makes more sense to fix it in one place than force the burden 
of coding around an arbitrary limit upon each such code generator.

Eric.


From jnoller at gmail.com  Tue Oct 26 02:56:48 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Mon, 25 Oct 2010 20:56:48 -0400
Subject: [Python-ideas] PyCon 2011 Reminder: Call for Proposals,
	Posters and Tutorials - us.pycon.org
Message-ID: <AANLkTi=B4WMXWc0QAa99oGvHKKKvJR-_22oy3dz3udyM@mail.gmail.com>

PyCon 2011 Reminder: Call for Proposals, Posters and Tutorials - us.pycon.org
===============================================

Well, it's October 25th! The leaves have turned and the deadline for submitting
main-conference talk proposals expires in 7 days (November 1st, 2010)!

We are currently accepting main conference talk proposals:
http://us.pycon.org/2011/speaker/proposals/

Tutorial Proposals:
http://us.pycon.org/2011/speaker/proposals/tutorials/

Poster Proposals:
http://us.pycon.org/2011/speaker/posters/cfp/

PyCon 2011 will be held March 9th through the 17th, 2011 in Atlanta, Georgia.
(Home of some of the best southern food you can possibly find on Earth!) The
PyCon conference days will be March 11-13, preceded by two tutorial days
(March 9-10), and followed by four days of development sprints (March 14-17).

We are also proud to announce that we have booked our first Keynote
speaker - Hilary Mason, her bio:

"Hilary is the lead scientist at bit.ly, where she is finding sense in vast
data sets. She is a former computer science professor with a background in
machine learning and data mining, has published numerous academic papers, and
regularly releases code on her personal site, http://www.hilarymason.com/.
She has discovered two new species, loves to bake cookies, and asks way too
many questions."

We're really looking forward to having her this year as a keynote speaker!

Remember, we've also added an "Extreme" talk track this year - no introduction,
no fluff - only the pure technical meat!

For more information on "Extreme Talks" see:

http://us.pycon.org/2011/speaker/extreme/

We look forward to seeing you in Atlanta!

Please also note - registration for PyCon 2011 will also be capped at a
maximum of 1,500 delegates, including speakers. When registration opens (soon),
you're going to want to make sure you register early! Speakers with accepted
talks will have a guaranteed slot.

We have published all registration prices online at:
http://us.pycon.org/2011/tickets/

Important Dates
November 1st, 2010: Talk proposals due.
December 15th, 2010: Acceptance emails sent.
January 19th, 2011: Early bird registration closes.
March 9-10th, 2011: Tutorial days at PyCon.
March 11-13th, 2011: PyCon main conference.
March 14-17th, 2011: PyCon sprints days.
Contact Emails:

Van Lindberg (Conference Chair) - van at python.org
Jesse Noller (Co-Chair) - jnoller at python.org
PyCon Organizers list: pycon-organizers at python.org


From rrr at ronadam.com  Tue Oct 26 03:01:29 2010
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 25 Oct 2010 20:01:29 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimjRuhVQvnhESV-2YCZm8ssTjFctf2yCj5ZjVFJ@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC5E055.9010009@ronadam.com>
	<AANLkTimjRuhVQvnhESV-2YCZm8ssTjFctf2yCj5ZjVFJ@mail.gmail.com>
Message-ID: <4CC62869.6090503@ronadam.com>



On 10/25/2010 03:21 PM, Guido van Rossum wrote:
> On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam<rrr at ronadam.com>  wrote:
>> This is how my mind wants to write this.
>>
>> @consumer
>> def reduce_collector(func):
>>     try:
>>         value = yield            # No value to yield here.
>>         while True:
>>             value = func((yield), value)        # or here.
>>     except YieldError:
>
> IIUC this works today if you substitute GeneratorExit and use
> c.close() instead of next(c) below. (I don't recall why I split it out
> into two different try/except blocks but it doesn't seem necessary.

I tried it, c.close() doesn't work yet, but it does work with 
c.throw(GeneratorExit) :-)   But that still uses yield to get the value.

I used a different way of starting the generator that checks for a value 
being yielded.



class GeneratorStartError(TypeError): pass

def start(g):
     value = next(g)
     if value is not None:
         raise GeneratorStartError('started generator yielded a value')
     return g

def reduce_collector(func):
     value = None
     try:
         value = yield
         while True:
             value = func((yield), value)
     except GeneratorExit:
         yield value

def parallel_reduce(iterable, funcs):
     collectors = [start(reduce_collector(func)) for func in funcs]
     for v in iterable:
         for coll in collectors:
             coll.send(v)
     return [c.throw(GeneratorExit) for c in collectors]

def main():
     it = range(100)
     print(parallel_reduce(it, [min, max]))

if __name__ == '__main__':
     main()



> As for being able to distinguish next(c) from c.send(None), that's a
> few language revisions too late. Perhaps more to the point, I don't
> like that idea; it breaks the general treatment of things that return
> None and throwing away values. (Long, long, long ago there were
> situations where Python balked when you threw away a non-None value.
> The feature was boohed off the island and it's better this way.)

I'm not sure I follow the relationship you suggest.  No values would be 
thrown away.  Or did you mean that it should be ok to throw away values? I 
don't think it would prevent that either.

What the YieldError case really does is give the generator a bit more 
control.  As far as the calling routine that uses it is concerned, it just 
works.  What happend inside the generator is completely transparent to the 
routine using the generator.  If the calling routine does see a YieldError, 
it means it probably was a bug.


>>         # next was called not send.
>>         yield value
>
> I object to overloading yield for both a *resumable* operation and
> returning a (final) value; that's why PEP 380 will let you write
> "return value". (Many alternatives were considered but we always come
> back to the simple "return value".)

That works for me.  I think lot of people will find it easy to learn.


>> def parallel_reduce(iterable, funcs):
>>     collectors = [reduce_collector(func) for func in funcs]
>>     for v in iterable:
>>         for coll in collectors:
>>             coll.send(v)
>>     return [next(c) for c in collectors]
>
> I really object to using next() for both getting the return value and
> the next yielded value. Jacob's proposal to spell this as c.close()
> sounds much better to me.

If c.close also throws the GeneratorExit and returns a value, that would be 
cool. Thanks.

I take it that the objections have more to do with style and coding 
practices rather than what is possible.


>> It nicely separates input and output parts of a co-function, which can be
>> tricky to get right when you have to receive and send at the same yield.
>
> I don't think there was a problem with this in my code (or if there
> was you didn't solve it).

There wasn't in this code.  This is one of those areas where it can be 
really difficult to find the correct way to express a co-function that does 
both input and output, but not necessarily in a fixed order.

I begin almost any co-function with this at the top of the loop and later 
trim it up if parts of it aren't needed.

    out_value = None
    while True:
       in_value = yield out_value
       out_value = None
       ...
       # rest of loop to check in_value and modify out_value

As long as None isn't a valid data item, this works most of the time.


>> Maybe in Python 4k?  Oh well. :-)
>
> Nah.

I'm ok with that.

Ron


From guido at python.org  Tue Oct 26 03:34:40 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 18:34:40 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC62869.6090503@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC5E055.9010009@ronadam.com>
	<AANLkTimjRuhVQvnhESV-2YCZm8ssTjFctf2yCj5ZjVFJ@mail.gmail.com>
	<4CC62869.6090503@ronadam.com>
Message-ID: <AANLkTincGuMQ7L-BicJb-udvo-DACb_XspScGsYdRK6s@mail.gmail.com>

On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam <rrr at ronadam.com> wrote:
>
> On 10/25/2010 03:21 PM, Guido van Rossum wrote:
>>
>> On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam<rrr at ronadam.com> ?wrote:
>>>
>>> This is how my mind wants to write this.
>>>
>>> @consumer
>>> def reduce_collector(func):
>>> ? ?try:
>>> ? ? ? ?value = yield ? ? ? ? ? ?# No value to yield here.
>>> ? ? ? ?while True:
>>> ? ? ? ? ? ?value = func((yield), value) ? ? ? ?# or here.
>>> ? ?except YieldError:
>>
>> IIUC this works today if you substitute GeneratorExit and use
>> c.close() instead of next(c) below. (I don't recall why I split it out
>> into two different try/except blocks but it doesn't seem necessary.
>
> I tried it, c.close() doesn't work yet, but it does work with
> c.throw(GeneratorExit) :-) ? But that still uses yield to get the value.

Yeah, sorry, I didn't mean to say that g.close() would return the
value, but that you can use GeneratorExit here. g.close() *does* throw
GeneratorExit (that's PEP 342); but it doesn't return the value yet. I
like adding that to PEP 380 though.

> I used a different way of starting the generator that checks for a value
> being yielded.
>
>
>
> class GeneratorStartError(TypeError): pass
>
> def start(g):
> ? ?value = next(g)
> ? ?if value is not None:
> ? ? ? ?raise GeneratorStartError('started generator yielded a value')
> ? ?return g

Whatever tickles your fancy. I just don't think this deserves a builtin.

> def reduce_collector(func):
> ? ?value = None
> ? ?try:
> ? ? ? ?value = yield
> ? ? ? ?while True:
> ? ? ? ? ? ?value = func((yield), value)
> ? ?except GeneratorExit:
> ? ? ? ?yield value

Even today, I would much prefer using raise StopIteration(value) over
yield value (or yield Return(value)). Reusing yield to return a value
just looks wrong to me, there are too many ways to get confused (and
this area doesn't need more of that :-).

> def parallel_reduce(iterable, funcs):
> ? ?collectors = [start(reduce_collector(func)) for func in funcs]
> ? ?for v in iterable:
> ? ? ? ?for coll in collectors:
> ? ? ? ? ? ?coll.send(v)
> ? ?return [c.throw(GeneratorExit) for c in collectors]
>
> def main():
> ? ?it = range(100)
> ? ?print(parallel_reduce(it, [min, max]))
>
> if __name__ == '__main__':
> ? ?main()
>
>
>
>> As for being able to distinguish next(c) from c.send(None), that's a
>> few language revisions too late. Perhaps more to the point, I don't
>> like that idea; it breaks the general treatment of things that return
>> None and throwing away values. (Long, long, long ago there were
>> situations where Python balked when you threw away a non-None value.
>> The feature was boohed off the island and it's better this way.)
>
> I'm not sure I follow the relationship you suggest. ?No values would be
> thrown away. ?Or did you mean that it should be ok to throw away values? I
> don't think it would prevent that either.

Well maybe I was misunderstanding your proposed YieldError. You didn't
really explain it -- you just used it and assumed everybody understood
what you meant. My assumption was that you meant for YieldError to be
raised if yield was used as an expression (not a statement) but next()
was called instead of send().

My response was that it's ugly to make a distinction between

  x = <expr>
  del x  # Or just not use x

and

  <expr>

But maybe I misunderstood what you meant.

> What the YieldError case really does is give the generator a bit more
> control. ?As far as the calling routine that uses it is concerned, it just
> works. ?What happend inside the generator is completely transparent to the
> routine using the generator. ?If the calling routine does see a YieldError,
> it means it probably was a bug.

That sounds pretty close to the rules for GeneratorExit.

>>> ? ? ? ?# next was called not send.
>>> ? ? ? ?yield value
>>
>> I object to overloading yield for both a *resumable* operation and
>> returning a (final) value; that's why PEP 380 will let you write
>> "return value". (Many alternatives were considered but we always come
>> back to the simple "return value".)
>
> That works for me. ?I think lot of people will find it easy to learn.
>
>
>>> def parallel_reduce(iterable, funcs):
>>> ? ?collectors = [reduce_collector(func) for func in funcs]
>>> ? ?for v in iterable:
>>> ? ? ? ?for coll in collectors:
>>> ? ? ? ? ? ?coll.send(v)
>>> ? ?return [next(c) for c in collectors]
>>
>> I really object to using next() for both getting the return value and
>> the next yielded value. Jacob's proposal to spell this as c.close()
>> sounds much better to me.
>
> If c.close also throws the GeneratorExit and returns a value, that would be
> cool. Thanks.

It does throw GeneratorExit (that's the whole reason for
GeneratorExit's existence :-).

> I take it that the objections have more to do with style and coding
> practices rather than what is possible.

Yeah, it's my gut and that's hard to reason with but usually right.
(See also: http://www.amazon.com/How-We-Decide-Jonah-Lehrer/dp/0618620117
)

>>> It nicely separates input and output parts of a co-function, which can be
>>> tricky to get right when you have to receive and send at the same yield.
>>
>> I don't think there was a problem with this in my code (or if there
>> was you didn't solve it).
>
> There wasn't in this code. ?This is one of those areas where it can be
> really difficult to find the correct way to express a co-function that does
> both input and output, but not necessarily in a fixed order.

Maybe for that one should use a "channel" abstraction, like Go (and
before it, CSP)? I noticed that Monocle
(http://github.com/saucelabs/monocle) has a demo of that in its
"experimental" module (but the example is kind of silly).

> I begin almost any co-function with this at the top of the loop and later
> trim it up if parts of it aren't needed.
>
> ? out_value = None
> ? while True:
> ? ? ?in_value = yield out_value
> ? ? ?out_value = None
> ? ? ?...
> ? ? ?# rest of loop to check in_value and modify out_value
>
> As long as None isn't a valid data item, this works most of the time.
>
>
>>> Maybe in Python 4k? ?Oh well. :-)
>>
>> Nah.
>
> I'm ok with that.
>
> Ron
>



-- 
--Guido van Rossum (python.org/~guido)


From rrr at ronadam.com  Tue Oct 26 03:36:13 2010
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 25 Oct 2010 20:36:13 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
Message-ID: <4CC6308D.9090201@ronadam.com>



On 10/25/2010 10:13 AM, Guido van Rossum wrote:

> BTW, while I have you, what do you think of Greg's "cofunctions" proposal?


Well, my .5 cents worth, for what it's worth.

I'm still undecided.

Because of the many optimazations python has had in the last year on 
speeding up attribute access, (thanks guys!), classes don't get penalized 
as much as they use to be.  So I'd like to see some speed comparisons with 
using class's vs co-functions.

I think the class's are much easier to use and may not be as slow as some 
may think.

Ron


From jh at improva.dk  Tue Oct 26 03:35:33 2010
From: jh at improva.dk (Jacob Holm)
Date: Tue, 26 Oct 2010 03:35:33 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
Message-ID: <4CC63065.9040507@improva.dk>

On 2010-10-25 17:13, Guido van Rossum wrote:
> On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm <jh at improva.dk> wrote:
>> More interesting (to me at least) is that this is an excellent example
>> of why I would like to see a version of PEP380 where "close" on a
>> generator can return a value (AFAICT the version of PEP380 on
>> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
>> mention this possibility, or even link to the heated discussion we had
>> on python-ideas around march/april 2009).
> 
> Can you dig up the link here?
> 
> I recall that discussion but I don't recall a clear conclusion coming
> from it -- just heated debate.
> 


Well here is a recap of the end of the discussion about how to handle
generator return values and g.close().

  Gregs conclusion that g.close() should not return a value:
  http://mail.python.org/pipermail/python-ideas/2009-April/003959.html

  My reply (ordered list of ways to handle return values in generators):
  http://mail.python.org/pipermail/python-ideas/2009-April/003984.html

  Some arguments for storing the return value on the generator:
  http://mail.python.org/pipermail/python-ideas/2009-April/004008.html

  Some support for that idea from Nick:
  http://mail.python.org/pipermail/python-ideas/2009-April/004012.html

  You're not convinced by Gregs argument:
  http://mail.python.org/pipermail/python-ideas/2009-April/003985.html

  Greg arguing that using GeneratorExit this way is bad:
  http://mail.python.org/pipermail/python-ideas/2009-April/004001.html

  You add a new complete proposal including g.close() returning a value:
  http://mail.python.org/pipermail/python-ideas/2009-April/003944.html

  I point out some problems e.g. with the handling of return values:
  http://mail.python.org/pipermail/python-ideas/2009-April/003981.html

  Then the discussion goes on at length about the problems of using a
  coroutine decorator with yield-from.  At one point I am arguing for
  generators to keep a reference to the last value yielded:
  http://mail.python.org/pipermail/python-ideas/2009-April/004032.html

  And you reply that storing "unnatural" state on the generator or
  frame object is a bad idea:
  http://mail.python.org/pipermail/python-ideas/2009-April/004034.html

  From which I concluded that having g.close() return a value (the same
  on each successive call) would be a no-go:
  http://mail.python.org/pipermail/python-ideas/2009-April/004040.html

  Which you confirmed:
  http://mail.python.org/pipermail/python-ideas/2009-April/004041.html


The latest draft (#13) I have been able to find was announced in
http://mail.python.org/pipermail/python-ideas/2009-April/004189.html

And can be found at
http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt

I had some later suggestions for how to change the expansion, see e.g.
http://mail.python.org/pipermail/python-ideas/2009-April/004195.html  (I
find that version easier to reason about even now 1? years later)




> Based on my example I have to agree that returning a value from
> close() would be nice. There is a little detail, how multiple
> arguments to StopIteration should be interpreted, but that's not so
> important if it's being raised by a return statement.
> 

Right.  I would assume that the return value of g.close() if we ever got
one was to be taken from the first argument to the StopIteration.

What killed the proposal last time was the question of what should
happen when you call g.close() on an exhausted generator.  My preferred
solution was (and is) that the generator should save the value from the
terminating StopIteration (or None if it ended by some other means) and
that g.close() should return that value each time and g.next(), g.send()
and g.throw() should raise a StopIteration with the value.
Unless you have changed your position on storing the return value, that
solution is dead in the water.

For this use case we don't actually need to call close() on an exhausted
generator so perhaps there is *some* use in only returning a value when
the generator is actually running.

Here's a stupid idea... let g.close take an optional argument that it
can return if the generator is already exhausted and let it return the
value from the StopIteration otherwise.

def close(self, default=None):
    if self.gi_frame is None:
        return default
    try:
        self.throw(GeneratorExit)
    except StopIteration as e:
        return e.args[0]
    except GeneratorExit:
        return None
    else:
	raise RuntimeError('generator ignored GeneratorExit')



> I totally agree that not having to call throw() and catch whatever it
> bounces back is much nicer. (Now I wish there was a way to avoid the
> "try..except GeneratorExit" construct in the generator, but I think I
> should stop while I'm ahead. :-)
> 
> The interesting thing is that I've been dealing with generators used
> as coroutines or tasks intensely on and off since July, and I haven't
> had a single need for any of the three patterns that this example
> happened to demonstrate:
> 
> - the need to "prime" the generator in a separate step
> - throwing and catching GeneratorExit
> - getting a value from close()
> 
> (I did have a lot of use for send(), throw(), and extracting a value
> from StopIteration.)
> 

I think these things (at least priming and close()) are mostly an issue
when using coroutines from non-coroutines.  That means it is likely to
be common in small examples where you write the whole program, but less
common when you are writing small(ish) parts of a larger framework.

Throwing and catching GeneratorExit is not common, and according to some
shouldn't be used for this purpose at all.


> In my context, generators are used to emulate concurrently running
> tasks, and "yield" is always used to mean "block until this piece of
> async I/O is complete, and wake me up with the result". This is
> similar to the "classic" trampoline code found in PEP 342.
> 
> In fact, when I wrote the example for this thread, I fumbled a bit
> because the use of generators there is different than I had been using
> them (though it was no doubt thanks to having worked with them
> intensely that I came up with the example quickly).
> 

This sounds a lot like working in a "larger framework" to me. :)


> So, it is clear that generators are extremely versatile, and PEP 380
> deserves several good use cases to explain all the API subtleties.
> 

I like your example because it matches the way I would have used
generators to solve it.  OTOH, it is not hard to rewrite parallel_reduce
as a traditional function.  In fact, the result is a bit shorter and
quite a bit faster so it is not a good example of what you need
generators for.


> BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
> 

I'll have to get back to you on that.

- Jacob


From guido at python.org  Tue Oct 26 05:14:34 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 20:14:34 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC63065.9040507@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
Message-ID: <AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>

On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm <jh at improva.dk> wrote:
> On 2010-10-25 17:13, Guido van Rossum wrote:
>> On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm <jh at improva.dk> wrote:
>>> More interesting (to me at least) is that this is an excellent example
>>> of why I would like to see a version of PEP380 where "close" on a
>>> generator can return a value (AFAICT the version of PEP380 on
>>> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
>>> mention this possibility, or even link to the heated discussion we had
>>> on python-ideas around march/april 2009).
>>
>> Can you dig up the link here?
>>
>> I recall that discussion but I don't recall a clear conclusion coming
>> from it -- just heated debate.
>>
>
>
> Well here is a recap of the end of the discussion about how to handle
> generator return values and g.close().

Thanks, very thorough!

> ?Gregs conclusion that g.close() should not return a value:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/003959.html
>
> ?My reply (ordered list of ways to handle return values in generators):
> ?http://mail.python.org/pipermail/python-ideas/2009-April/003984.html
>
> ?Some arguments for storing the return value on the generator:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004008.html
>
> ?Some support for that idea from Nick:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004012.html
>
> ?You're not convinced by Gregs argument:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/003985.html
>
> ?Greg arguing that using GeneratorExit this way is bad:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004001.html
>
> ?You add a new complete proposal including g.close() returning a value:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/003944.html
>
> ?I point out some problems e.g. with the handling of return values:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/003981.html
>
> ?Then the discussion goes on at length about the problems of using a
> ?coroutine decorator with yield-from. ?At one point I am arguing for
> ?generators to keep a reference to the last value yielded:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004032.html
>
> ?And you reply that storing "unnatural" state on the generator or
> ?frame object is a bad idea:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004034.html
>
> ?From which I concluded that having g.close() return a value (the same
> ?on each successive call) would be a no-go:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004040.html
>
> ?Which you confirmed:
> ?http://mail.python.org/pipermail/python-ideas/2009-April/004041.html
>
>
> The latest draft (#13) I have been able to find was announced in
> http://mail.python.org/pipermail/python-ideas/2009-April/004189.html
>
> And can be found at
> http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt

Hmm... It does look like the PEP editors dropped the ball on this one
(or maybe Greg didn't mail it directly to them). It doesn't seem there
are substantial differences with the published version at
http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't
return a value.

> I had some later suggestions for how to change the expansion, see e.g.
> http://mail.python.org/pipermail/python-ideas/2009-April/004195.html ?(I
> find that version easier to reason about even now 1? years later)

Hopefully you & Greg can agree on a new draft. I like this to make
progress and I really want this to appear in 3.3. But I don't have the
time to do the editing and reviewing of the PEP.

>> Based on my example I have to agree that returning a value from
>> close() would be nice. There is a little detail, how multiple
>> arguments to StopIteration should be interpreted, but that's not so
>> important if it's being raised by a return statement.
>>
>
> Right. ?I would assume that the return value of g.close() if we ever got
> one was to be taken from the first argument to the StopIteration.

That's a reasonable position. Monocle currently makes it so that using
yield Return(x, y, z) [which in my view should be spelled raise
Return(x, y, z0] is equivalent to return x, y, z, but there's no real
need if the latter syntax is actually supported.

> What killed the proposal last time was the question of what should
> happen when you call g.close() on an exhausted generator. ?My preferred
> solution was (and is) that the generator should save the value from the
> terminating StopIteration (or None if it ended by some other means) and
> that g.close() should return that value each time and g.next(), g.send()
> and g.throw() should raise a StopIteration with the value.
> Unless you have changed your position on storing the return value, that
> solution is dead in the water.

I haven't changed my position. Closing a file twice doesn't do
anything the second time either.

> For this use case we don't actually need to call close() on an exhausted
> generator so perhaps there is *some* use in only returning a value when
> the generator is actually running.

:-)

> Here's a stupid idea... let g.close take an optional argument that it
> can return if the generator is already exhausted and let it return the
> value from the StopIteration otherwise.
>
> def close(self, default=None):
> ? ?if self.gi_frame is None:
> ? ? ? ?return default
> ? ?try:
> ? ? ? ?self.throw(GeneratorExit)
> ? ?except StopIteration as e:
> ? ? ? ?return e.args[0]
> ? ?except GeneratorExit:
> ? ? ? ?return None
> ? ?else:
> ? ? ? ?raise RuntimeError('generator ignored GeneratorExit')

You'll have to explain why None isn't sufficient.

>> I totally agree that not having to call throw() and catch whatever it
>> bounces back is much nicer. (Now I wish there was a way to avoid the
>> "try..except GeneratorExit" construct in the generator, but I think I
>> should stop while I'm ahead. :-)
>>
>> The interesting thing is that I've been dealing with generators used
>> as coroutines or tasks intensely on and off since July, and I haven't
>> had a single need for any of the three patterns that this example
>> happened to demonstrate:
>>
>> - the need to "prime" the generator in a separate step
>> - throwing and catching GeneratorExit
>> - getting a value from close()
>>
>> (I did have a lot of use for send(), throw(), and extracting a value
>> from StopIteration.)
>>
>
> I think these things (at least priming and close()) are mostly an issue
> when using coroutines from non-coroutines. ?That means it is likely to
> be common in small examples where you write the whole program, but less
> common when you are writing small(ish) parts of a larger framework.
>
> Throwing and catching GeneratorExit is not common, and according to some
> shouldn't be used for this purpose at all.

Well, *throwing* it is close()'s job. And *catching* it ought to be
pretty rare. Maybe this idiom would be better:

def sum():
  total = 0
  try:
    while True:
      value = yield
      total += value
  finally:
    return total

>> In my context, generators are used to emulate concurrently running
>> tasks, and "yield" is always used to mean "block until this piece of
>> async I/O is complete, and wake me up with the result". This is
>> similar to the "classic" trampoline code found in PEP 342.
>>
>> In fact, when I wrote the example for this thread, I fumbled a bit
>> because the use of generators there is different than I had been using
>> them (though it was no doubt thanks to having worked with them
>> intensely that I came up with the example quickly).
>>
>
> This sounds a lot like working in a "larger framework" to me. :)

Possibly. I realize that I have code something like this:

next_input = None
while ...not done yet...:
  output = gen.send(next_input)
  next_input = ...computed from output...  # many variations

which quite naturally computes next_input from output but it does
start out with an initial value of None for next_input in order to
prime the pump.

>> So, it is clear that generators are extremely versatile, and PEP 380
>> deserves several good use cases to explain all the API subtleties.
>>
>
> I like your example because it matches the way I would have used
> generators to solve it. ?OTOH, it is not hard to rewrite parallel_reduce
> as a traditional function. ?In fact, the result is a bit shorter and
> quite a bit faster so it is not a good example of what you need
> generators for.

I'm not sure I understand. Maybe you meant to rewrite it as a class?
There's some state that wouldn't have a good place to live without
either a class or a (generator) stackframe to survive.

>> BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
>>
>
> I'll have to get back to you on that.
>
> - Jacob
>



-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Tue Oct 26 05:25:08 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 25 Oct 2010 20:25:08 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
Message-ID: <AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>

By the way, here's how to emulate the value-returning-close() on a
generator, assuming the generator uses raise StopIteration(x) to mean
return x:

def gclose(gen):
  try:
    gen.throw(GeneratorExit)
  except StopIteration, err:
    if err.args:
      return err.args[0]
  except GeneratorExit:
    pass
  return None

I like this because it's fairly straightforward (except for the detail
of having to also catch GeneratorExit).

In fact it would be a really simple change to gen_close() in
genobject.c -- the only change needed there would be to return
err.args[0]. I like small evolutionary improvements to APIs.

-- 
--Guido van Rossum (python.org/~guido)


From rrr at ronadam.com  Tue Oct 26 08:57:32 2010
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 26 Oct 2010 01:57:32 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTincGuMQ7L-BicJb-udvo-DACb_XspScGsYdRK6s@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC5E055.9010009@ronadam.com>
	<AANLkTimjRuhVQvnhESV-2YCZm8ssTjFctf2yCj5ZjVFJ@mail.gmail.com>
	<4CC62869.6090503@ronadam.com>
	<AANLkTincGuMQ7L-BicJb-udvo-DACb_XspScGsYdRK6s@mail.gmail.com>
Message-ID: <4CC67BDC.8010601@ronadam.com>



On 10/25/2010 08:34 PM, Guido van Rossum wrote:
> On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam<rrr at ronadam.com>  wrote:
>>
>> On 10/25/2010 03:21 PM, Guido van Rossum wrote:
>>>
>>> On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam<rrr at ronadam.com>    wrote:
>>>>
>>>> This is how my mind wants to write this.
>>>>
>>>> @consumer
>>>> def reduce_collector(func):
>>>>     try:
>>>>         value = yield            # No value to yield here.
>>>>         while True:
>>>>             value = func((yield), value)        # or here.
>>>>     except YieldError:

> Well maybe I was misunderstanding your proposed YieldError. You didn't
> really explain it -- you just used it and assumed everybody understood
> what you meant.

Sorry about that, it is too easy to think something is clear on these 
boards when in fact it's isn't as clear as we (I in this case) think it is.

hmmm ... I feel a bit embarrassed because I wasn't really meaning to try to 
convince you to do this.  It's just what first came to mind when I asked 
myself, "if there was an easier way to write it, how would I do it?".  As 
you pointed out, it isn't that much different from the c.close() example 
Jacob gave.

To me, that is a nice indication that you (and Jacob and Greg) are on the 
right track.

I think YieldError is an interesting concept, but it requires too many 
changes to make it work.

( I just wish I could be of more help here.  :-/ )

Cheers,
    Ron








From __peter__ at web.de  Tue Oct 26 10:12:54 2010
From: __peter__ at web.de (Peter Otten)
Date: Tue, 26 Oct 2010 10:12:54 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
Message-ID: <ia62hj$ncn$1@dough.gmane.org>

Guido van Rossum wrote:

>> I like your example because it matches the way I would have used
>> generators to solve it.  OTOH, it is not hard to rewrite parallel_reduce
>> as a traditional function.  In fact, the result is a bit shorter and
>> quite a bit faster so it is not a good example of what you need
>> generators for.
> 
> I'm not sure I understand. Maybe you meant to rewrite it as a class?
> There's some state that wouldn't have a good place to live without
> either a class or a (generator) stackframe to survive.

How about

def parallel_reduce(items, funcs):
    items = iter(items)
    try:
        first = next(items)
    except StopIteration:
        raise TypeError
    accu = [first] * len(funcs)
    for b in items:
        accu = [f(a, b) for f, a in zip(funcs, accu)]
    return accu

Peter



From solipsis at pitrou.net  Tue Oct 26 10:50:18 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 26 Oct 2010 10:50:18 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
 arguments
In-Reply-To: <4CC6164B.5040201@trueblade.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
Message-ID: <1288083018.3547.0.camel@localhost.localdomain>

Le lundi 25 octobre 2010 ? 19:44 -0400, Eric Smith a ?crit :
> On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
> > On Sat, 23 Oct 2010 00:36:30 +0200
> > "M.-A. Lemburg"<mal at egenix.com>  wrote:
> >>
> >> It may seem strange to have functions, methods or object constructors
> >> with more than 255 parameters, but as I said: when using code generators,
> >> the generators don't care whether they use 100 or 300 parameters.
> >
> > Why not make the code generators smarter?
> 
> Because it makes more sense to fix it in one place than force the burden 
> of coding around an arbitrary limit upon each such code generator.

Sure, but in the absence of anyone providing a patch for CPython, it is
still a possible resolution.

Regards

Antoine.




From mal at egenix.com  Tue Oct 26 11:31:41 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 26 Oct 2010 11:31:41 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <1288083018.3547.0.camel@localhost.localdomain>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
Message-ID: <4CC69FFD.7080102@egenix.com>

Antoine Pitrou wrote:
> Le lundi 25 octobre 2010 ? 19:44 -0400, Eric Smith a ?crit :
>> On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
>>> On Sat, 23 Oct 2010 00:36:30 +0200
>>> "M.-A. Lemburg"<mal at egenix.com>  wrote:
>>>>
>>>> It may seem strange to have functions, methods or object constructors
>>>> with more than 255 parameters, but as I said: when using code generators,
>>>> the generators don't care whether they use 100 or 300 parameters.
>>>
>>> Why not make the code generators smarter?

I don't see a way to work around the
limitation without starting every single wrapper object's .__init__()
with a test routine that checks the parameters in Python - and that's
not really feasible since it would kill performance.

You'd also have to move all **kws parameters to locals in order to
emulate the normal Python parameter invokation of the method.

>> Because it makes more sense to fix it in one place than force the burden 
>> of coding around an arbitrary limit upon each such code generator.
> 
> Sure, but in the absence of anyone providing a patch for CPython, it is
> still a possible resolution.

Cesare already posted a patch based on using EXTENDED_ARG. Should we
reopen that old ticket or create a new one ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Tue Oct 26 11:44:38 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 26 Oct 2010 11:44:38 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
 arguments
In-Reply-To: <4CC69FFD.7080102@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
Message-ID: <1288086278.3547.8.camel@localhost.localdomain>


> >>> Why not make the code generators smarter?
> 
> I don't see a way to work around the
> limitation without starting every single wrapper object's .__init__()
> with a test routine that checks the parameters in Python - and that's
> not really feasible since it would kill performance.

Have you considered that having 200 or 300 keyword arguments might
already kill performance? I don't think our function invocation code is
tuned for such a number.

> Cesare already posted a patch based on using EXTENDED_ARG. Should we
> reopen that old ticket or create a new one ?

Was there an old ticket open? I have only seen a piece of code on
python-ideas. Regardless, whether one or the other doesn't really
matter, as long as it's recorded somewhere :)

Regards

Antoine.




From cesare.di.mauro at gmail.com  Tue Oct 26 11:45:56 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 26 Oct 2010 11:45:56 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC69FFD.7080102@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
Message-ID: <AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>

2010/10/26 M.-A. Lemburg <mal at egenix.com>

> Antoine Pitrou wrote:
> > Sure, but in the absence of anyone providing a patch for CPython, it is
> > still a possible resolution.
>
> Cesare already posted a patch based on using EXTENDED_ARG. Should we
> reopen that old ticket or create a new one ?
>
> --
> Marc-Andre Lemburg
>

I can provide another patch that will not use EXTENDED_ARG (no VM changes),
and uses *args and/or **kwargs function calls when there are more than 255
arguments or keyword arguments.

But I need some days.

If needed, I'll post it at most on this week-end.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101026/ef77f5ca/attachment.html>

From ncoghlan at gmail.com  Tue Oct 26 11:51:52 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 26 Oct 2010 19:51:52 +1000
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
Message-ID: <AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>

On Tue, Oct 26, 2010 at 2:00 AM, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> Hm. I suppose the need for this would be slightly mitigated if I understood
> why str.join does not try to convert the elements of the iterable it is
> passed to strs (and analogously for unicode).
> Does anyone know what the rationale for that is?

To elaborate on Guido's answer, omitting automatic coercion makes it
fairly easy to coerce via str, repr or ascii (as appropriate), or else
to implicitly assert that all the inputs should be strings (or
buffers) already.

Once you put automatic coercion inside str.join, the last option
becomes comparatively hard to do.

Note that easy coercion in str.join is one of the use cases that
prompted us to keep map as a builtin though:

sep.join(map(str, seq))
sep.join(map(repr, seq))
sep.join(map(ascii, seq))
sep.join(seq)

The genexp equivalents are both slower and harder to read than the
simple map invocations.

To elaborate on Terry's answer as well - when join was the function
string.join, people often had troubling remembering if the sequence or
the separator argument came first. With the str method, while some
people may find it odd to have the method invocation on the separator,
they typically don't forget the order once they learn it for the first
time.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Tue Oct 26 12:10:59 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 26 Oct 2010 21:10:59 +1100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>
	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>
	<ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
Message-ID: <4CC6A933.3080605@pearwood.info>

Guido van Rossum wrote:

> This should not require threads.
> 
> Here's a bare-bones sketch using generators:
> 
> def reduce_collector(func):
>     outcome = None
>     while True:
>         try:
>             val = yield
>         except GeneratorExit:
>             break
>         if outcome is None:
>             outcome = val
>         else:
>             outcome = func(outcome, val)
>     raise StopIteration(outcome)
> 
> def parallel_reduce(iterable, funcs):
>     collectors = [reduce_collector(func) for func in funcs]
>     values = [None for _ in collectors]
>     for i, coll in enumerate(collectors):
>         try:
>             next(coll)
>         except StopIteration as err:
>             values[i] = err.args[0]
>             collectors[i] = None
>     for val in iterable:
>         for i, coll in enumerate(collectors):
>             if coll is not None:
>                 try:
>                     coll.send(val)
>                 except StopIteration as err:
>                     values[i] = err.args[0]
>                     collectors[i] = None
>     for i, coll in enumerate(collectors):
>         if coll is not None:
>             try:
>                 coll.throw(GeneratorExit)
>             except StopIteration as err:
>                 values[i] = err.args[0]
>     return values


Perhaps I'm missing something, but to my mind, that's an awfully 
complicated solution for such a simple problem. Here's my attempt:

def multi_reduce(iterable, funcs):
     it = iter(iterable)
     collectors = [next(it)]*len(funcs)
     for i, f in enumerate(funcs):
         x = next(it)
         collectors[i] = f(collectors[i], x)
     return collectors

I've called it multi_reduce rather than parallel_reduce, because it 
doesn't execute the functions in parallel. By my testing on Python 
3.1.1, multi_reduce is consistently ~30% faster than the generator based 
solution for lists with 1000 - 10,000,000 items.

So what am I missing? What does your parallel_reduce give us that 
multi_reduce doesn't?



-- 
Steven



From ncoghlan at gmail.com  Tue Oct 26 12:36:08 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 26 Oct 2010 20:36:08 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
Message-ID: <AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>

On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm <jh at improva.dk> wrote:
>> Throwing and catching GeneratorExit is not common, and according to some
>> shouldn't be used for this purpose at all.
>
> Well, *throwing* it is close()'s job. And *catching* it ought to be
> pretty rare. Maybe this idiom would be better:
>
> def sum():
> ?total = 0
> ?try:
> ? ?while True:
> ? ? ?value = yield
> ? ? ?total += value
> ?finally:
> ? ?return total

Rereading my previous post that Jacob linked, I'm still a little
uncomfortable with the idea of people deliberately catching
GeneratorExit to turn it into a normal value return to be reported by
close(). That said, I'm even less comfortable with the idea of
encouraging the moral equivalent of a bare except clause :)

I see two realistic options here:

1. Use GeneratorExit for this, have g.close() return a value and I
(and others that agree with me) just get the heck over it.

2. Add a new GeneratorReturn exception and a new g.finish() method
that follows the same basic algorithm Guido suggested, only with a
different exception type:

class GeneratorReturn(Exception): # Note: ordinary exception, unlike
GeneratorExit
  pass

def finish(gen):
 try:
   gen.throw(GeneratorReturn)
   raise RuntimeError("Generator ignored GeneratorReturn")
 except StopIteration as err:
   if err.args:
     return err.args[0]
 except GeneratorReturn:
   pass
 return None

(Why "finish" as the suggested name for the method? I'd prefer
"return", but that's a keyword and "return_" is somewhat ugly. Pairing
GeneratorReturn with finish() is my second choice, for the "OK, time
to wrap things up and complete your assigned task" connotations, as
compared to the "drop everything and clean up the mess" connotations
of GeneratorExit and close())

I'd personally be +1 on option 2 (since it addresses the immediate use
case while maintaining appropriate separation of concerns between
guaranteed resource cleanup and graceful completion of coroutines) and
-0 on option 1 (unsurprising, given my previously stated objections to
failing to maintain appropriate separation of concerns).

(I should note that this differs from the previous suggestion of a
GeneratorReturn exception in the context of PEP 380. Those suggestions
were to use it as a replacement for StopIteration when a generator
contained a return statement. The suggestion here is to instead use it
as a replacement for GeneratorExit in order to request
prompt-but-graceful completion of a generator rather than just bailing
out immediately).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mal at egenix.com  Tue Oct 26 12:38:44 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 26 Oct 2010 12:38:44 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
Message-ID: <4CC6AFB4.3040003@egenix.com>

Cesare Di Mauro wrote:
> 2010/10/26 M.-A. Lemburg <mal at egenix.com>
> 
>> Antoine Pitrou wrote:
>>> Sure, but in the absence of anyone providing a patch for CPython, it is
>>> still a possible resolution.
>>
>> Cesare already posted a patch based on using EXTENDED_ARG. Should we
>> reopen that old ticket or create a new one ?
>>
>> --
>> Marc-Andre Lemburg
>>
> 
> I can provide another patch that will not use EXTENDED_ARG (no VM changes),
> and uses *args and/or **kwargs function calls when there are more than 255
> arguments or keyword arguments.
> 
> But I need some days.
> 
> If needed, I'll post it at most on this week-end.

You mean a version that pushes the *args tuple and **kws dict
on the stack and then uses those for calling the function/method ?

I think that would be a lot more efficient than pushing/popping
hundreds of parameters on/off the stack.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From cesare.di.mauro at gmail.com  Tue Oct 26 13:10:56 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 26 Oct 2010 13:10:56 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC6AFB4.3040003@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
Message-ID: <AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>

2010/10/26 M.-A. Lemburg <mal at egenix.com>

> Cesare Di Mauro wrote:
>  > I can provide another patch that will not use EXTENDED_ARG (no VM
> changes),
> > and uses *args and/or **kwargs function calls when there are more than
> 255
> > arguments or keyword arguments.
> >
> > But I need some days.
> >
> > If needed, I'll post it at most on this week-end.
>
> You mean a version that pushes the *args tuple and **kws dict
> on the stack and then uses those for calling the function/method ?
>
> I think that would be a lot more efficient than pushing/popping
> hundreds of parameters on/off the stack.
>
> --
>  Marc-Andre Lemburg
>

I was referring to the solution (which I prefer) that I proposed answering
to Greg, two days ago.

Unfortunately, the stack must be used whatever the solution we will use.

Pushing the "final" tuple and/or dictionary is a possible optimization, but
we can use it only when we have a tuple or dict of constants; otherwise we
need to use the stack.

Good case:  f(1, 2, 3, a = 1, b = 2)
We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with
CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.

Worst case: f(1, x, 3, a = x, b = 2)
We can't push the tuple and dict as a whole, because they need first to be
built using the stack.

The good case is possible, and I have already done some work in wpython
collecting constants on parameters push (even partial constant sequences),
but some additional work must be done recognizing costants-only tuple /
dict.

However, the worst case rest unresolved.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101026/449f82e0/attachment.html>

From bborcic at gmail.com  Tue Oct 26 14:04:49 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Tue, 26 Oct 2010 14:04:49 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
Message-ID: <ia6g52$ldv$1@dough.gmane.org>

Nick Coghlan wrote:
> With the str method, while some
> people may find it odd to have the method invocation on the separator,
> they typically don't forget the order once they learn it for the first
> time.

OTOH, it is a pain that join and split aren't *both* methods on the separator. 
Imho, 71% of what makes it strange for join to be a method on the separator, is 
that split doesn't follow the same convention.

Cheers, BB




From jh at improva.dk  Tue Oct 26 14:22:11 2010
From: jh at improva.dk (Jacob Holm)
Date: Tue, 26 Oct 2010 14:22:11 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
Message-ID: <4CC6C7F3.6090405@improva.dk>

On 2010-10-26 05:14, Guido van Rossum wrote:
> On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm <jh at improva.dk> wrote:
>> On 2010-10-25 17:13, Guido van Rossum wrote:
>>> Can you dig up the link here?
>>
>> Well here is a recap of the end of the discussion about how to handle
>> generator return values and g.close().
> 
> Thanks, very thorough!

I had to read through it myself to remember what actually happened, and
thought you (and the rest of the world) might as well benefit from the
notes I made.



>> The latest draft (#13) I have been able to find was announced in
>> http://mail.python.org/pipermail/python-ideas/2009-April/004189.html
>>
>> And can be found at
>> http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt
> 
> Hmm... It does look like the PEP editors dropped the ball on this one
> (or maybe Greg didn't mail it directly to them). It doesn't seem there
> are substantial differences with the published version at
> http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't
> return a value.
> 

IIRC, there are a few minor semantic differences in how non-generators
are handled.  I haven't made a detailed comparison.


>> I had some later suggestions for how to change the expansion, see e.g.
>> http://mail.python.org/pipermail/python-ideas/2009-April/004195.html  (I
>> find that version easier to reason about even now 1? years later)
> 
> Hopefully you & Greg can agree on a new draft. I like this to make
> progress and I really want this to appear in 3.3. But I don't have the
> time to do the editing and reviewing of the PEP.

IIRC, this was just a presentation issue - the two expansions were
supposed to be equivalent.  It might become relevant if we want to
change something in the definition, because we need a common base to
discuss from.  My version is (intended to be) simpler to reason about in
the sense that things that should be handled the same are only written once.


>> What killed the proposal last time was the question of what should
>> happen when you call g.close() on an exhausted generator.  My preferred
>> solution was (and is) that the generator should save the value from the
>> terminating StopIteration (or None if it ended by some other means) and
>> that g.close() should return that value each time and g.next(), g.send()
>> and g.throw() should raise a StopIteration with the value.
>> Unless you have changed your position on storing the return value, that
>> solution is dead in the water.
> 
> I haven't changed my position. Closing a file twice doesn't do
> anything the second time either.
> 

Ok


>> Here's a stupid idea... let g.close take an optional argument that it
>> can return if the generator is already exhausted and let it return the
>> value from the StopIteration otherwise.
>>
>> def close(self, default=None):
>>    if self.gi_frame is None:
>>        return default
>>    try:
>>        self.throw(GeneratorExit)
>>    except StopIteration as e:
>>        return e.args[0]
>>    except GeneratorExit:
>>        return None
>>    else:
>>        raise RuntimeError('generator ignored GeneratorExit')
> 
> You'll have to explain why None isn't sufficient.
> 

It is not really necessary, but seemed "cleaner" somehow.  Think of
"g.close(default)" as "get me the result if possible, and this default
otherwise".  Then think of dict.get()...

An even cleaner solution might be Nicks "g.finish()" proposal, which I
will comment on separately.



>> I think these things (at least priming and close()) are mostly an issue
>> when using coroutines from non-coroutines.  That means it is likely to
>> be common in small examples where you write the whole program, but less
>> common when you are writing small(ish) parts of a larger framework.
>>
>> Throwing and catching GeneratorExit is not common, and according to some
>> shouldn't be used for this purpose at all.
> 
> Well, *throwing* it is close()'s job. And *catching* it ought to be
> pretty rare. Maybe this idiom would be better:
> 
> def sum():
>   total = 0
>   try:
>     while True:
>       value = yield
>       total += value
>   finally:
>     return total
> 

This is essentially the same as a bare except.  I think there is general
agreement that that is a bad idea.



>>> So, it is clear that generators are extremely versatile, and PEP 380
>>> deserves several good use cases to explain all the API subtleties.
>>>
>>
>> I like your example because it matches the way I would have used
>> generators to solve it.  OTOH, it is not hard to rewrite parallel_reduce
>> as a traditional function.  In fact, the result is a bit shorter and
>> quite a bit faster so it is not a good example of what you need
>> generators for.
> 
> I'm not sure I understand. Maybe you meant to rewrite it as a class?
> There's some state that wouldn't have a good place to live without
> either a class or a (generator) stackframe to survive.
> 

See the reply by Peter Otten (and my reply to him).

You mentioned some possible extensions though.  At a guess, at least
some of these would benefit greatly from the use of generators.  Maybe
such an extension would be a better example?


- Jacob


From ncoghlan at gmail.com  Tue Oct 26 14:35:24 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 26 Oct 2010 22:35:24 +1000
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <ia6g52$ldv$1@dough.gmane.org>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
	<ia6g52$ldv$1@dough.gmane.org>
Message-ID: <AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>

On Tue, Oct 26, 2010 at 10:04 PM, Boris Borcic <bborcic at gmail.com> wrote:
> Nick Coghlan wrote:
>>
>> With the str method, while some
>> people may find it odd to have the method invocation on the separator,
>> they typically don't forget the order once they learn it for the first
>> time.
>
> OTOH, it is a pain that join and split aren't *both* methods on the
> separator. Imho, 71% of what makes it strange for join to be a method on the
> separator, is that split doesn't follow the same convention.

But split only makes sense for strings, not arbitrary sequences. It's
the other way around for join.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From alexander.belopolsky at gmail.com  Tue Oct 26 16:00:33 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 26 Oct 2010 10:00:33 -0400
Subject: [Python-ideas] Move Demo scripts under Lib
Message-ID: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>

I originally proposed this under the Demo and Tools cleanup issue [1].
  The idea was to create a new package "demo" in the standard library
which will host selected demo programs or modules that currently
reside in the Demo/ directory of the source distribution.  There are
several advantages to this approach:

1. Discoverability.  Currently, various distributions place demo
scripts in different places or not include them at all.   There is no
easy way for an end user to discover them.  With a demo package, there
will be a natural place in the python manual to document demo scripts
and users will be able to run them using -m option.  IDEs will be able
to present demo source code and documentation consistently.

2. Test coverage.  One of the points raised in [1] was that Demo
scripts are not routinely tested.  While it is not strictly necessary
to move them under Lib to enable testing, doing so will put these
scripts on the same footing as the rest of the standard library
modules eliminating an unnecessary barrier to writing tests.

3. Quality/relevance.  Many scripts in Demo are very old and do not
reflect modern best practices.  By picking and choosing what goes to
Lib/demo, we can improve the demo collection without removing older
scripts that some may find useful.

One objection raised to this idea was that Demo scripts do not have
the same stability of the API and backward compatibility requirements
as the rest of the standard library.   I don't think this is a serious
issue.  As long as we don't start importing demo modules from other
stdlib modules, there is no impact on the stdlib itself from changing
demo APIs.  Users may be warned that their production programs should
not depend on the demo modules.  I think the word "demo" itself
suggests that.

What do you think?


[1] http://bugs.python.org/issue7962#msg111677


From mal at egenix.com  Tue Oct 26 16:15:47 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 26 Oct 2010 16:15:47 +0200
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
Message-ID: <4CC6E293.7060309@egenix.com>

Alexander Belopolsky wrote:
> I originally proposed this under the Demo and Tools cleanup issue [1].
>   The idea was to create a new package "demo" in the standard library
> which will host selected demo programs or modules that currently
> reside in the Demo/ directory of the source distribution.  There are
> several advantages to this approach:
> 
> 1. Discoverability.  Currently, various distributions place demo
> scripts in different places or not include them at all.   There is no
> easy way for an end user to discover them.  With a demo package, there
> will be a natural place in the python manual to document demo scripts
> and users will be able to run them using -m option.  IDEs will be able
> to present demo source code and documentation consistently.
> 
> 2. Test coverage.  One of the points raised in [1] was that Demo
> scripts are not routinely tested.  While it is not strictly necessary
> to move them under Lib to enable testing, doing so will put these
> scripts on the same footing as the rest of the standard library
> modules eliminating an unnecessary barrier to writing tests.
> 
> 3. Quality/relevance.  Many scripts in Demo are very old and do not
> reflect modern best practices.  By picking and choosing what goes to
> Lib/demo, we can improve the demo collection without removing older
> scripts that some may find useful.
> 
> One objection raised to this idea was that Demo scripts do not have
> the same stability of the API and backward compatibility requirements
> as the rest of the standard library.   I don't think this is a serious
> issue.  As long as we don't start importing demo modules from other
> stdlib modules, there is no impact on the stdlib itself from changing
> demo APIs.  Users may be warned that their production programs should
> not depend on the demo modules.  I think the word "demo" itself
> suggests that.
> 
> What do you think?

Calling a stdlib package "demo" or "example" is not a good idea,
since those are rather common package names in existing
applications.

I also don't really see the point in moving *scripts* to the stdlib.
The lib modules are usually not executable or meant for execution
and you'd normally expect scripts to be under .../bin rather than
.../lib.

Why don't you turn the ones you find useful into PyPI packages
to install separately ?

> [1] http://bugs.python.org/issue7962#msg111677

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From bborcic at gmail.com  Tue Oct 26 16:32:10 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Tue, 26 Oct 2010 16:32:10 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>
	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
Message-ID: <ia6opa$1eq$1@dough.gmane.org>

Nick Coghlan wrote:

>>> With the str method, while some
>>> people may find it odd to have the method invocation on the separator,
>>> they typically don't forget the order once they learn it for the first
>>> time.
>>
>> OTOH, it is a pain that join and split aren't *both* methods on the
>> separator. Imho, 71% of what makes it strange for join to be a method on the
>> separator, is that split doesn't follow the same convention.
>
> But split only makes sense for strings, not arbitrary sequences. It's
> the other way around for join.


I don't feel your "the other way around" makes clear sense. The /split/ function 
depends on two string parameters, what allows a design choice on which one 
should be the object when making it a method call. I have been burned more than 
once with internalizing that /join/ is a method on the separator, just to 
(re-)discover that such is *not* the case of the converse method /split/ - 
although it could (and therefore should, to minimize cognitive dissonance).

IOW, instead of whining that there is no way to make join a method on what "we 
should" think of as the prominent object (ie the sequence/iterator) and then 
half-heartedly promote sep.join as the solution, let's take the sep.join idiom 
seriously together with its implication that a core object role for a string, is 
to act as a separator.

And let's then propagate that notion, to a *coherent* definition of split that 
makes it as well a method on the separator.

Cheers, BB




From mal at egenix.com  Tue Oct 26 16:37:29 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 26 Oct 2010 16:37:29 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
Message-ID: <4CC6E7A9.4030205@egenix.com>

Cesare Di Mauro wrote:
> 2010/10/26 M.-A. Lemburg <mal at egenix.com>
> 
>> Cesare Di Mauro wrote:
>>  > I can provide another patch that will not use EXTENDED_ARG (no VM
>> changes),
>>> and uses *args and/or **kwargs function calls when there are more than
>> 255
>>> arguments or keyword arguments.
>>>
>>> But I need some days.
>>>
>>> If needed, I'll post it at most on this week-end.
>>
>> You mean a version that pushes the *args tuple and **kws dict
>> on the stack and then uses those for calling the function/method ?
>>
>> I think that would be a lot more efficient than pushing/popping
>> hundreds of parameters on/off the stack.
>>
>> --
>>  Marc-Andre Lemburg
>>
> 
> I was referring to the solution (which I prefer) that I proposed answering
> to Greg, two days ago.
> 
> Unfortunately, the stack must be used whatever the solution we will use.
> 
> Pushing the "final" tuple and/or dictionary is a possible optimization, but
> we can use it only when we have a tuple or dict of constants; otherwise we
> need to use the stack.
> 
> Good case:  f(1, 2, 3, a = 1, b = 2)
> We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with
> CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.
> 
> Worst case: f(1, x, 3, a = x, b = 2)
> We can't push the tuple and dict as a whole, because they need first to be
> built using the stack.
> 
> The good case is possible, and I have already done some work in wpython
> collecting constants on parameters push (even partial constant sequences),
> but some additional work must be done recognizing costants-only tuple /
> dict.
> 
> However, the worst case rest unresolved.

I don't understand. What is the difference between pushing values
on the stack and building a tuple/dict and then pushing those on
the stack ?

In your worst case example, the compiler would first build
a tuple/dict using the args already on the stack (BUILD_TUPLE,
BUILD_MAP) and then call the function with this tuple/dict
combination - you'd basically move the tuple/dict building to
the compiler rather than having the CALL* opcodes do this
internally.

It would essentially run:

f(*(1,x,3), **{'a':x, 'b':2})

and bypass the "max. number of opcode args" limit without
degrading performance, since BUILD_TUPLE et al. essentially
run the same code for building the call arguments as the
helpers for calling a function.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From dirkjan at ochtman.nl  Tue Oct 26 16:33:38 2010
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Tue, 26 Oct 2010 16:33:38 +0200
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
Message-ID: <AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>

On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> What do you think?

After browsing through the Demo dir a bit, I came away thinking most
of these should just be removed from the repository. I think there's
enough demo material out there on the internet (for example in the
cookbook), a lot of it of higher quality than what we have in the Demo
dir right now. Maybe it makes sense to have a basic tkinter app to get
you started. And some of the smaller functions or classes could
possibly be used in the documentation. But as it is, it seems silly to
waste developer time on stuff that few people look at or make use of
(I'm assuming this from the fact that they have previously been
neglected).

Back to the original question: I don't think moving the Demo stuff to
the Lib dir is a good idea, simply because the Lib dir should contain
libraries, not applications or scripts. Writing a section for the
documentation seems a better way to solve the discoverability problem,
testing could be done even in the Demo dir (with more structure if
need be), and quality control could just as well be exercised in the
current location.

Cheers,

Dirkjan


From alexander.belopolsky at gmail.com  Tue Oct 26 16:50:57 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 26 Oct 2010 10:50:57 -0400
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <4CC6E293.7060309@egenix.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<4CC6E293.7060309@egenix.com>
Message-ID: <AANLkTin85kLyHqOw-GXTELPau-QoWC-vVy2tC+53AM3X@mail.gmail.com>

On Tue, Oct 26, 2010 at 10:15 AM, M.-A. Lemburg <mal at egenix.com> wrote:
..
> Calling a stdlib package "demo" or "example" is not a good idea,
> since those are rather common package names in existing
> applications.
>
Since proposed "demo" package is not intended to be imported by
applications, there is not a problem if they are shadowed by
application modules.  There are plenty of common names in the stdlib.
I don't remember this cited as a problem in the past.  For example,
recently added collections and subprocess modules were likely to
conflict with names used by applications.   The "test" package has
been installed with stdlib for ages and it often conflicts with user
test packages.

I believe applications commonly using a "demo" or "example" package is
an argument for rather than against my idea.  (What's good for users
is probably good for stdlib.)

Finally, if the name conflict is indeed an issue, it is not hard to
come up with a less common name: "pydemo", "python_examples", etc.

> I also don't really see the point in moving *scripts* to the stdlib.

I gave three reasons in my first post.  The first is specifically for
*scripts*: to be able to run them using python -m without having to
know an obscure path or polluting system path.

> The lib modules are usually not executable or meant for execution
> and you'd normally expect scripts to be under .../bin rather than
> .../lib.

Most of stdlib modules are in fact executable with python -m.  Just
grep for 'if __name__ == "__main__":' line.  While most demo scripts
are self-contained programs, many are examples on how to write modules
or classes.  See for example Demo/classes.  Furthermore, while users
can run demos, presumably the main purpose of demos is to present the
source code in them.  I believe it is more natural too look for python
source code along PYTHONPATH than along PATH.

I don't think any demo scripts are suitable to be installed under
.../bin.   In fact, Red Hat distribution installs them under
/usr/lib/pythonX.Y/Demo.

> Why don't you turn the ones you find useful into PyPI packages
> to install separately ?

That's a good way to make them *less* discoverable than they currently
are and make even fewer distributions include them by default.

BTW, what is the purpose of the "Demo" directory to begin with?  I
would naively assume that it is the place where new users would look
to get the idea of what they can do with python.  If this is the case,
it completely misses the target because new users are unlikely to have
a source distribution or look under  /usr/lib/pythonX.Y/Demo or other
system specific place.


From jh at improva.dk  Tue Oct 26 16:44:31 2010
From: jh at improva.dk (Jacob Holm)
Date: Tue, 26 Oct 2010 16:44:31 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
Message-ID: <4CC6E94F.3090702@improva.dk>

On 2010-10-26 12:36, Nick Coghlan wrote:
> On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum <guido at python.org> wrote:
>> Well, *throwing* it is close()'s job. And *catching* it ought to be
>> pretty rare. Maybe this idiom would be better:
>>
>> def sum():
>>  total = 0
>>  try:
>>    while True:
>>      value = yield
>>      total += value
>>  finally:
>>    return total
> 
> Rereading my previous post that Jacob linked, I'm still a little
> uncomfortable with the idea of people deliberately catching
> GeneratorExit to turn it into a normal value return to be reported by
> close(). That said, I'm even less comfortable with the idea of
> encouraging the moral equivalent of a bare except clause :)

What Nick said. :)


> I see two realistic options here:
> 
> 1. Use GeneratorExit for this, have g.close() return a value and I
> (and others that agree with me) just get the heck over it.
> 

This has the benefit of not needing an extra method/function and an
extra exception for this style of programming.  It still has the
refactoring problem I mention below.  That might be fixable in a similar
way though.  (Hmm thinking about this gives me a strong sense of deja-vu).


> 2. Add a new GeneratorReturn exception and a new g.finish() method
> that follows the same basic algorithm Guido suggested, only with a
> different exception type:
> 
> class GeneratorReturn(Exception): # Note: ordinary exception, unlike
> GeneratorExit
>   pass
> 
> def finish(gen):
>  try:
>    gen.throw(GeneratorReturn)
>    raise RuntimeError("Generator ignored GeneratorReturn")
>  except StopIteration as err:
>    if err.args:
>      return err.args[0]
>  except GeneratorReturn:
>    pass
>  return None
> 

I like this.  Having a separate function lets you explicitly request a
return value and making it fail loudly when called on an exhausted
generator feels just right given the prohibition against saving the
"True" return value anywhere.  Also, using a different exception lets
the generator distinguish between the "close" and "finish" cases, and
making it an ordinary exception makes it clear that it is *intended* to
be caught.  All good stuff.

I am not sure that returning None when finish() cathes GeneratorReturn
is a good idea though.  If you call finish on a generator you expect it
to do something about it and return a value.  If the GeneratorReturn
escapes, it is a sign that the generator was not written to expect this
and so it likely an error.  OTOH, I am not sure it always is so maybe
allowing it is OK.  I just don't know.

How does it fit with the current PEP 380, and esp. the refactoring
principle?   It seems like we need to special-case the GeneratorReturn
exception somehow.  Perhaps like this:

[...]
  try:
      _s = yield _y
+ except GeneratorReturn as _e:
+     try:
+         _m = _i.finish
+     except AttributeError:
+         raise _e  # XXX RuntimeError?
+     raise YieldFromFinished(_m())
  except GeneratorExit as _e:
[...]

Where YieldFromFinished inherits from GeneratorReturn, and has a 'value'
attribute like the new StopIteration.

Without something like this a function that is written to work with
"finish" is unlikely to be refactorable.   With this, the trivial case
of perfect delegation can be written as:

def outer():
    try:
        return yield from inner()
    except YieldFromFinished as e:
        return e.value

and a slightly more complex case...

def outer2():
    try:
        a = yield from innerA()
    except YieldFromFinished as e:
        return e.value
    try:
        b = yield from innerB()
    except YieldFromFinished as e:
        return a+e.value
    return a+b

the "outer2" example shows why the special casing is needed.  If
outer2.finish() is called while outer2 is suspended in innerA, a
GeneratorReturn would be thrown directly into innerA.  Since innerA is
supposed to be expecting this, it returns a value immediately which
would then be the return value of the yield-from.  outer2 would then
erroneously continue to the "b = yield from innerB()" line, which unless
innerB immediately raised StopIteration would yield a value causing the
outer2.finish() to raise a RuntimeError...

We can avoid the extra YieldFromFinished exception if we let the new
GeneratorReturn exception grow a value attribute instead and use it for
both purposes.  But then the distinction between a GeneratorReturn that
is thrown in by "finish" (which has no associated value) and the
GeneratorReturn raised by the yield-from (which has) gets blurred a bit.

Another idea is to actually replace YieldFromFinished with StopIteration
or a GeneratorReturn inheriting from StopIteration.  That would mean we
could drop the first try-except block in each of the above example
generators because the "finished" result from the inner function is
returned directly anyway.  On the other hand, that could easily lead to
subtle bugs if you forget a try...except block that is actually needed,
like the second block in outer2.


A different way to handle this would be to change the PEP 380 expansion
as follows:

[...]
- except GeneratorExit as _e:
+ except (GeneratorReturn, GeneratorExit) as _e:
[...]

What this means is that only the outermost generator would see the
GeneratorReturn.  If the outermost generator is suspended using
yield-from, and finish() is called.  The inner generator is simply
closed and the GeneratorReturn re-raised.  This version is only really
useful for delegating to generators that *don't* return a value, but it
is simpler and at least it allows *some* use of yield-from with "finish".


> (Why "finish" as the suggested name for the method? I'd prefer
> "return", but that's a keyword and "return_" is somewhat ugly. Pairing
> GeneratorReturn with finish() is my second choice, for the "OK, time
> to wrap things up and complete your assigned task" connotations, as
> compared to the "drop everything and clean up the mess" connotations
> of GeneratorExit and close())

I like the names.  GeneratorFinish might work as well for the exception,
but I like GeneratorReturn better for its connection with "return".


> 
> I'd personally be +1 on option 2 (since it addresses the immediate use
> case while maintaining appropriate separation of concerns between
> guaranteed resource cleanup and graceful completion of coroutines) and
> -0 on option 1 (unsurprising, given my previously stated objections to
> failing to maintain appropriate separation of concerns).
> 

I agree the "finish" idea looks far better for generators without
yield-from.  It is unfortunate that extending it to work with yield-from
isn't prettier that it is though.



> (I should note that this differs from the previous suggestion of a
> GeneratorReturn exception in the context of PEP 380. Those suggestions
> were to use it as a replacement for StopIteration when a generator
> contained a return statement. The suggestion here is to instead use it
> as a replacement for GeneratorExit in order to request
> prompt-but-graceful completion of a generator rather than just bailing
> out immediately).

I agree the name fits this use better than the original.  Too bad some
of my suggestions above are starting to blur the line between
GeneratorReturn and StopIteration again.

- Jacob



From guido at python.org  Tue Oct 26 16:56:51 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 07:56:51 -0700
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
Message-ID: <AANLkTi=WZ8pv0zGqpH3gkY-wPLDV6Ljoy53nOUdXPEBV@mail.gmail.com>

On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
>> What do you think?
>
> After browsing through the Demo dir a bit, I came away thinking most
> of these should just be removed from the repository.

+1. Most of them are either quick hacks I once wrote and didn't know
where to put (dutree.py, repeat.py come to mind) or in some cases
contributed 3rd party code that was looking for a home. I think that
all of these ought to live somewhere else and I have no problem with
tossing out the entire Demo and Tools directories -- anything that's
not needed as part of the build should go. (Though a few things might
indeed be moved into the stdlib if they are useful enough.)

> I think there's
> enough demo material out there on the internet (for example in the
> cookbook), a lot of it of higher quality than what we have in the Demo
> dir right now. Maybe it makes sense to have a basic tkinter app to get
> you started. And some of the smaller functions or classes could
> possibly be used in the documentation. But as it is, it seems silly to
> waste developer time on stuff that few people look at or make use of
> (I'm assuming this from the fact that they have previously been
> neglected).

None of that belongs in the core distro any more.

> Back to the original question: I don't think moving the Demo stuff to
> the Lib dir is a good idea, simply because the Lib dir should contain
> libraries, not applications or scripts. Writing a section for the
> documentation seems a better way to solve the discoverability problem,
> testing could be done even in the Demo dir (with more structure if
> need be), and quality control could just as well be exercised in the
> current location.

If there are demos that are useful for testing, move them into Lib/test/.

-- 
--Guido van Rossum (python.org/~guido)


From alexander.belopolsky at gmail.com  Tue Oct 26 17:13:27 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 26 Oct 2010 11:13:27 -0400
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
Message-ID: <AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>

On Tue, Oct 26, 2010 at 10:33 AM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
>> What do you think?
>
> After browsing through the Demo dir a bit, I came away thinking most
> of these should just be removed from the repository. I think there's
> enough demo material out there on the internet (for example in the
> cookbook), a lot of it of higher quality than what we have in the Demo
> dir right now. Maybe it makes sense to have a basic tkinter app to get
> you started.

The one demo that I want to find a better place for is Demo/turtle.
For a novice-oriented framework that turtle is, it is really a shame
to require

$ cd <whatever>/Demo/turtle
$ python turtleDemo.py

to run the demo.  I would much rather use

$ python demo.turtle

or

$ python turtle.demo

(the later would require converting turtle.py into a package)

> And some of the smaller functions or classes could
> possibly be used in the documentation.

And most likely not be automatically tested contributing to users'
confusion: "I copied it from the documentation and it does not work!"
 See http://bugs.python.org/issue10029 .


> But as it is, it seems silly to
> waste developer time on stuff that few people look at or make use of
> (I'm assuming this from the fact that they have previously been
> neglected).
>
It is debatable what is the cause and what is the effect here.


> Back to the original question: I don't think moving the Demo stuff to
> the Lib dir is a good idea, simply because the Lib dir should contain
> libraries, not applications or scripts.

Introduction of -m option has changed that IMO.  For example, when I
work with recent versions of python, I always run pydoc as python -m
pydoc because pydoc script on the path amy not correspond to the same
version of python that I use.  The trace, timeit, dis and probably
many other useful modules don't even have a corresponding script in
the standard distribution.

> Writing a section for the
> documentation seems a better way to solve the discoverability problem,

What exactly such a section should say?  "In order to find demo
scripts, pleas unpack the source distribution and look under the Demo
directory"?

> testing could be done even in the Demo dir (with more structure if
> need be), and quality control could just as well be exercised in the
> current location.

This is a valid option and if running Demo tests is added to make test
target, it has a fighting chance to work.  However, if Demo test are
organized differently from stdlib module tests, maintaining them will
be more difficult than it needs to be.


From guido at python.org  Tue Oct 26 17:18:44 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 08:18:44 -0700
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
Message-ID: <AANLkTinGYEZUp4xHHAOJRoB_Q-ktdd33R-CWHTt7pvEz@mail.gmail.com>

On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> The one demo that I want to find a better place for is Demo/turtle.

Sure, go for it. It is a special case because the turtle module is
also in the stdlib and these are intended for a particular novice
audience. Anything we can do to make things easier for those people to
get start with is probably worth it. Ideally they could just double
click some file and the demo would fire up, with a command-line
alternative (for the geeks among them) e.g. "python -m turtledemo" .

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Tue Oct 26 17:43:00 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 08:43:00 -0700
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CC6A933.3080605@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>
	<201010111017.56101.steve@pearwood.info>
	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>
	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>
	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>
	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>
	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>
	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>
	<4CB7B7C2.8090401@ronadam.com>
	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>
	<4CB88BD2.4010901@ronadam.com> <i9a2u9$q8k$1@dough.gmane.org>
	<4CB898CD.6000207@ronadam.com>
	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>
	<4CB8B2F5.2020507@ronadam.com> <ia2d8l$7rr$1@dough.gmane.org>
	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
	<4CC6A933.3080605@pearwood.info>
Message-ID: <AANLkTimrJDPwHLv4kGbnidbAv2AFr3YF7BCopMSmhBPu@mail.gmail.com>

On Tue, Oct 26, 2010 at 3:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> Perhaps I'm missing something, but to my mind, that's an awfully complicated
> solution for such a simple problem. Here's my attempt:
>
> def multi_reduce(iterable, funcs):
> ? ?it = iter(iterable)
> ? ?collectors = [next(it)]*len(funcs)
> ? ?for i, f in enumerate(funcs):
> ? ? ? ?x = next(it)
> ? ? ? ?collectors[i] = f(collectors[i], x)
> ? ?return collectors
>
> I've called it multi_reduce rather than parallel_reduce, because it doesn't
> execute the functions in parallel. By my testing on Python 3.1.1,
> multi_reduce is consistently ~30% faster than the generator based solution
> for lists with 1000 - 10,000,000 items.
>
> So what am I missing? What does your parallel_reduce give us that
> multi_reduce doesn't?

You're right, the point I wanted to prove was that generators are
better than threads, but the code was based on emulating reduce(). The
generalization that I was aiming for was that it is convenient to
write a generator that does some kind of computation over a sequence
of items and returns a result at the end, and then have a driver that
feeds a single sequence to a bunch such generators. This is more
powerful once you try to use reduce to compute e.g. the average of the
numbers fed to it -- of course you can do it using a function of
(state, value) but it is so much easier to write as a loop! (At least
for me -- if you do nothing but write Haskell all day I'm sure it
comes naturally. :-)

def avg():
  total = 0
  count = 0
  try:
    while True:
      value = yield
      total += value
      count += 1
  except GeneratorExit:
    raise StopIteration(total / count)

The essential boilerplate here is

  try:
    while True:
      value = yield
      <use value>
  except GeneratorExit:
    raise StopIteration(<compute result>)

No doubt functional aficionados will snub this, but in Python, this
should run much faster than the same thing written as a reduce-ready
function, due to the function overhead (which wasn't a problem in the
min/max example since those are built-ins).

BTW This episode led me to better understand my objection against
reduce() as the universal hammer: my problem with writing avg() using
reduce is that the function one feeds into reduce is asymmetric -- its
first argument must be some state, e.g. a tuple (total, count), and
the second argument must be the next value. This is the moment that my
head reliably explodes -- even though it has no problem visualizing
reduce() using a *symmetric* function like +, min or max.

Also note that the reduce() based solution would have to have a
separate function to extract the desired result (total / count) from
the state (total, count), and for multi_reduce() you would have to
supply a separate list of functions for these or some other hacky
approach.

-- 
--Guido van Rossum (python.org/~guido)


From alexander.belopolsky at gmail.com  Tue Oct 26 17:49:49 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 26 Oct 2010 11:49:49 -0400
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTinGYEZUp4xHHAOJRoB_Q-ktdd33R-CWHTt7pvEz@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
	<AANLkTinGYEZUp4xHHAOJRoB_Q-ktdd33R-CWHTt7pvEz@mail.gmail.com>
Message-ID: <AANLkTimODy6TOd2dy-+N0mWOKXo0Sgugu+c9thb82WjB@mail.gmail.com>

On Tue, Oct 26, 2010 at 11:18 AM, Guido van Rossum <guido at python.org> wrote:
> On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
>> The one demo that I want to find a better place for is Demo/turtle.
>
> Sure, go for it. It is a special case because the turtle module is
> also in the stdlib and these are intended for a particular novice
> audience.

Please see http://bugs.python.org/issue10199 for further discussion.


From steve at pearwood.info  Tue Oct 26 18:09:23 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 27 Oct 2010 03:09:23 +1100
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
 rdelim='')
In-Reply-To: <ia6opa$1eq$1@dough.gmane.org>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
	<ia6opa$1eq$1@dough.gmane.org>
Message-ID: <4CC6FD33.4050305@pearwood.info>

Boris Borcic wrote:

> And let's then propagate that notion, to a *coherent* definition of 
> split that makes it as well a method on the separator.

Let's not.

Splitting is not something that you on the separator, it's something you 
do on the source string. I'm sure you wouldn't expect this:

":".find("key:value")
=> 3

Nor should we expect this:

":".split("key:value")
=> ["key", "value"]


You perform a search *on* the source string, not the target substring. 
Likewise you split the source string, not the separator.


-- 
Steven



From masklinn at masklinn.net  Tue Oct 26 18:33:41 2010
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 26 Oct 2010 18:33:41 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <4CC6FD33.4050305@pearwood.info>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
	<ia6opa$1eq$1@dough.gmane.org> <4CC6FD33.4050305@pearwood.info>
Message-ID: <5B4B42DF-25F5-4DC4-90B2-4AF5B7AF40D6@masklinn.net>

On 2010-10-26, at 18:09 , Steven D'Aprano wrote:
> Boris Borcic wrote:
>> And let's then propagate that notion, to a *coherent* definition of split that makes it as well a method on the separator.
> 
> Let's not.
> 
> Splitting is not something that you on the separator, it's something you do on the source string. I'm sure you wouldn't expect this:

Much as joining, that's a completely arbitrary decision. Python's take is that you split on a source and join on a separator, most APIs I've seen so far agree on the former but not on the latter.

And Python has an API which splits on the separator anyway: re.RegexObject is not a value you can provide to str.split() as far as I know (whereas in Ruby String#split can take a string or a regex indifferently, so it's coherent in that it always splits on the source string, never on the separator).

From guido at python.org  Tue Oct 26 18:56:41 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 09:56:41 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
Message-ID: <AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>

On Tue, Oct 26, 2010 at 3:36 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum <guido at python.org> wrote:
>> On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm <jh at improva.dk> wrote:
>>> Throwing and catching GeneratorExit is not common, and according to some
>>> shouldn't be used for this purpose at all.
>>
>> Well, *throwing* it is close()'s job. And *catching* it ought to be
>> pretty rare. Maybe this idiom would be better:
>>
>> def sum():
>> ?total = 0
>> ?try:
>> ? ?while True:
>> ? ? ?value = yield
>> ? ? ?total += value
>> ?finally:
>> ? ?return total

> Rereading my previous post that Jacob linked, I'm still a little
> uncomfortable with the idea of people deliberately catching
> GeneratorExit to turn it into a normal value return to be reported by
> close(). That said, I'm even less comfortable with the idea of
> encouraging the moral equivalent of a bare except clause :)

My bad. I should have stopped at "except GeneratorExit: return total".

> I see two realistic options here:
>
> 1. Use GeneratorExit for this, have g.close() return a value and I
> (and others that agree with me) just get the heck over it.

This is still my preferred option.

> 2. Add a new GeneratorReturn exception and a new g.finish() method
> that follows the same basic algorithm Guido suggested, only with a
> different exception type:
>
> class GeneratorReturn(Exception): # Note: ordinary exception, unlike
> GeneratorExit
> ?pass
>
> def finish(gen):
> ?try:
> ? gen.throw(GeneratorReturn)
> ? raise RuntimeError("Generator ignored GeneratorReturn")
> ?except StopIteration as err:
> ? if err.args:
> ? ? return err.args[0]
> ?except GeneratorReturn:
> ? pass
> ?return None

IMO there are already too many special exceptions and methods.

> (Why "finish" as the suggested name for the method? I'd prefer
> "return", but that's a keyword and "return_" is somewhat ugly. Pairing
> GeneratorReturn with finish() is my second choice, for the "OK, time
> to wrap things up and complete your assigned task" connotations, as
> compared to the "drop everything and clean up the mess" connotations
> of GeneratorExit and close())
>
> I'd personally be +1 on option 2 (since it addresses the immediate use
> case while maintaining appropriate separation of concerns between
> guaranteed resource cleanup and graceful completion of coroutines) and
> -0 on option 1 (unsurprising, given my previously stated objections to
> failing to maintain appropriate separation of concerns).

Hm, I guess I'm more in favor of minimal mechanism. The clincher for
me is pretty much that the extended g.close() semantics are a very
simple mod to the existing gen_close() function in genobject.c -- it
currently always returns None but could very easily be changed to
extract the return value from err.args when it catches StopIteration
(but not GeneratorExit).

it also looks like my proposal doesn't get in the way of anything --
if the generator doesn't catch GeneratorExit g.close() will return
None, and if the caller of g.close() doesn't expect a value, they can
just ignore it.

Finally note that this still looks like a relatively esoteric use
case: when using "var = yield from generator()" the the return value
from the generator (written as "return X" and implemented as "raise
StopIteration(X)") will automatically be delivered to var, and there's
no need to call g.close(). In this case there is also no reason for
the generator to catch GeneratorExit -- that is purely needed for the
idiom of writing "inside-out iterators" using this pattern in the
generator (as I mentioned on the parent thread):

  try:
    while True:
      value = yield
      <use value>
  except GeneratorExit:
    raise StopIteration(<result>)  # Or "return <result>" in PEP 380 syntax

Now, if I may temporarily go into wild-and-crazy mode (this *is*
python-ideas after all :-), we could invent some ad-hoc syntax for
this pattern, e.g.:

  for value in yield:
    <use value>
  return <result>

IOW the special form:

  for <var> in yield:
    <body>

would translate into:

  try:
    while True:
      <var> = yield
      <body>
  except GeneratorExit:
    pass

If (and this is a big if) the
while-True-yield-inside-try-except-GeneratorExit pattern somehow
becomes popular we could reconsider this syntactic extension or some
variant. (I have to add that the syntactic ice is a bit thin here,
since "for <var> in (yield)" already has a meaning, and a totally
different one of course. A variant could be "for <var> from yield" or
some other abuse of keywords.

But let me stop here before people think I've just volunteered my
retirement... :-)

> (I should note that this differs from the previous suggestion of a
> GeneratorReturn exception in the context of PEP 380. Those suggestions
> were to use it as a replacement for StopIteration when a generator
> contained a return statement. The suggestion here is to instead use it
> as a replacement for GeneratorExit in order to request
> prompt-but-graceful completion of a generator rather than just bailing
> out immediately).

Noted.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Tue Oct 26 19:01:50 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 10:01:50 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC6C7F3.6090405@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<4CC6C7F3.6090405@improva.dk>
Message-ID: <AANLkTi=6pa84R1MAjprFJP2jiOddnQptfT-C28=1n-Df@mail.gmail.com>

On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm <jh at improva.dk> wrote:
[...]
>>> Here's a stupid idea... let g.close take an optional argument that it
>>> can return if the generator is already exhausted and let it return the
>>> value from the StopIteration otherwise.
>>>
>>> def close(self, default=None):
>>> ? ?if self.gi_frame is None:
>>> ? ? ? ?return default
>>> ? ?try:
>>> ? ? ? ?self.throw(GeneratorExit)
>>> ? ?except StopIteration as e:
>>> ? ? ? ?return e.args[0]
>>> ? ?except GeneratorExit:
>>> ? ? ? ?return None
>>> ? ?else:
>>> ? ? ? ?raise RuntimeError('generator ignored GeneratorExit')
>>
>> You'll have to explain why None isn't sufficient.

> It is not really necessary, but seemed "cleaner" somehow. ?Think of
> "g.close(default)" as "get me the result if possible, and this default
> otherwise". ?Then think of dict.get()...

Hm, I'd say there always is a result -- it just sometimes is None. I
really don't want to make distinctions between falling off the end of
the function, "return" without a value, "return None", "raise
StopIteration()", "raise StopIteration(None)", or even (in response to
a close() request) "raise GeneratorExit".

> You mentioned some possible extensions though. ?At a guess, at least
> some of these would benefit greatly from the use of generators. ?Maybe
> such an extension would be a better example?

Yes, see the avg() example I posted in the parent thread.

-- 
--Guido van Rossum (python.org/~guido)


From cesare.di.mauro at gmail.com  Tue Oct 26 19:22:32 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 26 Oct 2010 19:22:32 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <4CC6E7A9.4030205@egenix.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
Message-ID: <AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>

2010/10/26 M.-A. Lemburg <mal at egenix.com>

> Cesare Di Mauro wrote:
> > 2010/10/26 M.-A. Lemburg <mal at egenix.com>
> >
>  > I was referring to the solution (which I prefer) that I proposed
> answering
> > to Greg, two days ago.
> >
> > Unfortunately, the stack must be used whatever the solution we will use.
> >
> > Pushing the "final" tuple and/or dictionary is a possible optimization,
> but
> > we can use it only when we have a tuple or dict of constants; otherwise
> we
> > need to use the stack.
> >
> > Good case:  f(1, 2, 3, a = 1, b = 2)
> > We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with
> > CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.
> >
> > Worst case: f(1, x, 3, a = x, b = 2)
> > We can't push the tuple and dict as a whole, because they need first to
> be
> > built using the stack.
> >
> > The good case is possible, and I have already done some work in wpython
> > collecting constants on parameters push (even partial constant
> sequences),
> > but some additional work must be done recognizing costants-only tuple /
> > dict.
> >
> > However, the worst case rest unresolved.
>
> I don't understand. What is the difference between pushing values
> on the stack and building a tuple/dict and then pushing those on
> the stack ?
>
> In your worst case example, the compiler would first build
> a tuple/dict using the args already on the stack (BUILD_TUPLE,
> BUILD_MAP) and then call the function with this tuple/dict
> combination - you'd basically move the tuple/dict building to
> the compiler rather than having the CALL* opcodes do this
> internally.
>
> It would essentially run:
>
> f(*(1,x,3), **{'a':x, 'b':2})
>
> and bypass the "max. number of opcode args" limit without
> degrading performance, since BUILD_TUPLE et al. essentially
> run the same code for building the call arguments as the
> helpers for calling a function.
>
> --
> Marc-Andre Lemburg
>

Yes, the idea is to let the compiler emit proper code to build the
tuple/dict, instead of using the CALL_* to do it, in order to bypass the
current limits.

That's if we don't want to change the current CALL_* behavior, so speeding
up the common cases and introducing a slower (but working) path for the
uncommon ones.

Another solution can be to introduce a specific opcode, but I don't see it
well if the purpose is just to permit more than 255 arguments.

At this time I have no other ideas to solve this problem.

Please, let me know if there's interest on a new patch to implement the
"compiler-based" solution.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101026/0c522533/attachment.html>

From guido at python.org  Tue Oct 26 19:33:53 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 26 Oct 2010 10:33:53 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC6E94F.3090702@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
Message-ID: <AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>

On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm <jh at improva.dk> wrote:
[...]
> I like this. ?Having a separate function lets you explicitly request a
> return value and making it fail loudly when called on an exhausted
> generator feels just right given the prohibition against saving the
> "True" return value anywhere. ?Also, using a different exception lets
> the generator distinguish between the "close" and "finish" cases, and
> making it an ordinary exception makes it clear that it is *intended* to
> be caught. ?All good stuff.

I don't know. There are places where failing loudly is the right thing
to do (1 + 'a'). But when it comes to return values Python takes a
pretty strong position that there's no difference between functions
and procedures, that "return", "return None" and falling off the end
all mean the same thing, and that it's totally fine to ignore a value
or to return a value that will be of no interest for most callers.

> I am not sure that returning None when finish() cathes GeneratorReturn
> is a good idea though. ?If you call finish on a generator you expect it
> to do something about it and return a value. ?If the GeneratorReturn
> escapes, it is a sign that the generator was not written to expect this
> and so it likely an error. ?OTOH, I am not sure it always is so maybe
> allowing it is OK. ?I just don't know.
>
> How does it fit with the current PEP 380, and esp. the refactoring
> principle? ? It seems like we need to special-case the GeneratorReturn
> exception somehow. ?Perhaps like this:
>
> [...]
> ?try:
> ? ? ?_s = yield _y
> + except GeneratorReturn as _e:
> + ? ? try:
> + ? ? ? ? _m = _i.finish
> + ? ? except AttributeError:
> + ? ? ? ? raise _e ?# XXX RuntimeError?
> + ? ? raise YieldFromFinished(_m())
> ?except GeneratorExit as _e:
> [...]
>
> Where YieldFromFinished inherits from GeneratorReturn, and has a 'value'
> attribute like the new StopIteration.
>
> Without something like this a function that is written to work with
> "finish" is unlikely to be refactorable. ? With this, the trivial case
> of perfect delegation can be written as:
>
> def outer():
> ? ?try:
> ? ? ? ?return yield from inner()
> ? ?except YieldFromFinished as e:
> ? ? ? ?return e.value
>
> and a slightly more complex case...
>
> def outer2():
> ? ?try:
> ? ? ? ?a = yield from innerA()
> ? ?except YieldFromFinished as e:
> ? ? ? ?return e.value
> ? ?try:
> ? ? ? ?b = yield from innerB()
> ? ?except YieldFromFinished as e:
> ? ? ? ?return a+e.value
> ? ?return a+b
>
> the "outer2" example shows why the special casing is needed. ?If
> outer2.finish() is called while outer2 is suspended in innerA, a
> GeneratorReturn would be thrown directly into innerA. ?Since innerA is
> supposed to be expecting this, it returns a value immediately which
> would then be the return value of the yield-from. ?outer2 would then
> erroneously continue to the "b = yield from innerB()" line, which unless
> innerB immediately raised StopIteration would yield a value causing the
> outer2.finish() to raise a RuntimeError...
>
> We can avoid the extra YieldFromFinished exception if we let the new
> GeneratorReturn exception grow a value attribute instead and use it for
> both purposes. ?But then the distinction between a GeneratorReturn that
> is thrown in by "finish" (which has no associated value) and the
> GeneratorReturn raised by the yield-from (which has) gets blurred a bit.
>
> Another idea is to actually replace YieldFromFinished with StopIteration
> or a GeneratorReturn inheriting from StopIteration. ?That would mean we
> could drop the first try-except block in each of the above example
> generators because the "finished" result from the inner function is
> returned directly anyway. ?On the other hand, that could easily lead to
> subtle bugs if you forget a try...except block that is actually needed,
> like the second block in outer2.

I'm afraid that all was too much to really reach my brain, which keeps
telling me "he's commenting on Nick's proposal which I've already
rejected".

> A different way to handle this would be to change the PEP 380 expansion
> as follows:
>
> [...]
> - except GeneratorExit as _e:
> + except (GeneratorReturn, GeneratorExit) as _e:
> [...]

That just strikes me as one more reason why a separate GeneratorReturn
is a bad idea.

In my ideal world, you almost never need to catch or raise
StopIteration; you don't raise GeneratorExit (that is close()'s job)
but you catch it to notice that your data source is finished, and then
you return a value. (And see my crazy idea in my previous post to get
rid of that too. :-)

> What this means is that only the outermost generator would see the
> GeneratorReturn. ?If the outermost generator is suspended using
> yield-from, and finish() is called. ?The inner generator is simply
> closed and the GeneratorReturn re-raised. ?This version is only really
> useful for delegating to generators that *don't* return a value, but it
> is simpler and at least it allows *some* use of yield-from with "finish".
>
>
>> (Why "finish" as the suggested name for the method? I'd prefer
>> "return", but that's a keyword and "return_" is somewhat ugly. Pairing
>> GeneratorReturn with finish() is my second choice, for the "OK, time
>> to wrap things up and complete your assigned task" connotations, as
>> compared to the "drop everything and clean up the mess" connotations
>> of GeneratorExit and close())
>
> I like the names. ?GeneratorFinish might work as well for the exception,
> but I like GeneratorReturn better for its connection with "return".
>
>
>>
>> I'd personally be +1 on option 2 (since it addresses the immediate use
>> case while maintaining appropriate separation of concerns between
>> guaranteed resource cleanup and graceful completion of coroutines) and
>> -0 on option 1 (unsurprising, given my previously stated objections to
>> failing to maintain appropriate separation of concerns).
>>
>
> I agree the "finish" idea looks far better for generators without
> yield-from. ?It is unfortunate that extending it to work with yield-from
> isn't prettier that it is though.
>
>
>
>> (I should note that this differs from the previous suggestion of a
>> GeneratorReturn exception in the context of PEP 380. Those suggestions
>> were to use it as a replacement for StopIteration when a generator
>> contained a return statement. The suggestion here is to instead use it
>> as a replacement for GeneratorExit in order to request
>> prompt-but-graceful completion of a generator rather than just bailing
>> out immediately).
>
> I agree the name fits this use better than the original. ?Too bad some
> of my suggestions above are starting to blur the line between
> GeneratorReturn and StopIteration again.

So now I'm even more convinced that it's not worth it...

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Tue Oct 26 19:44:53 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 26 Oct 2010 19:44:53 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
Message-ID: <20101026194453.46d42bf9@pitrou.net>

On Tue, 26 Oct 2010 19:22:32 +0200
Cesare Di Mauro
<cesare.di.mauro at gmail.com> wrote:
> 
> At this time I have no other ideas to solve this problem.
> 
> Please, let me know if there's interest on a new patch to implement the
> "compiler-based" solution.

Have you timed the EXTENDED_ARG solution?

Regards

Antoine.




From tjreedy at udel.edu  Tue Oct 26 19:55:30 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 26 Oct 2010 13:55:30 -0400
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTimrJDPwHLv4kGbnidbAv2AFr3YF7BCopMSmhBPu@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>
	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>
	<ia2d8l$7rr$1@dough.gmane.org>	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>	<4CC6A933.3080605@pearwood.info>
	<AANLkTimrJDPwHLv4kGbnidbAv2AFr3YF7BCopMSmhBPu@mail.gmail.com>
Message-ID: <ia74mg$285$1@dough.gmane.org>

On 10/26/2010 11:43 AM, Guido van Rossum wrote:

> You're right, the point I wanted to prove was that generators are
> better than threads, but the code was based on emulating reduce(). The
> generalization that I was aiming for was that it is convenient to
> write a generator that does some kind of computation over a sequence
> of items and returns a result at the end, and then have a driver that
> feeds a single sequence to a bunch such generators. This is more
> powerful once you try to use reduce to compute e.g. the average of the
> numbers fed to it -- of course you can do it using a function of
> (state, value) but it is so much easier to write as a loop! (At least
> for me -- if you do nothing but write Haskell all day I'm sure it
> comes naturally. :-)
>
> def avg():
>    total = 0
>    count = 0
>    try:
>      while True:
>        value = yield
>        total += value
>        count += 1
>    except GeneratorExit:
>      raise StopIteration(total / count)

The more traditional pull or grab (rather than push receive) version is

def avg(it):
     total = 0
     count = 0
     for value in it:
         total += value
         count += 1
     return total/count

> The essential boilerplate here is
>
>    try:
>      while True:
>        value = yield
>        <use value>
>    except GeneratorExit:
>      raise StopIteration(<compute result>)

with corresponding boilersplate.

I can see that the receiving generator version would be handy when you 
do not really want to package the producer into an iterator (perhaps 
because items are needed for other purposes also) and want to send items 
to the averager as they are produced, from the point of production.

> No doubt functional aficionados will snub this, but in Python, this
> should run much faster than the same thing written as a reduce-ready
> function, due to the function overhead (which wasn't a problem in the
> min/max example since those are built-ins).
>
> BTW This episode led me to better understand my objection against
> reduce() as the universal hammer: my problem with writing avg() using
> reduce is that the function one feeds into reduce is asymmetric -- its
> first argument must be some state, e.g. a tuple (total, count), and
> the second argument must be the next value.

Not hard:
def update(pair, item):
   return pair[0]+1, pair[1]+item

 > This is the moment that my
> head reliably explodes -- even though it has no problem visualizing
> reduce() using a *symmetric* function like +, min or max.
>
> Also note that the reduce() based solution would have to have a
> separate function to extract the desired result (total / count) from
> the state (total, count), and for multi_reduce() you would have to
> supply a separate list of functions for these or some other hacky
> approach.

Reduce is extremely important as concept: any function of a sequence (or 
collection arbitrarily ordered) can be written as a post-processed 
reduction. In practice, at least for Python, it is better thought of as 
wide-spread template pattern, such as the boilerplate above, than just 
as a function. This is partly because Python does not have general 
function expressions (and should not!) and also because Python does have 
high function call overhead (because of signature flexibility).


-- 
Terry Jan Reedy



From bborcic at gmail.com  Tue Oct 26 19:59:17 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Tue, 26 Oct 2010 19:59:17 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <4CC6FD33.4050305@pearwood.info>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>	<ia6opa$1eq$1@dough.gmane.org>
	<4CC6FD33.4050305@pearwood.info>
Message-ID: <ia74tl$38r$1@dough.gmane.org>

Steven D'Aprano wrote:
> Boris Borcic wrote:
>
>> And let's then propagate that notion, to a *coherent* definition of
>> split that makes it as well a method on the separator.
>
> Let's not.
>
> Splitting is not something that you on the separator, it's something you
> do on the source string. I'm sure you wouldn't expect this:
>
> ":".find("key:value")
> => 3

To be honest, my test for this type of questions is how likely I find myself 
using the bound method outside of immediate method call syntax, and I'd say 
having a specialized callable that will find specific content in whatever future 
argument, is more likely than the converse callable that will find occurences of 
whatever future argument in a fixed string. YMMV

>
> Nor should we expect this:
>
> ":".split("key:value")
> => ["key", "value"]
>
>
> You perform a search *on* the source string, not the target substring.
> Likewise you split the source string, not the separator.

To me, this sounds like giving too much weight to english language intuition. 
What really counts is not how it gets to be said in good english, but rather - 
what's the variable/object/value that, in the context of the action, tends to be 
the most stable focus of attention. And remember that most speakers of E as a 
second language, never become fully comfortable with E prepositions.

Cheers, BB



From brett at python.org  Tue Oct 26 20:01:42 2010
From: brett at python.org (Brett Cannon)
Date: Tue, 26 Oct 2010 11:01:42 -0700
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTi=WZ8pv0zGqpH3gkY-wPLDV6Ljoy53nOUdXPEBV@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTi=WZ8pv0zGqpH3gkY-wPLDV6Ljoy53nOUdXPEBV@mail.gmail.com>
Message-ID: <AANLkTinANY77zYnLYJw+aZAp=bEAfQMDm5B4Sv4qjEhE@mail.gmail.com>

On Tue, Oct 26, 2010 at 07:56, Guido van Rossum <guido at python.org> wrote:
> On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
>> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> What do you think?
>>
>> After browsing through the Demo dir a bit, I came away thinking most
>> of these should just be removed from the repository.
>
> +1. Most of them are either quick hacks I once wrote and didn't know
> where to put (dutree.py, repeat.py come to mind) or in some cases
> contributed 3rd party code that was looking for a home. I think that
> all of these ought to live somewhere else and I have no problem with
> tossing out the entire Demo and Tools directories -- anything that's
> not needed as part of the build should go. (Though a few things might
> indeed be moved into the stdlib if they are useful enough.)

Just to toss in my +1, I have suggested doing this before and received
push-back in the form of "it isn't hurting anyone". But considering
how often the idea of trying to fix the directory comes up and never
occurs, it is obviously wasting people's time keeping the directories
around. So I say move the stuff needed as part fo the build or dev
process (e.g., patchcheck is in the Tools directory) and then drop the
directory. We can give a deadline of some release like Python 3.2b1 or
b2 to move scripts people care about, and then simply do a mass
deletion just before cutting a release.

-Brett

>
>> I think there's
>> enough demo material out there on the internet (for example in the
>> cookbook), a lot of it of higher quality than what we have in the Demo
>> dir right now. Maybe it makes sense to have a basic tkinter app to get
>> you started. And some of the smaller functions or classes could
>> possibly be used in the documentation. But as it is, it seems silly to
>> waste developer time on stuff that few people look at or make use of
>> (I'm assuming this from the fact that they have previously been
>> neglected).
>
> None of that belongs in the core distro any more.
>
>> Back to the original question: I don't think moving the Demo stuff to
>> the Lib dir is a good idea, simply because the Lib dir should contain
>> libraries, not applications or scripts. Writing a section for the
>> documentation seems a better way to solve the discoverability problem,
>> testing could be done even in the Demo dir (with more structure if
>> need be), and quality control could just as well be exercised in the
>> current location.
>
> If there are demos that are useful for testing, move them into Lib/test/.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From raymond.hettinger at gmail.com  Tue Oct 26 20:26:35 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 26 Oct 2010 11:26:35 -0700
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTinANY77zYnLYJw+aZAp=bEAfQMDm5B4Sv4qjEhE@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTi=WZ8pv0zGqpH3gkY-wPLDV6Ljoy53nOUdXPEBV@mail.gmail.com>
	<AANLkTinANY77zYnLYJw+aZAp=bEAfQMDm5B4Sv4qjEhE@mail.gmail.com>
Message-ID: <171F8E33-A726-4955-A8F3-C3CE3829974C@gmail.com>


>>> Back to the original question: I don't think moving the Demo stuff to
>>> the Lib dir is a good idea, simply because the Lib dir should contain
>>> libraries, not applications or scripts. Writing a section for the
>>> documentation seems a better way to solve the discoverability problem, ...

If any of the demos survive the purge, I agree that they
should have their own docs.  Otherwise, they might as
well be invisible.


Raymond




From cesare.di.mauro at gmail.com  Tue Oct 26 22:30:16 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 26 Oct 2010 22:30:16 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <20101026194453.46d42bf9@pitrou.net>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
	<20101026194453.46d42bf9@pitrou.net>
Message-ID: <AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>

2010/10/26 Antoine Pitrou <solipsis at pitrou.net>

> On Tue, 26 Oct 2010 19:22:32 +0200
> Cesare Di Mauro
> <cesare.di.mauro at gmail.com> wrote:
> >
> > At this time I have no other ideas to solve this problem.
> >
> > Please, let me know if there's interest on a new patch to implement the
> > "compiler-based" solution.
>
> Have you timed the EXTENDED_ARG solution?
>
> Regards
>
> Antoine.


I made some a few minutes ago, and the results are unbelievable and
counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400,
Windows 7 x64, Python 3.2a3 32 bits running at high priority):

python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()"
Standard : 100000000 loops, best of 1: 0.348 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2,
3)"
Standard : 100000000 loops, best of 1: 0.452 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass"
"f(a = 1, b = 2, c = 3)"
Standard : 100000000 loops, best of 1: 0.578 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c =
3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)"
Standard : 100000000 loops, best of 1: 0.761 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)"
Standard : 100000000 loops, best of 1: 0.511 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b
= 2, c = 3)"
Standard : 100000000 loops, best of 1: 0.789 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1,
2, 3, a = 1, b = 2, c = 3)"
Standard : 100000000 loops, best of 1: 1.01 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop

python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()"
Standard : 100000000 loops, best of 1: 0.393 usec per loop
EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop

I really can't explain it. Ouch!

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101026/7799ee55/attachment.html>

From solipsis at pitrou.net  Tue Oct 26 22:39:00 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 26 Oct 2010 22:39:00 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
 arguments
In-Reply-To: <AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
	<20101026194453.46d42bf9@pitrou.net>
	<AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
Message-ID: <1288125540.3547.22.camel@localhost.localdomain>


> [snip lots of timeit results comparing unpatched and EXTENDED_ARG]
> 
> I really can't explain it. Ouch!

What do you mean exactly? There's no significant change at all.







From cesare.di.mauro at gmail.com  Tue Oct 26 22:58:52 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 26 Oct 2010 22:58:52 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <1288125540.3547.22.camel@localhost.localdomain>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
	<20101026194453.46d42bf9@pitrou.net>
	<AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
	<1288125540.3547.22.camel@localhost.localdomain>
Message-ID: <AANLkTikvoGV707Zbsgh-3pR+yi4cS7Aq0h5yLxr4=u26@mail.gmail.com>

2010/10/26 Antoine Pitrou <solipsis at pitrou.net>

>
> > [snip lots of timeit results comparing unpatched and EXTENDED_ARG]
> >
> > I really can't explain it. Ouch!
>
> What do you mean exactly? There's no significant change at all.
>

I cannot explain why the unpatched version was slower than the patched one
most of the times.

I find it silly and illogical.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101026/36accadc/attachment.html>

From solipsis at pitrou.net  Tue Oct 26 23:04:30 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 26 Oct 2010 23:04:30 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
 arguments
In-Reply-To: <AANLkTikvoGV707Zbsgh-3pR+yi4cS7Aq0h5yLxr4=u26@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
	<20101026194453.46d42bf9@pitrou.net>
	<AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
	<1288125540.3547.22.camel@localhost.localdomain>
	<AANLkTikvoGV707Zbsgh-3pR+yi4cS7Aq0h5yLxr4=u26@mail.gmail.com>
Message-ID: <1288127070.3547.25.camel@localhost.localdomain>


> I cannot explain why the unpatched version was slower than the patched
> one most of the times.

It just looks like measurement noise or, at worse, the side effect of
slightly different code generation by the compiler.
I don't think a ?1% variation on a desktop computer can be considered
significant.

(which means that the patch reaches its goal of not decreasing
performance, anyway :-))

Regards

Antoine.




From g.brandl at gmx.net  Tue Oct 26 23:04:47 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 26 Oct 2010 23:04:47 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTikvoGV707Zbsgh-3pR+yi4cS7Aq0h5yLxr4=u26@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>	<loom.20101021T160508-402@post.gmane.org>	<i9pl0o$vsk$1@dough.gmane.org>	<loom.20101021T201522-329@post.gmane.org>	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>	<4CC1D966.2080007@egenix.com>	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>	<4CC211EE.1050308@egenix.com>
	<20101023004508.6a6c1373@pitrou.net>	<4CC6164B.5040201@trueblade.com>	<1288083018.3547.0.camel@localhost.localdomain>	<4CC69FFD.7080102@egenix.com>	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>	<4CC6AFB4.3040003@egenix.com>	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>	<4CC6E7A9.4030205@egenix.com>	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>	<20101026194453.46d42bf9@pitrou.net>	<AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>	<1288125540.3547.22.camel@localhost.localdomain>
	<AANLkTikvoGV707Zbsgh-3pR+yi4cS7Aq0h5yLxr4=u26@mail.gmail.com>
Message-ID: <ia7fr4$p73$1@dough.gmane.org>

Am 26.10.2010 22:58, schrieb Cesare Di Mauro:
> 2010/10/26 Antoine Pitrou <solipsis at pitrou.net
> <mailto:solipsis at pitrou.net>>
> 
> 
>     > [snip lots of timeit results comparing unpatched and EXTENDED_ARG]
>     >
>     > I really can't explain it. Ouch!
> 
>     What do you mean exactly? There's no significant change at all.
> 
> 
> I cannot explain why the unpatched version was slower than the patched one most
> of the times.
> 
> I find it silly and illogical.

It rather seems that you're seeing statistics, and the impact of the
change is not measurable.  Nothing silly about it.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From g.brandl at gmx.net  Tue Oct 26 23:37:26 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 26 Oct 2010 23:37:26 +0200
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTinANY77zYnLYJw+aZAp=bEAfQMDm5B4Sv4qjEhE@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>	<AANLkTi=WZ8pv0zGqpH3gkY-wPLDV6Ljoy53nOUdXPEBV@mail.gmail.com>
	<AANLkTinANY77zYnLYJw+aZAp=bEAfQMDm5B4Sv4qjEhE@mail.gmail.com>
Message-ID: <ia7hob$2c2$1@dough.gmane.org>

Am 26.10.2010 20:01, schrieb Brett Cannon:
> On Tue, Oct 26, 2010 at 07:56, Guido van Rossum <guido at python.org> wrote:
>> On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
>>> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky
>>> <alexander.belopolsky at gmail.com> wrote:
>>>> What do you think?
>>>
>>> After browsing through the Demo dir a bit, I came away thinking most
>>> of these should just be removed from the repository.
>>
>> +1. Most of them are either quick hacks I once wrote and didn't know
>> where to put (dutree.py, repeat.py come to mind) or in some cases
>> contributed 3rd party code that was looking for a home. I think that
>> all of these ought to live somewhere else and I have no problem with
>> tossing out the entire Demo and Tools directories -- anything that's
>> not needed as part of the build should go. (Though a few things might
>> indeed be moved into the stdlib if they are useful enough.)
> 
> Just to toss in my +1, I have suggested doing this before and received
> push-back in the form of "it isn't hurting anyone". But considering
> how often the idea of trying to fix the directory comes up and never
> occurs, it is obviously wasting people's time keeping the directories
> around. So I say move the stuff needed as part fo the build or dev
> process (e.g., patchcheck is in the Tools directory) and then drop the
> directory. We can give a deadline of some release like Python 3.2b1 or
> b2 to move scripts people care about, and then simply do a mass
> deletion just before cutting a release.

I've started a list of Demos and Tools here:

https://spreadsheets.google.com/ccc?key=0AherhJVUN_I2dFNQdjNPMFdnOHVpdERSdWxqaXBkWWc&hl=en&authkey=CMWEn84C

Please, feel free to complete and argue about fates.  I'd like the
corresponding actions taken by 3.2b1.

(One note about the fate "showcase": it might be nice to keep a minimal set
of demos from various topics as a kind of showcase what you can do with a
few lines of Python.)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From mal at egenix.com  Tue Oct 26 23:46:38 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 26 Oct 2010 23:46:38 +0200
Subject: [Python-ideas] New 3.x restriction on number of keyword
	arguments
In-Reply-To: <AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com>
	<loom.20101021T160508-402@post.gmane.org>
	<i9pl0o$vsk$1@dough.gmane.org>
	<loom.20101021T201522-329@post.gmane.org>
	<AANLkTi=6ct2iGLFtwBeZkPqBiwc-QtKhe3n61bzpKYF0@mail.gmail.com>
	<4CC1D966.2080007@egenix.com>
	<AANLkTinkgpzZnh9qQ5f4SuCaWVmQ50iqzcwh+DZ6Z8aH@mail.gmail.com>
	<4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net>
	<4CC6164B.5040201@trueblade.com>
	<1288083018.3547.0.camel@localhost.localdomain>
	<4CC69FFD.7080102@egenix.com>
	<AANLkTi=UH+J2S9WtKRWv5bRhdVtPTWAEfVLtxwSaE-re@mail.gmail.com>
	<4CC6AFB4.3040003@egenix.com>
	<AANLkTimZaYKWDx4GQiyhOai3Oji4pQNjHGX4MrazCmu0@mail.gmail.com>
	<4CC6E7A9.4030205@egenix.com>
	<AANLkTi=SbbrZd+-iSj20rtGYBz6J+Tby9skGOFsczi48@mail.gmail.com>
	<20101026194453.46d42bf9@pitrou.net>
	<AANLkTikB7y_ekXJEzwdh5BfYc+FMzJuUGVnfG6eBCyB4@mail.gmail.com>
Message-ID: <4CC74C3E.8080909@egenix.com>

Cesare Di Mauro wrote:
> 2010/10/26 Antoine Pitrou <solipsis at pitrou.net>
> 
>> On Tue, 26 Oct 2010 19:22:32 +0200
>> Cesare Di Mauro
>> <cesare.di.mauro at gmail.com> wrote:
>>>
>>> At this time I have no other ideas to solve this problem.
>>>
>>> Please, let me know if there's interest on a new patch to implement the
>>> "compiler-based" solution.
>>
>> Have you timed the EXTENDED_ARG solution?
>>
>> Regards
>>
>> Antoine.
> 
> 
> I made some a few minutes ago, and the results are unbelievable and
> counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400,
> Windows 7 x64, Python 3.2a3 32 bits running at high priority):
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()"
> Standard : 100000000 loops, best of 1: 0.348 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2,
> 3)"
> Standard : 100000000 loops, best of 1: 0.452 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass"
> "f(a = 1, b = 2, c = 3)"
> Standard : 100000000 loops, best of 1: 0.578 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c =
> 3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)"
> Standard : 100000000 loops, best of 1: 0.761 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)"
> Standard : 100000000 loops, best of 1: 0.511 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b
> = 2, c = 3)"
> Standard : 100000000 loops, best of 1: 0.789 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1,
> 2, 3, a = 1, b = 2, c = 3)"
> Standard : 100000000 loops, best of 1: 1.01 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop
> 
> python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()"
> Standard : 100000000 loops, best of 1: 0.393 usec per loop
> EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop
> 
> I really can't explain it. Ouch!

Looks like a good solution to the problem - no performance
loss and a much higher limit on the number of arguments.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ncoghlan at gmail.com  Wed Oct 27 00:14:14 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 27 Oct 2010 08:14:14 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
Message-ID: <AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>

On Wed, Oct 27, 2010 at 3:33 AM, Guido van Rossum <guido at python.org> wrote:
> On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm <jh at improva.dk> wrote:
>> A different way to handle this would be to change the PEP 380 expansion
>> as follows:
>>
>> [...]
>> - except GeneratorExit as _e:
>> + except (GeneratorReturn, GeneratorExit) as _e:
>> [...]
>
> That just strikes me as one more reason why a separate GeneratorReturn
> is a bad idea.
>
> In my ideal world, you almost never need to catch or raise
> StopIteration; you don't raise GeneratorExit (that is close()'s job)
> but you catch it to notice that your data source is finished, and then
> you return a value. (And see my crazy idea in my previous post to get
> rid of that too. :-)

Jacob's "implications for PEP 380" exploration started to give me some
doubts, but I think there are actually some flaws in his argument.
Accordingly, I would like to make one more attempt at explaining why I
think throwing in a separate exception for this use case is valuable
(and *doesn't* require any changes to PEP 380).

As I see it, there's a bit of a disconnect between many PEP 380 use
cases and any mechanism or idiom which translates a thrown in
exception into an ordinary StopIteration. If you expect your thrown in
exception to always terminate the generator in some fashion, adopting
the latter idiom in your generator will make it potentially unsafe to
use in a "yield from" expression that isn't the very last yield
operation in any outer generator.

Consider the following:

def example(arg):
  try:
    yield arg
  except GeneratorExit
    return "Closed"
  return "Finished"

def outer_ok1(arg):  # close() after next() returns "Closed"
  return yield from example(arg)

def outer_ok2(arg): # close() after next() returns None
  yield from example(arg)

def outer_broken(arg): # close() after next() gives RuntimeError
  val = yield from example(arg)
  yield val

# All 3 cases: close() before next() returns None
# All 3 cases: close() after 2x next() returns None

Using close() to say "give me your return value" creates the risk of
hitting those runtime errors in a generator's __del__ method, and
exceptions in __del__ are always a bit ugly.

Keeping the "give me your return value" and "clean up your resources"
concerns separate by adding a new method and thrown exception means
that close() is less likely to unpredictably raise RuntimeError (and
when it does, will reliably indicate a genuine bug in a generator
somewhere that is suppressing GeneratorExit).

As far as PEP 380's semantics go, I think it should ignore the
existence of anything like GeneratorReturn completely. Either one of
the generators in the chain will catch the exception and turn it into
StopIteration, or they won't. If they convert it to StopIteration, and
they aren't the last generator in the chain, then maybe what actually
needs to happen at the outermost level is something like this:

class GeneratorReturn(Exception): pass

def finish(gen):
  try:
    gen.throw(GeneratorReturn) # Ask generator to wrap things up
  except StopIteration as err:
    if err.args:
      return err.args[0]
  except GeneratorReturn:
    pass
  else:
    # Asking nicely didn't work, so force resource cleanup
    # and treat the result as if the generator had already
    # been exhausted or hadn't started yet
    gen.close()
  return None

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Wed Oct 27 00:28:07 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 27 Oct 2010 08:28:07 +1000
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTimQXfrgo0A-j_xCCr3c1+=yPgQX0zKkabDvhCnN@mail.gmail.com>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
	<ia6g52$ldv$1@dough.gmane.org>
	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
	<AANLkTimQXfrgo0A-j_xCCr3c1+=yPgQX0zKkabDvhCnN@mail.gmail.com>
Message-ID: <AANLkTik8Q_qPANfDX_u9oqH7sFK28sqHLDTFrX7zKvPy@mail.gmail.com>

On Wed, Oct 27, 2010 at 12:24 AM, Boris Borcic <bborcic at gmail.com> wrote:
> On Tue, Oct 26, 2010 at 2:35 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> But split only makes sense for strings, not arbitrary sequences. It's
>> the other way around for join.
>
> I don't feel your "the other way around" makes clear sense.

Indeed, I realised my comment was slightly ambigous some time after I
posted it. "the other way around" refers to the English sentence, not
to the Python parameter order (i.e. join makes sense for arbitrary
sequences, not just strings).

If you're looking for the relevant piece of the Zen here, it's
"practicality beats purity". string.join used to be used primarily as
a function, but people had trouble remembering the parameter order.
Locking it in as a str method on the separator made the argument order
easier to remember at the cost of making it somewhat unintuitive to
learn in the first place (making it a method of the sequence being
joined was not an option, since join accepts arbitrary iterables).
Absolutely nothing has changed in the intervening years to affect the
rationale of that decision, so you can rail against it all you want
(with some justification) but you aren't going to change it.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From kristjan at ccpgames.com  Wed Oct 27 06:02:11 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Wed, 27 Oct 2010 12:02:11 +0800
Subject: [Python-ideas] ExternalMemory
Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local>

I find myself sometimes, when writing IO Code from C, wanting to pass memory that I have allocated internally and filled with data, to Python without copying it into a string object.
To this end, I have locally created a (2.x) method called PyBuffer_FromMemoryAndDestructor(), which is the same as PyBuffer_FromMemory() except that it will call a provided destructor function with an optional arg, to release the memory address given, when no longer in use.

First of all, I'd futilely  like to suggest this change for 2.x.  The existing PyBuffer_FromMemory() provides no lifetime management.
Second, the ByBuffer object doesn't support the new Py_buffer interface, so you can't really use this then, like putting a memoryview around it.  This is a fixable bug, otoh.

Thirdly, in py3k I think the situation is different.  There you would (probably, correct me if I'm wrong) emulate the old PyBuffer_FromMemory with a combination of the new PyBuffer_FromContiguous and a PyMemoryView_FromBuffer().  But this also does not allow any lifetime magement of the external memory.  So, for py3k, I'd actually like to extend the Memoryview object, and provide something like PyMemoryView_FromExternal() that takes an optional pointer to a "void destructor(void *arg, void *ptr)) and an (void *arg), to be called when the buffer is released.

K
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101027/6afd605a/attachment.html>

From lie.1296 at gmail.com  Wed Oct 27 07:07:39 2010
From: lie.1296 at gmail.com (Lie Ryan)
Date: Wed, 27 Oct 2010 16:07:39 +1100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <AANLkTikAnG0MhBegq+Zw3rL7ph4D_suuhsc9PNcKv9o6@mail.gmail.com>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>
	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>
	<ia2d8l$7rr$1@dough.gmane.org>	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>	<ia4dmg$7uc$1@dough.gmane.org>
	<AANLkTikAnG0MhBegq+Zw3rL7ph4D_suuhsc9PNcKv9o6@mail.gmail.com>
Message-ID: <ia8f3o$5sn$1@dough.gmane.org>

On 10/26/10 05:53, Guido van Rossum wrote:
>> Guido van Rossum wrote:
> [...]
>>> This should not require threads.
>>>
>>> Here's a bare-bones sketch using generators:
> [...]
> 
> On Mon, Oct 25, 2010 at 10:10 AM, Peter Otten <__peter__ at web.de> wrote:
>> I don't think the generator-based approach is equivalent to what Lie Ryan's
>> threaded code does. You are calling max(a, b) 99 times while Lie calls
>> max(items) once.
> 
> True. Nevertheless, my point stays: you shouldn't have to use threads
> to do such concurrent computations over a single-use iterable. Threads
> too slow and since there is no I/O multiplexing they don't offer
> advantages.
> 
>> Is it possible to calculate min(items) and max(items) simultaneously using
>> generators? I don't see how...
> 
> No, this is why the reduce-like approach is better for such cases.
> Otherwise you keep trying to fit a square peg into a round hold.

except the max(a, b) is an attempt to find square hole to fit the square
peg, and the max([a]) attempt is trying to find a round peg to fit the
round hole with.



From lie.1296 at gmail.com  Wed Oct 27 07:19:31 2010
From: lie.1296 at gmail.com (Lie Ryan)
Date: Wed, 27 Oct 2010 16:19:31 +1100
Subject: [Python-ideas] [Python-Dev] minmax() function returning
 (minimum, maximum) tuple of a sequence
In-Reply-To: <4CC6A933.3080605@pearwood.info>
References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG>	<201010111017.56101.steve@pearwood.info>	<AANLkTikGy=jFmAjHc+W=qn8KeBGe5PaY2-gYW4GrTH5E@mail.gmail.com>	<C11832C9-5EFC-48DA-91BE-D2D3906A6458@masklinn.net>	<AANLkTikpHH6kLw-iOUSGS4vTY7mJRkuv_zGF8tkK7rTB@mail.gmail.com>	<AANLkTinzkNj7CHJ=i1XJSQgxi_y=Ounc9CstF_FWgQBN@mail.gmail.com>	<AANLkTimsd2gMoLQ-gHTRUPBSZ29Ryta71ozvFaNZLmWV@mail.gmail.com>	<AANLkTimHt7ZJgHVq1OPS_813Sxe_ZNG2-Xb3kVCqEeeD@mail.gmail.com>	<4CB7B7C2.8090401@ronadam.com>	<AANLkTim3-fZxpXhCWBtrhX-AaVyB=e=UGqFzMnJx96uG@mail.gmail.com>	<4CB88BD2.4010901@ronadam.com>	<i9a2u9$q8k$1@dough.gmane.org>	<4CB898CD.6000207@ronadam.com>	<AANLkTinFMfWV6sX3E-VnDKFiJX2V=+bzAxvpX=x6_Rrx@mail.gmail.com>	<4CB8B2F5.2020507@ronadam.com>	<ia2d8l$7rr$1@dough.gmane.org>	<AANLkTi=yiaDbAfiTnV1yEk1L8UfCf4gHf3=3d9TXhvXw@mail.gmail.com>
	<4CC6A933.3080605@pearwood.info>
Message-ID: <ia8fq1$8an$1@dough.gmane.org>

On 10/26/10 21:10, Steven D'Aprano wrote:
> def multi_reduce(iterable, funcs):
>     it = iter(iterable)
>     collectors = [next(it)]*len(funcs)
>     for i, f in enumerate(funcs):
>         x = next(it)
>         collectors[i] = f(collectors[i], x)
>     return collectors
> 
> I've called it multi_reduce rather than parallel_reduce, because it
> doesn't execute the functions in parallel. By my testing on Python
> 3.1.1, multi_s designed for functions with the signature `func([object])` (a function that takes, as argument, a list of objects). I believereduce is consistently ~30% faster than the generator based
> solution for lists with 1000 - 10,000,000 items.
> 
> So what am I missing? What does your parallel_reduce give us that
> multi_reduce doesn't?

The parallel_reduce() is specifically designed for for functions with
the signature `func([object])` (a function that takes, as argument, a
list of objects). The idea is that, you can write your func()
iteratively, and parallel_reduce() will somehow handle splitting work
into multiple funcs, as if you tee() the iterator, but without caching
the whole iterable.

Maybe max and min is a bad example, as it happens to be the case that
max and min have the alternative signature `func(int, int)` which makes
it a better fit with the traditional reduce() approach (and as it
happens to be, parallel_reduce() seems to be a bad name as well, since
it's not related to the traditional reduce() in any way).

And I believe you do miss something:

>>> multi_reduce([1, 2, 3], [max, min])
[2, 1]
>>> parallel_reduce([1, 2, 3], [max, min])
[3, 1]


I don't think that my original attempt with threading is an ideal
solution either, as Guido stated, it's too complicated for such a simple
problem. cProfile even shows that 30% of the its time is spent waiting
on acquiring locks. The ideal solution would probably require a way for
a function to interrupt its own execution (when the teed iterator is
exhausted, but there is still some item in iterable), let other part of
code continues (the iterator feeder, and other funcs), and then resume
where it was left off (which is why I think cofunction is probably the
way to go, assuming I understand correctly what cofunction is).



In diagram:

Initially, producer creates a few funcs, and feeds them a suspendable
teed-iterators:

                    +--------+   +------------+
 true iterator  +---| func_1 |---| iterator_1 |--[1, ...]
 [1, 2, ..]     |   +--------+   +------------+
     |          |
+**********+    |   +--------+   +------------+
* producer *----+---| func_2 |---| iterator_2 |--[1, ...]
+**********+    |   +--------+   +------------+
                |
                |   +--------+   +------------+
                +---| func_3 |---| iterator_3 |--[1, ...]
                    +--------+   +------------+



First, func_1 is executed, and iterator_1 produce item 1:

                    +********+   +************+
 true iterator  +---* func_1 *---* iterator_1 *--[*1*, ...]
 [*1*, 2, ..]   |   +********+   +************+
     |          |
+----------+    |   +--------+   +------------+
| producer |----+---| func_2 |---| iterator_2 |--[1, ...]
+----------+    |   +--------+   +------------+
                |
                |   +--------+   +------------+
                +---| func_3 |---| iterator_3 |--[1, ...]
                    +--------+   +------------+



then iterator_1 suspends execution, giving control back to producer:

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[...]
 [*1*, 2, ..]   |   +========+   +============+
     |          |
+**********+    |   +--------+   +------------+
* producer *----+---| func_2 |---| iterator_2 |--[1, ...]
+**********+    |   +--------+   +------------+
                |
                |   +--------+   +------------+
                +---| func_3 |---| iterator_3 |--[1, ...]
                    +--------+   +------------+



Then, producer give execution to func_2:

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[...]
 [*1*, 2, ..]   |   +========+   +============+
     |          |
+----------+    |   +********+   +************+
| producer |----+---* func_2 *---* iterator_2 *--[*1*, ...]
+----------+    |   +********+   +************+
                |
                |   +--------+   +------------+
                +---| func_3 |---| iterator_3 |--[1, ...]
                    +--------+   +------------+



func_2 processes item 1, then iterator_2 suspends and give control back
to producer:

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[...]
 [*1*, 2, ..]   |   +========+   +============+
     |          |
+**********+    |   +========+   +============+
* producer *----+---| func_2 |---| iterator_2 |--[...]
+**********+    |   +========+   +============+
                |
                |   +--------+   +------------+
                +---| func_3 |---| iterator_3 |--[1, ...]
                    +--------+   +------------+



and now it's func_3's turn:

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[...]
 [*1*, 2, ..]   |   +========+   +============+
     |          |
+----------+    |   +========+   +============+
| producer |----+---| func_2 |---| iterator_2 |--[...]
+----------+    |   +========+   +============+
                |
                |   +********+   +************+
                +---* func_3 *---* iterator_3 *--[*1*, ...]
                    +********+   +************+




func_3 processes item 1, then iterator_3 suspends and give control back
to producer:

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[...]
 [*1*, 2, ..]   |   +========+   +============+
     |          |
+**********+    |   +========+   +============+
* producer *----+---| func_2 |---| iterator_2 |--[...]
+**********+    |   +========+   +============+
                |
                |   +========+   +============+
                +---| func_3 |---| iterator_3 |--[...]
                    +========+   +============+



all funcs already consumed item 1, so producer advances (next()-ed)the
"true iterator", and feeds it to the teed-iterator.

                    +========+   +============+
 true iterator  +---| func_1 |---| iterator_1 |--[2, ...]
 [*2*, 3, ..]   |   +========+   +============+
     |          |
+**********+    |   +========+   +============+
* producer *----+---| func_2 |---| iterator_2 |--[2, ...]
+**********+    |   +========+   +============+
                |
                |   +========+   +============+
                +---| func_3 |---| iterator_3 |--[2, ...]
                    +========+   +============+




then producer resumes func_1, and it processes item 2:

                    +********+   +************+
 true iterator  +---* func_1 *---* iterator_1 *--[*2*, ...]
 [*2*, 3, ..]   |   +********+   +************+
     |          |
+----------+    |   +========+   +============+
| producer |----+---| func_2 |---| iterator_2 |--[2, ...]
+----------+    |   +========+   +============+
                |
                |   +========+   +============+
                +---| func_3 |---| iterator_3 |--[2, ...]
                    +========+   +============+

then the same thing happens to func_2 and func_3; and repeat this until
the "true iterator" is exhausted. When the true iterator is exhausted,
producer signals iterator_1, iterator_2, and iterator_3 so they raises
StopIteration, causing func_1, func_2, and func_3 to return a result.
And producer collects the result into a list and return to the result to
its caller.



Basically, it is a form of cooperative multithreading where iterator_XXX
(instead of func_xxx) decides when to suspend the execution of func_XXX
(in this particular case, when its own cache is exhausted, but there is
still some item in the true iterator).

The advantage is that func_1, func_2, and func_3 can be written
iteratively (i.e. as func([object])), as opposed to reduce-like
approach.  If performance is important, iterator_xxx can feed multiple
items to func_xxx before suspending. Also, it should require no locking
as object sharing and suspending execution is controlled by iterator_xxx
(instead of the indeterministic preemptive threading).



From denis.spir at gmail.com  Wed Oct 27 09:10:28 2010
From: denis.spir at gmail.com (spir)
Date: Wed, 27 Oct 2010 09:10:28 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
 rdelim='')
In-Reply-To: <4CC6FD33.4050305@pearwood.info>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
	<ia6g52$ldv$1@dough.gmane.org>
	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
	<ia6opa$1eq$1@dough.gmane.org> <4CC6FD33.4050305@pearwood.info>
Message-ID: <20101027091028.11322756@o>

On Wed, 27 Oct 2010 03:09:23 +1100
Steven D'Aprano <steve at pearwood.info> wrote:

> Boris Borcic wrote:
> 
> > And let's then propagate that notion, to a *coherent* definition of 
> > split that makes it as well a method on the separator.
> 
> Let's not.
> 
> Splitting is not something that you on the separator, it's something you 
> do on the source string. I'm sure you wouldn't expect this:
> 
> ":".find("key:value")
> => 3
> 
> Nor should we expect this:
> 
> ":".split("key:value")
> => ["key", "value"]
> 
> 
> You perform a search *on* the source string, not the target substring. 
> Likewise you split the source string, not the separator.

I completely share this view.
Also, when one needs to split on multiple seps, repetitive seps, or even more complex separation schemes, it makes even less sense to see split applying on the sep, instead of on the string. Even less when splitting should remove empty parts generated by seps at both end or repeted seps. Note that it's precisely what split() without sep does:

>>> s = " some \t little   words  "
>>> s.split()
['some', 'little', 'words']
>>> s.split(' ')
['', 'some', '', 'little', '', '', 'words', '', '']

Finally, in any of such cases, join is _not_ a reverse function for split. split in the general case is not reversable because there is loss of information. It is possible only with a pattern limited to a single sep, no (implicit) repetition, and keeping empty parts at ends. Very fine that python's split semantics is so defined, one cannot think at split as reversible in general (*). 

Denis

(*) Similar rule: one cannot rewrite original code from an AST: there is loss of information. One can only write code in a standard form that has same semantics (hopefully).
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From jh at improva.dk  Wed Oct 27 09:57:16 2010
From: jh at improva.dk (Jacob Holm)
Date: Wed, 27 Oct 2010 09:57:16 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=6pa84R1MAjprFJP2jiOddnQptfT-C28=1n-Df@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<4CC6C7F3.6090405@improva.dk>
	<AANLkTi=6pa84R1MAjprFJP2jiOddnQptfT-C28=1n-Df@mail.gmail.com>
Message-ID: <4CC7DB5C.9060304@improva.dk>

On 2010-10-26 19:01, Guido van Rossum wrote:
> On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm <jh at improva.dk> wrote:
> [...]
>>>> Here's a stupid idea... let g.close take an optional argument that it
>>>> can return if the generator is already exhausted and let it return the
>>>> value from the StopIteration otherwise.
>>>>
>>>> def close(self, default=None):
>>>>    if self.gi_frame is None:
>>>>        return default
>>>>    try:
>>>>        self.throw(GeneratorExit)
>>>>    except StopIteration as e:
>>>>        return e.args[0]
>>>>    except GeneratorExit:
>>>>        return None
>>>>    else:
>>>>        raise RuntimeError('generator ignored GeneratorExit')
>>>
>>> You'll have to explain why None isn't sufficient.
> 
>> It is not really necessary, but seemed "cleaner" somehow.  Think of
>> "g.close(default)" as "get me the result if possible, and this default
>> otherwise".  Then think of dict.get()...
> 
> Hm, I'd say there always is a result -- it just sometimes is None. I
> really don't want to make distinctions between falling off the end of
> the function, "return" without a value, "return None", "raise
> StopIteration()", "raise StopIteration(None)", or even (in response to
> a close() request) "raise GeneratorExit".

None of these cover the distinction I am making.  I want to distinguish
between a non-exhausted and an exhausted generator.

When calling close on a non-exhausted generator, the generator decides
how to return by any one of the means you mentioned.  In this case you
are right that there is always a result.

When calling close on an exhausted generator, the generator has no
choice in the matter as the "true" return value was thrown away.  We
have to return *something*, but calling it the "result" of the generator
is stretching it too far.  Making it possible to return something other
than None in this case seems to be analogous to dict.get().

If we chose to use a different method (e.g. Nicks "finish") for getting
the "result", I would instead raise a RuntimeError when calling it on an
exhausted generator.  i.o.w, I would want it defined something like this:

def finish(self):
    if self.gi_frame is None:
        raise RuntimeError('generator already finished')
    try:
        self.throw(GeneratorExit)
    except StopIteration as e:
        return e.args[0]
    except GeneratorExit:
        return None # XXX debatable but unimportant to me
    else:
        raise RuntimeError('generator ignored GeneratorExit')

(possibly using a new GeneratorReturn exception instead)

You might argue for using a different exception for signaling the
exhausted case, e.g.:

class GeneratorFinishedError(StandardError):
    """finish() called on exhaused generator."""

but that only really makes sense if you think calling finish without
knowing whether the generator is exhausted is a reasonable thing to do.

*If* that is the case, we should also consider adding a 'default'
argument to finish which (if provided) could be returned instead of
raising the exception (kind of like dict.pop).


- Jacob


From solipsis at pitrou.net  Wed Oct 27 12:48:26 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 27 Oct 2010 12:48:26 +0200
Subject: [Python-ideas] ExternalMemory
References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local>
Message-ID: <20101027124826.7b925a8c@pitrou.net>

On Wed, 27 Oct 2010 12:02:11 +0800
Kristj?n Valur J?nsson
<kristjan at ccpgames.com> wrote:
> 
> First of all, I'd futilely  like to suggest this change for 2.x.  The existing
> PyBuffer_FromMemory() provides no lifetime management.

By "futilely" you mean you know it won't be accepted, since 2.x is in
bug fixes-only mode? :)

> So, for py3k, I'd actually like to extend the Memoryview object, and
> provide something like PyMemoryView_FromExternal() that takes an
> optional pointer to a "void destructor(void *arg, void *ptr)) and an
> (void *arg), to be called when the buffer is released.

Sounds reasonable to me.

Regards

Antoine.




From cmjohnson.mailinglist at gmail.com  Wed Oct 27 13:22:41 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Wed, 27 Oct 2010 01:22:41 -1000
Subject: [Python-ideas]  textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTik1q1Z=Pcpzgey0jkNp1Z2gjGHaz8Tc=eCiUB0d@mail.gmail.com>
References: <20101025154932.06be2faf@o>
	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>
	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>
	<ia6g52$ldv$1@dough.gmane.org>
	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>
	<AANLkTimQXfrgo0A-j_xCCr3c1+=yPgQX0zKkabDvhCnN@mail.gmail.com>
	<AANLkTik8Q_qPANfDX_u9oqH7sFK28sqHLDTFrX7zKvPy@mail.gmail.com>
	<AANLkTik1q1Z=Pcpzgey0jkNp1Z2gjGHaz8Tc=eCiUB0d@mail.gmail.com>
Message-ID: <AANLkTikrjowUt9bB3bCB8u8mDEQNnwVr9d0ACSnQCY-m@mail.gmail.com>

The downside of flipping the object and parameter of split is that
there's no clear thing to translate "blah\tblah\nblah".split() ==>
['blah', 'blah', 'blah'] into. None.split(string) is crazy talk. Then
again, the case can be made that split() doesn't behave like the other
splits (it drops empty segments; it treats all whitespace the same),
so maybe it shouldn't have the same name as the normal kind of split.

I do think that it might be convenient to be able to do this:

commasplit = ', '.divide #If we're going to imagine this, we should
probably use a different name than "split"
list1 = commasplit(string1)
list2 = commasplit(string2)
?

The same way that one can do:

commajoin = ', '.join
string1 = commajoin(list1)
string2 = commajoin(list2)
?

But the convention is too old and the advantage is too slight to
bother with sort of bikeshedding now. Save it for when you design a
new language to replace Python. :-)

-- Carl


From bborcic at gmail.com  Wed Oct 27 14:17:44 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Wed, 27 Oct 2010 14:17:44 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <AANLkTikrjowUt9bB3bCB8u8mDEQNnwVr9d0ACSnQCY-m@mail.gmail.com>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>	<AANLkTimQXfrgo0A-j_xCCr3c1+=yPgQX0zKkabDvhCnN@mail.gmail.com>	<AANLkTik8Q_qPANfDX_u9oqH7sFK28sqHLDTFrX7zKvPy@mail.gmail.com>	<AANLkTik1q1Z=Pcpzgey0jkNp1Z2gjGHaz8Tc=eCiUB0d@mail.gmail.com>
	<AANLkTikrjowUt9bB3bCB8u8mDEQNnwVr9d0ACSnQCY-m@mail.gmail.com>
Message-ID: <ia959b$445$1@dough.gmane.org>

Carl M. Johnson wrote:

> The downside of flipping the object and parameter of split is that
> there's no clear thing to translate "blah\tblah\nblah".split() ==>
> ['blah', 'blah', 'blah'] into. None.split(string) is crazy talk.

''.join(seqofstr) is deemed better-looking than sum(seqofstr), isn't it? Imo, 
this entails an aesthetic canon in favor of ''.split in the above context. Note 
that currently s.split('') bombs, so there would be no functional behavior to save.

> Then
> again, the case can be made that split() doesn't behave like the other
> splits (it drops empty segments; it treats all whitespace the same),
> so maybe it shouldn't have the same name as the normal kind of split.
>
> I do think that it might be convenient to be able to do this:
>
> commasplit = ', '.divide #If we're going to imagine this, we should
> probably use a different name than "split"
> list1 = commasplit(string1)
> list2 = commasplit(string2)
> ?
>
> The same way that one can do:
>
> commajoin = ', '.join
> string1 = commajoin(list1)
> string2 = commajoin(list2)

Yeah, that's a concrete rendition of my earlier point on bound methods.

> ?
>
> But the convention is too old and the advantage is too slight to
> bother with sort of bikeshedding now. Save it for when you design a
> new language to replace Python. :-)

Thanks :) But if I was to redesign the snake, I guess I might contemplate

pieces = string/separator

to mean

pieces = string.split(separator)

:)

Cheers, BB




From bborcic at gmail.com  Wed Oct 27 15:16:46 2010
From: bborcic at gmail.com (Boris Borcic)
Date: Wed, 27 Oct 2010 15:16:46 +0200
Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='',
	rdelim='')
In-Reply-To: <20101027091028.11322756@o>
References: <20101025154932.06be2faf@o>	<AANLkTinAceOSwCkvwMky6HB9D_FZodiJ63eazMy9jaRV@mail.gmail.com>	<AANLkTin82B-QT46bP5rxD6ddyg2Y+0cxDF9STLxJhUhz@mail.gmail.com>	<ia6g52$ldv$1@dough.gmane.org>	<AANLkTimBk7_oSxg4_5JrW981wccL-bD4j8_nTwwQVOzz@mail.gmail.com>	<ia6opa$1eq$1@dough.gmane.org>
	<4CC6FD33.4050305@pearwood.info> <20101027091028.11322756@o>
Message-ID: <ia98nv$kfr$1@dough.gmane.org>

spir wrote:
> On Wed, 27 Oct 2010 03:09:23 +1100
> Steven D'Aprano<steve at pearwood.info>  wrote:
>
>> Boris Borcic wrote:
>>
>>> And let's then propagate that notion, to a *coherent* definition of
>>> split that makes it as well a method on the separator.
>>
>> Let's not.
>>
>> Splitting is not something that you on the separator, it's something you
>> do on the source string. I'm sure you wouldn't expect this:
>>
>> ":".find("key:value")
>> =>  3
>>
>> Nor should we expect this:
>>
>> ":".split("key:value")
>> =>  ["key", "value"]
>>
>>
>> You perform a search *on* the source string, not the target substring.
>> Likewise you split the source string, not the separator.
>
> I completely share this view.

Pack behavior ! Where's the alpha male ? :)

> Also, when one needs to split on multiple seps, repetitive seps, or even
> more complex separation schemes, it makes even less sense to see split
> applying on the sep, instead of on the string.

Now that's a mighty strange argument, unless you think of /split/ as some sort 
of multimethod. I didn't mean to deprive you of your preferred swiss army knife :)

Obviously the algorithm must change according to the sort of "separation 
scheme". Isn't it then a natural anticipation to see the dispatch effected along 
the lines of Python's native object orientation ? Maybe though, this is a case 
of the user overstepping into the private coding business of language implementors.

But on the user's own coding side, the more complex the "separation scheme", the 
most likely it is that code written to achieve it using /split/, applies 
multiply on *changing* input "source string"s. What in turn would justify that 
the action name /split/ be bound more tightly to the relatively stable 
"separation scheme" than to the relatively unstable "source string".

> Even less when splitting should remove empty parts generated by seps at both
> end or repeted seps. Note that it's precisely what split() without sep does:
>
>>>> s = " some \t little   words  "
>>>> s.split()
> ['some', 'little', 'words']
>>>> s.split(' ')
> ['', 'some', '', 'little', '', '', 'words', '', '']

/split/ currently behaves as it does currently, sure. If it was bound on the 
separator, s.split() could naturally be written ''.split(s) - so what's your 
point ? As I told Johnson, deeming ''.join(seqofstr) better-looking than 
sum(seqofstr) entails promotion of aesthetic sense in favor of ''.split...

>
> Finally, in any of such cases, join is _not_ a reverse function for split.
> split in the general case is not reversable because there is loss of information.
 > It is possible only with a pattern limited to a single sep, no (implicit) 
repetition,
 >and keeping empty parts at ends. Very fine that python's split semantics
> is so defined, one cannot think at split as reversible in general (*).


Now that's gratuitous pedantry ! Note that given

f = sep.join
g = lambda t : t.split(sep)

it is true that

g(f(g(x)))==g(x)

and

f(g(f(y)))==f(y)

for whatever values of sep, x, and y that do not provoke any exception. What 
covers all natural use cases with the notable exception of s.split(), iow 
sep=None. That is clearly enough to justify calling, as I did, /split/ the 
"converse" of /join/ (note the order, sep.join applied first, which eliminates 
sep=None as a use case)

And iirc, the mathematical notion that best fits the idea, is not that of

http://en.wikipedia.org/wiki/Inverse_function

but that of

http://en.wikipedia.org/wiki/Adjoint_functors

Cheers, BB



From rrr at ronadam.com  Wed Oct 27 17:01:00 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 27 Oct 2010 10:01:00 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
Message-ID: <4CC83EAC.7010001@ronadam.com>



On 10/25/2010 10:25 PM, Guido van Rossum wrote:
> By the way, here's how to emulate the value-returning-close() on a
> generator, assuming the generator uses raise StopIteration(x) to mean
> return x:
>
> def gclose(gen):
>    try:
>      gen.throw(GeneratorExit)
>    except StopIteration, err:
>      if err.args:
>        return err.args[0]
>    except GeneratorExit:
>      pass
>    return None
>
> I like this because it's fairly straightforward (except for the detail
> of having to also catch GeneratorExit).
>
> In fact it would be a really simple change to gen_close() in
> genobject.c -- the only change needed there would be to return
> err.args[0]. I like small evolutionary improvements to APIs.

Here's an interesting idea...

It looks like a common case for consumer co-functions is they need to be 
started and then closed, so I'm wondering if we can make these work context 
managers?  That may be a way to reduce the need for the try/except blocks 
inside the generators.

     with my_cofunction(args) as c:
        ... use c

Regards,
   Ron



From alexander.belopolsky at gmail.com  Wed Oct 27 18:05:58 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 27 Oct 2010 12:05:58 -0400
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTimODy6TOd2dy-+N0mWOKXo0Sgugu+c9thb82WjB@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>
	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
	<AANLkTinGYEZUp4xHHAOJRoB_Q-ktdd33R-CWHTt7pvEz@mail.gmail.com>
	<AANLkTimODy6TOd2dy-+N0mWOKXo0Sgugu+c9thb82WjB@mail.gmail.com>
Message-ID: <AANLkTik2SJ4fA=0DGeQ_O5xznGnGq3N5wO=fYxyZ8H-4@mail.gmail.com>

I would like to report a conclusion reached on the tracker to a wider
audience before committing the changes.  The new home for Demo/turtle
is Lib/turtledemo.  (Lib/turtle/demo alternative received no support
and Lib/demo/turtle was not even in the running.)

If anyone is interested in reviewing the patch, please see
http://bugs.python.org/issue10199.  Note that I tried to limit changes
to what was necessary for running the demo script as python -m
demoturtle.  Running the scripts as unit tests and from python prompt
will be subject of a separate issue.

On Tue, Oct 26, 2010 at 11:49 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Tue, Oct 26, 2010 at 11:18 AM, Guido van Rossum <guido at python.org> wrote:
>> On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> The one demo that I want to find a better place for is Demo/turtle.
>>
>> Sure, go for it. It is a special case because the turtle module is
>> also in the stdlib and these are intended for a particular novice
>> audience.
>
> Please see http://bugs.python.org/issue10199 for further discussion.
>


From jh at improva.dk  Wed Oct 27 18:53:07 2010
From: jh at improva.dk (Jacob Holm)
Date: Wed, 27 Oct 2010 18:53:07 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>
Message-ID: <4CC858F3.4000602@improva.dk>

On 2010-10-26 18:56, Guido van Rossum wrote:
> Now, if I may temporarily go into wild-and-crazy mode (this *is*
> python-ideas after all :-), we could invent some ad-hoc syntax for
> this pattern, e.g.:
> 
>   for value in yield:
>     <use value>
>   return <result>
> 
> IOW the special form:
> 
>   for <var> in yield:
>     <body>
> 
> would translate into:
> 
>   try:
>     while True:
>       <var> = yield
>       <body>
>   except GeneratorExit:
>     pass
> 
> If (and this is a big if) the
> while-True-yield-inside-try-except-GeneratorExit pattern somehow
> becomes popular we could reconsider this syntactic extension or some
> variant. (I have to add that the syntactic ice is a bit thin here,
> since "for <var> in (yield)" already has a meaning, and a totally
> different one of course. A variant could be "for <var> from yield" or
> some other abuse of keywords.

Hmm.  This got me thinking.  One thing I'd really like to see in python
is something like the "channel" object from the go language
(http://golang.org/).

Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without
any of them) it is possible to write a trampoline-based implementation
of a channel object with "send" and "next" methods that work as
expected.  One thing that is *not* possible (I think) is to make that
object iterable.  Your wild idea above gave me a similar wild idea of my
own.  An extension to the cofunctions PEP that would make that possible.

1) Define a new "coiterator" protocol, consisting of a new special
method __conext__, and a new StopCoIteration exception that the regular
StopIteration inherits from.  __conext__ should be a generator that
yields as many times as necessary, then either raises StopCoIteration or
returns a result (possibly by raising a StopIteration with a value).
Add a new built-in "conext" cofunction that looks for a __conext__
method instead of a __next__ method.

2) Define a new "coiterable" protocol, consisting of a new special
method __coiter__.  __coiter__ is a regular function and should return
an object implementing the "coiterator" protocol.  Add a new built-in
"coiter" function that looks for a __coiter__ method instead of an
__iter__ method.   (We could also make this a cofunction but for now I
don't see the point).

3) Make sure that the for-loop in a cofunction:

   for val in coiterable:
      <block>
   else:
      <block>

expands as:

   _it = coiter(coiterable)
   while True:
       try:
           val = cocall conext(_it)
       except StopCoIteration:
           break
       <block>
   else:
       <block>

Which is exactly the same as in a normal function, except for the use of
"coiter" and "cocall conext" instead of "iter" and "next", and the use
of StopCoIteration instead of StopIteration.

3a) Alternatively define a new syntax for "coiterating" that expands as
in 3 and whose use is an alternative indicator that this is a cofunction.


All this to make it possible to write a code like this:

def consumer(ch):
    for val in ch:
        cocall print(val) # XXX need a cocall somewhere

def producer(ch):
    for val in range(10):
        cocall ch.send(val)

def main()
    sched = scheduler()
    ch = channel()
    sched.add(consumer(ch))
    sched.add(producer(ch))
    sched.run()


Thoughts?

- Jacob


From rrr at ronadam.com  Wed Oct 27 18:18:58 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 27 Oct 2010 11:18:58 -0500
Subject: [Python-ideas]  PEP 380 close and contextmanagers?
In-Reply-To: <4CC83EAC.7010001@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com>
Message-ID: <4CC850F2.7010202@ronadam.com>



On 10/27/2010 10:01 AM, Ron Adam wrote:
>
> On 10/25/2010 10:25 PM, Guido van Rossum wrote:
>> By the way, here's how to emulate the value-returning-close() on a
>> generator, assuming the generator uses raise StopIteration(x) to mean
>> return x:
>>
>> def gclose(gen):
>> try:
>> gen.throw(GeneratorExit)
>> except StopIteration, err:
>> if err.args:
>> return err.args[0]
>> except GeneratorExit:
>> pass
>> return None
>>
>> I like this because it's fairly straightforward (except for the detail
>> of having to also catch GeneratorExit).
>>
>> In fact it would be a really simple change to gen_close() in
>> genobject.c -- the only change needed there would be to return
>> err.args[0]. I like small evolutionary improvements to APIs.
>
> Here's an interesting idea...
>
> It looks like a common case for consumer co-functions is they need to be
> started and then closed, so I'm wondering if we can make these work
> context managers? That may be a way to reduce the need for the
> try/except blocks inside the generators.

It looks like No context managers return values in the finally or __exit__ 
part of a context manager.  Is there way to do that?

Here's a context manager version of the min/max with nested coroutines, but 
it doesn't return a value from close.


######
from contextlib import contextmanager


# New close function that enables returning a
# value.

def gclose(gen):
    try:
      gen.throw(GeneratorExit)
    except StopIteration as err:
      if err.args:
        return err.args[0]
    except GeneratorExit:
      pass
    return None


# Showing  both the class and geneator based
# context managers for comparison and to better
# see how these things may work.

class Consumer:
     def __init__(self, cofunc):
         next(cofunc)
         self.cofunc = cofunc
     def __enter__(self):
         return self.cofunc
     def __exit__(self, *exc_info):
         gclose(self.cofunc)

@contextmanager
def consumer(cofunc):
     next(cofunc)
     try:
         yield cofunc
     finally:
         gclose(cofunc)


class MultiConsumer:
     def __init__(self, cofuncs):
         for c in cofuncs:
             next(c)
         self.cofuncs = cofuncs
     def __enter__(self):
         return self.cofuncs
     def __exit__(self, *exc_info):
         for c in self.cofuncs:
             gclose(c)

@contextmanager
def multiconsumer(cofuncs):
     for c in cofuncs:
         next(c)
     try:
         yield cofuncs
     finally:
         for c in cofuncs:
             gclose(c)


# Min/max coroutine example slpit into
# nested coroutines for testing these ideas
# in a more complex situation that may arise
# when working with cofunctions and generators.

# Question:
#    How to rewrite this so close returns
#    a final value?

def reduce_i(f):
      i = yield
      while True:
          i = f(i, (yield i))

def reduce_it_to(funcs):
     with multiconsumer([reduce_i(f) for f in funcs]) as mc:
         values = None
         while True:
             i = yield values
             values = [c.send(i) for c in mc]

def main():
     with consumer(reduce_it_to([min, max])) as c:
         for i in range(100):
             value = c.send(i)
         print(value)


if __name__ == '__main__':
     main()




From guido at python.org  Wed Oct 27 20:38:49 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 27 Oct 2010 11:38:49 -0700
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <4CC850F2.7010202@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com>
Message-ID: <AANLkTi=Cphssxk3icqwMv-bH+_CGmPBXtWm=VWVLM-gk@mail.gmail.com>

On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam <rrr at ronadam.com> wrote:
>
>
> On 10/27/2010 10:01 AM, Ron Adam wrote:
> It looks like No context managers return values in the finally or __exit__
> part of a context manager. ?Is there way to do that?

How would that value be communicated to the code containing the with-clause?

> Here's a context manager version of the min/max with nested coroutines, but
> it doesn't return a value from close.
>
>
> ######
> from contextlib import contextmanager
>
>
> # New close function that enables returning a
> # value.
>
> def gclose(gen):
> ? try:
> ? ? gen.throw(GeneratorExit)
> ? except StopIteration as err:
> ? ? if err.args:
> ? ? ? return err.args[0]
> ? except GeneratorExit:
> ? ? pass
> ? return None
>
>
> # Showing ?both the class and geneator based
> # context managers for comparison and to better
> # see how these things may work.
>
> class Consumer:
> ? ?def __init__(self, cofunc):
> ? ? ? ?next(cofunc)
> ? ? ? ?self.cofunc = cofunc
> ? ?def __enter__(self):
> ? ? ? ?return self.cofunc
> ? ?def __exit__(self, *exc_info):
> ? ? ? ?gclose(self.cofunc)
>
> @contextmanager
> def consumer(cofunc):
> ? ?next(cofunc)
> ? ?try:
> ? ? ? ?yield cofunc
> ? ?finally:
> ? ? ? ?gclose(cofunc)
>
>
> class MultiConsumer:
> ? ?def __init__(self, cofuncs):
> ? ? ? ?for c in cofuncs:
> ? ? ? ? ? ?next(c)
> ? ? ? ?self.cofuncs = cofuncs
> ? ?def __enter__(self):
> ? ? ? ?return self.cofuncs
> ? ?def __exit__(self, *exc_info):
> ? ? ? ?for c in self.cofuncs:
> ? ? ? ? ? ?gclose(c)
>
> @contextmanager
> def multiconsumer(cofuncs):
> ? ?for c in cofuncs:
> ? ? ? ?next(c)
> ? ?try:
> ? ? ? ?yield cofuncs
> ? ?finally:
> ? ? ? ?for c in cofuncs:
> ? ? ? ? ? ?gclose(c)

So far so good.

> # Min/max coroutine example slpit into
> # nested coroutines for testing these ideas
> # in a more complex situation that may arise
> # when working with cofunctions and generators.
>
> # Question:
> # ? ?How to rewrite this so close returns
> # ? ?a final value?

Change the function to catch GeneratorExit and when it catches that,
raise StopIteration(<returnvalue>).

> def reduce_i(f):
> ? ? i = yield
> ? ? while True:
> ? ? ? ? i = f(i, (yield i))

Unfortunately from here on till the end of your example my brain exploded.

> def reduce_it_to(funcs):
> ? ?with multiconsumer([reduce_i(f) for f in funcs]) as mc:
> ? ? ? ?values = None
> ? ? ? ?while True:
> ? ? ? ? ? ?i = yield values
> ? ? ? ? ? ?values = [c.send(i) for c in mc]

Maybe you could have picked a better name than 'i' for this variable...

> def main():
> ? ?with consumer(reduce_it_to([min, max])) as c:
> ? ? ? ?for i in range(100):
> ? ? ? ? ? ?value = c.send(i)
> ? ? ? ?print(value)

I sort of get what you are doing here but I think you left one
abstraction out. Something like this:

def blah(it, funcs):
  with consumer(reduce_it_to(funcs) as c:
    for i in it:
      value = c.send(i)
    return value

def main():
  print(blah(range(100), [min, max]))

> if __name__ == '__main__':
> ? ?main()

-- 
--Guido van Rossum (python.org/~guido)


From lie.1296 at gmail.com  Wed Oct 27 19:59:44 2010
From: lie.1296 at gmail.com (Lie Ryan)
Date: Thu, 28 Oct 2010 04:59:44 +1100
Subject: [Python-ideas] Move Demo scripts under Lib
In-Reply-To: <AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
References: <AANLkTimjfdqyvk8T6TeqokL02pH3p5nd+Yd=aJrWS7AV@mail.gmail.com>	<AANLkTinx+762Wd_5C7WWs6EV35oWSZGbYme2Bbxshfvr@mail.gmail.com>
	<AANLkTikwXtkemCJeSmCJsNGXsm=yZmsq5L++yz_9oYGS@mail.gmail.com>
Message-ID: <ia9sbb$lkk$1@dough.gmane.org>

On 10/27/10 02:13, Alexander Belopolsky wrote:
> Introduction of -m option has changed that IMO.  For example, when I
> work with recent versions of python, I always run pydoc as python -m
> pydoc because pydoc script on the path amy not correspond to the same
> version of python that I use. 

Shouldn't there be a pydoc2.6, pydoc3.1, and other pydocX.X that
corresponds to each python version? Otherwise you should be able to
create an alias in your shell.



From jh at improva.dk  Wed Oct 27 22:22:09 2010
From: jh at improva.dk (Jacob Holm)
Date: Wed, 27 Oct 2010 22:22:09 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
Message-ID: <4CC889F1.8010603@improva.dk>

On 2010-10-27 00:14, Nick Coghlan wrote:
> Jacob's "implications for PEP 380" exploration started to give me some
> doubts, but I think there are actually some flaws in his argument.

I'm not sure I made much of an argument.  I showed an example that
assumed the change I was suggesting and explained what the problem would
be without the change.  Let me try another example:

def filesum(fn):
    s = 0
    with fd in open(fn):
        for line in fd:
            s += int(line)
            yield   # be cooperative..
    return s

def multifilesum():
    a = yield from filesum('fileA')
    b = yield from filesum('fileB')
    return a+b

def main()
    g = multifilesum()
    for i in range(10):
        try:
            next(g)
        except StopIteration as e:
            r = e.value
            break
    else:
        r = g.finish()

This tries to read at most 10 lines from 'fileA' + 'fileB' and returning
their sums when interpreting each line as an integer.  It works fine if
there are at most 10 lines but is broken if 'fileA' has more than 10
lines.  What's more, assuming latest PEP 380 + your "finish" and no
other changes I don't see a simple way of fixing it.
With my modification of your "finish" proposal you can add a few
try...except blocks to the code and it will "just work (tm)"...



> Accordingly, I would like to make one more attempt at explaining why I
> think throwing in a separate exception for this use case is valuable
> (and *doesn't* require any changes to PEP 380).
> 

I am convinced that it does, at least if you want it to be useable with
yield-from.   But the same goes for any version that uses GeneratorExit.


> As I see it, there's a bit of a disconnect between many PEP 380 use
> cases and any mechanism or idiom which translates a thrown in
> exception into an ordinary StopIteration. If you expect your thrown in
> exception to always terminate the generator in some fashion, adopting
> the latter idiom in your generator will make it potentially unsafe to
> use in a "yield from" expression that isn't the very last yield
> operation in any outer generator.
> 

Right.  This is the problem I'm trying to address by modifying the PEP
expansion.


> Consider the following:
> 
> def example(arg):
>   try:
>     yield arg
>   except GeneratorExit
>     return "Closed"
>   return "Finished"
> 
> def outer_ok1(arg):  # close() after next() returns "Closed"
>   return yield from example(arg)
> 
> def outer_ok2(arg): # close() after next() returns None
>   yield from example(arg)
> 
> def outer_broken(arg): # close() after next() gives RuntimeError
>   val = yield from example(arg)
>   yield val
> 
> # All 3 cases: close() before next() returns None
> # All 3 cases: close() after 2x next() returns None
> 

Actually, AFAICT outer_broken will *not* give a RuntimeError on close()
after next().  This is due to the special-casing of GeneratorExit in PEP
380.  That special-casing is also the basis for both my suggested
modifications.

In fact, in all 3 cases close() after next() would give None because the
"inner" return value is discarded and the GeneratorExit reraised.  Only
when called directly would the inner "example" function return "Closed"
on close() after next().


> Using close() to say "give me your return value" creates the risk of
> hitting those runtime errors in a generator's __del__ method, 

Not really.  Returning a value from close with no other changes does not
change the risk of that happening.  Of course I *do* think other changes
are necessary, but then we'll need to look at those before concluding
they are a problem...


> and
> exceptions in __del__ are always a bit ugly.
> 

That they are.


> Keeping the "give me your return value" and "clean up your resources"
> concerns separate by adding a new method and thrown exception means
> that close() is less likely to unpredictably raise RuntimeError (and
> when it does, will reliably indicate a genuine bug in a generator
> somewhere that is suppressing GeneratorExit).
> 
> As far as PEP 380's semantics go, I think it should ignore the
> existence of anything like GeneratorReturn completely. Either one of
> the generators in the chain will catch the exception and turn it into
> StopIteration, or they won't. If they convert it to StopIteration, and
> they aren't the last generator in the chain, then maybe what actually
> needs to happen at the outermost level is something like this:
> 
> class GeneratorReturn(Exception): pass
> 
> def finish(gen):
>   try:
>     gen.throw(GeneratorReturn) # Ask generator to wrap things up
>   except StopIteration as err:
>     if err.args:
>       return err.args[0]
>   except GeneratorReturn:
>     pass
>   else:
>     # Asking nicely didn't work, so force resource cleanup
>     # and treat the result as if the generator had already
>     # been exhausted or hadn't started yet
>     gen.close()
>   return None
> 

This, I don't like.  If we have a distinct method for "finishing" a
generator and getting a return value, I want it to tell me if the return
value was arrived at in some other way.  Preferably with an exception,
as in:

def finish(self):
    if self.gi_frame is None:
        raise RuntimeError('finish() on exhausted/closed generator')
    try:
        self.throw(GeneratorReturn)
    except StopIteration as err:
        if err.args:
            return err.args[0]
    except GeneratorReturn:
        pass
    else:
        raise RuntimeError('generator ignored GeneratorReturn')
    return None

The point of "finish" as I see it is not the "closing" part, but the
"give me a result" part.

Anyway, I am (probably) not going to argue much further for this.  The
only new thing that is on the table here is the "finish" function, and
using a new exception.  The use of a new exception solves some of the
issues that you and Greg had earlier, but leaves the problem of using a
value-returning close/finish with yield-from. (And Guido doesn't like
it).  Since noone seems interested in even considering a change to the
PEP 380 expansion to fix this, i don't really see a any more I can
contribute at this point.

- Jacob


From ncoghlan at gmail.com  Thu Oct 28 00:00:36 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 28 Oct 2010 08:00:36 +1000
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <4CC850F2.7010202@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com>
Message-ID: <AANLkTimuwtQpXh-=HVjGB_vdhxtS_SHiKWW1wk=g7Scz@mail.gmail.com>

On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam <rrr at ronadam.com> wrote:
>
> It looks like No context managers return values in the finally or __exit__
> part of a context manager. ?Is there way to do that?

The return value from __exit__ is used to decide whether or not to
suppress the exception (i.e. bool(__exit__()) == True will suppress
the exception that was passed in).

There are a few CMs in the test suite (test.support) that provide info
about things that happened during their with statement - they all use
the trick of returning a stateful object from __enter__, then
modifying the attributes of that object in __exit__. I seem to recall
the CM variants of unittest.TestCase.assertRaises* doing the same
thing (so you can poke and prod at the raised exception yourself).
warnings.catch_warnings also appends encountered warnings to a list
returned by __enter__ when record=True.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Thu Oct 28 00:52:59 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 28 Oct 2010 08:52:59 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC889F1.8010603@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
Message-ID: <AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>

On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm <jh at improva.dk> wrote:
> Actually, AFAICT outer_broken will *not* give a RuntimeError on close()
> after next(). ?This is due to the special-casing of GeneratorExit in PEP
> 380. ?That special-casing is also the basis for both my suggested
> modifications.

Ah, you're quite right - I'd completely forgotten about the
GeneratorExit special-casing in the PEP 380 semantics, so I was
arguing from a faulty premise. With that error corrected, I can
happily withdraw my objection to idioms that convert GeneratorExit to
StopIteration (since any yield from expressions will reraise the
GeneratorExit in that case).

The "did-it-really-finish?" question can likely be answered by
slightly improving generator state introspection from the Python level
(as I believe Guido suggested earlier in the thread). That way close()
can keep the gist of its current semantics (return something if the
generator ends up in an inactive state, raise RuntimeError if it
yields another value), while frameworks can object to other unexpected
states if they want to.

As it turns out, the information on generator state is already there,
just not in a particularly user friendly format ("not started" =
"g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" =
"g.gi_frame is None").

So, without any modifications at all to the current incarnation of PEP
380, it is already possible to write:

def finish(gen):
   frame = gen.gi_frame
   if frame is None:
       raise RuntimeError('finish() on exhausted/closed generator')
   if frame.f_lasti == -1:
       raise RuntimeError('finish() on not yet started generator')
   try:
       gen.throw(GeneratorExit)
   except StopIteration as err:
       if err.args:
           return err.args[0]
       return None
   except GeneratorExit:
       pass
   else:
       raise RuntimeError('Generator ignored GeneratorExit')
   raise RuntimeError('Generator failed to return a value')

I think I'm finally starting to understand *your* question/concern
though. Given the current PEP 380 expansion, the above definition of
finish() and the following two generators:

def g_inner():
  yield
  return "Hello world!"

def g_outer():
  yield (yield from g_inner())

You would get the following result (as g_inner converts GeneratorExit
to StopIteration, then yield from propogates that up the stack):
>>> g = g_outer()
>>> next(g)
>>> finish(g)
"Hello world!"

Oops?

I'm wondering if this part of the PEP 380 expansion:
                    if _e is _x[1] or isinstance(_x[1], GeneratorExit):
                        raise

Should actually look like:
                    if _e is _x[1]:
                        raise
                    if isinstance(_x[1], GeneratorExit):
                        raise GeneratorExit(*_e.args)

Once that distinction is made, you can more easily write helper
functions and context managers that allow code to do the "right thing"
according to the needs of a particular framework or application.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Thu Oct 28 00:54:30 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 28 Oct 2010 08:54:30 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
Message-ID: <AANLkTik-2+tHSk+E2CJR-67LJegPMn5yem5u3P-+xVwB@mail.gmail.com>

On Thu, Oct 28, 2010 at 8:52 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm <jh at improva.dk> wrote:
>> Actually, AFAICT outer_broken will *not* give a RuntimeError on close()
>> after next(). ?This is due to the special-casing of GeneratorExit in PEP
>> 380. ?That special-casing is also the basis for both my suggested
>> modifications.
>
> Ah, you're quite right - I'd completely forgotten about the
> GeneratorExit special-casing in the PEP 380 semantics, so I was
> arguing from a faulty premise. With that error corrected, I can
> happily withdraw my objection to idioms that convert GeneratorExit to
> StopIteration (since any yield from expressions will reraise the
> GeneratorExit in that case).

Correction: they'll reraise StopIteration with the current PEP
semantics, GeneratorExit with the proposed modification at the end of
my last message.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From guido at python.org  Thu Oct 28 01:46:30 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 27 Oct 2010 16:46:30 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
Message-ID: <AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>

Nick & Jacob,

Unfortunately other things are in need of my attention and I am
quickly lagging behind on this thread.

I'll try to respond to some issues without specific quoting.

If GeneratorReturn and finish() can be implemented in pure user code,
then I think it should be up to every (framework) developer to provide
their own API, using whatever constraints they chose. Without specific
use cases it's hard to reason about API design. Still, I think it is
reasonable to offer some basic behavior on the generator object, and I
still think that the best compromise here is to let g.close() extract
the return value from StopIteration if it catches it. If a framework
decides not to use this, fine. For a user working without a framework
this is still just a little nicer than having to figure out the
required logic yourself.

I am aware of four relevant states for generators. Here's how they
work (in current Python):

- initial state: execution is poised at the top of the function.
g.throw() always bounces back the exception. g.close() moves it to the
final state. g.next() starts it running. g.send() requires a None
argument and is then the same as g.next().

- running state: the frame is active. none of g.next(), g.send(),
g.throw() or g.close() work -- they all raise ValueError.

- suspended state: execution is suspended at a yield. g.close() raises
GeneratorExit and if the generator catches this it can do whatever it
pleases. If it then raises StopIteration or GeneratorExit, g.close()
is happy, if it raises another exception g.close() just passes that
through, if it yields a value g.close() complains and raises
RuntimeError().

- finished (exhausted) state: the generator has returned. g.close()
always return None. g.throw() always bounces back the exception.
g.next() and g.send() always raise StopIteration.

I would be in favor of adding an introspection API to distinguish
these four states and I think it would be a fine thing to add to
Python 3.2 if anyone finds the time to produce a patch (Nick? You
showed what these boil down to.)

I note that in the initial state a generator has no choice in how to
respond because it hasnt't yet had the opportunity to set up a
try/except, so in this state it acts pretty much the same as in the
exhausted state when receiving a throw() or close().

Regarding built-in syntax for Go-like channels, let's first see an
implementation in userland become successful *or* see that it's
impossible to write an efficient one before adding more to the
language.

Note that having a different expansion of a for-loop based on the
run-time value or type of the iterable cannot be done -- the expansion
can only vary based on the syntactic form.

There are a few different conventions for using generators and
yield-from; e.g. generators used as proper iterators with easy
refactoring; generators used as tasks where yield X is used for
blocking I/O operations; and generators used as "inverse generators"
as in the parallel_reduce() example that initiated this thread. I
don't particularly care about what kind of errors you get if a
generator written for one convention is accidentally used by another
convention, as long as it is made clear which convention is being used
in each case. Frameworks/libraries can and probably should develop
decorators to mark up the 2nd and 3rd conventions, but I don't think
the *language* needs to go out of its way to enforce proper usage.

-- 
--Guido van Rossum (python.org/~guido)


From rrr at ronadam.com  Thu Oct 28 02:00:52 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 27 Oct 2010 19:00:52 -0500
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <AANLkTi=Cphssxk3icqwMv-bH+_CGmPBXtWm=VWVLM-gk@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com>
	<AANLkTi=Cphssxk3icqwMv-bH+_CGmPBXtWm=VWVLM-gk@mail.gmail.com>
Message-ID: <4CC8BD34.3090700@ronadam.com>


On 10/27/2010 01:38 PM, Guido van Rossum wrote:
> On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam<rrr at ronadam.com>  wrote:
>>
>>
>> On 10/27/2010 10:01 AM, Ron Adam wrote:
>> It looks like No context managers return values in the finally or __exit__
>> part of a context manager.  Is there way to do that?
>
> How would that value be communicated to the code containing the with-clause?

I think that was what I was trying to figure out also.

>> def reduce_i(f):
>>      i = yield
>>      while True:
>>          i = f(i, (yield i))
>
> Unfortunately from here on till the end of your example my brain exploded.

Mine did too, but I think it was a useful but strange experience. ;-)

It forced me to take a break and think about the problem from a different 
viewpoint.  Heres the conclusion I came to, but be forewarned, it's kind of 
anti-climatic.  :-)


The use of an exception to signal some bit of code, is a way to reach over 
a wall that also protects that bit of code.  This seems to be a more common 
need when using coroutines, because it's more common to have some bits of 
code indirectly, direct some other bit of code.

Generators already have a nice .throw() method that will return the value 
at the next yield.  But we either have to choose an existing exception to 
throw, that has some other purpose, or make up a new one.  When it comes to 
making up new ones, lots of other programmers may each call it something else.

That isn't a big problem, but it may be nice if we had a standard exception 
for saying.. "Hey you!, send me a total or subtotal!".  And that's all that 
it does.  For now lets call it a ValueRequest exception.

ValueRequest makes sense if you are throwing an exception,  I think 
ValueReturn may make more sense if you are raising an exception.  Or maybe 
there is something that reads well both ways?  These both fit very nice 
with ValueError and it may make reading code easier if we make a 
distinction between a request and a return.


Below is the previous example rewritten to do this. A ValueRequest doesn't 
stop anything or force anything to close, so it wont ever interfere, 
confuse, or complicate, code that uses other exceptions.  You can always 
throw or catch one of these and raise something else if you need to.

Since throwing it into a generator doesn't stop the generator, the 
generator can put the try-except into a larger loop and loop back to get 
more values and catch another ValueRequest at some later point.  I feel 
that is a useful and handy thing to do.


So here's the example again.

The first version of this took advantage of yield's ability to send and get 
data at the same time to always send back an update (subtotal) to the 
parent routine.  That's nearly free since a yield always sends something 
back anyway. (None if you don't give it something else.)  But it's not 
always easy to do, or easy to understand if you do it.  IE.. brain 
exploding stuff.

In this version, data only flows into the coroutine until a ValueRequest 
exception is thrown at it, at which point it then yields back a total.


*I can see where some routines may reverse the control, by throwing 
ValueReturns from the inside out, rather than ValueRequests from the 
outside in.   Is it useful to distinquish between the two or should there 
be just one?

*Yes this can be made to work with gclose() and return, but I feel that is 
more restrictive, and more complex, than it needs to be.

*I still didn't figure out how to use the context managers to get rid of 
the try except. Oh well.  ;-)



from contextlib import contextmanager

class ValueRequest(Exception):
     pass

@contextmanager
def consumer(cofunc, result=True):
     next(cofunc)
     try:
         yield cofunc
     finally:
         cofunc.close()

@contextmanager
def multiconsumer(cofuncs, result=True):
     for c in cofuncs:
         next(c)
     try:
         yield cofuncs
     finally:
         for c in cofuncs:
             c.close()

# Min/max coroutine example slpit into
# nested coroutines for testing these ideas
# in a more complex situation that may arise
# when working with cofunctions and generators.

def reduce_item(f):
     try:
         x = yield
         while True:
             x = f(x, (yield))
     except ValueRequest:
         yield x

def reduce_group(funcs):
     with multiconsumer([reduce_item(f) for f in funcs]) as mc:
         try:
             while True:
                 x = yield
                 for c in mc:
                     c.send(x)
         except ValueRequest:
             yield [c.throw(ValueRequest) for c in mc]

def get_reductions(funcs, iterable):
     with consumer(reduce_group(funcs)) as c:
         for x in iterable:
             c.send(x)
         return c.throw(ValueRequest)

def main():
     funcs = [min, max]
     print(get_reductions(funcs, range(100)))
     s = "Python is fun for play, and great for work too."
     print(get_reductions(funcs, s))

if __name__ == '__main__':
     main()



From rrr at ronadam.com  Thu Oct 28 03:26:23 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 27 Oct 2010 20:26:23 -0500
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <AANLkTimuwtQpXh-=HVjGB_vdhxtS_SHiKWW1wk=g7Scz@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>	<4CC83EAC.7010001@ronadam.com>	<4CC850F2.7010202@ronadam.com>
	<AANLkTimuwtQpXh-=HVjGB_vdhxtS_SHiKWW1wk=g7Scz@mail.gmail.com>
Message-ID: <4CC8D13F.4070203@ronadam.com>



On 10/27/2010 05:00 PM, Nick Coghlan wrote:
> On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam<rrr at ronadam.com>  wrote:
>>
>> It looks like No context managers return values in the finally or __exit__
>> part of a context manager.  Is there way to do that?
>
> The return value from __exit__ is used to decide whether or not to
> suppress the exception (i.e. bool(__exit__()) == True will suppress
> the exception that was passed in).
>
> There are a few CMs in the test suite (test.support) that provide info
> about things that happened during their with statement - they all use
> the trick of returning a stateful object from __enter__, then
> modifying the attributes of that object in __exit__. I seem to recall
> the CM variants of unittest.TestCase.assertRaises* doing the same
> thing (so you can poke and prod at the raised exception yourself).
> warnings.catch_warnings also appends encountered warnings to a list
> returned by __enter__ when record=True.
>
> Cheers,
> Nick.

Thanks, I'll take a look.  If for nothing else it will help me understand 
it better.

BTW, The use case of the (min/max) examples doesn't fit that particular 
need.  It turned out that just creating a custom exception and throwing it 
into the coroutine is probably the best and simplest way to do it.

That's not to say that some of the other things Guido is thinking of won't 
benefit close() returning a value, but that particular example doesn't.

Cheers,
    Ron



From kristjan at ccpgames.com  Thu Oct 28 04:05:22 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Thu, 28 Oct 2010 10:05:22 +0800
Subject: [Python-ideas] ExternalMemory
In-Reply-To: <20101027124826.7b925a8c@pitrou.net>
References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local>
	<20101027124826.7b925a8c@pitrou.net>
Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC52@exchcn.ccp.ad.local>

Looking better at this, I don't think it is such a 
The MemoryView is designed to be a wrapper around the Py_buffer interface.  Slicing it, for example, creates a new memoryview based on the same underlying object.  Having the MemoryView get its data from two different places would be very hacky, I think.
What is needed, I think, is a basic ExternalMemory C api object with a buffer interface that does what I describe.
This exists in 2.7 (the BufferObject) but with the shortcomings I mentioned.  But as far as I know, there is no similar object in py3k.
K

-----Original Message-----
From: python-ideas-bounces+kristjan=ccpgames.com at python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou
Sent: Wednesday, October 27, 2010 18:48
To: python-ideas at python.org
Subject: Re: [Python-ideas] ExternalMemory


> So, for py3k, I'd actually like to extend the Memoryview object, and 
> provide something like PyMemoryView_FromExternal() that takes an 
> optional pointer to a "void destructor(void *arg, void *ptr)) and an 
> (void *arg), to be called when the buffer is released.

Sounds reasonable to me.

Regards

Antoine.


_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
http://mail.python.org/mailman/listinfo/python-ideas



From guido at python.org  Thu Oct 28 04:53:14 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 27 Oct 2010 19:53:14 -0700
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <4CC8BD34.3090700@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com>
	<AANLkTi=Cphssxk3icqwMv-bH+_CGmPBXtWm=VWVLM-gk@mail.gmail.com>
	<4CC8BD34.3090700@ronadam.com>
Message-ID: <AANLkTikT4GSBBV4ffKyhOC+Cgw=0rmhGNHFojTPS+y+4@mail.gmail.com>

On Wed, Oct 27, 2010 at 5:00 PM, Ron Adam <rrr at ronadam.com> wrote:
>
> On 10/27/2010 01:38 PM, Guido van Rossum wrote:
>>
>> On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam<rrr at ronadam.com> ?wrote:
>>>
>>>
>>> On 10/27/2010 10:01 AM, Ron Adam wrote:
>>> It looks like No context managers return values in the finally or
>>> __exit__
>>> part of a context manager. ?Is there way to do that?
>>
>> How would that value be communicated to the code containing the
>> with-clause?
>
> I think that was what I was trying to figure out also.
>
>>> def reduce_i(f):
>>> ? ? i = yield
>>> ? ? while True:
>>> ? ? ? ? i = f(i, (yield i))
>>
>> Unfortunately from here on till the end of your example my brain exploded.
>
> Mine did too, but I think it was a useful but strange experience. ;-)
>
> It forced me to take a break and think about the problem from a different
> viewpoint. ?Heres the conclusion I came to, but be forewarned, it's kind of
> anti-climatic. ?:-)
>
>
> The use of an exception to signal some bit of code, is a way to reach over a
> wall that also protects that bit of code. ?This seems to be a more common
> need when using coroutines, because it's more common to have some bits of
> code indirectly, direct some other bit of code.
>
> Generators already have a nice .throw() method that will return the value at
> the next yield. ?But we either have to choose an existing exception to
> throw, that has some other purpose, or make up a new one. ?When it comes to
> making up new ones, lots of other programmers may each call it something
> else.
>
> That isn't a big problem, but it may be nice if we had a standard exception
> for saying.. "Hey you!, send me a total or subtotal!". ?And that's all that
> it does. ?For now lets call it a ValueRequest exception.
>
> ValueRequest makes sense if you are throwing an exception, ?I think
> ValueReturn may make more sense if you are raising an exception. ?Or maybe
> there is something that reads well both ways? ?These both fit very nice with
> ValueError and it may make reading code easier if we make a distinction
> between a request and a return.
>
>
> Below is the previous example rewritten to do this. A ValueRequest doesn't
> stop anything or force anything to close, so it wont ever interfere,
> confuse, or complicate, code that uses other exceptions. ?You can always
> throw or catch one of these and raise something else if you need to.
>
> Since throwing it into a generator doesn't stop the generator, the generator
> can put the try-except into a larger loop and loop back to get more values
> and catch another ValueRequest at some later point. ?I feel that is a useful
> and handy thing to do.
>
>
> So here's the example again.
>
> The first version of this took advantage of yield's ability to send and get
> data at the same time to always send back an update (subtotal) to the parent
> routine. ?That's nearly free since a yield always sends something back
> anyway. (None if you don't give it something else.) ?But it's not always
> easy to do, or easy to understand if you do it. ?IE.. brain exploding stuff.
>
> In this version, data only flows into the coroutine until a ValueRequest
> exception is thrown at it, at which point it then yields back a total.
>
>
> *I can see where some routines may reverse the control, by throwing
> ValueReturns from the inside out, rather than ValueRequests from the outside
> in. ? Is it useful to distinquish between the two or should there be just
> one?
>
> *Yes this can be made to work with gclose() and return, but I feel that is
> more restrictive, and more complex, than it needs to be.
>
> *I still didn't figure out how to use the context managers to get rid of the
> try except. Oh well. ?;-)
>
>
>
> from contextlib import contextmanager
>
> class ValueRequest(Exception):
> ? ?pass
>
> @contextmanager
> def consumer(cofunc, result=True):
> ? ?next(cofunc)
> ? ?try:
> ? ? ? ?yield cofunc
> ? ?finally:
> ? ? ? ?cofunc.close()
>
> @contextmanager
> def multiconsumer(cofuncs, result=True):
> ? ?for c in cofuncs:
> ? ? ? ?next(c)
> ? ?try:
> ? ? ? ?yield cofuncs
> ? ?finally:
> ? ? ? ?for c in cofuncs:
> ? ? ? ? ? ?c.close()
>
> # Min/max coroutine example slpit into
> # nested coroutines for testing these ideas
> # in a more complex situation that may arise
> # when working with cofunctions and generators.
>
> def reduce_item(f):
> ? ?try:
> ? ? ? ?x = yield
> ? ? ? ?while True:
> ? ? ? ? ? ?x = f(x, (yield))
> ? ?except ValueRequest:
> ? ? ? ?yield x
>
> def reduce_group(funcs):
> ? ?with multiconsumer([reduce_item(f) for f in funcs]) as mc:
> ? ? ? ?try:
> ? ? ? ? ? ?while True:
> ? ? ? ? ? ? ? ?x = yield
> ? ? ? ? ? ? ? ?for c in mc:
> ? ? ? ? ? ? ? ? ? ?c.send(x)
> ? ? ? ?except ValueRequest:
> ? ? ? ? ? ?yield [c.throw(ValueRequest) for c in mc]
>
> def get_reductions(funcs, iterable):
> ? ?with consumer(reduce_group(funcs)) as c:
> ? ? ? ?for x in iterable:
> ? ? ? ? ? ?c.send(x)
> ? ? ? ?return c.throw(ValueRequest)
>
> def main():
> ? ?funcs = [min, max]
> ? ?print(get_reductions(funcs, range(100)))
> ? ?s = "Python is fun for play, and great for work too."
> ? ?print(get_reductions(funcs, s))
>
> if __name__ == '__main__':
> ? ?main()

Hm... Certainly interesting. My own (equally anti-climactic :-)
conclusions would be:

- Tastes differ

- There is a point where yield gets overused

- I am not convinced that using reduce as a paradigm here is right

-- 
--Guido van Rossum (python.org/~guido)


From rrr at ronadam.com  Thu Oct 28 05:57:08 2010
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 27 Oct 2010 22:57:08 -0500
Subject: [Python-ideas] PEP 380 close and contextmanagers?
In-Reply-To: <AANLkTikT4GSBBV4ffKyhOC+Cgw=0rmhGNHFojTPS+y+4@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=zK0kXS-9n_P3EEpM_LJyRLgPqqDJ7+qJW6p87@mail.gmail.com>
	<4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com>
	<AANLkTi=Cphssxk3icqwMv-bH+_CGmPBXtWm=VWVLM-gk@mail.gmail.com>
	<4CC8BD34.3090700@ronadam.com>
	<AANLkTikT4GSBBV4ffKyhOC+Cgw=0rmhGNHFojTPS+y+4@mail.gmail.com>
Message-ID: <4CC8F494.5030601@ronadam.com>

On 10/27/2010 09:53 PM, Guido van Rossum wrote:

> Hm... Certainly interesting. My own (equally anti-climactic :-)
> conclusions would be:
>
> - Tastes differ
>
> - There is a point where yield gets overused
>
> - I am not convinced that using reduce as a paradigm here is right


I Agree. :-)

This was a contrived example for the purpose of testing an idea.  The 
concept being tested had nothing to do with reduce.  It had to do with the 
interface and control mechanisms.

Cheers,
    Ron





From offline at offby1.net  Thu Oct 28 08:32:43 2010
From: offline at offby1.net (Chris Rose)
Date: Thu, 28 Oct 2010 00:32:43 -0600
Subject: [Python-ideas] Ordered storage of keyword arguments
Message-ID: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>

I'd like to resurrect a discussion that went on a little over a year
ago [1] started by Michael Foord suggesting that it'd be nice if
keyword arguments' storage was implemented as an ordered dict as
opposed to the current unordered form.

I'm interested in picking this up for implementation, which presumably
will require moving the implementation of the existing ordereddict
class into the C library.

Are there any issues that this might cause in implementation on the
py3k development line?

[1] http://mail.python.org/pipermail/python-ideas/2009-April/004163.html
--
Chris R.
Not to be taken literally, internally, or seriously.


From mal at egenix.com  Thu Oct 28 10:13:09 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 28 Oct 2010 10:13:09 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
Message-ID: <4CC93095.3080704@egenix.com>

Chris Rose wrote:
> I'd like to resurrect a discussion that went on a little over a year
> ago [1] started by Michael Foord suggesting that it'd be nice if
> keyword arguments' storage was implemented as an ordered dict as
> opposed to the current unordered form.
> 
> I'm interested in picking this up for implementation, which presumably
> will require moving the implementation of the existing ordereddict
> class into the C library.
> 
> Are there any issues that this might cause in implementation on the
> py3k development line?
> 
> [1] http://mail.python.org/pipermail/python-ideas/2009-April/004163.html

Ordered dicts are a lot slower than normal dictionaries. I don't
think that we can make such a change unless we want to make
Python a lot slower at the same time.

If you only want to learn about the definition order of the
keywords you can use the inspect module.

>>> def f(a,b,c=1,d=2): pass
...
>>> inspect.getargspec(f)
(['a', 'b', 'c', 'd'], None, None, (1, 2))

I don't see much use in having the order of providing the
keyword arguments in a function call always available.
Perhaps there's a way to have this optionally, i.e. by
allowing odicts to be passed in as keyword argument dict ?!

Where I do see a real use is making the order of class
attribute and method definition accessible in Python
(without having to use meta-class hacks like e.g. Django's
ORM does).

It would then be must easier to use classes to represent
external resources that rely on order, e.g. database table
schemas or XML schemas.

Classes are created using a keyword-like dictionary as
well, so the situation is similar. The major difference
is that classes aren't created as often as functions are
called.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From jh at improva.dk  Thu Oct 28 10:24:28 2010
From: jh at improva.dk (Jacob Holm)
Date: Thu, 28 Oct 2010 10:24:28 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
Message-ID: <4CC9333C.1030709@improva.dk>

On 2010-10-28 01:46, Guido van Rossum wrote:
> Nick & Jacob,
> 
> Unfortunately other things are in need of my attention and I am
> quickly lagging behind on this thread.
> 

Too bad, but understandable.  I'll try to be brief(er).


> I'll try to respond to some issues without specific quoting.
> 
> If GeneratorReturn and finish() can be implemented in pure user code,
> then I think it should be up to every (framework) developer to provide
> their own API, using whatever constraints they chose. Without specific
> use cases it's hard to reason about API design.

GeneratorReturn and finish *can* be implemented in pure user code, as
long as you accept that the premature return has to use some other
mechanism than "return" or StopIteration.



> Still, I think it is
> reasonable to offer some basic behavior on the generator object, and I
> still think that the best compromise here is to let g.close() extract
> the return value from StopIteration if it catches it. If a framework
> decides not to use this, fine. For a user working without a framework
> this is still just a little nicer than having to figure out the
> required logic yourself.
> 

This works only as long as you don't actually use yield-from, making it
a bit of a strange match to that PEP.  To get it to work *with*
yield-from you need the reraised GeneratorExit to include the return
value (possibly None) from the inner generator.  I seem to have
convinced Nick that the problem is real and that a modification to the
expansion might be needed/desirable.



> I am aware of four relevant states for generators. Here's how they
> work (in current Python):
> 
> - initial state: execution is poised at the top of the function.
> g.throw() always bounces back the exception. g.close() moves it to the
> final state. g.next() starts it running. g.send() requires a None
> argument and is then the same as g.next().
> 
> - running state: the frame is active. none of g.next(), g.send(),
> g.throw() or g.close() work -- they all raise ValueError.
> 
> - suspended state: execution is suspended at a yield. g.close() raises
> GeneratorExit and if the generator catches this it can do whatever it
> pleases. If it then raises StopIteration or GeneratorExit, g.close()
> is happy, if it raises another exception g.close() just passes that
> through, if it yields a value g.close() complains and raises
> RuntimeError().
> 
> - finished (exhausted) state: the generator has returned. g.close()
> always return None. g.throw() always bounces back the exception.
> g.next() and g.send() always raise StopIteration.
> 
> I would be in favor of adding an introspection API to distinguish
> these four states and I think it would be a fine thing to add to
> Python 3.2 if anyone finds the time to produce a patch (Nick? You
> showed what these boil down to.)
> 
> I note that in the initial state a generator has no choice in how to
> respond because it hasnt't yet had the opportunity to set up a
> try/except, so in this state it acts pretty much the same as in the
> exhausted state when receiving a throw() or close().
> 

Yes, I forgot about this case in the versions of "finish" that I wrote.
 Nick showed a better version that handled it properly.


> Regarding built-in syntax for Go-like channels, let's first see an
> implementation in userland become successful *or* see that it's
> impossible to write an efficient one before adding more to the
> language.
> 

It is impossible in current python to use a for-loop or generator
expression to loop over a Go-like channel without using threads for
everything.  (The only way to suspend the iteration is to suspend the
thread, and then whatever code is supposed to write to the channel must
be running in another thread)  This is a shame, since the blocking
nature of channels otherwise make them ideal for cooperative multitasking.

Note, this restriction (no for-loop iteration without threads) does not
make channels useless in current python, just much less convenient to
work with.  That, unfortunately, makes it less likely that a userland
implementation will ever become successful.


> Note that having a different expansion of a for-loop based on the
> run-time value or type of the iterable cannot be done -- the expansion
> can only vary based on the syntactic form.
> 

The intent was to have a different expansion depending on the type of
function containing the for-loop (as in regular/cofunction).  I think I
made a few errors though, so the new expansion doesn't actually work
with regular iterables.  If I get around to fixing it I'll post the fix
in that thread.


> There are a few different conventions for using generators and
> yield-from; e.g. generators used as proper iterators with easy
> refactoring; generators used as tasks where yield X is used for
> blocking I/O operations; and generators used as "inverse generators"
> as in the parallel_reduce() example that initiated this thread. I
> don't particularly care about what kind of errors you get if a
> generator written for one convention is accidentally used by another
> convention, as long as it is made clear which convention is being used
> in each case. Frameworks/libraries can and probably should develop
> decorators to mark up the 2nd and 3rd conventions, but I don't think
> the *language* needs to go out of its way to enforce proper usage.
> 

Agreed, I think.

- Jacob


From pyideas at rebertia.com  Thu Oct 28 10:38:09 2010
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 28 Oct 2010 01:38:09 -0700
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC93095.3080704@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
Message-ID: <AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>

On Thu, Oct 28, 2010 at 1:13 AM, M.-A. Lemburg <mal at egenix.com> wrote:
<snip>
> I don't see much use in having the order of providing the
> keyword arguments in a function call always available.
> Perhaps there's a way to have this optionally, i.e. by
> allowing odicts to be passed in as keyword argument dict ?!
>
> Where I do see a real use is making the order of class
> attribute and method definition accessible in Python
> (without having to use meta-class hacks like e.g. Django's
> ORM does).

So, you want to make class bodies use (C-implemented) OrderedDicts by
default, thus rendering metaclass __prepare__() definitions for that
purpose ("meta-class hacks") superfluous?

Cheers,
Chris


From jh at improva.dk  Thu Oct 28 10:52:53 2010
From: jh at improva.dk (Jacob Holm)
Date: Thu, 28 Oct 2010 10:52:53 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
Message-ID: <4CC939E5.5070700@improva.dk>

On 2010-10-28 00:52, Nick Coghlan wrote:
> On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm <jh at improva.dk> wrote:
>> Actually, AFAICT outer_broken will *not* give a RuntimeError on close()
>> after next().  This is due to the special-casing of GeneratorExit in PEP
>> 380.  That special-casing is also the basis for both my suggested
>> modifications.
> 
> Ah, you're quite right - I'd completely forgotten about the
> GeneratorExit special-casing in the PEP 380 semantics, so I was
> arguing from a faulty premise. With that error corrected, I can
> happily withdraw my objection to idioms that convert GeneratorExit to
> StopIteration (since any yield from expressions will reraise the
> GeneratorExit in that case).
> 

Looks like we are still not on exactly the same page though...  You seem
to be arguing from the version at
http://www.python.org/dev/peps/pep-0380, whereas I am looking at
http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt,
which is newer.



> The "did-it-really-finish?" question can likely be answered by
> slightly improving generator state introspection from the Python level
> (as I believe Guido suggested earlier in the thread). That way close()
> can keep the gist of its current semantics (return something if the
> generator ends up in an inactive state, raise RuntimeError if it
> yields another value), while frameworks can object to other unexpected
> states if they want to.
> 
> As it turns out, the information on generator state is already there,
> just not in a particularly user friendly format ("not started" =
> "g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" =
> "g.gi_frame is None").
> 
> So, without any modifications at all to the current incarnation of PEP
> 380, it is already possible to write:
> 
> def finish(gen):
>    frame = gen.gi_frame
>    if frame is None:
>        raise RuntimeError('finish() on exhausted/closed generator')
>    if frame.f_lasti == -1:
>        raise RuntimeError('finish() on not yet started generator')
>    try:
>        gen.throw(GeneratorExit)
>    except StopIteration as err:
>        if err.args:
>            return err.args[0]
>        return None
>    except GeneratorExit:
>        pass
>    else:
>        raise RuntimeError('Generator ignored GeneratorExit')
>    raise RuntimeError('Generator failed to return a value')
> 

Yes.  I forgot about the "not yet started" case in my earlier versions.



> I think I'm finally starting to understand *your* question/concern
> though. Given the current PEP 380 expansion, the above definition of
> finish() and the following two generators:
> 
> def g_inner():
>   yield
>   return "Hello world!"
> 
> def g_outer():
>   yield (yield from g_inner())
> 
> You would get the following result (as g_inner converts GeneratorExit
> to StopIteration, then yield from propogates that up the stack):
>>>> g = g_outer()
>>>> next(g)
>>>> finish(g)
> "Hello world!"
> 
> Oops?
> 

Well.  Not with the newest expansion.  Not that the None you will get
from that one is any better.


> I'm wondering if this part of the PEP 380 expansion:
>                     if _e is _x[1] or isinstance(_x[1], GeneratorExit):
>                         raise
> 
> Should actually look like:
>                     if _e is _x[1]:
>                         raise
>                     if isinstance(_x[1], GeneratorExit):
>                         raise GeneratorExit(*_e.args)
> 

In the newer expansion, I would change:

            except GeneratorExit as _e:
                try:
                    _m = getattr(_i, 'close')
                except AttributeError:
                    pass
                else:
                    _m()
                raise _e

Into:

            except GeneratorExit as _e:
                try:
                    _m = getattr(_i, 'close')
                except AttributeError:
                    pass
                else:
                    raise GeneratorExit(_m())
                raise _e

(Which can cleaned up a bit btw., by removing _e and using direct
attribute access instead of getattr)

> Once that distinction is made, you can more easily write helper
> functions and context managers that allow code to do the "right thing"
> according to the needs of a particular framework or application.
> 

Yes.  OTOH, I have argued for this change before with no luck.

- Jacob


From denis.spir at gmail.com  Thu Oct 28 11:19:36 2010
From: denis.spir at gmail.com (spir)
Date: Thu, 28 Oct 2010 11:19:36 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC93095.3080704@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
Message-ID: <20101028111936.1629eade@o>

On Thu, 28 Oct 2010 10:13:09 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:

> Ordered dicts are a lot slower than normal dictionaries. I don't
> think that we can make such a change unless we want to make
> Python a lot slower at the same time.

Ruby has ordered hashes since 1.9 with apparently no relevant performance loss -- actually there was gain of performance due to improvement in other aspects of the language. See eg http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/ 
I have no idea how python dicts are implemented, especially how entries are held in "buckets". The trick for Ruby is that buckets are actually linked lists, entries are list nodes with pointers allowing linear search inside the bucket. To preserve insertion order, all what is needed is to add parallel pointers to each node representing a parallel list, namely insertion order. Iteration just follows this second sequence of pointers. (I find this solution rather elegant.)

Denis
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From mal at egenix.com  Thu Oct 28 11:27:27 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 28 Oct 2010 11:27:27 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
Message-ID: <4CC941FF.6070408@egenix.com>

Chris Rebert wrote:
> On Thu, Oct 28, 2010 at 1:13 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> <snip>
>> I don't see much use in having the order of providing the
>> keyword arguments in a function call always available.
>> Perhaps there's a way to have this optionally, i.e. by
>> allowing odicts to be passed in as keyword argument dict ?!
>>
>> Where I do see a real use is making the order of class
>> attribute and method definition accessible in Python
>> (without having to use meta-class hacks like e.g. Django's
>> ORM does).
> 
> So, you want to make class bodies use (C-implemented) OrderedDicts by
> default, thus rendering metaclass __prepare__() definitions for that
> purpose ("meta-class hacks") superfluous?

Yes.

The http://www.python.org/dev/peps/pep-3115/ has an interesting
paragraph:

     Another good suggestion was to simply use an ordered dict for all
     classes, and skip the whole 'custom dict' mechanism. This was based
     on the observation that most use cases for a custom dict were for
     the purposes of preserving order information. However, this idea has
     several drawbacks, first because it means that an ordered dict
     implementation would have to be added to the set of built-in types
     in Python, and second because it would impose a slight speed (and
     complexity) penalty on all class declarations. Later, several people
     came up with ideas for use cases for custom dictionaries other
     than preserving field orderings, so this idea was dropped.

Some comments:

An ordered dict in C could be optimized to keep the same
performance as the regular dict by only storing the insertion
index together with the dict item and not maintaining these
in a separate list.

Access to the order would be slower, but
it would make its use in timing critical parts of CPython
a lot more attractive. An ordered dict would then require more
memory, but not necessarily introduce a performance hit.

The quoted paragraph starts with "observation that most use
cases for a custom dict were for the purposes of preserving
order information" and ends with "several people came up
with ideas for use cases for custom dictionaries other
than preserving field orderings".

I'd call that a standard case of over-generalization - a
rather popular syndrom in Python-land :-) - but that left aside:
the first part is very true. Python has always tried
to make the most common use case simple, so asking programmers to
use a meta-class to be able to access the order of definitions
in a class definition isn't exactly what the normal Python
programmer would expect.

Named tuples and similar sequence/mapping hybrids could probably
also benefit from having the order of definition readily
available, either directly via an odict cls.__dict__ or
via a new attribute cls.__deforder__ which provides the order
information in form of a tuple of cls.__dict__ keys.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From jh at improva.dk  Thu Oct 28 12:12:55 2010
From: jh at improva.dk (Jacob Holm)
Date: Thu, 28 Oct 2010 12:12:55 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC858F3.4000602@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>
	<4CC858F3.4000602@improva.dk>
Message-ID: <4CC94CA7.2030001@improva.dk>

On 2010-10-27 18:53, Jacob Holm wrote:
> Hmm.  This got me thinking.  One thing I'd really like to see in python
> is something like the "channel" object from the go language
> (http://golang.org/).
> 
> Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without
> any of them) it is possible to write a trampoline-based implementation
> of a channel object with "send" and "next" methods that work as
> expected.  One thing that is *not* possible (I think) is to make that
> object iterable.  Your wild idea above gave me a similar wild idea of my
> own.  An extension to the cofunctions PEP that would make that possible.
> 

Seems like I screwed up the semantics of the standard for-loop in that
version.  Let me try again...

1) Add new exception StopCoIteration, inheriting from StandardError.
Change the regular StopIteration to inherit from the new exception
instead of directly from StandardError.  This ensures that code that
catches StopCoIteration also catches StopIteration, which I think is
what we want.

The new exception is needed because "cocall func()" can never raise the
regular StopIteration (or any subclass thereof).  This might actually be
an argument for using a different exception for returning a value from a
coroutine...

2) Allow __next__ on an object to be a cofunction.  Add a __cocall__ to
the built-in next(ob) that tries to uses cocall to call ob.__next__.

def next__cocall__(ob, *args):
    if len(args)>1:
        raise TypeError
    try:
        _next = type(ob).__next__
    except AttributeError:
        raise TypeError
    try:
        return cocall _next(ob)
    except StopCoIteration:
        if args:
           return args[0]
        raise

2a) Optionally allow __iter__ on an object to be a cofunction.  Add a
__cocall__ to the builtin iter.

   class _func_iter(object):
       def __init__(self, callable, sentinel):
           self.callable = callable
           self.sentinel = sentinel
       def __next__(self):
           v = cocall self.callable()
           if v is sentinel:
               raise StopCoIteration
           return v

   def iter__cocall__(*args):
       try:
           ob, = args
       except ValueError:
           try:
               callable, sentinel = args
           except ValueError:
               raise TypeError
           return _func_iter(callable, sentinel)
       try:
           _iter = type(ob).__iter__
       except AttributeError:
           raise TypeError
       return cocall _iter(ob)

3) Change the for-loop in a cofunction:

   for val in iterable:
       <block>
   else:
       <block>

so it expands into:

   _it = cocall iter(iterable)
   while True:
       try:
           val = cocall next(iterable)
       except StopCoIteration:
           break
       <block>
   else:
       <block>

which is exactly the normal expansion, but using cocall to call iter and
next, and catching StopCoIteration instead of StopIteration.

Since cocall falls back to using a regular call, this should work well
with all normal iterables.

3a)  Alternatively define a new syntax for "coiterating", e.g.

    cocall for val in iterable:
        <block>
    else:
        <block>



All this to make it possible to write a code like this:


def consumer(ch):
    cocall for val in ch:
        print(val)

def producer(ch):
    cocall for val in range(10):
        cocall ch.send(val)

def main()
    sched = scheduler()
    ch = channel()
    sched.add(consumer(ch))
    sched.add(producer(ch))
    sched.run()


Thoughts?

- Jacob


From solipsis at pitrou.net  Thu Oct 28 13:10:54 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 28 Oct 2010 13:10:54 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
Message-ID: <20101028131054.22e8ba25@pitrou.net>

On Thu, 28 Oct 2010 11:19:36 +0200
spir <denis.spir at gmail.com> wrote:
> On Thu, 28 Oct 2010 10:13:09 +0200
> "M.-A. Lemburg" <mal at egenix.com> wrote:
> 
> > Ordered dicts are a lot slower than normal dictionaries. I don't
> > think that we can make such a change unless we want to make
> > Python a lot slower at the same time.
> 
> Ruby has ordered hashes since 1.9 with apparently no relevant
> performance loss

Performance would probably not suffer on micro-benchmarks (with
everything fitting in the CPU's L1 cache), but making dicts bigger
(by 66%: 5 pointer-sized fields per hash entry instead of 3) could
be detrimental in real life workloads.

Regards

Antoine.




From mal at egenix.com  Thu Oct 28 13:52:35 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 28 Oct 2010 13:52:35 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <20101028131054.22e8ba25@pitrou.net>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net>
Message-ID: <4CC96403.6030106@egenix.com>

Antoine Pitrou wrote:
> On Thu, 28 Oct 2010 11:19:36 +0200
> spir <denis.spir at gmail.com> wrote:
>> On Thu, 28 Oct 2010 10:13:09 +0200
>> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>
>>> Ordered dicts are a lot slower than normal dictionaries. I don't
>>> think that we can make such a change unless we want to make
>>> Python a lot slower at the same time.
>>
>> Ruby has ordered hashes since 1.9 with apparently no relevant
>> performance loss
> 
> Performance would probably not suffer on micro-benchmarks (with
> everything fitting in the CPU's L1 cache), but making dicts bigger
> (by 66%: 5 pointer-sized fields per hash entry instead of 3) could
> be detrimental in real life workloads.

For function calls, yes. For class creation, I doubt that a few
extra bytes would make much difference in real life - classes typically
don't have thousands of methods or attributes :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Thu Oct 28 14:10:07 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 28 Oct 2010 14:10:07 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC96403.6030106@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net>  <4CC96403.6030106@egenix.com>
Message-ID: <1288267807.3705.0.camel@localhost.localdomain>


> > Performance would probably not suffer on micro-benchmarks (with
> > everything fitting in the CPU's L1 cache), but making dicts bigger
> > (by 66%: 5 pointer-sized fields per hash entry instead of 3) could
> > be detrimental in real life workloads.
> 
> For function calls, yes. For class creation, I doubt that a few
> extra bytes would make much difference in real life - classes typically
> don't have thousands of methods or attributes :-)

Right. I was talking about the prospect of making dicts ordered by
default.

Regards

Antoine.




From ncoghlan at gmail.com  Thu Oct 28 14:18:25 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 28 Oct 2010 22:18:25 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC939E5.5070700@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
Message-ID: <AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>

On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm <jh at improva.dk> wrote:
> On 2010-10-28 00:52, Nick Coghlan wrote:
>> On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm <jh at improva.dk> wrote:
>>> Actually, AFAICT outer_broken will *not* give a RuntimeError on close()
>>> after next(). ?This is due to the special-casing of GeneratorExit in PEP
>>> 380. ?That special-casing is also the basis for both my suggested
>>> modifications.
>>
>> Ah, you're quite right - I'd completely forgotten about the
>> GeneratorExit special-casing in the PEP 380 semantics, so I was
>> arguing from a faulty premise. With that error corrected, I can
>> happily withdraw my objection to idioms that convert GeneratorExit to
>> StopIteration (since any yield from expressions will reraise the
>> GeneratorExit in that case).
>>
>
> Looks like we are still not on exactly the same page though... ?You seem
> to be arguing from the version at
> http://www.python.org/dev/peps/pep-0380, whereas I am looking at
> http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt,
> which is newer.

Ah, the comment earlier in the thread about the PEP not being up to
date with the last discussion makes more sense now...

Still, the revised expansion also does the right thing in the case
that was originally bothering me, and I agree with your suggested
tweak to that version. I've cc'ed Greg directly on this email - if he
wants, I can check in an updated version of the PEP to bring the
python.org version up to speed with the later discussions.

With that small change to the yield from expansion, as well as the
change to close to return the first argument to StopIteration (if any)
and None otherwise, I think PEP 380 will be in a much better position
to support user experimentation in this area.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Thu Oct 28 14:44:31 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 28 Oct 2010 22:44:31 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
Message-ID: <AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>

On Thu, Oct 28, 2010 at 9:46 AM, Guido van Rossum <guido at python.org> wrote:
> Nick & Jacob,
>
> Unfortunately other things are in need of my attention and I am
> quickly lagging behind on this thread.
>
> I'll try to respond to some issues without specific quoting.
>
> If GeneratorReturn and finish() can be implemented in pure user code,
> then I think it should be up to every (framework) developer to provide
> their own API, using whatever constraints they chose. Without specific
> use cases it's hard to reason about API design. Still, I think it is
> reasonable to offer some basic behavior on the generator object, and I
> still think that the best compromise here is to let g.close() extract
> the return value from StopIteration if it catches it. If a framework
> decides not to use this, fine. For a user working without a framework
> this is still just a little nicer than having to figure out the
> required logic yourself.

Yep, we've basically agreed on that as the way forward as well. We
have a small tweak to suggest for PEP 380 to avoid losing the return
value from inner close() calls, and I've cc'ed Greg directly on the
relevant message in order to move that idea forward (and bring the
python.org version of the PEP up to date with the last posted version
as well).

That should provide a solid foundation for experimentation in user
code in 3.3 without overcomplicating PEP 380 with stuff that will
probably end up being YAGNI.

> I would be in favor of adding an introspection API to distinguish
> these four states and I think it would be a fine thing to add to
> Python 3.2 if anyone finds the time to produce a patch (Nick? You
> showed what these boil down to.)

I've created a tracker issue proposing a simple
inspect.getgeneratorstate() function (issue 10220).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From guido at python.org  Thu Oct 28 16:58:08 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 28 Oct 2010 07:58:08 -0700
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <1288267807.3705.0.camel@localhost.localdomain>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com>
	<1288267807.3705.0.camel@localhost.localdomain>
Message-ID: <AANLkTi=856DpeZ1nHT9YU0Na8O4B0EzALs3roENZHUA5@mail.gmail.com>

Let's see if someone can come up with an ordereddict implemented in C
first and then benchmark the hell out of it.

Once its performance is acceptable we can talk about using it for
keyword args, class dicts, or even make it the one and only dict
object -- but the latter would be a really high bar to pass.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Thu Oct 28 17:04:55 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 28 Oct 2010 08:04:55 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
Message-ID: <AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>

On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Yep, we've basically agreed on that as the way forward as well. We
> have a small tweak to suggest for PEP 380 to avoid losing the return
> value from inner close() calls,

This is my "gclose()" function, right? Or is there more to it?

> and I've cc'ed Greg directly on the
> relevant message in order to move that idea forward (and bring the
> python.org version of the PEP up to date with the last posted version
> as well).

Greg's been remarkably quiet on this thread even though I cc'ed him
early on. Have you heard back from him yet?

> That should provide a solid foundation for experimentation in user
> code in 3.3 without overcomplicating PEP 380 with stuff that will
> probably end up being YAGNI.
>
>> I would be in favor of adding an introspection API to distinguish
>> these four states and I think it would be a fine thing to add to
>> Python 3.2 if anyone finds the time to produce a patch (Nick? You
>> showed what these boil down to.)
>
> I've created a tracker issue proposing a simple
> inspect.getgeneratorstate() function (issue 10220).

I added a little something to the issue.

-- 
--Guido van Rossum (python.org/~guido)


From denis.spir at gmail.com  Thu Oct 28 19:58:59 2010
From: denis.spir at gmail.com (spir)
Date: Thu, 28 Oct 2010 19:58:59 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTi=856DpeZ1nHT9YU0Na8O4B0EzALs3roENZHUA5@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com>
	<1288267807.3705.0.camel@localhost.localdomain>
	<AANLkTi=856DpeZ1nHT9YU0Na8O4B0EzALs3roENZHUA5@mail.gmail.com>
Message-ID: <20101028195859.738d22b8@o>

On Thu, 28 Oct 2010 07:58:08 -0700
Guido van Rossum <guido at python.org> wrote:

> Let's see if someone can come up with an ordereddict implemented in C
> first and then benchmark the hell out of it.
> 
> Once its performance is acceptable we can talk about using it for
> keyword args, class dicts, or even make it the one and only dict
> object -- but the latter would be a really high bar to pass.
> 

What does the current implementation use as buckets?


deins
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From solipsis at pitrou.net  Thu Oct 28 20:06:05 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 28 Oct 2010 20:06:05 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com>
	<1288267807.3705.0.camel@localhost.localdomain>
	<AANLkTi=856DpeZ1nHT9YU0Na8O4B0EzALs3roENZHUA5@mail.gmail.com>
	<20101028195859.738d22b8@o>
Message-ID: <20101028200605.14e829e3@pitrou.net>

On Thu, 28 Oct 2010 19:58:59 +0200
spir <denis.spir at gmail.com> wrote:
> On Thu, 28 Oct 2010 07:58:08 -0700
> Guido van Rossum <guido at python.org> wrote:
> 
> > Let's see if someone can come up with an ordereddict implemented in C
> > first and then benchmark the hell out of it.
> > 
> > Once its performance is acceptable we can talk about using it for
> > keyword args, class dicts, or even make it the one and only dict
> > object -- but the latter would be a really high bar to pass.
> > 
> 
> What does the current implementation use as buckets?

It uses an open addressing strategy. Each dict entry holds three
pointer-sized fields: key object, value object, and cached hash value
of the key.
(set entries have only two fields, since they don't hold a value object)

You'll find details in Include/dictobject.h and Objects/dictobject.c.

Regards

Antoine.




From raymond.hettinger at gmail.com  Thu Oct 28 20:10:24 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 28 Oct 2010 11:10:24 -0700
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC96403.6030106@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com>
Message-ID: <74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com>


On Oct 28, 2010, at 4:52 AM, M.-A. Lemburg wrote:

> Antoine Pitrou wrote:
>> On Thu, 28 Oct 2010 11:19:36 +0200
>> spir <denis.spir at gmail.com> wrote:
>>> On Thu, 28 Oct 2010 10:13:09 +0200
>>> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>> 
>>>> Ordered dicts are a lot slower than normal dictionaries. I don't
>>>> think that we can make such a change unless we want to make
>>>> Python a lot slower at the same time.
>>> 
>>> Ruby has ordered hashes since 1.9 with apparently no relevant
>>> performance loss
>> 
>> Performance would probably not suffer on micro-benchmarks (with
>> everything fitting in the CPU's L1 cache), but making dicts bigger
>> (by 66%: 5 pointer-sized fields per hash entry instead of 3) could
>> be detrimental in real life workloads.
> 
> For function calls, yes. For class creation, I doubt that a few
> extra bytes would make much difference in real life - classes typically
> don't have thousands of methods or attributes :-)

Last year, I experimented with this design (changing the dict implementation
to be ordered by adding two fields for links).   The effects are:

* The expected 66% increase in size was unavoidable for large dicts.

* For smaller dicts the link fields used indices instead of pointers
   and those indices were smaller than the existing fields (i.e. 8 bits
   per entry for tables under 256 rows, 16 bits per entry for tables under
   65k rows).

* Iteration speed improved for smaller dicts because we don't have 
   to examine empty slots (we also get to eliminate the "search
   finger" hack).  For larger dicts, results were mixed (because of the
   loss of locality of access).


Raymond

 



From jimjjewett at gmail.com  Thu Oct 28 20:44:59 2010
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 28 Oct 2010 14:44:59 -0400
Subject: [Python-ideas] dict changes [was: Ordered storage of keyword
	arguments]
Message-ID: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>

On 10/28/10, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Thu, 28 Oct 2010 19:58:59 +0200
> spir <denis.spir at gmail.com> wrote:

>> What does the current implementation use as buckets?

> It uses an open addressing strategy. Each dict entry holds three
> pointer-sized fields: key object, value object, and cached hash value
> of the key.
> (set entries have only two fields, since they don't hold a value object)

Has anyone benchmarked not storing the hash value here?

For a string dict, that hash should already be available on the string
object itself, so it is redundant.  Keeping it obviously improves
cache locality, but ... it also makes the dict objects 50% larger, and
there is a chance that the strings themselves would already be in
cache anyhow.  And if strings were reliably interned, the comparison
check should normally just be a pointer compare -- possibly fast
enough that the "different hash" shortcut doesn't buy anything.
[caveats about still needing to go to the slower dict implementation
for string subclasses]

-jJ


From raymond.hettinger at gmail.com  Thu Oct 28 21:06:38 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 28 Oct 2010 12:06:38 -0700
Subject: [Python-ideas] dict changes [was: Ordered storage of keyword
	arguments]
In-Reply-To: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
References: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
Message-ID: <02DB84F9-2D62-4A57-BA78-728C4E3ED399@gmail.com>


On Oct 28, 2010, at 11:44 AM, Jim Jewett wrote:
> 
>> It uses an open addressing strategy. Each dict entry holds three
>> pointer-sized fields: key object, value object, and cached hash value
>> of the key.
>> (set entries have only two fields, since they don't hold a value object)
> 
> Has anyone benchmarked not storing the hash value here

That would be a small disaster.  Either you call PyObject_Hash()
for every probe (adding function call overhead for int and str,
and adding tons of work for other types) or you can go directly
to Py_RichCompareBool() which is never fast.

I haven't timed this for dicts, but I did see major speed boosts
in the performance of set-to-set operations when the internally
stored hash was used instead of calling PyObject_Hash().

Raymond





From greg.ewing at canterbury.ac.nz  Thu Oct 28 22:14:46 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 09:14:46 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC858F3.4000602@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>
	<4CC858F3.4000602@improva.dk>
Message-ID: <4CC9D9B6.8000005@canterbury.ac.nz>

Jacob Holm wrote:

> 1) Define a new "coiterator" protocol, consisting of a new special
> method __conext__, and a new StopCoIteration exception that the regular
> StopIteration inherits from.

I don't think it's necessary to have a new protocol. All
that's needed is to allow for the possibility of the
__next__ method of an iterator being a cofunction.

Under the current version of PEP 3152, with an explicit
"cocall" operation, this would require a new kind of
for-loop. Maybe using "cofor"?

However, my current thinking on cofunctions is that
cocalls should be implicit -- you declare a cofunction
with "codef", and any call made within it can potentially
be a cocall. In that case, there would be no need for new
syntax -- the existing for-loop could just do the right
thing when given an object whose __next__ method is a
cofunction.

Thinking about this has made me even more sure that
implicit cocalls are the way to go, because it means
that any other things we think of that need to take
cofunctions into account can be fixed without having
to introduce new syntax for each one.

-- 
Greg


From debatem1 at gmail.com  Thu Oct 28 22:17:22 2010
From: debatem1 at gmail.com (geremy condra)
Date: Thu, 28 Oct 2010 13:17:22 -0700
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com> <20101028111936.1629eade@o>
	<20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com>
	<74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com>
Message-ID: <AANLkTikOt0-Xr8QJjrMBGquYUM-ePjWzTQNczsKGBsUD@mail.gmail.com>

On Thu, Oct 28, 2010 at 11:10 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Oct 28, 2010, at 4:52 AM, M.-A. Lemburg wrote:
>
>> Antoine Pitrou wrote:
>>> On Thu, 28 Oct 2010 11:19:36 +0200
>>> spir <denis.spir at gmail.com> wrote:
>>>> On Thu, 28 Oct 2010 10:13:09 +0200
>>>> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>>>
>>>>> Ordered dicts are a lot slower than normal dictionaries. I don't
>>>>> think that we can make such a change unless we want to make
>>>>> Python a lot slower at the same time.
>>>>
>>>> Ruby has ordered hashes since 1.9 with apparently no relevant
>>>> performance loss
>>>
>>> Performance would probably not suffer on micro-benchmarks (with
>>> everything fitting in the CPU's L1 cache), but making dicts bigger
>>> (by 66%: 5 pointer-sized fields per hash entry instead of 3) could
>>> be detrimental in real life workloads.
>>
>> For function calls, yes. For class creation, I doubt that a few
>> extra bytes would make much difference in real life - classes typically
>> don't have thousands of methods or attributes :-)
>
> Last year, I experimented with this design (changing the dict implementation
> to be ordered by adding two fields for links).   The effects are:
>
> * The expected 66% increase in size was unavoidable for large dicts.
>
> * For smaller dicts the link fields used indices instead of pointers
>   and those indices were smaller than the existing fields (i.e. 8 bits
>   per entry for tables under 256 rows, 16 bits per entry for tables under
>   65k rows).
>
> * Iteration speed improved for smaller dicts because we don't have
>   to examine empty slots (we also get to eliminate the "search
>   finger" hack).  For larger dicts, results were mixed (because of the
>   loss of locality of access).
>
>
> Raymond

Is this available somewhere? I'd like to play around with this for a bit.

Geremy Condra


From greg.ewing at canterbury.ac.nz  Thu Oct 28 23:22:39 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 10:22:39 +1300
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC941FF.6070408@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com>
Message-ID: <4CC9E99F.5030805@canterbury.ac.nz>

M.-A. Lemburg wrote:
> Python has always tried
> to make the most common use case simple, so asking programmers to
> use a meta-class to be able to access the order of definitions
> in a class definition isn't exactly what the normal Python
> programmer would expect.

But needing to know the order of definitions in a class
is a very uncommon thing to want to do in the first
place.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu Oct 28 23:37:17 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 10:37:17 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC94CA7.2030001@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>
	<4CC858F3.4000602@improva.dk> <4CC94CA7.2030001@improva.dk>
Message-ID: <4CC9ED0D.1010800@canterbury.ac.nz>

Jacob Holm wrote:

> The new exception is needed because "cocall func()" can never raise the
> regular StopIteration (or any subclass thereof).

Botheration, I hadn't thought of that!

I'll have to think about this one. I still feel that it
shouldn't be necessary to define any new protocol -- one
ought to be able to simply write a __next__ cofunction that
looks like a normal one in all respects except that it's
defined with 'codef'.

Maybe a StopIteration raised inside a cofunction shouldn't
be synonymous with a return, but instead should be caught
and tunnelled around the yield-from via another exception.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Oct 29 00:03:45 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 11:03:45 +1300
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
Message-ID: <4CC9F341.8020404@canterbury.ac.nz>

Chris Rose wrote:
> I'd like to resurrect a discussion that went on a little over a year
> ago [1] started by Michael Foord suggesting that it'd be nice if
> keyword arguments' storage was implemented as an ordered dict as
> opposed to the current unordered form.

What's the use case for this? One of the reasons that keyword
arguments are useful is that you don't have to care what order
you write them in!

-- 
Greg


From solipsis at pitrou.net  Fri Oct 29 00:11:51 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 29 Oct 2010 00:11:51 +0200
Subject: [Python-ideas] dict changes [was: Ordered storage of keyword
	arguments]
In-Reply-To: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
References: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
Message-ID: <1288303911.3753.9.camel@localhost.localdomain>

Le jeudi 28 octobre 2010 ? 14:44 -0400, Jim Jewett a ?crit :
> 
> For a string dict, that hash should already be available on the string
> object itself, so it is redundant.  Keeping it obviously improves
> cache locality, but ... it also makes the dict objects 50% larger, and
> there is a chance that the strings themselves would already be in
> cache anyhow.  And if strings were reliably interned, the comparison
> check should normally just be a pointer compare -- possibly fast
> enough that the "different hash" shortcut doesn't buy anything.
> [caveats about still needing to go to the slower dict implementation
> for string subclasses]

I've thought about that. The main annoyance is to be able to switch
transparently between the two implementations. But I think it would be
interesting to pursue that effort, since indeed dicts with interned keys
are the most common case of dicts in the average Python workload. Saving
1/3 of the memory size on these dicts would be worthwhile IMO.

(addressing itself would perhaps be a bit simpler, because of
multiplying by 8 or 16 instead of multiplying by 12 or 24. But I doubt
the difference would be noticeable)

Regartds

Antoine.




From greg.ewing at canterbury.ac.nz  Fri Oct 29 00:43:19 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 11:43:19 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
Message-ID: <4CC9FC87.1040600@canterbury.ac.nz>

Nick Coghlan wrote:

> On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm <jh at improva.dk> wrote:

>>Looks like we are still not on exactly the same page though...  You seem
>>to be arguing from the version at
>>http://www.python.org/dev/peps/pep-0380, whereas I am looking at
>>http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt,
>>which is newer.
> 
> Still, the revised expansion also does the right thing in the case
> that was originally bothering me,

That attachment is slightly older than my own current draft,
which is attached below. The differences in the expansion are
as follows (- is the version linked to above, + is my current
version):

@@ -141,20 +141,21 @@
                  _s = yield _y
              except GeneratorExit as _e:
                  try:
-                    _m = getattr(_i, 'close')
+                    _m = _i.close
                  except AttributeError:
                      pass
                  else:
                      _m()
                  raise _e
              except BaseException as _e:
+                _x = sys.exc_info()
                  try:
-                    _m = getattr(_i, 'throw')
+                    _m = _i.throw
                  except AttributeError:
                      raise _e
                  else:
                      try:
-                        _y = _m(*sys.exc_info())
+                        _y = _m(*_x)
                      except StopIteration as _e:
                          _r = _e.value
                          break

Does this version still address your concerns? If so, please
check it in as the latest version.

-- 
Greg

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: yield-from-rev14.txt
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101029/f8a1a501/attachment.txt>

From jh at improva.dk  Fri Oct 29 02:45:00 2010
From: jh at improva.dk (Jacob Holm)
Date: Fri, 29 Oct 2010 02:45:00 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC9D9B6.8000005@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<AANLkTi=EMCDcE2bH6ENJLjyxyP-ZqP5c2L7wOni6LC0G@mail.gmail.com>	<4CC858F3.4000602@improva.dk>
	<4CC9D9B6.8000005@canterbury.ac.nz>
Message-ID: <4CCA190C.6050709@improva.dk>

On 2010-10-28 22:14, Greg Ewing wrote:
> Jacob Holm wrote:
> 
>> 1) Define a new "coiterator" protocol, consisting of a new special
>> method __conext__, and a new StopCoIteration exception that the regular
>> StopIteration inherits from.
> 
> I don't think it's necessary to have a new protocol. All
> that's needed is to allow for the possibility of the
> __next__ method of an iterator being a cofunction.
> 

That is more or less exactly what I did for my second version.  Turns
out to be less simple than that because you need to "next" work as a
cofunction as well, and there is a problem with raising StopIteration
from a cofunction.


> Under the current version of PEP 3152, with an explicit
> "cocall" operation, this would require a new kind of
> for-loop. Maybe using "cofor"?
> 
> However, my current thinking on cofunctions is that
> cocalls should be implicit -- you declare a cofunction
> with "codef", and any call made within it can potentially
> be a cocall. In that case, there would be no need for new
> syntax -- the existing for-loop could just do the right
> thing when given an object whose __next__ method is a
> cofunction.
> 
> Thinking about this has made me even more sure that
> implicit cocalls are the way to go, because it means
> that any other things we think of that need to take
> cofunctions into account can be fixed without having
> to introduce new syntax for each one.
> 

Yes.  Looking at a few examples using my toy implementation of Go
channels made me realise just how awkward it would be to have to mark
all cocall sites explicitly.

With implicit cocalls and a for-loop changed to work with a cofunction
__next__, working with channels can be made to look exactly like working
with generators.  For me, that would be a major selling point for the PEP.

- Jacob


From ncoghlan at gmail.com  Fri Oct 29 02:54:38 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 29 Oct 2010 10:54:38 +1000
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC9F341.8020404@canterbury.ac.nz>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
Message-ID: <AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>

On Fri, Oct 29, 2010 at 8:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> What's the use case for this? One of the reasons that keyword
> arguments are useful is that you don't have to care what order
> you write them in!

The use case is being able to interface naturally with any key-value
API where order matters.

For example:

# Create an ordered dictionary (WRONG!)
d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due
to unordered kw dict

Another example is an addition made to part of the json API (to accept
an iterable of key-value pairs) to work around exactly this problem.
Basically, if an API accepts an iterable of key-value pairs instead of
a dictionary, it's a case where ordered keyword dictionaries would
likely improve usability.

That said, there are plenty of steps to be taken before the idea of
using ordered dictionaries implicitly anywhere in the interpreter can
even be seriously considered. Step 1 is to come up with a
C-accelerated version of collections.OrderedDictionary, step 2 is to
make it a builtin (odict?), step 3 is to consider using it for class
namespaces and/or for keyword arguments by default, then step 4 would
probably be to switch "dict=odict" and add a
collections.UnorderedDictionary interface to the old dict
implementation. The bar for progression (in terms of acceptable
impacts on speed and memory usage) would get higher with each step
along the path.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Oct 29 03:17:15 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 29 Oct 2010 11:17:15 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
Message-ID: <AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>

On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum <guido at python.org> wrote:
> On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Yep, we've basically agreed on that as the way forward as well. We
>> have a small tweak to suggest for PEP 380 to avoid losing the return
>> value from inner close() calls,
>
> This is my "gclose()" function, right? Or is there more to it?

Yeah, the idea your gclose(), plus one extra tweak to the expansion of
"yield from" to store the result of the inner close() call on a new
GeneratorExit instance.

To use a toy example:

  # Even this toy framework needs a little structure
  class EndSum(Exception): pass

  def gsum():
    # Sums sent values until EndSum or GeneratorExit are thrown in
    tally = 0
    try:
      while 1:
        tally += yield
    except (EndSum, GeneratorExit):
      pass
    return x

  def average_sums():
    # Advances to a new sum when EndSum is thrown in
    # Finishes the last sum and averages them all when GeneratorExit
is thrown in
    sums = []
    try:
      while 1:
        sums.append(yield from gsum())
    except GeneratorExit as ex:
      # Our proposed expansion tweak is to enable the next line
      sums.append(ex.args[0])
    return sum(sums) / len(sums)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From guido at python.org  Fri Oct 29 04:21:27 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 28 Oct 2010 19:21:27 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
	<AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
Message-ID: <AANLkTimohwjRwwsuhpTOgF48oiFRjXfjJocYWYE2+ej0@mail.gmail.com>

On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum <guido at python.org> wrote:
>> On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> Yep, we've basically agreed on that as the way forward as well. We
>>> have a small tweak to suggest for PEP 380 to avoid losing the return
>>> value from inner close() calls,
>>
>> This is my "gclose()" function, right? Or is there more to it?
>
> Yeah, the idea your gclose(), plus one extra tweak to the expansion of
> "yield from" to store the result of the inner close() call on a new
> GeneratorExit instance.
>
> To use a toy example:
>
> ?# Even this toy framework needs a little structure
> ?class EndSum(Exception): pass
>
> ?def gsum():
> ? ?# Sums sent values until EndSum or GeneratorExit are thrown in
> ? ?tally = 0
> ? ?try:
> ? ? ?while 1:
> ? ? ? ?tally += yield
> ? ?except (EndSum, GeneratorExit):
> ? ? ?pass
> ? ?return x

You meant return tally. Right?

> ?def average_sums():
> ? ?# Advances to a new sum when EndSum is thrown in
> ? ?# Finishes the last sum and averages them all when GeneratorExit
> is thrown in
> ? ?sums = []
> ? ?try:
> ? ? ?while 1:
> ? ? ? ?sums.append(yield from gsum())
> ? ?except GeneratorExit as ex:
> ? ? ?# Our proposed expansion tweak is to enable the next line
> ? ? ?sums.append(ex.args[0])
> ? ?return sum(sums) / len(sums)

Hmmm... That looks pretty complicated. Wouldn't it be much more
straightforward if instead of

 value ... value EndSum value ... value EndSum value ... value GeneratorExit

the input sequence was required to be

 value ... value EndSum value ... value EndSum value ... value
*EndSum* GeneratorExit

?

Then gsum() wouldn't have to catch EndSum at all, and I don't think
the PEP would have to special-case GeneratorExit. average_sums() could
simply have

  except GeneratorExit:
    return sum(sums) / len(sums)

After all this is a fairly arbitrary protocol and the caller
presumably can do whatever is required of it. If there are values
between the last EndSum and the last GeneratorExit those will be
ignored -- that is a case of garbage in garbage out. If you really
wanted to catch that mistake there would be several ways to translate
it reliably into some other exception -- or log it, or whatever.

It is also defensible that a better design of the protocol would not
require throwing EndSum but sending some agreed-upon marker value.

-- 
--Guido van Rossum (python.org/~guido)


From cmjohnson.mailinglist at gmail.com  Fri Oct 29 05:17:14 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Thu, 28 Oct 2010 17:17:14 -1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimohwjRwwsuhpTOgF48oiFRjXfjJocYWYE2+ej0@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
	<AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
	<AANLkTimohwjRwwsuhpTOgF48oiFRjXfjJocYWYE2+ej0@mail.gmail.com>
Message-ID: <AANLkTikVyuPWsNmT9n4nsHUU6gFT8GBRJ_uKganP5iLB@mail.gmail.com>

On Thu, Oct 28, 2010 at 4:21 PM, Guido van Rossum <guido at python.org> wrote:
> On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> To use a toy example:
>>
>> ?# Even this toy framework needs a little structure
>> ?class EndSum(Exception): pass
>>
>> ?def gsum():
>> ? ?# Sums sent values until EndSum or GeneratorExit are thrown in
>> ? ?tally = 0
>> ? ?try:
>> ? ? ?while 1:
>> ? ? ? ?tally += yield
>> ? ?except (EndSum, GeneratorExit):
>> ? ? ?pass
>> ? ?return x
>
> You meant return tally. Right?
>
>> ?def average_sums():
>> ? ?# Advances to a new sum when EndSum is thrown in
>> ? ?# Finishes the last sum and averages them all when GeneratorExit
>> is thrown in
>> ? ?sums = []
>> ? ?try:
>> ? ? ?while 1:
>> ? ? ? ?sums.append(yield from gsum())
>> ? ?except GeneratorExit as ex:
>> ? ? ?# Our proposed expansion tweak is to enable the next line
>> ? ? ?sums.append(ex.args[0])
>> ? ?return sum(sums) / len(sums)
>

This toy example is a little confusing to me because it has typos?
which is natural when one is writing a program without being able to
run it to debug it. So, I wrote a version of the accumulator/averager
that will work in Python 2.7 (and I think 3, but I didn't test it):


class ReturnValue(Exception): pass

def prime_pump(gen):
    def f(*args, **kwargs):
        g = gen(*args, **kwargs)
        next(g)
        return g
    return f

@prime_pump
def accumulator():
    total = 0
    length = 0
    try:
        while 1:
            value = yield
            total += value
            length += 1
            print(length, value, total)
    except GeneratorExit:
        r = ReturnValue()
        r.total = total
        r.length = length
        raise r

@contextmanager
def get_sum(it):
    try:
        it.close()
    except ReturnValue as r:
        yield r.total

@contextmanager
def get_average(it):
    try:
        it.close()
    except ReturnValue as r:
        yield r.total / r.length


def main():
    running_total = accumulator()
    sums = accumulator()
    running_total.send(6) #For example, whatever
    running_total.send(7)
    with get_sum(running_total) as first_sum:
        sums.send(first_sum)

    running_total = accumulator() #Zero it out

    running_total.send(2) #For example, whatever
    running_total.send(2)
    running_total.send(5)
    running_total.send(8)

    with get_sum(running_total) as second_sum:
        sums.send(second_sum)

    #Get the average of the sums
    with get_average(sums) as r:
        return r

main()

So, I guess the question I have is how will the proposed extensions to
the language make the above code prettier? One thing I can see is that
if it's possible to return from inside a generator, it can be more
straightforward to get the values out of the accumulator at the end:

    try:
        while 1:
            value = yield
            total += value
            length += 1
            print(length, value, total)
    except GeneratorExit:
        return total, length

With Guido's proposed "for item from yield" syntax, IIUC this can be
prettied up even more as:

     for value from yield:
            total += value
            length += 1
    return total, length

Are there other benefits to the proposed extensions? How will the call
sites be improved? I'm not sure how I would rewrite main() to be
prettier/more clear in light of the proposals?

Thanks,

-- Carl Johnson


From greg.ewing at canterbury.ac.nz  Fri Oct 29 06:26:30 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 17:26:30 +1300
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
Message-ID: <4CCA4CF6.6020006@canterbury.ac.nz>

On 29/10/10 13:54, Nick Coghlan wrote:

> The use case is being able to interface naturally with any key-value
> API where order matters.
>
> For example:
>
> # Create an ordered dictionary (WRONG!)
> d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due
> to unordered kw dict

I'd need convincing that the API wouldn't be better designed
to take something other than keyword arguments:

   d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3))

and have it refuse to accept keyword arguments to prevent
accidents.

-- 
Greg



From greg.ewing at canterbury.ac.nz  Fri Oct 29 06:35:20 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 17:35:20 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimohwjRwwsuhpTOgF48oiFRjXfjJocYWYE2+ej0@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>
	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>
	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
	<AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
	<AANLkTimohwjRwwsuhpTOgF48oiFRjXfjJocYWYE2+ej0@mail.gmail.com>
Message-ID: <4CCA4F08.3020206@canterbury.ac.nz>

On 29/10/10 15:21, Guido van Rossum wrote:

>   value ... value EndSum value ... value EndSum value ... value
> *EndSum* GeneratorExit

Seems to me that anything requiring asking for intermediate values
while not stopping the computation entirely is going beyond what
can reasonably be supported with a generator. I wouldn't like to
see yield-from and/or the generator protocol made any more
complicated in order to allow such things.

-- 
Greg


From offline at offby1.net  Fri Oct 29 06:43:48 2010
From: offline at offby1.net (Chris Rose)
Date: Thu, 28 Oct 2010 22:43:48 -0600
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CCA4CF6.6020006@canterbury.ac.nz>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
Message-ID: <AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>

On Thu, Oct 28, 2010 at 10:26 PM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> On 29/10/10 13:54, Nick Coghlan wrote:
>
>> The use case is being able to interface naturally with any key-value
>> API where order matters.
>>
>> For example:
>>
>> # Create an ordered dictionary (WRONG!)
>> d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due
>> to unordered kw dict
>
> I'd need convincing that the API wouldn't be better designed
> to take something other than keyword arguments:
>
> ?d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3))
>
> and have it refuse to accept keyword arguments to prevent
> accidents.

I'm hard pressed to see how an ordered dict and a dict should be
expected to differ by such a degree; in every particular they behave
the same, except in the case of the OrderedDict you specify your
initial parameters in tuples? Eugh.

I'm not saying that it's a big enough gap to justify the amount of
work that so clearly needs to be done (and now that I've followed some
of the more indepth comments here, as well as read over the
documentation in dictobject.c, I get a sense of how big a deal this
could end up being) but there's not a lot to be said for the current
weird behaviour of the ordered dict constructor.

-- 
Chris R.
======
Not to be taken literally, internally, or seriously.
Twitter: http://twitter.com/offby1


From rrr at ronadam.com  Fri Oct 29 06:25:46 2010
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 28 Oct 2010 23:25:46 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
	<AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
Message-ID: <4CCA4CCA.1040809@ronadam.com>



On 10/28/2010 08:17 PM, Nick Coghlan wrote:
> On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum<guido at python.org>  wrote:
>> On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan<ncoghlan at gmail.com>  wrote:
>>> Yep, we've basically agreed on that as the way forward as well. We
>>> have a small tweak to suggest for PEP 380 to avoid losing the return
>>> value from inner close() calls,
>>
>> This is my "gclose()" function, right? Or is there more to it?
>
> Yeah, the idea your gclose(), plus one extra tweak to the expansion of
> "yield from" to store the result of the inner close() call on a new
> GeneratorExit instance.
>
> To use a toy example:
>
>    # Even this toy framework needs a little structure
>    class EndSum(Exception): pass
>
>    def gsum():
>      # Sums sent values until EndSum or GeneratorExit are thrown in
>      tally = 0
>      try:
>        while 1:
>          tally += yield
>      except (EndSum, GeneratorExit):
>        pass
>      return x
>
>    def average_sums():
>      # Advances to a new sum when EndSum is thrown in
>      # Finishes the last sum and averages them all when GeneratorExit
> is thrown in
>      sums = []
>      try:
>        while 1:
>          sums.append(yield from gsum())
>      except GeneratorExit as ex:
>        # Our proposed expansion tweak is to enable the next line
>        sums.append(ex.args[0])
>      return sum(sums) / len(sums)

Nick, could you add a main() or calling routine?  I'm having trouble seeing 
the complete logic without that.

Cheers,
    Ron




From greg.ewing at canterbury.ac.nz  Fri Oct 29 07:25:56 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 18:25:56 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
Message-ID: <4CCA5AE4.7080403@canterbury.ac.nz>

Guido van Rossum wrote:

> I'd also like to convince you to change g.close() so that it captures
> and returns the return value from StopIteration if it has one.

Looking at this again, I find that I'm not really sure how
this impacts PEP 380. The current expansion specifies that
when a delegating generator is closed, the subgenerator's
close() method is called, any value it returns is ignored,
and GeneratorExit is re-raised.

If that close() call were to return a value, what do you
think should be done with it?

-- 
Greg


From stefan_ml at behnel.de  Fri Oct 29 07:45:16 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 29 Oct 2010 07:45:16 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC9E99F.5030805@canterbury.ac.nz>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>	<4CC93095.3080704@egenix.com>	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>	<4CC941FF.6070408@egenix.com>
	<4CC9E99F.5030805@canterbury.ac.nz>
Message-ID: <iadn1d$gpq$1@dough.gmane.org>

Greg Ewing, 28.10.2010 23:22:
> M.-A. Lemburg wrote:
>> Python has always tried
>> to make the most common use case simple, so asking programmers to
>> use a meta-class to be able to access the order of definitions
>> in a class definition isn't exactly what the normal Python
>> programmer would expect.
>
> But needing to know the order of definitions in a class
> is a very uncommon thing to want to do in the first
> place.

Uncommon, sure, but there are use cases. A couple of Python based DSLs use 
classes as namespaces. Think of SOAP interface classes or database table 
definitions. In these cases, users usually have the field/column order in 
the back of their head when they write or read the code. So it's actually a 
bit surprising and somewhat error prone when the fields show up in 
arbitrary (and unpredictable!) order at runtime. And even on the same 
system, the order can change arbitrarily when new fields are added.

Stefan



From rrr at ronadam.com  Fri Oct 29 07:53:24 2010
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 29 Oct 2010 00:53:24 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<AANLkTikYuhb=oE14xhZ6QMvR1qTrx0_h+5ApD7y7OZEL@mail.gmail.com>	<AANLkTinP8PpXgR3EHm2U2JeSGgC8-K_SByMqQ41qi1Oo@mail.gmail.com>	<AANLkTi=35rn3GB-AbQmYMtBbNRTeKbfxB1ANJ2Hyn4Nm@mail.gmail.com>
	<AANLkTimk1epcdTnOzTTndo_VLBcuntjk4X5hz-tNgmhi@mail.gmail.com>
Message-ID: <4CCA6154.3030507@ronadam.com>



On 10/28/2010 08:17 PM, Nick Coghlan wrote:

>    def average_sums():
>      # Advances to a new sum when EndSum is thrown in
>      # Finishes the last sum and averages them all when GeneratorExit
> is thrown in
>      sums = []
>      try:
>        while 1:
>          sums.append(yield from gsum())

Wouldn't this need to be...

            gsum_ = gsum()
            next(gsum_)
            sums.append(yield from gsum_)

Or does the yield from allow send on a just started generator?

>      except GeneratorExit as ex:
>        # Our proposed expansion tweak is to enable the next line
>        sums.append(ex.args[0])
>      return sum(sums) / len(sums)


Ron


From stephen at xemacs.org  Fri Oct 29 08:28:07 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 29 Oct 2010 15:28:07 +0900
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
	<AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
Message-ID: <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Rose writes:

 > I'm hard pressed to see how an ordered dict and a dict should be
 > expected to differ by such a degree; in every particular they behave
 > the same, except in the case of the OrderedDict you specify your
 > initial parameters in tuples? Eugh.

But initializing a dict with "dict(a=1, b=2)" is purely an accidental
convenience based on the fact that a **kw argument is implemented as a
dict.  I find that syntax a bit disconcerting, actually, though it's
natural on reflection.

If you think of odict as an (efficient) associative list (order is
primary function, random access via keys secondary), rather than an
ordered mapping (random access via keys is primary function, order
secondary) then the syntaxes

    ['a' : 1, 'b' : 2, 'c' : 3]    # create an odict

(surely that has been suggested before!) and

    def foo([**kw]):               # pass kw as an odict
        pass

are suggestive.  I don't know whether either would be parsable by
Python's parser, and I haven't thought about how the latter would deal
with positional or kw-only arguments.



From cmjohnson.mailinglist at gmail.com  Fri Oct 29 08:32:37 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Thu, 28 Oct 2010 20:32:37 -1000
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
	<AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
	<87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTi=Qf6sT=1FNqUpoVQZO9fhfqR-5+pFEssaRC9Fi@mail.gmail.com>

On Thu, Oct 28, 2010 at 8:28 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:

> If you think of odict as an (efficient) associative list (order is
> primary function, random access via keys secondary), rather than an
> ordered mapping (random access via keys is primary function, order
> secondary) then the syntaxes
>
> ? ?['a' : 1, 'b' : 2, 'c' : 3] ? ?# create an odict
>
> (surely that has been suggested before!) and

Yup:

http://mail.python.org/pipermail/python-ideas/2009-June/004924.html

GvR says "-100" :-O

Archivally-yrs,

-- Carl


From greg.ewing at canterbury.ac.nz  Fri Oct 29 09:18:05 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 20:18:05 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
Message-ID: <4CCA752D.5090904@canterbury.ac.nz>

I've been pondering the whole close()-returning-a-value
thing I've convinced myself once again that it's a bad
idea.

Essentially the problem is that we're trying to make
the close() method, and consequently GeneratorExit,
serve two different and incompatible roles.

One role (the one it currently serves) is as an
emergency bail-out mechanism. In that role, when we
have a stack of generators delegating via yield-from,
we want things to behave as thought the GeneratorExit
originates in the innermost one and propagates back
out of the entire stack. We don't want any of the
intermediate generators to catch it and turn it
into a StopIteration, because that would give the
next outer one the misleading impression that it's
business as usual, but it's not.

This is why PEP 380 currently specifies that, after
calling the close() method of the subgenerator,
GeneratorExit is unconditionally re-raised in the
delegating generator.

The proponents of close()-returning-a-value, however,
want GeneratorExit to serve another role: as a way
of signalling to a consuming generator (i.e. one that
is having values passed into it using send()) that
there are no more values left to pass in.

It seems to me that this is analogous to a function
reading values from a file, or getting them from an
iterator. The behaviour that's usually required in
the presence of delegation is quite different in those
cases.

Consider a function f1, that calls another function
f2, which loops reading from a file. When f2 reaches
the end of the file, this is a signal that it should
finish what it's doing and return a value to f1, which
then continues in its usual way.

Similarly, if f2 uses a for-loop to iterate over
something, when the iterator is exhausted, f2 continues
and returns normally.

I don't see how GeneratorExit can be made to fulfil
this role, i.e. as a "producer exhausted" signal,
without compromising its existing one. And if that
idea is dropped, the idea of close() returning a value
no longer has much motivation that I can see.

So how should "producer exhausted" be signalled, and
how should the result of a consumer generator be returned?

As for returning the result, I think it should be done
using the existing PEP 380 mechanism, i.e. the generator
executes a "return", consequently raising StopIteration
with the value. A delegating generator will then see
this as the result of a yield-from and continue normally.

As for the signalling mechanism, I think that's entirely
a matter for the producer and consumer to decide between
themselves. One way would be to send() in a sentinel value,
if there is a suitable out-of-band value available.
Another would be to throw() in some pre-arranged exception,
perhaps EOFError as a suggested convention.

If we look at files as an analogy, we see a similar range
of conventions. Most file reading operations return an empty
string or bytes object on EOF. Some, such as readline(),
raise an exception, because the empty element of the relevant
type is also a valid return value.

As an example, a consumer generator using None as a
sentinel value might look like this:

   def summer():
     tot = 0
     while 1:
       x = yield
       if x is None:
         break
       tot += x
     return tot

and a producer using it:

   s = summer()
   s.next()
   for x in values:
     s.send(x)
   try:
     s.send(None)
   except StopIteration as e:
     result = e.value

Having to catch StopIteration is a little tedious, but it
could easily be encapsulated in a helper function:

   def close_consumer(g, sentinel):
     try:
       g.send(sentinel)
     except StopIteration as e:
       return e.value

The helper function could also take care of another issue
that arises. What happens if a delegating consumer carries
on after a subconsumer has finished and yields again?

The analogous situation with files is trying to read from
a file that has already signalled EOF before. In that case,
the file simply signals EOF again. Similarly, calling
next() on an exhausted iterator raises StopIteration again.

So, if a "finished" consumer yields again, and we are using
a sentinel value, the yield should return the sentinel again.
We can get this behaviour by writing our helper function like
this:

   def close_consumer(g, sentinel):
     while 1:
       try:
         g.send(sentinel)
       except StopIteration as e:
         return e.value

So in summary, I think PEP 380 and current generator
semantics are fine as they stand with regard to the
behaviour of close(). Signalling the end of a stream of
values to a consumer generator can and should be handled
by convention, using existing facilities.

-- 
Greg


From masklinn at masklinn.net  Fri Oct 29 09:47:38 2010
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 29 Oct 2010 09:47:38 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CCA4CF6.6020006@canterbury.ac.nz>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
Message-ID: <A34DCE6A-BBAA-47FB-B3F2-4A06CEA89E82@masklinn.net>


On 2010-10-29, at 06:26 , Greg Ewing wrote:

> On 29/10/10 13:54, Nick Coghlan wrote:
> 
>> The use case is being able to interface naturally with any key-value
>> API where order matters.
>> 
>> For example:
>> 
>> # Create an ordered dictionary (WRONG!)
>> d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due
>> to unordered kw dict
> 
> I'd need convincing that the API wouldn't be better designed
> to take something other than keyword arguments:
> 
>  d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3))

Well that your version takes up nearly 3 times as many characters per item would be quite a ding against it I think. It is verbose and quite alien-looking, and thus not quite ideal for interfacing with systems where ordered keyword arguments are common place (a Python interface to Cocoa for instance, or to a Smalltalk-type system). Then again, you can easily counter that Smalltalk-type keyword arguments allow for repeated keys, whereas Python's don't allow this in any case, so the interface is broken in any case.

Furthermore, it is downright painful to interface with on the other side of the equation, especially since Python has (as far as I know) no support for association lists, as it is necessary to either manually walk the list for the right keys or one has to deal with two different structures at once (a dict for k:v access and a list for order)

From greg.ewing at canterbury.ac.nz  Fri Oct 29 09:47:54 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 20:47:54 +1300
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
	<AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
Message-ID: <4CCA7C2A.10005@canterbury.ac.nz>

Chris Rose wrote:

> I'm hard pressed to see how an ordered dict and a dict should be
> expected to differ by such a degree; in every particular they behave
> the same, except in the case of the OrderedDict you specify your
> initial parameters in tuples?

Well, you *can* specify them in tuples for an ordinary
dict if you want:

 >>> d = dict([('a', 1), ('b', 2)])
 >>> d
{'a': 1, 'b': 2}

The fact that keywords also work for an ordinary dict is
really just a lucky fluke that works in the special case
where the keys are identifier-like strings. In any other
case you have to use tuples anyway. Also, you get away
with it because dicts happen to be unordered. Expecting
your luck in this area to extend to other data types
is pushing things a bit, I think.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Oct 29 10:08:44 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Oct 2010 21:08:44 +1300
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTi=Qf6sT=1FNqUpoVQZO9fhfqR-5+pFEssaRC9Fi@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC9F341.8020404@canterbury.ac.nz>
	<AANLkTineSurioMAdDnEB3PvUMEqnv=HiBgJiaUg0A9GK@mail.gmail.com>
	<4CCA4CF6.6020006@canterbury.ac.nz>
	<AANLkTimtsUBBXmoTPbjRo7x1OvVy0oqeq1FP9ygeYBr7@mail.gmail.com>
	<87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTi=Qf6sT=1FNqUpoVQZO9fhfqR-5+pFEssaRC9Fi@mail.gmail.com>
Message-ID: <4CCA810C.1070501@canterbury.ac.nz>

Carl M. Johnson wrote:
> On Thu, Oct 28, 2010 at 8:28 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
>>   ['a' : 1, 'b' : 2, 'c' : 3]    # create an odict
>>
>>(surely that has been suggested before!) and
 >
> GvR says "-100" :-O

That's a pity, because the next obvious step, for those
who don't want to give up their keywords, would be

   [a = 1, b = 2, c = 3]

:-)

-- 
Greg


From mal at egenix.com  Fri Oct 29 10:30:26 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 29 Oct 2010 10:30:26 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CC9E99F.5030805@canterbury.ac.nz>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
Message-ID: <4CCA8622.5060405@egenix.com>

Greg Ewing wrote:
> M.-A. Lemburg wrote:
>> Python has always tried
>> to make the most common use case simple, so asking programmers to
>> use a meta-class to be able to access the order of definitions
>> in a class definition isn't exactly what the normal Python
>> programmer would expect.
> 
> But needing to know the order of definitions in a class
> is a very uncommon thing to want to do in the first
> place.

I've already pointed to a couple of existing use cases where the
authors had to play all sorts of tricks to access the order of
such definitions.

Since Python programs are executed sequentially (within the resp.
scope) in the order given in the source file, it is quite natural
to expect this order to be accessible somehow.

If it were easier to access this order, a lot of the extra magic
needed to map fixed order records to Python classes could go
away.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 29 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Fri Oct 29 10:49:31 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 29 Oct 2010 10:49:31 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com>
Message-ID: <20101029104931.665ee663@pitrou.net>

On Fri, 29 Oct 2010 10:30:26 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> Greg Ewing wrote:
> > M.-A. Lemburg wrote:
> >> Python has always tried
> >> to make the most common use case simple, so asking programmers to
> >> use a meta-class to be able to access the order of definitions
> >> in a class definition isn't exactly what the normal Python
> >> programmer would expect.
> > 
> > But needing to know the order of definitions in a class
> > is a very uncommon thing to want to do in the first
> > place.
> 
> I've already pointed to a couple of existing use cases where the
> authors had to play all sorts of tricks to access the order of
> such definitions.
> 
> Since Python programs are executed sequentially (within the resp.
> scope) in the order given in the source file, it is quite natural
> to expect this order to be accessible somehow.
> 
> If it were easier to access this order, a lot of the extra magic
> needed to map fixed order records to Python classes could go
> away.

Interestingly, this order is already accessible on the code object used
to build the class namespace:

>>> def f():
...   class C:
...     x = 5
...     def y(): pass
...     z = 6
... 
>>> code = f.__code__.co_consts[1]
>>> code.co_names
('__name__', '__module__', 'x', 'y', 'z')

Regards

Antoine.




From cmjohnson.mailinglist at gmail.com  Fri Oct 29 10:50:26 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Thu, 28 Oct 2010 22:50:26 -1000
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CCA8622.5060405@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com>
Message-ID: <AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>

On Thu, Oct 28, 2010 at 10:30 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> If it were easier to access this order, a lot of the extra magic
> needed to map fixed order records to Python classes could go
> away.

But, pending the creation of an odict of equal or greater speed, is
there any reason we can't just make due for now by having the
__prepare__ method of our relevant metaclasses return an odict? Sure,
in Python 2 people used to have to do crazy stack frame hacks and such
to preserve the ordering info for their ORMs, but now that we have
__prepare__, I don't think the need for a better solution is
particularly urgent. I agree it might be nice to have odicts
everywhere, but it wouldn't be so nice that we need to sacrifice
performance for it at the moment.

Let someone forge the bell first, then we can talk about getting the
cat to wear it.


From mal at egenix.com  Fri Oct 29 10:51:54 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 29 Oct 2010 10:51:54 +0200
Subject: [Python-ideas] dict changes [was: Ordered storage of keyword
 arguments]
In-Reply-To: <1288303911.3753.9.camel@localhost.localdomain>
References: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
	<1288303911.3753.9.camel@localhost.localdomain>
Message-ID: <4CCA8B2A.2060702@egenix.com>

Antoine Pitrou wrote:
> Le jeudi 28 octobre 2010 ? 14:44 -0400, Jim Jewett a ?crit :
>>
>> For a string dict, that hash should already be available on the string
>> object itself, so it is redundant.  Keeping it obviously improves
>> cache locality, but ... it also makes the dict objects 50% larger, and
>> there is a chance that the strings themselves would already be in
>> cache anyhow.  And if strings were reliably interned, the comparison
>> check should normally just be a pointer compare -- possibly fast
>> enough that the "different hash" shortcut doesn't buy anything.
>> [caveats about still needing to go to the slower dict implementation
>> for string subclasses]
> 
> I've thought about that. The main annoyance is to be able to switch
> transparently between the two implementations. But I think it would be
> interesting to pursue that effort, since indeed dicts with interned keys
> are the most common case of dicts in the average Python workload. Saving
> 1/3 of the memory size on these dicts would be worthwhile IMO.

Are you sure ? In the age of GB RAM, runtime performance appears
to be more important than RAM usage. Moving the hash comparison out
of the dict would likely cause (cache) locality to no longer trigger.

> (addressing itself would perhaps be a bit simpler, because of
> multiplying by 8 or 16 instead of multiplying by 12 or 24. But I doubt
> the difference would be noticeable)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 29 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Fri Oct 29 11:06:22 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 29 Oct 2010 11:06:22 +0200
Subject: [Python-ideas] dict changes [was: Ordered storage of keyword
 arguments]
In-Reply-To: <4CCA8B2A.2060702@egenix.com>
References: <AANLkTimysnRQXGn1f0TCc1VMB_T7q3EXpY43ecJK8B+k@mail.gmail.com>
	<1288303911.3753.9.camel@localhost.localdomain>
	<4CCA8B2A.2060702@egenix.com>
Message-ID: <1288343183.3565.26.camel@localhost.localdomain>

Le vendredi 29 octobre 2010 ? 10:51 +0200, M.-A. Lemburg a ?crit :
> > 
> > I've thought about that. The main annoyance is to be able to switch
> > transparently between the two implementations. But I think it would be
> > interesting to pursue that effort, since indeed dicts with interned keys
> > are the most common case of dicts in the average Python workload. Saving
> > 1/3 of the memory size on these dicts would be worthwhile IMO.
> 
> Are you sure ? In the age of GB RAM, runtime performance appears
> to be more important than RAM usage.
> Moving the hash comparison out
> of the dict would likely cause (cache) locality to no longer trigger.

Good point. It probably depends on the collision rate.
Also, a string key dict could be optimized for interned strings, in
which case the hash comparison is unnecessary.
(knowing whether the key is interned could be stored in e.g. the
low-order bit of the key pointer)

Regards

Antoine.




From mal at egenix.com  Fri Oct 29 11:15:42 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 29 Oct 2010 11:15:42 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <20101029104931.665ee663@pitrou.net>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com> <20101029104931.665ee663@pitrou.net>
Message-ID: <4CCA90BE.30206@egenix.com>

Antoine Pitrou wrote:
> On Fri, 29 Oct 2010 10:30:26 +0200
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>> Greg Ewing wrote:
>>> M.-A. Lemburg wrote:
>>>> Python has always tried
>>>> to make the most common use case simple, so asking programmers to
>>>> use a meta-class to be able to access the order of definitions
>>>> in a class definition isn't exactly what the normal Python
>>>> programmer would expect.
>>>
>>> But needing to know the order of definitions in a class
>>> is a very uncommon thing to want to do in the first
>>> place.
>>
>> I've already pointed to a couple of existing use cases where the
>> authors had to play all sorts of tricks to access the order of
>> such definitions.
>>
>> Since Python programs are executed sequentially (within the resp.
>> scope) in the order given in the source file, it is quite natural
>> to expect this order to be accessible somehow.
>>
>> If it were easier to access this order, a lot of the extra magic
>> needed to map fixed order records to Python classes could go
>> away.
> 
> Interestingly, this order is already accessible on the code object used
> to build the class namespace:
> 
>>>> def f():
> ...   class C:
> ...     x = 5
> ...     def y(): pass
> ...     z = 6
> ... 
>>>> code = f.__code__.co_consts[1]
>>>> code.co_names
> ('__name__', '__module__', 'x', 'y', 'z')

Interesting indeed and I kind of expected that order to be
available somewhere via the compiler.

Can this be generalized to arbitrary classes ?

Would other Python implementations be able to provide the
same information ?

If so, we could add this order as .__deforder__ tuple to class
objects.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 29 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Fri Oct 29 11:38:41 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 29 Oct 2010 11:38:41 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CCA90BE.30206@egenix.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com> <20101029104931.665ee663@pitrou.net>
	<4CCA90BE.30206@egenix.com>
Message-ID: <1288345121.3565.48.camel@localhost.localdomain>

Le vendredi 29 octobre 2010 ? 11:15 +0200, M.-A. Lemburg a ?crit :
> 
> Would other Python implementations be able to provide the
> same information ?

Probably, yes.

> Can this be generalized to arbitrary classes ?

In CPython, it could be done by modifying the default __build_class__
function (which always gets called regardless of metaclasses and other
stuff). Of course, it won't work if the metaclass forbids setting
attributes on the class object.

Here's a pure Python prototype:


import builtins, types

_old_build_class = builtins.__build_class__

def __build_class__(func, name, *bases, **kwds):
    cls = _old_build_class(func, name, *bases, **kwds)
    # Extract the code object used to create the class namespace
    co = func.__code__
    cls.__deforder__ = tuple(n for n in co.co_names
                             if n in cls.__dict__)
    return cls

builtins.__build_class__ = __build_class__

class C: 
    y = 5
    z = staticmethod(len)
    def x():
        pass

print(C.__deforder__)




From jh at improva.dk  Fri Oct 29 12:13:16 2010
From: jh at improva.dk (Jacob Holm)
Date: Fri, 29 Oct 2010 12:13:16 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCA752D.5090904@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
Message-ID: <4CCA9E3C.8000604@improva.dk>

On 2010-10-29 09:18, Greg Ewing wrote:
> I've been pondering the whole close()-returning-a-value
> thing I've convinced myself once again that it's a bad
> idea.
> 

And I still believe we could have made it work.  However, I have been
doing my own thinking about the whole of PEP 380, PEP 3152, for-loop
co-iteration and so on.  And I think I have an idea that improves the
whole story.

The main thing to note is that the expression form of yield-from is
mostly intended to make it easier to have cofunctions that return a
value, and that there is a problem with reusing StopIteration for that
purpose.  Now that we have an actual PEP 3152, we could choose to move
the necessary support over there.  Here is one way to do that:


1)  Drop the current PEP 380 support for using "return <value>" inside a
generator.  That means no extended StopIteration and no expression form
of "yield from".  And since there are no return values, there is no
problem with how "close" should treat them.

2)  In PEP 3152, define "return <value>" in a cofunction to raise a new
IterationResult exception with the value.  (And treat falling off the
edge of the function or returning without a value as "return None")

3)  In PEP 3152, change the "cocall" expansion so that:

    <val> = cocall f(*args, **kwargs)

Expands to:

    try:
        yield from f.__cocall__(*args, **kwargs)
    except IterationResult as e:
        <val> = e.value
    else:
        raise StopIteration

(The same expansion would be used if cocalls are implicit of course).
This ensures that a cofunction can raise StopIteration just as a regular
function, which means we can extend the iterator protocol to support
cofunctions with only minor changes.




An interesting variation might be to keep the expression form of
yield-from, but change its semantics so that it returns the
StopIteration instance that was caught, instead of trying to extract a
value.   Then adding an IterationResult inheriting from StopIteration
and using it for "return <value>" in a generator.

That would make all current yield-from examples work with the minor
change that the old:

  <var> = yield from <expr>

would need to be written as

  <var> = (yield from <expr>).value

And would have the benefit that the PEP 3152 expansion could reraise the
actual StopIteration as in:

  e = yield from f.__cocall__(*args, **kwargs)
  if isinstance(e, IterationResult):
      <var> = e.value
  else:
      raise e

The idea of returning the exception takes some getting used to, but it
solves the problem with StopIteration and cofunctions, and I'm sure I
can find some interesting uses for it by itself.


Anyway....  Thoughts?



- Jacob


From steve at pearwood.info  Fri Oct 29 13:10:21 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 29 Oct 2010 22:10:21 +1100
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>	<4CC93095.3080704@egenix.com>	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>	<4CC941FF.6070408@egenix.com>
	<4CC9E99F.5030805@canterbury.ac.nz>	<4CCA8622.5060405@egenix.com>
	<AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>
Message-ID: <4CCAAB9D.3060909@pearwood.info>

Carl M. Johnson wrote:

> But, pending the creation of an odict of equal or greater speed, is
> there any reason we can't just make due for now by having the
> __prepare__ method of our relevant metaclasses return an odict? 

+1

Given that the need to care about the order of keyword arguments is 
likely to be rare, I'd like to see some recipes and/or metaclass helpers 
  before changing the language.

Besides... moratorium.


-- 
Steven



From guido at python.org  Fri Oct 29 16:28:29 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 07:28:29 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCA5AE4.7080403@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
Message-ID: <AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>

On Thu, Oct 28, 2010 at 10:25 PM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
>> I'd also like to convince you to change g.close() so that it captures
>> and returns the return value from StopIteration if it has one.
>
> Looking at this again, I find that I'm not really sure how
> this impacts PEP 380. The current expansion specifies that
> when a delegating generator is closed, the subgenerator's
> close() method is called, any value it returns is ignored,
> and GeneratorExit is re-raised.
>
> If that close() call were to return a value, what do you
> think should be done with it?

I went over that myself in detail and ended up deciding that for
"yield-from" nothing should be changed! The expansion in the PEP
remains the same.

But since this PEP also specifies "return value" it would be nice if
there was a convenient way to capture this value, and close seems to
be it. E.g.

def gen():
  total = 0
  try:
    while True:
      total += yield
  except GeneratorExit:
    return total

def main():
  g = gen()
  for i in range(100):
    g.send(i)
  print(g.close())

This would print the total computed by gen().

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Fri Oct 29 21:13:18 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 12:13:18 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCA752D.5090904@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
Message-ID: <AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>

On Fri, Oct 29, 2010 at 12:18 AM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> I've been pondering the whole close()-returning-a-value
> thing I've convinced myself once again that it's a bad
> idea.
>
> Essentially the problem is that we're trying to make
> the close() method, and consequently GeneratorExit,
> serve two different and incompatible roles.
>
> One role (the one it currently serves) is as an
> emergency bail-out mechanism. In that role, when we
> have a stack of generators delegating via yield-from,
> we want things to behave as thought the GeneratorExit
> originates in the innermost one and propagates back
> out of the entire stack. We don't want any of the
> intermediate generators to catch it and turn it
> into a StopIteration, because that would give the
> next outer one the misleading impression that it's
> business as usual, but it's not.

This seems to be the crux of your objection. But if I look carefully
at the expansion in the current version of PEP 380, I don't think this
problem actually happens: If the outer generator catches
GeneratorExit, it closes the inner generator (by calling its close
method, if it exists) and then re-raises the GeneratorExit:

            except GeneratorExit as _e:
                try:
                    _m = _i.close
                except AttributeError:
                    pass
                else:
                    _m()
                raise _e

I would leave this expansion alone even if g.close() was changed to
return the generator's return value.

Could it be that you are thinking of your accelerated implementation,
which IIRC has a shortcut whereby generator operations (next, send,
throw) on the outer generator are *directly* passed to the inner
generator when a yield-from is active?

It looks to me as if using g.close() to capture the return value of a
generator is not of much value when using yield-from, but it can be of
value for the simpler pattern that started this thread. Here's an
updated version:

def gclose(gen):  ## Not needed with PEP 380
  try:
    gen.throw(GeneratorExit)
  except StopIteration as err:
    return err.args[0]
  except GeneratorExit:
    pass
  # Note: other exceptions are passed out untouched.
  return None

def summer():
  total = 0
  try:
    while True:
      total += yield
  except GeneratorExit:
    raise StopIteration(total)  ## return total

def maxer():
  highest = 0
  try:
    while True:
      value = yield
      highest = max(highest, value)
  except GeneratorExit:
    raise StopIteration(highest)  ## return highest

def map_to_multiple(it, funcs):
  gens = [func() for func in funcs]  # Create generators
  for gen in gens:
    next(gen)  # Prime generators
  for value in it:
    for gen in gens:
      gen.send(value)
  return [gclose(gen) for gen in gens]  ## [gen.close() for gen in gens]

def main():
  print(map_to_multiple(range(100), [summer, maxer]))

main()

-- 
--Guido van Rossum (python.org/~guido)


From greg.ewing at canterbury.ac.nz  Sat Oct 30 01:03:21 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 12:03:21 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCA9E3C.8000604@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz> <4CCA9E3C.8000604@improva.dk>
Message-ID: <4CCB52B9.8010203@canterbury.ac.nz>

Jacob Holm wrote:

> The main thing to note is that the expression form of yield-from is
> mostly intended to make it easier to have cofunctions that return a
> value, and that there is a problem with reusing StopIteration for that
> purpose.

No, I don't think that's where the problem is, if I understand
correctly which problem you're referring to. The fact that cofunctions
as currently defined in PEP 3152 can't raise StopIteration and have it
propagate to the caller is a problem with the use of StopIteration
to signal the end of a cofunction. Whether the StopIteration carries
a value or not is irrelevant.

 > And since there are no return values, there is no
> problem with how "close" should treat them.

There's no problem with that *now*, because close() is currently
not defined as returning a value. A problem only arises if we
try to overload close() to mean "no more data to send in, give me
your result" as well as "bail out now and clean up". And as I
pointed out, there are other ways of signalling end of data that
work fine with things as they are.

> 2)  In PEP 3152, define "return <value>" in a cofunction to raise a new
> IterationResult exception with the value.

That would have to apply to *all* forms of return, not just ones
with a value.

>   <var> = (yield from <expr>).value
> 
> have the benefit that the PEP 3152 expansion could reraise the
> actual StopIteration as in:
> 
>   e = yield from f.__cocall__(*args, **kwargs)
>   if isinstance(e, IterationResult):
>       <var> = e.value
>   else:
>       raise e

There's another way to approach this: define cofunctions so that
'return' in one of its forms is the only way to raise an actual
StopIteration, and any explicitly raised StopIteration gets wrapped
in something else, such as CoStopIteration. The expansion would
then be

    try:
       result = yield from f.__cocall__(*args, **kwargs)
    except CoStopIteration as e:
       raise e.value

where e.value is the original StopIteration instance.

This would have the advantage of not requiring any change to
yield-from as it stands.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sat Oct 30 01:30:54 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 12:30:54 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
Message-ID: <4CCB592E.7030507@canterbury.ac.nz>

Guido van Rossum wrote:

> I went over that myself in detail and ended up deciding that for
> "yield-from" nothing should be changed! The expansion in the PEP
> remains the same.

In that case, the proposal has nothing to do with PEP 380
and needn't be mentioned in it -- except perhaps to point
out that using it in the presence of yield-from may
not produce the expected result.

> But since this PEP also specifies "return value" it would be nice if
> there was a convenient way to capture this value,

As long as you're willing to accept that if the generator
you're closing is delegating using yield-from, the return
value from the inner generator will get lost.

To put it another way, if you design a generator to be
used in this way (i.e. its caller using close() to finish
it and get a value), you may find it awkward or impossible
to later refactor it in certain ways.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sat Oct 30 01:37:41 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 12:37:41 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
Message-ID: <4CCB5AC5.8060907@canterbury.ac.nz>

Guido van Rossum wrote:

> But since this PEP also specifies "return value" it would be nice if
> there was a convenient way to capture this value, and close seems to
> be it.

Sorry, I missed that bit -- you're right, it does need to be
allowed for in PEP 380 if we're to do this. I'm still not
convinced that it isn't a wrongheaded idea, though. The fact
that it doesn't play well with yield-from gives off a very
bad smell to me.

It seems highly incongruous for the PEP to propose a feature
that's incompatible with the main idea of the whole thing.

-- 
Greg


From jh at improva.dk  Sat Oct 30 01:34:30 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 01:34:30 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB52B9.8010203@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA752D.5090904@canterbury.ac.nz>
	<4CCA9E3C.8000604@improva.dk> <4CCB52B9.8010203@canterbury.ac.nz>
Message-ID: <4CCB5A06.60408@improva.dk>

On 2010-10-30 01:03, Greg Ewing wrote:
> Jacob Holm wrote:
> 
>> The main thing to note is that the expression form of yield-from is
>> mostly intended to make it easier to have cofunctions that return a
>> value, and that there is a problem with reusing StopIteration for that
>> purpose.
> 
> No, I don't think that's where the problem is, if I understand
> correctly which problem you're referring to. The fact that cofunctions
> as currently defined in PEP 3152 can't raise StopIteration and have it
> propagate to the caller is a problem with the use of StopIteration
> to signal the end of a cofunction.

Exactly.


> Whether the StopIteration carries
> a value or not is irrelevant.
> 

It is relevant if we later want to distinguish between "return" and
"raise StopIteration".



>> And since there are no return values, there is no
>> problem with how "close" should treat them.
> 
> There's no problem with that *now*, because close() is currently
> not defined as returning a value. A problem only arises if we
> try to overload close() to mean "no more data to send in, give me
> your result" as well as "bail out now and clean up". And as I
> pointed out, there are other ways of signalling end of data that
> work fine with things as they are.
> 

That is what I meant.  We were discussing whwther to add a new feature
to PEP 380 inspired by having "return <value>" in generators.  If we
dropped "return <value>" from PEP 380 (with the intent of adding it to
PEP 3152 instead), so would the basis for the new feature.  End of
discussion...

AFAICT, adding these features in a consistent way is a lot easier in the
context of PEP 3152.


>> 2)  In PEP 3152, define "return <value>" in a cofunction to raise a new
>> IterationResult exception with the value.
> 
> That would have to apply to *all* forms of return, not just ones
> with a value.
> 

Of course.



>>   <var> = (yield from <expr>).value
>>
>> have the benefit that the PEP 3152 expansion could reraise the
>> actual StopIteration as in:
>>
>>   e = yield from f.__cocall__(*args, **kwargs)
>>   if isinstance(e, IterationResult):
>>       <var> = e.value
>>   else:
>>       raise e
> 
> There's another way to approach this: define cofunctions so that
> 'return' in one of its forms is the only way to raise an actual
> StopIteration, and any explicitly raised StopIteration gets wrapped
> in something else, such as CoStopIteration. The expansion would
> then be
> 
>    try:
>       result = yield from f.__cocall__(*args, **kwargs)
>    except CoStopIteration as e:
>       raise e.value
> 
> where e.value is the original StopIteration instance.
> 
> This would have the advantage of not requiring any change to
> yield-from as it stands.
> 

That's just ugly...  I realize it could work, but I think that makes
*both* PEPs more complex than necessary.

My suggestion is to cut/change some features from PEP 380 that are in
the way and then add them in a cleaner way to PEP 3152.   This should
simplify both PEPs, at the cost of reopening some of the earlier
discussions.


- Jacob


From guido at python.org  Sat Oct 30 01:45:28 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 16:45:28 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB592E.7030507@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB592E.7030507@canterbury.ac.nz>
Message-ID: <AANLkTimt+wtM=Lb7NaPy5=5D7D5_-faD5LPTF+EOjac2@mail.gmail.com>

On Fri, Oct 29, 2010 at 4:30 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
>> I went over that myself in detail and ended up deciding that for
>> "yield-from" nothing should be changed! The expansion in the PEP
>> remains the same.
>
> In that case, the proposal has nothing to do with PEP 380
> and needn't be mentioned in it -- except perhaps to point
> out that using it in the presence of yield-from may
> not produce the expected result.

The connection is that it works well with returning values from
generators, which *is* specified in PEP 380. So I think this does
belong there.

>> But since this PEP also specifies "return value" it would be nice if
>> there was a convenient way to capture this value,
>
> As long as you're willing to accept that if the generator
> you're closing is delegating using yield-from, the return
> value from the inner generator will get lost.
>
> To put it another way, if you design a generator to be
> used in this way (i.e. its caller using close() to finish
> it and get a value), you may find it awkward or impossible
> to later refactor it in certain ways.

Only if after the refactoring the outer generator would need the
return value of the interrupted yield-from expression in order to
compute its return value. I think that's reasonable. (It might be
possible to tweak the yield-from expansion so that the return value is
assigned before GeneratorExit is re-raised, but that sounds fragile,
and doesn't always apply, e.g. if the return value is not assigned to
a local variable.)

-- 
--Guido van Rossum (python.org/~guido)


From jh at improva.dk  Sat Oct 30 01:46:23 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 01:46:23 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB5AC5.8060907@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA5AE4.7080403@canterbury.ac.nz>	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz>
Message-ID: <4CCB5CCF.90504@improva.dk>

On 2010-10-30 01:37, Greg Ewing wrote:
> Guido van Rossum wrote:
> 
>> But since this PEP also specifies "return value" it would be nice if
>> there was a convenient way to capture this value, and close seems to
>> be it.
> 
> Sorry, I missed that bit -- you're right, it does need to be
> allowed for in PEP 380 if we're to do this. I'm still not
> convinced that it isn't a wrongheaded idea, though. The fact
> that it doesn't play well with yield-from gives off a very
> bad smell to me.
> 
> It seems highly incongruous for the PEP to propose a feature
> that's incompatible with the main idea of the whole thing.
> 

Which is exactly why I'm suggesting dropping "return value" from PEP 380
and then doing it *right* in PEP 3152, which has a much better rationale
for the "return value" feature anyway.

- Jacob


From ghazel at gmail.com  Sat Oct 30 01:50:53 2010
From: ghazel at gmail.com (ghazel at gmail.com)
Date: Fri, 29 Oct 2010 16:50:53 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB5CCF.90504@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
Message-ID: <AANLkTikjHhB66bL5ExbXX343npFqS25iD4OkabD+k1nF@mail.gmail.com>

On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
> On 2010-10-30 01:37, Greg Ewing wrote:
>> Guido van Rossum wrote:
>>
>>> But since this PEP also specifies "return value" it would be nice if
>>> there was a convenient way to capture this value, and close seems to
>>> be it.
>>
>> Sorry, I missed that bit -- you're right, it does need to be
>> allowed for in PEP 380 if we're to do this. I'm still not
>> convinced that it isn't a wrongheaded idea, though. The fact
>> that it doesn't play well with yield-from gives off a very
>> bad smell to me.
>>
>> It seems highly incongruous for the PEP to propose a feature
>> that's incompatible with the main idea of the whole thing.
>>
>
> Which is exactly why I'm suggesting dropping "return value" from PEP 380
> and then doing it *right* in PEP 3152, which has a much better rationale
> for the "return value" feature anyway.

Why not split "return value" for generators in to its own PEP? There
is currently a use case for it in frameworks which use generators for
coroutines, without any dependency on PEP 380 or PEP 3152.

-Greg


From guido at python.org  Sat Oct 30 01:54:36 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 16:54:36 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB5CCF.90504@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
Message-ID: <AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>

On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
> On 2010-10-30 01:37, Greg Ewing wrote:
>> Guido van Rossum wrote:
>>
>>> But since this PEP also specifies "return value" it would be nice if
>>> there was a convenient way to capture this value, and close seems to
>>> be it.
>>
>> Sorry, I missed that bit -- you're right, it does need to be
>> allowed for in PEP 380 if we're to do this. I'm still not
>> convinced that it isn't a wrongheaded idea, though. The fact
>> that it doesn't play well with yield-from gives off a very
>> bad smell to me.
>>
>> It seems highly incongruous for the PEP to propose a feature
>> that's incompatible with the main idea of the whole thing.

I don't think it is.

> Which is exactly why I'm suggesting dropping "return value" from PEP 380
> and then doing it *right* in PEP 3152, which has a much better rationale
> for the "return value" feature anyway.

Oh, but I still don't like that PEP, and it has a much higher
probability of failing completely. PEP 380 OTOH has my approval except
for minor quibbles like g.close().

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Oct 30 01:55:39 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 16:55:39 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikjHhB66bL5ExbXX343npFqS25iD4OkabD+k1nF@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTikjHhB66bL5ExbXX343npFqS25iD4OkabD+k1nF@mail.gmail.com>
Message-ID: <AANLkTi=QBFVgd2mz-+5Y7kpn1WDzN=X6B05f4TdYxWLu@mail.gmail.com>

On Fri, Oct 29, 2010 at 4:50 PM,  <ghazel at gmail.com> wrote:
> On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
>> On 2010-10-30 01:37, Greg Ewing wrote:
>>> Guido van Rossum wrote:
>>>
>>>> But since this PEP also specifies "return value" it would be nice if
>>>> there was a convenient way to capture this value, and close seems to
>>>> be it.
>>>
>>> Sorry, I missed that bit -- you're right, it does need to be
>>> allowed for in PEP 380 if we're to do this. I'm still not
>>> convinced that it isn't a wrongheaded idea, though. The fact
>>> that it doesn't play well with yield-from gives off a very
>>> bad smell to me.
>>>
>>> It seems highly incongruous for the PEP to propose a feature
>>> that's incompatible with the main idea of the whole thing.
>>>
>>
>> Which is exactly why I'm suggesting dropping "return value" from PEP 380
>> and then doing it *right* in PEP 3152, which has a much better rationale
>> for the "return value" feature anyway.
>
> Why not split "return value" for generators in to its own PEP? There
> is currently a use case for it in frameworks which use generators for
> coroutines, without any dependency on PEP 380 or PEP 3152.

Either way it's not going in before Python 3.3... Aside from the
moratorium, 3.2 is also too close to release.

PS. Drop me a note to chat about Monocle.

-- 
--Guido van Rossum (python.org/~guido)


From jh at improva.dk  Sat Oct 30 01:57:59 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 01:57:59 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikjHhB66bL5ExbXX343npFqS25iD4OkabD+k1nF@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTikjHhB66bL5ExbXX343npFqS25iD4OkabD+k1nF@mail.gmail.com>
Message-ID: <4CCB5F87.3020107@improva.dk>

On 2010-10-30 01:50, ghazel at gmail.com wrote:
> On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
>> Which is exactly why I'm suggesting dropping "return value" from PEP 380
>> and then doing it *right* in PEP 3152, which has a much better rationale
>> for the "return value" feature anyway.
> 
> Why not split "return value" for generators in to its own PEP? There
> is currently a use case for it in frameworks which use generators for
> coroutines, without any dependency on PEP 380 or PEP 3152.
> 

You could do that, but I think 3152 would need to depend on the new PEP
then.  It would be quite strange to define a new class of "functions"
and not have them able to return a value.


- Jacob


From jh at improva.dk  Sat Oct 30 02:10:11 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 02:10:11 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimt+wtM=Lb7NaPy5=5D7D5_-faD5LPTF+EOjac2@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA5AE4.7080403@canterbury.ac.nz>	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>	<4CCB592E.7030507@canterbury.ac.nz>
	<AANLkTimt+wtM=Lb7NaPy5=5D7D5_-faD5LPTF+EOjac2@mail.gmail.com>
Message-ID: <4CCB6263.9030602@improva.dk>

On 2010-10-30 01:45, Guido van Rossum wrote:
> On Fri, Oct 29, 2010 at 4:30 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Guido van Rossum wrote:
>>> But since this PEP also specifies "return value" it would be nice if
>>> there was a convenient way to capture this value,
>>
>> As long as you're willing to accept that if the generator
>> you're closing is delegating using yield-from, the return
>> value from the inner generator will get lost.
>>
>> To put it another way, if you design a generator to be
>> used in this way (i.e. its caller using close() to finish
>> it and get a value), you may find it awkward or impossible
>> to later refactor it in certain ways.
> 
> Only if after the refactoring the outer generator would need the
> return value of the interrupted yield-from expression in order to
> compute its return value. I think that's reasonable. (It might be
> possible to tweak the yield-from expansion so that the return value is
> assigned before GeneratorExit is re-raised, but that sounds fragile,
> and doesn't always apply, e.g. if the return value is not assigned to
> a local variable.)
> 

I have earlier proposed a simple change that would at least make the
value available.  Instead of reraising the original GeneratorExit after
calling close on the subgenerator, you just raise a new GeneratorExit
with the returned value as its first argument.  Nick seemed in favor of
this idea.

- Jacob


From cmjohnson.mailinglist at gmail.com  Sat Oct 30 02:19:33 2010
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Fri, 29 Oct 2010 14:19:33 -1000
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <4CCAAB9D.3060909@pearwood.info>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com>
	<AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>
	<4CCAAB9D.3060909@pearwood.info>
Message-ID: <AANLkTim0MwtPcuRWmR9_ja8XvSkCkkTg0sejDnH-0Y3v@mail.gmail.com>

On Fri, Oct 29, 2010 at 1:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> Given that the need to care about the order of keyword arguments is likely
> to be rare, I'd like to see some recipes and/or metaclass helpers ?before
> changing the language.

The recipe is more or less directly in PEP 3115 (the PEP which
established the __prepare__ attribute). There's not a lot to it.

>>> class OrderedMetaclass(type):
...     @classmethod
...     def __prepare__(metacls, name, bases): # No keywords in this case
...         return OrderedDict()
...
...     def __new__(cls, name, bases, classdict):
...         result = type.__new__(cls, name, bases, dict(classdict))
...         result.odict = classdict
...         return result
...
>>>
>>> class OrderedClass(metaclass=OrderedMetaclass):
...     a = 1
...     z = 2
...     b = 3
...
>>> OrderedClass.odict
OrderedDict([('__module__', '__main__'), ('a', 1), ('z', 2), ('b', 3)])

Thinking about it a little more, if I were making an HTML tree type
metaclass though, I wouldn't want to use an OrderedDict anyway, since
it can't have duplicate elements, and I would want the interface to be
something like:

class body(Tree()):
    h1 = "Hello World!"
    p  = "Lorem ipsum."
    p  = "Dulce et decorum est."
    class div(Tree(id="content")):
        p = "Main thing"
    class div(Tree(id="footer")):
        p = "(C) 2010"

So, I'd probably end up making my own custom kind of dict that didn't
overwrite repeated names.

Of course, for an ORM, you don't want repeated field names, so an
OrderedDict would work.

Anyway, this just goes to show how limited the applicability of
switching to an odict in Python internals is.


From jh at improva.dk  Sat Oct 30 02:41:16 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 02:41:16 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
Message-ID: <4CCB69AC.9020701@improva.dk>

On 2010-10-30 01:54, Guido van Rossum wrote:
> On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
>> Which is exactly why I'm suggesting dropping "return value" from PEP 380
>> and then doing it *right* in PEP 3152, which has a much better rationale
>> for the "return value" feature anyway.
> 
> Oh, but I still don't like that PEP, and it has a much higher
> probability of failing completely. PEP 380 OTOH has my approval except
> for minor quibbles like g.close().
> 

I agree that PEP 3152 is far from perfect at this point, but I like the
basics.   The reason I am so concerned with the "return value" semantics
is that I see some problems we are having in PEP 3152 as indicating a
likely flaw/misfeature in PEP 380.  I would be much happier with both
PEPs if they didn't conflict in this way.

So much so, that I would rather miss a few features in PEP 380 in the
*hope* of getting them right later with another PEP.  To quote the Zen:

  "never is often better than *right* now"

A PEP just for the "return value" shouldn't be too hard to add later if
PEP 3152 doesn't work out, and at that point we should have a better
idea about the best way of doing it.


- Jacob


From guido at python.org  Sat Oct 30 03:15:57 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 18:15:57 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB69AC.9020701@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
Message-ID: <AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>

On Fri, Oct 29, 2010 at 5:41 PM, Jacob Holm <jh at improva.dk> wrote:
> On 2010-10-30 01:54, Guido van Rossum wrote:
>> On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm <jh at improva.dk> wrote:
>>> Which is exactly why I'm suggesting dropping "return value" from PEP 380
>>> and then doing it *right* in PEP 3152, which has a much better rationale
>>> for the "return value" feature anyway.
>>
>> Oh, but I still don't like that PEP, and it has a much higher
>> probability of failing completely. PEP 380 OTOH has my approval except
>> for minor quibbles like g.close().

> I agree that PEP 3152 is far from perfect at this point, but I like the
> basics.

I thought the basics weren't even decided? Implicit definitions, or
implicit cocalls, terminology to be used, how to implement in Jython
or IronPython, probably more (I can't keep the ideas of that PEP in my
head so I end up blanking out on any discussion that mentions it).

I truly wish it was easier to experiment with syntax -- it would be so
much simpler if these PEPs could be accompanied by a library that
people can just import to use the new syntax (even if it's a C
extension) rather than by a patch to the core language.

The need to "get it right in one shot" is keeping back the ability to
experiment at any realistic scale, so all we see (on all sides) are
trivial examples that may highlight proposed features and anticipated
problems, but this is no way to gain experience with what the *real*
problems would be.

> The reason I am so concerned with the "return value" semantics
> is that I see some problems we are having in PEP 3152 as indicating a
> likely flaw/misfeature in PEP 380. ?I would be much happier with both
> PEPs if they didn't conflict in this way.

If there was a separate PEP specifying *just* returning a value from a
generator and how to get at that value using g.close(), without
yield-from, would those problems still exist? If not, that would be a
reason to move those out in a separate PEP. Assume such a PEP (call it
PEP X) existed, what would be the dependency tree? What would be the
conflicts? Would PEP 3152 make sense with PEP X but without (the rest
of) PEP 380?

> So much so, that I would rather miss a few features in PEP 380 in the
> *hope* of getting them right later with another PEP.

Can you be specific? Which features?

> To quote the Zen:
>
> ?"never is often better than *right* now"

Um, Python 3.3 can hardly be referred to as "*right* now".

There are plenty of arguments in the zen for PEP X, especially "If the
implementation is easy to explain, it may be a good idea." Both
returning a value from a generator and catching that value in
g.close() are really easy to implement and the implementation is easy
to explain. It's a small evolution from the current generator code.

> A PEP just for the "return value" shouldn't be too hard to add later if
> PEP 3152 doesn't work out, and at that point we should have a better
> idea about the best way of doing it.

It would a small miracle if PEP 3152 worked out. I'd much rather have
a solid fallback position now. I'm not pushing for rushing PEP X to
acceptance -- I'm just hoping we can write it now and discuss it on
its own merits without too much concern for PEP 3152 or even PEP 380,
although I personally still think that the interference with PEP 380
would minimal and not a reason for changing PEP X.

BTW I don't think I like piggybacking a return value on GeneratorExit.
Before you know it people will be writing except blocks catching
GeneratorExit intending to catch one coming from inside but
accidentally including a yield in the try block and catching one
coming from the outside. The nice thing about how GeneratorExit works
today is that you needn't worry about it coming from inside, since it
always comes from the outside *first*. This means that if you catch a
GeneratorExit, it is either one you threw into a generator yourself
(it just bounced back, meaning the generator didn't handle it at all),
or one that was thrown into you. But the pattern of catching
GeneratorExit and responding by returning a value is a reasonable
extension of the pattern of catching GeneratorExit and doing other
cleanup.

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Sat Oct 30 05:07:41 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 13:07:41 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
Message-ID: <AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>

On Sat, Oct 30, 2010 at 11:15 AM, Guido van Rossum <guido at python.org> wrote:
> BTW I don't think I like piggybacking a return value on GeneratorExit.
> Before you know it people will be writing except blocks catching
> GeneratorExit intending to catch one coming from inside but
> accidentally including a yield in the try block and catching one
> coming from the outside. The nice thing about how GeneratorExit works
> today is that you needn't worry about it coming from inside, since it
> always comes from the outside *first*. This means that if you catch a
> GeneratorExit, it is either one you threw into a generator yourself
> (it just bounced back, meaning the generator didn't handle it at all),
> or one that was thrown into you. But the pattern of catching
> GeneratorExit and responding by returning a value is a reasonable
> extension of the pattern of catching GeneratorExit and doing other
> cleanup.

(TLDR version: I'm -1 on Guido's modified close() semantics if there
is no way to get the result out of a yield from expression that is
terminated by GeneratorExit, but I'm +1 if we tweak PEP 380 to make
the result available on the reraised GeneratorExit instance, thus
allowing framework authors to develop ways to correctly unwind a
generator stack in response to close())

Stepping back a bit, let's look at the ways a framework may "close" a
generator-based operation (or substep of a generator).

1. Send in a sentinel value (often None, but you could easily reuse
the exception types as sentinel values  as well)
2. Throw in GeneratorExit explicitly
3. Throw in StopIteration explicitly
4. Throw in a different specific exception
5. Call g.close()

Having close() return a value only helps with the last option, and
only if the coroutine is set up to work that way. Yield from also
isn't innately set up to unwind correctly in any of these cases,
without some form of framework based signalling from the inner
generator to indicate whether or not the outer generator should
continue or bail out.

Now, *if* close() were set up to return a value, then that second
point makes the idea less useful than it appears. To go back to the
simple summing example (not my
too-complicated-for-a-mailing-list-discussion version which I'm not
going to try to rehabilitate):

def gtally():
  count = tally = 0
  try:
    while 1:
      tally += yield
      count += 1
  except GeneratorExit:
    pass
  return count, tally

Fairly straightforward, but one of the promises of PEP 380 is that it
allows us to factor out some or all of a generator's internal logic
without affecting the externally visible semantics. So, let's do that:

  def gtally2():
    return (yield from gtally())

Unless the PEP 380 yield from expansion is changed, Guido's proposed
"close() returns the value on StopIteration" just broke this
equivalence for gtally2() - since the yield from expansion turns the
StopIteration back into a GeneratorExit, the return value of
gtally2.close is always going to be None instead of the expected
(count, tally) 2-tuple. Since the value of the internal call to
close() is thrown away completely, there is absolute nothing the
author of gtally2() can do to fix it (aside from not using yield from
at all). To me, if Guido's idea is adopted, this outcome is as
illogical and unacceptacle as the following returning None:

  def sum2(seq):
    return sum(seq)

We already thrashed out long ago that the yield from handling of
GeneratorExit needs to work the way it does in order to serve its
primary purpose of releasing resources, so allowing the inner
StopIteration to propagate with the exception value attached is not an
option.

The question is whether or not there is a way to implement the
return-value-from-close() idiom in a way that *doesn't* completely
break the equivalence between gtally() and gtally2() above. I think
there is: store the prospective return-value on the GeneratorExit
instance and have the yield from expansion provide the most recent
return value as it unwinds the stack.

To avoid giving false impressions as to which level of the stack
return values are from, gtally2() would need to be implemented a bit
differently in order to *also* convert GeneratorExit to StopIteration:

  def gtally2():
    # The PEP 380 equivalent of a "tail call" if g.close() returns a value
    try:
      yield from gtally()
    except GeneratorExit as ex:
      return ex.value

Specific proposed additions/modifications to PEP 380:

1. The new "value" attribute is added to GeneratorExit as well as
StopIteration and is explicitly read/write

2. The semantics of the generator close method are modified to be:

  def close(self):
    try:
      self.throw(GeneratorExit)
    except StopIteration as ex:
      return ex.value
    except GeneratorExit:
      return None # Ignore the value, as it didn't come from the
outermost generator
    raise RuntimeError("Generator ignored GeneratorExit")

3.  The GeneratorExit handling semantics for the yield from expansion
are modified to be:

        except GeneratorExit as _e:
            try:
                _m = _i.close
            except AttributeError:
                pass
            else:
                _e.value = _m() # Store close() result on the exception
            raise _e

With these modifications, a framework could then quite easily provide
a context manager to make the idiom a little more readable and hide
the fact that GeneratorExit is being caught at all:

class GenResult():
    def __init__(self): self.value = None

@contextmanager
def generator_return():
    result = GenResult()
    try:
      yield
    except GeneratorExit as ex:
      result.value = ex.value

def gtally():
  # The CM suppresses GeneratorExit, allowing us
  # to convert it to StopIteration
  count = tally = 0
  with generator_return():
    while 1:
      tally += yield
      count += 1
  return count, tally

def gtally2():
  # The CM *also* collects the value of any inner
  # yield from expression, allowing easier tail calls
  with generator_return() as result:
    yield from gtally()
  return result.value

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From greg.ewing at canterbury.ac.nz  Sat Oct 30 05:09:04 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 16:09:04 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
Message-ID: <4CCB8C50.2010404@canterbury.ac.nz>

Guido van Rossum wrote:

> This seems to be the crux of your objection. But if I look carefully
> at the expansion in the current version of PEP 380, I don't think this
> problem actually happens: If the outer generator catches
> GeneratorExit, it closes the inner generator (by calling its close
> method, if it exists) and then re-raises the GeneratorExit:

Yes, but if you want close() to cause the generator to finish
normally, you *don't* want that to happen. You would have to
surround the yield-from call with a try block to catch the
GeneratorExit, and even then you would lose the return value
from the inner generator, which you're probably going to
want.

> Could it be that you are thinking of your accelerated implementation,

No, not really. The same issues arise either way.

> It looks to me as if using g.close() to capture the return value of a
> generator is not of much value when using yield-from, but it can be of
> value for the simpler pattern that started this thread.

My concern is that this feature would encourage designing
generators with APIs that make it difficult to refactor the
implementation using yield-from later on. Simple things
don't always stay simple.

> def summer():
>   total = 0
>   try:
>     while True:
>       total += yield
>   except GeneratorExit:
>     raise StopIteration(total)  ## return total

I don't see how this gains you much. The generator is about
as complicated either way.

The only thing that's simpler is the final step of getting
the result, which in my version can be taken care of with
a fairly generic helper function that could be provided
by the stdlib.

-- 
Greg


From guido at python.org  Sat Oct 30 05:11:24 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 20:11:24 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CC63065.9040507@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
Message-ID: <AANLkTikamsk_e0aM_up8x2kYmhL0pfG3dDggYW6REBV5@mail.gmail.com>

On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm <jh at improva.dk> wrote:
> I had some later suggestions for how to change the expansion, see e.g.
> http://mail.python.org/pipermail/python-ideas/2009-April/004195.html ?(I
> find that version easier to reason about even now 1? years later)

I like that style too. Here it is with some annotations.

     _i = iter(EXPR)
     _m, _a = next, (_i,)
     # _m is a function or a bound method;
     #  _a is a tuple of arguments to call _m with;
     # both are set to other values further down
     while 1:
         # Move the generator along
         try:
             _y = _m(*_a)
         except StopIteration as _e:
             _r = _e.value
             break

         # Yield _y and process what came back in
         try:
             _s = yield _y
         except GeneratorExit as _e:
             # Request to exit
             try:
                 # NOTE: This _m is unrelated to the other
                 _m = _i.close
             except AttributeError:
                 pass
             else:
                 _m()
             raise _e  # Always exit
         except BaseException as _e:
             # An exception was thrown in; pass it along
             _a = sys.exc_info()
             try:
                 _m = _i.throw
             except AttributeError:
                 # Can't throw it in; throw it back out
                 raise _e
         else:
             # A value was sent in; pass it along
             if _s is None:
                 _m, _a = next, (_i,)
             else:
                 _m, _a = _i.send, (_s,)

     RESULT = _r

I do note that this is a bit subtle; I don't like the reusing of _m
and it's hard to verify that _m and _a are set on every path that goes
back to the top of the loop.

-- 
--Guido van Rossum (python.org/~guido)


From greg.ewing at canterbury.ac.nz  Sat Oct 30 05:17:50 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 16:17:50 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB5A06.60408@improva.dk>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz> <4CCA9E3C.8000604@improva.dk>
	<4CCB52B9.8010203@canterbury.ac.nz> <4CCB5A06.60408@improva.dk>
Message-ID: <4CCB8E5E.5080504@canterbury.ac.nz>

Jacob Holm wrote:

> It is relevant if we later want to distinguish between "return" and
> "raise StopIteration".

We want to distinguish between return *without* a
value and StopIteration too.

> My suggestion is to cut/change some features from PEP 380 that are in
> the way

But having StopIteration carry a value is *not* one of the
things that's in the way, as far as I can see.

-- 
Greg


From guido at python.org  Sat Oct 30 05:26:40 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 20:26:40 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB8C50.2010404@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
Message-ID: <AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>

On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
>> This seems to be the crux of your objection. But if I look carefully
>> at the expansion in the current version of PEP 380, I don't think this
>> problem actually happens: If the outer generator catches
>> GeneratorExit, it closes the inner generator (by calling its close
>> method, if it exists) and then re-raises the GeneratorExit:
>
> Yes, but if you want close() to cause the generator to finish
> normally, you *don't* want that to happen. You would have to
> surround the yield-from call with a try block to catch the
> GeneratorExit,

Yeah, putting such a try-block around yield from works just as it
works around plain yield: it captures the GeneratorExit thrown in. As
a bonus, the inner generator is first closed, but the yield-from
expression which was interrupted is not completed; just like anything
else that raises an exception, execution of the code stops immediately
and resumes at the except block.

> and even then you would lose the return value
> from the inner generator, which you're probably going to
> want.

Really? Can you show a realistic use case? (There was Nick's
average-of-sums example but I think nobody liked it.)

>> Could it be that you are thinking of your accelerated implementation,
>
> No, not really. The same issues arise either way.

Ok.

>> It looks to me as if using g.close() to capture the return value of a
>> generator is not of much value when using yield-from, but it can be of
>> value for the simpler pattern that started this thread.
>
> My concern is that this feature would encourage designing
> generators with APIs that make it difficult to refactor the
> implementation using yield-from later on. Simple things
> don't always stay simple.

Yeah, but there is also YAGNI. We shouldn't plan every simple thing to
become complex; in fact we should expect most simple things to stay
simple. Otherwise you'd never use lists and dicts but start with
classes right away.

>> def summer():
>> ?total = 0
>> ?try:
>> ? ?while True:
>> ? ? ?total += yield
>> ?except GeneratorExit:
>> ? ?raise StopIteration(total) ?## return total
>
> I don't see how this gains you much. The generator is about
> as complicated either way.

I'm just concerned about the following:

> The only thing that's simpler is the final step of getting
> the result, which in my version can be taken care of with
> a fairly generic helper function that could be provided
> by the stdlib.

In my case too -- it would just be a method on the generator named close(). :-)

In addition I like merging use cases that have some overlap, if the
non-overlapping parts do not conflict. E.g. I believe the reason we
all ended agreeing (at least last year :-) that returning a value
should be done through StopIteration was that this makes it so that
"return", "return None", "return <value>" and falling of the end of
the block are treated uniformly so that equivalences apply both ways.
In the case of close(), I *like* that the response to close() can be
either cleaning up or returning a value and that close() doesn't care
which of the two you do (and in fact it can't tell the difference).

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Oct 30 05:47:04 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Oct 2010 20:47:04 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
Message-ID: <AANLkTik-gYZdUkeE7=W+oW58FGtcp+-_vz1HJ7Fnb7d7@mail.gmail.com>

On Fri, Oct 29, 2010 at 8:07 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Oct 30, 2010 at 11:15 AM, Guido van Rossum <guido at python.org> wrote:
>> BTW I don't think I like piggybacking a return value on GeneratorExit.
>> Before you know it people will be writing except blocks catching
>> GeneratorExit intending to catch one coming from inside but
>> accidentally including a yield in the try block and catching one
>> coming from the outside. The nice thing about how GeneratorExit works
>> today is that you needn't worry about it coming from inside, since it
>> always comes from the outside *first*. This means that if you catch a
>> GeneratorExit, it is either one you threw into a generator yourself
>> (it just bounced back, meaning the generator didn't handle it at all),
>> or one that was thrown into you. But the pattern of catching
>> GeneratorExit and responding by returning a value is a reasonable
>> extension of the pattern of catching GeneratorExit and doing other
>> cleanup.
>
> (TLDR version: I'm -1 on Guido's modified close() semantics if there
> is no way to get the result out of a yield from expression that is
> terminated by GeneratorExit, but I'm +1 if we tweak PEP 380 to make
> the result available on the reraised GeneratorExit instance, thus
> allowing framework authors to develop ways to correctly unwind a
> generator stack in response to close())
>
> Stepping back a bit, let's look at the ways a framework may "close" a
> generator-based operation (or substep of a generator).
>
> 1. Send in a sentinel value (often None, but you could easily reuse
> the exception types as sentinel values ?as well)
> 2. Throw in GeneratorExit explicitly
> 3. Throw in StopIteration explicitly

Throwing in StopIteration seems more unnatural than any other option.

> 4. Throw in a different specific exception
> 5. Call g.close()
>
> Having close() return a value only helps with the last option, and
> only if the coroutine is set up to work that way. Yield from also
> isn't innately set up to unwind correctly in any of these cases,
> without some form of framework based signalling from the inner
> generator to indicate whether or not the outer generator should
> continue or bail out.

Yeah, there is definitely some kind of convention needed here. A
framework or app can always choose not to use g.close() for this
purpose (heck, several current frameworks use yield to return a value)
and in some cases that's just the right thing. Just like in other flow
control situations you can often choose between sentinel values,
exceptions, or something else (e.g. flag variables that must be
explicitly tested).

> Now, *if* close() were set up to return a value, then that second
> point makes the idea less useful than it appears. To go back to the
> simple summing example (not my
> too-complicated-for-a-mailing-list-discussion version which I'm not
> going to try to rehabilitate):
>
> def gtally():
> ?count = tally = 0
> ?try:
> ? ?while 1:
> ? ? ?tally += yield
> ? ? ?count += 1
> ?except GeneratorExit:
> ? ?pass
> ?return count, tally

I like this example.

> Fairly straightforward, but one of the promises of PEP 380 is that it
> allows us to factor out some or all of a generator's internal logic
> without affecting the externally visible semantics. So, let's do that:
>
> ?def gtally2():
> ? ?return (yield from gtally())

And I find this a good starting point.

> Unless the PEP 380 yield from expansion is changed, Guido's proposed
> "close() returns the value on StopIteration" just broke this
> equivalence for gtally2() - since the yield from expansion turns the
> StopIteration back into a GeneratorExit, the return value of
> gtally2.close is always going to be None instead of the expected
> (count, tally) 2-tuple. Since the value of the internal call to
> close() is thrown away completely, there is absolute nothing the
> author of gtally2() can do to fix it (aside from not using yield from
> at all).

Right, they could do something based on the (imperfect) equivalency
between "yield from f()" and "for x in f(): yield x".

> To me, if Guido's idea is adopted, this outcome is as
> illogical and unacceptable as the following returning None:
>
> ?def sum2(seq):
> ? ?return sum(seq)

Maybe.

> We already thrashed out long ago that the yield from handling of
> GeneratorExit needs to work the way it does in order to serve its
> primary purpose of releasing resources, so allowing the inner
> StopIteration to propagate with the exception value attached is not an
> option.
>
> The question is whether or not there is a way to implement the
> return-value-from-close() idiom in a way that *doesn't* completely
> break the equivalence between gtally() and gtally2() above. I think
> there is: store the prospective return-value on the GeneratorExit
> instance and have the yield from expansion provide the most recent
> return value as it unwinds the stack.
>
> To avoid giving false impressions as to which level of the stack
> return values are from, gtally2() would need to be implemented a bit
> differently in order to *also* convert GeneratorExit to StopIteration:
>
> ?def gtally2():
> ? ?# The PEP 380 equivalent of a "tail call" if g.close() returns a value
> ? ?try:
> ? ? ?yield from gtally()
> ? ?except GeneratorExit as ex:
> ? ? ?return ex.value

Unfortunately this misses the goal of equivalency between gtally() and
your original gtally2() by a mile. Having to add extra except clauses
around each yield-from IMO defeats the purpose.

> Specific proposed additions/modifications to PEP 380:
>
> 1. The new "value" attribute is added to GeneratorExit as well as
> StopIteration and is explicitly read/write

I already posted an argument against this.

> 2. The semantics of the generator close method are modified to be:
>
> ?def close(self):
> ? ?try:
> ? ? ?self.throw(GeneratorExit)
> ? ?except StopIteration as ex:
> ? ? ?return ex.value
> ? ?except GeneratorExit:
> ? ? ?return None # Ignore the value, as it didn't come from the
> outermost generator
> ? ?raise RuntimeError("Generator ignored GeneratorExit")
>
> 3. ?The GeneratorExit handling semantics for the yield from expansion
> are modified to be:
>
> ? ? ? ?except GeneratorExit as _e:
> ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ?_m = _i.close
> ? ? ? ? ? ?except AttributeError:
> ? ? ? ? ? ? ? ?pass
> ? ? ? ? ? ?else:
> ? ? ? ? ? ? ? ?_e.value = _m() # Store close() result on the exception
> ? ? ? ? ? ?raise _e
>
> With these modifications, a framework could then quite easily provide
> a context manager to make the idiom a little more readable and hide
> the fact that GeneratorExit is being caught at all:
>
> class GenResult():
> ? ?def __init__(self): self.value = None
>
> @contextmanager
> def generator_return():
> ? ?result = GenResult()
> ? ?try:
> ? ? ?yield
> ? ?except GeneratorExit as ex:
> ? ? ?result.value = ex.value
>
> def gtally():
> ?# The CM suppresses GeneratorExit, allowing us
> ?# to convert it to StopIteration
> ?count = tally = 0
> ?with generator_return():
> ? ?while 1:
> ? ? ?tally += yield
> ? ? ?count += 1
> ?return count, tally
>
> def gtally2():
> ?# The CM *also* collects the value of any inner
> ?# yield from expression, allowing easier tail calls
> ?with generator_return() as result:
> ? ?yield from gtally()
> ?return result.value

I agree that you've poked a hole in my proposal. If we can change the
expansion of yield-from to restore the equivalency between gtally()
and the simplest gtally2(), thereby restoring the original refactoring
principle, we might be able to save it. Otherwise I declare defeat.
Right now I am too tired to think of such an expansion, but I recall
trying my hand at one a few nights ago and realizing that I'd
introduced another problem. So this does not look too hopeful,
especially since I really don't like extending GeneratorExit for the
purpose.

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Sat Oct 30 05:48:30 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 13:48:30 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
	<AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
Message-ID: <AANLkTi=Vq3Ja3PHqypdqRpbxdTHB-L1DPcc7uYcmweqL@mail.gmail.com>

On Sat, Oct 30, 2010 at 1:26 PM, Guido van Rossum <guido at python.org> wrote:
> Really? Can you show a realistic use case? (There was Nick's
> average-of-sums example but I think nobody liked it.)

Yeah, I'm much happier with the tally example. It got rid of all the
irrelevant framework-y parts :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sat Oct 30 06:05:20 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 14:05:20 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTik-gYZdUkeE7=W+oW58FGtcp+-_vz1HJ7Fnb7d7@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
	<AANLkTik-gYZdUkeE7=W+oW58FGtcp+-_vz1HJ7Fnb7d7@mail.gmail.com>
Message-ID: <AANLkTinGuj=PyuYnWY2L1aRT6nBqKoC2Er1M+urXWWc=@mail.gmail.com>

On Sat, Oct 30, 2010 at 1:47 PM, Guido van Rossum <guido at python.org> wrote:
> I agree that you've poked a hole in my proposal. If we can change the
> expansion of yield-from to restore the equivalency between gtally()
> and the simplest gtally2(), thereby restoring the original refactoring
> principle, we might be able to save it. Otherwise I declare defeat.
> Right now I am too tired to think of such an expansion, but I recall
> trying my hand at one a few nights ago and realizing that I'd
> introduced another problem. So this does not look too hopeful,
> especially since I really don't like extending GeneratorExit for the
> purpose.

I tried to make the original version work as well, but always ran into
one of two problems:
- breaking GeneratorExit for resource cleanup
- "leaking" inner return values so they looked like they came from the
outer function.

Here's a crazy idea though. What if gtally2() could be written as follows:

def gtally2():
  return from gtally()

If we make the generator tail call explicit, then the interpreter can
do the right thing (i.e. raise StopIteration-with-a-value instead of
reraising GeneratorExit) and we don't need to try to shoehorn two
different sets of semantics into the single yield-from construct.

To give some formal semantics to the new statement:

    # RETURN FROM semantics
    _i = iter(EXPR)
    _m, _a = next, (_i,)
    # _m is a function or a bound method;
    #  _a is a tuple of arguments to call _m with;
    # both are set to other values further down
    while 1:
        # Move the generator along
        # Unlike YIELD FROM, we allow StopIteration to
        # escape, since this is a tail call
        _y = _m(*_a)

        # Yield _y and process what came back in
        try:
            _s = yield _y
        except GeneratorExit as _e:
            # Request to exit
            try:
                # Don't reuse _m, since we're bailing out of the loop
                _close = _i.close
            except AttributeError:
                pass
            else:
                # Unlike YIELD FROM, we use StopIteration
                # to return the value of the inner close() call
                raise StopIteration(_close())
            # If there is no inner close() attribute, return None
            raise StopIteration
        except BaseException as _e:
            # An exception was thrown in; pass it along
            _a = sys.exc_info()
            try:
                _m = _i.throw
            except AttributeError:
                # Can't throw it in; throw it back out
                raise _e
        else:
            # A value was sent in; pass it along
            if _s is None:
                _m, _a = next, (_i,)
            else:
                _m, _a = _i.send, (_s,)
    # Unlike YIELD FROM, this is a statement, so there is no RESULT

Summary of the differences between return from and yield from:
- statement, not an expression
- an inner StopIteration is allowed to propogate
- a thrown in GeneratorExit is converted to StopIteration

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sat Oct 30 06:26:48 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 14:26:48 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTinGuj=PyuYnWY2L1aRT6nBqKoC2Er1M+urXWWc=@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
	<AANLkTik-gYZdUkeE7=W+oW58FGtcp+-_vz1HJ7Fnb7d7@mail.gmail.com>
	<AANLkTinGuj=PyuYnWY2L1aRT6nBqKoC2Er1M+urXWWc=@mail.gmail.com>
Message-ID: <AANLkTi=ADF6Ck10gOK2dnqKYWgwuFcM2uYPnWWu4i3QF@mail.gmail.com>

On Sat, Oct 30, 2010 at 2:05 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> To give some formal semantics to the new statement:

Oops, those would be some-formal-but-incorrect semantics that made the
StopIteration exceptions visible in the current frame. Fixed below to
actually kill the current frame properly, letting the generator
instance take care of turning the return value into a StopIteration
exception.

   # RETURN FROM semantics
   _i = iter(EXPR)
   _m, _a = next, (_i,)
   # _m is a function or a bound method;
   #  _a is a tuple of arguments to call _m with;
   # both are set to other values further down
   while 1:
       # Move the generator along
       # Unlike YIELD FROM, we convert StopIteration
       # into an immediate return (since this is a tail call)
       try:
           _y = _m(*_a)
       except StopIteration as _e:
           return _e.value

       # Yield _y and process what came back in
       try:
           _s = yield _y
       except GeneratorExit as _e:
           # Request to exit
           try:
               # Don't reuse _m, since we're bailing out of the loop
               _close = _i.close
           except AttributeError:
               pass
           else:
               # Unlike YIELD FROM, we return the
               # value of the inner close() call
               return _close()
           # If there is no inner close() attribute,
           # we just return None
           return
       except BaseException as _e:
           # An exception was thrown in; pass it along
           _a = sys.exc_info()
           try:
               _m = _i.throw
           except AttributeError:
               # Can't throw it in; throw it back out
               raise _e
       else:
           # A value was sent in; pass it along
           if _s is None:
               _m, _a = next, (_i,)
           else:
               _m, _a = _i.send, (_s,)
   # Unlike YIELD FROM, this is a statement, so there is no RESULT

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From greg.ewing at canterbury.ac.nz  Sat Oct 30 08:12:54 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 19:12:54 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
Message-ID: <4CCBB766.3020104@canterbury.ac.nz>

Guido van Rossum wrote:

> I thought the basics weren't even decided?

I'll be posting a new version soon that ought to pin
things down more precisely, then we'll have something
to talk about.

> I truly wish it was easier to experiment with syntax -- it would be so
> much simpler if these PEPs could be accompanied by a library that
> people can just import to use the new syntax

Hmmm. Maybe if there were an option to use a parser and
compiler written in pure Python? It wouldn't be fast,
but it would be easier to hack experimental features into.

> If there was a separate PEP specifying *just* returning a value from a
> generator and how to get at that value using g.close(), without
> yield-from, would those problems still exist?

I don't think it's necessary to move the value-returning
part into another PEP, because it doesn't conflict with
anything. But close() returning that value could easily
be moved into another PEP that depended on PEP 380.

PEP 3152 would still depend on 380, not on the new PEP.

 > Would PEP 3152 make sense with PEP X but without (the rest
> of) PEP 380?

For PEP 3152 to *not* depend on PEP 380, it would have
to duplicate almost all of PEP 380's content.

-- 
Greg



From greg.ewing at canterbury.ac.nz  Sat Oct 30 08:13:29 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 19:13:29 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
Message-ID: <4CCBB789.4030208@canterbury.ac.nz>

Nick Coghlan wrote:

> 1. Send in a sentinel value (often None, but you could easily reuse
> the exception types as sentinel values  as well)
> 2. Throw in GeneratorExit explicitly
> 3. Throw in StopIteration explicitly
> 4. Throw in a different specific exception
> 5. Call g.close()
> 
> Yield from also
> isn't innately set up to unwind correctly in any of these cases,

On the contrary, I think it works perfectly well with 1, and
also with 4 as long as the inner generator catches it in the
right place.

Note that you *don't* want to unwind in this situation, you
want to continue with normal processing, in the same way that
a function reading from a file continues with normal processing
when it reaches the end of the file.

> without some form of framework based signalling from the inner
> generator to indicate whether or not the outer generator should
> continue or bail out.

No such signalling is necessary -- all it needs to do is
return in the normal way.

Or to put it another way, from the yield-from statement's
point of view, the signal is that it raised StopIteration
and not GeneratorExit.

> To avoid giving false impressions as to which level of the stack
> return values are from, gtally2() would need to be implemented a bit
> differently in order to *also* convert GeneratorExit to StopIteration:
> 
>   def gtally2():
>     # The PEP 380 equivalent of a "tail call" if g.close() returns a value
>     try:
>       yield from gtally()
>     except GeneratorExit as ex:
>       return ex.value

Exactly, which I think is a horrible thing to have to do,
and I'm loathe to make any modification to PEP 380 to
support this kind of pattern.

> With these modifications, a framework could then quite easily provide
> a context manager to make the idiom a little more readable

Which would be papering over an awful mess that has no
need to exist in the first place, as long as you don't
insist on using technique no. 5.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sat Oct 30 08:16:44 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 19:16:44 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
	<AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
Message-ID: <4CCBB84C.2080809@canterbury.ac.nz>

Guido van Rossum wrote:
> On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
>>and even then you would lose the return value
>>from the inner generator, which you're probably going to
>>want.
> 
> Really? Can you show a realistic use case?

Here's an attempt:

   def variancer():
     # Compute variance of values sent in (details left
     # as an exercise)

   def stddever():
     # Compute standard deviation of values sent in
     v = yield from variancer()
     return sqrt(v)

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sat Oct 30 08:18:20 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Oct 2010 19:18:20 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTinGuj=PyuYnWY2L1aRT6nBqKoC2Er1M+urXWWc=@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
	<AANLkTik-gYZdUkeE7=W+oW58FGtcp+-_vz1HJ7Fnb7d7@mail.gmail.com>
	<AANLkTinGuj=PyuYnWY2L1aRT6nBqKoC2Er1M+urXWWc=@mail.gmail.com>
Message-ID: <4CCBB8AC.2010402@canterbury.ac.nz>

Nick Coghlan wrote:

> Here's a crazy idea though. What if gtally2() could be written as follows:
> 
> def gtally2():
>   return from gtally()

That seems like an excessively special case. Most of the time
you're going to want to do some processing on the value, not
just return it immediately.

-- 
Greg


From ncoghlan at gmail.com  Sat Oct 30 08:25:36 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 16:25:36 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCBB789.4030208@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
	<AANLkTikJwWvb=1RSnC4XkeviowXnrvyALiCu_OYwgxxh@mail.gmail.com>
	<4CCBB789.4030208@canterbury.ac.nz>
Message-ID: <AANLkTi=5T3=-iO1ELX_Yi34kfB94BGxJp5PibU_+8vAK@mail.gmail.com>

On Sat, Oct 30, 2010 at 4:13 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nick Coghlan wrote:
>
>> 1. Send in a sentinel value (often None, but you could easily reuse
>> the exception types as sentinel values ?as well)
>> 2. Throw in GeneratorExit explicitly
>> 3. Throw in StopIteration explicitly
>> 4. Throw in a different specific exception
>> 5. Call g.close()
>>
>> Yield from also
>> isn't innately set up to unwind correctly in any of these cases,
>
> On the contrary, I think it works perfectly well with 1, and
> also with 4 as long as the inner generator catches it in the
> right place.

Yeah, I'd agree with that. Unwinding the stack correctly requires
cooperation from all of the intervening layers and the logic for that
is likely to be a little clumsy, but the issues are not insurmountable
(my own "return from" suggestion requires cooperation as well, since
the layers have to explicitly invoke the alternate semantics to
indicate that return values should be passed through).

"return from" would make more sense as its own PEP, with the construct
possibly given a meaning in ordinary functions as well (e.g. the
occasionally sought tail-call optimisation in recursive functions).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From rrr at ronadam.com  Sat Oct 30 08:42:18 2010
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 30 Oct 2010 01:42:18 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCB8C50.2010404@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA752D.5090904@canterbury.ac.nz>	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
Message-ID: <4CCBBE4A.3040503@ronadam.com>



On 10/29/2010 10:09 PM, Greg Ewing wrote:
> Guido van Rossum wrote:
>
>> This seems to be the crux of your objection. But if I look carefully
>> at the expansion in the current version of PEP 380, I don't think this
>> problem actually happens: If the outer generator catches
>> GeneratorExit, it closes the inner generator (by calling its close
>> method, if it exists) and then re-raises the GeneratorExit:
>
> Yes, but if you want close() to cause the generator to finish
> normally, you *don't* want that to happen. You would have to
> surround the yield-from call with a try block to catch the
> GeneratorExit, and even then you would lose the return value
> from the inner generator, which you're probably going to
> want.


Ok, after thinking about this for a while, I think the "yield from" would 
be too limited if it could only be used for consumers that must run until 
the end. That rules out a whole lot of pipes, filters and other things that 
consume-some, emit-some, consume-some_more, and emit-some_more.


I think I figured out something that may be more flexible and insn't too 
complicated.

The trick is how to tell the "yield from" to stop delegating on a 
particular exception.  (And be explicit about it!)


     # Inside a generator or sub-generator.
     ...

     next(<my_gen>)    # works in this frame.

     yield from <my_gen> except <exception>  #Delegate until <exception>

     value = next(<my_gen>)    # works in this frame again.

     ...

The explicit "yield from .. except" is easier to understand.  It also 
avoids the close and return issues.  It should be easier to implement as 
well.  And it doesn't require any "special" framework in the parent 
generator or the delegated sub-generator to work.


Here's an example.

# I prefer to use a ValueRequest exception, but someone could use
# StopIteration or GeneratorExit, if it's useful for what they
# are doing.

class ValueRequest(Exception): pass


# A pretty standard generator that emits
# a total when an exception is thrown in.
# It doesn't need anything special in it
# so it can be delegated.

def gtally():
     count = tally = 0
     try:
         while 1:
             tally += yield
             count += 1
     except ValueRequest:
         yield count, tally


# An example of delegating until an Exception.
# The specified "exception" is not sent to the sub-generator.
# I think explicit is better than implicit here.

def gtally_averages():
     gt = gtally()
     next(gt)
     yield from gt except ValueRequest     #Catches exception
     count, tally = gt.throw(ValueRequest)    #Get tally
     yield tally / count


# This part also already works and has no new stuf in it.
# This part isn't aware of any delegating!

def main():
     gavg = gtally_averages()
     next(gavg)
     for x in range(100):
         gavg.send(x)
     print(gavg.throw(ValueRequest))

main()


It may be that a lot of pre-existing generators will already work with 
this. ;-)

You can still use 'yield from <gen>" to delegate until <gen> ends.  You 
just won't get a value in the same frame <gen> was used in.  The parent may 
get it instead. That may be useful in it self.

Note: you *can't* put the yield from inside a try-except and do the same 
thing.  The exception would go to the sub-generator instead.  Which is one 
of the messy things we are trying to avoid doing.

Cheers,
    Ron


From ncoghlan at gmail.com  Sat Oct 30 09:58:13 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Oct 2010 17:58:13 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCBBE4A.3040503@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com>
Message-ID: <AANLkTimw1t4YB-ZCVpWs-mEb117pnPS8EXiEH5eQH4_O@mail.gmail.com>

On Sat, Oct 30, 2010 at 4:42 PM, Ron Adam <rrr at ronadam.com> wrote:
> Ok, after thinking about this for a while, I think the "yield from" would be
> too limited if it could only be used for consumers that must run until the
> end. That rules out a whole lot of pipes, filters and other things that
> consume-some, emit-some, consume-some_more, and emit-some_more.

Indeed, the "stop-in-the-middle" aspect is tricky, but is the crux of
what we're struggling with here.

> I think I figured out something that may be more flexible and insn't too
> complicated.

Basically a way to use yield from, while declaring how to force the
end of iteration? Interesting idea.

However, I think sentinel values are likely a better way to handle
this in a pure PEP 380 context.

> Here's an example.

Modifying this example to use sentinel values rather than throwing in
exceptions actually makes it all fairly straightforward in a PEP 380
context. So maybe the moral of this whole thread is really "sentinel
values good, sentinel exceptions bad".

# Helper function to finish off a generator by sending a sentinel value
def finish(g, sentinel=None):
  try:
    g.send(sentinel)
  except StopIteration as ex:
    return ex.value

def gtally(end_tally=None):
  # Tallies numbers until sentinel is passed in
  count = tally = 0
  value = object()
  while 1:
    value = yield
    if value is end_tally:
      return count, tally
    count += 1
    tally += value

def gaverage(end_avg=None):
  count, tally = (yield from gtally(end_avg))
  return tally / count

def main():
  g = gaverage()
  next(g)
  for x in range(100):
    g.send(x)
  return finish(g)

Even more complex cases, like my sum-of-averages example (or any
equivalent multi-level construct) can be implemented without too much
hassle, so long as "finish current action" and "start next action" are
implemented as two separate steps so the outer layer has a chance to
communicate with the outside world before diving into the inner layer.

I think we've thrashed this out enough that I, for one, want to see
how PEP 380 peforms in the wild as it currently stands before we start
tinkering any further.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From jh at improva.dk  Sat Oct 30 10:58:57 2010
From: jh at improva.dk (Jacob Holm)
Date: Sat, 30 Oct 2010 10:58:57 +0200
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA5AE4.7080403@canterbury.ac.nz>
	<AANLkTindHn6vteNuCtejotLmC3m_g1u=Jkp8SO-7PkLD@mail.gmail.com>
	<4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk>
	<AANLkTi=dgSevRbg+OZVDDd0cK2x25kaivySZuw9X5sDq@mail.gmail.com>
	<4CCB69AC.9020701@improva.dk>
	<AANLkTi=5zCeHwC7g0L5Qrue9+oC_Md5iMqoz4kph7vKA@mail.gmail.com>
Message-ID: <4CCBDE51.5040502@improva.dk>

On 2010-10-30 03:15, Guido van Rossum wrote:
> On Fri, Oct 29, 2010 at 5:41 PM, Jacob Holm <jh at improva.dk> wrote:
>> I agree that PEP 3152 is far from perfect at this point, but I like the
>> basics.
> 
> I thought the basics weren't even decided? Implicit definitions, or
> implicit cocalls, terminology to be used, how to implement in Jython
> or IronPython, probably more (I can't keep the ideas of that PEP in my
> head so I end up blanking out on any discussion that mentions it).
> 

The basics I am talking about are:

1) adding a new __cocall__ protocol with semantics described in terms of
generators

2) adding a simpler way to call the new functions, based on "yield from"


> I truly wish it was easier to experiment with syntax -- it would be so
> much simpler if these PEPs could be accompanied by a library that
> people can just import to use the new syntax (even if it's a C
> extension) rather than by a patch to the core language.
> 
> The need to "get it right in one shot" is keeping back the ability to
> experiment at any realistic scale, so all we see (on all sides) are
> trivial examples that may highlight proposed features and anticipated
> problems, but this is no way to gain experience with what the *real*
> problems would be.
> 

Right.


>> The reason I am so concerned with the "return value" semantics
>> is that I see some problems we are having in PEP 3152 as indicating a
>> likely flaw/misfeature in PEP 380.  I would be much happier with both
>> PEPs if they didn't conflict in this way.
> 
> If there was a separate PEP specifying *just* returning a value from a
> generator and how to get at that value using g.close(), without
> yield-from, would those problems still exist? If not, that would be a
> reason to move those out in a separate PEP. Assume such a PEP (call it
> PEP X) existed, what would be the dependency tree? What would be the
> conflicts? Would PEP 3152 make sense with PEP X but without (the rest
> of) PEP 380?
> 

If "return value" was moved from PEP 380 to PEP X, we should remove or
alter the expression form of "yield from".

I am currently in favor of changing "yield from" to return the
StopIteration instance that stopped the inner generator, because that
allows you to use different StopIteration subclasses in different
circumstances (e.g. exhausted, told to quit) and still let the calling
code know which one it was.   This is useful for PEP 3152 but I am sure
it has other uses as well.   It also means that PEP X return values are
still useful with yield-from, without modifications to PEP 380.  (But
slightly less convenient because you have to extract the value yourself).

In other words, with that change to the expression form of "yield from"
PEP 380 and PEP X could be completely independent and complementary.

PEP 3152 would naturally depend on PEP X, but AFAICT it depends on PEP
380 only for ease of presentation.   With the proposed change to the
expression form of "yield from" there would be no conflicts either.
(The current conflict is really only with the use of the current "yield
from" in the presentation.  The desired semantics could be defined from
scratch.  We just really don't want to do that)





>> So much so, that I would rather miss a few features in PEP 380 in the
>> *hope* of getting them right later with another PEP.
> 
> Can you be specific? Which features?
> 

"return value" in generators, current expression form of "yield from".


>> To quote the Zen:
>>
>>  "never is often better than *right* now"
> 
> Um, Python 3.3 can hardly be referred to as "*right* now".
> 

True, but the "close to acceptance" state of PEP 380 means that changes
there have much more of a "now" feel than changes to other PEPs.


> There are plenty of arguments in the zen for PEP X, especially "If the
> implementation is easy to explain, it may be a good idea." Both
> returning a value from a generator and catching that value in
> g.close() are really easy to implement and the implementation is easy
> to explain. It's a small evolution from the current generator code.
> 
>> A PEP just for the "return value" shouldn't be too hard to add later if
>> PEP 3152 doesn't work out, and at that point we should have a better
>> idea about the best way of doing it.
> 
> It would a small miracle if PEP 3152 worked out. I'd much rather have
> a solid fallback position now. I'm not pushing for rushing PEP X to
> acceptance -- I'm just hoping we can write it now and discuss it on
> its own merits without too much concern for PEP 3152 or even PEP 380,
> although I personally still think that the interference with PEP 380
> would minimal and not a reason for changing PEP X.
> 

You are right, PEP X should be very small.  The main points in it would be:

1) Allow "return value" in a generator, making it raise (a subclass of)
StopIteration with value as the first argument.   (I am now in favor of
using a subclass, and treating a bare "return" as "return None". Working
with PEP 3152 made me realize that there are use cases for
distinguishing between a "return" and a "raise StopIteration")

2) Change g.close() to extract and return the value or add a new
g.finish() for that purpose.   (I'd prefer using "finish" and adding a
new exception for this instead of reusing GeneratorExit.  The new
method+exception would make it work in my modified PEP 380 without
further modification, reusing close+GeneratorExit would have the same
problems as we have now)



- Jacob


From denis.spir at gmail.com  Sat Oct 30 11:02:51 2010
From: denis.spir at gmail.com (spir)
Date: Sat, 30 Oct 2010 11:02:51 +0200
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <AANLkTim0MwtPcuRWmR9_ja8XvSkCkkTg0sejDnH-0Y3v@mail.gmail.com>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>
	<4CC93095.3080704@egenix.com>
	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>
	<4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz>
	<4CCA8622.5060405@egenix.com>
	<AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>
	<4CCAAB9D.3060909@pearwood.info>
	<AANLkTim0MwtPcuRWmR9_ja8XvSkCkkTg0sejDnH-0Y3v@mail.gmail.com>
Message-ID: <20101030110251.09e8083e@o>

On Fri, 29 Oct 2010 14:19:33 -1000
"Carl M. Johnson" <cmjohnson.mailinglist at gmail.com> wrote:

> Thinking about it a little more, if I were making an HTML tree type
> metaclass though, I wouldn't want to use an OrderedDict anyway, since
> it can't have duplicate elements, and I would want the interface to be
> something like:
> 
> class body(Tree()):
>     h1 = "Hello World!"
>     p  = "Lorem ipsum."
>     p  = "Dulce et decorum est."
>     class div(Tree(id="content")):
>         p = "Main thing"
>     class div(Tree(id="footer")):
>         p = "(C) 2010"
> 
> So, I'd probably end up making my own custom kind of dict that didn't
> overwrite repeated names.

Ah, but that's a completely different issue. You seem to be talking of an appropriate data structure to represent (the equivalent of) a parse tree, or rather an AST. In most grammars there are sequence patterns representing composite data, such as funcDef:(parameterList block) in which (1) order is not meaningful (2) most often each "kind" of element happens only once (*). And there are repetitive patterns, such as block:statement*, in which elements "kinds" also repeat.
Composite elements like func defs can be represented as dicts (ordered or not), but actually their meaning is of a "flexible record", a named tuple (**). It's _not_ a collection. The point is they can be indexed by "kind" (id est which patterns generated them).
Repetitive elements do not have this nice property, they must be represented as sequences of elements _holding_ their kind. For this reason, tree nodes often hold the element "kind" in addition to their actual data and some metadata.


Denis

(*) But that's not always true, eg addition:(addOperand '+' addOperand).

(**) I miss "free objects" in python for this reason -- people often use dicts instead. I'd like to be able to write: "return (color:c, position:p)", where the defined object is instance of Object directly, or maybe of Individual, meaning Object with a __dict__.
class Individual:
    def __init__ (self, **slots):
        self.__dict__ = slots
    def __repr__(self):
        return "(%s)" % \
            ' '.join("%s:%s" %(k,v) for (k,v) in self.__dict__.items())
print Individual(a=135, d=100)  # (a:135 d:100)
If only we had a literal notation for that :-) No more need of classes for singletons.
-- -- -- -- -- -- --
vit esse estrany ?

spir.wikidot.com



From steve at pearwood.info  Sat Oct 30 11:58:23 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 30 Oct 2010 20:58:23 +1100
Subject: [Python-ideas] Ordered storage of keyword arguments
In-Reply-To: <20101030110251.09e8083e@o>
References: <AANLkTimdfzEJzLxf4aEU9kYCz_Rde+T3jBESkcD0O9Yz@mail.gmail.com>	<4CC93095.3080704@egenix.com>	<AANLkTinJX82KnbAevSHXdgNEnxdjSu+Zvv4PqD5AD3bU@mail.gmail.com>	<4CC941FF.6070408@egenix.com>
	<4CC9E99F.5030805@canterbury.ac.nz>	<4CCA8622.5060405@egenix.com>	<AANLkTing7bfdupQXf1PYY9iGgK++mgx0rSNjdT-hRGGX@mail.gmail.com>	<4CCAAB9D.3060909@pearwood.info>	<AANLkTim0MwtPcuRWmR9_ja8XvSkCkkTg0sejDnH-0Y3v@mail.gmail.com>
	<20101030110251.09e8083e@o>
Message-ID: <4CCBEC3F.3090403@pearwood.info>

spir wrote:

> (**) I miss "free objects" in python for this reason -- people often use dicts instead. I'd like to be able to write: "return (color:c, position:p)", where the defined object is instance of Object directly, or maybe of Individual, meaning Object with a __dict__.
> class Individual:
>     def __init__ (self, **slots):
>         self.__dict__ = slots
>     def __repr__(self):
>         return "(%s)" % \
>             ' '.join("%s:%s" %(k,v) for (k,v) in self.__dict__.items())
> print Individual(a=135, d=100)  # (a:135 d:100)
> If only we had a literal notation for that :-) No more need of classes for singletons.


We almost do.

 >>> from collections import namedtuple as nt
 >>> nt("Individual", "a d")(a=135, d=100)
Individual(a=135, d=100)


Short enough to write in-line. Or you could do this:

 >>> Individual = nt("Individual", "a d")
 >>> Individual(99, 42)
Individual(a=99, d=42)


It shouldn't be hard to write a function similar to namedtuple that 
didn't require a declaration before hand, but picked up the field names 
from the keyword-only arguments given:

 >>> from collections import record  # say
 >>> record(a=23, b=42)
record(a=23, b=24)


I leave that as an exercise to the readers :)




-- 
Steven


From guido at python.org  Sat Oct 30 17:00:32 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 30 Oct 2010 08:00:32 -0700
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCBB84C.2080809@canterbury.ac.nz>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
	<AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
	<4CCBB84C.2080809@canterbury.ac.nz>
Message-ID: <AANLkTi=YMM-xyFis1wiybWnNJXBJ0YxF2m=t2-H9CENx@mail.gmail.com>

On Fri, Oct 29, 2010 at 11:16 PM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>>
>> On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing <greg.ewing at canterbury.ac.nz>
>> wrote:
>>
>>> and even then you would lose the return value
>>> from the inner generator, which you're probably going to
>>> want.
>>
>> Really? Can you show a realistic use case?
>
> Here's an attempt:
>
> ?def variancer():
> ? ?# Compute variance of values sent in (details left
> ? ?# as an exercise)
>
> ?def stddever():
> ? ?# Compute standard deviation of values sent in
> ? ?v = yield from variancer()
> ? ?return sqrt(v)

Good.

I have to get a crazy idea off my chest: maybe the collective hang-up
is that GeneratorExit must be special-cased. Let me explore a bit.

Take a binary tree node:

class Node:
  def __init__(self, label, left=None, right=None):
    self.label, self.left, self.right = label, left, right

And an inorder traversal function:

def inorder(node):
  if node:
    yield from inorder(node.left)
    yield node
    yield from inorder(node.right)

This is a nice example, and different from gtally(), and
variance()/stddev(), because of the recursion.

Now let's say we want to design a protocol whereby the consumer of the
nodes yielded by inorder() can ask the traversal to be stopped. With
the code above this is trivial, just call g.close() or throw any other
exception in. But now let's first modify inorder() to also return a
value computed from the nodes traversed so far. For simplicity I'll
use the count:

def inorder(node):
  if not node:
    return 0
  count += yield from inorder(node.left)
  yield node
  count += 1
  count += yield from inorder(node.right)
  return count

How would we stop this enumeration *and* receive a count of the nodes
already enumerated up to that point? Throwing in some exception is the
easiest approach. Let's say we throw EOFError. My first attempt has a
bug:

def inorder(node):
  if not node:
    return 0
  count = 0
  count += yield from inorder(node.left)  # Bug here
  try:
    count += 1
    yield node
  except EOFError:
    return count
  count += yield from inorder(node.right)
  return count

This has the fatal flaw of not responding promptly when the EOFError
is caught by the left subtree, since it returns normally and the
parent doesn't "see" the EOFError: on the way in it's thrown directly
into the first yield-from, on the way out there's no way to
distinguish between a regular return or an interrupted one.

A potential fix is to return two values: an interrupted flag and a
count. But this is pretty ugly (I'm not even going to show the code).

A different approach to fixing this is for the throwing code to keep
throwing EOFError until the generator stops yielding values:

def stop(g):
  while True:
    try:
      g.throw(EOFError)
    except StopIteration as err:
      return err.value

I'm not concerned with the situation where the generator is already
stopped; the EOFError will be bounced out, but that is the caller's
problem, as they shouldn't have attempted to stop an already-stopped
iterator. (Jacob is probably shaking his head here. :-)

This solution doesn't quite work though, because the count returned
will include the nodes that were yielded while the stack of generators
was winding down. My pragmatic solution for this is to change the
protocol so that stopping the generator means that the node yielded
last should not be included in the count. If you envision the caller
to be running a for-loop, think of calling stop() at the top of the
loop rather than at the bottom. (Jacob is now again wondering how
they'd get the count if the iterator runs till completion. :-) We can
do this by modifying inorder() to bump the count after yielding rather
than before:

  try:
    yield node
  except EOFError:
    return count
  count += 1

Now, to get back the semantics of getting the correct count
*including* the last node seen by the caller, we can modify stop() to
advance the generator by one more step:

def stop(g):
  try:
    next(g)
    while True:
      g.throw(EOFError)
  except StopIteration as err:
    return err.value

This works even if g was positioned after the last item to be yielded:
in that case next(g) raises StopIteration. It still doesn't work if we
use a for-loop to iterate through the end (Jacob nods :-) but I say
they shouldn't be doing that, or they can write a little wrapper for
iter() that *does* save the return value from StopIteration. (Okay,
half of me says it would be fine to store it on the generator object.
:-)

[Dramatic pause]

[Drumroll]

What has this got to do with GeneratorExit and g.close()? I propose to
modify g.close() to keep throwing GeneratorExit until the generator
stops yielding values, and then capture the return value from
StopIteration if that is what was raised. The beauty is then that the
PEP 380 expansion can stop special-casing GeneratorExit: it just
treats it as every other exception. And stddev() above works! (If you
worry about infinite loops: you can get those anyway, by putting
"while: True" in an "except GeneratorExit" block. I don't see much
reason to worry more in this case.)

-- 
--Guido van Rossum (python.org/~guido)


From rrr at ronadam.com  Sat Oct 30 18:54:15 2010
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 30 Oct 2010 11:54:15 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimw1t4YB-ZCVpWs-mEb117pnPS8EXiEH5eQH4_O@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA752D.5090904@canterbury.ac.nz>	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>	<4CCB8C50.2010404@canterbury.ac.nz>	<4CCBBE4A.3040503@ronadam.com>
	<AANLkTimw1t4YB-ZCVpWs-mEb117pnPS8EXiEH5eQH4_O@mail.gmail.com>
Message-ID: <4CCC4DB7.9050604@ronadam.com>



On 10/30/2010 02:58 AM, Nick Coghlan wrote:
> On Sat, Oct 30, 2010 at 4:42 PM, Ron Adam<rrr at ronadam.com>  wrote:
>> Ok, after thinking about this for a while, I think the "yield from" would be
>> too limited if it could only be used for consumers that must run until the
>> end. That rules out a whole lot of pipes, filters and other things that
>> consume-some, emit-some, consume-some_more, and emit-some_more.
>
> Indeed, the "stop-in-the-middle" aspect is tricky, but is the crux of
> what we're struggling with here.
>
>> I think I figured out something that may be more flexible and insn't too
>> complicated.
>
> Basically a way to use yield from, while declaring how to force the
> end of iteration? Interesting idea.

Not iteration, iteration can continue.  It signals the end of delegation, 
and returns control to the generator that initiated the delegation.


> However, I think sentinel values are likely a better way to handle
> this in a pure PEP 380 context.

Sentinel values aren't always better because they require a extra 
comparison on each item.


>> Here's an example.
>
> Modifying this example to use sentinel values rather than throwing in
> exceptions actually makes it all fairly straightforward in a PEP 380
> context. So maybe the moral of this whole thread is really "sentinel
> values good, sentinel exceptions bad".
>
> # Helper function to finish off a generator by sending a sentinel value
> def finish(g, sentinel=None):
>    try:
>      g.send(sentinel)
>    except StopIteration as ex:
>      return ex.value
>
> def gtally(end_tally=None):
>    # Tallies numbers until sentinel is passed in
>    count = tally = 0

>    value = object()

Left over from earlier edit?

>    while 1:
>      value = yield
>      if value is end_tally:
>        return count, tally
>      count += 1
>      tally += value

The comparison is executed on every loop.  A try-except would be outside 
the loop.


> def gaverage(end_avg=None):
>    count, tally = (yield from gtally(end_avg))
>    return tally / count
>
> def main():
>    g = gaverage()
>    next(g)
>    for x in range(100):
>      g.send(x)
>    return finish(g)


Cheers,
    Ron


From greg.ewing at canterbury.ac.nz  Sun Oct 31 02:09:26 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 31 Oct 2010 13:09:26 +1300
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=YMM-xyFis1wiybWnNJXBJ0YxF2m=t2-H9CENx@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
	<AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
	<4CCBB84C.2080809@canterbury.ac.nz>
	<AANLkTi=YMM-xyFis1wiybWnNJXBJ0YxF2m=t2-H9CENx@mail.gmail.com>
Message-ID: <4CCCB3B6.3010209@canterbury.ac.nz>

Guido van Rossum wrote:

> A different approach to fixing this is for the throwing code to keep
> throwing EOFError until the generator stops yielding values:

That's precisely what I would recommend.

> This solution doesn't quite work though, because the count returned
> will include the nodes that were yielded while the stack of generators
> was winding down.
 >
 > My pragmatic solution for this is to change the
> protocol so that stopping the generator means that the node yielded
> last should not be included in the count.

This whole example seems contrived to me, so it's hard to
say whether this is a good or bad solution.

> I propose to
> modify g.close() to keep throwing GeneratorExit until the generator
> stops yielding values, and then capture the return value from
> StopIteration if that is what was raised. The beauty is then that the
> PEP 380 expansion can stop special-casing GeneratorExit: it just
> treats it as every other exception.

This was actually suggested during the initial round of
discussion, and shot down -- if I remember correctly, on the
grounds that it could result in infinite loops. But if you're
no longer concerned about that, it's worth considering.

My concern is that this would be a fairly substantial change
to the intended semantics of close() -- it would no longer be
a way of aborting a generator and forcing it to clean up as
quickly as possible.

But maybe you don't mind losing that functionality?

-- 
Greg


From ncoghlan at gmail.com  Sun Oct 31 02:34:35 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 31 Oct 2010 10:34:35 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTi=YMM-xyFis1wiybWnNJXBJ0YxF2m=t2-H9CENx@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz>
	<AANLkTikr9FDhTLisoPfR+bkDZTR_KVYVk=3FBSBT5zJw@mail.gmail.com>
	<4CCBB84C.2080809@canterbury.ac.nz>
	<AANLkTi=YMM-xyFis1wiybWnNJXBJ0YxF2m=t2-H9CENx@mail.gmail.com>
Message-ID: <AANLkTimEV2XdAA3g4hgFMTmGjNA-swMCyF_ko65aRkkx@mail.gmail.com>

On Sun, Oct 31, 2010 at 1:00 AM, Guido van Rossum <guido at python.org> wrote:
> What has this got to do with GeneratorExit and g.close()? I propose to
> modify g.close() to keep throwing GeneratorExit until the generator
> stops yielding values, and then capture the return value from
> StopIteration if that is what was raised. The beauty is then that the
> PEP 380 expansion can stop special-casing GeneratorExit: it just
> treats it as every other exception. And stddev() above works! (If you
> worry about infinite loops: you can get those anyway, by putting
> "while: True" in an "except GeneratorExit" block. I don't see much
> reason to worry more in this case.)

I'm back to liking your general idea, but wanting to use a new method
and exception for the task to keep the two sets of semantics
orthogonal :)

If we add a finish() method that corresponds to your stop() function,
and a GeneratorReturn exception as a peer to GeneratorExit:

class GeneratorReturn(BaseException): pass

def finish(self):
  if g.gi_frame is None:
    return self._result # (or raise RuntimeError)
  try:
    next(self)
    while True:
      self.throw(GeneratorReturn)
  except StopIteration as ex:
    return ex.value


Then your node counter iterator (nice example, btw) would simply look like:

def count_nodes(node):
 if not node:
   return 0
 count = 0
 count += yield from count_nodes(node.left)
 try:
   yield node
 except GeneratorReturn:
   return count
 count += 1 # Only count nodes when next is called in response
 count += yield from count_nodes(node.right)
 return count
Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sun Oct 31 02:35:46 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 31 Oct 2010 10:35:46 +1000
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <4CCC4DB7.9050604@ronadam.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>
	<4CC63065.9040507@improva.dk>
	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>
	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>
	<4CC6E94F.3090702@improva.dk>
	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>
	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>
	<4CC889F1.8010603@improva.dk>
	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>
	<4CC939E5.5070700@improva.dk>
	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>
	<4CC9FC87.1040600@canterbury.ac.nz>
	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>
	<4CCA752D.5090904@canterbury.ac.nz>
	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>
	<4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com>
	<AANLkTimw1t4YB-ZCVpWs-mEb117pnPS8EXiEH5eQH4_O@mail.gmail.com>
	<4CCC4DB7.9050604@ronadam.com>
Message-ID: <AANLkTimcnJk5Wi8t6-KHYHjWf=5TppmZvBb-n77JtQYx@mail.gmail.com>

On Sun, Oct 31, 2010 at 2:54 AM, Ron Adam <rrr at ronadam.com> wrote:
>> However, I think sentinel values are likely a better way to handle
>> this in a pure PEP 380 context.
>
> Sentinel values aren't always better because they require a extra comparison
> on each item.

Yep, Guido's example made me realise I was wrong on that front.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From rrr at ronadam.com  Sun Oct 31 08:26:11 2010
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 31 Oct 2010 02:26:11 -0500
Subject: [Python-ideas] Possible PEP 380 tweak
In-Reply-To: <AANLkTimcnJk5Wi8t6-KHYHjWf=5TppmZvBb-n77JtQYx@mail.gmail.com>
References: <AANLkTin1FNkHLww=yyUa2bhQ8w76FuUurRsG-5a2AXQ7@mail.gmail.com>	<4CC63065.9040507@improva.dk>	<AANLkTi=O4i81MTD9p251uUV+8PHrOXFfPggeuyvWa7jQ@mail.gmail.com>	<AANLkTi=1X2e_d4OywtPNd8gf73-5epThejoD6F3KmOQ6@mail.gmail.com>	<4CC6E94F.3090702@improva.dk>	<AANLkTi=5HOYVUgzkKzk9eF4-DtPHfUV9afUKP1MiUcj2@mail.gmail.com>	<AANLkTinOTtZzmJR2Vmm59Up_h0Pg7DnHJacKmvS_HJ=-@mail.gmail.com>	<4CC889F1.8010603@improva.dk>	<AANLkTimMjjgoHDNU-v_xjnjQEPTONnRqQpskoRsdeJ5W@mail.gmail.com>	<4CC939E5.5070700@improva.dk>	<AANLkTimysMXEaZY0xf+6AKGfYA=V-7Cu9g8HQuC3WNRV@mail.gmail.com>	<4CC9FC87.1040600@canterbury.ac.nz>	<AANLkTimownK9cPWFJnQX5UnuXY9RU+xcZsFmBUX7u3zU@mail.gmail.com>	<4CCA752D.5090904@canterbury.ac.nz>	<AANLkTikYpO+11RX=gf6UeqEBwT+NNQPc6nQTtY=ZKUu_@mail.gmail.com>	<4CCB8C50.2010404@canterbury.ac.nz>	<4CCBBE4A.3040503@ronadam.com>	<AANLkTimw1t4YB-ZCVpWs-mEb117pnPS8EXiEH5eQH4_O@mail.gmail.com>	<4CCC4DB7.9050604@ronadam.com>
	<AANLkTimcnJk5Wi8t6-KHYHjWf=5TppmZvBb-n77JtQYx@mail.gmail.com>
Message-ID: <4CCD1A13.1070400@ronadam.com>



On 10/30/2010 07:35 PM, Nick Coghlan wrote:
> On Sun, Oct 31, 2010 at 2:54 AM, Ron Adam<rrr at ronadam.com>  wrote:
>>> However, I think sentinel values are likely a better way to handle
>>> this in a pure PEP 380 context.
>>
>> Sentinel values aren't always better because they require a extra comparison
>> on each item.
>
> Yep, Guido's example made me realise I was wrong on that front.

BTW: A sentinal could still work, and the 'except <exception>' could be 
optional.

The finish function isn't needed in this one.


def gtally(end_tally):
   # Tallies numbers until sentinel is passed in
   count = tally = 0
   while 1:
     value = yield
     if value is end_tally:
       break
     count += 1
     tally += value
   yield count, tally

def gaverage(end_avg):
   yield from gtally(end_avg)
   yield tally / count

def main():
   g = gaverage(None)
   next(g)
   for x in range(100):
     g.send(x)
   return g.send(None)


Using sentinels not always wrong either. The data may have natural sentinel 
values in it.  In those cases, value testing is what you want.

I would like to be able to do it both ways myself. :-)

Cheers,
    Ron






From andre.roberge at gmail.com  Sun Oct 31 17:55:36 2010
From: andre.roberge at gmail.com (Andre Roberge)
Date: Sun, 31 Oct 2010 13:55:36 -0300
Subject: [Python-ideas] Accepting "?" as a valid character for identifiers
Message-ID: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>

In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?)
is a valid character for identifiers.  I find that using it well can improve
readability of programs written in those languages.

Python 3 now allow all kinds of unicode characters in source code for
identifiers. This is fantastic when one wants to teach programming to
non-English speakers and have them use meaningful identifiers.

While Python 3 does not allow ?, it does allow characters like ?  (
http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used
to good effect in writing valid identifiers such as functions that return
either True or False, etc., thus improving (imo) readability.

Given that one can legally mimic ? in Python identifiers, and given that the
? symbol is not used for anything in Python, would it be possible to
consider allowing the use of ? as a valid character in an identifier?

Andr?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20101031/bae86b2d/attachment.html>

From masklinn at masklinn.net  Sun Oct 31 18:39:58 2010
From: masklinn at masklinn.net (Masklinn)
Date: Sun, 31 Oct 2010 18:39:58 +0100
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
In-Reply-To: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
Message-ID: <A886BFB7-A616-4918-9E8D-FE46867DC8B7@masklinn.net>

On 2010-10-31, at 17:55 , Andre Roberge wrote:
> In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?)
> is a valid character for identifiers.  I find that using it well can improve
> readability of programs written in those languages.
> 
> Python 3 now allow all kinds of unicode characters in source code for
> identifiers. This is fantastic when one wants to teach programming to
> non-English speakers and have them use meaningful identifiers.
> 
> While Python 3 does not allow ?, it does allow characters like ?  (
> http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used
> to good effect in writing valid identifiers such as functions that return
> either True or False, etc., thus improving (imo) readability.
> 
> Given that one can legally mimic ? in Python identifiers, and given that the
> ? symbol is not used for anything in Python, would it be possible to
> consider allowing the use of ? as a valid character in an identifier?

An other interesting postfix in the same line is "!" (for mutating methods).

From songofacandy at gmail.com  Sun Oct 31 18:48:12 2010
From: songofacandy at gmail.com (INADA Naoki)
Date: Mon, 1 Nov 2010 02:48:12 +0900
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
In-Reply-To: <A886BFB7-A616-4918-9E8D-FE46867DC8B7@masklinn.net>
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
	<A886BFB7-A616-4918-9E8D-FE46867DC8B7@masklinn.net>
Message-ID: <AANLkTikJLsYpSjUkQmY5StoHAV_FdDqcxVHuk927+eLb@mail.gmail.com>

> An other interesting postfix in the same line is "!" (for mutating methods).

bytearray is a mutable type but it's methods are designed for immutable bytes.
I think bytearray should provide in-place mutation method. And '!' is good for
the method. For example, bytearray.strip!() should be in-place.

-- 
INADA Naoki? <songofacandy at gmail.com>


From g.brandl at gmx.net  Sun Oct 31 18:51:53 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 31 Oct 2010 17:51:53 +0000
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
In-Reply-To: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
Message-ID: <iakadn$kgu$1@dough.gmane.org>

Am 31.10.2010 16:55, schrieb Andre Roberge:
> In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) is
> a valid character for identifiers.  I find that using it well can improve
> readability of programs written in those languages. 
> 
> Python 3 now allow all kinds of unicode characters in source code for
> identifiers. This is fantastic when one wants to teach programming to
> non-English speakers and have them use meaningful identifiers.
> 
> While Python 3 does not allow ?, it does allow characters like ? 
> (http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used to
> good effect in writing valid identifiers such as functions that return either
> True or False, etc., thus improving (imo) readability.

Really?

if number.even?():
    # do something

Since in Python, function/method calls require parens -- as opposed to Ruby,
and in Scheme the parens are somewhere else, this doesn't strike me as more
readable, on the contrary, it looks more noisy.  Same goes for mutating
methods with "!" suffix -- it looks just awkward followed by parens.

(Obvious objection: use a property. Obvious answer: pick a method with args.)

Another drawback of introducing such a convention this late in the design of
the language is that you can never have it applied consistently.  Changing
the builtin and stdlib instances alone would need hundreds of compatibility
aliases.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From stefan_ml at behnel.de  Sun Oct 31 20:10:37 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 31 Oct 2010 20:10:37 +0100
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
In-Reply-To: <iakadn$kgu$1@dough.gmane.org>
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
	<iakadn$kgu$1@dough.gmane.org>
Message-ID: <iakeve$6mn$1@dough.gmane.org>

Georg Brandl, 31.10.2010 18:51:
> Am 31.10.2010 16:55, schrieb Andre Roberge:
>> In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) is
>> a valid character for identifiers.  I find that using it well can improve
>> readability of programs written in those languages.
>>
>> Python 3 now allow all kinds of unicode characters in source code for
>> identifiers. This is fantastic when one wants to teach programming to
>> non-English speakers and have them use meaningful identifiers.
>>
>> While Python 3 does not allow ?, it does allow characters like ?
>> (http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used to
>> good effect in writing valid identifiers such as functions that return either
>> True or False, etc., thus improving (imo) readability.
>
> Really?
>
> if number.even?():
>      # do something
>
> Since in Python, function/method calls require parens -- as opposed to Ruby,
> and in Scheme the parens are somewhere else, this doesn't strike me as more
> readable, on the contrary, it looks more noisy.  Same goes for mutating
> methods with "!" suffix -- it looks just awkward followed by parens.

Hmm, that reminds me. I think we should reconsider PEP 3117. There's still 
some value in it.

http://www.python.org/dev/peps/pep-3117/

Stefan



From greg.ewing at canterbury.ac.nz  Sun Oct 31 21:58:28 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 01 Nov 2010 09:58:28 +1300
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
In-Reply-To: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
Message-ID: <4CCDD874.9010006@canterbury.ac.nz>

Andre Roberge wrote:
> In some languages (e.g. Scheme, Ruby, etc.), the question mark character 
> (?) is a valid character for identifiers.  I find that using it well can 
> improve readability of programs written in those languages.

Opinions differ on that. I find that having punctuation mixed
in with identifiers makes the code *harder* to read. My wetware
parser makes a clear distinction between characters that can be
part of words and characters that separate words, and '?' falls
very much into the latter category for me.

Also, if we did this, it would preclude ever being able to use
the characters concerned as operators in the future.

-- 
Greg


From solipsis at pitrou.net  Sun Oct 31 22:07:22 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 31 Oct 2010 22:07:22 +0100
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
Message-ID: <20101031220722.5063a7f9@pitrou.net>

On Sun, 31 Oct 2010 13:55:36 -0300
Andre Roberge <andre.roberge at gmail.com>
wrote:
> 
> While Python 3 does not allow ?, it does allow characters like ?  (
> http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used
> to good effect in writing valid identifiers such as functions that return
> either True or False, etc., thus improving (imo) readability.
> 
> Given that one can legally mimic ? in Python identifiers, and given that the
> ? symbol is not used for anything in Python, would it be possible to
> consider allowing the use of ? as a valid character in an identifier?

The fact that it looks like some other Unicode character is not really a
valid reason to allow it in identifiers.

Regards

Antoine.




From ben+python at benfinney.id.au  Sun Oct 31 23:51:55 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Mon, 01 Nov 2010 09:51:55 +1100
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
	<iakadn$kgu$1@dough.gmane.org> <iakeve$6mn$1@dough.gmane.org>
Message-ID: <87hbg2kmb8.fsf@benfinney.id.au>

Stefan Behnel <stefan_ml at behnel.de>
writes:

> Hmm, that reminds me. I think we should reconsider PEP 3117. There's
> still some value in it.
>
> http://www.python.org/dev/peps/pep-3117/

I certainly got value out of reading it :-)

-- 
 \      ?On the internet you simply can't outsource parenting.? ?Eliza |
  `\      Cussen, _Top 10 Internet Filter Lies_, The Punch, 2010-03-25 |
_o__)                                                                  |
Ben Finney



From ben+python at benfinney.id.au  Sun Oct 31 23:59:23 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Mon, 01 Nov 2010 09:59:23 +1100
Subject: [Python-ideas] Accepting "?" as a valid character for
	identifiers
References: <AANLkTinWr=5vuc66z_DT+W5LCPgOBW+oe7cB6f1vvmQ3@mail.gmail.com>
Message-ID: <87d3qqklys.fsf@benfinney.id.au>

Andre Roberge <andre.roberge at gmail.com>
writes:

> While Python 3 does not allow ?, it does allow characters like ?  (
> http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29)  which can be used
> to good effect in writing valid identifiers such as functions that return
> either True or False, etc., thus improving (imo) readability.

I consider ?read-over-the-telephone-ability? to be an essential
component of ?readability?. Your identifiers containing unpronounceable
characters would kill that.

Unless you're going to argue that you are writing identifiers taken from
a natural language that allows unambiguous pronunciation of ??? with the
same concision as other characters, of course.

I certainly don't want to be spelling out ?U+0294 LATIN LETTER GLOTTAL
STOP? for a single character when I speak an identifier.

-- 
 \            ?But it is permissible to make a judgment after you have |
  `\    examined the evidence. In some circles it is even encouraged.? |
_o__)                    ?Carl Sagan, _The Burden of Skepticism_, 1987 |
Ben Finney