From rrr at ronadam.com  Wed Sep  5 08:59:42 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 05 Sep 2007 01:59:42 -0500
Subject: [Python-ideas] FInd first tuple argument for str.find and str.index
Message-ID: <46DE53DE.7070803@ronadam.com>


Could we add the ability of str.index and str.find to accept a tuple as the 
first argument and return the index of the first item found in it.

This is similar to how str.startswith and str.endswith already works.

  |  startswith(...)
  |      S.startswith(prefix[, start[, end]]) -> bool
  |
  |      Return True if S starts with the specified prefix, False otherwise.
  |      With optional start, test S beginning at that position.
  |      With optional end, stop comparing S at that position.
  |      prefix can also be a tuple of strings to try.


This would speed up cases of filtering and searching when more than one 
item is being searched for.  It would also simplify building iterators that 
filter and yield multiple items in order.


A general google code search seems to show it's a generally useful thing to 
do.

http://www.google.com/codesearch?hl=en&lr=&q=%22findfirst%22+string&btnG=Search


(searching for python specific code doesn't show much because python 
doesn't have a findfirst function of any type.)


Cheers,
    Ron



From guido at python.org  Wed Sep  5 17:05:43 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Sep 2007 08:05:43 -0700
Subject: [Python-ideas] FInd first tuple argument for str.find and
	str.index
In-Reply-To: <46DE53DE.7070803@ronadam.com>
References: <46DE53DE.7070803@ronadam.com>
Message-ID: <ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>

I was surprised to find that startswith and endswith support this, but
it does make sense. Adding a patch to 2.6 would cause it to be merged
into 3.0 soon enough.

On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>
> Could we add the ability of str.index and str.find to accept a tuple as the
> first argument and return the index of the first item found in it.
>
> This is similar to how str.startswith and str.endswith already works.
>
>   |  startswith(...)
>   |      S.startswith(prefix[, start[, end]]) -> bool
>   |
>   |      Return True if S starts with the specified prefix, False otherwise.
>   |      With optional start, test S beginning at that position.
>   |      With optional end, stop comparing S at that position.
>   |      prefix can also be a tuple of strings to try.
>
>
> This would speed up cases of filtering and searching when more than one
> item is being searched for.  It would also simplify building iterators that
> filter and yield multiple items in order.
>
>
> A general google code search seems to show it's a generally useful thing to
> do.
>
> http://www.google.com/codesearch?hl=en&lr=&q=%22findfirst%22+string&btnG=Search
>
>
> (searching for python specific code doesn't show much because python
> doesn't have a findfirst function of any type.)
>
>
> Cheers,
>     Ron
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From terry at jon.es  Wed Sep  5 18:14:30 2007
From: terry at jon.es (Terry Jones)
Date: Wed, 5 Sep 2007 18:14:30 +0200
Subject: [Python-ideas] FInd first tuple argument for str.find
	and	str.index
In-Reply-To: Your message at 08:05:43 on Wednesday, 5 September 2007
References: <46DE53DE.7070803@ronadam.com>
	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
Message-ID: <18142.54758.609458.513647@terry-jones-computer.local>

>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:
Guido> I was surprised to find that startswith and endswith support this,
Guido> but it does make sense. Adding a patch to 2.6 would cause it to be
Guido> merged into 3.0 soon enough.

Guido> On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>> Could we add the ability of str.index and str.find to accept a tuple as the
>> first argument and return the index of the first item found in it.

Hi

If someone is going to head down this path, it might be better to implement
a more general algorithm and provide the above as a special case via an
argument.

There's a fast and beautiful algorithm due to Aho & Corasick (CACM, 1975)
that finds _all_ matches of a set of patterns. It runs in time that's
linear in max(sum of the lengths of the patterns to be matched, length of
to-be-matched text). The algorithm was the basis of fgrep.

To provide the above, a special case could have it return as soon as it
found a first match (i.e., of any pattern).

One general way to write it would be to have it return a dict of patterns
and result indices in the case that the pattern argument is a tuple.

So

  "Hey, look, look, look at these patterns".find(('look', 'for', 'these', 'patterns'))

might return

    {
      'look'     : [ 5, 11, 17 ],
      'for'      : [ ],  # or arguably [ -1 ],
      'these'    : [ 25 ],
      'patterns' : [ 31 ],
    }

OK, that's a bit of a departure from the normal behavior of find, but so is
passing a tuple of patterns. Alternately, you could also get back a tuple
of (tuples of) matching indices.

The ideal calling interface and result depends on what you need to do -
check if a specific string matched? Just know the first match offset, etc.

I don't know the best solution, but the algorithm rocks. Raymond - you'll
love it :-)

Terry


From terry at jon.es  Wed Sep  5 18:37:34 2007
From: terry at jon.es (Terry Jones)
Date: Wed, 5 Sep 2007 18:37:34 +0200
Subject: [Python-ideas] FInd first tuple argument for
	str.find	and	str.index
In-Reply-To: Your message at 18:14:30 on Wednesday, 5 September 2007
References: <46DE53DE.7070803@ronadam.com>
	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
	<18142.54758.609458.513647@terry-jones-computer.local>
Message-ID: <18142.56142.262681.951910@terry-jones-computer.local>

>>>>> "Terry" == Terry Jones <terry at jon.es> writes:
>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:
Guido> I was surprised to find that startswith and endswith support this,
Guido> but it does make sense. Adding a patch to 2.6 would cause it to be
Guido> merged into 3.0 soon enough.

Guido> On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>>> Could we add the ability of str.index and str.find to accept a tuple as the
>>> first argument and return the index of the first item found in it.

I should have added a few more comments.

If you're going to implement the original desired functionality and make it
run quickly, you're probably going to dream up something along the lines of
what Aho & Corasick did so beautifully.

It's tricky to get it right. As you walk the text string, several patterns
may be currently matching. But the next char you consider might cause one
or more of the current matches to fail, or a currently non-matching pattern
to begin to match. The A&C algorithm builds a trie with failure arcs, so
the matching is linear (both linear time to build the trie and the failure
arcs, and then linear to walk the trie with the text). It has accepting
states, so you know as soon as something matches, and can quit early.

If this is going to be implemented you may as well do it right the first time.

You could also return a dict in which (pattern) keys are absent if they
didn't match at all. Then it would be fast to tell which, if any, patterns
matched - no need to step through all passed patterns, just use
result.keys() to get them.

Terry


From grosser.meister.morti at gmx.net  Wed Sep  5 18:50:21 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 05 Sep 2007 18:50:21 +0200
Subject: [Python-ideas] FInd first tuple argument for
	str.find	and	str.index
In-Reply-To: <18142.54758.609458.513647@terry-jones-computer.local>
References: <46DE53DE.7070803@ronadam.com>	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
	<18142.54758.609458.513647@terry-jones-computer.local>
Message-ID: <46DEDE4D.4010709@gmx.net>

Terry Jones wrote:
>>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:
> Guido> I was surprised to find that startswith and endswith support this,
> Guido> but it does make sense. Adding a patch to 2.6 would cause it to be
> Guido> merged into 3.0 soon enough.
> 
> Guido> On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>>> Could we add the ability of str.index and str.find to accept a tuple as the
>>> first argument and return the index of the first item found in it.
> 
> Hi
> 
> If someone is going to head down this path, it might be better to implement
> a more general algorithm and provide the above as a special case via an
> argument.
> 
> There's a fast and beautiful algorithm due to Aho & Corasick (CACM, 1975)
> that finds _all_ matches of a set of patterns. It runs in time that's
> linear in max(sum of the lengths of the patterns to be matched, length of
> to-be-matched text). The algorithm was the basis of fgrep.
> 
> To provide the above, a special case could have it return as soon as it
> found a first match (i.e., of any pattern).
> 
> One general way to write it would be to have it return a dict of patterns
> and result indices in the case that the pattern argument is a tuple.
> 
> So
> 
>   "Hey, look, look, look at these patterns".find(('look', 'for', 'these', 'patterns'))
> 
> might return
> 
>     {
>       'look'     : [ 5, 11, 17 ],
>       'for'      : [ ],  # or arguably [ -1 ],
>       'these'    : [ 25 ],
>       'patterns' : [ 31 ],
>     }
> 
> OK, that's a bit of a departure from the normal behavior of find, but so is
> passing a tuple of patterns. Alternately, you could also get back a tuple
> of (tuples of) matching indices.
> 
> The ideal calling interface and result depends on what you need to do -
> check if a specific string matched? Just know the first match offset, etc.
> 
> I don't know the best solution, but the algorithm rocks. Raymond - you'll
> love it :-)
> 
> Terry

I would expect such a method to return the index where one of the given strings was
found. Or maybe a tuple: (start, end) or a tuple: (start, searchstring).

	-panzi


From terry at jon.es  Wed Sep  5 19:01:23 2007
From: terry at jon.es (Terry Jones)
Date: Wed, 5 Sep 2007 19:01:23 +0200
Subject: [Python-ideas] FInd first tuple argument
	for	str.find	and	str.index
In-Reply-To: Your message at 18:50:21 on Wednesday, 5 September 2007
References: <46DE53DE.7070803@ronadam.com>
	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
	<18142.54758.609458.513647@terry-jones-computer.local>
	<46DEDE4D.4010709@gmx.net>
Message-ID: <18142.57571.484035.769749@terry-jones-computer.local>

>>>>> "Mathias" == Mathias Panzenb?ck <grosser.meister.morti at gmx.net> writes:
Mathias> I would expect such a method to return the index where one of the
Mathias> given strings was found. Or maybe a tuple: (start, end) or a
Mathias> tuple: (start, searchstring).

It could do something like that if you passed an argument telling it to
quit on the first match. But that makes the return type depend on the
passed arg, which I guess is not good. We'd already be doing that if we
returned a dict, but this would return either a tuple or a dict.

You could drop the dict idea altogether, but you need to consider what to
do if many (probably different) patterns match, all starting at the same
location in the string. For this reason alone I don't think returning a
(start searchstring) tuple is sufficient.

Given that Aho & Corasick find everything you could want to know (all
matches of all patterns), and that they do it in linear time, it doesn't
seem right to throw this information away - especially after going to the
trouble of building and walking the trie.

Terry


From rrr at ronadam.com  Wed Sep  5 20:19:36 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 05 Sep 2007 13:19:36 -0500
Subject: [Python-ideas] FInd first tuple
	argument	for	str.find	and	str.index
In-Reply-To: <18142.57571.484035.769749@terry-jones-computer.local>
References: <46DE53DE.7070803@ronadam.com>	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>	<18142.54758.609458.513647@terry-jones-computer.local>	<46DEDE4D.4010709@gmx.net>
	<18142.57571.484035.769749@terry-jones-computer.local>
Message-ID: <46DEF338.10407@ronadam.com>



Terry Jones wrote:
>>>>>> "Mathias" == Mathias Panzenb?ck <grosser.meister.morti at gmx.net> writes:
> Mathias> I would expect such a method to return the index where one of the
> Mathias> given strings was found. Or maybe a tuple: (start, end) or a
> Mathias> tuple: (start, searchstring).
> 
> It could do something like that if you passed an argument telling it to
> quit on the first match. But that makes the return type depend on the
> passed arg, which I guess is not good. We'd already be doing that if we
> returned a dict, but this would return either a tuple or a dict.
> 
> You could drop the dict idea altogether, but you need to consider what to
> do if many (probably different) patterns match, all starting at the same
> location in the string. For this reason alone I don't think returning a
> (start searchstring) tuple is sufficient.

I was thinking of something a bit more light weight.

For more complex stuff I think the 're' module already does pretty much 
what you are describing.  It may even already take advantage of the 
algorithms you referred to.  If not, that would be an important improvement 
to the re module. :-)

The use case I had in mind was to find starting and ending delimiters.  And 
to avoid the following type of awkward code.  (This would work for finding 
other things as well of course.)

    start = 0
    while start < len(s):

        i1 = s.find('{', start)
        if i1 == -1:
            i1 = len(s)

        i2 = s.find('}', start)
        if i2 == -1:
            i2 = len(s)

        # etc... for as many search terms as you have...
        # or use a loop to locate each one.

        start = min(i1, i2)
        if start == len(s):
            break

        ...
        # do something with s[start]
        ...

That works but it has to go through the string once for each item.  Of 
course I would use 're' for anything more complex than a few fixed length 
terms.

The above could be simplified greatly to the following and be much quicker 
over what we have now and still not be overly complex.

   start = 0
   while start < len(s):
      try:
         start = s.index(('{', '}'), start)
      except ValueError:
         break
      ...
      # do something with s[start]
      ...


> Given that Aho & Corasick find everything you could want to know (all
> matches of all patterns), and that they do it in linear time, it doesn't
> seem right to throw this information away - especially after going to the
> trouble of building and walking the trie.

Thanks for the reference, I'll look into it.  :-)

If the function returns something other than a simple index, then I think 
it will need to be a new function or method and not just an alteration of 
str.index and str.find.  In that case it may also need a PEP.

Cheers,
    Ron



From terry at jon.es  Wed Sep  5 21:55:44 2007
From: terry at jon.es (Terry Jones)
Date: Wed, 5 Sep 2007 21:55:44 +0200
Subject: [Python-ideas] FInd first
	tuple	argument	for	str.find	and	str.index
In-Reply-To: Your message at 13:19:36 on Wednesday, 5 September 2007
References: <46DE53DE.7070803@ronadam.com>
	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
	<18142.54758.609458.513647@terry-jones-computer.local>
	<46DEDE4D.4010709@gmx.net>
	<18142.57571.484035.769749@terry-jones-computer.local>
	<46DEF338.10407@ronadam.com>
Message-ID: <18143.2496.630054.862695@terry-jones-computer.local>

Hi Ron

>>>>> "Ron" == Ron Adam <rrr at ronadam.com> writes:
Ron> I was thinking of something a bit more light weight.

Ah, now we got to what you actually want to do :-)

Ron> For more complex stuff I think the 're' module already does pretty
Ron> much what you are describing.  It may even already take advantage of
Ron> the algorithms you referred to.  If not, that would be an important
Ron> improvement to the re module. :-)

Yes, that would make a good SoC project. But as you say it may already be
done that way.

Ron> The use case I had in mind was to find starting and ending delimiters.
Ron> And to avoid the following type of awkward code.  (This would work for
Ron> finding other things as well of course.)

    start = 0
    while start < len(s):

        i1 = s.find('{', start)
        if i1 == -1:
            i1 = len(s)

        i2 = s.find('}', start)
        if i2 == -1:
            i2 = len(s)

        # etc... for as many search terms as you have...
        # or use a loop to locate each one.

        start = min(i1, i2)
        if start == len(s):
            break

        ...
        # do something with s[start]
        ...

Ron> That works but it has to go through the string once for each item.

It's worse than that. _Each time around the loop_ it tests all candidates
against all of the remaining text. Imagine matching L patterns that were
each 'a' * M against a text of 'a' * N. You're going to do (roughly) O(N *
L * M) comparisons, using naive string matching.

You could completely drop patterns that have already returned -1. You can
short-circuit a loop if you ever got a 0 index back.  Sorry if this seems
picky - I guess you're just writing quick pseudo-code.

Ron> Of course I would use 're' for anything more complex than a few fixed
Ron> length terms.

Yes. It depends partly on what you want back. You could write a super-fast
iterator based on A&C that told you everything you need to know, with
guaranteed linear worst case behavior, but that seems like overkill here
(and it may be in re, as you say).

Ron> If the function returns something other than a simple index, then I
Ron> think it will need to be a new function or method and not just an
Ron> alteration of str.index and str.find.  In that case it may also need a
Ron> PEP.

Be my guest :-)

Regards,
Terry


From rrr at ronadam.com  Thu Sep  6 00:59:41 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 05 Sep 2007 17:59:41 -0500
Subject: [Python-ideas] FInd first
	tuple	argument	for	str.find	and	str.index
In-Reply-To: <18143.2496.630054.862695@terry-jones-computer.local>
References: <46DE53DE.7070803@ronadam.com>	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>	<18142.54758.609458.513647@terry-jones-computer.local>	<46DEDE4D.4010709@gmx.net>	<18142.57571.484035.769749@terry-jones-computer.local>	<46DEF338.10407@ronadam.com>
	<18143.2496.630054.862695@terry-jones-computer.local>
Message-ID: <46DF34DD.4000500@ronadam.com>



Terry Jones wrote:

 > It's worse than that. _Each time around the loop_ it tests all candidates
 > against all of the remaining text. Imagine matching L patterns that were
 > each 'a' * M against a text of 'a' * N. You're going to do (roughly) O(N *
 > L * M) comparisons, using naive string matching.

Yep, it's bad, which is why I don't want to do it this way.  Of course it's 
much better than ...

     for char in string:
         ... etc ...


 > You could completely drop patterns that have already returned -1. You can
 > short-circuit a loop if you ever got a 0 index back.  Sorry if this seems
 > picky - I guess you're just writing quick pseudo-code.

Well its a but if both pseudo-code and from an actual problem I was working 
on.  It was a first unoptimized version.  First rule is to get something 
that works, then make it fast after it's tested.  Right?  :-)

No you aren't being any more picky than I am.  I didn't like that solution 
either which is why I suggested improving index and find just a bit.

And I'm not real keen on this next one although it will probably work better.

(not tested)

length = len(s)
cases = dict((c, 0) for c in terms)
while 1:
    for term, i in cases.items():
        if i == start:
           i = s.find(term, start)
           cases[term] = i if i > -1 else length
    start = min(cases.values())
    if start == length:
        break
    ...
    Do something with s[start]

Return some result we constructed or found along the way.

That's an improvement, But it's still way more work than I think the 
problem should need.  It's also complex enough that it's no longer obvious 
just what it's doing.


I still like the tuple version of index, or find better.  Much easier to 
read and understand.

    ...
    try:
        start = s.index(('{', '}'), start)
    except:
        break
    ...

    ...
    start = s.find(('{', '}'), start)
    if start == -1:
        break
    ...


 > Ron> Of course I would use 're' for anything more complex than a few fixed
 > Ron> length terms.
 >
 > Yes. It depends partly on what you want back. You could write a super-fast
 > iterator based on A&C that told you everything you need to know, with
 > guaranteed linear worst case behavior, but that seems like overkill here
 > (and it may be in re, as you say).

Regular expression can be slower for small things because it has more 
overhead.  It's not always the fasted choice.

And I've read they are not good at is parsing nested delimiters.

Cheers,
    RON


 > Ron> If the function returns something other than a simple index, then I
 > Ron> think it will need to be a new function or method and not just an
 > Ron> alteration of str.index and str.find.  In that case it may also need a
 > Ron> PEP.
 >
 > Be my guest :-)
 >
 > Regards,
 > Terry


From rrr at ronadam.com  Thu Sep  6 13:58:40 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 06 Sep 2007 06:58:40 -0500
Subject: [Python-ideas] FInd first tuple argument for str.find and
	str.index
In-Reply-To: <ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
References: <46DE53DE.7070803@ronadam.com>
	<ca471dc20709050805t357d9732r379ca57c510708c6@mail.gmail.com>
Message-ID: <46DFEB70.90201@ronadam.com>



Guido van Rossum wrote:
> I was surprised to find that startswith and endswith support this, but
> it does make sense. Adding a patch to 2.6 would cause it to be merged
> into 3.0 soon enough.


I'll give it a try but it may take me a while to do it.  If someone else 
wants to do this and is more familiar with the string and unicode objects 
that would be good.

Cheers,
    Ron





> On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>> Could we add the ability of str.index and str.find to accept a tuple as the
>> first argument and return the index of the first item found in it.
>>
>> This is similar to how str.startswith and str.endswith already works.
>>
>>   |  startswith(...)
>>   |      S.startswith(prefix[, start[, end]]) -> bool
>>   |
>>   |      Return True if S starts with the specified prefix, False otherwise.
>>   |      With optional start, test S beginning at that position.
>>   |      With optional end, stop comparing S at that position.
>>   |      prefix can also be a tuple of strings to try.
>>
>>
>> This would speed up cases of filtering and searching when more than one
>> item is being searched for.  It would also simplify building iterators that
>> filter and yield multiple items in order.
>>
>>
>> A general google code search seems to show it's a generally useful thing to
>> do.
>>
>> http://www.google.com/codesearch?hl=en&lr=&q=%22findfirst%22+string&btnG=Search
>>
>>
>> (searching for python specific code doesn't show much because python
>> doesn't have a findfirst function of any type.)
>>
>>
>> Cheers,
>>     Ron
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
> 
> 


From jim_hill_au-24 at yahoo.com.au  Sun Sep  9 09:24:47 2007
From: jim_hill_au-24 at yahoo.com.au (Jim Hill)
Date: Sun, 09 Sep 2007 17:24:47 +1000
Subject: [Python-ideas] loop, breakif, skip
Message-ID: <46E39FBF.9010703@yahoo.com.au>


These 4 proposals are somewhat inter-dependent,
so I include them in a single message.

They are simple ideas, childish almost.
Think of non-programmers writing simple scripts,
and children learning basic coding in grade school.

Hope I'm not wasting your time with nonsense.
(I'm not an advanced programmer, so can't be sure.)

------------

Proposal 1

Abolish 'continue' in loops, and use 'skip' instead.

In normal English 'continue' means 'carry on at the next line'.

In most programming languages 'continue' means
'jump back UP the page to the start of this loop'.

This is OK for programmers accustomed to C, but
I find it very counter-intuitive, even though
i know it means 'continue with the next iteration'.

On the other hand, 'skip', meaning
'skip the rest of this iteration',
feels more intuitive to me.

'continue' is too long, 'skip' is short.

One new keyword.
Breaks existing code.

------------

Proposal 2

an alternative to PEP-315, and a simpler way
to write while loops of various flavours.

part A

'loop:' is exactly equivalent to 'while True:'

part B

(1)
'*breakif <condition>'
is exactly equivalent to
'if <condition>: break'

(2)
'*skipif <condition>'
is exactly equivalent to
'if <condition>: skip'

(assuming 'skip' replaces 'continue')

* is to make the word easier to find by human eye.
would some other character do it better?

2a and 2b together allow while loops to optionally
look something like this:

loop:
    <statements>
    *breakif <condition>
    <statements>
    *skipif <condition>
    <statements>


*breakif and *skipif can be used in for loops too, of course.

3 new keywords.
Existing code would not be affected, unless it was
already using loop, *breakif or *skipif as names.

------------

Proposal 3

Mainly for young students learning to program.

the keyword 'loop' can be placed in front of the keyword 'while'
the keyword 'loop' can be placed in front of the keyword 'for'
without changing the meaning of 'while' or 'for'.

Looks like this

loop while <condition>:
    <statements>

loop for <iteration expression>:
    <statements>


Allows beginner students the satisfaction of thinking that
every kind of loop begins with the word 'loop',
which also makes learning a little easier.
(Later they will learn that 'loop' can be left out.)

Existing code would not be affected.

------------

Proposal 4

If 'continue' is not used in loops, it can have
a more meaningful role in switch/case blocks.

'continue' in a Python switch block would have a meaning
opposite to that of 'break' in a C switch block,
allowing you to do 'fall-through'.

Here 'continue' would have its intuitive meaning of
'carry on at the next line'.


switch <expression>:
    case <values>:
       <statements>
       [continue]
    case <values>:
       <statements>
       [continue]
    case <values>:
       <statements>


Existing code would not be affected,
as switch/case is not implemented yet.

------------

Jim Hill



From greg.ewing at canterbury.ac.nz  Sun Sep  9 10:21:06 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 09 Sep 2007 20:21:06 +1200
Subject: [Python-ideas] loop, breakif, skip
In-Reply-To: <46E39FBF.9010703@yahoo.com.au>
References: <46E39FBF.9010703@yahoo.com.au>
Message-ID: <46E3ACF2.9050808@canterbury.ac.nz>

Jim Hill wrote:
> Think of non-programmers writing simple scripts,
> and children learning basic coding in grade school.
>  
> Abolish 'continue' in loops, and use 'skip' instead.

I wouldn't recommend teaching beginning programmers
about continue at all, whatever it's called. It's
an unneccessary complication when learning the
fundamentals.

> 'loop:' is exactly equivalent to 'while True:'

> '*breakif <condition>'
> '*skipif <condition>'

These just look ugly and unpythonic.

> the keyword 'loop' can be placed in front of the keyword 'while'
> the keyword 'loop' can be placed in front of the keyword 'for'

'Loop' is a piece of programming jargon, not
something that would occur readily to someone thinking
in everyday terms. Python's way of phrasing its
loops is closer to natural English, therefore,
one would expect, easier for beginning programmers
to get the meaning of.

> Allows beginner students the satisfaction of thinking that
> every kind of loop begins with the word 'loop',

All two of them? I don't think that's a big enough
burden on the memory to be worth introducing
Another Way To Do It.

> 'continue' in a Python switch block would have a meaning
> opposite to that of 'break' in a C switch block,
> allowing you to do 'fall-through'.

I don't think that's any more intuitive than its
current meaning. The word 'continue' on its own
doesn't really say anything at all about *what*
to continue with, except in a context where you're
stopped in some way, which is not the case here.
So whatever meaning is chosen, it's something
that has to be learned.

--
Greg


From ksankar at doubleclix.net  Sun Sep 16 22:59:32 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Sun, 16 Sep 2007 13:59:32 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately massive
 (4 to 32 cores) multi-core architectures
Message-ID: <46ED9934.4070201@doubleclix.net>

PEP: xxxxxxxx
Title: Concurrency for moderately massive (4 to 32 cores) multi-core architectures
Version: $Revision$
Last-Modified: $Date$
Author: Krishna Sankar <ksankar (at) doubleclix.net>,
Status: Wandering ! (as in "Not all those who wander are lost ..." -J.R.R.Tolkien)
Type: Process
Content-Type: text/x-rst
Created: 15-Sep-2007

Abstract
--------
This proposal aims at leveraging the multi-core capability as an embedded mechanism in python. It is not whether python is slow or fast, but of performance and control of parallelism/concurrency in a moderately massive parallelism world. The aim is 4 to 32 cores. The proposal advocates two mechanisms - one for task parallelism and another for data intensive parallelism. Scientific computing and web 2.0 frameworks are the forefront users for this proposal. Other applications would benefit as well.

Rationale
---------
Multicore architectures need no introductions and their ubiquity is evident. It is imperative that Python has one or more standard ways of leveraging multi-core architectures. OTOH, traditional thread based concurrency and lock based exclusions are becoming more and more difficult to program correctly.

First of all, the question is not whether py is slow or fast but performance of a system written in py. Which means, ability to leverage multi-core architectures as well as control. Control in term of things like ability to pin one process/task to a core, ability to pin one or more homogeneous tasks to specific cores et al, as well as not wait for a global lock and similar primitives. (Before anybody jumps into a conclusion, this is not about GIL by any means ;o))

Second, it is clear that we need a good solution (not THE solution) for moderately massive parallelism in multi-core architectures (i.e. 8-32 cores). Share nothing might not be optimal; we need some form of memory sharing, not just copy all data via messages. May be functional programming based on the blackboard pattern would work, who knows.

I have seen systems saturated still having only ~25% of CPU utilization (in a 4 core system!). It is because we didn't leverage multi-cores and parallelism. So while py3k will not be slow, lack of a cohesive multi-core strategy will show up in system performance and byte us later(pun intended!).

At least, in my mind, this is not an exercise about exposing locks and mutexes or threads in Python. I do believe that the GIL will be refactored to more granularity in the coming months (similar to the Global Locks in Linux) and most probably we will get microThreads et al. As we all know, architecture is constraining as well as liberating. The language primitives influence greatly how we think about a problem. 

In the discussions, Guido is right in insisting on speed, and Bruce is right in asking for language constructs. Without pragmatic speed, folks won't use it; same is the case without the required constructs. Both are barriers to adoption. We have an opportunity to offer a solution for multi-core architectures and let us seize it - we will rush in where angels fear to tread!

Programming Models
------------------
There are at least 3 possible paradigms

A. conventional threading model
B. Functional model, Erlang being the most appropriate C. Some form of limited shared memory model (message passing but pass pointers, blackboard model) D. Others, like Transactional Memory [2]

There is enough literature out there, so do not plan to explain these here. (<KS> Do we need more explanation? </KS>)

Pragmatic proposal
------------------
May I suggest we embed two primitives in Python 3K:
A)	A functional style share-nothing set of interfaces (and implementations thereof) - provides  the task parallelism/concurrency capability, "small messages, big computations" as Joe Armstrong calls it[3]
B)	A limited shared memory based model for data intensive parallelism

Most probably this would be part of stdlib. While Guido is almost right in saying that this is a (std)library problem, it is not fully so. We would need a few primitives from the underlying PVM substrate. Possibly one reason for Guido's position is the lack of clarity as to what needs to be changed and why. IMHO, just saying take GIL off does not solve the problem either.

The Zen of Python parallelism
-----------------------------
I draw inspiration for the very timely article by James Reinders in DDJ [1]. It embodies what we should be doing viz.:
1. Refactor the problem into parallel tasks. We cannot help if the domain is sequential 2. Program to abstraction & program chores not cores. Writing correct program using raw threads et al is difficult. Let the underlying substrate decide how best to optimize 3. Design for scale 4. Have an option to turn concurrency off, for debugging 5. Declarative parallelism based mechanisms (?) 

Related Efforts
---------------
The good news is there are at least 2 or 3 paradigms with implementations and rough benchmarks. Hopefully we can leverage the implementations and mature them to stdlib (with required primitives in pvm)
Parallel python http://www.artima.com/weblogs/viewpost.jsp?thread=214303
http://cheeseshop.python.org/pypi/parallel
Processing http://cheeseshop.python.org/pypi/processing
http://code.google.com/p/papyros/

Discussions
-----------
There are at least four thread sets (pardon the pun !) I am aware of:
1. The GIL discussions in python-dev and Guido's blog on GIL http://www.artima.com/weblogs/viewpost.jsp?thread=214235
2. The py3k topics started by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214112, response by Guide http://www.artima.com/weblogs/viewpost.jsp?thread=214325 and reply to reply by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214480
3. Python and concurrency http://mail.python.org/pipermail/python-ideas/2007-March/000338.html

References
----------
[1]http://www.ddj.com/architect/201804248
[2]Transaction http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=444
[3]Programming Erlang by Joe Armstrong




From ksankar at doubleclix.net  Sun Sep 16 23:34:10 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Sun, 16 Sep 2007 14:34:10 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46ED9934.4070201@doubleclix.net>
References: <46ED9934.4070201@doubleclix.net>
Message-ID: <46EDA152.8040601@doubleclix.net>

Folks,
    For some reason (fat fingers ;o() I missed the introduction to the 
proposal. Here is the full mail (pardon me for the spam):

    As a follow-up to the py3k discussions started by Bruce and Guido, I 
pinged Brett and he suggested I submit an exploratory proposal. Would 
appreciate insights, wisdom, the good, the bad and the ugly.
 
A)    Does it make sense ?
B)    Which application sets should we consider in designing the 
interfaces and implementations
C)    In this proposal, parallelism and concurrency are used in an 
interchangeable fashion. Thoughts ?
D)    Please suggest pertinent links, discussions and insights.
E)    I have kept the proposal to a minimum to start the discussions and 
to explore if this is the right thing to do. Collaboratively, as we 
zero-in on one or two approaches, the idea is to expand it to a crisp 
and clear PEP. Need to do some more formatting as well.

------------------------------------------------------------------------------------------------------------
PEP: xxxxxxxx
Title: Concurrency for moderately massive (4 to 32 cores) multi-core 
architectures
Version: $Revision$
Last-Modified: $Date$
Author: Krishna Sankar <ksankar (at) doubleclix.net>,
Status: Wandering ! (as in "Not all those who wander are lost ..." 
-J.R.R.Tolkien)
Type: Process
Content-Type: text/x-rst
Created: 15-Sep-2007

Abstract
--------
This proposal aims at leveraging the multi-core capability as an 
embedded mechanism in python. It is not whether python is slow or fast, 
but of performance and control of parallelism/concurrency in a 
moderately massive parallelism world. The aim is 4 to 32 cores. The 
proposal advocates two mechanisms - one for task parallelism and another 
for data intensive parallelism. Scientific computing and web 2.0 
frameworks are the forefront users for this proposal. Other applications 
would benefit as well.

Rationale
---------
Multicore architectures need no introductions and their ubiquity is 
evident. It is imperative that Python has one or more standard ways of 
leveraging multi-core architectures. OTOH, traditional thread based 
concurrency and lock based exclusions are becoming more and more 
difficult to program correctly.

First of all, the question is not whether py is slow or fast but 
performance of a system written in py. Which means, ability to leverage 
multi-core architectures as well as control. Control in term of things 
like ability to pin one process/task to a core, ability to pin one or 
more homogeneous tasks to specific cores et al, as well as not wait for 
a global lock and similar primitives. (Before anybody jumps into a 
conclusion, this is not about GIL by any means ;o))

Second, it is clear that we need a good solution (not THE solution) for 
moderately massive parallelism in multi-core architectures (i.e. 8-32 
cores). Share nothing might not be optimal; we need some form of memory 
sharing, not just copy all data via messages. May be functional 
programming based on the blackboard pattern would work, who knows.

I have seen systems saturated still having only ~25% of CPU utilization 
(in a 4 core system!). It is because we didn't leverage multi-cores and 
parallelism. So while py3k will not be slow, lack of a cohesive 
multi-core strategy will show up in system performance and byte us 
later(pun intended!).

At least, in my mind, this is not an exercise about exposing locks and 
mutexes or threads in Python. I do believe that the GIL will be 
refactored to more granularity in the coming months (similar to the 
Global Locks in Linux) and most probably we will get microThreads et al. 
As we all know, architecture is constraining as well as liberating. The 
language primitives influence greatly how we think about a problem.

In the discussions, Guido is right in insisting on speed, and Bruce is 
right in asking for language constructs. Without pragmatic speed, folks 
won't use it; same is the case without the required constructs. Both are 
barriers to adoption. We have an opportunity to offer a solution for 
multi-core architectures and let us seize it - we will rush in where 
angels fear to tread!

Programming Models
------------------
There are at least 3 possible paradigms

A. conventional threading model
B. Functional model, Erlang being the most appropriate C. Some form of 
limited shared memory model (message passing but pass pointers, 
blackboard model) D. Others, like Transactional Memory [2]

There is enough literature out there, so do not plan to explain these 
here. (<KS> Do we need more explanation? </KS>)

Pragmatic proposal
------------------
May I suggest we embed two primitives in Python 3K:
A)    A functional style share-nothing set of interfaces (and 
implementations thereof) - provides  the task parallelism/concurrency 
capability, "small messages, big computations" as Joe Armstrong calls it[3]
B)    A limited shared memory based model for data intensive parallelism

Most probably this would be part of stdlib. While Guido is almost right 
in saying that this is a (std)library problem, it is not fully so. We 
would need a few primitives from the underlying PVM substrate. Possibly 
one reason for Guido's position is the lack of clarity as to what needs 
to be changed and why. IMHO, just saying take GIL off does not solve the 
problem either.

The Zen of Python parallelism
-----------------------------
I draw inspiration for the very timely article by James Reinders in DDJ 
[1]. It embodies what we should be doing viz.:
1. Refactor the problem into parallel tasks. We cannot help if the 
domain is sequential 2. Program to abstraction & program chores not 
cores. Writing correct program using raw threads et al is difficult. Let 
the underlying substrate decide how best to optimize 3. Design for scale 
4. Have an option to turn concurrency off, for debugging 5. Declarative 
parallelism based mechanisms (?)

Related Efforts
---------------
The good news is there are at least 2 or 3 paradigms with 
implementations and rough benchmarks.
Parallel python http://www.artima.com/weblogs/viewpost.jsp?thread=214303
http://cheeseshop.python.org/pypi/parallel
Processing http://cheeseshop.python.org/pypi/processing
http://code.google.com/p/papyros/

Discussions
-----------
There are at least four thread sets (pardon the pun !) I am aware of:
1. The GIL discussions in python-dev and Guido's blog on GIL 
http://www.artima.com/weblogs/viewpost.jsp?thread=214235
2. The py3k topics started by Bruce 
http://www.artima.com/weblogs/viewpost.jsp?thread=214112, response by 
Guide http://www.artima.com/weblogs/viewpost.jsp?thread=214325 and reply 
to reply by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214480
3. Python and concurrency 
http://mail.python.org/pipermail/python-ideas/2007-March/000338.html
 

References
[1]http://www.ddj.com/architect/201804248
[2]Transaction 
http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=444
[3]Programming Erlang by Joe Armstrong



From rhamph at gmail.com  Mon Sep 17 00:51:18 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Sun, 16 Sep 2007 16:51:18 -0600
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46EDA152.8040601@doubleclix.net>
References: <46ED9934.4070201@doubleclix.net> <46EDA152.8040601@doubleclix.net>
Message-ID: <aac2c7cb0709161551m6749a203n44b27f4ca9834eea@mail.gmail.com>

On 9/16/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
> Folks,
>     For some reason (fat fingers ;o() I missed the introduction to the
> proposal. Here is the full mail (pardon me for the spam):
>
>     As a follow-up to the py3k discussions started by Bruce and Guido, I
> pinged Brett and he suggested I submit an exploratory proposal. Would
> appreciate insights, wisdom, the good, the bad and the ugly.
>
> A)    Does it make sense ?
> B)    Which application sets should we consider in designing the
> interfaces and implementations
> C)    In this proposal, parallelism and concurrency are used in an
> interchangeable fashion. Thoughts ?
> D)    Please suggest pertinent links, discussions and insights.
> E)    I have kept the proposal to a minimum to start the discussions and
> to explore if this is the right thing to do. Collaboratively, as we
> zero-in on one or two approaches, the idea is to expand it to a crisp
> and clear PEP. Need to do some more formatting as well.

I've been exploring this problem for a while so I've got some pretty
strong opinions.  I guess we'll find out if my ideas pass muster. :)


------------------------------------------------------------------------------------------------------------
> PEP: xxxxxxxx
> Title: Concurrency for moderately massive (4 to 32 cores) multi-core
> architectures
> Version: $Revision$
> Last-Modified: $Date$
> Author: Krishna Sankar <ksankar (at) doubleclix.net>,
> Status: Wandering ! (as in "Not all those who wander are lost ..."
> -J.R.R.Tolkien)
> Type: Process
> Content-Type: text/x-rst
> Created: 15-Sep-2007
>
> Abstract
> --------
> This proposal aims at leveraging the multi-core capability as an
> embedded mechanism in python. It is not whether python is slow or fast,
> but of performance and control of parallelism/concurrency in a
> moderately massive parallelism world. The aim is 4 to 32 cores. The
> proposal advocates two mechanisms - one for task parallelism and another
> for data intensive parallelism. Scientific computing and web 2.0
> frameworks are the forefront users for this proposal. Other applications
> would benefit as well.

I'm not sure just what "data intensive" means.  I know some is
basically a variant on vectorization, but I think that can be done
much easier in a library than in the language proper.

You're also missing distributed parallelism.  There's a large domain
in which you want failures to only bring down a single node, you'll
willing to sacrifice shared state and consensus, you're willing to put
in the extra effort to make it work.  That's not ideal for the core of
the language (where threading can do far better), but it's important
to keep in mind how they play off each other, and which use cases can
be better met by either.


>
> Rationale
> ---------
> Multicore architectures need no introductions and their ubiquity is
> evident. It is imperative that Python has one or more standard ways of
> leveraging multi-core architectures. OTOH, traditional thread based
> concurrency and lock based exclusions are becoming more and more
> difficult to program correctly.
>
> First of all, the question is not whether py is slow or fast but
> performance of a system written in py. Which means, ability to leverage
> multi-core architectures as well as control. Control in term of things
> like ability to pin one process/task to a core, ability to pin one or
> more homogeneous tasks to specific cores et al, as well as not wait for
> a global lock and similar primitives. (Before anybody jumps into a
> conclusion, this is not about GIL by any means ;o))

I'm not sure how relevant processor affinity (or prioritization for
that matter) .  I suspect only around 5% of users (if that) will
really need it.  It seems that you'd need some deep cooperation with
the OS to be really successful at it, and the support isn't there
today.  Of course support will likely improve as manycore becomes
common.


> Second, it is clear that we need a good solution (not THE solution) for
> moderately massive parallelism in multi-core architectures (i.e. 8-32
> cores). Share nothing might not be optimal; we need some form of memory
> sharing, not just copy all data via messages. May be functional
> programming based on the blackboard pattern would work, who knows.

I think share nothing is clearly unacceptable.  Things like audio
processing, video processing, gaming, or GUIs need it.  If you don't
support it directly you'll end up with a second-class form of objects
where the true content is shared indirectly and a handle gets copied
repeatedly.  You lose a great deal of dynamism doing that.


> I have seen systems saturated still having only ~25% of CPU utilization
> (in a 4 core system!). It is because we didn't leverage multi-cores and
> parallelism. So while py3k will not be slow, lack of a cohesive
> multi-core strategy will show up in system performance and byte us
> later(pun intended!).

This hints at a major compromise we can make.  So long as the
semantics are unchanged, we can offer a compile-time option to enable
scalable threading, even though it may have a fairly significant
amount of overhead.  For 1 to 4 cores you'll still want high
single-thread performance, but by the time you've got 8 cores it may
be an easy decision to switch to a scalable version instead.
Packagers could make this as easy as installing a different core
package.


> At least, in my mind, this is not an exercise about exposing locks and
> mutexes or threads in Python. I do believe that the GIL will be
> refactored to more granularity in the coming months (similar to the
> Global Locks in Linux) and most probably we will get microThreads et al.
> As we all know, architecture is constraining as well as liberating. The
> language primitives influence greatly how we think about a problem.

I've already got a patch/fork with the GIL refactored/removed, so I
agree that it'll change. ;)  I highly doubt we'll get any real
microthreads though: CPython is built on C, and getting away from the
C stack (and thus C's threads) is impractical.

I agree it's about primitives though.


> In the discussions, Guido is right in insisting on speed, and Bruce is
> right in asking for language constructs. Without pragmatic speed, folks
> won't use it; same is the case without the required constructs. Both are
> barriers to adoption. We have an opportunity to offer a solution for
> multi-core architectures and let us seize it - we will rush in where
> angels fear to tread!
>
> Programming Models
> ------------------
> There are at least 3 possible paradigms
>
> A. conventional threading model
> B. Functional model, Erlang being the most appropriate C. Some form of
> limited shared memory model (message passing but pass pointers,
> blackboard model) D. Others, like Transactional Memory [2]
>
> There is enough literature out there, so do not plan to explain these
> here. (<KS> Do we need more explanation? </KS>)

I'm not sure where my model fits in.  What I do is take all the
existing python objects and give them a shareable/non-shareable
property.  If if an object has some explicit semantics (such as a
thread-safe queue or an immutable int) then it's shareable; otherwise
it's non-shareable.  All the communication mechanisms (queues,
arguments to spawned threads, etc) check this property, so it becomes
impossible to corrupt memory.


> Pragmatic proposal
> ------------------
> May I suggest we embed two primitives in Python 3K:
> A)    A functional style share-nothing set of interfaces (and
> implementations thereof) - provides  the task parallelism/concurrency
> capability, "small messages, big computations" as Joe Armstrong calls it[3]
> B)    A limited shared memory based model for data intensive parallelism
>
> Most probably this would be part of stdlib. While Guido is almost right
> in saying that this is a (std)library problem, it is not fully so. We
> would need a few primitives from the underlying PVM substrate. Possibly
> one reason for Guido's position is the lack of clarity as to what needs
> to be changed and why. IMHO, just saying take GIL off does not solve the
> problem either.

I agree that it's *mostly* a stdlib problem.  The breadth of the
useful tools will simply be part of the library.  There are special
cases where language modifications are needed though.


> The Zen of Python parallelism
> -----------------------------
> I draw inspiration for the very timely article by James Reinders in DDJ
> [1]. It embodies what we should be doing viz.:
> 1. Refactor the problem into parallel tasks. We cannot help if the
> domain is sequential 2. Program to abstraction & program chores not
> cores. Writing correct program using raw threads et al is difficult. Let
> the underlying substrate decide how best to optimize 3. Design for scale
> 4. Have an option to turn concurrency off, for debugging 5. Declarative
> parallelism based mechanisms (?)

4 is made moot by better debuggers.  I think that's more practical in
the long run.  Really, if you have producer and consumer threads, you
*can't* flick a switch to make them serial.


> Related Efforts
> ---------------
> The good news is there are at least 2 or 3 paradigms with
> implementations and rough benchmarks.
> Parallel python http://www.artima.com/weblogs/viewpost.jsp?thread=214303
> http://cheeseshop.python.org/pypi/parallel
> Processing http://cheeseshop.python.org/pypi/processing
> http://code.google.com/p/papyros/
>
> Discussions
> -----------
> There are at least four thread sets (pardon the pun !) I am aware of:
> 1. The GIL discussions in python-dev and Guido's blog on GIL
> http://www.artima.com/weblogs/viewpost.jsp?thread=214235
> 2. The py3k topics started by Bruce
> http://www.artima.com/weblogs/viewpost.jsp?thread=214112, response by
> Guide http://www.artima.com/weblogs/viewpost.jsp?thread=214325 and reply
> to reply by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214480
> 3. Python and concurrency
> http://mail.python.org/pipermail/python-ideas/2007-March/000338.html
>
>
> References
> [1]http://www.ddj.com/architect/201804248
> [2]Transaction
> http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=444
> [3]Programming Erlang by Joe Armstrong

I'd like to add a list of practical requirements a design must meet:
* It must be composable with traditional single-threaded programs or
libraries.  Small changes are acceptable; complete redesigns are not.
* It must be largely compatible with existing CPython code and
extensions.  The threading APIs will likely change, but replacing
Py_INCREF/Py_DECREF with something else is too much.
* It must be useful for a broad set of local problems, without
becoming burdened down with speciality features.  We don't need direct
support for distributed computing.
* It needs to be easy, reliable, and robust.  Uncaught exceptions
should gracefully abort the entire program with a stack trace,
deadlocks should be detected and broken, and corruption should be
impossible.

Open for debate:
* How much compatibility should be retained with existing concurrency
mechanisms?  If we're trying to propose something better then we
obviously want to replace them, not add "yet another library", but
transition is important too.  (I mean this question to broadly apply
to event-driven and threaded libraries alike.)

-- 
Adam Olsen, aka Rhamphoryncus


From ellisonbg.net at gmail.com  Wed Sep 19 03:31:00 2007
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Tue, 18 Sep 2007 21:31:00 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately massive
	(4 to 32 cores) multi-core architectures
Message-ID: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>

Thinking about how Python can better support parallelism and
concurrency is an important topic.  Here is how I see it:  if we don't
address the issue, the Python interpreter 5 or 10 years from now will
run at roughly the same speed as it does today.  This is because
single CPU cores are not getting much faster (power consumption is too
high).  Instead, most of the performance gains in hardware will be due
to increased hardware parallelism, which means multi/many core CPUs.

What to do about this pending crisis is a complicated issue.

There are (at least) two levels that are important:

1.  Language level features that make it possible to build
higher-level libraries/tools for parallelism.

2.  The high-level libraries/tools that most users and developers
would use to express parallelism.

I think it is absolutely critical that we worry about (1) before
jumping to (2).  So, some thoughts about (1).  Does Python itself need
to be changed to better enable people to write libraries for
expressing parallelism?

My answer to this is no.  The dominant languages for parallel
computing (C/C++/Fortran) don't really have any additional constructs
or features above Python in this respect.  Java has a more
sophisticated support for threads.  Erlang has concurrency built into
its core.  But, Python is not Erlang or Java.  As Twisted
demonstrates, Python as a language is plenty powerful enough to
express concurrency in an elegant way.  I am not saying that
parallelism and concurrency is easy or wonderful today in Python, just
that the language itself is not the problem.  We don't necessarily
need new language features, we simply need bright people to sit down
and think about the right way to express parallelism in Python and
then write libraries (maybe in the stdlib) that implement those ideas.

But, there is a critical problem in CPython's implementation that
prevents people from really breaking new ground in this area with
Python.  It is the GIL and here is why:

* For the platforms on which Python runs, threads are what the
hardware+OS people have given to us as the most fine grained way of
mapping parallelism onto hardware.  This is true, even if you have
philosophical or existential problems with threads.  With the
limitations of the GIL, we can't take advantage of what hardware gives
to us.

* A process based solution using message passing is simply not
suitable for many parallel algorithms that are communications bound.
The shared state of threads is needed in many cases, not because
sharing state is a "fantastic idea", but rather because it is fast.
This will only become more true as multicore CPUs gain more
sophisticated memory architectures with higher bandwidths.  Also, the
overhead of managing processes is much greater than with threads.
Many exellent fine grained parallel approaches like Cilk would not be
possible with processes only.

* There are a number of powerful, high-level Python packages that
already exist (these have been named in the various threads) that
allow parallelism to be expressed.  All of these suffer from a GIL
related problem even though they are process based and use message
passing.  Regardless of whether you are using blocking/non-blocking
sockets/IPC, you can't run long running CPU bound code, because all
the network related stuff will stop.  You then think, "OK, I will run
the CPU intensive stuff in a different thread."  If the CPU intensive
code is just regular Python, you are fine, the Python interpreter will
switch between the network thread and the CPU intensive thread every
so often.  But the second you run extension code that doesn't release
the GIL, you are screwed.  The network thread will die until the
extension code is done.  When it comes to implementing robust process
based parallelism using sockets, the last thing you can afford is to
have your networking black out like this, and in CPython it can't be
avoided.

<disclaimer>
I am not saying that threads are what everyone should be using to
express parallelism.  I am only saying that they are needed to
implement robust higher-level forms of parallelism on multicore
systems, regardless of whether the solution is using process+ threads
or threads alone.
</disclaimer>

Of the dozen or so "parallel Python" packages that currently exist,
they _all_ suffer from this problem (some hide it better than others
though using clever tricks).  We can run but we can't hide.

Because of these things, I think the current "Exploratory PEP" is
entirely premature.  Let's figure out exactly what to do with the GIL
and _then_ think about the fun stuff.

Brian


From ksankar at doubleclix.net  Wed Sep 19 04:29:08 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 18 Sep 2007 19:29:08 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
Message-ID: <46F08974.4010101@doubleclix.net>

Brian,
    Good points.

> We don't necessarily
> need new language features, we simply need bright people to sit down
>and think about the right way to express parallelism in Python and
> then write libraries (maybe in the stdlib) that implement those ideas.
<KS>
	Exactly. This PEP is about this thinking about expressing parallelism in py.
	Also GIL is a challenge only in one implementation (I know, it is an important implementation!)
	My assumption is that the GIL restriction will be removed one way or another, soon. (same reason, quoting Joe Lewis)

	What we need to do is not to line them (i.e. GIL removal, parallelism) serially, but work on them simultaneously; that way they will leverage each other. Also, what ever paradigm(s) we zero-in on, can be implemented in other implementations, anyway.

	Moreover, IMHO, we need not force the GIL issue just yet. I firmly believe that it will find a solution in it's own time frame ...
</KS>
Cheers
<k/> 


Brian Granger wrote:
> Thinking about how Python can better support parallelism and
> concurrency is an important topic.  Here is how I see it:  if we don't
> address the issue, the Python interpreter 5 or 10 years from now will
> run at roughly the same speed as it does today.  This is because
> single CPU cores are not getting much faster (power consumption is too
> high).  Instead, most of the performance gains in hardware will be due
> to increased hardware parallelism, which means multi/many core CPUs.
>
> What to do about this pending crisis is a complicated issue.
>
> There are (at least) two levels that are important:
>
> 1.  Language level features that make it possible to build
> higher-level libraries/tools for parallelism.
>
> 2.  The high-level libraries/tools that most users and developers
> would use to express parallelism.
>
> I think it is absolutely critical that we worry about (1) before
> jumping to (2).  So, some thoughts about (1).  Does Python itself need
> to be changed to better enable people to write libraries for
> expressing parallelism?
>
> My answer to this is no.  The dominant languages for parallel
> computing (C/C++/Fortran) don't really have any additional constructs
> or features above Python in this respect.  Java has a more
> sophisticated support for threads.  Erlang has concurrency built into
> its core.  But, Python is not Erlang or Java.  As Twisted
> demonstrates, Python as a language is plenty powerful enough to
> express concurrency in an elegant way.  I am not saying that
> parallelism and concurrency is easy or wonderful today in Python, just
> that the language itself is not the problem.  We don't necessarily
> need new language features, we simply need bright people to sit down
> and think about the right way to express parallelism in Python and
> then write libraries (maybe in the stdlib) that implement those ideas.
>
> But, there is a critical problem in CPython's implementation that
> prevents people from really breaking new ground in this area with
> Python.  It is the GIL and here is why:
>
> * For the platforms on which Python runs, threads are what the
> hardware+OS people have given to us as the most fine grained way of
> mapping parallelism onto hardware.  This is true, even if you have
> philosophical or existential problems with threads.  With the
> limitations of the GIL, we can't take advantage of what hardware gives
> to us.
>
> * A process based solution using message passing is simply not
> suitable for many parallel algorithms that are communications bound.
> The shared state of threads is needed in many cases, not because
> sharing state is a "fantastic idea", but rather because it is fast.
> This will only become more true as multicore CPUs gain more
> sophisticated memory architectures with higher bandwidths.  Also, the
> overhead of managing processes is much greater than with threads.
> Many exellent fine grained parallel approaches like Cilk would not be
> possible with processes only.
>
> * There are a number of powerful, high-level Python packages that
> already exist (these have been named in the various threads) that
> allow parallelism to be expressed.  All of these suffer from a GIL
> related problem even though they are process based and use message
> passing.  Regardless of whether you are using blocking/non-blocking
> sockets/IPC, you can't run long running CPU bound code, because all
> the network related stuff will stop.  You then think, "OK, I will run
> the CPU intensive stuff in a different thread."  If the CPU intensive
> code is just regular Python, you are fine, the Python interpreter will
> switch between the network thread and the CPU intensive thread every
> so often.  But the second you run extension code that doesn't release
> the GIL, you are screwed.  The network thread will die until the
> extension code is done.  When it comes to implementing robust process
> based parallelism using sockets, the last thing you can afford is to
> have your networking black out like this, and in CPython it can't be
> avoided.
>
> <disclaimer>
> I am not saying that threads are what everyone should be using to
> express parallelism.  I am only saying that they are needed to
> implement robust higher-level forms of parallelism on multicore
> systems, regardless of whether the solution is using process+ threads
> or threads alone.
> </disclaimer>
>
> Of the dozen or so "parallel Python" packages that currently exist,
> they _all_ suffer from this problem (some hide it better than others
> though using clever tricks).  We can run but we can't hide.
>
> Because of these things, I think the current "Exploratory PEP" is
> entirely premature.  Let's figure out exactly what to do with the GIL
> and _then_ think about the fun stuff.
>
> Brian
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>   



From ellisonbg.net at gmail.com  Wed Sep 19 04:58:22 2007
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Tue, 18 Sep 2007 22:58:22 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F08974.4010101@doubleclix.net>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
Message-ID: <6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>

> > We don't necessarily
> > need new language features, we simply need bright people to sit down
> >and think about the right way to express parallelism in Python and
> > then write libraries (maybe in the stdlib) that implement those ideas.
> <KS>
>         Exactly. This PEP is about this thinking about expressing parallelism in py.

While I too love to think about parallelism, until the limitations of
the GIL in CPython are fixed, all our grand thoughts will be dead ends
at some level.

>         Also GIL is a challenge only in one implementation (I know, it is an important implementation!)
>         My assumption is that the GIL restriction will be removed one way or another, soon. (same reason, quoting Joe Lewis)

I am not that optimistic I guess.  Hopeful though.

>         What we need to do is not to line them (i.e. GIL removal, parallelism) serially, but work on them simultaneously; that way they will leverage each other. Also, what ever paradigm(s) we zero-in on, can be implemented in other implementations, anyway.

If there were infinitely many people willing for work on this stuff,
then I agree, but I don't see even a dozen people hacking on the GIL.
And in my mind, once the limitations of the GIL are relaxed, the other
parallel stuff won't be very difficult given the work that people have
already done in this area.

>         Moreover, IMHO, we need not force the GIL issue just yet. I firmly believe that it will find a solution in it's own time frame ...


From rhamph at gmail.com  Wed Sep 19 05:30:19 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Tue, 18 Sep 2007 21:30:19 -0600
Subject: [Python-ideas] Thread exceptions and interruption
Message-ID: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>

One of the core problems with threading is what to do with exceptions
and how to gracefully exit when one goes unhandled.  My approach is to
replace the independently spawned threads with "branches" off of your
main thread's call stack.

The standard example looks like this[1]:

def handle_client(conn, addr):
    with conn:
        ...

def accept_loop(server_conn):
    with branch() as clients:
        with server_conn:
            while True:
                clients.add(handle_client, *server_conn.accept())

The call stack will look something like this:

main - accept_loop - server_conn.accept
          |- handle_client
          \- handle_client

Here I use a with-statement[2] to create a branch point.  The branch
point collects any exceptions from its children and interrupts the
children when the first exception occurs.  Interruption is done
somewhat similarly to posix cancellation; participating functions
react to it.  However, I raise an Interrupted exception, which can
lead to much more graceful cleanup than posix cancellation. ;)

The __exit__ portion of branch's with-statement blocks until all child
threads have exited.  It then reraises the exception, if any, or wraps
it in MultipleError if several occurred.

The branch construct serves only simple needs.  It does not attempt to
limit the number of threads to the number of cores available, nor any
related tricks.  Those can be added as a separate tool (perhaps
wrapping branch.)

Thoughts?  Competing ideas?  Disagreement that it's a "core problem" at all? ;)


[1] I've previously (in private mostly) referred to the branch()
function as collate().  I've recently decided to rename it.

[2] Unfortunately, a with-statement lacks all the invariants that
would be desirable for the branch construct.  It also has no direct
way of handling generators-as-context-managers that themselves use
branches.

-- 
Adam Olsen, aka Rhamphoryncus


From aholkner at cs.rmit.edu.au  Wed Sep 19 06:40:09 2007
From: aholkner at cs.rmit.edu.au (Alex Holkner)
Date: Wed, 19 Sep 2007 14:40:09 +1000
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
Message-ID: <46F0A829.40908@cs.rmit.edu.au>

Adam Olsen wrote:

> Here I use a with-statement[2] to create a branch point.  The branch
> point collects any exceptions from its children and interrupts the
> children when the first exception occurs.  Interruption is done
> somewhat similarly to posix cancellation; participating functions
> react to it.  However, I raise an Interrupted exception, which can
> lead to much more graceful cleanup than posix cancellation. ;)

It sounds like you're proposing that a thread can be interrupted at any 
time.  The Java developers realised long ago that this is completely 
unworkable and deprecated their implementation:

http://java.sun.com/j2se/1.3/docs/guide/misc/threadPrimitiveDeprecation.html

Please disregard if I misunderstood your approach :-)

Alex.


From rhamph at gmail.com  Wed Sep 19 07:25:09 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Tue, 18 Sep 2007 23:25:09 -0600
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <46F0A829.40908@cs.rmit.edu.au>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<46F0A829.40908@cs.rmit.edu.au>
Message-ID: <aac2c7cb0709182225g4e9d6ae2ucdc911e5022f1386@mail.gmail.com>

On 9/18/07, Alex Holkner <aholkner at cs.rmit.edu.au> wrote:
> Adam Olsen wrote:
>
> > Here I use a with-statement[2] to create a branch point.  The branch
> > point collects any exceptions from its children and interrupts the
> > children when the first exception occurs.  Interruption is done
> > somewhat similarly to posix cancellation; participating functions
> > react to it.  However, I raise an Interrupted exception, which can
> > lead to much more graceful cleanup than posix cancellation. ;)
>
> It sounds like you're proposing that a thread can be interrupted at any
> time.  The Java developers realised long ago that this is completely
> unworkable and deprecated their implementation:
>
> http://java.sun.com/j2se/1.3/docs/guide/misc/threadPrimitiveDeprecation.html
>
> Please disregard if I misunderstood your approach :-)

You misunderstood. :)  The key word was *participating* functions.
Normally this only includes things like file or socket reading.  A
CPU-bound busy loop may never get interrupted.

-- 
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Sep 19 18:24:04 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Sep 2007 09:24:04 -0700
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
Message-ID: <ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>

Regarding the issue of exceptions in threads, I indeed see it as a
non-issue. It's easy enough to develop a subclass of threading.Thread
which catches any exceptions raised by run(), and stores the exception
as an instance variable from which it can be retrieved after join()
succeeds.

Regarding the proposal of branching the call stack, it reminds me too
much of the problems one has when a fork()'ed child raises an
exception which ends up being handled by an exception handler higher
up in the parent's call stack (which has been faithfully copied into
the child process by fork()). That has proven a major problem, leading
to various warnings to always catch all exceptions and call os._exit()
upon problems. I realize you're not proposing exactly that. I also
admit I don't exactly understand how you plan to deal with the
situation where one thread raises an exception which the spawning
location fails to handle, while another thread is still running (but
may raise another exception later). Is the spawning thread unwound?
Then what's left to catch the second thread's exception? But all in
all it gives me the heebie-jeebies.

Finally, may I suggest that you're perhaps too much in love with the
with-statement?

--Guido

On 9/18/07, Adam Olsen <rhamph at gmail.com> wrote:
> One of the core problems with threading is what to do with exceptions
> and how to gracefully exit when one goes unhandled.  My approach is to
> replace the independently spawned threads with "branches" off of your
> main thread's call stack.
>
> The standard example looks like this[1]:
>
> def handle_client(conn, addr):
>     with conn:
>         ...
>
> def accept_loop(server_conn):
>     with branch() as clients:
>         with server_conn:
>             while True:
>                 clients.add(handle_client, *server_conn.accept())
>
> The call stack will look something like this:
>
> main - accept_loop - server_conn.accept
>           |- handle_client
>           \- handle_client
>
> Here I use a with-statement[2] to create a branch point.  The branch
> point collects any exceptions from its children and interrupts the
> children when the first exception occurs.  Interruption is done
> somewhat similarly to posix cancellation; participating functions
> react to it.  However, I raise an Interrupted exception, which can
> lead to much more graceful cleanup than posix cancellation. ;)
>
> The __exit__ portion of branch's with-statement blocks until all child
> threads have exited.  It then reraises the exception, if any, or wraps
> it in MultipleError if several occurred.
>
> The branch construct serves only simple needs.  It does not attempt to
> limit the number of threads to the number of cores available, nor any
> related tricks.  Those can be added as a separate tool (perhaps
> wrapping branch.)
>
> Thoughts?  Competing ideas?  Disagreement that it's a "core problem" at all? ;)
>
>
> [1] I've previously (in private mostly) referred to the branch()
> function as collate().  I've recently decided to rename it.
>
> [2] Unfortunately, a with-statement lacks all the invariants that
> would be desirable for the branch construct.  It also has no direct
> way of handling generators-as-context-managers that themselves use
> branches.
>
> --
> Adam Olsen, aka Rhamphoryncus
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Wed Sep 19 18:53:17 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 10:53:17 -0600
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
Message-ID: <aac2c7cb0709190953r1ea9d7adh3a29e19fbba51fa5@mail.gmail.com>

On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> Regarding the issue of exceptions in threads, I indeed see it as a
> non-issue. It's easy enough to develop a subclass of threading.Thread
> which catches any exceptions raised by run(), and stores the exception
> as an instance variable from which it can be retrieved after join()
> succeeds.
>
> Regarding the proposal of branching the call stack, it reminds me too
> much of the problems one has when a fork()'ed child raises an
> exception which ends up being handled by an exception handler higher
> up in the parent's call stack (which has been faithfully copied into
> the child process by fork()). That has proven a major problem, leading
> to various warnings to always catch all exceptions and call os._exit()
> upon problems. I realize you're not proposing exactly that. I also
> admit I don't exactly understand how you plan to deal with the
> situation where one thread raises an exception which the spawning
> location fails to handle, while another thread is still running (but
> may raise another exception later). Is the spawning thread unwound?
> Then what's left to catch the second thread's exception? But all in
> all it gives me the heebie-jeebies.

I don't see what you're getting at.  No stack copying is done so fork
is irrelevant and the spawning threads *always* blocks until all of
its child threads have exited.

Let's try a simpler example, without the main thread (which isn't
special for exception purposes):
(Make sure to look at it with a monospace font)

           / baz
foo - bar +- baz
           \ baz

bar encapsulates several threads.  It makes no sense unravel the call
tree while a lower portion of it still exists, so it must wait.  If
there is a failure, bar will politely tell all 3 baz functions to
exit, but they probably won't listen (unless they're calling I/O).  If
necessary bar will wait forever.

foo never sees any of this.  It is completely hidden within bar.


> Finally, may I suggest that you're perhaps too much in love with the
> with-statement?

I've a preference for writing as a library, rather than with new syntax. ;)


-- 
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Sep 19 19:16:06 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Sep 2007 10:16:06 -0700
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <aac2c7cb0709190953r1ea9d7adh3a29e19fbba51fa5@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
	<aac2c7cb0709190953r1ea9d7adh3a29e19fbba51fa5@mail.gmail.com>
Message-ID: <ca471dc20709191016x6e88ad4bxac9ec203f3c24393@mail.gmail.com>

On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> Let's try a simpler example, without the main thread (which isn't
> special for exception purposes):
> (Make sure to look at it with a monospace font)
>
>            / baz
> foo - bar +- baz
>            \ baz
>
> bar encapsulates several threads.  It makes no sense unravel the call
> tree while a lower portion of it still exists, so it must wait.  If
> there is a failure, bar will politely tell all 3 baz functions to
> exit, but they probably won't listen (unless they're calling I/O).  If
> necessary bar will wait forever.
>
> foo never sees any of this.  It is completely hidden within bar.

So what happens if the first baz thread raises an exception that bar
isn't handling? I suppose it first waits until all baz threads are
done, but then the question is still open. Does it percolate up to
foo? What if two or more baz threads raise exceptions? How does foo
see these?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Wed Sep 19 19:37:29 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 11:37:29 -0600
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <ca471dc20709191016x6e88ad4bxac9ec203f3c24393@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
	<aac2c7cb0709190953r1ea9d7adh3a29e19fbba51fa5@mail.gmail.com>
	<ca471dc20709191016x6e88ad4bxac9ec203f3c24393@mail.gmail.com>
Message-ID: <aac2c7cb0709191037l5b57cee8g6f9ff28ffa427ee2@mail.gmail.com>

On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> > Let's try a simpler example, without the main thread (which isn't
> > special for exception purposes):
> > (Make sure to look at it with a monospace font)
> >
> >            / baz
> > foo - bar +- baz
> >            \ baz
> >
> > bar encapsulates several threads.  It makes no sense unravel the call
> > tree while a lower portion of it still exists, so it must wait.  If
> > there is a failure, bar will politely tell all 3 baz functions to
> > exit, but they probably won't listen (unless they're calling I/O).  If
> > necessary bar will wait forever.
> >
> > foo never sees any of this.  It is completely hidden within bar.
>
> So what happens if the first baz thread raises an exception that bar
> isn't handling? I suppose it first waits until all baz threads are
> done, but then the question is still open. Does it percolate up to
> foo? What if two or more baz threads raise exceptions? How does foo
> see these?

bar itself doesn't see it until *after* they've all exited.  The
branch construct holds it until all child threads have exited.  There
is no way to get weird stack unwinding.

If multiple exceptions occur they get encapsulated in a MultipleError exception.

-- 
Adam Olsen, aka Rhamphoryncus


From aahz at pythoncraft.com  Wed Sep 19 21:33:11 2007
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 19 Sep 2007 12:33:11 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
Message-ID: <20070919193311.GA6720@panix.com>

On Tue, Sep 18, 2007, Brian Granger wrote:
>
> What to do about this pending crisis is a complicated issue.

Oh, please.  Calling this a crisis only causes people to ignore you.
Take a look at this URL that Guido posted to the baypiggies list:

http://marknelson.us/2007/07/30/multicore-panic/
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.


From jimjjewett at gmail.com  Wed Sep 19 21:58:50 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 19 Sep 2007 15:58:50 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
Message-ID: <fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>

On 9/18/07, Brian Granger <ellisonbg.net at gmail.com> wrote:

> If there were infinitely many people willing for work on this stuff,
> then I agree, but I don't see even a dozen people hacking on the GIL.

In part because many people don't believe it would be productive.

For threading to be useful in terms of parallel processing, most
memory access has to be read-only.  That isn't true today, largely
because of reference counts.

There are ways around that, by using indirection, or delayed counts,
or multiple refcount buckets per object, or even just switching to a
tracing GC.

So far, no one has been able to make these changes without seriously
mangling the C API and/or slowing things down a lot.  The current
refcount mechanism is so lightweight that it isn't clear this would
even be possible.  (With 4 or more cores dedicated to just python, it
might be worth it anyhow -- but it isn't yet.)  So if you want the GIL
removed, you need to provide an existence proof that (CPython) memory
management can be handled efficiently without it.

-jJ


From rhamph at gmail.com  Wed Sep 19 22:10:42 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 14:10:42 -0600
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
	<fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
Message-ID: <aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>

On 9/19/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 9/18/07, Brian Granger <ellisonbg.net at gmail.com> wrote:
>
> > If there were infinitely many people willing for work on this stuff,
> > then I agree, but I don't see even a dozen people hacking on the GIL.
>
> In part because many people don't believe it would be productive.
>
> For threading to be useful in terms of parallel processing, most
> memory access has to be read-only.  That isn't true today, largely
> because of reference counts.
>
> There are ways around that, by using indirection, or delayed counts,
> or multiple refcount buckets per object, or even just switching to a
> tracing GC.
>
> So far, no one has been able to make these changes without seriously
> mangling the C API and/or slowing things down a lot.  The current
> refcount mechanism is so lightweight that it isn't clear this would
> even be possible.  (With 4 or more cores dedicated to just python, it
> might be worth it anyhow -- but it isn't yet.)  So if you want the GIL
> removed, you need to provide an existence proof that (CPython) memory
> management can be handled efficiently without it.

Is 60-65% of normal CPython "a lot"?

(I really should clean things up and post a patch...)

-- 
Adam Olsen, aka Rhamphoryncus


From rhamph at gmail.com  Wed Sep 19 22:42:06 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 14:42:06 -0600
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
Message-ID: <aac2c7cb0709191342o3f3928c5s93c4961acd195a54@mail.gmail.com>

On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> Regarding the issue of exceptions in threads, I indeed see it as a
> non-issue. It's easy enough to develop a subclass of threading.Thread
> which catches any exceptions raised by run(), and stores the exception
> as an instance variable from which it can be retrieved after join()
> succeeds.

Perhaps a better question then: do you think it correctly handling
errors is a significant part of what makes threads hard today?

My focus has always been on making "simultaneous activities" easier to
manage.  Removing the GIL is just a free bonus from making independent
tasks really be independent.

-- 
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Sep 19 22:58:20 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Sep 2007 13:58:20 -0700
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <aac2c7cb0709191342o3f3928c5s93c4961acd195a54@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
	<aac2c7cb0709191342o3f3928c5s93c4961acd195a54@mail.gmail.com>
Message-ID: <ca471dc20709191358k6d78d1c6mf62beec99d776b6c@mail.gmail.com>

On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> > Regarding the issue of exceptions in threads, I indeed see it as a
> > non-issue. It's easy enough to develop a subclass of threading.Thread
> > which catches any exceptions raised by run(), and stores the exception
> > as an instance variable from which it can be retrieved after join()
> > succeeds.
>
> Perhaps a better question then: do you think it correctly handling
> errors is a significant part of what makes threads hard today?

If you're talking about unhandled exceptions, no, that's absolutely a
non-issue. The real issues are race conditions, deadlocks, livelocks
etc.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Wed Sep 19 23:38:38 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 15:38:38 -0600
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <ca471dc20709191358k6d78d1c6mf62beec99d776b6c@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
	<aac2c7cb0709191342o3f3928c5s93c4961acd195a54@mail.gmail.com>
	<ca471dc20709191358k6d78d1c6mf62beec99d776b6c@mail.gmail.com>
Message-ID: <aac2c7cb0709191438h1b1ebfdap9f42c0b93535e0e2@mail.gmail.com>

On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> > On 9/19/07, Guido van Rossum <guido at python.org> wrote:
> > > Regarding the issue of exceptions in threads, I indeed see it as a
> > > non-issue. It's easy enough to develop a subclass of threading.Thread
> > > which catches any exceptions raised by run(), and stores the exception
> > > as an instance variable from which it can be retrieved after join()
> > > succeeds.
> >
> > Perhaps a better question then: do you think it correctly handling
> > errors is a significant part of what makes threads hard today?
>
> If you're talking about unhandled exceptions, no, that's absolutely a
> non-issue. The real issues are race conditions, deadlocks, livelocks
> etc.

I guess the bottom line here is that, since none of the proposed
solutions magically eliminate race conditions, deadlocks, livelocks,
etc, we'll need to try them in the field for quite some time before
it's clear if the ways they do make things better have any significant
effects in reducing the core problems.

In other words, I (and the other pundits) should implement our ideas
in a forked python, and not propose merging back until we've got a
large user base with a proven track record.  Even if that's not as
much fun. ;)

-- 
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Sep 19 23:59:18 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Sep 2007 14:59:18 -0700
Subject: [Python-ideas] Thread exceptions and interruption
In-Reply-To: <aac2c7cb0709191438h1b1ebfdap9f42c0b93535e0e2@mail.gmail.com>
References: <aac2c7cb0709182030n71fadd00lfb25c26491a668f@mail.gmail.com>
	<ca471dc20709190924l4834052ck2db9ec746e2af14c@mail.gmail.com>
	<aac2c7cb0709191342o3f3928c5s93c4961acd195a54@mail.gmail.com>
	<ca471dc20709191358k6d78d1c6mf62beec99d776b6c@mail.gmail.com>
	<aac2c7cb0709191438h1b1ebfdap9f42c0b93535e0e2@mail.gmail.com>
Message-ID: <ca471dc20709191459r538fc921t103856a7cf197c38@mail.gmail.com>

On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> > If you're talking about unhandled exceptions, no, that's absolutely a
> > non-issue. The real issues are race conditions, deadlocks, livelocks
> > etc.
>
> I guess the bottom line here is that, since none of the proposed
> solutions magically eliminate race conditions, deadlocks, livelocks,
> etc, we'll need to try them in the field for quite some time before
> it's clear if the ways they do make things better have any significant
> effects in reducing the core problems.
>
> In other words, I (and the other pundits) should implement our ideas
> in a forked python, and not propose merging back until we've got a
> large user base with a proven track record.  Even if that's not as
> much fun. ;)

Agreed. Though race conditions become less of a problem if you don't
have fine-grained memory sharing (where you always hope you can get
away without a lock -- just Google for "double-checked locking" :-).
And deadlocks can be fought quite effectively by a runtime layer that
detects them, plus strategies for forcing lock acquisition order.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jimjjewett at gmail.com  Thu Sep 20 00:12:48 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 19 Sep 2007 18:12:48 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
	<fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
	<aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>
Message-ID: <fb6fbf560709191512g2c2e56e4g14848e1775f28151@mail.gmail.com>

On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/19/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > On 9/18/07, Brian Granger <ellisonbg.net at gmail.com> wrote:

> > So far, no one has been able to make these changes without seriously
> > mangling the C API and/or slowing things down a lot.  The current
> > refcount mechanism is so lightweight that it isn't clear this would
> > even be possible.

> Is 60-65% of normal CPython "a lot"?

Yes, but I think it is still better than the last serious attempt, so
it would be worth posting patches anyhow.

-jJ


From rhamph at gmail.com  Thu Sep 20 00:27:04 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 19 Sep 2007 16:27:04 -0600
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <fb6fbf560709191512g2c2e56e4g14848e1775f28151@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
	<fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
	<aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>
	<fb6fbf560709191512g2c2e56e4g14848e1775f28151@mail.gmail.com>
Message-ID: <aac2c7cb0709191527u643de9f6wfa04c7201726b472@mail.gmail.com>

On 9/19/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 9/19/07, Adam Olsen <rhamph at gmail.com> wrote:
> > On 9/19/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > > On 9/18/07, Brian Granger <ellisonbg.net at gmail.com> wrote:
>
> > > So far, no one has been able to make these changes without seriously
> > > mangling the C API and/or slowing things down a lot.  The current
> > > refcount mechanism is so lightweight that it isn't clear this would
> > > even be possible.
>
> > Is 60-65% of normal CPython "a lot"?
>
> Yes, but I think it is still better than the last serious attempt, so
> it would be worth posting patches anyhow.

It's not even comparable to the last serious attempt.  Even mere
atomic refcounting has *negative* scalability at two threads when
running pystones.  My approach has 95-100% scalability.

-- 
Adam Olsen, aka Rhamphoryncus


From ellisonbg.net at gmail.com  Thu Sep 20 03:56:22 2007
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 19 Sep 2007 21:56:22 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <20070919193311.GA6720@panix.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
Message-ID: <6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>

> > What to do about this pending crisis is a complicated issue.
>
> Oh, please.  Calling this a crisis only causes people to ignore you.

I apologize if this statement is a little exaggerated.  But, I do
think this is a really critical problem that is going to affect
certain groups of Python users and developers in adverse ways.
Perhaps I have not made a very strong case that it is a true "crisis"
though.

> Take a look at this URL that Guido posted to the baypiggies list:
>
> http://marknelson.us/2007/07/30/multicore-panic/

I actually agree with many of the comments made by the author.  The
quotes from John Dvorak and Wired are over the top.  The author make
good points about operating systems' abilities to handle threading on
modest numbers of cores.  It does work, even today, and will continue
to work (for a while) as the number of cores increases.  But the main
point of the author is that operating systems DO have threading
support that works reasonably well.  Python is another story.  The
author even makes comments that provide a string critique of Python's
(and Ruby's) threading capabilities:

<quote>
Modern programs tend to be moderately multithreaded, with individual
threads dedicated to the GUI, to user I/O, to socket I/O, and often to
computation. Multicore CPUs take advantage of this quite well. And we
don't need any new technology to make sure multi-threaded programs are
well-behaved - these techniques are pretty well understood, and in use
in most software you use today. Modern languages like Java support
threads and various concurrency issues right out of the box. C++
requires non-standard libraries, but all modern C++ environments worth
their salt deal with multithreading in a fairly sane way.
</quote>

Modern programs tend to be moderately multithreaded.....except for
Python.  Modern languages like Java and C++ support threads and
concurrency, even if those capabilities aren't built in at a low level
(C++).  I don't think the same thing can be said about Python.  The
GIL in CPython does in fact prevent threading from being a general
solution for CPU bound parallelism.

The author is wrong in one important respect:

<quote>
In this future view, by 2010 we should have the first eight-core
systems. In 2014, we're up to 32 cores. By 2017, we've reached an
incredible 128 core CPU on a desktop machine.
</quote>

I don't know where the author got this information, but it is way off.
 Here are some currently available examples:

* Appro now offers a workstation with up to 4 quad core Opterons:

http://www.appro.com/product/workstationxtreme_opteron.asp

That is a 16 core system.  It is 2007, not 2012.

* Tilera offers a 64 core CPU that runs SMP Linux.

* SiCortex offers a low power 648 processor Linux system that is
organized into 6 core SMPs, each of which runs Linux.

This week I am at a Department of Defense conference on multicore computing:

http://www.ll.mit.edu/HPEC/2007/index.html

There is broad agreement from all corners that multi/manycore CPUs
will require new ways of expressing parallelism.  But in all of the
discussions about new approaches, people do assume that threads are a
low-level building block that can and should be used for building the
higher-level stuff.

Intel's Threaded Building Blocks is a perfect example.  While it is
implemented using threads, it provides much higher level abstractions
for developers to use in building applications that will scale well on
multicore systems.  I would _love_ to have such constructs in Python,
but that is simply not possible.  And process based solutions don't
provide a solution for the many algorithms that require fine grained
partitioning and fast data movement.

For those of us who do use Python for high performance computing,
these issues are critical.  In fact, anytime I deal with fine grained
parallel algorithms, I use C/C++.  I do end up wrapping my low level
C/C++ threaded code into Python, but this doesn't always map onto the
problem well.

The other group of people for whom this is a big issue, are general
Python users that don't think they need parallelism.  Eventually even
these people will become frustrated that their Python codes are the
same speed on a 1 core system as a 128 core system.  To me this is a
significant problem.

Brian

> Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/
>
> The best way to get information on Usenet is not to ask a question, but
> to post the wrong information.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From stephen at xemacs.org  Thu Sep 20 22:43:32 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 21 Sep 2007 05:43:32 +0900
Subject: [Python-ideas] Exploration PEP : Concurrency for
	moderately	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
Message-ID: <87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>

Brian Granger writes:

 > I apologize if this statement is a little exaggerated.  But, I do
 > think this is a really critical problem that is going to affect
 > certain groups of Python users and developers in adverse ways.
 > Perhaps I have not made a very strong case that it is a true "crisis"
 > though.

No, you're missing the point.  I don't see anybody denying that you
understand your own needs.  *You* may face a (true) crisis.  *The
Python community* does not perceive your crisis as its own.

Personally, I don't see why it should.  And I think you'd be much more
successful at selling this with a two-pronged approach of evangelizing
just how utterly cool it would be to have a totally-threading GIL-less
Python on the one hand, and recruiting some gung-ho grad students with
Google SoC projects (or *gasp* some of your DoE grant or VC money) on
the other.

Note that nobody has said anything to discourage this as a research
project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
and other senior developers, are saying they're not going to devote
their resources to it as things currently stand (and one of those
resources the attention of the folks who review PEPs).


From ksankar at doubleclix.net  Fri Sep 21 01:34:54 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Thu, 20 Sep 2007 16:34:54 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency
 for	moderately	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>	<20070919193311.GA6720@panix.com>	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <46F3039E.5000101@doubleclix.net>

Stephen,

> project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
> and other senior developers, are saying they're not going to devote
> their resources to it as things currently stand (and one of those
> resources the attention of the folks who review PEPs).

<KS>
    I am not sure that is true. I think if we have a well thought out 
PEP that addresses parallelism, it would be looked into by the folks. It 
is just that there are too many different ways of doing it and of 
course, the GIL doesn't help either. There is also a school of thought 
that we should save the GIL and as a result we will find another better 
way of leveraging the multi-core architectures.
</KS>

CHeers
<k/>   
   
Stephen J. Turnbull wrote:
> Brian Granger writes:
>
>  > I apologize if this statement is a little exaggerated.  But, I do
>  > think this is a really critical problem that is going to affect
>  > certain groups of Python users and developers in adverse ways.
>  > Perhaps I have not made a very strong case that it is a true "crisis"
>  > though.
>
> No, you're missing the point.  I don't see anybody denying that you
> understand your own needs.  *You* may face a (true) crisis.  *The
> Python community* does not perceive your crisis as its own.
>
> Personally, I don't see why it should.  And I think you'd be much more
> successful at selling this with a two-pronged approach of evangelizing
> just how utterly cool it would be to have a totally-threading GIL-less
> Python on the one hand, and recruiting some gung-ho grad students with
> Google SoC projects (or *gasp* some of your DoE grant or VC money) on
> the other.
>
> Note that nobody has said anything to discourage this as a research
> project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
> and other senior developers, are saying they're not going to devote
> their resources to it as things currently stand (and one of those
> resources the attention of the folks who review PEPs).
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>   



From stephen at xemacs.org  Fri Sep 21 04:00:57 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 21 Sep 2007 11:00:57 +0900
Subject: [Python-ideas] Exploration PEP : Concurrency
 for	moderately	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F3039E.5000101@doubleclix.net>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<46F3039E.5000101@doubleclix.net>
Message-ID: <87k5qk22w6.fsf@uwakimon.sk.tsukuba.ac.jp>

Krishna Sankar writes:

 > > project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
 > > and other senior developers, are saying they're not going to devote
 > > their resources to it as things currently stand (and one of those
 > > resources the attention of the folks who review PEPs).
 > 
 > <KS>
 >     I am not sure that is true. I think if we have a well thought out 
 > PEP that addresses parallelism,

... and provides a plausible proof-of-concept implementation [Guido
said that to you explicitly] ...

 > it would be looked into by the folks.

True.  But AIUI without the implementation, it won't become a PEP.
And at present you don't have a plausible implementation with the GIL.

Now, it looks like Adam has a plausible implementation of removing the
GIL.  If that holds up under at least some common use-cases, I think
you'll see enthusiasm from some of the top developers, and acceptance
from most.  But I really think that has to come first.



From ellisonbg.net at gmail.com  Fri Sep 21 04:29:03 2007
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Thu, 20 Sep 2007 22:29:03 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>

>  > I apologize if this statement is a little exaggerated.  But, I do
>  > think this is a really critical problem that is going to affect
>  > certain groups of Python users and developers in adverse ways.
>  > Perhaps I have not made a very strong case that it is a true "crisis"
>  > though.
>
> No, you're missing the point.  I don't see anybody denying that you
> understand your own needs.  *You* may face a (true) crisis.  *The
> Python community* does not perceive your crisis as its own.

I do agree that there is a diversity of needs in the greater Python
community.  But the current discussion is oriented towards a subset of
Python users who *do* or care about parallelism.  For this subset of
people, I do feel the issues are important.  I don't expect people who
don't need high performance and thus parallelism to feel the same way
that I do.

> Personally, I don't see why it should.  And I think you'd be much more
> successful at selling this with a two-pronged approach of evangelizing
> just how utterly cool it would be to have a totally-threading GIL-less
> Python on the one hand,

Now I am really regretting using the word "crisis," as your statement
implies that I am so negative about all this that I have lost sight
positive side of this discussion.  I do think it would be fantastic to
have a more threading capable Python.

> and recruiting some gung-ho grad students with
> Google SoC projects (or *gasp* some of your DoE grant or VC money) on
> the other.

*gasp*, I wasn't aware that I had grad students, DOE grants or VC
money.  A lot of things must have changed while I have been out of
town this week :)  While the company at which I work does have DOE
grants, work on the GIL is _far_ outside their scope of work.

One of the difficulties with the actual work of removing the GIL is
that it is difficult, grungy work that not many people are interested
in funding.  Most of the funding sources that are throwing money at
parallel computing and multicore are focused on languages other than
Python.  But, I could imagine that someone like IBM that seems to have
an interest in both Python and multicore CPUs would be interested in
sponsoring such an effort.  You would think that Google would also an
interest in this.  To me it seems that the situation with the GIL will
remain the same until 1) someone with lots of free time and desire
steps up to the plate or 2) someone ponies up the $ to pay someone to
work on it.  Currently, I am not in the first of these situations.

> Note that nobody has said anything to discourage this as a research
> project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
> and other senior developers, are saying they're not going to devote
> their resources to it as things currently stand (and one of those
> resources the attention of the folks who review PEPs).

That has been made clear.

Brian


From guido at python.org  Fri Sep 21 05:09:56 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Sep 2007 20:09:56 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
Message-ID: <ca471dc20709202009g269294cdo62193985f445ae4e@mail.gmail.com>

Can you all stop the meta-discussion and start discussing ideas for
parallel APIs please?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ksankar at doubleclix.net  Fri Sep 21 05:48:20 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Thu, 20 Sep 2007 20:48:20 -0700
Subject: [Python-ideas] Exploration PEP : Concurrency
 for	moderately	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <87k5qk22w6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>	<20070919193311.GA6720@panix.com>	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>	<46F3039E.5000101@doubleclix.net>
	<87k5qk22w6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <46F33F04.9030003@doubleclix.net>

Stephen,
    Now I see where you are coming from and you are right. Plausible is 
the operative word. It is not that Guido and Sr.Developers will not look 
at it at all, but will not look at it in the absence of a plausible 
implementation. This was evident in the blog discussions with Bruce as 
well. I was encouraged by Guido's reply. He is doing the right thing.

    Anyway, the point is, now we have a good understanding of the 
dynamics. There is a small community who would like to see this happen 
and it is up to us to show it can be done. A few key things need to happen:

    a)       A good set of benchmarks on multi-core systems, extending 
excellent work (for example 
http://blogs.warwick.ac.uk/dwatkins/entry/benchmarking_parallel_python_1_2/)
         I have an 8 core machine (2 X 4 core) and plan to run this 
benchmark as well as create others like the Red Black Tree (Guido had 
suggested that, even before that I was thinking of it)
    b.1)      Plausible POC with GIL - pp, actor pattern/agent paradigm 
et al.
    b.2)      Plausible one, as and when GIL is removed
    c)      Figure out what support is needed from PVM et al (b.1 and/or 
b.2)
    d)      PEP and onwards ...

    Most probably I am stating the obvious. My take is that lots of the 
work is already there. We need to converge and do the rest to work 
towards a PEP.I know that Guido and Sr Developers will involve 
themselves at the appropriate time.

Cheers
<k/>
Stephen J. Turnbull wrote:
> Krishna Sankar writes:
>
>  > > project.  Nothing like it's impossible, stupid, or YAGNI.  But Guido,
>  > > and other senior developers, are saying they're not going to devote
>  > > their resources to it as things currently stand (and one of those
>  > > resources the attention of the folks who review PEPs).
>  > 
>  > <KS>
>  >     I am not sure that is true. I think if we have a well thought out 
>  > PEP that addresses parallelism,
>
> ... and provides a plausible proof-of-concept implementation [Guido
> said that to you explicitly] ...
>
>  > it would be looked into by the folks.
>
> True.  But AIUI without the implementation, it won't become a PEP.
> And at present you don't have a plausible implementation with the GIL.
>
> Now, it looks like Adam has a plausible implementation of removing the
> GIL.  If that holds up under at least some common use-cases, I think
> you'll see enthusiasm from some of the top developers, and acceptance
> from most.  But I really think that has to come first.
>
>
>



From mattknox_ca at hotmail.com  Fri Sep 21 05:01:12 2007
From: mattknox_ca at hotmail.com (Matt Knox)
Date: Fri, 21 Sep 2007 03:01:12 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Exploration_PEP_=3A_Concurrency_for_mode?=
	=?utf-8?q?rately=09massive_=284_to_32_cores=29_multi-core_architec?=
	=?utf-8?q?tures?=
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
	<fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
	<aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>
Message-ID: <loom.20070921T045104-372@post.gmane.org>

> Is 60-65% of normal CPython "a lot"?
> 
> (I really should clean things up and post a patch...)
> 

Is your implementation based on Python 3000/3.0? or Python 2.x? Regardless, 
I'd love to see it. How would you describe the changes to the C-api? Radical 
departure? Somewhat compatible? mostly compatible? No changes?

- Matt



From rhamph at gmail.com  Fri Sep 21 07:33:02 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 20 Sep 2007 23:33:02 -0600
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <loom.20070921T045104-372@post.gmane.org>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<46F08974.4010101@doubleclix.net>
	<6ce0ac130709181958y6cc28a01yc22bb7bfa451c97e@mail.gmail.com>
	<fb6fbf560709191258m4bec8f44v561ac3309fc73466@mail.gmail.com>
	<aac2c7cb0709191310w6b63fd0fn8a2b5e5848de723b@mail.gmail.com>
	<loom.20070921T045104-372@post.gmane.org>
Message-ID: <aac2c7cb0709202233l608921a0g16573d2612f8d353@mail.gmail.com>

On 9/20/07, Matt Knox <mattknox_ca at hotmail.com> wrote:
> > Is 60-65% of normal CPython "a lot"?
> >
> > (I really should clean things up and post a patch...)
> >
>
> Is your implementation based on Python 3000/3.0? or Python 2.x? Regardless,
> I'd love to see it. How would you describe the changes to the C-api? Radical
> departure? Somewhat compatible? mostly compatible? No changes?

It's based of py3k.  I will post eventually (heh).  The C API is
mostly compatible (source wise only), notably excepting the various
threading things.

-- 
Adam Olsen, aka Rhamphoryncus


From tjreedy at udel.edu  Fri Sep 21 23:39:54 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 21 Sep 2007 17:39:54 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for
	moderatelymassive (4 to 32 cores) multi-core architectures
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com><20070919193311.GA6720@panix.com><6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com><87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
Message-ID: <fd1dn2$tb0$1@sea.gmane.org>


"Brian Granger" <ellisonbg.net at gmail.com>

Brief responses to this and previous posts and threads:

You withdraw 'crisis'.  Fine. Let us move on.

Hobbling Python on the billion (1000 million) or so current and future 
single-core CPUs is not acceptible.  Let us move on.

Multiple cores can be used with separate processes.  I expect this is where 
Google is, but that is another discussion, except 'Do not hold your breath" 
waiting for Google to publicly support multicore threading.

Improving multi-thread use of multiple cores -- without hobbling single 
core use -- would be good.  Obviously.  No need to argue the point.  Let us 
move on.

What is needed is persuasion of practicality by the means Guido has 
requested: concrete proposals and code.

Terry Jan Reedy





From greg.ewing at canterbury.ac.nz  Sat Sep 22 02:48:50 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 22 Sep 2007 12:48:50 +1200
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
Message-ID: <46F46672.70105@canterbury.ac.nz>

Brian Granger wrote:
> One of the difficulties with the actual work of removing the GIL is
> that it is difficult, grungy work that not many people are interested
> in funding.

It's not just a matter of hard work -- so far, nobody
knows *how* to remove the GIL without a big loss in
efficiency. Somebody is going to have to have a flash
of insight, and money can't buy that.

--
Greg


From stephen at xemacs.org  Sat Sep 22 09:49:54 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 22 Sep 2007 16:49:54 +0900
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F46672.70105@canterbury.ac.nz>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>
	<20070919193311.GA6720@panix.com>
	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>
	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>
	<46F46672.70105@canterbury.ac.nz>
Message-ID: <876423yw9p.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > Brian Granger wrote:
 > > One of the difficulties with the actual work of removing the GIL is
 > > that it is difficult, grungy work that not many people are interested
 > > in funding.
 > 
 > It's not just a matter of hard work -- so far, nobody
 > knows *how* to remove the GIL without a big loss in
 > efficiency. Somebody is going to have to have a flash
 > of insight, and money can't buy that.

They said that about the four-color theorem, too.<wink>


From ksankar at doubleclix.net  Wed Sep 26 06:10:59 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 25 Sep 2007 21:10:59 -0700
Subject: [Python-ideas] Python and Transactional memory
In-Reply-To: <876423yw9p.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <6ce0ac130709181831q415a1e0em87a680b68bd5cd9b@mail.gmail.com>	<20070919193311.GA6720@panix.com>	<6ce0ac130709191856v68c4f711s453197e0d37ff949@mail.gmail.com>	<87r6kt2hl7.fsf@uwakimon.sk.tsukuba.ac.jp>	<6ce0ac130709201929n226ea925ye9f0afd98bafb0f@mail.gmail.com>	<46F46672.70105@canterbury.ac.nz>
	<876423yw9p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <46F9DBD3.6020009@doubleclix.net>

Would like pointers to any work going on, on transactional memory 
constructs (like atomic et al) and python. I also have a few questions 
to any who is working on this topic.

Cheers
<k/>


From grosser.meister.morti at gmx.net  Wed Sep 26 23:08:29 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 26 Sep 2007 23:08:29 +0200
Subject: [Python-ideas] is in operator
Message-ID: <46FACA4D.8060800@gmx.net>

Sometimes I want to compare a "pointer" to more then one others. The "in" operator
would be handy, but it uses the "==" operator instead of the "is" operator. So a "is
in" operator would be nice. Though I don't know how easy it is for a newbie to see
what does what.

# This:
if x is in (a, b, c):
	...

# would be equivalent to this:
if x is a or x is b or x is c:
	...

# And of course there should be a "is not in" operator, too:
if x is not in (a, b, c):
	...

# this would be equivalent to tis:
if x is not a and x is not b and x is not c:
	...


Hmmm, maybe a way to apply some kind of comparison between a value and more other
values would be better. But that already exists, so screw this msg:

if any(x is y for y in (a, b, c)):
	...

if all(x is not y for y in (a, b, c)):
	...


From terry at jon.es  Wed Sep 26 23:42:50 2007
From: terry at jon.es (Terry Jones)
Date: Wed, 26 Sep 2007 23:42:50 +0200
Subject: [Python-ideas] Calling a function of a list without accumulating
	results
Message-ID: <18170.53850.491682.600306@terry.local>

What's the most compact way to repeatedly call a function on a list without
accumulating the results?

While I can accumulate results via

    a = [f(x) for x in mylist]

or with a generator, there doesn't seem to be a way to do this without
accumulating the results. I guess I need to either use the above and ignore
the result, or use

    for x in mylist:
        f(x)

I run into this need quite frequently. If I write

    [f(x) for x in mylist]

with no assignment, will Python notice that I don't want the accumulated
results and silently toss them for me?

A possible syntax change would be to allow the unadorned

    f(x) for x in mylist

And raise an error if someone tries to assign to this.

Terry


From rhamph at gmail.com  Wed Sep 26 23:57:37 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 26 Sep 2007 15:57:37 -0600
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <18170.53850.491682.600306@terry.local>
References: <18170.53850.491682.600306@terry.local>
Message-ID: <aac2c7cb0709261457k5bf5d400y98655ea198a80144@mail.gmail.com>

On 9/26/07, Terry Jones <terry at jon.es> wrote:
> What's the most compact way to repeatedly call a function on a list without
> accumulating the results?
>
> While I can accumulate results via
>
>     a = [f(x) for x in mylist]
>
> or with a generator, there doesn't seem to be a way to do this without
> accumulating the results. I guess I need to either use the above and ignore
> the result, or use
>
>     for x in mylist:
>         f(x)

Just use this.  Simple, readable.  No need to get fancy.


-- 
Adam Olsen, aka Rhamphoryncus


From brett at python.org  Thu Sep 27 01:02:03 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 26 Sep 2007 16:02:03 -0700
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <18170.53850.491682.600306@terry.local>
References: <18170.53850.491682.600306@terry.local>
Message-ID: <bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>

On 9/26/07, Terry Jones <terry at jon.es> wrote:
> What's the most compact way to repeatedly call a function on a list without
> accumulating the results?
>
> While I can accumulate results via
>
>     a = [f(x) for x in mylist]
>
> or with a generator, there doesn't seem to be a way to do this without
> accumulating the results. I guess I need to either use the above and ignore
> the result, or use
>
>     for x in mylist:
>         f(x)
>
> I run into this need quite frequently. If I write
>
>     [f(x) for x in mylist]
>
> with no assignment, will Python notice that I don't want the accumulated
> results and silently toss them for me?
>

Only after the list is completely constructed.  List comprehensions
are literally 'for' loops with an append call to a method so without
extending the peepholer to notice this case and strip out the list
creation and appending it is not optimized.

> A possible syntax change would be to allow the unadorned
>
>     f(x) for x in mylist
>
> And raise an error if someone tries to assign to this.

Go with the 'for' loop as Adam suggested.  I just don't see this as
needing syntax support.

-Brett


From terry at jon.es  Thu Sep 27 01:20:06 2007
From: terry at jon.es (Terry Jones)
Date: Thu, 27 Sep 2007 01:20:06 +0200
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: Your message at 16:02:03 on Wednesday, 26 September 2007
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
Message-ID: <18170.59686.345068.336674@terry.local>

Hi Brett & Adam

Thanks for the replies.

| Only after the list is completely constructed.  List comprehensions
| are literally 'for' loops with an append call to a method so without
| extending the peepholer to notice this case and strip out the list
| creation and appending it is not optimized.
| 
| > A possible syntax change would be to allow the unadorned
| >
| >     f(x) for x in mylist
| >
| > And raise an error if someone tries to assign to this.
| 
| Go with the 'for' loop as Adam suggested.  I just don't see this as
| needing syntax support.

I think there are two arguments in its favor.

The first is the same as one of the arguments for providing list
comprehensions and generator expressions - because it makes common
multi-line boilerplate much more concise.

There's a certain syntax that's allowed in [] and () to make list
comprehensions and generator expressions. I'm suggesting allowing exactly
the same thing, but with no explicit grouping wrapped around it.

The trivial case I posted isn't much of a win over the simple 2-line
alternative, but it's easy to go to further:

    f(x, y) for x in myXlist for y in myYlist

instead of

    for x in myXlist:
        for y in myYlist:
            f(x, y)

and of course there are many more examples.

The second argument is one of consistency.  If list comprehensions are
regarded as more pythonic and the Right Way to code in Python, I'd make the
same argument for when you don't happen to want to keep the accumulated
results.  Why force programmers to use two coding styles in order to get
essentially the same thing done?

I think these are decent arguments. It's simply the full succinctness and
convenience of list comprehensions, without needing to accumulate results.

Thanks again for the replies. Changing the peepholer to notice when there's
no assignment to a list expression would also be nice. I'd look at it if I
had time..... :-)

Terry


From george.sakkis at gmail.com  Thu Sep 27 02:07:58 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Wed, 26 Sep 2007 20:07:58 -0400
Subject: [Python-ideas] is in operator
In-Reply-To: <46FACA4D.8060800@gmx.net>
References: <46FACA4D.8060800@gmx.net>
Message-ID: <91ad5bf80709261707u129a118w3768061b0338baba@mail.gmail.com>

On 9/26/07, Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
>
> Sometimes I want to compare a "pointer" to more then one others. The "in"
> operator
> would be handy, but it uses the "==" operator instead of the "is"
> operator. So a "is
> in" operator would be nice. Though I don't know how easy it is for a
> newbie to see
> what does what.
>
> # This:
> if x is in (a, b, c):
>         ...
>
> # would be equivalent to this:
> if x is a or x is b or x is c:
>         ...
>
> # And of course there should be a "is not in" operator, too:
> if x is not in (a, b, c):
>         ...
>
> # this would be equivalent to tis:
> if x is not a and x is not b and x is not c:
>         ...
>
>
> Hmmm, maybe a way to apply some kind of comparison between a value and
> more other
> values would be better. But that already exists, so screw this msg:
>
> if any(x is y for y in (a, b, c)):
>         ...
>
> if all(x is not y for y in (a, b, c)):



Or in a more obfuscated way:

import operator as op
from itertools import imap
from functools import partial

if any(imap(partial(op.is_,x), (a, b, c))):
        ...

if all(imap(partial(op.is_not,x), (a, b, c))):
     ...


George
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070926/d46a0924/attachment.html>

From brett at python.org  Thu Sep 27 02:43:49 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 26 Sep 2007 17:43:49 -0700
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <18170.59686.345068.336674@terry.local>
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
Message-ID: <bbaeab100709261743y4175c5cekbec05f1af6597423@mail.gmail.com>

On 9/26/07, Terry Jones <terry at jon.es> wrote:
> Hi Brett & Adam
>
> Thanks for the replies.
>
> | Only after the list is completely constructed.  List comprehensions
> | are literally 'for' loops with an append call to a method so without
> | extending the peepholer to notice this case and strip out the list
> | creation and appending it is not optimized.
> |
> | > A possible syntax change would be to allow the unadorned
> | >
> | >     f(x) for x in mylist
> | >
> | > And raise an error if someone tries to assign to this.
> |
> | Go with the 'for' loop as Adam suggested.  I just don't see this as
> | needing syntax support.
>
> I think there are two arguments in its favor.
>
> The first is the same as one of the arguments for providing list
> comprehensions and generator expressions - because it makes common
> multi-line boilerplate much more concise.
>

OK, the question is how common is this.  I personally don't come
across this idiom often enough to feel the need to avoid creating a
listcomp, genexp, or a 'for' loop.

> There's a certain syntax that's allowed in [] and () to make list
> comprehensions and generator expressions. I'm suggesting allowing exactly
> the same thing, but with no explicit grouping wrapped around it.
>

But are you sure Python's grammar could support it?  Parentheses are
needed for genexps in certain situations for disambiguation because of
Python's LL(1) grammar.

> The trivial case I posted isn't much of a win over the simple 2-line
> alternative, but it's easy to go to further:
>
>     f(x, y) for x in myXlist for y in myYlist
>
> instead of
>
>     for x in myXlist:
>         for y in myYlist:
>             f(x, y)
>
> and of course there are many more examples.
>

Right, but the second one is so much easier to read and comprehend.

> The second argument is one of consistency.  If list comprehensions are
> regarded as more pythonic and the Right Way to code in Python, I'd make the
> same argument for when you don't happen to want to keep the accumulated
> results.  Why force programmers to use two coding styles in order to get
> essentially the same thing done?
>

I think "force" is rather strong wording for "choice".  The point is
the 'for' loop is the standard solution and there just happens to be a
shorthand for a common case.  You shouldn't view listcomps as being on
the same ground as a 'for' loop.

> I think these are decent arguments. It's simply the full succinctness and
> convenience of list comprehensions, without needing to accumulate results.
>

They are decent, but not enough to warrant adding special support in
my opinion.  Heck, I would vote to ditch listcomps for
``list(genexp)`` had genexps come first and have the options trimmed
down even more.

And if you are doing this to just toss out stuff you can do something like::

  for _ in (f(x) for x in anylist): pass

No accumulating list, one line, and you still get your genexp syntax.

Basically, unless you can go through the stdlib and find all the
instances of the pattern you want to prove it is common enough to
warrant support it none of the core developers will probably go for
this.

-Brett


From rhamph at gmail.com  Thu Sep 27 03:25:56 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 26 Sep 2007 19:25:56 -0600
Subject: [Python-ideas] is in operator
In-Reply-To: <46FACA4D.8060800@gmx.net>
References: <46FACA4D.8060800@gmx.net>
Message-ID: <aac2c7cb0709261825o3077619bu5ef7df5ea88a68d5@mail.gmail.com>

On 9/26/07, Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
> Sometimes I want to compare a "pointer" to more then one others. The "in" operator
> would be handy, but it uses the "==" operator instead of the "is" operator. So a "is
> in" operator would be nice. Though I don't know how easy it is for a newbie to see
> what does what.

There's many different ways you might want to do a comparison.  That's
why sorted() has a cmp=func argument.  A new API won't work though, as
dicts or sets need to know the hash in advance, and lists are O(n)
anyway (so there's little appropriate use.)

To solve your problem you should be using a decorate/undecorate
pattern, possibly encapsulated into a custom container type.  There
doesn't appear to be any in the python cookbook (so it may be a very
rare need), but assuming you did use a container type your code might
be rewritten as such:

if x in idset([a, b, c]):

But decorating is almost as simple:

if id(x) in [id(a), id(b), id(c)]:

(Caveat: id(obj) assumes you have another reference to the obj, to
prevent the identity from being reused.)


> # This:
> if x is in (a, b, c):
>         ...
>
> # would be equivalent to this:
> if x is a or x is b or x is c:
>         ...
>
> # And of course there should be a "is not in" operator, too:
> if x is not in (a, b, c):
>         ...
>
> # this would be equivalent to tis:
> if x is not a and x is not b and x is not c:
>         ...
>
>
> Hmmm, maybe a way to apply some kind of comparison between a value and more other
> values would be better. But that already exists, so screw this msg:
>
> if any(x is y for y in (a, b, c)):
>         ...
>
> if all(x is not y for y in (a, b, c)):
>         ...
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Adam Olsen, aka Rhamphoryncus


From terry at jon.es  Thu Sep 27 04:01:54 2007
From: terry at jon.es (Terry Jones)
Date: Thu, 27 Sep 2007 04:01:54 +0200
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: Your message at 17:43:49 on Wednesday, 26 September 2007
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
	<bbaeab100709261743y4175c5cekbec05f1af6597423@mail.gmail.com>
Message-ID: <18171.3858.931686.291011@terry.local>

Hi Brett

| > The first is the same as one of the arguments for providing list
| > comprehensions and generator expressions - because it makes common
| > multi-line boilerplate much more concise.
| 
| OK, the question is how common is this.

I don't know. I use it maybe once every few weeks. There's a function to do
this in Common Lisp (mapc), not that that means anything much.

| But are you sure Python's grammar could support it?  Parentheses are
| needed for genexps in certain situations for disambiguation because of
| Python's LL(1) grammar.

I don't know the answer to this either. I imagine it's a matter of tacking
an optional "for ..." clause onto the end of an expression. The "for ..."
part can certainly be handled (is being handled already), so I think it
might not be too hard - supposing there is already a non-terminal for the
"for ..." clause.

| > The trivial case I posted isn't much of a win over the simple 2-line
| > alternative, but it's easy to go to further:
| >
| >     f(x, y) for x in myXlist for y in myYlist
| >
| > instead of
| >
| >     for x in myXlist:
| >         for y in myYlist:
| >             f(x, y)
| >
| > and of course there are many more examples.
| 
| Right, but the second one is so much easier to read and comprehend.

I tend to agree, but the language supports the more concise form for both
list comprehension and genexps, so there must be a fair number of people
who thought it was a win to allow the compact form.

| I think "force" is rather strong wording for "choice".

OK, how about "lack of choice"?  :-)

Seriously (to take an example from the Python pocket ref page 24), you do
have the choice to write

    a = [x for x in range(5) if x % 2 == 0]

instead of

    a = []
    for x in range(5):
        if x % 2 == 0:
            a.append(x)

but you don't have a (simple) choice if you don't want to accumulate
results.  I'm merely saying that I think it would be cleaner and more
consistent to allow

    print(x) for x in range(5) if x % 2 == 0

instead of having the non-choice but to write something like

    for x in range(5):
        if x % 2 == 0:
            print x

Yes, I (and thank you for it) could now use your suggested for _ in ...:
pass trick, but that's not really the whole point, to me. If the language
can be made simpler and more consistent, I think that's generally a good
thing.

I know, I don't know anything about the implementation. But this is an
ideas list.

| Heck, I would vote to ditch listcomps for ``list(genexp)`` had genexps
| come first and have the options trimmed down even more.

Me too. But even if that eventuated, I'd _still_ propose allowing the
unadorned genexp, for the case where you don't want the results.

| Basically, unless you can go through the stdlib and find all the
| instances of the pattern you want to prove it is common enough to
| warrant support it none of the core developers will probably go for
| this.

Understood.

Thanks,
Terry


From greg.ewing at canterbury.ac.nz  Thu Sep 27 04:09:16 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 14:09:16 +1200
Subject: [Python-ideas] Calling a function of a list without
 accumulating results
In-Reply-To: <18170.53850.491682.600306@terry.local>
References: <18170.53850.491682.600306@terry.local>
Message-ID: <46FB10CC.6010205@canterbury.ac.nz>

Terry Jones wrote:
> What's the most compact way to repeatedly call a function on a list without
> accumulating the results?

>     for x in mylist:
>         f(x)

That's it.

> If I write
> 
>     [f(x) for x in mylist]
> 
> with no assignment, will Python notice that I don't want the accumulated
> results and silently toss them for me?

No. And there would be no advantage in using LC syntax
even if it did. The generated bytecode would be essentially
identical.

If you really must have it all on one line, you can write

   for x in mylist: f(x)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From greg.ewing at canterbury.ac.nz  Thu Sep 27 04:46:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 14:46:36 +1200
Subject: [Python-ideas] Calling a function of a list
 without	accumulating results
In-Reply-To: <18170.59686.345068.336674@terry.local>
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
Message-ID: <46FB198C.4070705@canterbury.ac.nz>

Terry Jones wrote:
> If list comprehensions are
> regarded as more pythonic and the Right Way to code in Python, I'd make the
> same argument for when you don't happen to want to keep the accumulated
> results.  Why force programmers to use two coding styles in order to get
> essentially the same thing done?


There isn't anything "more Pythonic" about the LC
syntax in itself. It's just a more compact alternative
for when you're constructing a list. It's not *un*-
Pythonic to *not* use it, even when you do want a
list. Nobody would fault you for not using one
when you could have.

The way things are, there is only one coding style
for when you don't want the results. You're suggesting
the addition of another one. That *would* be un-Pythonic.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From greg.ewing at canterbury.ac.nz  Thu Sep 27 05:02:00 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 15:02:00 +1200
Subject: [Python-ideas] Calling a function of a list
 without	accumulating results
In-Reply-To: <18171.3858.931686.291011@terry.local>
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
	<bbaeab100709261743y4175c5cekbec05f1af6597423@mail.gmail.com>
	<18171.3858.931686.291011@terry.local>
Message-ID: <46FB1D28.5090002@canterbury.ac.nz>

Terry Jones wrote:
> the language supports the more concise form for both
> list comprehension and genexps, so there must be a fair number of people
> who thought it was a win to allow the compact form.

The compact form *is* considerably more compact when
you're constructing a list, as it saves the initalisation
and an append call. It also permits an optimisation by
extracting a bound method for the append.

When not constructing a list, however, there's no
significant difference in source length or runtime
efficiency. So the "more compact" form wouldn't be
any more compact, just different. It would be a
spurious Other Way To Do It.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From brett at python.org  Thu Sep 27 05:42:06 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 26 Sep 2007 20:42:06 -0700
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <18171.3858.931686.291011@terry.local>
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
	<bbaeab100709261743y4175c5cekbec05f1af6597423@mail.gmail.com>
	<18171.3858.931686.291011@terry.local>
Message-ID: <bbaeab100709262042x5293194ftf23f2697b4ac6294@mail.gmail.com>

On 9/26/07, Terry Jones <terry at jon.es> wrote:
> Hi Brett
>
> | > The first is the same as one of the arguments for providing list
> | > comprehensions and generator expressions - because it makes common
> | > multi-line boilerplate much more concise.
> |
> | OK, the question is how common is this.
>
> I don't know. I use it maybe once every few weeks. There's a function to do
> this in Common Lisp (mapc), not that that means anything much.
>

You could try to get something that consumes a generator that just
tosses out the results into the stdlib.  That has a much lower barrier
of entry than new syntax.

> | But are you sure Python's grammar could support it?  Parentheses are
> | needed for genexps in certain situations for disambiguation because of
> | Python's LL(1) grammar.
>
> I don't know the answer to this either. I imagine it's a matter of tacking
> an optional "for ..." clause onto the end of an expression. The "for ..."
> part can certainly be handled (is being handled already), so I think it
> might not be too hard - supposing there is already a non-terminal for the
> "for ..." clause.
>
> | > The trivial case I posted isn't much of a win over the simple 2-line
> | > alternative, but it's easy to go to further:
> | >
> | >     f(x, y) for x in myXlist for y in myYlist
> | >
> | > instead of
> | >
> | >     for x in myXlist:
> | >         for y in myYlist:
> | >             f(x, y)
> | >
> | > and of course there are many more examples.
> |
> | Right, but the second one is so much easier to read and comprehend.
>
> I tend to agree, but the language supports the more concise form for both
> list comprehension and genexps, so there must be a fair number of people
> who thought it was a win to allow the compact form.
>

Yes, but it was specifically for the list building idiom.

> | I think "force" is rather strong wording for "choice".
>
> OK, how about "lack of choice"?  :-)
>
> Seriously (to take an example from the Python pocket ref page 24), you do
> have the choice to write
>
>     a = [x for x in range(5) if x % 2 == 0]
>
> instead of
>
>     a = []
>     for x in range(5):
>         if x % 2 == 0:
>             a.append(x)
>
> but you don't have a (simple) choice if you don't want to accumulate
> results.  I'm merely saying that I think it would be cleaner and more
> consistent to allow
>
>     print(x) for x in range(5) if x % 2 == 0
>
> instead of having the non-choice but to write something like
>
>     for x in range(5):
>         if x % 2 == 0:
>             print x
>
> Yes, I (and thank you for it) could now use your suggested for _ in ...:
> pass trick, but that's not really the whole point, to me. If the language
> can be made simpler and more consistent, I think that's generally a good
> thing.
>

But it could be said is not simpler as there is now a new type of
statement that previously was always an expression (this would have to
be a statement as the whole point of this idea is there is no return
value).

> I know, I don't know anything about the implementation. But this is an
> ideas list.
>

Right, which is why this conversation has gone on without someone
flat-out saying it wasn't going to happen like on python-dev.  =)  But
at some point the idea either needs to seem reasonable to enough to
try to move forward or to just let it go.  And at this point I have
said what it will take to take it to the next level if you care enough
as I doubt any of the core developers will go for this enough to carry
it on their own.

> | Heck, I would vote to ditch listcomps for ``list(genexp)`` had genexps
> | come first and have the options trimmed down even more.
>
> Me too. But even if that eventuated, I'd _still_ propose allowing the
> unadorned genexp, for the case where you don't want the results.
>
> | Basically, unless you can go through the stdlib and find all the
> | instances of the pattern you want to prove it is common enough to
> | warrant support it none of the core developers will probably go for
> | this.
>
> Understood.
>
> Thanks,

Welcome.  Thanks for bringing the idea up!

-Brett


From rrr at ronadam.com  Thu Sep 27 06:54:21 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 26 Sep 2007 23:54:21 -0500
Subject: [Python-ideas] Calling a function of a list
 without	accumulating results
In-Reply-To: <18171.3858.931686.291011@terry.local>
References: <18170.53850.491682.600306@terry.local>	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>	<18170.59686.345068.336674@terry.local>	<bbaeab100709261743y4175c5cekbec05f1af6597423@mail.gmail.com>
	<18171.3858.931686.291011@terry.local>
Message-ID: <46FB377D.50903@ronadam.com>



Terry Jones wrote:

> OK, how about "lack of choice"?  :-)

There's always a choice... Not always a good one though.   ;-)


> but you don't have a (simple) choice if you don't want to accumulate
> results.  I'm merely saying that I think it would be cleaner and more
> consistent to allow
> 
>     print(x) for x in range(5) if x % 2 == 0
> 
> instead of having the non-choice but to write something like
> 
>     for x in range(5):
>         if x % 2 == 0:
>             print x
> 


 >>> a = list(range(10))
 >>> def pr(obj):
...     print obj
...
 >>> a = list(range(10))
 >>> b = [1 for x in a if pr(x)]
0
1
2
3
4
5
6
7
8
9
 >>> b
[]


Or to be more specific to the above example...

 >>> b = [1 for x in range(5) if x % 2 == 0 and pr(x)]
0
2
4
 >>> b
[]


All of which are more complex then a simple for loop.


Cheers,
    Ron



From stephen at xemacs.org  Thu Sep 27 07:15:23 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 27 Sep 2007 14:15:23 +0900
Subject: [Python-ideas] Calling a function of a list
	without	accumulating results
In-Reply-To: <18170.59686.345068.336674@terry.local>
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
Message-ID: <87abr8lmdw.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Jones writes:
 > The trivial case I posted isn't much of a win over the simple 2-line
 > alternative, but it's easy to go to further:
 > 
 >     f(x, y) for x in myXlist for y in myYlist

Excuse me?

 > instead of
 > 
 >     for x in myXlist:
 >         for y in myYlist:
 >             f(x, y)

Oh, is that what you meant?!<wink>

I think the second version is much more readable, and only a few
characters longer when typing.

 > The second argument is one of consistency.  If list comprehensions are
 > regarded as more pythonic and the Right Way to code in Python, I'd make the
 > same argument for when you don't happen to want to keep the accumulated
 > results.  Why force programmers to use two coding styles in order to get
 > essentially the same thing done?

Because it is essentially not the same thing.  Comprehension syntax is
justified precisely when you want to generate a list value for immediate
use, and all the other ways to generate that value force you to hide
what's being done in an assignment deep inside a thicket of syntax.
List comprehensions are Pythonic because they "look like" lists.
IMHO, anyway.

OTOH, in Python, control syntax always starts with a keyword.  A naked
comprehension just doesn't look like a control statement to me, it
still looks like an expression.  I don't know if that's un-Pythonic,
but I do like the multiline version better.

 > I think these are decent arguments. It's simply the full succinctness and
 > convenience of list comprehensions, without needing to accumulate results.

But succintness and convenience aren't arguments for doing something
in Python as I understand it.  Lack of succintness and convenience may
postpone acceptance of a PEP, or even kill it, of course.  But they've
never been sufficient for acceptance of a PEP that I've seen.


From grosser.meister.morti at gmx.net  Thu Sep 27 14:18:24 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Thu, 27 Sep 2007 14:18:24 +0200
Subject: [Python-ideas] is in operator
In-Reply-To: <91ad5bf80709261707u129a118w3768061b0338baba@mail.gmail.com>
References: <46FACA4D.8060800@gmx.net>
	<91ad5bf80709261707u129a118w3768061b0338baba@mail.gmail.com>
Message-ID: <46FB9F90.2030607@gmx.net>

George Sakkis wrote:
>
> Or in a more obfuscated way:
>
> import operator as op
> from itertools import imap
>>from functools import partial
>
> if any(imap(partial(op.is_,x), (a, b, c))):
>         ...
>
> if all(imap(partial(op.is_not,x), (a, b, c))):
>      ...
>
>
> George
>

Or in haskell (assuming haskell would have "is" and "is not"):

if any (x is) [a, b, c] then ... else ...

if all (x is not) [a, b c] then ... else ...

I'm not sure if "any" and "all" are the ones with 2 parameters (function and list) or
if that would be "or" and "and".


	-panzi


From arno at marooned.org.uk  Thu Sep 27 19:02:28 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Thu, 27 Sep 2007 18:02:28 +0100 (BST)
Subject: [Python-ideas] Calling a function of a list without
 accumulating results
In-Reply-To: <18170.53850.491682.600306@terry.local>
References: <18170.53850.491682.600306@terry.local>
Message-ID: <63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>


On Wed, September 26, 2007 10:42 pm, Terry Jones wrote:
> What's the most compact way to repeatedly call a function on a list
> without
> accumulating the results?
>
> While I can accumulate results via
>
>     a = [f(x) for x in mylist]
>
> or with a generator, there doesn't seem to be a way to do this without
> accumulating the results. I guess I need to either use the above and
> ignore
> the result, or use
>
>     for x in mylist:
>         f(x)
>
> I run into this need quite frequently. If I write
>
>     [f(x) for x in mylist]
>
> with no assignment, will Python notice that I don't want the accumulated
> results and silently toss them for me?
>
> A possible syntax change would be to allow the unadorned
>
>     f(x) for x in mylist
>
> And raise an error if someone tries to assign to this.
>

If you want to do it like this, why not do it explicitly:

def exhaust(iterable):
   for i in iterable: pass

Then you can write:

exhaust(f(x) for x in mylist)

Done!


-- 
Arnaud




From fdrake at acm.org  Thu Sep 27 20:05:48 2007
From: fdrake at acm.org (Fred Drake)
Date: Thu, 27 Sep 2007 14:05:48 -0400
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>
References: <18170.53850.491682.600306@terry.local>
	<63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>
Message-ID: <279AD750-5307-4CDB-8AE8-E5252F7707E9@acm.org>

On Sep 27, 2007, at 1:02 PM, Arnaud Delobelle wrote:
> def exhaust(iterable):
>    for i in iterable: pass
>
> Then you can write:
>
> exhaust(f(x) for x in mylist)

Ooooh... I like this!  Anyone who needs such a construct can just  
write their own exhaust() function, too, since I see no reason to  
pollute the distribution with this.


   -Fred

-- 
Fred Drake   <fdrake at acm.org>





From lucio.torre at gmail.com  Thu Sep 27 20:06:12 2007
From: lucio.torre at gmail.com (Lucio Torre)
Date: Thu, 27 Sep 2007 15:06:12 -0300
Subject: [Python-ideas] Calling a function of a list without
	accumulating results
In-Reply-To: <63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>
References: <18170.53850.491682.600306@terry.local>
	<63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>
Message-ID: <999187ed0709271106vb3f879eo6b5ac9943d82cbe6@mail.gmail.com>

On 9/27/07, Arnaud Delobelle <arno at marooned.org.uk> wrote:
>
> >
> >     for x in mylist:
> >         f(x)
> >
> >
> >     [f(x) for x in mylist]
> >
>

am i missing something or the one-liner syntax is great for this?

for x in mylist: f(x)

lucio.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070927/48ffaa4f/attachment.html>

From cmaurer at slickedit.com  Thu Sep 27 23:34:11 2007
From: cmaurer at slickedit.com (Clark Maurer)
Date: Thu, 27 Sep 2007 17:34:11 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately massive
	(4 to 32 cores) multi-core architectures
Message-ID: <ECCC6E9907B4CD4A83260A191A91F20E014C8C1E@wampa.office.slickedit.com>

Hello,

 

I've been following this discussion.  My thoughts mostly reiterate what
has already been said.  There's no way to get rid of the GIL without
significantly effecting single threaded performance.  IMO getting rid of
the GIL would require writing a mark-and-sweep algorithm.  To improve
performance you can do incremental (threaded) marking and detect page
faults so that modified pages can be rescanned for references. The Boehm
garbage collector does this (I think) but Python would need something
much more custom.  This type of garbage collector is VERY hard to write.
Worse yet, the current implementation of Python would need a lot of
rewriting.

 

FYI: I tried using the Boehm collector in SlickEdit and it leaked memory
like crazy.  I never figured out why but I suspect it had to do with it
treating everything in memory as a potential pointer.

 

Ruby's mark-and sweep garbage collector illustrates the loss in single
threaded performance and since it does its own thread scheduling, the
thread performance is bad too. 

 

As Python stands right now, its performance is excellent for single
threading, the implementation is simple, it works well for the typical
Python user, and using processes at least gives a work around.  I like
to be a perfectionist as much as the next guy but the pay back doesn't
warrant the level of effort.  Where's the easy button when you need oneJ

 

I thought you Python enthusiasts (especially Guido) might enjoy the
article I just posted on the SlickEdit blog.  I'm the CTO and founder of
SlickEdit. I hate saying that because I'm a very humble guy but I
thought you would want to know.  The article is called "Comparing Python
to Perl and Ruby", go to  http://blog.slickedit.com/.   I limited the
article to a simple grammar comparison because I wanted to keep the
article short.  Hope you enjoy it.

 

Guido, I have another article written which talks about Python as well
but I have not yet posted it.  If you give me an email address, I will
send it to you to look over before I post it.  Don't give me your email
address here.  Instead write to support at slickedit.com and let them know
that I requested your email address.

 

Cheers

Clark

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070927/55066519/attachment.html>

From terry at jon.es  Thu Sep 27 23:48:47 2007
From: terry at jon.es (Terry Jones)
Date: Thu, 27 Sep 2007 23:48:47 +0200
Subject: [Python-ideas] Calling a function of a list
 without	accumulating results
In-Reply-To: Your message at 14:46:36 on Thursday, 27 September 2007
References: <18170.53850.491682.600306@terry.local>
	<bbaeab100709261602l1ecfe02bh31d5f3ae13fbe60@mail.gmail.com>
	<18170.59686.345068.336674@terry.local>
	<46FB198C.4070705@canterbury.ac.nz>
Message-ID: <18172.9535.939185.173926@terry.local>

Hi Greg

| The way things are, there is only one coding style for when you don't want
| the results. You're suggesting the addition of another one. That *would* be
| un-Pythonic.

But the same remark could be made about using a list and writing explicit
loops to accumulate results, and the later addition of list comprehensions.
Wasn't that un-Pythonic for the same reason?

Terry


From terry at jon.es  Thu Sep 27 23:56:38 2007
From: terry at jon.es (Terry Jones)
Date: Thu, 27 Sep 2007 23:56:38 +0200
Subject: [Python-ideas] Calling a function of a list
	without	accumulating results
In-Reply-To: Your message at 18:02:28 on Thursday, 27 September 2007
References: <18170.53850.491682.600306@terry.local>
	<63519.82.46.172.40.1190912548.squirrel@marooned.org.uk>
Message-ID: <18172.10006.586289.448207@terry.local>

Hi Arnaud

| If you want to do it like this, why not do it explicitly:
| 
| def exhaust(iterable):
|    for i in iterable: pass
| 
| Then you can write:
| 
| exhaust(f(x) for x in mylist)

Thanks - that's nice. It also gives me the generality I wanted, which was
the ability to use the full LC/genexp "for..." syntax, which I should have
emphasized more, including in the subject of the thread.

Terry


From tjreedy at udel.edu  Fri Sep 28 01:12:46 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 27 Sep 2007 19:12:46 -0400
Subject: [Python-ideas] Exploration PEP : Concurrency for moderately
	massive(4 to 32 cores) multi-core architectures
References: <ECCC6E9907B4CD4A83260A191A91F20E014C8C1E@wampa.office.slickedit.com>
Message-ID: <fdhdd5$1hv$1@sea.gmane.org>


"Clark Maurer" <cmaurer at slickedit.com> wrote in 
message 
news:ECCC6E9907B4CD4A83260A191A91F20E014C8C1E at wampa.office.slickedit.com...
Guido, I have another article written which talks about Python as well
but I have not yet posted it.  If you give me an email address, I will
send it to you to look over before I post it.  Don't give me your email
address here.  Instead write to 
support at slickedit.com and let them know
that I requested your email address.
==============
Cloak and dagger stuff is not necessary.  Guido, like most of us, uses a 
valid email in Python discussion groups (guido @ python.org recently).





From george.sakkis at gmail.com  Fri Sep 28 03:47:02 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Thu, 27 Sep 2007 21:47:02 -0400
Subject: [Python-ideas] Removing the del statement
Message-ID: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>

I guess this has very few to zero chances of being considered, even for
Python 3, but this being python-ideas I guess it's ok to bring it up. IMO
the del statement is one of the relatively few constructs that stick out
like a sore thumb. For one thing, it is overloaded to mean three different
things:
1) del x: Remove x from the current namespace
2) del x[i]: Equivalent to x.__delitem__(i)
3) del x.a: Equivalent to x.__delattr__('a') and delattr(x,'a')

Here I am mostly arguing for removing the last two; the first could also be
removed if/when Python gets block namespaces, but it is orthogonal to the
others. I don't see the point of complicating the lexer and the grammar with
an extra keyword and statement for something that is typically handled by a
method (my preference), or at least a generic (bulitin) function like len().
The last case is especially superfluous given that there is both a special
method and a generic builtin (delattr) that does the same thing. Neither
item nor attribute deletion are so pervasive to be granted special treatment
at the language level.

I wonder if this was considered and rejected in the Py3K discussions; PEP
3099 doesn't mention anything about it.

George
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070927/abbefdc8/attachment.html>

From adam at atlas.st  Fri Sep 28 03:58:11 2007
From: adam at atlas.st (Adam Atlas)
Date: Thu, 27 Sep 2007 21:58:11 -0400
Subject: [Python-ideas] Removing the del statement
In-Reply-To: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
References: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
Message-ID: <6F2A191C-E194-4F01-8150-496BC3973424@atlas.st>


On 27 Sep 2007, at 21:47, George Sakkis wrote:

> I guess this has very few to zero chances of being considered, even  
> for Python 3, but this being python-ideas I guess it's ok to bring  
> it up. IMO the del statement is one of the relatively few  
> constructs that stick out like a sore thumb. For one thing, it is  
> overloaded to mean three different things:
> 1) del x: Remove x from the current namespace
> 2) del x[i]: Equivalent to x.__delitem__(i)
> 3) del x.a: Equivalent to x.__delattr__('a') and delattr(x,'a')


I guess this has very few to zero chances of being considered, even  
for Python 3, but this being python-ideas I guess it's ok to bring it  
up. IMO the = statement is one of the relatively few constructs that  
stick out like a sore thumb. For one thing, it is overloaded to mean  
three different things:
1) x = : Assign x in the current namespace
2) x[i] = : Equivalent to x.__setitem__(i)
3) x.a = : Equivalent to x.__setattr__('a') and setattr(x,'a')


(Sorry for the slight sarcasm, but I hope you see my point. I don't  
see why the deletion statement should go while the perfectly  
complementary and nearly-identically-"overloaded" assignment  
statement should stay.)


From gsakkis at rutgers.edu  Fri Sep 28 04:13:11 2007
From: gsakkis at rutgers.edu (George Sakkis)
Date: Thu, 27 Sep 2007 22:13:11 -0400
Subject: [Python-ideas] Removing the del statement
In-Reply-To: <6F2A191C-E194-4F01-8150-496BC3973424@atlas.st>
References: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
	<6F2A191C-E194-4F01-8150-496BC3973424@atlas.st>
Message-ID: <91ad5bf80709271913i2b96e166yded2ed18ad4cb0c7@mail.gmail.com>

On 9/27/07, Adam Atlas <adam at atlas.st> wrote:
>
>
> On 27 Sep 2007, at 21:47, George Sakkis wrote:
>
> > I guess this has very few to zero chances of being considered, even
> > for Python 3, but this being python-ideas I guess it's ok to bring
> > it up. IMO the del statement is one of the relatively few
> > constructs that stick out like a sore thumb. For one thing, it is
> > overloaded to mean three different things:
> > 1) del x: Remove x from the current namespace
> > 2) del x[i]: Equivalent to x.__delitem__(i)
> > 3) del x.a: Equivalent to x.__delattr__('a') and delattr(x,'a')
>
>
> I guess this has very few to zero chances of being considered, even
> for Python 3, but this being python-ideas I guess it's ok to bring it
> up. IMO the = statement is one of the relatively few constructs that
> stick out like a sore thumb. For one thing, it is overloaded to mean
> three different things:
> 1) x = : Assign x in the current namespace
> 2) x[i] = : Equivalent to x.__setitem__(i)
> 3) x.a = : Equivalent to x.__setattr__('a') and setattr(x,'a')
>
>
> (Sorry for the slight sarcasm, but I hope you see my point. I don't
> see why the deletion statement should go while the perfectly
> complementary and nearly-identically-"overloaded" assignment
> statement should stay.)


Apples to oranges. I thought It would be obvious and that's why I didn't
mention it, but getitem/setitem and friends use almost universally known
punctuation; OTOH only Python AFAIK uses a keyword for a relatively
infrequent operation.

George
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070927/c0d12093/attachment.html>

From bjourne at gmail.com  Fri Sep 28 12:13:10 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri, 28 Sep 2007 12:13:10 +0200
Subject: [Python-ideas] Removing the del statement
In-Reply-To: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
References: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
Message-ID: <740c3aec0709280313o7830cb5uaa389e653eb2c334@mail.gmail.com>

On 9/28/07, George Sakkis <george.sakkis at gmail.com> wrote:
> I wonder if this was considered and rejected in the Py3K discussions; PEP
> 3099 doesn't mention anything about it.

Yes. del (and especially __del___) has been discussed on and off on
the python-3000 list.

http://mail.python.org/pipermail/python-3000/2006-September/003855.html
http://mail.python.org/pipermail/python-3000/2007-May/007129.html
http://mail.python.org/pipermail/python-3000/2007-May/007683.html

I have used "del x" a few times to shorten the list of exported names
in modules which helps epydoc. Never found any use for del x[i] or del
x.a though.

-- 
mvh Bj?rn


From guido at python.org  Fri Sep 28 16:42:34 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Sep 2007 07:42:34 -0700
Subject: [Python-ideas] Removing the del statement
In-Reply-To: <740c3aec0709280313o7830cb5uaa389e653eb2c334@mail.gmail.com>
References: <91ad5bf80709271847g726efcf7l919fddcce6332c53@mail.gmail.com>
	<740c3aec0709280313o7830cb5uaa389e653eb2c334@mail.gmail.com>
Message-ID: <ca471dc20709280742n3fa8707asb682ac7168216533@mail.gmail.com>

On 9/28/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> I have used "del x" a few times to shorten the list of exported names
> in modules which helps epydoc. Never found any use for del x[i] or del
> x.a though.

You never use dictionaries?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From clarkksv at yahoo.com  Sat Sep 29 13:05:40 2007
From: clarkksv at yahoo.com (Joseph Maurer)
Date: Sat, 29 Sep 2007 04:05:40 -0700 (PDT)
Subject: [Python-ideas] Enhance reload
Message-ID: <844399.5421.qm@web58901.mail.re1.yahoo.com>

I'm still new to the technical abilities of Python so help me if I misunderstand the current capabilities.

I'd like to see the reload feature of Python enhanced so it can replace the methods for existing class instances, references to methods, and references to functions.

Here's the scenario. Let's say you want to use Python as a macro language. Currently, you can bind a Python function to a key or menu (better do it by name and not reference). That's what most apps need.  However, an advanced app like SlickEdit would have classes instances for modeless dialogs (including tool windows) and other data structures. There are also callbacks which would preferably need to be references to functions or methods. With the current implementation you would have to close and reopen dialogs. In other cases, you would need to exit SlickEdit and restart. While there always will be cases where this is necessary, I can tell you from experience that this is a great feature to have since Slick-C does this.

I suspect that there are other scenarios that users would like this capability for.

Java and C# support something like this to a limited extent when you are debugging.

This capability could be a reload option. Their could be cases where one might want the existing instances to use the old implementation. You wouldn't need this to be an option for me. There will always be cases where you have to restart because you made too many changes.


       
____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting 


From steven.bethard at gmail.com  Sat Sep 29 15:16:49 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 29 Sep 2007 07:16:49 -0600
Subject: [Python-ideas] Enhance reload
In-Reply-To: <844399.5421.qm@web58901.mail.re1.yahoo.com>
References: <844399.5421.qm@web58901.mail.re1.yahoo.com>
Message-ID: <d11dcfba0709290616r4e7f5eb8o4243c0ab9144e720@mail.gmail.com>

On 9/29/07, Joseph Maurer <clarkksv at yahoo.com> wrote:
> I'd like to see the reload feature of Python enhanced so it
> can replace the methods for existing class instances,
> references to methods, and references to functions.

I'd be surprised if there's anyone out there who really doesn't want a
better reload(). The real question is who's going to figure out how to
implement it. ;-)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From clarkksv at yahoo.com  Sat Sep 29 16:14:54 2007
From: clarkksv at yahoo.com (Joseph Maurer)
Date: Sat, 29 Sep 2007 07:14:54 -0700 (PDT)
Subject: [Python-ideas]  Enhance reload
Message-ID: <644262.85437.qm@web58907.mail.re1.yahoo.com>

I'm glad to hear it isn't a matter of whether it was useful or not.

The way I implemented this feature in Slick-C is with indirection. In Python terms, this means that a separate data structure that isn't reference counted holds the method/function object data.  The method/function object is changed to just contain a pointer to it. The data structure which holds all method/function data should probably be a non-reference counted dictionary.  When a function is deleted, it's name remains in the dictionary but the entry needs to be changed to indicate that it is "null/invalid".  When a deleted function is called, an exception should be raised. Adding a function/method means replacing the data in the dictionary. This type of implementation is simple. There's an insignificant amount of overhead on a function/method call (i.e. instead of "func->data" you have  "func=*pfunc;if ( func->isInvalid() ) throw exception; else func->data" ).

Technically this algorithm leaks memory since deleted functions/methods are never removed.  My response is who cares. When the interpreter cleanup everything function is called, you simple deallocate everything in the hash table.

Does anyone know what level of effort would be needed for something like this?
Is my proposed implementation a good one for Python?


       
____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search 
that gives answers, not web links. 
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC


From gsakkis at rutgers.edu  Sat Sep 29 17:01:56 2007
From: gsakkis at rutgers.edu (George Sakkis)
Date: Sat, 29 Sep 2007 11:01:56 -0400
Subject: [Python-ideas] Enhance reload
In-Reply-To: <644262.85437.qm@web58907.mail.re1.yahoo.com>
References: <644262.85437.qm@web58907.mail.re1.yahoo.com>
Message-ID: <91ad5bf80709290801n60907f3ci1cf49039e8c69b59@mail.gmail.com>

On 9/29/07, Joseph Maurer <clarkksv at yahoo.com> wrote:

I'm glad to hear it isn't a matter of whether it was useful or not.
>
> The way I implemented this feature in Slick-C is with indirection. In
> Python terms, this means that a separate data structure that isn't reference
> counted holds the method/function object data.  The method/function object
> is changed to just contain a pointer to it. The data structure which holds
> all method/function data should probably be a non-reference counted
> dictionary.  When a function is deleted, it's name remains in the dictionary
> but the entry needs to be changed to indicate that it is
> "null/invalid".  When a deleted function is called, an exception should be
> raised. Adding a function/method means replacing the data in the dictionary.
> This type of implementation is simple. There's an insignificant amount of
> overhead on a function/method call (i.e. instead of "func->data" you
> have  "func=*pfunc;if ( func->isInvalid() ) throw exception; else
> func->data" ).
>
> Technically this algorithm leaks memory since deleted functions/methods
> are never removed.  My response is who cares. When the interpreter cleanup
> everything function is called, you simple deallocate everything in the hash
> table.
>
> Does anyone know what level of effort would be needed for something like
> this?
> Is my proposed implementation a good one for Python?


You may want to take a look at a relevant Cookbook recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/160164

George
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070929/ae4474b6/attachment.html>

From tjreedy at udel.edu  Sat Sep 29 21:16:05 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 29 Sep 2007 15:16:05 -0400
Subject: [Python-ideas] Enhance reload
References: <844399.5421.qm@web58901.mail.re1.yahoo.com>
Message-ID: <fdm89m$32m$1@sea.gmane.org>


"Joseph Maurer" <clarkksv at yahoo.com> wrote in 
message news:844399.5421.qm at web58901.mail.re1.yahoo.com...
| I'd like to see the reload feature of Python enhanced so it can replace 
the methods for existing class instances, references to methods, and 
references to functions.

I think would we could get farther by restricting concern to replacing 
class attributes so that existing class instances would use their new 
definitions.

As I understand, the problem is this.  After somemod is imported, 'import 
somemod' simply binds 'somemod' to the existing module object, while 
'reload somemod' replaces the module object with a new object with all new 
contents, while references to objects within the old module object remain 
as are.

So I propose this.  'Reclass somemod' (by whatever syntax) would execute 
the corresponding code in a new namespace (dict).  But instead of making 
that dict the __dict__ attribute of a new module, reclass would match class 
names with the existing  __dict__, and replace the class.__dict__ 
attributes, so that subsequent access to class attributes, including 
particularly methods, would get the new versions.  In other words  use the 
existing indirection involved in attribute access.  New classes could 
simple be added.  Deleted classes could be disabled, but this really 
requires a restart after editing files that reference such classes, so 
deleting classes should not be done for the restricted reload uses this 
idea is aimed at.

It would probably be possible to modify function objects (replace 
func_code, etc), but this is more difficult.  It is simpler, at least for a 
beginning, to require that functions be put within a class when reclassing 
is anticipated.

Terry Jan Reedy





From brett at python.org  Sat Sep 29 21:27:18 2007
From: brett at python.org (Brett Cannon)
Date: Sat, 29 Sep 2007 12:27:18 -0700
Subject: [Python-ideas] Enhance reload
In-Reply-To: <fdm89m$32m$1@sea.gmane.org>
References: <844399.5421.qm@web58901.mail.re1.yahoo.com>
	<fdm89m$32m$1@sea.gmane.org>
Message-ID: <bbaeab100709291227q19b10058u8c71406ac96dff69@mail.gmail.com>

On 9/29/07, Terry Reedy <tjreedy at udel.edu> wrote:
>
> "Joseph Maurer" <clarkksv at yahoo.com> wrote in
> message news:844399.5421.qm at web58901.mail.re1.yahoo.com...
> | I'd like to see the reload feature of Python enhanced so it can replace
> the methods for existing class instances, references to methods, and
> references to functions.
>
> I think would we could get farther by restricting concern to replacing
> class attributes so that existing class instances would use their new
> definitions.
>
> As I understand, the problem is this.  After somemod is imported, 'import
> somemod' simply binds 'somemod' to the existing module object, while
> 'reload somemod' replaces the module object with a new object with all new
> contents, while references to objects within the old module object remain
> as are.
>

Actually, the way reload works is that it takes the module object from
sys.modules, and then re-initializes mod.__dict__ (this is why import
loaders must use any pre-existing module's __dict__ when loading).  So
the module object itself is not replaced, it's just that it's __dict__
is mutated in-place.

-Brett


From greg.ewing at canterbury.ac.nz  Sun Sep 30 01:25:21 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 30 Sep 2007 11:25:21 +1200
Subject: [Python-ideas] Enhance reload
In-Reply-To: <644262.85437.qm@web58907.mail.re1.yahoo.com>
References: <644262.85437.qm@web58907.mail.re1.yahoo.com>
Message-ID: <46FEDEE1.1020209@canterbury.ac.nz>

Joseph Maurer wrote:
> 
> The way I implemented this feature in Slick-C is with indirection... 
>
> Is my proposed implementation a good one for Python?

It's nowhere near detailed enough to be able to tell. When
Steven said "figure out how to implement it", he meant
working out the details, not just coming up with a
high-level idea.

What you suggest sounds like it ought to be possible,
at first sight, since Python function objects are already
containers with a reference to another object that
holds the function's code.

The problem will be figuring out *when* you're redefining
a function, because the process of loading a module is a
very dynamic one in Python. Defining functions and classes
is done by executing code, not by statically analysing
declarations as a C compiler does.

--
Greg



From clarkksv at yahoo.com  Sun Sep 30 15:22:36 2007
From: clarkksv at yahoo.com (Joseph Maurer)
Date: Sun, 30 Sep 2007 06:22:36 -0700 (PDT)
Subject: [Python-ideas]  Enhance reload
Message-ID: <401200.41417.qm@web58907.mail.re1.yahoo.com>

Greg Ewin wrote:
> What you suggest sounds like it ought to be possible,
> at first sight, since Python function objects are already
> containers with a reference to another object that
> holds the function's code.

> The problem will be figuring out *when* you're redefining
> a function, because the process of loading a module is a
> very dynamic one in Python. Defining functions and classes
> is done by executing code, not by statically analysing
> declarations as a C compiler does.

My implementation is definitely a high level scetch.

Greg,

The kind of issues you are bringing up is exactly the kind of thing I'm looking for. If there are more, lets see them.

Would temporarily marking the module with "replace" work? I would think that when the function is defined, it has access to the module (because it is adding to its dictionary) and it could check for the "replace" attribute.  I'm assuming a certain sequence of execution here since the "replace" attribute would have to removed after the function/method code was executed/loaded.  Anyone who knows that this isn't the case, please shoot this down.

Another post I read proposed a Reclass feature that only worked for classes.  Given the macro language scenario, you definitely  need functions too.


      ____________________________________________________________________________________
Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
http://tv.yahoo.com/ 



From clarkksv at yahoo.com  Sun Sep 30 16:57:09 2007
From: clarkksv at yahoo.com (Joseph Maurer)
Date: Sun, 30 Sep 2007 07:57:09 -0700 (PDT)
Subject: [Python-ideas]   Enhance reload
Message-ID: <925784.74407.qm@web58910.mail.re1.yahoo.com>

Ok, my idea of a temporary "replace" attribute wrapped around a reload-like function is not a good idea for Python given that things can be dynamically added to modules at any time.

---------------------------------------
Here is my original high level design:

The way I implemented this feature in Slick-C is with indirection. In Python terms, this means that a separate data structure that isn't reference counted holds the method/function object data.  The method/function object is changed to just contain a pointer to it. The data structure which holds all method/function data should probably be a non-reference counted dictionary.  When a function is deleted, it's name remains in the dictionary but the entry needs to be changed to indicate that it is "null/invalid".  When a deleted function is called, an exception should be raised. Adding a function/method means replacing the data in the dictionary. This type of implementation is simple. There's an insignificant amount of overhead on a function/method call (i.e. instead of "func->data" you have  "func=*pfunc;if ( func->isInvalid() ) throw exception; else func->data" ).

Technically this algorithm leaks memory since deleted functions/methods are never removed.  My response is who cares. When the interpreter cleanup everything function is called, you simple deallocate everything in the dictionary.
---------------------------------

Instead of a temporary "replace" attribute wrapped into a reload-like call, how about giving modules a user settable "replace" attribute.  At any time, the user can set/reset this attribute.  This would specify how the user wanted functions/methods processed.  Always added or always replaced. The "replace" attribute would likely need to be pass through to function objects, class objects, and method objects. For the macro language scenarios, I would just mark every module that got loaded with this attribute.

The proposed implementation I have given is intended to by very "file" oriented (which maps to a Python module).  

Would this work in the current code base? 

I'm assuming the following:

When a function is added/executed, the module structure is accessible.
When a class is added (i.e. class myclass), the module structure is accessible.
When a method is added/executed, at least the class structure is accessible?

I hope you see where I'm going here.  The executed "class myclass" code which defines a new class can copy the module "replace" attribute.  the executed "def myfunction" code which defines a new method can copy the class "replace" attribute.

The function/method object structure could remain the same except for the addition of a new function/method pointer member.  

The additional code for a function call would look like this:

   // Did this function get defined in "replace" mode?
   if ( func->doReplace() ) { 
       // For this one, use the indirect pointer and not the other member data.
       func= func->pfunc;
       if ( !func->isValid() ) {
          throw exception.  // Python exeception, not C++ exception
          return here...
       }
    }
    //  Now do what we used to do

Given the OO nature of Python, a separate function/method type for a replacable function/method could be defined but I suspect it isn't worth the effort.  The above psuedo code is very efficient "doReplace" would probably be just a boolean/int member.  The "isValid" called would be efficient as well.

One thing my proposed implementation does not cover is adding new data members to a class.  I think it is acceptable for this not to be handled.

Please shot this down this high level implementation if it won't work in the current code base.

Also, what does everyone think about the idea of some sort of "replace" attribute for the module?  How should it get set?  "import module; module.replace=1". I'm probably showing a little lack of knowledge here. Teach me and I'll get it.


       
____________________________________________________________________________________
Pinpoint customers who are looking for what you sell. 
http://searchmarketing.yahoo.com/


From jimjjewett at gmail.com  Sun Sep 30 18:46:38 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 30 Sep 2007 12:46:38 -0400
Subject: [Python-ideas] Enhance reload
In-Reply-To: <844399.5421.qm@web58901.mail.re1.yahoo.com>
References: <844399.5421.qm@web58901.mail.re1.yahoo.com>
Message-ID: <fb6fbf560709300946m2bc3de8ctdbaff21c3b21ef4a@mail.gmail.com>

On 9/29/07, Joseph Maurer <clarkksv at yahoo.com> wrote:
> I'd like to see the reload feature of Python enhanced so
> it can replace the methods for existing class instances,
> references to methods, and references to functions.

Guido did some work on this (as xreload) for Py3K, but gave up for the moment.

In the meantime, you can sometimes get a bit closer with indirection.
Instead of

    from othermod import anobject  # won't notice if anobject is replaced later

use

    import othermod
    ...
    othermod.anobject    # always looks up the current anobject

That said, subclasses and existing instances generally don't use
indirection, because it is slower; replacing bound methods will almost
certainly always require at least the moral equivalanent of reloading
the dialog after reloading the module.

-jJ