From jason.orendorff at gmail.com  Thu Mar  1 23:46:55 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Thu, 1 Mar 2007 17:46:55 -0500
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
Message-ID: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>

I wonder if anyone else finds the heapq and bisect APIs a little dated.

It seems like these things could be offered in a more intuitive and
featureful way by making them classes.  They could go in the
collections module:

  class priorityqueue:
      def __init__(self, elements=(), *,
          cmp=None, key=None, reversed=False)
      def add(self, element)
      def pop(self)  --> remove and return the min element
      def __iter__(self) --> (while self: yield self.pop())
      ...
      ... any other list methods that make sense
      ... for example, __len__ but not __getitem__
      ...

  class sortedlist:
      def __init__(self, elements=(), *,
          cmp=None, key=None, reversed=False)
      def add(self, element)  # insort

      # Methods involving searching are O(log N).
      def __contains__(self, element)
      def index(self, element)
      def remove(self, element)
      ...
      ... plus all the other read-only list methods,
      ... and the modifying list methods that make sense
      ...

The point is, most of the API comes from list and sort.  I think
they'd be easier to use/remember than what we have.

Once upon a time Tim mentioned (maybe semi-seriously) possibly adding
a Fibonacci heap container.  I think a plain old binary heap would be
good enough for starters.

I could do this.  Any interest?

-j

From rhamph at gmail.com  Fri Mar  2 01:37:37 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 1 Mar 2007 17:37:37 -0700
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
Message-ID: <aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>

On 3/1/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> I wonder if anyone else finds the heapq and bisect APIs a little dated.
>
> It seems like these things could be offered in a more intuitive and
> featureful way by making them classes.  They could go in the
> collections module:

I agree, at least for heapq.  The algorithm involves enough voodoo
that it's not obvious how to extend it, so it might as well be wrapped
in a class that hides as much as possible.

The same can't be said for bisect though.  All it does is replace del
a[bisect_left(a, x)] with a.remove(x).  A little cleaner, but no more
obvious.  And the cost is still O(n) since it involves moving all the
pointers after x.

>
>   class priorityqueue:
Should inherit from object if this is going into 2.x.

>       def __init__(self, elements=(), *,
>           cmp=None, key=None, reversed=False)
>       def add(self, element)
>       def pop(self)  --> remove and return the min element
>       def __iter__(self) --> (while self: yield self.pop())
__iter__ shouldn't modify the container.

>       ...
>       ... any other list methods that make sense
>       ... for example, __len__ but not __getitem__
>       ...

-- 
Adam Olsen, aka Rhamphoryncus

From taleinat at gmail.com  Fri Mar  2 13:25:39 2007
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 2 Mar 2007 14:25:39 +0200
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
	<aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
Message-ID: <7afdee2f0703020425m4aa2c1do7a44755bf8ae93a3@mail.gmail.com>

On 3/2/07, Adam Olsen <rhamph at gmail.com> wrote:
>
> On 3/1/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> > I wonder if anyone else finds the heapq and bisect APIs a little dated.

Ooh, me, me!

> It seems like these things could be offered in a more intuitive and
> > featureful way by making them classes.  They could go in the
> > collections module:

I agree, at least for heapq.  The algorithm involves enough voodoo
> that it's not obvious how to extend it, so it might as well be wrapped
> in a class that hides as much as possible.

Sounds good to me. +1

The same can't be said for bisect though.  All it does is replace del
> a[bisect_left(a, x)] with a.remove(x).  A little cleaner, but no more
> obvious.  And the cost is still O(n) since it involves moving all the
> pointers after x.

A sortedlist class seems like a useful abstraction to me, since while using
an instance of such a class, you could be sure that the list remains sorted.
For example, you don't have to worry about it being changed outside of your
code and no longer being sorted.

Also, wouldn't a sortedlist class also have in insert() method to replace
bisect.insort(), as well as different implementations of __contains__ and
index?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070302/1eaf0d9c/attachment.html>

From jimjjewett at gmail.com  Fri Mar  2 19:32:11 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 2 Mar 2007 13:32:11 -0500
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
Message-ID: <fb6fbf560703021032y588c4f75q78272f473e6ff08d@mail.gmail.com>

On 3/1/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> I wonder if anyone else finds the heapq and bisect APIs a little dated.

I find them too much of a special case.

> ... They could go in the collections module:

This alone would be an improvement.

I don't disagree with the rest of your proposal, but for me, this is
the big one.

-jJ

From jimjjewett at gmail.com  Fri Mar  2 19:37:35 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 2 Mar 2007 13:37:35 -0500
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
	<aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
Message-ID: <fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>

On 3/1/07, Adam Olsen <rhamph at gmail.com> wrote:
> On 3/1/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:

> >   class priorityqueue:
> Should inherit from object if this is going into 2.x.

> >       def __init__(self, elements=(), *,
> >           cmp=None, key=None, reversed=False)
> >       def add(self, element)
> >       def pop(self)  --> remove and return the min element
> >       def __iter__(self) --> (while self: yield self.pop())
> __iter__ shouldn't modify the container.

generators do not need to be reiterable.

lists are reiterable, but I wouldn't expect a socket (treated as a file) to be.

For a queue that is already sorted, the natural use case is to take
the task and do it, explicitly adding it back to the end if need be.
I would expect an operation that *didn't* remove things from the queue
to have view in the name somewhere.

-jJ

From collinw at gmail.com  Fri Mar  2 19:45:57 2007
From: collinw at gmail.com (Collin Winter)
Date: Fri, 2 Mar 2007 12:45:57 -0600
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
	<aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
	<fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>
Message-ID: <43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com>

On 3/2/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 3/1/07, Adam Olsen <rhamph at gmail.com> wrote:
> > On 3/1/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
>
> > >   class priorityqueue:
> > Should inherit from object if this is going into 2.x.
>
> > >       def __init__(self, elements=(), *,
> > >           cmp=None, key=None, reversed=False)
> > >       def add(self, element)
> > >       def pop(self)  --> remove and return the min element
> > >       def __iter__(self) --> (while self: yield self.pop())
> > __iter__ shouldn't modify the container.
>
> generators do not need to be reiterable.
>
> lists are reiterable, but I wouldn't expect a socket (treated as a file) to be.
>
> For a queue that is already sorted, the natural use case is to take
> the task and do it, explicitly adding it back to the end if need be.
> I would expect an operation that *didn't* remove things from the queue
> to have view in the name somewhere.

I would be incredibly surprised if

for x in queue:
    ....

destroyed the queue. __iter__ should be implemented non-destructively.

Collin Winter

From phd at phd.pp.ru  Fri Mar  2 20:05:24 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Fri, 2 Mar 2007 22:05:24 +0300
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
	<aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
	<fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>
	<43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com>
Message-ID: <20070302190524.GC21602@phd.pp.ru>

On Fri, Mar 02, 2007 at 12:45:57PM -0600, Collin Winter wrote:
> I would be incredibly surprised if
> 
> for x in queue:
>     ....
> 
> destroyed the queue. __iter__ should be implemented non-destructively.

   Impossible for external streams such as pipes and sockets.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From rhamph at gmail.com  Fri Mar  2 20:40:32 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 2 Mar 2007 12:40:32 -0700
Subject: [Python-ideas] priorityqueue, sortedlist in collections?
In-Reply-To: <fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>
References: <bb8868b90703011446k6e19b3a2s4783665c9fd896c0@mail.gmail.com>
	<aac2c7cb0703011637x6bf666ai8052ff8b547c9a3b@mail.gmail.com>
	<fb6fbf560703021037l752ae137uf9fe563ff062b93@mail.gmail.com>
Message-ID: <aac2c7cb0703021140jb77f8f1td9548b436d072a19@mail.gmail.com>

On 3/2/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 3/1/07, Adam Olsen <rhamph at gmail.com> wrote:
> > __iter__ shouldn't modify the container.
>
> generators do not need to be reiterable.
>
> lists are reiterable, but I wouldn't expect a socket (treated as a file) to be.
>
> For a queue that is already sorted, the natural use case is to take
> the task and do it, explicitly adding it back to the end if need be.
> I would expect an operation that *didn't* remove things from the queue
> to have view in the name somewhere.

A priorityqueue seems far more like a list than a socket though.  I'd
be very surprised if __iter__ was destructive.  If destructive is the
only option I would prefer it be a explicit method.  Especially since
it's common modify and reiterate over a priority queue, which I
believe never happens for files or sockets.

-- 
Adam Olsen, aka Rhamphoryncus

From rrr at ronadam.com  Fri Mar  2 20:57:21 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 02 Mar 2007 13:57:21 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45DF7A91.8020402@canterbury.ac.nz>
References: <erlclc$ehu$1@sea.gmane.org>
	<d06a5cd30702221708j475301cate0fb810de6de73e7@mail.gmail.com>
	<erlf32$pob$1@sea.gmane.org>
	<3d2ce8cb0702221739r2fbca4ffy4825d02361748e13@mail.gmail.com>
	<erlgsv$2g8$1@sea.gmane.org> <45DEC783.8040601@canterbury.ac.nz>
	<45DF0C84.8020403@ronadam.com> <45DF7A91.8020402@canterbury.ac.nz>
Message-ID: <45E881A1.3020902@ronadam.com>

It seems this thread has run it's course on python-dev, but being on the 
road for several thousand miles and since I've given it some thought during 
the trip, I'll go ahead and post my own conclusions for the record.

This would be a python 3.0 suggestion only I think.

Greg Ewing wrote:
> Ron Adam wrote:
> 
>> Ok, so what if...  instead of bool being a type, it be a keyword that 
>> is just a nicer way to spell 'not not'?
> 
> The main benefit of having a bool type is so that its
> values display as "True" and "False" rather than whatever
> surrogate values you happen to have used to represent
> true and false.

This wouldn't change.  A bool keyword would still return True or False 
"bool" objects just as 'not' currently does.  It would just complete the 
set of keywords so we have...

    and, or   -->  flow control boolean operators

    not, bool  -->  boolean operators

Where 'bool' could generate more specific byte code like 'not' already does 
in some situations.

> The fact that bool() can be used canonicalise a truth
> value is secondary -- the same thing could just as well
> be achieved by a plain function (as it was before bool
> became a type). It's an extremely rare thing to need
> to do in any case.

It seems to me it is the main use, with testing for bool types as the 
secondary use.  All other bool object uses return or compare already 
created bool objects.

> If any change is to be made to bool in 3.0, my preference
> would be to keep it but make it a separate type.

I think the bool type should still continue to be a subtype of int.

The reason for making it it's own type is to limit bools functionality to 
cases that are more specific in order to avoid some errors.  But I feel 
those errors are not really that common and having bool be a subclass of 
int increases the situations where bool is useful in a nice way. ie... 
there are more benefits than consequences.

>  > Are 'and' and 'or' going to be changed in 3.0 to return bool types too?
 >
> I very much hope not! Their current behaviour is very
> useful, and I would hate to lose it.

I agree.  I only asked to be sure.

A little review:

On python-dev a question weather or not bool conversion was a wart was 
brought up because of an apparent inconsistency of value returned in some 
cases.  It was pointed out that there was no inconsistency, and that the 
current behavior was correct and consistent, as well as being desirable.

However, there are some people who feel there is room for improvement where 
bool is concerned.  I think this unsettled/incomplete feeling is due to a 
small inconsistency of how bool is used, (rather than what it does), in 
relation to the 'and', 'or' and 'not' keywords.

The 'and' and 'or' keywords return either the left or right values 
depending on their bool interpretation by comparison methods.  This is not 
a problem, and it is both the expected and desired behavior.

    * There is both implicit (or equivalent) and explicit bool operations 
in python.

the 'not' keyword returns a bool by calling a numbers __nonzero__ method, 
or a containers __len__ method which in turn is used to returns either a 
True or False bool object.

There is no bool keyword equivalent except 'not not' which isn't the most 
readable way to do it.  This is because any object can be used as an 
implicit bool, though there are situations where returning an explicitly 
constructed bool (True or False) expresses a clearer intent.

Interpretation:

The "small" inconsistency (if any) here is the 'not' keyword vs the 
'bool()' constructor. They pretty much do the same thing yet work in 
modestly different ways.

I believe this is a matter of both how the bool objects True and False came 
into python and a matter of practicality.  Boolean objects are used 
frequently in loop tests and it is very good that they work as fast as 
possible.  (In the case of 'not' different byte code is generated depending 
on how it is used.)

The 'not' operator proceeded bool, other wise we might have had a not() 
function or type that returns a bool object in it's place, or bool would 
have been a keyword working in much the same way as not does now.

It appears the use of bools are becoming integrated even more closely with 
the interpreters byte code, so in my opinion it makes more since to make 
bool a keyword that works in the same way as 'not', rather than make 'not' 
a sub-classed constructor, (to bool), that returns a bool object.

The only difficulty I see with a bool keyword is the bool type would need 
to be spelled 'Bool' rather than 'bool'.  In most cases this change looks 
like it would be backwards compatible as well.

    bool (expresssion)  <==>   bool expression

True and False could then become protected built-in objects like None:

    True = Bool(1)
    False = Bool(0)

The documents on None say the following.

Changed in version 2.4: None became a constant and is now recognized by the 
compiler as a name for the built-in object None. Although it is not a 
keyword, you cannot assign a different object to it.

PEP 3100 only says...

    * None becomes a keyword [4] (What about True, False?)

The reference [4] lists compatibility to python 2.3 as a reason for not 
protecting True and False, but it seems the general consensus from various 
threads I've examined is to have them protected.

    http://mail.python.org/pipermail/python-dev/2004-July/046294.html

Making True and False protected objects along with making bool a keyword 
may have some performance benefits since in many cases the compiler could 
then directly return the already constructed True and False object instead 
of creating new one(s).  [it may already do this under the covers in some 
cases(?)]  This may or may not effect the case where 'while 1:' is faster 
than 'while True:'.

It is also helpful to look at some byte code examples to compare how 'not', 
'not not' and bool() work in different situations and see how it may be 
changed if bool where a keyword and True and False become protected 
objects.  It may be that in some (common?) cases the C code for some byte 
codes and objects could return True and False references directly.

In my opinion this really just finishes up a process of change that is 
already started and ties up the loose ends.  It does not change what 
bool(), True, False, and not do in any major way.  It may create 
opportunities for further optimizations.

Cheers,
    Ron

From jcarlson at uci.edu  Fri Mar  2 23:16:03 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 02 Mar 2007 14:16:03 -0800
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45E881A1.3020902@ronadam.com>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>
Message-ID: <20070302141412.AEB0.JCARLSON@uci.edu>

Ron Adam <rrr at ronadam.com> wrote:
> Interpretation:
> 
> The "small" inconsistency (if any) here is the 'not' keyword vs the 
> 'bool()' constructor. They pretty much do the same thing yet work in 
> modestly different ways.

Maybe I'm missing something, but is there a place where the following is
true?

    (not not x) != bool(x)

I can't think of any that I've ever come across.

 - Josiah

From greg.ewing at canterbury.ac.nz  Fri Mar  2 22:37:41 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 03 Mar 2007 10:37:41 +1300
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45E881A1.3020902@ronadam.com>
References: <erlclc$ehu$1@sea.gmane.org>
	<d06a5cd30702221708j475301cate0fb810de6de73e7@mail.gmail.com>
	<erlf32$pob$1@sea.gmane.org>
	<3d2ce8cb0702221739r2fbca4ffy4825d02361748e13@mail.gmail.com>
	<erlgsv$2g8$1@sea.gmane.org> <45DEC783.8040601@canterbury.ac.nz>
	<45DF0C84.8020403@ronadam.com> <45DF7A91.8020402@canterbury.ac.nz>
	<45E881A1.3020902@ronadam.com>
Message-ID: <45E89925.5050500@canterbury.ac.nz>

Ron Adam wrote:
> 
> Greg Ewing wrote:
> 
>> The fact that bool() can be used canonicalise a truth
>> value is secondary
> 
> It seems to me it is the main use, with testing for bool types as the 
> secondary use.

What I meant was that this usage is not reason enough
on its own for bool to be a type. Given that bool is
already a type, it's convenient for its constructor
to also be the canonicalising function. It's secondary
as a reason for making bool a type, not in the sense
of how it's normally used.

> having bool be a 
> subclass of int increases the situations where bool is useful in a nice 
> way.

I still think that most of those uses could be covered
by giving bool an __index__ method.

> Making True and False protected objects along with making bool a keyword 
> may have some performance benefits since in many cases the compiler 
> could then directly return the already constructed True and False object 
> instead of creating new one(s).  [it may already do this under the 
> covers in some cases(?)]

I believe that True and False are treated as singletons,
so this should always happen.

I don't see anywhere enough benefit to be worth making
a bool keyword.

Optimisation of access to True and False is probably
better addressed through a general mechanism for optimising
lookups of globals, although I wouldn't object if they
were treated as reserved identifiers a la None.

Summary of my position on this:

+0   making bool a separate type with __index__ in 3.0
+0   treating True and False like None
-1   bool keyword

--
Greg

From rrr at ronadam.com  Sat Mar  3 01:17:32 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 02 Mar 2007 18:17:32 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
 wart?)
In-Reply-To: <20070302141412.AEB0.JCARLSON@uci.edu>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>
	<20070302141412.AEB0.JCARLSON@uci.edu>
Message-ID: <45E8BE9C.9010600@ronadam.com>

Josiah Carlson wrote:
> Ron Adam <rrr at ronadam.com> wrote:
>> Interpretation:
>>
>> The "small" inconsistency (if any) here is the 'not' keyword vs the 
>> 'bool()' constructor. They pretty much do the same thing yet work in 
>> modestly different ways.
> 
> Maybe I'm missing something, but is there a place where the following is
> true?
> 
>     (not not x) != bool(x)
> 
> I can't think of any that I've ever come across.
> 
>  - Josiah

I don't think you are missing anything.  I did say it was a *small* 
inconsistency in how they are used in relation to 'and', 'or' and 'not'. 
ie.. a constructor vs a keyword in a similar situation.

And I also pointed out it doesn't change what they do, it has more to do 
with how they work underneath and that there may be some benefits in 
changing this.

 >>> def foo(x):
...   return (not not x) != bool(x)
...
 >>> dis.dis(foo)
   2           0 LOAD_FAST                0 (x)
               3 UNARY_NOT
               4 UNARY_NOT
               5 LOAD_GLOBAL              0 (bool)
               8 LOAD_FAST                0 (x)
              11 CALL_FUNCTION            1
              14 COMPARE_OP               3 (!=)
              17 RETURN_VALUE

In this case the lines...

               5 LOAD_GLOBAL              0 (bool)
               8 LOAD_FAST                0 (n)
              11 CALL_FUNCTION            1

Would be replaced by...

               5 LOAD_FAST                0 (n)
               6 UNARY_BOOL

The compiler could also replace UNARY_NOT, UNARY_NOT pairs with a single 
UNARY_BOOL.

There may be other situations where this could result in different more 
efficient byte code as well.

_Ron

From collinw at gmail.com  Sat Mar  3 01:27:23 2007
From: collinw at gmail.com (Collin Winter)
Date: Fri, 2 Mar 2007 18:27:23 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45E8BE9C.9010600@ronadam.com>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>
	<20070302141412.AEB0.JCARLSON@uci.edu>
	<45E8BE9C.9010600@ronadam.com>
Message-ID: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>

On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
> Josiah Carlson wrote:
> > Ron Adam <rrr at ronadam.com> wrote:
> >> Interpretation:
> >>
> >> The "small" inconsistency (if any) here is the 'not' keyword vs the
> >> 'bool()' constructor. They pretty much do the same thing yet work in
> >> modestly different ways.
> >
> > Maybe I'm missing something, but is there a place where the following is
> > true?
> >
> >     (not not x) != bool(x)
> >
> > I can't think of any that I've ever come across.
> >
> >  - Josiah
>
> I don't think you are missing anything.  I did say it was a *small*
> inconsistency in how they are used in relation to 'and', 'or' and 'not'.
> ie.. a constructor vs a keyword in a similar situation.

'and', 'or' and 'not' are operators (and hence keywords) because
making them functions is incredibly ugly: and(or(a, b), not(c)).

Changing "bool(x)" to "bool x" introduces a much larger inconsistency
between bool and the other built-in types.

> And I also pointed out it doesn't change what they do, it has more to do
> with how they work underneath and that there may be some benefits in
> changing this.

So the main reason is for optimization? Are you really calling bool()
frequently in inner-loop code? Is a profiler telling you that bool()
is a bottleneck in your applications?

Collin Winter

From rrr at ronadam.com  Sat Mar  3 02:42:34 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 02 Mar 2007 19:42:34 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
 wart?)
In-Reply-To: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>	
	<20070302141412.AEB0.JCARLSON@uci.edu>
	<45E8BE9C.9010600@ronadam.com>
	<43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
Message-ID: <45E8D28A.1020009@ronadam.com>

Collin Winter wrote:
> On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
>> Josiah Carlson wrote:
>> > Ron Adam <rrr at ronadam.com> wrote:
>> >> Interpretation:
>> >>
>> >> The "small" inconsistency (if any) here is the 'not' keyword vs the
>> >> 'bool()' constructor. They pretty much do the same thing yet work in
>> >> modestly different ways.
>> >
>> > Maybe I'm missing something, but is there a place where the 
>> following is
>> > true?
>> >
>> >     (not not x) != bool(x)
>> >
>> > I can't think of any that I've ever come across.
>> >
>> >  - Josiah
>>
>> I don't think you are missing anything.  I did say it was a *small*
>> inconsistency in how they are used in relation to 'and', 'or' and 'not'.
>> ie.. a constructor vs a keyword in a similar situation.
> 
> 'and', 'or' and 'not' are operators (and hence keywords) because
> making them functions is incredibly ugly: and(or(a, b), not(c)).

I agree.

> Changing "bool(x)" to "bool x" introduces a much larger inconsistency
> between bool and the other built-in types.

A bool keyword would not be a built-in type but be an operator just like 
'and', 'or', and 'not'.  So it would be even more consistent.

Bool() with a capital 'B' would be the built in type and only the very 
*tiny* naming inconsistency of the capital 'B' would exist in relation to 
'int' and the other built_in types.

So I think this adds more consistency than it does inconsistency.

The performance benefits are a nice additional bonus. And how much that is 
(if it is a deciding factor) would need to be determined.

Note: The Bool() type as a constructor would hardly ever be needed with the 
presence of a bool operator.  So the capital "B" probably won't be a major 
problem for anyone.  If it is, it could be called boolean() instead.

>> And I also pointed out it doesn't change what they do, it has more to do
>> with how they work underneath and that there may be some benefits in
>> changing this.
> 
> So the main reason is for optimization? Are you really calling bool()
> frequently in inner-loop code? Is a profiler telling you that bool()
> is a bottleneck in your applications?

I'm not using it that way my self, but you are correct that the idea needs 
to be tested.

When 'not' is used with 'if' and 'while' it results in specific byte code 
to that situation instead of UNARY_NOT.  A bool operator may also work that 
way in other similar situations.  So it may be a greater benefit than 
expected.  (?)  It needs research I admit.

And I do realize "if bool x:" and "while bool x:" would not be the 
preferred spelling and would result in the same *exact* byte code as "if 
x:" and "while x:" I believe.

This was a response to the expressed desire to change bool started in 
python-dev.  Ie... a question and some follow up messages suggesting other 
possibly greater changes in semantics than this.  So it appears there is 
some consensus that there is something that could be changed, but just what 
wasn't precisely determined or agreed on.  After thinking on it some this 
is the best I could come up with.  ;-)

Consider this as 1 part aesthetics, .5 part performance.  And yes these 
changes are *minor*.  I've already stated that in quite a few places now. 
It's the type of thing that could be changed in python 3000 if desired, but 
doesn't need to be changed if it is felt it shouldn't be changed.

So I have no problem if it's ruled out on the grounds that 'there is not 
sufficient need', or on grounds that "it's incorrect" in some other way.

;-)

_Ron

From collinw at gmail.com  Sat Mar  3 03:08:05 2007
From: collinw at gmail.com (Collin Winter)
Date: Fri, 2 Mar 2007 20:08:05 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45E8D28A.1020009@ronadam.com>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>
	<20070302141412.AEB0.JCARLSON@uci.edu>
	<45E8BE9C.9010600@ronadam.com>
	<43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
	<45E8D28A.1020009@ronadam.com>
Message-ID: <43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com>

On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
> Collin Winter wrote:
> > On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
> >> Josiah Carlson wrote:
> >> > Ron Adam <rrr at ronadam.com> wrote:
> >> >> Interpretation:
> >> >>
> >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the
> >> >> 'bool()' constructor. They pretty much do the same thing yet work in
> >> >> modestly different ways.
> >> >
> >> > Maybe I'm missing something, but is there a place where the
> >> following is
> >> > true?
> >> >
> >> >     (not not x) != bool(x)
> >> >
> >> > I can't think of any that I've ever come across.
> >> >
> >> >  - Josiah
> >>
> >> I don't think you are missing anything.  I did say it was a *small*
> >> inconsistency in how they are used in relation to 'and', 'or' and 'not'.
> >> ie.. a constructor vs a keyword in a similar situation.
> >
> > 'and', 'or' and 'not' are operators (and hence keywords) because
> > making them functions is incredibly ugly: and(or(a, b), not(c)).
>
> I agree.
>
> > Changing "bool(x)" to "bool x" introduces a much larger inconsistency
> > between bool and the other built-in types.
>
> A bool keyword would not be a built-in type but be an operator just like
> 'and', 'or', and 'not'.  So it would be even more consistent.
[snip]
> Bool() with a capital 'B' would be the built in type and only the very
> *tiny* naming inconsistency of the capital 'B' would exist in relation to
> 'int' and the other built_in types.
>
> So I think this adds more consistency than it does inconsistency.

Added consistency:
- Things related to booleans are operators (bool, not, and, or).

Added inconsistency:
- The Bool type does not follow the same naming convention as int,
float, dict, list, tuple and set.
- There's now a keyword that has 99% of the same spelling, fulfills
*fewer* of the same uses-cases and has the *exact* same semantics as a
built-in constructor/function.

That's a bizarre trade-off. -1000.

Collin Winter

From jcarlson at uci.edu  Sat Mar  3 03:18:25 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 02 Mar 2007 18:18:25 -0800
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <45E8D28A.1020009@ronadam.com>
References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
	<45E8D28A.1020009@ronadam.com>
Message-ID: <20070302181642.AEB6.JCARLSON@uci.edu>

Ron Adam <rrr at ronadam.com> wrote:
> So I have no problem if it's ruled out on the grounds that 'there is not 
> sufficient need'

Ok.  YAGNI.  Seriously.  The performance advantage in real code, I
guarantee, isn't measurable.  Also, the inconsistancy is subjective, and
I've never heard anyone complain before.

 - Josiah

From rrr at ronadam.com  Sat Mar  3 04:19:33 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 02 Mar 2007 21:19:33 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
 wart?)
In-Reply-To: <43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com>
References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com>	
	<20070302141412.AEB0.JCARLSON@uci.edu>
	<45E8BE9C.9010600@ronadam.com>	
	<43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>	
	<45E8D28A.1020009@ronadam.com>
	<43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com>
Message-ID: <45E8E945.1070107@ronadam.com>

Collin Winter wrote:
> On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
>> Collin Winter wrote:
>> > On 3/2/07, Ron Adam <rrr at ronadam.com> wrote:
>> >> Josiah Carlson wrote:
>> >> > Ron Adam <rrr at ronadam.com> wrote:
>> >> >> Interpretation:
>> >> >>
>> >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the
>> >> >> 'bool()' constructor. They pretty much do the same thing yet 
>> work in
>> >> >> modestly different ways.
>> >> >
>> >> > Maybe I'm missing something, but is there a place where the
>> >> following is
>> >> > true?
>> >> >
>> >> >     (not not x) != bool(x)
>> >> >
>> >> > I can't think of any that I've ever come across.
>> >> >
>> >> >  - Josiah
>> >>
>> >> I don't think you are missing anything.  I did say it was a *small*
>> >> inconsistency in how they are used in relation to 'and', 'or' and 
>> 'not'.
>> >> ie.. a constructor vs a keyword in a similar situation.
>> >
>> > 'and', 'or' and 'not' are operators (and hence keywords) because
>> > making them functions is incredibly ugly: and(or(a, b), not(c)).
>>
>> I agree.
>>
>> > Changing "bool(x)" to "bool x" introduces a much larger inconsistency
>> > between bool and the other built-in types.
>>
>> A bool keyword would not be a built-in type but be an operator just like
>> 'and', 'or', and 'not'.  So it would be even more consistent.
> [snip]
>> Bool() with a capital 'B' would be the built in type and only the very
>> *tiny* naming inconsistency of the capital 'B' would exist in relation to
>> 'int' and the other built_in types.
>>
>> So I think this adds more consistency than it does inconsistency.
> 
> Added consistency:
> - Things related to booleans are operators (bool, not, and, or).

Yes

> Added inconsistency:
> - The Bool type does not follow the same naming convention as int,
> float, dict, list, tuple and set.

Ok, so name the type boolean instead.

> - There's now a keyword that has 99% of the same spelling, 

It can be good it has the same *exact* semantics and spelling.  That makes 
it easier to migrate code

> fulfills *fewer* of the same uses-cases

The only difference I can think of is in testing for a bool type.  boolean 
would need to be used instead of bool.

> and has the *exact* same semantics as a built-in constructor/function.

This is not an uncommon thing in python.

    [1,2,3]     <==>  list((1,2,3))
    {'a':1}     <==>  dict(a=1)

Currently:
    not not n   <==>   bool(n)
    not n       <==>   bool(not n)

Alternative:
    bool n      <==>   boolean(n)
    not n       <==>   boolean(not n)

I suspect the biggest reason against this suggestion is it changes the 
status quo.

As to weather or not it is a keyword,  there is just as strong an argument 
against 'not' not being a keyword as there is for 'bool' being a keyword. 
But the addition of new keywords is historically something to be avoided 
with python.  In this case, I feel boolean types have become such a 
fundamental object in python that this just may be doable.

So with the above changes you have...

   * bool becomes an operator
   * boolean() replaces bool() as a constructor.

Consistency:
   + 1    Things related to booleans are operators (bool, not, and, or).

Performance:
   + .5   Compiler is able to generate more specific byte code.
          (This is probably only a small benefit so it only gets +.5)

Keyword:
   - 1    it *is* a new keyword, which is something to be avoided.

Name:
   + 1    The bool operator is the same as the old constructor
          enabling easy migration.

   - 1    A different name 'boolean' must be used to test for boolean types.

And the total is...  + .5

And If you add in a -1 for anything that changes anything for the sake that 
people just don't like changes in general.  ie... changing the status quo.

It then becomes ...  -.5

Does that sound like a fair evaluation?

<shrug> I tried.  ;-)

_Ron

From rrr at ronadam.com  Sat Mar  3 04:38:54 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 02 Mar 2007 21:38:54 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
 wart?)
In-Reply-To: <20070302181642.AEB6.JCARLSON@uci.edu>
References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
	<45E8D28A.1020009@ronadam.com>
	<20070302181642.AEB6.JCARLSON@uci.edu>
Message-ID: <45E8EDCE.7000000@ronadam.com>

Josiah Carlson wrote:
> Ron Adam <rrr at ronadam.com> wrote:
>> So I have no problem if it's ruled out on the grounds that 'there is not 
>> sufficient need'
> 
> Ok.  YAGNI.  Seriously.  The performance advantage in real code, I
> guarantee, isn't measurable.  Also, the inconsistancy is subjective, and
> I've never heard anyone complain before.
> 
>  - Josiah

You are right.  It's more cosmetic than actual.  I basically said that from 
the start.

Not as a direct complaint, no.  It turns up more as a misunderstanding of 
how 'and' and 'or' work or a misunderstanding why bool returns what it does 
instead of what someone thinks it should, as in the thread that started 
this last week.  This addresses that in a subtle indirect way by redefining 
bool as an operator.  Weather or not it would actually improve the ease of 
understanding how python uses and interacts with boolean types and 
operators, I'm really not sure.

As far as performance gain, I was thinking that in combination with making 
True and False constants, it might be greater.  But it may very well not 
be.  Maybe sometime in the near future I will try to test just how much of 
a difference it could make.  In any case the making of True and False into 
constants is still an open python 3k issue.

This was more exploratory and since nobody else jumped in and took it up, 
it seems it would be un-popular as well.  But it's here for the record in 
any case.

Cheers,
   Ron

From rrr at ronadam.com  Sat Mar  3 08:14:42 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 03 Mar 2007 01:14:42 -0600
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
 wart?)
In-Reply-To: <20070302181642.AEB6.JCARLSON@uci.edu>
References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
	<45E8D28A.1020009@ronadam.com>
	<20070302181642.AEB6.JCARLSON@uci.edu>
Message-ID: <45E92062.5010306@ronadam.com>

Josiah Carlson wrote:
> Ron Adam <rrr at ronadam.com> wrote:
>> So I have no problem if it's ruled out on the grounds that 'there is not 
>> sufficient need'
> 
> Ok.  YAGNI.  Seriously.  The performance advantage in real code, I
> guarantee, isn't measurable. 

It is measurable, but the total average gain would not be substantial only 
because of it being a relatively rare operation.

The most common use of bool() seems to be in returning a bool value.

    return bool(exp)

A few timeit tests...

The bool constructor:

$ python2.5 -m timeit -n 10000000 'bool(1)'
10000000 loops, best of 3: 0.465 usec per loop

$ python2.5 -m timeit -n 10000000 'bool("a")'
10000000 loops, best of 3: 0.479 usec per loop

$ python2.5 -m timeit -n 10000000 'bool([1])'
10000000 loops, best of 3: 0.697 usec per loop

A bool operator would have the same (or faster) speed as the 'not' operator:

$ python2.5 -m timeit -n 10000000 'not 1'
10000000 loops, best of 3: 0.165 usec per loop

$ python2.5 -m timeit -n 10000000 'not "a"'
10000000 loops, best of 3: 0.164 usec per loop

$ python2.5 -m timeit -n 10000000 'not [1]'
10000000 loops, best of 3: 0.369 usec per loop

The real gain is dependent on how and where it is used of course.  In most 
cases 'not not x' would still be much faster than bool(x).  Its nearly as 
fast as a single 'not' but really isn't the most readable way to do it.

$ python2.5 -m timeit -n 10000000 'not not 1'
10000000 loops, best of 3: 0.18 usec per loop

$ python2.5 -m timeit -n 10000000 'not not "a"'
10000000 loops, best of 3: 0.185 usec per loop

$ python2.5 -m timeit -n 10000000 'not not [1]'
10000000 loops, best of 3: 0.381 usec per loop

The following may be an indication of how much speed difference may still 
be gained in other related situations such as flow control or comparisons 
where bool values are used implicitly.

$ python2.5 -m timeit -n 10000000 'if "a": pass'
10000000 loops, best of 3: 0.0729 usec per loop

$ python2.5 -m timeit -n 10000000 'if not "a": pass'
10000000 loops, best of 3: 0.17 usec per loop

$ python2.5 -m timeit -n 10000000 'if bool("a"): pass'
10000000 loops, best of 3: 0.486 usec per loop

In the case of a bool operator, it's not a major issue because of how 
rarely it's actually used.

Cheers,
   Ron

From armin.ronacher at active-4.com  Sat Mar  3 09:25:19 2007
From: armin.ronacher at active-4.com (Armin Ronacher)
Date: Sat, 3 Mar 2007 08:25:19 +0000 (UTC)
Subject: [Python-ideas] if with as
Message-ID: <loom.20070303T092453-381@post.gmane.org>

Hi all,

Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby,
C, etc..) I know that this was dismissed by guido because it can lead to
mistakes. And i can support that. However in some situations it might be
required because you have to nest if blocks and regular expressions for example::

    while pos < text_length:
        if match = name_re.match(text, pos):
            pos = match.end()
            do_something(match)
        elif match = digit_re.match(text, pos):
            pos = match.end()
            do_something(match)
        else:
            pos += 1

Well. But that would require an assignment. Why not use the "as" keyword
introduced in python2.5 with the future import::

    while pos < text_length:
        if name_re.match(text, pos) as match:
            pos = match.end()
            do_something(match)
        elif digit_re.match(text, pos) as match:
            pos = match.end()
            do_something(match)
        else:
            pos += 1

Advantages: Still no assignment expression, no additional keyword, simple to
understand.

Regards,
Armin

From jcarlson at uci.edu  Sat Mar  3 10:40:01 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 03 Mar 2007 01:40:01 -0800
Subject: [Python-ideas] if with as
In-Reply-To: <loom.20070303T092453-381@post.gmane.org>
References: <loom.20070303T092453-381@post.gmane.org>
Message-ID: <20070303011935.AEBC.JCARLSON@uci.edu>

Armin Ronacher <armin.ronacher at active-4.com> wrote:
> 
> Hi all,
> 
> Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby,
> C, etc..) I know that this was dismissed by guido because it can lead to
> mistakes. And i can support that. However in some situations it might be
> required because you have to nest if blocks and regular expressions for example::
[snip]

You could convert the code you have offered into the following:

    while pos < text_length:
        match = name_re.match(text, pos) or \
                digit_re.match(text, pos) or \
                None
        if match:
            post = match.end()
            do_something(match)
        else:
            pos += 1

Not only does it work today in any Python with the re module, has the
same number of lines as what you provided, and doesn't repeat itself, it
can be shortened with a simple helper function.

-1

Also, I think it would be confusing.  It is currently used in imports
and the with statement, but it was *necessary* to have some assignment
semantic in with statement - there is no such necessity in if/elif (or
even while, which is the next step after if/elif).

 - Josiah

From greg.ewing at canterbury.ac.nz  Sat Mar  3 12:04:28 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 04 Mar 2007 00:04:28 +1300
Subject: [Python-ideas] if with as
In-Reply-To: <20070303011935.AEBC.JCARLSON@uci.edu>
References: <loom.20070303T092453-381@post.gmane.org>
	<20070303011935.AEBC.JCARLSON@uci.edu>
Message-ID: <45E9563C.8060002@canterbury.ac.nz>

Josiah Carlson wrote:

> You could convert the code you have offered into the following:
> 
>     while pos < text_length:
>         match = name_re.match(text, pos) or \
>                 digit_re.match(text, pos) or \
>                 None
>         if match:
>             post = match.end()
>             do_something(match)
>         else:
>             pos += 1

I think the idea was that the do_something(match) could
be a different thing for each re.

--
Greg

From armin.ronacher at active-4.com  Sat Mar  3 12:13:14 2007
From: armin.ronacher at active-4.com (Armin Ronacher)
Date: Sat, 3 Mar 2007 11:13:14 +0000 (UTC)
Subject: [Python-ideas] if with as
References: <loom.20070303T092453-381@post.gmane.org>
	<20070303011935.AEBC.JCARLSON@uci.edu>
	<45E9563C.8060002@canterbury.ac.nz>
Message-ID: <loom.20070303T121245-305@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:

> 
> I think the idea was that the do_something(match) could
> be a different thing for each re.
> 
> --
> Greg
> 

Yes. It is,

Regards,
Armin

From talin at acm.org  Sat Mar  3 19:50:46 2007
From: talin at acm.org (Talin)
Date: Sat, 03 Mar 2007 10:50:46 -0800
Subject: [Python-ideas] if with as
In-Reply-To: <loom.20070303T092453-381@post.gmane.org>
References: <loom.20070303T092453-381@post.gmane.org>
Message-ID: <45E9C386.6020303@acm.org>

Armin Ronacher wrote:
> Hi all,
> 
> Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby,
> C, etc..) I know that this was dismissed by guido because it can lead to
> mistakes. And i can support that. However in some situations it might be
> required because you have to nest if blocks and regular expressions for example::
> 
>     while pos < text_length:
>         if match = name_re.match(text, pos):
>             pos = match.end()
>             do_something(match)
>         elif match = digit_re.match(text, pos):
>             pos = match.end()
>             do_something(match)
>         else:
>             pos += 1
> 
> Well. But that would require an assignment. Why not use the "as" keyword
> introduced in python2.5 with the future import::
> 
>     while pos < text_length:
>         if name_re.match(text, pos) as match:
>             pos = match.end()
>             do_something(match)
>         elif digit_re.match(text, pos) as match:
>             pos = match.end()
>             do_something(match)
>         else:
>             pos += 1
> 
> Advantages: Still no assignment expression, no additional keyword, simple to
> understand.

Personally, I like it - it's an issue that I've brought up before, but 
your syntax is better. With the introduction of 2.5's "with A as B", and 
the new exception-handling syntax in Py3K 'except E as v' (and the 
already existing import syntax), it seems to me that we are, in fact, 
establishing a general rule that:

	<keyword> <expression> as <variable>:

...is a common syntactical pattern in Python, meaning 'do something 
special with expression, and then as a side effect, assign that 
expression to the named variable for this block."

-- Talin

From guido at python.org  Sun Mar  4 00:56:35 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 3 Mar 2007 15:56:35 -0800
Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion
	wart?)
In-Reply-To: <ca471dc20703031555u302b93d1m77cbfd8a1cdfc6b8@mail.gmail.com>
References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com>
	<45E8D28A.1020009@ronadam.com> <20070302181642.AEB6.JCARLSON@uci.edu>
	<45E92062.5010306@ronadam.com>
	<ca471dc20703031555u302b93d1m77cbfd8a1cdfc6b8@mail.gmail.com>
Message-ID: <ca471dc20703031556p7dd75913wb50ec349dbef0f66@mail.gmail.com>

FWIW, there's zero chance that bool will change in Py3k. Well, unless
I get hit by a bus before it's released. It sounds like a typical
bikeshed color discussion, not worth the electrons killed to transmit
the posts.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From larry at hastings.org  Mon Mar  5 08:52:07 2007
From: larry at hastings.org (Larry Hastings)
Date: Sun, 04 Mar 2007 23:52:07 -0800
Subject: [Python-ideas] if with as
In-Reply-To: <45E9C386.6020303@acm.org>
References: <loom.20070303T092453-381@post.gmane.org>
	<45E9C386.6020303@acm.org>
Message-ID: <45EBCC27.7090008@hastings.org>

Talin wrote:
> Armin Ronacher wrote:
>   
>>         if name_re.match(text, pos) as match:
>>     
> Personally, I like it - [...] it seems to me that we are, in fact, 
> establishing a general rule that:
>
> 	<keyword> <expression> as <variable>:
>
> ...is a common syntactical pattern in Python, meaning 'do something 
> special with expression, and then as a side effect, assign that 
> expression to the named variable for this block."
I like it too.  However: unlike "except x as e" and "with a as b", the 
indented blocks under if and else don't establish a new scope.  So it 
begs the question: what is the lifetime of the variable created by "if x 
as y" here?  If it magically goes away, even as you create new variables 
in the scope, that just seems a little too weird to me.  If it outlives 
the nested block under the if, that's weird too.

(Personally I'd prefer it if the blocks under if and else *did* 
establish a new scope, but I know that's never going to change.)

Cheers,

/larry/

From free.condiments at gmail.com  Mon Mar  5 20:37:06 2007
From: free.condiments at gmail.com (Sam)
Date: Mon, 5 Mar 2007 19:37:06 +0000
Subject: [Python-ideas] if with as
In-Reply-To: <45EBCC27.7090008@hastings.org>
References: <loom.20070303T092453-381@post.gmane.org>
	<45E9C386.6020303@acm.org> <45EBCC27.7090008@hastings.org>
Message-ID: <b1c02c610703051137p6f1e2692xae818351e8a3f44e@mail.gmail.com>

(apologies to Larry - the GMail interface is still confusing and arbirary)

On 05/03/07, Larry Hastings <larry at hastings.org> wrote:
> I like it too.  However: unlike "except x as e" and "with a as b", the
> indented blocks under if and else don't establish a new scope.  So it
> begs the question: what is the lifetime of the variable created by "if x
> as y" here?  If it magically goes away, even as you create new variables
> in the scope, that just seems a little too weird to me.  If it outlives
> the nested block under the if, that's weird too.

I could be using a different definition of 'scope' than you, but:

>>> try:
       raise Exception("foo")
except Exception, e:
       pass
>>> print e
foo

seems to suggest that except blocks don't set up a new scope to me.

>>> with file('C:/foo.txt') as f:
       pass

>>> print f
<closed file 'C:/foo.txt', mode 'r' at 0x00B98800>

says the same thing about the with statement to me.

So the variable defined with an if <expr> as <name>: statement
outliving the end of the indented block would be absolutely no
surprise according to the current semantics of other statements.

--Sam

From larry at hastings.org  Mon Mar  5 23:19:40 2007
From: larry at hastings.org (Larry Hastings)
Date: Mon, 05 Mar 2007 14:19:40 -0800
Subject: [Python-ideas] if with as
In-Reply-To: <b1c02c610703051137p6f1e2692xae818351e8a3f44e@mail.gmail.com>
References: <loom.20070303T092453-381@post.gmane.org>	<45E9C386.6020303@acm.org>
	<45EBCC27.7090008@hastings.org>
	<b1c02c610703051137p6f1e2692xae818351e8a3f44e@mail.gmail.com>
Message-ID: <45EC977C.3020808@hastings.org>

Sam wrote:
> except blocks don't set up a new scope to me.
> [...second code example deleted...]
> says the same thing about the with statement to me.
>   
(What version of Python are you using, a 2.6 build?  In 2.5 release, a 
file() is not a context manager, so 'with file() as f:' doesn't work.  
And in py3k the "except E, N" syntax is gone.)

First off, you are absolutely right, the "with" and "except" statements 
do not open new scopes.  I was wrong about that.

But PEP 3110, "Catching Exceptions In Python:, describes "except E as N" 
this way:

    try:
        try_body
    except E as N:
        except_body
    ...

    gets translated to (in Python 2.5 terms)

    try:
        try_body
    except E, N:
        try:
            except_body
        finally:
            N = None
            del N
    ...

    http://www.python.org/dev/peps/pep-3110/

So the intent for "except E as N" is that N does *not* outlive the 
"except" block.  And indeed that's what happens in my month-old Py3k.

Also, in your example "with file(...) as f:", outside the "with" block 
"f" evaluated as "<closed file 'C:/foo.txt', mode 'r' at 0x00B98800>".  
The file was closed because "with file(...) as f:" wasn't simply 
assigning the result of the "with" expression to f; it assigns to f the 
result of calling __enter__() on the with with expression's result.  It 
seems that post-Python 2.5 file.__enter__() returns the file.  This 
isn't a bad thing, but it's a little misleading as "with A as X" doesn't 
usually wind up with X containing the "result" of A.

So I can state: in Py3k, in both cases of "except E as N" and "with E as 
N", after the nested block has exited, N does not contain the direct 
result of E.

I guess I'm starting to disagree with sprinkling "as" into the syntax as 
the defacto inline assignment operator.  Quoting from PEP 343:

    So now the final hurdle was that the PEP 310 <http://www.python.org/dev/peps/pep-0310> syntax:

        with VAR = EXPR:
            BLOCK1

    would be deceptive, since VAR does *not* receive the value of
    EXPR.  Borrowing from PEP 340 <http://www.python.org/dev/peps/pep-0340>, it was an easy step to:

        with EXPR as VAR:
             BLOCK1

    http://www.python.org/dev/peps/pep-0343/

Clearly the original intent with "as" was that it was specifically *not* 
direct assignment.  Python added a *new keyword*, specifically because 
this was not direct inline assignment.  So I assert a Python programmer 
should not see "as" and think "= with the two sides swapped".

For my final trick, I will attempt to channel Guido.  (Which is spooky, 
as he hasn't even Crossed Over To The Other Side.)  Python was 
originally written in C, the poster child for inline assignment 
operators, so he was obviously familiar with the syntactic construct.  
But Python does *not* support inline assignment everywhere.  AFAIK it 
specifically has supports for it in one place: allowing multiple 
assignments at once.  This works:

    a = b = file(...)

But this does not:

    if a = expression:
        # hooray, expression worked
    else:
        # boo, expression failed

Since nobody likes to type, I'm sure people have requested inline 
assignment zillions of times before.  Since Python does not support it, 
I'm guessing Guido doesn't want it for some reason.  Perhaps it's to 
save programmers from the inevitable heartache of typing "if a = x" when 
they meant "if a == x" (or vice-versa); if so then that's a little 
surprising, considering the "consenting adults" design tenet, but at 
least "if E as N" would not have that problem.  If there was some other 
reason, then I bet "if E as N" doesn't address it.

Knock three times,

/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070305/13a8e42c/attachment.html>

From talin at acm.org  Mon Mar  5 23:40:55 2007
From: talin at acm.org (Talin)
Date: Mon, 05 Mar 2007 14:40:55 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
Message-ID: <45EC9C77.10603@acm.org>

As you probably know, the next version of Javascript (1.7) will have a 
number of ideas that have been borrowed from Python. In particular, the 
"tuple assignment" syntax will now be supported in Javascript, which 
will be a pleasant addition to the language.

However, I noticed that the Javascript version is, in some respects, a 
superset of the Python functionality.

If you are interested, you might have a look at this page:

    http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7

Go to the section called "Destructuring assignment" to check out how the 
new syntax is going to work.

As an example of what I mean, the Javascript unpacking syntax will allow 
variables to be skipped:

   [a,,b] = [1,2,3]

In other words, a is assigned the value 1, the value 2 is thrown away, 
and b is assigned the value 3. In today's Python, this requires a dummy 
variable.

I admit that this is not a particularly important feature; However, 
given that Javascript is being inspired by Python in this case, maybe it 
would be appropriate to return the favor?

-- Talin

From greg.ewing at canterbury.ac.nz  Tue Mar  6 00:00:33 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 06 Mar 2007 12:00:33 +1300
Subject: [Python-ideas] if with as
In-Reply-To: <45EC977C.3020808@hastings.org>
References: <loom.20070303T092453-381@post.gmane.org>
	<45E9C386.6020303@acm.org> <45EBCC27.7090008@hastings.org>
	<b1c02c610703051137p6f1e2692xae818351e8a3f44e@mail.gmail.com>
	<45EC977C.3020808@hastings.org>
Message-ID: <45ECA111.60708@canterbury.ac.nz>

Larry Hastings wrote:

> So the intent for "except E as N" is that N does *not* outlive the 
> "except" block.  And indeed that's what happens in my month-old Py3k.

That's only because the traceback is being attached
to the exception, and there's a desire to avoid
creating a cycle. There are indications that this
idea might be dropped, in which case deleting the
exception would no longer be necessary.

In any case, it's still not really introducing
a new scope, since if you use 'e' anywhere else
in the function, it's the same variable.

> I'm sure people have requested inline 
> assignment zillions of times before.  Since Python does not support it, 
> I'm guessing Guido doesn't want it for some reason.

I believe the main objection is that it would
reintroduce the potential for accidentally
writing

   if a = b:

instead of

   if a == b:

and having it go undetected. Using an 'as' clause
would be one way of avoiding that. Using a different
operator, such as

   if a := b:

would be another way.

A further possible objection is that allowing in-line
assignments anywhere in any expression could lead to
hard-to-follow code. That could be mitigated by only
allowing them in certain places, such as the conditions
of if and while statements.

 > Clearly the original intent with "as" was that it was
 > specifically *not* direct assignment.

Each usage of "as" is unique. It's whatever it needs to
be in each case. In general it's not direct assignment,
but that doesn't mean that some of its uses couldn't be
direct assignment if we wanted.

--
Greg

From brett at python.org  Tue Mar  6 00:12:58 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 5 Mar 2007 15:12:58 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EC9C77.10603@acm.org>
References: <45EC9C77.10603@acm.org>
Message-ID: <bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>

On 3/5/07, Talin <talin at acm.org> wrote:
> As you probably know, the next version of Javascript (1.7) will have a
> number of ideas that have been borrowed from Python. In particular, the
> "tuple assignment" syntax will now be supported in Javascript, which
> will be a pleasant addition to the language.
>
> However, I noticed that the Javascript version is, in some respects, a
> superset of the Python functionality.
>
> If you are interested, you might have a look at this page:
>
>     http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7
>
> Go to the section called "Destructuring assignment" to check out how the
> new syntax is going to work.
>
> As an example of what I mean, the Javascript unpacking syntax will allow
> variables to be skipped:
>
>    [a,,b] = [1,2,3]
>
> In other words, a is assigned the value 1, the value 2 is thrown away,
> and b is assigned the value 3. In today's Python, this requires a dummy
> variable.

It skips in the middle?!?  Yuck.

>
> I admit that this is not a particularly important feature; However,
> given that Javascript is being inspired by Python in this case, maybe it
> would be appropriate to return the favor?
>

I personally am -1 on the idea.  Explicit is better than implicit.

-Brett

From greg.ewing at canterbury.ac.nz  Tue Mar  6 00:48:09 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 06 Mar 2007 12:48:09 +1300
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
Message-ID: <45ECAC39.4090908@canterbury.ac.nz>

Brett Cannon wrote:

> I personally am -1 on the idea.  Explicit is better than implicit.

One thing I *would* like to see, that Javascript doesn't
seem to have either yet, is a *-arg on the end of the
unpacking list:

   a, b, *c = [1, 2, 3, 4, 5]

giving a == 1, b == 2, c == [3, 4, 5].

--
Greg

From brett at python.org  Tue Mar  6 01:31:09 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 5 Mar 2007 16:31:09 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45ECAC39.4090908@canterbury.ac.nz>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
Message-ID: <bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>

On 3/5/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Brett Cannon wrote:
>
> > I personally am -1 on the idea.  Explicit is better than implicit.
>
> One thing I *would* like to see, that Javascript doesn't
> seem to have either yet, is a *-arg on the end of the
> unpacking list:
>
>    a, b, *c = [1, 2, 3, 4, 5]
>
> giving a == 1, b == 2, c == [3, 4, 5].
>

Now I know that was discussed on python-dev once, but I don't remember
why didn't end up happening.

-Brett

From jason.orendorff at gmail.com  Tue Mar  6 01:55:10 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 5 Mar 2007 19:55:10 -0500
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EC9C77.10603@acm.org>
References: <45EC9C77.10603@acm.org>
Message-ID: <bb8868b90703051655s544823cej39d7119317ca9ff6@mail.gmail.com>

On 3/5/07, Talin <talin at acm.org> wrote:
> As you probably know, the next version of Javascript (1.7) will have a
> number of ideas that have been borrowed from Python. [...]

JavaScript 1.7 is the current version.  It shipped with Firefox 2.0.

-j

From jason.orendorff at gmail.com  Tue Mar  6 04:17:22 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 5 Mar 2007 22:17:22 -0500
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bb8868b90703051655s544823cej39d7119317ca9ff6@mail.gmail.com>
References: <45EC9C77.10603@acm.org>
	<bb8868b90703051655s544823cej39d7119317ca9ff6@mail.gmail.com>
Message-ID: <bb8868b90703051917k153b31a2hb46332be35a30048@mail.gmail.com>

To me, the interesting development on the JavaScript front is Tamarin,
the new JavaScript VM contributed by Adobe.  Tamarin contains a
just-in-time compiler.

I don't know how many people here have seriously looked at the
JavaScript engine, but it's not *that* different from Python.  I know
compiling to machine code has been discussed here, and the consensus
is that it's not worth it.  I'm wondering: how can this be right for
JavaScript and wrong for Python?  Is Mozilla making a mistake?

-j

From jimjjewett at gmail.com  Tue Mar  6 18:47:05 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 6 Mar 2007 12:47:05 -0500
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
Message-ID: <fb6fbf560703060947wb23a766m6ced9d773568071b@mail.gmail.com>

On 3/5/07, Brett Cannon <brett at python.org> wrote:
> On 3/5/07, Talin <talin at acm.org> wrote:
> >     http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7
...

> >    [a,,b] = [1,2,3]
> >
> > In other words, a is assigned the value 1, the value 2 is thrown away,
> > and b is assigned the value 3. In today's Python, this requires a dummy
> > variable.

> It skips in the middle?!?  Yuck.

Are you assuming variable-length skips, so that

    [a,,b] = [1,2,3,4,5]   would mean a=1;b=5 ?

I had read it is just not requiring a name for the unused dummy
variable.  It doesn't really seem worse than

    [a, _junk, b] = [1,2,3]

or less explicit than

    [a,_,b] = [1,2,3]

-jJ

From jcarlson at uci.edu  Tue Mar  6 19:25:30 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 06 Mar 2007 10:25:30 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bb8868b90703051917k153b31a2hb46332be35a30048@mail.gmail.com>
References: <bb8868b90703051655s544823cej39d7119317ca9ff6@mail.gmail.com>
	<bb8868b90703051917k153b31a2hb46332be35a30048@mail.gmail.com>
Message-ID: <20070306102331.74A3.JCARLSON@uci.edu>

"Jason Orendorff" <jason.orendorff at gmail.com> wrote:
> To me, the interesting development on the JavaScript front is Tamarin,
> the new JavaScript VM contributed by Adobe.  Tamarin contains a
> just-in-time compiler.
> 
> I don't know how many people here have seriously looked at the
> JavaScript engine, but it's not *that* different from Python.  I know
> compiling to machine code has been discussed here, and the consensus
> is that it's not worth it.  I'm wondering: how can this be right for
> JavaScript and wrong for Python?  Is Mozilla making a mistake?

For reference: psyco is a JIT compiler for Python, and I believe PyPy
has a JIT compiler (or possibly could, as I believe it has a backend for
LLVM, what is used to generate object code in GCC/G++).

 - Josiah

From brett at python.org  Tue Mar  6 20:26:20 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 6 Mar 2007 11:26:20 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <fb6fbf560703060947wb23a766m6ced9d773568071b@mail.gmail.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<fb6fbf560703060947wb23a766m6ced9d773568071b@mail.gmail.com>
Message-ID: <bbaeab100703061126v3cbfa38fs6fe035a43888d8c@mail.gmail.com>

On 3/6/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 3/5/07, Brett Cannon <brett at python.org> wrote:
> > On 3/5/07, Talin <talin at acm.org> wrote:
> > >     http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7
> ...
>
> > >    [a,,b] = [1,2,3]
> > >
> > > In other words, a is assigned the value 1, the value 2 is thrown away,
> > > and b is assigned the value 3. In today's Python, this requires a dummy
> > > variable.
>
> > It skips in the middle?!?  Yuck.
>
> Are you assuming variable-length skips, so that
>
>     [a,,b] = [1,2,3,4,5]   would mean a=1;b=5 ?
>

Yep, that's how I read it.

> I had read it is just not requiring a name for the unused dummy
> variable.  It doesn't really seem worse than
>
>     [a, _junk, b] = [1,2,3]
>
> or less explicit than
>
>     [a,_,b] = [1,2,3]

Right, but why the ends?  And what if that example had three variables
to unpack to, e.g., a, b, and c?  With a = 1 and c = 5, what does b
get?  2, 4, 3?  It can be intrepreted in so many ways its ambiguous
without referencing the documentation.

-Brett

From tdelaney at avaya.com  Tue Mar  6 23:10:55 2007
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 7 Mar 2007 09:10:55 +1100
Subject: [Python-ideas] Javascript Destructuring Assignment
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com>

Brett Cannon wrote:

>> Are you assuming variable-length skips, so that
>> 
>>     [a,,b] = [1,2,3,4,5]   would mean a=1;b=5 ?
>> 
> 
> Yep, that's how I read it.

I read it differently. I'm think you need to have a comma for each
skipped element:

    [a,,,b] = [1,2,3,4,5]

Anyone with in-depth knowledge, or FireFox 2.0 available to test this?

Tim Delaney

From greg.ewing at canterbury.ac.nz  Tue Mar  6 23:20:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 07 Mar 2007 11:20:36 +1300
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bb8868b90703051917k153b31a2hb46332be35a30048@mail.gmail.com>
References: <45EC9C77.10603@acm.org>
	<bb8868b90703051655s544823cej39d7119317ca9ff6@mail.gmail.com>
	<bb8868b90703051917k153b31a2hb46332be35a30048@mail.gmail.com>
Message-ID: <45EDE934.9040100@canterbury.ac.nz>

Jason Orendorff wrote:
> Tamarin contains a just-in-time compiler.
> 
> I'm wondering: how can this be right for
> JavaScript and wrong for Python?

Javascript is a much simpler language than Python,
with far fewer potential data types. Maybe that
makes it easier to compile efficiently. Most
likely it makes the task of writing a compiler
much easier.

Or, who knows, maybe it isn't right for Javascript
either. Just because someone does something
doesn't necessarily mean it's a good idea.

--
Greg

From thobes at gmail.com  Wed Mar  7 00:17:57 2007
From: thobes at gmail.com (Tobias Ivarsson)
Date: Wed, 7 Mar 2007 00:17:57 +0100
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com>
Message-ID: <9997d5e60703061517v41bb961fq42bfa659710c18d4@mail.gmail.com>

On 3/6/07, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
>
> Brett Cannon wrote:
>
> >> Are you assuming variable-length skips, so that
> >>
> >>     [a,,b] = [1,2,3,4,5]   would mean a=1;b=5 ?
> >>
> >
> > Yep, that's how I read it.
>
> I read it differently. I'm think you need to have a comma for each
> skipped element:
>
>     [a,,,b] = [1,2,3,4,5]
>
> Anyone with in-depth knowledge, or FireFox 2.0 available to test this?

Firefox 2.0  gives me these results:
[a,,b] = [1,2,3,4];
a == 1
b == 3
[a,,,b] = [1,2,3,4];
a == 1
b == 4
[,a,,b] = [1,2,3,4];
a == 2
b == 4

/Tobias Ivarsson

Tim Delaney
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070307/3142d9de/attachment.html>

From bwinton at latte.ca  Wed Mar  7 01:15:24 2007
From: bwinton at latte.ca (Blake Winton)
Date: Tue, 06 Mar 2007 19:15:24 -0500
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com>
Message-ID: <45EE041C.1020909@latte.ca>

(re-sending, because I forgot to subscribe first...)

Delaney, Timothy (Tim) wrote:
> Brett Cannon wrote:
>>> Are you assuming variable-length skips, so that
>>>     [a,,b] = [1,2,3,4,5]   would mean a=1;b=5 ?
>> Yep, that's how I read it.
> I read it differently. I'm think you need to have a comma for each
> skipped element:
>     [a,,,b] = [1,2,3,4,5]
> Anyone with in-depth knowledge, or FireFox 2.0 available to test this?

Oddly enough, I build a Javascript 1.7 interpreter just yesterday...

js> [a,b] = [1,2]
1,2
js> a
1
js> b
2
js> [a,b] = [1,2,3]
1,2,3
js> a
1
js> b
2
js> [a,,b] = [1,2,3,4,5]
1,2,3,4,5
js> a
1
js> b
3
js> [a,,,b] = [1,2,3,4,5]
1,2,3,4,5
js> a
1
js> b
4

which is pretty much what I'ld expect.  Having said that, it does make
it easy to mess up in exactly the way you messed up in your example...

Also, for the curious:
js> [a,[b,c],d] = [1,2,3,4]
1,2,3,4
js> a
1
js> b
js> c
js> d
3

I'ld far prefer it to throw an error if you tried to unpack into a
sequence of the wrong length, but that's not the Javascript way, I
suppose...

Later,
Blake.

From rrr at ronadam.com  Wed Mar  7 07:13:24 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 07 Mar 2007 00:13:24 -0600
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
References: <45EC9C77.10603@acm.org>	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
Message-ID: <45EE5804.9080301@ronadam.com>

Brett Cannon wrote:
> On 3/5/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Brett Cannon wrote:
>>
>>> I personally am -1 on the idea.  Explicit is better than implicit.
>> One thing I *would* like to see, that Javascript doesn't
>> seem to have either yet, is a *-arg on the end of the
>> unpacking list:
>>
>>    a, b, *c = [1, 2, 3, 4, 5]
>>
>> giving a == 1, b == 2, c == [3, 4, 5].
>>
> 
> Now I know that was discussed on python-dev once, but I don't remember
> why didn't end up happening.

(If I recall correctly) There was some support for using the '*' outside of 
function signatures.  I think it died out because of too many alternative 
suggestions.  Or there was some sort of ambiguous situations I'm not 
remembering, possibly confusion with the multiply operator.

I have mixed feeling on it myself.  The reason being, (to me), using the 
'*' for both packing and unpacking is not the most readable solution.

Also the '*' syntax can't be used to unpack nested items.

 >>> data = [1, [2, [3, 4]]]
 >>> a, (b, (c, d)) = data
 >>> print a, b, c, d
1 2 3 4

Ron

From greg.ewing at canterbury.ac.nz  Wed Mar  7 09:34:34 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 07 Mar 2007 21:34:34 +1300
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EE5804.9080301@ronadam.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
	<45EE5804.9080301@ronadam.com>
Message-ID: <45EE791A.8000307@canterbury.ac.nz>

Ron Adam wrote:

> I have mixed feeling on it myself.  

As far as I remember, there weren't really any objective
reasons given against it, just gut feelings of dislike
from some people, including Guido. I'm living in hopes
that he may come round to the idea in time (as he did
with conditional expressions, for example).

 > The reason being, (to me), using the '*' for both packing
 > and unpacking is not the most readable solution.

To me it doesn't seem any less readable than using
[...,...] or (...,...) for both packing and unpacking,
or using * in both function definitions and calls. In
fact the symmetry is a large part of the beauty of
the idea.

> Also the '*' syntax can't be used to unpack nested items.

I'm not sure what you mean by that.

    >>> [[a, b, *c], [d, e, *f], *g] = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]

    >>> print a, b, c, d, e, f, g
    1 2 [3, 4] 5 6 [7, 8] [9, 10]

Makes perfectly good sense to me.

--
Greg

From scott+python-ideas at scottdial.com  Sun Mar  4 10:23:22 2007
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Sun, 04 Mar 2007 04:23:22 -0500
Subject: [Python-ideas] if with as
In-Reply-To: <45E9C386.6020303@acm.org>
References: <loom.20070303T092453-381@post.gmane.org>
	<45E9C386.6020303@acm.org>
Message-ID: <45EA900A.7010905@scottdial.com>

Talin wrote:
> Personally, I like it - it's an issue that I've brought up before, but 
> your syntax is better. With the introduction of 2.5's "with A as B", and 
> the new exception-handling syntax in Py3K 'except E as v' (and the 
> already existing import syntax), it seems to me that we are, in fact, 
> establishing a general rule that:
> 
> 	<keyword> <expression> as <variable>:
> 
> ...is a common syntactical pattern in Python, meaning 'do something 
> special with expression, and then as a side effect, assign that 
> expression to the named variable for this block."

While I am not going to advocate it, I would like to point out that 
these are all just broken versions of an infix assignment operator[1]. 
As Josiah pointed out, they are used right now in places where explicit 
assignment is not possible. I don't believe you will ever successfully 
push such an operator through when it could easily be done explicitly.

As for this particular case, it is only useful in a very restricted set 
of expressions and I was only able to find a handful of cases in stdlib 
where I could drop in a "if x as y". I believe this is an indication of 
how rarely one wants to do this. YMMV.

-Scott

[1] An infix assignment operator would be grammatically add to tests the 
form: test 'as' testlist. This is just a can of worms yielding pure 
obfuscation.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From g.brandl at gmx.net  Wed Mar  7 11:19:59 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 07 Mar 2007 11:19:59 +0100
Subject: [Python-ideas] if with as
In-Reply-To: <45EA900A.7010905@scottdial.com>
References: <loom.20070303T092453-381@post.gmane.org>	<45E9C386.6020303@acm.org>
	<45EA900A.7010905@scottdial.com>
Message-ID: <esm3kg$82u$1@sea.gmane.org>

Scott Dial schrieb:
> Talin wrote:
>> Personally, I like it - it's an issue that I've brought up before, but 
>> your syntax is better. With the introduction of 2.5's "with A as B", and 
>> the new exception-handling syntax in Py3K 'except E as v' (and the 
>> already existing import syntax), it seems to me that we are, in fact, 
>> establishing a general rule that:
>> 
>> 	<keyword> <expression> as <variable>:
>> 
>> ...is a common syntactical pattern in Python, meaning 'do something 
>> special with expression, and then as a side effect, assign that 
>> expression to the named variable for this block."
> 
> While I am not going to advocate it, I would like to point out that 
> these are all just broken versions of an infix assignment operator[1]. 
> As Josiah pointed out, they are used right now in places where explicit 
> assignment is not possible. I don't believe you will ever successfully 
> push such an operator through when it could easily be done explicitly.

The existing uses for "as" are all different. The ones with "except" and
"with" do not just assign the left hand expression to the right hand name,
but the one with "import" does.

> As for this particular case, it is only useful in a very restricted set 
> of expressions and I was only able to find a handful of cases in stdlib 
> where I could drop in a "if x as y". I believe this is an indication of 
> how rarely one wants to do this. YMMV.

It does here. I can't present use cases right now, but I do recall several
times where I thought this could have made the intent of the code clearer,
I think I even proposed it myself some time ago.

Georg

From jimjjewett at gmail.com  Wed Mar  7 16:48:05 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 7 Mar 2007 10:48:05 -0500
Subject: [Python-ideas] while with as [was: if with as]
Message-ID: <fb6fbf560703070748u7bb57d56me5198e497ab1c814@mail.gmail.com>

Scott Dial schrieb:
> Talin wrote:
...
>> already existing import syntax), it seems to me that we are, in fact,
>> establishing a general rule that:

>>      <keyword> <expression> as <variable>:

>> ...is a common syntactical pattern in Python, meaning 'do something
>> special with expression, and then as a side effect, assign that
>> expression to the named variable for this block."

Rather, assign the result of the do-something.

> these are all just broken versions of an infix assignment operator[1].

Not quite, because the precise do-something is implicit in the keywords.

With "import", you assign the imported module (or object), rather than
the name (string) that was the expression itself.

With "with", you assign the result of the enter call, rather than the
context object itself.

With except, you assign the caught instance, rather than the tuple of
classes that caught it.

By strict analogy, in

    while expr as myvar

myvar should be the result of calling bool(expr).  Fortunately, that
is so useless that people would understand the slight change by
analogy to "and" and "or", which return the full value rather than the
boolean to which it maps.

    while expr as myvar:   # proposal
        foo(myvar)
    <==>

    while (myvar=expr; bool(myvar)):    # except that we can't have
inline statements
        foo(myvar)
    <==>

    # current python workaround -- though normally we use object-specific
    # knowlege to keep it from being quite this ugly.
    _indirect=[]
    def _test_and_set(val):
        _indirect[0] = val
        return val
    while _test_and_set(val):
        foo(_indirect[0])

> As for this particular case, it is only useful in a very restricted set
> of expressions and I was only able to find a handful of cases in stdlib
> where I could drop in a "if x as y".

I have wanted it, though not nearly so often as I have wanted it for
the "while" loop.

The workarounds with break/continue or changing to a for-loop always bother me.

That said, I'm still not sure it would be worth the cost, because
people might start trying to write

    while (1,2,3)

and expecting an implicit iteration; the confusion to beginners
*might* outweigh the benefits.

-jJ

From g.brandl at gmx.net  Wed Mar  7 17:20:18 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 07 Mar 2007 17:20:18 +0100
Subject: [Python-ideas] while with as [was: if with as]
In-Reply-To: <fb6fbf560703070748u7bb57d56me5198e497ab1c814@mail.gmail.com>
References: <fb6fbf560703070748u7bb57d56me5198e497ab1c814@mail.gmail.com>
Message-ID: <esmoo3$mpa$1@sea.gmane.org>

Jim Jewett schrieb:

>> As for this particular case, it is only useful in a very restricted set
>> of expressions and I was only able to find a handful of cases in stdlib
>> where I could drop in a "if x as y".
> 
> I have wanted it, though not nearly so often as I have wanted it for
> the "while" loop.
> 
> The workarounds with break/continue or changing to a for-loop always bother me.
> 
> That said, I'm still not sure it would be worth the cost, because
> people might start trying to write
> 
>     while (1,2,3)
> 
> and expecting an implicit iteration; the confusion to beginners
> *might* outweigh the benefits.

Hm, why would anyone write that because of the new "as" syntax?

Georg

From jimjjewett at gmail.com  Wed Mar  7 18:09:53 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 7 Mar 2007 12:09:53 -0500
Subject: [Python-ideas] while with as [was: if with as]
In-Reply-To: <esmoo3$mpa$1@sea.gmane.org>
References: <fb6fbf560703070748u7bb57d56me5198e497ab1c814@mail.gmail.com>
	<esmoo3$mpa$1@sea.gmane.org>
Message-ID: <fb6fbf560703070909w4431f9f1ofde572eed4ebd517@mail.gmail.com>

On 3/7/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Jim Jewett schrieb:
> > That said, I'm still not sure it would be worth the cost, because
> > people might start trying to write

> >     while (1,2,3)

> > and expecting an implicit iteration; the confusion to beginners
> > *might* outweigh the benefits.

> Hm, why would anyone write that because of the new "as" syntax?

It isn't always clear (particularly to a beginner, or someone coming
from another programming language) when to use "while" and when to use
"for".  I've seen plenty of C code with for loops that don't increment
a counter -- they could easily be while loops.

I imagine getting into it somehow along the following lines;

    # OK, I want to go through the list.
    # I need a loop.  "while" gives me a loop.

    while [1, 2, 3] as num:
        print num

    # wait, I don't really need the number for this, I just need to get this
    # stupid thing to loop.  Maybe if I take out the "as"?

    while [1,2,3]:
        print "got in"

Obviously, you can say that the right answer here is a for loop

    for num in [1, 2, 3]: ...

but I'm not sure how hard it would be to explain to a new user.  It
may be that I'm still thinking pre-file-iterators, and newbies won't
have a problem ... but I'm not confident of that.

-jJ

From rrr at ronadam.com  Wed Mar  7 20:37:55 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 07 Mar 2007 13:37:55 -0600
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EE791A.8000307@canterbury.ac.nz>
References: <45EC9C77.10603@acm.org>	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>	<45ECAC39.4090908@canterbury.ac.nz>	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>	<45EE5804.9080301@ronadam.com>
	<45EE791A.8000307@canterbury.ac.nz>
Message-ID: <45EF1493.9090906@ronadam.com>

Greg Ewing wrote:
> Ron Adam wrote:
> 
>> I have mixed feeling on it myself.  
> 
> As far as I remember, there weren't really any objective
> reasons given against it, just gut feelings of dislike
> from some people, including Guido. I'm living in hopes
> that he may come round to the idea in time (as he did
> with conditional expressions, for example).
> 
>  > The reason being, (to me), using the '*' for both packing
>  > and unpacking is not the most readable solution.
> 
> To me it doesn't seem any less readable than using
> [...,...] or (...,...) for both packing and unpacking,
> or using * in both function definitions and calls. In
> fact the symmetry is a large part of the beauty of
> the idea.
> 
>> Also the '*' syntax can't be used to unpack nested items.
> 
> I'm not sure what you mean by that.
> 
>     >>> [[a, b, *c], [d, e, *f], *g] = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
> 
>     >>> print a, b, c, d, e, f, g
>     1 2 [3, 4] 5 6 [7, 8] [9, 10]
> 
> Makes perfectly good sense to me.

Didn't say it didn't.

Symmetry is not always the best solution.  Sometimes asymmetry is good 
because it can communicate a different context more clearly. That is more 
of a 'human' issue than machine one.  My opinion is in regards to what 
would be better for me.  It's not a right or wrong point of view and may 
not be better for others.

Hmmm... I think there might be an idea related to this of separating 
formatting from the assignment is such a way that the destination isn't 
specific to the source structure. (food for thought?)

For example: *what if* a sort of string formatting style where used to 
repack objects at their destination?

Where '%%' means repack this object this way at the destination, '[]' is 
unpack, '*' is pack, and ',' are used to indicate place holders.

The example from above could then be...

    >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
    >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data

    >>> print a, b, c, d, e, f, g
    1 2 [3, 4] 5 6 [7, 8] [9, 10]

A chained operation would need to be be done in this next example.  Works 
form left to right.

    >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
    >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data
    >>> print a, b, c, d
    [1, 2, 3, 4] 5 6 [7, 8, 9, 10]

Possibly the repack specifiers would be pushed onto a stack then pulled 
back off to do the actual repacking at assignment time. (?)

These are just examples to illustrate a concept.  The point here is to 
allow for the separation of the data structure knowledge from assignments 
while still being an operation that can happen at the destination.  That 
may avoid creating intermediate objects.

This type of abstraction may make it easier to interface different types of 
objects and data structures dynamically.

Of course a function could be made to do this, but it would likely be much 
much slower.

     a, b, c = repack(data, repack_specifier)

Ron

From free.condiments at gmail.com  Wed Mar  7 20:59:58 2007
From: free.condiments at gmail.com (Sam)
Date: Wed, 7 Mar 2007 19:59:58 +0000
Subject: [Python-ideas] if with as
In-Reply-To: <45EA900A.7010905@scottdial.com>
References: <loom.20070303T092453-381@post.gmane.org>
	<45E9C386.6020303@acm.org> <45EA900A.7010905@scottdial.com>
Message-ID: <b1c02c610703071159q6bf6e782pb3daadcb993c37a5@mail.gmail.com>

On 04/03/07, Scott Dial <scott+python-ideas at scottdial.com> wrote:
> As for this particular case, it is only useful in a very restricted set
> of expressions and I was only able to find a handful of cases in stdlib
> where I could drop in a "if x as y". I believe this is an indication of
> how rarely one wants to do this. YMMV.

I have a concrete use case. My hobby MUD server needs to do a lot of
parsing of players' input, as you'd expect. I use pyparsing to
construct the grammar, but I then have the slightly hairy problem of
dispatching to the correct method. Let's consider a single command,
'target', where the rest of the line can take one of three valid
forms: 'set $name to word list here', 'clear $name', and 'list'.
Presently, the code looks like this (minus some permission-checking
complications and room sanity checks):

def targetDistributor(actor, text):
    try:
        name, target = target_set_pattern.parseString(text)
    except ParseException:
        pass
    else:
        targetSet(actor, name.lower(), target)
        return
    try:
        name, = target_clear_pattern.parseString(text)
    except ParseException:
        pass
    else:
        targetClear(actor, name.lower())
        return
    try:
        target_list_pattern.parseString(text)
    except ParseException:
        pass
    else:
        targetList(actor)
        return
    badSyntax(actor)

Yuck. But, let's rewrite this using as-syntax and two new functions:

#for patterns which return no useful results, but just need to match
(whose results, when bool-ified, result in False)
def matchnoresults(pattern, string):
    try:
        pattern.parseString(string)
    except ParseException:
        return False
    return True

def matchwithresults(pattern, string):
    try:
        res = pattern.parseString(string)
    except ParseException:
        return False
    return res

def targetDistributor(actor, rest, info):
    if matchwithresults(target_set_pattern, rest) as name, target:
        targetSet(actor, name, target)
    elif matchwithresults(target_clear_pattern, rest) as name,:
        targetClear(actor, name)
    elif matchnoresults(target_list_pattern, rest):
        targetList(actor)
    else:
        badSyntax(actor)

I do think that the majority of use cases will be parsing, or in
similar cases where one needs to both test for success and obtain
results from the test.

From jcarlson at uci.edu  Wed Mar  7 22:43:17 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 07 Mar 2007 13:43:17 -0800
Subject: [Python-ideas] if with as
In-Reply-To: <b1c02c610703071159q6bf6e782pb3daadcb993c37a5@mail.gmail.com>
References: <45EA900A.7010905@scottdial.com>
	<b1c02c610703071159q6bf6e782pb3daadcb993c37a5@mail.gmail.com>
Message-ID: <20070307132522.74B6.JCARLSON@uci.edu>

Sam <free.condiments at gmail.com> wrote:
> On 04/03/07, Scott Dial <scott+python-ideas at scottdial.com> wrote:
> > As for this particular case, it is only useful in a very restricted set
> > of expressions and I was only able to find a handful of cases in stdlib
> > where I could drop in a "if x as y". I believe this is an indication of
> > how rarely one wants to do this. YMMV.
[snip]
> I do think that the majority of use cases will be parsing, or in
> similar cases where one needs to both test for success and obtain
> results from the test.

I'm going to agree with Scott Dial on this.  You aren't going to need it
often enough to warrant this syntax change.  Coupled with the potential
confusion that Jim Jewett pointed out, and this particular suggestion
smells like a misfeature.

Also, with the nonlocal declaration that is going to be making it into
Python 3 (or was it 2.6?), you can use a closure without a list to do
the same thing.

    def foo():
        result = None
        def do_something(...):
            nonlocal result
            ...
            result = 8
            return True

        if do_something(...):
            #do something with result
        ...

Then again, closures, assignments in while/if, etc., all smell to me
like a way of getting the features of object semantics, without actually
using objects.  Take your final example and use an object instead.

def targetDistributor(actor, rest, info):
    matcher = Matcher()

    if matcher.match(target_set_pattern, rest):
        name, target = matcher.result
        targetSet(actor, name, target)
    elif matcher.match(target_clear_pattern, rest):
        name, = matcher.result
        targetClear(actor, name)
    elif matcher.match(target_list_pattern, rest):
        targetList(actor)
    else:
        badSyntax(actor)

You know what?  That works *today* AND is clearer and more concise than
your original code.  Even better, it doesn't require a language syntax
change.

Try using an object for this.  You may find that it can do everything
that the assignment thing could do.

 - Josiah

From jimjjewett at gmail.com  Wed Mar  7 23:14:06 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 7 Mar 2007 17:14:06 -0500
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EF1493.9090906@ronadam.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
	<45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz>
	<45EF1493.9090906@ronadam.com>
Message-ID: <fb6fbf560703071414t7d2f7bbfx6753dbbe5130b840@mail.gmail.com>

On 3/7/07, Ron Adam <rrr at ronadam.com> wrote:
> The example from above could then be...
>
>     >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
>     >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data
>
>     >>> print a, b, c, d, e, f, g
>     1 2 [3, 4] 5 6 [7, 8] [9, 10]

...

>     >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
>     >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data
>     >>> print a, b, c, d
>     [1, 2, 3, 4] 5 6 [7, 8, 9, 10]

...

> This type of abstraction may make it easier to interface different types of
> objects and data structures dynamically.

I can see how this might be made efficient.

I'm not seeing how I could ever maintain code that used it.

-jJ

From greg.ewing at canterbury.ac.nz  Thu Mar  8 00:08:18 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 08 Mar 2007 12:08:18 +1300
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EF1493.9090906@ronadam.com>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
	<45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz>
	<45EF1493.9090906@ronadam.com>
Message-ID: <45EF45E2.6010308@canterbury.ac.nz>

Ron Adam wrote:
> Greg Ewing wrote:
> 
>> Ron Adam wrote:
>>
>>> Also the '*' syntax can't be used to unpack nested items.
>>
>> Makes perfectly good sense to me.
> 
> Didn't say it didn't.

Then I still don't know what you meant by your original
comment. What kind of nested item unpacking would you
like to do that the * syntax wouldn't handle?

> Symmetry is not always the best solution.  Sometimes asymmetry is good 
> because it can communicate a different context more clearly.

Perhaps, but why do you think that the symmetry we already
have between packing and unpacking is okay, but a new
symmetry involving * wouldn't be okay?

--
Greg

From rrr at ronadam.com  Thu Mar  8 05:23:08 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 07 Mar 2007 22:23:08 -0600
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <fb6fbf560703071414t7d2f7bbfx6753dbbe5130b840@mail.gmail.com>
References: <45EC9C77.10603@acm.org>	
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>	
	<45ECAC39.4090908@canterbury.ac.nz>	
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>	
	<45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz>	
	<45EF1493.9090906@ronadam.com>
	<fb6fbf560703071414t7d2f7bbfx6753dbbe5130b840@mail.gmail.com>
Message-ID: <45EF8FAC.2080306@ronadam.com>

Jim Jewett wrote:
> On 3/7/07, Ron Adam <rrr at ronadam.com> wrote:
>> The example from above could then be...
>>
>>     >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
>>     >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data
>>
>>     >>> print a, b, c, d, e, f, g
>>     1 2 [3, 4] 5 6 [7, 8] [9, 10]
> 
> ...
> 
>>     >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
>>     >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data
>>     >>> print a, b, c, d
>>     [1, 2, 3, 4] 5 6 [7, 8, 9, 10]
> 
> ...
> 
>> This type of abstraction may make it easier to interface different 
>> types of
>> objects and data structures dynamically.
> 
> I can see how this might be made efficient.
> 
> I'm not seeing how I could ever maintain code that used it.

Yes, that would be a problem.

(And I need to resist posting things like this because then I feel 
obligated to try to explain them.  ie... wait 24 hours and if it still 
seems like a good idea, then post it.)

Regarding maintaining code:

For interfacing purposes you would probably define the re-packers with the 
data and not where you actually use them.  And use in-line comments as well.

Trying a different format.   (This can't be a serious proposal unless it 
can be made simple to understand.)

      '[]'  for unpack items, these can be nested.
      '.'   a single unchanged item
      '.n'  for next n unchanged items
      '*n'  for packing next n items
      '*'   pack the rest of the items, use at the end only.
      ''    Null, don't change anything

* Doing it in two steps is the easy way, unpack+repack.
* Both unpacking/repacking can be done in one step
   if the items are aligned.
* partial unpacking/repacking is perfectly ok.
   (if it does what you want)

data_source_v1 = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10]
unpack_v1 = "[[][]..]"
repack_to_v2 = "..*2..*2*"   # 1,2,[3,4],5,6,[7,8],[9,10]

data_source_v2 = [1, 2, [3, 4], 5, 6, [7, 8], [9, 10]]
unpack_v2 = "[..[]..[][]]"
repack_to_v1 = "*4*4.."      # [1,2,3,4][5,6,7,8],9,10

def interface_v1(data, unpack='', repack=''):
     a, b, c, d = repack %% unpack %% data
     # Do something with a thru d.

def interface_v2(data, unpack='', repack=''):
     a, b, c, d, e ,f, g = repack %% unpack %% data
     # do something with a thru g

# use data sources with interface v1

interface_v1(data_source_v1)       # use v1 data as is

interface_v1(data_source_v2, unpack_v2, repack_to_v1)

# use data sources with interface V2

interface_v2(data_source_v2)       # use v2 data as is

interface_v2(data_source_v1, upack_v1, repack_to_v2)

Decorators probably currently fit this use better I think.

Cheers,
   Ron

From jcarlson at uci.edu  Thu Mar  8 06:52:42 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 07 Mar 2007 21:52:42 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EF8FAC.2080306@ronadam.com>
References: <fb6fbf560703071414t7d2f7bbfx6753dbbe5130b840@mail.gmail.com>
	<45EF8FAC.2080306@ronadam.com>
Message-ID: <20070307215028.74D1.JCARLSON@uci.edu>

Ron Adam <rrr at ronadam.com> wrote:
[sni[
> Trying a different format.   (This can't be a serious proposal unless it 
> can be made simple to understand.)
> 
>       '[]'  for unpack items, these can be nested.
>       '.'   a single unchanged item
>       '.n'  for next n unchanged items
>       '*n'  for packing next n items
>       '*'   pack the rest of the items, use at the end only.
>       ''    Null, don't change anything

Anything with a . or * is going to be *very* confusing to write and
maintain.

 - Josiah

From rrr at ronadam.com  Thu Mar  8 06:54:10 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 07 Mar 2007 23:54:10 -0600
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45EF45E2.6010308@canterbury.ac.nz>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
	<45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz>
	<45EF1493.9090906@ronadam.com> <45EF45E2.6010308@canterbury.ac.nz>
Message-ID: <45EFA502.6070305@ronadam.com>

Greg Ewing wrote:
> Ron Adam wrote:
>> Greg Ewing wrote:
>>
>>> Ron Adam wrote:
>>>
>>>> Also the '*' syntax can't be used to unpack nested items.
>>>
>>> Makes perfectly good sense to me.
>>
>> Didn't say it didn't.
> 
> Then I still don't know what you meant by your original
> comment. What kind of nested item unpacking would you
> like to do that the * syntax wouldn't handle?

This was what I was that originally referring to.

     data = [1, 2, [3, 4], 5]
     a, b, c, d, e = data        # can't unpack [3,4] here with *

     a, b, [c, d], e  = data     # must use [] on the left side instead

>> Symmetry is not always the best solution.  Sometimes asymmetry is good 
>> because it can communicate a different context more clearly.
> 
> Perhaps, but why do you think that the symmetry we already
> have between packing and unpacking is okay, but a new
> symmetry involving * wouldn't be okay?

The * is used mostly with names and they tend to look more alike than the 
equivalent packing or unpacking using () or [] in more situations.  The () 
and [] are much more explicit in both packing and unpacking operations.

I like the packing and unpacking features of python very much, it's just I 
would like it a bit more if the '*' symbol for packing and unpacking were 
different in this particular case.  And especially so if '*' packing and 
unpacking features are used in a more general way.

     def foo(*args, **kwds):    # packs here

     bar(&args, &&kwds):        # unpacks here

     a, b, *c = a, b, c, d      # packs here

     a, b, c = &items           # unpacks here

It would just be easier for me to see and keep in my head while I'm working 
with it.  I'm not sure I can explain it better than that, or say exactly 
why as it probably has more to do with how my brain works than weather it 
is okay or not okay in any programing sense.

Is that clearer?  (I don't expect this to be changed)

Ron

From greg.ewing at canterbury.ac.nz  Sat Mar 10 08:53:28 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 10 Mar 2007 20:53:28 +1300
Subject: [Python-ideas] Syntax to help with currying
In-Reply-To: <bbaeab100703091823y7b2ea950l806998518d795cd5@mail.gmail.com>
References: <20070228205212.GD5537@performancedrivers.com>
	<20070310001054.GN18179@performancedrivers.com>
	<bbaeab100703091823y7b2ea950l806998518d795cd5@mail.gmail.com>
Message-ID: <45F263F8.6000004@canterbury.ac.nz>

Brett Cannon wrote:
>    def register(cls, account_id, form_id):
>        def inner(to_decorate):
>            ...
>         return inner

Thinking the other day about how this pattern arises quite
frequently when doing currying like this, and wondering
what sort of syntax might make it easier, I came up with

   def f():
     return g():
       ...

--
Greg

From talin at acm.org  Sun Mar 11 08:51:11 2007
From: talin at acm.org (Talin)
Date: Sat, 10 Mar 2007 23:51:11 -0800
Subject: [Python-ideas] Mutable chars objects
Message-ID: <45F3B4EF.1010103@acm.org>

Something that I have wanted in Python for a long time is something like 
the Java StringBuffer class - a mutable buffer, with string-like 
methods, that holds characters instead of bytes.

I do a lot of stuff with parsing, and its often convenient to build up 
long strings of text one character at a time. Doing this with strings in 
Python is obviously not the way to go, since each time you append a 
character you have to construct a new string object. Doing it with lists 
is better, except that you still have to pay the overhead of the dynamic 
typing information for each character.

Also, unlike a list or an array, you'd ideally want something that has 
string-like methods, such as toupper() and so on. Calling str( buffer ) 
should create a string of the contents of the buffer, not generate a 
repr() of the object which is what would happen if you call str() on a 
list or array. Passing this buffer to 'print' should also just print the 
characters. Similarly, you ought to be able to comparisons between the 
mutable buffer and a real string; slices of the buffer should be 
strings, not lists, and so on.

In other words - it ought to act pretty much like STL strings.

Also, the class ought to be optimized for single-character appending, it 
should be smart enough to grow memory in the right-sized chunks; And no, 
there's no particular reason why the memory needs to be contiguous, 
although it could be.

Originally, I had thought that such a class might be called 'characters' 
(to correspond with 'bytes' in Python 3000), but it could just as easily 
be called strbuffer or something else.

-- Talin

From larry at hastings.org  Sun Mar 11 09:56:44 2007
From: larry at hastings.org (Larry Hastings)
Date: Sun, 11 Mar 2007 00:56:44 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
References: <45EC9C77.10603@acm.org>	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
Message-ID: <45F3C44C.4020401@hastings.org>

Brett Cannon wrote:
> On 3/5/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>   
>> One thing I *would* like to see, that Javascript doesn't
>> seem to have either yet, is a *-arg on the end of the
>> unpacking list:
>>    a, b, *c = [1, 2, 3, 4, 5]
>> giving a == 1, b == 2, c == [3, 4, 5].
> Now I know that was discussed on python-dev once, but I don't remember why didn't end up happening.
Surely you haven't forgotten your own recipe?
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163968
That was also published in the Python Cookbook (2nd Edition) as recipe 
19.4: "Unpacking a Few Items in a Multiple Assignment".

As mentioned in the recipe, the original discussion was here:
    http://mail.python.org/pipermail/python-dev/2002-November/030380.html
The last message posted on the thread was from GvR, who was against it:
    http://mail.python.org/pipermail/python-dev/2002-November/030437.html

Cheers,

/larry/

From jcarlson at uci.edu  Sun Mar 11 11:31:48 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 11 Mar 2007 02:31:48 -0800
Subject: [Python-ideas] Mutable chars objects
In-Reply-To: <45F3B4EF.1010103@acm.org>
References: <45F3B4EF.1010103@acm.org>
Message-ID: <20070311022621.751C.JCARLSON@uci.edu>

Talin <talin at acm.org> wrote:
> Something that I have wanted in Python for a long time is something like 
> the Java StringBuffer class - a mutable buffer, with string-like 
> methods, that holds characters instead of bytes.

8-bit ASCII characters, or compile-time specified unicode characters (16
or 32 bit)?  If all you wanted was mutable characters, array.array('c'),
it's smart about appending.  The lack of string methods kind of kills it
though.

One of the reasons I was pushing for string views oh, about 7 months ago
was for very similar reasons; it would be *really* nice to be able to
add string methods to anything that provided the buffer interface. 
Nevermind that if it offered a multi-byte buffer view (like the extended
buffer interface that will be coming in Py3k), you could treat arbitrary
data as if it were strings - an array of 16 bit ints would be the same
as 8 bit ints, the same as 8 bit characters, the same as 32 bit ints,
etc.  I guess I was 7 months too early in my proposal.

 - Josiah

From brett at python.org  Sun Mar 11 19:51:59 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 11 Mar 2007 10:51:59 -0800
Subject: [Python-ideas] Javascript Destructuring Assignment
In-Reply-To: <45F3C44C.4020401@hastings.org>
References: <45EC9C77.10603@acm.org>
	<bbaeab100703051512i6adb291bs3c108aa2c761dca7@mail.gmail.com>
	<45ECAC39.4090908@canterbury.ac.nz>
	<bbaeab100703051631ua93c14fta636e9cdbf89eb3d@mail.gmail.com>
	<45F3C44C.4020401@hastings.org>
Message-ID: <bbaeab100703111151y5d329157je65695898acf0d31@mail.gmail.com>

On 3/11/07, Larry Hastings <larry at hastings.org> wrote:
> Brett Cannon wrote:
> > On 3/5/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >
> >> One thing I *would* like to see, that Javascript doesn't
> >> seem to have either yet, is a *-arg on the end of the
> >> unpacking list:
> >>    a, b, *c = [1, 2, 3, 4, 5]
> >> giving a == 1, b == 2, c == [3, 4, 5].
> > Now I know that was discussed on python-dev once, but I don't remember why didn't end up happening.
> Surely you haven't forgotten your own recipe?
>     http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163968
> That was also published in the Python Cookbook (2nd Edition) as recipe
> 19.4: "Unpacking a Few Items in a Multiple Assignment".

I remember writing the code, but I totally forgot I submitted it (and
it got published) as a recipe.  =)

-Brett

From jason.orendorff at gmail.com  Sun Mar 11 21:53:17 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Sun, 11 Mar 2007 16:53:17 -0400
Subject: [Python-ideas] Mutable chars objects
In-Reply-To: <45F3B4EF.1010103@acm.org>
References: <45F3B4EF.1010103@acm.org>
Message-ID: <bb8868b90703111353n46dea5b5ucff8f192ed112e8e@mail.gmail.com>

On 3/11/07, Talin <talin at acm.org> wrote:
> I do a lot of stuff with parsing, and its often convenient to build up
> long strings of text one character at a time.

Could you be more specific about this?  When I write a parser it
always starts with either

  _token_re = re.compile(r'''(?x)
     ...15 lines omitted...
     ''')

or

  import yapps2  # wheeee!

I've never had much luck hand-coding a lexer in straight-up Python.
Not only is it slow, I feel like Python's syntax is working against
me-- no switch statement, no do-while.  (This is not a complaint!
It's all right.  I should be using a parser-generator anyway.)

Josiah mentioned array.array('c').  There's also array.array('u'),
which is an array of Py_UNICODEs.  You can add the string methods in a
subclass for a quick prototype.

-j

From jcarlson at uci.edu  Sun Mar 11 23:47:02 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 11 Mar 2007 15:47:02 -0700
Subject: [Python-ideas] Mutable chars objects
In-Reply-To: <bb8868b90703111353n46dea5b5ucff8f192ed112e8e@mail.gmail.com>
References: <45F3B4EF.1010103@acm.org>
	<bb8868b90703111353n46dea5b5ucff8f192ed112e8e@mail.gmail.com>
Message-ID: <20070311154251.FB62.JCARLSON@uci.edu>

"Jason Orendorff" <jason.orendorff at gmail.com> wrote:
> 
> On 3/11/07, Talin <talin at acm.org> wrote:
> > I do a lot of stuff with parsing, and its often convenient to build up
> > long strings of text one character at a time.
> 
> Could you be more specific about this?  When I write a parser it
> always starts with either

I don't believe he's talking about parsing in the language lexer sense,
I believe he is talking about perhaps url parsing (breaking it down into
its component parts), "unmarshaling" (think pickle, marshal, etc.), or
possibly even configuration files.

> Josiah mentioned array.array('c').  There's also array.array('u'),
> which is an array of Py_UNICODEs.  You can add the string methods in a
> subclass for a quick prototype.

Kind-of, but it's horribly slow.  The point of string views that I
mentioned is that you get all of the benefits of the underlying C
implementation; from speed to "it's already been implemented and
debugged".

 - Josiah

From collinw at gmail.com  Mon Mar 12 05:59:53 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 11 Mar 2007 23:59:53 -0500
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
Message-ID: <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>

[Moving to python-ideas...]

On 3/9/07, Steven Bethard <steven.bethard at gmail.com> wrote on python-dev:
> On 3/9/07, Collin Winter <collinw at gmail.com> wrote on python-dev:
> > One solution that just occurred to me -- and that skirts the issue of
> > choosing an interpretation -- is that, when comparing date and
> > datetime objects, the datetime's .date() method is called and the
> > result of that call is compared to the original date. That is,
> >
> > datetime_obj < date_obj
> >
> > is implicitly equivalent to
> >
> > datetime_obj.date() < date_obj
>
> Using the .date() is fine when the year/month/day doesn't match.  So
> the following are fine::
>     datetime.datetime(2005, 1, 1, 0, 0, 0) < datetime.date(2006, 1, 1)
>     datetime.datetime(2007, 1, 1, 0, 0, 0) > datetime.date(2006, 1, 1)
> It's *not* okay to say that a date() is less than, greater than or
> equal to a datetime() if the year/month/day *does* match.  The correct
> temporal relation is During, but Python doesn't have a During
> operator. During is not the same as less-than, greater-than or
> equal-to, so all of these should be False::
>     datetime.datetime(2006, 1, 1, 0, 0, 0)  < datetime.date(2006, 1, 1)
>     datetime.datetime(2006, 1, 1, 0, 0, 0)  > datetime.date(2006, 1, 1)
>     datetime.datetime(2006, 1, 1, 0, 0, 0)  == datetime.date(2006, 1, 1)
> That is, the datetime() is not less than, greater than or equal to the
> corresponding date().
>
> Some discussion of these kinds of issues is here:
>     http://citeseer.ist.psu.edu/allen94actions.html
> The essence is that in order to properly compare intervals, you need
> the Meets, Overlaps, Starts, During and Finishes operators in addition
> to the Before (<) and Simulaneous (=) operators.
>
> So, let's not conflate Before, After or Simultaneous with the other
> relations -- if it's not strictly Before (<), After (>) or
> Simultaneous (=), we can just say so by returning False.

It might be neat to add a __contains__ method to date() objects so
that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be
True. This would seem to fulfill the During operator.

Collin Winter

From lists at cheimes.de  Mon Mar 12 14:07:58 2007
From: lists at cheimes.de (Christian Heimes)
Date: Mon, 12 Mar 2007 14:07:58 +0100
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
Message-ID: <et3jd7$ncl$1@sea.gmane.org>

Collin Winter wrote:
> It might be neat to add a __contains__ method to date() objects so
> that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be
> True. This would seem to fulfill the During operator.

Yeah, I had the same idea:
http://permalink.gmane.org/gmane.comp.python.devel/86828

In my opinion the comparison operations for date and datetime object
should be defined as:

datetime before date: <
-----------------------

>>> datetime(2007, 1, 1, 23, 59, 59) < date(2007, 1, 2)
True
Valid for all datetimes before (smaller than) datetime(2007, 1, 2, 0, 0, 0)

datetime during date: in
------------------------
>>> datetime(2007, 1, 2, 23, 59, 59) in date(2007, 1, 2)
True
Valid for all datetimes((2007, 1, 2, *, *, *).date() == date(2007, 1, 2)

datetime after date: >
----------------------
>>> datetime(2007, 1, 3, 0, 0, 0) in date(2007, 1, 2)
True
Valid for all datetimes after 2007-01-02 (greater or equal 2007-01-03
00:00:00)

datetime <= date and datetime => date should raise a TypeError. The
result is ambiguous. A date with time is never equal to a date.

Christian

From jimjjewett at gmail.com  Mon Mar 12 16:43:18 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 12 Mar 2007 11:43:18 -0400
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <et3jd7$ncl$1@sea.gmane.org>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<et3jd7$ncl$1@sea.gmane.org>
Message-ID: <fb6fbf560703120843s1c7b6can604452844ebc72b0@mail.gmail.com>

On 3/12/07, Christian Heimes <lists at cheimes.de> wrote:
> datetime before date: <
> datetime during date: in
> datetime after date: >

> datetime <= date and datetime => date should raise a TypeError. The
> result is ambiguous. A date with time is never equal to a date.

As a practical matter, sorting uses <=

I would be somewhat annoyed if

    dt < d

were True, but when I tried to sort them,

    dt <= d

raised a TypeError.

It would be OK with me to raise an error (ValueError?) on "dt <= d" if
"dt in d".   Others might feel differently, and prefer to always get
the TypeError -- which is probably why dt<t doesn't already do the
obvious thing.

-jJ

From collinw at gmail.com  Mon Mar 12 16:52:27 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 12 Mar 2007 10:52:27 -0500
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <d11dcfba0703112225o4d615c99nd505b22338cdfc86@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<d11dcfba0703112225o4d615c99nd505b22338cdfc86@mail.gmail.com>
Message-ID: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>

On 3/12/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> [I'm not on this list, so please keep me in the CC if you reply]
>
> On 3/11/07, Collin Winter <collinw at gmail.com> wrote:
> > On 3/9/07, Steven Bethard <steven.bethard at gmail.com> wrote on python-dev:
> > > It's *not* okay to say that a date() is less than, greater than or
> > > equal to a datetime() if the year/month/day *does* match.  The correct
> > > temporal relation is During, but Python doesn't have a During
> > > operator. During is not the same as less-than, greater-than or
> > > equal-to, so all of these should be False::
> > >     datetime.datetime(2006, 1, 1, 0, 0, 0)  < datetime.date(2006, 1, 1)
> > >     datetime.datetime(2006, 1, 1, 0, 0, 0)  > datetime.date(2006, 1, 1)
> > >     datetime.datetime(2006, 1, 1, 0, 0, 0)  == datetime.date(2006, 1, 1)
> > > That is, the datetime() is not less than, greater than or equal to the
> > > corresponding date().
> > >
> > > Some discussion of these kinds of issues is here:
> > >     http://citeseer.ist.psu.edu/allen94actions.html
> > > The essence is that in order to properly compare intervals, you need
> > > the Meets, Overlaps, Starts, During and Finishes operators in addition
> > > to the Before (<) and Simulaneous (=) operators.
> > >
> > > So, let's not conflate Before, After or Simultaneous with the other
> > > relations -- if it's not strictly Before (<), After (>) or
> > > Simultaneous (=), we can just say so by returning False.
> >
> > It might be neat to add a __contains__ method to date() objects so
> > that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be
> > True. This would seem to fulfill the During operator.
>
> That's a nice idea.  With the simplest implementation, you could then
> guarantee that one of the following would always be true::
>
>     datetime < date
>     datetime in date
>     datetime > date
>
> (That would actually conflate the Starts, Finishes and During
> relations in the __contains__ operator, but I think that's a perfectly
> reasonable interpretation, and I know it would be useful in my code at
> least.)

I'll work up a patch.

Collin Winter

From lists at cheimes.de  Mon Mar 12 17:08:04 2007
From: lists at cheimes.de (Christian Heimes)
Date: Mon, 12 Mar 2007 17:08:04 +0100
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>	<d11dcfba0703112225o4d615c99nd505b22338cdfc86@mail.gmail.com>
	<43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>
Message-ID: <et3tuu$3c2$1@sea.gmane.org>

Collin Winter wrote:
> I'll work up a patch.

I'm already working on a patch, too.

Christian

From steven.bethard at gmail.com  Mon Mar 12 18:42:21 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 12 Mar 2007 11:42:21 -0600
Subject: [Python-ideas] datetime module enhancements
Message-ID: <d11dcfba0703121042o5cbbe3cbx592becdafb6bd1af@mail.gmail.com>

On 3/12/07, Christian Heimes <lists at cheimes.de> wrote:
> datetime before date: <
> datetime during date: in
> datetime after date: >

> datetime <= date and datetime => date should raise a TypeError. The
> result is ambiguous. A date with time is never equal to a date.

Jim Jewett wrote:
> As a practical matter, sorting uses <=

Are you sure?

    >>> class C(object):
    ...     def __init__(self, x):
    ...         self.x = x
    ...     def __lt__(self, other):
    ...         return self.x < other.x
    ...     def __le__(self, other):
    ...         raise TypeError()
    ...     def __repr__(self):
    ...         return 'C(%r)' % self.x
    ...
    >>> sorted([C(3), C(2), C(1)])
    [C(1), C(2), C(3)]

Looks like it's just using < to me.

(And thanks to Collin and Christian for jumping on the patch already.)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From lists at cheimes.de  Mon Mar 12 18:56:56 2007
From: lists at cheimes.de (Christian Heimes)
Date: Mon, 12 Mar 2007 18:56:56 +0100
Subject: [Python-ideas] datetime module enhancements
In-Reply-To: <d11dcfba0703121042o5cbbe3cbx592becdafb6bd1af@mail.gmail.com>
References: <d11dcfba0703121042o5cbbe3cbx592becdafb6bd1af@mail.gmail.com>
Message-ID: <et44b2$ud2$1@sea.gmane.org>

Steven Bethard schrieb:
>> datetime <= date and datetime => date should raise a TypeError. The
>> result is ambiguous. A date with time is never equal to a date.
> 
> Jim Jewett wrote:
>> As a practical matter, sorting uses <=
> 
> Are you sure?

It depends what tp slot is defined. The C api supports either comparison
(cmp style -1, 0, 1) or rich comparison (func(self, other, operator)).

Christian

From steven.bethard at gmail.com  Mon Mar 12 19:06:15 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 12 Mar 2007 12:06:15 -0600
Subject: [Python-ideas] datetime module enhancements
In-Reply-To: <et44b2$ud2$1@sea.gmane.org>
References: <d11dcfba0703121042o5cbbe3cbx592becdafb6bd1af@mail.gmail.com>
	<et44b2$ud2$1@sea.gmane.org>
Message-ID: <d11dcfba0703121106x2a6dfc63p23242d65a162b420@mail.gmail.com>

On 3/12/07, Christian Heimes <lists at cheimes.de> wrote:
> Steven Bethard schrieb:
> >> datetime <= date and datetime => date should raise a TypeError. The
> >> result is ambiguous. A date with time is never equal to a date.
> >
> > Jim Jewett wrote:
> >> As a practical matter, sorting uses <=
> >
> > Are you sure?
>
> It depends what tp slot is defined. The C api supports either comparison
> (cmp style -1, 0, 1) or rich comparison (func(self, other, operator)).

Fair enough.  My only point was that as long as __lt__ is defined,
__le__ can throw a TypeError() and it won't break sorted().

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From collinw at gmail.com  Mon Mar 12 20:01:55 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 12 Mar 2007 14:01:55 -0500
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<d11dcfba0703112225o4d615c99nd505b22338cdfc86@mail.gmail.com>
	<43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>
Message-ID: <43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com>

On 3/12/07, Collin Winter <collinw at gmail.com> wrote:
> On 3/12/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > [I'm not on this list, so please keep me in the CC if you reply]
> >
[snip]
> > That's a nice idea.  With the simplest implementation, you could then
> > guarantee that one of the following would always be true::
> >
> >     datetime < date
> >     datetime in date
> >     datetime > date
> >
> > (That would actually conflate the Starts, Finishes and During
> > relations in the __contains__ operator, but I think that's a perfectly
> > reasonable interpretation, and I know it would be useful in my code at
> > least.)
>
> I'll work up a patch.

Posted as #1679204 (http://python.org/sf/1679204). In addition to
date.__contains__, I had to add a datetime.__contains__ that throws a
TypeError (since datetime inherits from date).

While writing the patch, I had the idea of making "time in date"
always return True, but I'm not sure that would be useful.

Collin Winter

From steven.bethard at gmail.com  Mon Mar 12 20:07:19 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 12 Mar 2007 13:07:19 -0600
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<d11dcfba0703112225o4d615c99nd505b22338cdfc86@mail.gmail.com>
	<43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com>
	<43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com>
Message-ID: <d11dcfba0703121207p167230e9i30413b60dafc501a@mail.gmail.com>

On 3/12/07, Collin Winter <collinw at gmail.com> wrote:
> Posted as #1679204 (http://python.org/sf/1679204). In addition to
> date.__contains__, I had to add a datetime.__contains__ that throws a
> TypeError (since datetime inherits from date).

Very cool.  Thanks!

> While writing the patch, I had the idea of making "time in date"
> always return True, but I'm not sure that would be useful.

Yeah, I'd hold off on that one until someone indicates that they need
it.  Seems like there would be a number of places where True might be
wrong for a particular bit of code.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From greg.ewing at canterbury.ac.nz  Mon Mar 12 23:35:32 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 13 Mar 2007 11:35:32 +1300
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <fb6fbf560703120843s1c7b6can604452844ebc72b0@mail.gmail.com>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<et3jd7$ncl$1@sea.gmane.org>
	<fb6fbf560703120843s1c7b6can604452844ebc72b0@mail.gmail.com>
Message-ID: <45F5D5B4.3060607@canterbury.ac.nz>

Jim Jewett wrote:

> It would be OK with me to raise an error (ValueError?) on "dt <= d" if
> "dt in d".   Others might feel differently, and prefer to always get
> the TypeError -- which is probably why dt<t doesn't already do the
> obvious thing.

Seems to me that given all the conflicting behaviours
peole want this to have in different circumstances,
refusing to compare dates and datetimes is the right
thing to do.

All the use cases I've seen here for comparing them
are easily accommodated by either extracting a date
from the datetime to compare with the other date,
or deriving a datetime from the date with whatever
default time part you want. EIBTI.

--
Greg

From guido at python.org  Mon Mar 12 23:49:51 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Mar 2007 15:49:51 -0700
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <45F5D5B4.3060607@canterbury.ac.nz>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<et3jd7$ncl$1@sea.gmane.org>
	<fb6fbf560703120843s1c7b6can604452844ebc72b0@mail.gmail.com>
	<45F5D5B4.3060607@canterbury.ac.nz>
Message-ID: <ca471dc20703121549k32c0519cs828305221abff389@mail.gmail.com>

On 3/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Jim Jewett wrote:
> > It would be OK with me to raise an error (ValueError?) on "dt <= d" if
> > "dt in d".   Others might feel differently, and prefer to always get
> > the TypeError -- which is probably why dt<t doesn't already do the
> > obvious thing.
>
> Seems to me that given all the conflicting behaviours
> peole want this to have in different circumstances,
> refusing to compare dates and datetimes is the right
> thing to do.

Right. A lot of thought went into this when the original design was done.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Tue Mar 13 00:13:40 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 12 Mar 2007 17:13:40 -0600
Subject: [Python-ideas] [Python-Dev] datetime module enhancements
In-Reply-To: <45F5D5B4.3060607@canterbury.ac.nz>
References: <ess0hc$gqn$1@sea.gmane.org>
	<43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com>
	<43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com>
	<d11dcfba0703091403j1fa0eafbw6cbd38f2c2b37dab@mail.gmail.com>
	<43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com>
	<et3jd7$ncl$1@sea.gmane.org>
	<fb6fbf560703120843s1c7b6can604452844ebc72b0@mail.gmail.com>
	<45F5D5B4.3060607@canterbury.ac.nz>
Message-ID: <d11dcfba0703121613t79e75dd1icab08d53483406f6@mail.gmail.com>

On 3/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Jim Jewett wrote:
>
> > It would be OK with me to raise an error (ValueError?) on "dt <= d" if
> > "dt in d".   Others might feel differently, and prefer to always get
> > the TypeError -- which is probably why dt<t doesn't already do the
> > obvious thing.
>
> Seems to me that given all the conflicting behaviours
> peole want this to have in different circumstances,
> refusing to compare dates and datetimes is the right
> thing to do.

The "conflicting behaviors" are really just at the boundaries, and the
only reason they're "conflicting" is because people don't really seem
to know yet what they need.  This is probably an indication that an
appropriately extended datetime module should be distributed as a
third-party module for a while first...

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From jimjjewett at gmail.com  Tue Mar 13 02:17:06 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 12 Mar 2007 21:17:06 -0400
Subject: [Python-ideas] datetime module enhancements
In-Reply-To: <d11dcfba0703121106x2a6dfc63p23242d65a162b420@mail.gmail.com>
References: <d11dcfba0703121042o5cbbe3cbx592becdafb6bd1af@mail.gmail.com>
	<et44b2$ud2$1@sea.gmane.org>
	<d11dcfba0703121106x2a6dfc63p23242d65a162b420@mail.gmail.com>
Message-ID: <fb6fbf560703121817p3af78946qee474fe6a8798b4b@mail.gmail.com>

On 3/12/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 3/12/07, Christian Heimes <lists at cheimes.de> wrote:
> > Steven Bethard schrieb:
> > >> datetime <= date and datetime => date should raise a TypeError. The
> > >> result is ambiguous. A date with time is never equal to a date.

But it can still be <=, by being <.
I would personally be OK with just saying that (year, month, day)
sorts less than (year, month, day, ...) regardless of time, simply
because of the type -- but I admit that would be arbitrary.

...
> Fair enough.  My only point was that as long as __lt__ is defined,
> __le__ can throw a TypeError() and it won't break sorted().

Mea culpa.  I was mis-remembering, and thought that even this would
break because of sort stability.

-jJ

From thobes at gmail.com  Wed Mar 21 17:57:44 2007
From: thobes at gmail.com (Tobias Ivarsson)
Date: Wed, 21 Mar 2007 17:57:44 +0100
Subject: [Python-ideas] Builtin infinite generator
Message-ID: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>

Quite often I find myself wanting to write an infinite for-loop, or rather a
loop with a counter, that terminates on some inner condition. Recently I
even wanted to do this together with the generator comprehension syntax, I
don't remember exactly what I wanted to do, but it was something like:
zip( some_list_of_unknown_length, ('a' for x in infinit_generator) )
I admit that my example is silly, but it still serves as an example of what
I wanted to do.

Then I read that Ellipsis will become generally accepted in py3k [1] and
thought why not let range accept ... as end-parameter to mean "until
forever".

More silly examples to demonstrate how it would work:

>>> for i in range(...):
...     print i
...     if i == 4: break
0
1
2
3
4
>>> for i in range(10,...,10):
...     print i
...     if i == 40: break
10
20
30
40

Any thoughts on this?

/Tobias

[1] http://mail.python.org/pipermail/python-3000/2006-April/000996.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070321/94d7fa82/attachment.html>

From collinw at gmail.com  Wed Mar 21 18:35:22 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 21 Mar 2007 12:35:22 -0500
Subject: [Python-ideas] Builtin infinite generator
In-Reply-To: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>
References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>
Message-ID: <43aa6ff70703211035o5e47bc5xb5d8ced11d4723d8@mail.gmail.com>

On 3/21/07, Tobias Ivarsson <thobes at gmail.com> wrote:
> Quite often I find myself wanting to write an infinite for-loop, or rather a
> loop with a counter, that terminates on some inner condition. Recently I
> even wanted to do this together with the generator comprehension syntax, I
> don't remember exactly what I wanted to do, but it was something like:
> zip( some_list_of_unknown_length, ('a' for x in infinit_generator) )
> I admit that my example is silly, but it still serves as an example of what
> I wanted to do.

itertools.count()
(http://docs.python.org/lib/itertools-functions.html) handles just
this. Wrapping it in a genexp can take care of the second example:

> >>> for i in range(...):
> ...     print i
> ...     if i == 4: break

for i in itertools.count():
    ....

> >>> for i in range(10,...,10):
>  ...     print i
> ...     if i == 40: break

for i in (i * 10 for i in itertools.count(1)):
    ....

Collin Winter

From jcarlson at uci.edu  Wed Mar 21 18:43:54 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 21 Mar 2007 10:43:54 -0700
Subject: [Python-ideas] Builtin infinite generator
In-Reply-To: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>
References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>
Message-ID: <20070321103536.FC8F.JCARLSON@uci.edu>

"Tobias Ivarsson" <thobes at gmail.com> wrote:
> Quite often I find myself wanting to write an infinite for-loop, or rather a
> loop with a counter, that terminates on some inner condition. Recently I
> even wanted to do this together with the generator comprehension syntax, I
> don't remember exactly what I wanted to do, but it was something like:
> zip( some_list_of_unknown_length, ('a' for x in infinit_generator) )
> I admit that my example is silly, but it still serves as an example of what
> I wanted to do.
> 
> Then I read that Ellipsis will become generally accepted in py3k [1] and
> thought why not let range accept ... as end-parameter to mean "until
> forever".
> 
> More silly examples to demonstrate how it would work:
> 
> >>> for i in range(...):
> ...     print i
> ...     if i == 4: break
[snip]
> Any thoughts on this?

It isn't reasonable in Python 2.x, as range() returns an actual list,
and an infinite list isn't reasonable.  It would be *possible* with
xrange() in Python 2.x, but then the question is "is such desireable?". 
The answer there is also no, as currently 2.x range() and xrange() are
limited by your platform int size (generally 32 bits)...

    >>> xrange(2**31)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: long int too large to convert to int
    >>> xrange(2**31-1)
    xrange(2147483647)
    >>>

In Python 3, ints and longs will be unified, so the whole 'limited by
platform int' shouldn't be applicable.  Further, xrange is renamed to
range.  So it becomes more reasonable.  On the other hand, a user (you)
should be able to give a *huge* value to range, and it would work as you
want ... to.  Generally, you are pretty safe with 2**64 * increment, but
if you are really anal, go with 2**128 * increment, that should keep you
safe until the big freeze.

Ultimately, -1.  It's not usable in Python 2.x, and Python 3.x will
support a variant trivially.

 - Josiah

From greg.ewing at canterbury.ac.nz  Wed Mar 21 23:06:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 22 Mar 2007 10:06:39 +1200
Subject: [Python-ideas] Builtin infinite generator
In-Reply-To: <20070321103536.FC8F.JCARLSON@uci.edu>
References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com>
	<20070321103536.FC8F.JCARLSON@uci.edu>
Message-ID: <4601AC6F.4040904@canterbury.ac.nz>

Josiah Carlson wrote:

> Generally, you are pretty safe with 2**64 * increment, but
> if you are really anal, go with 2**128 * increment, that should keep you
> safe until the big freeze.

Although this will work for all practical purposes,
code which does things like this is still a bit
smelly. I prefer it when there are ways of explicitly
representing infinitely-large or unbounded values.

--
Greg

From talin at acm.org  Thu Mar 22 08:50:39 2007
From: talin at acm.org (Talin)
Date: Thu, 22 Mar 2007 00:50:39 -0700
Subject: [Python-ideas] Python and Concurrency
Message-ID: <4602354F.1010003@acm.org>

I would like to direct your attention to a presentation that was given 
by Mr. Herb Sutter on the subject of concurrency support in programming 
languages. It's a long video, but I would urge you to watch at least the 
first five minutes of it:

http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter

Mr. Sutter is the secretary of the ISO/C++ standards committee. I 
mention this only so that you'll understand he's not just some random 
yo-yo with an opinion.

(There are a number of other papers by Mr. Sutter on the same topic 
which you can google for, however none of them are as compelling or 
comprehensive as the video presentation IMHO.)

The gist of the introduction is this (my words, not his): Starting from 
now, and for the rest of your career as a programmer, you better start 
programming for massively concurrent systems or you are doomed to 
irrelevance.

As he points out in the talk, the good news is that Moore's law is going 
to continue for several decades to come. The bad news is that most 
people don't understand Moore's law or what they get from it. Moore's 
law is really about the number of transistors on a chip. It says nothing 
about processor speed or performance.

He calls it "the end of the free lunch", meaning that in the past you 
could write a program, wait a little while, and it would run faster - 
because the hardware got faster while you were waiting. This will no 
longer be true for single-threaded programs. Scalar processors, with all 
of their pipelining, prefetching and other tricks, have gotten about as 
fast as they are ever going to get; We've reached the point of 
diminishing returns.

So the only thing left to do is throw more and more cores on the chip. 
There's no physical reason why the number of cores won't continue to 
double every 18 months; the only reason why manufacturers might choose 
not to make 128-core CPUs by the year 2012 is that there won't be enough 
software to take advantages of that degree of parallelism. (Even with 
today's transistor count, you could put 100 Pentium-class processors on 
a single die.)

Moreover, programming languages which have built-in support for scaling 
to many processors are going to win in the long run, and programming 
languages that don't have strong, built-in concurrency support will 
gradually fade away. The reason for this is simple: People don't like 
using applications which don't get faster when they buy a better computer.

Now, on the server side, all of this is a solved problem. We have a very 
robust model for handling hundreds of requests in parallel, and a very 
strong model for dealing with shared state. This model is called 
"transactions" or more formally ACID.

On the client side, however, things are much worse. As he puts it, locks 
are the best we have, and locks suck. Locks are not "logically 
composable", in the sense that you can't simply take two pieces of 
software that were written independently and glue them together unless 
they share a common locking architecture. This ruins a large part of the 
power of modern software, which is the ability to compose software out 
of smaller, independently-written components.

On the client side, you have heterogeneous components accessing shared 
state at high bandwidth with myriads of pointer-based references to that 
shared state. Add concurrency, and what you end up with is a system in 
which it is impossible to make logical inferences about the behavior of 
the system.

(And before you ask - yes, functional languages - and their drawbacks - 
are addressed in the talk. The same is true for 'lock-free programming' 
and other techniques du jour.)

Now, in the talk he proposes some strategies for building concurrency 
into the language so that client-side programming can be done in a way 
that isn't rocket science. We won't be able to get rid of locks, he 
claims, but with the right language support at least we'll be able to 
reason about them, and maybe push them off to the side so that they 
mostly stay in their corner.

In any case, what I would like to open up is a discussion of how these 
ideas might influence the design of the Python language.

One thing that is important to understand is that I'm not talking about 
"automatic parallelization" where the compiler automatically figures out 
what parts can be parallelized. That would require so much change to the 
Python language as to make it no longer Python.

Nor am I talking about manually creating "thread" objects, and having to 
carefully place protections around any shared state. That kind of 
parallelism is too low-level, too coarse-grained, and too hard to do, 
even for people who think they know how to write concurrent code.

I am not even necessarily talking about changing the Python language 
(although certainly the internals of the interpreter will need to be 
changed.) The same language can be used to describe the same kinds of 
problems and operations, but the implications of those language elements 
will change. This is analogous to the fact that these massively 
multicore CPUs 10 years from now will most likely be using the same 
instruction sets as today - but that does not mean that programming as a 
discipline will anything like what it is now.

As an example of what I mean, suppose the Python 'with' statement could 
be used to indicate an atomic transaction. Any operations that were done 
within that block would either be committed or rolled back. This is not 
a new idea - here's an interesting paper on it:

    http://www-sal.cs.uiuc.edu/~zilles/tm.html

I'm sure that there's a lot more like that out there. However, there is 
also a lot of stuff out there that is *superficially* similar to what I 
am talking about, and I want to make the distinction clear. For example, 
any discussion of concurrency in Python will naturally raise the topic 
of both IPython and Stackless. However, IPython (from what I understand) 
is really about distributed computing and not so much about fine-grained 
concurrency; And Stackless (from what I understand) is really about 
coroutines or continuations, which is a different kind of concurrency. 
Unless I am mistaken (and I very well could be) neither of these are 
immediately applicable to the problem of authoring Python programs for 
multi-core CPUs, but I think that both of them contain valuable ideas 
that are worth discussing.

Now, I will be the first to admit that I am not an expert in these 
matters. Don't let my poor, naive ideas be a limit to what is discussed. 
What I've primarily tried to do in this posting is to get your attention 
and convince you that this is important and worth talking about. And I 
hope to be able to learn something along the way.

My overall goal here is to be able to continue writing programs in 
Python 10 years from now, not just as a hobby, but as part of my 
professional work. If Python is able to leverage the power of the CPUs 
that are being created at that time, I will be able to make a strong 
case for doing so. On the other hand, if I have a 128-core CPU on my 
desk, and Python is only able to utilize 1/128th of that power without 
resorting to tedious calculations of race conditions and deadlocks, then 
its likely that my Python programming will be relegated to the role of a 
hobby.

-- Talin

From rrr at ronadam.com  Thu Mar 22 11:24:21 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 22 Mar 2007 05:24:21 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <46025955.5030509@ronadam.com>

Talin wrote:

> My overall goal here is to be able to continue writing programs in 
> Python 10 years from now, not just as a hobby, but as part of my 
> professional work. If Python is able to leverage the power of the CPUs 
> that are being created at that time, I will be able to make a strong 
> case for doing so. On the other hand, if I have a 128-core CPU on my 
> desk, and Python is only able to utilize 1/128th of that power without 
> resorting to tedious calculations of race conditions and deadlocks, then 
> its likely that my Python programming will be relegated to the role of a 
> hobby.
> 
> -- Talin

A Very nice introduction Talin.  I will certainly look at the video
tomorrow morning.  Thanks.

My first thought is it's not quite as bad as it seems, because any third
party extensions will be able to use the remaining 127/128th of the power. ;-)

You would need to subtract any CPU's used by the OS and other concurrent
processes. (These would probably continue to use more resources as well.)

I wonder if some of cpu's would be definable for special purposes or not?
Maybe 64 of them set aside for simultaneous parallel calculations?

There may be a way to express to the OS that a particular process on a data
structure be carried out as 'Wide' as possible.  ('Narrow' as possible
being on a single CPU in a single Thread.)  It might be vary nice for these
'wide' structures to have their own API as well.

All speculation of course, ;-)

Cheers,
    Ron

From lucio.torre at gmail.com  Thu Mar 22 14:33:57 2007
From: lucio.torre at gmail.com (Lucio Torre)
Date: Thu, 22 Mar 2007 10:33:57 -0300
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com>

> I'm sure that there's a lot more like that out there. However, there is
> also a lot of stuff out there that is *superficially* similar to what I
> am talking about, and I want to make the distinction clear. For example,
> any discussion of concurrency in Python will naturally raise the topic
> of both IPython and Stackless. However, IPython (from what I understand)
> is really about distributed computing and not so much about fine-grained
> concurrency; And Stackless (from what I understand) is really about
> coroutines or continuations, which is a different kind of concurrency.
> Unless I am mistaken (and I very well could be) neither of these are
> immediately applicable to the problem of authoring Python programs for
> multi-core CPUs, but I think that both of them contain valuable ideas
> that are worth discussing.

>From what i understand, i think that the main contribution of the
stackless aproach to concurrency is microthreads: The ability to have
lots and lots of cheap threads. If you want to program for some huge
amount of cores, you will have to have even more threads than cores
you have today.

The problem is that rigth now python (on my linux box) will only let
me have 381 threads. And if we want concurrent programming to be made
easy, the user is not supposed to start its programms with "how many
threads can i create? 500? ok, so i should partition my app like
this". This leads to:

a) applications that wont get faster when the user can create more
than 500 threads
b) or, a lot of complicated logic to partition the software on runtime

The method should go the other way around: make all the threads you
can think about, if there are enough cores, they will run in parallel.

jm2c.

Lucio

Regards,

Lucio.

From jcarlson at uci.edu  Thu Mar 22 18:03:14 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 22 Mar 2007 10:03:14 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com>
References: <4602354F.1010003@acm.org>
	<999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com>
Message-ID: <20070322095206.FCA8.JCARLSON@uci.edu>

"Lucio Torre" <lucio.torre at gmail.com> wrote:
> > I'm sure that there's a lot more like that out there. However, there is
> > also a lot of stuff out there that is *superficially* similar to what I
> > am talking about, and I want to make the distinction clear. For example,
> > any discussion of concurrency in Python will naturally raise the topic
> > of both IPython and Stackless. However, IPython (from what I understand)
> > is really about distributed computing and not so much about fine-grained
> > concurrency; And Stackless (from what I understand) is really about
> > coroutines or continuations, which is a different kind of concurrency.
> > Unless I am mistaken (and I very well could be) neither of these are
> > immediately applicable to the problem of authoring Python programs for
> > multi-core CPUs, but I think that both of them contain valuable ideas
> > that are worth discussing.
> 
> From what i understand, i think that the main contribution of the
> stackless aproach to concurrency is microthreads: The ability to have
> lots and lots of cheap threads. If you want to program for some huge
> amount of cores, you will have to have even more threads than cores
> you have today.

But it's not about threads, it is about concurrent execution of code
(which threads in Python do not allow).  The only way to allow this is
to basically attach a re-entrant lock on every single Python object
(depending on the platform, perhaps 12 bytes minimum for count, process,
thread).  The sheer volume of the number of acquire/release cycles
during execution is staggering (think about the number of incref/decref
operations), and the increase in size of every object by around 12 bytes
is not terribly appealing.

On the upside, this is possible (change the PyObject_HEAD macro,
PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to
actually make it happen is huge, and it would likely result in negative
performance until sheer concurrency wins out over the acquire/release
overhead.

 - Josiah

From jcarlson at uci.edu  Thu Mar 22 18:30:58 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 22 Mar 2007 10:30:58 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <20070322100428.FCAB.JCARLSON@uci.edu>

Talin <talin at acm.org> wrote:
> I would like to direct your attention to a presentation that was given 
> by Mr. Herb Sutter on the subject of concurrency support in programming 
> languages. It's a long video, but I would urge you to watch at least the 
> first five minutes of it:

I'll watch it later today, but you seem to have done a great job of
explaining what it is saying.

[snip]
> The gist of the introduction is this (my words, not his): Starting from 
> now, and for the rest of your career as a programmer, you better start 
> programming for massively concurrent systems or you are doomed to 
> irrelevance.

Given an inexpensive fork() with certain features (the optional ability
to not copy threads, certain file handles, be able to quickly pass data
between the parent and child processes, etc.), task-level concurrency is
not quite as hard as it is right now. In the same way that one can use a
threadpool to handle queued tasks, one could use whole processes to do
the same thing, which gets us concurrency in Python.

Of course we run into the issue where processor scheduling needs to get
better to handle all of the processes, but that's going to be a
requirement for these multi-core processors anyways.  Windows doesn't
have a .fork() (cygwin emulates it by using a shared mmap to copy all
program state).  Sometimes transferring objects between processes is
difficult (file handles end up being relatively easy on *nix AND Windows),
but there.  Etc.

 - Josiah

From rrr at ronadam.com  Thu Mar 22 21:55:34 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 22 Mar 2007 15:55:34 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070322095206.FCA8.JCARLSON@uci.edu>
References: <4602354F.1010003@acm.org>	<999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com>
	<20070322095206.FCA8.JCARLSON@uci.edu>
Message-ID: <4602ED46.9050202@ronadam.com>

Josiah Carlson wrote:
> "Lucio Torre" <lucio.torre at gmail.com> wrote:
>>> I'm sure that there's a lot more like that out there. However, there is
>>> also a lot of stuff out there that is *superficially* similar to what I
>>> am talking about, and I want to make the distinction clear. For example,
>>> any discussion of concurrency in Python will naturally raise the topic
>>> of both IPython and Stackless. However, IPython (from what I understand)
>>> is really about distributed computing and not so much about fine-grained
>>> concurrency; And Stackless (from what I understand) is really about
>>> coroutines or continuations, which is a different kind of concurrency.
>>> Unless I am mistaken (and I very well could be) neither of these are
>>> immediately applicable to the problem of authoring Python programs for
>>> multi-core CPUs, but I think that both of them contain valuable ideas
>>> that are worth discussing.
>> From what i understand, i think that the main contribution of the
>> stackless aproach to concurrency is microthreads: The ability to have
>> lots and lots of cheap threads. If you want to program for some huge
>> amount of cores, you will have to have even more threads than cores
>> you have today.
> 
> But it's not about threads, it is about concurrent execution of code
> (which threads in Python do not allow).  The only way to allow this is
> to basically attach a re-entrant lock on every single Python object
> (depending on the platform, perhaps 12 bytes minimum for count, process,
> thread).  The sheer volume of the number of acquire/release cycles
> during execution is staggering (think about the number of incref/decref
> operations), and the increase in size of every object by around 12 bytes
> is not terribly appealing.
> 
> On the upside, this is possible (change the PyObject_HEAD macro,
> PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to
> actually make it happen is huge, and it would likely result in negative
> performance until sheer concurrency wins out over the acquire/release
> overhead.

It seems to me some types of operations are more suited for concurrent 
operations than others, so maybe new objects that are designed to be 
naturally usable in this way could help.  Or maybe there's a way to lock 
groups of objects at the same time by having them share a lock if they are 
related?

I imagine there will be some low level C support that could be used 
transparently, such as copying large areas of memory with multiple CPU's. 
These may even be the existing C copy functions reimplemented to take 
advantage of multiple CPU environments so new versions of python may have 
limited use of this even if no support is explicitly added.

Thinking out loud of ways a python program may use concurrent processing:

* Applying a single function concurrently over a list.  (A more limited 
function object might make this easier.)

* Feeding a single set of arguments concurrently over a list of callables.

* Generators with the semantics of calculating first and waiting on 'yield' 
for 'next', so the value is immediately returned. (depends on CPU load)

* Listcomps that perform the same calculation on each item may be a natural 
multi-processing structure.

Ron

From jason.orendorff at gmail.com  Thu Mar 22 22:01:11 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Thu, 22 Mar 2007 17:01:11 -0400
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <bb8868b90703221401ua8d21d7v8132a7ea537883a@mail.gmail.com>

On 3/22/07, Talin <talin at acm.org> wrote:
> I would like to direct your attention to a presentation that was given
> by Mr. Herb Sutter on the subject of concurrency support in programming
> languages. It's a long video, but I would urge you to watch at least the
> first five minutes of it:
>
> http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter

Thanks for the excellent summary.

I won't try to summarize Brendan Eich's February blog entry
on this topic, because it's short enough and worth reading:

http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html

-j

From guido at python.org  Thu Mar 22 22:20:45 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 22 Mar 2007 14:20:45 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <bb8868b90703221401ua8d21d7v8132a7ea537883a@mail.gmail.com>
References: <4602354F.1010003@acm.org>
	<bb8868b90703221401ua8d21d7v8132a7ea537883a@mail.gmail.com>
Message-ID: <ca471dc20703221420j7f71d48cj84569c1f94087c09@mail.gmail.com>

On 3/22/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 3/22/07, Talin <talin at acm.org> wrote:
> > I would like to direct your attention to a presentation that was given
> > by Mr. Herb Sutter on the subject of concurrency support in programming
> > languages. It's a long video, but I would urge you to watch at least the
> > first five minutes of it:
> >
> > http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter
>
> Thanks for the excellent summary.
>
> I won't try to summarize Brendan Eich's February blog entry
> on this topic, because it's short enough and worth reading:
>
> http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html

Right. I saw Sutter give this talk (or a very similar one) in Oxford
last April, and I'm thoroughly unconvinced that Python is doomed
unless it adds more thread support.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Thu Mar 22 23:40:27 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 22 Mar 2007 15:40:27 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602ED46.9050202@ronadam.com>
References: <20070322095206.FCA8.JCARLSON@uci.edu>
	<4602ED46.9050202@ronadam.com>
Message-ID: <20070322143830.FCB4.JCARLSON@uci.edu>

Ron Adam <rrr at ronadam.com> wrote:
> Josiah Carlson wrote:
> > But it's not about threads, it is about concurrent execution of code
> > (which threads in Python do not allow).  The only way to allow this is
> > to basically attach a re-entrant lock on every single Python object
> > (depending on the platform, perhaps 12 bytes minimum for count, process,
> > thread).  The sheer volume of the number of acquire/release cycles
> > during execution is staggering (think about the number of incref/decref
> > operations), and the increase in size of every object by around 12 bytes
> > is not terribly appealing.
> > 
> > On the upside, this is possible (change the PyObject_HEAD macro,
> > PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to
> > actually make it happen is huge, and it would likely result in negative
> > performance until sheer concurrency wins out over the acquire/release
> > overhead.
> 
> It seems to me some types of operations are more suited for concurrent 
> operations than others, so maybe new objects that are designed to be 
> naturally usable in this way could help.  Or maybe there's a way to lock 
> groups of objects at the same time by having them share a lock if they are 
> related?

That is a fine-grained vs. coarse-grained locking argument.  There is
literature.

On the other hand, there is also the pi-calculus:
http://scienceblogs.com/goodmath/2007/03/an_experiment_with_calculus_an_1.php

Of course, the pi-calculus is not reasonable unless one starts with a
lambda calculus and decides to modify it (parallel lisp?), so it isn't
all that applicable here.

> Thinking out loud of ways a python program may use concurrent processing:
> 
> * Applying a single function concurrently over a list.  (A more limited 
> function object might make this easier.)
> * Feeding a single set of arguments concurrently over a list of callables.
> * Generators with the semantics of calculating first and waiting on 'yield' 
> for 'next', so the value is immediately returned. (depends on CPU load)
> * Listcomps that perform the same calculation on each item may be a natural 
> multi-processing structure.

These examples are all what are generally referred to as "embarassingly
parallel" in literature.  One serious issue with programming parallel
algorithms generally is that not all algorithms are necessarily
parallelizable. Some are, certainly, but not all.  The task is to
discover those alternate algorithms that *are* parallelizable in such a
way to offer gains that are "worth it".

Personally, I think that if there were a *cheap* IPC to make
cross-process calls not too expensive, many of the examples above that
you talk about would be handled easily.

 - Josiah

From greg.ewing at canterbury.ac.nz  Fri Mar 23 02:32:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 23 Mar 2007 13:32:30 +1200
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602ED46.9050202@ronadam.com>
References: <4602354F.1010003@acm.org>
	<999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com>
	<20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com>
Message-ID: <46032E2E.8020408@canterbury.ac.nz>

Ron Adam wrote:

> Thinking out loud of ways a python program may use concurrent processing:
> 
> * Applying a single function concurrently over a list.
> * Feeding a single set of arguments concurrently over a list of callables.
> * Generators with the semantics of calculating first and waiting on 'yield'
> * Listcomps that perform the same calculation on each item

The solution is obvious -- just add a 'par' statement.

Then-we-can-call-it-Poccam-ly,
Greg

From greg.ewing at canterbury.ac.nz  Fri Mar 23 02:38:24 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 23 Mar 2007 13:38:24 +1200
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <ca471dc20703221420j7f71d48cj84569c1f94087c09@mail.gmail.com>
References: <4602354F.1010003@acm.org>
	<bb8868b90703221401ua8d21d7v8132a7ea537883a@mail.gmail.com>
	<ca471dc20703221420j7f71d48cj84569c1f94087c09@mail.gmail.com>
Message-ID: <46032F90.1090706@canterbury.ac.nz>

Guido van Rossum wrote:

> I saw Sutter give this talk (or a very similar one) in Oxford
> last April, and I'm thoroughly unconvinced that Python is doomed
> unless it adds more thread support.

I'm unconvinced, too. Python has always relied mostly
on C-level code to get grunty things done efficiently.
If the C-level code knows how to take advantage of
multiple cores, Python applications will benefit, too.

As long as Numeric can make use of the other 127 cores,
and OpenGL can run my 512-core GPU at full tilt, I'll
be happy. :-)

--
Greg

From rrr at ronadam.com  Fri Mar 23 03:21:30 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 22 Mar 2007 21:21:30 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070322143830.FCB4.JCARLSON@uci.edu>
References: <20070322095206.FCA8.JCARLSON@uci.edu>
	<4602ED46.9050202@ronadam.com>
	<20070322143830.FCB4.JCARLSON@uci.edu>
Message-ID: <460339AA.4030505@ronadam.com>

Josiah Carlson wrote:
> Ron Adam <rrr at ronadam.com> wrote:
>> Josiah Carlson wrote:
>>> But it's not about threads, it is about concurrent execution of code
>>> (which threads in Python do not allow).  The only way to allow this is
>>> to basically attach a re-entrant lock on every single Python object
>>> (depending on the platform, perhaps 12 bytes minimum for count, process,
>>> thread).  The sheer volume of the number of acquire/release cycles
>>> during execution is staggering (think about the number of incref/decref
>>> operations), and the increase in size of every object by around 12 bytes
>>> is not terribly appealing.
>>>
>>> On the upside, this is possible (change the PyObject_HEAD macro,
>>> PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to
>>> actually make it happen is huge, and it would likely result in negative
>>> performance until sheer concurrency wins out over the acquire/release
>>> overhead.
>> It seems to me some types of operations are more suited for concurrent 
>> operations than others, so maybe new objects that are designed to be 
>> naturally usable in this way could help.  Or maybe there's a way to lock 
>> groups of objects at the same time by having them share a lock if they are 
>> related?
> 
> That is a fine-grained vs. coarse-grained locking argument.  There is
> literature.
> 
> On the other hand, there is also the pi-calculus:
> http://scienceblogs.com/goodmath/2007/03/an_experiment_with_calculus_an_1.php
> 
> Of course, the pi-calculus is not reasonable unless one starts with a
> lambda calculus and decides to modify it (parallel lisp?), so it isn't
> all that applicable here.

Interesting, but I think you are correct, pi-calculus is probably better 
suited to a special purpose library.

Here's some more complete examples of what I was thinking.  From a python 
programmers point of view.  The underlying implementation would need to be 
worked out of course, either with locks, or messages, or by limiting access 
to mutable objects in some other way.

>> Thinking out loud of ways a python program may use concurrent processing:
>>
>> * Applying a single function concurrently over a list.  (A more limited 
>> function object might make this easier.)

(1)
    xyz = vector(r)
    forall coords as c:     # parallel modify coords 'inplace' with body
        c = rotate(c, xyz)

    * 'with' syntax form because 'c' does not outlive the body.

>> * Feeding a single set of arguments concurrently over a list of callables.

(2)
    x = 42
    result = [None] * len(funcs)
    forall (funcs, result) as (f, r):
        r = f(x)

>> * Generators with the semantics of calculating first and waiting on 'yield' 
>> for 'next', so the value is immediately returned. (depends on CPU load)

(3)  This corresponds nicely to the suggested use of 'future' in the video.

     def foo(x=42):
        # Starts when process is created instead of waiting
        # for f.next() call to start.
        while 1:
           future yield x
           x += 42

     f = foo()             # start a concurrent process

     result = f.wait()     # waits here for value if it's not ready.

>> * Listcomps that perform the same calculation on each item may be a natural 
>> multi-processing structure.

(4)

    result = [x = x**2 forall x in values]

> These examples are all what are generally referred to as "embarassingly
> parallel" in literature.  One serious issue with programming parallel
> algorithms generally is that not all algorithms are necessarily
> parallelizable. Some are, certainly, but not all.  The task is to
> discover those alternate algorithms that *are* parallelizable in such a
> way to offer gains that are "worth it".

Yes, they are obvious, but why not start with the obvious?

It's what most users will understand first and it may be the less obvious 
stuff can be put in terms of the more obvious "embarrassingly parallel" things.

> Personally, I think that if there were a *cheap* IPC to make
> cross-process calls not too expensive, many of the examples above that
> you talk about would be handled easily.

In the video he also talked about avoiding locks.  I was thinking a more 
limited function object (for concurrent uses only) might be used that has:

    * no global keyword
    * only access to external immutable objects
    * no closures.

In other words these need to pass 'values' explicitly at call time and 
return time. (or 'future yield' time)

Ron

From talin at acm.org  Fri Mar 23 08:12:39 2007
From: talin at acm.org (Talin)
Date: Fri, 23 Mar 2007 00:12:39 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070322143830.FCB4.JCARLSON@uci.edu>
References: <20070322095206.FCA8.JCARLSON@uci.edu>	<4602ED46.9050202@ronadam.com>
	<20070322143830.FCB4.JCARLSON@uci.edu>
Message-ID: <46037DE7.5050300@acm.org>

Josiah Carlson wrote:
> These examples are all what are generally referred to as "embarassingly
> parallel" in literature.  One serious issue with programming parallel
> algorithms generally is that not all algorithms are necessarily
> parallelizable. Some are, certainly, but not all.  The task is to
> discover those alternate algorithms that *are* parallelizable in such a
> way to offer gains that are "worth it".

A couple of ideas I wanted to explore:

-- A base class or perhaps metaclass that makes python objects 
transactional. This wraps all attribute access so what you see is your 
transaction's view of the current state of the object.

Something like:

    with atomic():
       obj1.attribute = 1

       # Other threads can't 'see' the new value until the
       # transaction commits.
       value = obj1.attribute

'atomic()' starts a new transaction in your thread, which is stored in 
thread-local data; Any objects that you mutate become part of the 
transaction automatically (will have to be careful about built-in 
mutable objects such as lists). At the end, the transaction either 
commits, or if there was a conflict, it rolls back and you get to do it 
over again.

This would be useful for large networks of objects, where you want to 
make large numbers of local changes, where each local change affects an 
object and perhaps its surrounding objects. What I am describing is very 
similar to many complex 3D game worlds.

ZODB already has a transactional object mechanism, although it's 
oriented towards database-style transactions and object persistence.

-- A way to partition the space of python objects, such that objects in 
each partition cannot have references outside of the partition without 
going through some sort of synchronization mechanism, perhaps via some 
proxy. The idea is to be able to guarantee that shared state is only 
accessible in certain ways that you can reason about.

-- Talin

From jimjjewett at gmail.com  Fri Mar 23 18:48:52 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 23 Mar 2007 13:48:52 -0400
Subject: [Python-ideas] [Python-checkins] r54539 - in python/trunk:
	Lib/string.py Misc/NEWS Objects/typeobject.c
In-Reply-To: <20070323045846.0C6001E4005@bag.python.org>
References: <20070323045846.0C6001E4005@bag.python.org>
Message-ID: <fb6fbf560703231048n704fdcfp9ba832418bb3a1ec@mail.gmail.com>

(Please note that most replies should trim at least one list from the Cc)

On 3/23/07, guido.van.rossum <python-checkins at python.org> wrote:
> Note: without the change to string.py, lots of spurious warnings happen.
> What's going on there?

I assume it was a defensive measure for subclasses of both Template
and X, where X has its own metaclass (which might have an __init__
that shouldn't be ignored).

This is where cooperative classes get ugly.  You could argue that the
"correct" code is to not bother making the super call if it would go
all the way to object (or even type), but there isn't a good way to
spell that.

> +        # A super call makes no sense since type() doesn't define __init__().
> +        # (Or does it? And should type.__init__() accept three args?)
> +        # super(_TemplateMetaclass, cls).__init__(name, bases, dct)

In this particular case, you could define a type.__init__ that did
nothing.  (If the signature were wrong, type.__new__ would have
already caught it.  If __new__ and __init__ are seeing something
different, then the change was probably intentional.)

The problem isn't really limited to type.__init__ though.  You'll
sometimes see similar patterns for __del__, close, save, etc.  The
main difference is that they have to either catch an error or check
first, since object doesn't have an implementation of those methods.
object.__init__ doesn't really do anything either, except check for
errors.  Getting rid of it should have the same effect as complaining
about parameters.

The ideal solution (discussion of which probably ought to stay on
python-ideas) might be to replace object.__init__ with some sort of
PlaceHolder object  that raises an error *unless* called through a
cooperative super.  This PlaceHolder would also be useful for
AbstractBaseClasses/Interfaces.  PlaceHolder still wouldn't deal with
the original concern of verifying that all arguments had already been
stripped and used; but the ABCs might be able to.

-jJ

> Modified: python/trunk/Lib/string.py
> ==============================================================================
> --- python/trunk/Lib/string.py  (original)
> +++ python/trunk/Lib/string.py  Fri Mar 23 05:58:42 2007
> @@ -108,7 +108,9 @@
>      """
>
>      def __init__(cls, name, bases, dct):
> -        super(_TemplateMetaclass, cls).__init__(name, bases, dct)
> +        # A super call makes no sense since type() doesn't define __init__().
> +        # (Or does it? And should type.__init__() accept three args?)
> +        # super(_TemplateMetaclass, cls).__init__(name, bases, dct)
>          if 'pattern' in dct:
>              pattern = cls.pattern
>          else:

From jimjjewett at gmail.com  Fri Mar 23 19:30:34 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 23 Mar 2007 14:30:34 -0400
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <460339AA.4030505@ronadam.com>
References: <20070322095206.FCA8.JCARLSON@uci.edu>
	<4602ED46.9050202@ronadam.com> <20070322143830.FCB4.JCARLSON@uci.edu>
	<460339AA.4030505@ronadam.com>
Message-ID: <fb6fbf560703231130t5c4bd1f8n5affdb27ceb2f82b@mail.gmail.com>

On 3/22/07, Ron Adam <rrr at ronadam.com> wrote:
> Josiah Carlson wrote:
> > Ron Adam <rrr at ronadam.com> wrote:
> >> Josiah Carlson wrote:

> >>> On the upside, this is possible (change the PyObject_HEAD macro,
> >>> PyINCREF, PyDECREF, remove the GIL), but the amount of work

The zillions of details are the same details you have to consider when
writing a concurrent database.  Having python store its objects
externally and retrieve them when needed (vs allocating memory and
pointers) would be a huge one-time loss, but it might eventually be
worthwhile.

PyPy calls that "external database" an "object_space", so the language
support is already there, and still will be when the hardware makes it
worthwhile.

> (1)
>     xyz = vector(r)
>     forall coords as c:     # parallel modify coords 'inplace' with body
>         c = rotate(c, xyz)
>
>     * 'with' syntax form because 'c' does not outlive the body.

Why not just keep using

    xyz = vector(r)
    rotate(xyz)

and let the compiler take care of it?

At most, we would want a way of marking objects as "read-only" or
callables as "instant", so that the compiler would know it doesn't
have to worry about the definition of "len" changing mid-stream.  (Ron
suggests something similar at the end of the message, and Talin's
Transactional metaclass is related.)

> >> * Generators with the semantics of calculating first and waiting on 'yield'
> >> for 'next', so the value is immediately returned. (depends on CPU load)

On its own, this just makes response time worse.  That said,
generators are also callables, and the compiler might do a better job
if it knew that it wouldn't have to worry about external
redefinitions.

There may also be some value in some sort of "idletasks" abstraction
that says "hey, go ahead and precompute this, but only if you've got
nothing better to do."  Garbage collection could certainly benefit
from this, and translating (or duplicating) string representations
into multiple encodings.

> In the video he also talked about avoiding locks.  I was thinking a more
> limited function object (for concurrent uses only) might be used that has:

>     * no global keyword
>     * only access to external immutable objects

Or at least, it doesn't mutate them itself, and it doesn't promise to
use newer versions that get created after the call begins.

>     * no closures.

This level of information should be useful.

But in practice, most of my functions already meet this definition.
(In theory, they access builtins that could be replaced, etc.)  I
wouldn't want to mark them all by hand.  I'm still inclined to trust
the compiler, and just accept that it may eventually be the PyPy
translator rather than the CPython interpreter.

-jJ

From rrr at ronadam.com  Fri Mar 23 23:47:33 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 23 Mar 2007 17:47:33 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <fb6fbf560703231130t5c4bd1f8n5affdb27ceb2f82b@mail.gmail.com>
References: <20070322095206.FCA8.JCARLSON@uci.edu>	
	<4602ED46.9050202@ronadam.com>
	<20070322143830.FCB4.JCARLSON@uci.edu>	
	<460339AA.4030505@ronadam.com>
	<fb6fbf560703231130t5c4bd1f8n5affdb27ceb2f82b@mail.gmail.com>
Message-ID: <46045905.9030907@ronadam.com>

Jim Jewett wrote:
> On 3/22/07, Ron Adam <rrr at ronadam.com> wrote:
>> Josiah Carlson wrote:
>> > Ron Adam <rrr at ronadam.com> wrote:
>> >> Josiah Carlson wrote:
> 
>> >>> On the upside, this is possible (change the PyObject_HEAD macro,
>> >>> PyINCREF, PyDECREF, remove the GIL), but the amount of work
> 
> The zillions of details are the same details you have to consider when
> writing a concurrent database.  Having python store its objects
> externally and retrieve them when needed (vs allocating memory and
> pointers) would be a huge one-time loss, but it might eventually be
> worthwhile.
> 
> PyPy calls that "external database" an "object_space", so the language
> support is already there, and still will be when the hardware makes it
> worthwhile.
> 
> 
>> (1)
>>     xyz = vector(r)
>>     forall coords as c:     # parallel modify coords 'inplace' with body
>>         c = rotate(c, xyz)
>>
>>     * 'with' syntax form because 'c' does not outlive the body.
> 
> 
> Why not just keep using
> 
>    xyz = vector(r)
>    rotate(xyz)
> 
> and let the compiler take care of it?

Just how much should the 'python' compiler do?

And what about dynamic modeling where things change according to input?

(Not just thinking of graphics.)

> At most, we would want a way of marking objects as "read-only" or
> callables as "instant", so that the compiler would know it doesn't
> have to worry about the definition of "len" changing mid-stream.  (Ron
> suggests something similar at the end of the message, and Talin's
> Transactional metaclass is related.)

Would it be possible to have a container for which it's contents are read 
only?  (as a side effect of being in the container)  Then individual items 
would not need their own read only attributes.

A few other thoughts...

Possibly ... when a module is first imported or ran nothing needs to be 
marked read only.  Then when execution falls off the end, *everything* 
existing at that point is marked read only. If the module was run from as a 
script, then a main function is executed.  These could be optional settings 
that could be turned on...

     from __far_future__ import multi-processing
     __multi-processing__ = True
     __main__ = "my_main"

So then there would be an initiation first pass followed by an execution 
phase. In the initiation phase it's pretty much just the way things are now 
except you can't use threads.  In the execution phase you can use threads, 
but you can't change anything created in the initiation phase.

(Other controls would still be needed of course.)

>> >> * Generators with the semantics of calculating first and waiting on 
>> 'yield'
>> >> for 'next', so the value is immediately returned. (depends on CPU 
>> load)
> 
> On its own, this just makes response time worse.

Yes, you really wouldn't use these for very simple counters.  For more 
complex things, the calling overhead becomes a much smaller percentage of 
the total, and it would have a bigger effect on smoothing out response time 
rather than hurting it.

> That said,
> generators are also callables, and the compiler might do a better job
> if it knew that it wouldn't have to worry about external
> redefinitions.
> 
> There may also be some value in some sort of "idletasks" abstraction
> that says "hey, go ahead and precompute this, but only if you've got
> nothing better to do."

Having a way to set process priority may be good.  (or bad)

>Garbage collection could certainly benefit
> from this, and translating (or duplicating) string representations
> into multiple encodings.
> 
>> In the video he also talked about avoiding locks.  I was thinking a more
>> limited function object (for concurrent uses only) might be used that 
>> has:
> 
>>     * no global keyword
>>     * only access to external immutable objects
> 
> Or at least, it doesn't mutate them itself, and it doesn't promise to
> use newer versions that get created after the call begins.

The difficulty is that even a method call on a global (or parent scope) 
object can result in it being mutated.  So you are either back to using 
locks and/or transactions on everything.

>>     * no closures.
> 
> This level of information should be useful.
> 
> But in practice, most of my functions already meet this definition.

Mine too.

Ron

> (In theory, they access builtins that could be replaced, etc.)  I
> wouldn't want to mark them all by hand.  I'm still inclined to trust
> the compiler, and just accept that it may eventually be the PyPy
> translator rather than the CPython interpreter.
> 
> -jJ
> 
> 

From talin at acm.org  Sun Mar 25 09:59:05 2007
From: talin at acm.org (Talin)
Date: Sun, 25 Mar 2007 00:59:05 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <46062BC9.2020208@acm.org>

Thinking more about this, it seems to me that discussions of syntax for 
doing parallel operations and nifty classes for synchronization are a 
bit premature. The real question, it seems to me, is how to get Python 
to operate concurrently at all.

Python's GIL cannot be removed without going through and fixing a 
thousand different places where different threads might access shared 
variables within the interpreter. Clearly, that's not the kind of work 
that is going to be done in a month or less.

It might be useful to decompose that task further into some digestible 
chunks. One of these chunks is garbage collection. It seems to me that 
reference counting as it exists today would is a serious hindrance to 
concurrency, because it requires writing to an object each time you 
create a new reference. Instead, it should be possible to pass a 
reference to an object between threads, without actually modifying the 
object unless one of the threads actually changes an attribute.

There are a number of papers on concurrent garbage collection out there 
on the web that might serve as a useful starting point. Of course, the 
.Net CLR and Java VM already have collectors of this type, so maybe 
those versions of Python already get this for free.

I also wonder what other things that the GIL is protecting can be broken 
out as large, coherent chunks.

-- Talin

From aahz at pythoncraft.com  Sun Mar 25 15:52:42 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 25 Mar 2007 06:52:42 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <46062BC9.2020208@acm.org>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
Message-ID: <20070325135242.GA24610@panix.com>

On Sun, Mar 25, 2007, Talin wrote:
>
> Thinking more about this, it seems to me that discussions of syntax for 
> doing parallel operations and nifty classes for synchronization are a 
> bit premature. The real question, it seems to me, is how to get Python 
> to operate concurrently at all.

Maybe that's what it seems to you; to others of us who have been looking
at this problem for a while, the real question is how to get a better
multi-process control and IPC library in Python, preferably one that is
cross-platform.  You can investigate that right now, and you don't even
need to discuss it with other people.

(Despite my oft-stated fondness for threading, I do recognize the
problems with threading, and if there were a way to make processes as
simple as threads from a programming standpoint, I'd be much more
willing to push processes.)
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith

From jcarlson at uci.edu  Sun Mar 25 18:34:58 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 25 Mar 2007 09:34:58 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070325135242.GA24610@panix.com>
References: <46062BC9.2020208@acm.org> <20070325135242.GA24610@panix.com>
Message-ID: <20070325084247.FCED.JCARLSON@uci.edu>

Aahz <aahz at pythoncraft.com> wrote:
> 
> On Sun, Mar 25, 2007, Talin wrote:
> >
> > Thinking more about this, it seems to me that discussions of syntax for 
> > doing parallel operations and nifty classes for synchronization are a 
> > bit premature. The real question, it seems to me, is how to get Python 
> > to operate concurrently at all.
> 
> Maybe that's what it seems to you; to others of us who have been looking
> at this problem for a while, the real question is how to get a better
> multi-process control and IPC library in Python, preferably one that is
> cross-platform.  You can investigate that right now, and you don't even
> need to discuss it with other people.
> 
> (Despite my oft-stated fondness for threading, I do recognize the
> problems with threading, and if there were a way to make processes as
> simple as threads from a programming standpoint, I'd be much more
> willing to push processes.)

I generally agree.  XML-RPC works pretty well in this regard, though as
I talked about a couple months ago, it's transport format encoding and
decoding result in overhead during calling that leaves something to be
desired.

 - Josiah

From talin at acm.org  Sun Mar 25 18:40:43 2007
From: talin at acm.org (Talin)
Date: Sun, 25 Mar 2007 09:40:43 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070325135242.GA24610@panix.com>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
	<20070325135242.GA24610@panix.com>
Message-ID: <4606A60B.4040402@acm.org>

Aahz wrote:
> On Sun, Mar 25, 2007, Talin wrote:
>> Thinking more about this, it seems to me that discussions of syntax for 
>> doing parallel operations and nifty classes for synchronization are a 
>> bit premature. The real question, it seems to me, is how to get Python 
>> to operate concurrently at all.
> 
> Maybe that's what it seems to you; to others of us who have been looking
> at this problem for a while, the real question is how to get a better
> multi-process control and IPC library in Python, preferably one that is
> cross-platform.  You can investigate that right now, and you don't even
> need to discuss it with other people.

If you mean some sort of inter-process messaging system, there are a 
number that already exist; I'd look at IPython and py-globus for starters.

My feeling is that while such an approach is vastly easier from the 
standpoint of Python developers, and may be easier from the standpoint 
of a typical Python programmer, it doesn't actually solve the problem 
that I'm attempting to address, which is figuring out how to write 
client-side software that dynamically scales to the number of processors 
on the system.

My view is that while the number of algorithms that we have that can be 
efficiently parallelized in a fine-grained threading environment is 
small (compared to the total number of strictly sequential algorithms), 
the number of algorithms that can be adapted to heavy-weight, 
coarse-grained processes is much smaller still.

For example, it is easy to imagine a quicksort routine where different 
threads are responsible for sorting various sub-partitions of the array. 
If this were to be done via processes, the overhead of marshalling and 
unmarshalling the array elements would completely swamp the benefits of 
making it concurrent.

-- Talin

From rrr at ronadam.com  Sun Mar 25 19:13:34 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 25 Mar 2007 12:13:34 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <46062BC9.2020208@acm.org>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
Message-ID: <4606ADBE.1030406@ronadam.com>

Talin wrote:
 > Thinking more about this, it seems to me that discussions of syntax for
 > doing parallel operations and nifty classes for synchronization are a
 > bit premature.

Yes, but it may help to have a few possible "end user" examples in mind 
while you work on the problem.  The point isn't the exact syntax, but the 
uses that they imply and what other requirements these examples need in 
order to work.

 >The real question, it seems to me, is how to get Python
 > to operate concurrently at all.

Yes, and I agree that it would be better to split the problem into smaller 
chunks.  Maybe start by finding some "easier" to do simple cases first.

 > Python's GIL cannot be removed without going through and fixing a
 > thousand different places where different threads might access shared
 > variables within the interpreter. Clearly, that's not the kind of work
 > that is going to be done in a month or less.

Here is an older discussion on removing the GIL and multiprocessing vs 
threading.  Maybe it will be of help?

http://mail.python.org/pipermail/python-dev/2005-September/056423.html

 > It might be useful to decompose that task further into some digestible
 > chunks. One of these chunks is garbage collection. It seems to me that
 > reference counting as it exists today would is a serious hindrance to
 > concurrency, because it requires writing to an object each time you
 > create a new reference. Instead, it should be possible to pass a
 > reference to an object between threads, without actually modifying the
 > object unless one of the threads actually changes an attribute.

I'm not familiar with the details of python's garbage collecting yet.

I was thinking the problem may be simplified by having a single mutable 
container-cell object.  But that wouldn't be enough because in python it 
isn't just a problem of a mutable object changing, but one of a names 
reference to an object being changed.

Could an explicit name space for sharing in threads and processes help?

 > There are a number of papers on concurrent garbage collection out there
 > on the web that might serve as a useful starting point. Of course, the
 > .Net CLR and Java VM already have collectors of this type, so maybe
 > those versions of Python already get this for free.
 >
 > I also wonder what other things that the GIL is protecting can be broken
 > out as large, coherent chunks.

I have no idea,
   ... Ron

From aahz at pythoncraft.com  Sun Mar 25 19:39:05 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 25 Mar 2007 10:39:05 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4606A60B.4040402@acm.org>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
	<20070325135242.GA24610@panix.com> <4606A60B.4040402@acm.org>
Message-ID: <20070325173905.GA15655@panix.com>

On Sun, Mar 25, 2007, Talin wrote:
> Aahz wrote:
>>On Sun, Mar 25, 2007, Talin wrote:
>>>
>>>Thinking more about this, it seems to me that discussions of syntax for 
>>>doing parallel operations and nifty classes for synchronization are a 
>>>bit premature. The real question, it seems to me, is how to get Python 
>>>to operate concurrently at all.
>>
>>Maybe that's what it seems to you; to others of us who have been looking
>>at this problem for a while, the real question is how to get a better
>>multi-process control and IPC library in Python, preferably one that is
>>cross-platform.  You can investigate that right now, and you don't even
>>need to discuss it with other people.
> 
> If you mean some sort of inter-process messaging system, there are a 
> number that already exist; I'd look at IPython and py-globus for starters.
> 
> My feeling is that while such an approach is vastly easier from the 
> standpoint of Python developers, and may be easier from the standpoint 
> of a typical Python programmer, it doesn't actually solve the problem 
> that I'm attempting to address, which is figuring out how to write 
> client-side software that dynamically scales to the number of processors 
> on the system.

How not?  Keep in mind that if this kind of library becomes part of the
Python Standard Library, the standard library can be written to use this
multi-process library.

> My view is that while the number of algorithms that we have that can be 
> efficiently parallelized in a fine-grained threading environment is 
> small (compared to the total number of strictly sequential algorithms), 
> the number of algorithms that can be adapted to heavy-weight, 
> coarse-grained processes is much smaller still.

Maybe.  I'm not convinced, but see below.

> For example, it is easy to imagine a quicksort routine where different 
> threads are responsible for sorting various sub-partitions of the array. 
> If this were to be done via processes, the overhead of marshalling and 
> unmarshalling the array elements would completely swamp the benefits of 
> making it concurrent.

The problem, though, is that Threading Doesn't Work for what you're
talking about.  SMP threading doesn't really scale when you're talking
about hundreds of CPUs.  This kind of problem really is better handled at
the library level: if it's worth splitting, the sort algorithm can figure
out how to do that.  (Whether it's threading or processes really doesn't
matter, the sort algorithm just calls an underlying library to manage it.
For example, it could put a lock around the list and the C library
releases the GIL to do its work.  As long as the overall sort() call was
synchronous, it should work.)  Generally speaking, it won't be worth
splitting for less than a million elements...
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith

From jcarlson at uci.edu  Sun Mar 25 20:03:06 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 25 Mar 2007 11:03:06 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4606A60B.4040402@acm.org>
References: <20070325135242.GA24610@panix.com> <4606A60B.4040402@acm.org>
Message-ID: <20070325101901.FCF0.JCARLSON@uci.edu>

Talin <talin at acm.org> wrote:
> 
> Aahz wrote:
> > On Sun, Mar 25, 2007, Talin wrote:
> >> Thinking more about this, it seems to me that discussions of syntax for 
> >> doing parallel operations and nifty classes for synchronization are a 
> >> bit premature. The real question, it seems to me, is how to get Python 
> >> to operate concurrently at all.
> > 
> > Maybe that's what it seems to you; to others of us who have been looking
> > at this problem for a while, the real question is how to get a better
> > multi-process control and IPC library in Python, preferably one that is
> > cross-platform.  You can investigate that right now, and you don't even
> > need to discuss it with other people.
> 
> If you mean some sort of inter-process messaging system, there are a 
> number that already exist; I'd look at IPython and py-globus for starters.
> 
> My feeling is that while such an approach is vastly easier from the 
> standpoint of Python developers, and may be easier from the standpoint 
> of a typical Python programmer, it doesn't actually solve the problem 
> that I'm attempting to address, which is figuring out how to write 
> client-side software that dynamically scales to the number of processors 
> on the system.

At some point either the user or the system (Python) needs to figure out
that splitting up a sequential task into multiple parallel tasks is
productive.  On the system end of things, that isn't easy.  How much
money and time has been poured into C/C++ compiler development, and
about all they can auto parallelize (via vector operations) are things
like:
    for (i=0;i<n;i++)
        a[i] = b[i] OP c[i];
for a restricted set of OP and input types for a, b, and c.

Could Python do better than that?  Sure, given enough time and research
money.

> My view is that while the number of algorithms that we have that can be 
> efficiently parallelized in a fine-grained threading environment is 
> small (compared to the total number of strictly sequential algorithms), 
> the number of algorithms that can be adapted to heavy-weight, 
> coarse-grained processes is much smaller still.
> 
> For example, it is easy to imagine a quicksort routine where different 
> threads are responsible for sorting various sub-partitions of the array. 
> If this were to be done via processes, the overhead of marshalling and 
> unmarshalling the array elements would completely swamp the benefits of 
> making it concurrent.

But that algorithm wouldn't be used for sorting data on multiple
processors. A variant of mergesort would be used (distribute blocks
equally to processors, sort them individually, merge the results - in
parallel).  But again, all of this relies on two things:

1. a method for executing multiple streams of instructions
simultaneously
2. a method of communication between the streams of instructions

Without significant work, #1 isn't possible using threads in Python.  It
is trivial using processes.

Without work, #2 isn't "fast" using processes in Python.  It is trivial
using threads.  But here's the thing: with work, #2 can be made fast. 
Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200
memory (you can get 50% faster memory nowadays)), I've been able to push
400 megs/second between processes.  Maybe anonymous or named pipes, or
perhaps a shared mmap with some sort of synchronization would allow for
IPC to be cross platform and just about as fast.

The reason that I (and perhaps others) have been pushing for IPC is
because it is easier to solve than the removal of Pythons threading
limitations, with many of the same benefits, and even a few extra (being
able to distribute processes across different machines).

 - Josiah

From r.m.oudkerk at googlemail.com  Sun Mar 25 23:39:01 2007
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Sun, 25 Mar 2007 22:39:01 +0100
Subject: [Python-ideas]  Python and Concurrency
Message-ID: <ac4216f0703251439w5aac6a2co3b9d58aed11dc138@mail.gmail.com>

Aahz wrote:
> Maybe that's what it seems to you; to others of us who have been looking
> at this problem for a while, the real question is how to get a better
> multi-process control and IPC library in Python, preferably one that is
> cross-platform.  You can investigate that right now, and you don't even
> need to discuss it with other people.

> (Despite my oft-stated fondness for threading, I do recognize the
> problems with threading, and if there were a way to make processes as
> simple as threads from a programming standpoint, I'd be much more
> willing to push processes.)

<plug>

The processing package at

    http://cheeseshop.python.org/pypi/processing

is multi-platform and mostly follows the API of threading.  It also
allows use of 'shared objects' which live in a manager process.

For example the following code is almost identical to the equivalent
written with threads:

    from processing import Process, Manager

    def f(q):
        for i in range(10):
            q.put(i*i)
        q.put('STOP')

    if __name__ == '__main__':
        manager = Manager()
        queue = manager.Queue(maxsize=3)

        p = Process(target=f, args=[queue])
        p.start()

        result = None
        while result != 'STOP':
            result = queue.get()
            print result

        p.join()

Josiah wrote:
> Without work, #2 isn't "fast" using processes in Python.  It is trivial
> using threads.  But here's the thing: with work, #2 can be made fast.
> Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200
> memory (you can get 50% faster memory nowadays)), I've been able to push
> 400 megs/second between processes.  Maybe anonymous or named pipes, or
> perhaps a shared mmap with some sort of synchronization would allow for
> IPC to be cross platform and just about as fast.

The IPC uses sockets or (on Windows) named pipes.  Linux and Windows
are roughly equal in speed.  On a P4 2.5Ghz laptop one can retreive an
element from a shared dict about 20,000 times/sec.  Not sure if that
qualifies as fast enough.

</plug>

Richard

From jcarlson at uci.edu  Mon Mar 26 01:15:51 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 25 Mar 2007 16:15:51 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <ac4216f0703251439w5aac6a2co3b9d58aed11dc138@mail.gmail.com>
References: <ac4216f0703251439w5aac6a2co3b9d58aed11dc138@mail.gmail.com>
Message-ID: <20070325155417.FCF3.JCARLSON@uci.edu>

"Richard Oudkerk" <r.m.oudkerk at googlemail.com> wrote:
> Josiah wrote:
> > Without work, #2 isn't "fast" using processes in Python.  It is trivial
> > using threads.  But here's the thing: with work, #2 can be made fast.
> > Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200
> > memory (you can get 50% faster memory nowadays)), I've been able to push
> > 400 megs/second between processes.  Maybe anonymous or named pipes, or
> > perhaps a shared mmap with some sort of synchronization would allow for
> > IPC to be cross platform and just about as fast.
> 
> The IPC uses sockets or (on Windows) named pipes.  Linux and Windows
> are roughly equal in speed.  On a P4 2.5Ghz laptop one can retreive an
> element from a shared dict about 20,000 times/sec.  Not sure if that
> qualifies as fast enough.

Depends on what the element is, but I suspect it isn't fast enough. 
Fairly large native dictionaries seem to run on the order of 1.3 million
fetches/second on my 2.8 ghz machine.

    >>> import time
    >>> d = dict.fromkeys(xrange(65536))
    >>> if 1:
    ...     t = time.time()
    ...     for j in xrange(1000000):
    ...             _ = d[j&65535]
    ...     print 1000000/(time.time()-t)
    ...
    1305482.97346
    >>>

But really, transferring little bits of data back and forth isn't what
is of my concern in terms of speed.  My real concern is transferring
nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k,
16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to
discover the "sweet spot" for a particular implementation, and also
allow a person to discover whether or not their system can be used for
nontrivial processor loads.

 - Josiah

From gsakkis at rutgers.edu  Mon Mar 26 02:07:13 2007
From: gsakkis at rutgers.edu (George Sakkis)
Date: Sun, 25 Mar 2007 20:07:13 -0400
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070325155417.FCF3.JCARLSON@uci.edu>
References: <ac4216f0703251439w5aac6a2co3b9d58aed11dc138@mail.gmail.com>
	<20070325155417.FCF3.JCARLSON@uci.edu>
Message-ID: <91ad5bf80703251707s220d4346u3e81737ea8f1a162@mail.gmail.com>

On 3/25/07, Josiah Carlson <jcarlson at uci.edu> wrote:

> But really, transferring little bits of data back and forth isn't what
> is of my concern in terms of speed.  My real concern is transferring
> nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k,
> 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to
> discover the "sweet spot" for a particular implementation, and also
> allow a person to discover whether or not their system can be used for
> nontrivial processor loads.

Not directly relevant to the discussion, but I attended recently a
talk from the main developer of STXXL (http://stxxl.sourceforge.net/),
an STL-compatible library for handling huge volumes of data. The keys
to efficient processing are support for parallel disks, explicit
overlapping between I/O and computation, and I/O pipelining. More
details are available at
http://i10www.ira.uka.de/dementiev/stxxl/report/.

George

From r.m.oudkerk at googlemail.com  Tue Mar 27 01:24:37 2007
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Tue, 27 Mar 2007 00:24:37 +0100
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070325155417.FCF3.JCARLSON@uci.edu>
References: <ac4216f0703251439w5aac6a2co3b9d58aed11dc138@mail.gmail.com>
	<20070325155417.FCF3.JCARLSON@uci.edu>
Message-ID: <ac4216f0703261624lc23814fsba0cbaf146847f9f@mail.gmail.com>

On 26/03/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> But really, transferring little bits of data back and forth isn't what
> is of my concern in terms of speed.  My real concern is transferring
> nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k,
> 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to
> discover the "sweet spot" for a particular implementation, and also
> allow a person to discover whether or not their system can be used for
> nontrivial processor loads.

The "20,000 fetches/sec" was just for retreving a "small"
object (an integer), so it only really reflects the server
overhead.  (Sending integer objects directly between processes
is maybe 6 times faster.)

Fetching string objects of particular sizes from a shared dict gives
the following results on the same computer:

string size  fetches/sec  throughput
-----------  -----------  ----------
1 kb          15,000        15 Mb/s
4 kb          13,000        52 Mb/s
16 kb          8,500       130 Mb/s
64 kb          1,800       110 Mb/s
256 kb           196        49 Mb/s
1 Mb              50        50 Mb/s
4 Mb              13        52 Mb/s
16 Mb              3.2      51 Mb/s
64 Mb              0.84     54 Mb/s

From jcarlson at uci.edu  Tue Mar 27 06:09:04 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 26 Mar 2007 21:09:04 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <ac4216f0703261624lc23814fsba0cbaf146847f9f@mail.gmail.com>
References: <20070325155417.FCF3.JCARLSON@uci.edu>
	<ac4216f0703261624lc23814fsba0cbaf146847f9f@mail.gmail.com>
Message-ID: <20070326210239.FCFF.JCARLSON@uci.edu>

"Richard Oudkerk" <r.m.oudkerk at googlemail.com> wrote:
> On 26/03/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > But really, transferring little bits of data back and forth isn't what
> > is of my concern in terms of speed.  My real concern is transferring
> > nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k,
> > 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to
> > discover the "sweet spot" for a particular implementation, and also
> > allow a person to discover whether or not their system can be used for
> > nontrivial processor loads.
> 
> The "20,000 fetches/sec" was just for retreving a "small"
> object (an integer), so it only really reflects the server
> overhead.  (Sending integer objects directly between processes
> is maybe 6 times faster.)

That's a positive sign.

> Fetching string objects of particular sizes from a shared dict gives
> the following results on the same computer:

Those numbers look pretty good.  Would I be correct in assuming that
there is a speedup sending blocks directly between processes? (though
perhaps not the 6x that integer sending gains)

I will definitely have to dig deeper, this could be the library that
we've been looking for.

 - Josiah

From lists at cheimes.de  Tue Mar 27 20:52:57 2007
From: lists at cheimes.de (Christian Heimes)
Date: Tue, 27 Mar 2007 20:52:57 +0200
Subject: [Python-ideas] Replace time_t by Py_time_t
Message-ID: <eubp8d$p9m$1@sea.gmane.org>

In the thread "datetime module enhancements" Guido and others said that
it is unpythonic to limit timestamp (seconds since Epoch) to an signed
int with 32bit.

http://permalink.gmane.org/gmane.comp.python.devel/86750

I've made a patch that introduces a new type Py_time_t as a first step
to increase the size of time stamps. For now it's just an alias for time_t.

  typedef time_t Py_time_t;

I'm proposing to change time_t in two steps:

Python 2.6:
Replace every occurrence of time_t by Py_time_t and give third party
authors time to change their software

Python 2.7 / 3000:
Change Py_time_t to a signed 64bit int on all platforms and provide the
necessary workaround for platforms with a 32bit time_t.

Patch: http://python.org/sf/1689402

Christian

From r.m.oudkerk at googlemail.com  Wed Mar 28 02:36:32 2007
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Wed, 28 Mar 2007 01:36:32 +0100
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070326210239.FCFF.JCARLSON@uci.edu>
References: <20070325155417.FCF3.JCARLSON@uci.edu>
	<ac4216f0703261624lc23814fsba0cbaf146847f9f@mail.gmail.com>
	<20070326210239.FCFF.JCARLSON@uci.edu>
Message-ID: <ac4216f0703271736r3d7a616dq980384f5e03237ab@mail.gmail.com>

On 27/03/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> Those numbers look pretty good.  Would I be correct in assuming that
> there is a speedup sending blocks directly between processes? (though
> perhaps not the 6x that integer sending gains)

Yes, sending blocks directly between processes is over 3 times faster
for 1k blocks, and twice as fast for 4k blocks, but after that it makes
little difference.  (This is using the 'processing.connection'
sub-package which is partly written in C.)

Of course since these blocks are string data you can avoid the pickle
translation which makes things get faster still: the peak bandwidth I
get is 40,000 x 16k blocks / sec = 630 Mb/s.

PS.  It would be nice if the standard library had support for sending
message oriented data over a connection so that you could just do
'recv()' and 'send()' without worrying about whether the whole message
was successfully read/written.  You can use 'socket.makefile()' for
line oriented text messages but not for binary data.

From jcarlson at uci.edu  Wed Mar 28 03:14:13 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 27 Mar 2007 18:14:13 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <ac4216f0703271736r3d7a616dq980384f5e03237ab@mail.gmail.com>
References: <20070326210239.FCFF.JCARLSON@uci.edu>
	<ac4216f0703271736r3d7a616dq980384f5e03237ab@mail.gmail.com>
Message-ID: <20070327180152.FD0D.JCARLSON@uci.edu>

"Richard Oudkerk" <r.m.oudkerk at googlemail.com> wrote:
> On 27/03/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Those numbers look pretty good.  Would I be correct in assuming that
> > there is a speedup sending blocks directly between processes? (though
> > perhaps not the 6x that integer sending gains)
> 
> Yes, sending blocks directly between processes is over 3 times faster
> for 1k blocks, and twice as fast for 4k blocks, but after that it makes
> little difference.  (This is using the 'processing.connection'
> sub-package which is partly written in C.)

I'm surprised that larger objects see little gain from the removal of an
encoding/decoding step and transfer.

> Of course since these blocks are string data you can avoid the pickle
> translation which makes things get faster still: the peak bandwidth I
> get is 40,000 x 16k blocks / sec = 630 Mb/s.

Very nice.

> PS.  It would be nice if the standard library had support for sending
> message oriented data over a connection so that you could just do
> 'recv()' and 'send()' without worrying about whether the whole message
> was successfully read/written.  You can use 'socket.makefile()' for
> line oriented text messages but not for binary data.

Well, there's also the problem that sockets, files, and pipes behave
differently on Windows.

If one is only concerned about sockets, there are various lightly
defined protocols that can be simply implemented on top of
asyncore/asynchat, among them is the sending of a 32 bit length field in
network-endian order, followed by the data to be sent immediately
afterwards.

Taking some methods and tossing them into a synchronous sockets package
wouldn't be terribly difficult (I've done a variant of this for a
commercial project).  Doing this generally may not find support, as my
idea of sharing encoding/decoding/internal state transition/etc in
sync/async servers was shot down at least a year ago.

 - Josiah

From r.m.oudkerk at googlemail.com  Thu Mar 29 01:10:21 2007
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Thu, 29 Mar 2007 00:10:21 +0100
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <20070327180152.FD0D.JCARLSON@uci.edu>
References: <20070326210239.FCFF.JCARLSON@uci.edu>
	<ac4216f0703271736r3d7a616dq980384f5e03237ab@mail.gmail.com>
	<20070327180152.FD0D.JCARLSON@uci.edu>
Message-ID: <ac4216f0703281610k228c0388x5fa09f140e4c8ccf@mail.gmail.com>

On 28/03/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> Well, there's also the problem that sockets, files, and pipes behave
> differently on Windows.

Windows named pipes have a native message mode.

> If one is only concerned about sockets, there are various lightly
> defined protocols that can be simply implemented on top of
> asyncore/asynchat, among them is the sending of a 32 bit length field in
> network-endian order, followed by the data to be sent immediately
> afterwards.

That's exactly what I was doing.

From greg.ewing at canterbury.ac.nz  Thu Mar 29 04:27:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 29 Mar 2007 14:27:10 +1200
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and
	Concurrency)
In-Reply-To: <46062BC9.2020208@acm.org>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
Message-ID: <460B23FE.2010309@canterbury.ac.nz>

I was thinking about thread killing, and why we
think it's okay to kill OS processes but not
threads.

Killing an OS process is unlikely to cause other
processes with which it's communicating to hang,
since closing one end of a pipe or socket causes
anything on the other end to get EOF on reading
or a signal or error on writing.

But the way threads usually communicate, using
locks and queues, means that it's easy for a
thread to get hung up waiting for something that
another thread is supposed to do, but won't,
because it's been killed.

So I'm wondering if we want something for inter-
thread communication that works something like a
cross between a queue and a pipe. It knows what
threads are connected to it, and if all threads
on one end exit or get killed, threads on the
other end find out about it.

We could call it a Quipe or Sockqueue or Quocket
(or if we want to be boring, a Channel).

This would probably have to be hooked into the
threads implementation at a low level, since it
would need to be able to detect the death of
a thread by any means, without relying on any
cooperation from the user's code.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jimjjewett at gmail.com  Thu Mar 29 15:29:16 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 29 Mar 2007 09:29:16 -0400
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
	and Concurrency)
In-Reply-To: <460B23FE.2010309@canterbury.ac.nz>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
	<460B23FE.2010309@canterbury.ac.nz>
Message-ID: <fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>

On 3/28/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I was thinking about thread killing, and why we
> think it's okay to kill OS processes but not
> threads.

[Suggestion for a Queue variant that knows when one end is dead]

I think the bigger problem is when threads don't restrict themselves
to queues, but just use the same memory directly.  If a thread dies in
the middle of an "atomic" action, other threads will see corrupted
memory.  If a process dies then, nobody else would be able to see the
memory anyhow.

What we really need is a Task object that treats shared memory
(perhaps with a small list of specified exceptions) as immutable.

If you're willing to rely on style guidelines, then you can already
get this today.

If you want safety, and efficiency ... that may be harder to do as an addon.

-jJ

From jcarlson at uci.edu  Thu Mar 29 18:32:02 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 29 Mar 2007 09:32:02 -0700
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
	and Concurrency)
In-Reply-To: <460B23FE.2010309@canterbury.ac.nz>
References: <46062BC9.2020208@acm.org> <460B23FE.2010309@canterbury.ac.nz>
Message-ID: <20070329092625.FD40.JCARLSON@uci.edu>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I was thinking about thread killing, and why we
> think it's okay to kill OS processes but not
> threads.

I don't know how useful the feature would be (I've not had this
particular issue before), but one implementation strategy would be to
use thread local storage and weak references to the incoming queue of
other threads.  Getting queues in the proper thread local storage
between two threads is a little more tricky when you want it done
automatically, but a couple lines of boilerplate and that's fixed.

 - Josiah

From rrr at ronadam.com  Thu Mar 29 21:07:21 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 29 Mar 2007 14:07:21 -0500
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
 and Concurrency)
In-Reply-To: <fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>
References: <4602354F.1010003@acm.org>
	<46062BC9.2020208@acm.org>	<460B23FE.2010309@canterbury.ac.nz>
	<fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>
Message-ID: <460C0E69.9060007@ronadam.com>

Jim Jewett wrote:
> On 3/28/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> I was thinking about thread killing, and why we
>> think it's okay to kill OS processes but not
>> threads.
> 
> [Suggestion for a Queue variant that knows when one end is dead]
> 
> I think the bigger problem is when threads don't restrict themselves
> to queues, but just use the same memory directly.  If a thread dies in
> the middle of an "atomic" action, other threads will see corrupted
> memory.  If a process dies then, nobody else would be able to see the
> memory anyhow.
> 
> What we really need is a Task object that treats shared memory
> (perhaps with a small list of specified exceptions) as immutable.

I think a set of all of these as tools would be good.  They are all 
different parts of the same elephant.  And I don't see why it needs to be a 
single unified thing.

* A 'better' task object for easily creating tasks.
      + We have a threading object now. (Needs improving.)

* A message mechanism (Que-pipe) for getting the status of a task.
      + In and out message ques for communicating with a thread.
      + A way to wait on task events (messages) nicely.
      + A way for exceptions to propagate out of task objects.

* Shared memory -
      + Prevent names from being rebound
      + Prevent objects from being altered

(It isn't just about objects being immutable, but also about name's not 
being rebound to other objects. Unless you want to pass objects references 
for every object to every task?  Or you trust that any imported code will 
play nice.)

Would the following be a good summery of the issues?

(Where the terms used mean)

     frozen: object can't be altered while frozen
     locked: name can't be rebound to another object

Threading issues:  (In order of difficulty)

     1. Pass immutable objects back and forth.
           + Works now.

     2. Share immutable objects by-ref.
           + Works now.

     3. Pass mutable "deep" copies back and forth.
           ? Works now. (but not for all objects?)

     4. Pass frozen mutable objects.
           - Needs freezable/unfreezable mutable objects.
             (Not the same as making an immutable copy.)

     5. Share immutable object in a shared name space.
           - Needs name locking.

     6. Share "frozen" mutable objects in shared name space.
           - Needs name locking
           - Needs freezable mutable objects

     7. pass mutable objects.
           - Has conflicts with shared access.

     8. Share mutable object by-ref.
           - Has conflicts. (same as #7.)

     9. Share mutable object in shared name space.
           - Needs name locking.
           - Has conflicts.

If we can make the first 6 of these work, that may be enough. 7,8 and 9 
have to do with race conditions and other simultaneous data access issues.

Name locking might work like this: Doing 'lock <name>' and 'unlock <name>' 
could just move the name to and from a locked name space in the same scope. 
Attempting to rebind a name while it's in a locked name space would raise 
an exception.

The point is, rather than attach a lock to each individual name, it may be 
much easier, faster, and save memory by having a new name spaces for this. 
You could also pass a locked-name-space reference to a task all at once 
like we pass locals or globals now.

It doesn't identify who locked what, so unlocking a name used by a thread 
would be in the "if it hurts, don't do that" category.  If locking or 
unlocking a name in outside scopes is disallowed, then knowing who locked 
what won't be a problem.

Freezing would be some way of preventing an object from being changed.  It 
isn't concerned with the name it is referenced with.  And it should not 
require copying the object.  Part of that may be made easier by locking 
names.(?)

Frozen objects may be useful in other ways besides threading.(?)

Names locked to immutable objects act like constants so they may have other 
uses as well.

Cheers,
    Ron

> If you're willing to rely on style guidelines, then you can already
> get this today.
> 
> If you want safety, and efficiency ... that may be harder to do as an addon.

From jimjjewett at gmail.com  Fri Mar 30 21:49:01 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 30 Mar 2007 15:49:01 -0400
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
	and Concurrency)
In-Reply-To: <460C0E69.9060007@ronadam.com>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
	<460B23FE.2010309@canterbury.ac.nz>
	<fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>
	<460C0E69.9060007@ronadam.com>
Message-ID: <fb6fbf560703301249r7b3352f5j492803cf65c6d36@mail.gmail.com>

On 3/29/07, Ron Adam <rrr at ronadam.com> wrote:
> Jim Jewett wrote:
> > What we really need is a Task object that treats shared memory
> > (perhaps with a small list of specified exceptions) as immutable.

> * A 'better' task object for easily creating tasks.
>       + We have a threading object now. (Needs improving.)

But the task isn't in any way restricted.  Brett's security sandbox
might be a useful starting point, if it is fast enough.  Otherwise,
we'll probably need to stick with microthreading to get things small
enough to contain.

> * Shared memory -
>       + Prevent names from being rebound
>       + Prevent objects from being altered

I had thought of the names as being part of a shared dictionary.  (Of
course, immutable dictionaries aren't really available out-of-the-box
now, and I'm not sure I would trust the supposed immutability of
anything that wasn't proxied.)

>      frozen: object can't be altered while frozen
>      locked: name can't be rebound to another object

>      3. Pass mutable "deep" copies back and forth.
>            ? Works now. (but not for all objects?)

Well, anything that can be deep-copied -- unless you also want the
mutations to be collected back into a single location.

>      4. Pass frozen mutable objects.
>            - Needs freezable/unfreezable mutable objects.
>              (Not the same as making an immutable copy.)

And there is where it starts to fall apart.  Though if you look at the
pypy dict and interpreter optimizations, they have started to deal
with it through versioning types.

http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#multi-dicts
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id23

-jJ