From adam at  Sat Mar  1 23:24:37 2008
From: adam at (Adam Atlas)
Date: Sat, 1 Mar 2008 17:24:37 -0500
Subject: [Python-ideas] List-comprehension-like extensions in normal for
How about letting ordinary for loops contain multiple "for"s and  
optionally "if"s like in list comprehensions and generator  
expressions? For example, on a site I was just looking at, it had the  
     for record in [ r for r in db if == 'pierre' ]:
which could instead be:
     for record in db if == 'pierre':

And of course there could be much more complex ones too. I know you  
could just do "for x in <some generator expression>" and get the same  
effect and roughly the same speed, but I think this looks a lot nicer,  
and it would make sense to have this sort of consistency across the  
multiple contexts in which "for" can be used.

From eyal.lotem at  Sat Mar  1 23:57:08 2008
From: eyal.lotem at (Eyal Lotem)
Date: Sun, 2 Mar 2008 00:57:08 +0200
Subject: [Python-ideas] List-comprehension-like extensions in normal for
How about just nesting the for's/if's?

for record in db:
  if == 'pierre':

Isn't that the one obvious way to do it?

On Sun, Mar 2, 2008 at 12:24 AM, Adam Atlas <adam at> wrote:
> How about letting ordinary for loops contain multiple "for"s and
>  optionally "if"s like in list comprehensions and generator
>  expressions? For example, on a site I was just looking at, it had the
>  line:
>      for record in [ r for r in db if == 'pierre' ]:
>  which could instead be:
>      for record in db if == 'pierre':
>  And of course there could be much more complex ones too. I know you
>  could just do "for x in <some generator expression>" and get the same
>  effect and roughly the same speed, but I think this looks a lot nicer,
>  and it would make sense to have this sort of consistency across the
>  multiple contexts in which "for" can be used.
From grantgm at  Sun Mar  2 23:02:31 2008
From: grantgm at (Gabriel Grant)
Date: Sun, 2 Mar 2008 17:02:31 -0500
Subject: [Python-ideas] Restartable Threads
Message-ID: <>

Hi everyone,

Why is it that threads can't be restarted?

I hope this is the right place for this discussion. If this has been
(or should be) discussed somewhere else, I apologize: my searches for
"restart thread" and similar only turned up statements that restarting
threads is impossible, which haven't satiated my curiosity.

Is there any fundamental reason why this can't (or shouldn't) be done?
If not, what would you think of making thread restartability an

For those who are wondering why I might wish to condemn myself by
using threads at all (rather than, say, subprocesses), never mind
threads that can be restarted while maintaining state, my use case is
as follows:

I am performing low-level hardware control through a C API that I have
wrapped with ctypes. The main "run"-type C function takes a pointer to
a struct in which it stores information about its state. This function
blocks while the system is running, but also needs access to the
shared memory space, and thus has to be executed in its own thread (I
believe - other options welcomed). In order to signal events, the
function returns with a signal code. Once the signal has been dealt
with, the API specifies that the same C function call be made, passing
the pointer to the original struct, so that the system can resume
operation where it left off.

I'm sure this could be done using a standard thread (although I
haven't actually done it) with something like:

def myloop():
    while not self.ret == 0:
        self.ret = sharedLib.blocking_call(self.c_state_struct)

... do some things ...
... deal with the signal ...

Or some such ugliness, but it seemed to me that the most natural
implementation of such a system would be something more like:

class myThread(Thread):
    def __init__(self):
        self.c_state_struct = structMaker()
    def run(self):
        self.ret = sharedLib.blocking_call(self.c_state_struct)

which would then be executed with:

>>> t = myThread()
>>> t.start()
... do some other stuff ...
>>> t.join()
>>> signal_handle(t.ret)  # deal with the returned value
>>> t.start()                     # resume operation

However this is impossible, since a thread's start() method can only
be called once (as explained in [1], [2] and [3] python2.5 raises an
assertion error, although as of rev 55785 this has been changed to a
RuntimeError). What I have been unable to find explained, however, is
why this should/needs to be the case.

To see if I could get around this limitation, I initially hacked this together:

class myThread(Thread):
	def __init__(self):
		self.i = 1
	def start(self):
	def run(self):
		print self.i
		self.i += 1
		return self.i

to be used as:

>>> t = myThread()	
>>> t.start()
>>> t.join()
>>> t.start()
>>> t.join()

Obviously it is not usable in the general case, since it completely
clobbers the thread's internal state through the repeated __init__()s,
but one could certainly imagine a more delicate implementation that
saves the relevant bits and pieces, while resetting those that need

With that in mind, I had a look into and, not immediately
seeing any reason this couldn't be done, implemented essentially that
functionality. The attached patch is implemented against
from trunk. I've also uploaded a patched copy of my that
can be used with python2.5 to [4], if anyone needs that.

In order to maintain complete backward compatibility, I've left the
default behaviour to have threads behave as they do today, but by
initializing them with "restartable=True", start() can be called
repeatedly. For example:

class Counter(Thread):
	def run(self):
		if not hasattr(self, "count"):
			self.count = 0
			self.count += 1

could be used with:

>>> t = Counter(restartable=True)
>>> t.start()
>>> t.join()
>>> print t.count
>>> t.start()
>>> t.join()
>>> print t.count

If an attempt is made to restart the thread while it is executing, it
still raises a RuntimeError, which I think makes sense:

>>> t = LongThread(restartable=True)
>>> t.start()
>>> t.start()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "", line 441, in start
    raise RuntimeError("thread already started")
RuntimeError: thread already started

So this _seems_ to work, but I have to admit, I'm somewhat afraid to
use it. I can't help but wonder: is it safe, or is it tempting the
Gods of parallelism to inflict sudden, multi-threaded death?

Less superstitious opinions than my own would be greatly appreciated.



Note: In addition to the patch, I have attached a few usage
examples/test cases, that I should really make into actual unit tests.
Some of these are expected to fail, so the file can't be executed
directly - the examples should be run in an interpreter.

From santagada at  Sun Mar  2 23:11:57 2008
From: santagada at (Leonardo Santagada)
Date: Sun, 2 Mar 2008 19:11:57 -0300
Subject: [Python-ideas] Restartable Threads
In-Reply-To: <>
References: <>
Message-ID: <>

Sorry if I missed it from you email, but why cant you just create  
another thread object before each start call?

I think the only objection to restart a thread would be that the idea  
is that each thread object represents a thread... but I might be  
completely wrong.

Leonardo Santagada

From grantgm at  Mon Mar  3 00:16:03 2008
From: grantgm at (Gabriel Grant)
Date: Sun, 2 Mar 2008 18:16:03 -0500
Subject: [Python-ideas] Restartable Threads
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada <santagada at> wrote:
> Sorry if I missed it from you email,
I know the message was a rather long. Sorry about that.

> but why cant you just create
>  another thread object before each start call?

The state of the thread needs to be preserved from start() to start()
because the C function needs to be passed the same object each time it
is called.

The state could be maintained by creating the persistent object in the
parent thread and passing it to a new child thread before each call,
but for a few reasons this feels wrong:

It seems to me that this would break encapsulation - objects exist for
the purpose of carrying state. They shouldn't rely on their parent to
do that for them.
Doing so would muck up the parent, especially once there are a)
multiple child threads and b) multiple state-carrying objects that
need to be maintained within each thread.

Also, from a more conceptual point of view, the C function basically
represents a single, restartable process, so it seems it should be
packaged and used as such. When the function returns, it is more akin
to a synchronization point between threads than stopping one and
creating then starting another.

Hopefully that clarifies my thinking a bit (or at least doesn't muddy
the waters any further :)

>  I think the only objection to restart a thread would be that the idea
>  is that each thread object represents a thread... but I might be
>  completely wrong.

And that may be a valid objection, although the lifetime of the Thread
object does not directly correspond with that of the thread it wraps.
The thread is created upon calling start(), and dies when run()
returns. The way it is implemented, the Thread object is more of a
thread creator and controller, than a physical thread. Otherwise, I
would think it should disapear after being join()ed. It seem to me
that these objects represent a more palatable abstraction of the
physical thread...but I might (also :) be completely wrong.

Given that we accept (enjoy, even?) some level of abstraction on top
of physical threads (for instance we start them after they have been
initialized, and we check whether they are running, not whether they
exist), it seems reasonable to me that stopping and restarting these
conceptual threads should be possible. What do you think?

Thanks again for your consideration,


From josiah.carlson at  Mon Mar  3 01:32:10 2008
From: josiah.carlson at (Josiah Carlson)
Date: Sun, 2 Mar 2008 16:32:10 -0800
Subject: [Python-ideas] Restartable Threads
In-Reply-To: <>
References: <>
Message-ID: <>

My 2 cents from my 30 seconds of reading this email thread:
encapsulation shouldn't be done on the thread level, it should be done
on the object level.  Create an object that offers the behavior you
want to have (call it ThreadStarter or something), and give it a
'start_thread()' method that returns a thread handle from which you
can .join() as necessary.  This ThreadStarter object keeps references
to the necessary structures that you need to pass to the lower level
threads.  Or heck, this ThreadStarter could handle the .join()
dispatch, etc.  If you think about it for 5 minutes, I'm sure you
could implement it.

Also, while it isn't impossible to "restart threads" the way you
conceive of it, your way of conceiving of the "restart" is
fundamentally wrong.  Can you restart a process whose stack you've
thrown away?  Of course not.  You've thrown away the process/thread's
stack (which can be seen by the fact that you can .join() the thread),
so you aren't "restarting" the thread, you are creating a new thread
with a new stack with some of the same arguments to called functions.

 - Josiah

(this message does not mean that I'm going to be spending much time in
this list anymore, just that I saw this silly idea and had to comment)

On Sun, Mar 2, 2008 at 3:16 PM, Gabriel Grant <grantgm at> wrote:
> On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada <santagada at> wrote:
>  > Sorry if I missed it from you email,
>  I know the message was a rather long. Sorry about that.
>  > but why cant you just create
>  >  another thread object before each start call?
>  The state of the thread needs to be preserved from start() to start()
>  because the C function needs to be passed the same object each time it
>  is called.
>  The state could be maintained by creating the persistent object in the
>  parent thread and passing it to a new child thread before each call,
>  but for a few reasons this feels wrong:
>  It seems to me that this would break encapsulation - objects exist for
>  the purpose of carrying state. They shouldn't rely on their parent to
>  do that for them.
>  Doing so would muck up the parent, especially once there are a)
>  multiple child threads and b) multiple state-carrying objects that
>  need to be maintained within each thread.
>  Also, from a more conceptual point of view, the C function basically
>  represents a single, restartable process, so it seems it should be
>  packaged and used as such. When the function returns, it is more akin
>  to a synchronization point between threads than stopping one and
>  creating then starting another.
>  Hopefully that clarifies my thinking a bit (or at least doesn't muddy
>  the waters any further :)
>  >  I think the only objection to restart a thread would be that the idea
>  >  is that each thread object represents a thread... but I might be
>  >  completely wrong.
>  And that may be a valid objection, although the lifetime of the Thread
>  object does not directly correspond with that of the thread it wraps.
>  The thread is created upon calling start(), and dies when run()
>  returns. The way it is implemented, the Thread object is more of a
>  thread creator and controller, than a physical thread. Otherwise, I
>  would think it should disapear after being join()ed. It seem to me
>  that these objects represent a more palatable abstraction of the
>  physical thread...but I might (also :) be completely wrong.
>  Given that we accept (enjoy, even?) some level of abstraction on top
>  of physical threads (for instance we start them after they have been
>  initialized, and we check whether they are running, not whether they
>  exist), it seems reasonable to me that stopping and restarting these
>  conceptual threads should be possible. What do you think?
>  Thanks again for your consideration,
>  -Gabriel
From aahz at  Mon Mar  3 01:33:50 2008
From: aahz at (Aahz)
Date: Sun, 2 Mar 2008 16:33:50 -0800
Subject: [Python-ideas] Restartable Threads
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Mar 02, 2008, Gabriel Grant wrote:
> Why is it that threads can't be restarted?

That's an interesting question.  Unfortunately, the best person to answer
it isn't on this list (Tim Peters).  Generally speaking, the standard
answer is to have a worker thread that uses a Queue in a loop.

> So this _seems_ to work, but I have to admit, I'm somewhat afraid to
> use it. I can't help but wonder: is it safe, or is it tempting the
> Gods of parallelism to inflict sudden, multi-threaded death?

If I had to guess, I think it's just adding an unnecessary layer of
complexity to the existing Thread class.  Moreover, the existing
implementation prevents the following code:


which IMO definitely should be a bug.  If you want to try creating a
patch that that adds a t.restart() method, I think it certainly wouldn't
hurt anything and would be a good way of getting feedback.
Aahz (aahz at           <*>

"All problems in computer science can be solved by another level of     
indirection."  --Butler Lampson

From josiah.carlson at  Mon Mar  3 01:35:02 2008
From: josiah.carlson at (Josiah Carlson)
Date: Sun, 2 Mar 2008 16:35:02 -0800
Subject: [Python-ideas] List-comprehension-like extensions in normal for
In-Reply-To: <>
References: <>
Message-ID: <>

Or even...

for record in (r for r in db == 'pierre'):

Hello generator expressions (available since Python 2.4).  But I'm
with Eyal, personally.

 - Josiah

On Sat, Mar 1, 2008 at 2:57 PM, Eyal Lotem <eyal.lotem at> wrote:
> How about just nesting the for's/if's?
>  for record in db:
>   if == 'pierre':
>     ...
>  Isn't that the one obvious way to do it?
>  On Sun, Mar 2, 2008 at 12:24 AM, Adam Atlas <adam at> wrote:
>  > How about letting ordinary for loops contain multiple "for"s and
>  >  optionally "if"s like in list comprehensions and generator
>  >  expressions? For example, on a site I was just looking at, it had the
>  >  line:
>  >      for record in [ r for r in db if == 'pierre' ]:
>  >  which could instead be:
>  >      for record in db if == 'pierre':
>  >
>  >  And of course there could be much more complex ones too. I know you
>  >  could just do "for x in <some generator expression>" and get the same
>  >  effect and roughly the same speed, but I think this looks a lot nicer,
>  >  and it would make sense to have this sort of consistency across the
>  >  multiple contexts in which "for" can be used.
From artomegus at  Wed Mar  5 03:36:36 2008
From: artomegus at (Anthony Tolle)
Date: Tue, 4 Mar 2008 21:36:36 -0500
Subject: [Python-ideas] new super redux (better late than never?)
Message-ID: <>

I was looking at the reference implementation in PEP 3135 (New Super),
and I was inspired to put together a slightly different implementation
that doesn't fiddle with bytecode.  I know that the new super() in
python 3000 doesn't follow the reference implementation in the PEP,
but the code intrigued me enough to offer up this little tidbit, which
can be easily be used in python 2.5.

What I did was borrow the idea of using a metaclass to do a
post-definition fix-up on the methods, but added a new function
decorator called autosuper_method.  Like staticmethod or classmethod,
the decorator wraps the function using the non-data descriptor

The method wrapped by the decorator will receive an extra implicit
argument (super) inserted before the instance argument (self).

One caveat about the decorator: it must be the first decorator in the
list (i.e. the outermost wrapper), or else the metaclass will not
recognize the wrapped function as an instance of the decorator class.

I think this implementation strikes me as more pythonic than the
spooky behavior of the new python 3000 super() built-in, and it is
more flexible because of the implicit argument design.  This allows
things like the ability to use the super argument in inner functions
without worrying about the 'first argument' assumption of python
3000's super().

The implementation follows, which is also called autosuper in
deference to the original reference implementation.  It includes a
demonstration of some of its flexibility:


#!/usr/bin/env python

class autosuper_method(object):
    def __init__(self, func, cls=None):
        self.func = func
        self.cls = cls

    def __get__(self, obj, type=None):
        # return self if self.cls is not set yet
        if self.cls is None:
            return self

        if obj is None:
            # class binding - assume first argument is instance,
            # and insert superclass before it
            def newfunc(*args, **kwargs):
                if not len(args):
                    raise TypeError('instance argument missing')
                return self.func(super(self.cls, args[0]),
            # instance binding - insert superclass as first
            # argument, and instance as second
            def newfunc(*args, **kwargs):
                return self.func(super(self.cls, obj),
        return newfunc

class autosuper_meta(type):
    def __init__(cls, name, bases, clsdict):
        # set cls attribute of all instances of autosuper_method
        for v in clsdict:
            o = getattr(cls, v)
            if isinstance(o, autosuper_method):
                o.cls = cls

class autosuper(object):
    __metaclass__ = autosuper_meta

if __name__ == '__main__':
    class A(autosuper):
        def f(self):
            return 'A'

    # Demo - standard use
    class B(A):
        def f(super, self):
            return 'B' + super.f()

    # Demo - reference super in inner function
    class C(A):
        def f(super, self):
            def inner():
                return 'C' + super.f()
            return inner()

    # Demo - define function before class definition
    def D_f(super, self):
        return 'D' + super.f()

    class D(B, C):
        f = D_f

    # Demo - define function after class definition
    class E(B, C):

    # don't use @autosuper_method here!  The metaclass has already
    # processed E, so it won't be able to set the cls attribute
    def E_f(super, self):
        return 'E' + super.f()

    # instead, use the extended version of the decorator
    E.f = autosuper_method(E_f, E)

    d = D()
    assert d.f() == 'DBCA'      # Instance binding
    assert D.f(d) == 'DBCA'     # Class binding

    e = E()
    assert e.f() == 'EBCA'      # Instance binding
    assert E.f(e) == 'EBCA'     # Class binding


P.S. I know that using the word 'super' as an argument name might be
frowned upon, but I'm just copying what I've seen done in the standard
python library (e.g. using 'list' as a local variable name :).
Anyway, it doesn't really hurt anything unless you wanted to call the
original super() built-in from the decorated method, which would kind
of defeat the purpose.

P.P.S. Something like this might have been offered up already.  I've
been searching the mail list archives for a while, and found a few
reference to using decorators, but didn't find any full
implementations.  This implementation also has the advantage of being
compatible with existing code.

From guido at  Wed Mar  5 04:24:35 2008
From: guido at (Guido van Rossum)
Date: Tue, 4 Mar 2008 19:24:35 -0800
Subject: [Python-ideas] new super redux (better late than never?)
In-Reply-To: <>
References: <>
Message-ID: <>

Ehhh! The PEP's "reference implementation" is useless and probably
doesn't even work. The actual implementation is completely different.
If you want to help, a rewrite of the PEP to match reality would be
most welcome!

On Tue, Mar 4, 2008 at 6:36 PM, Anthony Tolle <artomegus at> wrote:
> I was looking at the reference implementation in PEP 3135 (New Super),
>  and I was inspired to put together a slightly different implementation
>  that doesn't fiddle with bytecode.  I know that the new super() in
>  python 3000 doesn't follow the reference implementation in the PEP,
>  but the code intrigued me enough to offer up this little tidbit, which
>  can be easily be used in python 2.5.
>  What I did was borrow the idea of using a metaclass to do a
>  post-definition fix-up on the methods, but added a new function
>  decorator called autosuper_method.  Like staticmethod or classmethod,
>  the decorator wraps the function using the non-data descriptor
>  protocol.
>  The method wrapped by the decorator will receive an extra implicit
>  argument (super) inserted before the instance argument (self).
>  One caveat about the decorator: it must be the first decorator in the
>  list (i.e. the outermost wrapper), or else the metaclass will not
>  recognize the wrapped function as an instance of the decorator class.
>  I think this implementation strikes me as more pythonic than the
>  spooky behavior of the new python 3000 super() built-in, and it is
>  more flexible because of the implicit argument design.  This allows
>  things like the ability to use the super argument in inner functions
>  without worrying about the 'first argument' assumption of python
>  3000's super().
>  The implementation follows, which is also called autosuper in
>  deference to the original reference implementation.  It includes a
>  demonstration of some of its flexibility:
>  ------------------------------------------------------------
>  #!/usr/bin/env python
>  #
>  #
>  class autosuper_method(object):
>     def __init__(self, func, cls=None):
>         self.func = func
>         self.cls = cls
>     def __get__(self, obj, type=None):
>         # return self if self.cls is not set yet
>         if self.cls is None:
>             return self
>         if obj is None:
>             # class binding - assume first argument is instance,
>             # and insert superclass before it
>             def newfunc(*args, **kwargs):
>                 if not len(args):
>                     raise TypeError('instance argument missing')
>                 return self.func(super(self.cls, args[0]),
>                                  *args,
>                                  **kwargs)
>         else:
>             # instance binding - insert superclass as first
>             # argument, and instance as second
>             def newfunc(*args, **kwargs):
>                 return self.func(super(self.cls, obj),
>                                  obj,
>                                  *args,
>                                  **kwargs)
>         return newfunc
>  class autosuper_meta(type):
>     def __init__(cls, name, bases, clsdict):
>         # set cls attribute of all instances of autosuper_method
>         for v in clsdict:
>             o = getattr(cls, v)
>             if isinstance(o, autosuper_method):
>                 o.cls = cls
>  class autosuper(object):
>     __metaclass__ = autosuper_meta
>  if __name__ == '__main__':
>     class A(autosuper):
>         def f(self):
>             return 'A'
>     # Demo - standard use
>     class B(A):
>         @autosuper_method
>         def f(super, self):
>             return 'B' + super.f()
>     # Demo - reference super in inner function
>     class C(A):
>         @autosuper_method
>         def f(super, self):
>             def inner():
>                 return 'C' + super.f()
>             return inner()
>     # Demo - define function before class definition
>     @autosuper_method
>     def D_f(super, self):
>         return 'D' + super.f()
>     class D(B, C):
>         f = D_f
>     # Demo - define function after class definition
>     class E(B, C):
>         pass
>     # don't use @autosuper_method here!  The metaclass has already
>     # processed E, so it won't be able to set the cls attribute
>     def E_f(super, self):
>         return 'E' + super.f()
>     # instead, use the extended version of the decorator
>     E.f = autosuper_method(E_f, E)
>     d = D()
>     assert d.f() == 'DBCA'      # Instance binding
>     assert D.f(d) == 'DBCA'     # Class binding
>     e = E()
>     assert e.f() == 'EBCA'      # Instance binding
>     assert E.f(e) == 'EBCA'     # Class binding
>  ------------------------------------------------------------
>  P.S. I know that using the word 'super' as an argument name might be
>  frowned upon, but I'm just copying what I've seen done in the standard
>  python library (e.g. using 'list' as a local variable name :).
>  Anyway, it doesn't really hurt anything unless you wanted to call the
>  original super() built-in from the decorated method, which would kind
>  of defeat the purpose.
>  P.P.S. Something like this might have been offered up already.  I've
>  been searching the mail list archives for a while, and found a few
>  reference to using decorators, but didn't find any full
>  implementations.  This implementation also has the advantage of being
>  compatible with existing code.
--Guido van Rossum (home page:

From artomegus at  Wed Mar  5 08:49:02 2008
From: artomegus at (Anthony Tolle)
Date: Wed, 5 Mar 2008 02:49:02 -0500
Subject: [Python-ideas] new super redux (better late than never?)
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Mar 4, 2008 at 10:24 PM, Guido van Rossum <guido at> wrote:
> Ehhh! The PEP's "reference implementation" is useless and probably
>  doesn't even work. The actual implementation is completely different.
>  If you want to help, a rewrite of the PEP to match reality would be
>  most welcome!

Yep, I knew the actual implementation was completely different from
the reference implementation.  I was really just trying to offer a
different take on 'fixing' super, even though I know it is too late to
suggest this type of change for python 3000.  That's one reason I
refrained from posting in the python-3000 list.

I was enamored with the idea of passing the super object as an actual
parameter to the method that needs it.  Using a decorator with
descriptor behavior (like staticmethod or classmethod) seemed the best
way to do this.

The only downside is that my implementation depends on using a
metaclass to fix up the decorator objects after the class definition
is completed (or catching assignment to class attributes after the

It would be nice if the decorator class could be self-contained
without depending on an associated metaclass.  However, the __get__
method of the decorator would have to dynamically determine the class
that the wrapped function belongs to.  Since functions can be defined
outside of a class and then arbitrarily assigned to a class attribute
(or even multiple classes!), this seems to be difficult.  In fact, the
code in my previous post has a bug related to this.

Which brings me to posting a new version of my code:
-- Defined __setattr__ in the metaclass to make demo code more
consistent (and less ugly).
-- Modified __init__ function in the metaclass so it doesn't generate
__get__ calls.
-- Fix-ups now create new instance of autosuper_method object instead
of modifying cls attribute of existing object.  Reason: assigning a
decorated function to multiple classes would modify the original
object, breaking functionality for all classes but one.
-- Known issue: cases such as E.f = D.f are not caught, because
__get__ on D.f doesn't return an instance of autosuper_method.  Can be
resolved by having autosuper_method.__get__ return a callable sublass
of autosuper_method.  However, it makes me wonder if my idea isn't so
hot after all. :/

Here's the new version:


#!/usr/bin/env python

class autosuper_method(object):
    def __init__(self, func, cls=None):
        self.func = func
        self.cls = cls

    def __get__(self, obj, type=None):
        # return self if self.cls is not set - prevents use
        # by methods of classes that don't subclass autosuper
        if self.cls is None:
            return self

        if obj is None:
            # class binding - assume first argument is instance,
            # and insert superclass before it
            def newfunc(*args, **kwargs):
                if not len(args):
                    raise TypeError('instance argument missing')
                return self.func(super(self.cls, args[0]),
            # instance binding - insert superclass as first
            # argument, and instance as second
            def newfunc(*args, **kwargs):
                return self.func(super(self.cls, obj),
        return newfunc

class autosuper_meta(type):
    def __init__(cls, name, bases, clsdict):
        # fix up all autosuper_method instances in class
        for attr in clsdict:
            value = clsdict[attr]
            if isinstance(value, autosuper_method):
                setattr(cls, attr, autosuper_method(value.func, cls))

    def __setattr__(cls, attr, value):
        # catch assignment after class definition
        if isinstance(value, autosuper_method):
            value = autosuper_method(value.func, cls)
        type.__setattr__(cls, attr, value)

class autosuper(object):
    __metaclass__ = autosuper_meta

if __name__ == '__main__':
    class A(autosuper):
        def f(self):
            return 'A'

    # Demo - standard use
    class B(A):
        def f(super, self):
            return 'B' + super.f()

    # Demo - reference super in inner function
    class C(A):
        def f(super, self):
            def inner():
                return 'C' + super.f()
            return inner()

    # Demo - define function before class definition
    def D_f(super, self):
        return 'D' + super.f()

    class D(B, C):
        f = D_f

    # Demo - define function after class definition
    class E(B, C):

    def E_f(super, self):
        return 'E' + super.f()

    E.f = E_f

    # Test D
    d = D()
    assert d.f() == 'DBCA'      # Instance binding
    assert D.f(d) == 'DBCA'     # Class binding

    # Test E
    e = E()
    assert e.f() == 'EBCA'      # Instance binding
    assert E.f(e) == 'EBCA'     # Class binding


Regardless of the flaws in my code, I still like the idea of a
decorator syntax to specify methods that want to receive a super
object as a parameter.  It could use the same 'magic' that allows the
new python 3000 super() to determine the method's class from the stack
frame, but doesn't depend on grabbing the first argument as the
instance (i.e. breaking use with inner functions).

From aaron.watters at  Wed Mar  5 16:11:48 2008
From: aaron.watters at (Aaron Watters)
Date: Wed, 5 Mar 2008 10:11:48 -0500
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
Message-ID: <>

I just checked the python site documentation on marshal and pickle and I
consider them to be irresponsibly and dangerously misleading.

For example.  Suppose Mercurial is implemented using pickle.load (I sure
hope it isn't -- is it?).

1) I send someone a "patch" for their software claiming it makes their
package run faster.

2) That person uses mercurial to "unpack" the patch and mercurial uses

BAM!  That person's filesystem is GONE!  AND I'M NOT ASSUMING

Now: suppose Mercurial is implemented using marshal: no such scenario is
unless there is a security bug in mercurial where they explicitly execute

RESOLVED: pickle should come with a large red label:


It doesn't have one.

Marshal needs no such label: but it has one:

*Warning:* The marshal module is not intended to be secure against erroneous
or maliciously constructed data. Never unmarshal data received from an
untrusted or unauthenticated source.

This is bullshit.

Sorry, for the french and the caps, but this is REALLY IMPORTANT.

   -- Aaron Watters
From george.sakkis at  Wed Mar  5 16:25:57 2008
From: george.sakkis at (George Sakkis)
Date: Wed, 5 Mar 2008 10:25:57 -0500
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Mar 5, 2008 at 10:11 AM, Aaron Watters <aaron.watters at> wrote:

> I just checked the python site documentation on marshal and pickle and I
> consider them to be irresponsibly and dangerously misleading.
> RESOLVED: pickle should come with a large red label:
> It doesn't have one.

So what is this [1] ?

Warning: The pickle module is not intended to be secure against
erroneous or maliciously constructed data. Never unpickle data
received from an untrusted or unauthenticated source.

You may want to check your facts better next time you go on a rampage.



From phd at  Wed Mar  5 16:27:59 2008
From: phd at (Oleg Broytmann)
Date: Wed, 5 Mar 2008 18:27:59 +0300
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Mar 05, 2008 at 10:11:48AM -0500, Aaron Watters wrote:
> RESOLVED: pickle should come with a large red label:

   "Warning: The pickle module is not intended to be secure against
erroneous or maliciously constructed data. Never unpickle data received
from an untrusted or unauthenticated source."

   Enough for me, though it is not as big or as red...

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From aaron.watters at  Wed Mar  5 17:11:26 2008
From: aaron.watters at (Aaron Watters)
Date: Wed, 5 Mar 2008 11:11:26 -0500
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

In response to Oleg and George.

Yes apparently there is an acknowledgement in some subordinate page
somewhere that there might be some problem with security and pickle.  This
should be on the first page in bold face like the unneeded one for marshal.
I missed it just now because I just looked at the first page for marshal and
pickle, like most people probably would, sorry.

Also this line from the marshal doc has got to go:

"For general persistence and transfer of Python objects through RPC calls,
see the modules pickle <> and
shelve <>. "

which should read
"For RPC calls never use pickle."

And the security warning for marshal benieth it should be removed because it
is nonsense.

The implication of the current documentation is that most of my public
projects contain serious security holes when they don't.
And if you don't read the documentation carefully (like the implementers of
Plone apparently didn't) the docs seem to suggest
that pickle is somehow "safer" when it is about as unsafe as it could be.

-- Aaron Watters
From guido at  Wed Mar  5 18:36:56 2008
From: guido at (Guido van Rossum)
Date: Wed, 5 Mar 2008 09:36:56 -0800
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

I'm assuming that someone confronted you with this security issue
somehow? Otherwise I don't understand why you'd be so upset about it.

BTW the warning for marshal is legit -- the C code that unpacks
marshal data has not been carefully analyzed against buffer overflows
and so on. Remember the first time someone broke into a system through
a malicious JPEG? The same could happen with marshal. Seriously.

I agree that the pickle module's warning needs to be moved to a more
prominent place (Georg has probably aready done this by the time I'm
finished typing this message :-). But I see no reason to get so upset
about it as to use all caps.


On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters <aaron.watters at> wrote:
> In response to Oleg and George.
> Yes apparently there is an acknowledgement in some subordinate page
> somewhere that there might be some problem with security and pickle.  This
> should be on the first page in bold face like the unneeded one for marshal.
> I missed it just now because I just looked at the first page for marshal and
> pickle, like most people probably would, sorry.
> Also this line from the marshal doc has got to go:
> "For general persistence and transfer of Python objects through RPC calls,
> see the modules pickle and shelve. "
> which should read
> "For RPC calls never use pickle."
> And the security warning for marshal benieth it should be removed because it
> is nonsense.
> The implication of the current documentation is that most of my public
> projects contain serious security holes when they don't.
>  And if you don't read the documentation carefully (like the implementers of
> Plone apparently didn't) the docs seem to suggest
> that pickle is somehow "safer" when it is about as unsafe as it could be.
> -- Aaron Watters
--Guido van Rossum (home page:

From santagada at  Wed Mar  5 19:12:47 2008
From: santagada at (Leonardo Santagada)
Date: Wed, 5 Mar 2008 15:12:47 -0300
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On 05/03/2008, at 13:11, Aaron Watters wrote:

> like the implementers of Plone apparently didn't

I know this is just a strange angry rant, but why do you say this? Do  
you mean that ZODB and ZRPC should be implemented using marshal? Can  
this even be done?

Now if you want to do secure pickles, just sign them with a cripto  
method (completely secure). Simple and I think, the only way to do  
this being secure. Or using a secure transport layer like ssl to  
transfer things with signatures to identify your peers. Not a reason  
to rant, but to install the crypto package.

Leonardo Santagada

From g.brandl at  Wed Mar  5 20:02:51 2008
From: g.brandl at (Georg Brandl)
Date: Wed, 05 Mar 2008 20:02:51 +0100
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>	<>
Message-ID: <fqmqjs$the$>

Guido van Rossum schrieb:
> I'm assuming that someone confronted you with this security issue
> somehow? Otherwise I don't understand why you'd be so upset about it.
> BTW the warning for marshal is legit -- the C code that unpacks
> marshal data has not been carefully analyzed against buffer overflows
> and so on. Remember the first time someone broke into a system through
> a malicious JPEG? The same could happen with marshal. Seriously.
> I agree that the pickle module's warning needs to be moved to a more
> prominent place (Georg has probably aready done this by the time I'm
> finished typing this message :-). But I see no reason to get so upset
> about it as to use all caps.

I used the time machine :)

Though the warning is at the same location in 
<>, since all pickle docs are
on the same page it's visible enough in my opinion.


From aaron.watters at  Wed Mar  5 20:03:23 2008
From: aaron.watters at (Aaron Watters)
Date: Wed, 5 Mar 2008 14:03:23 -0500
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

What follows is a brief summary of offline discussions with Guido and
Leonardo (I hope represented correctly, please complain if not):

Guido pointed out that previous versions of marshal could crash python.

I replied that that is a bug and all known instances have been fixed.
Pickle executes arbitrary code by design -- which is much worse than just
crashing a program.

Leonardo mentioned that pickle security concerns could be addressed using
crypto tricks.

I replied that I would be comfortable unmarshalling a file from a known
hostile party -- no crypto verification required, because the worst that
could happen is that it would crash the interpreter.  With pickle I'd be
handing my keyboard to a villian.

In summary: I think marshal.loads(s) is just as safe as unicode(s) or  pickle.loads(s) is morally equivalant to __import__(s) or
I think the security warning for marshal and the implied recommendation that
pickle is okay for RPC should be removed.

  alright already, 'nuff said. whatever.  -- Aaron Watters
From arne_bab at  Wed Mar  5 20:46:37 2008
From: arne_bab at (Arne Babenhauserheide)
Date: Wed, 5 Mar 2008 20:46:37 +0100
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

I'd also agree, that the warning should be really prominent (especially since 
I just saw someone saying "for game states: Just pickle them", which could 
result in people getting problems when they get a mail saying "hey, look, I 
got to the 14th level"), but I don't think the warning was irresponsibly 

At least I saw it, when I began to learn python (but I had forgotten it until 

Maybe it could be replaced by yaml at some point, though, which offers a mode 
that doesn't execute everything (safe_load):

"safe_load(stream) parses the given stream and returns a Python object 
constructed from for the first document in the stream. If there are no 
documents in the stream, it returns None. safe_load recognizes only standard 
YAML tags and cannot construct an arbitrary Python object."

And there's also a C implementation:
Which can be relicensed under the Python License:

Or pickle could get a safe_load function itself (if it doesn't yet have it). 

Best wishes, 

El Wednesday, 5 de March de 2008 18:36:56 Guido van Rossum escribi?:
> I'm assuming that someone confronted you with this security issue
> somehow? Otherwise I don't understand why you'd be so upset about it.
> BTW the warning for marshal is legit -- the C code that unpacks
> marshal data has not been carefully analyzed against buffer overflows
> and so on. Remember the first time someone broke into a system through
> a malicious JPEG? The same could happen with marshal. Seriously.
> I agree that the pickle module's warning needs to be moved to a more
> prominent place (Georg has probably aready done this by the time I'm
> finished typing this message :-). But I see no reason to get so upset
> about it as to use all caps.
> --Guido
> On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters <aaron.watters at> 
> > In response to Oleg and George.
> >
> > Yes apparently there is an acknowledgement in some subordinate page
> > somewhere that there might be some problem with security and pickle. 
> > This should be on the first page in bold face like the unneeded one for
> > marshal. I missed it just now because I just looked at the first page for
> > marshal and pickle, like most people probably would, sorry.
> >
> > Also this line from the marshal doc has got to go:
> >
> > "For general persistence and transfer of Python objects through RPC
> > calls, see the modules pickle and shelve. "
> >
> >
> > which should read
> > "For RPC calls never use pickle."
> >
> > And the security warning for marshal benieth it should be removed because
> > it is nonsense.
> >
> > The implication of the current documentation is that most of my public
> > projects contain serious security holes when they don't.
> >  And if you don't read the documentation carefully (like the implementers
> > of Plone apparently didn't) the docs seem to suggest
> > that pickle is somehow "safer" when it is about as unsafe as it could be.
> >
> > -- Aaron Watters
> >
> >
> > _______________________________________________
> >  Python-ideas mailing list
> >  Python-ideas at
> >

Unpolitisch sein
Hei?t politisch sein
Ohne es zu merken. 
- Arne Babenhauserheide ( )
-- Weblog:

-- Mein ?ffentlicher Schl?ssel (PGP/GnuPG):
From idadesub at  Wed Mar  5 21:45:17 2008
From: idadesub at (Erick Tryzelaar)
Date: Wed, 5 Mar 2008 12:45:17 -0800
Subject: [Python-ideas] adding a trim convenience function
Message-ID: <>

I find that when I'm normalizing strings, I end up writing this a lot:

sites = ['', '', '']
new_sites = []
for site in sites:
  if site.startswith('http://'):
    site = site[len('http://'):]

But it'd be much nicer if I could use a convenience function trim that
would do this for me, so I could just use a comprehension:

def ltrim(s, prefix):
  if s.startswith(prefix):
    return s[len(prefix):]
  return s

sites = ['', '', '']
sites = [ltrim(site, 'http://') for site in sites]

Would there be any interest to add this helper function, as well as an
"rtrim" and "trim", to the str class?


From tjreedy at  Wed Mar  5 22:33:40 2008
From: tjreedy at (Terry Reedy)
Date: Wed, 5 Mar 2008 16:33:40 -0500
Subject: [Python-ideas] adding a trim convenience function
References: <>
Message-ID: <fqn3jj$2t6$>

"Erick Tryzelaar" 
<idadesub at> wrote in 
message news:1ef034530803051245u7fdf525dn6f4efc74a8af59a8 at
| sites = ['', '', '']
| sites = [ltrim(site, 'http://') for site in sites]

>>> [site.replace('http://', '')for site in sites]
['', '', '']

| Would there be any interest to add this helper function, as well as an
| "rtrim" and "trim", to the str class?

Try another use case.  I think str pretty well has the basic tools needed 
to construct whatever specific tools one needs.


From idadesub at  Wed Mar  5 23:11:56 2008
From: idadesub at (Erick Tryzelaar)
Date: Wed, 5 Mar 2008 14:11:56 -0800
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Mar 5, 2008 at 1:30 PM, Matthew Russell <matt.horizon5 at> wrote:
> Of couse that should of read:
> sites = list(item.strip("http://") for item in sites)

{l,r,}strip doesn't actually do what I'm talking about, which confused
me for a long time. Consider this simple case:

>>> 'abaaabcd'.lstrip('ab')

ltrim in this case would produce 'aaabcd'.

On Wed, Mar 5, 2008 at 1:33 PM, Terry Reedy <tjreedy at> wrote:
> >>> [site.replace('http://', '')for site in sites]
> ['', '', '']

Unfortunately that would break down in certain cases with a different
strings, like 'foo bar foo'.replace('foo', ''), which just results in
' bar '.

>  Try another use case.  I think str pretty well has the basic tools needed
>  to construct whatever specific tools one needs.

Oh sure it can, considering that I can implement ltrim in three lines.
This is just to reduce a common pattern in my code, and to remove
rewriting it in multiple projects.

From greg.ewing at  Wed Mar  5 23:24:37 2008
From: greg.ewing at (Greg Ewing)
Date: Thu, 06 Mar 2008 11:24:37 +1300
Subject: [Python-ideas] An official complaint regarding the marshal and
 pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> BTW the warning for marshal is legit -- the C code that unpacks
> marshal data has not been carefully analyzed against buffer overflows
> and so on.

I thought the main issue with marshal is that it's happy
to create code objects, which pickle doesn't do -- ostensibly
for security reasons.

But if pickle is inherently insecure anyway, does the
exclusion of code objects really make much difference?

BTW, I only consider pickle suitable for quick and dirty
uses anyway, because it ties the external representation very
closely to internal details of your program, which can make
it difficult to evolve the program without invalidating
previously written files.

For long-term use, it's better to invest time in a
properly-thought-out external format for the task, designed
with extensibility in mind.


From greg.ewing at  Wed Mar  5 23:29:49 2008
From: greg.ewing at (Greg Ewing)
Date: Thu, 06 Mar 2008 11:29:49 +1300
Subject: [Python-ideas] An official complaint regarding the marshal and
 pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

Aaron Watters wrote:

> In summary: I think marshal.loads(s) is just as safe as unicode(s) or 
>  pickle.loads(s) is morally equivalant to __import__(s) or 
> eval(s).

According to the docs, you can use a customised unpickler
to restrict the set of things it can use as constructors.
It might be worth mentioning that in a prominent place near
the security warning as well.


From taleinat at  Wed Mar  5 23:59:07 2008
From: taleinat at (Tal Einat)
Date: Thu, 6 Mar 2008 00:59:07 +0200
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <>
References: <>
Message-ID: <>

Erick Tryzelaar wrote:
> I find that when I'm normalizing strings, I end up writing this a lot:
>  sites = ['', '', '']
>  new_sites = []
>  for site in sites:
>   if site.startswith('http://'):
>     site = site[len('http://'):]
>   new_sites.append(site)
>  But it'd be much nicer if I could use a convenience function trim that
>  would do this for me, so I could just use a comprehension:
>  def ltrim(s, prefix):
>   if s.startswith(prefix):
>     return s[len(prefix):]
>   return s
>  sites = ['', '', '']
>  sites = [ltrim(site, 'http://') for site in sites]
>  Would there be any interest to add this helper function, as well as an
>  "rtrim" and "trim", to the str class?

I'm against adding this as a string method, or even a a function in the stdlib.

I've done a lot of text processing with Python and have hardly ever
needed something like this. If you think this would be useful often, a
good way to convince this list is to show some examples of how it
could improve code in the standard library, noting how common they

In general, having a lot of string methods is very harmful because it
makes learning Python a longer and more confusing process.
Furthermore, this functionality is very simple and easy to implement,
I just thought of 3 different ways [1] to implement this function in a
simple, readable one-liner. For these reasons, unless you can show
that this will be very useful very often, I'm against.

- Tal

ltrim = lambda item, to_trim,: re.sub('^' + to_trim, '', item)
ltrim = lambda item, x: item[0 if not item.startswith(x) else len(x):]
ltrim = lambda item, to_trim: ''.join(item.split(to_trim, 1))

From taleinat at  Thu Mar  6 00:06:01 2008
From: taleinat at (Tal Einat)
Date: Thu, 6 Mar 2008 01:06:01 +0200
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <>
References: <>
Message-ID: <>

Tal Einat wrote:
> Erick Tryzelaar wrote:
>  > I find that when I'm normalizing strings, I end up writing this a lot:
>  >
>  >  sites = ['', '', '']
>  >  new_sites = []
>  >  for site in sites:
>  >   if site.startswith('http://'):
>  >     site = site[len('http://'):]
>  >   new_sites.append(site)
>  >
>  >  But it'd be much nicer if I could use a convenience function trim that
>  >  would do this for me, so I could just use a comprehension:
>  >
>  >  def ltrim(s, prefix):
>  >   if s.startswith(prefix):
>  >     return s[len(prefix):]
>  >   return s
>  >
>  >  sites = ['', '', '']
>  >  sites = [ltrim(site, 'http://') for site in sites]
>  >
>  >  Would there be any interest to add this helper function, as well as an
>  >  "rtrim" and "trim", to the str class?
>  >
>  I'm against adding this as a string method, or even a a function in the stdlib.
>  I've done a lot of text processing with Python and have hardly ever
>  needed something like this. If you think this would be useful often, a
>  good way to convince this list is to show some examples of how it
>  could improve code in the standard library, noting how common they
>  are.
>  In general, having a lot of string methods is very harmful because it
>  makes learning Python a longer and more confusing process.
>  Furthermore, this functionality is very simple and easy to implement,
>  I just thought of 3 different ways [1] to implement this function in a
>  simple, readable one-liner. For these reasons, unless you can show
>  that this will be very useful very often, I'm against.
>  - Tal
>  [1]
>  ltrim = lambda item, to_trim,: re.sub('^' + to_trim, '', item)
>  ltrim = lambda item, x: item[0 if not item.startswith(x) else len(x):]
>  ltrim = lambda item, to_trim: ''.join(item.split(to_trim, 1))

Ignore the third implementation, it's broken... here's another one in its place:
ltrim = lambda item, x: item[item.startswith(x) * len(x):]

- Tal

From greg.ewing at  Wed Mar  5 23:44:10 2008
From: greg.ewing at (Greg Ewing)
Date: Thu, 06 Mar 2008 11:44:10 +1300
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <fqn3jj$2t6$>
References: <>
Message-ID: <>

Terry Reedy wrote:

>>>>[site.replace('http://', '')for site in sites]

Not exactly the same thing, as the original only replaced
at the beginning of the string.

An re substitution could be used, but that could be
seen as overkill.


From santagada at  Thu Mar  6 02:33:25 2008
From: santagada at (Leonardo Santagada)
Date: Wed, 5 Mar 2008 22:33:25 -0300
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On 05/03/2008, at 16:03, Aaron Watters wrote:
> Guido pointed out that previous versions of marshal could crash  
> python.
> I replied that that is a bug and all known instances have been  
> fixed.  Pickle executes arbitrary code by design -- which is much  
> worse than just crashing a program.

Just read carefully what Guido said, if there is a bug it can not just  
crash your program, it can execute any kind of code, as bad or even  
worse than pickle... that is what is called a buffer overflow

Talking about it the pypy project has a directory somewhere with lots  
of snippets of ways to crash cpython... Not just the set recursion  
limit and overflow the stack one.

> Leonardo mentioned that pickle security concerns could be addressed  
> using crypto tricks.

For some uses, for others some modified version of pure python pickle  
could be used, so you have a controled and almost safe pickle.

> I replied that I would be comfortable unmarshalling a file from a  
> known hostile party -- no crypto verification required, because the  
> worst that could happen is that it would crash the interpreter.   
> With pickle I'd be handing my keyboard to a villian.
> In summary: I think marshal.loads(s) is just as safe as unicode(s)  
> or  pickle.loads(s) is morally equivalant to  
> __import__(s) or eval(s).

No marshall load do lots of stuff in pure unverified C code...  
anything could happen, as guido pointed out.

> I think the security warning for marshal and the implied  
> recommendation that pickle is okay for RPC should be removed.

No, AFAIK marshal can only load ints and simple objects... and that  
will give you a very poor rpc (for example it could never be used to  
replace pickle as it is used in ZODB and ZRPC).

Leonardo Santagada

From bborcic at  Thu Mar  6 13:45:41 2008
From: bborcic at (Boris Borcic)
Date: Thu, 06 Mar 2008 13:45:41 +0100
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <fqop3f$u11$>

Aaron Watters wrote:

> Sorry, for the french and the caps, but this is REALLY IMPORTANT.

I see nothing french in your post.


From aaron.watters at  Thu Mar  6 18:40:35 2008
From: aaron.watters at (Aaron Watters)
Date: Thu, 6 Mar 2008 12:40:35 -0500
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Mar 5, 2008 at 8:33 PM, Leonardo Santagada <santagada at>

> On 05/03/2008, at 16:03, Aaron Watters wrote:
> > Guido pointed out that previous versions of marshal could crash
> > python.
> >
> > I replied that that is a bug and all known instances have been
> > fixed.  Pickle executes arbitrary code by design -- which is much
> > worse than just crashing a program.
> Just read carefully what Guido said, if there is a bug it can not just
> crash your program, it can execute any kind of code, as bad or even
> worse than pickle... that is what is called a buffer overflow

I'd like to know the actual number of successful
buffer overflow attacks that have ever happened on the planet in the wild.
Maybe one?  Okay, according to Wikipedia there have been 4.  I don't really
know but I think an overflowing buffer in marshal is not very likely to be
near where a code segment could jump to because almost everything
in marshal is dynamically
allocated.  The known attacks have been where the arrays were in static
I believe.

And it's not worse than pickle because pickle is perfectly capable of
compiling and
loading an assembly language component without you knowing anything about
Pickle can do anything that the computer can do.

Also it's not worse than pickle because you have to be a highly experienced
perverted assembly language programmer to construct
an overflow attack and there has to be a bug in
marshal to allow it.  To abuse pickle requires almost no skill at all, and
don't have to be perverted, you just have to be stupid.  In fact pickle is
to execute arbitrary code, and even documented.

For all I know it's just as feasible to stage buffer overflow attacks in
many other
places in python as it is in marshal -- like maybe
unicode.join or anyplace else where an array
is constructed.  Which is to say it's not very feasible in those places

I was clearly off my medication to start this discussion. I suppose
people into thinking marshal is dangerous is better than suggesting pickle
is safe.
Peace and love everyone.  bye now.

  -- Aaron Watters
From lists at  Thu Mar  6 18:56:05 2008
From: lists at (Christian Heimes)
Date: Thu, 06 Mar 2008 18:56:05 +0100
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <fqop3f$u11$>
References: <>
Message-ID: <>

Boris Borcic wrote:
> Aaron Watters wrote:
> [...]
>> Sorry, for the french and the caps, but this is REALLY IMPORTANT.
> I see nothing french in your post.



From lists at  Thu Mar  6 18:59:00 2008
From: lists at (Christian Heimes)
Date: Thu, 06 Mar 2008 18:59:00 +0100
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>	<>
Message-ID: <fqpbd5$91o$>

Leonardo Santagada wrote:
>> I replied that that is a bug and all known instances have been  
>> fixed.  Pickle executes arbitrary code by design -- which is much  
>> worse than just crashing a program.
> Just read carefully what Guido said, if there is a bug it can not just  
> crash your program, it can execute any kind of code, as bad or even  
> worse than pickle... that is what is called a buffer overflow

marshal is *ONLY* designed to store and load trusted pyc files. It's not
desinged for anything else. It *CAN* be used for simple stuff, too. But
it doesn't support fancy stuff and it can easily be broken. IIRC it
doesn't support nested structured like a list containing a reference to
itself. Use it on your own risk.


From tjreedy at  Thu Mar  6 20:37:17 2008
From: tjreedy at (Terry Reedy)
Date: Thu, 6 Mar 2008 14:37:17 -0500
Subject: [Python-ideas] adding a trim convenience function
References: <><><>
Message-ID: <fqph5b$2bu$>

"Erick Tryzelaar" 
<idadesub at> wrote in 
message | On Wed, Mar 5, 2008 at 1:33 PM, Terry Reedy 
<tjreedy at> wrote:
| > >>> [site.replace('http://', '')for site in sites]
| > ['', '', '']
| Unfortunately that would break down in certain cases with a different
| strings, like 'foo bar foo'.replace('foo', ''), which just results in
| ' bar '.

I knew that, of course, but that objection does not apply to the use case 
you presented.  I simply gave the simplest thing that worked, that passed 
your 'test'.
Tal gave a more general answer.

| >  Try another use case.  I think str pretty well has the basic tools 
| >  to construct whatever specific tools one needs.
| Oh sure it can, considering that I can implement ltrim in three lines.
| This is just to reduce a common pattern in my code, and to remove
| rewriting it in multiple projects.

Who many such uses are like your 'foo bar for'?


From ntoronto at  Thu Mar  6 20:17:27 2008
From: ntoronto at (Neil Toronto)
Date: Thu, 06 Mar 2008 12:17:27 -0700
Subject: [Python-ideas] An official complaint regarding the marshal and
 pickle documentation
In-Reply-To: <fqop3f$u11$>
References: <>
Message-ID: <>

Boris Borcic wrote:
> Aaron Watters wrote:
> [...]
>> Sorry, for the french and the caps, but this is REALLY IMPORTANT.
> I see nothing french in your post.

I could have sworn I read something like "I fart in your general direction".


From idadesub at  Thu Mar  6 21:16:16 2008
From: idadesub at (Erick Tryzelaar)
Date: Thu, 6 Mar 2008 12:16:16 -0800
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <fqph5b$2bu$>
References: <>
Message-ID: <>

On Thu, Mar 6, 2008 at 11:37 AM, Terry Reedy <tjreedy at> wrote:

fyi, I looked through a bunch of code, and it does seem that there is
less need for this than I thought.

>  Who many such uses are like your 'foo bar for'?

The case I ran into is that I used in a fashion like
'abaaab'.lstrip('ab') before I understood exactly what strip did. The
replace trick won't work for me because all of the instances where I
used this were in an api, so I couldn't assume that the string i was
trimming didn't have other instances of the prefix/suffix in the

From aahz at  Thu Mar  6 21:25:27 2008
From: aahz at (Aahz)
Date: Thu, 6 Mar 2008 12:25:27 -0800
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Mar 06, 2008, Greg Ewing wrote:
> BTW, I only consider pickle suitable for quick and dirty uses anyway,
> because it ties the external representation very closely to internal
> details of your program, which can make it difficult to evolve the
> program without invalidating previously written files.
> For long-term use, it's better to invest time in a
> properly-thought-out external format for the task, designed with
> extensibility in mind.

Maybe so, but my company has been using pickle as a primary long-term
storage mechanism for more than a decade.  We only rarely have problems
with code changes causing pickle problems (less than once per year).
OTOH, we mostly only have a growing internal format -- we almost never
change the internal format.
Aahz (aahz at           <*>

"All problems in computer science can be solved by another level of     
indirection."  --Butler Lampson

From george.sakkis at  Thu Mar  6 21:29:42 2008
From: george.sakkis at (George Sakkis)
Date: Thu, 6 Mar 2008 15:29:42 -0500
Subject: [Python-ideas] adding a trim convenience function
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Mar 6, 2008 at 3:16 PM, Erick Tryzelaar
<idadesub at> wrote:

> On Thu, Mar 6, 2008 at 11:37 AM, Terry Reedy <tjreedy at> wrote:
>  fyi, I looked through a bunch of code, and it does seem that there is
>  less need for this than I thought.
>  >  Who many such uses are like your 'foo bar for'?
>  The case I ran into is that I used in a fashion like
>  'abaaab'.lstrip('ab') before I understood exactly what strip did. The
>  replace trick won't work for me because all of the instances where I
>  used this were in an api, so I couldn't assume that the string i was
>  trimming didn't have other instances of the prefix/suffix in the
>  middle.

What about adding an optional boolean parameter to str.*strip that
treats the argument as either a set of characters (default, just like
now) or an exact string ? Something like

>>> 'abaaab'.lstrip('ab')
>>> 'abaaab'.lstrip('ab', exact=True)


From santagada at  Thu Mar  6 22:00:13 2008
From: santagada at (Leonardo Santagada)
Date: Thu, 6 Mar 2008 18:00:13 -0300
Subject: [Python-ideas] An official complaint regarding the marshal and
	pickle documentation
In-Reply-To: <>
References: <>
Message-ID: <>

On 06/03/2008, at 17:25, Aahz wrote:

> Maybe so, but my company has been using pickle as a primary long-term
> storage mechanism for more than a decade.  We only rarely have  
> problems
> with code changes causing pickle problems (less than once per year).
> OTOH, we mostly only have a growing internal format -- we almost never
> change the internal format.

And then tons of companies are using ZODB wich uses pickle... so no  
biggie... but using pickle directly can be a problem if you have lots  
of data.

Leonardo Santagada

From ntoronto at  Mon Mar 10 07:28:43 2008
From: ntoronto at (Neil Toronto)
Date: Mon, 10 Mar 2008 00:28:43 -0600
Subject: [Python-ideas] [Python-Dev] PEP Proposal: Revised slice objects
 & lists use slice objects as indexes
In-Reply-To: <>
References: <>
Message-ID: <>

Alexandre Vassalotti wrote:
> On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight <voights at> wrote:
>> This would simplify the handling of list slices.
>>  Slice objects that are produced in a list index area would be different,
>>  and optionally the syntax for slices in list indexes would be expanded
>>  to work everywhere. Instead of being containers for the start, end,
>>  and step numbers, they would be generators, similar to xranges.
> I am not sure what you are trying to propose here. The slice object
> isn't special, it's just a regular built-in type.
>   >>> slice(1,4)
>   slice(1, 4, None)
>   >>> [1,2,3,4,5,6][slice(1,4)]
>   [2, 3, 4]
> I don't see how introducing new syntax would simplify indexing.

Likewise. It would simplify looping, though:

     >>> for i in 1:5:
     ...     print i

Since this kind of loop happens frequently, it makes sense to shorten 
it. Slice objects (and syntax) seem ready-made for that - it wouldn't be 
*new* syntax, just repurposed syntax.

Though Forrest didn't bring this up directly, I've often thought that 
Python's having both xrange and slice (and in 3000, range and slice) is 
mostly vestigial. Their information content is identical and their 
purposes are highly analogous. Unifying them would reduce the number of 
new concepts for a beginner by one, and these are frequently-used 
concepts at that.

Negative indexes could throw the idea for a loop, though. (Pun! Ha ha!) 
And this makes the colons look like some kind of enclosure:

     >>> for i in :5:


From aaron.watters at  Mon Mar 10 16:33:25 2008
From: aaron.watters at (Aaron Watters)
Date: Mon, 10 Mar 2008 11:33:25 -0400
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
Message-ID: <>

Hi.  Some months ago I complained on the python-list
that python gc did too much work for apps that allocate
and deallocate lots of structures.  In fact one of my apps
was spending about 1/3 of its time garbage collecting
and not finding anything to collect (before i disabled gc).

My proposal was that python
should have some sort of a smarter strategy for garbage
collection, perhaps involving watching the global
high water mark for memory allocation or other tricks.

The appropriate response was:
"great idea! patch please!" :)

Unfortunately dealing with cross platform
memory management internals is beyond my
C-level expertise, and I'm not having a lot of
luck finding good information sources.  Does anyone
have any clues on this or other ideas for improving
the gc heuristic?  For example, how do you find
out the allocated heap size(s) in a cross platform

This link provides some clues, but I don't really
understand this code well enough to hope to
patch gc.

  -- Aaron Watters
From rhamph at  Mon Mar 10 17:30:56 2008
From: rhamph at (Adam Olsen)
Date: Mon, 10 Mar 2008 09:30:56 -0700
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Mar 10, 2008 at 8:33 AM, Aaron Watters <aaron.watters at> wrote:
> Hi.  Some months ago I complained on the python-list
> that python gc did too much work for apps that allocate
> and deallocate lots of structures.  In fact one of my apps
> was spending about 1/3 of its time garbage collecting
>  and not finding anything to collect (before i disabled gc).
> My proposal was that python
> should have some sort of a smarter strategy for garbage
> collection, perhaps involving watching the global
> high water mark for memory allocation or other tricks.
> The appropriate response was:
> "great idea! patch please!" :)
> Unfortunately dealing with cross platform
> memory management internals is beyond my
> C-level expertise, and I'm not having a lot of
>  luck finding good information sources.  Does anyone
> have any clues on this or other ideas for improving
> the gc heuristic?  For example, how do you find
> out the allocated heap size(s) in a cross platform
> way?
> This link provides some clues, but I don't really
> understand this code well enough to hope to
> patch gc.

You can of course tweak gc.set_threshold() (and I would expect this to
be quite effective, once you find out what an appropriate threshold0
is for your app.)  I don't believe you'll find any existing counters
of the current heap size though (be it number of allocated objects or
total size consumed by those objects.)

Adam Olsen, aka Rhamphoryncus

From g.brandl at  Mon Mar 10 18:44:05 2008
From: g.brandl at (Georg Brandl)
Date: Mon, 10 Mar 2008 18:44:05 +0100
Subject: [Python-ideas] [Python-Dev] PEP Proposal: Revised slice objects
 & lists use slice objects as indexes
In-Reply-To: <>
References: <>	<>
Message-ID: <fr3rs5$o7a$>

Neil Toronto schrieb:
> Alexandre Vassalotti wrote:
>> On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight <voights at> wrote:
>>> This would simplify the handling of list slices.
>>>  Slice objects that are produced in a list index area would be different,
>>>  and optionally the syntax for slices in list indexes would be expanded
>>>  to work everywhere. Instead of being containers for the start, end,
>>>  and step numbers, they would be generators, similar to xranges.
>> I am not sure what you are trying to propose here. The slice object
>> isn't special, it's just a regular built-in type.
>>   >>> slice(1,4)
>>   slice(1, 4, None)
>>   >>> [1,2,3,4,5,6][slice(1,4)]
>>   [2, 3, 4]
>> I don't see how introducing new syntax would simplify indexing.
> Likewise. It would simplify looping, though:
>      >>> for i in 1:5:
>      ...     print i
>      1
>      2
>      3
>      4
>      >>>

See for a similar proposal.


From aaron.watters at  Tue Mar 11 15:28:30 2008
From: aaron.watters at (Aaron Watters)
Date: Tue, 11 Mar 2008 10:28:30 -0400
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Mar 10, 2008 at 12:30 PM, Adam Olsen <rhamph at> wrote:

> You can of course tweak gc.set_threshold() (and I would expect this to
> be quite effective, once you find out what an appropriate threshold0
> is for your app.)  I don't believe you'll find any existing counters
> of the current heap size though (be it number of allocated objects or
> total size consumed by those objects.)...

It would be nice if the threshold would adjust based
on the performance characteristics of the app.
In particular it'd be nice if the garbage collector would
notice when it's never finding anything and wait longer
everytime it finds nothing for the next collection attempt.

How about this.
- The threshold slides between minimumThresh and maximumThresh
- At each collection the current number of objects
  collected is compared to the last number collected (collectionTrend).
- If the collectionTrend is negative or zero the next threshold slides
  towards the maximum.
- If the collectionTrend is a small increase, the threshold stays the same.
- If the collectionTrend is a large increase the next threshold slides
  the minimum.
That way for apps that need no garbage collection
(outside of refcounting) the threshold would slide to the
maximum and stay there, but for apps that need a lot of
gc the threshold would bounce up and down near the minimum.

This is almost easy enough that I could implement it...
    -- Aaron Watters

From aaron.watters at  Tue Mar 11 15:42:11 2008
From: aaron.watters at (Aaron Watters)
Date: Tue, 11 Mar 2008 10:42:11 -0400
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>


> - If the collectionTrend is a small increase, the threshold stays the
> same.

footnote: for stability you would not update the "last collection count"
in this case so the comparison is always against a fixed point until
the threshold adjusts....
  -- Aaron Watters

From rhamph at  Tue Mar 11 17:25:48 2008
From: rhamph at (Adam Olsen)
Date: Tue, 11 Mar 2008 09:25:48 -0700
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Mar 11, 2008 at 7:28 AM, Aaron Watters <aaron.watters at> wrote:
> On Mon, Mar 10, 2008 at 12:30 PM, Adam Olsen <rhamph at> wrote:
> > You can of course tweak gc.set_threshold() (and I would expect this to
> > be quite effective, once you find out what an appropriate threshold0
> > is for your app.)  I don't believe you'll find any existing counters
> > of the current heap size though (be it number of allocated objects or
> > total size consumed by those objects.)...
>  It would be nice if the threshold would adjust based
>  on the performance characteristics of the app.
>  In particular it'd be nice if the garbage collector would
>  notice when it's never finding anything and wait longer
>  everytime it finds nothing for the next collection attempt.
>  How about this.
>  - The threshold slides between minimumThresh and maximumThresh
>  - At each collection the current number of objects
>    collected is compared to the last number collected (collectionTrend).
>  - If the collectionTrend is negative or zero the next threshold slides
>    towards the maximum.
>  - If the collectionTrend is a small increase, the threshold stays the same.
>  - If the collectionTrend is a large increase the next threshold slides
> towards
>    the minimum.
>  That way for apps that need no garbage collection
>  (outside of refcounting) the threshold would slide to the
>  maximum and stay there, but for apps that need a lot of
>  gc the threshold would bounce up and down near the minimum.
>  This is almost easy enough that I could implement it...
>      -- Aaron Watters

It sounds plausible to me.

But have you tried just tweaking the threshold?  Surely there's a
value at which it performs well, and that'd need to be within your
maximum anyway.

Adam Olsen, aka Rhamphoryncus

From aaron.watters at  Tue Mar 11 19:20:50 2008
From: aaron.watters at (Aaron Watters)
Date: Tue, 11 Mar 2008 14:20:50 -0400
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>

> It sounds plausible to me.
> But have you tried just tweaking the threshold?  Surely there's a
> value at which it performs well, and that'd need to be within your
> maximum anyway.

In that case it works best when gc is disabled.
If I add a new feature, the gc requirements may
change completely without me realizing it.
I'm interested in not having to think about it :).
   -- Aaron Watters
From rhamph at  Tue Mar 11 20:08:37 2008
From: rhamph at (Adam Olsen)
Date: Tue, 11 Mar 2008 12:08:37 -0700
Subject: [Python-ideas] cross platform memory usage high water mark for
	improved gc?
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Mar 11, 2008 at 11:20 AM, Aaron Watters <aaron.watters at> wrote:
> > It sounds plausible to me.
> >
> > But have you tried just tweaking the threshold?  Surely there's a
> > value at which it performs well, and that'd need to be within your
> > maximum anyway.
> In that case it works best when gc is disabled.
> If I add a new feature, the gc requirements may
> change completely without me realizing it.
> I'm interested in not having to think about it :).

You're concerned that a new feature may increase how high of a
threshold you need, yet it could also exceed the "maximum" of your
adaptive scheme.

I'm not convinced you need that high of a threshold anyway.  I'd like
to see a benchmark showing how your app performs at different levels.

Adam Olsen, aka Rhamphoryncus

Aaron Watters wrote:
> It would be nice if the threshold would adjust based
> on the performance characteristics of the app.
> In particular it'd be nice if the garbage collector would
> notice when it's never finding anything and wait longer
> everytime it finds nothing for the next collection attempt.

Have you read the code and comments in Modules/gcmodule.c? The cyclic GC
has three generations. A gc sweep for the highest generation is started
every 70,000 instructions. You can tune the levels for the generations
yourself through the gc module set threshold function.


On Tue, Mar 11, 2008 at 12:57 PM, Christian Heimes <lists at> wrote:
> Aaron Watters wrote:
>  > It would be nice if the threshold would adjust based
>  > on the performance characteristics of the app.
>  > In particular it'd be nice if the garbage collector would
>  > notice when it's never finding anything and wait longer
>  > everytime it finds nothing for the next collection attempt.
>  Have you read the code and comments in Modules/gcmodule.c? The cyclic GC
>  has three generations. A gc sweep for the highest generation is started
>  every 70,000 instructions. You can tune the levels for the generations
>  yourself through the gc module set threshold function.

Not instructions.  There's a counter that's incremented on allocation
and decremented on deallocation.  Each time it hits 700 it triggers a
collection.  The collections are normally only gen0, but after 10 it
does a gen1 collection.  After 10 of the second generation it does the
gen2 (ie a full collection.)

Although, given the way the math is done, I think the 701st object
will be the one that triggers the gen0 collection, and the 12th time
that happens it does gen1 instead.  I get a grand total of.. 93233
objects to trigger a gen2 collection.  Not that it matters.

Without more detail information on what the app is doing we can't
seriously attempt to improve the heuristics for it.

Adam Olsen, aka Rhamphoryncus

> You're concerned that a new feature may increase how high of a
> threshold you need, yet it could also exceed the "maximum" of your
> adaptive scheme.
> I'm not convinced you need that high of a threshold anyway.  I'd like
> to see a benchmark showing how your app performs at different levels.

You are absolutely right that I can set a threshold high enough.
With the default values Python is extremely slow for certain cases.
I'm arguing it should automatically detect when it is being stupid
and attempt to fix it.  In particular I would set the maximum very
high and start at the minimum, which might be near the current

For example I get the following in a simple test (python2.6):

> python
gc not disabled
elapsed 19.2473409176
> python disable
gc disabled
elapsed 4.88715791702

In this case the interpreter is spending 80% of its time trying
to collect non-existent garbage.  Now a newbie who
didn't know to go fiddling with the garbage collector
might just conclude "python is ssslllooowwww" and go
back to using Perl or Ruby or whatever in a case like this.
Maybe the powers that be couldn't care less about it, I don't
know.  (I know newbies can be irritating).

The problem is quadratic also: if I double the limit the
penalty goes up by a factor of 4.

Here is the source:

def test(disable=False, limit=1000000):
    from time import time
    import gc
    if disable:
        print "gc disabled"
        print "gc not disabled"
    now = time()
    D = {}
    for i in range(limit):
        D[ (hex(i), oct(i)) ] = str(i)+repr(i)
    L = [ (y,x) for (x,y) in D.iteritems() ]
    elapsed = time()-now
    print "elapsed", elapsed

if __name__=="__main__":
    import sys
    disable = False
    if "disable" in sys.argv:
        disable = True

-- Aaron Watters

On Tue, Mar 11, 2008 at 1:57 PM, Aaron Watters <aaron.watters at> wrote:
> >
> >
> > You're concerned that a new feature may increase how high of a
> > threshold you need, yet it could also exceed the "maximum" of your
> > adaptive scheme.
> >
> > I'm not convinced you need that high of a threshold anyway.  I'd like
> > to see a benchmark showing how your app performs at different levels.
> You are absolutely right that I can set a threshold high enough.
> With the default values Python is extremely slow for certain cases.
>  I'm arguing it should automatically detect when it is being stupid
> and attempt to fix it.  In particular I would set the maximum very
> high and start at the minimum, which might be near the current
> defaults.
> For example I get the following in a simple test (python2.6):
> > python
> gc not disabled
> elapsed 19.2473409176
> > python disable
> gc disabled
> elapsed 4.88715791702
> In this case the interpreter is spending 80% of its time trying
>  to collect non-existent garbage.  Now a newbie who
> didn't know to go fiddling with the garbage collector
> might just conclude "python is ssslllooowwww" and go
> back to using Perl or Ruby or whatever in a case like this.
>  Maybe the powers that be couldn't care less about it, I don't
> know.  (I know newbies can be irritating).
> The problem is quadratic also: if I double the limit the
> penalty goes up by a factor of 4.
>  Here is the source:
> def test(disable=False, limit=1000000):
>     from time import time
>     import gc
>     if disable:
>         gc.disable()
>         print "gc disabled"
>     else:
>         print "gc not disabled"
>      now = time()
>     D = {}
>     for i in range(limit):
>         D[ (hex(i), oct(i)) ] = str(i)+repr(i)
>     L = [ (y,x) for (x,y) in D.iteritems() ]
>     elapsed = time()-now
>     print "elapsed", elapsed
> if __name__=="__main__":
>     import sys
>     disable = False
>     if "disable" in sys.argv:
>         disable = True
>     test(disable)

Interesting.  With some further testing, it's become clear that the
problem is in gen2.  gen0 and gen1 both add a constant overhead (their
size is bounded), but gen2's size grows linearly, and with a linear
number of scans that gives quadratic performance.

I'm unsure how to best fix this.  Anything we do will effectively
disable gen2 for short-running programs, unless they do the right
thing to trigger the heuristics.  Long running programs have a little
more chance of triggering them, but may do so much later than

Something must be done though.  The costs should be linear with time,
not quadratic.  The frequency at which an object gets scanned should
be inversely proportional to the number of objects to be scanned.

Adam Olsen, aka Rhamphoryncus

As per Aahz's suggestion, I'm moving this discussion here, from Python-Dev.
(Thanks Aahz!)

Mark Dickinson wrote:
> On Thu, Mar 13, 2008 at 4:20 AM, Imri Goldberg <lorgandon at 
> <mailto:lorgandon at>> wrote:
>     My suggestion is to do either of the following:
>     1. Change floating point == to behave like a valid floating point
>     comparison. That means using precision and some error measure.
>     2. Change floating point == to raise an exception, with an error
>     string
>     suggesting using precision comparison, or the decimal module.
> I don't much like either of these;  I think option 1 would cause
> a lot of confusion and difficulty---it changes a conceptually
> simple operation into something more complicated.
> As for option 2., I'd agree that there are situations where having
> a warning (not an exception) for floating-point equality (and
> inequality) tests might be helpful;  but that warning should be
> off by default, or at least easily turned off.
As I said earlier, I'd like static checkers (like Python-Lint) to catch 
this sort of cases, whatever the decision may be.
> Some Fortran compilers have such a (compile-time) warning,
> I believe.  But Fortran's users are much more likely to be
> writing the sort of code that cares about this.
>     Since this change is not backwards compatible, I suggest it be added
>     only to Python 3.
> It's already too late for Python 3.0.
Still, I believe it is worth discussing.
>     3. Programmers will still need the regular ==:
>     Maybe, and even then, only for very rare cases. For these, a special
>     function\method might be used, which could be named floating_exact_eq.
> I disagree with the 'very rare' here.  I've seen, and written, code like:
> if a == 0.0:
>     # deal with exceptional case
> else:
>     b = c/a
>     ...
> or similarly, a test (a==b) before doing a division by a-b.  That
> one's kind of dodgy, by the way:  a != b doesn't always guarantee
> that a-b is nonzero, though you're okay if you're on an IEEE 754
> platform and a and b are both finite numbers.
While checking against a==0.0 (and other similar conditions) before 
dividing will indeed protect from outright division by zero, it will 
enlarge any error you will have in the computation. I guess it would be 
better to do the same check for 'a is small' for appropriate values of 
> Or what if you wanted to generate random numbers in the open interval
> (0.0, 1.0).  random.random gives you numbers in [0.0, 1.0), so a
> careful programmer might well write:
> while True:
>     x = random.random()
>     if x != 0.0:
>         break
> (A less fussy programmer might just say that the chance
> of getting 0.0 is about 1 in 2**53, so it's never going to happen...)
> Other thoughts:
>  - what should x == x do?
If suggestion no. 1 is accepted, always return True. If no. 2 is 
accepted, raise an exception.
Checking x==x is as meaningful as checking x==y.
>  - what should
> 1.0 in set([0.0, 1.0, 2.0])
> and 
> 3.0 in set([0.0, 1.0, 2.0])
> do?
Actually, one of the reasons I thought about this subject in the first 
place, was dict lookup for floating point numbers. It seems to me that 
it's something you just shouldn't do.
As for your examples, I believe these two should both raise an 
exception. This is even worse than normal comparison - here you are 
checking against the hash of a floating point number. So if you do that 
in the current implementation, there's a good chance you'll get 
unexpected results. If you do that given the implementation of 
suggestion 1, you'll have a hard time make set work.
> Mark

Imri Goldberg
Insert Signature Here

Aargh!  Sorry about the multiple emails.  The first one bounced because I
wasn'tsubscribed to python-ideas, so I canceled it and sent it again,
forgetting that
you would still have got a copy of the first email.

And now I'm sending you a third one, just to apologise for the second one
was it the first.)

Double apologies, and I'll try not to do it again.

(with apologies for the random extra level of quoting in the below...)

> On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg <lorgandon at>
> wrote:
> As I said earlier, I'd like static checkers (like Python-Lint) to catch
> > this sort of cases, whatever the decision may be.
> >
Hmm.  Isn't that tricky?  How does the static checker decide
whether the objects being compared are floats?  I guess one could
be content with catching some cases where the operands to ==
are clearly floats...  Wouldn't you have to have run-time warnings
to be really sure of catching all the cases?

> > It's already too late for Python 3.0.
> > Still, I believe it is worth discussing.
> >

Sure.  I didn't mean that to come out in quite the dismissive way it did :).
Apologies.  Maybe a PEP aimed at Python 4.0 is in order.  If you're open
to the idea of just having some way to enable warnings, it could be
much sooner.

> > While checking against a==0.0 (and other similar conditions) before
> > dividing will indeed protect from outright division by zero, it will
> > enlarge any error you will have in the computation. I guess it would be
> > better to do the same check for 'a is small' for appropriate values of
> > 'small'.
Still, a check for 0.0 is good enough in some cases:  if a is tiny, the
large intermediate values may appear and then disappear happily
before giving a sensible final result.  These are usually the sort
of cases where just having division by 0.0 return an infinity
would have "just worked" too (making the whole "if" redundant), but
that's not (currently!) an option in Python.

It's a truism that floating-point equality tests should be avoided, but
it's just not true that floating-point equality testing is *always* wrong,
and I don't think that Python should make it so.

Actually, one of the reasons I thought about this subject in the first
> > place, was dict lookup for floating point numbers. It seems to me that
> > it's something you just shouldn't do.
> >
So your proposal would presumably include making

  x in dict


  x not in dict

errors for any float x, regardless of the contents of the dictionary
(or list, or set, or frozenset, or...) dict?

What would you do about Decimals? A Decimal is just another
floating point format (albeit base 10 instead of base 2); so
presumably all these warnings/errors should apply equally
to Decimal instances?  If not, why not?

I'm not trying to be negative here---as Aahz says, this is an
interesting idea;  I'm just trying to understand exactly how
things might work.

Mark Dickinson wrote:

> (with apologies for the random extra level of quoting in the below...)
>     On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg
>     <lorgandon at <mailto:lorgandon at>> wrote:
>         As I said earlier, I'd like static checkers (like Python-Lint)
>         to catch
>         this sort of cases, whatever the decision may be.
> Hmm.  Isn't that tricky?  How does the static checker decide
> whether the objects being compared are floats?  I guess one could
> be content with catching some cases where the operands to ==
> are clearly floats...  Wouldn't you have to have run-time warnings
> to be really sure of catching all the cases?

Yes. Writing a static-checker for Python is tricky in any case. For the 
sake of this discussion, it might be useful to refer to some 'ideal' 
static checker. This will allow us to better define what is the desired 

>         > It's already too late for Python 3.0.
>         Still, I believe it is worth discussing.
> Sure.  I didn't mean that to come out in quite the dismissive way it 
> did :).
> Apologies.  Maybe a PEP aimed at Python 4.0 is in order.  If you're open
> to the idea of just having some way to enable warnings, it could be
> much sooner.

I think that generating a warning (by default?) is a strong enough 
change in the right direction, so we should add that as another option. 
(Was also suggested in a comment on my blog.)

>         While checking against a==0.0 (and other similar conditions)
>         before
>         dividing will indeed protect from outright division by zero,
>         it will
>         enlarge any error you will have in the computation. I guess it
>         would be
>         better to do the same check for 'a is small' for appropriate
>         values of
>         'small'.
> Still, a check for 0.0 is good enough in some cases:  if a is tiny, the
> large intermediate values may appear and then disappear happily
> before giving a sensible final result.  These are usually the sort
> of cases where just having division by 0.0 return an infinity
> would have "just worked" too (making the whole "if" redundant), but
> that's not (currently!) an option in Python.
> It's a truism that floating-point equality tests should be avoided, but
> it's just not true that floating-point equality testing is *always* wrong,
> and I don't think that Python should make it so.

Alright, that's why in my original suggestion, I proposed a function for 
'old-style' comparison.
It still seems to me that in most cases you are better off doing 
something other than using the current ==.

A point I'm not sure of though, is what happens to other comparison 
operators, namely,
<=, <, >, >=. If they retain their original meaning than <= and >= 
become at least a bit inconsistent.
I'll be glad to hear more opinions about this.

>         Actually, one of the reasons I thought about this subject in
>         the first
>         place, was dict lookup for floating point numbers. It seems to
>         me that
>         it's something you just shouldn't do.
> So your proposal would presumably include making
>   x in dict
> and
>   x not in dict
> errors for any float x, regardless of the contents of the dictionary
> (or list, or set, or frozenset, or...) dict?
> What would you do about Decimals? A Decimal is just another
> floating point format (albeit base 10 instead of base 2); so
> presumably all these warnings/errors should apply equally
> to Decimal instances?  If not, why not?

This last note gave me pause. I still need to think more about this, but 
here are my thoughts so far:

1. Decimal's behavior might be considered even more inconsistent - the 
precision applies to arithmetical operations, but not to comparisons.
2. As a result, it seems to me that decimal's behavior might also be 
It needn't be the same change as regular floating point though - decimal 
behavior might follow suggestion 1, while regular floating points might 
follow suggestion 2. (I see no point in it being the other way around 
3. Usage in containers depending on __hash__ should change according to 
how == behaves for decimals. If == raises an a warning/exception, so 
should "x in {..}". If == will be changed to work according to precision 
for decimals, then usage in containers will be (very) problematic, 
because of context changes. (Consider what happens when changing the 
4. Right now, I would avoid using decimal or regular floating points in 
such containers. The results are just not predictable enough. Using the 
'ideal static-checker' mentioned above, I'd say that any such use should 
result in a warning.

In any case, there might be a place for a way to do floating point 
comparisons in a 'standard' manner.

> I'm not trying to be negative here---as Aahz says, this is an
> interesting idea;  I'm just trying to understand exactly how
> things might work.
Mark

Sure, so do I.


Imri Goldberg
Imri Goldberg

Message-ID: <>

This might be a minor thing, but I kind of wish that I could write this:

sys.stderr.print('first line')
sys.stderr.print('another line here')
sys.stderr.print('and again')

instead of:

print('first line', file=sys.stderr)
print('another line here', file=sys.stderr)
print('and again', file=sys.stderr)

As it's a lot easier to read for me. Of course you can always add
spaces to make the lines line up, but with a long print statement your
eye has to go a long distance to figure out what file, if any, you're
printing to. It could be pretty simple to add:

class ...:
  def print(*args, **kwargs):
    io.print(file=self, *args, **kwargs)

I haven't been able to find any discussion on this, has this already
been rejected?

From greg.ewing at  Thu Mar 13 23:02:19 2008
From: greg.ewing at (Greg Ewing)
Date: Fri, 14 Mar 2008 11:02:19 +1300
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
In-Reply-To: <>
References: <>
Message-ID: <>

Imri Goldberg wrote:
> what happens to other comparison 
> operators, namely,
> <=, <, >, >=. If they retain their original meaning than <= and >= 
> become at least a bit inconsistent.

Also, if you have <= and >= then you can cheat by
doing 'x <= y and x >= y'. :-)


From lorgandon at  Thu Mar 13 23:18:35 2008
From: lorgandon at (Imri Goldberg)
Date: Fri, 14 Mar 2008 00:18:35 +0200
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Greg Ewing wrote:
> Imri Goldberg wrote:
>> what happens to other comparison 
>> operators, namely,
>> <=, <, >, >=. If they retain their original meaning than <= and >= 
>> become at least a bit inconsistent.
> Also, if you have <= and >= then you can cheat by
> doing 'x <= y and x >= y'. :-)

That's part of what I meant.

There's also the problem that if x>y, then you want x!=y. This means 
that there are implications for all comparison operators.

This makes changing == behavior to an epsilon comparison more involved. 
I still think it is feasible, but will require much more consideration.

In any case, emitting a warning for == is still 'cheap', and the 
original arguments stand.

Imri Goldberg
Imri Goldberg

I prefer using partial than introducing new syntax:

print_to_stderr = functools.partial(print, file=sys.stderr)

print_to_stderr('first line')
print_to_stderr('second line')

- Tal

> This might be a minor thing, but I kind of wish that I could write this:
> sys.stderr.print('first line')
> sys.stderr.print('another line here')
> sys.stderr.print('and again')
> instead of:
> print('first line', file=sys.stderr)
> print('another line here', file=sys.stderr)
> print('and again', file=sys.stderr)
> As it's a lot easier to read for me. Of course you can always add
> spaces to make the lines line up, but with a long print statement your
> eye has to go a long distance to figure out what file, if any, you're
> printing to. It could be pretty simple to add:
> class ...:
>  def print(*args, **kwargs):
>    io.print(file=self, *args, **kwargs)
> I haven't been able to find any discussion on this, has this already
> been rejected?
On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg <lorgandon at> wrote:

> This makes changing == behavior to an epsilon comparison more involved.
> I still think it is feasible, but will require much more consideration.

Okay, now I am going to be negative. :-)

I really think that there's essentially zero chance of == and != ever
to 'fuzzy' comparisons in Python.  I don't want to discourage you from
out possible details as an academic exercise, or perhaps with some other
(Python-like?) language in mind, but I just don't see it ever happening in
Maybe I'm wrong, in which case I hope other python people will tell me so,
but I think pursuing this is, in the end, going to be a waste of time.

Some reasons, and then I'll shut up:

Too much complication and magic implicit stuff going on
behind the scenes.  In a fuzzy a == b there are hidden choices about the
fuzziness scheme and the amount of fuzz to allow, and those choices
are going to confuse the hell out of newbie and expert programmers alike.

As above, you'd have to choose defaults for the fuzziness, and by Murphy's
Law those defaults would be wrong for almost everybody else's particular
applications, meaning that almost everybody else would have to go away
and learn about how to change or turn off the fuzziness.

Fundamental and well-understood laws (trichotomy, transitivity of equality)
would break.  It's really unclear how the other comparison operators
would be affected.  If 1.0 == 1.0+2e-16 returns True, shouldn't
1.0 >= 1.0+2e-16 also return True?

Containers would be affected in peculiar ways.  I think people would be
really surprised to find that 1.0+2e-16 *was* an element of the set {1.0},
or that 1.0 and 1.0+2e-16 weren't allowed to be different keys in a dict.
And how on earth do you check for set or dict membership under the

I don't know of any other language that has successfully done this, even
though I've seen the idea floated many times for different languages.
That doesn't mean much, since I only know a small handful of the many
hundreds (thousands?) of languages out there.  If you know a
counterexample, I'd be interested to hear it.

Tiago A.O.A. wrote:
> I would suggest something like a ~= b, for "approximately equal to". How 
> approximately? Well, there would be a default that could be changed 
> somewhere.
> Don't know if it's all that useful, though.

Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__, 
__nsim__, __ltsim__, and __gtsim__ slots.

I'm not at all sure how serious I am right now. It's late, and I have 
fuzzy recollections of how those kinds of things might have been nice in 
some past numerical code.

And then =~ and !~ could be defined for strings and do regular 
expression matching! Woo! More operators! With pronouns!


From lorgandon at  Fri Mar 14 10:01:28 2008
From: lorgandon at (Imri Goldberg)
Date: Fri, 14 Mar 2008 11:01:28 +0200
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
In-Reply-To: <>
References: <>	
	<> <>	
Message-ID: <>

Mark Dickinson wrote:

> On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg <lorgandon at 
> <mailto:lorgandon at>> wrote:
>     This makes changing == behavior to an epsilon comparison more
>     involved.
>     I still think it is feasible, but will require much more
>     consideration.
> Okay, now I am going to be negative. :-)
> I really think that there's essentially zero chance of == and != ever 
> changing
> to 'fuzzy' comparisons in Python.  I don't want to discourage you from 
> working
> out possible details as an academic exercise, or perhaps with some other
> (Python-like?) language in mind, but I just don't see it ever 
> happening in Python.
> Maybe I'm wrong, in which case I hope other python people will tell me so,
> but I think pursuing this is, in the end, going to be a waste of time.

Alright, I agree it's a good idea to drop the proposal to changing 
floating point == into an epsilon compare.
What about issuing a warning though?
Consider the following course of action. It is the one with the least 

== for regular floating point numbers now issues a warning, but still 
works. This warning might be turned off. All other operators are left 

Do you think this should be dropped as well?
Just for my own code, I think I'd like this behavior. I still consider 
floating point == a potential bug, and this helps me catch it, in the 
absence of the 'ideal static checker'.

> Containers would be affected in peculiar ways.  I think people would be
> really surprised to find that 1.0+2e-16 *was* an element of the set {1.0},
> or that 1.0 and 1.0+2e-16 weren't allowed to be different keys in a dict.
> And how on earth do you check for set or dict membership under the 
> hood?

I think that right now containers behave in peculiar ways when used with 
FP numbers.
Take set for example - you might as well just use list instead of it.
When you consider dict, then doing d[x] might not return the result you 
actually want.

> I don't know of any other language that has successfully done this, even
> though I've seen the idea floated many times for different languages.
> That doesn't mean much, since I only know a small handful of the many
> hundreds (thousands?) of languages out there.  If you know a
> counterexample, I'd be interested to hear it.
> Mark
Don't know of a good counterexample. I agree that before changing the 
behavior of == to fuzzy comparison, you'll want  experience with that 
kind of change.


Imri Goldberg
Imri Goldberg

From aaron.watters at  Fri Mar 14 15:40:58 2008
From: aaron.watters at (Aaron Watters)
Date: Fri, 14 Mar 2008 10:40:58 -0400
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
Message-ID: <>

For systems programming I often use
floats as timestamps in dictionaries,
and in this case I never do calculations
and all I care about is "same" or "different",
meaning "any single bit difference".

If you change the way == works and
also follow through and change the way
floats in dictionaries work you would
probably break very many applications
like this.  I think any "almost equal"
should be implemented using a new
method or syntax x~=y rather than
break things.

   - Aaron Watters
On Fri, Mar 14, 2008 at 5:01 AM, Imri Goldberg <lorgandon at> wrote:

> Alright, I agree it's a good idea to drop the proposal to changing
> floating point == into an epsilon compare.
> What about issuing a warning though?
> Consider the following course of action. It is the one with the least
> changes:
> == for regular floating point numbers now issues a warning, but still
> works. This warning might be turned off. All other operators are left
> unchanged.

> Do you think this should be dropped as well?

To be honest, yes.  There isn't currently a SmellyCodeWarning or
IsThatReallyWhatYouMeanWarning in Python, and there doesn't
seem to be a lot of precedent for warning on code constructs that
may often be wrong but also have legitimate uses.  Most of
the current warnings have more to do with syntactic or semantic
changes between various versions of Python.

But I think it would be entirely appropriate to warn about
floating-point (in)equality checks in something like PyChecker
or Pylint, if you can get past the technical difficulties of detecting
floating-point comparisons statically.

Mark Dickinson writes:
> There isn't currently a SmellyCodeWarning ... in Python

Though, clearly, that's what DeprecationWarning should immediately be
renamed to :-).


>>     My suggestion is to do either of the following:
>>     1. Change floating point == to behave like a valid floating point
>>     comparison. That means using precision and some error measure
There are two ways:

1. python users have to know, that representation of float has some 

2. python users must not care about internal float representation

Solution "2." is not good, because someday somebody will complain, that 
computer calculations are not accurate (some scientist who was not 
willing about learning how computer stores floats).

It is better to choose "1." -- beginers will have to accept that 
computer is not able to store every real number, because floats are 
stored as binary numbers. Maybe operator "==" for floats should be 
deprecated, and people should use something like "!~" or "=~", and they 
should be able to set precission for float numbers?

On 3/13/08, Leszek Dubiel <leszek at> wrote:
>  I would suggest to deprecate one-element tuple construction with colon at
> the end, because this looks ugly, is not self-evident for other people
> reading code, looks like some type of trickery.

Notice that nearly all reasonable programming languages allow for an
extra trailing comma in a comma-delimited list, for consistency and to
make programmatic code generation easier.  So for example, [1,2,3,] is
intentionally correct.

One you accept (1,2,3,) as a reasonable tuple, you'll see that
disallowing or deprecating (1,) is wrong, too.

> >>> tuple(['hello'])
>  ('hello',)

Tuples are a fundamental data type, and it would be irresponsible to
steer beginners away from their simple literal syntax.

It's mildly unfortunate that (1,2,) (1,2) (1,) and () represent tuples
but (1) doesn't.  But this rule is simple, well motivated, and
described quite straightforwardly at the very page you link to.
Better to leave it as is, so beginners can learn the rule, accept it,
and move on.

Greg F

On Thu, Mar 13, 2008 at 3:56 PM, Erick Tryzelaar
<idadesub at> wrote:
> This might be a minor thing, but I kind of wish that I could write this:
>  sys.stderr.print('first line')
>  sys.stderr.print('another line here')
>  sys.stderr.print('and again')
>  instead of:
>  print('first line', file=sys.stderr)
>  print('another line here', file=sys.stderr)
>  print('and again', file=sys.stderr)
>  As it's a lot easier to read for me. Of course you can always add
>  spaces to make the lines line up, but with a long print statement your
>  eye has to go a long distance to figure out what file, if any, you're
>  printing to. It could be pretty simple to add:
>  class ...:
>   def print(*args, **kwargs):
>     io.print(file=self, *args, **kwargs)
>  I haven't been able to find any discussion on this, has this already
>  been rejected?

It was brought up, considered, and rejected. The reason is that it
would require *every* stream-like object to implement the print()
functionality, which is rather hairy; or subclass a specific base
class, which we traditionally haven't required. Making it  function
that takes a file argument avoids these problems.

And, by the way, it's too late to bring up new py3k proposals.

--Guido van Rossum (home page:

From greg at  Fri Mar 14 21:46:43 2008
From: greg at (Gregory P. Smith)
Date: Fri, 14 Mar 2008 15:46:43 -0500
Subject: [Python-ideas] One-element tuple
In-Reply-To: <>
References: <>
Message-ID: <>

-1 on deprecating the syntax.  the tuple(['hello']) syntax is much much
slower, a factor of 20x here.  trailing ,s when you only have one item are
how python tuple syntax is defined allowing them to use ()s instead of
needing other tokens.


On 3/12/08, Leszek Dubiel <Leszek.Dubiel at> wrote:
> I would suggest to deprecate one-element tuple construction with colon at
> the end, because this looks ugly, is not self-evident for other people
> reading code, looks like some type of trickery.
> It would prefer tutorial (
> to use instead of
>  >>> ('hello',)
> ('hello',)
> this syntax:
>  >>> tuple(['hello'])
> ('hello',)
> .
> PS.
> Funcitons set(), tuple(), list() and dict() are good!
> Syntax
>     myset = {'a', 'b'}
> is absolutely perfect too!
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
Neil Toronto wrote:

> Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__, 
> __nsim__, __ltsim__, and __gtsim__ slots.

I think that all of these are a bad idea. In my experience,
when comparing with a tolerance, you need to think carefully
about what the appropriate tolerance is for each and every
comparison. Having a global default tolerance would just
lead people to write sloppy and unreliable numerical code.


From greg.ewing at  Sat Mar 15 00:34:12 2008
From: greg.ewing at (Greg Ewing)
Date: Sat, 15 Mar 2008 12:34:12 +1300
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Imri Goldberg wrote:

> == for regular floating point numbers now issues a warning, but still 
> works. This warning might be turned off.

I think I would find it annoying to have to disable a warning
whenever I legitimately wanted to do a floating ==.

Also, having a global warning/no warning setting for the
whole program isn't really right -- whether a floating == is
legitimate is something that needs to be decided on a
case-by-case basis.


From greg.ewing at  Sat Mar 15 01:27:32 2008
From: greg.ewing at (Greg Ewing)
Date: Sat, 15 Mar 2008 13:27:32 +1300
Subject: [Python-ideas] py3k: adding "print" methods to file-like objects
In-Reply-To: <>
References: <>
Message-ID: <>

> On Thu, Mar 13, 2008 at 3:56 PM, Erick Tryzelaar
> <idadesub at> wrote:
>> instead of:
>> print('first line', file=sys.stderr)
>> print('another line here', file=sys.stderr)
>> print('and again', file=sys.stderr)

Perhaps it would help if there were a function

   fprint(f, *args):
     print(file = f, *args)

then the above could be written

   fprint(sys.stderr, 'first line')
   fprint(sys.stderr, 'another line here')
   fprint(sys.stderr, 'and again')

which to me is a lot easier to read, since the file argument
is in a consistent place, making it easier to see that it's
the same from one line to the next.

Also, it enables making the file argument very abbreviated, e.g.

   f = sys.stdout
   fprint(f, 'first line')
   fprint(f, 'another line here')
   fprint(f, 'and again')

Otherwise, the shortest you can get it down to is 'file=f',
which is 6 times as long. It might not seem much, but that's
5 less characters of print arguments that you can fit in
without having to split the line.


From idadesub at  Sat Mar 15 02:39:43 2008
From: idadesub at (Erick Tryzelaar)
Date: Fri, 14 Mar 2008 18:39:43 -0700
Subject: [Python-ideas] py3k: adding "print" methods to file-like objects
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Mar 14, 2008 at 5:27 PM, Greg Ewing <greg.ewing at> wrote:
>  Also, it enables making the file argument very abbreviated, e.g.
>    f = sys.stdout
>    fprint(f, 'first line')
>    fprint(f, 'another line here')
>    fprint(f, 'and again')
>  Otherwise, the shortest you can get it down to is 'file=f',
>  which is 6 times as long. It might not seem much, but that's
>  5 less characters of print arguments that you can fit in
>  without having to split the line.

In that case I think partials a better option, when you could do:

p = partial(print, file=sys.stderr)
p('first line')
p('another line here')
p('and again')

I completely forgot about partial which does a good job of filling in
for what I wanted. I just need to consider the combination of the py3k
stuff a bit more.

From larry at  Sat Mar 15 06:35:57 2008
From: larry at (Larry Hastings)
Date: Sat, 15 Mar 2008 00:35:57 -0500
Subject: [Python-ideas] Python Pragmas
Message-ID: <>

Recently-ish on c.l.py3k (iirc) folks were discussing how to write a 
script that exited with a human-friendly warning message if run under an 
incompatible version of the language.  The problem with this code:

     import sys
     if sys.version < 3: sys.exit("Sorry, this script needs Python 3000")

is that the code only executes once tokenization is finished--if your 
script uses any incompatible syntax, it will fail in the tokenizer, most 
likely with an error message that doesn't make it particularly clear 
what is going on.

After thinking about the problem for a while, it hit me--this is best 
expressed as a "pragma".  For Python's purposes, I would define a 
"pragma" as an instruction to the tokenizer / compiler, executed 
immediately upon its complete tokenization.  The use case here is

     pragma version >= 3 # python version must be less than 3.0

Again, this would be executed immediately, aborting before the tokenizer 
has a chance to see some old syntax it didn't like.

What else might we use "pragma" for?  Well, consider that Python already 
has two specialized syntaxes that are really pragmas: "from __future__ 
import" and "# -*- coding: ".  I think this functionality would be more 
clearly expressed with a "pragma" syntax, for example:

     pragma encoding latin-1
     pragma enable floatdivision

It's a matter of taste, but I've never liked it when languages hide 
important directives in comments--isn't the compiler supposed to 
*ignore* comments?--nor do I like how "from __future__ import" doesn't 
really have anything to do with importing modules.  Your tastes may vary.

There was some discussion back in 2000 about adding a "pragma" to the

It sounds like GvR wasn't wholly against the idea:

But nothing seems to have come of it.  The discussion died out in early
September of 2000, and I didn't find any subsequent revivals.

There was some worry back then that pragmas would be a slippery slope, 
resulting in increasingly elaborate pragma syntaxes until 
we--shudder!--wake up one day and have preprocessor macros.  I agree 
that we don't want to go too far down this slippery slope.  I have some 
specific suggestions on how we could obviate the temptation, but they 
are predicated on having pragmas at all, so I might as well keep quiet 
until such point as pragmas get traction.

If you'd like to discuss this in person--or just give me a good hard 
slap for even suggesting it--I'm bouncing around PyCon until Wednesday 
afternoon.  I'm the guy with the Facebook logos plastered around his person.



From greg at  Sat Mar 15 07:15:48 2008
From: greg at (Gregory P. Smith)
Date: Sat, 15 Mar 2008 01:15:48 -0500
Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point ==
In-Reply-To: <>
References: <> <frc5d6$ueh$>
	<> <>
Message-ID: <>

On 3/14/08, Greg Ewing <greg.ewing at> wrote:
> Neil Toronto wrote:
> > Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__,
> > __nsim__, __ltsim__, and __gtsim__ slots.
> I think that all of these are a bad idea. In my experience,
> when comparing with a tolerance, you need to think carefully
> about what the appropriate tolerance is for each and every
> comparison. Having a global default tolerance would just
> lead people to write sloppy and unreliable numerical code.

Agreed no quick "fix" for float imprecisions is going to make life better
for programmers beyond the first week.  floats are imprecise.  the sooner
programmers learn that the better.  if you want things that can be compared
without thinking use a decimal and avoid irrational numbers. good luck. ;)

Though I don't use them myself I believe the popular math language packages
like matlab and mathematica may even allow you to compute all values with
second error/precision component that gets mutated properly based on the
computations being done so that you know the accuracy of your result without
manually having to calculate accuracy every step of the way based on the
algorithm and order of floating point operations used.  (if not, its an
interesting idea and could be fleshed out as a pure python object
implementation  by someone who cares about these things to see if enough
people find it useful).

Larry Hastings schrieb:

> There was some discussion back in 2000 about adding a "pragma" to the
> language:
> It sounds like GvR wasn't wholly against the idea:
> But nothing seems to have come of it.  The discussion died out in early
> September of 2000, and I didn't find any subsequent revivals.

There is the "directive" PEP, which was rejected:


Larry Hastings <larry at> writes:

> Recently-ish on c.l.py3k (iirc) folks were discussing how to write a 
> script that exited with a human-friendly warning message if run under an 
> incompatible version of the language.  The problem with this code:
>      import sys
>      if sys.version < 3: sys.exit("Sorry, this script needs Python 3000")
> is that the code only executes once tokenization is finished--if your 
> script uses any incompatible syntax, it will fail in the tokenizer, most 
> likely with an error message that doesn't make it particularly clear 
> what is going on.

Personally, in scenarios where I'm worried about that, I just make my
entry point script a thin one with code suitable for any releases I'm
worried about, and only import the main script once the version checks
have passed.  It also permits conditional importing of a version-specific
script when that's appropriate as well.

Avoids the need to introduce execution into the tokenizer, and trying
to worry about all the possible types of comparisons you might want to
make (and thus need to be supported by such execution).

-- David

I've given it more thought over the past few days.

Given the discussion here, and some more reading on my part, it seems to 
me that there isn't much chance for me convincing anyone to raise an 
exception on FP ==. I'm not too sure that it's the right move anyway. 
While I'll probably avoid FP == in my code, it seems to me that there 
are some cases it is useful (even given the inaccuracy of the results).

Regarding adding warnings to pychecker/pylint, I think it's a good idea. 
Probably for another mailing list though :).

Also, I considered the subject of runtime warnings as well.

Adding the relevant warnings to any static checker could be really hard 
work while warning during runtime could be a lot easier. Therefore, it 
seems worthwhile to consider this option. I didn't happen to use the 
warnings module before, so I read its documentation now (also the PEP) 
and played with it a little.

First, if a warning is generated for floating point ==, it can be turned 
off globally, or on a line-by-line basis.

Second, regarding Mark's comment on SmellyCodeWarning. I thought about 
it a bit, and it seems no joke to me. gcc has a -Wall mode, so does 
Python. Why not use it in this situation? (i.e. having some warnings not 
displayed by default.)

I think it would be interesting to consider more cases of 
'SmellyCodeWarning' in general, and adding them under some warning 
category. If there's a need for a use case, we've already got the first 
one - floating point comparisons.



Imri Goldberg
Imri Goldberg

Mark Dickinson wrote:

> On Fri, Mar 14, 2008 at 5:01 AM, Imri Goldberg <lorgandon at 
> <mailto:lorgandon at>> wrote:
>     Alright, I agree it's a good idea to drop the proposal to changing
>     floating point == into an epsilon compare.
>     What about issuing a warning though?
>     Consider the following course of action. It is the one with the least
>     changes:
>     == for regular floating point numbers now issues a warning, but still
>     works. This warning might be turned off. All other operators are left
>     unchanged.
>     Do you think this should be dropped as well?
> To be honest, yes.  There isn't currently a SmellyCodeWarning or
> IsThatReallyWhatYouMeanWarning in Python, and there doesn't
> seem to be a lot of precedent for warning on code constructs that
> may often be wrong but also have legitimate uses.  Most of
> the current warnings have more to do with syntactic or semantic
> changes between various versions of Python.
> But I think it would be entirely appropriate to warn about
> floating-point (in)equality checks in something like PyChecker
> or Pylint, if you can get past the technical difficulties of detecting
> floating-point comparisons statically.
>  Mark

I would suggest to add question 4.28 to faq. Everybody who learns Python 
reads that and will not ask questions about "one-element tuples". I have 
compiled answer from responses to my last question about one-element tuple.

Question: Why python allows to put comma at the end of list? This looks 
ugly and seems to break common rules...

Answer: There are may reasons that follow.

1. If you defined multiline dictionary

        d = {       
            "A": [1, 5],
            "B": [6, 7],  # last trailing comma is optional but good style

it would be easier to add more elements, because you don't have to care 
about colons -- you always put colon at the end of line and don't have 
to reedit other lines. It eases sorting of such lines too -- just cut 
line and paste above.

2. Missing comma can lead to errors that are hard to diagnose. For example:

        x = [

contains tree elements "fee", "fiefoo" and "fum". So if programmer puts 
comma always at the end of line he saves lots of trouble in a future.

2. Nearly all reasonable programming languages (C, C++, Java) allow for an
extra trailing comma in a comma-delimited list, for consistency and to
make programmatic code generation easier. So for example [1,2,3,] is
intentionally correct.

3. Creating one-element tuples using tuple(['hello']) syntax is much much
slower (a factor of 20x here) then writing just ['hello', ].  Trailing 
when you only have one item are how python tuple syntax is defined
allowing them to use commas instead of needing other tokens. If python
didn't allow comma at the end of tuple, you will have to use such slow 

4. The same rule applies to other type of lists, where delimiter can occur
at the end. For example both strings "alfa\nbeta\n" and "alfa\nbeta" contain
two lines.



>  3. Creating one-element tuples using tuple(['hello']) syntax is much much
>  slower (a factor of 20x here) then writing just ['hello', ].

That should be ('hello',)

From jimjjewett at  Tue Mar 18 23:11:31 2008
On 3/17/08, Leszek Dubiel <leszek at> wrote:

>  I would suggest to add question 4.28 to faq. Everybody who learns Python
>  reads that and will not ask questions about "one-element tuples". I have
>  compiled answer from responses to my last question about one-element tuple.

This (with the followup correction) is great; please post it to the
Issue Tracker, so that someone (probably Georg) with commit privs can
check it in.


From jimjjewett at  Tue Mar 18 23:33:06 2008
On 3/13/08, Mark Dickinson <dickinsm at> wrote:
> On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg <lorgandon at> wrote:

> I really think that there's essentially zero chance of == and != ever
> changing to 'fuzzy' comparisons in Python.

They sort of already did -- you can define __eq__ and __ne__ on your
own class in bizarre and inconsistent ways.  [Though I think you can't
easily override that (x is y) ==> (x==y).]

You can even do this with your own float-alike class.

What you're really asking for is that the float class take advantage of this.

> I don't know of any other language that has successfully done this, ...

Changing an existing class requires that the class be "open".  That is
the default in languages like smalltalk or ruby.  It is even the
default for python classes -- but it is certainly not the default for
"python" classes that are actually coded in C -- which includes


From jimjjewett at  Tue Mar 18 23:42:28 2008
On 3/14/08, Imri Goldberg <lorgandon at> wrote:

> Alright, I agree it's a good idea to drop the proposal to changing
>  floating point == into an epsilon compare.
>  What about issuing a warning though?
>  Consider the following course of action. It is the one with the least
>  changes:

>  == for regular floating point numbers now issues a warning, but still
>  works. This warning might be turned off. All other operators are left
>  unchanged.

If you change ==, you should really change !=, and probably the other
comparisons as well.

I suspect what you really want is a warning on any usage of a floating
point.  And I'm only half-joking.  Comparison (or arithmetic) with
other floats adds error.  Comparison (or arithmetic) with ints is
*usually* a bug (unless one of the operands is a constant that someone
was too lazy to write correctly).


From dickinsm at  Wed Mar 19 00:58:22 2008
On Tue, Mar 18, 2008 at 6:33 PM, Jim Jewett <jimjjewett at> wrote:
>  They sort of already did -- you can define __eq__ and __ne__ on your
>  own class in bizarre and inconsistent ways.  [Though I think you can't
>  easily override that (x is y) ==> (x==y).]

Why not?  I get this with Python 2.5.1:

>>> from decimal import *
>>> Decimal.__eq__ = lambda x, y: False
>>> x = Decimal(2)
>>> x == x
>>> x is x

Or am I misunderstanding your meaning?

<unnecessary pendantry> Of course, even for floats it's not true
that x is y implies x == y:

>>> x = float('nan')
>>> x is x
>>> x == x

</unnecessary pedantry>

>  Changing an existing class requires that the class be "open".  That is
>  the default in languages like smalltalk or ruby.  It is even the
>  default for python classes -- but it is certainly not the default for
>  "python" classes that are actually coded in C -- which includes
>  floats.

You mean like:

>>> float.__eq__ = lambda x, y: False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'float'

?  Presumably there are good reasons for this restriction
(performance? convenience? lack of round tuits?), but
I've no idea what they are.  I can't say that I've ever felt a
need to do anything like this.


From greg.ewing at  Wed Mar 19 01:55:53 2008
Jim Jewett wrote:
> Comparison (or arithmetic) with ints is
> *usually* a bug (unless one of the operands is a constant that someone
> was too lazy to write correctly).

That depends on what you regard as "correct". Python
generally permits a duck-typed approach to numbers
wherein using integers as a subset of floats is
considered legitimate, and not lazy at all.


From ggpolo at  Thu Mar 20 16:45:50 2008
I've read this idea about "preparing an existing RPC mechanism for the
standard library" at StandardLibrary ideas and I would be interested
in doing it, but as you all know, including something into stdlib is
not exactly easy and shouldn't be anyway. Also I'm not even sure if
this idea is still desired.

I'm considering the inclusion of rpyc, with appropriate changes
(possibly lots). And would like to know your opinions towards this.


-- Guilherme H. Polo Goncalves

On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo <ggpolo at> wrote:
> Hello,
>  I've read this idea about "preparing an existing RPC mechanism for the
>  standard library" at StandardLibrary ideas and I would be interested
>  in doing it, but as you all know, including something into stdlib is
>  not exactly easy and shouldn't be anyway. Also I'm not even sure if
>  this idea is still desired.
>  I'm considering the inclusion of rpyc, with appropriate changes
>  (possibly lots). And would like to know your opinions towards this.

I know from my end I am not even familiar with rpyc so I have no
comment. And I suspect most other people have a similar reason for
having not commented on this so far.


From santagada at  Mon Mar 24 00:11:37 2008
On 20/03/2008, at 12:45, Guilherme Polo wrote:
> I'm considering the inclusion of rpyc, with appropriate changes
> (possibly lots). And would like to know your opinions towards this.

I think the route you would have to go is making a pep, and one of the  
things I would like to see in this pep would be why rpyc and not any  
of the other rpc modules around (like the not recomended for general  
use zrpc or pyro or the thing the guys from twisted have). Only if  
your pep is accept I think you should waste your time making it better  
for the stdlib.

The two "think" in my last paragraph are there because I am not sure  
this the right route, this is just a guess.

Leonardo Santagada

From taleinat at  Mon Mar 24 07:58:10 2008
On Mon, Mar 24, 2008 at 1:03 AM, Brett Cannon <brett at> wrote:
> On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo <ggpolo at> wrote:
>  > Hello,
>  >
>  >  I've read this idea about "preparing an existing RPC mechanism for the
>  >  standard library" at StandardLibrary ideas and I would be interested
>  >  in doing it, but as you all know, including something into stdlib is
>  >  not exactly easy and shouldn't be anyway. Also I'm not even sure if
>  >  this idea is still desired.
>  >
>  >  I'm considering the inclusion of rpyc, with appropriate changes
>  >  (possibly lots). And would like to know your opinions towards this.
>  >
>  I know from my end I am not even familiar with rpyc so I have no
>  comment. And I suspect most other people have a similar reason for
>  having not commented on this so far.

I believe the reason that the OP is considering RPyC is because it is
the most Pythonic RPC mechanism of the lot. That, and its relative
simplicity, are the reasons I recently chose RPyC for a project, and
it worked out pretty well. If any RPC mechanism is added to the
standard library, I hope it has an API as Pythonic as RPyC's!

I ran into two main problems while using RPyC (v2.60), neither of them
show breakers for me. The first was that debugging it can be hard
because its exception handling (propagation across the RPC link) isn't
good enough (yet). The second is that the RPC is two-way and very
transparent, so that once the application became complex I had to take
special measures to avoid deadlocks. All things considered, RPyC got
the job done.

I know RPyC's developer and maintainer, Tomer Filiba, and he's a great
guy, though recently much busier than he used to be. He had plans to
add distributed computing capabilities to RPyC in version 3.0, and
probably quite a few other features, but AFAIK development is
currently frozen. I'm CC-ing the RPyC newsgroup in hopes that he (and
the users) will comment on this.

- Tal

From ggpolo at  Mon Mar 24 11:31:38 2008
2008/3/24, Tal Einat <taleinat at>:
> On Mon, Mar 24, 2008 at 1:03 AM, Brett Cannon <brett at> wrote:
>  > On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo <ggpolo at> wrote:
>  >  > Hello,
>  >  >
>  >  >  I've read this idea about "preparing an existing RPC mechanism for the
>  >  >  standard library" at StandardLibrary ideas and I would be interested
>  >  >  in doing it, but as you all know, including something into stdlib is
>  >  >  not exactly easy and shouldn't be anyway. Also I'm not even sure if
>  >  >  this idea is still desired.
>  >  >
>  >  >  I'm considering the inclusion of rpyc, with appropriate changes
>  >  >  (possibly lots). And would like to know your opinions towards this.
>  >  >
>  >
>  >  I know from my end I am not even familiar with rpyc so I have no
>  >  comment. And I suspect most other people have a similar reason for
>  >  having not commented on this so far.
> I believe the reason that the OP is considering RPyC is because it is
>  the most Pythonic RPC mechanism of the lot. That, and its relative
>  simplicity, are the reasons I recently chose RPyC for a project, and
>  it worked out pretty well. If any RPC mechanism is added to the
>  standard library, I hope it has an API as Pythonic as RPyC's!
>  I ran into two main problems while using RPyC (v2.60), neither of them
>  show breakers for me. The first was that debugging it can be hard
>  because its exception handling (propagation across the RPC link) isn't
>  good enough (yet). The second is that the RPC is two-way and very
>  transparent, so that once the application became complex I had to take
>  special measures to avoid deadlocks. All things considered, RPyC got
>  the job done.
>  I know RPyC's developer and maintainer, Tomer Filiba, and he's a great
>  guy, though recently much busier than he used to be. He had plans to
>  add distributed computing capabilities to RPyC in version 3.0, and
>  probably quite a few other features, but AFAIK development is
>  currently frozen. I'm CC-ing the RPyC newsgroup in hopes that he (and
>  the users) will comment on this.

I've talked with him before posting this here Tal. Also, the
development of the new version is active.

>  - Tal

-- Guilherme H. Polo Goncalves

From janssen at  Mon Mar 24 19:24:38 2008
> I'm considering the inclusion of rpyc, with appropriate changes
> (possibly lots). And would like to know your opinions towards this.

Might think about reviving the ILU kernel (just the runtime, not the
stubbers) as a Python-only module.  Open source, pretty complete
bindings to Python, multithreaded, threadsafe, etc., etc.  On the
other hand, I haven't even compiled it in 6 years :-).  The key
advantage would be that ILU speaks a number of different RPC protocols
under the covers, and it's straightforward to add new ones.  I'd love
to see our implementation of wmux (a way of multiplexing multiple
virtual connections, in either direction, over a single TCP
connection) actually in use.


From tomerfiliba at  Tue Mar 25 12:18:52 2008
hi all.

i don't feel i may join in this discussion as i'm certainly biased,
but i don't want to just leave it on the wall. for those of you who
haven't heard of rpyc, here's a link:
and a short demo/tutorial at

i can bring many use cases that demonstrate rpyc's superiority over
other RPC mechanism, but then again, rpyc has its drawbacks too
(mainly security and frequent IOs). i can make a list of both pros
and cons, but i don't see how it could advance this discussion.

just some final words, as guilherme has said, i am now actively
working on rpyc3.0. in fact the core (parallel to rpyc2.6) is already
stable and quite tested (you can find it on the svn), but if any
attempt is made to integrate rpyc into the stdlib, it should wait
until the final 3.0 release.


From ggpolo at  Tue Mar 25 15:05:57 2008
(this is an idea for Python 3)

Is there any reason for keeping the directory lib-tk at Lib ? I
believe renaming it to tkinter and making it a package would make more
sense. Tkinter module's code could then reside into maybe.
Other change that could be done in this package would be renaming some

Dialog -> dialog
FileDialog -> filedialog
FixTk -> fixtk

Also, I believe tkSimpleDialog and dialog could be in a single module.
There are other modules like tkColorChooser and tkCommonDialog and
even tkSimpleDialog (and some others) that I'm not totally sure what
to do about them, but for me they should reside in possible a single

That is. Thanks,

-- Guilherme H. Polo Goncalves

From qgallet at  Tue Mar 25 15:19:37 2008
Hi Guilherme,

Tkinter is scheduled to become a package in py3k, this is documented in the
PEP 3108 :
If you wish to help on related issues, feel free to join the stdlib-sig
where the reorganization is being discussed :-)


On Tue, Mar 25, 2008 at 3:05 PM, Guilherme Polo <ggpolo at> wrote:

> Hello,
> (this is an idea for Python 3)
> Is there any reason for keeping the directory lib-tk at Lib ? I
> modules:
> Dialog -> dialog
> FileDialog -> filedialog
> FixTk -> fixtk
> ...
> Also, I believe tkSimpleDialog and dialog could be in a single module.
> There are other modules like tkColorChooser and tkCommonDialog and
> even tkSimpleDialog (and some others) that I'm not totally sure what
> to do about them, but for me they should reside in possible a single
> module.
> That is. Thanks,
> --
> -- Guilherme H. Polo Goncalves
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
From rasky at  Thu Mar 27 12:25:41 2008
inspired by Greg's post about ideas on making the lambda syntax more 
concise, like:

    x,y => x+y

I was wondering if using unnamed arguments had already been debated. 
Something like:


where basically you're declaring implicitally declaring that your lambda 
takes two arguments. You wouldn't be able to call them through keyword 
arguments, nor to accept a variable number of arguments (nor to accept 
more arguments than they are actually used), but wouldn't it cover most 
use cases and be really compact?

Other examples:


Giovanni Bajo

From jjb5 at  Thu Mar 27 14:29:37 2008
Greg wrote:

 >  What's needed is something very concise and unobtrusive,
 >  such as
 >     x, y => x + y

As inspired by Prolog:

     x, y :- x + y

So this:

     f = lambda x, y: x ** 2 + y ** 2

Or this:

     def f(x, y): return x ** 2 + y ** 2

Becomes this:

     f = x, y :- x ** 2 + y ** 2

And would logically transpose into this:

     f(x, y) :- x ** 2 + y ** 2

Oooo...this rabbit hole is fun!


From helmert at  Thu Mar 27 17:32:18 2008
[follow up from py3k.devel list]

Neil Toronto wrote:

> Yep. In my seven years of CS instruction so far, I've only come across 
> this once, in a theory of programming languages course. "Lambda" simply 
> doesn't show up unless you do language theory or program in a Lisp... or 
> in Python.

Since you mention Haskell below:

> It's a little less terse than Haskell's "\->"

it's worth pointing out that Haskell uses the backslash syntax because
it is the nearest ASCII equivalent to the (lower-case) letter lambda.
For example, see
or the Google results for "haskell lambda backslash" (without the quotes).


From brett at  Thu Mar 27 19:17:21 2008
On Thu, Mar 27, 2008 at 4:25 AM, Giovanni Bajo <rasky at> wrote:
> Hello,
>  inspired by Greg's post about ideas on making the lambda syntax more
>  concise, like:
>     x,y => x+y
>  I was wondering if using unnamed arguments had already been debated.
>  Something like:
>     \(_1+_2)
>  where basically you're declaring implicitally declaring that your lambda
>  takes two arguments. You wouldn't be able to call them through keyword
>  arguments, nor to accept a variable number of arguments (nor to accept
>  more arguments than they are actually used), but wouldn't it cover most
>  use cases and be really compact?
>  Other examples:
>      k.sort(key=\(
>      k.sort(key=\(_1[0]))

Two reasons for being -1:

One is it's just plain ugly to me.

Two, why break from how functions and methods work to save a few
keystrokes? Explicit is better than implicit.


From tjreedy at  Thu Mar 27 20:44:43 2008
"Malte Helmert" 
<helmert at> 
wrote in message news:fsgi6i$dpv$1 at
| > It's a little less terse than Haskell's "\->"
| it's worth pointing out that Haskell uses the backslash syntax because
| it is the nearest ASCII equivalent to the (lower-case) letter lambda.

With unicode source, we could use the real thing (ducks ;-). 

From leszek at  Mon Mar 31 09:23:52 2008
Joel Bender napisa?(a):
> Greg wrote:
>  >  What's needed is something very concise and unobtrusive,
>  >  such as
>  >
>  >     x, y => x + y
> As inspired by Prolog:
>      x, y :- x + y
> So this:
>      f = lambda x, y: x ** 2 + y ** 2
> Or this:
>      def f(x, y): return x ** 2 + y ** 2
> Becomes this:
>      f = x, y :- x ** 2 + y ** 2
> And would logically transpose into this:
>      f(x, y) :- x ** 2 + y ** 2
> Oooo...this rabbit hole is fun!

Lambda should have the same syntax as ordinary functions. The only 
difference should be: you don't have to put the name of the function.

def f (x, y):
return x ** 2 + y ** 2

g = f

h = def (x, y): return x ** 2 + y ** 2

Functions f, g and h are doing the same.

From eli at  Mon Mar 31 14:19:35 2008
On Mon, Mar 31, 2008 at 3:23 AM, Leszek Dubiel <leszek at> wrote:

> Lambda should have the same syntax as ordinary functions. The only
> difference should be: you don't have to put the name of the function.
> def f (x, y): return x ** 2 + y ** 2
> g = f
> h = def (x, y): return x ** 2 + y ** 2
> Functions f, g and h are doing the same.

Javascript handles anonymous functions this way as well:

function f(x, y) { return x*x + y*y; }

g = f;

h = function(x, y) { return x*x + y*y; }

With that being said, it makes sense for the return statement to be omitted
in lambdas (or anonymous defs, as I hope they will eventually be called),
since those functions are limited to one statement.

- Eli
From grosser.meister.morti at  Mon Mar 31 16:17:11 2008
Maybe dictionary unpacking would be a nice thing?

 >>> d = {'foo': 42, 'egg': 23}
 >>> {'foo': bar, 'egg': spam} = d
 >>> print bar, spam
42 23

What do you think? Bad idea? Good idea?


From aahz at  Mon Mar 31 16:43:09 2008
On Mon, Mar 31, 2008, Mathias Panzenb?ck wrote:
> Maybe dictionary unpacking would be a nice thing?
>  >>> d = {'foo': 42, 'egg': 23}
>  >>> {'foo': bar, 'egg': spam} = d
>  >>> print bar, spam
> 42 23
> What do you think? Bad idea? Good idea?

Horrible idea.  ;-)
Aahz (aahz at           <*>

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan