From adam at atlas.st Sat Mar 1 23:24:37 2008 From: adam at atlas.st (Adam Atlas) Date: Sat, 1 Mar 2008 17:24:37 -0500 Subject: [Python-ideas] List-comprehension-like extensions in normal for loops Message-ID: How about letting ordinary for loops contain multiple "for"s and optionally "if"s like in list comprehensions and generator expressions? For example, on a site I was just looking at, it had the line: for record in [ r for r in db if r.name == 'pierre' ]: which could instead be: for record in db if record.name == 'pierre': And of course there could be much more complex ones too. I know you could just do "for x in " and get the same effect and roughly the same speed, but I think this looks a lot nicer, and it would make sense to have this sort of consistency across the multiple contexts in which "for" can be used. From eyal.lotem at gmail.com Sat Mar 1 23:57:08 2008 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun, 2 Mar 2008 00:57:08 +0200 Subject: [Python-ideas] List-comprehension-like extensions in normal for loops In-Reply-To: References: Message-ID: How about just nesting the for's/if's? for record in db: if record.name == 'pierre': ... Isn't that the one obvious way to do it? On Sun, Mar 2, 2008 at 12:24 AM, Adam Atlas wrote: > How about letting ordinary for loops contain multiple "for"s and > optionally "if"s like in list comprehensions and generator > expressions? For example, on a site I was just looking at, it had the > line: > for record in [ r for r in db if r.name == 'pierre' ]: > which could instead be: > for record in db if record.name == 'pierre': > > And of course there could be much more complex ones too. I know you > could just do "for x in " and get the same > effect and roughly the same speed, but I think this looks a lot nicer, > and it would make sense to have this sort of consistency across the > multiple contexts in which "for" can be used. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From grantgm at mcmaster.ca Sun Mar 2 23:02:31 2008 From: grantgm at mcmaster.ca (Gabriel Grant) Date: Sun, 2 Mar 2008 17:02:31 -0500 Subject: [Python-ideas] Restartable Threads Message-ID: Hi everyone, Why is it that threads can't be restarted? I hope this is the right place for this discussion. If this has been (or should be) discussed somewhere else, I apologize: my searches for "restart thread" and similar only turned up statements that restarting threads is impossible, which haven't satiated my curiosity. Is there any fundamental reason why this can't (or shouldn't) be done? If not, what would you think of making thread restartability an option? For those who are wondering why I might wish to condemn myself by using threads at all (rather than, say, subprocesses), never mind threads that can be restarted while maintaining state, my use case is as follows: I am performing low-level hardware control through a C API that I have wrapped with ctypes. The main "run"-type C function takes a pointer to a struct in which it stores information about its state. This function blocks while the system is running, but also needs access to the shared memory space, and thus has to be executed in its own thread (I believe - other options welcomed). In order to signal events, the function returns with a signal code. Once the signal has been dealt with, the API specifies that the same C function call be made, passing the pointer to the original struct, so that the system can resume operation where it left off. I'm sure this could be done using a standard thread (although I haven't actually done it) with something like: def myloop(): while not self.ret == 0: self.resume_evt.clear() self.ret = sharedLib.blocking_call(self.c_state_struct) self.signal_evt.set() self.resume_evt.wait() t.Thread(target=myloop) ... do some things ... t.signal_evt.wait() ... deal with the signal ... t.signal_evt.clear() t.resume_evt.set() Or some such ugliness, but it seemed to me that the most natural implementation of such a system would be something more like: class myThread(Thread): def __init__(self): self.c_state_struct = structMaker() Thread.__init__(self) def run(self): self.ret = sharedLib.blocking_call(self.c_state_struct) which would then be executed with: >>> t = myThread() >>> t.start() ... do some other stuff ... >>> t.join() >>> signal_handle(t.ret) # deal with the returned value >>> t.start() # resume operation However this is impossible, since a thread's start() method can only be called once (as explained in [1], [2] and [3] python2.5 raises an assertion error, although as of rev 55785 this has been changed to a RuntimeError). What I have been unable to find explained, however, is why this should/needs to be the case. To see if I could get around this limitation, I initially hacked this together: class myThread(Thread): def __init__(self): self.i = 1 Thread.__init__(self) def start(self): Thread.__init__(self) Thread.start(self) def run(self): print self.i self.i += 1 return self.i to be used as: >>> t = myThread() >>> t.start() >>> t.join() 1 >>> t.start() >>> t.join() 2 Obviously it is not usable in the general case, since it completely clobbers the thread's internal state through the repeated __init__()s, but one could certainly imagine a more delicate implementation that saves the relevant bits and pieces, while resetting those that need it. With that in mind, I had a look into threading.py and, not immediately seeing any reason this couldn't be done, implemented essentially that functionality. The attached patch is implemented against threading.py from trunk. I've also uploaded a patched copy of my threading.py that can be used with python2.5 to [4], if anyone needs that. In order to maintain complete backward compatibility, I've left the default behaviour to have threads behave as they do today, but by initializing them with "restartable=True", start() can be called repeatedly. For example: class Counter(Thread): def run(self): if not hasattr(self, "count"): self.count = 0 else: self.count += 1 could be used with: >>> t = Counter(restartable=True) >>> t.start() >>> t.join() >>> print t.count 0 >>> t.start() >>> t.join() >>> print t.count 1 If an attempt is made to restart the thread while it is executing, it still raises a RuntimeError, which I think makes sense: >>> t = LongThread(restartable=True) >>> t.start() >>> t.start() Traceback (most recent call last): File "", line 1, in File "threading.py", line 441, in start raise RuntimeError("thread already started") RuntimeError: thread already started So this _seems_ to work, but I have to admit, I'm somewhat afraid to use it. I can't help but wonder: is it safe, or is it tempting the Gods of parallelism to inflict sudden, multi-threaded death? Less superstitious opinions than my own would be greatly appreciated. Thanks, -Gabriel Note: In addition to the patch, I have attached a few usage examples/test cases, that I should really make into actual unit tests. Some of these are expected to fail, so the file can't be executed directly - the examples should be run in an interpreter. [1]: http://docs.python.org/lib/thread-objects.html [2]: http://mail.python.org/pipermail/python-list/2006-November/415503.html [3]: http://lookherefirst.wordpress.com/2007/12/20/can-i-restart-a-thread-in-python/ [4]: http://ieeesb.mcmaster.ca/~grantgm/reThread/threading.py -------------- next part -------------- A non-text attachment was scrubbed... Name: threading-restartable-thread.patch Type: text/x-patch Size: 1257 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: restartable_Thread_examples.py Type: text/x-python Size: 1635 bytes Desc: not available URL: From santagada at gmail.com Sun Mar 2 23:11:57 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Sun, 2 Mar 2008 19:11:57 -0300 Subject: [Python-ideas] Restartable Threads In-Reply-To: References: Message-ID: <2273EC76-A23B-407B-9BF9-220F9BF971C7@gmail.com> Sorry if I missed it from you email, but why cant you just create another thread object before each start call? I think the only objection to restart a thread would be that the idea is that each thread object represents a thread... but I might be completely wrong. -- Leonardo Santagada From grantgm at mcmaster.ca Mon Mar 3 00:16:03 2008 From: grantgm at mcmaster.ca (Gabriel Grant) Date: Sun, 2 Mar 2008 18:16:03 -0500 Subject: [Python-ideas] Restartable Threads In-Reply-To: <2273EC76-A23B-407B-9BF9-220F9BF971C7@gmail.com> References: <2273EC76-A23B-407B-9BF9-220F9BF971C7@gmail.com> Message-ID: On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada wrote: > Sorry if I missed it from you email, I know the message was a rather long. Sorry about that. > but why cant you just create > another thread object before each start call? The state of the thread needs to be preserved from start() to start() because the C function needs to be passed the same object each time it is called. The state could be maintained by creating the persistent object in the parent thread and passing it to a new child thread before each call, but for a few reasons this feels wrong: It seems to me that this would break encapsulation - objects exist for the purpose of carrying state. They shouldn't rely on their parent to do that for them. Doing so would muck up the parent, especially once there are a) multiple child threads and b) multiple state-carrying objects that need to be maintained within each thread. Also, from a more conceptual point of view, the C function basically represents a single, restartable process, so it seems it should be packaged and used as such. When the function returns, it is more akin to a synchronization point between threads than stopping one and creating then starting another. Hopefully that clarifies my thinking a bit (or at least doesn't muddy the waters any further :) > I think the only objection to restart a thread would be that the idea > is that each thread object represents a thread... but I might be > completely wrong. And that may be a valid objection, although the lifetime of the Thread object does not directly correspond with that of the thread it wraps. The thread is created upon calling start(), and dies when run() returns. The way it is implemented, the Thread object is more of a thread creator and controller, than a physical thread. Otherwise, I would think it should disapear after being join()ed. It seem to me that these objects represent a more palatable abstraction of the physical thread...but I might (also :) be completely wrong. Given that we accept (enjoy, even?) some level of abstraction on top of physical threads (for instance we start them after they have been initialized, and we check whether they are running, not whether they exist), it seems reasonable to me that stopping and restarting these conceptual threads should be possible. What do you think? Thanks again for your consideration, -Gabriel From josiah.carlson at gmail.com Mon Mar 3 01:32:10 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Sun, 2 Mar 2008 16:32:10 -0800 Subject: [Python-ideas] Restartable Threads In-Reply-To: References: <2273EC76-A23B-407B-9BF9-220F9BF971C7@gmail.com> Message-ID: My 2 cents from my 30 seconds of reading this email thread: encapsulation shouldn't be done on the thread level, it should be done on the object level. Create an object that offers the behavior you want to have (call it ThreadStarter or something), and give it a 'start_thread()' method that returns a thread handle from which you can .join() as necessary. This ThreadStarter object keeps references to the necessary structures that you need to pass to the lower level threads. Or heck, this ThreadStarter could handle the .join() dispatch, etc. If you think about it for 5 minutes, I'm sure you could implement it. Also, while it isn't impossible to "restart threads" the way you conceive of it, your way of conceiving of the "restart" is fundamentally wrong. Can you restart a process whose stack you've thrown away? Of course not. You've thrown away the process/thread's stack (which can be seen by the fact that you can .join() the thread), so you aren't "restarting" the thread, you are creating a new thread with a new stack with some of the same arguments to called functions. - Josiah (this message does not mean that I'm going to be spending much time in this list anymore, just that I saw this silly idea and had to comment) On Sun, Mar 2, 2008 at 3:16 PM, Gabriel Grant wrote: > On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada wrote: > > Sorry if I missed it from you email, > I know the message was a rather long. Sorry about that. > > > > but why cant you just create > > another thread object before each start call? > > The state of the thread needs to be preserved from start() to start() > because the C function needs to be passed the same object each time it > is called. > > The state could be maintained by creating the persistent object in the > parent thread and passing it to a new child thread before each call, > but for a few reasons this feels wrong: > > It seems to me that this would break encapsulation - objects exist for > the purpose of carrying state. They shouldn't rely on their parent to > do that for them. > Doing so would muck up the parent, especially once there are a) > multiple child threads and b) multiple state-carrying objects that > need to be maintained within each thread. > > Also, from a more conceptual point of view, the C function basically > represents a single, restartable process, so it seems it should be > packaged and used as such. When the function returns, it is more akin > to a synchronization point between threads than stopping one and > creating then starting another. > > Hopefully that clarifies my thinking a bit (or at least doesn't muddy > the waters any further :) > > > > I think the only objection to restart a thread would be that the idea > > is that each thread object represents a thread... but I might be > > completely wrong. > > And that may be a valid objection, although the lifetime of the Thread > object does not directly correspond with that of the thread it wraps. > The thread is created upon calling start(), and dies when run() > returns. The way it is implemented, the Thread object is more of a > thread creator and controller, than a physical thread. Otherwise, I > would think it should disapear after being join()ed. It seem to me > that these objects represent a more palatable abstraction of the > physical thread...but I might (also :) be completely wrong. > > Given that we accept (enjoy, even?) some level of abstraction on top > of physical threads (for instance we start them after they have been > initialized, and we check whether they are running, not whether they > exist), it seems reasonable to me that stopping and restarting these > conceptual threads should be possible. What do you think? > > Thanks again for your consideration, > > -Gabriel > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From aahz at pythoncraft.com Mon Mar 3 01:33:50 2008 From: aahz at pythoncraft.com (Aahz) Date: Sun, 2 Mar 2008 16:33:50 -0800 Subject: [Python-ideas] Restartable Threads In-Reply-To: References: Message-ID: <20080303003350.GA19156@panix.com> On Sun, Mar 02, 2008, Gabriel Grant wrote: > > Why is it that threads can't be restarted? That's an interesting question. Unfortunately, the best person to answer it isn't on this list (Tim Peters). Generally speaking, the standard answer is to have a worker thread that uses a Queue in a loop. > So this _seems_ to work, but I have to admit, I'm somewhat afraid to > use it. I can't help but wonder: is it safe, or is it tempting the > Gods of parallelism to inflict sudden, multi-threaded death? If I had to guess, I think it's just adding an unnecessary layer of complexity to the existing Thread class. Moreover, the existing implementation prevents the following code: t.start() t.run() t.run() which IMO definitely should be a bug. If you want to try creating a patch that that adds a t.restart() method, I think it certainly wouldn't hurt anything and would be a good way of getting feedback. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson From josiah.carlson at gmail.com Mon Mar 3 01:35:02 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Sun, 2 Mar 2008 16:35:02 -0800 Subject: [Python-ideas] List-comprehension-like extensions in normal for loops In-Reply-To: References: Message-ID: Or even... for record in (r for r in db record.name == 'pierre'): ... Hello generator expressions (available since Python 2.4). But I'm with Eyal, personally. - Josiah On Sat, Mar 1, 2008 at 2:57 PM, Eyal Lotem wrote: > How about just nesting the for's/if's? > > > for record in db: > if record.name == 'pierre': > ... > > Isn't that the one obvious way to do it? > > > > On Sun, Mar 2, 2008 at 12:24 AM, Adam Atlas wrote: > > How about letting ordinary for loops contain multiple "for"s and > > optionally "if"s like in list comprehensions and generator > > expressions? For example, on a site I was just looking at, it had the > > line: > > for record in [ r for r in db if r.name == 'pierre' ]: > > which could instead be: > > for record in db if record.name == 'pierre': > > > > And of course there could be much more complex ones too. I know you > > could just do "for x in " and get the same > > effect and roughly the same speed, but I think this looks a lot nicer, > > and it would make sense to have this sort of consistency across the > > multiple contexts in which "for" can be used. > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From artomegus at gmail.com Wed Mar 5 03:36:36 2008 From: artomegus at gmail.com (Anthony Tolle) Date: Tue, 4 Mar 2008 21:36:36 -0500 Subject: [Python-ideas] new super redux (better late than never?) Message-ID: <47857e720803041836v6796c236v6ad8c675b7fafd75@mail.gmail.com> I was looking at the reference implementation in PEP 3135 (New Super), and I was inspired to put together a slightly different implementation that doesn't fiddle with bytecode. I know that the new super() in python 3000 doesn't follow the reference implementation in the PEP, but the code intrigued me enough to offer up this little tidbit, which can be easily be used in python 2.5. What I did was borrow the idea of using a metaclass to do a post-definition fix-up on the methods, but added a new function decorator called autosuper_method. Like staticmethod or classmethod, the decorator wraps the function using the non-data descriptor protocol. The method wrapped by the decorator will receive an extra implicit argument (super) inserted before the instance argument (self). One caveat about the decorator: it must be the first decorator in the list (i.e. the outermost wrapper), or else the metaclass will not recognize the wrapped function as an instance of the decorator class. I think this implementation strikes me as more pythonic than the spooky behavior of the new python 3000 super() built-in, and it is more flexible because of the implicit argument design. This allows things like the ability to use the super argument in inner functions without worrying about the 'first argument' assumption of python 3000's super(). The implementation follows, which is also called autosuper in deference to the original reference implementation. It includes a demonstration of some of its flexibility: ------------------------------------------------------------ #!/usr/bin/env python # # autosuper.py class autosuper_method(object): def __init__(self, func, cls=None): self.func = func self.cls = cls def __get__(self, obj, type=None): # return self if self.cls is not set yet if self.cls is None: return self if obj is None: # class binding - assume first argument is instance, # and insert superclass before it def newfunc(*args, **kwargs): if not len(args): raise TypeError('instance argument missing') return self.func(super(self.cls, args[0]), *args, **kwargs) else: # instance binding - insert superclass as first # argument, and instance as second def newfunc(*args, **kwargs): return self.func(super(self.cls, obj), obj, *args, **kwargs) return newfunc class autosuper_meta(type): def __init__(cls, name, bases, clsdict): # set cls attribute of all instances of autosuper_method for v in clsdict: o = getattr(cls, v) if isinstance(o, autosuper_method): o.cls = cls class autosuper(object): __metaclass__ = autosuper_meta if __name__ == '__main__': class A(autosuper): def f(self): return 'A' # Demo - standard use class B(A): @autosuper_method def f(super, self): return 'B' + super.f() # Demo - reference super in inner function class C(A): @autosuper_method def f(super, self): def inner(): return 'C' + super.f() return inner() # Demo - define function before class definition @autosuper_method def D_f(super, self): return 'D' + super.f() class D(B, C): f = D_f # Demo - define function after class definition class E(B, C): pass # don't use @autosuper_method here! The metaclass has already # processed E, so it won't be able to set the cls attribute def E_f(super, self): return 'E' + super.f() # instead, use the extended version of the decorator E.f = autosuper_method(E_f, E) d = D() assert d.f() == 'DBCA' # Instance binding assert D.f(d) == 'DBCA' # Class binding e = E() assert e.f() == 'EBCA' # Instance binding assert E.f(e) == 'EBCA' # Class binding ------------------------------------------------------------ P.S. I know that using the word 'super' as an argument name might be frowned upon, but I'm just copying what I've seen done in the standard python library (e.g. using 'list' as a local variable name :). Anyway, it doesn't really hurt anything unless you wanted to call the original super() built-in from the decorated method, which would kind of defeat the purpose. P.P.S. Something like this might have been offered up already. I've been searching the mail list archives for a while, and found a few reference to using decorators, but didn't find any full implementations. This implementation also has the advantage of being compatible with existing code. From guido at python.org Wed Mar 5 04:24:35 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Mar 2008 19:24:35 -0800 Subject: [Python-ideas] new super redux (better late than never?) In-Reply-To: <47857e720803041836v6796c236v6ad8c675b7fafd75@mail.gmail.com> References: <47857e720803041836v6796c236v6ad8c675b7fafd75@mail.gmail.com> Message-ID: Ehhh! The PEP's "reference implementation" is useless and probably doesn't even work. The actual implementation is completely different. If you want to help, a rewrite of the PEP to match reality would be most welcome! On Tue, Mar 4, 2008 at 6:36 PM, Anthony Tolle wrote: > I was looking at the reference implementation in PEP 3135 (New Super), > and I was inspired to put together a slightly different implementation > that doesn't fiddle with bytecode. I know that the new super() in > python 3000 doesn't follow the reference implementation in the PEP, > but the code intrigued me enough to offer up this little tidbit, which > can be easily be used in python 2.5. > > What I did was borrow the idea of using a metaclass to do a > post-definition fix-up on the methods, but added a new function > decorator called autosuper_method. Like staticmethod or classmethod, > the decorator wraps the function using the non-data descriptor > protocol. > > The method wrapped by the decorator will receive an extra implicit > argument (super) inserted before the instance argument (self). > > One caveat about the decorator: it must be the first decorator in the > list (i.e. the outermost wrapper), or else the metaclass will not > recognize the wrapped function as an instance of the decorator class. > > I think this implementation strikes me as more pythonic than the > spooky behavior of the new python 3000 super() built-in, and it is > more flexible because of the implicit argument design. This allows > things like the ability to use the super argument in inner functions > without worrying about the 'first argument' assumption of python > 3000's super(). > > The implementation follows, which is also called autosuper in > deference to the original reference implementation. It includes a > demonstration of some of its flexibility: > > ------------------------------------------------------------ > > #!/usr/bin/env python > # > # autosuper.py > > class autosuper_method(object): > def __init__(self, func, cls=None): > self.func = func > self.cls = cls > > def __get__(self, obj, type=None): > # return self if self.cls is not set yet > if self.cls is None: > return self > > if obj is None: > # class binding - assume first argument is instance, > # and insert superclass before it > def newfunc(*args, **kwargs): > if not len(args): > raise TypeError('instance argument missing') > return self.func(super(self.cls, args[0]), > *args, > **kwargs) > else: > # instance binding - insert superclass as first > # argument, and instance as second > def newfunc(*args, **kwargs): > return self.func(super(self.cls, obj), > obj, > *args, > **kwargs) > return newfunc > > class autosuper_meta(type): > def __init__(cls, name, bases, clsdict): > # set cls attribute of all instances of autosuper_method > for v in clsdict: > o = getattr(cls, v) > if isinstance(o, autosuper_method): > o.cls = cls > > class autosuper(object): > __metaclass__ = autosuper_meta > > if __name__ == '__main__': > class A(autosuper): > def f(self): > return 'A' > > # Demo - standard use > class B(A): > @autosuper_method > def f(super, self): > return 'B' + super.f() > > # Demo - reference super in inner function > class C(A): > @autosuper_method > def f(super, self): > def inner(): > return 'C' + super.f() > return inner() > > # Demo - define function before class definition > @autosuper_method > def D_f(super, self): > return 'D' + super.f() > > class D(B, C): > f = D_f > > # Demo - define function after class definition > class E(B, C): > pass > > # don't use @autosuper_method here! The metaclass has already > # processed E, so it won't be able to set the cls attribute > def E_f(super, self): > return 'E' + super.f() > > # instead, use the extended version of the decorator > E.f = autosuper_method(E_f, E) > > d = D() > assert d.f() == 'DBCA' # Instance binding > assert D.f(d) == 'DBCA' # Class binding > > e = E() > assert e.f() == 'EBCA' # Instance binding > assert E.f(e) == 'EBCA' # Class binding > > ------------------------------------------------------------ > > P.S. I know that using the word 'super' as an argument name might be > frowned upon, but I'm just copying what I've seen done in the standard > python library (e.g. using 'list' as a local variable name :). > Anyway, it doesn't really hurt anything unless you wanted to call the > original super() built-in from the decorated method, which would kind > of defeat the purpose. > > P.P.S. Something like this might have been offered up already. I've > been searching the mail list archives for a while, and found a few > reference to using decorators, but didn't find any full > implementations. This implementation also has the advantage of being > compatible with existing code. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From artomegus at gmail.com Wed Mar 5 08:49:02 2008 From: artomegus at gmail.com (Anthony Tolle) Date: Wed, 5 Mar 2008 02:49:02 -0500 Subject: [Python-ideas] new super redux (better late than never?) In-Reply-To: References: <47857e720803041836v6796c236v6ad8c675b7fafd75@mail.gmail.com> Message-ID: <47857e720803042349t74fc13a9y3ad927980d4810a1@mail.gmail.com> On Tue, Mar 4, 2008 at 10:24 PM, Guido van Rossum wrote: > Ehhh! The PEP's "reference implementation" is useless and probably > doesn't even work. The actual implementation is completely different. > If you want to help, a rewrite of the PEP to match reality would be > most welcome! Yep, I knew the actual implementation was completely different from the reference implementation. I was really just trying to offer a different take on 'fixing' super, even though I know it is too late to suggest this type of change for python 3000. That's one reason I refrained from posting in the python-3000 list. I was enamored with the idea of passing the super object as an actual parameter to the method that needs it. Using a decorator with descriptor behavior (like staticmethod or classmethod) seemed the best way to do this. The only downside is that my implementation depends on using a metaclass to fix up the decorator objects after the class definition is completed (or catching assignment to class attributes after the fact). It would be nice if the decorator class could be self-contained without depending on an associated metaclass. However, the __get__ method of the decorator would have to dynamically determine the class that the wrapped function belongs to. Since functions can be defined outside of a class and then arbitrarily assigned to a class attribute (or even multiple classes!), this seems to be difficult. In fact, the code in my previous post has a bug related to this. Which brings me to posting a new version of my code: -- Defined __setattr__ in the metaclass to make demo code more consistent (and less ugly). -- Modified __init__ function in the metaclass so it doesn't generate __get__ calls. -- Fix-ups now create new instance of autosuper_method object instead of modifying cls attribute of existing object. Reason: assigning a decorated function to multiple classes would modify the original object, breaking functionality for all classes but one. -- Known issue: cases such as E.f = D.f are not caught, because __get__ on D.f doesn't return an instance of autosuper_method. Can be resolved by having autosuper_method.__get__ return a callable sublass of autosuper_method. However, it makes me wonder if my idea isn't so hot after all. :/ Here's the new version: ------------------------------------------------------------ #!/usr/bin/env python # # autosuper.py class autosuper_method(object): def __init__(self, func, cls=None): self.func = func self.cls = cls def __get__(self, obj, type=None): # return self if self.cls is not set - prevents use # by methods of classes that don't subclass autosuper if self.cls is None: return self if obj is None: # class binding - assume first argument is instance, # and insert superclass before it def newfunc(*args, **kwargs): if not len(args): raise TypeError('instance argument missing') return self.func(super(self.cls, args[0]), *args, **kwargs) else: # instance binding - insert superclass as first # argument, and instance as second def newfunc(*args, **kwargs): return self.func(super(self.cls, obj), obj, *args, **kwargs) return newfunc class autosuper_meta(type): def __init__(cls, name, bases, clsdict): # fix up all autosuper_method instances in class for attr in clsdict: value = clsdict[attr] if isinstance(value, autosuper_method): setattr(cls, attr, autosuper_method(value.func, cls)) def __setattr__(cls, attr, value): # catch assignment after class definition if isinstance(value, autosuper_method): value = autosuper_method(value.func, cls) type.__setattr__(cls, attr, value) class autosuper(object): __metaclass__ = autosuper_meta if __name__ == '__main__': class A(autosuper): def f(self): return 'A' # Demo - standard use class B(A): @autosuper_method def f(super, self): return 'B' + super.f() # Demo - reference super in inner function class C(A): @autosuper_method def f(super, self): def inner(): return 'C' + super.f() return inner() # Demo - define function before class definition @autosuper_method def D_f(super, self): return 'D' + super.f() class D(B, C): f = D_f # Demo - define function after class definition class E(B, C): pass @autosuper_method def E_f(super, self): return 'E' + super.f() E.f = E_f # Test D d = D() assert d.f() == 'DBCA' # Instance binding assert D.f(d) == 'DBCA' # Class binding # Test E e = E() assert e.f() == 'EBCA' # Instance binding assert E.f(e) == 'EBCA' # Class binding ------------------------------------------------------------ Regardless of the flaws in my code, I still like the idea of a decorator syntax to specify methods that want to receive a super object as a parameter. It could use the same 'magic' that allows the new python 3000 super() to determine the method's class from the stack frame, but doesn't depend on grabbing the first argument as the instance (i.e. breaking use with inner functions). From aaron.watters at gmail.com Wed Mar 5 16:11:48 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Wed, 5 Mar 2008 10:11:48 -0500 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation Message-ID: I just checked the python site documentation on marshal and pickle and I consider them to be irresponsibly and dangerously misleading. For example. Suppose Mercurial is implemented using pickle.load (I sure hope it isn't -- is it?). 1) I send someone a "patch" for their software claiming it makes their package run faster. 2) That person uses mercurial to "unpack" the patch and mercurial uses pickle.load. BAM! That person's filesystem is GONE! AND I'M NOT ASSUMING THAT THERE IS ANY BUG IN MERCURIAL! Now: suppose Mercurial is implemented using marshal: no such scenario is possible unless there is a security bug in mercurial where they explicitly execute something. RESOLVED: pickle should come with a large red label: WARNING: LARK'S VOMIT -- NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND. It doesn't have one. Marshal needs no such label: but it has one: *Warning:* The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source. This is bullshit. Sorry, for the french and the caps, but this is REALLY IMPORTANT. -- Aaron Watters -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Wed Mar 5 16:25:57 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 5 Mar 2008 10:25:57 -0500 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <91ad5bf80803050725y51c06959uf2a87c2d404da67@mail.gmail.com> On Wed, Mar 5, 2008 at 10:11 AM, Aaron Watters wrote: > I just checked the python site documentation on marshal and pickle and I > consider them to be irresponsibly and dangerously misleading. > RESOLVED: pickle should come with a large red label: > > WARNING: LARK'S VOMIT -- > NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND. > > It doesn't have one. So what is this [1] ? ''' Warning: The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source. ''' You may want to check your facts better next time you go on a rampage. George [1] http://docs.python.org/lib/node314.html From phd at phd.pp.ru Wed Mar 5 16:27:59 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 5 Mar 2008 18:27:59 +0300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <20080305152759.GB21001@phd.pp.ru> On Wed, Mar 05, 2008 at 10:11:48AM -0500, Aaron Watters wrote: > RESOLVED: pickle should come with a large red label: > > WARNING: LARK'S VOMIT -- > NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND. http://docs.python.org/lib/node314.html "Warning: The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source." Enough for me, though it is not as big or as red... Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From aaron.watters at gmail.com Wed Mar 5 17:11:26 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Wed, 5 Mar 2008 11:11:26 -0500 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: In response to Oleg and George. Yes apparently there is an acknowledgement in some subordinate page somewhere that there might be some problem with security and pickle. This should be on the first page in bold face like the unneeded one for marshal. I missed it just now because I just looked at the first page for marshal and pickle, like most people probably would, sorry. Also this line from the marshal doc has got to go: "For general persistence and transfer of Python objects through RPC calls, see the modules pickle and shelve . " http://docs.python.org/lib/module-marshal.html which should read "For RPC calls never use pickle." And the security warning for marshal benieth it should be removed because it is nonsense. The implication of the current documentation is that most of my public projects contain serious security holes when they don't. And if you don't read the documentation carefully (like the implementers of Plone apparently didn't) the docs seem to suggest that pickle is somehow "safer" when it is about as unsafe as it could be. -- Aaron Watters -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 5 18:36:56 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Mar 2008 09:36:56 -0800 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: I'm assuming that someone confronted you with this security issue somehow? Otherwise I don't understand why you'd be so upset about it. BTW the warning for marshal is legit -- the C code that unpacks marshal data has not been carefully analyzed against buffer overflows and so on. Remember the first time someone broke into a system through a malicious JPEG? The same could happen with marshal. Seriously. I agree that the pickle module's warning needs to be moved to a more prominent place (Georg has probably aready done this by the time I'm finished typing this message :-). But I see no reason to get so upset about it as to use all caps. --Guido On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters wrote: > In response to Oleg and George. > > Yes apparently there is an acknowledgement in some subordinate page > somewhere that there might be some problem with security and pickle. This > should be on the first page in bold face like the unneeded one for marshal. > I missed it just now because I just looked at the first page for marshal and > pickle, like most people probably would, sorry. > > Also this line from the marshal doc has got to go: > > "For general persistence and transfer of Python objects through RPC calls, > see the modules pickle and shelve. " > http://docs.python.org/lib/module-marshal.html > > which should read > "For RPC calls never use pickle." > > And the security warning for marshal benieth it should be removed because it > is nonsense. > > The implication of the current documentation is that most of my public > projects contain serious security holes when they don't. > And if you don't read the documentation carefully (like the implementers of > Plone apparently didn't) the docs seem to suggest > that pickle is somehow "safer" when it is about as unsafe as it could be. > > -- Aaron Watters > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From santagada at gmail.com Wed Mar 5 19:12:47 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Wed, 5 Mar 2008 15:12:47 -0300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: On 05/03/2008, at 13:11, Aaron Watters wrote: > like the implementers of Plone apparently didn't I know this is just a strange angry rant, but why do you say this? Do you mean that ZODB and ZRPC should be implemented using marshal? Can this even be done? Now if you want to do secure pickles, just sign them with a cripto method (completely secure). Simple and I think, the only way to do this being secure. Or using a secure transport layer like ssl to transfer things with signatures to identify your peers. Not a reason to rant, but to install the crypto package. -- Leonardo Santagada From g.brandl at gmx.net Wed Mar 5 20:02:51 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 05 Mar 2008 20:02:51 +0100 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > I'm assuming that someone confronted you with this security issue > somehow? Otherwise I don't understand why you'd be so upset about it. > > BTW the warning for marshal is legit -- the C code that unpacks > marshal data has not been carefully analyzed against buffer overflows > and so on. Remember the first time someone broke into a system through > a malicious JPEG? The same could happen with marshal. Seriously. > > I agree that the pickle module's warning needs to be moved to a more > prominent place (Georg has probably aready done this by the time I'm > finished typing this message :-). But I see no reason to get so upset > about it as to use all caps. I used the time machine :) Though the warning is at the same location in , since all pickle docs are on the same page it's visible enough in my opinion. cheers, Georg From aaron.watters at gmail.com Wed Mar 5 20:03:23 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Wed, 5 Mar 2008 14:03:23 -0500 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: What follows is a brief summary of offline discussions with Guido and Leonardo (I hope represented correctly, please complain if not): Guido pointed out that previous versions of marshal could crash python. I replied that that is a bug and all known instances have been fixed. Pickle executes arbitrary code by design -- which is much worse than just crashing a program. Leonardo mentioned that pickle security concerns could be addressed using crypto tricks. I replied that I would be comfortable unmarshalling a file from a known hostile party -- no crypto verification required, because the worst that could happen is that it would crash the interpreter. With pickle I'd be handing my keyboard to a villian. In summary: I think marshal.loads(s) is just as safe as unicode(s) or file.read(). pickle.loads(s) is morally equivalant to __import__(s) or eval(s). I think the security warning for marshal and the implied recommendation that pickle is okay for RPC should be removed. alright already, 'nuff said. whatever. -- Aaron Watters -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne_bab at web.de Wed Mar 5 20:46:37 2008 From: arne_bab at web.de (Arne Babenhauserheide) Date: Wed, 5 Mar 2008 20:46:37 +0100 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <200803052046.40314.arne_bab@web.de> I'd also agree, that the warning should be really prominent (especially since I just saw someone saying "for game states: Just pickle them", which could result in people getting problems when they get a mail saying "hey, look, I got to the 14th level"), but I don't think the warning was irresponsibly small. At least I saw it, when I began to learn python (but I had forgotten it until now). Maybe it could be replaced by yaml at some point, though, which offers a mode that doesn't execute everything (safe_load): http://pyyaml.org/wiki/PyYAMLDocumentation#LoadingYAML "safe_load(stream) parses the given stream and returns a Python object constructed from for the first document in the stream. If there are no documents in the stream, it returns None. safe_load recognizes only standard YAML tags and cannot construct an arbitrary Python object." And there's also a C implementation: http://pyyaml.org/browser/libyaml/trunk Which can be relicensed under the Python License: http://pyyaml.org/browser/libyaml/trunk/LICENSE Or pickle could get a safe_load function itself (if it doesn't yet have it). Best wishes, Arne El Wednesday, 5 de March de 2008 18:36:56 Guido van Rossum escribi?: > I'm assuming that someone confronted you with this security issue > somehow? Otherwise I don't understand why you'd be so upset about it. > > BTW the warning for marshal is legit -- the C code that unpacks > marshal data has not been carefully analyzed against buffer overflows > and so on. Remember the first time someone broke into a system through > a malicious JPEG? The same could happen with marshal. Seriously. > > I agree that the pickle module's warning needs to be moved to a more > prominent place (Georg has probably aready done this by the time I'm > finished typing this message :-). But I see no reason to get so upset > about it as to use all caps. > > --Guido > > On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters wrote: > > In response to Oleg and George. > > > > Yes apparently there is an acknowledgement in some subordinate page > > somewhere that there might be some problem with security and pickle. > > This should be on the first page in bold face like the unneeded one for > > marshal. I missed it just now because I just looked at the first page for > > marshal and pickle, like most people probably would, sorry. > > > > Also this line from the marshal doc has got to go: > > > > "For general persistence and transfer of Python objects through RPC > > calls, see the modules pickle and shelve. " > > http://docs.python.org/lib/module-marshal.html > > > > which should read > > "For RPC calls never use pickle." > > > > And the security warning for marshal benieth it should be removed because > > it is nonsense. > > > > The implication of the current documentation is that most of my public > > projects contain serious security holes when they don't. > > And if you don't read the documentation carefully (like the implementers > > of Plone apparently didn't) the docs seem to suggest > > that pickle is somehow "safer" when it is about as unsafe as it could be. > > > > -- Aaron Watters > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas -- Unpolitisch sein Hei?t politisch sein Ohne es zu merken. - Arne Babenhauserheide ( http://draketo.de ) -- Weblog: http://blog.draketo.de -- Mein ?ffentlicher Schl?ssel (PGP/GnuPG): http://draketo.de/inhalt/ich/pubkey.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From idadesub at users.sourceforge.net Wed Mar 5 21:45:17 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Wed, 5 Mar 2008 12:45:17 -0800 Subject: [Python-ideas] adding a trim convenience function Message-ID: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> I find that when I'm normalizing strings, I end up writing this a lot: sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] new_sites = [] for site in sites: if site.startswith('http://'): site = site[len('http://'):] new_sites.append(site) But it'd be much nicer if I could use a convenience function trim that would do this for me, so I could just use a comprehension: def ltrim(s, prefix): if s.startswith(prefix): return s[len(prefix):] return s sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] sites = [ltrim(site, 'http://') for site in sites] Would there be any interest to add this helper function, as well as an "rtrim" and "trim", to the str class? -e From tjreedy at udel.edu Wed Mar 5 22:33:40 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Mar 2008 16:33:40 -0500 Subject: [Python-ideas] adding a trim convenience function References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> Message-ID: "Erick Tryzelaar" wrote in message news:1ef034530803051245u7fdf525dn6f4efc74a8af59a8 at mail.gmail.com... | sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] | sites = [ltrim(site, 'http://') for site in sites] >>> [site.replace('http://', '')for site in sites] ['www.google.com', 'python.org', 'www.yahoo.com'] | Would there be any interest to add this helper function, as well as an | "rtrim" and "trim", to the str class? Try another use case. I think str pretty well has the basic tools needed to construct whatever specific tools one needs. tjr From idadesub at users.sourceforge.net Wed Mar 5 23:11:56 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Wed, 5 Mar 2008 14:11:56 -0800 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: <3b5110850803051330h188951d3y990e460e590127a9@mail.gmail.com> References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> <3b5110850803051330n362f24a4nd9bfffce90f1aa72@mail.gmail.com> <3b5110850803051330h188951d3y990e460e590127a9@mail.gmail.com> Message-ID: <1ef034530803051411i35dc94f5seb48911e4a0343b0@mail.gmail.com> On Wed, Mar 5, 2008 at 1:30 PM, Matthew Russell wrote: > Of couse that should of read: > sites = list(item.strip("http://") for item in sites) {l,r,}strip doesn't actually do what I'm talking about, which confused me for a long time. Consider this simple case: >>> 'abaaabcd'.lstrip('ab') 'cd' ltrim in this case would produce 'aaabcd'. On Wed, Mar 5, 2008 at 1:33 PM, Terry Reedy wrote: > >>> [site.replace('http://', '')for site in sites] > ['www.google.com', 'python.org', 'www.yahoo.com'] Unfortunately that would break down in certain cases with a different strings, like 'foo bar foo'.replace('foo', ''), which just results in ' bar '. > Try another use case. I think str pretty well has the basic tools needed > to construct whatever specific tools one needs. Oh sure it can, considering that I can implement ltrim in three lines. This is just to reduce a common pattern in my code, and to remove rewriting it in multiple projects. From greg.ewing at canterbury.ac.nz Wed Mar 5 23:24:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Mar 2008 11:24:37 +1300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <47CF1DA5.9010501@canterbury.ac.nz> Guido van Rossum wrote: > BTW the warning for marshal is legit -- the C code that unpacks > marshal data has not been carefully analyzed against buffer overflows > and so on. I thought the main issue with marshal is that it's happy to create code objects, which pickle doesn't do -- ostensibly for security reasons. But if pickle is inherently insecure anyway, does the exclusion of code objects really make much difference? BTW, I only consider pickle suitable for quick and dirty uses anyway, because it ties the external representation very closely to internal details of your program, which can make it difficult to evolve the program without invalidating previously written files. For long-term use, it's better to invest time in a properly-thought-out external format for the task, designed with extensibility in mind. -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 5 23:29:49 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Mar 2008 11:29:49 +1300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <47CF1EDD.20008@canterbury.ac.nz> Aaron Watters wrote: > In summary: I think marshal.loads(s) is just as safe as unicode(s) or > file.read(). pickle.loads(s) is morally equivalant to __import__(s) or > eval(s). According to the docs, you can use a customised unpickler to restrict the set of things it can use as constructors. It might be worth mentioning that in a prominent place near the security warning as well. -- Greg From taleinat at gmail.com Wed Mar 5 23:59:07 2008 From: taleinat at gmail.com (Tal Einat) Date: Thu, 6 Mar 2008 00:59:07 +0200 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> Message-ID: <7afdee2f0803051459x77c0eb80m9b028a8ed7df6676@mail.gmail.com> Erick Tryzelaar wrote: > I find that when I'm normalizing strings, I end up writing this a lot: > > sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] > new_sites = [] > for site in sites: > if site.startswith('http://'): > site = site[len('http://'):] > new_sites.append(site) > > But it'd be much nicer if I could use a convenience function trim that > would do this for me, so I could just use a comprehension: > > def ltrim(s, prefix): > if s.startswith(prefix): > return s[len(prefix):] > return s > > sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] > sites = [ltrim(site, 'http://') for site in sites] > > Would there be any interest to add this helper function, as well as an > "rtrim" and "trim", to the str class? > I'm against adding this as a string method, or even a a function in the stdlib. I've done a lot of text processing with Python and have hardly ever needed something like this. If you think this would be useful often, a good way to convince this list is to show some examples of how it could improve code in the standard library, noting how common they are. In general, having a lot of string methods is very harmful because it makes learning Python a longer and more confusing process. Furthermore, this functionality is very simple and easy to implement, I just thought of 3 different ways [1] to implement this function in a simple, readable one-liner. For these reasons, unless you can show that this will be very useful very often, I'm against. - Tal [1] ltrim = lambda item, to_trim,: re.sub('^' + to_trim, '', item) ltrim = lambda item, x: item[0 if not item.startswith(x) else len(x):] ltrim = lambda item, to_trim: ''.join(item.split(to_trim, 1)) From taleinat at gmail.com Thu Mar 6 00:06:01 2008 From: taleinat at gmail.com (Tal Einat) Date: Thu, 6 Mar 2008 01:06:01 +0200 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: <7afdee2f0803051459x77c0eb80m9b028a8ed7df6676@mail.gmail.com> References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> <7afdee2f0803051459x77c0eb80m9b028a8ed7df6676@mail.gmail.com> Message-ID: <7afdee2f0803051506q379c6835we28f1930884cd88d@mail.gmail.com> Tal Einat wrote: > Erick Tryzelaar wrote: > > I find that when I'm normalizing strings, I end up writing this a lot: > > > > sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] > > new_sites = [] > > for site in sites: > > if site.startswith('http://'): > > site = site[len('http://'):] > > new_sites.append(site) > > > > But it'd be much nicer if I could use a convenience function trim that > > would do this for me, so I could just use a comprehension: > > > > def ltrim(s, prefix): > > if s.startswith(prefix): > > return s[len(prefix):] > > return s > > > > sites = ['www.google.com', 'http://python.org', 'www.yahoo.com'] > > sites = [ltrim(site, 'http://') for site in sites] > > > > Would there be any interest to add this helper function, as well as an > > "rtrim" and "trim", to the str class? > > > > I'm against adding this as a string method, or even a a function in the stdlib. > > I've done a lot of text processing with Python and have hardly ever > needed something like this. If you think this would be useful often, a > good way to convince this list is to show some examples of how it > could improve code in the standard library, noting how common they > are. > > In general, having a lot of string methods is very harmful because it > makes learning Python a longer and more confusing process. > Furthermore, this functionality is very simple and easy to implement, > I just thought of 3 different ways [1] to implement this function in a > simple, readable one-liner. For these reasons, unless you can show > that this will be very useful very often, I'm against. > > - Tal > > > [1] > ltrim = lambda item, to_trim,: re.sub('^' + to_trim, '', item) > ltrim = lambda item, x: item[0 if not item.startswith(x) else len(x):] > ltrim = lambda item, to_trim: ''.join(item.split(to_trim, 1)) > Ignore the third implementation, it's broken... here's another one in its place: ltrim = lambda item, x: item[item.startswith(x) * len(x):] - Tal From greg.ewing at canterbury.ac.nz Wed Mar 5 23:44:10 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Mar 2008 11:44:10 +1300 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> Message-ID: <47CF223A.8050407@canterbury.ac.nz> Terry Reedy wrote: >>>>[site.replace('http://', '')for site in sites] Not exactly the same thing, as the original only replaced at the beginning of the string. An re substitution could be used, but that could be seen as overkill. -- Greg From santagada at gmail.com Thu Mar 6 02:33:25 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Wed, 5 Mar 2008 22:33:25 -0300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <44602C61-86DB-4CC7-A48F-BD62D69F55D5@gmail.com> On 05/03/2008, at 16:03, Aaron Watters wrote: > Guido pointed out that previous versions of marshal could crash > python. > > I replied that that is a bug and all known instances have been > fixed. Pickle executes arbitrary code by design -- which is much > worse than just crashing a program. Just read carefully what Guido said, if there is a bug it can not just crash your program, it can execute any kind of code, as bad or even worse than pickle... that is what is called a buffer overflow Talking about it the pypy project has a directory somewhere with lots of snippets of ways to crash cpython... Not just the set recursion limit and overflow the stack one. > Leonardo mentioned that pickle security concerns could be addressed > using crypto tricks. For some uses, for others some modified version of pure python pickle could be used, so you have a controled and almost safe pickle. > I replied that I would be comfortable unmarshalling a file from a > known hostile party -- no crypto verification required, because the > worst that could happen is that it would crash the interpreter. > With pickle I'd be handing my keyboard to a villian. > > In summary: I think marshal.loads(s) is just as safe as unicode(s) > or file.read(). pickle.loads(s) is morally equivalant to > __import__(s) or eval(s). No marshall load do lots of stuff in pure unverified C code... anything could happen, as guido pointed out. > I think the security warning for marshal and the implied > recommendation that pickle is okay for RPC should be removed. No, AFAIK marshal can only load ints and simple objects... and that will give you a very poor rpc (for example it could never be used to replace pickle as it is used in ZODB and ZRPC). -- Leonardo Santagada From bborcic at gmail.com Thu Mar 6 13:45:41 2008 From: bborcic at gmail.com (Boris Borcic) Date: Thu, 06 Mar 2008 13:45:41 +0100 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: Aaron Watters wrote: [...] > Sorry, for the french and the caps, but this is REALLY IMPORTANT. I see nothing french in your post. BB From aaron.watters at gmail.com Thu Mar 6 18:40:35 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Thu, 6 Mar 2008 12:40:35 -0500 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: <44602C61-86DB-4CC7-A48F-BD62D69F55D5@gmail.com> References: <44602C61-86DB-4CC7-A48F-BD62D69F55D5@gmail.com> Message-ID: On Wed, Mar 5, 2008 at 8:33 PM, Leonardo Santagada wrote: > > On 05/03/2008, at 16:03, Aaron Watters wrote: > > Guido pointed out that previous versions of marshal could crash > > python. > > > > I replied that that is a bug and all known instances have been > > fixed. Pickle executes arbitrary code by design -- which is much > > worse than just crashing a program. > > Just read carefully what Guido said, if there is a bug it can not just > crash your program, it can execute any kind of code, as bad or even > worse than pickle... that is what is called a buffer overflow I'd like to know the actual number of successful buffer overflow attacks that have ever happened on the planet in the wild. Maybe one? Okay, according to Wikipedia there have been 4. I don't really know but I think an overflowing buffer in marshal is not very likely to be somewhere near where a code segment could jump to because almost everything in marshal is dynamically allocated. The known attacks have been where the arrays were in static locations, I believe. And it's not worse than pickle because pickle is perfectly capable of compiling and loading an assembly language component without you knowing anything about it. Pickle can do anything that the computer can do. Also it's not worse than pickle because you have to be a highly experienced and perverted assembly language programmer to construct an overflow attack and there has to be a bug in marshal to allow it. To abuse pickle requires almost no skill at all, and you don't have to be perverted, you just have to be stupid. In fact pickle is designed to execute arbitrary code, and even documented. For all I know it's just as feasible to stage buffer overflow attacks in many other places in python as it is in marshal -- like maybe unicode.join or anyplace else where an array is constructed. Which is to say it's not very feasible in those places either. I was clearly off my medication to start this discussion. I suppose misleading people into thinking marshal is dangerous is better than suggesting pickle is safe. Peace and love everyone. bye now. -- Aaron Watters -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Thu Mar 6 18:56:05 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 06 Mar 2008 18:56:05 +0100 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <47D03035.6010205@cheimes.de> Boris Borcic wrote: > Aaron Watters wrote: > [...] > >> Sorry, for the french and the caps, but this is REALLY IMPORTANT. > > I see nothing french in your post. http://en.wikipedia.org/wiki/Pardon_my_French HTH Christian From lists at cheimes.de Thu Mar 6 18:59:00 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 06 Mar 2008 18:59:00 +0100 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: <44602C61-86DB-4CC7-A48F-BD62D69F55D5@gmail.com> References: <44602C61-86DB-4CC7-A48F-BD62D69F55D5@gmail.com> Message-ID: Leonardo Santagada wrote: >> I replied that that is a bug and all known instances have been >> fixed. Pickle executes arbitrary code by design -- which is much >> worse than just crashing a program. > > Just read carefully what Guido said, if there is a bug it can not just > crash your program, it can execute any kind of code, as bad or even > worse than pickle... that is what is called a buffer overflow marshal is *ONLY* designed to store and load trusted pyc files. It's not desinged for anything else. It *CAN* be used for simple stuff, too. But it doesn't support fancy stuff and it can easily be broken. IIRC it doesn't support nested structured like a list containing a reference to itself. Use it on your own risk. Christian From tjreedy at udel.edu Thu Mar 6 20:37:17 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 6 Mar 2008 14:37:17 -0500 Subject: [Python-ideas] adding a trim convenience function References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com><3b5110850803051330n362f24a4nd9bfffce90f1aa72@mail.gmail.com><3b5110850803051330h188951d3y990e460e590127a9@mail.gmail.com> <1ef034530803051411i35dc94f5seb48911e4a0343b0@mail.gmail.com> Message-ID: "Erick Tryzelaar" wrote in message | On Wed, Mar 5, 2008 at 1:33 PM, Terry Reedy wrote: | > >>> [site.replace('http://', '')for site in sites] | > ['www.google.com', 'python.org', 'www.yahoo.com'] | | Unfortunately that would break down in certain cases with a different | strings, like 'foo bar foo'.replace('foo', ''), which just results in | ' bar '. I knew that, of course, but that objection does not apply to the use case you presented. I simply gave the simplest thing that worked, that passed your 'test'. Tal gave a more general answer. | > Try another use case. I think str pretty well has the basic tools needed | > to construct whatever specific tools one needs. | | Oh sure it can, considering that I can implement ltrim in three lines. | This is just to reduce a common pattern in my code, and to remove | rewriting it in multiple projects. Who many such uses are like your 'foo bar for'? tjr From ntoronto at cs.byu.edu Thu Mar 6 20:17:27 2008 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Thu, 06 Mar 2008 12:17:27 -0700 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: References: Message-ID: <47D04347.2030104@cs.byu.edu> Boris Borcic wrote: > Aaron Watters wrote: > [...] > >> Sorry, for the french and the caps, but this is REALLY IMPORTANT. > > I see nothing french in your post. I could have sworn I read something like "I fart in your general direction". Neil From idadesub at users.sourceforge.net Thu Mar 6 21:16:16 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Thu, 6 Mar 2008 12:16:16 -0800 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> <3b5110850803051330n362f24a4nd9bfffce90f1aa72@mail.gmail.com> <3b5110850803051330h188951d3y990e460e590127a9@mail.gmail.com> <1ef034530803051411i35dc94f5seb48911e4a0343b0@mail.gmail.com> Message-ID: <1ef034530803061216k6fae06b1gc3e0e178ced5736c@mail.gmail.com> On Thu, Mar 6, 2008 at 11:37 AM, Terry Reedy wrote: fyi, I looked through a bunch of code, and it does seem that there is less need for this than I thought. > Who many such uses are like your 'foo bar for'? The case I ran into is that I used in a fashion like 'abaaab'.lstrip('ab') before I understood exactly what strip did. The replace trick won't work for me because all of the instances where I used this were in an api, so I couldn't assume that the string i was trimming didn't have other instances of the prefix/suffix in the middle. From aahz at pythoncraft.com Thu Mar 6 21:25:27 2008 From: aahz at pythoncraft.com (Aahz) Date: Thu, 6 Mar 2008 12:25:27 -0800 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: <47CF1DA5.9010501@canterbury.ac.nz> References: <47CF1DA5.9010501@canterbury.ac.nz> Message-ID: <20080306202527.GB1724@panix.com> On Thu, Mar 06, 2008, Greg Ewing wrote: > > BTW, I only consider pickle suitable for quick and dirty uses anyway, > because it ties the external representation very closely to internal > details of your program, which can make it difficult to evolve the > program without invalidating previously written files. > > For long-term use, it's better to invest time in a > properly-thought-out external format for the task, designed with > extensibility in mind. Maybe so, but my company has been using pickle as a primary long-term storage mechanism for more than a decade. We only rarely have problems with code changes causing pickle problems (less than once per year). OTOH, we mostly only have a growing internal format -- we almost never change the internal format. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson From george.sakkis at gmail.com Thu Mar 6 21:29:42 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 6 Mar 2008 15:29:42 -0500 Subject: [Python-ideas] adding a trim convenience function In-Reply-To: <1ef034530803061216k6fae06b1gc3e0e178ced5736c@mail.gmail.com> References: <1ef034530803051245u7fdf525dn6f4efc74a8af59a8@mail.gmail.com> <3b5110850803051330n362f24a4nd9bfffce90f1aa72@mail.gmail.com> <3b5110850803051330h188951d3y990e460e590127a9@mail.gmail.com> <1ef034530803051411i35dc94f5seb48911e4a0343b0@mail.gmail.com> <1ef034530803061216k6fae06b1gc3e0e178ced5736c@mail.gmail.com> Message-ID: <91ad5bf80803061229x7c8993a5j76e726873ee4a8ff@mail.gmail.com> On Thu, Mar 6, 2008 at 3:16 PM, Erick Tryzelaar wrote: > On Thu, Mar 6, 2008 at 11:37 AM, Terry Reedy wrote: > > fyi, I looked through a bunch of code, and it does seem that there is > less need for this than I thought. > > > Who many such uses are like your 'foo bar for'? > > The case I ran into is that I used in a fashion like > 'abaaab'.lstrip('ab') before I understood exactly what strip did. The > replace trick won't work for me because all of the instances where I > used this were in an api, so I couldn't assume that the string i was > trimming didn't have other instances of the prefix/suffix in the > middle. What about adding an optional boolean parameter to str.*strip that treats the argument as either a set of characters (default, just like now) or an exact string ? Something like >>> 'abaaab'.lstrip('ab') '' >>> 'abaaab'.lstrip('ab', exact=True) 'aaab' George From santagada at gmail.com Thu Mar 6 22:00:13 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Thu, 6 Mar 2008 18:00:13 -0300 Subject: [Python-ideas] An official complaint regarding the marshal and pickle documentation In-Reply-To: <20080306202527.GB1724@panix.com> References: <47CF1DA5.9010501@canterbury.ac.nz> <20080306202527.GB1724@panix.com> Message-ID: On 06/03/2008, at 17:25, Aahz wrote: > Maybe so, but my company has been using pickle as a primary long-term > storage mechanism for more than a decade. We only rarely have > problems > with code changes causing pickle problems (less than once per year). > OTOH, we mostly only have a growing internal format -- we almost never > change the internal format. And then tons of companies are using ZODB wich uses pickle... so no biggie... but using pickle directly can be a problem if you have lots of data. -- Leonardo Santagada From ntoronto at cs.byu.edu Mon Mar 10 07:28:43 2008 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Mon, 10 Mar 2008 00:28:43 -0600 Subject: [Python-ideas] [Python-Dev] PEP Proposal: Revised slice objects & lists use slice objects as indexes In-Reply-To: References: Message-ID: <47D4D51B.6070802@cs.byu.edu> Alexandre Vassalotti wrote: > On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight wrote: >> This would simplify the handling of list slices. >> >> Slice objects that are produced in a list index area would be different, >> and optionally the syntax for slices in list indexes would be expanded >> to work everywhere. Instead of being containers for the start, end, >> and step numbers, they would be generators, similar to xranges. > > I am not sure what you are trying to propose here. The slice object > isn't special, it's just a regular built-in type. > > >>> slice(1,4) > slice(1, 4, None) > >>> [1,2,3,4,5,6][slice(1,4)] > [2, 3, 4] > > I don't see how introducing new syntax would simplify indexing. Likewise. It would simplify looping, though: >>> for i in 1:5: ... print i 1 2 3 4 >>> Since this kind of loop happens frequently, it makes sense to shorten it. Slice objects (and syntax) seem ready-made for that - it wouldn't be *new* syntax, just repurposed syntax. Though Forrest didn't bring this up directly, I've often thought that Python's having both xrange and slice (and in 3000, range and slice) is mostly vestigial. Their information content is identical and their purposes are highly analogous. Unifying them would reduce the number of new concepts for a beginner by one, and these are frequently-used concepts at that. Negative indexes could throw the idea for a loop, though. (Pun! Ha ha!) And this makes the colons look like some kind of enclosure: >>> for i in :5: ... Neil From aaron.watters at gmail.com Mon Mar 10 16:33:25 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Mon, 10 Mar 2008 11:33:25 -0400 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? Message-ID: Hi. Some months ago I complained on the python-list that python gc did too much work for apps that allocate and deallocate lots of structures. In fact one of my apps was spending about 1/3 of its time garbage collecting and not finding anything to collect (before i disabled gc). My proposal was that python should have some sort of a smarter strategy for garbage collection, perhaps involving watching the global high water mark for memory allocation or other tricks. The appropriate response was: "great idea! patch please!" :) Unfortunately dealing with cross platform memory management internals is beyond my C-level expertise, and I'm not having a lot of luck finding good information sources. Does anyone have any clues on this or other ideas for improving the gc heuristic? For example, how do you find out the allocated heap size(s) in a cross platform way? This link provides some clues, but I don't really understand this code well enough to hope to patch gc. http://www.xfeedme.com/nucular/pydistro.py/go?FocusId=74&FREETEXT=high%20water%20mark -- Aaron Watters -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Mon Mar 10 17:30:56 2008 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 10 Mar 2008 09:30:56 -0700 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Mon, Mar 10, 2008 at 8:33 AM, Aaron Watters wrote: > Hi. Some months ago I complained on the python-list > that python gc did too much work for apps that allocate > and deallocate lots of structures. In fact one of my apps > was spending about 1/3 of its time garbage collecting > and not finding anything to collect (before i disabled gc). > > My proposal was that python > should have some sort of a smarter strategy for garbage > collection, perhaps involving watching the global > high water mark for memory allocation or other tricks. > > The appropriate response was: > "great idea! patch please!" :) > > Unfortunately dealing with cross platform > memory management internals is beyond my > C-level expertise, and I'm not having a lot of > luck finding good information sources. Does anyone > have any clues on this or other ideas for improving > the gc heuristic? For example, how do you find > out the allocated heap size(s) in a cross platform > way? > > This link provides some clues, but I don't really > understand this code well enough to hope to > patch gc. > > http://www.xfeedme.com/nucular/pydistro.py/go?FocusId=74&FREETEXT=high%20water%20mark You can of course tweak gc.set_threshold() (and I would expect this to be quite effective, once you find out what an appropriate threshold0 is for your app.) I don't believe you'll find any existing counters of the current heap size though (be it number of allocated objects or total size consumed by those objects.) -- Adam Olsen, aka Rhamphoryncus From g.brandl at gmx.net Mon Mar 10 18:44:05 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 10 Mar 2008 18:44:05 +0100 Subject: [Python-ideas] [Python-Dev] PEP Proposal: Revised slice objects & lists use slice objects as indexes In-Reply-To: <47D4D51B.6070802@cs.byu.edu> References: <47D4D51B.6070802@cs.byu.edu> Message-ID: Neil Toronto schrieb: > Alexandre Vassalotti wrote: >> On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight wrote: >>> This would simplify the handling of list slices. >>> >>> Slice objects that are produced in a list index area would be different, >>> and optionally the syntax for slices in list indexes would be expanded >>> to work everywhere. Instead of being containers for the start, end, >>> and step numbers, they would be generators, similar to xranges. >> >> I am not sure what you are trying to propose here. The slice object >> isn't special, it's just a regular built-in type. >> >> >>> slice(1,4) >> slice(1, 4, None) >> >>> [1,2,3,4,5,6][slice(1,4)] >> [2, 3, 4] >> >> I don't see how introducing new syntax would simplify indexing. > > Likewise. It would simplify looping, though: > > >>> for i in 1:5: > ... print i > 1 > 2 > 3 > 4 > >>> See http://www.python.org/dev/peps/pep-0204/ for a similar proposal. Georg From aaron.watters at gmail.com Tue Mar 11 15:28:30 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Tue, 11 Mar 2008 10:28:30 -0400 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Mon, Mar 10, 2008 at 12:30 PM, Adam Olsen wrote: > > You can of course tweak gc.set_threshold() (and I would expect this to > be quite effective, once you find out what an appropriate threshold0 > is for your app.) I don't believe you'll find any existing counters > of the current heap size though (be it number of allocated objects or > total size consumed by those objects.)... It would be nice if the threshold would adjust based on the performance characteristics of the app. In particular it'd be nice if the garbage collector would notice when it's never finding anything and wait longer everytime it finds nothing for the next collection attempt. How about this. - The threshold slides between minimumThresh and maximumThresh - At each collection the current number of objects collected is compared to the last number collected (collectionTrend). - If the collectionTrend is negative or zero the next threshold slides towards the maximum. - If the collectionTrend is a small increase, the threshold stays the same. - If the collectionTrend is a large increase the next threshold slides towards the minimum. That way for apps that need no garbage collection (outside of refcounting) the threshold would slide to the maximum and stay there, but for apps that need a lot of gc the threshold would bounce up and down near the minimum. This is almost easy enough that I could implement it... -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=stupid+animation -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.watters at gmail.com Tue Mar 11 15:42:11 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Tue, 11 Mar 2008 10:42:11 -0400 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: ... > - If the collectionTrend is a small increase, the threshold stays the > same. > footnote: for stability you would not update the "last collection count" in this case so the comparison is always against a fixed point until the threshold adjusts.... -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=arggh -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Tue Mar 11 17:25:48 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 11 Mar 2008 09:25:48 -0700 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Tue, Mar 11, 2008 at 7:28 AM, Aaron Watters wrote: > On Mon, Mar 10, 2008 at 12:30 PM, Adam Olsen wrote: > > You can of course tweak gc.set_threshold() (and I would expect this to > > be quite effective, once you find out what an appropriate threshold0 > > is for your app.) I don't believe you'll find any existing counters > > of the current heap size though (be it number of allocated objects or > > total size consumed by those objects.)... > > It would be nice if the threshold would adjust based > on the performance characteristics of the app. > In particular it'd be nice if the garbage collector would > notice when it's never finding anything and wait longer > everytime it finds nothing for the next collection attempt. > > How about this. > - The threshold slides between minimumThresh and maximumThresh > - At each collection the current number of objects > collected is compared to the last number collected (collectionTrend). > - If the collectionTrend is negative or zero the next threshold slides > towards the maximum. > - If the collectionTrend is a small increase, the threshold stays the same. > - If the collectionTrend is a large increase the next threshold slides > towards > the minimum. > That way for apps that need no garbage collection > (outside of refcounting) the threshold would slide to the > maximum and stay there, but for apps that need a lot of > gc the threshold would bounce up and down near the minimum. > > This is almost easy enough that I could implement it... > -- Aaron Watters It sounds plausible to me. But have you tried just tweaking the threshold? Surely there's a value at which it performs well, and that'd need to be within your maximum anyway. -- Adam Olsen, aka Rhamphoryncus From aaron.watters at gmail.com Tue Mar 11 19:20:50 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Tue, 11 Mar 2008 14:20:50 -0400 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: > It sounds plausible to me. > > But have you tried just tweaking the threshold? Surely there's a > value at which it performs well, and that'd need to be within your > maximum anyway. In that case it works best when gc is disabled. If I add a new feature, the gc requirements may change completely without me realizing it. I'm interested in not having to think about it :). -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=hello+there+goof+ball -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Tue Mar 11 20:08:37 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 11 Mar 2008 12:08:37 -0700 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Tue, Mar 11, 2008 at 11:20 AM, Aaron Watters wrote: > > It sounds plausible to me. > > > > But have you tried just tweaking the threshold? Surely there's a > > value at which it performs well, and that'd need to be within your > > maximum anyway. > > In that case it works best when gc is disabled. > If I add a new feature, the gc requirements may > change completely without me realizing it. > I'm interested in not having to think about it :). You're concerned that a new feature may increase how high of a threshold you need, yet it could also exceed the "maximum" of your adaptive scheme. I'm not convinced you need that high of a threshold anyway. I'd like to see a benchmark showing how your app performs at different levels. -- Adam Olsen, aka Rhamphoryncus From lists at cheimes.de Tue Mar 11 20:57:02 2008 From: lists at cheimes.de (Christian Heimes) Date: Tue, 11 Mar 2008 20:57:02 +0100 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: Aaron Watters wrote: > It would be nice if the threshold would adjust based > on the performance characteristics of the app. > In particular it'd be nice if the garbage collector would > notice when it's never finding anything and wait longer > everytime it finds nothing for the next collection attempt. Have you read the code and comments in Modules/gcmodule.c? The cyclic GC has three generations. A gc sweep for the highest generation is started every 70,000 instructions. You can tune the levels for the generations yourself through the gc module set threshold function. Christian From rhamph at gmail.com Tue Mar 11 21:49:45 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 11 Mar 2008 13:49:45 -0700 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Tue, Mar 11, 2008 at 12:57 PM, Christian Heimes wrote: > Aaron Watters wrote: > > It would be nice if the threshold would adjust based > > on the performance characteristics of the app. > > In particular it'd be nice if the garbage collector would > > notice when it's never finding anything and wait longer > > everytime it finds nothing for the next collection attempt. > > Have you read the code and comments in Modules/gcmodule.c? The cyclic GC > has three generations. A gc sweep for the highest generation is started > every 70,000 instructions. You can tune the levels for the generations > yourself through the gc module set threshold function. Not instructions. There's a counter that's incremented on allocation and decremented on deallocation. Each time it hits 700 it triggers a collection. The collections are normally only gen0, but after 10 it does a gen1 collection. After 10 of the second generation it does the gen2 (ie a full collection.) Although, given the way the math is done, I think the 701st object will be the one that triggers the gen0 collection, and the 12th time that happens it does gen1 instead. I get a grand total of.. 93233 objects to trigger a gen2 collection. Not that it matters. Without more detail information on what the app is doing we can't seriously attempt to improve the heuristics for it. -- Adam Olsen, aka Rhamphoryncus From aaron.watters at gmail.com Tue Mar 11 21:57:25 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Tue, 11 Mar 2008 16:57:25 -0400 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: > > You're concerned that a new feature may increase how high of a > threshold you need, yet it could also exceed the "maximum" of your > adaptive scheme. > > I'm not convinced you need that high of a threshold anyway. I'd like > to see a benchmark showing how your app performs at different levels. You are absolutely right that I can set a threshold high enough. With the default values Python is extremely slow for certain cases. I'm arguing it should automatically detect when it is being stupid and attempt to fix it. In particular I would set the maximum very high and start at the minimum, which might be near the current defaults. For example I get the following in a simple test (python2.6): > python gctest.py gc not disabled elapsed 19.2473409176 > python gctest.py disable gc disabled elapsed 4.88715791702 In this case the interpreter is spending 80% of its time trying to collect non-existent garbage. Now a newbie who didn't know to go fiddling with the garbage collector might just conclude "python is ssslllooowwww" and go back to using Perl or Ruby or whatever in a case like this. Maybe the powers that be couldn't care less about it, I don't know. (I know newbies can be irritating). The problem is quadratic also: if I double the limit the penalty goes up by a factor of 4. Here is the source: def test(disable=False, limit=1000000): from time import time import gc if disable: gc.disable() print "gc disabled" else: print "gc not disabled" now = time() D = {} for i in range(limit): D[ (hex(i), oct(i)) ] = str(i)+repr(i) L = [ (y,x) for (x,y) in D.iteritems() ] elapsed = time()-now print "elapsed", elapsed if __name__=="__main__": import sys disable = False if "disable" in sys.argv: disable = True test(disable) -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=being+anal -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Tue Mar 11 22:31:17 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 11 Mar 2008 14:31:17 -0700 Subject: [Python-ideas] cross platform memory usage high water mark for improved gc? In-Reply-To: References: Message-ID: On Tue, Mar 11, 2008 at 1:57 PM, Aaron Watters wrote: > > > > > > > > You're concerned that a new feature may increase how high of a > > threshold you need, yet it could also exceed the "maximum" of your > > adaptive scheme. > > > > I'm not convinced you need that high of a threshold anyway. I'd like > > to see a benchmark showing how your app performs at different levels. > > You are absolutely right that I can set a threshold high enough. > With the default values Python is extremely slow for certain cases. > I'm arguing it should automatically detect when it is being stupid > and attempt to fix it. In particular I would set the maximum very > high and start at the minimum, which might be near the current > defaults. > > For example I get the following in a simple test (python2.6): > > > python gctest.py > gc not disabled > elapsed 19.2473409176 > > python gctest.py disable > gc disabled > elapsed 4.88715791702 > > In this case the interpreter is spending 80% of its time trying > to collect non-existent garbage. Now a newbie who > didn't know to go fiddling with the garbage collector > might just conclude "python is ssslllooowwww" and go > back to using Perl or Ruby or whatever in a case like this. > Maybe the powers that be couldn't care less about it, I don't > know. (I know newbies can be irritating). > > The problem is quadratic also: if I double the limit the > penalty goes up by a factor of 4. > > Here is the source: > > def test(disable=False, limit=1000000): > from time import time > import gc > if disable: > gc.disable() > print "gc disabled" > else: > print "gc not disabled" > now = time() > D = {} > for i in range(limit): > D[ (hex(i), oct(i)) ] = str(i)+repr(i) > L = [ (y,x) for (x,y) in D.iteritems() ] > elapsed = time()-now > print "elapsed", elapsed > > if __name__=="__main__": > import sys > disable = False > if "disable" in sys.argv: > disable = True > test(disable) Interesting. With some further testing, it's become clear that the problem is in gen2. gen0 and gen1 both add a constant overhead (their size is bounded), but gen2's size grows linearly, and with a linear number of scans that gives quadratic performance. I'm unsure how to best fix this. Anything we do will effectively disable gen2 for short-running programs, unless they do the right thing to trigger the heuristics. Long running programs have a little more chance of triggering them, but may do so much later than desirable. Something must be done though. The costs should be linear with time, not quadratic. The frequency at which an object gets scanned should be inversely proportional to the number of objects to be scanned. -- Adam Olsen, aka Rhamphoryncus From lorgandon at gmail.com Thu Mar 13 16:09:43 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Thu, 13 Mar 2008 17:09:43 +0200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> Message-ID: <47D943B7.60405@gmail.com> As per Aahz's suggestion, I'm moving this discussion here, from Python-Dev. (Thanks Aahz!) Mark Dickinson wrote: > On Thu, Mar 13, 2008 at 4:20 AM, Imri Goldberg > wrote: > > My suggestion is to do either of the following: > 1. Change floating point == to behave like a valid floating point > comparison. That means using precision and some error measure. > 2. Change floating point == to raise an exception, with an error > string > suggesting using precision comparison, or the decimal module. > > > I don't much like either of these; I think option 1 would cause > a lot of confusion and difficulty---it changes a conceptually > simple operation into something more complicated. > > As for option 2., I'd agree that there are situations where having > a warning (not an exception) for floating-point equality (and > inequality) tests might be helpful; but that warning should be > off by default, or at least easily turned off. As I said earlier, I'd like static checkers (like Python-Lint) to catch this sort of cases, whatever the decision may be. > > Some Fortran compilers have such a (compile-time) warning, > I believe. But Fortran's users are much more likely to be > writing the sort of code that cares about this. > > > Since this change is not backwards compatible, I suggest it be added > only to Python 3. > > > It's already too late for Python 3.0. Still, I believe it is worth discussing. > > > 3. Programmers will still need the regular ==: > Maybe, and even then, only for very rare cases. For these, a special > function\method might be used, which could be named floating_exact_eq. > > > I disagree with the 'very rare' here. I've seen, and written, code like: > > if a == 0.0: > # deal with exceptional case > else: > b = c/a > ... > > or similarly, a test (a==b) before doing a division by a-b. That > one's kind of dodgy, by the way: a != b doesn't always guarantee > that a-b is nonzero, though you're okay if you're on an IEEE 754 > platform and a and b are both finite numbers. While checking against a==0.0 (and other similar conditions) before dividing will indeed protect from outright division by zero, it will enlarge any error you will have in the computation. I guess it would be better to do the same check for 'a is small' for appropriate values of 'small'. > > Or what if you wanted to generate random numbers in the open interval > (0.0, 1.0). random.random gives you numbers in [0.0, 1.0), so a > careful programmer might well write: > > while True: > x = random.random() > if x != 0.0: > break > > (A less fussy programmer might just say that the chance > of getting 0.0 is about 1 in 2**53, so it's never going to happen...) > > Other thoughts: > > - what should x == x do? If suggestion no. 1 is accepted, always return True. If no. 2 is accepted, raise an exception. Checking x==x is as meaningful as checking x==y. > - what should > > 1.0 in set([0.0, 1.0, 2.0]) > > and > > 3.0 in set([0.0, 1.0, 2.0]) > > do? > Actually, one of the reasons I thought about this subject in the first place, was dict lookup for floating point numbers. It seems to me that it's something you just shouldn't do. As for your examples, I believe these two should both raise an exception. This is even worse than normal comparison - here you are checking against the hash of a floating point number. So if you do that in the current implementation, there's a good chance you'll get unexpected results. If you do that given the implementation of suggestion 1, you'll have a hard time make set work. > Mark Cheers, Imri. ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- From dickinsm at gmail.com Thu Mar 13 17:17:39 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 13 Mar 2008 12:17:39 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> Message-ID: <5c6f2a5d0803130917m1142f081rb71998efe904963d@mail.gmail.com> Imri, Aargh! Sorry about the multiple emails. The first one bounced because I wasn'tsubscribed to python-ideas, so I canceled it and sent it again, forgetting that you would still have got a copy of the first email. And now I'm sending you a third one, just to apologise for the second one (or was it the first.) Double apologies, and I'll try not to do it again. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Thu Mar 13 17:08:56 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 13 Mar 2008 12:08:56 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> Message-ID: <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> (with apologies for the random extra level of quoting in the below...) > On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg > wrote: > > As I said earlier, I'd like static checkers (like Python-Lint) to catch > > this sort of cases, whatever the decision may be. > > > Hmm. Isn't that tricky? How does the static checker decide whether the objects being compared are floats? I guess one could be content with catching some cases where the operands to == are clearly floats... Wouldn't you have to have run-time warnings to be really sure of catching all the cases? > > It's already too late for Python 3.0. > > Still, I believe it is worth discussing. > > > Sure. I didn't mean that to come out in quite the dismissive way it did :). Apologies. Maybe a PEP aimed at Python 4.0 is in order. If you're open to the idea of just having some way to enable warnings, it could be much sooner. > > While checking against a==0.0 (and other similar conditions) before > > dividing will indeed protect from outright division by zero, it will > > enlarge any error you will have in the computation. I guess it would be > > better to do the same check for 'a is small' for appropriate values of > > 'small'. > > Still, a check for 0.0 is good enough in some cases: if a is tiny, the large intermediate values may appear and then disappear happily before giving a sensible final result. These are usually the sort of cases where just having division by 0.0 return an infinity would have "just worked" too (making the whole "if" redundant), but that's not (currently!) an option in Python. It's a truism that floating-point equality tests should be avoided, but it's just not true that floating-point equality testing is *always* wrong, and I don't think that Python should make it so. Actually, one of the reasons I thought about this subject in the first > > place, was dict lookup for floating point numbers. It seems to me that > > it's something you just shouldn't do. > > > So your proposal would presumably include making x in dict and x not in dict errors for any float x, regardless of the contents of the dictionary (or list, or set, or frozenset, or...) dict? What would you do about Decimals? A Decimal is just another floating point format (albeit base 10 instead of base 2); so presumably all these warnings/errors should apply equally to Decimal instances? If not, why not? I'm not trying to be negative here---as Aahz says, this is an interesting idea; I'm just trying to understand exactly how things might work. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorgandon at gmail.com Thu Mar 13 19:38:12 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Thu, 13 Mar 2008 20:38:12 +0200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> Message-ID: <47D97494.7040103@gmail.com> Mark Dickinson wrote: > (with apologies for the random extra level of quoting in the below...) > > > On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg > > wrote: > > As I said earlier, I'd like static checkers (like Python-Lint) > to catch > this sort of cases, whatever the decision may be. > > > Hmm. Isn't that tricky? How does the static checker decide > whether the objects being compared are floats? I guess one could > be content with catching some cases where the operands to == > are clearly floats... Wouldn't you have to have run-time warnings > to be really sure of catching all the cases? > Yes. Writing a static-checker for Python is tricky in any case. For the sake of this discussion, it might be useful to refer to some 'ideal' static checker. This will allow us to better define what is the desired behavior. > > > It's already too late for Python 3.0. > Still, I believe it is worth discussing. > > > > Sure. I didn't mean that to come out in quite the dismissive way it > did :). > Apologies. Maybe a PEP aimed at Python 4.0 is in order. If you're open > to the idea of just having some way to enable warnings, it could be > much sooner. > I think that generating a warning (by default?) is a strong enough change in the right direction, so we should add that as another option. (Was also suggested in a comment on my blog.) > > While checking against a==0.0 (and other similar conditions) > before > dividing will indeed protect from outright division by zero, > it will > enlarge any error you will have in the computation. I guess it > would be > better to do the same check for 'a is small' for appropriate > values of > 'small'. > > > Still, a check for 0.0 is good enough in some cases: if a is tiny, the > large intermediate values may appear and then disappear happily > before giving a sensible final result. These are usually the sort > of cases where just having division by 0.0 return an infinity > would have "just worked" too (making the whole "if" redundant), but > that's not (currently!) an option in Python. > > It's a truism that floating-point equality tests should be avoided, but > it's just not true that floating-point equality testing is *always* wrong, > and I don't think that Python should make it so. > Alright, that's why in my original suggestion, I proposed a function for 'old-style' comparison. It still seems to me that in most cases you are better off doing something other than using the current ==. A point I'm not sure of though, is what happens to other comparison operators, namely, <=, <, >, >=. If they retain their original meaning than <= and >= become at least a bit inconsistent. I'll be glad to hear more opinions about this. > Actually, one of the reasons I thought about this subject in > the first > place, was dict lookup for floating point numbers. It seems to > me that > it's something you just shouldn't do. > > > So your proposal would presumably include making > > x in dict > > and > > x not in dict > > errors for any float x, regardless of the contents of the dictionary > (or list, or set, or frozenset, or...) dict? > > What would you do about Decimals? A Decimal is just another > floating point format (albeit base 10 instead of base 2); so > presumably all these warnings/errors should apply equally > to Decimal instances? If not, why not? > This last note gave me pause. I still need to think more about this, but here are my thoughts so far: 1. Decimal's behavior might be considered even more inconsistent - the precision applies to arithmetical operations, but not to comparisons. 2. As a result, it seems to me that decimal's behavior might also be changed. It needn't be the same change as regular floating point though - decimal behavior might follow suggestion 1, while regular floating points might follow suggestion 2. (I see no point in it being the other way around though.) 3. Usage in containers depending on __hash__ should change according to how == behaves for decimals. If == raises an a warning/exception, so should "x in {..}". If == will be changed to work according to precision for decimals, then usage in containers will be (very) problematic, because of context changes. (Consider what happens when changing the precision.) 4. Right now, I would avoid using decimal or regular floating points in such containers. The results are just not predictable enough. Using the 'ideal static-checker' mentioned above, I'd say that any such use should result in a warning. In any case, there might be a place for a way to do floating point comparisons in a 'standard' manner. > I'm not trying to be negative here---as Aahz says, this is an > interesting idea; I'm just trying to understand exactly how > things might work. > > Mark Sure, so do I. Cheers, Imri. ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- From idadesub at users.sourceforge.net Thu Mar 13 21:56:20 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Thu, 13 Mar 2008 13:56:20 -0700 Subject: [Python-ideas] py3k: adding "print" methods to file-like objects Message-ID: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> This might be a minor thing, but I kind of wish that I could write this: sys.stderr.print('first line') sys.stderr.print('another line here') sys.stderr.print('and again') instead of: print('first line', file=sys.stderr) print('another line here', file=sys.stderr) print('and again', file=sys.stderr) As it's a lot easier to read for me. Of course you can always add spaces to make the lines line up, but with a long print statement your eye has to go a long distance to figure out what file, if any, you're printing to. It could be pretty simple to add: class ...: def print(*args, **kwargs): io.print(file=self, *args, **kwargs) I haven't been able to find any discussion on this, has this already been rejected? From greg.ewing at canterbury.ac.nz Thu Mar 13 23:02:19 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Mar 2008 11:02:19 +1300 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47D97494.7040103@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> Message-ID: <47D9A46B.2080900@canterbury.ac.nz> Imri Goldberg wrote: > > what happens to other comparison > operators, namely, > <=, <, >, >=. If they retain their original meaning than <= and >= > become at least a bit inconsistent. Also, if you have <= and >= then you can cheat by doing 'x <= y and x >= y'. :-) -- Greg From lorgandon at gmail.com Thu Mar 13 23:18:35 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Fri, 14 Mar 2008 00:18:35 +0200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47D9A46B.2080900@canterbury.ac.nz> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> Message-ID: <47D9A83B.1050908@gmail.com> Greg Ewing wrote: > Imri Goldberg wrote: > >> what happens to other comparison >> operators, namely, >> <=, <, >, >=. If they retain their original meaning than <= and >= >> become at least a bit inconsistent. >> > > Also, if you have <= and >= then you can cheat by > doing 'x <= y and x >= y'. :-) > > That's part of what I meant. There's also the problem that if x>y, then you want x!=y. This means that there are implications for all comparison operators. This makes changing == behavior to an epsilon comparison more involved. I still think it is feasible, but will require much more consideration. In any case, emitting a warning for == is still 'cheap', and the original arguments stand. ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- From taleinat at gmail.com Thu Mar 13 23:40:03 2008 From: taleinat at gmail.com (Tal Einat) Date: Fri, 14 Mar 2008 00:40:03 +0200 Subject: [Python-ideas] py3k: adding "print" methods to file-like objects In-Reply-To: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> References: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> Message-ID: <7afdee2f0803131540x5c453f62q6f38b604ad152ab1@mail.gmail.com> I prefer using partial than introducing new syntax: print_to_stderr = functools.partial(print, file=sys.stderr) print_to_stderr('first line') print_to_stderr('second line') ... - Tal On Thu, Mar 13, 2008 at 10:56 PM, Erick Tryzelaar < idadesub at users.sourceforge.net> wrote: > This might be a minor thing, but I kind of wish that I could write this: > > sys.stderr.print('first line') > sys.stderr.print('another line here') > sys.stderr.print('and again') > > instead of: > > print('first line', file=sys.stderr) > print('another line here', file=sys.stderr) > print('and again', file=sys.stderr) > > As it's a lot easier to read for me. Of course you can always add > spaces to make the lines line up, but with a long print statement your > eye has to go a long distance to figure out what file, if any, you're > printing to. It could be pretty simple to add: > > class ...: > def print(*args, **kwargs): > io.print(file=self, *args, **kwargs) > > I haven't been able to find any discussion on this, has this already > been rejected? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Fri Mar 14 02:01:10 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 13 Mar 2008 21:01:10 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47D9A83B.1050908@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> Message-ID: <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg wrote: > This makes changing == behavior to an epsilon comparison more involved. > I still think it is feasible, but will require much more consideration. > Okay, now I am going to be negative. :-) I really think that there's essentially zero chance of == and != ever changing to 'fuzzy' comparisons in Python. I don't want to discourage you from working out possible details as an academic exercise, or perhaps with some other (Python-like?) language in mind, but I just don't see it ever happening in Python. Maybe I'm wrong, in which case I hope other python people will tell me so, but I think pursuing this is, in the end, going to be a waste of time. Some reasons, and then I'll shut up: Too much complication and magic implicit stuff going on behind the scenes. In a fuzzy a == b there are hidden choices about the fuzziness scheme and the amount of fuzz to allow, and those choices are going to confuse the hell out of newbie and expert programmers alike. As above, you'd have to choose defaults for the fuzziness, and by Murphy's Law those defaults would be wrong for almost everybody else's particular applications, meaning that almost everybody else would have to go away and learn about how to change or turn off the fuzziness. Fundamental and well-understood laws (trichotomy, transitivity of equality) would break. It's really unclear how the other comparison operators would be affected. If 1.0 == 1.0+2e-16 returns True, shouldn't 1.0 >= 1.0+2e-16 also return True? Containers would be affected in peculiar ways. I think people would be really surprised to find that 1.0+2e-16 *was* an element of the set {1.0}, or that 1.0 and 1.0+2e-16 weren't allowed to be different keys in a dict. And how on earth do you check for set or dict membership under the hood? I don't know of any other language that has successfully done this, even though I've seen the idea floated many times for different languages. That doesn't mean much, since I only know a small handful of the many hundreds (thousands?) of languages out there. If you know a counterexample, I'd be interested to hear it. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From ntoronto at cs.byu.edu Fri Mar 14 08:29:45 2008 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Fri, 14 Mar 2008 01:29:45 -0600 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47D9E7E6.9090401@cos.ufrj.br> References: <47D8E3E0.7010509@gmail.com> <47D9E7E6.9090401@cos.ufrj.br> Message-ID: <47DA2969.9090903@cs.byu.edu> Tiago A.O.A. wrote: > I would suggest something like a ~= b, for "approximately equal to". How > approximately? Well, there would be a default that could be changed > somewhere. > > Don't know if it's all that useful, though. Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__, __nsim__, __ltsim__, and __gtsim__ slots. I'm not at all sure how serious I am right now. It's late, and I have fuzzy recollections of how those kinds of things might have been nice in some past numerical code. And then =~ and !~ could be defined for strings and do regular expression matching! Woo! More operators! With pronouns! Neil From lorgandon at gmail.com Fri Mar 14 10:01:28 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Fri, 14 Mar 2008 11:01:28 +0200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> Message-ID: <47DA3EE8.7060902@gmail.com> Mark Dickinson wrote: > On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg > wrote: > > This makes changing == behavior to an epsilon comparison more > involved. > I still think it is feasible, but will require much more > consideration. > > > Okay, now I am going to be negative. :-) > > I really think that there's essentially zero chance of == and != ever > changing > to 'fuzzy' comparisons in Python. I don't want to discourage you from > working > out possible details as an academic exercise, or perhaps with some other > (Python-like?) language in mind, but I just don't see it ever > happening in Python. > Maybe I'm wrong, in which case I hope other python people will tell me so, > but I think pursuing this is, in the end, going to be a waste of time. > Alright, I agree it's a good idea to drop the proposal to changing floating point == into an epsilon compare. What about issuing a warning though? Consider the following course of action. It is the one with the least changes: == for regular floating point numbers now issues a warning, but still works. This warning might be turned off. All other operators are left unchanged. Do you think this should be dropped as well? Just for my own code, I think I'd like this behavior. I still consider floating point == a potential bug, and this helps me catch it, in the absence of the 'ideal static checker'. > Containers would be affected in peculiar ways. I think people would be > really surprised to find that 1.0+2e-16 *was* an element of the set {1.0}, > or that 1.0 and 1.0+2e-16 weren't allowed to be different keys in a dict. > And how on earth do you check for set or dict membership under the > hood? > I think that right now containers behave in peculiar ways when used with FP numbers. Take set for example - you might as well just use list instead of it. When you consider dict, then doing d[x] might not return the result you actually want. > I don't know of any other language that has successfully done this, even > though I've seen the idea floated many times for different languages. > That doesn't mean much, since I only know a small handful of the many > hundreds (thousands?) of languages out there. If you know a > counterexample, I'd be interested to hear it. > > Mark Don't know of a good counterexample. I agree that before changing the behavior of == to fuzzy comparison, you'll want experience with that kind of change. Cheers, Imri ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- From aaron.watters at gmail.com Fri Mar 14 15:40:58 2008 From: aaron.watters at gmail.com (Aaron Watters) Date: Fri, 14 Mar 2008 10:40:58 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == Message-ID: For systems programming I often use floats as timestamps in dictionaries, and in this case I never do calculations and all I care about is "same" or "different", meaning "any single bit difference". If you change the way == works and also follow through and change the way floats in dictionaries work you would probably break very many applications like this. I think any "almost equal" should be implemented using a new method or syntax x~=y rather than break things. - Aaron Watters http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=stdio+stinks -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Fri Mar 14 16:00:37 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Fri, 14 Mar 2008 11:00:37 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47DA3EE8.7060902@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> Message-ID: <5c6f2a5d0803140800h37310696rdeb3cfb17603d458@mail.gmail.com> On Fri, Mar 14, 2008 at 5:01 AM, Imri Goldberg wrote: > Alright, I agree it's a good idea to drop the proposal to changing > floating point == into an epsilon compare. > What about issuing a warning though? > Consider the following course of action. It is the one with the least > changes: > > == for regular floating point numbers now issues a warning, but still > works. This warning might be turned off. All other operators are left > unchanged. > Do you think this should be dropped as well? To be honest, yes. There isn't currently a SmellyCodeWarning or IsThatReallyWhatYouMeanWarning in Python, and there doesn't seem to be a lot of precedent for warning on code constructs that may often be wrong but also have legitimate uses. Most of the current warnings have more to do with syntactic or semantic changes between various versions of Python. But I think it would be entirely appropriate to warn about floating-point (in)equality checks in something like PyChecker or Pylint, if you can get past the technical difficulties of detecting floating-point comparisons statically. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From janssen at parc.com Fri Mar 14 17:25:13 2008 From: janssen at parc.com (Bill Janssen) Date: Fri, 14 Mar 2008 09:25:13 PDT Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803140800h37310696rdeb3cfb17603d458@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> <5c6f2a5d0803140800h37310696rdeb3cfb17603d458@mail.gmail.com> Message-ID: <08Mar14.092522pdt."58696"@synergy1.parc.xerox.com> Mark Dickinson writes: > There isn't currently a SmellyCodeWarning ... in Python Though, clearly, that's what DeprecationWarning should immediately be renamed to :-). Bill From leszek at dubiel.pl Thu Mar 13 16:49:45 2008 From: leszek at dubiel.pl (Leszek Dubiel) Date: Thu, 13 Mar 2008 16:49:45 +0100 Subject: [Python-ideas] One-element tuple Message-ID: <47D94D19.7080806@dubiel.pl> An HTML attachment was scrubbed... URL: From leszek at dubiel.pl Thu Mar 13 17:35:03 2008 From: leszek at dubiel.pl (Leszek Dubiel) Date: Thu, 13 Mar 2008 17:35:03 +0100 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47D943B7.60405@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> Message-ID: <47D957B7.6060200@dubiel.pl> >> My suggestion is to do either of the following: >> 1. Change floating point == to behave like a valid floating point >> comparison. That means using precision and some error measure >> There are two ways: 1. python users have to know, that representation of float has some problems 2. python users must not care about internal float representation Solution "2." is not good, because someday somebody will complain, that computer calculations are not accurate (some scientist who was not willing about learning how computer stores floats). It is better to choose "1." -- beginers will have to accept that computer is not able to store every real number, because floats are stored as binary numbers. Maybe operator "==" for floats should be deprecated, and people should use something like "!~" or "=~", and they should be able to set precission for float numbers? From veloso at verylowsodium.com Fri Mar 14 18:56:00 2008 From: veloso at verylowsodium.com (Greg Falcon) Date: Fri, 14 Mar 2008 13:56:00 -0400 Subject: [Python-ideas] One-element tuple In-Reply-To: <47D94D19.7080806@dubiel.pl> References: <47D94D19.7080806@dubiel.pl> Message-ID: <3cdcefb80803141056w65aabbb0jd878f0830e0a2648@mail.gmail.com> On 3/13/08, Leszek Dubiel wrote: > I would suggest to deprecate one-element tuple construction with colon at > the end, because this looks ugly, is not self-evident for other people > reading code, looks like some type of trickery. Notice that nearly all reasonable programming languages allow for an extra trailing comma in a comma-delimited list, for consistency and to make programmatic code generation easier. So for example, [1,2,3,] is intentionally correct. One you accept (1,2,3,) as a reasonable tuple, you'll see that disallowing or deprecating (1,) is wrong, too. > >>> tuple(['hello']) > ('hello',) Tuples are a fundamental data type, and it would be irresponsible to steer beginners away from their simple literal syntax. It's mildly unfortunate that (1,2,) (1,2) (1,) and () represent tuples but (1) doesn't. But this rule is simple, well motivated, and described quite straightforwardly at the very page you link to. Better to leave it as is, so beginners can learn the rule, accept it, and move on. Greg F From guido at python.org Fri Mar 14 19:31:44 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Mar 2008 13:31:44 -0500 Subject: [Python-ideas] py3k: adding "print" methods to file-like objects In-Reply-To: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> References: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> Message-ID: On Thu, Mar 13, 2008 at 3:56 PM, Erick Tryzelaar wrote: > This might be a minor thing, but I kind of wish that I could write this: > > sys.stderr.print('first line') > sys.stderr.print('another line here') > sys.stderr.print('and again') > > instead of: > > print('first line', file=sys.stderr) > print('another line here', file=sys.stderr) > print('and again', file=sys.stderr) > > As it's a lot easier to read for me. Of course you can always add > spaces to make the lines line up, but with a long print statement your > eye has to go a long distance to figure out what file, if any, you're > printing to. It could be pretty simple to add: > > class ...: > def print(*args, **kwargs): > io.print(file=self, *args, **kwargs) > > I haven't been able to find any discussion on this, has this already > been rejected? It was brought up, considered, and rejected. The reason is that it would require *every* stream-like object to implement the print() functionality, which is rather hairy; or subclass a specific base class, which we traditionally haven't required. Making it function that takes a file argument avoids these problems. And, by the way, it's too late to bring up new py3k proposals. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From Leszek.Dubiel at dubielvitrum.pl Wed Mar 12 13:33:15 2008 From: Leszek.Dubiel at dubielvitrum.pl (Leszek Dubiel) Date: Wed, 12 Mar 2008 13:33:15 +0100 Subject: [Python-ideas] One-element tuple Message-ID: <47D7CD8B.2010206@dubielvitrum.pl> An HTML attachment was scrubbed... URL: From greg at krypto.org Fri Mar 14 21:46:43 2008 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 14 Mar 2008 15:46:43 -0500 Subject: [Python-ideas] One-element tuple In-Reply-To: <47D7CD8B.2010206@dubielvitrum.pl> References: <47D7CD8B.2010206@dubielvitrum.pl> Message-ID: <52dc1c820803141346tfbf99cnca6911e5ecafae11@mail.gmail.com> -1 on deprecating the syntax. the tuple(['hello']) syntax is much much slower, a factor of 20x here. trailing ,s when you only have one item are how python tuple syntax is defined allowing them to use ()s instead of needing other tokens. gps On 3/12/08, Leszek Dubiel wrote: > > > > I would suggest to deprecate one-element tuple construction with colon at > the end, because this looks ugly, is not self-evident for other people > reading code, looks like some type of trickery. > > It would prefer tutorial ( > http://docs.python.org/dev/3.0/tutorial/datastructures.html#tuples-and-sequences) > to use instead of > > >>> ('hello',) > ('hello',) > > > > this syntax: > > >>> tuple(['hello']) > ('hello',) > > . > > > > PS. > > Funcitons set(), tuple(), list() and dict() are good! > > Syntax > > myset = {'a', 'b'} > > is absolutely perfect too! > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Mar 15 00:19:20 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Mar 2008 12:19:20 +1300 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47DA2969.9090903@cs.byu.edu> References: <47D8E3E0.7010509@gmail.com> <47D9E7E6.9090401@cos.ufrj.br> <47DA2969.9090903@cs.byu.edu> Message-ID: <47DB07F8.4010900@canterbury.ac.nz> Neil Toronto wrote: > Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__, > __nsim__, __ltsim__, and __gtsim__ slots. I think that all of these are a bad idea. In my experience, when comparing with a tolerance, you need to think carefully about what the appropriate tolerance is for each and every comparison. Having a global default tolerance would just lead people to write sloppy and unreliable numerical code. -- Greg From greg.ewing at canterbury.ac.nz Sat Mar 15 00:34:12 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Mar 2008 12:34:12 +1300 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47DA3EE8.7060902@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> Message-ID: <47DB0B74.6030709@canterbury.ac.nz> Imri Goldberg wrote: > == for regular floating point numbers now issues a warning, but still > works. This warning might be turned off. I think I would find it annoying to have to disable a warning whenever I legitimately wanted to do a floating ==. Also, having a global warning/no warning setting for the whole program isn't really right -- whether a floating == is legitimate is something that needs to be decided on a case-by-case basis. -- Greg From greg.ewing at canterbury.ac.nz Sat Mar 15 01:27:32 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Mar 2008 13:27:32 +1300 Subject: [Python-ideas] py3k: adding "print" methods to file-like objects In-Reply-To: References: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> Message-ID: <47DB17F4.7050602@canterbury.ac.nz> > On Thu, Mar 13, 2008 at 3:56 PM, Erick Tryzelaar > wrote: > >> instead of: >> >> print('first line', file=sys.stderr) >> print('another line here', file=sys.stderr) >> print('and again', file=sys.stderr) Perhaps it would help if there were a function fprint(f, *args): print(file = f, *args) then the above could be written fprint(sys.stderr, 'first line') fprint(sys.stderr, 'another line here') fprint(sys.stderr, 'and again') which to me is a lot easier to read, since the file argument is in a consistent place, making it easier to see that it's the same from one line to the next. Also, it enables making the file argument very abbreviated, e.g. f = sys.stdout fprint(f, 'first line') fprint(f, 'another line here') fprint(f, 'and again') Otherwise, the shortest you can get it down to is 'file=f', which is 6 times as long. It might not seem much, but that's 5 less characters of print arguments that you can fit in without having to split the line. -- Greg From idadesub at users.sourceforge.net Sat Mar 15 02:39:43 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Fri, 14 Mar 2008 18:39:43 -0700 Subject: [Python-ideas] py3k: adding "print" methods to file-like objects In-Reply-To: <47DB17F4.7050602@canterbury.ac.nz> References: <1ef034530803131356m3320cd93v8fbfb5ceca464640@mail.gmail.com> <47DB17F4.7050602@canterbury.ac.nz> Message-ID: <1ef034530803141839v6c9bd4c8t8ea54ffa41df17d3@mail.gmail.com> On Fri, Mar 14, 2008 at 5:27 PM, Greg Ewing wrote: > Also, it enables making the file argument very abbreviated, e.g. > > f = sys.stdout > fprint(f, 'first line') > fprint(f, 'another line here') > fprint(f, 'and again') > > Otherwise, the shortest you can get it down to is 'file=f', > which is 6 times as long. It might not seem much, but that's > 5 less characters of print arguments that you can fit in > without having to split the line. In that case I think partials a better option, when you could do: p = partial(print, file=sys.stderr) p('first line') p('another line here') p('and again') I completely forgot about partial which does a good job of filling in for what I wanted. I just need to consider the combination of the py3k stuff a bit more. From larry at hastings.org Sat Mar 15 06:35:57 2008 From: larry at hastings.org (Larry Hastings) Date: Sat, 15 Mar 2008 00:35:57 -0500 Subject: [Python-ideas] Python Pragmas Message-ID: <47DB603D.2040907@hastings.org> Recently-ish on c.l.py3k (iirc) folks were discussing how to write a script that exited with a human-friendly warning message if run under an incompatible version of the language. The problem with this code: import sys if sys.version < 3: sys.exit("Sorry, this script needs Python 3000") is that the code only executes once tokenization is finished--if your script uses any incompatible syntax, it will fail in the tokenizer, most likely with an error message that doesn't make it particularly clear what is going on. After thinking about the problem for a while, it hit me--this is best expressed as a "pragma". For Python's purposes, I would define a "pragma" as an instruction to the tokenizer / compiler, executed immediately upon its complete tokenization. The use case here is pragma version >= 3 # python version must be less than 3.0 Again, this would be executed immediately, aborting before the tokenizer has a chance to see some old syntax it didn't like. What else might we use "pragma" for? Well, consider that Python already has two specialized syntaxes that are really pragmas: "from __future__ import" and "# -*- coding: ". I think this functionality would be more clearly expressed with a "pragma" syntax, for example: pragma encoding latin-1 pragma enable floatdivision It's a matter of taste, but I've never liked it when languages hide important directives in comments--isn't the compiler supposed to *ignore* comments?--nor do I like how "from __future__ import" doesn't really have anything to do with importing modules. Your tastes may vary. There was some discussion back in 2000 about adding a "pragma" to the language: http://www.python.org/dev/summary/2000-08-2/ It sounds like GvR wasn't wholly against the idea: http://mail.python.org/pipermail/python-dev/2000-August/008840.html But nothing seems to have come of it. The discussion died out in early September of 2000, and I didn't find any subsequent revivals. There was some worry back then that pragmas would be a slippery slope, resulting in increasingly elaborate pragma syntaxes until we--shudder!--wake up one day and have preprocessor macros. I agree that we don't want to go too far down this slippery slope. I have some specific suggestions on how we could obviate the temptation, but they are predicated on having pragmas at all, so I might as well keep quiet until such point as pragmas get traction. If you'd like to discuss this in person--or just give me a good hard slap for even suggesting it--I'm bouncing around PyCon until Wednesday afternoon. I'm the guy with the Facebook logos plastered around his person. Cheers, /larry/ From greg at krypto.org Sat Mar 15 07:15:48 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 15 Mar 2008 01:15:48 -0500 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47DB07F8.4010900@canterbury.ac.nz> References: <47D8E3E0.7010509@gmail.com> <47D9E7E6.9090401@cos.ufrj.br> <47DA2969.9090903@cs.byu.edu> <47DB07F8.4010900@canterbury.ac.nz> Message-ID: <52dc1c820803142315i23fd86a4q8ca2b4b6a29c1bf3@mail.gmail.com> On 3/14/08, Greg Ewing wrote: > > Neil Toronto wrote: > > > Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__, > > __nsim__, __ltsim__, and __gtsim__ slots. > > > I think that all of these are a bad idea. In my experience, > when comparing with a tolerance, you need to think carefully > about what the appropriate tolerance is for each and every > comparison. Having a global default tolerance would just > lead people to write sloppy and unreliable numerical code. Agreed no quick "fix" for float imprecisions is going to make life better for programmers beyond the first week. floats are imprecise. the sooner programmers learn that the better. if you want things that can be compared without thinking use a decimal and avoid irrational numbers. good luck. ;) Though I don't use them myself I believe the popular math language packages like matlab and mathematica may even allow you to compute all values with second error/precision component that gets mutated properly based on the computations being done so that you know the accuracy of your result without manually having to calculate accuracy every step of the way based on the algorithm and order of floating point operations used. (if not, its an interesting idea and could be fleshed out as a pure python object implementation by someone who cares about these things to see if enough people find it useful). -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sat Mar 15 08:36:40 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 15 Mar 2008 08:36:40 +0100 Subject: [Python-ideas] Python Pragmas In-Reply-To: <47DB603D.2040907@hastings.org> References: <47DB603D.2040907@hastings.org> Message-ID: Larry Hastings schrieb: > There was some discussion back in 2000 about adding a "pragma" to the > language: > > http://www.python.org/dev/summary/2000-08-2/ > > It sounds like GvR wasn't wholly against the idea: > > http://mail.python.org/pipermail/python-dev/2000-August/008840.html > > But nothing seems to have come of it. The discussion died out in early > September of 2000, and I didn't find any subsequent revivals. There is the "directive" PEP, which was rejected: http://www.python.org/dev/peps/pep-0244/ Georg From db3l.net at gmail.com Sat Mar 15 19:59:07 2008 From: db3l.net at gmail.com (David Bolen) Date: Sat, 15 Mar 2008 14:59:07 -0400 Subject: [Python-ideas] Python Pragmas References: <47DB603D.2040907@hastings.org> Message-ID: Larry Hastings writes: > Recently-ish on c.l.py3k (iirc) folks were discussing how to write a > script that exited with a human-friendly warning message if run under an > incompatible version of the language. The problem with this code: > > import sys > if sys.version < 3: sys.exit("Sorry, this script needs Python 3000") > > is that the code only executes once tokenization is finished--if your > script uses any incompatible syntax, it will fail in the tokenizer, most > likely with an error message that doesn't make it particularly clear > what is going on. Personally, in scenarios where I'm worried about that, I just make my entry point script a thin one with code suitable for any releases I'm worried about, and only import the main script once the version checks have passed. It also permits conditional importing of a version-specific script when that's appropriate as well. Avoids the need to introduce execution into the tokenizer, and trying to worry about all the possible types of comparisons you might want to make (and thus need to be supported by such execution). -- David From lorgandon at gmail.com Mon Mar 17 00:13:25 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Mon, 17 Mar 2008 01:13:25 +0200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803140800h37310696rdeb3cfb17603d458@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> <5c6f2a5d0803140800h37310696rdeb3cfb17603d458@mail.gmail.com> Message-ID: <47DDA995.3090906@gmail.com> I've given it more thought over the past few days. Given the discussion here, and some more reading on my part, it seems to me that there isn't much chance for me convincing anyone to raise an exception on FP ==. I'm not too sure that it's the right move anyway. While I'll probably avoid FP == in my code, it seems to me that there are some cases it is useful (even given the inaccuracy of the results). Regarding adding warnings to pychecker/pylint, I think it's a good idea. Probably for another mailing list though :). Also, I considered the subject of runtime warnings as well. Adding the relevant warnings to any static checker could be really hard work while warning during runtime could be a lot easier. Therefore, it seems worthwhile to consider this option. I didn't happen to use the warnings module before, so I read its documentation now (also the PEP) and played with it a little. First, if a warning is generated for floating point ==, it can be turned off globally, or on a line-by-line basis. Second, regarding Mark's comment on SmellyCodeWarning. I thought about it a bit, and it seems no joke to me. gcc has a -Wall mode, so does Python. Why not use it in this situation? (i.e. having some warnings not displayed by default.) I think it would be interesting to consider more cases of 'SmellyCodeWarning' in general, and adding them under some warning category. If there's a need for a use case, we've already got the first one - floating point comparisons. Cheers, Imri. ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- Mark Dickinson wrote: > On Fri, Mar 14, 2008 at 5:01 AM, Imri Goldberg > wrote: > > Alright, I agree it's a good idea to drop the proposal to changing > floating point == into an epsilon compare. > What about issuing a warning though? > Consider the following course of action. It is the one with the least > changes: > > == for regular floating point numbers now issues a warning, but still > works. This warning might be turned off. All other operators are left > unchanged. > > > Do you think this should be dropped as well? > > > To be honest, yes. There isn't currently a SmellyCodeWarning or > IsThatReallyWhatYouMeanWarning in Python, and there doesn't > seem to be a lot of precedent for warning on code constructs that > may often be wrong but also have legitimate uses. Most of > the current warnings have more to do with syntactic or semantic > changes between various versions of Python. > > But I think it would be entirely appropriate to warn about > floating-point (in)equality checks in something like PyChecker > or Pylint, if you can get past the technical difficulties of detecting > floating-point comparisons statically. > > Mark From leszek at dubiel.pl Mon Mar 17 09:56:35 2008 From: leszek at dubiel.pl (Leszek Dubiel) Date: Mon, 17 Mar 2008 09:56:35 +0100 Subject: [Python-ideas] Faq 4.28 Suggestion -- Trailing comas Message-ID: <47DE3243.5040509@dubiel.pl> I would suggest to add question 4.28 to faq. Everybody who learns Python reads that and will not ask questions about "one-element tuples". I have compiled answer from responses to my last question about one-element tuple. Question: Why python allows to put comma at the end of list? This looks ugly and seems to break common rules... Answer: There are may reasons that follow. 1. If you defined multiline dictionary d = { "A": [1, 5], "B": [6, 7], # last trailing comma is optional but good style } it would be easier to add more elements, because you don't have to care about colons -- you always put colon at the end of line and don't have to reedit other lines. It eases sorting of such lines too -- just cut line and paste above. 2. Missing comma can lead to errors that are hard to diagnose. For example: x = [ "fee", "fie" "foo", "fum" ] contains tree elements "fee", "fiefoo" and "fum". So if programmer puts comma always at the end of line he saves lots of trouble in a future. 2. Nearly all reasonable programming languages (C, C++, Java) allow for an extra trailing comma in a comma-delimited list, for consistency and to make programmatic code generation easier. So for example [1,2,3,] is intentionally correct. 3. Creating one-element tuples using tuple(['hello']) syntax is much much slower (a factor of 20x here) then writing just ['hello', ]. Trailing commas when you only have one item are how python tuple syntax is defined allowing them to use commas instead of needing other tokens. If python didn't allow comma at the end of tuple, you will have to use such slow syntax. 4. The same rule applies to other type of lists, where delimiter can occur at the end. For example both strings "alfa\nbeta\n" and "alfa\nbeta" contain two lines. Sources: -- http://mail.python.org/pipermail/python-list/2003-October/231419.html -- http://mail.python.org/pipermail/python-ideas/2008-March/001478.html -- http://mail.python.org/pipermail/python-ideas/2008-March/001475.html From szekeres at iii.hu Mon Mar 17 11:17:26 2008 From: szekeres at iii.hu (=?UTF-8?Q?Szekeres_Istv=C3=A1n?=) Date: Mon, 17 Mar 2008 11:17:26 +0100 Subject: [Python-ideas] Faq 4.28 Suggestion -- Trailing comas In-Reply-To: <47DE3243.5040509@dubiel.pl> References: <47DE3243.5040509@dubiel.pl> Message-ID: > 3. Creating one-element tuples using tuple(['hello']) syntax is much much > slower (a factor of 20x here) then writing just ['hello', ]. That should be ('hello',) From jimjjewett at gmail.com Tue Mar 18 23:11:31 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 18 Mar 2008 18:11:31 -0400 Subject: [Python-ideas] Faq 4.28 Suggestion -- Trailing comas In-Reply-To: <47DE3243.5040509@dubiel.pl> References: <47DE3243.5040509@dubiel.pl> Message-ID: On 3/17/08, Leszek Dubiel wrote: > I would suggest to add question 4.28 to faq. Everybody who learns Python > reads that and will not ask questions about "one-element tuples". I have > compiled answer from responses to my last question about one-element tuple. This (with the followup correction) is great; please post it to the Issue Tracker, so that someone (probably Georg) with commit privs can check it in. -jJ From jimjjewett at gmail.com Tue Mar 18 23:33:06 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 18 Mar 2008 18:33:06 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> Message-ID: On 3/13/08, Mark Dickinson wrote: > On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg wrote: > I really think that there's essentially zero chance of == and != ever > changing to 'fuzzy' comparisons in Python. They sort of already did -- you can define __eq__ and __ne__ on your own class in bizarre and inconsistent ways. [Though I think you can't easily override that (x is y) ==> (x==y).] You can even do this with your own float-alike class. What you're really asking for is that the float class take advantage of this. > I don't know of any other language that has successfully done this, ... Changing an existing class requires that the class be "open". That is the default in languages like smalltalk or ruby. It is even the default for python classes -- but it is certainly not the default for "python" classes that are actually coded in C -- which includes floats. -jJ From jimjjewett at gmail.com Tue Mar 18 23:42:28 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 18 Mar 2008 18:42:28 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: <47DA3EE8.7060902@gmail.com> References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> Message-ID: On 3/14/08, Imri Goldberg wrote: > Alright, I agree it's a good idea to drop the proposal to changing > floating point == into an epsilon compare. > What about issuing a warning though? > Consider the following course of action. It is the one with the least > changes: > == for regular floating point numbers now issues a warning, but still > works. This warning might be turned off. All other operators are left > unchanged. If you change ==, you should really change !=, and probably the other comparisons as well. I suspect what you really want is a warning on any usage of a floating point. And I'm only half-joking. Comparison (or arithmetic) with other floats adds error. Comparison (or arithmetic) with ints is *usually* a bug (unless one of the operands is a constant that someone was too lazy to write correctly). -jJ From dickinsm at gmail.com Wed Mar 19 00:58:22 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 18 Mar 2008 19:58:22 -0400 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> Message-ID: <5c6f2a5d0803181658i25feb850lc26fbf6768930e3d@mail.gmail.com> On Tue, Mar 18, 2008 at 6:33 PM, Jim Jewett wrote: > > They sort of already did -- you can define __eq__ and __ne__ on your > own class in bizarre and inconsistent ways. [Though I think you can't > easily override that (x is y) ==> (x==y).] Why not? I get this with Python 2.5.1: >>> from decimal import * >>> Decimal.__eq__ = lambda x, y: False >>> x = Decimal(2) >>> x == x False >>> x is x True >>> Or am I misunderstanding your meaning? Of course, even for floats it's not true that x is y implies x == y: >>> x = float('nan') >>> x is x True >>> x == x False > Changing an existing class requires that the class be "open". That is > the default in languages like smalltalk or ruby. It is even the > default for python classes -- but it is certainly not the default for > "python" classes that are actually coded in C -- which includes > floats. You mean like: >>> float.__eq__ = lambda x, y: False Traceback (most recent call last): File "", line 1, in TypeError: can't set attributes of built-in/extension type 'float' ? Presumably there are good reasons for this restriction (performance? convenience? lack of round tuits?), but I've no idea what they are. I can't say that I've ever felt a need to do anything like this. Mark From greg.ewing at canterbury.ac.nz Wed Mar 19 01:55:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 19 Mar 2008 12:55:53 +1200 Subject: [Python-ideas] [Python-Dev] The Case Against Floating Point == In-Reply-To: References: <47D8E3E0.7010509@gmail.com> <5c6f2a5d0803130741k6adbb85cia0a19ea59ed830f4@mail.gmail.com> <47D943B7.60405@gmail.com> <5c6f2a5d0803130837j21137d2cg84ee62c501a4af84@mail.gmail.com> <5c6f2a5d0803130908l53bab326i8eb6e417f0c2627c@mail.gmail.com> <47D97494.7040103@gmail.com> <47D9A46B.2080900@canterbury.ac.nz> <47D9A83B.1050908@gmail.com> <5c6f2a5d0803131801j3441d619p6004be30252b6b8a@mail.gmail.com> <47DA3EE8.7060902@gmail.com> Message-ID: <47E06499.9080705@canterbury.ac.nz> Jim Jewett wrote: > Comparison (or arithmetic) with ints is > *usually* a bug (unless one of the operands is a constant that someone > was too lazy to write correctly). That depends on what you regard as "correct". Python generally permits a duck-typed approach to numbers wherein using integers as a subset of floats is considered legitimate, and not lazy at all. -- Greg From ggpolo at gmail.com Thu Mar 20 16:45:50 2008 From: ggpolo at gmail.com (Guilherme Polo) Date: Thu, 20 Mar 2008 12:45:50 -0300 Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library Message-ID: Hello, I've read this idea about "preparing an existing RPC mechanism for the standard library" at StandardLibrary ideas and I would be interested in doing it, but as you all know, including something into stdlib is not exactly easy and shouldn't be anyway. Also I'm not even sure if this idea is still desired. I'm considering the inclusion of rpyc, with appropriate changes (possibly lots). And would like to know your opinions towards this. Thanks, -- -- Guilherme H. Polo Goncalves From brett at python.org Mon Mar 24 00:03:29 2008 From: brett at python.org (Brett Cannon) Date: Sun, 23 Mar 2008 16:03:29 -0700 Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: References: Message-ID: On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo wrote: > Hello, > > I've read this idea about "preparing an existing RPC mechanism for the > standard library" at StandardLibrary ideas and I would be interested > in doing it, but as you all know, including something into stdlib is > not exactly easy and shouldn't be anyway. Also I'm not even sure if > this idea is still desired. > > I'm considering the inclusion of rpyc, with appropriate changes > (possibly lots). And would like to know your opinions towards this. > I know from my end I am not even familiar with rpyc so I have no comment. And I suspect most other people have a similar reason for having not commented on this so far. -Brett From santagada at gmail.com Mon Mar 24 00:11:37 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Sun, 23 Mar 2008 20:11:37 -0300 Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: References: Message-ID: <69959DA8-4D76-4A4C-A068-EFDA678DF7D8@gmail.com> On 20/03/2008, at 12:45, Guilherme Polo wrote: > I'm considering the inclusion of rpyc, with appropriate changes > (possibly lots). And would like to know your opinions towards this. I think the route you would have to go is making a pep, and one of the things I would like to see in this pep would be why rpyc and not any of the other rpc modules around (like the not recomended for general use zrpc or pyro or the thing the guys from twisted have). Only if your pep is accept I think you should waste your time making it better for the stdlib. The two "think" in my last paragraph are there because I am not sure this the right route, this is just a guess. -- Leonardo Santagada From taleinat at gmail.com Mon Mar 24 07:58:10 2008 From: taleinat at gmail.com (Tal Einat) Date: Mon, 24 Mar 2008 08:58:10 +0200 Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: References: Message-ID: <7afdee2f0803232358q1e4cd909rd3da056c85e9fd05@mail.gmail.com> On Mon, Mar 24, 2008 at 1:03 AM, Brett Cannon wrote: > On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo wrote: > > Hello, > > > > I've read this idea about "preparing an existing RPC mechanism for the > > standard library" at StandardLibrary ideas and I would be interested > > in doing it, but as you all know, including something into stdlib is > > not exactly easy and shouldn't be anyway. Also I'm not even sure if > > this idea is still desired. > > > > I'm considering the inclusion of rpyc, with appropriate changes > > (possibly lots). And would like to know your opinions towards this. > > > > I know from my end I am not even familiar with rpyc so I have no > comment. And I suspect most other people have a similar reason for > having not commented on this so far. I believe the reason that the OP is considering RPyC is because it is the most Pythonic RPC mechanism of the lot. That, and its relative simplicity, are the reasons I recently chose RPyC for a project, and it worked out pretty well. If any RPC mechanism is added to the standard library, I hope it has an API as Pythonic as RPyC's! I ran into two main problems while using RPyC (v2.60), neither of them show breakers for me. The first was that debugging it can be hard because its exception handling (propagation across the RPC link) isn't good enough (yet). The second is that the RPC is two-way and very transparent, so that once the application became complex I had to take special measures to avoid deadlocks. All things considered, RPyC got the job done. I know RPyC's developer and maintainer, Tomer Filiba, and he's a great guy, though recently much busier than he used to be. He had plans to add distributed computing capabilities to RPyC in version 3.0, and probably quite a few other features, but AFAIK development is currently frozen. I'm CC-ing the RPyC newsgroup in hopes that he (and the users) will comment on this. - Tal From ggpolo at gmail.com Mon Mar 24 11:31:38 2008 From: ggpolo at gmail.com (Guilherme Polo) Date: Mon, 24 Mar 2008 07:31:38 -0300 Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: <7afdee2f0803232358q1e4cd909rd3da056c85e9fd05@mail.gmail.com> References: <7afdee2f0803232358q1e4cd909rd3da056c85e9fd05@mail.gmail.com> Message-ID: 2008/3/24, Tal Einat : > On Mon, Mar 24, 2008 at 1:03 AM, Brett Cannon wrote: > > On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo wrote: > > > Hello, > > > > > > I've read this idea about "preparing an existing RPC mechanism for the > > > standard library" at StandardLibrary ideas and I would be interested > > > in doing it, but as you all know, including something into stdlib is > > > not exactly easy and shouldn't be anyway. Also I'm not even sure if > > > this idea is still desired. > > > > > > I'm considering the inclusion of rpyc, with appropriate changes > > > (possibly lots). And would like to know your opinions towards this. > > > > > > > I know from my end I am not even familiar with rpyc so I have no > > comment. And I suspect most other people have a similar reason for > > having not commented on this so far. > > > I believe the reason that the OP is considering RPyC is because it is > the most Pythonic RPC mechanism of the lot. That, and its relative > simplicity, are the reasons I recently chose RPyC for a project, and > it worked out pretty well. If any RPC mechanism is added to the > standard library, I hope it has an API as Pythonic as RPyC's! > > I ran into two main problems while using RPyC (v2.60), neither of them > show breakers for me. The first was that debugging it can be hard > because its exception handling (propagation across the RPC link) isn't > good enough (yet). The second is that the RPC is two-way and very > transparent, so that once the application became complex I had to take > special measures to avoid deadlocks. All things considered, RPyC got > the job done. > > I know RPyC's developer and maintainer, Tomer Filiba, and he's a great > guy, though recently much busier than he used to be. He had plans to > add distributed computing capabilities to RPyC in version 3.0, and > probably quite a few other features, but AFAIK development is > currently frozen. I'm CC-ing the RPyC newsgroup in hopes that he (and > the users) will comment on this. > I've talked with him before posting this here Tal. Also, the development of the new version is active. > > - Tal > -- -- Guilherme H. Polo Goncalves From janssen at parc.com Mon Mar 24 19:24:38 2008 From: janssen at parc.com (Bill Janssen) Date: Mon, 24 Mar 2008 11:24:38 PDT Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: References: Message-ID: <08Mar24.112443pdt."58696"@synergy1.parc.xerox.com> > I'm considering the inclusion of rpyc, with appropriate changes > (possibly lots). And would like to know your opinions towards this. Might think about reviving the ILU kernel (just the runtime, not the stubbers) as a Python-only module. Open source, pretty complete bindings to Python, multithreaded, threadsafe, etc., etc. On the other hand, I haven't even compiled it in 6 years :-). The key advantage would be that ILU speaks a number of different RPC protocols under the covers, and it's straightforward to add new ones. I'd love to see our implementation of wmux (a way of multiplexing multiple virtual connections, in either direction, over a single TCP connection) actually in use. http://www2.parc.com/istl/projects/ILU/ Bill From tomerfiliba at gmail.com Tue Mar 25 12:18:52 2008 From: tomerfiliba at gmail.com (tomer filiba) Date: Tue, 25 Mar 2008 04:18:52 -0700 (PDT) Subject: [Python-ideas] Preparing an existing RPC mechanism for the standard library In-Reply-To: References: <7afdee2f0803232358q1e4cd909rd3da056c85e9fd05@mail.gmail.com> Message-ID: <2153be49-0e9a-4c84-a951-3d3c54eedd26@8g2000hsu.googlegroups.com> hi all. i don't feel i may join in this discussion as i'm certainly biased, but i don't want to just leave it on the wall. for those of you who haven't heard of rpyc, here's a link: http://rpyc.wikispaces.com/ and a short demo/tutorial at http://rpyc.wikispaces.com/tutorial i can bring many use cases that demonstrate rpyc's superiority over other RPC mechanism, but then again, rpyc has its drawbacks too (mainly security and frequent IOs). i can make a list of both pros and cons, but i don't see how it could advance this discussion. just some final words, as guilherme has said, i am now actively working on rpyc3.0. in fact the core (parallel to rpyc2.6) is already stable and quite tested (you can find it on the svn), but if any attempt is made to integrate rpyc into the stdlib, it should wait until the final 3.0 release. -tomer On Mar 24, 12:31 pm, "Guilherme Polo" wrote: > 2008/3/24, Tal Einat : > > > > > On Mon, Mar 24, 2008 at 1:03 AM, Brett Cannon wrote: > > > On Thu, Mar 20, 2008 at 8:45 AM, Guilherme Polo wrote: > > > > Hello, > > > > > I've read this idea about "preparing an existing RPC mechanism for the > > > > standard library" at StandardLibrary ideas and I would be interested > > > > in doing it, but as you all know, including something into stdlib is > > > > not exactly easy and shouldn't be anyway. Also I'm not even sure if > > > > this idea is still desired. > > > > > I'm considering the inclusion of rpyc, with appropriate changes > > > > (possibly lots). And would like to know your opinions towards this. > > > > I know from my end I am not even familiar with rpyc so I have no > > > comment. And I suspect most other people have a similar reason for > > > having not commented on this so far. > > > I believe the reason that the OP is considering RPyC is because it is > > the most Pythonic RPC mechanism of the lot. That, and its relative > > simplicity, are the reasons I recently chose RPyC for a project, and > > it worked out pretty well. If any RPC mechanism is added to the > > standard library, I hope it has an API as Pythonic as RPyC's! > > > I ran into two main problems while using RPyC (v2.60), neither of them > > show breakers for me. The first was that debugging it can be hard > > because its exception handling (propagation across the RPC link) isn't > > good enough (yet). The second is that the RPC is two-way and very > > transparent, so that once the application became complex I had to take > > special measures to avoid deadlocks. All things considered, RPyC got > > the job done. > > > I know RPyC's developer and maintainer, Tomer Filiba, and he's a great > > guy, though recently much busier than he used to be. He had plans to > > add distributed computing capabilities to RPyC in version 3.0, and > > probably quite a few other features, but AFAIK development is > > currently frozen. I'm CC-ing the RPyC newsgroup in hopes that he (and > > the users) will comment on this. > > I've talked with him before posting this here Tal. Also, the > development of the new version is active. > > > > > - Tal > > -- > -- Guilherme H. Polo Goncalves > _______________________________________________ > Python-ideas mailing list > Python-id... at python.orghttp://mail.python.org/mailman/listinfo/python-ideas From ggpolo at gmail.com Tue Mar 25 15:05:57 2008 From: ggpolo at gmail.com (Guilherme Polo) Date: Tue, 25 Mar 2008 11:05:57 -0300 Subject: [Python-ideas] Lib/lib-tk goes away, package tkinter joins in Message-ID: Hello, (this is an idea for Python 3) Is there any reason for keeping the directory lib-tk at Lib ? I believe renaming it to tkinter and making it a package would make more sense. Tkinter module's code could then reside into __init__.py maybe. Other change that could be done in this package would be renaming some modules: Dialog -> dialog FileDialog -> filedialog FixTk -> fixtk ... Also, I believe tkSimpleDialog and dialog could be in a single module. There are other modules like tkColorChooser and tkCommonDialog and even tkSimpleDialog (and some others) that I'm not totally sure what to do about them, but for me they should reside in possible a single module. That is. Thanks, -- -- Guilherme H. Polo Goncalves From qgallet at gmail.com Tue Mar 25 15:19:37 2008 From: qgallet at gmail.com (Quentin Gallet-Gilles) Date: Tue, 25 Mar 2008 15:19:37 +0100 Subject: [Python-ideas] Lib/lib-tk goes away, package tkinter joins in In-Reply-To: References: Message-ID: <8b943f2b0803250719j2de946ffre23515a1e8d0a5c2@mail.gmail.com> Hi Guilherme, Tkinter is scheduled to become a package in py3k, this is documented in the PEP 3108 : http://www.python.org/dev/peps/pep-3108/#tk-package If you wish to help on related issues, feel free to join the stdlib-sig where the reorganization is being discussed :-) Quentin On Tue, Mar 25, 2008 at 3:05 PM, Guilherme Polo wrote: > Hello, > > (this is an idea for Python 3) > > Is there any reason for keeping the directory lib-tk at Lib ? I > believe renaming it to tkinter and making it a package would make more > sense. Tkinter module's code could then reside into __init__.py maybe. > Other change that could be done in this package would be renaming some > modules: > > Dialog -> dialog > FileDialog -> filedialog > FixTk -> fixtk > ... > > Also, I believe tkSimpleDialog and dialog could be in a single module. > There are other modules like tkColorChooser and tkCommonDialog and > even tkSimpleDialog (and some others) that I'm not totally sure what > to do about them, but for me they should reside in possible a single > module. > > That is. Thanks, > > -- > -- Guilherme H. Polo Goncalves > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rasky at develer.com Thu Mar 27 12:25:41 2008 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 27 Mar 2008 11:25:41 +0000 (UTC) Subject: [Python-ideas] Lambda again: unnamed arguments Message-ID: Hello, inspired by Greg's post about ideas on making the lambda syntax more concise, like: x,y => x+y I was wondering if using unnamed arguments had already been debated. Something like: \(_1+_2) where basically you're declaring implicitally declaring that your lambda takes two arguments. You wouldn't be able to call them through keyword arguments, nor to accept a variable number of arguments (nor to accept more arguments than they are actually used), but wouldn't it cover most use cases and be really compact? Other examples: k.sort(key=\(_1.foo)) k.sort(key=\(_1[0])) -- Giovanni Bajo From jjb5 at cornell.edu Thu Mar 27 14:29:37 2008 From: jjb5 at cornell.edu (Joel Bender) Date: Thu, 27 Mar 2008 09:29:37 -0400 Subject: [Python-ideas] Lambda again: Anonymous function definition Message-ID: <47EBA141.7080306@cornell.edu> Greg wrote: > What's needed is something very concise and unobtrusive, > such as > > x, y => x + y As inspired by Prolog: x, y :- x + y So this: f = lambda x, y: x ** 2 + y ** 2 Or this: def f(x, y): return x ** 2 + y ** 2 Becomes this: f = x, y :- x ** 2 + y ** 2 And would logically transpose into this: f(x, y) :- x ** 2 + y ** 2 Oooo...this rabbit hole is fun! Joel From helmert at informatik.uni-freiburg.de Thu Mar 27 17:32:18 2008 From: helmert at informatik.uni-freiburg.de (Malte Helmert) Date: Thu, 27 Mar 2008 16:32:18 +0000 Subject: [Python-ideas] lambda In-Reply-To: <47EBC900.5000201@cs.byu.edu> References: <319e029f0803260051w4f1c0b8ela4ab55dcaa8a930a@mail.gmail.com> <47EA157E.5080903@gmail.com> <319e029f0803260233oba7e8bdr64c3ca09ea49b261@mail.gmail.com> <47EA61AA.2030607@gmail.com> <47EBC900.5000201@cs.byu.edu> Message-ID: [follow up from py3k.devel list] Neil Toronto wrote: > Yep. In my seven years of CS instruction so far, I've only come across > this once, in a theory of programming languages course. "Lambda" simply > doesn't show up unless you do language theory or program in a Lisp... or > in Python. Since you mention Haskell below: > It's a little less terse than Haskell's "\->" it's worth pointing out that Haskell uses the backslash syntax because it is the nearest ASCII equivalent to the (lower-case) letter lambda. For example, see http://en.wikibooks.org/wiki/Haskell/More_on_functions or the Google results for "haskell lambda backslash" (without the quotes). Malte From brett at python.org Thu Mar 27 19:17:21 2008 From: brett at python.org (Brett Cannon) Date: Thu, 27 Mar 2008 11:17:21 -0700 Subject: [Python-ideas] Lambda again: unnamed arguments In-Reply-To: References: Message-ID: On Thu, Mar 27, 2008 at 4:25 AM, Giovanni Bajo wrote: > Hello, > > inspired by Greg's post about ideas on making the lambda syntax more > concise, like: > > x,y => x+y > > I was wondering if using unnamed arguments had already been debated. > Something like: > > \(_1+_2) > > where basically you're declaring implicitally declaring that your lambda > takes two arguments. You wouldn't be able to call them through keyword > arguments, nor to accept a variable number of arguments (nor to accept > more arguments than they are actually used), but wouldn't it cover most > use cases and be really compact? > > Other examples: > > k.sort(key=\(_1.foo)) > k.sort(key=\(_1[0])) Two reasons for being -1: One is it's just plain ugly to me. Two, why break from how functions and methods work to save a few keystrokes? Explicit is better than implicit. -Brett From tjreedy at udel.edu Thu Mar 27 20:44:43 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 27 Mar 2008 15:44:43 -0400 Subject: [Python-ideas] lambda References: <319e029f0803260051w4f1c0b8ela4ab55dcaa8a930a@mail.gmail.com> <47EA157E.5080903@gmail.com> <319e029f0803260233oba7e8bdr64c3ca09ea49b261@mail.gmail.com> <47EA61AA.2030607@gmail.com> <47EBC900.5000201@cs.byu.edu> Message-ID: "Malte Helmert" wrote in message news:fsgi6i$dpv$1 at ger.gmane.org... | > It's a little less terse than Haskell's "\->" | | it's worth pointing out that Haskell uses the backslash syntax because | it is the nearest ASCII equivalent to the (lower-case) letter lambda. With unicode source, we could use the real thing (ducks ;-). From leszek at dubiel.pl Mon Mar 31 09:23:52 2008 From: leszek at dubiel.pl (Leszek Dubiel) Date: Mon, 31 Mar 2008 09:23:52 +0200 Subject: [Python-ideas] Lambda again: Anonymous function definition In-Reply-To: <47EBA141.7080306@cornell.edu> References: <47EBA141.7080306@cornell.edu> Message-ID: <47F09188.9090001@dubiel.pl> Joel Bender napisa?(a): > Greg wrote: > > > What's needed is something very concise and unobtrusive, > > such as > > > > x, y => x + y > > As inspired by Prolog: > > x, y :- x + y > > So this: > > f = lambda x, y: x ** 2 + y ** 2 > > Or this: > > def f(x, y): return x ** 2 + y ** 2 > > Becomes this: > > f = x, y :- x ** 2 + y ** 2 > > And would logically transpose into this: > > f(x, y) :- x ** 2 + y ** 2 > > Oooo...this rabbit hole is fun! > Lambda should have the same syntax as ordinary functions. The only difference should be: you don't have to put the name of the function. def f (x, y): return x ** 2 + y ** 2 g = f h = def (x, y): return x ** 2 + y ** 2 Functions f, g and h are doing the same. From eli at courtwright.org Mon Mar 31 14:19:35 2008 From: eli at courtwright.org (Eli Courtwright) Date: Mon, 31 Mar 2008 08:19:35 -0400 Subject: [Python-ideas] Lambda again: Anonymous function definition In-Reply-To: <47F09188.9090001@dubiel.pl> References: <47EBA141.7080306@cornell.edu> <47F09188.9090001@dubiel.pl> Message-ID: <3f6c86f50803310519ka2db919ye2b8673d9fb25b56@mail.gmail.com> On Mon, Mar 31, 2008 at 3:23 AM, Leszek Dubiel wrote: > Lambda should have the same syntax as ordinary functions. The only > difference should be: you don't have to put the name of the function. > > def f (x, y): return x ** 2 + y ** 2 > > g = f > > h = def (x, y): return x ** 2 + y ** 2 > > Functions f, g and h are doing the same. Javascript handles anonymous functions this way as well: function f(x, y) { return x*x + y*y; } g = f; h = function(x, y) { return x*x + y*y; } With that being said, it makes sense for the return statement to be omitted in lambdas (or anonymous defs, as I hope they will eventually be called), since those functions are limited to one statement. - Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From grosser.meister.morti at gmx.net Mon Mar 31 16:17:11 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-15?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 31 Mar 2008 16:17:11 +0200 Subject: [Python-ideas] dictionary unpacking Message-ID: <47F0F267.8080005@gmx.net> Maybe dictionary unpacking would be a nice thing? >>> d = {'foo': 42, 'egg': 23} >>> {'foo': bar, 'egg': spam} = d >>> print bar, spam 42 23 What do you think? Bad idea? Good idea? -panzi From aahz at pythoncraft.com Mon Mar 31 16:43:09 2008 From: aahz at pythoncraft.com (Aahz) Date: Mon, 31 Mar 2008 07:43:09 -0700 Subject: [Python-ideas] dictionary unpacking In-Reply-To: <47F0F267.8080005@gmx.net> References: <47F0F267.8080005@gmx.net> Message-ID: <20080331144309.GA20205@panix.com> On Mon, Mar 31, 2008, Mathias Panzenb?ck wrote: > > Maybe dictionary unpacking would be a nice thing? > > >>> d = {'foo': 42, 'egg': 23} > >>> {'foo': bar, 'egg': spam} = d > >>> print bar, spam > 42 23 > > What do you think? Bad idea? Good idea? Horrible idea. ;-) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan