Mailman 3 May 2010 - Python-Dev

Reasons behind misleading TypeError message when passing the wrong number of arguments to a method
by Giampaolo Rodolà May 21, 2010

May 21, 2010

>>> class A: ... def echo(self, x): ... return x ... >>> a = A() >>> a.echo() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: echo() takes exactly 2 arguments (1 given) >>> I bet my last 2 cents this has already been raised in past but I want to give it a try and revamp the subject anyway. Is there a reason why the error shouldn't be adjusted to state that *1* argument is actually required … [View More]

13 19

bug or feature? fixing argparse's default help value for version actions
by Steven Bethard May 20, 2010

May 20, 2010

Sorry I haven't had time to get around to the argparse issues. I should have time this weekend. I need a release manager call on one of the issues though. Two things I assume are fine to fix at this stage: * In the documentation, the '--version' example should either not use a shorthand, or should use the conventional '-V' * In the documentation, the difference between the argparse and optparse ways of specifying versions needs to be mentioned in the section on migrating from optparse. One … [View More]

4 3

Incorrect length of collections.Counter objects / Multiplicity function
by Gustavo Narea May 20, 2010

May 20, 2010

Hello, everyone. I've checked the new collections.Counter class and I think I've found a bug: > >>> from collections import Counter > >>> c1 = Counter([1, 2, 1, 3, 2]) > >>> c2 = Counter([1, 1, 2, 2, 3]) > >>> c3 = Counter([1, 1, 2, 3]) > >>> c1 == c2 and c3 not in (c1, c2) > True > >>> # Perfect, so far. But... There's always a "but": > ... > >>> len(c1) > 3 The length of a Counter is the amount of … [View More]

3 4

Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
by David Beazley May 19, 2010

May 19, 2010

> From: "Martin v. L?wis" <martin(a)v.loewis.de> > To: Dj Gilcrease <digitalxero(a)gmail.com> > Cc: python-dev(a)python.org > Subject: Re: [Python-Dev] Fixing the GIL (with a BFS scheduler) > Message-ID: <4BF385E3.9030903(a)v.loewis.de> > Content-Type: text/plain; charset=ISO-8859-1 > >> I think the new GIL should be given a year or so in the wild before >> you start trying to optimize theoretical issues you may run into. If >> in a year … [View More]

5 5

pybuildbot.identify?
by Bill Janssen May 19, 2010

May 19, 2010

The PPC buildbots are running pretty well, now that I've opened a few more ports, but I'd like to find this script "pybuildbot.identify" that they keep complaining about, and install it. I've poked around the Python sources, but haven't found it. Anyone know where to get it from? Thanks. Bill

3 5

Fixing the GIL (with a BFS scheduler)
by Nir Aides May 19, 2010

May 19, 2010

Hi all, Here is a second (last?) attempt at getting traction on fixing the GIL (is it broken?) with BFS (WTF?). So don't be shy (don't be too rude either) since ignoring counts as down voting. Relevant Python issue: http://bugs.python.org/issue7946 *Bottom line first* I submitted an implementation of BFS ( http://ck.kolivas.org/patches/bfs/sched-BFS.txt) as a patch to the GIL, which to the extent I have tested it, behaves nicely on Windows XP, Windows 7, GNU/Linux with either CFS or O(1) … [View More]schedulers, 1/2/4 cores, laptop, desktop and VirtualBox VM guest (some data below). The patch is still work in progress and requires work in terms of style, moving code where it belongs, test code, etc... nevertheless, Python core developers recommended I already (re)post to python-dev for discussion. *So is the GIL broken?* There seems to be some disagreement on that question among Python core developers (unless you all agree it is not broken :) ). Some developers maintain the effects described by David Beazley do not affect real world systems. Even I took the role of a devil's advocate in a previous discussion, but in fact I think that Python, being a general purpose language, is similar to the OS in that regard. It is used across many application domains, platforms, and development paradigms, just as OS schedulers are, and therefore accepting thread scheduling with such properties as a fact of life is not a good idea. I was first bitten by the original GIL last year while testing a system, and found David's research while looking for answers, and later had to work around that problem in another system. Here are other real world cases: 1) Zope people hit this back in 2002 and documented the problem with interesting insight: http://www.zope.org/Members/glpb/solaris/multiproc "I have directly observed a 30% penalty under MP constraints when the sys.setcheckinterval value was too low (and there was too much GIL thrashing)." http://www.zope.org/Members/glpb/solaris/report_ps "A machine that's going full-throttle isn't as bad, curiously enough -- because the other CPU's are busy doing real work, the GIL doesn't have as much opportunity to get shuffled between CPUs. On a MP box it's very important to set sys.setcheckinterval() up to a fairly large number, I recommend pystones / 50 or so." 2) Python mailing list - 2005 http://mail.python.org/pipermail/python-list/2005-August/336286.html "The app suffers from serious performance degradation (compared to pure c/C++) and high context switches that I suspect the GIL unlocking may be aggravating ?" 3) Python mailing list - 2008 http://mail.python.org/pipermail/python-list/2008-June/1143217.html "When I start the server, it sometimes eats up 100% of the CPU for a good minute or so... though none of the threads are CPU-intensive" 4) Twisted http://twistedmatrix.com/pipermail/twisted-python/2005-July/011048.html "When I run a CPU intensive method via threads.deferToThread it takes all the CPU away and renders the twisted process unresponsive." Admittedly, it is not easy to dig reports up in Google. Finally, I think David explained the relevance of this problem quite nicely: http://mail.python.org/pipermail/python-dev/2010-March/098416.html *What about the new GIL?* There is no real world experience with the new GIL since it is under development. What we have is David's analysis and a few benchmarks from the bug report. *Evolving the GIL into a scheduler* The problem addressed by the GIL has always been *scheduling* threads to the interpreter, not just controlling access to it. The patches by Antoine and David essentially evolve the GIL into a scheduler, however both cause thread starvation or high rate of context switching in some scenarios (see data below). *BFS* Enter BFS, a new scheduler designed by Con Kolivas, a Linux kernel hacker who is an expert in this field: http://ck.kolivas.org/patches/bfs/sched-BFS.txt "The goal of the Brain Fuck Scheduler, referred to as BFS from here on, is to completely do away with the complex designs of the past for the cpu process scheduler and instead implement one that is very simple in basic design. The main focus of BFS is to achieve excellent desktop interactivity and responsiveness without heuristics and tuning knobs that are difficult to understand, impossible to model and predict the effect of, and when tuned to one workload cause massive detriment to another." I submitted an implementation of BFS (bfs.patch) which on my machines gives comparable performance to gilinter2.patch (Antoine's) and seems to schedule threads more fairly, predictably, and with lower rate of context switching (see data below). There are however, some issues in bfs.patch: 1) It works on top of the OS scheduler, which means (for all GIL patches!): a) It does not control and is not aware of information such as OS thread preemption, CPU core to run on, etc... b) There may be hard to predict interaction between BFS and the underlying OS scheduler, which needs to be tested on each target platform. 2) It works best when TSC (http://en.wikipedia.org/wiki/Time_Stamp_Counter) is available and otherwise falls back to gettimeofday(). I expect the scheduler to misbehave to some degree or affect performance when TSC is not available and either of the following is true: a) if gettimeofday() is very expensive to read (impacts release/acquire overhead). b) if gettimeofday() has very low precision ~10ms. By design of BFS, once CPU load crosses a given threshold (about 8 CPU bound tasks which need the CPU at once), the scheduler falls back to FIFO behavior and latency goes up sharply. I have no data on how bfs.patch behaves on ARM, AMD, old CPU models, OSX, FreeBSD, Solaris, or mobile. The patch may require some tuning to work properly on those systems, so data is welcome (make sure TSC code in Include/cycle.h works on those systems before benching). All that said, to the extent I have tested it, bfs.patch behaves nicely on Windows XP, Windows 7, GNU/Linux with either CFS or O(1) schedulers, 1/2/4 cores, laptop, desktop and VirtualBox VM guest. *Data* Comparison of proposed patches running ccbench on Windows XP: http://bugs.python.org/issue7946#msg104899 Comparison of proposed patches running Florent's write.py test on Ubuntu Karmic: http://bugs.python.org/issue7946#msg105687 Comparison of old GIL, new GIL and BFS running ccbench on Ubuntu Karmic: http://bugs.python.org/issue7946#msg105874 Last comparison includes a run of old GIL with sys.setcheckinterval(2500) as Zope people do. IO latency shoots up to ~1000ms as result. *What can be done with it?* Here are some options: 1) Abandon it - no one is interested, yawn. 2) Take ideas and workarounds from its code and apply to other patches. 3) Include it in the interpreter as an auxiliary (turn on with a runtime switch) scheduler. 4) Adopt it as the Python scheduler. *Opinion?* Your opinion is needed (however, please submit code review comments which are not likely to interest other people, e.g. "why did you use volatile for X?", at the issue page: http://bugs.python.org/issue7946). Thanks, Nir [View Less]

13 49

Re: [Python-Dev] Summing up
by David Beazley May 19, 2010

May 19, 2010

Antoine, This is a pretty good summary that mirrors my thoughts on the GIL matter as well. In the big picture, I do think it's desirable for Python to address the multicore performance issue--namely to not have the performance needlessly thrashed in that environment. The original new GIL addressed this. The I/O convoy effect problem is more subtle. Personally, I think it's an issue that at least merits further study because trying to overlap I/O with computation is a known programming … [View More]technique that might be useful for people using Python to do message passing, distributed computation, etc. As an example, the multiprocessing module uses threads as part of its queue implementation. Is it impacted by convoying? I honestly don't know. I agree that getting some more real-world experience would be useful. Cheers, Dave > From: Antoine Pitrou <solipsis(a)pitrou.net> > > Ok, this is a good opportunity to try to sum up, from my point of view. > > The main problem of the old GIL, which was evidenced in Dave's original > study (not this year's, but the previous one) *is* fixed unless someone > demonstrates otherwise. > > It should be noted that witnessing a slight performance degradation on > a multi-core machine is not enough to demonstrate such a thing. The > degradation could be caused by other factors, such as thread migration, > bad OS behaviour, or even locking peculiarities in your own > application, which are not related to the GIL. A good test is whether > performance improves if you play with sys.setswitchinterval(). > > > Dave's newer study regards another issue, which I must stress is also > present in the old GIL algorithm, and therefore must have affected, if > it is serious, real-world applications in 2.x. And indeed, the test I > recently added to ccbench evidences the huge drop in socket I/Os per > second when there's a background CPU thread; this test exercises the > same situation as Dave's demos, only with a less trivial CPU workload: > > == CPython 2.7b2+.0 (trunk:81274M) == > == x86_64 Linux on 'x86_64' == > > --- I/O bandwidth --- > > Background CPU task: Pi calculation (Python) > > CPU threads=0: 23034.5 packets/s. > CPU threads=1: 6.4 ( 0 %) > CPU threads=2: 15.7 ( 0 %) > CPU threads=3: 13.9 ( 0 %) > CPU threads=4: 20.8 ( 0 %) > > (note: I've just changed my desktop machine, so these figures are > different from what I've posted weeks or months ago) > > > Regardless of the fact that apparently noone reported it in real-world > conditions, we *could* decide that the issue needs fixing. If we > decide so, Nir's approach is the most rigorous one: it tries to fix > the problem thoroughly, rather than graft an additional heuristic. Nir > also has tested his patch on a variety of machines, more so than Dave > and I did with our own patches; he is obviously willing to go forward. > > Right now, there are two problems with Nir's proposal: > > - first, what Nick said: the difficulty of having reliable > high-precision cross-platform time sources, which are necessary for > the BFS algorithm. Ironically, timestamp counters have their own > problems on multi-core machines (they can go out of sync between > CPUs). gettimeofday() and clock_gettime() may be precise enough on > most Unices, though. > > - second, the BFS algorithm is not that well-studied, since AFAIK it > was refused for inclusion in the Linux kernel; someone in the > python-dev community would therefore have to make sense of, and > evaluate, its heuristic. > > I also don't consider my own patch a very satisfactory "solution", > although it has the reassuring quality of being simple and short (and > easy to revert!). > > > That said, most of us are programmers and we love to invent ways > of fixing technical issues. It sometimes leads us to consider some > things issues even when they are mostly theoretical. This is why I > am lukewarm on this. I think interested people should focus on > real-world testing (rather than Dave and I's synthetic tests) of the new > GIL, with or without the various patches, and share the results. > > Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports > is also a very good one. > > Regards > > Antoine. [View Less]

3 2

Python versions for Ubuntu 10.10 (Maverick Meerkat)
by Barry Warsaw May 18, 2010

May 18, 2010

I just wanted to let the python-dev community know about some tracks we had at the recently concluded Ubuntu Developer Summit in Brussels. Among the several Python-related discussions, we talked about what versions of Python will be supported and default in the next version of Ubuntu (10.10, code name Maverick Meerkat, to be released in October). If you're interested in following and participating in this discussion, I've started a wiki page as the central place to collect information: https:… [View More]

1 0

Possible patch for functools partial - Interested?
by VanL May 17, 2010

May 17, 2010

Howdy all - I have an app where I am using functools.partial to bundle up jobs to do, where a job is defined as a callable + args. In one case, I wanted to keep track of whether I had previously seen a job, so I started putting them into a set... only to find out that partials never test equal to each other: >>> import operator >>> from functools import partial >>> p1 = partial(operator.add) >>> p2 = partial(operator.add) >>> p1 == p2 False >… [View More]

16 24

OS X buildbots and launchd
by Bill Janssen May 16, 2010

May 16, 2010

I can find no evidence that the buildbot installation process given on the wiki will cause the buildbot slave to be restarted after a reboot of the machine. To accomplish this, you should also undertake the work described in http://buildbot.net/trac/wiki/UsingLaunchd On my Leopard slave, I created the file /Library/LaunchAgents/org.python.buildbot.slave.plist, and put in it this XML: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "… [View More]

1 2