Mailman 3 January 2005 - Python-Dev

python-dev Summary for 2004-12-16 through 2004-12-31 [draft]
by Brett C. Feb. 1, 2005

Feb. 1, 2005

Nice and short summary this time. Plan to send this off Wednesday or Thursday so get corrections in before then. ------------------------------ ===================== Summary Announcements ===================== You can still `register <http://www.python.org/pycon/2005/register.html>`__ for `PyCon`_. The `schedule of talks`_ is now online. Jim Hugunin is lined up to be the keynote speaker on the first day with Guido being the keynote on Thursday. Once again PyCon looks like it is … [View More]going to be great. On a different note, as I am sure you are all aware I am still about a month behind in summaries. School this quarter for me has just turned out hectic. I think it is lack of motivation thanks to having finished my 14 doctoral applications just a little over a week ago (and no, that number is not a typo). I am going to for the first time in my life come up with a very regimented study schedule that will hopefully allow me to fit in weekly Python time so as to allow me to catch up on summaries. And this summary is not short because I wanted to finish it. 2.5 was released just before the time this summary covers so most stuff was on bug fixes discovered after the release. .. _PyCon: http://www.pycon.org/ .. _schedule of talks: http://www.python.org/pycon/2005/schedule.html ======= Summary ======= ------------- PEP movements ------------- I introduced a `proto-PEP <http://mail.python.org/pipermail/python-dev/2005-January/050753.html>`__ to the list on how one can go about changing CPython's bytecode. It will need rewriting once the AST branch is merged into HEAD on CVS. Plus I need to get a PEP number assigned to me. =) Contributing threads: - ` proto-pep: How to change Python's bytecode <>`__ ------------------------------------ Handling versioning within a package ------------------------------------ The suggestion of extending import syntax to support explicit version importation came up. The idea was to have something along the lines of ``import foo version 2, 4`` so that one can have packages that contain different versions of itself and to provide an easy way to specify which version was desired. The idea didn't fly, though. The main objection was that import-as support was all you really needed; ``import foo_2_4 as foo``. And if you had a ton of references to a specific package and didn't want to burden yourself with explicit imports, one can always have a single place before codes starts executing doing ``import foo_2_4; sys.modules["foo"] = foo_2_4``. And that itself can even be lower by creating a foo.py file that does the above for you. You can also look at how wxPython handles it at http://wiki.wxpython.org/index.cgi/MultiVersionInstalls . Contributing threads: - `Re: [Pythonmac-SIG] The versioning question... <>`__ =============== Skipped Threads =============== - Problems compiling Python 2.3.3 on Solaris 10 with gcc 3.4.1 - 2.4 news reaches interesting places see `last summary`_ for coverage of this thread - RE: [Python-checkins] python/dist/src/Modules posixmodule.c, 2.300.8.10, 2.300.8.11 - mmap feature or bug? - Re: [Python-checkins] python/dist/src/Pythonmarshal.c, 1.79, 1.80 - Latex problem when trying to build documentation - Patches: 1 for the price of 10. - Python for Series 60 released - Website documentation - link to descriptor information - Build extensions for windows python 2.4 what are the compiler rules? - Re: [Python-checkins] python/dist/src setup.py, 1.208, 1.209 - Zipfile needs? fake 32-bit unsigned int overflow with ``x = x & 0xFFFFFFFFL`` and signed ints with the additional ``if x & 0x80000000L: x -= 0x100000000L`` . - Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.1, 1.2 [View Less]

2 2

RE: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up functioncalls)
by Michael Chermside Feb. 1, 2005

Feb. 1, 2005

Evan Jones writes: > My knowledge about garbage collection is weak, but I have read a little > bit of Hans Boehm's work on garbage collection. [...] The biggest > disadvantage mentioned is that simple pointer assignments end up > becoming "increment ref count" operations as well... Hans Boehm certainly has some excellent points. I believe a little searching through the Python dev archives will reveal that attempts have been made in the past to use his GC tools with CPython, and … [View More]that the results have been disapointing. That may be because other parts of CPython are optimized for reference counting, or it may be just because this stuff is so bloody difficult! However, remember that changing away from reference counting is a change to the semantics of CPython. Right now, people can (and often do) assume that objects which don't participate in a reference loop are collected as soon as they go out of scope. They write code that depends on this... idioms like: >>> text_of_file = open(file_name, 'r').read() Perhaps such idioms aren't a good practice (they'd fail in Jython or in IronPython), but they ARE common. So we shouldn't stop using reference counting unless we can demonstrate that the alternative is clearly better. Of course, we'd also need to devise a way for extensions to cooperate (which is a problem Jython, at least, doesn't face). So it's NOT an obvious call, and so far numerous attempts to review other GC strategies have failed. I wouldn't be so quick to dismiss reference counting. > My only argument for making Python capable of leveraging multiple > processor environments is that multithreading seems to be where the big > performance increases will be in the next few years. I am currently > using Python for some relatively large simulations, so performance is > important to me. CPython CAN leverage such environments, and it IS used that way. However, this requires using multiple Python processes and inter-process communication of some sort (there are lots of choices, take your pick). It's a technique which is more trouble for the programmer, but in my experience usually has less likelihood of containing subtle parallel processing bugs. Sure, it'd be great if Python threads could make use of separate CPUs, but if the cost of that were that Python dictionaries performed as poorly as a Java HashTable or synchronized HashMap, then it wouldn't be worth the cost. There's a reason why Java moved away from HashTable (the threadsafe data structure) to HashMap (not threadsafe). Perhaps the REAL solution is just a really good IPC library that makes it easier to write programs that launch "threads" as separate processes and communicate with them. No change to the internals, just a new library to encourage people to use the technique that already works. -- Michael Chermside [View Less]

4 3

RE: [Python-Dev] Speed up function calls
by Raymond Hettinger Jan. 31, 2005

Jan. 31, 2005

> > In theory, I don't see how you could improve on METH_O and METH_NOARGS. > > The only saving is the time for the flag test (a predictable branch). > > Offsetting that savings is the additional time for checking min/max args > > and for constructing a C call with the appropriate number of args. I > > suspect there is no savings here and that the timings will get worse. > > I think tested a method I changed from METH_O to METH_ARGS and could > not … [View More]measure a difference. Something is probably wrong with the measurements. The new call does much more work than METH_O or METH_NOARGS. Those two common and essential cases cannot be faster and are likely slower on at least some compilers and some machines. If some timing shows differently, then it is likely a mirage (falling into an unsustainable local minimum). The patch introduces range checks, an extra C function call, nine variable initializations, and two additional unpredictable branches (the case statements). The only benefit (in terms of timing) is possibly saving a tuple allocation/deallocation. That benefit only kicks in for METH_VARARGS and even then only when the tuple free list is empty. I recommend not changing ANY of the METH_O and METH_NOARGS calls. These are already close to optimal. > A beneift would be to consolidate METH_O, > METH_NOARGS, and METH_VARARGS into a single case. This should > make code simpler all around (IMO). Will backwards compatibility allow those cases to be eliminated? It would be a bummer if most existing extensions could not compile with Py2.5. Also, METH_VARARGS will likely have to hang around unless a way can be found to handle more than nine arguments. This patch appears to be taking on a life of its own and is being applied more broadly than is necessary or wise. The patch is extensive and introduces a new C API that cannot be taken back later, so we ought to be careful with it. For the time being, try not to touch the existing METH_O and METH_NOARGS methods. Focus on situations that do stand a chance of being improved (such as methods with a signature like "O|O"). That being said, I really like the concept. I just worry that many of the stated benefits won't materialize: * having to keep the old versions for backwards compatibility, * being slower than METH_O and METH_NOARGS, * not handling more than nine arguments, * separating function signature info from the function itself, * the time to initialize all the argument variables to NULL, * somewhat unattractive case stmt code for building the c function call. Raymond [View Less]

17 22

linux executable - how?
by apocalypznow Jan. 31, 2005

Jan. 31, 2005

How can I take my python scripts and create a linux executable out of it (to be distributed without having to also distribute python) ?

2 1

Patch review: [ 1094542 ] add Bunch type to collections module
by Alan Green Jan. 31, 2005

Jan. 31, 2005

Steven Bethard is proposing a new collection class named Bunch. I had a few suggestions which I attached as comments to the patch - but what is really required is a bit more work on the draft PEP, and then discussion on the python-dev mailing list. http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=… Alan. -- Alan Green alan.green(a)cardboard.nu - http://cardboard.nu

6 12

Bug tracker reviews
by Tony Meyer Jan. 30, 2005

Jan. 30, 2005

As promised, here are five bug reviews with recommendations. If they help [ 1112812 ] Patch for Lib/bsddb/__init__.py to work with modulefinder <http://sourceforge.net/tracker/index.php?func=detail&aid=1112812&group_id=5 470&atid=305470> get reviewed, then that'd be great. Otherwise I'll just take the good karma and run :) ----- [ 531205 ] Bugs in rfc822.parseaddr() <http://sourceforge.net/tracker/index.php?func=detail&aid=531205&group_id=54 70&atid=… [View More]

1 0

Should Python's library modules be written to help the freeze tools?
by Tony Meyer Jan. 30, 2005

Jan. 30, 2005

The Python 2.4 Lib/bsddb/__init__.py contains this: """ # for backwards compatibility with python versions older than 2.3, the # iterator interface is dynamically defined and added using a mixin # class. old python can't tokenize it due to the yield keyword. if sys.version >= '2.3': exec """ import UserDict from weakref import ref class _iter_mixin(UserDict.DictMixin): ... """ Because the imports are inside an exec, modulefinder (e.g. when using bsddb with a py2exe built application) … [View More]

3 2

RE: [Python-Dev] Should Python's library modules be written to help the freeze tools?
by Tony Meyer Jan. 30, 2005

Jan. 30, 2005

[Tony Meyer] > The main question (to steal Thomas's words) is whether the > library modules should be written to help the freeze tools > - if the answer is 'yes', then I'll submit the above as a > patch for 2.5. [Martin v. Löwis] > The answer to this question certainly is "yes, if possible". In this > specific case, I wonder whether the backwards compatibility is still > required in the first place. According to PEP 291, Greg Smith and > Barry Warsaw decide on this, so … [View More]

1 0

Improving the Python Memory Allocator
by Evan Jones Jan. 30, 2005

Jan. 30, 2005

This message is a follow up to a thread I started on python-dev back in October, archived here: http://mail.python.org/pipermail/python-dev/2004-October/049480.html Basically, the problem I am trying to solve is that the Python memory allocator never frees memory back to the operating system. I have attached a patch against obmalloc.c for discussion. The patch still has some rough edges and possibly some bugs, so I don't think it should be merged as is. However, I would appreciate any … [View More]feedback on the chances for getting this implementation into the core. The rest of this message lists some disadvantages to this implementation, a description of the important changes, a benchmark, and my future plans if this change gets accepted. The patch works for any version of Python that uses obmalloc.c (which includes Python 2.3 and 2.4), but I did my testing with Python 2.5 from CVS under Linux and Mac OS X. This version of the allocator will actually free memory. It has two disadvantages: First, there is slightly more overhead with programs that allocate a lot of memory, release it, then reallocate it. The original allocator simply holds on to all the memory, allowing it to be efficiently reused. This allocator will call free(), so it also must call malloc() again when the memory is needed. I have a "worst case" benchmark which shows that this cost isn't too significant, but it could be a problem for some workloads. If it is, I have an idea for how to work around it. Second, the previous allocator went out of its way to permit a module to call PyObject_Free while another thread is executing PyObject_Malloc. Apparently, this was a backwards compatibility hack for old Python modules which erroneously call these functions without holding the GIL. These modules will have to be fixed if this implementation is accepted into the core. Summary of the changes: - Add an "arena_object" structure for tracking pages that belong to each 256kB arena. - Change the "arenas" array from an array of pointers to an array of arena_object structures. - When freeing a page (a pool), it is placed on a free pool list for the arena it belongs to, instead of a global free pool list. - When freeing a page, if the arena is completely unused, the arena is deallocated. - When allocating a page, it is taken from the arena that is the most full. This gives arenas that are almost completely unused a chance to be freed. Benchmark: The only benchmark I have performed at the moment is the worst case for this allocator: A program that allocates 1 000 000 Python objects which occupy nearly 200MB, frees them, reallocates them, then quits. I ran the program four times, and discarded the initial time. Here is the object: class Obj: def __init__( self ): self.dumb = "hello" And here are the average execution times for this program: Python 2.5: real time: 16.304 user time: 16.016 system: 0.257 Python 2.5 + patch: real time: 16.062 user time: 15.593 system: 0.450 As expected, the patched version spends nearly twice as much system time than the original version. This is because it calls free() and malloc() twice as many times. However, this difference is offset by the fact that the user space execution time is actually *less* than the original version. How is this possible? The likely cause is because the original version defined the arenas pointer to be "volatile" in order to work when Free and Malloc were called simultaneously. Since this version breaks that, the pointer no longer needs to be volatile, which allows the value to be stored in a register instead of being read from memory on each operation. Here are some graphs of the memory allocator behaviour running this benchmark. Original: http://www.eng.uwaterloo.ca/~ejones/original.png New: http://www.eng.uwaterloo.ca/~ejones/new.png Future Plans: - More detailed benchmarking. - The "specialized" allocators for the basic types, such as ints, also need to free memory back to the system. - Potentially the allocator should keep some amount of free memory around to improve the performance of programs that cyclically allocate and free large amounts of memory. This amount should be "self-tuned" to the application. Thank you for your feedback, Evan Jones [View Less]

3 6

Weekly Python Patch/Bug Summary
by Kurt B. Kaiser Jan. 29, 2005

Jan. 29, 2005

Patch / Bug Summary ___________________ Patches : 280 open ( +7) / 2747 closed ( +1) / 3027 total ( +8) Bugs : 803 open ( +6) / 4799 closed (+10) / 5602 total (+16) RFE : 167 open ( +1) / 141 closed ( +0) / 308 total ( +1) New / Reopened Patches ______________________ tarfile.ExFileObject iterators (2005-01-23) http://python.org/sf/1107973 opened by Mitch Chapman Allow slicing of any iterator by default (2005-01-24) http://python.org/sf/1108272 opened by … [View More]

1 0