Mailman 3 December 2005 - Python-Dev

fixing log messages
by Fredrik Lundh March 8, 2006

March 8, 2006

just noticed an embarrasing misspelling in one of my recent checkins, only to find that I cannot fix it: $ svn propedit --revprop -r 41759 svn:log svn: Repository has not been enabled to accept revision propchanges; ask the administrator to create a pre-revprop-change hook $ would it be a good idea to ask the administrator to do this ? </F>

4 8

problem with genexp
by Neal Norwitz Feb. 24, 2006

Feb. 24, 2006

There's a problem with genexp's that I think really needs to get fixed. See http://python.org/sf/1167751 the details are below. This code: >>> foo(a = i for i in range(10)) generates "NameError: name 'i' is not defined" when run because: 2 0 LOAD_GLOBAL 0 (foo) 3 LOAD_CONST 1 ('a') 6 LOAD_GLOBAL 1 (i) 9 CALL_FUNCTION 256 12 POP_TOP 13 LOAD_CONST … [View More]

5 10

Re: [Python-Dev] [Doc-SIG] that library reference, again
by David Goodger Jan. 30, 2006

Jan. 30, 2006

On 12/29/05, Fredrik Lundh <fredrik(a)pythonware.com> wrote: > however, given that the discussion that led up to this has been dead for > almost a week, You mean since Christmas? > I'm beginning to fear that I've wasted my time on a project > that nobody's interested in. Could be. I don't see HTML+PythonDoc as a significant improvement over LaTeX. Yes, I'm biased. So are you. > are we stuck with latex for the foreseeable future? Until we have something clearly and … [View More]

12 22

New PEP: Using ssize_t as the index type
by "Martin v. Löwis" Jan. 19, 2006

Jan. 19, 2006

Please let me know what you think. Regards, Martin PEP: XXX Title: Using ssize_t as the index type Version: $Revision$ Last-Modified: $Date$ Author: Martin v. Löwis <martin(a)v.loewis.de> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 18-Dec-2005 Post-History: Abstract ======== In Python 2.4, indices of sequences are restricted to the C type int. On 64-bit machines, sequences therefore cannot use the full address space, and are restricted to 2**31 elements. … [View More]This PEP proposes to change this, introducing a platform-specific index type Py_ssize_t. An implementation of the proposed change is in http://svn.python.org/projects/python/branches/ssize_t. Rationale ========= 64-bit machines are becoming more popular, and the size of main memory increases beyond 4GiB. On such machines, Python currently is limited, in that sequences (strings, unicode objects, tuples, lists, array.arrays, ...) cannot contain more than 2GiElements. Today, very few machines have memory to represent larger lists: as each pointer is 8B (in a 64-bit machine), one needs 16GiB to just hold the pointers of such a list; with data in the list, the memory consumption grows even more. However, there are three container types for which users request improvements today: * strings (currently restricted to 2GiB) * mmap objects (likewise; plus the system typically won't keep the whole object in memory concurrently) * Numarray objects (from Numerical Python) As the proposed change will cause incompatibilities on 64-bit machines, it should be carried out while such machines are not in wide use (IOW, as early as possible). Specification ============= A new type Py_ssize_t is introduced, which has the same size as the compiler's size_t type, but is signed. It will be a typedef for ssize_t where available. The internal representation of the length fields of all container types is changed from int to ssize_t, for all types included in the standard distribution. In particular, PyObject_VAR_HEAD is changed to use Py_ssize_t, affecting all extension modules that use that macro. All occurrences of index and length parameters and results are changed to use Py_ssize_t, including the sequence slots in type objects. New conversion functions PyInt_FromSsize_t, PyInt_AsSsize_t, PyLong_AsSsize_t are introduced. PyInt_FromSsize_t will transparently return a long int object if the value exceeds the MAX_INT. New function pointer typedefs ssizeargfunc, ssizessizeargfunc, ssizeobjargproc, and ssizessizeobjargproc are introduced. A new conversion code 'n' is introduced for PyArg_ParseTuple and Py_BuildValue, which operates on Py_ssize_t. The conversion codes 's#' and 't#' will output Py_ssize_t if the macro PY_SIZE_T_CLEAN is defined before Python.h is included, and continue to output int if that macro isn't defined. At places where a conversion from size_t/Py_ssize_t to int is necessary, the strategy for conversion is chosen on a case-by-case basis (see next section). Conversion guidelines ===================== Module authors have the choice whether they support this PEP in their code or not; if they support it, they have the choice of different levels of compatibility. If a module is not converted to support this PEP, it will continue to work unmodified on a 32-bit system. On a 64-bit system, compile-time errors and warnings might be issued, and the module might crash the interpreter if the warnings are ignored. Conversion of a module can either attempt to continue using int indices, or use Py_ssize_t indices throughout. If the module should continue to use int indices, care must be taken when calling functions that return Py_ssize_t or size_t, in particular, for functions that return the length of an object (this includes the strlen function and the sizeof operator). A good compiler will warn when a Py_ssize_t/size_t value is truncated into an int. In these cases, three strategies are available: * statically determine that the size can never exceed an int (e.g. when taking the sizeof a struct, or the strlen of a file pathname). In this case, add a debug assert() that the value is indeed smaller than INT_MAX, and cast the value to int. * statically determine that the value shouldn't overflow an int unless there is a bug in the C code somewhere. Test whether the value is smaller than INT_MAX, and raise an InternalError if it isn't. * otherwise, check whether the value fits an int, and raise a ValueError if it doesn't. The same care must be taking for tp_as_sequence slots, in addition, the signatures of these slots change, and the slots must be explicitly recast (e.g. from intargfunc to ssizeargfunc). Compatibility with previous Python versions can be achieved with the test:: #if PY_VERSION_HEX < 0x02050000 typedef int Py_ssize_t; #endif and then using Py_ssize_t in the rest of the code. For the tp_as_sequence slots, additional typedefs might be necessary; alternatively, by replacing:: PyObject* foo_item(struct MyType* obj, int index) { ... } with:: PyObject* foo_item(PyObject* _obj, Py_ssize_t index) { struct MyType* obj = (struct MyType*)_obj; ... } it becomes possible to drop the cast entirely; the type of foo_item should then match the sq_item slot in all Python versions. If the module should be extended to use Py_ssize_t indices, all usages of the type int should be reviewed, to see whether it should be changed to Py_ssize_t. The compiler will help in finding the spots, but a manual review is still necessary. Particular care must be taken for PyArg_ParseTuple calls: they need all be checked for s# and t# converters, and PY_SIZE_T_CLEAN must be defined before including Python.h if the calls have been updated accordingly. Discussion ========== Why not size_t -------------- An initial attempt to implement this feature tried to use size_t. It quickly turned out that this cannot work: Python uses negative indices in many places (to indicate counting from the end). Even in places where size_t would be usable, to many reformulations of code where necessary, e.g. in loops like:: for(index = length-1; index >= 0; index--) This loop will never terminate if index is changed from int to size_t. Doesn't this break much code? ----------------------------- With the changes proposed, code breakage is fairly minimal. On a 32-bit system, no code will break, as Py_ssize_t is just a typedef for int. On a 64-bit system, the compiler will warn in many places. If these warnings are ignored, the code will continue to work as long as the container sizes don't exceeed 2**31, i.e. it will work nearly as good as it does currently. There are two exceptions to this statement: if the extension module implements the sequence protocol, it must be updated, or the calling conventions will be wrong. The other exception is the places where Py_ssize_t is output through a pointer (rather than a return value); this applies most notably to codecs and slice objects. If the conversion of the code is made, the same code can continue to work on earlier Python releases. Doesn't this consume too much memory? ------------------------------------- One might think that using Py_ssize_t in all tuples, strings, lists, etc. is a waste of space. This is not true, though: on a 32-bit machine, there is no change. On a 64-bit machine, the size of many containers doesn't change, e.g. * in lists and tuples, a pointer immediately follows the ob_size member. This means that the compiler currently inserts a 4 padding bytes; with the change, these padding bytes become part of the size. * in strings, the ob_shash field follows ob_size. This field is of type long, which is a 64-bit type on most 64-bit systems (except Win64), so the compiler inserts padding before it as well. Copyright ========= This document has been placed in the public domain. [View Less]

11 27

Re: [Python-Dev] Automated Python testing (was Re: status of development documentation)
by "Martin v. Löwis" Jan. 6, 2006

Jan. 6, 2006

Jean-Paul Calderone wrote: > I guess the config for this particular behavior would look something > like... You were right that I needed two schedulers for that. Unfortunately, it doesn't work at all, because svn_buildbot.py does not report branches on which a change happened, so if you have multiple schedulers for a subversion source, they either all build when a change occurs, or none of them. If svn_version knew about branches (which I'll have to implement, or incorporate the patch … [View More]

3 3

Jython and CPython
by Fredrik Lundh Jan. 5, 2006

Jan. 5, 2006

BTW, what's the policy wrt. Jython-specific modules in the standard library? Expat isn't available under Jython, but I have a Java/Jython-driver for ElementTree on my disk. Can / should this go into the CPython standard library ? </F>

5 4

slight inconsistency in svn checkin email subject lines
by Trent Mick Jan. 4, 2006

Jan. 4, 2006

Here are the subject lines for two recent svn commit emails: [Python-checkins] commit of r41847 - in python/trunk: Lib/test/test__locale.py Python/as... [Python-checkins] commit of r41848 - python/trunk/setup.py ^ `--- one extra space There is an extra space when the checkin includes exactly one file (at least, I think that is the condition). Is this intentional? If not, could someone point me to where the svn trigger scripts are maintained so … [View More]

5 8

a quit that actually quits
by Fredrik Lundh Jan. 3, 2006

Jan. 3, 2006

sourceforge just went off the air, so I'm posting this patch here, in order to distract you all from Christian's deque thread. this silly little patch changes the behaviour of the interpreter so that "quit" and "exit" actually exits the interpreter. it does this by installing a custom excepthook that looks for NameErrors at the top level, in interactive mode only. whaddya think? </F> Index: Lib/site.py =================================================================== --- Lib/site.… [View More]

28 79

Re: [Python-Dev] When do sets shrink?
by Raymond Hettinger Jan. 1, 2006

Jan. 1, 2006

[Noam] > For example, iteration over a set which once had > 1,000,000 members and now has 2 can take 1,000,000 operations every > time you traverse all the (2) elements. Do you find that to be a common or plausible use case? Was Guido's suggestion of s=set(s) unworkable for some reason? dicts and sets emphasize fast lookups over fast iteration -- apps requiring many iterations over a collection may be better off converting to a list (which has no dummy entries or empty gaps between … [View More]

5 7

When do sets shrink?
by Noam Raphael Dec. 31, 2005

Dec. 31, 2005

Hello, If I do something like this: s = set() for i in xrange(1000000): s.add(i) while s: s.pop() gc.collect() the memory consumption of the process remains the same even after the pops. I checked the code (that's where I started from, really), and there's nothing in set.pop or set.remove that resizes the table. And it turns out that it's the same with dicts. Should something be done about it? Noam

8 20