From scott+python-dev at  Mon Nov  1 00:08:00 2010
From: scott+python-dev at (Scott Dial)
Date: Sun, 31 Oct 2010 19:08:00 -0400
Subject: [Python-Dev] [Python-checkins] r85934 - in
 python/branches/py3k: Misc/NEWS Modules/socketmodule.c
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 10/30/2010 4:08 PM, Martin v. L?wis wrote:
>> I think size should be in TCHARs, not in bytes. (MSDN says so)
>> And GetComputerName's signature differs from MSDN. (Maybe should
>> use GetComputerNameExW again?)
> You are right. So how about this patch?

Still not quite right. The call to GetComputerNameExW after
ERROR_MORE_DATA (which gives the number of *bytes* needed) still needs
to pass "size/sizeof(wchar_t)" back into GetComputerNameExW since it
wants the number TCHARs. I don't think the +1 is needed either (MSDN
says it already included the null-terminator in the byte count.

Scott Dial
scott at
scodial at

From benjamin at  Mon Nov  1 00:20:07 2010
From: benjamin at (Benjamin Peterson)
Date: Sun, 31 Oct 2010 18:20:07 -0500
Subject: [Python-Dev] str.format_from_mapping
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

2010/10/31 Glenn Linderman <v+python at>:
> On 10/31/2010 3:32 PM, Eric Smith wrote:
> On 10/31/2010 6:28 PM, Glenn Linderman wrote:
> On 10/31/2010 2:02 PM, Benjamin Peterson wrote:
> 2010/10/31 Antoine Pitrou<solipsis at>:
>>? On Sun, 31 Oct 2010 16:39:44 -0400
>>? Eric Smith<eric at>? wrote:
>>>? What are your thoughts on adding a str.format_from_mapping (or similar
>>>? name, maybe the suggested "format_map") to 3.2? See
>>>? . This method would be similar to
>>>? "%(foo)s %(bar)s" % d, where d is a dict (or rather any mapping object),
>>>? but of course would use str.format syntax: "{foo}
>>>? {bar}".format_from_mapping(d).
>>? I must be missing something, but what's the difference with
>>? XXX.format(**d)?
> It allows arbitrary mappings.
> Other than the language moratorium, why are arbitrary mappings not
> allowed for the (**d) syntax?
> An arbitrary mapping would be converted to a dict.
> Yes, but why convert?

Because callees always get a dictionary *copy* of the arguments.


From barry at  Mon Nov  1 01:45:21 2010
From: barry at (Barry Warsaw)
Date: Sun, 31 Oct 2010 20:45:21 -0400
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>
Message-ID: <20101031204521.263769bf@mission>

On Oct 31, 2010, at 09:54 AM, Gregory P. Smith wrote:

>> - moving the documentation to an "advanced" or "complete reference" section
>Agreed, I perfer simply deemphasizing these methods by reorganizing the
>documentation and mentioning in their documentation to, "just use
>assertEqual."  De-documenting them is the first step towards causing
>unnecessary pain by taking either of the next two steps:
>- make the methods non-public by prepending an underscore
>> - leaving them public but adding deprecation warnings to the code
>Please do not make any existing released methods from the unittest module
>non-public or add any deprecation warnings.  That will simply cause
>unnecessary code churn and pain for people porting their code from one
>version to the next without benefiting anyone.

I was hoping someone would get my not-too-subtle hint. :)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Mon Nov  1 01:53:05 2010
From: barry at (Barry Warsaw)
Date: Sun, 31 Oct 2010 20:53:05 -0400
Subject: [Python-Dev] str.format_from_mapping
In-Reply-To: <>
References: <>
Message-ID: <20101031205305.012eb563@snowdog>

On Oct 31, 2010, at 04:39 PM, Eric Smith wrote:

>What are your thoughts on adding a str.format_from_mapping (or similar
>name, maybe the suggested "format_map") to 3.2?

+1 for the shorter name.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From kristjan at  Mon Nov  1 03:32:01 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 1 Nov 2010 10:32:01 +0800
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>
Message-ID: <>

You just moved your copying down one level into
This magic function must be implemented by possibly concatenating several "socket.recv()" calls.
This invariably involves data copying, either by "".join() or stringio.write()

-----Original Message-----
From: at [ at] On Behalf Of "Martin v. L?wis"
Sent: Friday, October 29, 2010 18:15
To: python-dev at
Subject: Re: [Python-Dev] new buffer in python2.7
That is easy to achieve using the existing API:

def read_and_unpack(stream, format):
    data =
    return struct.unpack(format, data)

> Otherwise, I'm +1 on your suggestion, avoiding copying is a good thing.

I believe my function also doesn't involve any unnecessary copies.

From fuzzyman at  Mon Nov  1 03:55:35 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 02:55:35 +0000
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 30/10/2010 06:56, Raymond Hettinger wrote:
> On Oct 29, 2010, at 9:11 PM, Michael Foord wrote:
>>> Just to clarify. The following fails in Python 3:
>>> sorted([3, 1, 2, None])
>>> If you want to compare that two iterables containing heterogeneous 
>>> types have the same members then it is tricky to implement correctly 
>>> and assertItemsEqual does it for you.
>>> I agree that the name is not ideal and would be happy to change the 
>>> name (deprecating the old name as it was released in 2.7). API churn 
>>> is as bad as API bloat, but at least changing the name is something 
>>> only done once.
>> Sorry for the noise. Suggested alternative name:
>> assertElementsEqual
>> The docs need updating to make it clear that the method isn't just a 
>> synonym for assertEqual(sorted(iter1), sorted(iter2)) and that it 
>> works with unorderable types.
> I looked at this again and think we should just remove 
> assertItemsEqual() from Py3.2 and dedocument it in Py2.7. It is listed 
> as being new in 3.2 so nothing is lost.

As it has been released in 2.7 (and in unittest2 for earlier versions of 
Python) removing it would add another pain point for those porting from 
Python 2 to 3. From a backwards compatibility point of view this method 
has been released (it is only new in 3.2 for the Python 3 series).

Note that for this issues plus the other cleanup related topics we have 
been discussing Raymond has created issue 10273:

> A new name like assertElementsEqual is an improvement because it 
> doesn't suggest something like assertEqual(d.items(), d.items()), but 
> it falls short in describing its key features:
> * the method doesn't care about order
Something that implied order would be good but we shouldn't let the 
perfect be the enemy of the good.

> * it does care about duplicates
Both the old name and the new one imply that it does care about 
duplicates (to me at least).

> * it don't need hashability
> * it can handle sequences of non-comparable types

The name doesn't imply that it needs hashability or comparable types 
either (although the latter needs to be documented as the current 
documentation could be read as saying that comparable types are needed). 
The name doesn't need to include all its *non-requirements*, it just 
needs to describe what it does.

> Also, I think the O(n**2) behavior is unexpected.

I agree that this should be fixed.

> There is a O(n log n) fast-path but it has a bug and needs to be 
> removed. See issue 10242.
Having a more efficient 'slow-path' and moving to that by default would 
fix it. The bug is only a duplicate of the bug in sorted - caused by the 
fact that sets / frozensets can't be sorted in the standard Python way 
(their less than comparison adheres to the set definition). This is 
something that will probably surprise many Python developers:

 >>> a = [{2,4}, {1,2}]
 >>> b = a[::-1]
 >>> sorted(a)
[set([2, 4]), set([1, 2])]
 >>> sorted(b)
[set([1, 2]), set([2, 4])]

(Fixing the bug in sorted would fix assertItemsEqual ;-)

As I stated in my previous email, the functionality is still useful. Add 
on the fact that this has already been released I'm -1 one removing, +1 
on fixing O(n**2) behaviour and +0 on an alternative name.

> The sole benefit over the more explicit variants like 
> assertEqual(set(a), set(b)) and assertEqual(sorted(a), sorted(b)) is 
> that it handles a somewhat rare corner case where neither of those 
> work (unordered comparison of non-compable types when you do care 
> about duplicates). That particular case doesn't come-up much and isn't 
> suggested by either the current name or its proposed replacement.
I have test suites littered with self.assertEqual(sorted(expected), 
sorted(actual)) - anywhere I care about the contents of a sequence but 
not about the order it is generated in (perhaps created by iteration 
over a set or dictionary). It is not uncommon for these lists to contain 
None which makes them un-sortable in Python 3. Decorating the members 
with something that allows a stable sort would fix that - and that is 
one possible fix for the efficiency issue. It would probably propagate 
the issue that sets / frozensets don't work with sorted.

> FWIW, I checked-out some other unittest suites in other languages and 
> did not find an equivalent. That strongly suggests this is YAGNI and 
> it shouldn't be added in Py3.2. There needs to be more evidence of 
> need before putting this in. And if it goes in, it needs a really good 
> name that tells what operations are hidden behind the abstraction. 
> When reading test assertion, it is vital that the reader understand 
> exactly what is being tested. It's an API fail if a reader guesses 
> that assertElementsEqual(a,b) means list(a)==list(b); the test will 
> pass unintentionally.

I agree very much that asserts need to be readable. I think 
assertSameElements is "good enough" on this score though.

All the best,


> See:
> Raymond


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From martin at  Mon Nov  1 07:22:14 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 Nov 2010 07:22:14 +0100
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>	<>
Message-ID: <>

>> def read_and_unpack(stream, format): 
>>   data =
>>   return struct.unpack(format, data)
>>> Otherwise, I'm +1 on your suggestion, avoiding copying is a good
>>> thing.
>> I believe my function also doesn't involve any unnecessary copies.
> You just moved your copying down one level into This
> magic function must be implemented by possibly concatenating
> several "socket.recv()" calls.
> This invariably involves data copying, either by "".join() or
> stringio.write()

Assuming there are multiple recv calls. For a typical struct, all data
will come out of the stream with a single recv. so no join will be


From stefan_ml at  Mon Nov  1 09:35:41 2010
From: stefan_ml at (Stefan Behnel)
Date: Mon, 01 Nov 2010 09:35:41 +0100
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>	<>	<>	<1288178142.3533.9.camel@localhost.localdomain>	<>
Message-ID: <ialu4t$32b$>

Kristj?n Valur J?nsson, 27.10.2010 16:32:
> Sorry, here the tables properly formatted:

Certainly looked better on your first try.


From stefan_ml at  Mon Nov  1 09:45:06 2010
From: stefan_ml at (Stefan Behnel)
Date: Mon, 01 Nov 2010 09:45:06 +0100
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>	<>	<>	<1288178142.3533.9.camel@localhost.localdomain>	<>
Message-ID: <ialumi$67g$>

Kristj?n Valur J?nsson, 27.10.2010 16:28:
> Notice how  a Slice object is generated.  Then a PyObject_GetItem() is
> done.  The salient code path is from apply_slice().  A slice object must
> be constructed and destroyed.

If slice object creation bothers you here, it might be worth using a free 
list in PySlice_New() instead of creating new slice objects on request.

Creating a slice of something is not necessarily such a costly operation 
that it dominates creating the slice object, so optimising the slice 
request itself sounds like a good idea.

You can take a look at how it's done in tupleoject.c if you want to provide 
a patch. Then, please open a bug tracker ticket and attach the patch there 
(and post a link to the ticket in this thread).


From kristjan at  Mon Nov  1 09:57:09 2010
From: kristjan at (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=)
Date: Mon, 1 Nov 2010 16:57:09 +0800
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <ialumi$67g$>
References: <>
Message-ID: <>

I've already created a patch.  See
I was working with 2.7 where slicing sequences is done differently than in 3.2, so the difference is not that very great.
I'm going to have another go at profiling the 3.2 version later and see why slicing a bytearray is so much more expensive than slicing a bytes object.

> -----Original Message-----
> From: at
> [ at] On Behalf
> Of Stefan Behnel
> Sent: 1. n?vember 2010 16:45
> To: python-dev at
> Subject: Re: [Python-Dev] new buffer in python2.7
> Kristj?n Valur J?nsson, 27.10.2010 16:28:
> > Notice how  a Slice object is generated.  Then a PyObject_GetItem()
> is
> > done.  The salient code path is from apply_slice().  A slice object
> must
> > be constructed and destroyed.
> If slice object creation bothers you here, it might be worth using a
> free
> list in PySlice_New() instead of creating new slice objects on request.
> Creating a slice of something is not necessarily such a costly
> operation
> that it dominates creating the slice object, so optimising the slice
> request itself sounds like a good idea.
> You can take a look at how it's done in tupleoject.c if you want to
> provide
> a patch. Then, please open a bug tracker ticket and attach the patch
> there
> (and post a link to the ticket in this thread).
> Stefan
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
> dev/

From stefan_ml at  Mon Nov  1 09:58:22 2010
From: stefan_ml at (Stefan Behnel)
Date: Mon, 01 Nov 2010 09:58:22 +0100
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <ialumi$67g$>
References: <>	<>	<>	<1288178142.3533.9.camel@localhost.localdomain>	<>	<>
Message-ID: <ialvfe$7eh$>

Stefan Behnel, 01.11.2010 09:45:
> If slice object creation bothers you here, it might be worth using a
> free list in PySlice_New() instead of creating new slice objects on
> request.
> You can take a look at how it's done in tupleoject.c if you want to
> provide a patch.

Hmm, that's actually a particularly bad place to look. The implementation 
in listobject.c is much simpler.


From kristjan at  Mon Nov  1 10:09:31 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 1 Nov 2010 17:09:31 +0800
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>
Message-ID: <>

Ah, yes.  There are, in my case.  (why do I always seem to be doing stuff that is different from what you all are doing:)
The particular piece of code is from the chunked reader. It may be reading rather large chunks at a time (several lots of Kb.):

def recvchunk(socket):
    len = socket.unpack('i', recv_exactly(socket, 4))
    return recv_exactly(len)

#old style
def recv_exactly(socket, length):
    data = []
    while length:
        got = socket.receive(length)
        if not got: raise EOFError
        length -= len(got)
    return "".join(data)

#new style
def recv_exactly(socket, length):
    data = bytearray(length)
    view = memoryview(data)
    while length:
        got = socket.receive_into(view[-length:])
        if not got: raise EOFError
        length -= len(got)
    return data

Here I spot another optimzation oppertunity:  let memoryview[:] return self, since the object is immutable, I believe.

> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at]
> Sent: 1. n?vember 2010 14:22
> To: Kristj?n Valur J?nsson
> Cc: python-dev at
> Subject: Re: [Python-Dev] new buffer in python2.7
> Assuming there are multiple recv calls. For a typical struct, all data
> will come out of the stream with a single recv. so no join will be
> necessary.
> Regards,
> Martin

From ncoghlan at  Mon Nov  1 12:25:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 1 Nov 2010 21:25:22 +1000
Subject: [Python-Dev] str.format_from_mapping
In-Reply-To: <20101031205305.012eb563@snowdog>
References: <> <20101031205305.012eb563@snowdog>
Message-ID: <>

On Mon, Nov 1, 2010 at 10:53 AM, Barry Warsaw <barry at> wrote:
> On Oct 31, 2010, at 04:39 PM, Eric Smith wrote:
>>What are your thoughts on adding a str.format_from_mapping (or similar
>>name, maybe the suggested "format_map") to 3.2?
> +1 for the shorter name.

+1 for a format_map() method that takes a single mapping argument
(Eric's patch on the issue). -1 for the most recent patch attached to
that issue that allows further positional arguments after the mapping

(Raymond mentioned it on the issue, but I'll mention it here as well:
this addition would fall under the "case-by-case exemption" clause in
the moratorium PEP, since it adds a new method to a builtin type)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Nov  1 12:32:51 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 1 Nov 2010 21:32:51 +1000
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/1 Kristj?n Valur J?nsson <kristjan at>:
> Ah, yes. ?There are, in my case. ?(why do I always seem to be doing stuff that is different from what you all are doing:)

I would guess that most of us aren't writing MMOs for a living. Gamers
seem to be a particularly demanding breed of user :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Mon Nov  1 12:33:31 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 1 Nov 2010 12:33:31 +0100
Subject: [Python-Dev] Stable sort and partial order
References: <>
Message-ID: <>

On Mon, 01 Nov 2010 02:55:35 +0000
Michael Foord <fuzzyman at> wrote:
> Having a more efficient 'slow-path' and moving to that by default would 
> fix it. The bug is only a duplicate of the bug in sorted - caused by the 
> fact that sets / frozensets can't be sorted in the standard Python way 
> (their less than comparison adheres to the set definition). This is 
> something that will probably surprise many Python developers:
>  >>> a = [{2,4}, {1,2}]
>  >>> b = a[::-1]
>  >>> sorted(a)
> [set([2, 4]), set([1, 2])]
>  >>> sorted(b)
> [set([1, 2]), set([2, 4])]
> (Fixing the bug in sorted would fix assertItemsEqual ;-)

How is this a bug? The sort algorithm is stable, which means the above
behaviour is a feature.
I see no easy way of eliminating the O(n*n) issue. Custom key functions
can't work in all cases.



From rdmurray at  Mon Nov  1 14:06:40 2010
From: rdmurray at (R. David Murray)
Date: Mon, 01 Nov 2010 09:06:40 -0400
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, 01 Nov 2010 12:33:31 +0100, Antoine Pitrou <solipsis at> wrote:
> On Mon, 01 Nov 2010 02:55:35 +0000
> Michael Foord <fuzzyman at> wrote:
> > Having a more efficient 'slow-path' and moving to that by default would 
> > fix it. The bug is only a duplicate of the bug in sorted - caused by the 
> > fact that sets / frozensets can't be sorted in the standard Python way 
> > (their less than comparison adheres to the set definition). This is 
> > something that will probably surprise many Python developers:
> > 
> >  >>> a = [{2,4}, {1,2}]
> >  >>> b = a[::-1]
> >  >>> sorted(a)
> > [set([2, 4]), set([1, 2])]
> >  >>> sorted(b)
> > [set([1, 2]), set([2, 4])]
> > 
> > (Fixing the bug in sorted would fix assertItemsEqual ;-)
> How is this a bug? The sort algorithm is stable, which means the above
> behaviour is a feature.  I see no easy way of eliminating the O(n*n)
> issue. Custom key functions can't work in all cases.

Even granting some theoretical way to sort sets by their contents, it
still wouldn't be a bug in sorted.  Sorted is just using the results
returned by '__lt__', which is what it should do.  Special casing sets
in sorted would be wrong.

R. David Murray                            

From kristjan at  Mon Nov  1 14:34:09 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 1 Nov 2010 21:34:09 +0800
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <>
	<>	<20101028120414.1f4f3016@mission>
	<> <>
Message-ID: <>

I've been sitting on the sideline seeing this unfold.
We've seen some different viewpoints on the matter and I'm happy to see that I'm not the only one lamenting the proclaimed death of the 2.x linage.
However, As correctly stated by Martin, I merely voiced a suggestion and I have gotten helpful counter-suggestions.
A private branch is fine (More correctly a fork, even, as people have pointed out) and Hg is going to support user-branches.
In the meantime, however, unless someone strongly objects, I'm probably going to set up a temporary branch off /release27-maint under /stackless/sandboxes/ until the Hg switchover.  Name undecided yet.


> -----Original Message-----
> From: at
> [ at] On Behalf
> Of "Martin v. L?wis"
> Sent: 29. okt?ber 2010 22:13
> This thread was started by a specific proposal from Kristjan, and
> Kristjan got a specific suggestion on how to proceed (namely, wait
> for the Mercurial switchover, then publish his changes in a branch).
> So despite the more general subject (which I think is still mostly
> hypothetical), the real issue Kristjan raised has been resolved,
> AFAICT (although Kristjan has not yet voiced an opinion of whether
> he finds that resolution acceptable).

From kristjan at  Mon Nov  1 15:00:59 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 1 Nov 2010 22:00:59 +0800
Subject: [Python-Dev] time.wallclock()
Message-ID: <>

Working on Condition variables and semaphores (see I noticed that time.time() was being used to correctly time blocking system calls.  On windows, I would have used time.clock() but reading the documentation made me realize that on Unix that would return CPU seconds which are useless when blocking.  However, on Windows, time.clock() has a much higher resolution, apart from being a "wallclock" time, and is thus better suited to timing that time.time().   In addition, time.time() has the potential of giving unexpected results if someone messes with the system clock.

I was wondering if it were helpful to have a function such as time.wallclock() which is specified to give relative wallclock time between invocations or an approximation thereof, to the system's best ability?

We could then choose this to be an alias of time.clock() on windows and time.time() on any other machine, or even have custom implementations on machines that support such a notion.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From olemis at  Mon Nov  1 15:04:10 2010
From: olemis at (Olemis Lang)
Date: Mon, 1 Nov 2010 09:04:10 -0500
Subject: [Python-Dev] Change to logging Formatters: support for
 alternative format styles
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Oct 31, 2010 at 9:55 AM, Vinay Sajip <vinay_sajip at> wrote:
> Olemis Lang <olemis <at>> writes:
>> On Fri, Oct 29, 2010 at 10:07 AM, Barry Warsaw <barry <at>> wrote:
>> > I haven't played with it yet, but do you think it makes sense to add a
>> > 'style' keyword argument to basicConfig()? ?That would make it pretty easy
>> > to get the formatting style you want without having to explicitly
>> > instantiate a Formatter, at least for simple logging clients.
>> >
>> Since this may be considered as a little sophisticated, I'd rather
>> prefer these new classes to be added to configuration sections using
>> fileConfig (and default behavior if missing), and still leave
>> `basicConfig` unchanged (i.e. *basic*) .
> Actually it's no biggie to have an optional style argument for basicConfig.
> People who don't use it don't have to specify it; the style argument would only
> apply if format was specified.


> For some people, use of {} over % is more about personal taste than about the
> actual usage of str.format's flexibility;

Thought you were talking about me, you only needed to say ?he has
black hair and blue eyes? ... ;o)

> we may as well accommodate that
> preference, as it encourages in a small way the use of {}-formatting.

ok , nevermind , it's ok for me anyway (provided that sections for
`fileConfig` will be available) .



Blog ES:
Blog EN:

Featured article:

From fuzzyman at  Mon Nov  1 15:05:14 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 14:05:14 +0000
Subject: [Python-Dev] time.wallclock()
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/11/2010 14:00, Kristj?n Valur J?nsson wrote:
> Working on Condition variables and semaphores (see 
> I noticed that time.time() was 
> being used to correctly time blocking system calls.  On windows, I 
> would have used time.clock() but reading the documentation made me 
> realize that on Unix that would return CPU seconds which are useless 
> when blocking.  However, on Windows, time.clock() has a much higher 
> resolution, apart from being a "wallclock" time, and is thus better 
> suited to timing that time.time().   In addition, time.time() has the 
> potential of giving unexpected results if someone messes with the 
> system clock.
> I was wondering if it were helpful to have a function such as 
> time.wallclock() which is specified to give relative wallclock time 
> between invocations or an approximation thereof, to the system's best 
> ability?
> We could then choose this to be an alias of time.clock() on windows 
> and time.time() on any other machine, or even have custom 
> implementations on machines that support such a notion.

I think this would be helpful. Having to do platform specific checks to 
choose which time function to use is annoying.


> Kristj?n
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies ("BOGUS AGREEMENTS") that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Mon Nov  1 15:26:19 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 14:26:19 +0000
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

On 01/11/2010 11:33, Antoine Pitrou wrote:
> On Mon, 01 Nov 2010 02:55:35 +0000
> Michael Foord<fuzzyman at>  wrote:
>> Having a more efficient 'slow-path' and moving to that by default would
>> fix it. The bug is only a duplicate of the bug in sorted - caused by the
>> fact that sets / frozensets can't be sorted in the standard Python way
>> (their less than comparison adheres to the set definition). This is
>> something that will probably surprise many Python developers:
>>   >>>  a = [{2,4}, {1,2}]
>>   >>>  b = a[::-1]
>>   >>>  sorted(a)
>> [set([2, 4]), set([1, 2])]
>>   >>>  sorted(b)
>> [set([1, 2]), set([2, 4])]
>> (Fixing the bug in sorted would fix assertItemsEqual ;-)
> How is this a bug? The sort algorithm is stable, which means the above
> behaviour is a feature.

Well, bug is the wrong word as it is obviously an intended feature (or 
consequence of a feature). I still think, given the general behaviour of 
Python sorting, that it is unexpected. It breaks what is usually an 
invariant for sorting without an explicit key that sortable types 
sorted(l) == sorted(l[::-1]).

There is however a note in the set documentation:

Since sets only define partial ordering (subset relationships), the 
output of the list.sort() method is undefined for lists of sets.

> I see no easy way of eliminating the O(n*n) issue. Custom key functions
> can't work in all cases.
Right. Special casing sets and frozensets would be one (particularly 
inelegant) way however.

All the best,

> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From steven.bethard at  Mon Nov  1 15:48:48 2010
From: steven.bethard at (Steven Bethard)
Date: Mon, 1 Nov 2010 14:48:48 +0000
Subject: [Python-Dev] okay to remove argparse.__all__?
Message-ID: <>

I think the easiest and most sensible way to address is to simply remove the __all__
definition from argparse - everything that doesn't start with an
underscore in the module is already meant to be exposed.

But then I wonder - is __all__ considered part of the public API of a
module? Or is it okay to just remove it and assume that no one should
have been accessing it directly anyway?

Where did you get that preposterous hypothesis?
Did Steve tell you that?
? ? ? ? --- The Hiphopopotamus

From fuzzyman at  Mon Nov  1 15:53:24 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 14:53:24 +0000
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/11/2010 14:48, Steven Bethard wrote:
> I think the easiest and most sensible way to address
> is to simply remove the __all__
> definition from argparse - everything that doesn't start with an
> underscore in the module is already meant to be exposed.
> But then I wonder - is __all__ considered part of the public API of a
> module? Or is it okay to just remove it and assume that no one should
> have been accessing it directly anyway?

Isn't it better to add the missing elements - what is the problem with 
that approach?

Not defining __all__ will mean that "from argparse import *" will also 
export all the modules you import (copy, os, re, sys, textwrap).

All the best,


> Steve


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From steven.bethard at  Mon Nov  1 15:55:25 2010
From: steven.bethard at (Steven Bethard)
Date: Mon, 1 Nov 2010 14:55:25 +0000
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 1, 2010 at 2:53 PM, Michael Foord <fuzzyman at> wrote:
> On 01/11/2010 14:48, Steven Bethard wrote:
>> I think the easiest and most sensible way to address
>> is to simply remove the __all__
>> definition from argparse - everything that doesn't start with an
>> underscore in the module is already meant to be exposed.
>> But then I wonder - is __all__ considered part of the public API of a
>> module? Or is it okay to just remove it and assume that no one should
>> have been accessing it directly anyway?
> Isn't it better to add the missing elements - what is the problem with that
> approach?

It just requires extra synchronization, and history shows that I
always forget to add them. ;-)

> Not defining __all__ will mean that "from argparse import *" will also
> export all the modules you import (copy, os, re, sys, textwrap).

That won't happen in the case of argparse - all modules are imported
like "import os as _os".

Where did you get that preposterous hypothesis?
Did Steve tell you that?
? ? ? ? --- The Hiphopopotamus

From guido at  Mon Nov  1 15:57:13 2010
From: guido at (Guido van Rossum)
Date: Mon, 1 Nov 2010 07:57:13 -0700
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 1, 2010 at 7:53 AM, Michael Foord <fuzzyman at> wrote:
> On 01/11/2010 14:48, Steven Bethard wrote:
>> I think the easiest and most sensible way to address
>> is to simply remove the __all__
>> definition from argparse - everything that doesn't start with an
>> underscore in the module is already meant to be exposed.
>> But then I wonder - is __all__ considered part of the public API of a
>> module? Or is it okay to just remove it and assume that no one should
>> have been accessing it directly anyway?
> Isn't it better to add the missing elements - what is the problem with that
> approach?

Agreed, that's what I would do.

> Not defining __all__ will mean that "from argparse import *" will also
> export all the modules you import (copy, os, re, sys, textwrap).

Well, the copy of that I have carefully renames those to
_copy, _os etc. to avoid this.
You never know.

It is also possible to write automated tests that flag likely missing
symbols in __all__ (as well as symbols in __all__ missing from the

--Guido van Rossum (

From steven.bethard at  Mon Nov  1 15:59:03 2010
From: steven.bethard at (Steven Bethard)
Date: Mon, 1 Nov 2010 14:59:03 +0000
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 1, 2010 at 2:57 PM, Guido van Rossum <guido at> wrote:
> On Mon, Nov 1, 2010 at 7:53 AM, Michael Foord <fuzzyman at> wrote:
>> On 01/11/2010 14:48, Steven Bethard wrote:
>>> But then I wonder - is __all__ considered part of the public API of a
>>> module? Or is it okay to just remove it and assume that no one should
>>> have been accessing it directly anyway?
>> Isn't it better to add the missing elements - what is the problem with that
>> approach?
> Agreed, that's what I would do.

Ok, sounds good.

> It is also possible to write automated tests that flag likely missing
> symbols in __all__ (as well as symbols in __all__ missing from the
> module).

Yep, I plan on doing that. I already had a test something like this to
remind me how I broke __all__ before. ;-)

Where did you get that preposterous hypothesis?
Did Steve tell you that?
? ? ? ? --- The Hiphopopotamus

From fuzzyman at  Mon Nov  1 15:59:07 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 14:59:07 +0000
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/11/2010 14:57, Guido van Rossum wrote:
> [snip...]
>> Not defining __all__ will mean that "from argparse import *" will also
>> export all the modules you import (copy, os, re, sys, textwrap).
> Well, the copy of that I have carefully renames those to
> _copy, _os etc. to avoid this.

Bah.... Sorry about that.



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Mon Nov  1 16:10:44 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 2 Nov 2010 01:10:44 +1000
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 12:57 AM, Guido van Rossum <guido at> wrote:
> It is also possible to write automated tests that flag likely missing
> symbols in __all__ (as well as symbols in __all__ missing from the
> module).

These days, test___all__ checks that everything in __all__ exists in
standard library modules. It is also possible for individual module
tests to include a check that goes the other way along the lines of:

def test_all_is_complete():
  known_private = {"known", "unexported", "names"}
  expected_public = (k for k in mod.__dict__ if k not in known_private
and not k.startswith("_"))
  self.assertEqual(set(mod.__all__), expected_public)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From kristjan at  Mon Nov  1 16:10:51 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 1 Nov 2010 23:10:51 +0800
Subject: [Python-Dev] time.wallclock()
In-Reply-To: <>
References: <>
Message-ID: <>

Ok, please see

From: Michael Foord [mailto:fuzzyman at]
Sent: 1. n?vember 2010 22:05
To: Kristj?n Valur J?nsson
Cc: python-dev at
Subject: Re: [Python-Dev] time.wallclock()

I think this would be helpful. Having to do platform specific checks to choose which time function to use is annoying.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rdmurray at  Mon Nov  1 16:10:53 2010
From: rdmurray at (R. David Murray)
Date: Mon, 01 Nov 2010 11:10:53 -0400
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord <fuzzyman at> wrote:
> On 01/11/2010 11:33, Antoine Pitrou wrote:
> > On Mon, 01 Nov 2010 02:55:35 +0000
> > Michael Foord<fuzzyman at>  wrote:
> >> Having a more efficient 'slow-path' and moving to that by default would
> >> fix it. The bug is only a duplicate of the bug in sorted - caused by the
> >> fact that sets / frozensets can't be sorted in the standard Python way
> >> (their less than comparison adheres to the set definition). This is
> >> something that will probably surprise many Python developers:
> >>
> >>   >>>  a =3D [{2,4}, {1,2}]
> >>   >>>  b =3D a[::-1]
> >>   >>>  sorted(a)
> >> [set([2, 4]), set([1, 2])]
> >>   >>>  sorted(b)
> >> [set([1, 2]), set([2, 4])]
> >>
> >> (Fixing the bug in sorted would fix assertItemsEqual ;-)
> > How is this a bug? The sort algorithm is stable, which means the above
> > behaviour is a feature.
> Well, bug is the wrong word as it is obviously an intended feature (or
> consequence of a feature). I still think, given the general behaviour of
> Python sorting, that it is unexpected. It breaks what is usually an
> invariant for sorting without an explicit key that sortable types
> sorted(l) = sorted(l[::-1]).

Well, as Antoine pointed out, Python's sorting algorithm is stable,
so that is in fact *not* an invariant:

>>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke']
>>> y = x[::-1]
>>> [id(n) for n in sorted(x)]
[3073747680, 3073747904, 3073747624, 3073747512]
>>> [id(n) for n in sorted(y)]
[3073747680, 3073747904, 3073747512, 3073747624]

Yes, == usually hides the fact that the *objects* are in a different
order, but obviously that doesn't apply to sets :)

R. David Murray                            

From fuzzyman at  Mon Nov  1 16:14:36 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 15:14:36 +0000
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/11/2010 15:10, R. David Murray wrote:
> On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord<fuzzyman at>  wrote:
>> On 01/11/2010 11:33, Antoine Pitrou wrote:
>>> On Mon, 01 Nov 2010 02:55:35 +0000
>>> Michael Foord<fuzzyman at>   wrote:
>>>> Having a more efficient 'slow-path' and moving to that by default would
>>>> fix it. The bug is only a duplicate of the bug in sorted - caused by the
>>>> fact that sets / frozensets can't be sorted in the standard Python way
>>>> (their less than comparison adheres to the set definition). This is
>>>> something that will probably surprise many Python developers:
>>>>    >>>   a =3D [{2,4}, {1,2}]
>>>>    >>>   b =3D a[::-1]
>>>>    >>>   sorted(a)
>>>> [set([2, 4]), set([1, 2])]
>>>>    >>>   sorted(b)
>>>> [set([1, 2]), set([2, 4])]
>>>> (Fixing the bug in sorted would fix assertItemsEqual ;-)
>>> How is this a bug? The sort algorithm is stable, which means the above
>>> behaviour is a feature.
>> Well, bug is the wrong word as it is obviously an intended feature (or
>> consequence of a feature). I still think, given the general behaviour of
>> Python sorting, that it is unexpected. It breaks what is usually an
>> invariant for sorting without an explicit key that sortable types
>> sorted(l) = sorted(l[::-1]).
> Well, as Antoine pointed out, Python's sorting algorithm is stable,
> so that is in fact *not* an invariant:
>>>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke']
>>>> y = x[::-1]
>>>> [id(n) for n in sorted(x)]
> [3073747680, 3073747904, 3073747624, 3073747512]
>>>> [id(n) for n in sorted(y)]
> [3073747680, 3073747904, 3073747512, 3073747624]
> Yes, == usually hides the fact that the *objects* are in a different
> order, but obviously that doesn't apply to sets :)

Sorry, that should have been sorted(l) == sorted(l[::-1]) - which *is* 
the case for your example above.


> --
> R. David Murray                            


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ocean-city at  Mon Nov  1 16:12:47 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Tue, 02 Nov 2010 00:12:47 +0900
Subject: [Python-Dev] PyMem_MALLOC vs PyMem_Malloc
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 2010/10/31 2:32, M.-A. Lemburg wrote:
> M.-A. Lemburg wrote:
>> Hirokazu Yamamoto wrote:
>>> Hello. I found several codes using PyMem_Free to free
>>> allocated memory with PyMem_MALLOC (ie: PyUnicode_AsWideCharString)
>>> Is it safe?
>> Within the interpreter: yes.
>> In extensions: depends on the platform, but probably not.
>> The macros provide faster access to the C lib malloc calls.
>> The functions need to be used in extensions in case the interpreter will
>> free the resource or the extension wants to free an interpreter
>> allocated resource. They provide access to the malloc calls
>> used by the interpreter, which may operate on a different heap
>> than the extensions.
>> Within an extension the macros use the extension heap.
>> A subtle, but important difference.
> BTW: If you were referring to extensions using PyMem_Free()
> to deallocate memory allocated in the interpreter using
> PyMem_MALLOC(), then that's exactly how things should be
> done.
> I was referring to use of the two mentioned APIs within
> an extension.

Thank you for reply, probably I could understand.

From phd at  Mon Nov  1 16:08:27 2010
From: phd at (Oleg Broytman)
Date: Mon, 1 Nov 2010 18:08:27 +0300
Subject: [Python-Dev] okay to remove argparse.__all__?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 01, 2010 at 02:55:25PM +0000, Steven Bethard wrote:
> On Mon, Nov 1, 2010 at 2:53 PM, Michael Foord <fuzzyman at> wrote:
> > Isn't it better to add the missing elements - what is the problem with that
> > approach?
> It just requires extra synchronization, and history shows that I
> always forget to add them. ;-)


for key, value in globals().items():
    if not key.startswith('_'):

   Further filter (by key or value) to your needs.

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From rdmurray at  Mon Nov  1 16:33:39 2010
From: rdmurray at (R. David Murray)
Date: Mon, 01 Nov 2010 11:33:39 -0400
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, 01 Nov 2010 15:14:36 -0000, Michael Foord <fuzzyman at> wrote:
> On 01/11/2010 15:10, R. David Murray wrote:
> > On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord<fuzzyman at>  wrote:
> >> Well, bug is the wrong word as it is obviously an intended feature (or
> >> consequence of a feature). I still think, given the general behaviour of
> >> Python sorting, that it is unexpected. It breaks what is usually an
> >> invariant for sorting without an explicit key that sortable types
> >> sorted(l) = sorted(l[::-1]).
> > Well, as Antoine pointed out, Python's sorting algorithm is stable,
> > so that is in fact *not* an invariant:
> >
> >>>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke']
> >>>> y = x[::-1]
> >>>> [id(n) for n in sorted(x)]
> > [3073747680, 3073747904, 3073747624, 3073747512]
> >>>> [id(n) for n in sorted(y)]
> > [3073747680, 3073747904, 3073747512, 3073747624]
> >
> > Yes, == usually hides the fact that the *objects* are in a different
> > order, but obviously that doesn't apply to sets :)
> >
> Sorry, that should have been sorted(l) == sorted(l[::-1]) - which *is* 
> the case for your example above.

Yes, I know that's what you meant, that's why I said "== usually hides
this".  If you are restricting yourself to built in types, then your
invariant is mostly true but (IMO) misleading, as set demonstrates.
And it certainly doesn't have to be true for custom types, even if they
don't redefine __lt__.  You can argue that in a good design it should
be, but as the set example indicates, there are problem domains where it
is useful for it not to be.

Or, to put it another way, *if* there is a bug here it would be in set,
not sorted.

R. David Murray                            

From ocean-city at  Mon Nov  1 17:10:32 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Tue, 02 Nov 2010 01:10:32 +0900
Subject: [Python-Dev] [Python-checkins] r85987 -
In-Reply-To: <>
References: <>
Message-ID: <>

On 2010/10/31 6:24, brian.curtin wrote:
> Author: brian.curtin
> Date: Sat Oct 30 23:24:21 2010
> New Revision: 85987
> Log:
> Fix #10257. Clear resource warnings by using os.popen's context manager.
> Modified:
>     python/branches/py3k/Lib/test/
> Modified: python/branches/py3k/Lib/test/
> ==============================================================================
> --- python/branches/py3k/Lib/test/	(original)
> +++ python/branches/py3k/Lib/test/	Sat Oct 30 23:24:21 2010
> @@ -406,17 +406,19 @@
>           os.environ.clear()
>           if os.path.exists("/bin/sh"):
>               os.environ.update(HELLO="World")
> -            value = os.popen("/bin/sh -c 'echo $HELLO'").read().strip()
> -            self.assertEquals(value, "World")
> +            with os.popen("/bin/sh -c 'echo $HELLO'") as popen:
> +                value =
> +                self.assertEquals(value, "World")

Does this really cause resource warning? I think os.popen instance
won't be into traceback because it's not declared as variable. So I
suppose it will be deleted by reference count == 0 even when exception

From ncoghlan at  Mon Nov  1 17:23:41 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 2 Nov 2010 02:23:41 +1000
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 1:33 AM, R. David Murray <rdmurray at> wrote:
> Or, to put it another way, *if* there is a bug here it would be in set,
> not sorted.

Put me in the "it's not a bug, it's a feature" camp. Providing a
"elements equal" check that doesn't rely on LT providing a total
ordering is a non-trivial exercise.

Looking at assertItemsEqual, I'd be inclined to insert a check that
falls back to the "unorderable_list_difference" approach in the case
where "expected != sorted(reversed(expected))" (only need to check the
one, since if the expected values are totally ordered, while the
actual values are not, this should show up when comparing the
elements). It slows down the fast path a bit, but the updated function
should at least handle partial orderings more correctly than it does


P.S. Late night post, so I may be missing something obvious...

Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Mon Nov  1 17:26:35 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 01 Nov 2010 16:26:35 +0000
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 01/11/2010 16:23, Nick Coghlan wrote:
> On Tue, Nov 2, 2010 at 1:33 AM, R. David Murray<rdmurray at>  wrote:
>> Or, to put it another way, *if* there is a bug here it would be in set,
>> not sorted.
> Put me in the "it's not a bug, it's a feature" camp. Providing a
> "elements equal" check that doesn't rely on LT providing a total
> ordering is a non-trivial exercise.
> Looking at assertItemsEqual, I'd be inclined to insert a check that
> falls back to the "unorderable_list_difference" approach in the case
> where "expected != sorted(reversed(expected))"

If that is sufficient then it would be a nice way of keeping the fast path.

(I'm not arguing that Antoine and R. David aren't correct in what 
they're saying about set ordering - I'm just saying that I was surprised 
and bet I'm not the only one. Bit of a dead end discussion. :-)


> (only need to check the
> one, since if the expected values are totally ordered, while the
> actual values are not, this should show up when comparing the
> elements). It slows down the fast path a bit, but the updated function
> should at least handle partial orderings more correctly than it does
> now.
> Cheers,
> Nick.
> P.S. Late night post, so I may be missing something obvious...


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Mon Nov  1 17:30:09 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 2 Nov 2010 02:30:09 +1000
Subject: [Python-Dev] [Python-checkins] r85987 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto
<ocean-city at> wrote:
> Does this really cause resource warning? I think os.popen instance
> won't be into traceback because it's not declared as variable. So I
> suppose it will be deleted by reference count == 0 even when exception
> occurs.

Any time __del__ has to close the resource triggers ResourceWarning,
regardless of whether that is due to the cyclic garbage collector or
the refcount naturally falling to zero. In the past dealing with this
was clumsy, so it made sense to rely on CPython's refcounting to do
the work. However, we have better tools for deterministic resource
management now (in the form of context managers), so these updates
help make the standard library and its test suite more suitable for
use with non-refcounting Python implementations (such as PyPy, Jython
and IronPython).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Nov  1 17:38:40 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 2 Nov 2010 02:38:40 +1000
Subject: [Python-Dev] Stable sort and partial order
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 2:26 AM, Michael Foord <fuzzyman at> wrote:
> On 01/11/2010 16:23, Nick Coghlan wrote:
>> Looking at assertItemsEqual, I'd be inclined to insert a check that
>> falls back to the "unorderable_list_difference" approach in the case
>> where "expected != sorted(reversed(expected))"
> If that is sufficient then it would be a nice way of keeping the fast path.

As far as I can tell, that check is a valid partial ordering detector
for any sequence that contains one or more pairs of items for which
LT, EQ and GE are all False:

>>> seq = [{1}, {2}]
>>> seq[0] < seq[1]
>>> seq[0] == seq[1]
>>> seq[0] > seq[1]
>>> sorted(seq)
[{1}, {2}]
>>> sorted(reversed(sorted(seq)))
[{2}, {1}]

Obviously, if the sequence doesn't contain any such items (e.g. all
subsets of each other), then it will look like a total ordering and
use the fast path. I see that as an upside :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From brett at  Mon Nov  1 17:50:12 2010
From: brett at (Brett Cannon)
Date: Mon, 1 Nov 2010 09:50:12 -0700
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <>
	<> <20101028120414.1f4f3016@mission>
	<> <>
Message-ID: <>

2010/11/1 Kristj?n Valur J?nsson <kristjan at>:
> I've been sitting on the sideline seeing this unfold.
> We've seen some different viewpoints on the matter and I'm happy to see that I'm not the only one lamenting the proclaimed death of the 2.x linage.
> However, As correctly stated by Martin, I merely voiced a suggestion and I have gotten helpful counter-suggestions.
> A private branch is fine (More correctly a fork, even, as people have pointed out) and Hg is going to support user-branches.
> In the meantime, however, unless someone strongly objects, I'm probably going to set up a temporary branch off /release27-maint under /stackless/sandboxes/ until the Hg switchover. ?Name undecided yet.

No objection from me; branches in svn are for experimental stuff and
this is what you are proposing.


> Cheers,
> Kristj?n
>> -----Original Message-----
>> From: at
>> [ at] On Behalf
>> Of "Martin v. L?wis"
>> Sent: 29. okt?ber 2010 22:13
>> This thread was started by a specific proposal from Kristjan, and
>> Kristjan got a specific suggestion on how to proceed (namely, wait
>> for the Mercurial switchover, then publish his changes in a branch).
>> So despite the more general subject (which I think is still mostly
>> hypothetical), the real issue Kristjan raised has been resolved,
>> AFAICT (although Kristjan has not yet voiced an opinion of whether
>> he finds that resolution acceptable).
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From vinay_sajip at  Mon Nov  1 18:49:36 2010
From: vinay_sajip at (Vinay Sajip)
Date: Mon, 1 Nov 2010 17:49:36 +0000 (UTC)
Subject: [Python-Dev] Change to logging Formatters: support for
	alternative format styles
References: <>
Message-ID: <>

Olemis Lang <olemis <at>> writes:

> > For some people, use of {} over % is more about personal taste than about the
> > actual usage of str.format's flexibility;
> Thought you were talking about me, you only needed to say ?he has
> black hair and blue eyes? ... ;o)

No, it was a general comment; I don't know your preferences. The basicConfig()
change has now been checked into the py3k branch.


Vinay Sajip

From tjreedy at  Mon Nov  1 19:37:33 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 01 Nov 2010 14:37:33 -0400
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <ian1d8$8i5$>

On 10/31/2010 10:55 PM, Michael Foord wrote:

> fact that sets / frozensets can't be sorted in the standard Python way
> (their less than comparison adheres to the set definition). This is
> something that will probably surprise many Python developers:

Any programmer who sorts (or uses functions that depend on proper 
sorting) should know and respect the difference between partial orders, 
such as set inclusion, and total orders, such as lex order of sequences. 
So I am surprised by the above claim ;-).

>  >>> a = [{2,4}, {1,2}]
>  >>> b = a[::-1]
>  >>> sorted(a)
> [set([2, 4]), set([1, 2])]
>  >>> sorted(b)
> [set([1, 2]), set([2, 4])]

The bug is not in the sort method, but the attempt to sort partially 
ordered items, which are not properly sortable.

a = [{2,4}, {1,2}]
b = a[::-1]
#[{1, 2}, {2, 4}]
#[{1, 2}, {2, 4}]

A test method (or internal branch) that depends on sorting to work 
properly could just refuse to work with sets (and frozensets).

Terry Jan Reedy

From ctb at  Tue Nov  2 02:40:05 2010
From: ctb at (C. Titus Brown)
Date: Mon, 1 Nov 2010 18:40:05 -0700
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <ian1d8$8i5$>
References: <>
	<> <ian1d8$8i5$>
Message-ID: <>

On Mon, Nov 01, 2010 at 02:37:33PM -0400, Terry Reedy wrote:
> On 10/31/2010 10:55 PM, Michael Foord wrote:
>> fact that sets / frozensets can't be sorted in the standard Python way
>> (their less than comparison adheres to the set definition). This is
>> something that will probably surprise many Python developers:
> Any programmer who sorts (or uses functions that depend on proper  
> sorting) should know and respect the difference between partial orders,  
> such as set inclusion, and total orders, such as lex order of sequences.  
> So I am surprised by the above claim ;-).

Huh.  Count me out.  I guess I don't live up to your standards.


p.s. Seriously?  I can accept that there's a rational minimalist argument
for this "feature", but arguing that it's somehow the responsibility of
a programmer to *expect* this seems kind of whack.
C. Titus Brown, ctb at

From brett at  Tue Nov  2 03:35:18 2010
From: brett at (Brett Cannon)
Date: Mon, 1 Nov 2010 19:35:18 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou <solipsis at> wrote:
> On Tue, 26 Oct 2010 22:06:37 -0400
> Alexander Belopolsky <alexander.belopolsky at> wrote:
>> While I appreciate your and Michael's eloquence, I don't really see
>> why five 400-line modules are necessarily easier to maintain than one
>> 2000-line module. ?Splitting code into modules is certainly a good
>> thing when the resulting modules can be used independently. ?This
>> helps users write leaner programs, reduces mental footprint of
>> individual modules, etc, etc. ? The split unittest module does not
>> bring any such benefits. ?It still presents a single "big-ball-of-mud"
>> namespace, only rather than implemented in a single file, it is now
>> swept in from eight different files.
> Are you saying that it has become a pile of medium-sized balls of mud?
> I would like to say thanks for the mud, Michael! It's high quality mud
> for sure.

I realize I am a little late in this reply but issue 10273 linked to
this and so now I am actually bothering to read this thread since it
felt like bikeshedding when the thread began.

I think the issue here is that the file structure of the code no
longer matches the public API documented by unittest. Personally I,
like most people it seems, prefer source files to be structured in a
way to match the public API. In the case of unittest Michael didn't.
He did ask python-dev if it was okay to do what he did, we all kept
quiet, and now we have realized that most of us prefer to have files
that mirror the API; lesson learned. But Python 2.7 shipped with this
file layout so we have to stick with it lest we break any imports out
there that use the package-like file structure Michael went with
(which we could actually document and use if we wanted now that
Michael has already broken things up). Reversing the trend by sticking
all the code into unittest/ and then sticking import shims
into the existing modules would be a stupid waste of time, especially
considering the head maintainer of the package likes it the way it is.

So basically it seems like we have learned a lesson: we prefer to have
our code structured in files that match the public API. I think that
is a legitimate design rule for the stdlib to follow from now on, but
in the case of unittest it's too late to change it back (and it's a
minor price to pay to learn this lesson and to have Michael
maintaining unittest like he has been, plus we could consider using
the new structure so that the public API matches the file structure
when the need arises).

From stephen at  Tue Nov  2 09:28:43 2010
From: stephen at (Stephen J. Turnbull)
Date: Tue, 02 Nov 2010 17:28:43 +0900
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>
	<> <ian1d8$8i5$>
Message-ID: <>

C. Titus Brown writes:

 > p.s. Seriously?  I can accept that there's a rational minimalist argument
 > for this "feature",

It is a feature, even if you aren't gonna need it.  I want it.<wink>

Many programmers do know that sets are partially ordered by inclusion,
preordered by size, and (in Python) totally ordered by memory address.
There's nothing wrong with not knowing that -- these are rather
abstract mathematical concepts.  But it's very useful that sorted() or
.sort() use <=, and it's very useful that Python so often obeys simple
consistent rules, and it would be quite confusing to those who do
understand that "in Python the set type is partially ordered by
inclusion" if sorted() used some other relation to order collections
of sets.

It's not so hard to change this:

class SizedSet (set):
    def __lt__(a, b):
        return length(a) < length(b)
    def __le__(a, b):
        return length(a) <= length(b)
    def __gt__(a, b):
        return length(a) > length(b)
    def __ge__(a, b):
        return length(a) >= length(b)
    # These two are arguable, which makes size comparison not so
    # great as a candidate for the OOWTDI of set.__lt__().
    def __eq__(a, b):
        return length(a) == length(b)
    def __ne__(a, b):
        return length(a) != length(b)

If there were an obvious way to compare sets for use in sorting, that
way would very likely be the most useful definition for <=, too.  But
there isn't, really (it's pretty obvious that comparing memory
addresses is implausible, but otherwise, there are lots of candidates
that are at least sometimes useful).  Do you think otherwise?  If so,
what do you propose for the OOWTDI of sorting a collection of sets?

 > but arguing that it's somehow the responsibility of a programmer to
 > *expect* this seems kind of whack.

I don't quite agree that everyone should "expect exactly the
implemented behavior", but I do think it's a Python *programmer's*
responsibility to refrain from expecting something else in this case.

From victor.stinner at  Tue Nov  2 13:55:40 2010
From: victor.stinner at (Victor Stinner)
Date: Tue, 2 Nov 2010 13:55:40 +0100
Subject: [Python-Dev] [Python-checkins] r85902 - in
	python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

I don't know how to ignore the BytesWarning without importing warning. How can 
I do that?


Le vendredi 29 octobre 2010 04:31:42, Benjamin Peterson a ?crit :
> 2010/10/28 victor.stinner <python-checkins at>:
> > Author: victor.stinner
> > Date: Fri Oct 29 02:38:58 2010
> > New Revision: 85902
> > 
> > Log:
> > Issue #10210: os.get_exec_path() ignores BytesWarning warnings
> > 
> > 
> > Modified:
> >   python/branches/py3k/Lib/
> >   python/branches/py3k/Lib/test/
> > 
> > Modified: python/branches/py3k/Lib/
> > =========================================================================
> > ===== --- python/branches/py3k/Lib/      (original)
> > +++ python/branches/py3k/Lib/      Fri Oct 29 02:38:58 2010
> > @@ -382,18 +382,32 @@
> >     *env* must be an environment variable dict or None.  If *env* is
> > None, os.environ will be used.
> >     """
> > +    # Use a local import instead of a global import to avoid bootstrap
> > issue: +    # the os module is used to build Python extensions.
> > +    import warnings
> This sort of function import should be avoided.

From ctb at  Tue Nov  2 15:05:55 2010
From: ctb at (C. Titus Brown)
Date: Tue, 2 Nov 2010 07:05:55 -0700
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>
	<> <ian1d8$8i5$>
Message-ID: <>

On Tue, Nov 02, 2010 at 05:28:43PM +0900, Stephen J. Turnbull wrote:
> C. Titus Brown writes:
>  > p.s. Seriously?  I can accept that there's a rational minimalist argument
>  > for this "feature",
> It is a feature, even if you aren't gonna need it.  I want it.<wink>
> Many programmers do know that sets are partially ordered by inclusion,
> preordered by size, and (in Python) totally ordered by memory address.
> There's nothing wrong with not knowing that -- these are rather
> abstract mathematical concepts.  But it's very useful that sorted() or
> .sort() use <=, and it's very useful that Python so often obeys simple
> consistent rules, and it would be quite confusing to those who do
> understand that "in Python the set type is partially ordered by
> inclusion" if sorted() used some other relation to order collections
> of sets.
> It's not so hard to change this:

[ ... ]

> If there were an obvious way to compare sets for use in sorting, that
> way would very likely be the most useful definition for <=, too.  But
> there isn't, really (it's pretty obvious that comparing memory
> addresses is implausible, but otherwise, there are lots of candidates
> that are at least sometimes useful).  Do you think otherwise?  If so,
> what do you propose for the OOWTDI of sorting a collection of sets?

I don't have one...

>  > but arguing that it's somehow the responsibility of a programmer to
>  > *expect* this seems kind of whack.
> I don't quite agree that everyone should "expect exactly the
> implemented behavior", but I do think it's a Python *programmer's*
> responsibility to refrain from expecting something else in this case.

...but, as someone who has to figure out how to teach stuff to CSE undergrads
(and biology grads) I hate the statement "...any programmer should
expect this..." because (unless you're going to disqualify a huge swathe of
people from being programmers) it's *just not true*.  I don't expect Python
to cater to the lowest common denominator but we should be mindful of our
audience, too.

I think Python has a great advantage in not being too surprising much of the
time, which helps quite a bit with learning. I hope people keep that in mind
for future features.

C. Titus Brown, ctb at

From ocean-city at  Tue Nov  2 16:03:33 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Wed, 03 Nov 2010 00:03:33 +0900
Subject: [Python-Dev] Resource leaks warnings
In-Reply-To: <>
References: <>	<i6e4rs$pm8$>
Message-ID: <>

Sorry for late post.

On 2010/09/29 20:01, Antoine Pitrou wrote:
> Furthermore, it can produce real bugs, especially under Windows when
> coupled with refererence cycles created by traceback objects

I think this can be relaxed with the patch in #9815. ;-)

From tjreedy at  Tue Nov  2 17:23:10 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 02 Nov 2010 12:23:10 -0400
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
	<ian1d8$8i5$>	<>	<>
Message-ID: <iapdt8$tsp$>

On 11/2/2010 10:05 AM, C. Titus Brown wrote:

> ...but, as someone who has to figure out how to teach stuff to CSE undergrads
> (and biology grads) I hate the statement "...any programmer should
> expect this..."

And indeed I (intentionally) did not say that. People who are ignorant 
and inexperienced about something should avoid making expectations in 
any direction until they have read the doc and experimented a bit.

What I did say in the post you responded to is "Any programmer who sorts 
(or uses functions that depend on proper sorting) should know and 
respect the difference between partial orders, such as set inclusion, 
and total orders, such as lex order of sequences." I should hope that 
you teach the difference, or rather, help students to notice what they 
already know. Tell them to consider that difference between sorting 
people by a totally ordered characteristic like height or weight and a 
characteristic that is at best partially ordered, like hair color or 
ethical character. Or have them consider the partial order dependencies 
between morning get-ready-for-class activities (socks before shoes 
versus pants and shirt in either order). They already do topological 
sorting every day, even if the name seems fancy.

Terry Jan Reedy

From fuzzyman at  Tue Nov  2 17:29:36 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 02 Nov 2010 16:29:36 +0000
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <iapdt8$tsp$>
References: <>	<>	<>	<>	<>	<>	<ian1d8$8i5$>	<>	<>	<>
Message-ID: <>

On 02/11/2010 16:23, Terry Reedy wrote:
> On 11/2/2010 10:05 AM, C. Titus Brown wrote:
>> ...but, as someone who has to figure out how to teach stuff to CSE 
>> undergrads
>> (and biology grads) I hate the statement "...any programmer should
>> expect this..."
> And indeed I (intentionally) did not say that. People who are ignorant 
> and inexperienced about something should avoid making expectations in 
> any direction until they have read the doc and experimented a bit.
Expectations come from consistent behaviour. sorted behaves consistently 
for *most* of the built-in types and will also work for custom types 
that provide a 'standard' (total ordering) implementation of __lt__.

It is very easy to *not realise* that a consequence of sets (and 
frozensets) providing partial ordering through operator overloading is 
that sorting is undefined for them. Particularly as it still works for 
other mutable collections. Worth being aware that custom implementations 
of standard operators will break expectations of users who aren't 
intimately aware of the problem domains that the specific type may be 
created for.

All the best,

Michael Foord

> What I did say in the post you responded to is "Any programmer who 
> sorts (or uses functions that depend on proper sorting) should know 
> and respect the difference between partial orders, such as set 
> inclusion, and total orders, such as lex order of sequences." I should 
> hope that you teach the difference, or rather, help students to notice 
> what they already know. Tell them to consider that difference between 
> sorting people by a totally ordered characteristic like height or 
> weight and a characteristic that is at best partially ordered, like 
> hair color or ethical character. Or have them consider the partial 
> order dependencies between morning get-ready-for-class activities 
> (socks before shoes versus pants and shirt in either order). They 
> already do topological sorting every day, even if the name seems fancy.


From jacob at  Tue Nov  2 17:37:17 2010
From: jacob at (Jacob Kaplan-Moss)
Date: Tue, 2 Nov 2010 11:37:17 -0500
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <iapdt8$tsp$>
References: <>
	<> <>
	<> <ian1d8$8i5$>
	<> <iapdt8$tsp$>
Message-ID: <>

On Tue, Nov 2, 2010 at 11:23 AM, Terry Reedy <tjreedy at> wrote:
> What I did say in the post you responded to is "Any programmer who sorts (or
> uses functions that depend on proper sorting) should know and respect the
> difference between partial orders, such as set inclusion, and total orders,
> such as lex order of sequences."

FWIW (i.e. not much): before this thread if you'd asked me about
partial and total orders I'd have had to run to Wikipedia real quick
to figure it out.

Hopefully I'm still allowed to use Python.


From fdrake at  Tue Nov  2 17:41:37 2010
From: fdrake at (Fred Drake)
Date: Tue, 2 Nov 2010 12:41:37 -0400
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>
	<> <>
	<> <ian1d8$8i5$>
	<> <iapdt8$tsp$>
Message-ID: <>

On Tue, Nov 2, 2010 at 12:37 PM, Jacob Kaplan-Moss <jacob at> wrote:
> Hopefully I'm still allowed to use Python.

Definitely!  Python's a great place to learn about all these things.  :-)

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From stephen at  Tue Nov  2 18:00:22 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 03 Nov 2010 02:00:22 +0900
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <iapdt8$tsp$>
References: <>
	<> <ian1d8$8i5$>
	<> <iapdt8$tsp$>
Message-ID: <>

Terry Reedy writes:

 > ethical character. Or have them consider the partial order dependencies 
 > between morning get-ready-for-class activities (socks before shoes 
 > versus pants and shirt in either order). They already do topological 
 > sorting every day, even if the name seems fancy.

Augment the example a bit, perhaps: socks and pants before shoes,
socks and pants in either order.

From exarkun at  Tue Nov  2 18:17:38 2010
From: exarkun at (exarkun at
Date: Tue, 02 Nov 2010 17:17:38 -0000
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>
	<> <ian1d8$8i5$>
	<> <iapdt8$tsp$>
Message-ID: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain>

On 04:29 pm, fuzzyman at wrote:
>On 02/11/2010 16:23, Terry Reedy wrote:
>>On 11/2/2010 10:05 AM, C. Titus Brown wrote:
>>>...but, as someone who has to figure out how to teach stuff to CSE 
>>>(and biology grads) I hate the statement "...any programmer should
>>>expect this..."
>>And indeed I (intentionally) did not say that. People who are ignorant 
>>and inexperienced about something should avoid making expectations in 
>>any direction until they have read the doc and experimented a bit.
>Expectations come from consistent behaviour. sorted behaves 
>consistently for *most* of the built-in types and will also work for 
>custom types that provide a 'standard' (total ordering) implementation 
>of __lt__.
>It is very easy to *not realise* that a consequence of sets (and 
>frozensets) providing partial ordering through operator overloading is 
>that sorting is undefined for them.

Perhaps.  The documentation for sets says this, though:

  Since sets only define partial ordering (subset relationships), the 
output of the list.sort() method is undefined for lists of sets.
>Particularly as it still works for other mutable collections. Worth 
>being aware that custom implementations of standard operators will 
>break expectations of users who aren't intimately aware of the problem 
>domains that the specific type may be created for.

I can't help thinking that most of this confusion is caused by using < 
for determining subsets.  If < were not defined for sets and people had 
to use "set.issubset" (which exists already), then sorting a list with 
sets would raise an exception, a much more understandable failure mode 
than getting back a list in arbitrary order.


From fuzzyman at  Tue Nov  2 18:23:45 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 02 Nov 2010 17:23:45 +0000
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<ian1d8$8i5$>	<>	<>	<>
	<iapdt8$tsp$>	<>
Message-ID: <>

On 02/11/2010 17:17, exarkun at wrote:
> On 04:29 pm, fuzzyman at wrote:
>> On 02/11/2010 16:23, Terry Reedy wrote:
>>> On 11/2/2010 10:05 AM, C. Titus Brown wrote:
>>>> ...but, as someone who has to figure out how to teach stuff to CSE 
>>>> undergrads
>>>> (and biology grads) I hate the statement "...any programmer should
>>>> expect this..."
>>> And indeed I (intentionally) did not say that. People who are 
>>> ignorant and inexperienced about something should avoid making 
>>> expectations in any direction until they have read the doc and 
>>> experimented a bit.
>> Expectations come from consistent behaviour. sorted behaves 
>> consistently for *most* of the built-in types and will also work for 
>> custom types that provide a 'standard' (total ordering) 
>> implementation of __lt__.
>> It is very easy to *not realise* that a consequence of sets (and 
>> frozensets) providing partial ordering through operator overloading 
>> is that sorting is undefined for them.
> Perhaps. The documentation for sets says this, though:
> Since sets only define partial ordering (subset relationships), the 
> output of the list.sort() method is undefined for lists of sets.

Right, I did quote that exact text earlier in the thread. False 
expectations come when there are exceptions to otherwise-consistent 

>> Particularly as it still works for other mutable collections. Worth 
>> being aware that custom implementations of standard operators will 
>> break expectations of users who aren't intimately aware of the 
>> problem domains that the specific type may be created for.
> I can't help thinking that most of this confusion is caused by using < 
> for determining subsets. If < were not defined for sets and people had 
> to use "set.issubset" (which exists already), then sorting a list with 
> sets would raise an exception, a much more understandable failure mode 
> than getting back a list in arbitrary order.
I agree. This is a cost of overloading operators with domain specific 

All the best,

Michael Foord

> Jean-Paul
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From tjreedy at  Tue Nov  2 23:13:47 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 02 Nov 2010 18:13:47 -0400
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<ian1d8$8i5$>	<>	<>	<>	<iapdt8$tsp$>	<>	<20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain>
Message-ID: <iaq2et$495$>

On 11/2/2010 1:23 PM, Michael Foord wrote:

> Right, I did quote that exact text earlier in the thread. False
> expectations come when there are exceptions to otherwise-consistent
> behaviour.
>>> Particularly as it still works for other mutable collections. Worth
>>> being aware that custom implementations of standard operators will
>>> break expectations of users who aren't intimately aware of the
>>> problem domains that the specific type may be created for.
>> I can't help thinking that most of this confusion is caused by using <
>> for determining subsets. If < were not defined for sets and people had
>> to use "set.issubset" (which exists already), then sorting a list with
>> sets would raise an exception, a much more understandable failure mode
>> than getting back a list in arbitrary order.
> I agree. This is a cost of overloading operators with domain specific
> meanings.

I disagree. In mathematics, total ordering is a special case of partial 
ordering, not the other way around. Set inclusion is a standard example 
of non-total ordering. In everyday life, another example (other than 
action dependencies) are ancestry relationships. In general, acyclic 
directed graphs model sets with partial orders. Totally ordered linear 
chains, as with integers, are a special case.

A Python program, for instance, is usually a non-unique topological sort 
of a set a statements with a non-total dependency order. This is related 
to a topological sort of a set of actions with a non-total dependency 
order. A NameError, if not due to a misspelling, is typically a result 
of violating one of the space or time order constraints.

So I stick with my statement that a programmer should have some 
understanding (at least at a gut level) of non-total orders and 
non-unique sorts. They are a major part of what programming is.

Terry Jan Reedy

From ncoghlan at  Tue Nov  2 23:33:28 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 3 Nov 2010 08:33:28 +1000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon <brett at> wrote:
> So basically it seems like we have learned a lesson: we prefer to have
> our code structured in files that match the public API. I think that
> is a legitimate design rule for the stdlib to follow from now on, but
> in the case of unittest it's too late to change it back (and it's a
> minor price to pay to learn this lesson and to have Michael
> maintaining unittest like he has been, plus we could consider using
> the new structure so that the public API matches the file structure
> when the need arises).

Something to note in PEP 8, perhaps?


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From greg.ewing at  Tue Nov  2 23:33:39 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 03 Nov 2010 11:33:39 +1300
Subject: [Python-Dev] Cleaning-up the new unittest API
In-Reply-To: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain>
References: <>
	<> <>
	<> <ian1d8$8i5$>
	<> <iapdt8$tsp$>
Message-ID: <>

exarkun at wrote:

> I can't help thinking that most of this confusion is caused by using < 
> for determining subsets.  If < were not defined for sets and people had 
> to use "set.issubset" (which exists already), then sorting a list with 
> sets would raise an exception, a much more understandable failure mode 
> than getting back a list in arbitrary order.

Personally I think it was premature to throw out __cmp__.

What should have happened instead is for __cmp__ to be
augmented with a fourth outcome, "not equal but unordered".
Then operations such as sorting that require a total ordering
could use __cmp__ and complain if they get an unordered


From ncoghlan at  Tue Nov  2 23:38:12 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 3 Nov 2010 08:38:12 +1000
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner
<victor.stinner at> wrote:
> I don't know how to ignore the BytesWarning without importing warning. How can
> I do that?

I was suggesting trying to fix the bootstrap issue so you could use a
top-level import, instead of working around it with a function level
import (which we've learned from experience is a recipe for later
reports from users of programs deadlocking on the import lock - we've
made lots of improvement to avoid such deadlocks, but still prefer to
avoid function level imports anyway).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From brett at  Tue Nov  2 23:43:29 2010
From: brett at (Brett Cannon)
Date: Tue, 2 Nov 2010 15:43:29 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 15:33, Nick Coghlan <ncoghlan at> wrote:
> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon <brett at> wrote:
>> So basically it seems like we have learned a lesson: we prefer to have
>> our code structured in files that match the public API. I think that
>> is a legitimate design rule for the stdlib to follow from now on, but
>> in the case of unittest it's too late to change it back (and it's a
>> minor price to pay to learn this lesson and to have Michael
>> maintaining unittest like he has been, plus we could consider using
>> the new structure so that the public API matches the file structure
>> when the need arises).
> Something to note in PEP 8, perhaps?

If everyone agrees with making this policy, then yes.


> Cheers,
> Nick.
> --
> Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From raymond.hettinger at  Tue Nov  2 23:47:58 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 15:47:58 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote:
> I think the issue here is that the file structure of the code no
> longer matches the public API documented by unittest. Personally I,
> like most people it seems, prefer source files to be structured in a
> way to match the public API. In the case of unittest Michael didn't.
> He did ask python-dev if it was okay to do what he did, we all kept
> quiet, and now we have realized that most of us prefer to have files
> that mirror the API; lesson learned. But Python 2.7 shipped with this
> file layout so we have to stick with it lest we break any imports out
> there that use the package-like file structure Michael went with
> (which we could actually document and use if we wanted now that
> Michael has already broken things up). Reversing the trend by sticking
> all the code into unittest/ and then sticking import shims
> into the existing modules would be a stupid waste of time, especially
> considering the head maintainer of the package likes it the way it is.

I'm not sure I follow where we're stuck with the current package.
AFAICT, the module is still used with "import unittest".
The file splitting was done badly, so I don't think there any of the
components are usable directly, i.e. "from import SkipTest".
Also, I don't think the package structure was documented or announced.

This is in contrast to the logging module which does have a
clean separation of components and where it isn't unusual
to import just part of the package.

What is it you're seeing as a risk that I'm not seeing?
Are we permanently locked into the exact ten filenames
that are currently used:  utils, suite, loader, case, result, main, signals, etc?
Is the file structure now frozen?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From raymond.hettinger at  Tue Nov  2 23:52:11 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 15:52:11 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 2, 2010, at 3:33 PM, Nick Coghlan wrote:

> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon <brett at> wrote:
>> So basically it seems like we have learned a lesson: we prefer to have
>> our code structured in files that match the public API. I think that
>> is a legitimate design rule for the stdlib to follow from now on, but
>> in the case of unittest it's too late to change it back (and it's a
>> minor price to pay to learn this lesson and to have Michael
>> maintaining unittest like he has been, plus we could consider using
>> the new structure so that the public API matches the file structure
>> when the need arises).
> Something to note in PEP 8, perhaps?

I'll propose some PEP 8 wording in the bug tracker
(essentially advice on when and how to use packaging),
and everyone can offer their assent, dissent, and


From barry at  Tue Nov  2 23:58:05 2010
From: barry at (Barry Warsaw)
Date: Tue, 2 Nov 2010 18:58:05 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <20101102185805.08ca8ca8@mission>

On Nov 02, 2010, at 03:43 PM, Brett Cannon wrote:

>On Tue, Nov 2, 2010 at 15:33, Nick Coghlan <ncoghlan at> wrote:
>> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon <brett at> wrote:
>>> So basically it seems like we have learned a lesson: we prefer to have
>>> our code structured in files that match the public API. I think that
>>> is a legitimate design rule for the stdlib to follow from now on, but
>>> in the case of unittest it's too late to change it back (and it's a
>>> minor price to pay to learn this lesson and to have Michael
>>> maintaining unittest like he has been, plus we could consider using
>>> the new structure so that the public API matches the file structure
>>> when the need arises).
>> Something to note in PEP 8, perhaps?
>If everyone agrees with making this policy, then yes.

If SHOULD not MUST, then +0

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From guido at  Tue Nov  2 23:58:51 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 15:58:51 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 3:47 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> I'm not sure I follow where we're stuck with the current package.
> AFAICT, the module is still used with "import unittest".
> The file splitting was done badly, so I don't think there any of the
> components are usable directly, i.e. "from import SkipTest".
> Also, I don't think the package structure was documented or announced.
> This is in contrast to the logging module which does have a
> clean separation of components and where it isn't unusual
> to import just part of the package.
> What is it you're seeing as a risk that I'm not seeing?
> Are we permanently locked into the exact ten filenames
> that are currently used: ?utils, suite, loader, case, result, main, signals,
> etc?
> Is the file structure now frozen?

To spout a somewhat contrarian opinion, I just browsed the new
unittest package, and the structure seems reasonable to me, even if
its submodules are not particularly reusable. I've used this kind of
style for development myself. What is so offensive about it?

--Guido van Rossum (

From solipsis at  Tue Nov  2 23:59:28 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 02 Nov 2010 23:59:28 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <1288738768.3541.12.camel@localhost.localdomain>

Le mardi 02 novembre 2010 ? 15:47 -0700, Raymond Hettinger a ?crit :
> What is it you're seeing as a risk that I'm not seeing?
> Are we permanently locked into the exact ten filenames
> that are currently used:  utils, suite, loader, case, result, main,
> signals, etc?
> Is the file structure now frozen?

I don't think it's frozen. It's just that Michael and Benjamin (perhaps
others too) prefer it like that, and given who does most of the
maintenance and improvement work it is reasonable to respect that

If one day someone else becomes maintainer of unittest, then, sure, they
can undo the splitting or do it differently. But right now there's no
reason to change.

Oh, and I much prefer a splitting without any impact on the public API.
I *hate* writing "urllib.request.urlopen" and I really wish we hadn't
done that; "urllib.urlopen" was so much easier to remember it isn't
funny :/



From brett at  Wed Nov  3 00:00:40 2010
From: brett at (Brett Cannon)
Date: Tue, 2 Nov 2010 16:00:40 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger
<raymond.hettinger at> wrote:
> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote:
> I think the issue here is that the file structure of the code no
> longer matches the public API documented by unittest. Personally I,
> like most people it seems, prefer source files to be structured in a
> way to match the public API. In the case of unittest Michael didn't.
> He did ask python-dev if it was okay to do what he did, we all kept
> quiet, and now we have realized that most of us prefer to have files
> that mirror the API; lesson learned. But Python 2.7 shipped with this
> file layout so we have to stick with it lest we break any imports out
> there that use the package-like file structure Michael went with
> (which we could actually document and use if we wanted now that
> Michael has already broken things up). Reversing the trend by sticking
> all the code into unittest/ and then sticking import shims
> into the existing modules would be a stupid waste of time, especially
> considering the head maintainer of the package likes it the way it is.
> I'm not sure I follow where we're stuck with the current package.
> AFAICT, the module is still used with "import unittest".

Yes, as far as you can tell, but who the hell knows what someone is
doing with code you are *not* aware of. As I said, Python 2.7 shipped
with the code structured like this, so it's possible someone is
importing instead of unittest.TestCase.

> The file splitting was done badly, so I don't think there any of the
> components are usable directly, i.e. "from import SkipTest".

I wouldn't say it was done badly, just non-standard. I was able to
figure out where all the key classes were based on the file names. I
personally would have no trouble doing ``from import
TestCase`` if more test case classes came along for various needs.

> Also, I don't think the package structure was documented or announced.

Announced publicly? No, not that I know of.

> This is in contrast to the logging module which does have a
> clean separation of components and where it isn't unusual
> to import just part of the package.
> What is it you're seeing as a risk that I'm not seeing?

Broken imports between Python 2.7 code and any version of Python where
unittest is re-merged back into a single module.

> Are we permanently locked into the exact ten filenames
> that are currently used: ?utils, suite, loader, case, result, main, signals,
> etc?
> Is the file structure now frozen?

Somewhat, yes. Screwing with unittest is always touchy as absolutely
no one wants their tests to break, and that includes messing with
imports. We could stick in import shims to shift everything into
unittest/, but the benefits you have outlined already don't
strike me as not worth the hassle especially since you won't ever get
your format back.

From benjamin at  Wed Nov  3 00:08:51 2010
From: benjamin at (Benjamin Peterson)
Date: Tue, 2 Nov 2010 18:08:51 -0500
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/2 Raymond Hettinger <raymond.hettinger at>:
> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote:
> I think the issue here is that the file structure of the code no
> longer matches the public API documented by unittest. Personally I,
> like most people it seems, prefer source files to be structured in a
> way to match the public API. In the case of unittest Michael didn't.
> He did ask python-dev if it was okay to do what he did, we all kept
> quiet, and now we have realized that most of us prefer to have files
> that mirror the API; lesson learned. But Python 2.7 shipped with this
> file layout so we have to stick with it lest we break any imports out
> there that use the package-like file structure Michael went with
> (which we could actually document and use if we wanted now that
> Michael has already broken things up). Reversing the trend by sticking
> all the code into unittest/ and then sticking import shims
> into the existing modules would be a stupid waste of time, especially
> considering the head maintainer of the package likes it the way it is.
> I'm not sure I follow where we're stuck with the current package.
> AFAICT, the module is still used with "import unittest".
> The file splitting was done badly, so I don't think there any of the
> components are usable directly, i.e. "from import SkipTest".
> Also, I don't think the package structure was documented or announced.
> This is in contrast to the logging module which does have a
> clean separation of components and where it isn't unusual
> to import just part of the package.



From fuzzyman at  Wed Nov  3 00:11:57 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 02 Nov 2010 23:11:57 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 02/11/2010 22:43, Brett Cannon wrote:
> On Tue, Nov 2, 2010 at 15:33, Nick Coghlan<ncoghlan at>  wrote:
>> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon<brett at>  wrote:
>>> So basically it seems like we have learned a lesson: we prefer to have
>>> our code structured in files that match the public API. I think that
>>> is a legitimate design rule for the stdlib to follow from now on, but
>>> in the case of unittest it's too late to change it back (and it's a
>>> minor price to pay to learn this lesson and to have Michael
>>> maintaining unittest like he has been, plus we could consider using
>>> the new structure so that the public API matches the file structure
>>> when the need arises).
>> Something to note in PEP 8, perhaps?
> If everyone agrees with making this policy, then yes.
I'd like to reply a bit further, I'll do it as a reply to your earlier 
email though.


> -Brett
>> Cheers,
>> Nick.
>> --
>> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From raymond.hettinger at  Wed Nov  3 00:20:38 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 16:20:38 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 2, 2010, at 3:58 PM, Guido van Rossum wrote:

> To spout a somewhat contrarian opinion, I just browsed the new
> unittest package, and the structure seems reasonable to me, even if
> its submodules are not particularly reusable. I've used this kind of
> style for development myself. What is so offensive about it?

I don't find anything offensive about it.  The issues have to do
with being able to find and analyze code.

For example, to find-out what assert.ItemsEqual does, I have
to figure-out that it was put in the file.  In Py2.6,
you code use IDLE's Open Module tool to immediately
bring up all the source for unittest.   Then you could fire-up
the class browser to quickly see and navigate the structure,
but that also no longer works in Py2.7.   Also, it used to be
the just knowing the module name was sufficient to find the
code with
All you needed to study the code was a web browser and
its find function.   Now you need to open ten tabs to be able
to browse this code.  IOW, the packaging broke a read-the-source-luke
style of research that I've been teaching people to use for years.

I probably didn't articulate the above very well, but I think
Martin said it more succinctly in this same thread.

The other issue that Brett pointed out is that the file names
now become part of the API, "from unittest.utils import safe_repr".

In the logging module, packaging was done well.  The files
fell along natural lines in the API, some of the components
we usable separately and testable separately.  Likewise
with the xml packages.  In contrast, the unittest module
is full of cross-imports and tightly coupled pieces (like
suite and case) have been separated.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Wed Nov  3 00:24:46 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 03 Nov 2010 00:24:46 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <1288740286.3541.22.camel@localhost.localdomain>

Le mardi 02 novembre 2010 ? 16:20 -0700, Raymond Hettinger a ?crit :
> For example, to find-out what assert.ItemsEqual does, I have
> to figure-out that it was put in the file.

Well, it's a TestCase method, so it seems rather intuitive to look for
it in



From raymond.hettinger at  Wed Nov  3 00:32:09 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 16:32:09 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 2, 2010, at 4:00 PM, Brett Cannon wrote:
>> Are we permanently locked into the exact ten filenames
>> that are currently used:  utils, suite, loader, case, result, main, signals,
>> etc?
>> Is the file structure now frozen?
> Somewhat, yes.

That's a bummer.

Sounds like a decision to split a module into a package is a big commitment.  Each of the individual file names becomes a permanent part of the API.  Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils).


From solipsis at  Wed Nov  3 00:34:15 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 03 Nov 2010 00:34:15 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <1288740855.3541.24.camel@localhost.localdomain>

Le mardi 02 novembre 2010 ? 16:32 -0700, Raymond Hettinger a ?crit :
> On Nov 2, 2010, at 4:00 PM, Brett Cannon wrote:
> >> Are we permanently locked into the exact ten filenames
> >> that are currently used:  utils, suite, loader, case, result, main, signals,
> >> etc?
> >> Is the file structure now frozen?
> > 
> > Somewhat, yes.
> That's a bummer.
> Sounds like a decision to split a module into a package is a big
> commitment.  Each of the individual file names becomes a permanent
> part of the API.  Even future additional splits are precluded because
> it might break someones dotted import (i.e. not a single function can
> be moved between those files -- once in unittest.utils, alway in
> unittest.utils).

I don't agree with this. Until it's documented, it's an implementation
detail and should be able to change without notice.
If someone wants to depend on some undocumented detail of the directory
layout it's their problem (like people depending on bytecode and other

From fuzzyman at  Wed Nov  3 00:34:38 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 02 Nov 2010 23:34:38 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 02/11/2010 23:00, Brett Cannon wrote:
> On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger
> <raymond.hettinger at>  wrote:
>> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote:
>> I think the issue here is that the file structure of the code no
>> longer matches the public API documented by unittest. Personally I,
>> like most people it seems, prefer source files to be structured in a
>> way to match the public API. In the case of unittest Michael didn't.
>> He did ask python-dev if it was okay to do what he did, we all kept
>> quiet, and now we have realized that most of us prefer to have files
>> that mirror the API; lesson learned. But Python 2.7 shipped with this
>> file layout so we have to stick with it lest we break any imports out
>> there that use the package-like file structure Michael went with
>> (which we could actually document and use if we wanted now that
>> Michael has already broken things up). Reversing the trend by sticking
>> all the code into unittest/ and then sticking import shims
>> into the existing modules would be a stupid waste of time, especially
>> considering the head maintainer of the package likes it the way it is.
>> I'm not sure I follow where we're stuck with the current package.
>> AFAICT, the module is still used with "import unittest".
> Yes, as far as you can tell, but who the hell knows what someone is
> doing with code you are *not* aware of. As I said, Python 2.7 shipped
> with the code structured like this, so it's possible someone is
> importing instead of unittest.TestCase.

It is also shipped in unittest (and unittest2py3k I might add) so that 
users of earlier versions of Python can use the new features seamlessly. 
(unittest2 will be in Django 1.3.)

Much better times to discuss this would be when it was proposed or when 
it was done, not months after it has been shipped in a production release.

> [snip...]
>> This is in contrast to the logging module which does have a
>> clean separation of components and where it isn't unusual
>> to import just part of the package.
>> What is it you're seeing as a risk that I'm not seeing?
> Broken imports between Python 2.7 code and any version of Python where
> unittest is re-merged back into a single module.

I *know* that some people are using the new package structure directly, 
because the topic has come up on the Testing in Python mailing list.

>> Are we permanently locked into the exact ten filenames
>> that are currently used:  utils, suite, loader, case, result, main, signals,
>> etc?
>> Is the file structure now frozen?
> Somewhat, yes. Screwing with unittest is always touchy as absolutely
> no one wants their tests to break, and that includes messing with
> imports. We could stick in import shims to shift everything into
> unittest/, but the benefits you have outlined already don't
> strike me as not worth the hassle especially since you won't ever get
> your format back.
Absolutely, that would be the worst of all possible worlds - a 
monolithic huge module but still embedded in a package.

People *are* using the existing package structure to import directly 
from (against my advice!) as this particular topic has been discussed on 
the Testing In Python mailing list.

Although Raymond has been vociferous in his detestation of this 
particular split he does not represent a "clear consensus" in favour of 
merging back. Both Fred Drake and Barrry Warsaw voiced their approval 
and on the "Clean up unittest API" issue both yourself (Brett) and 
Antoine have supported keeping the existing structure.

Alexander Belopolsky and Martin Loewis expressed difficulties with the 
new structure, but that was in response to the original email from 
Raymond that didn't seem (on my reading) to expressly suggest re-merging 
unittest back into a module but was instead seemed to be using it as an 

I am aware of the costs of having a package rather than single (however 
large it may be) module, but to my mind the benefits to maintenance 
still outweigh these cost. I'm happy to again discuss these benefits at 
great length, but having had the same conversation in person with 
Raymond twice and at repeated most of the points (but by no means all) 
in this thread on the mailing list it really feels like going round in 

As the maintainer of unittest I'd like to say that in the absence of 
clear consensus that the merger should happen, or a fiat from the BDFL, 
the merger won't happen. I believe that this is standard Python 
development process.

All the best,


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Wed Nov  3 00:39:13 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 02 Nov 2010 23:39:13 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 02/11/2010 22:58, Guido van Rossum wrote:
> On Tue, Nov 2, 2010 at 3:47 PM, Raymond Hettinger
> <raymond.hettinger at>  wrote:
>> I'm not sure I follow where we're stuck with the current package.
>> AFAICT, the module is still used with "import unittest".
>> The file splitting was done badly, so I don't think there any of the
>> components are usable directly, i.e. "from import SkipTest".
>> Also, I don't think the package structure was documented or announced.
>> This is in contrast to the logging module which does have a
>> clean separation of components and where it isn't unusual
>> to import just part of the package.
>> What is it you're seeing as a risk that I'm not seeing?
>> Are we permanently locked into the exact ten filenames
>> that are currently used:  utils, suite, loader, case, result, main, signals,
>> etc?
>> Is the file structure now frozen?
> To spout a somewhat contrarian opinion, I just browsed the new
> unittest package, and the structure seems reasonable to me, even if
> its submodules are not particularly reusable. I've used this kind of
> style for development myself. What is so offensive about it?
Amen. Although not that contrarian, others have spoken up in favour.

The split is pretty obvious (in general - I'm sure its not perfect) and 
divided along major functionality.

TestCase -
TestResult -
TestSuite -
TextTestRunner -
TestLoader -
main function -
signal handling -

The utils module is somewhat an odd one out as it is only used by, but this is hardly the most egregious error in the world. If 
you can't guess where a class lives, shows you explicitly (a 
clear advantage of exporting the public API at the top level ;-)

Due to the original design of unittest (and I have many thoughts on 
that) the modules aren't strictly "reusable" (i.e. isolated from each 
other) - but many test frameworks (for example) just use the TestCase 
without using other components. I find it difficult to believe that this 
package structure is only acceptable if we make people import the 
TestCase from and not expose it at the top level.

As mentioned in another email, but this thread has many long and tedious 
emails, there is no clear consensus that there should be a remerger and 
I am aware that there are already some consumers of the new package 

As the maintainer of unittest I'd like to say that in the absence of 
clear consensus that the merger should happen, or a fiat from the BDFL, 
the merger won't happen. I believe that this is standard Python 
development process.

All the best,



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From guido at  Wed Nov  3 00:43:29 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 16:43:29 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 4:39 PM, Michael Foord <fuzzyman at> wrote:
> As the maintainer of unittest I'd like to say that in the absence of clear
> consensus that the merger should happen, or a fiat from the BDFL, the merger
> won't happen. I believe that this is standard Python development process.

I don't think that anybody seriously expected the unittest package
would be restructured again. The remaining thrust of the thread seems
to be whether PEP 8 should advise against breaking code up into many
little modules. Personally I don't think it should -- it should by now
be clear that this is not an area where one style will fit all. I also
think that the convenience of one style over another might have
something to do with the tools being used.

--Guido van Rossum (

From raymond.hettinger at  Wed Nov  3 00:44:12 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 16:44:12 -0700
Subject: [Python-Dev] Question on imports and packages
Message-ID: <>

Brett,  Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined?  I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package.


From raymond.hettinger at  Wed Nov  3 01:03:00 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 2 Nov 2010 17:03:00 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 2, 2010, at 4:43 PM, Guido van Rossum wrote:

>  The remaining thrust of the thread seems
> to be whether PEP 8 should advise against breaking code up into many
> little modules.

I was thinking of PEP 8 wording that listed the forces for and against.

For example, ply.yacc and ply.lex was a very useful split (separately testable, natural division of concerns, no nested of cross-imports).

The xml.sax, xml.dom, and xml.minidom was a nice split because it separated distinct tools.  The xml packaging also worked well because it is easy to substitute in alternate parsers implementing the same API.

I think we also want to recommend against putting much if any code in

Some forces against packaging are that it breaks the class browser.  As you say, different users of different toolsets are affected differently.  For me, the unittest split broke my usual ways of finding out how the new methods were implemented.

Another force against is what Brett pointed-out, that the package file structure becomes a permanent and unchangeable part of the API.  It's a one-way street.

In general, I think the advice should be that packaging should be done when there is some clear benefit beyond "turning one big file into lots of smaller files". 


From fuzzyman at  Wed Nov  3 01:02:37 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 00:02:37 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 02/11/2010 23:34, Michael Foord wrote:
> On 02/11/2010 23:00, Brett Cannon wrote:
>> On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger
>> <raymond.hettinger at>  wrote:
>>> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote:
>>> I think the issue here is that the file structure of the code no
>>> longer matches the public API documented by unittest. Personally I,
>>> like most people it seems, prefer source files to be structured in a
>>> way to match the public API. In the case of unittest Michael didn't.
>>> He did ask python-dev if it was okay to do what he did, we all kept
>>> quiet, and now we have realized that most of us prefer to have files
>>> that mirror the API; lesson learned. But Python 2.7 shipped with this
>>> file layout so we have to stick with it lest we break any imports out
>>> there that use the package-like file structure Michael went with
>>> (which we could actually document and use if we wanted now that
>>> Michael has already broken things up). Reversing the trend by sticking
>>> all the code into unittest/ and then sticking import shims
>>> into the existing modules would be a stupid waste of time, especially
>>> considering the head maintainer of the package likes it the way it is.
>>> I'm not sure I follow where we're stuck with the current package.
>>> AFAICT, the module is still used with "import unittest".
>> Yes, as far as you can tell, but who the hell knows what someone is
>> doing with code you are *not* aware of. As I said, Python 2.7 shipped
>> with the code structured like this, so it's possible someone is
>> importing instead of unittest.TestCase.
> It is also shipped in unittest (and unittest2py3k I might add) so that 
> users of earlier versions of Python can use the new features 
> seamlessly. (unittest2 will be in Django 1.3.)

unittest2 dammit.



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ben+python at  Wed Nov  3 01:06:56 2010
From: ben+python at (Ben Finney)
Date: Wed, 03 Nov 2010 11:06:56 +1100
Subject: [Python-Dev] On breaking modules into packages
References: <>
Message-ID: <>

Raymond Hettinger <raymond.hettinger at> writes:

> >> Are we permanently locked into the exact ten filenames that are
> >> currently used: utils, suite, loader, case, result, main, signals,
> >> etc?
> Sounds like a decision to split a module into a package is a big
> commitment. Each of the individual file names becomes a permanent part
> of the API. Even future additional splits are precluded because it
> might break someones dotted import (i.e. not a single function can be
> moved between those files -- once in unittest.utils, alway in
> unittest.utils).

Is this a case where it would be better if the package names had the
leading underscore: ?_utils?, ?_suite?, etc.?

Does the convention on single-leading-underscore identifiers as ?don't
rely on this name staying the same in future versions? hold for package

 \         ?Alternative explanations are always welcome in science, if |
  `\   they are better and explain more. Alternative explanations that |
_o__) explain nothing are not welcome.? ?Victor J. Stenger, 2001-11-05 |
Ben Finney

From guido at  Wed Nov  3 01:28:48 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 17:28:48 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 5:03 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> Some forces against packaging are that it breaks the class browser. ?As you say, different users of different toolsets are affected differently. ?For me, the unittest split broke my usual ways of finding out how the new methods were implemented.

Maybe the IDLE class browser can be fixed; there is plenty of code
with this structure that can't or won't be restructured, no matter how
strongly PEP 8 is worded. FWIW, personally I don't use the IDLE class
browser -- I use Emacs, grep, and find. :-)

--Guido van Rossum (

From guido at  Wed Nov  3 01:35:55 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 17:35:55 -0700
Subject: [Python-Dev] Question on imports and packages
In-Reply-To: <>
References: <>
Message-ID: <>

If you are importing the code, the __module__ attribute on each class
should tell you where it is actually defined (as opposed to where you
imported it from). Then sys.modules gives you the module object which
has a __file__ attribute, etc.

On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package.

--Guido van Rossum (

From ben+python at  Wed Nov  3 01:47:55 2010
From: ben+python at (Ben Finney)
Date: Wed, 03 Nov 2010 11:47:55 +1100
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
References: <>
Message-ID: <>

Antoine Pitrou <solipsis at> writes:

> I don't agree with this. Until it's documented, it's an implementation
> detail and should be able to change without notice.

If it's an implementation detail, shouldn't it be named as one (i.e.
with a leading underscore)?

> If someone wants to depend on some undocumented detail of the
> directory layout it's their problem (like people depending on bytecode
> and other stuff).

I would say that names without a single leading underscore are part of
the public API, whether documented or not.

 \        ?Your [government] representative owes you, not his industry |
  `\   only, but his judgment; and he betrays, instead of serving you, |
_o__)        if he sacrifices it to your opinion.? ?Edmund Burke, 1774 |
Ben Finney

From fuzzyman at  Wed Nov  3 02:08:21 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 01:08:21 +0000
Subject: [Python-Dev] Question on imports and packages
In-Reply-To: <>
References: <>
Message-ID: <>

On 02/11/2010 23:44, Raymond Hettinger wrote:
> Brett,  Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined?  I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package.

Can it not do that through static analysis - just look at the classes / 
functions defined in the sub-modules. I mean, you could do it from the 
ast, right. Relying on importing code to analyse it is unpleasant if the 
code has top level side-effects (which no good code does of course).

There may be *some* cases where magic makes things weird (__package__), 
but how common are those in practise?

If you build up a data-structure representing definitions in a package, 
working out where any individual class / function used in a module is 
defined is a matter of looking at where it is imported (assuming it 
hasn't been aliased or fetched dynamically) and matching the import to a 
package you have analysed (or analyse on the fly).

A project that attempts to do something like this is pysmell:

All the best,


> Raymod
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From exarkun at  Wed Nov  3 03:23:47 2010
From: exarkun at (exarkun at
Date: Wed, 03 Nov 2010 02:23:47 -0000
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199]	Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <20101103022347.2040.1348266477.divmod.xquotient.512@localhost.localdomain>

On 12:47 am, ben+python at wrote:
>Antoine Pitrou <solipsis at> writes:
>>I don't agree with this. Until it's documented, it's an implementation
>>detail and should be able to change without notice.
>If it's an implementation detail, shouldn't it be named as one (i.e.
>with a leading underscore)?
>>If someone wants to depend on some undocumented detail of the
>>directory layout it's their problem (like people depending on bytecode
>>and other stuff).
>I would say that names without a single leading underscore are part of
>the public API, whether documented or not.

And if that isn't the rule, then what the heck is?


From fdrake at  Wed Nov  3 03:49:29 2010
From: fdrake at (Fred Drake)
Date: Tue, 2 Nov 2010 22:49:29 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 8:47 PM, Ben Finney <ben+python at> wrote:
> I would say that names without a single leading underscore are part of
> the public API, whether documented or not.

I don't recall this ever being the standard library's policy.  There are
many modules using leading underscores as hints, and many others which

Other people consider the presence of a docstring on a non-underscored
name significant, while still others refer to the out-of-line
documentation as definitive.

For modules, an __all__ attribute is generally agreed on as definitive,
if present, but that's a fairly limited case.

At this point, there isn't a single clear way to determine if something
is public API.  I doubt it will be likely to agree on a single
definition now without engendering a lengthy discussion on whether names
can be changed to reflect such a policy (where backward compatibility is
sure to be lost).

I'm sticking to the out-of-line documentation to determine what's public.


Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From brett at  Wed Nov  3 03:50:11 2010
From: brett at (Brett Cannon)
Date: Tue, 2 Nov 2010 19:50:11 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 16:43, Guido van Rossum <guido at> wrote:
> On Tue, Nov 2, 2010 at 4:39 PM, Michael Foord <fuzzyman at> wrote:
>> As the maintainer of unittest I'd like to say that in the absence of clear
>> consensus that the merger should happen, or a fiat from the BDFL, the merger
>> won't happen. I believe that this is standard Python development process.
> I don't think that anybody seriously expected the unittest package
> would be restructured again. The remaining thrust of the thread seems
> to be whether PEP 8 should advise against breaking code up into many
> little modules. Personally I don't think it should -- it should by now
> be clear that this is not an area where one style will fit all. I also
> think that the convenience of one style over another might have
> something to do with the tools being used.

This is not what I am suggesting for PEP 8. I want to say that a
package's file structure should reflect the public API. I personally
have no trouble with modules in packages that do not have a ton of
objects in them. I just think if you have pkg/, pkg.mod should
be exposed in the API, else name the file In the case of
unittest that would just mean documenting that it's and that unittest.TestCase is for legacy
reasons, much like os.path is just blindly added on to os even though
it is a separate module(s).

From fuzzyman at  Wed Nov  3 03:50:51 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 02:50:51 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 02/11/2010 02:35, Brett Cannon wrote:
> On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou<solipsis at>  wrote:
>> On Tue, 26 Oct 2010 22:06:37 -0400
>> Alexander Belopolsky<alexander.belopolsky at>  wrote:
>>> While I appreciate your and Michael's eloquence, I don't really see
>>> why five 400-line modules are necessarily easier to maintain than one
>>> 2000-line module.  Splitting code into modules is certainly a good
>>> thing when the resulting modules can be used independently.  This
>>> helps users write leaner programs, reduces mental footprint of
>>> individual modules, etc, etc.   The split unittest module does not
>>> bring any such benefits.  It still presents a single "big-ball-of-mud"
>>> namespace, only rather than implemented in a single file, it is now
>>> swept in from eight different files.
>> Are you saying that it has become a pile of medium-sized balls of mud?
>> I would like to say thanks for the mud, Michael! It's high quality mud
>> for sure.
> I realize I am a little late in this reply but issue 10273 linked to
> this and so now I am actually bothering to read this thread since it
> felt like bikeshedding when the thread began.
> I think the issue here is that the file structure of the code no
> longer matches the public API documented by unittest. Personally I,
> like most people it seems, prefer source files to be structured in a
> way to match the public API. In the case of unittest Michael didn't.

Well the structure *does* match the API (which is primarily why I 
disagree with Raymond that this is a 'bad split').

How could we have split the module into a package in a way that matched 
the API, whilst still retaining backwards compatibility with the old 
API? We had no choice but to export the public names at the top level.

> He did ask python-dev if it was okay to do what he did, we all kept
> quiet, and now we have realized that most of us prefer to have files

Most of us? Raymond, Alexander and Martin are the only ones I *recall* 
complaining about the split specifically in this thread and not all of 
those were on the grounds you mention. Several people supported the 
split. Guido didn't object to it on these grounds and Antoine noted that 
finding core classes was generally straightforward.

> [snip...]
> So basically it seems like we have learned a lesson: we prefer to have
> our code structured in files that match the public API.

When designing packages from the ground up that is a sensible rule of 
thumb to follow, but usually follows naturally from good design. This 
doesn't necessarily mean that all the sub-modules will export public 
APIs for consumers, so this is where I disagree with this. Python's 
package system is a very useful way of providing internal structure for 
projects. That doesn't mean that this structure must always be exposed 

It should be as easy to navigate as possible (and there is plenty about 
the old module that wasn't easy to navigate I can assure 
you), but I *don't* think that the new package fails in a substantially 
greater way on that score.

As Guido points out, this may depend a lot on which tools you use. I 
could write more about the role and value of packages, I guess I'll save 
it for a blog post.

All the best,

Michael Foord

> I think that
> is a legitimate design rule for the stdlib to follow from now on, but
> in the case of unittest it's too late to change it back (and it's a
> minor price to pay to learn this lesson and to have Michael
> maintaining unittest like he has been, plus we could consider using
> the new structure so that the public API matches the file structure
> when the need arises).
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From brett at  Wed Nov  3 03:54:05 2010
From: brett at (Brett Cannon)
Date: Tue, 2 Nov 2010 19:54:05 -0700
Subject: [Python-Dev] Question on imports and packages
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 17:35, Guido van Rossum <guido at> wrote:
> If you are importing the code, the __module__ attribute on each class
> should tell you where it is actually defined (as opposed to where you
> imported it from). Then sys.modules gives you the module object which
> has a __file__ attribute, etc.

What Guido said. It's the equivalent of browsing an object that a
function returned to you. Working backwards to where something is
defined has nothing to do with imports and more to do with __module__,
__class__, etc. Import has nothing to do with introspection for things
that you access off of a module that happened to have imported the

> On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger
> <raymond.hettinger at> wrote:
>> Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package.
> --
> --Guido van Rossum (

From brett at  Wed Nov  3 03:57:48 2010
From: brett at (Brett Cannon)
Date: Tue, 2 Nov 2010 19:57:48 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 19:50, Michael Foord <fuzzyman at> wrote:
> On 02/11/2010 02:35, Brett Cannon wrote:
>> On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou<solipsis at> ?wrote:
>>> On Tue, 26 Oct 2010 22:06:37 -0400
>>> Alexander Belopolsky<alexander.belopolsky at> ?wrote:
>>>> While I appreciate your and Michael's eloquence, I don't really see
>>>> why five 400-line modules are necessarily easier to maintain than one
>>>> 2000-line module. ?Splitting code into modules is certainly a good
>>>> thing when the resulting modules can be used independently. ?This
>>>> helps users write leaner programs, reduces mental footprint of
>>>> individual modules, etc, etc. ? The split unittest module does not
>>>> bring any such benefits. ?It still presents a single "big-ball-of-mud"
>>>> namespace, only rather than implemented in a single file, it is now
>>>> swept in from eight different files.
>>> Are you saying that it has become a pile of medium-sized balls of mud?
>>> I would like to say thanks for the mud, Michael! It's high quality mud
>>> for sure.
>> I realize I am a little late in this reply but issue 10273 linked to
>> this and so now I am actually bothering to read this thread since it
>> felt like bikeshedding when the thread began.
>> I think the issue here is that the file structure of the code no
>> longer matches the public API documented by unittest. Personally I,
>> like most people it seems, prefer source files to be structured in a
>> way to match the public API. In the case of unittest Michael didn't.
> Well the structure *does* match the API (which is primarily why I disagree
> with Raymond that this is a 'bad split').

But not the public API as documented, e.g., it's documented as
unittest.TestCase, not as the file structure

> How could we have split the module into a package in a way that matched the
> API, whilst still retaining backwards compatibility with the old API? We had
> no choice but to export the public names at the top level.

I'm not disagreeing with that. What I am saying is can now document
that it's instead of having that just be an
implementation detail.


>> He did ask python-dev if it was okay to do what he did, we all kept
>> quiet, and now we have realized that most of us prefer to have files
> Most of us? Raymond, Alexander and Martin are the only ones I *recall*
> complaining about the split specifically in this thread and not all of those
> were on the grounds you mention. Several people supported the split. Guido
> didn't object to it on these grounds and Antoine noted that finding core
> classes was generally straightforward.
>> [snip...]
>> So basically it seems like we have learned a lesson: we prefer to have
>> our code structured in files that match the public API.
> When designing packages from the ground up that is a sensible rule of thumb
> to follow, but usually follows naturally from good design. This doesn't
> necessarily mean that all the sub-modules will export public APIs for
> consumers, so this is where I disagree with this. Python's package system is
> a very useful way of providing internal structure for projects. That doesn't
> mean that this structure must always be exposed publicly.
> It should be as easy to navigate as possible (and there is plenty about the
> old module that wasn't easy to navigate I can assure you), but I
> *don't* think that the new package fails in a substantially greater way on
> that score.
> As Guido points out, this may depend a lot on which tools you use. I could
> write more about the role and value of packages, I guess I'll save it for a
> blog post.
> All the best,
> Michael Foord
>> I think that
>> is a legitimate design rule for the stdlib to follow from now on, but
>> in the case of unittest it's too late to change it back (and it's a
>> minor price to pay to learn this lesson and to have Michael
>> maintaining unittest like he has been, plus we could consider using
>> the new structure so that the public API matches the file structure
>> when the need arises).
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> --
> READ CAREFULLY. By accepting and reading this email you agree,
> on behalf of your employer, to release me from all obligations
> and waivers arising from any and all NON-NEGOTIATED agreements,
> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
> confidentiality, non-disclosure, non-compete and acceptable use
> policies (?BOGUS AGREEMENTS?) that I have entered into with your
> employer, its partners, licensors, agents and assigns, in
> perpetuity, without prejudice to my ongoing rights and privileges.
> You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.

From guido at  Wed Nov  3 04:02:42 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 20:02:42 -0700
Subject: [Python-Dev] Question on imports and packages
In-Reply-To: <>
References: <>
Message-ID: <>

FWIW, I also agree with Michael that static analysis would be much
preferred. You never know what side effects importing a module has.
(This could even be construed as an attack vector.)


On Tue, Nov 2, 2010 at 7:54 PM, Brett Cannon <brett at> wrote:
> On Tue, Nov 2, 2010 at 17:35, Guido van Rossum <guido at> wrote:
>> If you are importing the code, the __module__ attribute on each class
>> should tell you where it is actually defined (as opposed to where you
>> imported it from). Then sys.modules gives you the module object which
>> has a __file__ attribute, etc.
> What Guido said. It's the equivalent of browsing an object that a
> function returned to you. Working backwards to where something is
> defined has nothing to do with imports and more to do with __module__,
> __class__, etc. Import has nothing to do with introspection for things
> that you access off of a module that happened to have imported the
> object.
>> On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger
>> <raymond.hettinger at> wrote:
>>> Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package.
>> --
>> --Guido van Rossum (

--Guido van Rossum (

From solipsis at  Wed Nov  3 04:33:48 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 3 Nov 2010 04:33:48 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
References: <>
Message-ID: <>

On Wed, 03 Nov 2010 11:47:55 +1100
Ben Finney <ben+python at> wrote:
> > If someone wants to depend on some undocumented detail of the
> > directory layout it's their problem (like people depending on bytecode
> > and other stuff).
> I would say that names without a single leading underscore are part of
> the public API, whether documented or not.

That's not what we are talking about; we are talking about their
locations. If the official location is the unittest package, then
I don't see why we should also support undocumented locations just
because they happen to work. Otherwise we should also support e.g.
"unittest.unlink" if the unittest package happens to have "from os
import unlink" at its top. I don't think it's reasonable.


From solipsis at  Wed Nov  3 04:35:38 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 3 Nov 2010 04:35:38 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, 2 Nov 2010 19:57:48 -0700
Brett Cannon <brett at> wrote:
> >
> > How could we have split the module into a package in a way that matched the
> > API, whilst still retaining backwards compatibility with the old API? We had
> > no choice but to export the public names at the top level.
> I'm not disagreeing with that. What I am saying is can now document
> that it's instead of having that just be an
> implementation detail.

-1.  unittest.TestCase is far simpler and more obvious that any
javaesque qualified name.  urllib.request and friends are already
annoying enough.



From guido at  Wed Nov  3 04:01:18 2010
From: guido at (Guido van Rossum)
Date: Tue, 2 Nov 2010 20:01:18 -0700
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 7:50 PM, Brett Cannon <brett at> wrote:
> This is not what I am suggesting for PEP 8. I want to say that a
> package's file structure should reflect the public API.

But what does that mean? I could argue that unittest's structure
(TestCase in, etc.) reflects its public API just fine.

> I personally
> have no trouble with modules in packages that do not have a ton of
> objects in them. I just think if you have pkg/, pkg.mod should
> be exposed in the API, else name the file In the case of
> unittest that would just mean documenting that it's
> and that unittest.TestCase is for legacy
> reasons, much like os.path is just blindly added on to os even though
> it is a separate module(s).

I really don't think we should encourage the use as -- it's unnecessarily introducing structure. I
think it's fine now that the cat is out of the bag to document as an alternative spelling, but I don't think
it should be the preferred one.

os.path is so old that should not be taken as an example for anything.
(It predates packages!) But it should not be changed either, there'd
be too  much churn.

--Guido van Rossum (

From ben+python at  Wed Nov  3 05:29:18 2010
From: ben+python at (Ben Finney)
Date: Wed, 03 Nov 2010 15:29:18 +1100
Subject: [Python-Dev] On breaking modules into packages
References: <>
	<> <>
Message-ID: <>

Antoine Pitrou <solipsis at> writes:

> On Wed, 03 Nov 2010 11:47:55 +1100
> Ben Finney <ben+python at> wrote:
> > 
> > > If someone wants to depend on some undocumented detail of the
> > > directory layout it's their problem (like people depending on
> > > bytecode and other stuff).
> > 
> > I would say that names without a single leading underscore are part
> > of the public API, whether documented or not.
> That's not what we are talking about; we are talking about their
> locations. If the official location is the unittest package, then I
> don't see why we should also support undocumented locations just
> because they happen to work.

So long as the names available for import are such that they indicate
whether they're public or implementation-detail (i.e. without a leading
single underscore or with one), I agree that this is distinct from the
issue of locations on the filesystem.

> Otherwise we should also support e.g. "unittest.unlink" if the
> unittest package happens to have "from os import unlink" at its top. I
> don't think it's reasonable.

Hmm. That example does give me pause. I'm trying to think of a simple
way that such imports are excluded from being ?public interface?, but
can't immediately think of one.

The distinction is clear in my head, though, for what it's worth :-)

 \        ?I don't accept the currently fashionable assertion that any |
  `\       view is automatically as worthy of respect as any equal and |
_o__)                                   opposite view.? ?Douglas Adams |
Ben Finney

From kristjan at  Wed Nov  3 06:08:02 2010
From: kristjan at (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=)
Date: Wed, 3 Nov 2010 13:08:02 +0800
Subject: [Python-Dev] On breaking modules into packages
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Just a small input into this discussion:

In EVE, for historical reasons, we implemented our own importing mechanism.  I think it is because we started out with an ancient Python version that didn't support packages.
Regardless, we still have a system where a hierarchy of files is scanned, and then code in each .py files determines where in the "namespace" it lands.  This can be
Declaratively (by using a __guid__ attribute on a class, for instance) or by defining a special __exports__ dict at the module level.
The good thing about this system is that it allows us to separate code in a manner independent of the api.  We can choose for example to group all network
Code in a folder.  Or have each class in the "game.entity" namespace be defined in its own file.  It unhooks file structure from name structure.

Now, this has its own problems of course, the biggest of it being that it is non-standard.  Off the shelf IDEs have problems with it.  And we have to implement dynamic reloading on our own.  The list goes on, and for that reason, we are moving away from it in favor of standard python import.

However, I am personally not super happy about how this will force one to think in "api" terms when creating source files.  As has been mentioned, files cannot be moved and restructured once in general use, and when writing new code, one has to think long and hard about "where" to put the source, not "what" to put in it.  What is more, a hierarchy, while a convenient system for storing files, does not, IMHO, always map to problem domain.

But we're having a go at it.
Time will tell if "forcing us to think inside the hierarchy" will be beneficial in the long run.


From g.brandl at  Wed Nov  3 08:06:49 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 03 Nov 2010 07:06:49 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <iar1oc$c3q$>

Am 03.11.2010 03:35, schrieb Antoine Pitrou:
> On Tue, 2 Nov 2010 19:57:48 -0700
> Brett Cannon <brett at> wrote:
>> >
>> > How could we have split the module into a package in a way that matched the
>> > API, whilst still retaining backwards compatibility with the old API? We had
>> > no choice but to export the public names at the top level.
>> I'm not disagreeing with that. What I am saying is can now document
>> that it's instead of having that just be an
>> implementation detail.
> -1.  unittest.TestCase is far simpler and more obvious that any
> javaesque qualified name.  urllib.request and friends are already
> annoying enough.

Agreed.  There are about 30 names importable from unittest, that is quite
manageable in a single namespace.


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From victor.stinner at  Wed Nov  3 11:55:41 2010
From: victor.stinner at (Victor Stinner)
Date: Wed, 3 Nov 2010 11:55:41 +0100
Subject: [Python-Dev] [Python-checkins] r85902 - in
	python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

Le mardi 02 novembre 2010 23:38:12, vous avez ?crit :
> On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner
> <victor.stinner at> wrote:
> > I don't know how to ignore the BytesWarning without importing warning.
> > How can I do that?
> I was suggesting trying to fix the bootstrap issue so you could use a
> top-level import, instead of working around it with a function level
> import (which we've learned from experience is a recipe for later
> reports from users of programs deadlocking on the import lock - we've
> made lots of improvement to avoid such deadlocks, but still prefer to
> avoid function level imports anyway).

I don't know if there is a bootstrap issue. I'm using a local import because 
os is always loaded at startup, and get_exec_path() is only used to run a 
subprocess: os.exec*() and subprocess.Popen() (only the POSIX implementation). 
I suppose that a top level "import warnings" would augment the memory 


From fuzzyman at  Wed Nov  3 12:25:50 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 11:25:50 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On 03/11/2010 02:57, Brett Cannon wrote:
> On Tue, Nov 2, 2010 at 19:50, Michael Foord<fuzzyman at>  wrote:
>> On 02/11/2010 02:35, Brett Cannon wrote:
>>> On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou<solipsis at>    wrote:
>>>> On Tue, 26 Oct 2010 22:06:37 -0400
>>>> Alexander Belopolsky<alexander.belopolsky at>    wrote:
>>>>> While I appreciate your and Michael's eloquence, I don't really see
>>>>> why five 400-line modules are necessarily easier to maintain than one
>>>>> 2000-line module.  Splitting code into modules is certainly a good
>>>>> thing when the resulting modules can be used independently.  This
>>>>> helps users write leaner programs, reduces mental footprint of
>>>>> individual modules, etc, etc.   The split unittest module does not
>>>>> bring any such benefits.  It still presents a single "big-ball-of-mud"
>>>>> namespace, only rather than implemented in a single file, it is now
>>>>> swept in from eight different files.
>>>> Are you saying that it has become a pile of medium-sized balls of mud?
>>>> I would like to say thanks for the mud, Michael! It's high quality mud
>>>> for sure.
>>> I realize I am a little late in this reply but issue 10273 linked to
>>> this and so now I am actually bothering to read this thread since it
>>> felt like bikeshedding when the thread began.
>>> I think the issue here is that the file structure of the code no
>>> longer matches the public API documented by unittest. Personally I,
>>> like most people it seems, prefer source files to be structured in a
>>> way to match the public API. In the case of unittest Michael didn't.
>> Well the structure *does* match the API (which is primarily why I disagree
>> with Raymond that this is a 'bad split').
> But not the public API as documented, e.g., it's documented as
> unittest.TestCase, not as the file structure
> suggests.

Right. I don't think that whether or not the unittest package structure 
is a good structure or not is determined by where we make users import 
the names from. Like others I see little value in reccommending people 
use the longer form for imports.

All the best,

Michael Foord

>> How could we have split the module into a package in a way that matched the
>> API, whilst still retaining backwards compatibility with the old API? We had
>> no choice but to export the public names at the top level.
> I'm not disagreeing with that. What I am saying is can now document
> that it's instead of having that just be an
> implementation detail.
> -Brett
>>> He did ask python-dev if it was okay to do what he did, we all kept
>>> quiet, and now we have realized that most of us prefer to have files
>> Most of us? Raymond, Alexander and Martin are the only ones I *recall*
>> complaining about the split specifically in this thread and not all of those
>> were on the grounds you mention. Several people supported the split. Guido
>> didn't object to it on these grounds and Antoine noted that finding core
>> classes was generally straightforward.
>>> [snip...]
>>> So basically it seems like we have learned a lesson: we prefer to have
>>> our code structured in files that match the public API.
>> When designing packages from the ground up that is a sensible rule of thumb
>> to follow, but usually follows naturally from good design. This doesn't
>> necessarily mean that all the sub-modules will export public APIs for
>> consumers, so this is where I disagree with this. Python's package system is
>> a very useful way of providing internal structure for projects. That doesn't
>> mean that this structure must always be exposed publicly.
>> It should be as easy to navigate as possible (and there is plenty about the
>> old module that wasn't easy to navigate I can assure you), but I
>> *don't* think that the new package fails in a substantially greater way on
>> that score.
>> As Guido points out, this may depend a lot on which tools you use. I could
>> write more about the role and value of packages, I guess I'll save it for a
>> blog post.
>> All the best,
>> Michael Foord
>>> I think that
>>> is a legitimate design rule for the stdlib to follow from now on, but
>>> in the case of unittest it's too late to change it back (and it's a
>>> minor price to pay to learn this lesson and to have Michael
>>> maintaining unittest like he has been, plus we could consider using
>>> the new structure so that the public API matches the file structure
>>> when the need arises).
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at
>>> Unsubscribe:
>> --
>> READ CAREFULLY. By accepting and reading this email you agree,
>> on behalf of your employer, to release me from all obligations
>> and waivers arising from any and all NON-NEGOTIATED agreements,
>> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
>> confidentiality, non-disclosure, non-compete and acceptable use
>> policies (?BOGUS AGREEMENTS?) that I have entered into with your
>> employer, its partners, licensors, agents and assigns, in
>> perpetuity, without prejudice to my ongoing rights and privileges.
>> You further represent that you have the authority to release me
>> from any BOGUS AGREEMENTS on behalf of your employer.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From benjamin at  Wed Nov  3 13:19:44 2010
From: benjamin at (Benjamin Peterson)
Date: Wed, 3 Nov 2010 07:19:44 -0500
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/3 Victor Stinner <victor.stinner at>:
> Le mardi 02 novembre 2010 23:38:12, vous avez ?crit :
>> On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner
>> <victor.stinner at> wrote:
>> > I don't know how to ignore the BytesWarning without importing warning.
>> > How can I do that?
>> I was suggesting trying to fix the bootstrap issue so you could use a
>> top-level import, instead of working around it with a function level
>> import (which we've learned from experience is a recipe for later
>> reports from users of programs deadlocking on the import lock - we've
>> made lots of improvement to avoid such deadlocks, but still prefer to
>> avoid function level imports anyway).
> I don't know if there is a bootstrap issue. I'm using a local import because
> os is always loaded at startup, and get_exec_path() is only used to run a
> subprocess: os.exec*() and subprocess.Popen() (only the POSIX implementation).
> I suppose that a top level "import warnings" would augment the memory
> footprint.

Warnings is loaded every time anyway.


From hrvoje.niksic at  Wed Nov  3 13:38:58 2010
From: hrvoje.niksic at (Hrvoje Niksic)
Date: Wed, 03 Nov 2010 13:38:58 +0100
Subject: [Python-Dev] On breaking modules into packages
 Was:	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<1288740855.3541.24.camel@localhost.localdomain>
Message-ID: <>

On 11/03/2010 01:47 AM, Ben Finney wrote:
>>  If someone wants to depend on some undocumented detail of the
>>  directory layout it's their problem (like people depending on bytecode
>>  and other stuff).
> I would say that names without a single leading underscore are part of
> the public API, whether documented or not.

I understand this reasoning, but I'd like to offer counter-examples. 
For instance, would you say that glob.glob0 and glob.glob1 are public 
API?  They're undocumented, they're not in __all__, but they don't have 
a leading underscore either, and source comments call them "helper 
functions."  I'm sure there is a lot of other examples like that, both 
in the standard library and in python packages out there.

Other than the existing practice, there is the matter of esthetics. 
Accepting underscore-less identifiers as automatically public leads to a 
proliferation of identifiers with leading underscores, which many people 
(myself included) plainly don't like.

From ncoghlan at  Wed Nov  3 15:00:30 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 4 Nov 2010 00:00:30 +1000
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson <benjamin at> wrote:
> Warnings is loaded every time anyway.

I would have agreed with you, but the contents of sys.modules in a
just-started interactive interpreter suggests that isn't true any more
(which surprised me).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Wed Nov  3 15:05:44 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 4 Nov 2010 00:05:44 +1000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
<raymond.hettinger at> wrote:
> Sounds like a decision to split a module into a package is a big commitment. ?Each of the individual file names becomes a permanent part of the API. ?Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils).

Can Python 2.7 pickles containing unittest classes be unpickled using
2.6 or earlier? Even if nobody uses the new names for imports, I
believe they implicitly end up included in any pickles involving
affected classes (I seem to recall we've been bitten by that before
when moving things around).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Wed Nov  3 15:16:18 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 14:16:18 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 03/11/2010 14:05, Nick Coghlan wrote:
> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
> <raymond.hettinger at>  wrote:
>> Sounds like a decision to split a module into a package is a big commitment.  Each of the individual file names becomes a permanent part of the API.  Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils).
> Can Python 2.7 pickles containing unittest classes be unpickled using
> 2.6 or earlier? Even if nobody uses the new names for imports, I
> believe they implicitly end up included in any pickles involving
> affected classes (I seem to recall we've been bitten by that before
> when moving things around).

Yes, since unittest.TestCase is still available (as are all the names). 
I believe so anyway...


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Wed Nov  3 15:17:49 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 03 Nov 2010 15:17:49 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <1288793869.5969.2.camel@localhost.localdomain>

Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit :
> On 03/11/2010 14:05, Nick Coghlan wrote:
> > On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
> > <raymond.hettinger at>  wrote:
> >> Sounds like a decision to split a module into a package is a big commitment.  Each of the individual file names becomes a permanent part of the API.  Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils).
> > Can Python 2.7 pickles containing unittest classes be unpickled using
> > 2.6 or earlier? Even if nobody uses the new names for imports, I
> > believe they implicitly end up included in any pickles involving
> > affected classes (I seem to recall we've been bitten by that before
> > when moving things around).
> Yes, since unittest.TestCase is still available (as are all the names). 
> I believe so anyway...

unittest.TestCase is not really pickleable. There were
test_multiprocessing issues because of that (see recent SVN checkins).



From fuzzyman at  Wed Nov  3 15:26:23 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 14:26:23 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <1288793869.5969.2.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 03/11/2010 14:17, Antoine Pitrou wrote:
> Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit :
>> On 03/11/2010 14:05, Nick Coghlan wrote:
>>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
>>> <raymond.hettinger at>   wrote:
>>>> Sounds like a decision to split a module into a package is a big commitment.  Each of the individual file names becomes a permanent part of the API.  Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils).
>>> Can Python 2.7 pickles containing unittest classes be unpickled using
>>> 2.6 or earlier? Even if nobody uses the new names for imports, I
>>> believe they implicitly end up included in any pickles involving
>>> affected classes (I seem to recall we've been bitten by that before
>>> when moving things around).
>> Yes, since unittest.TestCase is still available (as are all the names).
>> I believe so anyway...
> unittest.TestCase is not really pickleable. There were
> test_multiprocessing issues because of that (see recent SVN checkins).

Interesting. We made some fixes before 2.7 to ensure they were copyable, 
but we fixed this in the copy module. TestCase instances now store some 
method objects in a dictionary which may make them unpickleable, so that 
could be a new problem. I'll test with 2.6 and 2.7 to see.

An easy fix would be to store the method names rather than the method 
objects themself (if this is indeed the cause of the problem). This is 
what unittest2 does so that it works with earlier versions of Python 
that don't have the fix we put in copy.


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Wed Nov  3 15:33:18 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 03 Nov 2010 15:33:18 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <1288794798.5969.6.camel@localhost.localdomain>

Le mercredi 03 novembre 2010 ? 14:26 +0000, Michael Foord a ?crit :
> Interesting. We made some fixes before 2.7 to ensure they were copyable, 
> but we fixed this in the copy module. TestCase instances now store some 
> method objects in a dictionary which may make them unpickleable, so that 
> could be a new problem. I'll test with 2.6 and 2.7 to see.

I don't think it is a problem in unittest, unless pickling TestCase
objects is really useful. I have fixed the problem in
test_multiprocessing instead.



From fuzzyman at  Wed Nov  3 15:38:56 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 14:38:56 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<1288793869.5969.2.camel@localhost.localdomain>
Message-ID: <>

On 03/11/2010 14:26, Michael Foord wrote:
> On 03/11/2010 14:17, Antoine Pitrou wrote:
>> Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit :
>>> On 03/11/2010 14:05, Nick Coghlan wrote:
>>>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
>>>> <raymond.hettinger at>   wrote:
>>>>> Sounds like a decision to split a module into a package is a big 
>>>>> commitment.  Each of the individual file names becomes a permanent 
>>>>> part of the API.  Even future additional splits are precluded 
>>>>> because it might break someones dotted import (i.e. not a single 
>>>>> function can be moved between those files -- once in 
>>>>> unittest.utils, alway in unittest.utils).
>>>> Can Python 2.7 pickles containing unittest classes be unpickled using
>>>> 2.6 or earlier? Even if nobody uses the new names for imports, I
>>>> believe they implicitly end up included in any pickles involving
>>>> affected classes (I seem to recall we've been bitten by that before
>>>> when moving things around).
>>> Yes, since unittest.TestCase is still available (as are all the names).
>>> I believe so anyway...
>> unittest.TestCase is not really pickleable. There were
>> test_multiprocessing issues because of that (see recent SVN checkins).
> Interesting. We made some fixes before 2.7 to ensure they were 
> copyable, but we fixed this in the copy module. TestCase instances now 
> store some method objects in a dictionary which may make them 
> unpickleable, so that could be a new problem. I'll test with 2.6 and 
> 2.7 to see.
> An easy fix would be to store the method names rather than the method 
> objects themself (if this is indeed the cause of the problem). This is 
> what unittest2 does so that it works with earlier versions of Python 
> that don't have the fix we put in copy.
Yep, looks like 2.7 introduced a bug making it impossible to pickle 
TestCase instances. I think it will be easy to fix, I'll create a 
specific issue:

Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> from unittest import TestCase
 >>> from pickle import dumps
 >>> t = TestCase('assert_')
 >>> dumps(t)
the test unless the expression is 
 >>> michael$ python2.7
ActivePython (ActiveState Software Inc.) based on
Python 2.7 (r27:82500, Jul  4 2010, 13:58:56)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> from unittest import TestCase
 >>> from pickle import dumps
 >>> t = TestCase('assert_')
 >>> dumps(t)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
line 1374, in dumps
     Pickler(file, protocol).dump(obj)
line 306, in save
     rv = reduce(self.proto)
line 70, in _reduce_ex
     raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle instancemethod objects

All the best,


> Michael
>> Regards
>> Antoine.
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From eric at  Wed Nov  3 15:53:11 2010
From: eric at (Eric Smith)
Date: Wed, 03 Nov 2010 10:53:11 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/3/10 10:16 AM, Michael Foord wrote:
> On 03/11/2010 14:05, Nick Coghlan wrote:
>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
>> <raymond.hettinger at> wrote:
>>> Sounds like a decision to split a module into a package is a big
>>> commitment. Each of the individual file names becomes a permanent
>>> part of the API. Even future additional splits are precluded because
>>> it might break someones dotted import (i.e. not a single function can
>>> be moved between those files -- once in unittest.utils, alway in
>>> unittest.utils).
>> Can Python 2.7 pickles containing unittest classes be unpickled using
>> 2.6 or earlier? Even if nobody uses the new names for imports, I
>> believe they implicitly end up included in any pickles involving
>> affected classes (I seem to recall we've been bitten by that before
>> when moving things around).
> Yes, since unittest.TestCase is still available (as are all the names).
> I believe so anyway...

Actually I think the answer is "no" (assuming you could pickle a 
TestCase). Here's an example with TestLoader:

$ python27
Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import unittest
 >>> x = unittest.TestLoader()
 >>> import pickle
 >>> pickle.dumps(x)

$ python24
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import pickle
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/usr/lib/python2.4/", line 1394, in loads
     return Unpickler(file).load()
   File "/usr/lib/python2.4/", line 872, in load
   File "/usr/lib/python2.4/", line 1104, in load_global
     klass = self.find_class(module, name)
   File "/usr/lib/python2.4/", line 1138, in find_class
ImportError: No module named loader

The problem is that there is no unittest.loader in 2.4, and 
unittest.loader.TestLoader is the name that the 2.7 pickle creates. We 
see this problem every time we try and move anything in the stdlib.


From barry at  Wed Nov  3 15:54:33 2010
From: barry at (Barry Warsaw)
Date: Wed, 3 Nov 2010 10:54:33 -0400
Subject: [Python-Dev] On breaking modules into packages
In-Reply-To: <>
References: <>
Message-ID: <20101103105433.1530f1fd@mission>

On Nov 03, 2010, at 11:06 AM, Ben Finney wrote:

>Is this a case where it would be better if the package names had the
>leading underscore: ?_utils?, ?_suite?, etc.?
>Does the convention on single-leading-underscore identifiers as ?don't
>rely on this name staying the same in future versions? hold for package

I would vote "yes".  I have seen more and more packages use this convention to
signal that the module name is not intended to be imported directly.  This
should be part of any PEP 8 recommendation, IMO.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Wed Nov  3 15:55:07 2010
From: barry at (Barry Warsaw)
Date: Wed, 3 Nov 2010 10:55:07 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <1288740855.3541.24.camel@localhost.localdomain>
References: <>
Message-ID: <20101103105507.11e12b41@mission>

On Nov 03, 2010, at 12:34 AM, Antoine Pitrou wrote:

>I don't agree with this. Until it's documented, it's an implementation
>detail and should be able to change without notice.
>If someone wants to depend on some undocumented detail of the directory
>layout it's their problem (like people depending on bytecode and other

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From fuzzyman at  Wed Nov  3 15:56:45 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 14:56:45 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 03/11/2010 14:53, Eric Smith wrote:
> On 11/3/10 10:16 AM, Michael Foord wrote:
>> On 03/11/2010 14:05, Nick Coghlan wrote:
>>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
>>> <raymond.hettinger at> wrote:
>>>> Sounds like a decision to split a module into a package is a big
>>>> commitment. Each of the individual file names becomes a permanent
>>>> part of the API. Even future additional splits are precluded because
>>>> it might break someones dotted import (i.e. not a single function can
>>>> be moved between those files -- once in unittest.utils, alway in
>>>> unittest.utils).
>>> Can Python 2.7 pickles containing unittest classes be unpickled using
>>> 2.6 or earlier? Even if nobody uses the new names for imports, I
>>> believe they implicitly end up included in any pickles involving
>>> affected classes (I seem to recall we've been bitten by that before
>>> when moving things around).
>> Yes, since unittest.TestCase is still available (as are all the names).
>> I believe so anyway...
> Actually I think the answer is "no" (assuming you could pickle a 
> TestCase). Here's an example with TestLoader:

Ah dammit, I read the question the other way round.


> $ python27
> Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25)
> [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import unittest
> >>> x = unittest.TestLoader()
> >>> import pickle
> >>> pickle.dumps(x)
> 'ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.' 
> >>>
> $ python24
> Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
> [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import pickle
> >>> 
> pickle.loads('ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.')
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "/usr/lib/python2.4/", line 1394, in loads
> return Unpickler(file).load()
> File "/usr/lib/python2.4/", line 872, in load
> dispatch[key](self)
> File "/usr/lib/python2.4/", line 1104, in load_global
> klass = self.find_class(module, name)
> File "/usr/lib/python2.4/", line 1138, in find_class
> __import__(module)
> ImportError: No module named loader
> The problem is that there is no unittest.loader in 2.4, and 
> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We 
> see this problem every time we try and move anything in the stdlib.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From benjamin at  Wed Nov  3 16:01:32 2010
From: benjamin at (Benjamin Peterson)
Date: Wed, 3 Nov 2010 10:01:32 -0500
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/3 Nick Coghlan <ncoghlan at>:
> On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson <benjamin at> wrote:
>> Warnings is loaded every time anyway.
> I would have agreed with you, but the contents of sys.modules in a
> just-started interactive interpreter suggests that isn't true any more
> (which surprised me).

Is that perhaps because of _warnings?


From eric at  Wed Nov  3 16:25:36 2010
From: eric at (Eric Smith)
Date: Wed, 03 Nov 2010 11:25:36 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/3/10 10:53 AM, Eric Smith wrote:

> The problem is that there is no unittest.loader in 2.4, and
> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We
> see this problem every time we try and move anything in the stdlib.

And BTW: for me, this is the strongest reason not to break up modules 
into packages or otherwise reorganize the stdlib.


From alexander.belopolsky at  Wed Nov  3 16:26:18 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 3 Nov 2010 11:26:18 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 2, 2010 at 6:58 PM, Guido van Rossum <guido at> wrote:
> To spout a somewhat contrarian opinion, I just browsed the new
> unittest package, and the structure seems reasonable to me, even if
> its submodules are not particularly reusable. I've used this kind of
> style for development myself. What is so offensive about it?

I would not call it "offensive", but what I find annoying is

>>> import unittest
>>> unittest.TestCase.__module__

This may not be a problem for smart tools, but for me and a simple
editor what used to be:

Let's find code for unittest.TestCase.

1. Open Lib/
2. Search for "class TestCase".

is now

1. Open Lib/
-> No such file or directory.

2. OK, I'm in 2.7.  Open Lib/unittest/
3. Search for "class TestCase"
-> beep

4. OK, search for "TestCase"
-> from .case import (TestCase, FunctionTestCase, SkipTest, skip, skipIf, ..

5. Hmm, what is ".case". Ah, the relative import - open
7.  Search for "class TestCase".
8. What exactly was I looking for?

The above is only slightly exaggerated scenario that I went through
several times when I started using 2.7 before I conditioned myself to
grep in Lib/unittest/*.py.

What is unfortunate is that file split was accompanied by an explosion
of assert* methods in TestCase API which means that anyone reading 2.7
unittests is likely to encounter an unfamiliar method that has to be
looked up.

I think the problem that I described is just a slightly reworded
problem that Raymond reported at the beginning of this thread.   In
other words, I am not alone in seeing this as a problem.

PS: For a "made from scratch" API I would prefer TestCase only be
available from

From foom at  Wed Nov  3 18:04:53 2010
From: foom at (James Y Knight)
Date: Wed, 3 Nov 2010 13:04:53 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

On Nov 3, 2010, at 11:25 AM, Eric Smith wrote:

> On 11/3/10 10:53 AM, Eric Smith wrote:
>> The problem is that there is no unittest.loader in 2.4, and
>> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We
>> see this problem every time we try and move anything in the stdlib.
> And BTW: for me, this is the strongest reason not to break up modules into packages or otherwise reorganize the stdlib.

This is the strongest reason why I recommend to everyone I know that they not use pickle for storage they'd like to keep working after upgrades [not just of stdlib, but other 3rd party software or their own software]. :)


From techtonik at  Wed Nov  3 19:21:27 2010
From: techtonik at (anatoly techtonik)
Date: Wed, 3 Nov 2010 20:21:27 +0200
Subject: [Python-Dev] Code coverage doesn't show .py stats
Message-ID: <>


Python code coverage doesn't include any .py files. What happened?

Did it work before?
anatoly t.

From glyph at  Wed Nov  3 20:08:33 2010
From: glyph at (Glyph Lefkowitz)
Date: Wed, 3 Nov 2010 15:08:33 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

On Nov 3, 2010, at 1:04 PM, James Y Knight wrote:

> This is the strongest reason why I recommend to everyone I know that they not use pickle for storage they'd like to keep working after upgrades [not just of stdlib, but other 3rd party software or their own software]. :)


Twisted actually tried to preserve pickle compatibility in the bad old days, but it was impossible.  Pickles should never really be saved to disk unless they contain nothing but lists, ints, strings, and dicts.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Wed Nov  3 20:26:53 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 19:26:53 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 03/11/2010 14:53, Eric Smith wrote:
> On 11/3/10 10:16 AM, Michael Foord wrote:
>> On 03/11/2010 14:05, Nick Coghlan wrote:
>>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger
>>> <raymond.hettinger at> wrote:
>>>> Sounds like a decision to split a module into a package is a big
>>>> commitment. Each of the individual file names becomes a permanent
>>>> part of the API. Even future additional splits are precluded because
>>>> it might break someones dotted import (i.e. not a single function can
>>>> be moved between those files -- once in unittest.utils, alway in
>>>> unittest.utils).
>>> Can Python 2.7 pickles containing unittest classes be unpickled using
>>> 2.6 or earlier? Even if nobody uses the new names for imports, I
>>> believe they implicitly end up included in any pickles involving
>>> affected classes (I seem to recall we've been bitten by that before
>>> when moving things around).
>> Yes, since unittest.TestCase is still available (as are all the names).
>> I believe so anyway...
> Actually I think the answer is "no" (assuming you could pickle a 
> TestCase). Here's an example with TestLoader:

It is actually fixable by temporarily switching the __module__ attribute 
of the classes inside a __reduce__ or __reduce_ex__ method. I couldn't 
see a cleaner way of doing it using the pickling protocol methods. I 
asked on #python-dev but the *only* person who claimed to understand the 
pickle protocol methods was Barry, and he is clearly insane.

Antoine is firmly of the opinion that making TestCase instances 
unpickleable is a feature...

Although in practise this is less likely to be an issue for TestCase 
directly as it is extremely rare to use them without subclassing. More 
likely to be an issue for the test result or runner objects.

All the best,

Michael Foord

> $ python27
> Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25)
> [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import unittest
> >>> x = unittest.TestLoader()
> >>> import pickle
> >>> pickle.dumps(x)
> 'ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.' 
> >>>
> $ python24
> Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
> [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import pickle
> >>> 
> pickle.loads('ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.')
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "/usr/lib/python2.4/", line 1394, in loads
> return Unpickler(file).load()
> File "/usr/lib/python2.4/", line 872, in load
> dispatch[key](self)
> File "/usr/lib/python2.4/", line 1104, in load_global
> klass = self.find_class(module, name)
> File "/usr/lib/python2.4/", line 1138, in find_class
> __import__(module)
> ImportError: No module named loader
> The problem is that there is no unittest.loader in 2.4, and 
> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We 
> see this problem every time we try and move anything in the stdlib.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Wed Nov  3 20:45:04 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 3 Nov 2010 20:45:04 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
References: <>
Message-ID: <>

On Wed, 03 Nov 2010 19:26:53 +0000
Michael Foord <fuzzyman at> wrote:
> Antoine is firmly of the opinion that making TestCase instances 
> unpickleable is a feature...

Apparently you didn't really understand me. I'm of the opinion that
making TestCase instances pickleable is useless if that pickling
doesn't have well-defined semantics. And I wonder what the semantics of
pickling a TestCase could be, and what the use cases are.



From jnoller at  Wed Nov  3 20:48:27 2010
From: jnoller at (Jesse Noller)
Date: Wed, 3 Nov 2010 15:48:27 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou <solipsis at> wrote:
> On Wed, 03 Nov 2010 19:26:53 +0000
> Michael Foord <fuzzyman at> wrote:
>> Antoine is firmly of the opinion that making TestCase instances
>> unpickleable is a feature...
> Apparently you didn't really understand me. I'm of the opinion that
> making TestCase instances pickleable is useless if that pickling
> doesn't have well-defined semantics. And I wonder what the semantics of
> pickling a TestCase could be, and what the use cases are.
> Regards
> Antoine.

Splitting groups of tests to run in parallel via multiple processes is
a pretty good use case.

From solipsis at  Wed Nov  3 20:56:51 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 03 Nov 2010 20:56:51 +0100
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <1288814211.5969.19.camel@localhost.localdomain>

Le mercredi 03 novembre 2010 ? 15:48 -0400, Jesse Noller a ?crit :
> On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou <solipsis at> wrote:
> > On Wed, 03 Nov 2010 19:26:53 +0000
> > Michael Foord <fuzzyman at> wrote:
> >>
> >> Antoine is firmly of the opinion that making TestCase instances
> >> unpickleable is a feature...
> >
> > Apparently you didn't really understand me. I'm of the opinion that
> > making TestCase instances pickleable is useless if that pickling
> > doesn't have well-defined semantics. And I wonder what the semantics of
> > pickling a TestCase could be, and what the use cases are.
> >
> > Regards
> >
> > Antoine.
> >
> Splitting groups of tests to run in parallel via multiple processes is
> a pretty good use case.

Indeed, but it implies a lot of things about TestCase instances, which
could have additional non-pickleable attributes (e.g. file objects).
You'd better pickle the TestCase class instead, or simply the module
name as we do with regrtest -jN.



From fuzzyman at  Wed Nov  3 21:15:51 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 03 Nov 2010 20:15:51 +0000
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 03/11/2010 19:48, Jesse Noller wrote:
> On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou<solipsis at>  wrote:
>> On Wed, 03 Nov 2010 19:26:53 +0000
>> Michael Foord<fuzzyman at>  wrote:
>>> Antoine is firmly of the opinion that making TestCase instances
>>> unpickleable is a feature...
>> Apparently you didn't really understand me. I'm of the opinion that
>> making TestCase instances pickleable is useless if that pickling
>> doesn't have well-defined semantics. And I wonder what the semantics of
>> pickling a TestCase could be, and what the use cases are.
>> Regards
>> Antoine.
> Splitting groups of tests to run in parallel via multiple processes is
> a pretty good use case.

That's something I've been thinking about a lot (and talking to Holger 
about) for the unittest plugins. I definitely won't be doing it with 
pickles but as Antoine says, sending test names to the subprocesses. You 
really want tests run in a child process to behave differently and it 
makes sense to set them up inside the child process.

All the best,

Michael Foord

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From glyph at  Wed Nov  3 21:59:35 2010
From: glyph at (Glyph Lefkowitz)
Date: Wed, 3 Nov 2010 16:59:35 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
	[issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 3, 2010, at 11:26 AM, Alexander Belopolsky wrote:

> This may not be a problem for smart tools, but for me and a simple
> editor what used to be:

Maybe this is the real problem?  It's 2010, we should all be far enough beyond EDLIN that our editors can jump to the definition of a Python class.  Even Vim can be convinced to do this (<>).  Could Python itself make this easier?  Maybe ship with a command that says "hey, somewhere on sys.path, there is a class with <this name>.  Please run '$EDITOR file +line' (or the current OS's equivalent) so I can look at the source code".

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexander.belopolsky at  Wed Nov  3 22:18:41 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 3 Nov 2010 17:18:41 -0400
Subject: [Python-Dev] On breaking modules into packages Was:
 [issue10199] Move Demo/turtle under Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 3, 2010 at 4:59 PM, Glyph Lefkowitz <glyph at> wrote:
> ?Maybe ship with a command that says "hey, somewhere on sys.path,
> there is a class with <this name>. ?Please run '$EDITOR file +line' (or the
> current OS's equivalent) so I can look at the source code".

Well, we already have inspect.findsource() for that.

From ncoghlan at  Wed Nov  3 23:12:01 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 4 Nov 2010 08:12:01 +1000
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 1:01 AM, Benjamin Peterson <benjamin at> wrote:
> 2010/11/3 Nick Coghlan <ncoghlan at>:
>> On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson <benjamin at> wrote:
>>> Warnings is loaded every time anyway.
>> I would have agreed with you, but the contents of sys.modules in a
>> just-started interactive interpreter suggests that isn't true any more
>> (which surprised me).
> Is that perhaps because of _warnings?

I suspect it's a combination of that and the patch to allow
non-blocking module imports (which turns some things that would
previously have been deadlocks into runtime exceptions).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From eric at  Thu Nov  4 01:14:27 2010
From: eric at (Eric Smith)
Date: Wed, 03 Nov 2010 20:14:27 -0400
Subject: [Python-Dev] str.format_from_mapping
In-Reply-To: <>
References: <>
Message-ID: <>

On 10/31/10 4:39 PM, Eric Smith wrote:
> What are your thoughts on adding a str.format_from_mapping (or similar
> name, maybe the suggested "format_map") to 3.2? See
> . This method would be similar to
> "%(foo)s %(bar)s" % d, where d is a dict (or rather any mapping object),
> but of course would use str.format syntax: "{foo}
> {bar}".format_from_mapping(d).
> I like the idea. It's particularly handy when converting from %-formatting.
> Eric.

I've updated the issue with tests, minimal docs, and a name change to 
str.format_map. Having heard no objections and some support, I'll commit 
this shortly.


From allan at  Thu Nov  4 05:44:09 2010
From: allan at (Allan McRae)
Date: Thu, 04 Nov 2010 14:44:09 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
Message-ID: <>


While this is not strictly related to python development, I thought that 
developers of python might be interested in some of the lessons provided 
by this. So forgive me if this is really wrong for this list...

Recently Arch Linux did a big transition with respect to python.  Now we 
support two python packages: "python" and "python2".

The "python" package will always contain the latest 3.x release and 
currently has /usr/bin/python3.1 with symlinks to /usr/bin/python3 and 

The "python2" package contains the latest from the "legacy" python-2.x 
branch and has /usr/bin/python2.7 with a symlink to /usr/bin/python2.

I really do not want to debate the sanity of pointing /usr/bin/python at 
python-3.x here, but it suffices to say that I am of the opinion that if 
python-3.x is really the future of python, then /usr/bin/python must 
eventually point to a 3.x version.  Also, Arch Linux is very bleeding 
edge and we expect our users to be competent enough to deal with thing 
like this.  According to #python, we are all idiots....  And I have been 
(figuratively) yelled at by a couple of Debian developers (which is 
incidentally the only major distro I found without a /usr/bin/python2 

Anyway, this transition was rather simple from a distribution point of 
view apart from the sheer number of packages involved.  All our 
supported packages were rebuilt to work with that symlink layout and any 
"porting" software to use that layout was relatively simple.  Most 
packages either detected the python2 symlink during the rebuild and just 
worked while others required some sort "export PYTHON=python2" or 
"--with-python=python2" or "python2" or the like during the build.

Software packages tend to fall into three categories at roughly equal 
1) packages that detected or were pointed at python2 and everything worked
2) packages that detected or were pointed at python2 and partially worked
3) packages that needed adjustment to work with the python2 symlink.

The second case was particularly interesting.  These software would 
change some of their #! to point at the python2 symlink and leave the 
rest pointing at python.  Note that python-2.7 itself falls into this 
category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env 
python" even when installed with "make altinstall".  I can not remember 
the exact details, but I recall that some of these files were installed 
with executable permissions which would be bad, but I need to look into 
this again now things have calmed down...

The packages that did not auto-detect and work with /usr/bin/python2 or 
/usr/bin/python2.7 mostly required a sed of their shebangs or a patch to 
any hardcoded /usr/bin/python paths so were easily fixed.

So that is something that python software developers could think about 
for the future.  When someone configures a module using a particular 
version of python, then ALL shebangs need changed to use that version. 
And it is generally bad practice to hardcode /usr/bin/python into any 
application as you are never quite sure which version you are getting. 
Instead allow it to be configured, keeping the current value as default.


Allan McRae
Arch Linux Developer

From techtonik at  Thu Nov  4 07:28:11 2010
From: techtonik at (anatoly techtonik)
Date: Thu, 4 Nov 2010 08:28:11 +0200
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules
 into packages)
Message-ID: <>

On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz <glyph at> wrote:
> This is the strongest reason why I recommend to everyone I know that they
> not use pickle for storage they'd like to keep working after upgrades [not
> just of stdlib, but other 3rd party software or their own software]. :)
> +1.
> Twisted actually tried to preserve pickle compatibility in the bad old days,
> but it was impossible. ?Pickles should never really be saved to disk unless
> they contain nothing but lists, ints, strings, and dicts.

But what is alternative in stdlib?
Don't you think that Python doesn't provide any?
anatoly t.

From victor.stinner at  Thu Nov  4 12:16:17 2010
From: victor.stinner at (Victor Stinner)
Date: Thu, 4 Nov 2010 12:16:17 +0100
Subject: [Python-Dev] [Python-checkins] r85902 - in
	python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

On Wednesday 03 November 2010 23:12:01 Nick Coghlan wrote:
> On Thu, Nov 4, 2010 at 1:01 AM, Benjamin Peterson <benjamin at> 
> > 2010/11/3 Nick Coghlan <ncoghlan at>:
> >> On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson <benjamin at> 
> >>> Warnings is loaded every time anyway.
> >> 
> >> I would have agreed with you, but the contents of sys.modules in a
> >> just-started interactive interpreter suggests that isn't true any more
> >> (which surprised me).
> > 
> > Is that perhaps because of _warnings?
> I suspect it's a combination of that and the patch to allow
> non-blocking module imports (which turns some things that would
> previously have been deadlocks into runtime exceptions).

So do you still think that I should patch the os module to use a global import 
or not?


From exarkun at  Thu Nov  4 13:12:19 2010
From: exarkun at (exarkun at
Date: Thu, 04 Nov 2010 12:12:19 -0000
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On
	breaking	modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <20101104121219.2040.225904071.divmod.xquotient.555@localhost.localdomain>

On 06:28 am, techtonik at wrote:
>On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz 
><glyph at> wrote:
>>This is the strongest reason why I recommend to everyone I know that 
>>not use pickle for storage they'd like to keep working after upgrades 
>>just of stdlib, but other 3rd party software or their own software]. 
>>Twisted actually tried to preserve pickle compatibility in the bad old 
>>but it was impossible. ?Pickles should never really be saved to disk 
>>they contain nothing but lists, ints, strings, and dicts.
>But what is alternative in stdlib?
>Don't you think that Python doesn't provide any?

Persistence is a very hard problem.  Lots and lots of trade-offs need to 
be made, and you generally want to tailor those trade-offs to the 
particular application at hand.  This probably means that the stdlib 
isn't a suitable place to try to solve the problem.

Look outside the stdlib and you'll find an extremely vibrant and diverse 
collection of software which is aimed at solving this problem, though.


From ncoghlan at  Thu Nov  4 14:33:38 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 4 Nov 2010 23:33:38 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 2:44 PM, Allan McRae <allan at> wrote:
> The second case was particularly interesting. ?These software would change
> some of their #! to point at the python2 symlink and leave the rest pointing
> at python. ?Note that python-2.7 itself falls into this category as many
> files in /usr/lib/python2.7 still have "#!/usr/bin/env python" even when
> installed with "make altinstall". ?I can not remember the exact details, but
> I recall that some of these files were installed with executable permissions
> which would be bad, but I need to look into this again now things have
> calmed down...
> The packages that did not auto-detect and work with /usr/bin/python2 or
> /usr/bin/python2.7 mostly required a sed of their shebangs or a patch to any
> hardcoded /usr/bin/python paths so were easily fixed.

A very interesting exercise, indeed - especially the observation
regarding software (including python itself) that supports
installation under alternate names, but doesn't subsequently ensure
use of that name in its shebang lines.

I just did a quick grep of Lib in my py3k directory, and it looks like is incorrectly set to use "/usr/local/bin/python", while the
other files with shebang lines are set to "/usr/bin/env python3" as

Tools also had a few discrepancies:
  scripts/ /usr/bin/env python (necessary, I think - I believe
2to3 is a 2.x only program)
  scripts/ /usr/bin/env python32.3 (Huh? Automated
correction gone wrong, perhaps?)
  scripts/ /usr/bin/env python (probably incorrect)
  pybench: /usr/bin/env python (not sure - has pybench been forward
ported on the 3.x branch?)
  world: /usr/bin/env python (I have no idea what this script is even for)

(Note that these examples are a matter of simply respecting the
*default* install location for python3, without even getting into
questions of altinstall or configured installation locations)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Thu Nov  4 14:38:33 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 4 Nov 2010 23:38:33 +1000
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 4:28 PM, anatoly techtonik <techtonik at> wrote:
> On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz <glyph at> wrote:
>> This is the strongest reason why I recommend to everyone I know that they
>> not use pickle for storage they'd like to keep working after upgrades [not
>> just of stdlib, but other 3rd party software or their own software]. :)
>> +1.
>> Twisted actually tried to preserve pickle compatibility in the bad old days,
>> but it was impossible. ?Pickles should never really be saved to disk unless
>> they contain nothing but lists, ints, strings, and dicts.
> But what is alternative in stdlib?
> Don't you think that Python doesn't provide any?

Python 3.2a3+ (py3k:85817, Oct 24 2010, 19:25:28)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> dir(json)
['JSONDecoder', 'JSONEncoder', '__all__', '__author__',
'__builtins__', '__cached__', '__doc__', '__file__', '__name__',
'__package__', '__path__', '__version__', '_default_decoder',
'_default_encoder', 'decoder', 'dump', 'dumps', 'encoder', 'load',
'loads', 'scanner']

pickle gets overspecific in many ways, and hence (despite our best
efforts, and those of third parties) may break when changing Python
versions. Serialising to something more language natural (be it JSON,
YAML, XML or one of the multitude of other state encoding formats out
there) is far more likely to be future proof.

As a tool for communicating between different instances of the *same*
version of Python though, pickle is fine.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From walter at  Thu Nov  4 14:19:16 2010
From: walter at (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Thu, 04 Nov 2010 14:19:16 +0100
Subject: [Python-Dev] Code coverage doesn't show .py stats
In-Reply-To: <>
References: <>
Message-ID: <>

On 03.11.10 19:21, anatoly techtonik wrote:

> Hi,
> Python code coverage doesn't include any .py files. What happened?
> Did it work before?

It did, however currently the logfile

shows the following exception:

Traceback (most recent call last):
  File "Lib/test/", line 1500, in <module>
  File "Lib/test/", line 696, in main
    r.write_results(show_missing=True, summary=True, coverdir=coverdir)
  File "/home/coverage/python/Lib/", line 319, in write_results
    lnotab, count)
  File "/home/coverage/python/Lib/", line 369, in write_results_file
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in
position 30: ordinal not in range(128)

BTW, this is the py3k branch (i.e.

It seems the trace module has a problem with unicode.


From solipsis at  Thu Nov  4 14:46:05 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 4 Nov 2010 14:46:05 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
References: <>
Message-ID: <>

On Thu, 4 Nov 2010 23:33:38 +1000
Nick Coghlan <ncoghlan at> wrote:
> Tools also had a few discrepancies:
>   scripts/ /usr/bin/env python (necessary, I think - I believe
> 2to3 is a 2.x only program)
>   scripts/ /usr/bin/env python32.3 (Huh? Automated
> correction gone wrong, perhaps?)

Or time machine gone wild?
I think it is the version which automatically renames your classes
and methods based on good taste, but still has the old
assertLessEqual method at the bottom of the now 5-level deep unittest
package hierarchy (while Michael enjoys his 3251st PSF community award
after it was decided to make it a daily ceremony). pyclbr has been
patched to handle it fine, though.



From benjamin at  Thu Nov  4 14:59:45 2010
From: benjamin at (Benjamin Peterson)
Date: Thu, 4 Nov 2010 08:59:45 -0500
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/4 Nick Coghlan <ncoghlan at>:
> On Thu, Nov 4, 2010 at 2:44 PM, Allan McRae <allan at> wrote:
>> The second case was particularly interesting. ?These software would change
>> some of their #! to point at the python2 symlink and leave the rest pointing
>> at python. ?Note that python-2.7 itself falls into this category as many
>> files in /usr/lib/python2.7 still have "#!/usr/bin/env python" even when
>> installed with "make altinstall". ?I can not remember the exact details, but
>> I recall that some of these files were installed with executable permissions
>> which would be bad, but I need to look into this again now things have
>> calmed down...
>> The packages that did not auto-detect and work with /usr/bin/python2 or
>> /usr/bin/python2.7 mostly required a sed of their shebangs or a patch to any
>> hardcoded /usr/bin/python paths so were easily fixed.
> A very interesting exercise, indeed - especially the observation
> regarding software (including python itself) that supports
> installation under alternate names, but doesn't subsequently ensure
> use of that name in its shebang lines.
> I just did a quick grep of Lib in my py3k directory, and it looks like
> is incorrectly set to use "/usr/local/bin/python", while the
> other files with shebang lines are set to "/usr/bin/env python3" as
> expected.
> Tools also had a few discrepancies:
> ?scripts/ /usr/bin/env python (necessary, I think - I believe
> 2to3 is a 2.x only program)

No, I believe distutils is supposed to patch that up, though.


From ocean-city at  Thu Nov  4 15:09:39 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Thu, 04 Nov 2010 23:09:39 +0900
Subject: [Python-Dev] [Python-checkins] r85987
	-	python/branches/py3k/Lib/test/
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 2010/11/02 1:30, Nick Coghlan wrote:
> On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto
> <ocean-city at>  wrote:
>> Does this really cause resource warning? I think os.popen instance
>> won't be into traceback because it's not declared as variable. So I
>> suppose it will be deleted by reference count == 0 even when exception
>> occurs.
> Any time __del__ has to close the resource triggers ResourceWarning,
> regardless of whether that is due to the cyclic garbage collector or
> the refcount naturally falling to zero. In the past dealing with this
> was clumsy, so it made sense to rely on CPython's refcounting to do
> the work. However, we have better tools for deterministic resource
> management now (in the form of context managers), so these updates
> help make the standard library and its test suite more suitable for
> use with non-refcounting Python implementations (such as PyPy, Jython
> and IronPython).
> Cheers,
> Nick.

Thank you for reply. Probably this is difficult problem. I often
use with statement, but it's also true sometimes I feel this warning is
a bit noisy. Is there a way to turn this off?

C:\Documents and Settings\Ocean>py3k
Python 3.2a3+ (py3k, Nov  3 2010, 00:27:28) [MSC v.1200 32 bit (Intel)] 
on win32

Type "help", "copyright", "credits" or "license" for more information.
 >>> open("").read()
__main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper 
name='' encodi
'\nimport timeit\n\nt = 
timeit.Timer("""\nos.stat("e:/voltest/lnk")\n""", """\ni
mport os\n""")\n\nprint(t.timeit(1000))\n\n'
[49593 refs]

From solipsis at  Thu Nov  4 15:23:58 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 4 Nov 2010 15:23:58 +0100
Subject: [Python-Dev] [Python-checkins] r85987 -
References: <>
Message-ID: <>

On Thu, 04 Nov 2010 23:09:39 +0900
Hirokazu Yamamoto <ocean-city at> wrote:
> On 2010/11/02 1:30, Nick Coghlan wrote:
> > On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto
> > <ocean-city at>  wrote:
> >> Does this really cause resource warning? I think os.popen instance
> >> won't be into traceback because it's not declared as variable. So I
> >> suppose it will be deleted by reference count == 0 even when exception
> >> occurs.
> >
> > Any time __del__ has to close the resource triggers ResourceWarning,
> > regardless of whether that is due to the cyclic garbage collector or
> > the refcount naturally falling to zero. In the past dealing with this
> > was clumsy, so it made sense to rely on CPython's refcounting to do
> > the work. However, we have better tools for deterministic resource
> > management now (in the form of context managers), so these updates
> > help make the standard library and its test suite more suitable for
> > use with non-refcounting Python implementations (such as PyPy, Jython
> > and IronPython).
> >
> > Cheers,
> > Nick.
> >
> Thank you for reply. Probably this is difficult problem. I often
> use with statement, but it's also true sometimes I feel this warning is
> a bit noisy. Is there a way to turn this off?

You can use all the usual means of controlling emission of warnings, so
for example "python -Wi" would work to silence them all.
Also, ResourceWarning is silenced by default in "release" builds.



From ocean-city at  Thu Nov  4 15:41:04 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Thu, 04 Nov 2010 23:41:04 +0900
Subject: [Python-Dev] [Python-checkins] r85987 -
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 2010/11/04 23:23, Antoine Pitrou wrote:
> You can use all the usual means of controlling emission of warnings, so
> for example "python -Wi" would work to silence them all.
> Also, ResourceWarning is silenced by default in "release" builds.
> Regards
> Antoine.

Thank you, this works. (I couldn't find the way from "python --help")

From guido at  Thu Nov  4 15:51:17 2010
From: guido at (Guido van Rossum)
Date: Thu, 4 Nov 2010 07:51:17 -0700
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

> On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz <glyph at> wrote:
>> This is the strongest reason why I recommend to everyone I know that they
>> not use pickle for storage they'd like to keep working after upgrades [not
>> just of stdlib, but other 3rd party software or their own software]. :)
>> +1.
>> Twisted actually tried to preserve pickle compatibility in the bad old days,
>> but it was impossible. ?Pickles should never really be saved to disk unless
>> they contain nothing but lists, ints, strings, and dicts.

But *that* set of types can safely be marshalled using the marshal module...

--Guido van Rossum (

From alexander.belopolsky at  Thu Nov  4 15:57:39 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 4 Nov 2010 10:57:39 -0400
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 10:51 AM, Guido van Rossum <guido at> wrote:
>>> Twisted actually tried to preserve pickle compatibility in the bad old days,
>>> but it was impossible. ?Pickles should never really be saved to disk unless
>>> they contain nothing but lists, ints, strings, and dicts.
> But *that* set of types can safely be marshalled using the marshal module...

Not if the instances contain reference cycles.

From ncoghlan at  Thu Nov  4 15:59:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 5 Nov 2010 00:59:22 +1000
Subject: [Python-Dev] [Python-checkins] r85902 - in
 python/branches/py3k/Lib: test/
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 9:16 PM, Victor Stinner
<victor.stinner at> wrote:
> So do you still think that I should patch the os module to use a global import
> or not?

I'm actually more inclined to suggest we avoid triggering the warning
under -bb in the first place by iterating over the environment in that
case instead of using the mapping interface. (I was going to suggest a
smarter version that used a SafeKey class instead, but it turns out
os.environ only works with real string objects).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From barry at  Thu Nov  4 16:29:00 2010
From: barry at (Barry Warsaw)
Date: Thu, 4 Nov 2010 11:29:00 -0400
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <20101104112900.6ab53bd5@mission>

On Nov 04, 2010, at 02:44 PM, Allan McRae wrote:

>While this is not strictly related to python development, I thought that
>developers of python might be interested in some of the lessons provided by
>this. So forgive me if this is really wrong for this list...
>Recently Arch Linux did a big transition with respect to python.  Now we
>support two python packages: "python" and "python2".

Very cool to hear about this first hand, thanks for posting it Allan.  I was
recently at the Ubuntu Developers Summit and Arch Linux's transition was a
source of several hallway discussions.  It's good to read about your work and
successes in blazing that trail.

>I really do not want to debate the sanity of pointing /usr/bin/python at
>python-3.x here, but it suffices to say that I am of the opinion that if
>python-3.x is really the future of python, then /usr/bin/python must
>eventually point to a 3.x version.  Also, Arch Linux is very bleeding edge
>and we expect our users to be competent enough to deal with thing like this.
>According to #python, we are all idiots....  And I have been (figuratively)
>yelled at by a couple of Debian developers (which is incidentally the only
>major distro I found without a /usr/bin/python2 symlink).

Ah too bad, no one needs to yell :).  It's an interesting discussion topic
though and it's something I think other distros should start considering.  In
Ubuntu 11.04, we'll have Python 3.1 and 3.2, 2.6 and 2.7, with the default
(i.e. /usr/bin/python) either at 2.6 (probable) or 2.7 (possible).  `python3`
currently points to 3.1.2, but we've talked about getting that to 3.2 for this

Ubuntu's next Long Term Support release is scheduled for April 2012.  It's an
ambitious but worthy goal to see if we can transition to Python 3 as the
default Python by then.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Thu Nov  4 16:31:36 2010
From: barry at (Barry Warsaw)
Date: Thu, 4 Nov 2010 11:31:36 -0400
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <20101104113136.0c1147b6@mission>

On Nov 04, 2010, at 11:33 PM, Nick Coghlan wrote:

>  world: /usr/bin/env python (I have no idea what this script is even for)

It's basically a front-end to ISO 3166 country codes.  IOW, it prints the
expansion of top-level domain names and can do some reverse lookups too.

% Tools/world/world us
us originated from United States

I once started to rip it out into a separate package but haven't gotten too
far with that.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From techtonik at  Thu Nov  4 17:15:57 2010
From: techtonik at (anatoly techtonik)
Date: Thu, 4 Nov 2010 18:15:57 +0200
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 3:38 PM, Nick Coghlan <ncoghlan at> wrote:
> On Thu, Nov 4, 2010 at 4:28 PM, anatoly techtonik <techtonik at> wrote:
>> On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz <glyph at> wrote:
>>> This is the strongest reason why I recommend to everyone I know that they
>>> not use pickle for storage they'd like to keep working after upgrades [not
>>> just of stdlib, but other 3rd party software or their own software]. :)
>>> +1.
>>> Twisted actually tried to preserve pickle compatibility in the bad old days,
>>> but it was impossible. ?Pickles should never really be saved to disk unless
>>> they contain nothing but lists, ints, strings, and dicts.
>> But what is alternative in stdlib?
>> Don't you think that Python doesn't provide any?
> Python 3.2a3+ (py3k:85817, Oct 24 2010, 19:25:28)
> [GCC 4.4.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import json
>>>> dir(json)
> ['JSONDecoder', 'JSONEncoder', '__all__', '__author__',
> '__builtins__', '__cached__', '__doc__', '__file__', '__name__',
> '__package__', '__path__', '__version__', '_default_decoder',
> '_default_encoder', 'decoder', 'dump', 'dumps', 'encoder', 'load',
> 'loads', 'scanner']
> pickle gets overspecific in many ways, and hence (despite our best
> efforts, and those of third parties) may break when changing Python
> versions. Serialising to something more language natural (be it JSON,
> YAML, XML or one of the multitude of other state encoding formats out
> there) is far more likely to be future proof.
> As a tool for communicating between different instances of the *same*
> version of Python though, pickle is fine.

pickle is insecure, marshal too. What about JSON? IIUC you need a
definition of a class to be able to unserialize it in all cases. I
wonder how is this definition validated, i.e. what to watch for when
modifying classes that can be serialized.
anatoly t.

From guido at  Thu Nov  4 17:49:39 2010
From: guido at (Guido van Rossum)
Date: Thu, 4 Nov 2010 09:49:39 -0700
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 9:15 AM, anatoly techtonik <techtonik at> wrote:
> pickle is insecure, marshal too.

What's the attack you're thinking of on marshal? It never executes any
code while unmarshalling (although it can unmarshal code objects --
but the receiving program has to do something additionally to execute

> What about JSON? IIUC you need a
> definition of a class to be able to unserialize it in all cases. I
> wonder how is this definition validated, i.e. what to watch for when
> modifying classes that can be serialized.

Security is all in the code used to deserialize. I haven't analyzed
the json library that comes in the stdlib these days, but couldn't it
in theory be as safe as XML? (Not that there haven't been any attacks
on XML -- but they depended on bugs in the unmarshalling code, the
format itself is not insecure.)

--Guido van Rossum (

From thomas at  Thu Nov  4 18:27:55 2010
From: thomas at (Thomas Wouters)
Date: Thu, 4 Nov 2010 18:27:55 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 05:44, Allan McRae <allan at> wrote:

> According to #python, we are all idiots....

To clarify (but I dont speak for the rest of #python, just myself), I think
the move was premature, but I don't use Arch and I don't know what typical
Arch users expect. The reason I think it's premature is that 'python2' just
doesn't work everywhere, and I would have gone for a transitionary period
where '/usr/bin/python' is something that screams loudly that it shouldn't
be used before it executes 'python2'. That would've allowed for more time to
fix things that use the wrong shebang line, or tools that use 'python'
instead of letting distutils set it for them. I hope that's something other
distributions will consider before changing the meaning of /usr/bin/python.

As for #python, well, we got this storm of people utterly confused about how
their stuff doesn't work anymore, and putting the blame in the wrong place.
I don't think a distribution should ever cause that (even though many do in
lesser ways) -- but as I said, I don't use Arch so maybe I don't understand
the purpose of it. The complaints seem to have died down now (though
possibly because of the 'no arch' topic :)

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Thu Nov  4 21:12:41 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 04 Nov 2010 21:12:41 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

> To clarify (but I dont speak for the rest of #python, just myself), I
> think the move was premature, but I don't use Arch and I don't know what
> typical Arch users expect. The reason I think it's premature is that
> 'python2' just doesn't work everywhere, and I would have gone for a
> transitionary period where '/usr/bin/python' is something that screams
> loudly that it shouldn't be used before it executes 'python2'.

I really do think the key point here is "don't know what typical
Arch users expect". I don't know either, but my personal feeling is that
Arch isn't that widely used, but ISTM that Arch users are expected to be
technically advanced, compared to the wider community of Linux users.

So if these user find a problem, they might know how to fix it, and they
might know how to make bug reports. In essence, you are asking that
there should be a smoother path to making /usr/bin/python Python 3 - and
I observe that Arch's switching actually is a very useful step on that
smoother path. If they figure out what changes to make, many of the
changes may have been done when other Linux distributions just start to
consider the change.

> As for #python, well, we got this storm of people utterly confused about
> how their stuff doesn't work anymore, and putting the blame in the wrong
> place. I don't think a distribution should ever cause that (even though
> many do in lesser ways) -- but as I said, I don't use Arch so maybe I
> don't understand the purpose of it. The complaints seem to have died
> down now (though possibly because of the 'no arch' topic :)

So apparently, there is quite a number of Arch users, and they do make
bug reports. Good :-)

If this gets attributed correctly (i.e. as a deliberate decision by
Arch, revealing bugs in many packages that have long existed), and if
Google picks the canonical resolution quickly, I don't think any harm
is done - and in the long run, it will smooth the migration for
everybody else.


From glyph at  Thu Nov  4 21:25:47 2010
From: glyph at (Glyph Lefkowitz)
Date: Thu, 4 Nov 2010 16:25:47 -0400
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
	modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 4, 2010, at 12:49 PM, Guido van Rossum wrote:

> What's the attack you're thinking of on marshal? It never executes any
> code while unmarshalling (although it can unmarshal code objects --
> but the receiving program has to do something additionally to execute
> those).

These issues may have been fixed now, but a long time ago I recall seeing some nasty segfaults which looked exploitable when feeding marshal malformed data.  If they still exist, running a fuzzer on some pyc files should reveal them pretty quickly.

When I ran across them I didn't think much of them, and probably did not even report the bug, since marshal is mostly used to load code anyway, which is implicitly trusted.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From doko at  Thu Nov  4 22:08:07 2010
From: doko at (Matthias Klose)
Date: Thu, 04 Nov 2010 22:08:07 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 04.11.2010 21:12, "Martin v. L?wis" wrote:
>> To clarify (but I dont speak for the rest of #python, just myself), I
>> think the move was premature, but I don't use Arch and I don't know what
>> typical Arch users expect. The reason I think it's premature is that
>> 'python2' just doesn't work everywhere, and I would have gone for a
>> transitionary period where '/usr/bin/python' is something that screams
>> loudly that it shouldn't be used before it executes 'python2'.

Iirc, it was an explicit decision made at the 2009 language summit not to 
introduce a python2 symlink, but using python3 for python3.x instead. 
Debian/Ubuntu don't ship a python2 symlink by intent.

Did the plans change, i.e. are there plans to provide a python symlink for 
python 3.x altinstall in a future release, e.g in 3.4 or 3.5?


From thomas at  Thu Nov  4 22:21:18 2010
From: thomas at (Thomas Wouters)
Date: Thu, 4 Nov 2010 22:21:18 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 21:12, "Martin v. L?wis" <martin at> wrote:

> > As for #python, well, we got this storm of people utterly confused about
> > how their stuff doesn't work anymore, and putting the blame in the wrong
> > place. I don't think a distribution should ever cause that (even though
> > many do in lesser ways) -- but as I said, I don't use Arch so maybe I
> > don't understand the purpose of it. The complaints seem to have died
> > down now (though possibly because of the 'no arch' topic :)
> So apparently, there is quite a number of Arch users, and they do make
> bug reports. Good :-)

I don't know that they do. I just know that people came to #python and
complained, which is unfortunately something completely different. (We did
ask every single one to take it up with the right forum, and I know at least
one person did file a bug, but that's about it.)

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Thu Nov  4 22:24:09 2010
From: guido at (Guido van Rossum)
Date: Thu, 4 Nov 2010 14:24:09 -0700
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 1:25 PM, Glyph Lefkowitz <glyph at> wrote:
> On Nov 4, 2010, at 12:49 PM, Guido van Rossum wrote:
> What's the attack you're thinking of on marshal? It never executes any
> code while unmarshalling (although it can unmarshal code objects --
> but the receiving program has to do something additionally to execute
> those).
> These issues may have been fixed now, but a long time ago I recall seeing
> some nasty segfaults which looked exploitable when feeding marshal malformed
> data. ?If they still exist, running a fuzzer on some pyc files should reveal
> them pretty quickly.
> When I ran across them I didn't think much of them, and probably did not
> even report the bug, since marshal is mostly used to load code anyway, which
> is implicitly trusted.

I'm not sure that all these were fixed but it would be a finite (and
probably small) amount of work to get it fixed -- unlike fixing
pickling, which is impossible (unless you implemented some kind of
sandboxing solution :-).

A good use for pickling is when it's optional. Example: putting
pickles in memcache. The source of the pickles is (presumably)
trusted, so the only remaining problem is occasional version skew. If
the unpickling fails it can just be treated as a cache miss. (Tricky:
when unpickling succeeds but returns a broken object. "Nobody's
perfect." :-)

--Guido van Rossum (

From lvh at  Thu Nov  4 23:40:46 2010
From: lvh at (Laurens Van Houtven)
Date: Thu, 4 Nov 2010 23:40:46 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae <allan at> wrote:
> According to #python, we are all idiots....

I realize this is not really what your message was about and for sake
of brevity you used a bit of a hyperbole, but like Thomas I would
still like to nip in right there. #python is a pretty big channel. I
think everyone understands that reducing it in its entirety to a
single opinion as inflammatory as "you're all idiots" is at best
oversimplifying and at worst offensive.

(FWIW, Thomas has already said a bunch of stuff I completely agree
with, so +1 everything he said.)

What is true is that there's a new and temporary "NO ARCH" rule in the
topic, and it's the for the same reason there's a "NO LOL" in the
topic: to keep the signal to noise ratio high. Apparently there is a
large number of packages (or perhaps just commonly used ones) either
in Arch itself or AUR that didn't work anymore. This caused a lot of
people to complain about problems that are actually Arch-specific
problems: not really something #python is there for nor something it
is good at helping with. That wouldn't be helping people with Python,
that would be helping people with Arch. It is not intended as, and
should not be interpreted as, some kind of public "declaration of war"
against Arch. It simply means that #python isn't going to do
Arch-specific support for packages that no longer work after an
update, since that's not our job nor expertise.

I don't think grudges or misunderstandings help anyone, and Python in
particular, further along. I think I've demonstrated that I'm eager to
get rid of them before. If you (or anyone else for that matter) are
worried about behavior or policy in #python in the future (I assure
you there's really not as much as people generally seem to think there
is) and would like clarification, there's an easy way to access a list
of the ops:

/msg chanserv access #python list

Or just shout "are there any ops on" in #python whenever you like.
These people should be able to tell you what you want to know or at
least point you to the right person to ask.

But basically, to reiterate a point I've made a bunch of times and
have already made (not to you in particular, just in general): #python
is a bunch of people, please don't extrapolate the opinions of a few
to the opinions of many. It's easy and tempting, but it often leads to
demonizing a bunch of people and putting words in people's mouths
which they didn't say or even agree with.

cheers and good luck

From allan at  Fri Nov  5 01:19:59 2010
From: allan at (Allan McRae)
Date: Fri, 05 Nov 2010 10:19:59 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On 05/11/10 08:40, Laurens Van Houtven wrote:
> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae<allan at>  wrote:
>> According to #python, we are all idiots....
> I realize this is not really what your message was about and for sake
> of brevity you used a bit of a hyperbole, but like Thomas I would
> still like to nip in right there. #python is a pretty big channel. I
> think everyone understands that reducing it in its entirety to a
> single opinion as inflammatory as "you're all idiots" is at best
> oversimplifying and at worst offensive.

Of course, and I was not intending to offend here.  It was more of a 
running commentary on the unintended influx of Arch Linux users to the 
channel and some of the responses posted to them (some of which I found 
rather amusing when forwarded to me - especially the early response as 
people were figuring out what was going on).

I also agree with the "NO ARCH" topic at the moment. I was fairly 
surprised so many people went to #python for help given we had made news 
posts and had a topic in our IRC channel pointing to how to start fixing 


From jcea at  Fri Nov  5 01:26:31 2010
From: jcea at (Jesus Cea)
Date: Fri, 05 Nov 2010 01:26:31 +0100
Subject: [Python-Dev] Help with warnings not being raised
Message-ID: <>

Hash: SHA1

Hi all. I just committed r86180, but there is something I don't like.

If you read the tests I did (by hand)at , python should show the
unraisable and THEN the "C API unavailable" warning, but it is not
showing the warning.

I don't know why.

I have committed the patch because it solves the original bug, but I am
pretty uncomfy not knowing what Python is not doing exactly what I want...

Any idea?.

Sorry for wasting your time with probably trivial stuff, but I need to
know... :-?

PS: I am using "PyErr_Warn()", that is deprecated, because this code
should work in Python 2.3 too. I tried "PyErr_WarnEx()" too, it didn't
work either.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From benjamin at  Fri Nov  5 01:36:25 2010
From: benjamin at (Benjamin Peterson)
Date: Thu, 4 Nov 2010 19:36:25 -0500
Subject: [Python-Dev] Help with warnings not being raised
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/4 Jesus Cea <jcea at>:
> Hash: SHA1
> Hi all. I just committed r86180, but there is something I don't like.
> If you read the tests I did (by hand)at
> , python should show the
> unraisable and THEN the "C API unavailable" warning, but it is not
> showing the warning.
> I don't know why.

Are you passing -3 -Wall?


From marc at  Fri Nov  5 01:21:41 2010
From: marc at (Marcel Hellkamp)
Date: Fri, 05 Nov 2010 01:21:41 +0100
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Am 04.11.2010 17:15, schrieb anatoly techtonik:
 > pickle is insecure, marshal too.

If the transport or storage layer is not save, you should 
cryptographically sign the data anyway::

     def pickle_encode(data, key):
         msg = base64.b64encode(pickle.dumps(data, -1))
         sig = base64.b64encode(, msg).digest())
         return sig + ':' + msg

     def pickle_decode(data, key):
         if data and ':' in data:
             sig, msg = data.split(':', 1)
             if sig == base64.b64encode(, msg).digest()):
                 return pickle.loads(base64.b64decode(msg))
         raise pickle.UnpicklingError("Wrong or missing signature.")

Bottle (a web framework) uses a similar approach to store non-string 
data in client-side cookies. I don't see a (security) problem here.

Mit freundlichen Gr??en
Marcel Hellkamp

From stephen at  Fri Nov  5 01:43:30 2010
From: stephen at (Stephen J. Turnbull)
Date: Fri, 05 Nov 2010 09:43:30 +0900
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

Thomas Wouters writes:

 > To clarify (but I dont speak for the rest of #python, just myself), I think
 > the move was premature, but I don't use Arch and I don't know what typical
 > Arch users expect.

All of the Arch users I know expect Arch to occasionally do radical
things because they're the right things to do in the long run.  But
every avant garde distribution picks up its share of wannabes who
don't understand how the process works.

 > The reason I think it's premature is that 'python2' just doesn't
 > work everywhere, and I would have gone for a transitionary period
 > where '/usr/bin/python' is something that screams loudly that it
 > shouldn't be used before it executes 'python2'.

This is unrealistic.  It would seriously annoy Arch's intended
audience.  (Eg, recently I've become a lot more favorable to using
Word instead of OOo because Word doesn't pop up a useless warning
every time I save a .doc file.)  Practically speaking, it would have
to be off by default, like Python pending deprecation warnings.

Anyway, I bet that anybody capable of upgrading their *Arch* packages
and complaining to *#python* about resulting breakage would be capable
of complaining to #python about the weird warning about python2.  And
you can't have a NO /USR/BIN/PYTHON topic, can you?<wink>

 > As for #python, well, we got this storm of people utterly confused
 > about how their stuff doesn't work anymore, and putting the blame
 > in the wrong place.

How so?  Ultimately, Guido is responsible for this.  Sure, the
immediate symptom was caused by Arch's action, but Python 3 *is*
rather incompatible with Python 2.  You're going to get a storm every
time a distro changes, and in a year or two, it's no longer going to
be something you can dispose of by setting a hotkey to "Google for
'BOGUS Linux python'" -- it's going to be stuff that requires a real
understanding of how Python 3 differs from Python 2, and often will be
pretty subtle.

 > I don't think a distribution should ever cause that (even though
 > many do in lesser ways)

Sure, and Guido should have exercised the Time Machine a little harder
so that Python 3 never needed to happen.  IOW, this is the price of
success and wide distribution.

BTW, I hope the next distribution make the jump does try your
suggestion to make /usr/bin/python scream.  It might work, even work

From devin.c.cook at  Fri Nov  5 01:52:05 2010
From: devin.c.cook at (Devin Cook)
Date: Thu, 4 Nov 2010 19:52:05 -0500
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 7:19 PM, Allan McRae <allan at> wrote:
> I also agree with the "NO ARCH" topic at the moment. I was fairly surprised
> so many people went to #python for help given we had made news posts and had
> a topic in our IRC channel pointing to how to start fixing issues.
> Allan

I don't remember seeing any warning about it during the upgrade. That
may have helped people (ones that read the warnings, at least) figure
out what was going on. I think a warning from /usr/bin/python may have
helped as well, but I do suppose might be a bit extreme.

FWIW, I found those news posts and the Python wiki page pretty quickly
after I realized my scripts weren't working anymore.


From steve at  Fri Nov  5 01:56:05 2010
From: steve at (Steven D'Aprano)
Date: Fri, 05 Nov 2010 11:56:05 +1100
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan wrote:

> As a tool for communicating between different instances of the *same*
> version of Python though, pickle is fine.

I'm using pickle to pass a list and dict of floats and strings from 
Python 2.6 to 3.1. I've never had any problems with it. Am I living in a 
state of sin or is that okay?


From foom at  Fri Nov  5 02:04:53 2010
From: foom at (James Y Knight)
Date: Thu, 4 Nov 2010 21:04:53 -0400
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 4, 2010, at 8:43 PM, Stephen J. Turnbull wrote:
> All of the Arch users I know expect Arch to occasionally do radical
> things because they're the right things to do in the long run.

But the previous consensus (at least, as I, and presumably many other people understood it) was that python2 would remain the owner of the name "/usr/bin/python" for the indefinite future, and python3 would be invoked with /usr/bin/python3.

Given that, it's not at all clear that Arch's actions are the right thing to do.

IMO, moving away from that consensus should've been brought up on python-dev rather than just one distro just doing it all alone, causing incompatibilities and annoyance. If python-dev wants python3 to inherit the name /usr/bin/python, then python2 should've been installing a binary called /usr/bin/python2 for a couple years ahead of time, and recommending that everyone use that in their #! lines, so that the switch could've been done without breaking everything...

> Sure, and Guido should have exercised the Time Machine a little harder
> so that Python 3 never needed to happen.  IOW, this is the price of
> success and wide distribution.

Well, other programming languages seem to have avoided making sweeping bidirectionally-incompatible changes, despite being successful and widely distributed. But that's a whole other discussion.


From jcea at  Fri Nov  5 02:12:55 2010
From: jcea at (Jesus Cea)
Date: Fri, 05 Nov 2010 02:12:55 +0100
Subject: [Python-Dev] Help with warnings not being raised
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 05/11/10 01:36, Benjamin Peterson wrote:
>> I don't know why.
> Are you passing -3 -Wall?

I am passing "-3 -Werror", to induce the error control I have committed.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From jcea at  Fri Nov  5 02:20:56 2010
From: jcea at (Jesus Cea)
Date: Fri, 05 Nov 2010 02:20:56 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 04/11/10 05:44, Allan McRae wrote:
> The second case was particularly interesting.  These software would
> change some of their #! to point at the python2 symlink and leave the
> rest pointing at python.  Note that python-2.7 itself falls into this
> category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env
> python" even when installed with "make altinstall".  I can not remember
> the exact details, but I recall that some of these files were installed
> with executable permissions which would be bad, but I need to look into
> this again now things have calmed down...

PLEASE, open a bug with this. It is a serious bug. "make altinstall"
*SHOULD* be respected.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From jcea at  Fri Nov  5 02:31:05 2010
From: jcea at (Jesus Cea)
Date: Fri, 05 Nov 2010 02:31:05 +0100
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Hash: SHA1

On 04/11/10 15:57, Alexander Belopolsky wrote:
> On Thu, Nov 4, 2010 at 10:51 AM, Guido van Rossum <guido at> wrote:
> ..
>>>> Twisted actually tried to preserve pickle compatibility in the bad old days,
>>>> but it was impossible.  Pickles should never really be saved to disk unless
>>>> they contain nothing but lists, ints, strings, and dicts.
>> But *that* set of types can safely be marshalled using the marshal module...
> Not if the instances contain reference cycles.

Moreover, in the docs the marshall module EXPLICITLY says that the
format is undocumented on purpose, and subject to change. Seems a pretty
bad option for persistence, if you expect to read your data back in the

This module contains functions that can read and write Python values in
a binary format. The format is specific to Python, but independent of
machine architecture issues (e.g., you can write a Python value to a
file on a PC, transport the file to a Sun, and read it back there).
Details of the format are undocumented on purpose; it may change between
Python versions (although it rarely does).

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From steve at  Fri Nov  5 02:40:10 2010
From: steve at (Steven D'Aprano)
Date: Fri, 05 Nov 2010 12:40:10 +1100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

James Y Knight wrote:

> But the previous consensus (at least, as I, and presumably many other
> people understood it) was that python2 would remain the owner of the
> name "/usr/bin/python" for the indefinite future, and python3 would
> be invoked with /usr/bin/python3.
> Given that, it's not at all clear that Arch's actions are the right
> thing to do.

The time will come when even Python 2.7 is long obsolete. I think it is
silly to insist that people invoke python3 to run their Python 3.7
scripts. Arch might be jumping the gun a little, or even a lot, but 
sooner or later it should be done.

Besides, this is another sign that the Python 3 haters are wrong. We now 
have a distro that has made Python 3 the standard system python. It 
might be a bleeding-edge distro not recommended for non-experts, but 
it's still pretty cool that *somebody* has done it.

> IMO, moving away from that consensus should've been brought up on 
> python-dev rather than just one distro just doing it all alone, 
> causing incompatibilities and annoyance. 

We're all adults here. If Arch wants to live on the bleeding edge, more 
power to them. That's why my server runs Centos :)


From exarkun at  Fri Nov  5 05:09:35 2010
From: exarkun at (exarkun at
Date: Fri, 05 Nov 2010 04:09:35 -0000
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On
	breaking	modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <20101105040935.2040.490777150.divmod.xquotient.622@localhost.localdomain>

On 12:21 am, marc at wrote:
>Am 04.11.2010 17:15, schrieb anatoly techtonik:
> > pickle is insecure, marshal too.
>If the transport or storage layer is not save, you should 
>cryptographically sign the data anyway::
>     def pickle_encode(data, key):
>         msg = base64.b64encode(pickle.dumps(data, -1))
>         sig = base64.b64encode(, msg).digest())
>         return sig + ':' + msg
>     def pickle_decode(data, key):
>         if data and ':' in data:
>             sig, msg = data.split(':', 1)
>             if sig == base64.b64encode(, msg).digest()):
>                 return pickle.loads(base64.b64decode(msg))
>         raise pickle.UnpicklingError("Wrong or missing signature.")
>Bottle (a web framework) uses a similar approach to store non-string 
>data in client-side cookies. I don't see a (security) problem here.

Your pickle_decode leaks information about the key.  An attacker will 
eventually (a few seconds to a few minutes, depending on how they have 
access to this system) be able to determine your key and send you 
arbitrary pickles (ie, execute arbitrary code on your system).


This stuff is hard.  If you're going to mess around with it, make sure 
you're *serious* (better approach: don't mess around with it).


From bob at  Fri Nov  5 05:21:57 2010
From: bob at (Bob Ippolito)
Date: Fri, 5 Nov 2010 12:21:57 +0800
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <20101105040935.2040.490777150.divmod.xquotient.622@localhost.localdomain>
References: <>
Message-ID: <>

On Friday, November 5, 2010,  <exarkun at> wrote:
> On 12:21 am, marc at wrote:
> Am 04.11.2010 17:15, schrieb anatoly techtonik:
>> pickle is insecure, marshal too.
> If the transport or storage layer is not save, you should cryptographically sign the data anyway::
>  ? ?def pickle_encode(data, key):
>  ? ? ? ?msg = base64.b64encode(pickle.dumps(data, -1))
>  ? ? ? ?sig = base64.b64encode(, msg).digest())
>  ? ? ? ?return sig + ':' + msg
>  ? ?def pickle_decode(data, key):
>  ? ? ? ?if data and ':' in data:
>  ? ? ? ? ? ?sig, msg = data.split(':', 1)
>  ? ? ? ? ? ?if sig == base64.b64encode(, msg).digest()):
>  ? ? ? ? ? ? ? ?return pickle.loads(base64.b64decode(msg))
>  ? ? ? ?raise pickle.UnpicklingError("Wrong or missing signature.")
> Bottle (a web framework) uses a similar approach to store non-string data in client-side cookies. I don't see a (security) problem here.
> Your pickle_decode leaks information about the key. ?An attacker will eventually (a few seconds to a few minutes, depending on how they have access to this system) be able to determine your key and send you arbitrary pickles (ie, execute arbitrary code on your system).
> Oops.
> This stuff is hard. ?If you're going to mess around with it, make sure you're *serious* (better approach: don't mess around with it).

Specifically you need to use a constant time signature verification or
else there are possible timing attacks. Sounds like something a hmac
module should provide in the first place.

But yeah, this stuff is hard, better to just not have a code execution
hole in the first place.


From mcrae_allan at  Fri Nov  5 04:14:02 2010
From: mcrae_allan at (Allan McRae)
Date: Fri, 05 Nov 2010 13:14:02 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 05/11/10 11:20, Jesus Cea wrote:
> Hash: SHA1
> On 04/11/10 05:44, Allan McRae wrote:
>> The second case was particularly interesting.  These software would
>> change some of their #! to point at the python2 symlink and leave the
>> rest pointing at python.  Note that python-2.7 itself falls into this
>> category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env
>> python" even when installed with "make altinstall".  I can not remember
>> the exact details, but I recall that some of these files were installed
>> with executable permissions which would be bad, but I need to look into
>> this again now things have calmed down...
> PLEASE, open a bug with this. It is a serious bug. "make altinstall"
> *SHOULD* be respected.



From thomas at  Fri Nov  5 09:47:18 2010
From: thomas at (Thomas Wouters)
Date: Fri, 5 Nov 2010 09:47:18 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 01:43, Stephen J. Turnbull <stephen at>wrote:

> Thomas Wouters writes:
>  > To clarify (but I dont speak for the rest of #python, just myself), I
> think
>  > the move was premature, but I don't use Arch and I don't know what
> typical
>  > Arch users expect.
> All of the Arch users I know expect Arch to occasionally do radical
> things because they're the right things to do in the long run.  But
> every avant garde distribution picks up its share of wannabes who
> don't understand how the process works.
>  > The reason I think it's premature is that 'python2' just doesn't
>  > work everywhere, and I would have gone for a transitionary period
>  > where '/usr/bin/python' is something that screams loudly that it
>  > shouldn't be used before it executes 'python2'.
> This is unrealistic.  It would seriously annoy Arch's intended
> audience.  (Eg, recently I've become a lot more favorable to using
> Word instead of OOo because Word doesn't pop up a useless warning
> every time I save a .doc file.)  Practically speaking, it would have
> to be off by default, like Python pending deprecation warnings.

Wait, what? Warning about impending brokenness is *more annoying* than just
plain breaking? How on earth would the warning be "useless"?
Keep in mind that the warning would only show up *if stuff would otherwise
not work*.

Anyway, I bet that anybody capable of upgrading their *Arch* packages
> and complaining to *#python* about resulting breakage would be capable
> of complaining to #python about the weird warning about python2.  And
> you can't have a NO /USR/BIN/PYTHON topic, can you?<wink>

Any change is disruptive. My comment wasn't about the crowd of people
visiting #python and complaining, it was about the decision to change
/usr/bin/python, and how it was done. However, a warning with a clear
description -- for example, a link to a webpage explaining the situation --
would most assuredly have prevented many people from coming to #python in
desperation. They might still have *complained*, in #python or elsewhere,
but it would have been a lot clearer.

>  > As for #python, well, we got this storm of people utterly confused
>  > about how their stuff doesn't work anymore, and putting the blame
>  > in the wrong place.
> How so?  Ultimately, Guido is responsible for this.  Sure, the

immediate symptom was caused by Arch's action, but Python 3 *is*
> rather incompatible with Python 2.  You're going to get a storm every
> time a distro changes, and in a year or two, it's no longer going to
> be something you can dispose of by setting a hotkey to "Google for
> 'BOGUS Linux python'" -- it's going to be stuff that requires a real
> understanding of how Python 3 differs from Python 2, and often will be
> pretty subtle.

>  > I don't think a distribution should ever cause that (even though
>  > many do in lesser ways)
> Sure, and Guido should have exercised the Time Machine a little harder
> so that Python 3 never needed to happen.  IOW, this is the price of
> success and wide distribution.

No, that's not my point at all. The problem isn't that Python 3 is
incompatible with Python 2. The problem is that stuff broke without
(apparently) fair warning. This isn't a Python thing, this is a distribution
thing: for users of a distribution, having a clear, usable migration path
for incompatible changes is *important*. For users, not packagers, this
means you have to slap them in the face with upcoming incompatible changes,
or they won't notice. It may not be important for Arch, or for the users
Arch expects to have, but it sure as hell is important to me and every
sysadmin I know :)

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From allan at  Fri Nov  5 11:25:35 2010
From: allan at (Allan McRae)
Date: Fri, 05 Nov 2010 20:25:35 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 05/11/10 18:47, Thomas Wouters wrote:
> No, that's not my point at all. The problem isn't that Python 3 is
> incompatible with Python 2. The problem is that stuff broke without
> (apparently) fair warning.


Just to clarify (and going way off topic for this list...), this was 
discussed on the Arch Linux mailing lists around six months in advance, 
then again about two months beforehand when the rebuild started.  Then 
it sat in our testing repository for a month where issues were discussed 
on our mailing lists and forums.  Also a news post was made on our 
website front page before moving it into our main repos.

With a rolling release distro we do not have the luxury of making 
release notes every major release so we make it abundantly clear to our 
users that we expect them to at least always read the front page news 
before updating.  There are even wrapper scripts for our package manager 
that print the news headlines before updating.

So there was warning.  As always, it was just ignored by a portion of 
our users.


From ncoghlan at  Fri Nov  5 13:55:58 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 5 Nov 2010 22:55:58 +1000
Subject: [Python-Dev] Help with warnings not being raised
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 11:12 AM, Jesus Cea <jcea at> wrote:
> Hash: SHA1
> On 05/11/10 01:36, Benjamin Peterson wrote:
>>> I don't know why.
>> Are you passing -3 -Wall?
> I am passing "-3 -Werror", to induce the error control I have committed.

Under -We, PyErr_Warn raises an exception rather than printing to
stdout. That exception is clobbered by the immediately following call
to PyErr_Clear.
Since you *only* hit that branch under -We in the first place, a
second call to PyErr_WriteUnraisable should get the error to actually
print out.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Fri Nov  5 14:18:45 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 5 Nov 2010 23:18:45 +1000
Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking
 modules into packages)
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 10:56 AM, Steven D'Aprano <steve at> wrote:
> Nick Coghlan wrote:
>> As a tool for communicating between different instances of the *same*
>> version of Python though, pickle is fine.
> I'm using pickle to pass a list and dict of floats and strings from Python
> 2.6 to 3.1. I've never had any problems with it. Am I living in a state of
> sin or is that okay?

Builtins are generally fine, and we do try reasonably hard to keep the
pickle formats properly compatible across versions. It's corner cases
(like pickling unittest objects) that may sometimes break, since
pickle implicitly depends on things that *should* be disregarded as
implementation details. Specifically, without explicit directions to
do otherwise, pickle encodes objects based on what they *are*, which
may include implementation details, such as optional acceleration
modules, platform specific variants of classes returned by a factory
function, etc. Technically such things are bugs in an object's
pickling support, but they're *really* non-obvious (and genuinely
harmless in most cases).

As I see it, there are at least 3 levels of pickling support:
1. Complete, version independent (implementation details are weeded
out from the pickle, or deliberately kept the same across versions to
preserve pickle compatibility)
2. Partial, potentially version dependent (pickles may be infected
with implementation details that affect cross-version compatibility if
they happen to change)
3. None (can't even pickle it in the first place)

Builtins are in category 1, but there are plenty of things in the
standard library (like unittest classes) that rely on default pickling
behaviour and hence fit into category 2 (we just very, very rarely
move anything around, so such classes may as well be in category 1
most of the time).

Notably, this mostly causes problems when reading pickles generated
with a *new* version of Python in an *old* version. When going the
other way, we can adjust the unpickling process to cope with any
discrepancies (for the "relying on implementation details case",
usually by the simple expedient of keeping both sets of names around).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Fri Nov  5 15:17:51 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 6 Nov 2010 00:17:51 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 11:59 PM, Benjamin Peterson <benjamin at> wrote:
> 2010/11/4 Nick Coghlan <ncoghlan at>:
>> Tools also had a few discrepancies:
>> ?scripts/ /usr/bin/env python (necessary, I think - I believe
>> 2to3 is a 2.x only program)
> No, I believe distutils is supposed to patch that up, though.

Yeah, I did a more thorough grep and the ready-to-install version of has a correctly updated shebang line.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From stephen at  Fri Nov  5 17:09:38 2010
From: stephen at (Stephen J. Turnbull)
Date: Sat, 06 Nov 2010 01:09:38 +0900
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

Thomas Wouters writes:

 > > This is unrealistic.  It would seriously annoy Arch's intended
 > > audience.  (Eg, recently I've become a lot more favorable to using
 > > Word instead of OOo because Word doesn't pop up a useless warning
 > > every time I save a .doc file.)  Practically speaking, it would have
 > > to be off by default, like Python pending deprecation warnings.
 > Wait, what? Warning about impending brokenness is *more annoying* than just
 > plain breaking? How on earth would the warning be "useless"?
 > Keep in mind that the warning would only show up *if stuff would otherwise
 > not work*.

As I understood it, what you proposed was that in a *Python 2-based*
distribution thinking about switching to Python 3 as the default
/usr/bin/python, they should first substitute a bitch'n'run-python2
script for the python (Python 2) binary, and after that works the bugs
out, switch to Python 3.

In that scenario, the bitching is useful *exactly* once: the first
time anybody reports the bug to someone who can do something about it.
But for some time, *every time* you run your app, it bitches
uselessly: it would work fine if you just install Python 2 as
/usr/bin/python, without bitching.  That's not very graceful.  And
"some time" will often stretch into weeks or months for any given
user, since few distros will bless a new package with zero testing.

 > No, that's not my point at all. The problem isn't that Python 3 is
 > incompatible with Python 2. The problem is that stuff broke without
 > (apparently) fair warning.

Warning was given; they weren't listening.

From thomas at  Fri Nov  5 17:58:30 2010
From: thomas at (Thomas Wouters)
Date: Fri, 5 Nov 2010 17:58:30 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 17:09, Stephen J. Turnbull <stephen at>wrote:

> Thomas Wouters writes:
>  > > This is unrealistic.  It would seriously annoy Arch's intended
>  > > audience.  (Eg, recently I've become a lot more favorable to using
>  > > Word instead of OOo because Word doesn't pop up a useless warning
>  > > every time I save a .doc file.)  Practically speaking, it would have
>  > > to be off by default, like Python pending deprecation warnings.
>  >
>  > Wait, what? Warning about impending brokenness is *more annoying* than
> just
>  > plain breaking? How on earth would the warning be "useless"?
>  > Keep in mind that the warning would only show up *if stuff would
> otherwise
>  > not work*.
> As I understood it, what you proposed was that in a *Python 2-based*
> distribution thinking about switching to Python 3 as the default
> /usr/bin/python, they should first substitute a bitch'n'run-python2
> script for the python (Python 2) binary, and after that works the bugs
> out, switch to Python 3.
> In that scenario, the bitching is useful *exactly* once: the first
> time anybody reports the bug to someone who can do something about it.
> But for some time, *every time* you run your app, it bitches
> uselessly: it would work fine if you just install Python 2 as
> /usr/bin/python, without bitching.  That's not very graceful.  And
> "some time" will often stretch into weeks or months for any given
> user, since few distros will bless a new package with zero testing.

No, what I suggested was that *instead of changing /usr/bin/python to Python
3*, it would produce a warning. So, as before, change everything you know
about to python2. Keep everything that is python3 using python3. Change
/usr/bin/python, which *should* now be unused, to something that complains.
Since all the distribution-installed packages were changed, the only
warnings will come from invocations that would otherwise have spectacularly
and possibly quite confusingly blown up. As I said, the warning can provide
clear instructions on updating the software. Heck, the /usr/bin/python
wrapper could be made to be quiet for a day at a time by having the user
press a button.

>  > No, that's not my point at all. The problem isn't that Python 3 is
>  > incompatible with Python 2. The problem is that stuff broke without
>  > (apparently) fair warning.
> Warning was given; they weren't listening.

Yes, that's what users do. They don't look at the websites or read the
mailinglists, they just care that their stuff keeps working and they don't
want to pay the maintenance cost :) I'm not saying Arch should have done
this, but most Linux distributions do *not* have attentive users. This is
not news. I would rather we stay with an explicit 'python3' for another
decade (as, after all, Perl did with perl5 as well) than that more people
are confronted with the switch to python3 by having their own code break.

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From status at  Fri Nov  5 18:08:19 2010
From: status at (Python tracker)
Date: Fri,  5 Nov 2010 18:08:19 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
Python tracker at

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2514 (+17)
  closed 19597 (+78)
  total  22111 (+95)

Open issues with patches: 1044 

Issues opened (56)

#5251: contextlib.nested inconsistent with, well, nested with stateme  reopened by ncoghlan

#10236: Sporadic failures of test_ssl  opened by ixokai

#10237: failure in Barrier tests  opened by pitrou

#10238: ctypes not building under OS X 10.6 with LLVM/Clang 2.8  opened by brett.cannon

#10239: multiprocessing signal defect  opened by Neal.Becker

#10240: dict.update.__doc__ is misleading  opened by ivank

#10241: gc fixes for module m_copy attribute  opened by nascheme

#10242: unittest's assertItemsEqual() method makes too many assumption  opened by rhettinger

#10245: Fix resource warnings in test_telnetlib  opened by bbrazil

#10248: Fix resource warnings in test_xmlrpclib  opened by bbrazil

#10252: Fix resource warnings in distutils  opened by bbrazil

#10254: unicodedata.normalize('NFC', s) regression  opened by valhallasw

#10255: refleak in initstdio  opened by nascheme

#10259: Entry text not set if all of 'Font', 'Foreground' and 'Justify  opened by iarspider

#10260: Add a threading.Condition.wait_for() method  opened by krisvale

#10261: tarfile iterator without members caching  opened by karstenw

#10262: Add --disable-abi-flags option to `configure`  opened by Arfrever

#10267: test_ttk_guionly leaks many references  opened by pitrou

#10270: Fix resource warnings in test_threading  opened by bbrazil

#10271: warnings.showwarning should allow any callable object  opened by lekma

#10272: SSL handshake timeouts not caught by transient_internet  opened by pitrou

#10273: Clean-up Unittest API  opened by rhettinger

#10274: imaplib should provide a means to validate a remote server ssl  opened by db

#10276: zlib crc32/adler32 buffer length truncation (64-bit)  opened by nvawda

#10278: add time.wallclock() method  opened by krisvale

#10282: IMPLEMENTATION token differently delt with in NNTP capability  opened by jelie

#10284: NNTP should accept bytestrings for username and password  opened by jelie

#10287: NNTP authentication should check capabilities  opened by jelie

#10288: Remove deprecated C "character" handling macros ISUPPER() etc  opened by dmalcolm

#10289: Document magic methods called by built-in functions  opened by eric.araujo

#10291: Clean-up turtledemo in-package documentation  opened by belopolsky

#10292: tarinfo should use relative symlinks  opened by magcius

#10296: ctypes catches BreakPoint error on windows 32  opened by krisvale

#10297: decimal module documentation is misguiding  opened by hafiza.jameel

#10298: zipfile: incorrect comment size will prevent extracting  opened by rep

#10299: Add index with links section for built-in functions  opened by nestor

#10302: Add class-functions to hash many small objects with hashlib  opened by ebfe

#10303: small inconsistency in tutorial  opened by maltehelmert

#10304: error in tutorial triple-string example  opened by maltehelmert

#10305: Cleanup up ResourceWarnings in multiprocessing  opened by brian.curtin

#10308: Modules/getpath.c bugs  opened by hfuru

#10309: dlmalloc.c needs _GNU_SOURCE for mremap()  opened by hfuru

#10310: signed:1 bitfields rarely make sense  opened by hfuru

#10311: Signal handlers must preserve errno  opened by hfuru

#10312: intcatcher() can deadlock  opened by hfuru

#10318: "make altinstall" installs many files with incorrect shebangs  opened by allan

#10319: SocketServer.TCPServer truncates responses on close (in some s  opened by jrodman2

#10320: printf %qd is nonstandard  opened by hfuru

#10321: Add support for Message objects and binary data to smtplib.sen  opened by r.david.murray

#10323: Final state of underlying sequence in islice  opened by shashank

#10324: Modules/binascii.c: simplify expressions  opened by nikai

#10325: PY_LLONG_MAX & co - preprocessor constants or not?  opened by hfuru

#10326: Can't pickle unittest.TestCase instances  opened by michael.foord

#10327: Abnormal SSL timeouts when using socket timeouts - once again  opened by pakal

#10328: re.sub[n] doesn't seem to handle /Z replacements correctly in  opened by Alexander.Schmolck

#10329: and unicode in Python 3  opened by doerwalter

Most recent 15 issues with no replies (15)

#10329: and unicode in Python 3

#10328: re.sub[n] doesn't seem to handle /Z replacements correctly in

#10326: Can't pickle unittest.TestCase instances

#10325: PY_LLONG_MAX & co - preprocessor constants or not?

#10324: Modules/binascii.c: simplify expressions

#10321: Add support for Message objects and binary data to smtplib.sen

#10320: printf %qd is nonstandard

#10319: SocketServer.TCPServer truncates responses on close (in some s

#10312: intcatcher() can deadlock

#10310: signed:1 bitfields rarely make sense

#10309: dlmalloc.c needs _GNU_SOURCE for mremap()

#10308: Modules/getpath.c bugs

#10303: small inconsistency in tutorial

#10298: zipfile: incorrect comment size will prevent extracting

#10297: decimal module documentation is misguiding

Most recent 15 issues waiting for review (15)

#10329: and unicode in Python 3

#10324: Modules/binascii.c: simplify expressions

#10321: Add support for Message objects and binary data to smtplib.sen

#10312: intcatcher() can deadlock

#10311: Signal handlers must preserve errno

#10310: signed:1 bitfields rarely make sense

#10308: Modules/getpath.c bugs

#10299: Add index with links section for built-in functions

#10298: zipfile: incorrect comment size will prevent extracting

#10292: tarinfo should use relative symlinks

#10288: Remove deprecated C "character" handling macros ISUPPER() etc

#10278: add time.wallclock() method

#10276: zlib crc32/adler32 buffer length truncation (64-bit)

#10270: Fix resource warnings in test_threading

#10267: test_ttk_guionly leaks many references

Top 10 most discussed issues (10)

#10273: Clean-up Unittest API  19 msgs

#10284: NNTP should accept bytestrings for username and password  18 msgs

#2636: Regexp 2.7 (modifications to current re 2.2.2)  16 msgs

#1926: NNTPS support in nntplib  12 msgs

#7061: Improve 24.5. turtle doc  12 msgs

#9611: FileIO not 64-bit safe under Windows  11 msgs

#10278: add time.wallclock() method  11 msgs

#9377: socket, PEP 383: Mishandling of non-ASCII bytes in host/domain  10 msgs

#10181: Problems with Py_buffer management in memoryobject.c (and else  10 msgs

#10311: Signal handlers must preserve errno  10 msgs

Issues closed (78)

#3699: test_bigaddrspace broken  closed by pitrou

#4403: regression from 2.6: requiring ascii for sending me  closed by r.david.murray

#4510: ValueError for list.remove() not very helpful  closed by benjamin.peterson

#5573: multiprocessing Pipe poll() and recv() semantics.  closed by asksol

#5729: Allows tabs for indenting JSON output  closed by rhettinger

#6081: str.format_map()  closed by eric.smith

#6706: asyncore's accept() is broken  closed by giampaolo.rodola

#7059: 'checking getaddrinfo bug' doesn't output the result during ./  closed by benjamin.peterson

#7266: test_lib2to3 failure under Windows  closed by benjamin.peterson

#7402: Improve reduce example in doanddont.rst  closed by rhettinger

#7447: Sum() doc and behavior mismatch  closed by rhettinger

#7547: test_timeout should skip, not fail, when the remote host is no  closed by pitrou

#7826: support caching for 2to3  closed by benjamin.peterson

#9340: argparse parse_known_args does not work with subparsers  closed by bethard

#9352: argparse eats characters when parsing multiple merged short op  closed by bethard

#9353: argparse __all__ is incomplete  closed by bethard

#9355: argparse add_mutually_exclusive_group more than once has incor  closed by bethard

#9553: 80 failures if COLUMNS env var set to a valu  closed by bethard

#9675: segfault: PyDict_SetItem: Assertion `value' failed.  closed by jcea

#9733: Can't iterate over multiprocessing.managers.DictProxy  closed by asksol

#9779: argparse.ArgumentParser not support unicode in print help  closed by bethard

#9886: Make operator.itemgetter/attrgetter/methodcaller easier to dis  closed by rhettinger

#9926: Wrapped TestSuite subclass does not get __call__ executed  closed by michael.foord

#9981: let make_buildinfo use a temporary directory on windows  closed by krisvale

#10025: random.seed not initialized as advertised  closed by rhettinger

#10038: json.loads() on str should return unicode, not str  closed by barry

#10110: Queue doesn't recognize it is full after shrinking maxsize  closed by rhettinger

#10157: Refleaks in pythonrun.c  closed by ocean-city

#10160: operator.attrgetter slower than lambda after adding dotted nam  closed by pitrou

#10171: Ugly buttons in some Tkinter objects in Windows  closed by eric.araujo

#10173: Don't pickle TestCase instances in test_multiprocessing  closed by pitrou

#10177: PyUnicode_AsWideCharString and PyMem_Free  closed by terry.reedy

#10184: tarfile touches directories twice  closed by loewis

#10199: Move Demo/turtle under Lib/  closed by belopolsky

#10221: {}.pop('a') raises non-standard KeyError exception  closed by rhettinger

#10230: test_tarfile failure (test_extractall) on AMD64 debian paralle  closed by georg.brandl

#10233: fix test_tarfile ResourceWarnings  closed by pitrou

#10235: test_argparse depends on the COLUMNS environment variable  closed by pitrou

#10243: Packaged Pythons  closed by loewis

#10244: PEP100 has broken links  closed by fijall

#10246: uu.encode fd leak if arguments are filenames  closed by pitrou

#10247: mold builder  closed by pitrou

#10249: Fix resource warnings in test_unicodedata  closed by pitrou

#10250: Fix resource warnings in test_urllib2_localnet  closed by pitrou

#10251: Fix resource warnings in test_file  closed by pitrou

#10253: Fix fd leak in fileio.c and test resource warnings  closed by pitrou

#10257: Fix resource warnings in test_os  closed by brian.curtin

#10258: Fix resource warnings in test_tokenize  closed by brian.curtin

#10263: "python -m site" does not print path details  closed by ned.deily

#10264: Fix resource warnings in test_smtplib  closed by benjamin.peterson

#10265: Fix fd leak in sunau  closed by pitrou

#10266: uu.decode fd leak if in_file is a filename  closed by pitrou

#10268: Add --enable-loadable-sqlite-extensions option to `configure`  closed by benjamin.peterson

#10269: Fix some resource warnings in test_sax  closed by benjamin.peterson

#10275: how to know that a module is a module, a function is a functio  closed by brian.curtin

#10277: sax leaks a fd if source is a filename  closed by benjamin.peterson

#10279: test_gc failure on Windows x64  closed by pitrou

#10280: nntp_version set to the most recent advertised version  closed by pitrou

#10281: Exception raised when an NNTP overview field is absent  closed by pitrou

#10283: New parameter for an NNTP newsgroup pattern in LIST ACTIVE  closed by pitrou

#10285: Other status field flags in documentation for NNTP LIST comman  closed by pitrou

#10286: URLOpener => URLopener x2 in  closed by georg.brandl

#10290: Fix resource warnings in distutils  closed by brian.curtin

#10293: PyMemoryView object has obsolete members  closed by pitrou

#10294: Lib/test/ contains dead code  closed by brett.cannon

#10295: _socket.pyd uses winsock2, select.pyd uses winsock 1  closed by krisvale

#10300: Documentation of three PyDict_* functions  closed by benjamin.peterson

#10301: Zipfile cannot be used in "with" Statement  closed by benjamin.peterson

#10306: Weakref callback exceptions should be turned into warnings.  closed by oddthinking

#10307: compile error in readline.c  closed by orsenthil

#10313: Reassure user: test_os BytesWarning is OK  closed by r.david.murray

#10314: Improve JSON encoding with sort_keys=True  closed by pitrou

#10315: smtplib.SMTP_SSL new in 2.6  closed by georg.brandl

#10316: tkFileDialog.askopenfilenames scrambling multiple file selecti  closed by ned.deily

#10317: Add TurtleShell to turtle  closed by rhettinger

#10322: sys.argv and quoted arguments on command line  closed by fcr

#960325: "--require <feature>" option for configure/make  (fail if buil  closed by terry.reedy

#10256: Fix resource warnings in test_pkgimport  closed by brian.curtin

From debatem1 at  Fri Nov  5 18:10:34 2010
From: debatem1 at (geremy condra)
Date: Fri, 5 Nov 2010 10:10:34 -0700
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven <lvh at> wrote:
> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae <allan at> wrote:


> What is true is that there's a new and temporary "NO ARCH" rule in the
> topic

It's your channel and you can do with it what you want, but seriously-
does this strike you as the best response to a widespread problem?
You're basically telling people to get lost, and in all caps no less.

Geremy Condra

From fuzzyman at  Fri Nov  5 18:14:02 2010
From: fuzzyman at (Michael Foord)
Date: Fri, 05 Nov 2010 17:14:02 +0000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 05/11/2010 17:10, geremy condra wrote:
> On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven<lvh at>  wrote:
>> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae<allan at>  wrote:
> <snip>
>> What is true is that there's a new and temporary "NO ARCH" rule in the
>> topic
> It's your channel and you can do with it what you want,

Actually it's a PSF run channel.

> but seriously-
> does this strike you as the best response to a widespread problem?
> You're basically telling people to get lost, and in all caps no less.
They're saying that the channel isn't the correct place to get support 
on that particular issue.

All the best,


> Geremy Condra
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From j.reid at  Fri Nov  5 18:17:43 2010
From: j.reid at (John Reid)
Date: Fri, 05 Nov 2010 17:17:43 +0000
Subject: [Python-Dev] *** glibc detected *** gdb: malloc(): smallbin double
	linked list
Message-ID: <ib1e7p$b5g$>


I've compiled
Python 2.7 (r27:82500, Nov  2 2010, 09:00:37)
[GCC 4.4.3] on linux2

with the following configure options
./configure --prefix=/home/john/local/python-dbg --with-pydebug

I've installed numpy and some other packages but when I try to run my 
extension code under gdb I get the errors below. Does anyone have any 
ideas of how to track down what's happening here? I imagine I've 
misconfigured something somewhere. Is valgrind the answer?


*** glibc detected *** gdb: malloc(): smallbin double linked list 
corrupted: 0x0000000004de7ad0 ***
======= Backtrace: =========
======= Memory map: ========
00400000-00818000 r-xp 00000000 08:05 4832730 
00a17000-00a18000 r--p 00417000 08:05 4832730 
00a18000-00a25000 rw-p 00418000 08:05 4832730 
00a25000-00a43000 rw-p 00000000 00:00 0
0287f000-0b920000 rw-p 00000000 00:00 0 
7f0a1c000000-7f0a1c021000 rw-p 00000000 00:00 0
7f0a1c021000-7f0a20000000 ---p 00000000 00:00 0
7f0a20fc0000-7f0a20fd6000 r-xp 00000000 08:05 3498245 
7f0a20fd6000-7f0a211d5000 ---p 00016000 08:05 3498245 
7f0a211d5000-7f0a211d6000 r--p 00015000 08:05 3498245 
7f0a211d6000-7f0a211d7000 rw-p 00016000 08:05 3498245 
7f0a211fd000-7f0a21211000 r--p 000dc000 08:05 4825848 
7f0a21211000-7f0a21218000 r--p 00018000 08:05 4841756 
7f0a21218000-7f0a21226000 r--p 00001000 08:05 4841756 
7f0a21226000-7f0a2123e000 r--p 000bc000 08:05 4653290 
7f0a2123e000-7f0a21287000 r--p 003dd000 08:05 4653290 
7f0a21287000-7f0a21299000 r--p 00425000 08:05 4653290 
7f0a21299000-7f0a213e7000 r--p 0018c000 08:05 4653290 
7f0a213e7000-7f0a2152f000 r--p 0207c000 08:05 4653324 
7f0a2152f000-7f0a22027000 r--p 01585000 08:05 4653324 
7f0a22027000-7f0a22400000 rw-p 00000000 00:00 0
7f0a22408000-7f0a224d1000 r--p 00315000 08:05 4653290 
7f0a224d1000-7f0a224ff000 r--p 002e8000 08:05 4653290 
7f0a224ff000-7f0a22526000 r--p 00038000 08:05 4653310 
7f0a22526000-7f0a2259c000 r--p 01510000 08:05 4653324 
7f0a2259c000-7f0a2280c000 r--p 012a0000 08:05 4653324 
7f0a2280c000-7f0a2343f000 rw-p 00000000 00:00 0
7f0a23443000-7f0a2344c000 r--p 0001a000 08:05 6169643 
7f0a2344c000-7f0a2345c000 r--p 002d9000 08:05 4653290 
7f0a2345c000-7f0a23461000 r--p 0005e000 08:05 4653310 
7f0a23461000-7f0a23477000 r--p 0001f000 08:05 4653310 
7f0a23477000-7f0a2347d000 r--p 00004000 08:05 4653095 
7f0a2347d000-7f0a2350c000 r--p 00757000 08:05 4653324 
7f0a2350c000-7f0a23555000 r--p 021c3000 08:05 4653324 
7f0a23555000-7f0a2355b000 r--p 00048000 08:05 6169627 
7f0a2355b000-7f0a2356f000 r--p 0002d000 08:05 6169627 
7f0a2356f000-7f0a23575000 r--p 000b1000 08:05 3489898 
7f0a23575000-7f0a2357c000 r--p 000ab000 08:05 3489898 
7f0a2357c000-7f0a2358d000 r--p 0009b000 08:05 3489898 
7f0a2358d000-7f0a2359b000 r--p 000dd000 08:05 4827887 
7f0a2359b000-7f0a235ac000 r--p 00416000 08:05 6709644 
7f0a235ac000-7f0a23668000 rw-p 00000000 00:00 0
7f0a23668000-7f0a2366d000 r--p 00033000 08:05 3180358 
7f0a2366d000-7f0a23678000 r--p 00052000 08:05 3180358 
7f0a23678000-7f0a2367d000 r--p 0004c000 08:05 3180358 
7f0a2367d000-7f0a23690000 r--p 00039000 08:05 3180358 
7f0a23690000-7f0a23698000 r--p 0001b000 08:05 6169649 
7f0a23698000-7f0a236a7000 r--p 004fd000 08:05 3180355 
7f0a236a7000-7f0a2374f000 rw-p 00000000 00:00 0
7f0a2374f000-7f0a2375a000 r--p 0001b000 08:05 3180353 
7f0a2375a000-7f0a23762000 r--p 00065000 08:05 3180320 
7f0a23762000-7f0a23774000 r--p 000ae000 08:05 3180320 
7f0a23774000-7f0a2377a000 r--p 000a9000 08:05 3180320 
7f0a2377a000-7f0a23780000 r--p 000a4000 08:05 3180320 
7f0a23780000-7f0a237b4000 r--p 00071000 08:05 3180320 
7f0a237b4000-7f0a23881000 rw-p 00000000 00:00 0
7f0a23883000-7f0a23888000 r--p 0000f000 08:05 3146117 
7f0a23888000-7f0a23897000 r--p 000b9000 08:05 3180362 
7f0a23897000-7f0a238a1000 r--p 00118000 08:05 3180362 
7f0a238a1000-7f0a238ae000 r--p 0010c000 08:05 3180362 
7f0a238ae000-7f0a238e8000 r--p 000d3000 08:05 3180362 
7f0a238e8000-7f0a23aa4000 r--p 004e2000 08:05 4841832 
7f0a23aa4000-7f0a23b03000 r--p 0069d000 08:05 4841832 
7f0a23b03000-7f0a23b27000 r--p 004bf000 08:05 4841832 
7f0a23b27000-7f0a23bc3000 r--p 00424000 08:05 4841832 
7f0a23bc3000-7f0a23c3e000 r--p 003aa000 08:05 4841832 
7f0a23c3e000-7f0a23fca000 r--p 0001f000 08:05 4841832 
7f0a23fca000-7f0a240f6000 rw-p 00000000 00:00 0
7f0a240f8000-7f0a24118000 r--p 00121000 08:05 3180362 
7f0a24118000-7f0a24129000 r--p 0000e000 08:05 4950482 
7f0a24129000-7f0a24133000 r--p 00000000 08:05 4950482 
7f0a24133000-7f0a24154000 r--p 00155000 08:05 2900170 
7f0a24154000-7f0a241a2000 r--p 00061000 08:05 4841716 
7f0a241a2000-7f0a241a8000 r--p 0005c000 08:05 4841716 
7f0a241a8000-7f0a241bb000 r--p 0004a000 08:05 4841716 
7f0a241bb000-7f0a241ed000 r--p 00007000 08:05 4841716 
7f0a241ed000-7f0a241f4000 r-xp 00000000 08:05 2900165 
7f0a241f4000-7f0a243f3000 ---p 00007000 08:05 2900165 
7f0a243f3000-7f0a243f4000 r--p 00006000 08:05 2900165 
7f0a243f4000-7f0a243f5000 rw-p 00007000 08:05 2900165 
7f0a243f9000-7f0a2440c000 r--p 00038000 08:05 4841716 
7f0a2440c000-7f0a2441b000 r--p 00000000 08:05 4841839 
7f0a2441b000-7f0a24431000 r--p 00078000 08:05 4841828 
7f0a24431000-7f0a24439000 r--p 00071000 08:05 4841828 
7f0a24439000-7f0a2444c000 r--p 0005f000 08:05 4841828 
7f0a2444c000-7f0a2445c000 r--p 00050000 08:05 4841828 
7f0a2445c000-7f0a244a9000 r--p 00004000 08:05 4841828 
7f0a244a9000-7f0a244cc000 r--p 00063000 08:05 4841753 
7f0a244cc000-7f0a244d6000 r--p 00085000 08:05 4841753 
7f0a244d6000-7f0a244f3000 r--p 001be000 08:05 221210 
7f0a244f3000-7f0a24537000 r--p 00370000 08:05 221210 
7f0a24537000-7f0a2453e000 r--p 003b3000 08:05 221210 
7f0a2453e000-7f0a2455c000 r--p 00353000 08:05 221210 
7f0a2455c000-7f0a24583000 r--p 0032d000 08:05 221210 
7f0a24583000-7f0a24591000 r--p 00320000 08:05 221210 
7f0a24591000-7f0a2468b000 r--p 00227000 08:05 221210 
7f0a2468b000-7f0a247a8000 rw-p 00000000 00:00 0
7f0a247a8000-7f0a247aa000 r-xp 00000000 08:05 2900166 
7f0a247aa000-7f0a249a9000 ---p 00002000 08:05 2900166 
7f0a249a9000-7f0a249aa000 r--p 00001000 08:05 2900166 
7f0a249aa000-7f0a249ab000 rw-p 00002000 08:05 2900166 
7f0a249ab000-7f0a249c3000 r-xp 00000000 08:05 2900168 
7f0a249c3000-7f0a24bc2000 ---p 00018000 08:05 2900168 
7f0a24bc2000-7f0a24bc3000 r--p 00017000 08:05 2900168 
7f0a24bc3000-7f0a24bc4000 rw-p 00018000 08:05 2900168 
7f0a24bc4000-7f0a24bc8000 rw-p 00000000 00:00 0
7f0a24bc8000-7f0a24d30000 r-xp 00000000 08:05 2901949 
7f0a24d30000-7f0a24f2f000 ---p 00168000 08:05 2901949 
7f0a24f2f000-7f0a24f3c000 r--p 00167000 08:05 2901949 
7f0a24f3c000-7f0a24f54000 rw-p 00174000 08:05 2901949 
7f0a24f54000-7f0a24f58000 rw-p 00000000 00:00 0
7f0a24f58000-7f0a24fa3000 r-xp 00000000 08:05 2901950 
7f0a24fa3000-7f0a251a2000 ---p 0004b000 08:05 2901950 
7f0a251a2000-7f0a251a4000 r--p 0004a000 08:05 2901950 
7f0a251a4000-7f0a251a9000 rw-p 0004c000 08:05 2901950 
7f0a251a9000-7f0a251aa000 rw-p 00000000 00:00 0
7f0a251aa000-7f0a25324000 r-xp 00000000 08:05 2900170 
7f0a25324000-7f0a25523000 ---p 0017a000 08:05 2900170 
7f0a25523000-7f0a25527000 r--p 00179000 08:05 2900170 
7f0a25527000-7f0a25528000 rw-p 0017d000 08:05 2900170 
7f0a25528000-7f0a2552d000 rw-p 00000000 00:00 0
7f0a2552d000-7f0a2552f000 r-xp 00000000 08:05 2900174 
7f0a2552f000-7f0a2572f000 ---p 00002000 08:05 2900174 
7f0a2572f000-7f0a25730000 r--p 00002000 08:05 2900174 
7f0a25730000-7f0a25731000 rw-p 00003000 08:05 2900174 
7f0a25731000-7f0a25757000 r-xp 00000000 08:05 2900004 
7f0a25757000-7f0a25957000 ---p 00026000 08:05 2900004 
7f0a25957000-7f0a25959000 r--p 00026000 08:05 2900004 
7f0a25959000-7f0a2595a000 rw-p 00028000 08:05 2900004 
7f0a2595a000-7f0a25b98000 r-xp 00000000 08:05 4827971 
7f0a25b98000-7f0a25d98000 ---p 0023e000 08:05 4827971 
7f0a25d98000-7f0a25d9a000 r--p 0023e000 08:05 4827971 
7f0a25d9a000-7f0a25dfc000 rw-p 00240000 08:05 4827971 
7f0a25dfc000-7f0a25e0b000 rw-p 00000000 00:00 0
7f0a25e0b000-7f0a25e8d000 r-xp 00000000 08:05 2900011 
7f0a25e8d000-7f0a2608c000 ---p 00082000 08:05 2900011 
7f0a2608c000-7f0a2608d000 r--p 00081000 08:05 2900011 
7f0a2608d000-7f0a2608e000 rw-p 00082000 08:05 2900011 
7f0a2608e000-7f0a260a4000 r-xp 00000000 08:05 2900157 
7f0a260a4000-7f0a262a3000 ---p 00016000 08:05 2900157 
7f0a262a3000-7f0a262a4000 r--p 00015000 08:05 2900157 
7f0a262a4000-7f0a262a5000 rw-p 00016000 08:05 2900157 
7f0a262a5000-7f0a262e3000 r-xp 00000000 08:05 3498266 
7f0a262e3000-7f0a264e3000 ---p 0003e000 08:05 3498266 
7f0a264e3000-7f0a264e7000 r--p 0003e000 08:05 3498266 
7f0a264e7000-7f0a264e8000 rw-p 00042000 08:05 3498266 
7f0a264e8000-7f0a26521000 r-xp 00000000 08:05 3498308 
7f0a26521000-7f0a26720000 ---p 00039000 08:05 3498308 
7f0a26720000-7f0a26722000 r--p 00038000 08:05 3498308 
7f0a26722000-7f0a26728000 rw-p 0003a000 08:05 3498308 
7f0a26728000-7f0a26729000 rw-p 00000000 00:00 0
7f0a26729000-7f0a26749000 r-xp 00000000 08:05 2900131 
7f0a26749000-7f0a2674f000 r--p 00013000 08:05 6169622 
7f0a2674f000-7f0a26758000 r--p 0004c000 08:05 4841753 
7f0a26758000-7f0a267a4000 r--p 00001000 08:05 4841753 
7f0a267a4000-7f0a26857000 rw-p 00000000 00:00 0
7f0a26857000-7f0a26858000 r--p 00000000 08:05 5792628 
7f0a26858000-7f0a268da000 rw-p 00000000 00:00 0
7f0a268da000-7f0a26919000 r--p 00000000 08:05 4874536 
7f0a26919000-7f0a26920000 rw-p 00000000 00:00 0
7f0a26922000-7f0a26928000 r--p 0005e000 08:05 4841753 
7f0a26928000-7f0a26933000 r--p 00054000 08:05 4841753 
7f0a26935000-7f0a26938000 rw-p 00000000 00:00 0
7f0a26938000-7f0a2693e000 r--p 00000000 08:05 5792627 
7f0a2693e000-7f0a26945000 r--s 00000000 08:05 5417899 
7f0a26945000-7f0a26946000 r--p 00000000 08:05 4875999 
7f0a26946000-7f0a26948000 rw-p 00000000 00:00 0
7f0a26948000-7f0a26949000 r--p 0001f000 08:05 2900131 
7f0a26949000-7f0a2694a000 rw-p 00020000 08:05 2900131 
7f0a2694a000-7f0a2694b000 rw-p 00000000 00:00 0
7ffff92d6000-7ffff92f8000 rw-p 00000000 00:00 0 
7ffff93ff000-7ffff9400000 r-xp 00000000 00:00 0 
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 

From lvh at  Fri Nov  5 19:00:19 2010
From: lvh at (Laurens Van Houtven)
Date: Fri, 5 Nov 2010 19:00:19 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 6:10 PM, geremy condra <debatem1 at> wrote:
> On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven <lvh at> wrote:
>> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae <allan at> wrote:
> <snip>
>> What is true is that there's a new and temporary "NO ARCH" rule in the
>> topic
> It's your channel and you can do with it what you want, but seriously-
> does this strike you as the best response to a widespread problem?
> You're basically telling people to get lost, and in all caps no less.
> Geremy Condra

It is not by any means "my channel" -- I apologize if I gave anyone
the impression that I alone decided that was going up, because that's
not true. Unfortunately Freenode does not give us the ability to be
more verbose in IRC topics. In fact, to put that up, we had to remove
something less important. As a result, NO ARCH is roughly the best we
can do. (Similarly NO LOL is really NO CHATSPEAK, but topics are
length limited.)


From lvh at  Fri Nov  5 19:08:35 2010
From: lvh at (Laurens Van Houtven)
Date: Fri, 5 Nov 2010 19:08:35 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

Whoops, pressed send too soon. This should've followed my previous email:

Unscientifically judging by the rate of people who used to have vague
problems that turned out to be Arch-related, I don't really think
anyone feels they're being told to "get lost". People ask a question
about it, which is great: answering that issue in the detail it
deserves (as you've mentioned), which is something we can't do in the
/topic but *can* do in the channel itself, takes a lot less time for
everyone and leads to the correct answer (such as "tell the package
maintainer") faster.

As soon as this dies down and it stops being an issue, we're obviously
taking it down.


From fetchinson at  Fri Nov  5 19:57:08 2010
From: fetchinson at (Daniel Fetchinson)
Date: Fri, 5 Nov 2010 19:57:08 +0100
Subject: [Python-Dev] *** glibc detected *** gdb: malloc(): smallbin
 double linked list
In-Reply-To: <ib1e7p$b5g$>
References: <ib1e7p$b5g$>
Message-ID: <>

> Hi,
> I've compiled
> Python 2.7 (r27:82500, Nov  2 2010, 09:00:37)
> [GCC 4.4.3] on linux2
> with the following configure options
> ./configure --prefix=/home/john/local/python-dbg --with-pydebug
> I've installed numpy and some other packages but when I try to run my
> extension code under gdb I get the errors below. Does anyone have any
> ideas of how to track down what's happening here? I imagine I've
> misconfigured something somewhere. Is valgrind the answer?
> Thanks,
> John.

Hi John, the right place for asking such questions is the python
mailing list python-list at, please see

This python-dev list is for the development *of* python and not
development *with* python. For the latter python-list is the
appropriate forum.


Psss, psss, put it down! -

From merwok at  Sat Nov  6 01:59:24 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sat, 06 Nov 2010 01:59:24 +0100
Subject: [Python-Dev] [Python-checkins] r86170 - in
 python/branches/py3k: Doc/library/stdtypes.rst Lib/test/
 Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c
In-Reply-To: <>
References: <>
Message-ID: <>

> Author: eric.smith
> Date: Thu Nov  4 18:06:58 2010
> New Revision: 86170
> Log:
> Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict.

> Modified: python/branches/py3k/Doc/library/stdtypes.rst
> ==============================================================================
> --- python/branches/py3k/Doc/library/stdtypes.rst	(original)
> +++ python/branches/py3k/Doc/library/stdtypes.rst	Thu Nov  4 18:06:58 2010
> @@ -1038,6 +1038,14 @@
>     that can be specified in format strings.
> +.. method:: str.format_map(mapping)
> +
> +   Similar to ``str.forrmat(**mapping)``, except that ``mapping`` is

Yarrr me hearrties, it be forrrmat!

From debatem1 at  Sat Nov  6 03:38:21 2010
From: debatem1 at (geremy condra)
Date: Fri, 5 Nov 2010 19:38:21 -0700
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 5, 2010 at 10:14 AM, Michael Foord
<fuzzyman at> wrote:
> On 05/11/2010 17:10, geremy condra wrote:
>> On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven<lvh at>
>> ?wrote:
>>> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae<allan at> ?wrote:
>> <snip>
>>> What is true is that there's a new and temporary "NO ARCH" rule in the
>>> topic
>> It's your channel and you can do with it what you want,
> Actually it's a PSF run channel.
>> but seriously-
>> does this strike you as the best response to a widespread problem?
>> You're basically telling people to get lost, and in all caps no less.
> They're saying that the channel isn't the correct place to get support on
> that particular issue.

In the same way that telling someone to RTFM n00b is the same thing as
telling them to kindly refer to the documents produced by man, yes.

As you said during the "python 2 or 3" discussion some months back
"given the topic is far more nuanced than an IRC topic can express
maybe that just isn't the right place for it".

Geremy Condra

From ezio.melotti at  Sat Nov  6 05:44:42 2010
From: ezio.melotti at (Ezio Melotti)
Date: Sat, 06 Nov 2010 06:44:42 +0200
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>
Message-ID: <>


On 05/11/2010 19.08, Python tracker wrote:
> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
> Python tracker at
> To view or respond to any of the issues listed below, click on the issue.
> Do NOT respond to this message.
> Issues counts and deltas:
>    open    2514 (+17)
>    closed 19597 (+78)
>    total  22111 (+95)

as suggested in recent mails[0][1] I changed these values to represent 
the deltas with the previous week.
Now let's try to keep the "open" delta negative ;)

Best Regards,
Ezio Melotti


From merwok at  Sat Nov  6 06:00:16 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sat, 06 Nov 2010 06:00:16 +0100
Subject: [Python-Dev] [Python-checkins] r86170 - in
 python/branches/py3k: Doc/library/stdtypes.rst Lib/test/
 Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c
In-Reply-To: <>
References: <>
Message-ID: <>

According to #python-dev, there?s no need to go through
python-checkins/-dev for typos, so I fixed this one in r86247.

Piratical regards

From eric at  Sat Nov  6 11:43:45 2010
From: eric at (Eric Smith)
Date: Sat, 06 Nov 2010 06:43:45 -0400
Subject: [Python-Dev] [Python-checkins] r86170 - in
 python/branches/py3k:	Doc/library/stdtypes.rst Lib/test/
 Misc/NEWS	Objects/stringlib/string_format.h Objects/unicodeobject.c
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/6/10 1:16 AM, Ezio Melotti wrote:
>> +.. method:: str.format_map(mapping)
>> +
>> + Similar to ``str.forrmat(**mapping)``, except that ``mapping`` is
>> + used directly and not copied to a :class:`dict` . This is useful
>> + if for example ``mapping`` is a dict subclass.
> Including the "__missing__" example might be better. From the
> description, it's not clear why str.format(**dict_subclass) wouldn't
> work and that the previous line refers to the fact that ** converts the
> mapping in a plain dict, thus making __missing__ and other things unusable.

I agree, but I was hesitant to add a long example. But thinking about it 
some more I think I'll add it.

>> +
>> + self.assertEqual('{foo._x}'.format_map({'foo': C(20)}), '20')
>> +
> The classes D-H seem unused, did you forget to add some tests or am I
> missing something?

It was a big copy job from the other tests. I'll review them all.

>> +PyDoc_STRVAR(format_map__doc__,
>> + "S.format_map(mapping) -> str\n\
>> +\n\
>> +");
>> +
> Wouldn't a more verbose docstring be better? (str.format seems to lack
> one too)

Undoubtedly true. Any suggestions? How about (for .format): "Returns S 
formatted with substitutions from args and kwargs."

I also see that __format__'s docstring is similarly terse.

Thanks for reviewing!


From victor.stinner at  Sat Nov  6 12:19:55 2010
From: victor.stinner at (Victor Stinner)
Date: Sat, 6 Nov 2010 12:19:55 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x"
Message-ID: <>


I noticed "OSError: [Errno 23] Too many open files in system" errors on your 
FreeBSD buildbot. I would like to know if you configured a limit on the open 
files or maybe of child processes on this buildbot or not, or if it is a 
failure in Python?

The first error always occurs in the first test of test_concurrent_futures. It's 
maybe because this test uses a lot of open files or processes?


From martin at  Sat Nov  6 12:31:39 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 Nov 2010 12:31:39 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
 3.x"	buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

Am 06.11.2010 12:19, schrieb Victor Stinner:
> Hi,
> I noticed "OSError: [Errno 23] Too many open files in system" errors on your 
> FreeBSD buildbot. I would like to know if you configured a limit on the open 
> files or maybe of child processes on this buildbot or not, or if it is a 
> failure in Python?

Before David responds: feel free to put temporarily a "limits -a"
command into the build process, or some such.


From lvh at  Sat Nov  6 13:53:12 2010
From: lvh at (Laurens Van Houtven)
Date: Sat, 6 Nov 2010 13:53:12 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

I'm sorry you feel that way.

Experience teaches us that people do speak up more than they tend to
keep schtum. We do get feedback on most things, including the "NO
ARCH" rule. At least so far, responses have not been anywhere near
what you'd expect if you'd tell people to "RTFM n00b" (in terms of
defensiveness and verbal hostility, at least). From the things I've
seen (and I've asked other regulars, they seem to agree), the related
interactions have been short, clear, and cordial. The first is
important to #python because it keeps the signal to noise ratio high.
The second is important to the person with the broken package, so they
know what to do to fix it and how to get it fixed for other people as
well. The last part is important to everyone.

As usual, any and all policy is up for debate, but I really see too
much result (not just for #python, but for the people with the broken
package as well) and too little badness to consider taking it down
right now. I believe I speak for all of the ops and regulars in
#python when I say that. Even Allan himself has said that he agrees
with the rule, and yes: I do honestly believe that right now, it is
the best thing we can actually *do*. That doesn't mean it has to be
the best thing bar none: like with software projects, "patches
welcome", if you have any suggestions for improving this, we're all
ears. However, I've already said: this is temporary, it's going down
as soon as we stop getting feedback on it. (Checking if that has
occurred or not is in my tickler file for next Friday.)

It has already been pointed out in this thread that Arch is a distro
with a target audience of above average knowledge. Yes, the rule does
expect people to understand the difference between an Arch-specific
problem and something that's likely to be whatever distro
it is you're running. Even the people who do feel instantly offended
and just leave without asking questions, hey, at least they're likely
to go to Arch-specific spots next for support, and that's the right
place (FWIW I do not believe this to be a significant amount of

Also, sometimes pointing people to the FM is just the only reasonable
thing left to do. If you've got recent-ish logs (24h) I can give you a
recent prime example of that. I do doubt that anyone used terminology
like 'RTFM n00b'. If you think 'NO ARCH' is the same kind of language,
well, we'll just have to agree to disagree there. I could see how
someone would think that, but IRC typically forces people to be more
brief, and a lot of people understandably mistake that for being blunt
or even downright rude. That's an unfortunate side effect of the
medium that pretty much every large channel I know of has had to deal
with in some way.


From martin at  Sat Nov  6 14:41:08 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 Nov 2010 14:41:08 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> But the previous consensus (at least, as I, and presumably many other
> people understood it) was that python2 would remain the owner of the
> name "/usr/bin/python" for the indefinite future, and python3 would
> be invoked with /usr/bin/python3.

Can you cite references for that (not that other people agree, but that
this was consensus)? I couldn't find any summary report of the 2009
language summit, and, despite having been present there, I don't recall
that aspect even being discussed.

Instead, I recall that a decision was made (and I'm not sure whether
with consensus or not) that "make install" would install
/usr/bin/python3, for the time being. Period.

So I don't recall a decision that there shouldn't be a python2
binary, nor a decision that anything is done indefinitely
(it may be that the decision was actually just about 3.1 - changing
it again for 3.2 would require another decision, but certainly can't
be ruled out categorically).


From g.brandl at  Sat Nov  6 15:38:22 2010
From: g.brandl at (Georg Brandl)
Date: Sat, 06 Nov 2010 15:38:22 +0100
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>
Message-ID: <ib3pb7$ot$>

Am 06.11.2010 05:44, schrieb Ezio Melotti:
> Hi,
> On 05/11/2010 19.08, Python tracker wrote:
>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>> Python tracker at
>> To view or respond to any of the issues listed below, click on the issue.
>> Do NOT respond to this message.
>> Issues counts and deltas:
>>    open    2514 (+17)
>>    closed 19597 (+78)
>>    total  22111 (+95)
> as suggested in recent mails[0][1] I changed these values to represent 
> the deltas with the previous week.
> Now let's try to keep the "open" delta negative ;)

That is a worthy goal, however the difference between the "open" and "closed"
deltas is already quite an achievement and shows that our triage works.


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From rdmurray at  Sat Nov  6 15:46:57 2010
From: rdmurray at (R. David Murray)
Date: Sat, 06 Nov 2010 10:46:57 -0400
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 06 Nov 2010 12:31:39 +0100, Martin wrote:
> Am 06.11.2010 12:19, schrieb Victor Stinner:
> > Hi,
> > 
> > I noticed "OSError: [Errno 23] Too many open files in system" errors on your 
> > FreeBSD buildbot. I would like to know if you configured a limit on the open 
> > files or maybe of child processes on this buildbot or not, or if it is a 
> > failure in Python?
> Before David responds: feel free to put temporarily a "limits -a"
> command into the build process, or some such.

You might also want to check the value of sysctl kern.maxfiles.  On the
FreeBSD (6.3) systems to which I have access the default value for
kern.maxfiles appears to be 12328, but that information is of limited
utility since its value is set based on kern.maxusers, which in
turn is set at boot time based primarily on the available system memory
The systems I got the above number from have 1GB of memory.

R. David Murray                            

From rdmurray at  Sat Nov  6 16:42:03 2010
From: rdmurray at (R. David Murray)
Date: Sat, 06 Nov 2010 11:42:03 -0400
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <ib3pb7$ot$>
References: <>
	<> <ib3pb7$ot$>
Message-ID: <>

On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl <g.brandl at> wrote:
> Am 06.11.2010 05:44, schrieb Ezio Melotti:
> > Hi,
> > 
> > On 05/11/2010 19.08, Python tracker wrote:
> >> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
> >> Python tracker at
> >>
> >> To view or respond to any of the issues listed below, click on the issue.
> >> Do NOT respond to this message.
> >>
> >> Issues counts and deltas:
> >>    open    2514 (+17)
> >>    closed 19597 (+78)
> >>    total  22111 (+95)
> > 
> > as suggested in recent mails[0][1] I changed these values to represent 
> > the deltas with the previous week.
> > Now let's try to keep the "open" delta negative ;)
> That is a worthy goal, however the difference between the "open" and "closed"
> deltas is already quite an achievement and shows that our triage works.


We did have negative open deltas for several weeks running in October.
Kudos to everyone involved, and lets do it some more :)  I'm looking
forward to making a non-trivial dent in the open count during the bug
weekend on the 20th/21st.

R. David Murray                            

From martin at  Sat Nov  6 17:00:42 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 Nov 2010 17:00:42 +0100
Subject: [Python-Dev] [Python-checkins] r86264
	-	python/branches/release27-maint/Lib/distutils/
In-Reply-To: <>
References: <>
Message-ID: <>

> Remove one trace of Mac OS 9 support (#7908 follow-up)
> This was overlooked in r80804.  This change is not really a bug fix,

I'm skeptical that this change should be carried out, then.

It's easy to argue that this can't possibly hurt (but I can certainly
come up with code that will break under that change); however,
I fail to see what good it does.


From merwok at  Sat Nov  6 17:26:50 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sat, 06 Nov 2010 17:26:50 +0100
Subject: [Python-Dev] [Python-checkins] r86264 -
In-Reply-To: <>
References: <>
Message-ID: <>

> I'm skeptical that this change should be carried out, then.
Yes, I asked myself the same question.  I had done the svnmerge from
py3k, and when I saw that only one change was left, I wondered whether I
should commit or revert.

> It's easy to argue that this can't possibly hurt (but I can certainly
> come up with code that will break under that change); however, I fail
> to see what good it does.
This was a private function used on an unsupported platform, this should
do no harm.  We?ve been bitten by ?should do no harm? before though, so
I am ready to revert this change (and learn from this :)


From martin at  Sat Nov  6 17:33:06 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 06 Nov 2010 17:33:06 +0100
Subject: [Python-Dev] [Python-checkins] r86264 -
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

> This was a private function used on an unsupported platform, this should
> do no harm.  We?ve been bitten by ?should do no harm? before though, so
> I am ready to revert this change (and learn from this :)

Do as you like. I won't insist on it being reverted.

It's rather a matter of agreeing when moving forward: IMO, mere style
changes, code cleanup etc shouldn't be applied to the bug fix branches,
as their only purpose is to provide bug fixes for existing users.


From tjreedy at  Sat Nov  6 17:55:45 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 06 Nov 2010 12:55:45 -0400
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <ib41ag$31f$>

On 11/6/2010 8:53 AM, Laurens Van Houtven wrote:

> Experience teaches us that people do speak up more than they tend to
> keep schtum. We do get feedback on most things, including the "NO
> ARCH" rule.

It strikes me as reasonable to warn people that they would be wasting 
their time typing out a multiline question about problems with the new 
Arch distro. They can always ask briefly 'Why NO ARCH' and get back 
'Beyond our knowledge' (or a longer pasted response).

Terry Jan Reedy

From tjreedy at  Sat Nov  6 18:01:56 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 06 Nov 2010 13:01:56 -0400
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>	<>
Message-ID: <ib41m3$4in$>

On 11/6/2010 11:42 AM, R. David Murray wrote:
> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl<g.brandl at>  wrote:
>> Am 06.11.2010 05:44, schrieb Ezio Melotti:
>>> Hi,
>>> On 05/11/2010 19.08, Python tracker wrote:
>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>>>> Python tracker at
>>>> To view or respond to any of the issues listed below, click on the issue.
>>>> Do NOT respond to this message.
>>>> Issues counts and deltas:
>>>>     open    2514 (+17)

This seems wrong. A default search for open issues returns 2452 and it 
was about the same yesterday just a few hours after the report.

>>>>     closed 19597 (+78)
>>>>     total  22111 (+95)
>>> as suggested in recent mails[0][1] I changed these values to represent
>>> the deltas with the previous week.
>>> Now let's try to keep the "open" delta negative ;)

Since there were more issues closed than opened I think it really was. 
Anyway, we are down 300 from the 2750 peak.

>> That is a worthy goal, however the difference between the "open" and "closed"
>> deltas is already quite an achievement and shows that our triage works.
> Agreed.
> We did have negative open deltas for several weeks running in October.
> Kudos to everyone involved, and lets do it some more :)  I'm looking
> forward to making a non-trivial dent in the open count during the bug
> weekend on the 20th/21st.

Terry Jan Reedy

From tjreedy at  Sat Nov  6 18:22:35 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 06 Nov 2010 13:22:35 -0400
Subject: [Python-Dev] [Python-checkins] r86264 -
In-Reply-To: <>
References: <>	<>
	<> <>
Message-ID: <ib42sp$9go$>

On 11/6/2010 12:33 PM, "Martin v. L?wis" wrote:
>> This was a private function used on an unsupported platform, this should
>> do no harm.  We?ve been bitten by ?should do no harm? before though, so
>> I am ready to revert this change (and learn from this :)
> Do as you like. I won't insist on it being reverted.
> It's rather a matter of agreeing when moving forward: IMO, mere style
> changes, code cleanup etc shouldn't be applied to the bug fix branches,
> as their only purpose is to provide bug fixes for existing users.

The omission of the deletion from the 5/5 revision was a bug in that 
revision. If the removal of OS9 support was documented (announced), 
which I presume it was, then one could consider any visible trace 
remaining to be a bug.

Perhaps the policy on code cleanup should be a bit more liberal for 2.7 
*because* it will be maintained for several years and *because* there is 
no newer 2.x branch to apply changes to. If I were going to maintain 2.7 
for several years, I would want to have the benefit of gradual 
improvements that make maintainance easier.

Applying such a cleanup to 3.1, say, is less necessary because a) the 
code will soon be end-of-lifed and not maintained much and b) it can be 
applied to the newer (3.2) branch and benefit that and all future 
releases thereafter.

Terry Jan Reedy

From ezio.melotti at  Sat Nov  6 18:25:37 2010
From: ezio.melotti at (Ezio Melotti)
Date: Sat, 06 Nov 2010 19:25:37 +0200
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <ib41m3$4in$>
References: <>	<>	<ib3pb7$ot$>	<>
Message-ID: <>

On 06/11/2010 19.01, Terry Reedy wrote:
> On 11/6/2010 11:42 AM, R. David Murray wrote:
>> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl<g.brandl at>  
>> wrote:
>>> Am 06.11.2010 05:44, schrieb Ezio Melotti:
>>>> Hi,
>>>> On 05/11/2010 19.08, Python tracker wrote:
>>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>>>>> Python tracker at
>>>>> To view or respond to any of the issues listed below, click on the 
>>>>> issue.
>>>>> Do NOT respond to this message.
>>>>> Issues counts and deltas:
>>>>>     open    2514 (+17)
> This seems wrong. A default search for open issues returns 2452 and it 
> was about the same yesterday just a few hours after the report.

That's because the "open" count includes about 25 languishing and 39 
pending issues (technically they are still open).

>>>>>     closed 19597 (+78)
>>>>>     total  22111 (+95)
>>>> as suggested in recent mails[0][1] I changed these values to represent
>>>> the deltas with the previous week.
>>>> Now let's try to keep the "open" delta negative ;)
> Since there were more issues closed than opened I think it really was. 
> Anyway, we are down 300 from the 2750 peak.
>>> That is a worthy goal, however the difference between the "open" and 
>>> "closed"
>>> deltas is already quite an achievement and shows that our triage works.

Yes, even if having a negative delta would be best, having the "closed" 
delta higher than then "open" one is still very good.
So congrats to everyone who worked and works to make this possible.

>> Agreed.
>> We did have negative open deltas for several weeks running in October.
>> Kudos to everyone involved, and lets do it some more :)  I'm looking
>> forward to making a non-trivial dent in the open count during the bug
>> weekend on the 20th/21st.
Best Regards,
Ezio Melotti

From g.rodola at  Sat Nov  6 18:53:30 2010
From: g.rodola at (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sat, 6 Nov 2010 18:53:30 +0100
Subject: [Python-Dev] SSH access against buildbot boxes
Message-ID: <>

sorry in advance if this sounds a little indiscreet, but I think it
would be great if we'd have SSH access against some of the computers
used to host buildbots.
Personally, I would find this particularly useful for OSX since it's
one of the few OSes I can't manage to virtualize and which often
causes me problems.
Some examples: (this one also involves Solaris)

In such cases I would find more easy to be able to connect to the
machine and test myself rather than create a separate branch, commit,
schedule a buildbot run, wait for it to complete and see whether
everything is "green".

On the other side I perfectly understand how opening up blanket ssh
access is not something everyone is comfortable with doing.
AFAICR there was someone who was setting up an evironment to solve
exactly this problem but I'm not sure whether this is already usable.

Best regards,

--- Giampaolo

From rrr at  Sat Nov  6 20:18:43 2010
From: rrr at (Ron Adam)
Date: Sat, 06 Nov 2010 14:18:43 -0500
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <ib41m3$4in$>
References: <>	<>	<ib3pb7$ot$>	<>
Message-ID: <>

On 11/06/2010 12:01 PM, Terry Reedy wrote:
> On 11/6/2010 11:42 AM, R. David Murray wrote:
>> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl<g.brandl at> wrote:
>>> Am 06.11.2010 05:44, schrieb Ezio Melotti:
>>>> Hi,
>>>> On 05/11/2010 19.08, Python tracker wrote:
>>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>>>>> Python tracker at
>>>>> To view or respond to any of the issues listed below, click on the
>>>>> issue.
>>>>> Do NOT respond to this message.
>>>>> Issues counts and deltas:
>>>>> open 2514 (+17)
> This seems wrong. A default search for open issues returns 2452 and it
> was about the same yesterday just a few hours after the report.
>>>>> closed 19597 (+78)
>>>>> total 22111 (+95)
>>>> as suggested in recent mails[0][1] I changed these values to represent
>>>> the deltas with the previous week.
>>>> Now let's try to keep the "open" delta negative ;)
> Since there were more issues closed than opened I think it really was.
> Anyway, we are down 300 from the 2750 peak.

Current status from the tracker...

    don't care:     22134
    not closed:      2491
    not selected:       1

    open:            2451
    languishing:       25
    pending:           39
    closed:         19604

That gives us...

      2451 open
         1 not selected
        39 pending
        25 languishing
      2516 Total open

      2451 open
        39 languishing
         1 not selected
      2491 total "not closed"

      19604 closed
       2491 not closed
         39 pending
      22134 Total issues

My guess as to how this got this way, is that different fields were merged 
at some time where the meanings didn't quite match up. <shrug>

It would be nicer if...

    closed + not_closed = total issues

    closed + open + not_selected = total issues

Pending and languishing should be keywords or sub categories of open.


From rrr at  Sat Nov  6 20:18:43 2010
From: rrr at (Ron Adam)
Date: Sat, 06 Nov 2010 14:18:43 -0500
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <ib41m3$4in$>
References: <>	<>	<ib3pb7$ot$>	<>
Message-ID: <>

On 11/06/2010 12:01 PM, Terry Reedy wrote:
> On 11/6/2010 11:42 AM, R. David Murray wrote:
>> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl<g.brandl at> wrote:
>>> Am 06.11.2010 05:44, schrieb Ezio Melotti:
>>>> Hi,
>>>> On 05/11/2010 19.08, Python tracker wrote:
>>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>>>>> Python tracker at
>>>>> To view or respond to any of the issues listed below, click on the
>>>>> issue.
>>>>> Do NOT respond to this message.
>>>>> Issues counts and deltas:
>>>>> open 2514 (+17)
> This seems wrong. A default search for open issues returns 2452 and it
> was about the same yesterday just a few hours after the report.
>>>>> closed 19597 (+78)
>>>>> total 22111 (+95)
>>>> as suggested in recent mails[0][1] I changed these values to represent
>>>> the deltas with the previous week.
>>>> Now let's try to keep the "open" delta negative ;)
> Since there were more issues closed than opened I think it really was.
> Anyway, we are down 300 from the 2750 peak.

Current status from the tracker...

    don't care:     22134
    not closed:      2491
    not selected:       1

    open:            2451
    languishing:       25
    pending:           39
    closed:         19604

That gives us...

      2451 open
         1 not selected
        39 pending
        25 languishing
      2516 Total open

      2451 open
        39 languishing
         1 not selected
      2491 total "not closed"

      19604 closed
       2491 not closed
         39 pending
      22134 Total issues

My guess as to how this got this way, is that different fields were merged 
at some time where the meanings didn't quite match up. <shrug>

It would be nicer if...

    closed + not_closed = total issues

    closed + open + not_selected = total issues

Pending and languishing should be keywords or sub categories of open.


From martin at  Sat Nov  6 20:15:20 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 Nov 2010 20:15:20 +0100
Subject: [Python-Dev] SSH access against buildbot boxes
In-Reply-To: <>
References: <>
Message-ID: <>

> sorry in advance if this sounds a little indiscreet, but I think it
> would be great if we'd have SSH access against some of the computers
> used to host buildbots.

The only way this can work is on a bilateral basis. If you need shell
access to one of the build slaves, contact the slave operator.


From rrr at  Sat Nov  6 20:30:47 2010
From: rrr at (Ron Adam)
Date: Sat, 06 Nov 2010 14:30:47 -0500
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>	<>	<ib3pb7$ot$>	<>	<ib41m3$4in$>
Message-ID: <>

> Current status from the tracker...
> don't care: 22134
> not closed: 2491
> not selected: 1
> open: 2451
> languishing: 25
> pending: 39
> closed: 19604
> That gives us...
> 2451 open
> 1 not selected
> 39 pending
> 25 languishing
> ----
> 2516 Total open
> 2451 open
> 39 languishing

Should be 39 pending here, not languishing.


> 1 not selected
> ----
> 2491 total "not closed"
> 19604 closed
> 2491 not closed
> 39 pending
> -----
> 22134 Total issues

From rrr at  Sat Nov  6 20:30:47 2010
From: rrr at (Ron Adam)
Date: Sat, 06 Nov 2010 14:30:47 -0500
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>	<>	<ib3pb7$ot$>	<>	<ib41m3$4in$>
Message-ID: <>

> Current status from the tracker...
> don't care: 22134
> not closed: 2491
> not selected: 1
> open: 2451
> languishing: 25
> pending: 39
> closed: 19604
> That gives us...
> 2451 open
> 1 not selected
> 39 pending
> 25 languishing
> ----
> 2516 Total open
> 2451 open
> 39 languishing

Should be 39 pending here, not languishing.


> 1 not selected
> ----
> 2491 total "not closed"
> 19604 closed
> 2491 not closed
> 39 pending
> -----
> 22134 Total issues

From eric at  Sat Nov  6 20:38:14 2010
From: eric at (Eric Smith)
Date: Sat, 06 Nov 2010 15:38:14 -0400
Subject: [Python-Dev] [Python-checkins] r86170 - in
 python/branches/py3k: Doc/library/stdtypes.rst Lib/test/
 Misc/NEWS	Objects/stringlib/string_format.h Objects/unicodeobject.c
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 11/6/10 6:43 AM, Eric Smith wrote:
> On 11/6/10 1:16 AM, Ezio Melotti wrote:

<issues with format_map documentation and docstrings for str.format and 

I've addressed all of these issues, although if anyone has suggestions 
for the docstrings or documentation they'd be appreciated.

Thanks again.


From me+python at  Sat Nov  6 21:19:32 2010
From: me+python at (Stephen Hansen)
Date: Sat, 06 Nov 2010 13:19:32 -0700
Subject: [Python-Dev] SSH access against buildbot boxes
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/6/10 10:53 AM, Giampaolo Rodol? wrote:
> Personally, I would find this particularly useful for OSX since it's
> one of the few OSes I can't manage to virtualize and which often
> causes me problems.

Although I said this on IRC, I'll repeat the offer to the list for those
not present -- I'm operating the Leopard and Snow Leopard buildslaves,
and although I try to be proactive watching for failures, if someone
wants to test something out before committing they can poke me and I'd
be happy to help.

I can either run a test or two and report back to you, or if you need it
I can open up SSH or even VNC access on a temporary/as-needed basis.
Heck, if you're doing some longer-term work that is more then just
debugging a certain issue and would need access over a longer period of
time, I can probably work something out for you.

I'm just not comfortable opening up such access except on a
person-by-person/case-by-case basis.

I idle on #python-dev as "ixokai" -- you can ping me there and I
generally wake up rather promptly. That, or email works too.


   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: OpenPGP digital signature
URL: <>

From at  Sat Nov  6 22:36:44 2010
From: at (David Bolen)
Date: Sat, 6 Nov 2010 17:36:44 -0400
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 6, 2010 at 7:19 AM, Victor Stinner <victor.stinner at>
> I noticed "OSError: [Errno 23] Too many open files in system" errors on your
> FreeBSD buildbot. I would like to know if you configured a limit on the open
> files or maybe of child processes on this buildbot or not, or if it is a
> failure in Python?
> The first error always occurs in the first test of test_concurrent_futures. It's
> maybe because this test uses a lot of open files or processes?

I couldn't find the matching failures that you're talking about, but
then I figured out you mean the FreeBSD7 (7.2) buildbot, not the
FreeBSD (6.4) buildbot ....

I haven't configured any specific limits with respect to open files.
On both FreeBSD buildbots, kern.maxfiles is 3600 and
kern.maxfilesperproc is 3060.  Both have limits of 1530 processes.
The latter also agrees with the maximum descriptors as shown by limit.
 In regards to R. David Murray's response, the buildbots are VMs with
limited memory, so the dynamic calculation he references for
descriptors is much lower than his system.

Looks like the reason FreeBSD is ok, and FreeBSD7 is because the
relevant tests don't run due to lack of POSIX semaphore support.  I
manually enabled their use on FreeBSD7 a while back (11/2009,
issue7272) since they aren't on by default.  I'd be surprised if at
least test_multiprocessing didn't pass at that point (since that's
what the issue was for) but even it seems to be generating the open
files error now.  The buildbots haven't changed, but I suppose the
tests might just have grown in the number of files they need over

I noticed that the failures seem to always be on a semaphore call.
Some quick googling found a few references that seems to imply that
the number of posix semaphores are very limited (like 30), and can't
be changed without recompiling the kernel from source.  So that's not
so big a threshold for the tests to have perhaps started crossing
since issue7272 was fixed.  Certainly seems more likely than 3000+
files or 1500+ processes.

I wonder if it's possible to deduce if this started recently or not?
The web buildbot interface doesn't go back that far, and an additional
complexity is that the FreeBSD builds tend to have various errors
somewhat consistently over time, but perhaps there are server logs we
can grep for this particular error?

Not sure if the best approach at this point is to see if the tests can
use fewer semaphores, skip these tests under FreeBSD 7 like 6, or if
it's important enough to compile a new kernel with a higher semaphore

-- David

From at  Sat Nov  6 23:35:44 2010
From: at (David Bolen)
Date: Sat, 06 Nov 2010 18:35:44 -0400
Subject: [Python-Dev] SSH access against buildbot boxes
References: <>
Message-ID: <>

Giampaolo Rodol? <g.rodola at> writes:

> In such cases I would find more easy to be able to connect to the
> machine and test myself rather than create a separate branch, commit,
> schedule a buildbot run, wait for it to complete and see whether
> everything is "green".

I agree with both Stephen and Martin's prior responses.  For me, I'm
happy to arrange for individual access on a case by case basis, but am
less comfortable leaving access enabled permanently.

I've arranged access to both my Windows and FreeBSD buildbots in the
past, and while I suspect my OSX Tiger buildbot may be a little less
interesting than the other OSX boxes, the offer remains open for any
of my buildbots.

-- David

From victor.stinner at  Sun Nov  7 04:30:54 2010
From: victor.stinner at (Victor Stinner)
Date: Sun, 7 Nov 2010 04:30:54 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

On Saturday 06 November 2010 22:36:44 you wrote:
> I couldn't find the matching failures that you're talking about, but
> then I figured out you mean the FreeBSD7 (7.2) buildbot, not the
> FreeBSD (6.4) buildbot ....

Search "test_concurrent_futures" in:

I specified "x86 FreeBSD 7.2 3.x" in the email title.

> (...)
> I noticed that the failures seem to always be on a semaphore call.
> Some quick googling found a few references that seems to imply that
> the number of posix semaphores are very limited (like 30), and can't
> be changed without recompiling the kernel from source.  So that's not
> so big a threshold for the tests to have perhaps started crossing
> since issue7272 was fixed.  Certainly seems more likely than 3000+
> files or 1500+ processes.

Nice catch. The problem is the total number of semaphores: I reproduced the 
bug in my FreeBSD 8 VM. The first test fails at the creation of the 31th 

The first failing test if test_all_completed. And it looks like this test 
doesn't destroy the semaphore at exit: my counter (based on __init__/__del__) 
is still at 15 when exiting the test!

> I wonder if it's possible to deduce if this started recently or not?
> The web buildbot interface doesn't go back that far, and an additional
> complexity is that the FreeBSD builds tend to have various errors
> somewhat consistently over time, but perhaps there are server logs we
> can grep for this particular error?

No idea.

> Not sure if the best approach at this point is to see if the tests can
> use fewer semaphores, skip these tests under FreeBSD 7 like 6, or if
> it's important enough to compile a new kernel with a higher semaphore
> limit.

You wrote that the "POSIX" semaphore are very limited. Do it mean that there 
is another kind of semaphore will an higher limit?

I don't think that skipping the test is a good idea: it looks like a real bug 
in (a limitation of) the ProcessPoolExecutor implementation on FreeBSD. Eg. 
test_map fails on FreeBSD 7.2 with ProcessPoolExecutorTest which uses 
self.executor = futures.ProcessPoolExecutor(max_workers=1): only one worker 

It looks like it is possible to tune semaphore limits on FreeBSD, without 
recompiling the kernel, by using boot loader option (kern.ipc.sem* options). 
But ask the FreeBSD user to tune its boot loader options to use the 
concurrent.futures module is not pratical :-)


From ncoghlan at  Sun Nov  7 06:44:07 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 7 Nov 2010 15:44:07 +1000
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 6, 2010 at 11:41 PM, "Martin v. L?wis" <martin at> wrote:
> Instead, I recall that a decision was made (and I'm not sure whether
> with consensus or not) that "make install" would install
> /usr/bin/python3, for the time being. Period.

Indeed, that's my recollection as well. Whether python3 ever inherits
the python symlink at some point in the future is a different question
that has never really been discussed (and probably makes more sense at
the distro level at this point in time - "python = Python 2.x, python3
= Python 3.x" will likely stand as python-dev's consensus
recommendation for quite some time to come).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov  7 06:55:37 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 7 Nov 2010 15:55:37 +1000
Subject: [Python-Dev] SSH access against buildbot boxes
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 7, 2010 at 3:53 AM, Giampaolo Rodol? <g.rodola at> wrote:
> In such cases I would find more easy to be able to connect to the
> machine and test myself rather than create a separate branch, commit,
> schedule a buildbot run, wait for it to complete and see whether
> everything is "green".
> On the other side I perfectly understand how opening up blanket ssh
> access is not something everyone is comfortable with doing.
> AFAICR there was someone who was setting up an evironment to solve
> exactly this problem but I'm not sure whether this is already usable.

Dealing with exactly this problem is one of the goals of the Snakebite project.

As far as I know, the folks behind that project are still working on
it - I've cc'ed Trent Nelson to see if he can provide any additional
info on the topic.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ezio.melotti at  Sat Nov  6 21:00:46 2010
From: ezio.melotti at (Ezio Melotti)
Date: Sat, 06 Nov 2010 22:00:46 +0200
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 06/11/2010 17.42, R. David Murray wrote:
> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl<g.brandl at>  wrote:
>> Am 06.11.2010 05:44, schrieb Ezio Melotti:
>>> Hi,
>>> On 05/11/2010 19.08, Python tracker wrote:
>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05)
>>>> Python tracker at
>>>> To view or respond to any of the issues listed below, click on the issue.
>>>> Do NOT respond to this message.
>>>> Issues counts and deltas:
>>>>     open    2514 (+17)
>>>>     closed 19597 (+78)
>>>>     total  22111 (+95)
>>> as suggested in recent mails[0][1] I changed these values to represent
>>> the deltas with the previous week.
>>> Now let's try to keep the "open" delta negative ;)
>> That is a worthy goal, however the difference between the "open" and "closed"
>> deltas is already quite an achievement and shows that our triage works.
> Agreed.
> We did have negative open deltas for several weeks running in October.
> Kudos to everyone involved, and lets do it some more :)  I'm looking
> forward to making a non-trivial dent in the open count during the bug
> weekend on the 20th/21st.

Just to get a better idea I tried to plot a graph with the values of the 
last 13 weeks.
The resulting image is attached to the mail.

Best Regards,
Ezio Melotti
-------------- next part --------------
A non-text attachment was scrubbed...
Name: issues.png
Type: image/png
Size: 57338 bytes
Desc: not available
URL: <>

From martin at  Sun Nov  7 09:01:22 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 07 Nov 2010 09:01:22 +0100
Subject: [Python-Dev] [Python-checkins] r86264
	-	python/branches/release27-maint/Lib/distutils/
In-Reply-To: <ib42sp$9go$>
References: <>	<>	<>
	<> <ib42sp$9go$>
Message-ID: <>

>> It's rather a matter of agreeing when moving forward: IMO, mere style
>> changes, code cleanup etc shouldn't be applied to the bug fix branches,
>> as their only purpose is to provide bug fixes for existing users.
> The omission of the deletion from the 5/5 revision was a bug in that
> revision. If the removal of OS9 support was documented (announced),
> which I presume it was, then one could consider any visible trace
> remaining to be a bug.

Well, the question is: can anything break due to the code removal.
In principle, stuff *could* break even by a function that is supposedly
unused, and had supposedly been removed. The problem is that a
supposedly-unused function actually might be used somewhere, in some
context unrelated to its intended usage.

> Perhaps the policy on code cleanup should be a bit more liberal for 2.7
> *because* it will be maintained for several years and *because* there is
> no newer 2.x branch to apply changes to.

You mean, it's ok to break stuff with no gain in 2.7 bug fix releases?

> If I were going to maintain 2.7
> for several years, I would want to have the benefit of gradual
> improvements that make maintainance easier.

I question whether cleanup on a maintenance branch makes maintenance
easier. For example, one may (and I often do) compare the code base
of the previous bug fix release with the upcoming one, to see whether
any suspicious change accidentally was backported. Code cleanup is
in the way of such analysis, making maintenance more difficult.


From trent at  Sun Nov  7 12:24:59 2010
From: trent at (Trent Nelson)
Date: Sun, 7 Nov 2010 06:24:59 -0500
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

On 07-Nov-10 1:55 AM, Nick Coghlan wrote:
> On Sun, Nov 7, 2010 at 3:53 AM, Giampaolo Rodol?<g.rodola at>  wrote:
>> In such cases I would find more easy to be able to connect to the
>> machine and test myself rather than create a separate branch, commit,
>> schedule a buildbot run, wait for it to complete and see whether
>> everything is "green".
>> On the other side I perfectly understand how opening up blanket ssh
>> access is not something everyone is comfortable with doing.
>> AFAICR there was someone who was setting up an evironment to solve
>> exactly this problem but I'm not sure whether this is already usable.
> Dealing with exactly this problem is one of the goals of the Snakebite project.
> As far as I know, the folks behind that project are still working on
> it - I've cc'ed Trent Nelson to see if he can provide any additional
> info on the topic.

Thanks for the ping Nick, I might have missed this otherwise.  Good 
timing, too, as Titus and I were just discussing which low hanging 
fruit/pain points Snakebite should tackle first (now that all the server 
room stuff has finally been taken care of).

Luckily, the problems that we faced 2.5 years ago when I came up with 
the idea of Snakebite are still just as ever present today ;-)

1.  Not having access to buildbots is a pain when something doesn't work 
right.  Doing dummy debug commits against trunk to try and coerce some 
more information out of a failing platform is painful.  Losing a build 
slave entirely due to a particularly hard crash and requiring the 
assistance of the owner is also super frustrating.

2.  The buildbot web interface for building non-(trunk|2.x|py3k) 
branches is also crazy unfriendly.  Per-activity branches are a great 
way to isolate development, even with Subversion, but it kinda' blows 
that you don't *really* get any feedback about how your code behaves on 
other platforms until you re-integrate your changes back into a mainline 
branch.  (I'm sure none of us have been masochistic enough to manually 
kick off individual builds for every platform via the buildbot web page 
after every commit to a non-standard branch.)

So, enter Snakebite.  We've got three racks filled with way more 
hardware than I should have ever purchased.  Ignoring the overhead of 
having to set machines up and whatnot, let's just assume that over the 
next couple of months, if there's a platform we need a stable buildbot 
for, Snakebite can provide it.  (And if we feel like bringing IRIX/MIPS 
and Tru64/Alphas back as primary platforms, we've got the hardware to do 
that, too ;-).)

Now, the fact that they're all in the one place and under my complete 
control is a big advantage, as I can start addressing some of the pain 
points that lead me down this twisted path 2.5 years ago.

I'd like to get some feedback from the development community on what 
they'd prefer.  In my mind, I could take one of the following two steps:

1.  Set up standard build slaves on all the platforms, but put something 
in place that allowed committers to ssh/mstsc in to said slaves when 
things go wrong in order to aid with debugging and/or maintaining 
general buildbot health (OK'ing modal crash dialogues on Windows, for 

2.  Address the second problem of the buildbot web interface sucking for 
non-standard branches.  I'm thinking along the lines of a hack to 
buildbot, such that upon creation of new per-activity branches off a 
mainline, something magically runs in the background and sets up a 
complete buildbot view at<your-branch-name>, just as if you 
were looking at a trunk buildbot page.

I'm not sure how easy the second point will be when we switch to hg; and 
I'll admit if there have been any python-dev discussions about buildbot 
once we're on hg, I've missed them.

Of course there's a third option, which is using the infrastructure I've 
mentioned to address a similarly annoying pain point I haven't thought 
of -- so feel free to mention anything else you'd like to see first 
instead of the above two things.

Titus, for example, alluded to some nifty way for a committer to push 
his local hg branch/changes somewhere, such that it would kick off 
builds on multiple platforms in the same sorta' vein as point 2, but 
able to leverage cloud resources like Amazon's EC2, not just Snakebite 

Look forward to hearing some feedback!



From ncoghlan at  Sun Nov  7 12:50:31 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 7 Nov 2010 21:50:31 +1000
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 7, 2010 at 9:24 PM, Trent Nelson <trent at> wrote:
> 1. ?Set up standard build slaves on all the platforms, but put something in
> place that allowed committers to ssh/mstsc in to said slaves when things go
> wrong in order to aid with debugging and/or maintaining general buildbot
> health (OK'ing modal crash dialogues on Windows, for example).

This sounds like a great place to start. Perhaps focus on one or two
of the less common platforms first (e.g. FreeBSD 7 has been hitting a
few semaphore issues lately). The big 3 (Windows/Mac/Linux) are
usually reasonably well covered for debugging purposes by people that
use them for development.

> 2. ?Address the second problem of the buildbot web interface sucking for
> non-standard branches. ?I'm thinking along the lines of a hack to buildbot,
> such that upon creation of new per-activity branches off a mainline,
> something magically runs in the background and sets up a complete buildbot
> view at<your-branch-name>, just as if you
> were looking at a trunk buildbot page.
> I'm not sure how easy the second point will be when we switch to hg; and
> I'll admit if there have been any python-dev discussions about buildbot once
> we're on hg, I've missed them.

With the switch to imminent, it may be better to focus
on Hg for that part (unless you have other projects in mind that also
use SVN). I believe Martin and/or Dirkjan have worked out the
equivalent triggers and build commands needed to switch the buildbot
fleet from svn to hg, but I'm not entirely certain about that one.

Good to know things are still progressing though - traffic on the
website news feed and the mailing list has been a little sparse this
year ;)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From dirkjan at  Sun Nov  7 13:15:15 2010
From: dirkjan at (Dirkjan Ochtman)
Date: Sun, 7 Nov 2010 13:15:15 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 7, 2010 at 12:24, Trent Nelson <trent at> wrote:
> Titus, for example, alluded to some nifty way for a committer to push his
> local hg branch/changes somewhere, such that it would kick off builds on
> multiple platforms in the same sorta' vein as point 2, but able to leverage
> cloud resources like Amazon's EC2, not just Snakebite hardware.

Mozilla has something called the "try server", where people push
changes like to any normal repositories, but the result is that it
runs all the test suites they have. This lets people painlessly test
stuff on all platforms before actually pushing it to one of the main

On Sun, Nov 7, 2010 at 12:50, Nick Coghlan <ncoghlan at> wrote:
> With the switch to imminent, it may be better to focus
> on Hg for that part (unless you have other projects in mind that also
> use SVN). I believe Martin and/or Dirkjan have worked out the
> equivalent triggers and build commands needed to switch the buildbot
> fleet from svn to hg, but I'm not entirely certain about that one.

Yeah, Martin has things for buildbot worked out. Notes about this are
in the repository.



From solipsis at  Sun Nov  7 14:42:20 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 7 Nov 2010 14:42:20 +0100
Subject: [Python-Dev] [Python-checkins] r86264 -
References: <>
	<> <>
	<> <ib42sp$9go$>
Message-ID: <>

On Sun, 07 Nov 2010 09:01:22 +0100
"Martin v. L?wis" <martin at> wrote:
> > If I were going to maintain 2.7
> > for several years, I would want to have the benefit of gradual
> > improvements that make maintainance easier.
> I question whether cleanup on a maintenance branch makes maintenance
> easier.

It certainly does when using svnmerge. You can have many merge
conflicts if cleanups on the dev branch aren't backported to the bugfix



From khamenya at  Sun Nov  7 11:19:11 2010
From: khamenya at (Valery Khamenya)
Date: Sun, 7 Nov 2010 11:19:11 +0100
Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests)
Message-ID: <>


A) I missed the auto-complete feature for dictionary keys a lot in python
console. This patch seems to do the job.

B) There is no rlcompleter tests in trunk for some reason. So, I've taken
the 2.7.x and extended it.

C) patched rlcompleter as such works OK for unicode dictionary keys as well.
All tests pass OK. HOWEVER, readline's completion mechanism seem to be
confused with unicode strings -- see comments to
Completer.dict_key_matches(). So, perhaps, some changes should be applied to
readline code too.


1. (as for trunk)

2. test_rlcompleter (as for trunk)

3. rlcompleter_trunk_to_new.diff (created as: diff >rlcompleter_trunk_to_new.diff)

P.S. thanks to kerio & ssbr on icq for advices.

best regards
Valery A.Khamenya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Type: text/x-python
Size: 9001 bytes
Desc: not available
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Type: text/x-python
Size: 6134 bytes
Desc: not available
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rlcompleter_trunk_to_new.diff
Type: text/x-patch
Size: 3674 bytes
Desc: not available
URL: <>

From g.brandl at  Sun Nov  7 14:51:09 2010
From: g.brandl at (Georg Brandl)
Date: Sun, 07 Nov 2010 14:51:09 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <ib6auo$5cc$>

Am 07.11.2010 12:50, schrieb Nick Coghlan:
> On Sun, Nov 7, 2010 at 9:24 PM, Trent Nelson <trent at> wrote:
>> 1.  Set up standard build slaves on all the platforms, but put something in
>> place that allowed committers to ssh/mstsc in to said slaves when things go
>> wrong in order to aid with debugging and/or maintaining general buildbot
>> health (OK'ing modal crash dialogues on Windows, for example).
> This sounds like a great place to start. Perhaps focus on one or two
> of the less common platforms first (e.g. FreeBSD 7 has been hitting a
> few semaphore issues lately). The big 3 (Windows/Mac/Linux) are
> usually reasonably well covered for debugging purposes by people that
> use them for development.
>> 2.  Address the second problem of the buildbot web interface sucking for
>> non-standard branches.  I'm thinking along the lines of a hack to buildbot,
>> such that upon creation of new per-activity branches off a mainline,
>> something magically runs in the background and sets up a complete buildbot
>> view at<your-branch-name>, just as if you
>> were looking at a trunk buildbot page.
>> I'm not sure how easy the second point will be when we switch to hg; and
>> I'll admit if there have been any python-dev discussions about buildbot once
>> we're on hg, I've missed them.
> With the switch to imminent, it may be better to focus
> on Hg for that part (unless you have other projects in mind that also
> use SVN). I believe Martin and/or Dirkjan have worked out the
> equivalent triggers and build commands needed to switch the buildbot
> fleet from svn to hg, but I'm not entirely certain about that one.

I've spent a good bit of time on that, and left all the instructions in
the buildbot master config.  I also adapted buildbot's hg hook to our
situation (e.g. to send a change to multiple masters, as required for
the community buildbots), so it should be quite easy to actually
switch the buildbots over on migration day.


From foom at  Sun Nov  7 15:57:06 2010
From: foom at (James Y Knight)
Date: Sun, 7 Nov 2010 09:57:06 -0500
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On Nov 6, 2010, at 9:41 AM, Martin v. L?wis wrote:
> So I don't recall a decision that there shouldn't be a python2
> binary,

The decision to make one would have to be an active decision, since Python has never installed one before. If there should be one, then the Python Makefile should make one by default.

> nor a decision that anything is done indefinitely
> (it may be that the decision was actually just about 3.1 - changing
> it again for 3.2 would require another decision, but certainly can't
> be ruled out categorically).

When I said "indefinite", I meant "until some point in the future not yet determined", with an implied undertone of "not anytime soon".


From exarkun at  Sun Nov  7 17:25:18 2010
From: exarkun at (exarkun at
Date: Sun, 07 Nov 2010 16:25:18 -0000
Subject: [Python-Dev] Snakebite,
	buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
	access	against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <20101107162518.2040.178068202.divmod.xquotient.717@localhost.localdomain>

On 11:24 am, trent at wrote:
>2.  Address the second problem of the buildbot web interface sucking 
>for non-standard branches.  I'm thinking along the lines of a hack to 
>buildbot, such that upon creation of new per-activity branches off a 
>mainline, something magically runs in the background and sets up a 
>complete buildbot view at<your- 
>branch-name>, just as if you were looking at a trunk buildbot page.

This is basically trivial.  I gave #python-dev a tool for forcing 
builds, dunno if anyone still has a copy, but it's easy to reconstruct 
from <> (which is what the Twisted project uses).  Plus, you can add 
?branch=<name> to most BuildBot views to limit display of results to 
just builds for the named branch.
>Titus, for example, alluded to some nifty way for a committer to push 
>his local hg branch/changes somewhere, such that it would kick off 
>builds on multiple platforms in the same sorta' vein as point 2, but 
>able to leverage cloud resources like Amazon's EC2, not just Snakebite 

BuildBot supports managing EC2 instance lifetimes to run builds.


From brian.curtin at  Sun Nov  7 17:41:09 2010
From: brian.curtin at (Brian Curtin)
Date: Sun, 7 Nov 2010 10:41:09 -0600
Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 7, 2010 at 04:19, Valery Khamenya <khamenya at> wrote:

> Hi,
> A) I missed the auto-complete feature for dictionary keys a lot in python
> console. This patch seems to do the job.
> B) There is no rlcompleter tests in trunk for some reason. So, I've taken
> the 2.7.x and extended it.
> C) patched rlcompleter as such works OK for unicode dictionary keys as
> well. All tests pass OK. HOWEVER, readline's completion mechanism seem to be
> confused with unicode strings -- see comments to
> Completer.dict_key_matches(). So, perhaps, some changes should be applied to
> readline code too.
> Attached:
> 1. (as for trunk)
> 2. test_rlcompleter (as for trunk)
> 3. rlcompleter_trunk_to_new.diff (created as: diff
> >rlcompleter_trunk_to_new.diff)
> P.S. thanks to kerio & ssbr on icq for advices.
> best regards
> --
> Valery A.Khamenya
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
Can you post your patch on
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From khamenya at  Sun Nov  7 18:07:31 2010
From: khamenya at (Valery Khamenya)
Date: Sun, 7 Nov 2010 18:07:31 +0100
Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+
In-Reply-To: <>
References: <>
Message-ID: <>

> Can you post your patch on

 the site is not working currently.

Also, I have forgotten to mention, that the usual lines in .pythonstartup
should look now like that:

# the usual lines:
import readline
import rlcompleter
readline.parse_and_bind('tab: complete')
readline.parse_and_bind('Control-Space: complete')

# and now the additional line to allow the '[' char and both quote
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bobbyi at  Sun Nov  7 18:30:17 2010
From: bobbyi at (Bobby Impollonia)
Date: Sun, 7 Nov 2010 09:30:17 -0800
Subject: [Python-Dev] not responding (Was: rlcompleter --
 auto-complete dictionary keys (+ tests))
Message-ID: <>

On Sun, Nov 7, 2010 at 9:07 AM, Valery Khamenya <khamenya at> wrote:
>> Can you post your patch on
> ?the site is not working currently.

Yes, it is down for me too, trying from multiple hosts. It was up
approximately an hour ago, but has now been unresponsive for the past
twenty or thirty minutes. I cannot even ping The main site seems to be fine.

From rdmurray at  Sun Nov  7 20:19:44 2010
From: rdmurray at (R. David Murray)
Date: Sun, 07 Nov 2010 14:19:44 -0500
Subject: [Python-Dev] not responding (Was: rlcompleter
	-- auto-complete dictionary keys (+ tests))
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, 07 Nov 2010 09:30:17 -0800, Bobby Impollonia <bobbyi at> wrote:
> On Sun, Nov 7, 2010 at 9:07 AM, Valery Khamenya <khamenya at> wrote:
> >> Can you post your patch on
> >
> > the site is not working currently.
> Yes, it is down for me too, trying from multiple hosts. It was up
> approximately an hour ago, but has now been unresponsive for the past
> twenty or thirty minutes. I cannot even ping The main
> site seems to be fine.

The hosting company working on the problem, which seems to be a hardware
issue.  Hopefully be resolved soon.

FYI and are different machines, and
in fact the two machines are not even hosted at the same location.

Valery,  I would advise you to submit the patch to when
it comes back up.  Patches posted to this mailing list will in general
just get forgotten.

R. David Murray                            

From python at  Sun Nov  7 22:05:32 2010
From: python at (MRAB)
Date: Sun, 07 Nov 2010 21:05:32 +0000
Subject: [Python-Dev] Bug track down?
Message-ID: <>

It looks like the bug tracker is down.

From martin at  Sun Nov  7 22:06:49 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Nov 2010 22:06:49 +0100
Subject: [Python-Dev] Python-3 transition in Arch Linux
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Am 07.11.2010 15:57, schrieb James Y Knight:
> On Nov 6, 2010, at 9:41 AM, Martin v. L?wis wrote:
>> So I don't recall a decision that there shouldn't be a python2 
>> binary,
> The decision to make one would have to be an active decision, since
> Python has never installed one before. If there should be one, then
> the Python Makefile should make one by default.

No. Creation of additional symlinks is certainly in the realm of what
Python packagers can decide on their own.


From martin at  Sun Nov  7 22:26:46 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Nov 2010 22:26:46 +0100
Subject: [Python-Dev] Bug track down?
In-Reply-To: <>
References: <>
Message-ID: <>

Am 07.11.2010 22:05, schrieb MRAB:
> It looks like the bug tracker is down.

Thanks - we have already contacted the hosting company, who have already
contacted the datacenter. It appears that the bug tracker actually
wasn't down (at least, it believes it was up all time), which suggests
that there was some kind of networking problem. It came back, then went
away again.


From solipsis at  Sun Nov  7 23:11:59 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 7 Nov 2010 23:11:59 +0100
Subject: [Python-Dev] r86276 -
References: <>
Message-ID: <>

On Sat,  6 Nov 2010 19:03:53 +0100 (CET)
eric.araujo <python-checkins at> wrote:
> Author: eric.araujo
> Date: Sat Nov  6 19:03:52 2010
> New Revision: 86276
> Log:
> Fix #10252 again (hopefully definitely).  Patch by Brian Curtin.

It seems this and previous fixes should be backported to 2.7.



From at  Mon Nov  8 00:34:36 2010
From: at (David Bolen)
Date: Sun, 07 Nov 2010 18:34:36 -0500
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
References: <>
Message-ID: <>

Victor Stinner <victor.stinner at> writes:

> You wrote that the "POSIX" semaphore are very limited. Do it mean that there 
> is another kind of semaphore will an higher limit?

Well, I think the SYSV semaphores are either less limited or at least
more adjustable.  They've certainly been around longer in FreeBSD.
The POSIX semaphore support is not enabled by default in FreeBSD 7, so
I added loader.conf stuff to load them (as part of issue7272).  I
don't think the Python internals are using the SYSV semaphores though.
SYSV functions have no underscore (e.g., semget) whereas POSIX do
(sem_get).  Also, I believe only POSIX has named semaphores.

> I don't think that skipping the test is a good idea: it looks like a real bug 
> in (a limitation of) the ProcessPoolExecutor implementation on FreeBSD. Eg. 
> test_map fails on FreeBSD 7.2 with ProcessPoolExecutorTest which uses 
> self.executor = futures.ProcessPoolExecutor(max_workers=1): only one worker 
> process!
> It looks like it is possible to tune semaphore limits on FreeBSD, without 
> recompiling the kernel, by using boot loader option (kern.ipc.sem* options). 
> But ask the FreeBSD user to tune its boot loader options to use the 
> concurrent.futures module is not pratical :-)

Yeah, I guess the key question is if changing the limit is just needed
to get around an artifact of the test process (which I'm willing to do
for the buildbot), or if it would be needed to be able to use the
regular modules in practice.  If the latter, I doubt too many users
are going to jump through such hoops, particularly if it needs a
kernel rebuild, so we may need to make other choices in terms of
support under FreeBSD.

I'm also not entirely sure just what is the limiting factor.  I think
the kern.ipc.sem* options are for the SYSV semaphores, not POSIX, though
some of them do have a similar limit.  Some are adjustable by sysctl,
others by loader.conf.

The references I found were talking about a limit set explicitly
(#define SEM_MAX) in the kernel source (uipc_sem.c) which exports its
value (at least in 7.2) via the sysctl p1003_1b.sem_nsems_max, which
is read-only.  I got the impression they weren't adjustable even in
loader.conf, but haven't actually tried it yet myself.

It may be different in 8.x, but one email thread I found indicated
that the changes proposed to make the POSIX limits adjustable didn't
make the 8.1 cut (current release), though might make it in the next
8.x release.

-- David

From martin at  Mon Nov  8 01:09:27 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Nov 2010 01:09:27 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> Luckily, the problems that we faced 2.5 years ago when I came up with
> the idea of Snakebite are still just as ever present today ;-)

Is this bashing of existing infrastructure really necessary?
People (like me) might start bashing about vaporware and how
a bird in the hand is worth two in the bush. Cooperate, don't


From martin at  Mon Nov  8 01:13:19 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Nov 2010 01:13:19 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <ib6auo$5cc$>
References: <>	<>	<>	<>
Message-ID: <>

> I've spent a good bit of time on that, and left all the instructions in
> the buildbot master config.  I also adapted buildbot's hg hook to our
> situation (e.g. to send a change to multiple masters, as required for
> the community buildbots), so it should be quite easy to actually
> switch the buildbots over on migration day.

I'm not sure this is the right way of doing it. AFAICT, hg can have
multiple handlers for the same hook, e.g. incoming.buildbot and

Furthermore, I believe the community buildbot farm is currently dead,
and unlikely to come back.


From solipsis at  Mon Nov  8 01:58:05 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 8 Nov 2010 01:58:05 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
References: <>
Message-ID: <>

On Sun, 7 Nov 2010 06:24:59 -0500
Trent Nelson <trent at> wrote:
> (And if we feel like bringing IRIX/MIPS 
> and Tru64/Alphas back as primary platforms, we've got the hardware to do 
> that, too ;-).)

Unless you want to rename your project zombiebite, it would probably be
better not to resurrect those old corpses.
(I'm talking about the OSes, not the chips)

> Of course there's a third option, which is using the infrastructure I've 
> mentioned to address a similarly annoying pain point I haven't thought 
> of -- so feel free to mention anything else you'd like to see first 
> instead of the above two things.

I'm sure there are various special builds that could be useful. One is
a build with heavy resource consumption (lots of RAM, lots of disk) if
there's a machine which can handle that.

Another is testing memory leaks on all 3 branches (I have a daily
script which does that for 3.x on my personal server).

Perhaps there could even be some automated fuzzing if Victor is looking
for something to do on his free time :)



From scott+python-dev at  Mon Nov  8 03:32:33 2010
From: scott+python-dev at (Scott Dial)
Date: Sun, 07 Nov 2010 21:32:33 -0500
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 11/7/2010 7:09 PM, Martin v. L?wis wrote:
>> Luckily, the problems that we faced 2.5 years ago when I came up with
>> the idea of Snakebite are still just as ever present today ;-)
> Is this bashing of existing infrastructure really necessary?
> People (like me) might start bashing about vaporware and how
> a bird in the hand is worth two in the bush. Cooperate, don't
> confront.

+1 Respect your (software) elders.

The Snaketbite rhetoric has always been less than generous with regard
to Buildbot, but Buildbot has been providing an infinitely more useful
service to the community for much longer than Snakebite has for those
2.5 years.

Scott Dial
scott at
scodial at

From victor.stinner at  Mon Nov  8 03:36:34 2010
From: victor.stinner at (Victor Stinner)
Date: Mon, 8 Nov 2010 03:36:34 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

On Monday 08 November 2010 00:34:36 David Bolen wrote:
> Victor Stinner <victor.stinner at> writes:
> > You wrote that the "POSIX" semaphore are very limited. Do it mean that
> > there is another kind of semaphore will an higher limit?
> Well, I think the SYSV semaphores are either less limited or at least
> more adjustable.  They've certainly been around longer in FreeBSD.
> The POSIX semaphore support is not enabled by default in FreeBSD 7, so
> I added loader.conf stuff to load them (as part of issue7272).  I
> don't think the Python internals are using the SYSV semaphores though.
> SYSV functions have no underscore (e.g., semget) whereas POSIX do
> (sem_get).  Also, I believe only POSIX has named semaphores.

I created the issue to suggest this.


From ctb at  Mon Nov  8 03:58:56 2010
From: ctb at (C. Titus Brown)
Date: Sun, 7 Nov 2010 18:58:56 -0800
Subject: [Python-Dev] Snakebite,
	buildbot and low hanging fruit --	feedback wanted! (Was Re: SSH
	access against buildbot boxes)
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 07, 2010 at 09:32:33PM -0500, Scott Dial wrote:
> On 11/7/2010 7:09 PM, Martin v. L?wis wrote:
> >> Luckily, the problems that we faced 2.5 years ago when I came up with
> >> the idea of Snakebite are still just as ever present today ;-)
> > 
> > Is this bashing of existing infrastructure really necessary?
> > People (like me) might start bashing about vaporware and how
> > a bird in the hand is worth two in the bush. Cooperate, don't
> > confront.
> +1 Respect your (software) elders.
> The Snaketbite rhetoric has always been less than generous with regard
> to Buildbot, but Buildbot has been providing an infinitely more useful
> service to the community for much longer than Snakebite has for those
> 2.5 years.

Yes, yes, I agree that some graciousness is a good idea.

Oh, wait... you're not helping.

Anyway, I think buildbot is a good local optimum for python-dev, largely
because it's maintained by someone who cares enough to do it well.  And, if
Trent had been talking about buildbot only, MvL's comment would be more than
fair.  But Trent, and I, and others, have talked about quite a bit more than
buildbot being "the" problem. Things like enabling *and maintaining* easy EC2
spin-up with buildbot, or providing SSH key access, or making a 'try' server
available and maintaining it, would be clearly beneficial.  And that's some of
what Trent has been talking about providing.  It turns out it's hard to do
without lots and lots of time and money.  If you truly think it's not useful,
I'd be interested in hearing your opinions, because we've spent an ungodly
amount of the above on it.

In the larger context, I worry very much that we're settling for a rather
suboptimal support setup (on svn, and on cont integration, and on some other
aspects of Python infrastructure) because the current maintainers are so
overloaded and few others are stepping up to bear burdens.  This is a big
concern of at least some people in the PSF.  But it's not an easy problem to
solve - quelle surprise.  And I'm not in a personal position to help, so I've
basically tried to shut up about it :).

As for buildbot, I've been pretty hard on buildbot myself, and I'm happy to
justify it to others -- I've done so in public fora so I'm sure you can find
the records, if you care to look.  But it's not really very relevant to this
conversation, especially since Trent has always been interested in building off
the buildbot setup rather than replacing it.

C. Titus Brown, ctb at

From qiyong at  Mon Nov  8 02:43:33 2010
From: qiyong at (Qi Yong)
Date: Sun, 7 Nov 2010 18:43:33 -0700
Subject: [Python-Dev] KeyboardInterrupt not catch
Message-ID: <>


With this script, after ctrl-d, ctrl-c exception not catch.
Is it a python bug or a wrong exception usage? Thanks.
If with import readline, this problem disappears.

-- qiyong

def parse():
                answer = raw_input("Eo: ")
                print answer
        except EOFError:
        except KeyboardInterrupt:

def main():
        while True:

if __name__ == "__main__":

Qi Yong

From tjreedy at  Mon Nov  8 04:45:43 2010
From: tjreedy at (Terry Reedy)
Date: Sun, 07 Nov 2010 22:45:43 -0500
Subject: [Python-Dev] r86276 -
In-Reply-To: <>
References: <>
Message-ID: <ib7rp5$50l$>

On 11/7/2010 5:11 PM, Antoine Pitrou wrote:
> On Sat,  6 Nov 2010 19:03:53 +0100 (CET)
> eric.araujo<python-checkins at>  wrote:
>> Author: eric.araujo
>> Date: Sat Nov  6 19:03:52 2010
>> New Revision: 86276
>> Log:
>> Fix #10252 again (hopefully definitely).  Patch by Brian Curtin.
> It seems this and previous fixes should be backported to 2.7.

Perhaps there should be a 'backport 2.7' keyword to check on issues that 
might be but have not been.

Terry Jan Reedy

From scott+python-dev at  Mon Nov  8 04:51:44 2010
From: scott+python-dev at (Scott Dial)
Date: Sun, 07 Nov 2010 22:51:44 -0500
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit --	feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On 11/7/2010 9:58 PM, C. Titus Brown wrote:
> Yes, yes, I agree that some graciousness is a good idea.
> Oh, wait... you're not helping.


I don't remember being invited to help. is a dead end.
snakebite-list hasn't had a post for over a year. Where is the list of
things that you need done so that I can get started on that? Oh wait..

Seriously, all I asked was for you to tone down your insults to a
technology that is already solving problems today. Why you feel the need
to attack me personally is beyond my understanding. Furthermore, I don't
see why I need to be "helping" -- somebody who doesn't want help -- to
be able to deduce that your message is being insulting to the authors of

On 11/7/2010 9:58 PM, C. Titus Brown wrote:
> And I'm not in a personal position to help, so I've basically tried
> to shut up about it :).

At least I am in good company. ;)

Scott Dial
scott at
scodial at

From ben+python at  Mon Nov  8 05:05:05 2010
From: ben+python at (Ben Finney)
Date: Mon, 08 Nov 2010 15:05:05 +1100
Subject: [Python-Dev] KeyboardInterrupt not catch
References: <>
Message-ID: <>

Qi Yong <qiyong at> writes:

> With this script, after ctrl-d, ctrl-c exception not catch.

When I run it, the Ctrl-D doesn't affect the behaviour of Ctrl-C. Can
you confirm that the behaviour is dependent on whether Ctrl-D is used?

> If with import readline, this problem disappears.

Again, if I ?import readline? or not, the behaviour is unchanged. Can
you show a specific series of steps that changes the behaviour?

> Is it a python bug or a wrong exception usage? Thanks.

The ?raw_input? function uses the ?readline? library. That library uses
signal handlers for many of the terminal signals, as documented at

 \       ?You've got the brain of a four-year-old boy, and I'll bet he |
  `\                         was glad to get rid of it.? ?Groucho Marx |
_o__)                                                                  |
Ben Finney

From ralf at  Mon Nov  8 06:35:27 2010
From: ralf at (Ralf Schmitt)
Date: Mon, 08 Nov 2010 06:35:27 +0100
Subject: [Python-Dev] KeyboardInterrupt not catch
In-Reply-To: <> (Qi Yong's message of
	"Sun, 7 Nov 2010 18:43:33 -0700")
References: <>
Message-ID: <>

Qi Yong <qiyong at> writes:

> Hello,
> With this script, after ctrl-d, ctrl-c exception not catch.
> Is it a python bug or a wrong exception usage? Thanks.
> If with import readline, this problem disappears.

there's already a bug in the issue tracker:

- Ralf

From g.brandl at  Mon Nov  8 07:39:31 2010
From: g.brandl at (Georg Brandl)
Date: Mon, 08 Nov 2010 07:39:31 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>	<>	<>	<>	<ib6auo$5cc$>
Message-ID: <ib861f$1qj$>

Am 08.11.2010 01:13, schrieb "Martin v. L?wis":
>> I've spent a good bit of time on that, and left all the instructions in
>> the buildbot master config.  I also adapted buildbot's hg hook to our
>> situation (e.g. to send a change to multiple masters, as required for
>> the community buildbots), so it should be quite easy to actually
>> switch the buildbots over on migration day.
> I'm not sure this is the right way of doing it. AFAICT, hg can have
> multiple handlers for the same hook, e.g. incoming.buildbot and

That is true, however it doesn't help you: the hook takes its configuration
from the hgrc file, so you can configure exactly one host:port to send
changes to.

> Furthermore, I believe the community buildbot farm is currently dead,
> and unlikely to come back.

Then it's easy not to use that feature :)


From martin at  Mon Nov  8 09:05:23 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Nov 2010 09:05:23 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <ib861f$1qj$>
References: <>	<>	<>	<>	<ib6auo$5cc$>	<>
Message-ID: <>

> That is true, however it doesn't help you: the hook takes its configuration
> from the hgrc file, so you can configure exactly one host:port to send
> changes to.

Ah, ok.


From asmodai at  Mon Nov  8 10:05:24 2010
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Mon, 8 Nov 2010 10:05:24 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20101107 12:52], Nick Coghlan (ncoghlan at wrote:
>This sounds like a great place to start. Perhaps focus on one or two
>of the less common platforms first (e.g. FreeBSD 7 has been hitting a
>few semaphore issues lately).

Nick, do you have some pointers for this? I am one of those BSD Python
users/coders and would like to resolve any issues.

By default FreeBSD 7, at least, has limits on semaphores and the likes, but
those can be expanded.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
Hypocrisy is the homage which vice pays to virtue...

From regebro at  Mon Nov  8 10:42:26 2010
From: regebro at (Lennart Regebro)
Date: Mon, 8 Nov 2010 10:42:26 +0100
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <Act2RJ70W6LCt2gPS2e4wNV7jFYtAQ==>
Message-ID: <>

2010/10/28 Kristj?n Valur J?nsson <kristjan at>:
> Hello all.
> So, python 2.7 is in bugfix only mode.? ?trunk? is off limit.? So, where
> does one make improvements to the distinguished, and still very much alive,
> 2.x series of Python?
> The answer would seem to be ?one doesn?t?. ?But must it be that way?

Except for making releases that start backporting Python 3 features
and breaking backwards compatibility gradually (which may or may not
be a good idea) I don't see the point. There isn't much to do when it
comes to improving the language, and there is a moratorium anyway.
Improvements in the standard library can be more easily done in
external libraries anyway, and then you can release the improved
libraries for everything from Python 2.4 and forwards if you like.

So it can be done, but the question is "Why?"

Lennart Regebro, Colliberty:
Telephone: +48 691 268 328

From khamenya at  Mon Nov  8 11:43:36 2010
From: khamenya at (Valery Khamenya)
Date: Mon, 8 Nov 2010 11:43:36 +0100
Subject: [Python-Dev] not responding (Was: rlcompleter
 -- auto-complete dictionary keys (+ tests))
In-Reply-To: <>
References: <>
Message-ID: <>

Hi David,

> Valery,  I would advise you to submit the patch to when
>  it comes back up.  Patches posted to this mailing list will in general
> just get forgotten.

Albeit, as I can already see the situation with changes in 2.x trunk isn't
much simple.
I hope the patch won't go forgotten. (After all we here still rely very much
on 2.x)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From regebro at  Mon Nov  8 12:08:24 2010
From: regebro at (Lennart Regebro)
Date: Mon, 8 Nov 2010 12:08:24 +0100
Subject: [Python-Dev] new buffer in python2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Oct 27, 2010 at 12:36, Antoine Pitrou <solipsis at> wrote:
> On Wed, 27 Oct 2010 10:13:12 +0800
> Kristj?n Valur J?nsson <kristjan at> wrote:
>> Although 2.7 has the new buffer interface and memoryview
>> objects, these are widely not accepted in the built in modules.
> That's true, and slightly unfortunate. It could be a reason for
> switching to 3.1/3.2 :-)

It's rather a reason against it, as it makes supporting both Python 2
and Python 3 harder.
However, fixing this in 2.7 just means that you need to support 2.7x
or later only, so it's not a good solution.
I think using compatibility types is a better solution. I suggested
something like that for inclusion in "six", but it was softly
rejected. :-)

Something like this, for example. It's a str in Python2 and a Bytes in
Python3, but it extends both classes with a consistent interface.
Improvements, comments and ideas are welcome.
import sys
if sys.version < '3':
    class Bites(str):
        def __new__(cls, value):
            if isinstance(value[0], int):
                # It's a list of integers
                value = ''.join([chr(x) for x in value])
            return super(Bites, cls).__new__(cls, value)

        def itemint(self, index):
            return ord(self[index])

        def iterint(self):
            for x in self:
                yield ord(x)

    class Bites(bytes):
        def __new__(cls, value):
            if isinstance(value, str):
                # It's a unicode string:
                value = value.encode('ISO-8859-1')
            return super(Bites, cls).__new__(cls, value)

        def itemint(self, x):
            return self[x]

        def iterint(self):
            for x in self:
                yield x

Lennart Regebro:
Python 3 Porting:
+33 661 58 14 64

From ncoghlan at  Mon Nov  8 12:44:56 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 8 Nov 2010 21:44:56 +1000
Subject: [Python-Dev] Backward incompatible API changes in the pydoc module
Message-ID: <>


I was about to commit the patch for issue 2001 (the improvements to
the pydoc web server and the removal of the Tk GUI) when I realised
that pydoc.serve() and pydoc.gui() are technically public standard
library APIs (albeit undocumented ones).

Currently the patch switches serve() to start the new server
implementation and gui() to start the server and open a browser window
for it.

It occurred to me that, despite the "it's an application" feel to the
pydoc web server APIs, it may be a better idea to leave the two
existing functions alone (aside from adding DeprecationWarning), and
using new private function names to start the new server and the web

Is following the standard deprecation procedure the better course
here, or am I being overly paranoid?


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Nov  8 12:51:44 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 8 Nov 2010 21:51:44 +1000
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 7:05 PM, Jeroen Ruigrok van der Werven
<asmodai at> wrote:
> -On [20101107 12:52], Nick Coghlan (ncoghlan at wrote:
>>This sounds like a great place to start. Perhaps focus on one or two
>>of the less common platforms first (e.g. FreeBSD 7 has been hitting a
>>few semaphore issues lately).
> Nick, do you have some pointers for this? I am one of those BSD Python
> users/coders and would like to resolve any issues.
> By default FreeBSD 7, at least, has limits on semaphores and the likes, but
> those can be expanded.

Possibly a bad example on my part, since David and Victor actually
seem to be making reasonable progress in tracking down the problem:


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From dirkjan at  Mon Nov  8 12:55:24 2010
From: dirkjan at (Dirkjan Ochtman)
Date: Mon, 8 Nov 2010 12:55:24 +0100
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 7, 2010 at 13:15, Dirkjan Ochtman <dirkjan at> wrote:
> Yeah, Martin has things for buildbot worked out. Notes about this are
> in the repository.

I meant Georg here, of course. Sorry, Georg!



From ncoghlan at  Mon Nov  8 12:57:45 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 8 Nov 2010 21:57:45 +1000
Subject: [Python-Dev] Snakebite,
 buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH
 access against buildbot boxes)
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 8, 2010 at 10:09 AM, "Martin v. L?wis" <martin at> wrote:
>> Luckily, the problems that we faced 2.5 years ago when I came up with
>> the idea of Snakebite are still just as ever present today ;-)
> Is this bashing of existing infrastructure really necessary?
> People (like me) might start bashing about vaporware and how
> a bird in the hand is worth two in the bush. Cooperate, don't
> confront.

I don't believe the comment was meant to be a slight on the efforts of
the current infrastructure maintainers.

I took Trent's message as referring to the problems Giampaolo
mentioned in the original post (i.e. the ability to grant buildbot
access in an easy-to-use way to existing core developers without
burdening every buildbot operator with decisions as to who they can
trust with access to their buildbot). Buildbot (and similar tools) are
fine for what they do, but there are some problems like this that they
don't even *try* to solve (because they aren't software problems -
they're dependent on physical infrastructure).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From michael at  Mon Nov  8 13:09:42 2010
From: michael at (Michael Foord)
Date: Mon, 08 Nov 2010 12:09:42 +0000
Subject: [Python-Dev] GUI test runner tool
Message-ID: <>

Hello all,

Now that unittest has test discovery, Mark Roddy has been working on 
resurrecting the old GUI test runner (using Tkinter):

This was part of the original pyunit project but I believe it was never 
part of the standard library:

Here's a screenshot of what it looks like:

I'd like to propose adding it to Python in Tools/ and am volunteering to 
maintain it. If the answer is "not yet" that is fine as it can go into 
unittest2 first. Mark has updated it to work with test discovery and 
added support for configuring test discovery in the same way as you can 
from the command line. It is a nice tool for those new to writing tests 
who aren't yet familiar with the command line or prefer a GUI.

In its basic form you simply pick a directory and unittestgui will 
discover and run all the tests it finds. It would be nice if it provided 
more diagnostic information on tests it ran (clicking through test 
results) but these can be added later.

All the best,

Michael Foord


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From asmodai at  Mon Nov  8 13:23:33 2010
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Mon, 8 Nov 2010 13:23:33 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
 3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20101108 00:36], David Bolen ( at wrote:
>Victor Stinner <victor.stinner at> writes:
>Well, I think the SYSV semaphores are either less limited or at least
>more adjustable.  They've certainly been around longer in FreeBSD.
>The POSIX semaphore support is not enabled by default in FreeBSD 7, so
>I added loader.conf stuff to load them (as part of issue7272).

It is enabled by default on FreeBSD 8 at least.
Looking through the repository it seems 7-STABLE has it enabled by default
as well in the GENERIC kernel (the standard one it boots with after its
first install). It seems this was added for 7.3 and onward. So 7.2 and
before need an "options P1003_1B_SEMAPHORES" added to their kernel at least.
The SYSV options are already present in the entire 7.x line.

>> It looks like it is possible to tune semaphore limits on FreeBSD, without 
>> recompiling the kernel, by using boot loader option (kern.ipc.sem* options). 
>> But ask the FreeBSD user to tune its boot loader options to use the 
>> concurrent.futures module is not pratical :-)

PostgreSQL installations via ports as well as its documentation instruct the
FreeBSD user to tweak kern.ipc settings.

>Yeah, I guess the key question is if changing the limit is just needed
>to get around an artifact of the test process (which I'm willing to do
>for the buildbot), or if it would be needed to be able to use the
>regular modules in practice.  If the latter, I doubt too many users
>are going to jump through such hoops, particularly if it needs a
>kernel rebuild, so we may need to make other choices in terms of
>support under FreeBSD.

Almost every FreeBSD user I know of compiles a new kernel. It's just one of
those BSD things that every user goes through.

>I'm also not entirely sure just what is the limiting factor.  I think
>the kern.ipc.sem* options are for the SYSV semaphores, not POSIX, though
>some of them do have a similar limit.  Some are adjustable by sysctl,
>others by loader.conf.

kern.ipc is about System V IPC. As you indicate later on, p1003_1b is the
POSIX related IPC sysctl tree.
The three semaphore settings semmni, semmns, and semmnu are only tweakable
via loader.conf.

>The references I found were talking about a limit set explicitly
>(#define SEM_MAX) in the kernel source (uipc_sem.c) which exports its
>value (at least in 7.2) via the sysctl p1003_1b.sem_nsems_max, which
>is read-only.  I got the impression they weren't adjustable even in
>loader.conf, but haven't actually tried it yet myself.
>It may be different in 8.x, but one email thread I found indicated
>that the changes proposed to make the POSIX limits adjustable didn't
>make the 8.1 cut (current release), though might make it in the next
>8.x release.

After checking the repository I saw that there were MFCs (Merge From
Current, backport) to 8-STABLE prior to the 8.1 release for dynamic

On my 8.1 machine:

nexus% sudo sysctl -w p1003_1b.sem_nsems_max=31
p1003_1b.sem_nsems_max: 30 -> 32

7.x is hardlocked at the moment unless someone manually edits the file to up
the SEM_MAX define. The same goes for FreeBSD 8.0.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
Nothing yet from nothing ever came...

From jcea at  Mon Nov  8 13:59:09 2010
From: jcea at (Jesus Cea)
Date: Mon, 08 Nov 2010 13:59:09 +0100
Subject: [Python-Dev] Help with warnings not being raised
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Hash: SHA1

On 05/11/10 13:55, Nick Coghlan wrote:
> Under -We, PyErr_Warn raises an exception rather than printing to
> stdout. That exception is clobbered by the immediately following call
> to PyErr_Clear.
> Since you *only* hit that branch under -We in the first place, a
> second call to PyErr_WriteUnraisable should get the error to actually
> print out.

Excellent explanation, Nick. Thanks.

Patched in r86317. Up-ported to upcoming pybsddb 5.1.1.

PS: is still down.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From merwok at  Mon Nov  8 15:30:18 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 15:30:18 +0100
Subject: [Python-Dev] r86276 -
In-Reply-To: <>
References: <>
Message-ID: <>

>> New Revision: 86276
>> Log:
>> Fix #10252 again (hopefully definitely).  Patch by Brian Curtin.
> It seems this and previous fixes should be backported to 2.7.

Certainly.  I was waiting on buildbot feedback before doing it.


From merwok at  Mon Nov  8 15:34:36 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 15:34:36 +0100
Subject: [Python-Dev] r86276 -
In-Reply-To: <ib7rp5$50l$>
References: <>	<>
Message-ID: <>

>> It seems this and previous fixes should be backported to 2.7.
> Perhaps there should be a 'backport 2.7' keyword to check on issues that 
> might be but have not been.

The ?Your issues? list is very helpful and works well for me.  This bug
is still open and assigned to me (and opened in my web browser,
incidentally), so I don?t fear I?ll forget it.  This new keyword would
IMO be redundant with existing fields (status:open + version:2.7).

(The once-existing 26backport was entirely different if I recall
correctly, it was used to tag 3.x features to be added to 2.x.)


From merwok at  Mon Nov  8 15:38:56 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 15:38:56 +0100
Subject: [Python-Dev] [Python-checkins] r86300 - in
 python/branches/py3k: Misc/NEWS PC/winsound.c
In-Reply-To: <>
References: <>
Message-ID: <>

> Author: hirokazu.yamamoto
> New Revision: 86300
> Log:
> Issue #6317: Now winsound.PlaySound only accepts unicode with MvL's approval.
> Modified: python/branches/py3k/Misc/NEWS
> ==============================================================================
> --- python/branches/py3k/Misc/NEWS	(original)
> +++ python/branches/py3k/Misc/NEWS	Sun Nov  7 15:29:26 2010
> @@ -251,6 +251,8 @@
>  Extension Modules
>  -----------------
> +- Issue #6317: Now winsound.PlaySound only accepts unicode.
> +
>  - Issue #6317: Now winsound.PlaySound can accept non ascii filename.

Shouldn?t that be only one entry?


From exarkun at  Mon Nov  8 16:12:55 2010
From: exarkun at (exarkun at
Date: Mon, 08 Nov 2010 15:12:55 -0000
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>
Message-ID: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain>

On 11:44 am, ncoghlan at wrote:
>I was about to commit the patch for issue 2001 (the improvements to
>the pydoc web server and the removal of the Tk GUI) when I realised
>that pydoc.serve() and pydoc.gui() are technically public standard
>library APIs (albeit undocumented ones).
>Currently the patch switches serve() to start the new server
>implementation and gui() to start the server and open a browser window
>for it.
>It occurred to me that, despite the "it's an application" feel to the
>pydoc web server APIs, it may be a better idea to leave the two
>existing functions alone (aside from adding DeprecationWarning), and
>using new private function names to start the new server and the web
>Is following the standard deprecation procedure the better course
>here, or am I being overly paranoid?

Following the deprecation procedure here sounds awesome to me.  Thanks 
for considering it, I hope you'll choose to go that way.


From foom at  Mon Nov  8 16:34:50 2010
From: foom at (James Y Knight)
Date: Mon, 8 Nov 2010 10:34:50 -0500
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <Act2RJ70W6LCt2gPS2e4wNV7jFYtAQ==>
Message-ID: <>

On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
> Except for making releases that start backporting Python 3 features
> and breaking backwards compatibility gradually (which may or may not
> be a good idea) I don't see the point. There isn't much to do when it
> comes to improving the language, and there is a moratorium anyway.
> Improvements in the standard library can be more easily done in
> external libraries anyway, and then you can release the improved
> libraries for everything from Python 2.4 and forwards if you like.
> So it can be done, but the question is "Why?"

To keep the batteries included?


From merwok at  Mon Nov  8 17:00:27 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 17:00:27 +0100
Subject: [Python-Dev] [Python-checkins] r86264 -
In-Reply-To: <>
References: <>	<>	<>	<>
	<ib42sp$9go$> <>
Message-ID: <>

>>> It's rather a matter of agreeing when moving forward: IMO, mere style
>>> changes, code cleanup etc shouldn't be applied to the bug fix branches,
>>> as their only purpose is to provide bug fixes for existing users.
>> [Terry]
>> The omission of the deletion from the 5/5 revision was a bug in that
>> revision. If the removal of OS9 support was documented (announced),
>> which I presume it was, then one could consider any visible trace
>> remaining to be a bug.

FTR, it was documented in PEP 11 as removed in 2.4 (but not in 2.4?s

> Well, the question is: can anything break due to the code removal.
> In principle, stuff *could* break even by a function that is supposedly
> unused, and had supposedly been removed. The problem is that a
> supposedly-unused function actually might be used somewhere, in some
> context unrelated to its intended usage.

It?s known that people do modify distutils.sysconfig._config_vars, a
private dictionary; I can imagine some really contrived example of code
using _init_mac, the function I removed, to set sysconfig values for Mac
OS 9 in 2.7 code.  1% chance, I guess.

>> Perhaps the policy on code cleanup should be a bit more liberal for 2.7
>> *because* it will be maintained for several years and *because* there is
>> no newer 2.x branch to apply changes to.
> You mean, it's ok to break stuff with no gain in 2.7 bug fix releases?

I don?t think Terry was suggesting breakages, just other kinds of
cleanup.  In this particular case, I think now that I should have
followed distutils policy (which is less liberal that the rest of the
stdlib).  If there are no arguments against it this week, I will revert
the commit.


From merwok at  Mon Nov  8 17:02:24 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 17:02:24 +0100
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Nick,

If there is no enormous difficulty in maintaining compatibility, I think
the usual deprecation process should be followed.  We don?t know who is
using pydoc as a library, so let?s play safe and not risk breaking their
code (especially considering that it must not have been easy to write
code extending pydoc :).

BTW, doesn?t the process start with PendingDeprecationWarnings, then


From tseaver at  Mon Nov  8 17:15:01 2010
From: tseaver at (Tres Seaver)
Date: Mon, 08 Nov 2010 11:15:01 -0500
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <Act2RJ70W6LCt2gPS2e4wNV7jFYtAQ==>	<>
Message-ID: <ib97m5$oqt$>

Hash: SHA1

On 11/08/2010 04:42 AM, Lennart Regebro wrote:
> 2010/10/28 Kristj?n Valur J?nsson <kristjan at>:
>> Hello all.
>> So, python 2.7 is in bugfix only mode.  ?trunk? is off limit.  So, where
>> does one make improvements to the distinguished, and still very much alive,
>> 2.x series of Python?
>> The answer would seem to be ?one doesn?t?.  But must it be that way?
> Except for making releases that start backporting Python 3 features
> and breaking backwards compatibility gradually (which may or may not
> be a good idea) I don't see the point. There isn't much to do when it
> comes to improving the language, and there is a moratorium anyway.
> Improvements in the standard library can be more easily done in
> external libraries anyway, and then you can release the improved
> libraries for everything from Python 2.4 and forwards if you like.
> So it can be done, but the question is "Why?"

The OP has existing patches to contribute which the core python-dev team
consider "not-a-bugfix", and hence not acceptable for the 2.7 branch.

- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From g.brandl at  Mon Nov  8 17:13:45 2010
From: g.brandl at (Georg Brandl)
Date: Mon, 08 Nov 2010 17:13:45 +0100
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>
Message-ID: <ib97m7$ou2$>

Am 08.11.2010 17:02, schrieb ?ric Araujo:
> Hi Nick,
> If there is no enormous difficulty in maintaining compatibility, I think
> the usual deprecation process should be followed.  We don?t know who is
> using pydoc as a library, so let?s play safe and not risk breaking their
> code (especially considering that it must not have been easy to write
> code extending pydoc :).
> BTW, doesn?t the process start with PendingDeprecationWarnings, then
> DeprecationWarnings?

PendingDeprecationWarnings only make sense for larger changes, especially
now that bot Pending and normal DeprecationWarnings are silent by default.
See PEP 387 (which is only a draft though).


From rdmurray at  Mon Nov  8 17:18:59 2010
From: rdmurray at (R. David Murray)
Date: Mon, 08 Nov 2010 11:18:59 -0500
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, 08 Nov 2010 17:02:24 +0100, <merwok at> wrote:
> If there is no enormous difficulty in maintaining compatibility, I think
> the usual deprecation process should be followed.  We don???t know who is
> using pydoc as a library, so let???s play safe and not risk breaking their
> code (especially considering that it must not have been easy to write
> code extending pydoc :).
> BTW, doesn???t the process start with PendingDeprecationWarnings, then
> DeprecationWarnings?

No, PendingDeprecationWarning was something used when we wanted a
default-silent deprecation warning for a while before doing an
actual deprecation.  Now that deprecation warnings are silent
by default we'll probably never need PendingDeprecationWarning ever
again :)


From belopolsky at  Mon Nov  8 18:20:23 2010
From: belopolsky at (Alexander Belopolsky)
Date: Mon, 8 Nov 2010 12:20:23 -0500
Subject: [Python-Dev] Breaking undocumented API
Message-ID: <>

Was: [issue2001] Pydoc interactive browsing enhancement

On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan <report at> wrote:
> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public
> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected
> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality
> without going through a deprecation period first.

I have a similar issue with the trace module and would appreciate some
guidance on this as well.  The trace module documented API includes
just the Trace class, but the module defines several helper functions
and classes  that do not start with a leading underscore and are not
excluded from * imports by __all__.  (There is no trace.__all__.)

I don't think a strict don't remove without deprecation policy is
workable.  For example, is trace.rx_blank constant part of the trace
module API that needs to be preserved indefinitely?  I don't even know
if it is possible to add a deprecation warning to it, but
CoverageResults._blank_re would certainly be a better place for it.

The functions I have specific need to modify (See are trace.find_strings(), and
find_executable_linenos().  The functions take module's file name, but
I need to make them to take the module object in order to be able to
deal with modules that have custom loaders.

The trace.find_strings() function is clearly internal.  It's name does
not even reflect what it does (finding docstring locations), so it was
never intended for use outside of the trace module.  However, google
code search reveals that people do use it and other functions in their

This suggests that trace.find_strings() should probably be preserved
or properly deprecated.  If this is the case, should we fix bugs in
it?  Note that it currently has a bug because it ignores the coding
cookie when opening python source file.  Should this be fixed?

I freely admit that I have more questions than answers, so I would
like to hear from a wider audience.

From fuzzyman at  Mon Nov  8 18:35:56 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 08 Nov 2010 17:35:56 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

> Was: [issue2001] Pydoc interactive browsing enhancement
> [snip...]
> This suggests that trace.find_strings() should probably be preserved
> or properly deprecated.  If this is the case, should we fix bugs in
> it?  Note that it currently has a bug because it ignores the coding
> cookie when opening python source file.  Should this be fixed?
> I freely admit that I have more questions than answers, so I would
> like to hear from a wider audience.

If you deprecate it then you don't *have* to fix bugs in it. If we know 
it is used then we can't remove it without deprecation.

If the function is no longer needed but we want to exclude it from the 
public API, you could create a new function in the module, with a 
leading underscore name, fix the bugs in that and deprecate the old name.

Alternatively you could make the old name an alias for the new one with 
a deprecation warning applied. That way the old name does get the 
bugfixes but is still deprecated.


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Mon Nov  8 18:40:47 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 08 Nov 2010 17:40:47 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

>> Was: [issue2001] Pydoc interactive browsing enhancement
>> [snip...]
>> This suggests that trace.find_strings() should probably be preserved
>> or properly deprecated.  If this is the case, should we fix bugs in
>> it?  Note that it currently has a bug because it ignores the coding
>> cookie when opening python source file.  Should this be fixed?
>> I freely admit that I have more questions than answers, so I would
>> like to hear from a wider audience.
> If you deprecate it then you don't *have* to fix bugs in it. If we 
> know it is used then we can't remove it without deprecation.
> If the function is no longer needed but we want to exclude it from the 
> public API, you could create a new function in the module, with a 
> leading underscore name, fix the bugs in that and deprecate the old name.

Sorry, this meant to say "if the function is *still needed* (internally 
to the module) but we want to exclude it from the API"...

This would be a good approach to clarifying the public API of standard 
library modules. At least that way we could work towards a consistent 

All the best,


> Alternatively you could make the old name an alias for the new one 
> with a deprecation warning applied. That way the old name does get the 
> bugfixes but is still deprecated.
> Michael
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Mon Nov  8 18:50:32 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 08 Nov 2010 18:50:32 +0100
Subject: [Python-Dev] Buildbot for AIX
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
Message-ID: <1289238632.3572.14.camel@localhost.localdomain>

Le lundi 08 novembre 2010 ? 18:46 +0100, S?bastien Sabl? a ?crit :
> xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized - 
> passed to ld

Is -qmaxmem really necessary to build Python?
If so, you could try passing it in CFLAGS.

> However running 2 different slaves per host in order to distinguish xlc 
> and gcc would be OK; though I would appreciate if they could run 
> sequentially rather than in parallel as that would limit the host load.

If there are two separate slaves, I can't think of any simple way to run
builds sequentially. Perhaps you can assign both of them to a single CPU
(assuming AIX allows that).



From ctb at  Mon Nov  8 18:57:28 2010
From: ctb at (C. Titus Brown)
Date: Mon, 8 Nov 2010 09:57:28 -0800
Subject: [Python-Dev] Buildbot for AIX
In-Reply-To: <1289238632.3572.14.camel@localhost.localdomain>
References: <>
Message-ID: <>

On Mon, Nov 08, 2010 at 06:50:32PM +0100, Antoine Pitrou wrote:
> > However running 2 different slaves per host in order to distinguish xlc 
> > and gcc would be OK; though I would appreciate if they could run 
> > sequentially rather than in parallel as that would limit the host load.
> If there are two separate slaves, I can't think of any simple way to run
> builds sequentially. Perhaps you can assign both of them to a single CPU
> (assuming AIX allows that).

You can specify a slave lock to do this, in buildbot:

One the neat things that a master/slave system like buildbot provides...

C. Titus Brown, ctb at

From sable at  Mon Nov  8 18:46:26 2010
From: sable at (=?ISO-8859-1?Q?S=E9bastien_Sabl=E9?=)
Date: Mon, 08 Nov 2010 18:46:26 +0100
Subject: [Python-Dev] Buildbot for AIX
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Hi Antoine,

I tried to provide command lines arguments to configure instead of 
environment variables with:

configureFlags = ["--with-pydebug", "--without-computed-gotos", 
"CC=xlc", 'OPT="-O2 -qmaxmem=18000"']

But that would fail: on the slave, configure would run like that:
./configure --with-pydebug --without-computed-gotos CC=xlc OPT="-O2 

And the compilation would give some error like that:
	xlc -c  "-O2 -qmaxmem=18000" -O2 -O2  -I. -IInclude -I./Include 
-I/home/cis/.buildbot/support-buildbot/include/ncurses  -DPy_BUILD_CORE 
-o Modules/python.o ./Modules/python.c
xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized - 
passed to ld

However running 2 different slaves per host in order to distinguish xlc 
and gcc would be OK; though I would appreciate if they could run 
sequentially rather than in parallel as that would limit the host load.


S?bastien Sabl?

Le 28/10/2010 16:45, Antoine Pitrou a ?crit :
> On Fri, 15 Oct 2010 17:38:47 +0200
> S?bastien Sabl?<sable at>  wrote:
>> Could you please take a look at those modifications in master.cfg,
>> provide me some password for the bot slaves and apply the corrections in
>> those issues?
> About the master.cfg modifications: there should be no need for
> separate environment variables. Instead, you should be able to specify
> them as command-line arguments to ./configure, e.g.:
> ["--with-pydebug", "--without-computed-gotos", "CC=xlc",
>   'OPT="-O2 -qmaxmem=18000"']
> Can you check this works for you?
> Also, there's no need to complicate the buildbot naming procedure.
> You should be able to run several buildslaves on a single machine,
> provided we give you separate credentials: one per compiler type.
> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From exarkun at  Mon Nov  8 19:07:24 2010
From: exarkun at (exarkun at
Date: Mon, 08 Nov 2010 18:07:24 -0000
Subject: [Python-Dev] Buildbot for AIX
In-Reply-To: <1289238632.3572.14.camel@localhost.localdomain>
References: <>
Message-ID: <20101108180724.2040.587669116.divmod.xquotient.740@localhost.localdomain>

On 05:50 pm, solipsis at wrote:
>Le lundi 08 novembre 2010 ? 18:46 +0100, S?bastien Sabl? a ?crit :
>>xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized -
>>passed to ld
>Is -qmaxmem really necessary to build Python?
>If so, you could try passing it in CFLAGS.
>>However running 2 different slaves per host in order to distinguish 
>>and gcc would be OK; though I would appreciate if they could run
>>sequentially rather than in parallel as that would limit the host 
>If there are two separate slaves, I can't think of any simple way to 
>builds sequentially. Perhaps you can assign both of them to a single 
>(assuming AIX allows that).

A master lock will allow this.  Although just having a single slave and 
using a slave lock would be simpler.


From rrr at  Mon Nov  8 19:07:47 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 12:07:47 -0600
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain>
References: <>
Message-ID: <>

On 11/08/2010 09:12 AM, exarkun at wrote:
> On 11:44 am, ncoghlan at wrote:
>> All,
>> I was about to commit the patch for issue 2001 (the improvements to
>> the pydoc web server and the removal of the Tk GUI) when I realised
>> that pydoc.serve() and pydoc.gui() are technically public standard
>> library APIs (albeit undocumented ones).
>> Currently the patch switches serve() to start the new server
>> implementation and gui() to start the server and open a browser window
>> for it.
>> It occurred to me that, despite the "it's an application" feel to the
>> pydoc web server APIs, it may be a better idea to leave the two
>> existing functions alone (aside from adding DeprecationWarning), and
>> using new private function names to start the new server and the web
>> browser.
>> Is following the standard deprecation procedure the better course
>> here, or am I being overly paranoid?
> Following the deprecation procedure here sounds awesome to me. Thanks
> for considering it, I hope you'll choose to go that way.

I want to be clear on what isn't changing.

All of the help() function features that python depends on and any of the 
code that is required for that is staying the same.

All of the static html document generating features and code that depend on 
that, is staying the same.  These static pages do not depend on any parts 
of pydoc after they are generated.

Those are the parts that are most likely to be used in other applications 
as well.

The new changes only effect the interactive browsing mode. The tk search 
box was removed.  By doing that, it enabled the browser interface, to be 
used on systems that don't have tk installed.

The html web server was rewritten and a search feature was added so that 
you can do the same searches in the web browser that you did in the tk 
search box.

Do you (or anyone) know of any programs that access pydocs tk search 
window, or it's server parts directly?  The server was so specific and 
included very specific pydoc html code, so it would have been very 
difficult for it to be used for anything else.  Any thoughts?

I think the main issues Nick is concerned with is the functions and options 
used to start pydoc in the interactive mode.


From rrr at  Mon Nov  8 19:07:47 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 12:07:47 -0600
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain>
References: <>
Message-ID: <>

On 11/08/2010 09:12 AM, exarkun at wrote:
> On 11:44 am, ncoghlan at wrote:
>> All,
>> I was about to commit the patch for issue 2001 (the improvements to
>> the pydoc web server and the removal of the Tk GUI) when I realised
>> that pydoc.serve() and pydoc.gui() are technically public standard
>> library APIs (albeit undocumented ones).
>> Currently the patch switches serve() to start the new server
>> implementation and gui() to start the server and open a browser window
>> for it.
>> It occurred to me that, despite the "it's an application" feel to the
>> pydoc web server APIs, it may be a better idea to leave the two
>> existing functions alone (aside from adding DeprecationWarning), and
>> using new private function names to start the new server and the web
>> browser.
>> Is following the standard deprecation procedure the better course
>> here, or am I being overly paranoid?
> Following the deprecation procedure here sounds awesome to me. Thanks
> for considering it, I hope you'll choose to go that way.

I want to be clear on what isn't changing.

All of the help() function features that python depends on and any of the 
code that is required for that is staying the same.

All of the static html document generating features and code that depend on 
that, is staying the same.  These static pages do not depend on any parts 
of pydoc after they are generated.

Those are the parts that are most likely to be used in other applications 
as well.

The new changes only effect the interactive browsing mode. The tk search 
box was removed.  By doing that, it enabled the browser interface, to be 
used on systems that don't have tk installed.

The html web server was rewritten and a search feature was added so that 
you can do the same searches in the web browser that you did in the tk 
search box.

Do you (or anyone) know of any programs that access pydocs tk search 
window, or it's server parts directly?  The server was so specific and 
included very specific pydoc html code, so it would have been very 
difficult for it to be used for anything else.  Any thoughts?

I think the main issues Nick is concerned with is the functions and options 
used to start pydoc in the interactive mode.


From tjreedy at  Mon Nov  8 19:12:18 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 08 Nov 2010 13:12:18 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <ib9ehu$m1$>

On 11/8/2010 12:20 PM, Alexander Belopolsky wrote:
> Was: [issue2001] Pydoc interactive browsing enhancement
> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan<report at>  wrote:
> ..
>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public
>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected
>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality
>> without going through a deprecation period first.
> I have a similar issue with the trace module and would appreciate some
> guidance on this as well.  The trace module documented API includes
> just the Trace class, but the module defines several helper functions
> and classes  that do not start with a leading underscore and are not
> excluded from * imports by __all__.  (There is no trace.__all__.)

The trace module *appears* to be an ancient module written at a time 
(fictional or actual) when there was no '_' and '__all__' convention and 
only a loose 'public' == documented convention. The undocumented 
public-looking private stuff is a huge mess that Eli and I intentionally 
passed over in our July/August patch documenting (and fixing) the public 
stuff. I hope we included everything that should be public.

In order to warn about constants getting renamed or moved, is it 
possible to issue an off-by-default warning on module import, something like
"Trace is an ancient module with public names for many undocumented 
private constants and functions. Use of these is deprecated. See lib doc 
for more."

Terry Jan Reedy

From belopolsky at  Mon Nov  8 19:35:39 2010
From: belopolsky at (Alexander Belopolsky)
Date: Mon, 8 Nov 2010 13:35:39 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 12:35 PM, Michael Foord
<fuzzyman at> wrote:
> If you deprecate it then you don't *have* to fix bugs in it. If we know it
> is used then we can't remove it without deprecation.

What about the maintenance branch?

From fuzzyman at  Mon Nov  8 19:39:26 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 08 Nov 2010 18:39:26 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> On Mon, Nov 8, 2010 at 12:35 PM, Michael Foord
> <fuzzyman at>  wrote:
> ..
>> If you deprecate it then you don't *have* to fix bugs in it. If we know it
>> is used then we can't remove it without deprecation.
> What about the maintenance branch?
So you have a bug in the module that can only be fixed in a function you 
want to deprecate?

It depends what approach you are taking in 3.2. If you are creating a 
new private function, in which you will fix the bug, but keeping an 
alias around to the old name so that you can deprecate it - then merely 
fixing the bug in the maintenance branch should be fine.

(If you're deprecating the function because it is unneeded then you 
don't need to fix bugs in the maintenance branch either - I guess no-one 
would complain if you did though.)



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From belopolsky at  Mon Nov  8 19:44:22 2010
From: belopolsky at (Alexander Belopolsky)
Date: Mon, 8 Nov 2010 13:44:22 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 1:39 PM, Michael Foord <fuzzyman at> wrote:
> So you have a bug in the module that can only be fixed in a function you
> want to deprecate?
No, I have a bug in a function that I want to deprecate.  You said I
don't need to fix it if I add a deprecation warning.  However, as far
as I know, deprecation warnings are not backported  to maintenance
branches while bug fixes are.  So the specific question is: there is a
bug in trace.find_strings() - should it be fixed in 3.1-maint?

From fuzzyman at  Mon Nov  8 19:46:39 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 08 Nov 2010 18:46:39 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

> On Mon, Nov 8, 2010 at 1:39 PM, Michael Foord<fuzzyman at>  wrote:
> ..
>> So you have a bug in the module that can only be fixed in a function you
>> want to deprecate?
> No, I have a bug in a function that I want to deprecate.  You said I
> don't need to fix it if I add a deprecation warning.  However, as far
> as I know, deprecation warnings are not backported  to maintenance
> branches while bug fixes are.  So the specific question is: there is a
> bug in trace.find_strings() - should it be fixed in 3.1-maint?
My opinion would be:

* No we don't backport the deprecation warning
* No we don't need to fix the bug

Others may disagree. (Logic being that we won't fix the bug in 3.2, if 
we fixed it in 2.7 then we would have to fix it in 3.2. Therefore we 
shouldn't fix in 2.7.)



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From brett at  Mon Nov  8 20:00:45 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 11:00:45 -0800
Subject: [Python-Dev] GUI test runner tool
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 04:09, Michael Foord <michael at> wrote:
> Hello all,
> Now that unittest has test discovery, Mark Roddy has been working on
> resurrecting the old GUI test runner (using Tkinter):
> This was part of the original pyunit project but I believe it was never part
> of the standard library:
> Here's a screenshot of what it looks like:
> I'd like to propose adding it to Python in Tools/ and am volunteering to
> maintain it.

Does that mean upgrading it as well? =) For instance it would be great
to get it to use ttk so it looks a bit sharper, supports skipped tests
and expected failures, and dream-of-dreams ties into regrtest so you
can just check boxes instead of passing a ton of CLI flags.

> If the answer is "not yet" that is fine as it can go into
> unittest2 first. Mark has updated it to work with test discovery and added
> support for configuring test discovery in the same way as you can from the
> command line. It is a nice tool for those new to writing tests who aren't
> yet familiar with the command line or prefer a GUI.

I personally have no problem with it going into tools as long as it
can also be used to run the tests in the stdlib. Just don't put it in
Demos/ . =)


> In its basic form you simply pick a directory and unittestgui will discover
> and run all the tests it finds. It would be nice if it provided more
> diagnostic information on tests it ran (clicking through test results) but
> these can be added later.
> All the best,
> Michael Foord
> --
> READ CAREFULLY. By accepting and reading this email you agree,
> on behalf of your employer, to release me from all obligations
> and waivers arising from any and all NON-NEGOTIATED agreements,
> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
> confidentiality, non-disclosure, non-compete and acceptable use
> policies (?BOGUS AGREEMENTS?) that I have entered into with your
> employer, its partners, licensors, agents and assigns, in
> perpetuity, without prejudice to my ongoing rights and privileges.
> You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From alexander.belopolsky at  Mon Nov  8 20:28:25 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 8 Nov 2010 14:28:25 -0500
Subject: [Python-Dev] GUI test runner tool
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord <michael at> wrote:
> I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to
> maintain it.

Why not adding it under Lib/unittest/?   I think Tools/  is a less
attractive location for most users than say PyPI or some other package
repository.  Tools/ is for stuff that is primarily of interest to
python developers, not python users.  OS vendors are less likely to
install packages in Tools/ in a user-visible place than they are a
popular 3rd-party package.

From brett at  Mon Nov  8 20:58:25 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 11:58:25 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky
<belopolsky at> wrote:
> Was: [issue2001] Pydoc interactive browsing enhancement
> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan <report at> wrote:
> ..
>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public
>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected
>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality
>> without going through a deprecation period first.
> I have a similar issue with the trace module and would appreciate some
> guidance on this as well. ?The trace module documented API includes
> just the Trace class, but the module defines several helper functions
> and classes ?that do not start with a leading underscore and are not
> excluded from * imports by __all__. ?(There is no trace.__all__.)

I think we need to, as a group, decide how to handle undocumented APIs
that don't have a leading underscore: they get treated just the same
as the documented APIs, or are they private regardless and thus we can
change them at our whim?

> I don't think a strict don't remove without deprecation policy is
> workable. ?For example, is trace.rx_blank constant part of the trace
> module API that needs to be preserved indefinitely? ?I don't even know
> if it is possible to add a deprecation warning to it, but
> CoverageResults._blank_re would certainly be a better place for it.

The deprecation policy obviously cannot apply to module-level attributes.

> The functions I have specific need to modify (See
> are trace.find_strings(), and
> find_executable_linenos(). ?The functions take module's file name, but
> I need to make them to take the module object in order to be able to
> deal with modules that have custom loaders.
> The trace.find_strings() function is clearly internal. ?It's name does
> not even reflect what it does (finding docstring locations), so it was
> never intended for use outside of the trace module. ?However, google
> code search reveals that people do use it and other functions in their
> code.
> This suggests that trace.find_strings() should probably be preserved
> or properly deprecated. ?If this is the case, should we fix bugs in
> it? ?Note that it currently has a bug because it ignores the coding
> cookie when opening python source file. ?Should this be fixed?
> I freely admit that I have more questions than answers, so I would
> like to hear from a wider audience.

The main reason I have said that non-underscore names should be
properly deprecated (assuming they are not contained in an
underscored-named module) is that dir() and help() do not distinguish.
If you are perusing a module from the interpreter prompt you have no
way to know whether something is public or private if it lacks an
underscore. Is it reasonable to assume that any API found through
dir() or help() must be checked with the official docs before you can
consider using it, even if you have no explicit need to read the
official docs?

I (unfortunately) say no, which is why I have argued that
non-underscored names need to be properly deprecated. This obviously
places a nasty burden on us, though, so I don't like taking this
position. Unless we can make it clearly known through help() or
something that the official docs must be checked to know what can and
cannot be reliably used I don't think it is reasonable to force users
to not be able to rely on help() (we should probably change help() to
print a big disclaimer for anything with a leading underscore,

But that doesn't mean we can't go through, fix up our names, and
deprecate the old public names; that's fair game in my book.

From raymond.hettinger at  Mon Nov  8 21:56:22 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Mon, 8 Nov 2010 12:56:22 -0800
Subject: [Python-Dev] GUI test runner tool
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 8, 2010, at 11:28 AM, Alexander Belopolsky wrote:

> On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord <michael at> wrote:
> ..
>> I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to
>> maintain it.
> Why not adding it under Lib/unittest/?   

Michael's instinct to put it in Tools is good one.
GUI preferences and support varies among users and environments.
Also, any given GUI runner is just one of many possible solutions
and there is no reason to commit to one.  Better to add something
to Tools, post an ASPN recipe, or list a package on PyPI.

If you need it to be more visible, we can always give it a mention
in the docs.  Though we might want to mention more full featured
tools like Hudson.

Remember, the standard library is where code goes to die ;-)


From exarkun at  Mon Nov  8 22:03:18 2010
From: exarkun at (exarkun at
Date: Mon, 08 Nov 2010 21:03:18 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain>

On 07:58 pm, brett at wrote:
>>I don't think a strict don't remove without deprecation policy is
>>workable. ?For example, is trace.rx_blank constant part of the trace
>>module API that needs to be preserved indefinitely? ?I don't even know
>>if it is possible to add a deprecation warning to it, but
>>CoverageResults._blank_re would certainly be a better place for it.
>The deprecation policy obviously cannot apply to module-level 

I'm not sure why this is.  Can you elaborate?
>The main reason I have said that non-underscore names should be
>properly deprecated (assuming they are not contained in an
>underscored-named module) is that dir() and help() do not distinguish.
>If you are perusing a module from the interpreter prompt you have no
>way to know whether something is public or private if it lacks an
>underscore. Is it reasonable to assume that any API found through
>dir() or help() must be checked with the official docs before you can
>consider using it, even if you have no explicit need to read the
>official docs?
>I (unfortunately) say no, which is why I have argued that
>non-underscored names need to be properly deprecated. This obviously
>places a nasty burden on us, though, so I don't like taking this
>position. Unless we can make it clearly known through help() or
>something that the official docs must be checked to know what can and
>cannot be reliably used I don't think it is reasonable to force users
>to not be able to rely on help() (we should probably change help() to
>print a big disclaimer for anything with a leading underscore,
>But that doesn't mean we can't go through, fix up our names, and
>deprecate the old public names; that's fair game in my book.



From at  Mon Nov  8 22:11:36 2010
From: at (David Bolen)
Date: Mon, 08 Nov 2010 16:11:36 -0500
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
References: <>
Message-ID: <>

Jeroen Ruigrok van der Werven <asmodai at> writes:

> -On [20101108 00:36], David Bolen ( at wrote:
>>Well, I think the SYSV semaphores are either less limited or at least
>>more adjustable.  They've certainly been around longer in FreeBSD.
>>The POSIX semaphore support is not enabled by default in FreeBSD 7, so
>>I added loader.conf stuff to load them (as part of issue7272).
> It is enabled by default on FreeBSD 8 at least.
> Looking through the repository it seems 7-STABLE has it enabled by default
> as well in the GENERIC kernel (the standard one it boots with after its
> first install). It seems this was added for 7.3 and onward. So 7.2 and
> before need an "options P1003_1B_SEMAPHORES" added to their kernel at least.
> The SYSV options are already present in the entire 7.x line.

My use of "enabled" may not have been the best word choice since I
didn't mean to imply a kernel option.

I'm still using GENERIC on the 7.2 buildbot, so I didn't need to
recompile the kernel in that release either.  The issue was that the
POSIX semaphore module wasn't loaded by default (something I thought
only changed in 8.x), so the buildbot currently has a 'sem_load="YES"'
loader.conf entry to ensure that's done.

-- David

From brett at  Mon Nov  8 22:25:58 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 13:25:58 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 13:03,  <exarkun at> wrote:
> On 07:58 pm, brett at wrote:
>>> I don't think a strict don't remove without deprecation policy is
>>> workable. ?For example, is trace.rx_blank constant part of the trace
>>> module API that needs to be preserved indefinitely? ?I don't even know
>>> if it is possible to add a deprecation warning to it, but
>>> CoverageResults._blank_re would certainly be a better place for it.
>> The deprecation policy obviously cannot apply to module-level attributes.
> I'm not sure why this is. ?Can you elaborate?

There is no way to directly trigger a DeprecationWarning for an
attribute. We can still document it, but there is just no way to
programmatically enforce it.


>> The main reason I have said that non-underscore names should be
>> properly deprecated (assuming they are not contained in an
>> underscored-named module) is that dir() and help() do not distinguish.
>> If you are perusing a module from the interpreter prompt you have no
>> way to know whether something is public or private if it lacks an
>> underscore. Is it reasonable to assume that any API found through
>> dir() or help() must be checked with the official docs before you can
>> consider using it, even if you have no explicit need to read the
>> official docs?
>> I (unfortunately) say no, which is why I have argued that
>> non-underscored names need to be properly deprecated. This obviously
>> places a nasty burden on us, though, so I don't like taking this
>> position. Unless we can make it clearly known through help() or
>> something that the official docs must be checked to know what can and
>> cannot be reliably used I don't think it is reasonable to force users
>> to not be able to rely on help() (we should probably change help() to
>> print a big disclaimer for anything with a leading underscore,
>> though).
>> But that doesn't mean we can't go through, fix up our names, and
>> deprecate the old public names; that's fair game in my book.
> +1
> Jean-Paul

From rrr at  Mon Nov  8 22:36:25 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 15:36:25 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/08/2010 01:58 PM, Brett Cannon wrote:
> On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky
> <belopolsky at>  wrote:
>> Was: [issue2001] Pydoc interactive browsing enhancement
>> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan<report at>  wrote:
>> ..
>>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public
>>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected
>>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality
>>> without going through a deprecation period first.
>> I have a similar issue with the trace module and would appreciate some
>> guidance on this as well.  The trace module documented API includes
>> just the Trace class, but the module defines several helper functions
>> and classes  that do not start with a leading underscore and are not
>> excluded from * imports by __all__.  (There is no trace.__all__.)
> I think we need to, as a group, decide how to handle undocumented APIs
> that don't have a leading underscore: they get treated just the same
> as the documented APIs, or are they private regardless and thus we can
> change them at our whim?

My understanding is that anything with an actual docstring is part of the 
public API.  Any thing with a leading underscore is private.

And to a lesser extent, objects with out docstrings, but have comments 
instead or nothing, may change, so don't depend on them.  Thankfully most 
things do have docstrings.

>> I freely admit that I have more questions than answers, so I would
>> like to hear from a wider audience.
> The main reason I have said that non-underscore names should be
> properly deprecated (assuming they are not contained in an
> underscored-named module) is that dir() and help() do not distinguish.
> If you are perusing a module from the interpreter prompt you have no
> way to know whether something is public or private if it lacks an
> underscore. Is it reasonable to assume that any API found through
> dir() or help() must be checked with the official docs before you can
> consider using it, even if you have no explicit need to read the
> official docs?
> I (unfortunately) say no, which is why I have argued that
> non-underscored names need to be properly deprecated. This obviously
> places a nasty burden on us, though, so I don't like taking this
> position. Unless we can make it clearly known through help() or
> something that the official docs must be checked to know what can and
> cannot be reliably used I don't think it is reasonable to force users
> to not be able to rely on help() (we should probably change help() to
> print a big disclaimer for anything with a leading underscore,
> though).

+1 on the help disclaimer for objects with leading underscores.

Currently help() does not see comments when they are used in place of a 
docstring.  I think it would be easy to have help notate things with no 
docstrings as "Warning: Undocumented <object>. Use at your own risk."

At first, it would probably have a nice side effect of getting any public 
API's documented with doc strings. (if they aren't already.)

> But that doesn't mean we can't go through, fix up our names, and
> deprecate the old public names; that's fair game in my book.

I agree.

It may also be useful to clarify that importing some "utility" modules is 
not recommended because they may be changed more often and may not follow 
the standard process.  Would something like the following work, but still 
allow for importing if the exception is caught with a try except?

if __name__ == "__main__":
     raise ImportWarning("This is utility module and may be changed.")


From exarkun at  Mon Nov  8 22:45:12 2010
From: exarkun at (exarkun at
Date: Mon, 08 Nov 2010 21:45:12 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain>

On 09:25 pm, brett at wrote:
>On Mon, Nov 8, 2010 at 13:03,  <exarkun at> wrote:
>>On 07:58 pm, brett at wrote:
>>>>I don't think a strict don't remove without deprecation policy is
>>>>workable. ?For example, is trace.rx_blank constant part of the trace
>>>>module API that needs to be preserved indefinitely? ?I don't even 
>>>>if it is possible to add a deprecation warning to it, but
>>>>CoverageResults._blank_re would certainly be a better place for it.
>>>The deprecation policy obviously cannot apply to module-level 
>>I'm not sure why this is. ?Can you elaborate?
>There is no way to directly trigger a DeprecationWarning for an
>attribute. We can still document it, but there is just no way to
>programmatically enforce it.

What about `deprecatedModuleAttribute` 
or zope.deprecation 
(<>) which inspired 


From rrr at  Mon Nov  8 22:36:25 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 15:36:25 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/08/2010 01:58 PM, Brett Cannon wrote:
> On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky
> <belopolsky at>  wrote:
>> Was: [issue2001] Pydoc interactive browsing enhancement
>> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan<report at>  wrote:
>> ..
>>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public
>>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected
>>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality
>>> without going through a deprecation period first.
>> I have a similar issue with the trace module and would appreciate some
>> guidance on this as well.  The trace module documented API includes
>> just the Trace class, but the module defines several helper functions
>> and classes  that do not start with a leading underscore and are not
>> excluded from * imports by __all__.  (There is no trace.__all__.)
> I think we need to, as a group, decide how to handle undocumented APIs
> that don't have a leading underscore: they get treated just the same
> as the documented APIs, or are they private regardless and thus we can
> change them at our whim?

My understanding is that anything with an actual docstring is part of the 
public API.  Any thing with a leading underscore is private.

And to a lesser extent, objects with out docstrings, but have comments 
instead or nothing, may change, so don't depend on them.  Thankfully most 
things do have docstrings.

>> I freely admit that I have more questions than answers, so I would
>> like to hear from a wider audience.
> The main reason I have said that non-underscore names should be
> properly deprecated (assuming they are not contained in an
> underscored-named module) is that dir() and help() do not distinguish.
> If you are perusing a module from the interpreter prompt you have no
> way to know whether something is public or private if it lacks an
> underscore. Is it reasonable to assume that any API found through
> dir() or help() must be checked with the official docs before you can
> consider using it, even if you have no explicit need to read the
> official docs?
> I (unfortunately) say no, which is why I have argued that
> non-underscored names need to be properly deprecated. This obviously
> places a nasty burden on us, though, so I don't like taking this
> position. Unless we can make it clearly known through help() or
> something that the official docs must be checked to know what can and
> cannot be reliably used I don't think it is reasonable to force users
> to not be able to rely on help() (we should probably change help() to
> print a big disclaimer for anything with a leading underscore,
> though).

+1 on the help disclaimer for objects with leading underscores.

Currently help() does not see comments when they are used in place of a 
docstring.  I think it would be easy to have help notate things with no 
docstrings as "Warning: Undocumented <object>. Use at your own risk."

At first, it would probably have a nice side effect of getting any public 
API's documented with doc strings. (if they aren't already.)

> But that doesn't mean we can't go through, fix up our names, and
> deprecate the old public names; that's fair game in my book.

I agree.

It may also be useful to clarify that importing some "utility" modules is 
not recommended because they may be changed more often and may not follow 
the standard process.  Would something like the following work, but still 
allow for importing if the exception is caught with a try except?

if __name__ == "__main__":
     raise ImportWarning("This is utility module and may be changed.")


From brett at  Mon Nov  8 22:57:43 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 13:57:43 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 13:45,  <exarkun at> wrote:
> On 09:25 pm, brett at wrote:
>> On Mon, Nov 8, 2010 at 13:03, ?<exarkun at> wrote:
>>> On 07:58 pm, brett at wrote:
>>>>> I don't think a strict don't remove without deprecation policy is
>>>>> workable. ?For example, is trace.rx_blank constant part of the trace
>>>>> module API that needs to be preserved indefinitely? ?I don't even know
>>>>> if it is possible to add a deprecation warning to it, but
>>>>> CoverageResults._blank_re would certainly be a better place for it.
>>>> The deprecation policy obviously cannot apply to module-level
>>>> attributes.
>>> I'm not sure why this is. ?Can you elaborate?
>> There is no way to directly trigger a DeprecationWarning for an
>> attribute. We can still document it, but there is just no way to
>> programmatically enforce it.
> What about `deprecatedModuleAttribute`
> (<>)
> or zope.deprecation
> (<>) which inspired it?

Just checked the code and it looks like it substitutes the module for
some proxy object? To begin that break subclass checks. After that I
don't know the ramifications without really digging into the
ModuleType code.

From brett at  Mon Nov  8 23:01:18 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 14:01:18 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 13:36, Ron Adam <rrr at> wrote:
> On 11/08/2010 01:58 PM, Brett Cannon wrote:
>> On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky
>> <belopolsky at> ?wrote:
>>> Was: [issue2001] Pydoc interactive browsing enhancement
>>> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan<report at>
>>> ?wrote:
>>> ..
>>>> I'd actually started typing out the command to commit this before it
>>>> finally clicked that the patch changes public
>>>> APIs of the pydoc module in incompatible ways. Sure, they aren't
>>>> documented, but the fact they aren't protected
>>>> by an underscore means I'm not comfortable with the idea of removing
>>>> them or radically change their functionality
>>>> without going through a deprecation period first.
>>> I have a similar issue with the trace module and would appreciate some
>>> guidance on this as well. ?The trace module documented API includes
>>> just the Trace class, but the module defines several helper functions
>>> and classes ?that do not start with a leading underscore and are not
>>> excluded from * imports by __all__. ?(There is no trace.__all__.)
>> I think we need to, as a group, decide how to handle undocumented APIs
>> that don't have a leading underscore: they get treated just the same
>> as the documented APIs, or are they private regardless and thus we can
>> change them at our whim?
> My understanding is that anything with an actual docstring is part of the
> public API. ?Any thing with a leading underscore is private.

That's a bad rule. Why shouldn't I be able to document something that
is not meant for the public so that fellow developers know what the
heck should be going on in the code?

> And to a lesser extent, objects with out docstrings, but have comments
> instead or nothing, may change, so don't depend on them. ?Thankfully most
> things do have docstrings.
>>> I freely admit that I have more questions than answers, so I would
>>> like to hear from a wider audience.
>> The main reason I have said that non-underscore names should be
>> properly deprecated (assuming they are not contained in an
>> underscored-named module) is that dir() and help() do not distinguish.
>> If you are perusing a module from the interpreter prompt you have no
>> way to know whether something is public or private if it lacks an
>> underscore. Is it reasonable to assume that any API found through
>> dir() or help() must be checked with the official docs before you can
>> consider using it, even if you have no explicit need to read the
>> official docs?
>> I (unfortunately) say no, which is why I have argued that
>> non-underscored names need to be properly deprecated. This obviously
>> places a nasty burden on us, though, so I don't like taking this
>> position. Unless we can make it clearly known through help() or
>> something that the official docs must be checked to know what can and
>> cannot be reliably used I don't think it is reasonable to force users
>> to not be able to rely on help() (we should probably change help() to
>> print a big disclaimer for anything with a leading underscore,
>> though).
> +1 on the help disclaimer for objects with leading underscores.
> Currently help() does not see comments when they are used in place of a
> docstring. ?I think it would be easy to have help notate things with no
> docstrings as "Warning: Undocumented <object>. Use at your own risk."
> At first, it would probably have a nice side effect of getting any public
> API's documented with doc strings. (if they aren't already.)
>> But that doesn't mean we can't go through, fix up our names, and
>> deprecate the old public names; that's fair game in my book.
> I agree.
> It may also be useful to clarify that importing some "utility" modules is
> not recommended because they may be changed more often and may not follow
> the standard process. ?Would something like the following work, but still
> allow for importing if the exception is caught with a try except?
> if __name__ == "__main__":
> ? ?main()
> else:
> ? ?raise ImportWarning("This is utility module and may be changed.")

Sure it would work, but that doesn't make it pleasant to use. It
already breaks how warnings are typically handled by raising it
instead of calling warnings.warn(). Plus I'm now supposed to
try/except certain imports? That's messy. At that point we are coding
in visibility rules instead of following convention and that doesn't
sit well with me.


> Cheers,
> ?Ron
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From raymond.hettinger at  Mon Nov  8 23:07:46 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Mon, 8 Nov 2010 14:07:46 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote:

> I think we need to, as a group, decide how to handle undocumented APIs
> that don't have a leading underscore: they get treated just the same
> as the documented APIs, or are they private regardless and thus we can
> change them at our whim?

To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed.  That way, we can at least declare the parts that are intended to be public on a go-forward basis.

For the most part, the non-underscored parts of the API shouldn't be changed "at our whim".  Some sense needs to be applied to the decision.  Google's code search is great for showing how people actually have used a module in real world code.  If that shows that people are accessing and/or changing an attribute, it probably needs to remain exposed.   In the absence of a code search, good guesses can be made about what someone might reasonably and usefully be accessing (i.e. glob0 isn't likely).   The goal is to improve the standard library while minimizing breakage, and that will involve trade-offs depending on what is being changed.

IIRC, we've been trying to get away from deprecations because they're so disruptive.  For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break).  Likewise, with the unittest clean-ups, I'm expecting that Michael will introduce aliases when fixing-up mis-named methods, rather than break code that uses the existing names.



From steve at  Mon Nov  8 23:17:40 2010
From: steve at (Steven D'Aprano)
Date: Tue, 09 Nov 2010 09:17:40 +1100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Ron Adam wrote:

> My understanding is that anything with an actual docstring is part of 
> the public API.

I frequently add docstrings to _private functions. Just because it is 
private doesn't mean I don't want documentation for it, and it is very 
handy for running doctests.

Yes, I test my internal functions *wink*

The convention I use is:

* If __all__ exists, anything in that is public.
* Anything not listed in __all__ but without a leading underscore is 
public, but not part of the module's API; e.g. utility functions, 
imported modules, globals (but hopefully not too many of the last). That 
means I don't expect you to use it, but you can if you want.
* Anything with a _private name is internal use only. That includes 
modules. Any attribute of a private object is also private.

If a class is flagged as private, _MyClass, you wouldn't expect that 
_MyClass.attribute were public just because the attribute name wasn't 
also flagged with an underscore. So why treat as public?

> +1 on the help disclaimer for objects with leading underscores.

I don't know that it will be that useful, but I don't think it will help 
that much. +0.

> Currently help() does not see comments when they are used in place of a 
> docstring.  I think it would be easy to have help notate things with no 
> docstrings as "Warning: Undocumented <object>. Use at your own risk."

I wouldn't like that. I don't think that "no docstring" = "undocumented" 
-- the documentation might exist somewhere else.

Besides, I don't think that help() should start misidentifying public 
objects as private if you run it under python -OO.

> It may also be useful to clarify that importing some "utility" modules 
> is not recommended because they may be changed more often and may not 
> follow the standard process.  Would something like the following work, 
> but still allow for importing if the exception is caught with a try except?
> if __name__ == "__main__":
>     main()
> else:
>     raise ImportWarning("This is utility module and may be changed.")

There's no way for the imported module to know what module is importing 
it, is there? Because the API I'd much prefer is:

safe_modules = [a, b, c, d]  # List of modules allowed to import me.
if calling_module not in safe_modules:
     warning.warn("private module, are you sure you want to do this?")


From merwok at  Mon Nov  8 23:22:42 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 08 Nov 2010 23:22:42 +0100
Subject: [Python-Dev] [Python-checkins] r86327 - in
 python/branches/py3k: Doc/includes/
 Doc/includes/	Doc/library/smtplib.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

> Author: r.david.murray
> New Revision: 86327
> Log: #10321: Add support for sending binary DATA and Message objects to smtplib
> Modified: python/branches/py3k/Doc/includes/
> ==============================================================================
>  # Send the email via our own SMTP server.
>  s = smtplib.SMTP()
> -s.sendmail(me, family, msg.as_string())
> +s.sendmail(msg)
>  s.quit()

If I?m not mistaken, you?re giving a message object to a method that
only accepts str or bytes.  That line should read s.send_message(msg).


From exarkun at  Mon Nov  8 23:35:53 2010
From: exarkun at (exarkun at
Date: Mon, 08 Nov 2010 22:35:53 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain>

On 09:57 pm, brett at wrote:
>On Mon, Nov 8, 2010 at 13:45,  <exarkun at> wrote:
>>On 09:25 pm, brett at wrote:
>>>On Mon, Nov 8, 2010 at 13:03, ?<exarkun at> wrote:
>>>>On 07:58 pm, brett at wrote:
>>>>>>I don't think a strict don't remove without deprecation policy is
>>>>>>workable. ?For example, is trace.rx_blank constant part of the 
>>>>>>module API that needs to be preserved indefinitely? ?I don't even 
>>>>>>if it is possible to add a deprecation warning to it, but
>>>>>>CoverageResults._blank_re would certainly be a better place for 
>>>>>The deprecation policy obviously cannot apply to module-level
>>>>I'm not sure why this is. ?Can you elaborate?
>>>There is no way to directly trigger a DeprecationWarning for an
>>>attribute. We can still document it, but there is just no way to
>>>programmatically enforce it.
>>What about `deprecatedModuleAttribute`
>>or zope.deprecation
>>(<>) which 
>>inspired it?
>Just checked the code and it looks like it substitutes the module for
>some proxy object? To begin that break subclass checks. After that I
>don't know the ramifications without really digging into the
>ModuleType code.

That could be fixed if ModuleType allowed subclassing. :)

For what it's worth, no one has complained about problems caused by 
`deprecatedModuleAttribute`, but we've only been using it for about two 
and a half years.


From tjreedy at  Mon Nov  8 23:59:40 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 08 Nov 2010 17:59:40 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <ib9vcq$oqd$>

On 11/8/2010 4:36 PM, Ron Adam wrote:

> My understanding is that anything with an actual docstring is part of
> the public API. Any thing with a leading underscore is private.

When the trace module was written, the rule seems to have been more 
like: docs (but no docstrings) for public API, docstrings (but no doc 
mention) for private stuff. Eli and I fixed the first part.

Terry Jan Reedy

From tjreedy at  Tue Nov  9 00:05:19 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 08 Nov 2010 18:05:19 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <ib9vnb$qe5$>

On 11/8/2010 2:58 PM, Brett Cannon wrote:

> I think we need to, as a group, decide how to handle undocumented APIs
> that don't have a leading underscore: they get treated just the same
> as the documented APIs, or are they private regardless and thus we can
> change them at our whim?

How about in between: deprecate as if private, but do so much more 
freely that we would for public stuff. I think this is what you actually 
propose. We might deprecate faster too.

> The main reason I have said that non-underscore names should be
> properly deprecated (assuming they are not contained in an
> underscored-named module) is that dir() and help() do not distinguish.
> If you are perusing a module from the interpreter prompt you have no
> way to know whether something is public or private if it lacks an
> underscore. Is it reasonable to assume that any API found through
> dir() or help() must be checked with the official docs before you can
> consider using it, even if you have no explicit need to read the
> official docs?
> I (unfortunately) say no, which is why I have argued that
> non-underscored names need to be properly deprecated. This obviously
> places a nasty burden on us, though, so I don't like taking this

Completely naive question: Is there anything that could be automated to 
reduce the burden?

> position. Unless we can make it clearly known through help() or
> something that the official docs must be checked to know what can and
> cannot be reliably used I don't think it is reasonable to force users
> to not be able to rely on help() (we should probably change help() to
> print a big disclaimer for anything with a leading underscore,
> though).
> But that doesn't mean we can't go through, fix up our names, and
> deprecate the old public names; that's fair game in my book.

Terry Jan Reedy

From regebro at  Tue Nov  9 00:08:08 2010
From: regebro at (Lennart Regebro)
Date: Tue, 9 Nov 2010 00:08:08 +0100
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/8 James Y Knight <foom at>:
> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
>> Except for making releases that start backporting Python 3 features
>> and breaking backwards compatibility gradually (which may or may not
>> be a good idea) I don't see the point. There isn't much to do when it
>> comes to improving the language, and there is a moratorium anyway.
>> Improvements in the standard library can be more easily done in
>> external libraries anyway, and then you can release the improved
>> libraries for everything from Python 2.4 and forwards if you like.
>> So it can be done, but the question is "Why?"
> To keep the batteries included?

But they'll only be included in > 2.7, which won't be used much, which
defeats the purpose of including those batteries.

Lennart Regebro, Colliberty:
Telephone: +48 691 268 328

From bobbyi at  Tue Nov  9 00:26:58 2010
From: bobbyi at (Bobby Impollonia)
Date: Mon, 8 Nov 2010 15:26:58 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote:
>> I think we need to, as a group, decide how to handle undocumented APIs
>> that don't have a leading underscore: they get treated just the same
>> as the documented APIs, or are they private regardless and thus we can
>> change them at our whim?
> To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. ?That way, we can at least declare the parts that are intended to be public on a go-forward basis.

This does hurt because anyone who was relying on "import *" to get a
name which is now omitted from __all__ is going to upgrade and find
their program failing with NameErrors. This is a backwards compatible
change and shouldn't happen without a deprecation warning first.

From guido at  Tue Nov  9 00:47:03 2010
From: guido at (Guido van Rossum)
Date: Mon, 8 Nov 2010 15:47:03 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 3:26 PM, Bobby Impollonia <bobbyi at> wrote:
> On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger
> <raymond.hettinger at> wrote:
>> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote:
>>> I think we need to, as a group, decide how to handle undocumented APIs
>>> that don't have a leading underscore: they get treated just the same
>>> as the documented APIs, or are they private regardless and thus we can
>>> change them at our whim?
>> To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. ?That way, we can at least declare the parts that are intended to be public on a go-forward basis.
> This does hurt because anyone who was relying on "import *" to get a
> name which is now omitted from __all__ is going to upgrade and find
> their program failing with NameErrors. This is a backwards compatible
> change and shouldn't happen without a deprecation warning first.

Given that import * is generally frowned upon you can't make a blanket
statement like this without referring to the specifics of the name
being considered for removal. In fact, for any proposed change the
risk and reward need to be weighed properly. If the risk is "someone's
code could break if they used some undocumented API" it is useful to
estimate the probability that this would happen and that somebody
would care (rather than just fixing their code and moving on). Many
factors go into such an estimate. Just one example would be if we knew
of usage of the offending name in code that could reasonably be
assumed to be widely copied or distributed -- in such cases we should
move very carefully indeed no matter how "officially undocumented"
something is.

I don't want to go into the specifics of the trace module (even if I
wrote it, it's too long ago to remember, nor can I recall using it)
but I do want to warn about the dangers of applying simplifying rules

--Guido van Rossum (

From glyph at  Tue Nov  9 00:55:23 2010
From: glyph at (Glyph Lefkowitz)
Date: Mon, 8 Nov 2010 15:55:23 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain>
References: <>
Message-ID: <>

On Nov 8, 2010, at 2:35 PM, exarkun at wrote:

> On 09:57 pm, brett at wrote:
>> On Mon, Nov 8, 2010 at 13:45,  <exarkun at> wrote:
>>> On 09:25 pm, brett at wrote:
>>>> On Mon, Nov 8, 2010 at 13:03,  <exarkun at> wrote:
>>>>> On 07:58 pm, brett at wrote:
>>>>>>> I don't think a strict don't remove without deprecation policy is
>>>>>>> workable.  For example, is trace.rx_blank constant part of the trace
>>>>>>> module API that needs to be preserved indefinitely?  I don't even know
>>>>>>> if it is possible to add a deprecation warning to it, but
>>>>>>> CoverageResults._blank_re would certainly be a better place for it.
>>>>>> The deprecation policy obviously cannot apply to module-level
>>>>>> attributes.
>>>>> I'm not sure why this is.  Can you elaborate?
>>>> There is no way to directly trigger a DeprecationWarning for an
>>>> attribute. We can still document it, but there is just no way to
>>>> programmatically enforce it.
>>> What about `deprecatedModuleAttribute`
>>> (<>)
>>> or zope.deprecation
>>> (<>) which inspired it?
>> Just checked the code and it looks like it substitutes the module for
>> some proxy object? To begin that break subclass checks. After that I
>> don't know the ramifications without really digging into the
>> ModuleType code.
> That could be fixed if ModuleType allowed subclassing. :)
> For what it's worth, no one has complained about problems caused by `deprecatedModuleAttribute`, but we've only been using it for about two and a half years.

This seems like a pretty clear case of "practicality beats purity".  Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'.  The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints.

From rrr at  Tue Nov  9 01:10:17 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 18:10:17 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 11/08/2010 04:01 PM, Brett Cannon wrote:

>> My understanding is that anything with an actual docstring is part of the
>> public API.  Any thing with a leading underscore is private.
> That's a bad rule. Why shouldn't I be able to document something that
> is not meant for the public so that fellow developers know what the
> heck should be going on in the code?

You can use comments instead of a docstring.

Here are the possible cases concerned with the subject.  I'm using 
functions here for these examples, but this also applies to other objects.

def public_api():
     """ Should always have a nice docstring. """

def _private_api():
     # Isn't it a good practice to use comments here?

def _publicly_documented_private_api():
     """  Not sure why you would want to do this
          instead of using comments.

def undocumented_public_api():

def _undocumented_private_api():

Out of these, the two that are problematic are the 
_publicly_documented_private_api() and the undocumented_public_api().

The _publicly_documented_private_api() is a problem because people *will* 
use it even though it has a leading underscore.  Especially those who are 
new to python.

The undocumented_public_api() wouldn't be a problem if all private api's 
used leading  underscore, but for older modules, it isn't always clear what 
the intention was.  Was it undocumented because the programmer simply 
forgot, or was it intended to be a private api?

>> It may also be useful to clarify that importing some "utility" modules is
>> not recommended because they may be changed more often and may not follow
>> the standard process.  Would something like the following work, but still
>> allow for importing if the exception is caught with a try except?
>> if __name__ == "__main__":
>>     main()
>> else:
>>     raise ImportWarning("This is utility module and may be changed.")
> Sure it would work, but that doesn't make it pleasant to use. It
> already breaks how warnings are typically handled by raising it
> instead of calling warnings.warn(). Plus I'm now supposed to
> try/except certain imports? That's messy. At that point we are coding
> in visibility rules instead of following convention and that doesn't
> sit well with me.

No, you're not suppose to try/except imports.  That's the point.

You can do that, only if you really want to abuse the intended purpose of a 
module that isn't meant to be imported in the first place.  If someone 
wants to do that, it isn't a problem.  They are well aware of the risks if 
they do it.  (This is just one option and probably one that isn't thought 
out very well.)

Brett, I'm sure you can up with a better alternative.   ;-)


From rrr at  Tue Nov  9 01:10:17 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 18:10:17 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 11/08/2010 04:01 PM, Brett Cannon wrote:

>> My understanding is that anything with an actual docstring is part of the
>> public API.  Any thing with a leading underscore is private.
> That's a bad rule. Why shouldn't I be able to document something that
> is not meant for the public so that fellow developers know what the
> heck should be going on in the code?

You can use comments instead of a docstring.

Here are the possible cases concerned with the subject.  I'm using 
functions here for these examples, but this also applies to other objects.

def public_api():
     """ Should always have a nice docstring. """

def _private_api():
     # Isn't it a good practice to use comments here?

def _publicly_documented_private_api():
     """  Not sure why you would want to do this
          instead of using comments.

def undocumented_public_api():

def _undocumented_private_api():

Out of these, the two that are problematic are the 
_publicly_documented_private_api() and the undocumented_public_api().

The _publicly_documented_private_api() is a problem because people *will* 
use it even though it has a leading underscore.  Especially those who are 
new to python.

The undocumented_public_api() wouldn't be a problem if all private api's 
used leading  underscore, but for older modules, it isn't always clear what 
the intention was.  Was it undocumented because the programmer simply 
forgot, or was it intended to be a private api?

>> It may also be useful to clarify that importing some "utility" modules is
>> not recommended because they may be changed more often and may not follow
>> the standard process.  Would something like the following work, but still
>> allow for importing if the exception is caught with a try except?
>> if __name__ == "__main__":
>>     main()
>> else:
>>     raise ImportWarning("This is utility module and may be changed.")
> Sure it would work, but that doesn't make it pleasant to use. It
> already breaks how warnings are typically handled by raising it
> instead of calling warnings.warn(). Plus I'm now supposed to
> try/except certain imports? That's messy. At that point we are coding
> in visibility rules instead of following convention and that doesn't
> sit well with me.

No, you're not suppose to try/except imports.  That's the point.

You can do that, only if you really want to abuse the intended purpose of a 
module that isn't meant to be imported in the first place.  If someone 
wants to do that, it isn't a problem.  They are well aware of the risks if 
they do it.  (This is just one option and probably one that isn't thought 
out very well.)

Brett, I'm sure you can up with a better alternative.   ;-)


From ben+python at  Tue Nov  9 01:32:05 2010
From: ben+python at (Ben Finney)
Date: Tue, 09 Nov 2010 11:32:05 +1100
Subject: [Python-Dev] Breaking undocumented API
References: <>
Message-ID: <>

Bobby Impollonia <bobbyi at> writes:

> On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger
> <raymond.hettinger at> wrote:
> > To start with, it doesn't hurt for a maintainer to add an __all__
> > entry and to only document the parts of the API we think need to be
> > exposed. ?That way, we can at least declare the parts that are
> > intended to be public on a go-forward basis.
> This does hurt because anyone who was relying on "import *" to get a
> name which is now omitted from __all__ is going to upgrade and find
> their program failing with NameErrors. This is a backwards compatible
> change and shouldn't happen without a deprecation warning first.

It also introduces a (perhaps small, but clearly non-zero) maintenance
burden: the name of an object must be added, changed, and removed not
only where it is defined, but also in the ?__all__? entry.

This burden is avoided when using the spelling of the name itself as the
indicator for exposure in the API.

 \         ?In any great organization it is far, far safer to be wrong |
  `\          with the majority than to be right alone.? ?John Kenneth |
_o__)                                            Galbraith, 1989-07-28 |
Ben Finney

From ben+python at  Tue Nov  9 01:46:59 2010
From: ben+python at (Ben Finney)
Date: Tue, 09 Nov 2010 11:46:59 +1100
Subject: [Python-Dev] Breaking undocumented API
References: <>
Message-ID: <>

Ron Adam <rrr at> writes:

> def _publicly_documented_private_api():
>     """  Not sure why you would want to do this
>          instead of using comments.
>     """
>     ...

Because the docstring is available at the interpreter via ?help()?, and
because it's automatically available to ?doctest?, and most of the other
good reasons for docstrings.

> The _publicly_documented_private_api() is a problem because people
> *will* use it even though it has a leading underscore. Especially
> those who are new to python.

That isn't an argument against docstrings, since the problem you
describe isn't dependent on the presence or absence of docstrings.

 \     ?I wish there was a knob on the TV to turn up the intelligence. |
  `\          There's a knob called ?brightness? but it doesn't work.? |
_o__)                                             ?Eugene P. Gallagher |
Ben Finney

From guido at  Tue Nov  9 01:50:42 2010
From: guido at (Guido van Rossum)
Date: Mon, 8 Nov 2010 16:50:42 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz <glyph at> wrote:
> This seems like a pretty clear case of "practicality beats purity". ?Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'. ?The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints.

But the Twisted universe is only a subset of the Python universe. The
Python stdlib needs to move more carefully.

--Guido van Rossum (

From rrr at  Tue Nov  9 02:18:00 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 19:18:00 -0600
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>
Message-ID: <iba7ge$ot7$>

On 11/08/2010 05:44 AM, Nick Coghlan wrote:
> All,
> I was about to commit the patch for issue 2001 (the improvements to
> the pydoc web server and the removal of the Tk GUI) when I realised
> that pydoc.serve() and pydoc.gui() are technically public standard
> library APIs (albeit undocumented ones).
> Currently the patch switches serve() to start the new server
> implementation and gui() to start the server and open a browser window
> for it.
> It occurred to me that, despite the "it's an application" feel to the
> pydoc web server APIs, it may be a better idea to leave the two
> existing functions alone (aside from adding DeprecationWarning), and
> using new private function names to start the new server and the web
> browser.
> Is following the standard deprecation procedure the better course
> here, or am I being overly paranoid?

What do you think about adding a new module along with a loader module with a basic user api.  The number 3, so that it 
match's python3.x.

We can then keep the old unchanged and be free to make a lot more 
changes to the file without having to be even a little paranoid.


From brett at  Tue Nov  9 02:18:33 2010
From: brett at (Brett Cannon)
Date: Mon, 8 Nov 2010 17:18:33 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 16:10, Ron Adam <rrr at> wrote:
> On 11/08/2010 04:01 PM, Brett Cannon wrote:
>>> My understanding is that anything with an actual docstring is part of the
>>> public API. ?Any thing with a leading underscore is private.
>> That's a bad rule. Why shouldn't I be able to document something that
>> is not meant for the public so that fellow developers know what the
>> heck should be going on in the code?
> You can use comments instead of a docstring.
> Here are the possible cases concerned with the subject. ?I'm using functions
> here for these examples, but this also applies to other objects.
> def public_api():
> ? ?""" Should always have a nice docstring. """
> ? ?...
> def _private_api():
> ? ?#
> ? ?# Isn't it a good practice to use comments here?
> ? ?#
> ? ?...

That is ugly. I already hate doing that for unittest, I'm not about to
champion that for anything else.

It would also lead to essentially requiring a docstrings for
everything that is public whether someone wants to bother to writing a
docstring or not. I don't think we should be suggesting that a
docstring be required either.

> def _publicly_documented_private_api():
> ? ?""" ?Not sure why you would want to do this
> ? ? ? ? instead of using comments.
> ? ?"""
> ? ?...
> def undocumented_public_api():
> ? ?...
> def _undocumented_private_api():
> ? ?...
> Out of these, the two that are problematic are the
> _publicly_documented_private_api() and the undocumented_public_api().
> The _publicly_documented_private_api() is a problem because people *will*
> use it even though it has a leading underscore. ?Especially those who are
> new to python.
> The undocumented_public_api() wouldn't be a problem if all private api's
> used leading ?underscore, but for older modules, it isn't always clear what
> the intention was. ?Was it undocumented because the programmer simply
> forgot, or was it intended to be a private api?
>>> It may also be useful to clarify that importing some "utility" modules is
>>> not recommended because they may be changed more often and may not follow
>>> the standard process. ?Would something like the following work, but still
>>> allow for importing if the exception is caught with a try except?
>>> if __name__ == "__main__":
>>> ? ?main()
>>> else:
>>> ? ?raise ImportWarning("This is utility module and may be changed.")
>> Sure it would work, but that doesn't make it pleasant to use. It
>> already breaks how warnings are typically handled by raising it
>> instead of calling warnings.warn(). Plus I'm now supposed to
>> try/except certain imports? That's messy. At that point we are coding
>> in visibility rules instead of following convention and that doesn't
>> sit well with me.
> No, you're not suppose to try/except imports. ?That's the point.
> You can do that, only if you really want to abuse the intended purpose of a
> module that isn't meant to be imported in the first place. ?If someone wants
> to do that, it isn't a problem. ?They are well aware of the risks if they do
> it. ?(This is just one option and probably one that isn't thought out very
> well.)
> Brett, I'm sure you can up with a better alternative. ? ;-)

But I don't want to have to do that in the stdlib by remembering what
modules I should or should not import. This is just as much about
developer burden on core devs as it is making sure we don't yank the
rug out from underneath users.

From guido at  Tue Nov  9 02:21:50 2010
From: guido at (Guido van Rossum)
Date: Mon, 8 Nov 2010 17:21:50 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 8, 2010 at 4:46 PM, Ben Finney <ben+python at> wrote:
> Ron Adam <rrr at> writes:
>> def _publicly_documented_private_api():
>> ? ? """ ?Not sure why you would want to do this
>> ? ? ? ? ?instead of using comments.
>> ? ? """
>> ? ? ...
> Because the docstring is available at the interpreter via ?help()?, and
> because it's automatically available to ?doctest?, and most of the other
> good reasons for docstrings.
>> The _publicly_documented_private_api() is a problem because people
>> *will* use it even though it has a leading underscore. Especially
>> those who are new to python.
> That isn't an argument against docstrings, since the problem you
> describe isn't dependent on the presence or absence of docstrings.


--Guido van Rossum (

From a.badger at  Tue Nov  9 02:39:47 2010
From: a.badger at (Toshio Kuratomi)
Date: Mon, 8 Nov 2010 17:39:47 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101109013947.GD22818@unaka.lan>

On Tue, Nov 09, 2010 at 11:46:59AM +1100, Ben Finney wrote:
> Ron Adam <rrr at> writes:
> > def _publicly_documented_private_api():
> >     """  Not sure why you would want to do this
> >          instead of using comments.
> >     """
> >     ...
> Because the docstring is available at the interpreter via ?help()?, and
> because it's automatically available to ?doctest?, and most of the other
> good reasons for docstrings.
> > The _publicly_documented_private_api() is a problem because people
> > *will* use it even though it has a leading underscore. Especially
> > those who are new to python.
> That isn't an argument against docstrings, since the problem you
> describe isn't dependent on the presence or absence of docstrings.
Just wanted to expand a bit here:  as a general practice, you may be
involved in a project where the _private_api() is not intended by people
outside of the project but is intended to be used in multiple places within
the project.  If you have different people working on those different areas,
it can be very useful for them to be able to use help(_private_api) on the
other functions from within the interpreter shell.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <>

From victor.stinner at  Tue Nov  9 02:57:00 2010
From: victor.stinner at (Victor Stinner)
Date: Tue, 9 Nov 2010 02:57:00 +0100
Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2
	3.x" buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

On Monday 08 November 2010 13:23:33 Jeroen Ruigrok van der Werven wrote:
> >The POSIX semaphore support is not enabled by default in FreeBSD 7, so
> >I added loader.conf stuff to load them (as part of issue7272).
> It is enabled by default on FreeBSD 8 at least.

Ok, but I suppose that many users use older versions.

> PostgreSQL installations via ports as well as its documentation instruct
> the FreeBSD user to tweak kern.ipc settings.

I found some informations about SysV semaphores on FreeBSD in a Firebird 
patch, which means that Firebird uses SysV semaphores on FreeBSD :-) (at least 
in Debian/kFreeBSD).

> Almost every FreeBSD user I know of compiles a new kernel. It's just one of
> those BSD things that every user goes through.

If #10348 is implemented, FreeBSD users will be able to use the 
multiprocessing module without having to recompile their kernel. The question 
is more who would like to implement SysV semaphores in Python :-) I don't know 
anything about these semaphores.


From exarkun at  Tue Nov  9 03:03:23 2010
From: exarkun at (exarkun at
Date: Tue, 09 Nov 2010 02:03:23 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101109020323.2040.1678073058.divmod.xquotient.812@localhost.localdomain>

On 12:50 am, guido at wrote:
>On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz 
><glyph at> wrote:
>>This seems like a pretty clear case of "practicality beats purity". 
>>Not only has nobody complained about deprecatedModuleAttribute, but 
>>there are tons of things which show up in sys.modules that aren't 
>>modules in the sense of 'instances of ModuleType'. ?The Twisted 
>>reactor, for example, is an instance, and we've been doing *that* for 
>>about 10 years with no complaints.
>But the Twisted universe is only a subset of the Python universe. The
>Python stdlib needs to move more carefully.

I think that Twisted developers are pretty careful to consider the 
consequences of changes they make to Twisted.  We have an explicit, 
documented backwards compatibility policy, for example.  We also have 
mandatory code review for all changes, with a documented set of 
guidelines outlining the minimum things a reviewer should be 

I wonder if there are any actual technical arguments to be made against 
something like `deprecatedModuleAttribute`?

Also, it turns out that ModuleType can be subclassed these days.


From rdmurray at  Tue Nov  9 03:07:52 2010
From: rdmurray at (R. David Murray)
Date: Mon, 08 Nov 2010 21:07:52 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, 08 Nov 2010 18:10:17 -0600, Ron Adam <rrr at> wrote:
> def _private_api():
>      #
>      # Isn't it a good practice to use comments here?
>      #
>      ...

IMO, no.

R. David Murray                            

From belopolsky at  Tue Nov  9 04:28:40 2010
From: belopolsky at (Alexander Belopolsky)
Date: Mon, 8 Nov 2010 22:28:40 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 8, 2010 at 2:58 PM, Brett Cannon <brett at> wrote:
> But that doesn't mean we can't go through, fix up our names, and
> deprecate the old public names; that's fair game in my book.



From ncoghlan at  Tue Nov  9 05:26:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 9 Nov 2010 14:26:22 +1000
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <iba7ge$ot7$>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 11:18 AM, Ron Adam <rrr at> wrote:
> What do you think about adding a new module along with a
> loader module with a basic user api. ?The number 3, so that it
> match's python3.x.
> We can then keep the old unchanged and be free to make a lot more
> changes to the file without having to be even a little paranoid.

I think changing the behaviour of the pydoc command line app is a fine
idea - it's only the pydoc.serve and pydoc.gui functions that are
worrying me. As I noted on the tracker issue, there's a reasonably
clean way to do this, even given the coupling between the 3.1 GUI app
and server: leave the existing serve() and gui() functions alone
(aside from adding DeprecationWarning), and add your new
implementation as a parallel private API.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rrr at  Tue Nov  9 05:28:32 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 22:28:32 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

On 11/08/2010 07:18 PM, Brett Cannon wrote:
> On Mon, Nov 8, 2010 at 16:10, Ron Adam<rrr at>  wrote:

>> def _private_api():
>>     #
>>     # Isn't it a good practice to use comments here?
>>     #
>>     ...
> That is ugly. I already hate doing that for unittest, I'm not about to
> champion that for anything else.

Ugly?  I suppose it's a matter of what you are used to.

> It would also lead to essentially requiring a docstrings for
> everything that is public whether someone wants to bother to writing a
> docstring or not. I don't think we should be suggesting that a
> docstring be required either.

I can see where that would be overly strict in an application or script 
made with python.

But it seems odd to me, to have undocumented api's in a programming 
language.  If it's being replaced with something else, the doc string can 
say that.  A null string is also a valid doc string if you just need a 
place holder until someone gets to it.


>> Brett, I'm sure you can up with a better alternative.   ;-)
> But I don't want to have to do that in the stdlib by remembering what
> modules I should or should not import. This is just as much about
> developer burden on core devs as it is making sure we don't yank the
> rug out from underneath users.

Yes, I agree.  But how to best do that?

From rrr at  Tue Nov  9 05:28:32 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 22:28:32 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

On 11/08/2010 07:18 PM, Brett Cannon wrote:
> On Mon, Nov 8, 2010 at 16:10, Ron Adam<rrr at>  wrote:

>> def _private_api():
>>     #
>>     # Isn't it a good practice to use comments here?
>>     #
>>     ...
> That is ugly. I already hate doing that for unittest, I'm not about to
> champion that for anything else.

Ugly?  I suppose it's a matter of what you are used to.

> It would also lead to essentially requiring a docstrings for
> everything that is public whether someone wants to bother to writing a
> docstring or not. I don't think we should be suggesting that a
> docstring be required either.

I can see where that would be overly strict in an application or script 
made with python.

But it seems odd to me, to have undocumented api's in a programming 
language.  If it's being replaced with something else, the doc string can 
say that.  A null string is also a valid doc string if you just need a 
place holder until someone gets to it.


>> Brett, I'm sure you can up with a better alternative.   ;-)
> But I don't want to have to do that in the stdlib by remembering what
> modules I should or should not import. This is just as much about
> developer burden on core devs as it is making sure we don't yank the
> rug out from underneath users.

Yes, I agree.  But how to best do that?

From ncoghlan at  Tue Nov  9 05:37:03 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 9 Nov 2010 14:37:03 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 1:28 PM, Alexander Belopolsky
<belopolsky at> wrote:
> On Mon, Nov 8, 2010 at 2:58 PM, Brett Cannon <brett at> wrote:
> ..
>> But that doesn't mean we can't go through, fix up our names, and
>> deprecate the old public names; that's fair game in my book.

Indeed. I've now recommended Ron do exactly that for the pydoc patch.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rrr at  Tue Nov  9 05:47:34 2010
From: rrr at (Ron Adam)
Date: Mon, 08 Nov 2010 22:47:34 -0600
Subject: [Python-Dev] Backward incompatible API changes in the pydoc
In-Reply-To: <>
References: <>	<iba7ge$ot7$>
Message-ID: <ibajp7$2q8$>

On 11/08/2010 10:26 PM, Nick Coghlan wrote:
> On Tue, Nov 9, 2010 at 11:18 AM, Ron Adam<rrr at>  wrote:
>> What do you think about adding a new module along with a
>> loader module with a basic user api.  The number 3, so that it
>> match's python3.x.
>> We can then keep the old unchanged and be free to make a lot more
>> changes to the file without having to be even a little paranoid.
> I think changing the behaviour of the pydoc command line app is a fine
> idea - it's only the pydoc.serve and pydoc.gui functions that are
> worrying me. As I noted on the tracker issue, there's a reasonably
> clean way to do this, even given the coupling between the 3.1 GUI app
> and server: leave the existing serve() and gui() functions alone
> (aside from adding DeprecationWarning), and add your new
> implementation as a parallel private API.

Ok, I guess that's what needs to be done then.  I can try to do it over the 
next few days, and will probably need a bit more advise on how to add in 
the depreciation warnings.  Or if you want to go ahead and do it, I'm more 
than OK with that.

Thanks for the help on this.  I do appreciate it.


From eliben at  Tue Nov  9 06:30:08 2010
From: eliben at (Eli Bendersky)
Date: Tue, 9 Nov 2010 07:30:08 +0200
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 04:07, R. David Murray <rdmurray at> wrote:

> On Mon, 08 Nov 2010 18:10:17 -0600, Ron Adam <rrr at> wrote:
> > def _private_api():
> >      #
> >      # Isn't it a good practice to use comments here?
> >      #
> >      ...
> IMO, no.
FWIW, I agree completely. Docstrings are a part of Python I don't see a
reason to leave out for "non-public" code. They're convenient in the
beginning of functions and we all are used to seeing them there. IDE's use
them to display helpful "tooltips" on functions, and so on.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From khamenya at  Tue Nov  9 11:40:16 2010
From: khamenya at (Valery Khamenya)
Date: Tue, 9 Nov 2010 11:40:16 +0100
Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+
In-Reply-To: <>
References: <>
Message-ID: <>

> Can you post your patch on

done -- now both 2.x and 3.x patches are available on

The py3k appeared to be *much* more friendly regarding the unpleasant
unicode-issues that I've faced in python 2.x

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Tue Nov  9 11:59:30 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 09 Nov 2010 10:59:30 +0000
Subject: [Python-Dev] GUI test runner tool
In-Reply-To: <>
References: <>
Message-ID: <>

On 08/11/2010 19:00, Brett Cannon wrote:
> On Mon, Nov 8, 2010 at 04:09, Michael Foord<michael at>  wrote:
>> Hello all,
>> Now that unittest has test discovery, Mark Roddy has been working on
>> resurrecting the old GUI test runner (using Tkinter):
>> This was part of the original pyunit project but I believe it was never part
>> of the standard library:
>> Here's a screenshot of what it looks like:
>> I'd like to propose adding it to Python in Tools/ and am volunteering to
>> maintain it.
> Does that mean upgrading it as well? =)


>   For instance it would be great
> to get it to use ttk so it looks a bit sharper,

I've never used Ttk. Patches welcomed...

> supports skipped tests
> and expected failures,

It already does, the screenshot is a bit old. :-)

>   and dream-of-dreams ties into regrtest so you
> can just check boxes instead of passing a ton of CLI flags.
That would be great, but regrtest is a bit custom. It's a great idea, 
but would need a different UI shell.

>> If the answer is "not yet" that is fine as it can go into
>> unittest2 first. Mark has updated it to work with test discovery and added
>> support for configuring test discovery in the same way as you can from the
>> command line. It is a nice tool for those new to writing tests who aren't
>> yet familiar with the command line or prefer a GUI.
> I personally have no problem with it going into tools as long as it
> can also be used to run the tests in the stdlib.

Unfortunately the stdlib tests largely aren't compatible with test 
discovery. There is an open issue about that. Many of the tests depend 
on being run with regrtest, and use features that are in many places now 
obsolete due to improvements in unittest. No-one has yet done the work 
to switch them over. It is 'on my list' though.

All the best,

Michael Foord

>   Just don't put it in
> Demos/ . =)
> -Brett
>> In its basic form you simply pick a directory and unittestgui will discover
>> and run all the tests it finds. It would be nice if it provided more
>> diagnostic information on tests it ran (clicking through test results) but
>> these can be added later.
>> All the best,
>> Michael Foord
>> --
>> READ CAREFULLY. By accepting and reading this email you agree,
>> on behalf of your employer, to release me from all obligations
>> and waivers arising from any and all NON-NEGOTIATED agreements,
>> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
>> confidentiality, non-disclosure, non-compete and acceptable use
>> policies (?BOGUS AGREEMENTS?) that I have entered into with your
>> employer, its partners, licensors, agents and assigns, in
>> perpetuity, without prejudice to my ongoing rights and privileges.
>> You further represent that you have the authority to release me
>> from any BOGUS AGREEMENTS on behalf of your employer.
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Tue Nov  9 12:53:44 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 9 Nov 2010 12:53:44 +0100
Subject: [Python-Dev] Breaking undocumented API
References: <>
Message-ID: <>

On Tue, 09 Nov 2010 02:03:23 -0000
exarkun at wrote:
> I wonder if there are any actual technical arguments to be made against 
> something like `deprecatedModuleAttribute`?

For example, does it work well with import hacks such as Mercurial's



From solipsis at  Tue Nov  9 13:00:57 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 9 Nov 2010 13:00:57 +0100
Subject: [Python-Dev] r86351 - python/branches/py3k/Lib/
References: <>
Message-ID: <>

On Tue,  9 Nov 2010 04:43:58 +0100 (CET)
raymond.hettinger <python-checkins at> wrote:
> Author: raymond.hettinger
> Date: Tue Nov  9 04:43:58 2010
> New Revision: 86351
> Log:
> Simplify code
> Modified:
>    python/branches/py3k/Lib/
> Modified: python/branches/py3k/Lib/
> ==============================================================================
> --- python/branches/py3k/Lib/	(original)
> +++ python/branches/py3k/Lib/	Tue Nov  9 04:43:58 2010
> @@ -108,30 +108,19 @@
>      _RandomNameSequence is an iterator."""
> -    characters = ("abcdefghijklmnopqrstuvwxyz" +
> -                  "0123456789_")
> +    characters = "abcdefghijklmnopqrstuvwxyz0123456789_"

Aren't you reducing entropy here?

From fuzzyman at  Tue Nov  9 13:17:13 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 09 Nov 2010 12:17:13 +0000
Subject: [Python-Dev] GUI test runner tool
In-Reply-To: <>
References: <>
Message-ID: <>

On 08/11/2010 19:28, Alexander Belopolsky wrote:
> On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord<michael at>  wrote:
> ..
>> I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to
>> maintain it.
> Why not adding it under Lib/unittest/?
I really don't want to make Tk a dependency for unittest itself. :-)

I also don't want a GUI test runner to in any way be part of the *api* 
of unittest...

> I think Tools/  is a less
> attractive location for most users than say PyPI or some other package
> repository.  Tools/ is for stuff that is primarily of interest to
> python developers, not python users.  OS vendors are less likely to
> install packages in Tools/ in a user-visible place than they are a
> popular 3rd-party package.

Well, there's always Demos/. ;-)

I realise that putting it in Tools/ means that distros will probably 
have to make a conscious decision to package it. unittest2 will install 
it as a script. I don't think the gui runner belongs in Demos/, so 
Tools/ is the logical choice for including in core-Python. As Raymond 
says we can point to it in the docs. (The gui test runner is merely a 
convenience / beginners tool - so pointing to more "production suitable" 
tools like Hudson would also be good.)

I was looking to see what else was in Tools/ that was distributed with 
Python, but I don't *think* the Mac distribution includes them at all. 
(freeze is in the Tools/ directory of the repo and is an 'end user' tool 
rather than core-developer tool.) The Mac distribution does put a bunch 
of stuff in the Python 'bin' directory, and ideally it would go there.

All the best,

Michael Foord

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Tue Nov  9 13:26:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 9 Nov 2010 22:26:22 +1000
Subject: [Python-Dev] r86351 - python/branches/py3k/Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 10:00 PM, Antoine Pitrou <solipsis at> wrote:
>> - ? ?characters = ("abcdefghijklmnopqrstuvwxyz" +
>> - ? ? ? ? ? ? ? ? ?"ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
>> - ? ? ? ? ? ? ? ? ?"0123456789_")
>> + ? ?characters = "abcdefghijklmnopqrstuvwxyz0123456789_"
> Aren't you reducing entropy here?

Perhaps in some cases, but it also makes the behaviour consistent
across all platforms.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From exarkun at  Tue Nov  9 17:10:23 2010
From: exarkun at (exarkun at
Date: Tue, 09 Nov 2010 16:10:23 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101109161023.2040.316260932.divmod.xquotient.834@localhost.localdomain>

On 11:53 am, solipsis at wrote:
>On Tue, 09 Nov 2010 02:03:23 -0000
>exarkun at wrote:
>>I wonder if there are any actual technical arguments to be made 
>>something like `deprecatedModuleAttribute`?
>For example, does it work well with import hacks such as Mercurial's

I haven't tried before, but a quick experiment suggests that the two 
happily co-exist (aside from demandimport getting the blame instead of 
the true offending code, but that's really a problem with the warnings 

  >>> import mercurial.demandimport as di
  >>> di.enable()
  >>> import twisted.python.threadpool as tp
  >>> tp.ThreadSafeList
DeprecationWarning: twisted.python.threadpool.ThreadSafeList was 
deprecated in Twisted 10.1.0: This was an internal implementation detail 
of support for Jython 2.1, which is now obsolete.
  return getattr(self._module, attr)
  <class twisted.python.threadpool.ThreadSafeList at 0xb746decc>


From alexander.belopolsky at  Tue Nov  9 17:23:23 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 9 Nov 2010 11:23:23 -0500
Subject: [Python-Dev] [Python-checkins] r86355 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner
<python-checkins at> wrote:
> Log:
> Issue #10359: Remove useless comma, invalid in ISO C

C99 allows it.  Which compiler is giving you trouble?

From solipsis at  Tue Nov  9 17:36:57 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 9 Nov 2010 17:36:57 +0100
Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c
References: <>
Message-ID: <>

On Tue, 9 Nov 2010 11:23:23 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner
> <python-checkins at> wrote:
> ..
> > Log:
> > Issue #10359: Remove useless comma, invalid in ISO C
> C99 allows it.  Which compiler is giving you trouble?

One part of the answer is that we generally try to enforce C89
compatibility. I don't know if any modern compiler would mind, though.



From alexander.belopolsky at  Tue Nov  9 18:12:35 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 9 Nov 2010 12:12:35 -0500
Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 9, 2010 at 11:36 AM, Antoine Pitrou <solipsis at> wrote:
>> C99 allows it. ?Which compiler is giving you trouble?
> One part of the answer is that we generally try to enforce C89
> compatibility. I don't know if any modern compiler would mind, though.

I know, but if we ever start making exceptions, this would be a
particularly harmless one.  There must be a reason why we don't use
-std=c89 flag with the standard config.  I don't think too many people
remember that c89 allows trailing commas in array and struct
initialization lists, but not in enum declarations.  Without compiler
help, enforcing this is an unnecessary maintenance burden.

From stefan-usenet at  Tue Nov  9 18:29:53 2010
From: stefan-usenet at (Stefan Krah)
Date: Tue, 9 Nov 2010 18:29:53 +0100
Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Tue, Nov 9, 2010 at 11:36 AM, Antoine Pitrou <solipsis at> wrote:
> ..
> >> C99 allows it. ?Which compiler is giving you trouble?
> >
> > One part of the answer is that we generally try to enforce C89
> > compatibility. I don't know if any modern compiler would mind, though.
> I know, but if we ever start making exceptions, this would be a
> particularly harmless one.  There must be a reason why we don't use
> -std=c89 flag with the standard config.  I don't think too many people
> remember that c89 allows trailing commas in array and struct
> initialization lists, but not in enum declarations.  Without compiler
> help, enforcing this is an unnecessary maintenance burden.

xlc on AIX has problems:

Stefan Krah

From tseaver at  Tue Nov  9 19:49:01 2010
From: tseaver at (Tres Seaver)
Date: Tue, 09 Nov 2010 13:49:01 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <ibc52t$8lf$>

Hash: SHA1

On 11/08/2010 06:26 PM, Bobby Impollonia wrote:

> This does hurt because anyone who was relying on "import *" to get a
> name which is now omitted from __all__ is going to upgrade and find
> their program failing with NameErrors. This is a backwards compatible
> change and shouldn't happen without a deprecation warning first.

Outside an interactive prompt, anyone using "from foo import *" has set
themselves and their users up to lose anyway.

That syntax is the single worst misfeature in all of Python.  It impairs
readability and discoverability for *no* benefit beyond one-time typing
convenience.  Module writers who compound the error by expecting to be
imported this way, thereby bogarting the global namespace for their own
purposes, should be fish-slapped. ;)

- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From merwok at  Tue Nov  9 20:46:41 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Tue, 09 Nov 2010 20:46:41 +0100
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib:	test/ xml/etree/
In-Reply-To: <>
References: <>
Message-ID: <>

Hello Senthil

> Author: senthil.kumaran
> New Revision: 86348
> Log: Fix Issue10205 - XML QName error when different tags have same QName.
> Modified:
>    python/branches/py3k/Lib/test/
>    python/branches/py3k/Lib/xml/etree/

Shouldn?t this include an entry in NEWS and maybe in ACKS?


From glyph at  Tue Nov  9 21:36:16 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 9 Nov 2010 12:36:16 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 8, 2010, at 4:50 PM, Guido van Rossum wrote:
> On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz <glyph at> wrote:
>> This seems like a pretty clear case of "practicality beats purity".  Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'.  The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints.
> But the Twisted universe is only a subset of the Python universe. The
> Python stdlib needs to move more carefully.

While this is true, I think the Twisted universe generally represents a particularly conservative, compatibility-conscious area within the Python universe (multiverse?).  I know of several Twisted users who regularly upgrade to the most recent version of Twisted without incident, but can't move from Python 2.4->2.5 because of compatibility issues.

That's not to say that there are no areas within the larger Python ecosystem that I'm unaware of where putting non-module-objects into sys.modules would cause issues.  But if it were a practice that were at all common, I suspect that we would have bumped into it by now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From a.badger at  Tue Nov  9 21:48:06 2010
From: a.badger at (Toshio Kuratomi)
Date: Tue, 9 Nov 2010 12:48:06 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibc52t$8lf$>
References: <>
Message-ID: <20101109204806.GB14976@unaka.lan>

On Tue, Nov 09, 2010 at 01:49:01PM -0500, Tres Seaver wrote:
> Hash: SHA1
> On 11/08/2010 06:26 PM, Bobby Impollonia wrote:
> > This does hurt because anyone who was relying on "import *" to get a
> > name which is now omitted from __all__ is going to upgrade and find
> > their program failing with NameErrors. This is a backwards compatible
> > change and shouldn't happen without a deprecation warning first.
> Outside an interactive prompt, anyone using "from foo import *" has set
> themselves and their users up to lose anyway.
> That syntax is the single worst misfeature in all of Python.  It impairs
> readability and discoverability for *no* benefit beyond one-time typing
> convenience.  Module writers who compound the error by expecting to be
> imported this way, thereby bogarting the global namespace for their own
> purposes, should be fish-slapped. ;)
I think there's a valid case for bogarting the namespace in this instance,
but let me know if there's a better way to do it::

# Method to use system libraries if available, otherwise use a bundled copy,
# aka: make both system packagers and developers happy::

Relevant directories and files for this module::

+ foo/
++ compat/
 ++ bar/

foo/compat/bar/ is a bundled module.

foo/compat/bar/ has:

    from bar import *
    from bar import __all__
except ImportError::
    from import *
    from import __all__

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <>

From martin at  Tue Nov  9 22:17:25 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 Nov 2010 22:17:25 +0100
Subject: [Python-Dev] migration
Message-ID: <> is moving to a new hardware; this also involves a new IP
address. The migration will happen on Thursday, likely around 8:00 UTC.
If all goes well, outage should be very short.


From ncoghlan at  Tue Nov  9 23:09:09 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 10 Nov 2010 08:09:09 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibc52t$8lf$>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 4:49 AM, Tres Seaver <tseaver at> wrote:
> Outside an interactive prompt, anyone using "from foo import *" has set
> themselves and their users up to lose anyway.
> That syntax is the single worst misfeature in all of Python. ?It impairs
> readability and discoverability for *no* benefit beyond one-time typing
> convenience. ?Module writers who compound the error by expecting to be
> imported this way, thereby bogarting the global namespace for their own
> purposes, should be fish-slapped. ;)

Be prepared to fish-slap all of python-dev then - we use precisely
this technique to support optional acceleration modules. The pure
Python versions of pairs like profile/_profile and heapq/_heapq
include a try/except block at the end that does the equivalent of:

    from _accelerated import * # Allow accelerated overrides
  except ImportError:
    pass # Use pure Python versions

This allows each implementation to make its own decisions about
exactly which parts to accelerate without needing to change the pure
Python version. In CPython itself, different *builds* may vary based
on which components are available during the build process.

There are utility functions provided in that allow us to
make sure that these modules are tested both with and without their
accelerated components.

The new unittest package in 2.7 and 3.2 also uses it in the module
__init__ to present the old "flat" namespace despite become a package
under the hood.

Star imports are certainly open to abuse, but there are legitimate use
cases when you want to lie about where particular APIs live in the
module heirarchy. Those use cases generally involve being imported by
one *specific* other module, such that anyone else importing the
module directly *at all* is already doing the wrong thing.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From tseaver at  Tue Nov  9 23:12:00 2010
From: tseaver at (Tres Seaver)
Date: Tue, 09 Nov 2010 17:12:00 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101109204806.GB14976@unaka.lan>
References: <>	<>	<>	<>	<ibc52t$8lf$>
Message-ID: <ibcgvg$6rp$>

Hash: SHA1

On 11/09/2010 03:48 PM, Toshio Kuratomi wrote:

> I think there's a valid case for bogarting the namespace in this instance,
> but let me know if there's a better way to do it::
> # Method to use system libraries if available, otherwise use a bundled copy,
> # aka: make both system packagers and developers happy::
> Relevant directories and files for this module::
> + foo/
> +-
> ++ compat/
>  +-
>  ++ bar/
>   +-
>   +-
> foo/compat/bar/ is a bundled module.
> foo/compat/bar/ has:
> try:
>     from bar import *
>     from bar import __all__
> except ImportError::
>     from import *
>     from import __all__

I guess the usual caveats apply for dopplegangers / proxies. ;)

- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From foom at  Wed Nov 10 00:33:35 2010
From: foom at (James Y Knight)
Date: Tue, 9 Nov 2010 18:33:35 -0500
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 8, 2010, at 6:08 PM, Lennart Regebro wrote:

> 2010/11/8 James Y Knight <foom at>:
>> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
>>> So it can be done, but the question is "Why?"
>> To keep the batteries included?
> But they'll only be included in > 2.7, which won't be used much, [...]

If there was going to be an official sanctioned Python 2.8 release, I'm not at all sure that'd be the case. Since there isn't going to be one, then yes, that's probably true.


From orsenthil at  Wed Nov 10 00:48:42 2010
From: orsenthil at (Senthil Kumaran)
Date: Wed, 10 Nov 2010 07:48:42 +0800
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib: test/ xml/etree/
In-Reply-To: <>
References: <>
Message-ID: <20101109234842.GA1068@rubuntu>

Hello ?ric,

On Tue, Nov 09, 2010 at 08:46:41PM +0100, ?ric Araujo wrote:
> Shouldn?t this include an entry in NEWS and maybe in ACKS?

It was a very simple bug fix (caused due to an overlook initially), so
did not add NEWS/ACKS. For features, larger fixes or complete patches,
I the add NEWS and ACKS as appropriate.


From stephen at  Wed Nov 10 05:12:09 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 10 Nov 2010 13:12:09 +0900
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan writes:

 > > Module writers who compound the error by expecting to be imported
 > > this way, thereby bogarting the global namespace for their own
 > > purposes, should be fish-slapped. ;)
 > Be prepared to fish-slap all of python-dev then - we use precisely
 > this technique to support optional acceleration modules. The pure
 > Python versions of pairs like profile/_profile and heapq/_heapq
 > include a try/except block at the end that does the equivalent of:
 >   try:
 >     from _accelerated import * # Allow accelerated overrides
 >   except ImportError:
 >     pass # Use pure Python versions

But these identifiers will appear at the module level, not global, no?
Otherwise this technique couldn't be used.  I don't really understand
what Tres is talking about when he writes "modules that expect to be
imported this way".  The *imported* module shouldn't care, no?  This
is an issue for the *importing* code to deal with.

From stephen at  Wed Nov 10 05:20:58 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 10 Nov 2010 13:20:58 +0900
Subject: [Python-Dev] Continuing 2.x
In-Reply-To: <>
References: <>
Message-ID: <>

James Y Knight writes:
 > On Nov 8, 2010, at 6:08 PM, Lennart Regebro wrote:
 > > 2010/11/8 James Y Knight <foom at>:
 > >> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
 > >>> So it can be done, but the question is "Why?"
 > >> 
 > >> To keep the batteries included?
 > > 
 > > But they'll only be included in > 2.7, which won't be used much, [...]
 > If there was going to be an official sanctioned Python
 > 2.8 release, I'm not at all sure that'd be the case. Since there
 > isn't going to be one, then yes, that's probably true.

Which pretty much demonstrates that the argument for a sanctioned 2.8
is weak, and ditto for adding features to 2.7.

Python 2.7 is a great language; existing projects which need to go
beyond that need to port to a different language.  The OP is already
doing that IIUC: Stackless is a pretty faithful implementation of
Python (in several versions of the language, too!), but not quite
100%, right?  OTOH, how many derivatives has C spawned?  Or Pascal,
FORTRAN, LISP?  ML?  And people continue to find that variety
*constraining*, and invent new languages!

python-dev's decision to offer that different language as Python 3,
where *almost all* of your skills will upgrade transparently (even
though unfortunately a lot of code won't, at least not today), is
probably a great boon to developers *in* Python.  Time will tell.

From ncoghlan at  Wed Nov 10 14:32:52 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 10 Nov 2010 23:32:52 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 10:23 PM, Michael Foord
<fuzzyman at> wrote:
> On 09/11/2010 22:09, Nick Coghlan wrote:
>> The new unittest package in 2.7 and 3.2 also uses it in the module
>> __init__ to present the old "flat" namespace despite become a package
>> under the hood.
> Look again. :-)
> Benjamin did the refactoring into a package and he obviously dislikes
> "import *" as much as me. If he had used "import *" I would have changed it
> anyway, but he didn't.
> We also define a __all__ to make the exported names explicit.

Fair cop :)

(and in that particular case, the maintenance burden in being explicit
is minimal, since new top-level names in unittest are going to be
significantly more rare than new methods on existing unittest classes)

Even some of the acceleration modules (such as _hashlib) use
approaches that are more explicit than using "import *". The point at
least stands for the cases where the pure Python version is largely
agnostic as to exactly which names the acceleration module overrides.
It's a very, very niche use case though, so the default position of
"if you use a star import anywhere other than at the interactive
prompt, you're most like wrong to do so" is still a reasonable stance
to take :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From victor.stinner at  Wed Nov 10 13:28:36 2010
From: victor.stinner at (Victor Stinner)
Date: Wed, 10 Nov 2010 13:28:36 +0100
Subject: [Python-Dev] [Python-checkins] r86355 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Tuesday 09 November 2010 17:23:23 Alexander Belopolsky wrote:
> On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner
> <python-checkins at> wrote:
> ..
> > Log:
> > Issue #10359: Remove useless comma, invalid in ISO C
> C99 allows it.  Which compiler is giving you trouble?

I don't know, but the commit is trivial and cheap. If it improves the support 
on uncommon compiler, I agree to commit such change.


From flub at  Wed Nov 10 10:20:09 2010
From: flub at (Floris Bruynooghe)
Date: Wed, 10 Nov 2010 09:20:09 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 November 2010 04:12, Stephen J. Turnbull <stephen at> wrote:
> Nick Coghlan writes:
> ?> > Module writers who compound the error by expecting to be imported
> ?> > this way, thereby bogarting the global namespace for their own
> ?> > purposes, should be fish-slapped. ;)
> ?>
> ?> Be prepared to fish-slap all of python-dev then - we use precisely
> ?> this technique to support optional acceleration modules. The pure
> ?> Python versions of pairs like profile/_profile and heapq/_heapq
> ?> include a try/except block at the end that does the equivalent of:
> ?>
> ?> ? try:
> ?> ? ? from _accelerated import * # Allow accelerated overrides
> ?> ? except ImportError:
> ?> ? ? pass # Use pure Python versions
> But these identifiers will appear at the module level, not global, no?
> Otherwise this technique couldn't be used. ?I don't really understand
> what Tres is talking about when he writes "modules that expect to be
> imported this way". ?The *imported* module shouldn't care, no? ?This
> is an issue for the *importing* code to deal with.

I can't think of stdlib examples, but for 3rd party packages I'd say
storm.locals and fabric.api are examples of packages designed with
"from foo import * " in mind.  So this does happen.


Debian GNU/Linux -- The Power of Freedom | |

From hrvoje.niksic at  Wed Nov 10 13:23:35 2010
From: hrvoje.niksic at (Hrvoje Niksic)
Date: Wed, 10 Nov 2010 13:23:35 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibc52t$8lf$>	<>
Message-ID: <>

On 11/10/2010 05:12 AM, Stephen J. Turnbull wrote:
> But these identifiers will appear at the module level, not global, no?
> Otherwise this technique couldn't be used.  I don't really understand
> what Tres is talking about when he writes "modules that expect to be
> imported this way".  The *imported* module shouldn't care, no?

I think he's referring to the choice of identifiers, and the usage 
examples given in the documentation and tutorials.  For example, in the 
original PyGTK, all identifiers included "Gtk" in the name, so it made 
sense to write from pygtk import * so you could spell GtkWindow as 
GtkWindow rather than the redundant pygtk.GtkWindow.  In that sense the 
module writer "expected" to be imported this way, although you are right 
that it doesn't the least bit matter for the correct operation of the 
module itself.  For GTK 2 PyGTK switch to "gtk.Window", which 
effectively removes the temptation to import * from the module.

There are other examples of that school, most notably ctypes, but also 
Tkinter and the python2 threading module.  Fortunately it has become 
much less popular in the last ~5 years of Python history.

From jcea at  Wed Nov 10 12:34:47 2010
From: jcea at (Jesus Cea)
Date: Wed, 10 Nov 2010 12:34:47 +0100
Subject: [Python-Dev] migration
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 09/11/10 22:17, "Martin v. L?wis" wrote:
> is moving to a new hardware; this also involves a new IP
> address. The migration will happen on Thursday, likely around 8:00 UTC.
> If all goes well, outage should be very short.

Seems to be offline now. I get timeouts.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From raymond.hettinger at  Wed Nov 10 20:33:55 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Wed, 10 Nov 2010 11:33:55 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On Nov 10, 2010, at 5:47 AM, Michael Foord wrote:

> So it is obvious that we don't have a clearly stated policy for what defines the public API of standard library modules.
> How about making this explicit (either pep 8 or our developer docs):

I believe the point of Guido's email was that it is a situation dependent judgment call and not readily boiled down to a set of rules for PEP 8.


From fuzzyman at  Wed Nov 10 13:23:17 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 10 Nov 2010 12:23:17 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibc52t$8lf$>
Message-ID: <>

On 09/11/2010 22:09, Nick Coghlan wrote:
> On Wed, Nov 10, 2010 at 4:49 AM, Tres Seaver<tseaver at>  wrote:
>> Outside an interactive prompt, anyone using "from foo import *" has set
>> themselves and their users up to lose anyway.
>> That syntax is the single worst misfeature in all of Python.  It impairs
>> readability and discoverability for *no* benefit beyond one-time typing
>> convenience.  Module writers who compound the error by expecting to be
>> imported this way, thereby bogarting the global namespace for their own
>> purposes, should be fish-slapped. ;)
> Be prepared to fish-slap all of python-dev then - we use precisely
> this technique to support optional acceleration modules. The pure
> Python versions of pairs like profile/_profile and heapq/_heapq
> include a try/except block at the end that does the equivalent of:
>    try:
>      from _accelerated import * # Allow accelerated overrides
>    except ImportError:
>      pass # Use pure Python versions
> This allows each implementation to make its own decisions about
> exactly which parts to accelerate without needing to change the pure
> Python version. In CPython itself, different *builds* may vary based
> on which components are available during the build process.
> There are utility functions provided in that allow us to
> make sure that these modules are tested both with and without their
> accelerated components.
> The new unittest package in 2.7 and 3.2 also uses it in the module
> __init__ to present the old "flat" namespace despite become a package
> under the hood.

Look again. :-)

Benjamin did the refactoring into a package and he obviously dislikes 
"import *" as much as me. If he had used "import *" I would have changed 
it anyway, but he didn't.

We also define a __all__ to make the exported names explicit.

All the best,


> Star imports are certainly open to abuse, but there are legitimate use
> cases when you want to lie about where particular APIs live in the
> module heirarchy. Those use cases generally involve being imported by
> one *specific* other module, such that anyone else importing the
> module directly *at all* is already doing the wrong thing.
> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Wed Nov 10 14:47:39 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 10 Nov 2010 13:47:39 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 08/11/2010 22:07, Raymond Hettinger wrote:
> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote:
>> I think we need to, as a group, decide how to handle undocumented APIs
>> that don't have a leading underscore: they get treated just the same
>> as the documented APIs, or are they private regardless and thus we can
>> change them at our whim?
> To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed.  That way, we can at least declare the parts that are intended to be public on a go-forward basis.
> For the most part, the non-underscored parts of the API shouldn't be changed "at our whim".  Some sense needs to be applied to the decision.  Google's code search is great for showing how people actually have used a module in real world code.  If that shows that people are accessing and/or changing an attribute, it probably needs to remain exposed.   In the absence of a code search, good guesses can be made about what someone might reasonably and usefully be accessing (i.e. glob0 isn't likely).   The goal is to improve the standard library while minimizing breakage, and that will involve trade-offs depending on what is being changed.
> IIRC, we've been trying to get away from deprecations because they're so disruptive.  For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break).  Likewise, with the unittest clean-ups, I'm expecting that Michael will introduce aliases when fixing-up mis-named methods, rather than break code that uses the existing names.

So it is obvious that we don't have a clearly stated policy for what 
defines the public API of standard library modules.

How about making this explicit (either pep 8 or our developer docs):

If a module or package defines __all__ that authoritatively defines the 
public interface. Modules with __all__ SHOULD still respect the naming 
conventions (leading underscore for private members) to avoid confusing 
users. Modules SHOULD NOT export private members in __all__.

Names imported into a module a never considered part of its public API 
unless documented to be so or included in __all__.

Methods / functions / classes and module attributes whose names begin 
with a leading underscore are private.

If a class name begins with a leading underscore none of its members are 
public, whether or not they begin with a leading underscore.

If a module name in a package begins with a leading underscore none of 
its members are public, whether or not they begin with a leading underscore.

If a module or package doesn't define __all__ then all names that don't 
start with a leading underscore are public.

All public members MUST be documented. Public functions, methods and 
classes SHOULD have docstrings. Private members may have docstrings.

Where in the standard library this means that a module exports stuff 
that isn't helpful or shouldn't be part of the public API we need to 
migrate to private names and follow our deprecation process for the 
public names.

All the best,

Michael Foord
> my-two-cents,
> Raymond
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From rdmurray at  Wed Nov 10 16:48:37 2010
From: rdmurray at (R. David Murray)
Date: Wed, 10 Nov 2010 10:48:37 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, 10 Nov 2010 13:12:09 +0900, "Stephen J. Turnbull" <stephen at> wrote:
> Nick Coghlan writes:
>  > > Module writers who compound the error by expecting to be imported
>  > > this way, thereby bogarting the global namespace for their own
>  > > purposes, should be fish-slapped. ;)
>  > 
>  > Be prepared to fish-slap all of python-dev then - we use precisely
>  > this technique to support optional acceleration modules. The pure
>  > Python versions of pairs like profile/_profile and heapq/_heapq
>  > include a try/except block at the end that does the equivalent of:
>  > 
>  >   try:
>  >     from _accelerated import * # Allow accelerated overrides
>  >   except ImportError:
>  >     pass # Use pure Python versions
> But these identifiers will appear at the module level, not global, no?
> Otherwise this technique couldn't be used.  I don't really understand
> what Tres is talking about when he writes "modules that expect to be
> imported this way".  The *imported* module shouldn't care, no?  This
> is an issue for the *importing* code to deal with.

I think Tres was referring to certain packages (which shall remain
nameless since I don't feel like googling to find one) whose
documentation recommends the 'from <me> import *' methodology.

At least that's how I read "Module writers who..."  (that is, he's not
saying the *module* expects to be imported that way). [*]

R. David Murray                            

[*] although reading that sentence literally, the thought of such a
module writer themselves being imported that way (a la Tron) has a
certain charm....

From tseaver at  Wed Nov 10 17:58:17 2010
From: tseaver at (Tres Seaver)
Date: Wed, 10 Nov 2010 11:58:17 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibc52t$8lf$>	<>
Message-ID: <ibeiv9$qr4$>

Hash: SHA1

On 11/09/2010 11:12 PM, Stephen J. Turnbull wrote:
> Nick Coghlan writes:
>  > > Module writers who compound the error by expecting to be imported
>  > > this way, thereby bogarting the global namespace for their own
>  > > purposes, should be fish-slapped. ;)
>  > 
>  > Be prepared to fish-slap all of python-dev then - we use precisely
>  > this technique to support optional acceleration modules. The pure
>  > Python versions of pairs like profile/_profile and heapq/_heapq
>  > include a try/except block at the end that does the equivalent of:
>  > 
>  >   try:
>  >     from _accelerated import * # Allow accelerated overrides
>  >   except ImportError:
>  >     pass # Use pure Python versions
> But these identifiers will appear at the module level, not global, no?
> Otherwise this technique couldn't be used.  I don't really understand
> what Tres is talking about when he writes "modules that expect to be
> imported this way".  The *imported* module shouldn't care, no?  This
> is an issue for the *importing* code to deal with.

Right -- "private" star imports aren't the issue for me, because the
same user who creates them is responsible for the other end fo the
stick.  I was ranting about library authors who document star imports as
the expected usage pattern for their external users.

Note that I still wouldn't use star imports in the "private
acceleration" case myself.  I would prefer a pattern like:

- ----------------------- $< -----------------------------

# Pure python API implementation
def foo(spat, blarg):

def bar(qux):

# Replace with accelearated C implemenataion
    import _spam
except ImportError:
    pass # accelerated version not available
    foo =
    bar =
- ----------------------- $< -----------------------------

This explicit name remapping catches unintentional erros (e.g., _spam
renames a method) better than the star import.

- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From greg.ewing at  Wed Nov 10 21:44:26 2010
From: greg.ewing at (Greg Ewing)
Date: Thu, 11 Nov 2010 09:44:26 +1300
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

Stephen J. Turnbull wrote:
> I don't really understand
> what Tres is talking about when he writes "modules that expect to be
> imported this way".  The *imported* module shouldn't care, no?  This
> is an issue for the *importing* code to deal with.

I think he's talking about modules that add a prefix to all
of their exported names, such as Tkinter starting everything
with "Tk", on the expectation that import * will be the normal
way of using the module.

For very well-known modules with very well-known prefixes,
this probably doesn't do too much harm, since it's usually
fairly obvious where a given name is coming from. However,
it's probably best not encouraged, as it could lead people
who don't know better into bad habits.

There's also the downside that people who choose *not* to
use import *, and instead import the module itself and use
qualified references, end up with everything being prefixed
twice, e.g. 'import Tkinter as tk' leads to 'tk.TkWhatever'

On the other hand, when wrapping a C library there's a desire
to keep the Python names as close as possible to the C ones,
which usually come with prefixes to manage C's totally-global
namespace. So there's a bit of a double bind there.


From brett at  Wed Nov 10 21:45:53 2010
From: brett at (Brett Cannon)
Date: Wed, 10 Nov 2010 12:45:53 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 05:47, Michael Foord <fuzzyman at> wrote:
> On 08/11/2010 22:07, Raymond Hettinger wrote:
>> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote:
>>> I think we need to, as a group, decide how to handle undocumented APIs
>>> that don't have a leading underscore: they get treated just the same
>>> as the documented APIs, or are they private regardless and thus we can
>>> change them at our whim?
>> To start with, it doesn't hurt for a maintainer to add an __all__ entry
>> and to only document the parts of the API we think need to be exposed. ?That
>> way, we can at least declare the parts that are intended to be public on a
>> go-forward basis.
>> For the most part, the non-underscored parts of the API shouldn't be
>> changed "at our whim". ?Some sense needs to be applied to the decision.
>> ?Google's code search is great for showing how people actually have used a
>> module in real world code. ?If that shows that people are accessing and/or
>> changing an attribute, it probably needs to remain exposed. ? In the absence
>> of a code search, good guesses can be made about what someone might
>> reasonably and usefully be accessing (i.e. glob0 isn't likely). ? The goal
>> is to improve the standard library while minimizing breakage, and that will
>> involve trade-offs depending on what is being changed.
>> IIRC, we've been trying to get away from deprecations because they're so
>> disruptive. ?For example, when the pprint rewrite is finally ready, if there
>> is an incompatible API change, I expect that a new clean class will be
>> offered, but that the old will be left in-place so that tons of existing
>> code won't break). ?Likewise, with the unittest clean-ups, I'm expecting
>> that Michael will introduce aliases when fixing-up mis-named methods, rather
>> than break code that uses the existing names.
> So it is obvious that we don't have a clearly stated policy for what defines
> the public API of standard library modules.
> How about making this explicit (either pep 8 or our developer docs):
> If a module or package defines __all__ that authoritatively defines the
> public interface. Modules with __all__ SHOULD still respect the naming
> conventions (leading underscore for private members) to avoid confusing
> users. Modules SHOULD NOT export private members in __all__.
> Names imported into a module a never considered part of its public API
> unless documented to be so or included in __all__.
> Methods / functions / classes and module attributes whose names begin with a
> leading underscore are private.
> If a class name begins with a leading underscore none of its members are
> public, whether or not they begin with a leading underscore.
> If a module name in a package begins with a leading underscore none of its
> members are public, whether or not they begin with a leading underscore.
> If a module or package doesn't define __all__ then all names that don't
> start with a leading underscore are public.
> All public members MUST be documented. Public functions, methods and classes
> SHOULD have docstrings. Private members may have docstrings.
> Where in the standard library this means that a module exports stuff that
> isn't helpful or shouldn't be part of the public API we need to migrate to
> private names and follow our deprecation process for the public names.

All sounds reasonable to me and what common practice out in the community is.


> All the best,
> Michael Foord
>> my-two-cents,
>> Raymond
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> --
> READ CAREFULLY. By accepting and reading this email you agree,
> on behalf of your employer, to release me from all obligations
> and waivers arising from any and all NON-NEGOTIATED agreements,
> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
> confidentiality, non-disclosure, non-compete and acceptable use
> policies (?BOGUS AGREEMENTS?) that I have entered into with your
> employer, its partners, licensors, agents and assigns, in
> perpetuity, without prejudice to my ongoing rights and privileges.
> You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.

From alexander.belopolsky at  Wed Nov 10 21:48:38 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 10 Nov 2010 15:48:38 -0500
Subject: [Python-Dev] [Python-checkins] r86355 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 7:28 AM, Victor Stinner
<victor.stinner at> wrote:
> I don't know, but the commit is trivial and cheap. If it improves the support
> on uncommon compiler, I agree to commit such change.

But it does it at the cost of invalidating the "svn blame" for the
last enum entry now and for future additions.   The problem is that
when you change from

enum {


enum {

you modify the X line while you are not responsible for adding the X
entry.  Someone who will then add Z, will be blamed for Y as well.

From barry at  Wed Nov 10 22:19:40 2010
From: barry at (Barry Warsaw)
Date: Wed, 10 Nov 2010 16:19:40 -0500
Subject: [Python-Dev] migration
In-Reply-To: <>
References: <>
Message-ID: <20101110161940.21d978f5@mission>

On Nov 10, 2010, at 12:34 PM, Jesus Cea wrote:

>On 09/11/10 22:17, "Martin v. L?wis" wrote:
>> is moving to a new hardware; this also involves a new IP
>> address. The migration will happen on Thursday, likely around 8:00 UTC.
>> If all goes well, outage should be very short.
>Seems to be offline now. I get timeouts.

I just had no problems updating issue 9807.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Wed Nov 10 22:27:19 2010
From: barry at (Barry Warsaw)
Date: Wed, 10 Nov 2010 16:27:19 -0500
Subject: [Python-Dev] issue 9807 - abiflags in paths and symlinks (updated
Message-ID: <20101110162719.11ae7fe6@mission>

I finally found a chance to address all the outstanding technical issues
mentioned in bug 9807:

I've uploaded a new patch which contains the rest of the changes I'm
proposing.  I think we still need consensus about whether these changes are
good to commit.  With 3.2b1 coming soon, now's the time to do that.

If there are any remaining concerns about the details of the patch, please add
them to the tracker issue.  If you have any remaining objections to the
change, please let me know or follow up here.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From foom at  Wed Nov 10 23:21:54 2010
From: foom at (James Y Knight)
Date: Wed, 10 Nov 2010 17:21:54 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On Nov 10, 2010, at 8:47 AM, Michael Foord wrote:
> How about making this explicit (either pep 8 or our developer docs):
> If a module or package defines __all__ that authoritatively defines the public interface. Modules with __all__ SHOULD still respect the naming conventions (leading underscore for private members) to avoid confusing users. Modules SHOULD NOT export private members in __all__.

I don't like the idea of the authoritative definition of a public interface being defined based on __all__, because that provides users almost no warning that they're using a private API: the __all__ attribute doesn't do anything if you aren't using import *. If there was some proposal to make it so that accessing an attribute not in __all__ did prevent or somehow warn users that they're doing something dangerous, that'd be different, but there isn't such a proposal, and I don't even know what such a proposal would look like...

On the other hand, if you make the primary mechanism to indicate privateness be a leading underscore, that's obvious to everyone.


From rrr at  Thu Nov 11 00:10:21 2010
From: rrr at (Ron Adam)
Date: Wed, 10 Nov 2010 17:10:21 -0600
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <ibf8p4$ajc$>

On 11/10/2010 01:33 PM, Raymond Hettinger wrote:
> On Nov 10, 2010, at 5:47 AM, Michael Foord wrote:
>> So it is obvious that we don't have a clearly stated policy for what defines the public API of standard library modules.
>> How about making this explicit (either pep 8 or our developer docs):
> I believe the point of Guido's email was that it is a situation dependent judgment call and not readily boiled down to a set of rules for PEP 8.

The way I read Guido's email is that it is a situation dependent judgment 
call for those cases that aren't clear.

I think what Micheal is trying to say is for us to agree on some things so 
we can go forward with a little more clarity.


From jcea at  Thu Nov 11 01:20:16 2010
From: jcea at (Jesus Cea)
Date: Thu, 11 Nov 2010 01:20:16 +0100
Subject: [Python-Dev] migration
In-Reply-To: <20101110161940.21d978f5@mission>
References: <>	<>
Message-ID: <>

Hash: SHA1

On 10/11/10 22:19, Barry Warsaw wrote:
> On Nov 10, 2010, at 12:34 PM, Jesus Cea wrote:
>> Seems to be offline now. I get timeouts.
> I just had no problems updating issue 9807.

That was 10 hours after my message :-).

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From glyph at  Thu Nov 11 03:41:11 2010
From: glyph at (Glyph Lefkowitz)
Date: Wed, 10 Nov 2010 18:41:11 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On Nov 10, 2010, at 2:21 PM, James Y Knight wrote:

> On the other hand, if you make the primary mechanism to indicate privateness be a leading underscore, that's obvious to everyone.


One of the best features of Python is the ability to make a conscious decision to break the interface of a library and just get on with your work, even if your use-case is not really supported, because nothing can stop you calling its private functionality.

But, IMHO the worst problem with Python is the fact that you can do this _without realizing it_ and pay a steep maintenance price later when an upgrade of something springs the trap that you had unwittingly set for yourself.

The leading-underscore convention is the only thing I've found that even mitigates this problem.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Thu Nov 11 04:04:39 2010
From: stephen at (Stephen J. Turnbull)
Date: Thu, 11 Nov 2010 12:04:39 +0900
Subject: [Python-Dev] [Python-checkins] r86355
	-	python/branches/py3k/Modules/_pickle.c
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky writes:
 > On Wed, Nov 10, 2010 at 7:28 AM, Victor Stinner
 > <victor.stinner at> wrote:
 > ..
 > > I don't know, but the commit is trivial and cheap. If it improves the support
 > > on uncommon compiler, I agree to commit such change.
 > >
 > But it does it at the cost of invalidating the "svn blame" for the
 > last enum entry now and for future additions.   The problem is that
 > when you change from
 > enum {
 >   ..
 >   X
 > }
 > to
 > enum {
 >   ..
 >   X,
 >   Y
 > }

If that bothers you, you can write

enum {
  , B
  /* etc */
  , X


enum {
  /* etc */

I prefer the last; it's a compiler (and debugger) space burden, but
shouldn't affect the running python.  On the original question, I
think it's preferable to keep compilers happy unless you're willing to
*require* C99.

From alexander.belopolsky at  Thu Nov 11 05:31:22 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 10 Nov 2010 23:31:22 -0500
Subject: [Python-Dev] [Python-checkins] r86355 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 10:04 PM, Stephen J. Turnbull
<stephen at> wrote:
> ... ?On the original question, I
> think it's preferable to keep compilers happy unless you're willing to
> *require* C99.

Hmm, maybe I should take another look at .

Note that issue #10359 was not about any real compiler - it was about
compiling with gcc -pedantic.  If we *require* pedantic c89 compliance
- we should add -pedantic -std=c89 to the standard build flags.
Otherwise no-compliant code will accumulate between "ISO C cleanups"
and such cleanups will continue to pollute VC logs.

From alexander.belopolsky at  Thu Nov 11 06:41:16 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 11 Nov 2010 00:41:16 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibf8p4$ajc$>
References: <>
Message-ID: <>

On Wed, Nov 10, 2010 at 6:10 PM, Ron Adam <rrr at> wrote:
>> On Nov 10, 2010, at 5:47 AM, Michael Foord wrote:
>>> So it is obvious that we don't have a clearly stated policy for what
>>> defines the public API of standard library modules.
>>> How about making this explicit (either pep 8 or our developer docs):
>> ..
> The way I read Guido's email is that it is a situation dependent judgment
> call for those cases that aren't clear.
> I think what Micheal is trying to say is for us to agree on some things so
> we can go forward with a little more clarity.

I don't understand why everyone seem to have accepted Michael's
premise that "we don't have a clearly stated policy for what defines
the public API of standard library modules."  We do have such a policy
and it is well known (while the location in the reference manual may
not be):

The public names defined by a module are determined by checking the
module?s namespace for a variable named __all__; if defined, it must
be a sequence of strings which are names defined or imported by that
module. The names given in __all__ are all considered public and are
required to exist. If __all__ is not defined, the set of public names
includes all names found in the module?s namespace which do not begin
with an underscore character ('_'). __all__ should contain the entire
public API. It is intended to avoid accidentally exporting items that
are not part of the API (such as library modules which were imported
and used within the module).
"""  -- <>

The question that I had when I started this thread was not about a
definition of "public API."  It was about a policy with respect to
modules that precede the introduction of __all__ and the modern
definition of public names.  (See r18692 "Two changes to
from...import", and r23920 ' adding a definition of "public names"'.)

Is it OK to add __all__ to such modules that does not include all
names not starting with an underscore?  Is it OK to then remove names
that clearly were not intended to be public?

Case in point: trace.rx_blank.  See also <>.

From stephen at  Thu Nov 11 07:09:58 2010
From: stephen at (Stephen J. Turnbull)
Date: Thu, 11 Nov 2010 15:09:58 +0900
Subject: [Python-Dev] [Python-checkins] r86355
	-	python/branches/py3k/Modules/_pickle.c
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky writes:
 > On Wed, Nov 10, 2010 at 10:04 PM, Stephen J. Turnbull
 > <stephen at> wrote:
 > > ... ?On the original question, I
 > > think it's preferable to keep compilers happy unless you're willing to
 > > *require* C99.
 > Hmm, maybe I should take another look at .
 > Note that issue #10359 was not about any real compiler

True, but a real compiler has been mentioned in the thread, and I know
that every time XEmacs lets a non-C89 feature slip through (most
commonly, "//" comments and declarations following non-declarations,
the latter being a killer feature in C-like languages IMO, but our
current coding standard says "C89") we get build breakage reports.

From fuzzyman at  Thu Nov 11 12:51:26 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 11 Nov 2010 11:51:26 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibc52t$8lf$>	<>	<>
Message-ID: <>

On 10/11/2010 15:48, R. David Murray wrote:
> On Wed, 10 Nov 2010 13:12:09 +0900, "Stephen J. Turnbull"<stephen at>  wrote:
>> Nick Coghlan writes:
>>   >  >  Module writers who compound the error by expecting to be imported
>>   >  >  this way, thereby bogarting the global namespace for their own
>>   >  >  purposes, should be fish-slapped. ;)
>>   >
>>   >  Be prepared to fish-slap all of python-dev then - we use precisely
>>   >  this technique to support optional acceleration modules. The pure
>>   >  Python versions of pairs like profile/_profile and heapq/_heapq
>>   >  include a try/except block at the end that does the equivalent of:
>>   >
>>   >    try:
>>   >      from _accelerated import * # Allow accelerated overrides
>>   >    except ImportError:
>>   >      pass # Use pure Python versions
>> But these identifiers will appear at the module level, not global, no?
>> Otherwise this technique couldn't be used.  I don't really understand
>> what Tres is talking about when he writes "modules that expect to be
>> imported this way".  The *imported* module shouldn't care, no?  This
>> is an issue for the *importing* code to deal with.
> I think Tres was referring to certain packages (which shall remain
> nameless since I don't feel like googling to find one) whose
> documentation recommends the 'from<me>  import *' methodology.

Contenders include popular libraries like fabric and django:

All the best,


> At least that's how I read "Module writers who..."  (that is, he's not
> saying the *module* expects to be imported that way). [*]
> --
> R. David Murray                            
> [*] although reading that sentence literally, the thought of such a
> module writer themselves being imported that way (a la Tron) has a
> certain charm....
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Thu Nov 11 13:01:16 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 11 Nov 2010 12:01:16 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>
Message-ID: <>

On 11/11/2010 05:41, Alexander Belopolsky wrote:
> On Wed, Nov 10, 2010 at 6:10 PM, Ron Adam<rrr at>  wrote:
> ..
>>> On Nov 10, 2010, at 5:47 AM, Michael Foord wrote:
>>>> So it is obvious that we don't have a clearly stated policy for what
>>>> defines the public API of standard library modules.
>>>> How about making this explicit (either pep 8 or our developer docs):
>>> ..
>> The way I read Guido's email is that it is a situation dependent judgment
>> call for those cases that aren't clear.
>> I think what Micheal is trying to say is for us to agree on some things so
>> we can go forward with a little more clarity.
> I don't understand why everyone seem to have accepted Michael's
> premise that "we don't have a clearly stated policy for what defines
> the public API of standard library modules."  We do have such a policy
> and it is well known (while the location in the reference manual may
> not be):

Ha. 14 paragraphs into the grammar reference on the import statement is 
perhaps not where developers would go to look for Python standard 
library development policy (and it *isn't* where they should go - 
standard library policy should be in pep 8 or our developer docs).

What you're saying is that the behaviour of "import *" *already* defines 
the public API at module level (but says nothing about class members or 
modules whose names begin with a leading underscore - those rules follow 
as a natural extension though).

By "clearly stated", I meant part of the python development 
documentation and / or standard library documentation. This is so that 
both users and developers are clear about the rules, and we have 
somewhere obvious to point people to. From this discussion it is clear 
that developers *don't* have a common understanding about what defines 
the public API of a standard library module. Suggestions as to what the 
rule is have included "only documented APIs are public" and "every 
member with a docstring is public"...

This largely comes from the heritage of the standard library which, as 
you point out, pre-dates the addition of __all__ / import * behaviour to 
the language. However many newer modules don't define __all__ either and 
several core developers have said they don't consider it a requirement 
that they do (as __all__ is a maintenance burden).

> """
> The public names defined by a module are determined by checking the
> module?s namespace for a variable named __all__;
> if defined, it must
> be a sequence of strings which are names defined or imported by that
> module. The names given in __all__ are all considered public and are
> required to exist. If __all__ is not defined, the set of public names
> includes all names found in the module?s namespace which do not begin
> with an underscore character ('_'). __all__ should contain the entire
> public API. It is intended to avoid accidentally exporting items that
> are not part of the API (such as library modules which were imported
> and used within the module).
> """  --<>
> The question that I had when I started this thread was not about a
> definition of "public API."  It was about a policy with respect to
> modules that precede the introduction of __all__ and the modern
> definition of public names.  (See r18692 "Two changes to
> from...import", and r23920 ' adding a definition of "public names"'.)

Well - restated your question is asking if adding a __all__ *changes* 
the public API of a standard library module. If it does then it is has 
stronger backwards compatibility implications than if it doesn't.

So given a standard library module that doesn't define __all__, what is 
considered the public API?

> Is it OK to add __all__ to such modules that does not include all
> names not starting with an underscore?  Is it OK to then remove names
> that clearly were not intended to be public?

Given the rules I suggested, which are basically the same as the one 
*you* are saying are already in place, if "import *" exports these names 
then you shouldn't change that behaviour without going through the 
deprecation process.

It would be clearer if these rules were stated either in pep 8 or our 
developer documentation of course.

All the best,


> Case in point: trace.rx_blank.  See also<>.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From kmtracey at  Thu Nov 11 13:22:00 2010
From: kmtracey at (Karen Tracey)
Date: Thu, 11 Nov 2010 07:22:00 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 6:51 AM, Michael Foord <fuzzyman at>wrote:

> On 10/11/2010 15:48, R. David Murray wrote:
>> I think Tres was referring to certain packages (which shall remain
>> nameless since I don't feel like googling to find one) whose
>> documentation recommends the 'from<me>  import *' methodology.
> Contenders include popular libraries like fabric and django:
That is one very specific module in Django that gets imported that way, it
is not a general pattern recommended by Django. For every other Django
module besides that one you will see specific imports being used in the doc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexander.belopolsky at  Thu Nov 11 14:23:46 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 11 Nov 2010 08:23:46 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord
<fuzzyman at> wrote:
>> Is it OK to add __all__ to such modules that does not include all
>> names not starting with an underscore? ?Is it OK to then remove names
>> that clearly were not intended to be public?
> Given the rules I suggested, which are basically the same as the one *you*
> are saying are already in place, if "import *" exports these names then you
> shouldn't change that behaviour without going through the deprecation
> process.

I don't dispute that these are *the* rules, but my question was
whether it is ok to break them in specific cases such as
trace.rx_blank.  If not, how can we deprecate trace.rx_blank which is
a regex constant?

Another specific case is token.main.  See <>.

From tseaver at  Thu Nov 11 14:39:54 2010
From: tseaver at (Tres Seaver)
Date: Thu, 11 Nov 2010 08:39:54 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>
Message-ID: <ibgrn9$kvf$>

Hash: SHA1

On 11/11/2010 08:23 AM, Alexander Belopolsky wrote:
> On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord
> <fuzzyman at> wrote:
> ..
>>> Is it OK to add __all__ to such modules that does not include all
>>> names not starting with an underscore?  Is it OK to then remove names
>>> that clearly were not intended to be public?
>> Given the rules I suggested, which are basically the same as the one *you*
>> are saying are already in place, if "import *" exports these names then you
>> shouldn't change that behaviour without going through the deprecation
>> process.
> I don't dispute that these are *the* rules, but my question was
> whether it is ok to break them in specific cases such as
> trace.rx_blank.  If not, how can we deprecate trace.rx_blank which is
> a regex constant?
> Another specific case is token.main.  See <>.

I would argue that the narrative documentation for the module is
normative for defining "public API", trumping even a pre-existing
'__all__'.  Given that all non-private stdlib modules have such docs,
nobody should be relying on '__all__' as anything other than a convenience.

Therefore, in the absence of an '__all__', adding one which conforms to
the docs should not require deprecations, as the set of applications /
modules which both use the undocumented names *and* do so via 'import *'
can be safely deemed "too small to worry about".

- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From fdrake at  Thu Nov 11 14:43:44 2010
From: fdrake at (Fred Drake)
Date: Thu, 11 Nov 2010 08:43:44 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 8:23 AM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> I don't dispute that these are *the* rules, but my question was
> whether it is ok to break them in specific cases such as
> trace.rx_blank. ?If not, how can we deprecate trace.rx_blank which is
> a regex constant?

Since trace is documented and rx_blank isn't covered, I think it's
pretty clear it was never intended as API.  I'd be fine with changing
the visibility of rx_blank, and see no need to change its name.

> Another specific case is token.main. ?See <>.

Yep.  Again, it's clear that it's not API, and that's a documented module.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From alexander.belopolsky at  Thu Nov 11 14:51:32 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 11 Nov 2010 08:51:32 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 8:43 AM, Fred Drake <fdrake at> wrote:
> Since trace is documented and rx_blank isn't covered, I think it's
> pretty clear it was never intended as API. ?I'd be fine with changing
> the visibility of rx_blank, and see no need to change its name.

While I obviously agree with your conclusion, your logic is not
perfect because trace documentation is *much* younger than the module.
 How would you apply your reasoning to trace.find_strings()?  It is
undocumented, its name is misleading, but it is used in the wild
according to google code search.  I draw the line somewhere between
trace.rx_blank and trace.find_strings.

From fuzzyman at  Thu Nov 11 14:57:53 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 11 Nov 2010 13:57:53 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibgrn9$kvf$>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>
Message-ID: <>

On 11/11/2010 13:39, Tres Seaver wrote:
> Hash: SHA1
> On 11/11/2010 08:23 AM, Alexander Belopolsky wrote:
>> On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord
>> <fuzzyman at>  wrote:
>> ..
>>>> Is it OK to add __all__ to such modules that does not include all
>>>> names not starting with an underscore?  Is it OK to then remove names
>>>> that clearly were not intended to be public?
>>> Given the rules I suggested, which are basically the same as the one *you*
>>> are saying are already in place, if "import *" exports these names then you
>>> shouldn't change that behaviour without going through the deprecation
>>> process.
>> I don't dispute that these are *the* rules, but my question was
>> whether it is ok to break them in specific cases such as
>> trace.rx_blank.  If not, how can we deprecate trace.rx_blank which is
>> a regex constant?
>> Another specific case is token.main.  See<>.
> I would argue that the narrative documentation for the module is
> normative for defining "public API", trumping even a pre-existing
> '__all__'.  Given that all non-private stdlib modules have such docs,
> nobody should be relying on '__all__' as anything other than a convenience.
> Therefore, in the absence of an '__all__', adding one which conforms to
> the docs should not require deprecations, as the set of applications /
> modules which both use the undocumented names *and* do so via 'import *'
> can be safely deemed "too small to worry about".

I don't think this is generally sufficient given the not-infrequent 
occurrence of undocumented-but-used APIs in the standard library.

Another example is re.Scanner.

Making the rules explicit and following a deprecation process seems like 
a sensible way forward to me. That still leaves Alexander's question 
open; how to handle module level constants that can't easily be formally 
deprecated. One possibility is using something similar to the twisted 
technique for deprecating module constants. That would mean adding code 
to the standard library to do this.

I would say that if it seems unlikely that the constants are used in the 
wild, and google code search confirms this, then it is fine to skip the 
deprecation process. If there are known uses we should at least document 
the deprecation (and alias) for a release before removing.

All the best,


> Tres.
> - -- 
> ===================================================================
> Tres Seaver          +1 540-429-0999          tseaver at
> Palladion Software   "Excellence by Design"
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla -
> iEYEARECAAYFAkzb8ioACgkQ+gerLs4ltQ4WBwCgux91ooO8lega+HRlYClSDj/B
> SdwAoIq3ZjMwEL1V7vX8sq9k/xSRhIjA
> =v9Zc
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Thu Nov 11 15:00:06 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 11 Nov 2010 14:00:06 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<>
Message-ID: <>

On 11/11/2010 13:51, Alexander Belopolsky wrote:
> On Thu, Nov 11, 2010 at 8:43 AM, Fred Drake<fdrake at>  wrote:
> ..
>> Since trace is documented and rx_blank isn't covered, I think it's
>> pretty clear it was never intended as API.  I'd be fine with changing
>> the visibility of rx_blank, and see no need to change its name.
> While I obviously agree with your conclusion, your logic is not
> perfect because trace documentation is *much* younger than the module.
>   How would you apply your reasoning to trace.find_strings()?  It is
> undocumented, its name is misleading, but it is used in the wild
> according to google code search.  I draw the line somewhere between
> trace.rx_blank and trace.find_strings.
I agree. Known / likely usage has to be the determining factor for 
poorly named but undocumented members.

For functions though formal deprecation is easy - so it should only be 
an *issue* for constants. (And there the issue is not whether we can 
remove but how we do it.)



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Thu Nov 11 15:02:58 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 12 Nov 2010 00:02:58 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibgrn9$kvf$>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 11:39 PM, Tres Seaver <tseaver at> wrote:
> I would argue that the narrative documentation for the module is
> normative for defining "public API", trumping even a pre-existing
> '__all__'. ?Given that all non-private stdlib modules have such docs,
> nobody should be relying on '__all__' as anything other than a convenience.
> Therefore, in the absence of an '__all__', adding one which conforms to
> the docs should not require deprecations, as the set of applications /
> modules which both use the undocumented names *and* do so via 'import *'
> can be safely deemed "too small to worry about".

Except, as noted earlier in the thread, many Python programmers (and I
count myself amongst this group) often use dir() and help() to find
out what a module has available, and only resort to the written
documentation if we get stuck.

My personal opinion is that we should be trying to get the standard
library to the point where __all__ definitions are unnecessary - if a
name isn't in __all__, it should start with an underscore (and if that
is true, then the __all__ definition becomes effectively redundant).

That way, all sources of information (docs, dir(), help(), import *)
give the same answer as to what constitutes the public API.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From lukasz at  Thu Nov 11 15:45:47 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Thu, 11 Nov 2010 15:45:47 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Am 08.11.2010 23:07, schrieb Raymond Hettinger:
> Some sense needs to be applied to the decision. Google's code search 
> is great for showing how people actually have used a module in real 
> world code. If that shows that people are accessing and/or changing an 
> attribute, it probably needs to remain exposed. In the absence of a 
> code search, good guesses can be made about what someone might 
> reasonably and usefully be accessing (i.e. glob0 isn't likely).

Danger, Will Robinson!

I just tried to use that to determine if I could consider moving a 
module-wide constant in configparser to the parser instance (to enable 

Search on returned me four incompatible result sets 
within 30 minutes. One had only two entries whereas another had 7 pages 
of results.

Search using found 3 pages of results 
different than the search on Google Code. The best part is that 
codesearch found some occurences on Google Code which Google Code's own 
search didn't.

None of them returned results whereas search on found occurences only on SourceForge.

The idea to use a code search engine is ingenious but the current tools 
are not yet reliable enough for the task.

> For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break).
Unrelated but that's the way I'm doing it.

From barry at  Thu Nov 11 16:01:25 2010
From: barry at (Barry Warsaw)
Date: Thu, 11 Nov 2010 10:01:25 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101111100125.2ca443c1@mission>

On Nov 11, 2010, at 12:41 AM, Alexander Belopolsky wrote:

>Is it OK to add __all__ to such modules that does not include all
>names not starting with an underscore?  Is it OK to then remove names
>that clearly were not intended to be public?

I would say in general, yes.  It's a good small modernization and stdlib
improvement.  However, this shouldn't be done as a bug fix to a stable
release, and care must be taken to consider backward compatibility.  IOW, if
you really think it's a name that is not used publicly, or is usually only
imported explicitly, then I think it's fine leaving it out of __all__.  It's
not a difficult change to work around.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Thu Nov 11 16:05:16 2010
From: barry at (Barry Warsaw)
Date: Thu, 11 Nov 2010 10:05:16 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <20101111100516.6e90aa41@mission>

On Nov 12, 2010, at 12:02 AM, Nick Coghlan wrote:

>My personal opinion is that we should be trying to get the standard
>library to the point where __all__ definitions are unnecessary - if a
>name isn't in __all__, it should start with an underscore (and if that
>is true, then the __all__ definition becomes effectively redundant).

Agreed, though I wouldn't *remove* __all__'s, I would establish a convention
where they can be generated programmatically.  Keeping __all__ in sync with
the code is a PITA.  It screams for automation.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From lukasz at  Thu Nov 11 16:17:07 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Thu, 11 Nov 2010 16:17:07 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101111100516.6e90aa41@mission>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>
Message-ID: <>

Am 11.11.2010 16:05, schrieb Barry Warsaw:
> Agreed, though I wouldn't *remove* __all__'s, I would establish a 
> convention
> where they can be generated programmatically.  Keeping __all__ in sync with
> the code is a PITA.  It screams for automation.

You mean runtime automation, e.g. creating __all__ on the fly omitting 
underscored names?

Best regards,
?ukasz Langa

From fuzzyman at  Thu Nov 11 16:18:40 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 11 Nov 2010 15:18:40 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
Message-ID: <>

On 11/11/2010 15:17, ?ukasz Langa wrote:
> Am 11.11.2010 16:05, schrieb Barry Warsaw:
>> Agreed, though I wouldn't *remove* __all__'s, I would establish a 
>> convention
>> where they can be generated programmatically.  Keeping __all__ in 
>> sync with
>> the code is a PITA.  It screams for automation.
> You mean runtime automation, e.g. creating __all__ on the fly omitting 
> underscored names?
Writing code to generate a __all__ that duplicates the default behaviour 
seems redundant to me.



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Thu Nov 11 16:32:31 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 11 Nov 2010 16:32:31 +0100
Subject: [Python-Dev] Breaking undocumented API
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Thu, 11 Nov 2010 15:18:40 +0000
Michael Foord <fuzzyman at> wrote:
> On 11/11/2010 15:17, ?ukasz Langa wrote:
> > Am 11.11.2010 16:05, schrieb Barry Warsaw:
> >> Agreed, though I wouldn't *remove* __all__'s, I would establish a 
> >> convention
> >> where they can be generated programmatically.  Keeping __all__ in 
> >> sync with
> >> the code is a PITA.  It screams for automation.
> >
> > You mean runtime automation, e.g. creating __all__ on the fly omitting 
> > underscored names?
> >
> Writing code to generate a __all__ that duplicates the default behaviour 
> seems redundant to me.

Agreed with Michael.  __all__ is useful mostly when you don't adhere to
the convention that private APIs should have a leading underscore.



From ocean-city at  Thu Nov 11 17:07:28 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Fri, 12 Nov 2010 01:07:28 +0900
Subject: [Python-Dev] Removal of Win32 ANSI API
Message-ID: <>

Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
and only use Win32 WIDE API (ie: GetFileAttributesW)?
Mainly in posixmodule.c.
I think we can simplify the code hugely. (This means droping bytes
support for os.stat etc on windows)

# I recently did it for winsound.PlaySound with MvL's approval

Thank you.

From mail at  Thu Nov 11 17:10:35 2010
From: mail at (Tim Golden)
Date: Thu, 11 Nov 2010 16:10:35 +0000
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/11/2010 16:07, Hirokazu Yamamoto wrote:
> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
> and only use Win32 WIDE API (ie: GetFileAttributesW)?
> Mainly in posixmodule.c.
> I think we can simplify the code hugely. (This means droping bytes
> support for os.stat etc on windows)
> # I recently did it for winsound.PlaySound with MvL's approval

+1 from me


From eckhardt at  Thu Nov 11 17:18:08 2010
From: eckhardt at (Ulrich Eckhardt)
Date: Thu, 11 Nov 2010 17:18:08 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday 11 November 2010, Hirokazu Yamamoto wrote:
> Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
> and only use Win32 WIDE API (ie: GetFileAttributesW)?
> Mainly in posixmodule.c.
> I think we can simplify the code hugely.


MS Windows variants that only support the ANSI API (win9x) are officially 
unsupported since 2.5 or 2.6. Further, this also eases porting to MS Windows 
CE, which I'd still like to to see one day.

> (This means droping bytes support for os.stat etc on windows)

I disagree that not using the ANSI win32 API means dropping byte support for 
os.stat. I'd rather say that it means converting bytes at the earliest 
possible time and only using unicode internally. But I'm guessing a bit here, 
I haven't looked at the code for a while.

> # I recently did it for winsound.PlaySound with MvL's approval

Interesting, is there a ticket associate with this? Also, was that on Python 3 
or 2? Which commits?


Sator Laser GmbH, Fangdieckstra?e 75a, 22547 Hamburg, Deutschland
Gesch?ftsf?hrer: Thorsten F?cking, Amtsgericht Hamburg HR B62 932

Sator Laser GmbH, Fangdieckstra?e 75a, 22547 Hamburg, Deutschland
Gesch?ftsf?hrer: Thorsten F?cking, Amtsgericht Hamburg HR B62 932
           Visit our website at <>
Diese E-Mail einschlie?lich s?mtlicher Anh?nge ist nur f?r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf?nger sein sollten. Die E-Mail ist in diesem Fall zu l?schen und darf weder gelesen, weitergeleitet, ver?ffentlicht oder anderweitig benutzt werden.
E-Mails k?nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte ?nderungen enthalten. Sator Laser GmbH ist f?r diese Folgen nicht verantwortlich.

From solipsis at  Thu Nov 11 17:43:35 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 11 Nov 2010 17:43:35 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
References: <>
Message-ID: <>

On Thu, 11 Nov 2010 16:10:35 +0000
Tim Golden <mail at> wrote:
> On 11/11/2010 16:07, Hirokazu Yamamoto wrote:
> > Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
> > and only use Win32 WIDE API (ie: GetFileAttributesW)?
> > Mainly in posixmodule.c.
> > I think we can simplify the code hugely. (This means droping bytes
> > support for os.stat etc on windows)
> >
> > # I recently did it for winsound.PlaySound with MvL's approval
> +1 from me

How do you support cross-platform code using bytes filenames?
IIRC, it has already been argued that it was an important feature. Many
filesystem-related utilities might prefer to handle filenames in bytes

("winsound" is a Windows-specific module so that wasn't a concern



From merwok at  Thu Nov 11 18:38:11 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 11 Nov 2010 18:38:11 +0100
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib: test/ xml/etree/
In-Reply-To: <20101109234842.GA1068@rubuntu>
References: <>
	<> <20101109234842.GA1068@rubuntu>
Message-ID: <>

>> Shouldn?t this include an entry in NEWS and maybe in ACKS?
> It was a very simple bug fix (caused due to an overlook initially), so
> did not add NEWS/ACKS. For features, larger fixes or complete patches,
> I the add NEWS and ACKS as appropriate.

Thanks for the reply.  Now I?m unsure about the rules for adding NEWS
entries: some bugs are important but have a very simple fix (see
#1718574 for an example).  I guess I?ll just always add an entry :)

Brett, maybe this is something to cover in the dev docs.

make-patchcheck-ly yours

From brett at  Thu Nov 11 18:56:11 2010
From: brett at (Brett Cannon)
Date: Thu, 11 Nov 2010 09:56:11 -0800
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib: test/ xml/etree/
In-Reply-To: <>
References: <>
	<20101109234842.GA1068@rubuntu> <>
Message-ID: <>

On Thu, Nov 11, 2010 at 09:38, ?ric Araujo <merwok at> wrote:
>>> Shouldn?t this include an entry in NEWS and maybe in ACKS?
>> It was a very simple bug fix (caused due to an overlook initially), so
>> did not add NEWS/ACKS. For features, larger fixes or complete patches,
>> I the add NEWS and ACKS as appropriate.
> Thanks for the reply. ?Now I?m unsure about the rules for adding NEWS
> entries: some bugs are important but have a very simple fix (see
> #1718574 for an example). ?I guess I?ll just always add an entry :)
> Brett, maybe this is something to cover in the dev docs.

I just follow Guido's own personal rule: if the fix required thought
they should go into Misc/ACKS.

From alexander.belopolsky at  Thu Nov 11 19:01:10 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 11 Nov 2010 13:01:10 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

2010/11/11 Michael Foord <fuzzyman at>:
>> You mean runtime automation, e.g. creating __all__ on the fly omitting
>> underscored names?
> Writing code to generate a __all__ that duplicates the default behaviour
> seems redundant to me.

FWIW, I like having __all__ at the top of the module.  It feels like a
table of contents at the start of a chapter.  In some cases it may
also serve as an optimization when len(__all__) is much smaller than
len(__dict__).  I also don't like _ prefix to become an exclusive
means to express privateness.

I think the current definition of "public names" is a good one and
just needs to be made more visible in the docs.  If the module defines
__all__, that should be the ultimate answer to what is public in that
module.   (Users should learn to use help(module) instead of
dir(module) for API discovery.)   If __all__ is not defined in the
module, I think it is good to introduce it after a careful review of
what it should contain.  And __all__ should never contain names that
start with _.

From merwok at  Thu Nov 11 19:10:43 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 11 Nov 2010 19:10:43 +0100
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib: test/ xml/etree/
In-Reply-To: <>
References: <>
	<> <20101109234842.GA1068@rubuntu>
Message-ID: <>

> I just follow Guido's own personal rule: if the fix required thought
> they should go into Misc/ACKS.

Okay.  Same rule for NEWS?

From brett at  Thu Nov 11 19:16:04 2010
From: brett at (Brett Cannon)
Date: Thu, 11 Nov 2010 10:16:04 -0800
Subject: [Python-Dev] [Python-checkins] r86348 - in
 python/branches/py3k/Lib: test/ xml/etree/
In-Reply-To: <>
References: <>
	<20101109234842.GA1068@rubuntu> <>
Message-ID: <>

On Thu, Nov 11, 2010 at 10:10, ?ric Araujo <merwok at> wrote:
>> I just follow Guido's own personal rule: if the fix required thought
>> they should go into Misc/ACKS.
> Okay. ?Same rule for NEWS?

I do a NEWS entry if a bug was fixed or semantics changed/added for
anything public (e.g., I don't do an entry for every little
clarification in the docs or new tests fixed or written).

From steve at  Thu Nov 11 19:16:16 2010
From: steve at (Steven D'Aprano)
Date: Fri, 12 Nov 2010 05:16:16 +1100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>
Message-ID: <>

Nick Coghlan wrote:

> My personal opinion is that we should be trying to get the standard
> library to the point where __all__ definitions are unnecessary - if a
> name isn't in __all__, it should start with an underscore (and if that
> is true, then the __all__ definition becomes effectively redundant).

You don't *need* to define __all__ -- if you don't, import * will import 
everything that doesn't start with a leading underscore. __all__ is only 
useful when you want more control over what is or isn't imported. If you 
don't need that control, just don't define __all__, and the problem is 

> That way, all sources of information (docs, dir(), help(), import *)
> give the same answer as to what constitutes the public API.

I disagree with the underlying assumption that import * need necessarily 
import the entire public API. That's not how I use it in my modules, and 
the option should be available to std library modules as well.

When I create a module, I distinguish between three categories of functions:

* private, which start with an underscore;
* the core public API, which is listed in __all__; and
* support/helper functions, which are not part of the core functionality 
of the module but are public.

If you import * you will get just the core functions. If you want the 
support functions, you need to use the fully qualified, or 
otherwise import them yourself.

This division of public functions into first and second class API 
functions is a deliberate design choice on my part. I expect the core 
functionality to be fully documented. Helper and support functions may 
not be -- there should be some docs, but doing so is a lower priority.

The support functions are public, and available for use, if you go 
looking for them, but I neither encourage nor discourage users from 
doing so.

I don't see any reason that the standard library should not be permitted 
to use the same convention.

Another couple of objections to getting rid of __all__:

If you're proxying modules or built-ins, you may not be able to use a 
_private name, but you may not want import * to pick up your proxies.

I find it annoying to see this:

import module as _module

(instead of import module and merely leaving module out of __all__)

I accept that some standard library authors may choose this convention, 
but I don't want to see it become mandatory.


From alexander.belopolsky at  Thu Nov 11 19:40:36 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 11 Nov 2010 13:40:36 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord
<fuzzyman at> wrote:
>> I don't understand why everyone seem to have accepted Michael's
>> premise that "we don't have a clearly stated policy for what defines
>> the public API of standard library modules." ?We do have such a policy
>> and it is well known (while the location in the reference manual may
>> not be):
> Ha. 14 paragraphs into the grammar reference on the import statement is
> perhaps not where developers would go to look for Python standard library
> development policy..

Very true.  To make it slightly more visible, any objections to the
following patch? (It adds "public names (in module globals)" linking
to that 14-th paragraph in the index.)

Index: Doc/reference/simple_stmts.rst
--- Doc/reference/simple_stmts.rst	(revision 86409)
+++ Doc/reference/simple_stmts.rst	(working copy)
@@ -794,6 +794,7 @@
 namespace of the :keyword:`import` statement..

 .. index:: single: __all__ (optional module attribute)
+.. index:: public names (in module globals)

 The *public names* defined by a module are determined by checking the module's
 namespace for a variable named ``__all__``; if defined, it must be a
sequence of

From solipsis at  Thu Nov 11 19:47:34 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 11 Nov 2010 19:47:34 +0100
Subject: [Python-Dev] Breaking undocumented API
References: <>
Message-ID: <>

On Thu, 11 Nov 2010 13:40:36 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord
> <fuzzyman at> wrote:
> ..
> >> I don't understand why everyone seem to have accepted Michael's
> >> premise that "we don't have a clearly stated policy for what defines
> >> the public API of standard library modules." ?We do have such a policy
> >> and it is well known (while the location in the reference manual may
> >> not be):
> >
> > Ha. 14 paragraphs into the grammar reference on the import statement is
> > perhaps not where developers would go to look for Python standard library
> > development policy..
> Very true.  To make it slightly more visible, any objections to the
> following patch? (It adds "public names (in module globals)" linking
> to that 14-th paragraph in the index.)

I think what Michael meant is that the language grammar reference is not
(and shouldn't be) the authority on stdlib development policy. To which
I would agree.



From victor.stinner at  Thu Nov 11 20:26:24 2010
From: victor.stinner at (Victor Stinner)
Date: Thu, 11 Nov 2010 20:26:24 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote:
> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
> and only use Win32 WIDE API (ie: GetFileAttributesW)?
> Mainly in posixmodule.c.

Even if I hate the MBCS encoding, because it replaces undecodable characters 
by similar glyphs by default, I'm not certain that it is a good idea to drop 
the bytes API. Can it be a problem to port programs from Python2 to Python3? 
Do major Python2 programs/libraries rely on the bytes API?

> I think we can simplify the code hugely. (This means droping bytes
> support for os.stat etc on windows)

Sure, it will divide the number of lines, of the code specific to Windows, by 


From martin at  Thu Nov 11 20:44:52 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 11 Nov 2010 20:44:52 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> How do you support cross-platform code using bytes filenames?
> IIRC, it has already been argued that it was an important feature. Many
> filesystem-related utilities might prefer to handle filenames in bytes
> form.

It would be a policy decision. However, I think it is hear-say that
filesystem-related utilities might prefer byte file names. On Windows,
some files are inaccessible if you constrain yourself to byte filenames,
so once people learn about this limitation, I expect them to switch to
Unicode filenames on Windows - for the same reason they use byte
filenames on Unix (i.e. to be able to access all files correctly).


From martin at  Thu Nov 11 20:50:35 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 11 Nov 2010 20:50:35 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

> Even if I hate the MBCS encoding, because it replaces undecodable characters 
> by similar glyphs by default, I'm not certain that it is a good idea to drop 
> the bytes API. Can it be a problem to port programs from Python2 to Python3? 
> Do major Python2 programs/libraries rely on the bytes API?

I don't actually know for a fact, but I expect that the answer is "no".

The questions is: where do file names typically come from? My guess
is that they come from
a) hard-coded strings in the source code
b) command line arguments/environment variables
c) directory listings
[of course, there are other ways, like GUI input, getcwd(), etc]

In case a), you have filenames such as ".", e.g. as a parameter to
listdir or walk. These will typically be regular strings in Python 2,
which become Unicode strings in 3. You would actively need to put b""
prefixes into the code.

In case b), they will be Unicode strings in Python 3.

In case c), they will be Unicode strings if the argument is a Unicode
string. So by induction, file names will be typically unicode. The
exception will be libraries/applications which make deliberate attempts
to get byte-oriented file names.


From solipsis at  Thu Nov 11 21:02:43 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 11 Nov 2010 21:02:43 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, 11 Nov 2010 20:44:52 +0100
"Martin v. L?wis" <martin at> wrote:
> > How do you support cross-platform code using bytes filenames?
> > IIRC, it has already been argued that it was an important feature. Many
> > filesystem-related utilities might prefer to handle filenames in bytes
> > form.
> It would be a policy decision. However, I think it is hear-say that
> filesystem-related utilities might prefer byte file names.

One possible situation is when you receive filenames in bytes form from
an external API or tool (or even the contents of a file). If you don't
know the encoding, keeping the bytes form is obviously recommended.

I don't know how often this happens.



From eric at  Thu Nov 11 21:44:23 2010
From: eric at (Eric Smith)
Date: Thu, 11 Nov 2010 15:44:23 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ibeiv9$qr4$>
References: <>	<>	<>	<>	<ibc52t$8lf$>	<>	<>
Message-ID: <>

On 11/10/2010 11:58 AM, Tres Seaver wrote:
> Hash: SHA1
> On 11/09/2010 11:12 PM, Stephen J. Turnbull wrote:
>> Nick Coghlan writes:
>>   >  >  Module writers who compound the error by expecting to be imported
>>   >  >  this way, thereby bogarting the global namespace for their own
>>   >  >  purposes, should be fish-slapped. ;)
>>   >
>>   >  Be prepared to fish-slap all of python-dev then - we use precisely
>>   >  this technique to support optional acceleration modules. The pure
>>   >  Python versions of pairs like profile/_profile and heapq/_heapq
>>   >  include a try/except block at the end that does the equivalent of:
>>   >
>>   >    try:
>>   >      from _accelerated import * # Allow accelerated overrides
>>   >    except ImportError:
>>   >      pass # Use pure Python versions
>> But these identifiers will appear at the module level, not global, no?
>> Otherwise this technique couldn't be used.  I don't really understand
>> what Tres is talking about when he writes "modules that expect to be
>> imported this way".  The *imported* module shouldn't care, no?  This
>> is an issue for the *importing* code to deal with.
> Right -- "private" star imports aren't the issue for me, because the
> same user who creates them is responsible for the other end fo the
> stick.  I was ranting about library authors who document star imports as
> the expected usage pattern for their external users.
> Note that I still wouldn't use star imports in the "private
> acceleration" case myself.  I would prefer a pattern like:
> - ----------------------- $<  -----------------------------
> #
> # Pure python API implementation
> def foo(spat, blarg):
>      ...
> def bar(qux):
>      ...
> # Replace with accelearated C implemenataion
> try:
>      import _spam
> except ImportError:
>      pass # accelerated version not available
> else:
>      foo =
>      bar =
> - ----------------------- $<  -----------------------------
> This explicit name remapping catches unintentional erros (e.g., _spam
> renames a method) better than the star import.

But then you're saying that all implementations of _spam have to support 
the same API. What if CPython's _spam has foo, bar, and baz, but 
Jython's only has foo and bar, and IronPython's only has baz? Without 
getting into special casing or lots of try/catch blocks on individual 
names, I think import * is the best way to go.


From ncoghlan at  Thu Nov 11 23:01:32 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 12 Nov 2010 08:01:32 +1000
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 12, 2010 at 5:26 AM, Victor Stinner
<victor.stinner at> wrote:
> On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote:
>> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
>> and only use Win32 WIDE API (ie: GetFileAttributesW)?
>> Mainly in posixmodule.c.
> Even if I hate the MBCS encoding, because it replaces undecodable characters
> by similar glyphs by default, I'm not certain that it is a good idea to drop
> the bytes API. Can it be a problem to port programs from Python2 to Python3?
> Do major Python2 programs/libraries rely on the bytes API?
>> I think we can simplify the code hugely. (This means droping bytes
>> support for os.stat etc on windows)
> Sure, it will divide the number of lines, of the code specific to Windows, by
> two.

Can we get most of the code cleanup benefit without the backwards
compatibility risk by doing the decode from 'mbcs' on our side of the

That is, have code that was the C equivalent of:

arg_is_bytes = not isinstance(arg, str)
if arg_is_bytes:
  val = _decode_mbcs(arg)
  # Decoding error checking here
  val = arg
# Common processing using WIDE API
if arg_is_bytes:
  result = _encode_mbcs(wide_result)
  # Encoding error checking here
  result = wide_result


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Thu Nov 11 23:15:36 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 12 Nov 2010 08:15:36 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 12, 2010 at 4:16 AM, Steven D'Aprano <steve at> wrote:
> Another couple of objections to getting rid of __all__:
> If you're proxying modules or built-ins, you may not be able to use a
> _private name, but you may not want import * to pick up your proxies.
> I find it annoying to see this:
> import module as _module
> _module.func()
> (instead of import module and merely leaving module out of __all__)

That gets us back to dir() and help() giving the wrong impression of
the module's public API though.

The issue I have is that the current policy (public APIs may or may
not be in all, private APIs may or may not be prefixed by a leading
underscore) makes it impossible to reliably extract a module's public
API programmatically.

If we instead adopt the explicit policy that private APIs are:
- imported modules (with the exception of os.path)
- any names starting with a leading underscore

Then we get the 3 API tiers you describe: core public API in __all__,
other public functions and globals without leading underscores,
private API with leading underscores (or imported modules).

We could even add two additional functions to the inspect module (e.g.
getpublicnames() and getimportstarnames()) which applied the relevant
filtering rules.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From greg.ewing at  Thu Nov 11 23:24:49 2010
From: greg.ewing at (Greg Ewing)
Date: Fri, 12 Nov 2010 11:24:49 +1300
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan wrote:

> My personal opinion is that we should be trying to get the standard
> library to the point where __all__ definitions are unnecessary - if a
> name isn't in __all__, it should start with an underscore (and if that
> is true, then the __all__ definition becomes effectively redundant).

What about names imported from other modules that are used by
the module, but not intended for re-export? How would you
prevent them from turning up in help() etc. without using


From nd at  Fri Nov 12 08:51:58 2010
From: nd at (=?iso-8859-1?q?Andr=E9_Malo?=)
Date: Fri, 12 Nov 2010 08:51:58 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday 11 November 2010 20:50:35 Martin v. L?wis wrote:
> > Even if I hate the MBCS encoding, because it replaces undecodable
> > characters by similar glyphs by default, I'm not certain that it is a
> > good idea to drop the bytes API. Can it be a problem to port programs
> > from Python2 to Python3? Do major Python2 programs/libraries rely on the
> > bytes API?
> I don't actually know for a fact, but I expect that the answer is "no".
> The questions is: where do file names typically come from? My guess
> is that they come from
> a) hard-coded strings in the source code
> b) command line arguments/environment variables


> In case b), they will be Unicode strings in Python 3.

But not neccessarily with unicode semantics if I get the discussions about the 
environment topic right.


d) Over a socket (like the HTTP protocol) -> Bytes.


From p.f.moore at  Fri Nov 12 09:44:03 2010
From: p.f.moore at (Paul Moore)
Date: Fri, 12 Nov 2010 08:44:03 +0000
Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and buildbot
	run as a service
Message-ID: <>

My buildbot has been failing for some time because of these 2 issues,
both related to the fact that tests are hanging when run as a service
(and hence have no display to open GUI elements on). Both issues have
patches, and as far as I am aware, the patches fix the issues
reasonably well. What can I do to move these 2 issues forwards? As
things stand, my buildbot is not providing a lot of value on the 3.x
branch :-(


From martin at  Fri Nov 12 09:51:19 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 09:51:19 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> Additionally:
> d) Over a socket (like the HTTP protocol) -> Bytes.

Sure. However, you can't really expect that the bytes you receive
over the socket are a meaningful filename on your local Windows
installation. So it would be a bug in the application to not decode
the bytes that you receive before using them as a file name.
In a well-specified network protocol, you would know the encoding of
the bytes; IETF recommends to use UTF-8 for all new protocols. Using
an UTF-8 string as a filename on Windows will create mojibake.


From martin at  Fri Nov 12 10:29:31 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 10:29:31 +0100
Subject: [Python-Dev] buildbot master update
Message-ID: <>

As you may have noticed: I updated the buildbot master to release 0.8.2.
If you notice any problems, please post them here.

Slave operators can upgrade their installations at their own pace;
buildbot is highly backwards compatible. As a recommendation, I suggest
that slaves run at least at the version that is available in Debian
stable (currently 0.7.8).


From martin at  Fri Nov 12 10:32:46 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 10:32:46 +0100
Subject: [Python-Dev] migration complete
Message-ID: <> is now on the new hardware. There have been some
problems in the migration: the old hardware would start failing before
the scheduled migration date, so the migration was done early, causing
outage for some people who then the old address in their DNS caches.
In addition, there was initially a misconfiguration preventing outgoing
IP traffic, particularly preventing outgoing emails from being
delivered. This is all fixed now; report any remaining issues to the


From hrvoje.niksic at  Fri Nov 12 10:49:48 2010
From: hrvoje.niksic at (Hrvoje Niksic)
Date: Fri, 12 Nov 2010 10:49:48 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>
Message-ID: <>

On 11/11/2010 11:24 PM, Greg Ewing wrote:
> Nick Coghlan wrote:
>>  My personal opinion is that we should be trying to get the standard
>>  library to the point where __all__ definitions are unnecessary - if a
>>  name isn't in __all__, it should start with an underscore (and if that
>>  is true, then the __all__ definition becomes effectively redundant).
> What about names imported from other modules that are used by
> the module, but not intended for re-export? How would you
> prevent them from turning up in help() etc. without using
> __all__?

import foo as _foo

I believe I am not the only one who finds that practice ugly, but I find 
it just as ugly to underscore-ize every non-public helper function. 
__all__ is there for a reason, let's use it.  Maybe help() could 
automatically ignore stuff not in __all__, or display it but warn the 
user of non-public identifiers?

From lukasz at  Fri Nov 12 11:34:01 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Fri, 12 Nov 2010 11:34:01 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<>
Message-ID: <>

Am 11.11.2010 23:15, schrieb Nick Coghlan:
> If we instead adopt the explicit policy that private APIs are:
> - imported modules (with the exception of os.path)
> - any names starting with a leading underscore
> Then we get the 3 API tiers you describe: core public API in __all__,
> other public functions and globals without leading underscores,
> private API with leading underscores (or imported modules).


I like this approach *very much*. Let me elaborate:

1. The community knows, understands and accepts _names as private. We 
need to have _names for private functions and constants because we can 
change or remove those in later versions. It's very explicit: when the 
user complains "What, you removed _foo?" we can say "Yes, it was 
considered an implementation detail *from the start*." And it's hard to 
beat that argument. It was private from the start. You knew that because 
the name you called specifies that.  If we would be now to proclaim 
__all__ as a decisive point on what's private and what's not, it makes 
lives of all Python programmers (I mean the users as well) more complicated.

2. That being said, having help() mark non-underscored names which 
aren't included in __all__ as private is a good idea, too [1].  I'm a 
heavy user of interactive API discovery using dir() and help() and this 
would be definitely welcome. And even for those who don't use those 
tools, this feature can expose inconsistencies between documentation and 

3. "import name as _name" or "from x.y import z as _z" is just bad form. 
There may be valid exceptions but imagine if that would be the default 
way to do it. Uglier than nights of November.

4. This is why I think considering all imports as private (unless 
they're in __all__) is a fine example of "practicability beats purity". 
We could try to conceive a way to expose this information 
programatically but that's not so important at the moment.

[1] As Hrvoje Niksic wrote here:

Best regards,
?ukasz Langa

From fdrake at  Fri Nov 12 12:23:31 2010
From: fdrake at (Fred Drake)
Date: Fri, 12 Nov 2010 06:23:31 -0500
Subject: [Python-Dev] [Python-checkins] r86429 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 12, 2010 at 3:57 AM, georg.brandl
<python-checkins at> wrote in a commit:
> Add a deprecated-removed directive that allows to give the version of removal for deprecations.

This sounds pretty general-purpose rather than Python-specific.  Any
chance this will move into Sphinx?  I know a few projects that like to
deprecate things and would use this.  :-)

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From victor.stinner at  Fri Nov 12 13:08:30 2010
From: victor.stinner at (Victor Stinner)
Date: Fri, 12 Nov 2010 13:08:30 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday 11 November 2010 23:01:32 you wrote:
> > Sure, it will divide the number of lines, of the code specific to
> > Windows, by two.
> Can we get most of the code cleanup benefit without the backwards
> compatibility risk by doing the decode from 'mbcs' on our side of the
> fence?

I created PyUnicode_FSDecoder, a ParseTuple converter used to work on unicode 
paths, instead of bytes paths. On Windows, this converter uses mbcs encoding 
in strict mode, whereas Windows converter uses replace error handler to 
decode, and ignore to encode. So I don't think that we should this converter 
on Windows.

> That is, have code that was the C equivalent of:
> arg_is_bytes = not isinstance(arg, str)
> if arg_is_bytes:
>   val = _decode_mbcs(arg)
>   # Decoding error checking here
> else:
>   val = arg
> # Common processing using WIDE API
> if arg_is_bytes:
>   result = _encode_mbcs(wide_result)
>   # Encoding error checking here
> else:
>   result = wide_result

This doesn't make the code shorter, it may be longer than the actual code, and 
it is less compliant with the Windows native API...


From victor.stinner at  Fri Nov 12 13:13:08 2010
From: victor.stinner at (Victor Stinner)
Date: Fri, 12 Nov 2010 13:13:08 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Thursday 11 November 2010 21:02:43 Antoine Pitrou wrote:
> On Thu, 11 Nov 2010 20:44:52 +0100
> "Martin v. L?wis" <martin at> wrote:
> > > How do you support cross-platform code using bytes filenames?
> > > IIRC, it has already been argued that it was an important feature. Many
> > > filesystem-related utilities might prefer to handle filenames in bytes
> > > form.
> > 
> > It would be a policy decision. However, I think it is hear-say that
> > filesystem-related utilities might prefer byte file names.
> One possible situation is when you receive filenames in bytes form from
> an external API or tool (or even the contents of a file). If you don't
> know the encoding, keeping the bytes form is obviously recommended.

I disagree with you: the filename stored in the binary content/network stream 
may be encoded with a different code page than the current Windows code page. 
The application have to decode the filename itself, the application has more 
information about the right encoding than Windows.

 - MKV video stores filenames in utf-8
 - ZIP stores filenames in cp437 or utf-8
 - tar stores filenames... in the locale encoding (except for PAX format which 
uses utf-8)
 - etc.


From victor.stinner at  Fri Nov 12 13:15:35 2010
From: victor.stinner at (Victor Stinner)
Date: Fri, 12 Nov 2010 13:15:35 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday 11 November 2010 20:50:35 you wrote:
> > Even if I hate the MBCS encoding, because it replaces undecodable
> > characters by similar glyphs by default, I'm not certain that it is a
> > good idea to drop the bytes API. Can it be a problem to port programs
> > from Python2 to Python3? Do major Python2 programs/libraries rely on the
> > bytes API?
> I don't actually know for a fact, but I expect that the answer is "no".
> The questions is: where do file names typically come from? My guess
> is that they come from
> a) hard-coded strings in the source code
> b) command line arguments/environment variables
> c) directory listings
> [of course, there are other ways, like GUI input, getcwd(), etc]
> In case a), you have filenames such as ".", e.g. as a parameter to
> listdir or walk. These will typically be regular strings in Python 2,
> which become Unicode strings in 3. You would actively need to put b""
> prefixes into the code.
> In case b), they will be Unicode strings in Python 3.
> In case c), they will be Unicode strings if the argument is a Unicode
> string. So by induction, file names will be typically unicode. The
> exception will be libraries/applications which make deliberate attempts
> to get byte-oriented file names.

Ok, good answer. In this case, I vote +1 to remove completly the ANSI version 
from all Python modules.

I consider the ANSI version has a compatibility layer for old applications 
written for MS-Dos or early versions of Windows. Even if these APIs are still 
widely used in C/C++ applications, the wide versions should always be 


From solipsis at  Fri Nov 12 14:40:29 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 12 Nov 2010 14:40:29 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
References: <> <>
Message-ID: <>

On Fri, 12 Nov 2010 13:13:08 +0100
Victor Stinner <victor.stinner at> wrote:
> On Thursday 11 November 2010 21:02:43 Antoine Pitrou wrote:
> > On Thu, 11 Nov 2010 20:44:52 +0100
> > 
> > "Martin v. L?wis" <martin at> wrote:
> > > > How do you support cross-platform code using bytes filenames?
> > > > IIRC, it has already been argued that it was an important feature. Many
> > > > filesystem-related utilities might prefer to handle filenames in bytes
> > > > form.
> > > 
> > > It would be a policy decision. However, I think it is hear-say that
> > > filesystem-related utilities might prefer byte file names.
> > 
> > One possible situation is when you receive filenames in bytes form from
> > an external API or tool (or even the contents of a file). If you don't
> > know the encoding, keeping the bytes form is obviously recommended.
> I disagree with you: the filename stored in the binary content/network stream 
> may be encoded with a different code page than the current Windows code page. 
> The application have to decode the filename itself, the application has more 
> information about the right encoding than Windows.

I'm not talking about Windows obviously. POSIX filenames are natively
bytes, so if you get a bytes filename from an external source, it makes
sense to reuse the bytes form.

I think it would be a mistake to allow bytes filenames under POSIX but
not under Windows. It makes porting harder.

>  - tar stores filenames... in the locale encoding (except for PAX format which 
> uses utf-8)

So bytes filenames are useful at least for tar. I'm sure there are many
other cases (actually, most kinds of configuration files containing
paths would apply).



From barry at  Fri Nov 12 17:15:53 2010
From: barry at (Barry Warsaw)
Date: Fri, 12 Nov 2010 11:15:53 -0500
Subject: [Python-Dev] buildbot master update
In-Reply-To: <>
References: <>
Message-ID: <20101112111553.44da8c08@mission>

On Nov 12, 2010, at 10:29 AM, Martin v. L?wis wrote:

>As you may have noticed: I updated the buildbot master to release 0.8.2.
>If you notice any problems, please post them here.

Pretty!  My buildbot seems fine.

>Slave operators can upgrade their installations at their own pace;
>buildbot is highly backwards compatible. As a recommendation, I suggest
>that slaves run at least at the version that is available in Debian
>stable (currently 0.7.8).

Thanks Martin, for all you do to keep our infrastructure humming along
smoothly, including the recent Roundup migration.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From status at  Fri Nov 12 18:07:02 2010
From: status at (Python tracker)
Date: Fri, 12 Nov 2010 18:07:02 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (2010-11-05 - 2010-11-12)
Python tracker at

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2526 (+12)
  closed 19651 (+54)
  total  22177 (+66)

Open issues with patches: 1050 

Issues opened (47)

#9313: distutils error on MSVC older than 8  reopened by eric.araujo

#10252: Fix resource warnings in distutils  reopened by eric.araujo

#10329: and unicode in Python 3  reopened by belopolsky

#10332: Multiprocessing maxtasksperchild results in hang  opened by Jimbofbx

#10333: Remove ancient backwards compatibility GC API  opened by nascheme

#10336: test_xmlrpc fails if gzip is not supported by client  opened by ocean-city

#10338: test_lib2to3 failure on buildbot x86 debian parallel 3.x: node  opened by haypo

#10339: test_lib2to3 leaks  opened by pitrou

#10340: asyncore doesn't properly handle EINVAL on OSX  opened by giampaolo.rodola

#10342: trace module cannot produce coverage reports for zipped module  opened by belopolsky

#10344: codecs.readline doesn't care buffering=0  opened by Santiago.Piccinini

#10348: multiprocessing: use SysV semaphores on FreeBSD  opened by haypo

#10349: Error in Module/python.c when building on OpenBSD 4.8  opened by pgurumur

#10350: errno is read too late  opened by hfuru

#10351: Add autocompletion for keys in dictionaries  opened by Valery.Khamenya

#10354: tempfile.template is broken  opened by giampaolo.rodola

#10355: SpooledTemporaryFile's name property is broken  opened by giampaolo.rodola

#10356: hash of -1  opened by skrah

#10357: ** and "mapping" are poorly defined in python docs  opened by Fergal.Daly

#10358: Doc styles for print should only use dark colors  opened by fdrake

#10359: ISO C cleanup  opened by hfuru

#10360: _weakrefset.WeakSet.__contains__ should not propagate TypeErro  opened by tseaver

#10362: AttributeError: addinfourl instance has no attribute 'tell'  opened by Valentin.Lorentz

#10363: Embedded python, handle (memory) leak  opened by martind

#10364: Color coding fails after running program.  opened by Typo

#10365: IDLE Crashes on File Open Dialog when code window closed befor  opened by william.barr

#10366: Remove unneeded '(object)' from 3.x class examples  opened by terry.reedy

#10367: "python sdist upload --show-response" can fail with "  opened by jcea

#10369: tarfile requires an actual file on disc; a file-like object is  opened by strombrg

#10371: Deprecate trace module undocumented API  opened by belopolsky

#10373: Setup Script example incorrect  opened by lensart

#10374: caches outdated scripts in the build tree  opened by gjb1002

#10375: 2to3 print(single argument)  opened by hfuru

#10376: ZipFile unzip is unbuffered  opened by Jimbofbx

#10377: cProfile incorrectly labels its output  opened by exarkun

#10379: locale.format() input regression  opened by barry

#10381: Add timezone support to datetime C API  opened by belopolsky

#10382: Command line error marker misplaced on unicode entry  opened by belopolsky

#10383: test_os leaks under Windows  opened by pitrou

#10384: SyntaxError should contain exact location of the invalid chara  opened by belopolsky

#10385: Mark up "subprocess" as module in its doc  opened by belopolsky

#10388: spwd returning different value depending on privileges  opened by giampaolo.rodola

#10391: obj2ast's error handling can lead to python crashing with a C-  opened by dmalcolm

#10392: GZipFile crash when fileobj.mode is None  opened by bgreenlee

#10394: subprocess Popen deadlock  opened by Christoph.Mathys

#10395: os.path.commonprefix broken by design  opened by ronaldoussoren

#10345: fcntl.ioctl always fails claiming an invalid fd  opened by bgamari

Most recent 15 issues with no replies (15)

#10394: subprocess Popen deadlock

#10392: GZipFile crash when fileobj.mode is None

#10388: spwd returning different value depending on privileges

#10384: SyntaxError should contain exact location of the invalid chara

#10381: Add timezone support to datetime C API

#10377: cProfile incorrectly labels its output

#10375: 2to3 print(single argument)

#10373: Setup Script example incorrect

#10350: errno is read too late

#10339: test_lib2to3 leaks

#10338: test_lib2to3 failure on buildbot x86 debian parallel 3.x: node

#10332: Multiprocessing maxtasksperchild results in hang

#10320: printf %qd is nonstandard

#10310: signed:1 bitfields rarely make sense

#10309: dlmalloc.c needs _GNU_SOURCE for mremap()

Most recent 15 issues waiting for review (15)

#10392: GZipFile crash when fileobj.mode is None

#10391: obj2ast's error handling can lead to python crashing with a C-

#10385: Mark up "subprocess" as module in its doc

#10382: Command line error marker misplaced on unicode entry

#10371: Deprecate trace module undocumented API

#10369: tarfile requires an actual file on disc; a file-like object is

#10360: _weakrefset.WeakSet.__contains__ should not propagate TypeErro

#10359: ISO C cleanup

#10356: hash of -1

#10354: tempfile.template is broken

#10351: Add autocompletion for keys in dictionaries

#10350: errno is read too late

#10342: trace module cannot produce coverage reports for zipped module

#10340: asyncore doesn't properly handle EINVAL on OSX

#10329: and unicode in Python 3

Top 10 most discussed issues (10)

#10329: and unicode in Python 3  11 msgs

#7061: Improve turtle module documentation   9 msgs

#10354: tempfile.template is broken   9 msgs

#10359: ISO C cleanup   9 msgs

#10379: locale.format() input regression   9 msgs

#10325: PY_LLONG_MAX & co - preprocessor constants or not?   8 msgs

#5412: extend configparser to support mapping access(__*item__)   7 msgs

#10252: Fix resource warnings in distutils   7 msgs

#10349: Error in Module/python.c when building on OpenBSD 4.8   7 msgs

#10364: Color coding fails after running program.   7 msgs

Issues closed (51)

#1602: windows console doesn't print utf8 (Py30a2)  closed by haypo

#1926: NNTPS support in nntplib  closed by pitrou

#6058: Add cp65001 to encodings/  closed by haypo

#6226: Inconsistent 'file' vs 'stream' kwarg in pprint, other stdlibs  closed by eric.araujo

#6317: winsound.PlaySound doesn't accept non-unicode string  closed by ocean-city

#8634: get method for dbm interface  closed by eric.araujo

#8679: write a distutils to distutils2 converter  closed by eric.araujo

#8804: http.client should support SSL contexts  closed by pitrou

#9421: configparser.ConfigParser's getint, getboolean and getfloat do  closed by lukasz.langa

#9508: python3.2 reversal of distutils reintrocud macos9 support  closed by eric.araujo

#10008: Two links point to same place  closed by georg.brandl

#10022: Emit more information in decoded SSL certificates  closed by pitrou

#10145: float.is_integer is undocumented  closed by mark.dickinson

#10180: File objects should not pickleable  closed by pitrou

#10226: urlparse example is wrong  closed by orsenthil

#10229: Refleak run of test_gettext fails  closed by eric.araujo

#10232: Tkinter issues with Scrollbar and custom widget list  closed by terry.reedy

#10245: Fix resource warnings in test_telnetlib  closed by orsenthil

#10282: IMPLEMENTATION token differently delt with in NNTP capability  closed by pitrou

#10297: decimal module documentation is misguiding  closed by mark.dickinson

#10302: Add class-functions to hash many small objects with hashlib  closed by gregory.p.smith

#10303: small inconsistency in tutorial  closed by orsenthil

#10304: error in tutorial triple-string example  closed by terry.reedy

#10311: Signal handlers must preserve errno  closed by pitrou

#10321: Add support for Message objects and binary data to smtplib.sen  closed by r.david.murray

#10324: Modules/binascii.c: simplify expressions  closed by orsenthil

#10327: Abnormal SSL timeouts when using socket timeouts - once again  closed by pakal

#10330: trace module doesn't work without threads  closed by belopolsky

#10331: test_gdb failure when warnings printed out  closed by dmalcolm

#10334: Add new reST directive for links to source code  closed by georg.brandl

#10335: open a file with encoding detected from a cod  closed by haypo

#10337: testTanh() of test_math fails on "NetBSD 5 i386 3.x"  closed by haypo

#10341: Remove traces of setuptools  closed by eric.araujo

#10343: urllib.parse problems with bytes vs str  closed by r.david.murray

#10346: strange arithmetic behaviour  closed by mark.dickinson

#10347: regrtest progress counter makes -f option less useful  closed by pitrou

#10352: has no tests in trunk  closed by georg.brandl

#10353: 2to3 and places argument in unitests assertAlmostEqual  closed by r.david.murray

#10361: Fix issue 9995 - distutils forces developers to store password  closed by eric.araujo

#10368: "python sdist upload --show-response" fails  closed by eric.araujo

#10370: py3 readlines() reports wrong offset for UnicodeDecodeError  closed by haypo

#10372: [REGRESSION] test_gc fails in non-debug mode.  closed by pitrou

#10378: Typo in results of help(divmod)  closed by benjamin.peterson

#10380: AttributeError: 'module' object has no attribute 'exc_tracebac  closed by georg.brandl

#10386: token module should define __all__  closed by belopolsky

#10387: ConfigParser's getboolean method is broken  closed by lukasz.langa

#10389: Document rules for use of case in section titles  closed by belopolsky

#10390: json.load should handle bytes input  closed by r.david.murray

#10393: "with" statement isn't thread-safe  closed by amaury.forgeotdarc

#1466065: base64 module ignores non-alphabet characters  closed by r.david.murray

#962772: when both maintainer and author provided, author discarded  closed by tarek

From tjreedy at  Fri Nov 12 18:07:44 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 12:07:44 -0500
Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and
 buildbot run as a service
In-Reply-To: <>
References: <>
Message-ID: <ibjs91$t2m$>

On 11/12/2010 3:44 AM, Paul Moore wrote:
> Hi,
> My buildbot has been failing for some time because of these 2 issues,
> both related to the fact that tests are hanging when run as a service
> (and hence have no display to open GUI elements on). Both issues have
> patches, and as far as I am aware, the patches fix the issues
> reasonably well. What can I do to move these 2 issues forwards? As
> things stand, my buildbot is not providing a lot of value on the 3.x
> branch :-(
is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed 
patches. Should it be re-versioned for 3.1/2? There is no patch file 
attached, though perhaps the code in Yamamoto's message is meant as such 
(but for which version?). So the first thing you could do is clarify the 
current status and remaining issue on the tracker.
by Yamamoto is marked for all 3 versions. It seems to be a similar 
issue, though marked 'test' rather than 'ctypes'. It does have a patch 
by him apparently based on his previous comments. The issue has no 
responses and needs a patch review. So the first thing you could do is 
to provide one;-). If it looks great (no changes that you can think of) 
and works great, say so. Then it can move on to commit review stage.

PS. Providing links like the above makes it easier for multiple people 
to take a look and respond.

Terry Jan Reedy

From tjreedy at  Fri Nov 12 18:11:40 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 12:11:40 -0500
Subject: [Python-Dev] migration complete
In-Reply-To: <>
References: <>
Message-ID: <ibjsgc$t2m$>

On 11/12/2010 4:32 AM, "Martin v. L?wis" wrote:
> is now on the new hardware. There have been some
> problems in the migration: the old hardware would start failing before
> the scheduled migration date, so the migration was done early, causing
> outage for some people who then the old address in their DNS caches.
> In addition, there was initially a misconfiguration preventing outgoing
> IP traffic, particularly preventing outgoing emails from being
> delivered. This is all fixed now; report any remaining issues to the
> metatracker.

I got stymied by some of the late failures, but it has been working 
great, with quick response, since last night. Thanks for the upgrade.

Terry Jan Reedy

From p.f.moore at  Fri Nov 12 18:25:05 2010
From: p.f.moore at (Paul Moore)
Date: Fri, 12 Nov 2010 17:25:05 +0000
Subject: [Python-Dev] buildbot master update
In-Reply-To: <20101112111553.44da8c08@mission>
References: <>
Message-ID: <>

On 12 November 2010 16:15, Barry Warsaw <barry at> wrote:
> On Nov 12, 2010, at 10:29 AM, Martin v. L?wis wrote:
>>As you may have noticed: I updated the buildbot master to release 0.8.2.
>>If you notice any problems, please post them here.
> Pretty! ?My buildbot seems fine.

Yes, I like the new look.

>>Slave operators can upgrade their installations at their own pace;
>>buildbot is highly backwards compatible. As a recommendation, I suggest
>>that slaves run at least at the version that is available in Debian
>>stable (currently 0.7.8).
> Thanks Martin, for all you do to keep our infrastructure humming along
> smoothly, including the recent Roundup migration.

Thanks from me, too!

From solipsis at  Fri Nov 12 20:42:00 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 12 Nov 2010 20:42:00 +0100
Subject: [Python-Dev] r86418 - in python/branches/release27-maint:
 Doc/library/difflib.rst Lib/ Lib/test/ Misc/NEWS
References: <>
Message-ID: <>


On Fri, 12 Nov 2010 00:22:19 +0100 (CET)
terry.reedy <python-checkins at> wrote:
> +
> +   .. versionadded:: 2.7
> +      The *autojunk* parameter.

Maybe I've missed something, but is there any reason to add a new
parameter in a bugfix release?
(apart from security issues)



From martin at  Fri Nov 12 20:44:34 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 20:44:34 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
	<>	<>	<>
Message-ID: <>

> I'm not talking about Windows obviously. POSIX filenames are natively
> bytes, so if you get a bytes filename from an external source, it makes
> sense to reuse the bytes form.
> I think it would be a mistake to allow bytes filenames under POSIX but
> not under Windows. It makes porting harder.

Not really. People who want to write portable code should use Unicode
filenames everywhere, not byte filenames.

>>  - tar stores filenames... in the locale encoding (except for PAX format which 
>> uses utf-8)
> So bytes filenames are useful at least for tar.

No, they are not. The tarfile module decodes all file names on its own,


From martin at  Fri Nov 12 20:46:27 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 20:46:27 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> Ok, good answer. In this case, I vote +1 to remove completly the ANSI version 
> from all Python modules.

I think caution is still necessary. So I propose to deprecate byte
filenames on Windows in 3.2, with removal in 3.3. People who think this
is a terrible mistake and breaks there applications with no hope of a
sensible solution can then still intervene.


From martin at  Fri Nov 12 20:53:00 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Nov 2010 20:53:00 +0100
Subject: [Python-Dev] buildbot master update
In-Reply-To: <20101112111553.44da8c08@mission>
References: <> <20101112111553.44da8c08@mission>
Message-ID: <>

> Thanks Martin, for all you do to keep our infrastructure humming along
> smoothly, including the recent Roundup migration.

I just write the announcements :-) In this case. thanks should also
extend to Izak Burger of Upfront Hosting who did most of the setup
(I just did the DNS changes), and to bitdancer who
investigated (together with Izak) the configuration problems of the new


From solipsis at  Fri Nov 12 21:07:52 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 12 Nov 2010 21:07:52 +0100
Subject: [Python-Dev] buildbot master update
References: <> <20101112111553.44da8c08@mission>
Message-ID: <>

On Fri, 12 Nov 2010 20:53:00 +0100
"Martin v. L?wis" <martin at> wrote:
> > Thanks Martin, for all you do to keep our infrastructure humming along
> > smoothly, including the recent Roundup migration.
> I just write the announcements :-) In this case. thanks should also
> extend to Izak Burger of Upfront Hosting who did most of the setup
> (I just did the DNS changes), and to bitdancer who
> investigated (together with Izak) the configuration problems of the new
> installation.

And for the record, bitdancer is R. David Murray :-)



From hnassrat at  Fri Nov 12 21:08:42 2010
From: hnassrat at (Hatem Nassrat)
Date: Fri, 12 Nov 2010 13:08:42 -0700
Subject: [Python-Dev] Closures / Python Scopes
Message-ID: <>

A colleague of mine came across something anecdotal when working with
lambdas, it is expressed by the following code snippet.

  In [1]: def a():
     ...:     for i in range(10):
     ...:         def b():
     ...:             return i
     ...:         yield b

  In [2]: funcs = list(a())

  In [3]: print [f() for f in funcs]
  [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

I understand that for loops in python do not have a scope, neither do
if statements, and python is awesome for that. Is this something
accidental? i.e. will python ever evolve into having scopes for if and
for loops (and other blocks that are not functions)? the reason I ask
is with the introduction of it
seems like something that can happen.

Hatem Nassrat

From tjreedy at  Fri Nov 12 21:32:21 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 15:32:21 -0500
Subject: [Python-Dev] r86418 - in python/branches/release27-maint:
 Doc/library/difflib.rst Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <ibk88m$r56$>

On 11/12/2010 2:42 PM, Antoine Pitrou wrote:
> Hello,
> On Fri, 12 Nov 2010 00:22:19 +0100 (CET)
> terry.reedy<python-checkins at>  wrote:
>> +
>> +   .. versionadded:: 2.7
>> +      The *autojunk* parameter.
> Maybe I've missed something, but is there any reason to add a new
> parameter in a bugfix release?
> (apart from security issues)

This is a bugfix. We discussed this (with Tim's participation) here last 
July/August and pretty well agreed that this was the least obnoxious 
solution to a bad situation.

Terry Jan Reedy

From tjreedy at  Fri Nov 12 21:38:19 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 15:38:19 -0500
Subject: [Python-Dev] Closures / Python Scopes
In-Reply-To: <>
References: <>
Message-ID: <ibk8jr$srs$>

On 11/12/2010 3:08 PM, Hatem Nassrat wrote:
> A colleague of mine came across something anecdotal when working with
> lambdas, it is expressed by the following code snippet.
>    In [1]: def a():
>       ...:     for i in range(10):
>       ...:         def b():
>       ...:             return i
>       ...:         yield b
>       ...:
>       ...:
>    In [2]: funcs = list(a())
>    In [3]: print [f() for f in funcs]
>    [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
> I understand that for loops in python do not have a scope, neither do
> if statements, and python is awesome for that. Is this something
> accidental? i.e. will python ever evolve into having scopes for if and
> for loops (and other blocks that are not functions)? the reason I ask
> is with the introduction of
> it
> seems like something that can happen.

Question/discussion issues like this belong on python-list or 
python-ideas list.

Terry Jan Reedy

From tjreedy at  Fri Nov 12 21:53:17 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 15:53:17 -0500
Subject: [Python-Dev] r86418 - in python/branches/release27-maint:
 Doc/library/difflib.rst Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <ibk88m$r56$>
References: <>	<>
Message-ID: <ibk9fs$vr$>

On 11/12/2010 3:32 PM, Terry Reedy wrote:
> On 11/12/2010 2:42 PM, Antoine Pitrou wrote:
>> Hello,
>> On Fri, 12 Nov 2010 00:22:19 +0100 (CET)
>> terry.reedy<python-checkins at> wrote:
>>> +
>>> + .. versionadded:: 2.7
>>> + The *autojunk* parameter.

I just realized that this should say 2.7.1 so people know not to use it 
with the original 2.7. I will repeat it again in the SequenceMatcher 

>> Maybe I've missed something, but is there any reason to add a new
>> parameter in a bugfix release?
>> (apart from security issues)
> This is a bugfix. We discussed this (with Tim's participation) here last
> July/August and pretty well agreed that this was the least obnoxious
> solution to a bad situation.

Terry Jan Reedy

From ncoghlan at  Sat Nov 13 01:45:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 13 Nov 2010 10:45:22 +1000
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 13, 2010 at 5:46 AM, "Martin v. L?wis" <martin at> wrote:
>> Ok, good answer. In this case, I vote +1 to remove completly the ANSI version
>> from all Python modules.
> I think caution is still necessary. So I propose to deprecate byte
> filenames on Windows in 3.2, with removal in 3.3. People who think this
> is a terrible mistake and breaks there applications with no hope of a
> sensible solution can then still intervene.

I was going to suggest much the same thing.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 13 01:51:03 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 13 Nov 2010 10:51:03 +1000
Subject: [Python-Dev] r86418 - in python/branches/release27-maint:
 Doc/library/difflib.rst Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <ibk88m$r56$>
References: <>
	<> <ibk88m$r56$>
Message-ID: <>

On Sat, Nov 13, 2010 at 6:32 AM, Terry Reedy <tjreedy at> wrote:
> On 11/12/2010 2:42 PM, Antoine Pitrou wrote:
>> Maybe I've missed something, but is there any reason to add a new
>> parameter in a bugfix release?
>> (apart from security issues)
> This is a bugfix. We discussed this (with Tim's participation) here last
> July/August and pretty well agreed that this was the least obnoxious
> solution to a bad situation.

Yep, as Terry said, the current behaviour is irredeemably broken in
some situations, but switching it off completely would break other
cases. Adding a new optional parameter that defaulted to the 2.7
behaviour was considered the least-bad option out of those available
(do nothing, add parameter, change default behaviour, add new API).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From tjreedy at  Sat Nov 13 02:31:49 2010
From: tjreedy at (Terry Reedy)
Date: Fri, 12 Nov 2010 20:31:49 -0500
Subject: [Python-Dev] [Python-checkins] r86441
	-	python/branches/py3k/Lib/test/
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/12/2010 7:28 PM, antoine.pitrou wrote:
> Author: antoine.pitrou
> Date: Sat Nov 13 01:28:53 2010
> New Revision: 86441
> Log:
> Switch from gmane to another provider for NNTP tests (as gmane isn't reliable
> enough).  Also, use setUpClass in order to connect only once per test run.

>       class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase):
> -        NNTP_HOST = ''
> -        GROUP_NAME = 'gmane.comp.python.devel'
> -        GROUP_PAT = 'gmane.comp.python.d*'

gmane is most problematical on weekends.

> +        NNTP_HOST = ''
> +        GROUP_NAME = 'comp.lang.python'
> +        GROUP_PAT = 'comp.lang.*'

aioe went away for several months a couple of years ago or so.
Let us hope it stays up for awhile now.
The ssl connection currently does not work (expired certificate).


From greg.ewing at  Sat Nov 13 04:05:35 2010
From: greg.ewing at (Greg Ewing)
Date: Sat, 13 Nov 2010 16:05:35 +1300
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Steven D'Aprano wrote:
> By the way, did you intend to send this off-list?

No, I didn't realise I hadn't sent it to the list.

If you don't document them, I won't use them, because I won't
know if it's one of these don't-ask-don't-tell pseudo-public
functions or something private that's accidentally been given
a non-underscore name.

> Greg Ewing wrote:

>> Also it means that help() wouldn't show me documentation for
>> the support functions, which is a bad thing if they really are
>> intended for public use.
> I don't see why... if you import the module, and call help(module), they 
> will show up as normal.

If the module has an __all__ list, then help(module) will only
show functions included in that list. So your pseudo-public
functions would not show up in it. Without some other reason to
suspect their existence, I would probably never find them.


From guido at  Sat Nov 13 05:38:16 2010
From: guido at (Guido van Rossum)
Date: Fri, 12 Nov 2010 20:38:16 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Fri, Nov 12, 2010 at 1:49 AM, Hrvoje Niksic <hrvoje.niksic at> wrote:
> On 11/11/2010 11:24 PM, Greg Ewing wrote:
>> Nick Coghlan wrote:
>>> ?My personal opinion is that we should be trying to get the standard
>>> ?library to the point where __all__ definitions are unnecessary - if a
>>> ?name isn't in __all__, it should start with an underscore (and if that
>>> ?is true, then the __all__ definition becomes effectively redundant).
>> What about names imported from other modules that are used by
>> the module, but not intended for re-export? How would you
>> prevent them from turning up in help() etc. without using
>> __all__?
> import foo as _foo
> I believe I am not the only one who finds that practice ugly,


> but I find it
> just as ugly to underscore-ize every non-public helper function. __all__ is
> there for a reason, let's use it. ?Maybe help() could automatically ignore
> stuff not in __all__, or display it but warn the user of non-public
> identifiers?

No, I like all non-public functions, constants, classes and variables
(but excluding imported modules) to start with _.

You'd still need __all__ to make "import *" do the right thing, but
the reader of the source code should not have to look up every name in
__all__ to find whether it is supposed to be public or private. Plus,
the same convention should carry over to methods and other class
attributes, where you don't have __all__.

If help() is broken we should fix it. (I'm not very happy with it
myself anyway, I rarely use it.)

Note that __all__ was originally invented to give "from package import
*" a well-defined meaning when the package included submodules that
might not have been loaded yet. Using it for other export control
(while a good idea) could be considered "newfangled". :-)

--Guido van Rossum (

From solipsis at  Sat Nov 13 13:06:46 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 13 Nov 2010 13:06:46 +0100
Subject: [Python-Dev] Breaking undocumented API
References: <>
	<> <>
Message-ID: <>

On Fri, 12 Nov 2010 20:38:16 -0800
Guido van Rossum <guido at> wrote:
> Note that __all__ was originally invented to give "from package import
> *" a well-defined meaning when the package included submodules that
> might not have been loaded yet. Using it for other export control
> (while a good idea) could be considered "newfangled". :-)

Newfangled in a rather old way already, then, perhaps :p



From solipsis at  Sat Nov 13 13:08:39 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 13 Nov 2010 13:08:39 +0100
Subject: [Python-Dev] [Python-checkins] r86441 -
References: <>
Message-ID: <>

On Fri, 12 Nov 2010 20:31:49 -0500
Terry Reedy <tjreedy at> wrote:
> >       class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase):
> > -        NNTP_HOST = ''
> > -        GROUP_NAME = 'gmane.comp.python.devel'
> > -        GROUP_PAT = 'gmane.comp.python.d*'
> gmane is most problematical on weekends.

Well we've had buildbot failures in the middle of the week.

> > +        NNTP_HOST = ''
> > +        GROUP_NAME = 'comp.lang.python'
> > +        GROUP_PAT = 'comp.lang.*'
> aioe went away for several months a couple of years ago or so.
> Let us hope it stays up for awhile now.
> The ssl connection currently does not work (expired certificate).

Funny, it shows that the NNTP SSL tests don't check the certificate,



From g.rodola at  Sat Nov 13 13:12:31 2010
From: g.rodola at (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sat, 13 Nov 2010 13:12:31 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

+1 on everything.

2010/11/11 Alexander Belopolsky <alexander.belopolsky at>:
> 2010/11/11 Michael Foord <fuzzyman at>:
> ..
>>> You mean runtime automation, e.g. creating __all__ on the fly omitting
>>> underscored names?
>> Writing code to generate a __all__ that duplicates the default behaviour
>> seems redundant to me.
> FWIW, I like having __all__ at the top of the module. ?It feels like a
> table of contents at the start of a chapter. ?In some cases it may
> also serve as an optimization when len(__all__) is much smaller than
> len(__dict__). ?I also don't like _ prefix to become an exclusive
> means to express privateness.
> I think the current definition of "public names" is a good one and
> just needs to be made more visible in the docs. ?If the module defines
> __all__, that should be the ultimate answer to what is public in that
> module. ? (Users should learn to use help(module) instead of
> dir(module) for API discovery.) ? If __all__ is not defined in the
> module, I think it is good to introduce it after a careful review of
> what it should contain. ?And __all__ should never contain names that
> start with _.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From foom at  Sat Nov 13 13:30:05 2010
From: foom at (James Y Knight)
Date: Sat, 13 Nov 2010 07:30:05 -0500
Subject: [Python-Dev] [Python-checkins] r86441 -
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote:
> Funny, it shows that the NNTP SSL tests don't check the certificate,
> then.

Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so...


From solipsis at  Sat Nov 13 13:37:12 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 13 Nov 2010 13:37:12 +0100
Subject: [Python-Dev] Stable buildbots
Message-ID: <>


Just to let you know that we now have 8 stable buildbots, including
Barry's own PPC Ubuntu machine (even though the Windows buildbots give
a rather unconventional meaning to the word "stability").

Right now they are mostly green:



From solipsis at  Sat Nov 13 13:40:25 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 13 Nov 2010 13:40:25 +0100
Subject: [Python-Dev] [Python-checkins] r86441 -
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sat, 13 Nov 2010 07:30:05 -0500
James Y Knight <foom at> wrote:
> On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote:
> > Funny, it shows that the NNTP SSL tests don't check the certificate,
> > then.
> Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so...

You must have missed the new match_hostname() function:

(its implementation is 50 lines rather than 140 lines, though)



From dickinsm at  Sat Nov 13 14:00:29 2010
From: dickinsm at (Mark Dickinson)
Date: Sat, 13 Nov 2010 13:00:29 +0000
Subject: [Python-Dev] buildbot master update
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 12, 2010 at 9:29 AM, "Martin v. L?wis" <martin at> wrote:
> As you may have noticed: I updated the buildbot master to release 0.8.2.
> If you notice any problems, please post them here.

One effect of this change seems to be that bbreport[1] no longer
works, since it appears that buildbot 0.8.2 has done away with the
XMLRPC interface that bbreport uses.

But that's really a bbreport issue rather than a buildbot one...



From g.brandl at  Sat Nov 13 15:15:43 2010
From: g.brandl at (Georg Brandl)
Date: Sat, 13 Nov 2010 15:15:43 +0100
Subject: [Python-Dev] buildbot master update
In-Reply-To: <>
References: <>
Message-ID: <ibm6l6$mmg$>

Am 13.11.2010 14:00, schrieb Mark Dickinson:
> On Fri, Nov 12, 2010 at 9:29 AM, "Martin v. L?wis" <martin at> wrote:
>> As you may have noticed: I updated the buildbot master to release 0.8.2.
>> If you notice any problems, please post them here.
> One effect of this change seems to be that bbreport[1] no longer
> works, since it appears that buildbot 0.8.2 has done away with the
> XMLRPC interface that bbreport uses.
> But that's really a bbreport issue rather than a buildbot one...
> Mark

I've added a quickfix by copying the removed xmlrpc interface to the
local buildbot installation now.  I had to patch it up a bit though,
because of an apparent API change somewhere in buildbot, and I'm not
sure whether this was correct.


From ocean-city at  Sat Nov 13 15:47:53 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sat, 13 Nov 2010 23:47:53 +0900
Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and
 buildbot run as a service
In-Reply-To: <ibjs91$t2m$>
References: <>
Message-ID: <>

On 2010/11/13 2:07, Terry Reedy wrote:
> On 11/12/2010 3:44 AM, Paul Moore wrote:
>> Hi,
>> My buildbot has been failing for some time because of these 2 issues,
>> both related to the fact that tests are hanging when run as a service
>> (and hence have no display to open GUI elements on). Both issues have
>> patches, and as far as I am aware, the patches fix the issues
>> reasonably well. What can I do to move these 2 issues forwards? As
>> things stand, my buildbot is not providing a lot of value on the 3.x
>> branch :-(
> is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed
> patches. Should it be re-versioned for 3.1/2? There is no patch file
> attached, though perhaps the code in Yamamoto's message is meant as such
> (but for which version?). So the first thing you could do is clarify the
> current status and remaining issue on the tracker.
> by Yamamoto is marked for all 3 versions. It seems to be a similar
> issue, though marked 'test' rather than 'ctypes'. It does have a patch
> by him apparently based on his previous comments. The issue has no
> responses and needs a patch review. So the first thing you could do is
> to provide one;-). If it looks great (no changes that you can think of)
> and works great, say so. Then it can move on to commit review stage.
> PS. Providing links like the above makes it easier for multiple people
> to take a look and respond.

My patch won't fix issue9055 directly, but solves issue9931.
Probably it's easy to create a patch to fix issue9055 based
on my patch.

One problem is, how to skip test. With single decorator like
skip_unless_symlink? Or let requires() raise SkipTest?

From ocean-city at  Sat Nov 13 17:21:37 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sun, 14 Nov 2010 01:21:37 +0900
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 2010/11/12 4:26, Victor Stinner wrote:
 > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote:
 >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
 >> and only use Win32 WIDE API (ie: GetFileAttributesW)?
 >> Mainly in posixmodule.c.
 > Even if I hate the MBCS encoding, because it replaces undecodable 
 > by similar glyphs by default, I'm not certain that it is a good idea 
to drop
 > the bytes API.

On 2010/11/12 21:08, Victor Stinner wrote:
> On Thursday 11 November 2010 23:01:32 you wrote:
>>> Sure, it will divide the number of lines, of the code specific to
>>> Windows, by two.
>> Can we get most of the code cleanup benefit without the backwards
>> compatibility risk by doing the decode from 'mbcs' on our side of the
>> fence?
> I created PyUnicode_FSDecoder, a ParseTuple converter used to work on unicode
> paths, instead of bytes paths. On Windows, this converter uses mbcs encoding
> in strict mode, whereas Windows converter uses replace error handler to
> decode, and ignore to encode. So I don't think that we should this converter
> on Windows.
>> That is, have code that was the C equivalent of:
>> arg_is_bytes = not isinstance(arg, str)
>> if arg_is_bytes:
>>    val = _decode_mbcs(arg)
>>    # Decoding error checking here
>> else:
>>    val = arg
>> # Common processing using WIDE API
>> if arg_is_bytes:
>>    result = _encode_mbcs(wide_result)
>>    # Encoding error checking here
>> else:
>>    result = wide_result
> This doesn't make the code shorter, it may be longer than the actual code, and
> it is less compliant with the Windows native API...

Is it possible to implement new PyArg_ParseTuple converter to use
     PyUnicode_Decode(const char *s,
                      Py_ssize_t size,
                      const char *encoding, /* mbcs */
                      const char *errors) /* replace */
and use it?

From tjreedy at  Sat Nov 13 19:40:54 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 13 Nov 2010 13:40:54 -0500
Subject: [Python-Dev] [Python-checkins] r86441 -
In-Reply-To: <>
References: <>	<>
Message-ID: <ibmm3m$o69$>

On 11/13/2010 7:08 AM, Antoine Pitrou wrote:
> On Fri, 12 Nov 2010 20:31:49 -0500
> Terry Reedy<tjreedy at>  wrote:
>>>        class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase):
>>> -        NNTP_HOST = ''
>>> -        GROUP_NAME = 'gmane.comp.python.devel'
>>> -        GROUP_PAT = 'gmane.comp.python.d*'
>> gmane is most problematical on weekends.
> Well we've had buildbot failures in the middle of the week.

Why I did not say 'only' ;-).

>>> +        NNTP_HOST = ''
>>> +        GROUP_NAME = 'comp.lang.python'
>>> +        GROUP_PAT = 'comp.lang.*'
>> aioe went away for several months a couple of years ago or so.
>> Let us hope it stays up for awhile now.
>> The ssl connection currently does not work (expired certificate).

More specifically, if, with Thunderbird, I turn on SSL/TLS, (which 
switches from port 119 to 563), I get *invalid* certificate message - 
good for, news.aioe,org, but not I believe SSL 
worked before the hiatus so it might be an oversight in restarting.

> Funny, it shows that the NNTP SSL tests don't check the certificate,
> then.

Or not thoroughly.

Terry Jan Reedy

From tjreedy at  Sat Nov 13 20:29:23 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 13 Nov 2010 14:29:23 -0500
Subject: [Python-Dev] [Python-checkins] r86441 -
In-Reply-To: <ibmm3m$o69$>
References: <>	<>	<>
Message-ID: <ibmouj$4ih$>

> More specifically, if, with Thunderbird, I turn on SSL/TLS, (which
> switches from port 119 to 563), I get *invalid* certificate message -
> good for, news.aioe,org, but not I believe SSL
> worked before the hiatus so it might be an oversight in restarting.
>> Funny, it shows that the NNTP SSL tests don't check the certificate,
>> then.
> Or not thoroughly.

I can access gmane with SSL, so you could add a conditional (on being up 
and running) certificate check using that.

Terry Jan Reedy

From tjreedy at  Sat Nov 13 19:17:25 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 13 Nov 2010 13:17:25 -0500
Subject: [Python-Dev] [Python-checkins] r86451 -
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/13/2010 8:25 AM, georg.brandl wrote:
> Author: georg.brandl
> Date: Sat Nov 13 14:25:40 2010
> New Revision: 86451

> -  unused undocumented value PyBUF_SHADOW, and strangely-looking code in
> +  undocumented value PyBUF_SHADOW, and strangely-looking code in

For future reference, 'strangely-looking' should be either 'strange- 
looking' or 'strangely appearing'. First, '-ly' adverbs are never 
hypenated even when modifying adjectives. Second, 'strangely looking 
code' would mean that the code is actively looking around strangely (as 
opposed to passively sitting there appearing strange).


From tjreedy at  Sat Nov 13 19:21:09 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 13 Nov 2010 13:21:09 -0500
Subject: [Python-Dev] [Python-checkins] r86453 - in
 python/branches/release31-maint: Include/patchlevel.h
 Lib/distutils/	Lib/idlelib/ Misc/NEWS
 Misc/RPM/python-3.1.spec README
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/13/2010 12:28 PM, benjamin.peterson wrote:
> Author: benjamin.peterson
> Date: Sat Nov 13 18:28:56 2010
> New Revision: 86453

> Modified: python/branches/release31-maint/README
> ==============================================================================
> --- python/branches/release31-maint/README	(original)
> +++ python/branches/release31-maint/README	Sat Nov 13 18:28:56 2010
> @@ -1,5 +1,5 @@
> -This is Python version 3.1.2
> -============================
> +This is Python version 3.1.2 release candidate 1
> +================================================

That should be 3.1.3 ;-)

From janssen at  Sat Nov 13 21:56:11 2010
From: janssen at (Bill Janssen)
Date: Sat, 13 Nov 2010 12:56:11 PST
Subject: [Python-Dev] [Python-checkins] r86441 -
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Antoine Pitrou <solipsis at> wrote:

> On Sat, 13 Nov 2010 07:30:05 -0500
> James Y Knight <foom at> wrote:
> > On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote:
> > > Funny, it shows that the NNTP SSL tests don't check the certificate,
> > > then.
> > 
> > Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so...
> You must have missed the new match_hostname() function:
> (its implementation is 50 lines rather than 140 lines, though)

On the client side, it's pretty easy to see an invalid (say, expired)
certificate.  Just call get_server_certificate(), which will fail if the
server certificate is invalid.

That's a separate issue from matching the request hostname to the
various host identifiers in the certificate, which various application
protocols may or may not require.


From benjamin at  Sun Nov 14 00:08:10 2010
From: benjamin at (Benjamin Peterson)
Date: Sat, 13 Nov 2010 17:08:10 -0600
Subject: [Python-Dev] [RELEASED] Python 3.1.3 release candidate 1
Message-ID: <>

On behalf of the Python development team, I'm gladsome to announce a release
candidate of the third bugfix release for the Python 3.1 series, Python 3.1.3.

This bug fix release fixes numerous issues found in 3.1.2.  Please try it with
your packages and report any bugs you find.  The final of 3.1.3 is scheduled to
be released in two weeks.

The Python 3.1 version series focuses on the stabilization and optimization of
the features and changes that Python 3.0 introduced.  For example, the new I/O
system has been rewritten in C for speed.  File system APIs that use unicode
strings now handle paths with undecodable bytes in them. Other features include
an ordered dictionary implementation, a condensed syntax for nested with
statements, and support for ttk Tile in Tkinter.  For a more extensive list of
changes in 3.1, see or Misc/NEWS in
the Python distribution.

To download Python 3.1.3 visit:

A list of changes in 3.1.3 can be found here:

The 3.1 documentation can be found at:

Bugs can always be reported to:


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 3.1.3's contributors)

From benjamin at  Sun Nov 14 00:12:22 2010
From: benjamin at (Benjamin Peterson)
Date: Sat, 13 Nov 2010 17:12:22 -0600
Subject: [Python-Dev] [RELEASED] Python 2.7.1 release candidate 1
Message-ID: <>

On behalf of the Python development team, I'm chuffed to announce the a release
candidate of Python 2.7.1.

Please test the release candidate with your packages and report any bugs you
find.  2.7.1 final is scheduled in two weeks.

2.7 includes many features that were first released in Python 3.1. The faster io
module, the new nested with statement syntax, improved float repr, set literals,
dictionary views, and the memoryview object have been backported from 3.1. Other
features include an ordered dictionary implementation, unittests improvements, a
new sysconfig module, auto-numbering of fields in the str/unicode format method,
and support for ttk Tile in Tkinter.  For a more extensive list of changes in
2.7, see or Misc/NEWS in the Python

To download Python 2.7.1 visit:

The 2.7.1 changelog is at:

2.7 documentation can be found at:

This is a testing release, so we encourage developers to test it with their
applications and libraries.  Please report any bugs you find, so they can be
fixed in the final release.  The bug tracker is at:


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 2.7.1's contributors)

From victor.stinner at  Sun Nov 14 01:06:55 2010
From: victor.stinner at (Victor Stinner)
Date: Sun, 14 Nov 2010 01:06:55 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Saturday 13 November 2010 17:21:37 you wrote:
> On 2010/11/12 4:26, Victor Stinner wrote:
>  > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote:
>  >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
>  >> and only use Win32 WIDE API (ie: GetFileAttributesW)?
>  >> Mainly in posixmodule.c.
>  > 
>  > Even if I hate the MBCS encoding, because it replaces undecodable
> characters
>  > by similar glyphs by default, I'm not certain that it is a good idea
> to drop
>  > the bytes API.
> On 2010/11/12 21:08, Victor Stinner wrote:
> > On Thursday 11 November 2010 23:01:32 you wrote:
> >>> Sure, it will divide the number of lines, of the code specific to
> >>> Windows, by two.
> >> 
> >> Can we get most of the code cleanup benefit without the backwards
> >> compatibility risk by doing the decode from 'mbcs' on our side of the
> >> fence?
> > 
> > I created PyUnicode_FSDecoder, a ParseTuple converter used to work on
> > unicode paths, instead of bytes paths. On Windows, this converter uses
> > mbcs encoding in strict mode, whereas Windows converter uses replace
> > error handler to decode, and ignore to encode. So I don't think that we
> > should this converter on Windows.
> > 
> >> That is, have code that was the C equivalent of:
> >> 
> >> arg_is_bytes = not isinstance(arg, str)
> >> 
> >> if arg_is_bytes:
> >>    val = _decode_mbcs(arg)
> >>    # Decoding error checking here
> >> 
> >> else:
> >>    val = arg
> >> 
> >> # Common processing using WIDE API
> >> 
> >> if arg_is_bytes:
> >>    result = _encode_mbcs(wide_result)
> >>    # Encoding error checking here
> >> 
> >> else:
> >>    result = wide_result
> > 
> > This doesn't make the code shorter, it may be longer than the actual
> > code, and it is less compliant with the Windows native API...
> Is it possible to implement new PyArg_ParseTuple converter to use
>      PyUnicode_Decode(const char *s,
>                       Py_ssize_t size,
>                       const char *encoding, /* mbcs */
>                       const char *errors) /* replace */
> and use it?

Yes, but how do you check if the input argument is a bytes or a str object 
with your PyArg_Parse converter? You should use "O" format and manually 
convert it to unicode, and then convert the result back to bytes (if the input 
was bytes). It don't think that it makes the code shorter.

The code is currently working. The question is if we have to drop the ANSI API 
now, later or never. It looks like the decision moves to "later" (deprecate in 
3.2, remove in 3.3). I still think that drop now doesn't really hurt.


From solipsis at  Sun Nov 14 01:19:28 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 14 Nov 2010 01:19:28 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
References: <>
Message-ID: <>

On Sun, 14 Nov 2010 01:06:55 +0100
Victor Stinner <victor.stinner at> wrote:
> The code is currently working. The question is if we have to drop the ANSI API 
> now, later or never.

If the code is currently working and isn't a security hole, then we
obviously don't "have to".
Apparently several developers "want to", which is different.

> It looks like the decision moves to "later" (deprecate in 
> 3.2, remove in 3.3). I still think that drop now doesn't really hurt.

If you drop code without first deprecating it, chances are it will
hurt someone.  That's why having a deprecation period is the rule we
usually follow (most of the time :-)).



From ncoghlan at  Sun Nov 14 02:06:57 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 11:06:57 +1000
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 10:19 AM, Antoine Pitrou <solipsis at> wrote:
> On Sun, 14 Nov 2010 01:06:55 +0100
> Victor Stinner <victor.stinner at> wrote:
>> The code is currently working. The question is if we have to drop the ANSI API
>> now, later or never.
> If the code is currently working and isn't a security hole, then we
> obviously don't "have to".
> Apparently several developers "want to", which is different.

We should also keep in mind that *Microsoft* have chosen to keep the
bytes Win32 APIs around, despite their flaws, all in the name of
backwards compatibility. While the goal of nudging third party
developers towards the superior Unicode APIs is an admirable one, it
is still the case that there is a *lot* of ASCII-only code out there.
E.g. applications could easily be storing filenames in an ASCII only
datastore that provides them back to the application as bytes in 3.x.

>> It looks like the decision moves to "later" (deprecate in
>> 3.2, remove in 3.3). I still think that drop now doesn't really hurt.
> If you drop code without first deprecating it, chances are it will
> hurt someone. ?That's why having a deprecation period is the rule we
> usually follow (most of the time :-)).



Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 14 02:28:31 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 11:28:31 +1000
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
Message-ID: <>

Following the python-checkins list, I get to see both the current SVN
notifications and the Hg notifications from Tarek's pushes into the
distutils repository. I realised today that there is one key reason as
to why the latter strikes me as a big wall of unintelligible text,
while I find the SVN notification quite easy to read: vertical

The SVN notification uses vertical whitespace to separate out the log
message and the list of files affected clearly from the rest of the
header fields. It makes it *really* easy to see at a glance what the
checkin was about and which files were affected. For the Hg
notification, both of these fields are embedded in a big header block
along with all the other fields, so it is quite difficult to make out
the same information.

It would be really nice if the formatting could be improved for the
email notifications on the Hg side when we adopt it for the main
CPython repository. The changes would be to:
- add a blank line before and after the summary field
- add a carriage return between the header and content for the summary
field and the files field
- indent the list of files by two spaces and use a carriage return
rather than a comma to separate named files

I've included an example below based on one of Tarek's recent pushes:

Current Hg notification header and start of first diff:
tarek.ziade pushed 7ebf14ab2840 to distutils2:
changeset:   816:7ebf14ab2840
tag:         tip
user:        Tarek Ziade <tarek at>
date:        Sat Nov 13 12:40:33 2010 +0100
summary:     compiler_type -> name
files:       distutils2/compiler/,
distutils2/compiler/, distutils2/compiler/,
distutils2/compiler/, distutils2/tests/

diff --git a/distutils2/compiler/ b/distutils2/compiler/
--- a/distutils2/compiler/
+++ b/distutils2/compiler/
@@ -13,7 +13,7 @@

Proposed change to separate out summary and files fields:
tarek.ziade pushed 7ebf14ab2840 to distutils2:
changeset:   816:7ebf14ab2840
tag:         tip
user:        Tarek Ziade <tarek at>
date:        Sat Nov 13 12:40:33 2010 +0100

compiler_type -> name


diff --git a/distutils2/compiler/ b/distutils2/compiler/
--- a/distutils2/compiler/
+++ b/distutils2/compiler/
@@ -13,7 +13,7 @@


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From at  Sun Nov 14 03:40:22 2010
From: at (David Bolen)
Date: Sat, 13 Nov 2010 21:40:22 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

Antoine Pitrou <solipsis at> writes:

> (even though the Windows buildbots give
> a rather unconventional meaning to the word "stability").

Nag, nag, nag .... :-)

There's been a bit of an uptick in the past few weeks with hung
python_d processes (not a new issue, but it ebbs and flows), so I'm
going to try to pull together a monitor script this weekend to start
killing them off automatically.  Should at least get rid of some of
the low hanging fruit that interferes with subsequent builds.

-- David

From tjreedy at  Sun Nov 14 04:10:11 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 13 Nov 2010 22:10:11 -0500
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>
Message-ID: <ibnjui$1m8$>

On 11/13/2010 8:28 PM, Nick Coghlan wrote:
> Following the python-checkins list, I get to see both the current SVN
> notifications and the Hg notifications from Tarek's pushes into the
> distutils repository. I realised today that there is one key reason as
> to why the latter strikes me as a big wall of unintelligible text,
> while I find the SVN notification quite easy to read: vertical
> whitespace.
> The SVN notification uses vertical whitespace to separate out the log
> message and the list of files affected clearly from the rest of the
> header fields. It makes it *really* easy to see at a glance what the
> checkin was about and which files were affected. For the Hg
> notification, both of these fields are embedded in a big header block
> along with all the other fields, so it is quite difficult to make out
> the same information.
> It would be really nice if the formatting could be improved for the
> email notifications on the Hg side when we adopt it for the main
> CPython repository. The changes would be to:
> - add a blank line before and after the summary field
> - add a carriage return between the header and content for the summary
> field and the files field
> - indent the list of files by two spaces and use a carriage return
> rather than a comma to separate named files
> I've included an example below based on one of Tarek's recent pushes:
> Current Hg notification header and start of first diff:
> ================================================
> tarek.ziade pushed 7ebf14ab2840 to distutils2:
> changeset:   816:7ebf14ab2840
> tag:         tip
> user:        Tarek Ziade<tarek at>
> date:        Sat Nov 13 12:40:33 2010 +0100
> summary:     compiler_type ->  name
> files:       distutils2/compiler/,
> distutils2/compiler/, distutils2/compiler/,
> distutils2/compiler/,
> distutils2/compiler/,
> distutils2/compiler/,
> distutils2/compiler/, distutils2/tests/
> diff --git a/distutils2/compiler/ b/distutils2/compiler/
> --- a/distutils2/compiler/
> +++ b/distutils2/compiler/
> @@ -13,7 +13,7 @@
> ====================================================
> Proposed change to separate out summary and files fields:
> ================================================
> tarek.ziade pushed 7ebf14ab2840 to distutils2:
> changeset:   816:7ebf14ab2840
> tag:         tip
> user:        Tarek Ziade<tarek at>
> date:        Sat Nov 13 12:40:33 2010 +0100
> summary:
> compiler_type ->  name
> files:
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/compiler/
>    distutils2/tests/
> diff --git a/distutils2/compiler/ b/distutils2/compiler/
> --- a/distutils2/compiler/
> +++ b/distutils2/compiler/
> @@ -13,7 +13,7 @@
> ====================================================

Much better except possible for \n after 'summary:'

Terry Jan Reedy

From rdmurray at  Sun Nov 14 04:40:52 2010
From: rdmurray at (R. David Murray)
Date: Sat, 13 Nov 2010 22:40:52 -0500
Subject: [Python-Dev] unexpected traceback/stack behavior with chained
	exceptions (issue 1553375)
Message-ID: <>

Issue 1553375 [1] proposes a patch to add an 'allframes' option to the
traceback printing and formatting routines so that the full traceback
from the top of the execution stack down to the exception is printed,
instead of just from the point where the exception is caught down to
the exception.  This is useful when the reason you are capturing the
traceback is to log it, and you have several different points in your
application where you do such traceback logging.  You often really want
to know the entire stack in that case; logging only from the capture
point down can lose important debugging information depending on how
the application is structured.

The patch seems to work well, except for one problem that I don't have
enough CPython internals knowledge to understand.  If the traceback we
are printing has a chained traceback, the resulting full traceback shows
the line that is printing the traceback instead of the line from the 'try'
block.  (It prints the expected line if there is no chained traceback).

So, is this a failure in my understanding of how tracebacks are supposed
to work, or a bug in how chained tracebacks are constructed?


R. David Murray                            

From ncoghlan at  Sun Nov 14 09:22:31 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 18:22:31 +1000
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 12:40 PM, David Bolen < at> wrote:
> Antoine Pitrou <solipsis at> writes:
>> (even though the Windows buildbots give
>> a rather unconventional meaning to the word "stability").
> Nag, nag, nag .... :-)
> There's been a bit of an uptick in the past few weeks with hung
> python_d processes (not a new issue, but it ebbs and flows), so I'm
> going to try to pull together a monitor script this weekend to start
> killing them off automatically. ?Should at least get rid of some of
> the low hanging fruit that interferes with subsequent builds.

Do we have any idea why the workaround to avoid the popup windows
stopped working? (assuming it ever worked reliably - I thought it did,
but that impression may have been incorrect)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 14 09:25:27 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 18:25:27 +1000
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <ibnjui$1m8$>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy <tjreedy at> wrote:
> Much better except possible for \n after 'summary:'

That extra line break helps more for multi-line checkin messages
(which happen reasonably often). Doesn't really bother me either way -
I'm mainly looking for info on who has the ability to change the
format in the first place :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From at  Sun Nov 14 09:48:53 2010
From: at (David Bolen)
Date: Sun, 14 Nov 2010 03:48:53 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

Nick Coghlan <ncoghlan at> writes:

> Do we have any idea why the workaround to avoid the popup windows
> stopped working? (assuming it ever worked reliably - I thought it did,
> but that impression may have been incorrect)

Oh, the pop-up handling for the RTL dialogs still seems to be working
fine (at least I haven't seen any since I put it in place).  That, plus
the original buildbot tweaks to block any OS popups still looks solid
for avoiding any dialogs that block a test process.

This is a completely separate issue, though probably around just as
long, and like the popup problem its frequency changes over time.  By
"hung" here I'm referring to cases where something must go wrong with
a test and/or its cleanup such that a python_d process remains
running, usually several of them at the same time.  So I end up with a
bunch of python_d processes in the background (but not with any
dialogs pending), which eventually cause errors during attempts the
next time the same builder is used since the file remains in use.

I expect some of this may be the lack of a good process group cleanup
under Windows, though the root cause may not be unique to Windows.  I
see something very similar reasonable frequency on my OSX Tiger
buildbot as well.  But since the filesystem there can let the build
tree get cleaned and rebuilt even with a stranded executable, the
impact is minimal on subsequent tests than on Windows, though the OSX
processes do burn a ton of CPU.  I run a script on OSX to kill them
off, but that was quick to whip up since in those cases the stranded
processes all end up getting owned by init so it's a simple ps grep
and kill.  In the Windows case I'll probably just set a time limit so
if the processes have been around more than a few hours I figure
they're safe to kill.

-- David

From martin at  Sun Nov 14 11:09:08 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 11:09:08 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

> If the code is currently working and isn't a security hole, then we
> obviously don't "have to".
> Apparently several developers "want to", which is different.

In case the motivation for that isn't clear: it would produce a
significant code reduction, and therefore ease maintenance.


From martin at  Sun Nov 14 11:14:27 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 11:14:27 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

> We should also keep in mind that *Microsoft* have chosen to keep the
> bytes Win32 APIs around, despite their flaws, all in the name of
> backwards compatibility.

Of course, Microsoft is in a different position. If they remove a
functionality in some release, their users typically can't go back
and continue to use the old version - at least not on the same computer.

For Python, it's different: our users can go back to use an old version
if the new one breaks their applications. And we do break applications
from time to time, most notably with the introduction of Python 3.

> While the goal of nudging third party
> developers towards the superior Unicode APIs is an admirable one, it
> is still the case that there is a *lot* of ASCII-only code out there.

The question is: is there also a lot of ASCII-only Python 3 software out
there? And would developers of such software have difficulties to
port it to a Unicode file name API.

> E.g. applications could easily be storing filenames in an ASCII only
> datastore that provides them back to the application as bytes in 3.x.

That's speculation. My speculation would be that authors of such a
datastore find that they can't even print the data anymore in a
reasonable way, so they changed their API to return strings (i.e.
decoding from ASCII) when they ported it to Python 3. They wouldn't
even consider it a change, because it returned strings all the time,
and now Python 3 has a different string type.

>> If you drop code without first deprecating it, chances are it will
>> hurt someone.  That's why having a deprecation period is the rule we
>> usually follow (most of the time :-)).

I'm in favor of deprecating it first.


From martin at  Sun Nov 14 11:18:07 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 11:18:07 +0100
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> This is a completely separate issue, though probably around just as
> long, and like the popup problem its frequency changes over time.  By
> "hung" here I'm referring to cases where something must go wrong with
> a test and/or its cleanup such that a python_d process remains
> running, usually several of them at the same time.  So I end up with a
> bunch of python_d processes in the background (but not with any
> dialogs pending), which eventually cause errors during attempts the
> next time the same builder is used since the file remains in use.

This is what kill_python.exe is supposed to solve. So I recommend to
investigate why it fails to kill the hanging Pythons.


From martin at  Sun Nov 14 11:20:47 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 11:20:47 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>	<ibnjui$1m8$>
Message-ID: <>

Am 14.11.2010 09:25, schrieb Nick Coghlan:
> On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy <tjreedy at> wrote:
>> Much better except possible for \n after 'summary:'
> That extra line break helps more for multi-line checkin messages
> (which happen reasonably often). Doesn't really bother me either way -
> I'm mainly looking for info on who has the ability to change the
> format in the first place :)


You should have push permissions to that repository.


From at  Sun Nov 14 11:32:25 2010
From: at (David Bolen)
Date: Sun, 14 Nov 2010 05:32:25 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

"Martin v. L?wis" <martin at> writes:

> This is what kill_python.exe is supposed to solve. So I recommend to
> investigate why it fails to kill the hanging Pythons.

Yeah, I know, and I can't say I disagree in principle - not sure why
Windows doesn't let the kill in that module work (or if there's an
issue actually running it under all conditions).

At the moment though, I do know that using the sysinternals pskill
utility externally (which is what I currently do interactively)
definitely works so to be honest, automating that is a guaranteed bang
for buck at this point with no analysis involved.  Looking into
kill_python or its use can be a follow-on.

-- David

From ncoghlan at  Sun Nov 14 12:41:59 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 21:41:59 +1000
Subject: [Python-Dev] unexpected traceback/stack behavior with chained
 exceptions (issue 1553375)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 1:40 PM, R. David Murray <rdmurray at> wrote:
> Issue 1553375 [1] proposes a patch to add an 'allframes' option to the
> traceback printing and formatting routines so that the full traceback
> from the top of the execution stack down to the exception is printed,
> instead of just from the point where the exception is caught down to
> the exception. ?This is useful when the reason you are capturing the
> traceback is to log it, and you have several different points in your
> application where you do such traceback logging. ?You often really want
> to know the entire stack in that case; logging only from the capture
> point down can lose important debugging information depending on how
> the application is structured.
> The patch seems to work well, except for one problem that I don't have
> enough CPython internals knowledge to understand. ?If the traceback we
> are printing has a chained traceback, the resulting full traceback shows
> the line that is printing the traceback instead of the line from the 'try'
> block. ?(It prints the expected line if there is no chained traceback).
> So, is this a failure in my understanding of how tracebacks are supposed
> to work, or a bug in how chained tracebacks are constructed?

It looks to me like you're grabbing a reference to a frame that is
currently executing and that frame has moved on since the exception
was thrown (to your exception handler). The print_stack() call in the
patch then accurately reflects this.

The other thing to keep in mind is that the exception currently being
handled is the *last* one produced by _iter_chain - all of the rest
have already been caught and handled, it was the handlers for those
that raised the subsequent exceptions in the chain.

Basically, the whole patch strikes me as fundamentally misguided. If
someone wants this information in their exception handler, they should
put a print_stack() with the appropriate header information after the
print_exception() call rather than trying to embed it in the display
of the exception information. logging could also gain an independent
"stack_trace=True" option to request inclusion of a stack trace
independently of whether or not exception information is included.

(Side note: there's a typo in the format_tb docstring claiming it is a
wrapper around extract_stack - that's incorrect, it is a wrapper
around extract_tb)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 14 12:44:19 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 21:44:19 +1000
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 8:14 PM, "Martin v. L?wis" <martin at> wrote:
> I'm in favor of deprecating it first.

Aye. I've made the best case I could for keeping it, and even I don't
find it terribly convincing. So deprecation for 3.2 sound like a
reasonable option.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 14 12:46:41 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 21:46:41 +1000
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" <martin at> wrote:
> Am 14.11.2010 09:25, schrieb Nick Coghlan:
>> On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy <tjreedy at> wrote:
>>> Much better except possible for \n after 'summary:'
>> That extra line break helps more for multi-line checkin messages
>> (which happen reasonably often). Doesn't really bother me either way -
>> I'm mainly looking for info on who has the ability to change the
>> format in the first place :)
> See
> You should have push permissions to that repository.

Thanks - it will give me a chance to use Hg for something meaningful as well.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 14 13:39:40 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 14 Nov 2010 22:39:40 +1000
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" <martin at> wrote:
> See
> You should have push permissions to that repository.

I suspect my hg-fu is inadequate to at this point - I get an 'access
to repository "" not permitted' error when I try to
push the modified file to "ssh://hg at". (I actually
got the same error when cloning, but if I understand hg correctly, it
shouldn't matter that my clone came from the http URL rather than the
ssh one).

My username and email address in my hgrc file match those in Dirkjan's
author map, so I'm not sure what's going on there.

The change I tried to make was to replace the last couple of lines of
the header creation's incoming() function with the following 3

    body += log.splitlines()[:-2]
    body += ['summary:\n  ' + ctx.description(), '']
    body += ['files:\n  ' + '\n  '.join(ctx.files()), '']


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From g.brandl at  Sun Nov 14 14:05:12 2010
From: g.brandl at (Georg Brandl)
Date: Sun, 14 Nov 2010 14:05:12 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>	<ibnjui$1m8$>	<>	<>
Message-ID: <ibomt0$cs0$>

Am 14.11.2010 13:39, schrieb Nick Coghlan:
> On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" <martin at> wrote:
>> See
>> You should have push permissions to that repository.
> I suspect my hg-fu is inadequate to at this point - I get an 'access
> to repository "" not permitted' error when I try to
> push the modified file to "ssh://hg at".

Martin told you only half the truth: the SSH URL is (currently)
<ssh://hg at>.  I think we will change that to
remove the /repos/ part before going live with the cpython repo, but
the hg username remains, corresponding to the pythondev username for SVN.

> (I actually
> got the same error when cloning, but if I understand hg correctly, it
> shouldn't matter that my clone came from the http URL rather than the
> ssh one).

That's correct.

> My username and email address in my hgrc file match those in Dirkjan's
> author map, so I'm not sure what's going on there.

The usernames and email addresses you use for commits don't matter; as
long as you can connect via SSH you can push commits with any author.


From p.f.moore at  Sun Nov 14 18:31:19 2010
From: p.f.moore at (Paul Moore)
Date: Sun, 14 Nov 2010 17:31:19 +0000
Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and
 buildbot run as a service
In-Reply-To: <ibjs91$t2m$>
References: <>
Message-ID: <>

On 12 November 2010 17:07, Terry Reedy <tjreedy at> wrote:
> On 11/12/2010 3:44 AM, Paul Moore wrote:
>> Hi,
>> My buildbot has been failing for some time because of these 2 issues,
>> both related to the fact that tests are hanging when run as a service
>> (and hence have no display to open GUI elements on). Both issues have
>> patches, and as far as I am aware, the patches fix the issues
>> reasonably well. What can I do to move these 2 issues forwards? As
>> things stand, my buildbot is not providing a lot of value on the 3.x
>> branch :-(
> is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed
> patches. Should it be re-versioned for 3.1/2? There is no patch file
> attached, though perhaps the code in Yamamoto's message is meant as such
> (but for which version?). So the first thing you could do is clarify the
> current status and remaining issue on the tracker.

Ah, sorry. I misremembered the history - you are right, I suspect this
is fixed (at least to the extent that my buildbot isn't permanently
red :-))

On rereading, I get the impression that a cleaner fix may be possible
by using the ideas in the patch for 9931, but that's probably for
another time.

> by Yamamoto is marked for all 3 versions. It seems to be a similar issue,
> though marked 'test' rather than 'ctypes'. It does have a patch by him
> apparently based on his previous comments. The issue has no responses and
> needs a patch review. So the first thing you could do is to provide one;-).
> If it looks great (no changes that you can think of) and works great, say
> so. Then it can move on to commit review stage.

OK, thanks. I'll see if I can provide a review, and see how it goes from there.

Really, it's not that urgent that this gets fixed in the wider scheme
of things - but as my buildbot is a bit useless while the problem
remains, I'm motivated to do what I can to work on it. I'm just a
little limited in what I can do, hence the request for suggestions.

> PS. Providing links like the above makes it easier for multiple people to
> take a look and respond.

You're right, and I apologise for that. I sent the email in a hurry
and didn't consider others before sending.


From p.f.moore at  Sun Nov 14 18:49:36 2010
From: p.f.moore at (Paul Moore)
Date: Sun, 14 Nov 2010 17:49:36 +0000
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>
Message-ID: <>

On 14 November 2010 02:40, David Bolen < at> wrote:
> There's been a bit of an uptick in the past few weeks with hung
> python_d processes (not a new issue, but it ebbs and flows), so I'm
> going to try to pull together a monitor script this weekend to start
> killing them off automatically. ?Should at least get rid of some of
> the low hanging fruit that interferes with subsequent builds.

My buildslave (x86 XP-5, see runs
buildbot as a service. I set it up that way as I assumed that would be
the most sensible approach to avoid manual intervention on reboots,
keeping a user session permanently running, etc. But it seems that
there are a few areas where things don't work quite right when run
from a service (see, for example,
and I assumed that some of my hung python_d processes were related to

Do you run your slave as a service? (And for that matter, what do
other Windows slave owners do?) Are there any "best practices" for
ongoing admin of a Windows buildslave that might be worth collecting
together? (I'll try to put some notes on what I've found together -
maybe a page on the Python wiki would be the best place to collect


From martin at  Sun Nov 14 19:27:22 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 19:27:22 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>	<ibnjui$1m8$>	<>	<>
Message-ID: <>

> I suspect my hg-fu is inadequate to at this point - I get an 'access
> to repository "" not permitted' error when I try to
> push the modified file to "ssh://hg at".


ssh://hg at

I think this is something that needs to be fixed: I fail to see the
point of having this extra repos/ directory in the path (even though
it's certainly useful to have them all in a separate directory on disk).

It's also unfortunate that hg complains it can't give access to /hooks,
when the problem really is that the repository doesn't exist. I guess
this is because it tries to create it, and then finds that it can't.


From solipsis at  Sun Nov 14 19:35:07 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 14 Nov 2010 19:35:07 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
References: <>
Message-ID: <>

On Sun, 14 Nov 2010 19:27:22 +0100
"Martin v. L?wis" <martin at> wrote:
> > I suspect my hg-fu is inadequate to at this point - I get an 'access
> > to repository "" not permitted' error when I try to
> > push the modified file to "ssh://hg at".
> Try
> ssh://hg at
> I think this is something that needs to be fixed: I fail to see the
> point of having this extra repos/ directory in the path (even though
> it's certainly useful to have them all in a separate directory on disk).

IIUC, "repos/hooks" is interpreted as a relative path to the "hg"
user's HOME. The "ssh://" scheme executes remote hg over an ssh
session, I don't think there's any additional magic.



From martin at  Sun Nov 14 19:49:44 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Nov 2010 19:49:44 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>	<ibnjui$1m8$>	<>	<>	<>	<>
Message-ID: <>

>> I think this is something that needs to be fixed: I fail to see the
>> point of having this extra repos/ directory in the path (even though
>> it's certainly useful to have them all in a separate directory on disk).
> IIUC, "repos/hooks" is interpreted as a relative path to the "hg"
> user's HOME. The "ssh://" scheme executes remote hg over an ssh
> session, I don't think there's any additional magic.

Correct. However, this just means that additional magic is required.


From vinay_sajip at  Sun Nov 14 21:05:16 2010
From: vinay_sajip at (Vinay Sajip)
Date: Sun, 14 Nov 2010 20:05:16 +0000 (UTC)
Subject: [Python-Dev] unexpected traceback/stack behavior with chained
	exceptions (issue 1553375)
References: <>
Message-ID: <>

Nick Coghlan <ncoghlan <at>> writes:

> of the exception information. logging could also gain an independent
> "stack_trace=True" option to request inclusion of a stack trace
> independently of whether or not exception information is included.

Good point, Nick. There are times when you'd want to know how you got to a
certain point in code, irrespective of whether any exception occurred. So your
suggestion makes sense, and I'll try and see if I can get it into 3.2.

Another benefit of this is that a user only gets this if they want it; if I were
to use the allframes flag in logging, then everyone would get the print_stack()
even if they didn't want it.


Vinay Sajip

From g.brandl at  Sun Nov 14 21:36:37 2010
From: g.brandl at (Georg Brandl)
Date: Sun, 14 Nov 2010 21:36:37 +0100
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>	<ibnjui$1m8$>	<>	<>	<>	<>
Message-ID: <ibphbe$r6f$>

Am 14.11.2010 19:35, schrieb Antoine Pitrou:
> On Sun, 14 Nov 2010 19:27:22 +0100
> "Martin v. L?wis" <martin at> wrote:
>> > I suspect my hg-fu is inadequate to at this point - I get an 'access
>> > to repository "" not permitted' error when I try to
>> > push the modified file to "ssh://hg at".
>> Try
>> ssh://hg at
>> I think this is something that needs to be fixed: I fail to see the
>> point of having this extra repos/ directory in the path (even though
>> it's certainly useful to have them all in a separate directory on disk).
> IIUC, "repos/hooks" is interpreted as a relative path to the "hg"
> user's HOME. The "ssh://" scheme executes remote hg over an ssh
> session, I don't think there's any additional magic.

There is; we already have a custom authorized_keys command in place
to call the hg-ssh wrapper, and all that's needed is to customize that
command a bit more.


From at  Sun Nov 14 22:24:55 2010
From: at (David Bolen)
Date: Sun, 14 Nov 2010 16:24:55 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

Paul Moore <p.f.moore at> writes:

> Do you run your slave as a service? (And for that matter, what do
> other Windows slave owners do?) Are there any "best practices" for
> ongoing admin of a Windows buildslave that might be worth collecting
> together? (I'll try to put some notes on what I've found together -
> maybe a page on the Python wiki would be the best place to collect
> them).

I've always run my slave interactively under Windows (well, started it
interactively).  Not sure if I tried a service in the beginning or
not, it was a while ago.  So your slave is probably the guinea pig for
service operation.

There is (for which I
can't take any credit).  It could probably use a little love and
updating, and it's largely aimed at setting things up, but not as much
operating it.

I think the only stuff I'm doing on my slave above and beyond the
basic setup is a small patch to buildbot (circa 2007, couldn't get it
back upstream at the time) to use SetErrorMode to disable OS pop-ups,
and the AutoIt script (from earlier this year) to auto-acknowledge C
RTL pop-ups.  The kill script in this thread as a safety net above
kill_python would be a third tweak.  There was a buildbot fix for
uploading that was only needed for the short-lived MSI generation, and
which I think later buildbot versions have their own changes for.

I'd be happy to work with you if you're willing to combine/edit our
bits of information.  Probably something we can take off-list, so just
let me know.

-- David

From ncoghlan at  Mon Nov 15 12:45:46 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 15 Nov 2010 21:45:46 +1000
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 15, 2010 at 4:27 AM, "Martin v. L?wis" <martin at> wrote:
>> I suspect my hg-fu is inadequate to at this point - I get an 'access
>> to repository "" not permitted' error when I try to
>> push the modified file to "ssh://hg at".
> Try
> ssh://hg at

And done :)

Hopefully I didn't break anything in the process...


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Nov 15 14:24:01 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 15 Nov 2010 23:24:01 +1000
Subject: [Python-Dev] [Python-checkins] r86467 - in
 python/branches/py3k: Doc/library/logging.rst Lib/logging/
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 15, 2010 at 7:33 AM, vinay.sajip <python-checkins at> wrote:
> + ? .. attribute:: stack_info
> +
> + ? ? ?Stack frame information (where available) from the bottom of the stack
> + ? ? ?in the current thread, up to and including the stack frame of the
> + ? ? ?logging call which resulted in the creation of this record.
> +

Interesting - my mental model of the call stack is that the outermost
frame is the top of the stack and the stack grows downwards as calls
are executed (there are a few idioms like "recursive descent", the
intuitive parallel with "inner functions" being lower in the stack
than "outer functions" as well as the order in which Python prints
stack traces that reinforce this view).

According to the sys.getframe documentation, my mental model is wrong though :)

(I'll note that the documentation of frame objects in the language
reference itself appears a little confused on the matter - either that
or I'm completely misunderstanding when writing to f_lineno will work)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From reid.kleckner at  Mon Nov 15 18:01:36 2010
From: reid.kleckner at (Reid Kleckner)
Date: Mon, 15 Nov 2010 12:01:36 -0500
Subject: [Python-Dev] [Python-checkins] r86467 - in
 python/branches/py3k: Doc/library/logging.rst Lib/logging/
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 15, 2010 at 8:24 AM, Nick Coghlan <ncoghlan at> wrote:
> On Mon, Nov 15, 2010 at 7:33 AM, vinay.sajip <python-checkins at> wrote:
>> + ? .. attribute:: stack_info
>> +
>> + ? ? ?Stack frame information (where available) from the bottom of the stack
>> + ? ? ?in the current thread, up to and including the stack frame of the
>> + ? ? ?logging call which resulted in the creation of this record.
>> +
> Interesting - my mental model of the call stack is that the outermost
> frame is the top of the stack and the stack grows downwards as calls
> are executed (there are a few idioms like "recursive descent", the
> intuitive parallel with "inner functions" being lower in the stack
> than "outer functions" as well as the order in which Python prints
> stack traces that reinforce this view).

Probably because the C stack tends to grow down for most
architectures, but most stack data structures are implemented over
arrays and hence, grow upwards from 0.  Depending on the author's
background, they probably use one mental model or the other.


From techtonik at  Mon Nov 15 21:43:08 2010
From: techtonik at (anatoly techtonik)
Date: Mon, 15 Nov 2010 22:43:08 +0200
Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications
In-Reply-To: <ibnjui$1m8$>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 5:10 AM, Terry Reedy <tjreedy at> wrote:
> On 11/13/2010 8:28 PM, Nick Coghlan wrote:
>> Following the python-checkins list, I get to see both the current SVN
>> notifications and the Hg notifications from Tarek's pushes into the
>> distutils repository. I realised today that there is one key reason as
>> to why the latter strikes me as a big wall of unintelligible text,
>> while I find the SVN notification quite easy to read: vertical
>> whitespace.
>> The SVN notification uses vertical whitespace to separate out the log
>> message and the list of files affected clearly from the rest of the
>> header fields. It makes it *really* easy to see at a glance what the
>> checkin was about and which files were affected. For the Hg
>> notification, both of these fields are embedded in a big header block
>> along with all the other fields, so it is quite difficult to make out
>> the same information.
>> It would be really nice if the formatting could be improved for the
>> email notifications on the Hg side when we adopt it for the main
>> CPython repository. The changes would be to:
>> - add a blank line before and after the summary field
>> - add a carriage return between the header and content for the summary
>> field and the files field
>> - indent the list of files by two spaces and use a carriage return
>> rather than a comma to separate named files
>> I've included an example below based on one of Tarek's recent pushes:
>> Current Hg notification header and start of first diff:
>> ================================================
>> tarek.ziade pushed 7ebf14ab2840 to distutils2:
>> changeset: ? 816:7ebf14ab2840
>> tag: ? ? ? ? tip
>> user: ? ? ? ?Tarek Ziade<tarek at>
>> date: ? ? ? ?Sat Nov 13 12:40:33 2010 +0100
>> summary: ? ? compiler_type -> ?name
>> files: ? ? ? distutils2/compiler/,
>> distutils2/compiler/, distutils2/compiler/,
>> distutils2/compiler/,
>> distutils2/compiler/,
>> distutils2/compiler/,
>> distutils2/compiler/, distutils2/tests/
>> diff --git a/distutils2/compiler/
>> b/distutils2/compiler/
>> --- a/distutils2/compiler/
>> +++ b/distutils2/compiler/
>> @@ -13,7 +13,7 @@
>> ====================================================
>> Proposed change to separate out summary and files fields:
>> ================================================
>> tarek.ziade pushed 7ebf14ab2840 to distutils2:
>> changeset: ? 816:7ebf14ab2840
>> tag: ? ? ? ? tip
>> user: ? ? ? ?Tarek Ziade<tarek at>
>> date: ? ? ? ?Sat Nov 13 12:40:33 2010 +0100
>> summary:
>> compiler_type -> ?name
>> files:
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/compiler/
>> ? distutils2/tests/
>> diff --git a/distutils2/compiler/
>> b/distutils2/compiler/
>> --- a/distutils2/compiler/
>> +++ b/distutils2/compiler/
>> @@ -13,7 +13,7 @@
>> ====================================================
> Much better except possible for \n after 'summary:'

Why not to drop "summary" label at all? The purpose of the text
delimited with newlines is quite obvious.
anatoly t.

From brian.curtin at  Tue Nov 16 01:23:51 2010
From: brian.curtin at (Brian Curtin)
Date: Mon, 15 Nov 2010 18:23:51 -0600
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 14, 2010 at 02:48, David Bolen < at> wrote:

> Nick Coghlan <ncoghlan at> writes:
> > Do we have any idea why the workaround to avoid the popup windows
> > stopped working? (assuming it ever worked reliably - I thought it did,
> > but that impression may have been incorrect)
> Oh, the pop-up handling for the RTL dialogs still seems to be working
> fine (at least I haven't seen any since I put it in place).  That, plus
> the original buildbot tweaks to block any OS popups still looks solid
> for avoiding any dialogs that block a test process.
> This is a completely separate issue, though probably around just as
> long, and like the popup problem its frequency changes over time.  By
> "hung" here I'm referring to cases where something must go wrong with
> a test and/or its cleanup such that a python_d process remains
> running, usually several of them at the same time.  So I end up with a
> bunch of python_d processes in the background (but not with any
> dialogs pending), which eventually cause errors during attempts the
> next time the same builder is used since the file remains in use.
> I expect some of this may be the lack of a good process group cleanup
> under Windows, though the root cause may not be unique to Windows.  I
> see something very similar reasonable frequency on my OSX Tiger
> buildbot as well.  But since the filesystem there can let the build
> tree get cleaned and rebuilt even with a stranded executable, the
> impact is minimal on subsequent tests than on Windows, though the OSX
> processes do burn a ton of CPU.  I run a script on OSX to kill them
> off, but that was quick to whip up since in those cases the stranded
> processes all end up getting owned by init so it's a simple ps grep
> and kill.  In the Windows case I'll probably just set a time limit so
> if the processes have been around more than a few hours I figure
> they're safe to kill.
> -- David

Is the dialog closer script available somewhere? I'm guessing this is the
same script that closes the window which pops up during test_capi's crash?

I just setup a Windows Server 2008 R2 x64 build slave and noticed it hanging
due to the popup.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Tue Nov 16 03:35:05 2010
From: at (David Bolen)
Date: Mon, 15 Nov 2010 21:35:05 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

Brian Curtin <brian.curtin at> writes:

> Is the dialog closer script available somewhere? I'm guessing this is the
> same script that closes the window which pops up during test_capi's crash?

Not sure about that specific test, as I won't normally see the windows.

If the failure is causing a C RTL pop-up, then yes, the script will
be closing it. If the test is generating an OS level pop-up (process
error dialog from the OS, not RTL) then that is instead suppressed for
any of the child processes run on my slave, so it never shows up at all.

The RTL script is trivial enough that I'll just include it inline:

          - - - - - - - - - - - - - - - - - - - - - - - - -
; buildbot.au3
; Forceably acknowledge any RTL pop-ups that may occur during testing

$MSVCRT = "Microsoft Visual C++ Runtime Library"

while 1
    ; Wait for any RTL pop-up and then acknowledge
    ControlClick($MSVCRT, "", "[CLASS:Button; TEXT:OK]")
    ; Safety check to avoid spinning if it doesn't go away
          - - - - - - - - - - - - - - - - - - - - - - - - -

Execute with AutoIt3 (  I just
use the plain autoit3.exe against this script from the Startup folder.

The error mode buildbot patch was discussed in the past on this list
(or it might have been the python-3000-devel list at the time).
Originally it just used pywin32, but I added a fallback to ctypes if
available.  When first done, we were still building pre-2.5 builds - I
suppose at this point it could just assume the presence of ctypes.
The patch below is from 0.7.11p3:

          - - - - - - - - - - - - - - - - - - - - - - - - -
---	2009-08-13 11:53:17.000000000 -0400
+++ /cygdrive/d/python/2.6/lib/site-packages/buildbot/slave/	2009-11-08 02:09:38.000000000 -0500
@@ -489,6 +489,23 @@
         if not self.keepStdinOpen:
+        # [db3l] Under Win32, try to control error mode
+        win32_SetErrorMode = None
+        if runtime.platformType  == 'win32':
+            try:
+                import win32api
+                win32_SetErrorMode = win32api.SetErrorMode
+            except:
+                try:
+                    import ctypes
+                    win32_SetErrorMode = ctypes.windll.kernel32.SetErrorMode
+                except:
+                    pass
+        if win32_SetErrorMode:
+            log.msg(" Setting Windows error mode")
+            old_err_mode = win32_SetErrorMode(7)
         # win32eventreactor's spawnProcess (under twisted <= 2.0.1) returns
         # None, as opposed to all the posixbase-derived reactors (which
         # return the new Process object). This is a nuisance. We can make up
@@ -509,6 +526,10 @@
         if not self.process:
             self.process = p
+        # [db3l]
+        if win32_SetErrorMode:
+            win32_SetErrorMode(old_err_mode)
         # connectionMade also closes stdin as long as we're not using a PTY.
         # This is intended to kill off inappropriately interactive commands
         # better than the (long) hung-command timeout. ProcessPTY should be
          - - - - - - - - - - - - - - - - - - - - - - - - -

-- David

From janssen at  Tue Nov 16 04:57:10 2010
From: janssen at (Bill Janssen)
Date: Mon, 15 Nov 2010 19:57:10 PST
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>
Message-ID: <>

Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line.
Looking at the changes since the last success, I can't see anything
which would obviously affect that...  Any suspects?

Here's what's failing:

ERROR: test_run_code (test.test_cmd_line.CmdLineTest)
Traceback (most recent call last):
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 95, in test_run_code
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 55, in assert_python_failure
    return _assert_python(False, *args, **env_vars)
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 29, in _assert_python
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", line 683, in __init__
    self.stdin =, 'wb', bufsize)
OSError: [Errno 9] Bad file descriptor

ERROR: test_run_module (test.test_cmd_line.CmdLineTest)
Traceback (most recent call last):
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 72, in test_run_module
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 55, in assert_python_failure
    return _assert_python(False, *args, **env_vars)
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 29, in _assert_python
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", line 683, in __init__
    self.stdin =, 'wb', bufsize)
OSError: [Errno 9] Bad file descriptor

ERROR: test_version (test.test_cmd_line.CmdLineTest)
Traceback (most recent call last):
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 48, in test_version
    rc, out, err = assert_python_ok('-V')
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 48, in assert_python_ok
    return _assert_python(True, *args, **env_vars)
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/", line 29, in _assert_python
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", line 683, in __init__
    self.stdin =, 'wb', bufsize)
OSError: [Errno 9] Bad file descriptor


From nad at  Tue Nov 16 10:21:29 2010
From: nad at (Ned Deily)
Date: Tue, 16 Nov 2010 01:21:29 -0800
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

In article <30929.1289879830 at>, Bill Janssen <janssen at> 

> Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line.
> Looking at the changes since the last success, I can't see anything
> which would obviously affect that...  Any suspects?

It appears to be a duplicate of Issue8458.  Playing with it again, it 
seems to be a race condition: sometimes I see all three failures you 
reported, sometimes just one, sometimes none.  Again, only on 10.4 
(Tiger), not 10.5 or 10.6.  But the 10.4 machine I'm using is by far the 
slowest of the three so it is possible that could be a factor.  Perhaps 
a race condition with cleaning up the p2c pipe from a previous run?

> Here's what's failing:
> ======================================================================
> ERROR: test_run_code (test.test_cmd_line.CmdLineTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 95, in test_run_code
>     assert_python_failure('-c')
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 55, in assert_python_failure
>     return _assert_python(False, *args, **env_vars)
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 29, in _assert_python
>     env=env)
>   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
>   line 683, in __init__
>     self.stdin =, 'wb', bufsize)
> OSError: [Errno 9] Bad file descriptor
> ======================================================================
> ERROR: test_run_module (test.test_cmd_line.CmdLineTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 72, in test_run_module
>     assert_python_failure('-m')
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 55, in assert_python_failure
>     return _assert_python(False, *args, **env_vars)
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 29, in _assert_python
>     env=env)
>   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
>   line 683, in __init__
>     self.stdin =, 'wb', bufsize)
> OSError: [Errno 9] Bad file descriptor
> ======================================================================
> ERROR: test_version (test.test_cmd_line.CmdLineTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 48, in test_version
>     rc, out, err = assert_python_ok('-V')
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 48, in assert_python_ok
>     return _assert_python(True, *args, **env_vars)
>   File 
>   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
>   , line 29, in _assert_python
>     env=env)
>   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
>   line 683, in __init__
>     self.stdin =, 'wb', bufsize)
> OSError: [Errno 9] Bad file descriptor

 Ned Deily,
 nad at

From georg at  Tue Nov 16 15:05:51 2010
From: georg at (Georg Brandl)
Date: Tue, 16 Nov 2010 15:05:51 +0100
Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4
Message-ID: <>

Hash: SHA1

On behalf of the Python development team, I'm happy to announce the
fourth and (this time really) final alpha preview release of Python 3.2.

Python 3.2 is a continuation of the efforts to improve and stabilize the
Python 3.x line.  Since the final release of Python 2.7, the 2.x line
will only receive bugfixes, and new features are developed for 3.x only.

Since PEP 3003, the Moratorium on Language Changes, is in effect, there
are no changes in Python's syntax and built-in types in Python 3.2.
Development efforts concentrated on the standard library and support for
porting code to Python 3.  Highlights are:

* numerous improvements to the unittest module
* PEP 3147, support for .pyc repository directories
* PEP 3149, support for version tagged dynamic libraries
* an overhauled GIL implementation that reduces contention
* many consistency and behavior fixes for numeric operations
* countless fixes regarding string/unicode issues; among them full
  support for a bytes environment (filenames, environment variables)
* a sysconfig module to access configuration information
* a pure-Python implementation of the datetime module
* additions to the shutil module, among them archive file support
* improvements to pdb, the Python debugger

For an extensive list of changes in 3.2, see Misc/NEWS in the Python

To download Python 3.2 visit:

3.2 documentation can be found at:

Please consider trying Python 3.2 with your code and reporting any bugs
you may notice to:


- -- 
Georg Brandl, Release Manager
georg at
(on behalf of the entire python-dev team and 3.2's contributors)
Version: GnuPG v2.0.16 (GNU/Linux)


From p.f.moore at  Tue Nov 16 16:05:49 2010
From: p.f.moore at (Paul Moore)
Date: Tue, 16 Nov 2010 15:05:49 +0000
Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4
In-Reply-To: <>
References: <>
Message-ID: <>

(Copying to the list, sorry Georg for the duplicate)

On 16 November 2010 14:05, Georg Brandl <georg at> wrote:
> On behalf of the Python development team, I'm happy to announce the
> fourth and (this time really) final alpha preview release of Python 3.2.

PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also
seems to be in the release.

Should it not be added to the "What's new in 3.2" document and the
release announcements? It's a fairly significant feature.


From alexander.belopolsky at  Tue Nov 16 16:16:15 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 10:16:15 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

What this thread has shown is that there is no consensus on what
public names are and what rules should be followed when changing names
that can be imported from a module.  I have opened an issue at to address this.  My vote is to
adopt the definition spelled out in the language reference, copy it to
the library manual and add some discussion of the deprecation

I also have a similar question about C API.  Here, in absence of
__all__, the answer should be clear: all symbols in public header
files should start with either _Py_ or Py_ and those that start with
Py_ are public.   The question is what should be done with names that
start with Py_, but are not documented?  Can we add an underscore to
those names?  If so, should a (deprecated) alias be made available?
Should they be documented as deprecated?

I think these questions can only be answered on a case by case bases
which choices being:

1. Document.
2. Document as deprecated.
3. Document as deprecated, add underscore prefix and retain a deprecated alias.
4. Add an underscore prefix.

The specific set of names that I would like to consider is the
following from unicode.h.  I am marking with (*) the names that I
think should be documented and with (D) those that should be

PyUnicode_Resize (*)
PyUnicode_FromOrdinal (*)
PyUnicode_GetDefaultEncoding (D)
PyUnicode_EncodeDecimal (*)
PyUnicode_Append (*)
PyUnicode_AppendAndDel (*)
PyUnicode_Partition (*)
PyUnicode_RPartition (*)
PyUnicode_RSplit (*)
PyUnicode_IsIdentifier (*)

On Sat, Nov 13, 2010 at 7:12 AM, Giampaolo Rodol? <g.rodola at> wrote:
> +1 on everything.
> 2010/11/11 Alexander Belopolsky <alexander.belopolsky at>:
>> 2010/11/11 Michael Foord <fuzzyman at>:
>> ..
>>>> You mean runtime automation, e.g. creating __all__ on the fly omitting
>>>> underscored names?
>>> Writing code to generate a __all__ that duplicates the default behaviour
>>> seems redundant to me.
>> FWIW, I like having __all__ at the top of the module. ?It feels like a
>> table of contents at the start of a chapter. ?In some cases it may
>> also serve as an optimization when len(__all__) is much smaller than
>> len(__dict__). ?I also don't like _ prefix to become an exclusive
>> means to express privateness.
>> I think the current definition of "public names" is a good one and
>> just needs to be made more visible in the docs. ?If the module defines
>> __all__, that should be the ultimate answer to what is public in that
>> module. ? (Users should learn to use help(module) instead of
>> dir(module) for API discovery.) ? If __all__ is not defined in the
>> module, I think it is good to introduce it after a careful review of
>> what it should contain. ?And __all__ should never contain names that
>> start with _.
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:

From fuzzyman at  Tue Nov 16 16:31:10 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 16 Nov 2010 15:31:10 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>
Message-ID: <>

On 16/11/2010 15:16, Alexander Belopolsky wrote:
> What this thread has shown is that there is no consensus on what
> public names are and what rules should be followed when changing names
> that can be imported from a module.  I have opened an issue at
> to address this.  My vote is to
> adopt the definition spelled out in the language reference, copy it to
> the library manual and add some discussion of the deprecation
> policies.

Whilst the definition in the reference manual is fine it only covers 
module level public APIs (which I realise is your particular concern) it 
doesn't cover whether a module in a package is public and doesn't cover 
class members. The rules for these follow as a natural extension, but if 
we are going to bother codifying the rules (which I think is good given 
the confusion) then it is worth covering these cases.

I posted a suggested wording in an earlier message:

We could also note that existing modules that don't follow these rules 
will generally follow the deprecation rules for "accidentally public" 
names, but that this will be decided on a case-by-case basis and that 
names *obviously* never intended to be public may be changed if it is 
believed that they aren't (or really shouldn't be) in use.

All the best,

Michael Foord

> I also have a similar question about C API.  Here, in absence of
> __all__, the answer should be clear: all symbols in public header
> files should start with either _Py_ or Py_ and those that start with
> Py_ are public.   The question is what should be done with names that
> start with Py_, but are not documented?  Can we add an underscore to
> those names?  If so, should a (deprecated) alias be made available?
> Should they be documented as deprecated?
> I think these questions can only be answered on a case by case bases
> which choices being:
> 1. Document.
> 2. Document as deprecated.
> 3. Document as deprecated, add underscore prefix and retain a deprecated alias.
> 4. Add an underscore prefix.
> The specific set of names that I would like to consider is the
> following from unicode.h.  I am marking with (*) the names that I
> think should be documented and with (D) those that should be
> deprecated:
> PyUnicode_GetMax
> PyUnicode_Resize (*)
> PyUnicode_InternImmortal
> PyUnicode_FromOrdinal (*)
> PyUnicode_GetDefaultEncoding (D)
> PyUnicode_AsDecodedObject
> PyUnicode_AsDecodedUnicode
> PyUnicode_AsEncodedObject
> PyUnicode_AsEncodedUnicode
> PyUnicode_BuildEncodingMap
> PyUnicode_EncodeDecimal (*)
> PyUnicode_Append (*)
> PyUnicode_AppendAndDel (*)
> PyUnicode_Partition (*)
> PyUnicode_RPartition (*)
> PyUnicode_RSplit (*)
> PyUnicode_IsIdentifier (*)
> Py_UNICODE_strlen
> Py_UNICODE_strcpy
> Py_UNICODE_strcat
> Py_UNICODE_strncpy
> Py_UNICODE_strcmp
> Py_UNICODE_strncmp
> Py_UNICODE_strchr
> Py_UNICODE_strrchr
> On Sat, Nov 13, 2010 at 7:12 AM, Giampaolo Rodol?<g.rodola at>  wrote:
>> +1 on everything.
>> 2010/11/11 Alexander Belopolsky<alexander.belopolsky at>:
>>> 2010/11/11 Michael Foord<fuzzyman at>:
>>> ..
>>>>> You mean runtime automation, e.g. creating __all__ on the fly omitting
>>>>> underscored names?
>>>> Writing code to generate a __all__ that duplicates the default behaviour
>>>> seems redundant to me.
>>> FWIW, I like having __all__ at the top of the module.  It feels like a
>>> table of contents at the start of a chapter.  In some cases it may
>>> also serve as an optimization when len(__all__) is much smaller than
>>> len(__dict__).  I also don't like _ prefix to become an exclusive
>>> means to express privateness.
>>> I think the current definition of "public names" is a good one and
>>> just needs to be made more visible in the docs.  If the module defines
>>> __all__, that should be the ultimate answer to what is public in that
>>> module.   (Users should learn to use help(module) instead of
>>> dir(module) for API discovery.)   If __all__ is not defined in the
>>> module, I think it is good to introduce it after a careful review of
>>> what it should contain.  And __all__ should never contain names that
>>> start with _.
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at
>>> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From mal at  Tue Nov 16 16:38:04 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 16 Nov 2010 16:38:04 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

Alexander Belopolsky wrote:
> What this thread has shown is that there is no consensus on what
> public names are and what rules should be followed when changing names
> that can be imported from a module.  I have opened an issue at
> to address this.  My vote is to
> adopt the definition spelled out in the language reference, copy it to
> the library manual and add some discussion of the deprecation
> policies.
> I also have a similar question about C API.  Here, in absence of
> __all__, the answer should be clear: all symbols in public header
> files should start with either _Py_ or Py_ and those that start with
> Py_ are public.   The question is what should be done with names that
> start with Py_, but are not documented?  Can we add an underscore to
> those names?  If so, should a (deprecated) alias be made available?
> Should they be documented as deprecated?
> I think these questions can only be answered on a case by case bases
> which choices being:
> 1. Document.
> 2. Document as deprecated.
> 3. Document as deprecated, add underscore prefix and retain a deprecated alias.
> 4. Add an underscore prefix.
> The specific set of names that I would like to consider is the
> following from unicode.h.  I am marking with (*) the names that I
> think should be documented and with (D) those that should be
> deprecated:
> PyUnicode_GetMax
> PyUnicode_Resize (*)
> PyUnicode_InternImmortal
> PyUnicode_FromOrdinal (*)
> PyUnicode_GetDefaultEncoding (D)
> PyUnicode_AsDecodedObject
> PyUnicode_AsDecodedUnicode
> PyUnicode_AsEncodedObject
> PyUnicode_AsEncodedUnicode
> PyUnicode_BuildEncodingMap
> PyUnicode_EncodeDecimal (*)
> PyUnicode_Append (*)
> PyUnicode_AppendAndDel (*)
> PyUnicode_Partition (*)
> PyUnicode_RPartition (*)
> PyUnicode_RSplit (*)
> PyUnicode_IsIdentifier (*)
> Py_UNICODE_strlen
> Py_UNICODE_strcpy
> Py_UNICODE_strcat
> Py_UNICODE_strncpy
> Py_UNICODE_strcmp
> Py_UNICODE_strncmp
> Py_UNICODE_strchr
> Py_UNICODE_strrchr

For Unicode, unicodeobject.h defines which APIs are private or not.
APIs which don't appear in the header file are either private or
need to be added to the header file (but I don't think there are
any in this category).

All APIs in the header that do not appear in the documentation,
should be added there as well. unicodeobject.h already provides
documentation for most of the APIs you've listed above (except some
new ones that were added later on).

One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat
obscure and given that we already have PyUnicode_Concat(), I think
it should be made private and eventually dropped.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 16 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From guido at  Tue Nov 16 16:48:20 2010
From: guido at (Guido van Rossum)
Date: Tue, 16 Nov 2010 07:48:20 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> What this thread has shown is that there is no consensus on what
> public names are and what rules should be followed when changing names
> that can be imported from a module. ?I have opened an issue at
> to address this. ?My vote is to
> adopt the definition spelled out in the language reference, copy it to
> the library manual and add some discussion of the deprecation
> policies.

Hm. Apart from the specific semantics assigned by the language to
single and double leading (and trailing) underscores, I still think
this belongs in a style guide, not in the library manual. When reading
the library manual, one should always assume that undocumented
features are subject to change at any time.

When writing library code, one should of course be much more
conservative, and guidelines for contributors are needed to ensure
that in the future we won't repeat the mistakes of the past (mostly my
own mistakes :-).

> I also have a similar question about C API. ?Here, in absence of
> __all__, the answer should be clear: all symbols in public header
> files should start with either _Py_ or Py_ and those that start with
> Py_ are public. ? The question is what should be done with names that
> start with Py_, but are not documented? ?Can we add an underscore to
> those names? ?If so, should a (deprecated) alias be made available?
> Should they be documented as deprecated?

Even more care should be taken here, since breakage is harder to fix,
especially in 3rd party code that needs to be compatible with a wide
range of Python versions.

The good news here is that the intended rule is very clear:

- *no* symbols that don't start with Py_ or _Py_ (unless there's a
technical reason why it can't be named that way)
- public == Py_
- private == _Py_

> I think these questions can only be answered on a case by case bases


> which choices being:
> 1. Document.
> 2. Document as deprecated.
> 3. Document as deprecated, add underscore prefix and retain a deprecated alias.
> 4. Add an underscore prefix.
> The specific set of names that I would like to consider is the
> following from unicode.h. ?I am marking with (*) the names that I
> think should be documented and with (D) those that should be
> deprecated:
> PyUnicode_GetMax
> PyUnicode_Resize (*)
> PyUnicode_InternImmortal
> PyUnicode_FromOrdinal (*)
> PyUnicode_GetDefaultEncoding (D)
> PyUnicode_AsDecodedObject
> PyUnicode_AsDecodedUnicode
> PyUnicode_AsEncodedObject
> PyUnicode_AsEncodedUnicode
> PyUnicode_BuildEncodingMap
> PyUnicode_EncodeDecimal (*)
> PyUnicode_Append (*)
> PyUnicode_AppendAndDel (*)
> PyUnicode_Partition (*)
> PyUnicode_RPartition (*)
> PyUnicode_RSplit (*)
> PyUnicode_IsIdentifier (*)
> Py_UNICODE_strlen
> Py_UNICODE_strcpy
> Py_UNICODE_strcat
> Py_UNICODE_strncpy
> Py_UNICODE_strcmp
> Py_UNICODE_strncmp
> Py_UNICODE_strchr
> Py_UNICODE_strrchr

I'll leave this to others more familiar with the Unicode code; I would
recommend being fairly conservative though since these have been
around for a long time.

--Guido van Rossum (

From janssen at  Tue Nov 16 17:30:44 2010
From: janssen at (Bill Janssen)
Date: Tue, 16 Nov 2010 08:30:44 PST
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>
Message-ID: <>

Ned Deily <nad at> wrote:

> In article <30929.1289879830 at>, Bill Janssen <janssen at> 
> wrote:
> > Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line.
> > Looking at the changes since the last success, I can't see anything
> > which would obviously affect that...  Any suspects?
> It appears to be a duplicate of Issue8458.  Playing with it again, it 
> seems to be a race condition: sometimes I see all three failures you 
> reported, sometimes just one, sometimes none.  Again, only on 10.4 
> (Tiger), not 10.5 or 10.6.  But the 10.4 machine I'm using is by far the 
> slowest of the three so it is possible that could be a factor.

Good thought.  It's also the slowest of my buildbots -- dual 1GHz PPC.

> Perhaps a race condition with cleaning up the p2c pipe from a previous run?
> > Here's what's failing:
> > 
> > ======================================================================
> > ERROR: test_run_code (test.test_cmd_line.CmdLineTest)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 95, in test_run_code
> >     assert_python_failure('-c')
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 55, in assert_python_failure
> >     return _assert_python(False, *args, **env_vars)
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 29, in _assert_python
> >     env=env)
> >   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
> >   line 683, in __init__
> >     self.stdin =, 'wb', bufsize)
> > OSError: [Errno 9] Bad file descriptor
> > 
> > ======================================================================
> > ERROR: test_run_module (test.test_cmd_line.CmdLineTest)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 72, in test_run_module
> >     assert_python_failure('-m')
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 55, in assert_python_failure
> >     return _assert_python(False, *args, **env_vars)
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 29, in _assert_python
> >     env=env)
> >   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
> >   line 683, in __init__
> >     self.stdin =, 'wb', bufsize)
> > OSError: [Errno 9] Bad file descriptor
> > 
> > ======================================================================
> > ERROR: test_version (test.test_cmd_line.CmdLineTest)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 48, in test_version
> >     rc, out, err = assert_python_ok('-V')
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 48, in assert_python_ok
> >     return _assert_python(True, *args, **env_vars)
> >   File 
> >   "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/"
> >   , line 29, in _assert_python
> >     env=env)
> >   File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/", 
> >   line 683, in __init__
> >     self.stdin =, 'wb', bufsize)
> > OSError: [Errno 9] Bad file descriptor
> -- 
>  Ned Deily,
>  nad at
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From exarkun at  Tue Nov 16 17:34:54 2010
From: exarkun at (exarkun at
Date: Tue, 16 Nov 2010 16:34:54 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>

On 03:48 pm, guido at wrote:
>On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky
><alexander.belopolsky at> wrote:
>>What this thread has shown is that there is no consensus on what
>>public names are and what rules should be followed when changing names
>>that can be imported from a module. ?I have opened an issue at
>> to address this. ?My vote is to
>>adopt the definition spelled out in the language reference, copy it to
>>the library manual and add some discussion of the deprecation
>Hm. Apart from the specific semantics assigned by the language to
>single and double leading (and trailing) underscores, I still think
>this belongs in a style guide, not in the library manual. When reading
>the library manual, one should always assume that undocumented
>features are subject to change at any time.

I don't think it belongs only in PEP 8 (that's "a style guide" you're 
referring to, correct?).  It needs to be front and center.  This is 
information that every single user of the stdlib needs in order to use 
the stdlib correctly.

Imagine trying to use a dictionary without knowing about alphabetical 
ordering.  Or driving a car without knowing what lane markers indicate.

No matter how many times we discuss this policy on this list (I know 
it's come up here before), the majority of python users still won't 
learn about it.

PEP 8 isn't nearly visible enough, either.  Whatever the rule is, it 
needs to be presented with the information itself.  If the rule is that 
things not documented in the library manual have no compatibility 
guarantees, then all of the means of getting documentation *other* than 
looking at the library manual need to indicate this somehow 
(alternatively, the information shouldn't be duplicated, but I doubt 
I'll convince anyone of that).

Here's a stupid proposal.  What if the top of pydoc output said (for 
stdlib modules only) "The library manual is the canonical reference. 
Refer to it before using APIs you find in this documentation."  Still 
inconvenient, but inconvenient is better than secret/impossible.


From raymond.hettinger at  Tue Nov 16 18:03:03 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 16 Nov 2010 09:03:03 -0800
Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 16, 2010, at 7:05 AM, Paul Moore wrote:
> PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also
> seems to be in the release.
> Should it not be added to the "What's new in 3.2" document and the
> release announcements? It's a fairly significant feature.

I'll update the whatsnew document before the beta goes out.


From raymond.hettinger at  Tue Nov 16 18:01:39 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 16 Nov 2010 09:01:39 -0800
Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 16, 2010, at 7:05 AM, Paul Moore wrote:
> PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also
> seems to be in the release.
> Should it not be added to the "What's new in 3.2" document and the
> release announcements? It's a fairly significant feature.

I'll update the whatsnew document before the beta goes out.


From solipsis at  Tue Nov 16 18:06:40 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 16 Nov 2010 18:06:40 +0100
Subject: [Python-Dev] Breaking undocumented API
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, 16 Nov 2010 16:34:54 -0000
exarkun at wrote:
> Imagine trying to use a dictionary without knowing about alphabetical 
> ordering.

You mean an ordered dictionary, right?

From alexander.belopolsky at  Tue Nov 16 18:13:57 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 12:13:57 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg <mal at> wrote:
> One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat
> obscure and given that we already have PyUnicode_Concat(), I think
> it should be made private and eventually dropped.

What about PyUnicode_GetMax()?  Isn't that supposed to be

From lukasz at  Tue Nov 16 18:16:21 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Tue, 16 Nov 2010 18:16:21 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
Message-ID: <>

Am 16.11.2010 18:06, schrieb Antoine Pitrou:
> On Tue, 16 Nov 2010 16:34:54 -0000
> exarkun at wrote:
>> Imagine trying to use a dictionary without knowing about alphabetical
>> ordering.
> You mean an ordered dictionary, right?

He meant the ones with actual paper pages.

From fuzzyman at  Tue Nov 16 18:21:38 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 16 Nov 2010 17:21:38 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>
Message-ID: <>

On 16/11/2010 17:16, ?ukasz Langa wrote:
> Am 16.11.2010 18:06, schrieb Antoine Pitrou:
>> On Tue, 16 Nov 2010 16:34:54 -0000
>> exarkun at wrote:
>>> Imagine trying to use a dictionary without knowing about alphabetical
>>> ordering.
>> You mean an ordered dictionary, right?
> He meant the ones with actual paper pages.

But given that we are particularly talking about how to handle 
undocumented APIs, a more apropos comparison would be to ask how 
dictionary readers are supposed to look up words that aren't in the 

This is why I think it *is* a style issue for developers - the more 
important decision is codifying how we decide what words need to go in 
the dictionary (to continue to torture the analogy).


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From exarkun at  Tue Nov 16 18:30:49 2010
From: exarkun at (exarkun at
Date: Tue, 16 Nov 2010 17:30:49 -0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <20101116173049.2040.989476246.divmod.xquotient.936@localhost.localdomain>

On 05:21 pm, fuzzyman at wrote:
>On 16/11/2010 17:16, 1ukasz Langa wrote:
>>Am 16.11.2010 18:06, schrieb Antoine Pitrou:
>>>On Tue, 16 Nov 2010 16:34:54 -0000
>>>exarkun at wrote:
>>>>Imagine trying to use a dictionary without knowing about 
>>>You mean an ordered dictionary, right?
>>He meant the ones with actual paper pages.
>But given that we are particularly talking about how to handle 
>undocumented APIs, a more apropos comparison would be to ask how 
>dictionary readers are supposed to look up words that aren't in the 

No, this isn't an appropriate comparison.  The dictionary was an example 
of something that presents information but is very hard to use without 
knowing the rules.

We're not talking about undocumented APIs.  We're talking about APIs 
that are documented somewhere other than in the library manual.


From mal at  Tue Nov 16 19:06:22 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 16 Nov 2010 19:06:22 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

Alexander Belopolsky wrote:
> On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg <mal at> wrote:
> ..
>> One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat
>> obscure and given that we already have PyUnicode_Concat(), I think
>> it should be made private and eventually dropped.
> What about PyUnicode_GetMax()?  Isn't that supposed to be

Traditionally, all uppercase symbols refer to macros, whereas
the mixed case ones refer to functions.

Now, we can't use a macro for this, since the information has
to be available as callable in order to applications or extensions
to use it (without recompile).

Regarding the name: PyUnicode_MaxOrdinal() would certainly
have been better.

BTW: I'm not really happy about the Py_UNICODE_ prefix for functions
in unicodeobject.h, but I guess it's too late to change those.
It would be better to stick to one prefix for Unicode related
APIs, i.e. "PyUnicode_".

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 16 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From g.brandl at  Tue Nov 16 19:05:44 2010
From: g.brandl at (Georg Brandl)
Date: Tue, 16 Nov 2010 19:05:44 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
Message-ID: <ibuh8l$v8d$>

Am 16.11.2010 18:06, schrieb Antoine Pitrou:
> On Tue, 16 Nov 2010 16:34:54 -0000
> exarkun at wrote:
>> Imagine trying to use a dictionary without knowing about alphabetical 
>> ordering.
> You mean an ordered dictionary, right?

That one's a sorted dictionary, though.


From alexander.belopolsky at  Tue Nov 16 19:31:32 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 13:31:32 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg <mal at> wrote:
> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has
> to be available as callable in order to applications or extensions
> to use it (without recompile).

.. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or
PyUnicodeUCS4_GetMax.  What is the scenario when may want to change
what PyUnicodeUCS?_GetMax return and have extensions pick up the
change without a recompile? UCS2 case will certainly never change
since it is already 0xFFFF.  Is it possible that USC4 will be expanded
beyond 0x10FFFF?  Note that we can have both a macro and a function
version.  This is fairly standard practice in Python C-API.

From jcea at  Tue Nov 16 19:38:07 2010
From: jcea at (Jesus Cea)
Date: Tue, 16 Nov 2010 19:38:07 +0100
Subject: [Python-Dev] Mercurial Schedule
Message-ID: <>

Hash: SHA1

Is there any updated mercurial schedule?.

Any impact related with the new 3.2 schedule (three weeks offset)?

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From alexander.belopolsky at  Tue Nov 16 19:40:36 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 13:40:36 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg <mal at> wrote:
> BTW: I'm not really happy about the Py_UNICODE_ prefix for functions
> in unicodeobject.h, but I guess it's too late to change those.
> It would be better to stick to one prefix for Unicode related
> APIs, i.e. "PyUnicode_".

I don't have a problem with this.  It makes sense that functions that
operate on PyUnicode objects start with PyUnicode_ and those that
operate on Py_UNICODE ordinals start with Py_UNICODE_.  Of course,
PyUnicode should have been named PyUnicodeObject and Py_UNICODE should
have been named Py_wchar_t, but that's a different story.

From mal at  Tue Nov 16 19:57:04 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 16 Nov 2010 19:57:04 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

Alexander Belopolsky wrote:
> On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg <mal at> wrote:
> ..
>> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has
>> to be available as callable in order to applications or extensions
>> to use it (without recompile).
> .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or
> PyUnicodeUCS4_GetMax.

That doesn't count :-) It's only a trick to prevent external code
from using the wrong Unicode APIs.

There still is a real function behind the renaming.

> What is the scenario when may want to change
> what PyUnicodeUCS?_GetMax return and have extensions pick up the
> change without a recompile? 

If an extensions uses the stable ABI, it will want to know
whether the interpreter was built for UCS2 or UCS4 (even if
it doesn't use the Unicode APIs directly).

> UCS2 case will certainly never change
> since it is already 0xFFFF.  Is it possible that USC4 will be expanded
> beyond 0x10FFFF? 

Well, the Unicode Consortium decided to not go beyond 0x10FFFF,
but then you never know... when they started out on the quest,
16 bits appeared more than enough, but they found out relatively
quickly that the Asian scripts had enough code points to easily
fill that space.

Once space is available, it tends to get used sooner or later :-)

> Note that we can have both a macro and a function
> version.  This is fairly standard practice in Python C-API.

Sure, but what for ?

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 16 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From alexander.belopolsky at  Tue Nov 16 20:06:37 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 14:06:37 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg <mal at> wrote:
>> Note that we can have both a macro and a function
>> version. ?This is fairly standard practice in Python C-API.
> Sure, but what for ?

Mostly just for consistency with the other macros:

Wait, these actually map to C functions as well.  So this is just a
naming issue.

From tjreedy at  Tue Nov 16 20:08:18 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 16 Nov 2010 14:08:18 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>
Message-ID: <ibukqv$ini$>

On 11/16/2010 10:16 AM, Alexander Belopolsky wrote:
> What this thread has shown is that there is no consensus on what
> public names are and what rules should be followed when changing names
> that can be imported from a module.

Nor is their any consensus on the use of __all__ in the stdlib, with 
opinion ranging from never to sometimes to always.

I do not have any opinions on the particular solution adopted, but 
appreciate your persistence in pushing to *some* solution. It would be 
nice to add 'Cleanly separated public and private APIs' to the list of 
3.x features.

Terry Jan Reedy

From mal at  Tue Nov 16 20:16:50 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 16 Nov 2010 20:16:50 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

Alexander Belopolsky wrote:
> On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg <mal at> wrote:
> ..
>>> Note that we can have both a macro and a function
>>> version.  This is fairly standard practice in Python C-API.
>> Sure, but what for ?
> Mostly just for consistency with the other macros:
> Wait, these actually map to C functions as well.  So this is just a
> naming issue.

As said: the UCS2/4 name mangling doesn't count fall under the
macro naming scheme, since it's done transparently and with a
different reasoning in mind, than when you decide to use a macro
to access some object detail, or want to avoid repetition. This
trick was also added after the original APIs had already been documented
for a while, so there was no way to change their names anymore.

The various ctype functions use macro names for historic
reasons: they were directed to different functions and/or
inline code depending on a configuration switch. This is now
gone, since the lib C ctype functions were locale aware and
often implemented things a little differently than the Python
ctype tables.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 16 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From alexander.belopolsky at  Tue Nov 16 20:52:07 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 14:52:07 -0500
Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal() Was:
 Breaking undocumented API
Message-ID: <>

On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg <mal at> wrote:
> Alexander Belopolsky wrote:
>> On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg <mal at> wrote:
>> ..
>>> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has
>>> to be available as callable in order to applications or extensions
>>> to use it (without recompile).
>> .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or
>> PyUnicodeUCS4_GetMax.
> That doesn't count :-) It's only a trick to prevent external code
> from using the wrong Unicode APIs.
> There still is a real function behind the renaming.
>> What is the scenario when may want to change
>> what PyUnicodeUCS?_GetMax return and have extensions pick up the
>> change without a recompile?
> If an extensions uses the stable ABI, it will want to know
> whether the interpreter was built for UCS2 or UCS4 (even if
> it doesn't use the Unicode APIs directly).
>> UCS2 case will certainly never change
>> since it is already 0xFFFF. ?Is it possible that USC4 will be expanded
>> beyond 0x10FFFF?
> Well, the Unicode Consortium decided to not go beyond 0x10FFFF,
> but then you never know... when they started out on the quest,
> 16 bits appeared more than enough, but they found out relatively
> quickly that the Asian scripts had enough code points to easily
> fill that space.
> Once space is available, it tends to get used sooner or later :-)
>> Note that we can have both a macro and a function
>> version. ?This is fairly standard practice in Python C-API.
> Sure, but what for ?

Note that PyUnicode_FromOrdinal()  is documented (in unicodeobject.h)
as follows without a reference to PyUnicode_GetMax():

   Create a Unicode Object from the given Unicode code point ordinal.

   The ordinal must be in range(0x10000) on narrow Python builds
   (UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is
   raised in case it is not.

The actual implementation actually checks UCS4 range only.

    if (ordinal < 0 || ordinal > 0x10ffff) {
                        "chr() arg not in range(0x110000)");
        return NULL;

This actually looks like a bug:

>>> len(chr(0x10FFFF))

(on a USC2 build.)

Also, I think PyUnicode_FromOrdinal()  should take Py_UNICODE argument
rather than int.

From mal at  Tue Nov 16 21:06:15 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 16 Nov 2010 21:06:15 +0100
Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal()
 Was: Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky wrote:
> On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg <mal at> wrote:
>> Alexander Belopolsky wrote:
>>> On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg <mal at> wrote:
>>> ..
>>>> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has
>>>> to be available as callable in order to applications or extensions
>>>> to use it (without recompile).
>>> .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or
>>> PyUnicodeUCS4_GetMax.
>> That doesn't count :-) It's only a trick to prevent external code
>> from using the wrong Unicode APIs.
>> There still is a real function behind the renaming.
>>> What is the scenario when may want to change
>>> what PyUnicodeUCS?_GetMax return and have extensions pick up the
>>> change without a recompile?
>> If an extensions uses the stable ABI, it will want to know
>> whether the interpreter was built for UCS2 or UCS4 (even if
>> it doesn't use the Unicode APIs directly).
>>> UCS2 case will certainly never change
>>> since it is already 0xFFFF.  Is it possible that USC4 will be expanded
>>> beyond 0x10FFFF?
>> Well, the Unicode Consortium decided to not go beyond 0x10FFFF,
>> but then you never know... when they started out on the quest,
>> 16 bits appeared more than enough, but they found out relatively
>> quickly that the Asian scripts had enough code points to easily
>> fill that space.
>> Once space is available, it tends to get used sooner or later :-)
>>> Note that we can have both a macro and a function
>>> version.  This is fairly standard practice in Python C-API.
>> Sure, but what for ?
> Note that PyUnicode_FromOrdinal()  is documented (in unicodeobject.h)
> as follows without a reference to PyUnicode_GetMax():
> """
>    Create a Unicode Object from the given Unicode code point ordinal.
>    The ordinal must be in range(0x10000) on narrow Python builds
>    (UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is
>    raised in case it is not.
> """
> The actual implementation actually checks UCS4 range only.
>     if (ordinal < 0 || ordinal > 0x10ffff) {
> 	PyErr_SetString(PyExc_ValueError,
>                         "chr() arg not in range(0x110000)");
>         return NULL;
>     }
> This actually looks like a bug:
>>>> len(chr(0x10FFFF))
> 2
> (on a USC2 build.)

Yes, it's a documentation bug. I guess someone forgot to update
the comment in unicodeobject.h after the change to have chr()/unichr()
return a 2-char string instead of a 1-char string for non-BMP
code points.

> Also, I think PyUnicode_FromOrdinal()  should take Py_UNICODE argument
> rather than int.

No, an ordinal is a number, not a typed value. We have
PyUnicode_FromUnicode() to create strings from Py_UNICODE*

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 16 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From rbp at  Tue Nov 16 21:15:56 2010
From: rbp at (Rodrigo Bernardo Pimentel)
Date: Tue, 16 Nov 2010 18:15:56 -0200
Subject: [Python-Dev] Python bug week-end : 20-21 November
In-Reply-To: <ia7c9r$887$>
References: <>
Message-ID: <>

On 26 October 2010 18:04, Georg Brandl <g.brandl at> wrote:
> Am 26.10.2010 19:53, schrieb Brett Cannon:
>> Can whomever has edit access to the Python Google Calendar add this?
> Done.

The Bug Weekend is still up, right? I don't see mention of it at (and when I tried to log in
to edit, I got "A problem occurred in a Python script." - now, I
thought no problems ever occurred on Python scripts! ;)).

? ? rbp

From alexander.belopolsky at  Tue Nov 16 21:31:13 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 15:31:13 -0500
Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal()
 Was: Breaking undocumented API
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 16, 2010 at 3:06 PM, M.-A. Lemburg <mal at> wrote:
>>>>> len(chr(0x10FFFF))
>> 2
>> (on a USC2 build.)
> Yes, it's a documentation bug. I guess someone forgot to update
> the comment in unicodeobject.h after the change to have chr()/unichr()
> return a 2-char string instead of a 1-char string for non-BMP
> code points.

Same problem in reST doc for chr(i):


Return the string of one character whose Unicode codepoint is the
integer i. For example, chr(97) returns the string 'a'. This is the
inverse of ord(). The valid range for the argument depends how Python
was configured ? it may be either UCS2 [0..0xFFFF] or UCS4
[0..0x10FFFF]. ValueError will be raised if i is outside that range.

And in ord(c):


Given a string of length one, return an integer representing the
Unicode code point of the character. For example, ord('a') returns the
integer 97 and ord('\u2020') returns 8224. This is the inverse of

If the argument length is not one, a TypeError will be raised. (If
Python was built with UCS2 Unicode, then the character?s code point
must be in the range [0..65535] inclusive; otherwise the string length
is two!)

From g.brandl at  Tue Nov 16 21:49:01 2010
From: g.brandl at (Georg Brandl)
Date: Tue, 16 Nov 2010 21:49:01 +0100
Subject: [Python-Dev] Python bug week-end : 20-21 November
In-Reply-To: <>
References: <>	<>	<ia7c9r$887$>
Message-ID: <ibuqqq$hcs$>

Am 16.11.2010 21:15, schrieb Rodrigo Bernardo Pimentel:
> On 26 October 2010 18:04, Georg Brandl <g.brandl at> wrote:
>> Am 26.10.2010 19:53, schrieb Brett Cannon:
>>> Can whomever has edit access to the Python Google Calendar add this?
>> Done.
> The Bug Weekend is still up, right? I don't see mention of it at
> (and when I tried to log in
> to edit, I got "A problem occurred in a Python script." - now, I
> thought no problems ever occurred on Python scripts! ;)).

Yeah, somebody (Antoine?) should update that wiki page...


From ben+python at  Tue Nov 16 22:31:41 2010
From: ben+python at (Ben Finney)
Date: Wed, 17 Nov 2010 08:31:41 +1100
Subject: [Python-Dev] Breaking undocumented API
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

exarkun at writes:

> On 03:48 pm, guido at wrote:
> >Hm. Apart from the specific semantics assigned by the language to
> >single and double leading (and trailing) underscores, I still think
> >this belongs in a style guide, not in the library manual.
> I don't think it belongs only in PEP 8 (that's "a style guide" you're
> referring to, correct?).

I don't know about Guido, but I'd be ?1 on suggestions to add more
normative information to PEP 7, PEP 8, PEP 257, or any other established
style guide PEP. I certainly don't want to have to keep going back to
the same documents frequently just to see if the set of recommendations
I already know has changed recently.

Rather, I took Guido's mention of ?this belongs in a style guide? as
suggesting a *new* style guide. Perhaps one that explicitly obsoletes an
existing one or perhaps not; either way, the updated normative
recommendations are in a new document with a new name, so that one knows
whether one has already read it.

> It needs to be front and center. This is information that every single
> user of the stdlib needs in order to use the stdlib correctly.

True enough. This is information that goes beyond a style guide for
writers, and into conventions that API users need to know also.

 \         ?I went to the museum where they had all the heads and arms |
  `\      from the statues that are in all the other museums.? ?Steven |
_o__)                                                           Wright |
Ben Finney

From fdrake at  Tue Nov 16 22:41:39 2010
From: fdrake at (Fred Drake)
Date: Tue, 16 Nov 2010 16:41:39 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 4:31 PM, Ben Finney <ben+python at> wrote:
> I don't know about Guido, but I'd be -1 on suggestions to add more
> normative information to PEP 7, PEP 8, PEP 257, or any other established
> style guide PEP. I certainly don't want to have to keep going back to
> the same documents frequently just to see if the set of recommendations
> I already know has changed recently.


Many style guides are written as extensions of PEP 8 in particular.
This has already bitten the Zope community, which was developing style
beyond what was even written in it's own extension, only to have PEP 8
change out from under it in a contrary manner.

Lessons we learned:

- If you refer to someone else's documents, refer to specific versions.
  References can be updated explicitly if desired.

- If you have even an advisory point of style, write it down in the style
  guide, so people who read the foundational documents you referred to
  without version information will be aware of the expectations.

Otherwise, you may as well not have one.


Fred L. Drake, Jr.    <fdrake at>
"A storm broke loose in my mind."  --Albert Einstein

From guido at  Tue Nov 16 22:49:16 2010
From: guido at (Guido van Rossum)
Date: Tue, 16 Nov 2010 13:49:16 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 8:34 AM,  <exarkun at> wrote:
> On 03:48 pm, guido at wrote:
>> On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky
>> <alexander.belopolsky at> wrote:
>>> What this thread has shown is that there is no consensus on what
>>> public names are and what rules should be followed when changing names
>>> that can be imported from a module. ?I have opened an issue at
>>> to address this. ?My vote is to
>>> adopt the definition spelled out in the language reference, copy it to
>>> the library manual and add some discussion of the deprecation
>>> policies.
>> Hm. Apart from the specific semantics assigned by the language to
>> single and double leading (and trailing) underscores, I still think
>> this belongs in a style guide, not in the library manual. When reading
>> the library manual, one should always assume that undocumented
>> features are subject to change at any time.
> I don't think it belongs only in PEP 8 (that's "a style guide" you're
> referring to, correct?). ?It needs to be front and center. ?This is
> information that every single user of the stdlib needs in order to use the
> stdlib correctly.

That depends on what methods you're imagining "every single user" is
using to find out what the API *is*.

In my experience there are many ways people do this:

- by reading the source

- by reading the official docs

- by trial and error

- inspection of objects (e.g. dir())

- using help()

- by reading pydoc output collected on some website (or local disk)

- by following tutorials

- by reading books containing reference documentation generated by 3rd
party authors

Most people do several of those things. (Personally, I learned about
many APIs by creating them. But I'm probably an exception. :-)

> No matter how many times we discuss this policy on this list (I know it's
> come up here before), the majority of python users still won't learn about
> it.

Agreed. And adding a disclaimer to help() or pydoc output won't make
much of a difference, I expect.

> PEP 8 isn't nearly visible enough, either. ?Whatever the rule is, it needs
> to be presented with the information itself. ?If the rule is that things not
> documented in the library manual have no compatibility guarantees, then all
> of the means of getting documentation *other* than looking at the library
> manual need to indicate this somehow (alternatively, the information
> shouldn't be duplicated, but I doubt I'll convince anyone of that).

Assuming people actually read the disclaimers.

> Here's a stupid proposal. ?What if the top of pydoc output said (for stdlib
> modules only) "The library manual is the canonical reference. Refer to it
> before using APIs you find in this documentation." ?Still inconvenient, but
> inconvenient is better than secret/impossible.

Personally I think it would be sufficient if the disclaimer was at the
top of the library reference itself. That's certainly enough from a
legalistic "I told you so" POV and I doubt that we'll be able to move
the POV of what people actually use...

--Guido van Rossum (

From alexander.belopolsky at  Tue Nov 16 22:54:24 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 16:54:24 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 4:31 PM, Ben Finney <ben+python at> wrote:
> I don't know about Guido, but I'd be -1 on suggestions to add more
> normative information to PEP 7, PEP 8, PEP 257, or any other established
> style guide PEP. I certainly don't want to have to keep going back to
> the same documents frequently just to see if the set of recommendations
> I already know has changed recently.
> Rather, I took Guido's mention of "this belongs in a style guide" as
> suggesting a *new* style guide. Perhaps one that explicitly obsoletes an
> existing one or perhaps not; either way, the updated normative
> recommendations are in a new document with a new name, so that one knows
> whether one has already read it.


Numbered PEPs, while well-known to old-timers, are really odd place
for newcomers to find a style guide.  This really should be a separate
part at the top level of  Note that we already have a
documentation style guide under "Documenting Python."   Maybe we
should reuse this slot and have say "Python Development" part which
will put together PEP 7, PEP 8 and documentation "Style Guide" in one
convenient package.

This, however, is a much bigger project than what I had in mind when I
started this thread.

From alexander.belopolsky at  Tue Nov 16 23:19:36 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 16 Nov 2010 17:19:36 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

I created to follow up on unicode C
API issues.

On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg <mal at> wrote:
> Alexander Belopolsky wrote:
>> What this thread has shown is that there is no consensus on what
>> public names are and what rules should be followed when changing names
>> that can be imported from a module. ?I have opened an issue at
>> to address this. ?My vote is to
>> adopt the definition spelled out in the language reference, copy it to
>> the library manual and add some discussion of the deprecation
>> policies.
>> I also have a similar question about C API. ?Here, in absence of
>> __all__, the answer should be clear: all symbols in public header
>> files should start with either _Py_ or Py_ and those that start with
>> Py_ are public. ? The question is what should be done with names that
>> start with Py_, but are not documented? ?Can we add an underscore to
>> those names? ?If so, should a (deprecated) alias be made available?
>> Should they be documented as deprecated?
>> I think these questions can only be answered on a case by case bases
>> which choices being:
>> 1. Document.
>> 2. Document as deprecated.
>> 3. Document as deprecated, add underscore prefix and retain a deprecated alias.
>> 4. Add an underscore prefix.
>> The specific set of names that I would like to consider is the
>> following from unicode.h. ?I am marking with (*) the names that I
>> think should be documented and with (D) those that should be
>> deprecated:
>> PyUnicode_GetMax
>> PyUnicode_Resize (*)
>> PyUnicode_InternImmortal
>> PyUnicode_FromOrdinal (*)
>> PyUnicode_GetDefaultEncoding (D)
>> PyUnicode_AsDecodedObject
>> PyUnicode_AsDecodedUnicode
>> PyUnicode_AsEncodedObject
>> PyUnicode_AsEncodedUnicode
>> PyUnicode_BuildEncodingMap
>> PyUnicode_EncodeDecimal (*)
>> PyUnicode_Append (*)
>> PyUnicode_AppendAndDel (*)
>> PyUnicode_Partition (*)
>> PyUnicode_RPartition (*)
>> PyUnicode_RSplit (*)
>> PyUnicode_IsIdentifier (*)
>> Py_UNICODE_strlen
>> Py_UNICODE_strcpy
>> Py_UNICODE_strcat
>> Py_UNICODE_strncpy
>> Py_UNICODE_strcmp
>> Py_UNICODE_strncmp
>> Py_UNICODE_strchr
>> Py_UNICODE_strrchr
> For Unicode, unicodeobject.h defines which APIs are private or not.
> APIs which don't appear in the header file are either private or
> need to be added to the header file (but I don't think there are
> any in this category).
> All APIs in the header that do not appear in the documentation,
> should be added there as well. unicodeobject.h already provides
> documentation for most of the APIs you've listed above (except some
> new ones that were added later on).
> One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat
> obscure and given that we already have PyUnicode_Concat(), I think
> it should be made private and eventually dropped.
> --
> Marc-Andre Lemburg
> Professional Python Services directly from the Source ?(#1, Nov 16 2010)
>>>> Python/Zope Consulting and Support ... ? ? ? ?
>>>> mxODBC.Zope.Database.Adapter ... ? ? ? ? ? ?
>>>> mxODBC, mxDateTime, mxTextTools ... ? ? ? ?
> ________________________________________________________________________
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
> ? Software, Skills and Services GmbH ?Pastor-Loeh-Str.48
> ? ?D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
> ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611
> ? ? ? ? ? ? ?

From glyph at  Wed Nov 17 00:41:42 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 16 Nov 2010 18:41:42 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Nov 16, 2010, at 4:49 PM, Guido van Rossum wrote:

>> PEP 8 isn't nearly visible enough, either.  Whatever the rule is, it needs
>> to be presented with the information itself.  If the rule is that things not
>> documented in the library manual have no compatibility guarantees, then all
>> of the means of getting documentation *other* than looking at the library
>> manual need to indicate this somehow (alternatively, the information
>> shouldn't be duplicated, but I doubt I'll convince anyone of that).
> Assuming people actually read the disclaimers.

I don't think it necessarily needs to be presented as a disclaimer.  There will always be people who just ignore part of the information presented, but the message could be something along the lines of "Here's some basic documentation, but it might be out-of-date or incomplete.  You can find a better reference at <>."  If it's easy to click on the link, I think a lot of people will click on it.  Especially since the library reference really _is_ more helpful than the docstrings, for the standard library.

(IMHO, dir()'s semantics are so weird that it should emit a warning too, like "looking for docs?  please use help()".)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From g.brandl at  Wed Nov 17 08:18:59 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 17 Nov 2010 08:18:59 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
Message-ID: <ibvvo1$40o$>

Am 16.11.2010 19:38, schrieb Jesus Cea:
> Is there any updated mercurial schedule?.
> Any impact related with the new 3.2 schedule (three weeks offset)?

I've been trying to contact Dirkjan and ask; generally, I don't
see much connection to the 3.2 schedule (with the exception that
the final migration day should not be a release day.)


From ncoghlan at  Wed Nov 17 12:45:39 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 17 Nov 2010 21:45:39 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 2:34 AM,  <exarkun at> wrote:
> I don't think it belongs only in PEP 8 (that's "a style guide" you're
> referring to, correct?). ?It needs to be front and center. ?This is
> information that every single user of the stdlib needs in order to use the
> stdlib correctly.
> Imagine trying to use a dictionary without knowing about alphabetical
> ordering. ?Or driving a car without knowing what lane markers indicate.

The definition of the public/private policy in all its gory detail
should be in PEP 8 as Guido suggests.

The library documentation may then contain a note about the difference
in compatibility guarantees for public and private APIs, say that any
interface and behaviour documented in the manual qualifies as public,
then point readers to PEP 8 for the precise details.

A similar note could be placed in the C API documentation (with a
reference to the detailed policy in PEP 7, perhaps REsTify'ing that
PEP in the process in order to link directly to the naming convention


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Wed Nov 17 12:57:17 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 17 Nov 2010 11:57:17 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
Message-ID: <>

On 17/11/2010 11:45, Nick Coghlan wrote:
> On Wed, Nov 17, 2010 at 2:34 AM,<exarkun at>  wrote:
>> I don't think it belongs only in PEP 8 (that's "a style guide" you're
>> referring to, correct?).  It needs to be front and center.  This is
>> information that every single user of the stdlib needs in order to use the
>> stdlib correctly.
>> Imagine trying to use a dictionary without knowing about alphabetical
>> ordering.  Or driving a car without knowing what lane markers indicate.
> The definition of the public/private policy in all its gory detail
> should be in PEP 8 as Guido suggests.


Have we agreed the policy though?

> The library documentation may then contain a note about the difference
> in compatibility guarantees for public and private APIs, say that any
> interface and behaviour documented in the manual qualifies as public,
> then point readers to PEP 8 for the precise details.


This sounds like the right approach to me.

All the best,

> A similar note could be placed in the C API documentation (with a
> reference to the detailed policy in PEP 7, perhaps REsTify'ing that
> PEP in the process in order to link directly to the naming convention
> section).
> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From lukasz at  Wed Nov 17 13:37:27 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Wed, 17 Nov 2010 13:37:27 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>
Message-ID: <>

Am 17.11.2010 12:57, schrieb Michael Foord:
> On 17/11/2010 11:45, Nick Coghlan wrote:
>> The definition of the public/private policy in all its gory detail
>> should be in PEP 8 as Guido suggests.
> +1

Guido did not said that, though. I'm with Fred and other people that 
agree that PEPs should be more-less immutable. Let's make a new document 
(PEP 88?). The reasoning was well laid out here:

> Have we agreed the policy though?

Everybody has their own opinion on the matter. This discussion thread is 
getting too fractured to actually get us far enough with the 
conclusions. Let's make a PEP and discuss concrete wording on a concrete 

>> The library documentation may then contain a note about the difference
>> in compatibility guarantees for public and private APIs, say that any
>> interface and behaviour documented in the manual qualifies as public,
>> then point readers to PEP 8 for the precise details.
> +1

Yes, point to PEP 88.

Best regards,
?ukasz Langa

From jcea at  Wed Nov 17 13:51:49 2010
From: jcea at (Jesus Cea)
Date: Wed, 17 Nov 2010 13:51:49 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ibvvo1$40o$>
References: <> <ibvvo1$40o$>
Message-ID: <>

Hash: SHA1

On 17/11/10 08:18, Georg Brandl wrote:
> Am 16.11.2010 19:38, schrieb Jesus Cea:
>> Is there any updated mercurial schedule?.
>> Any impact related with the new 3.2 schedule (three weeks offset)?
> I've been trying to contact Dirkjan and ask; generally, I don't
> see much connection to the 3.2 schedule (with the exception that
> the final migration day should not be a release day.)

I can't find the mail now, but I remember that months ago the Mercurial
migration schedule was mid-december. I wonder if there is any update.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From emile.anclin at  Wed Nov 17 13:48:06 2010
From: emile.anclin at (Emile Anclin)
Date: Wed, 17 Nov 2010 13:48:06 +0100
Subject: [Python-Dev] python3k vs _ast
Message-ID: <201011171348.07169.emile.anclin@logilab>

hello everybody,

migrating Pylint to python3.x, we encounter a little problem :
in the tree generated by _ast, if we consider a "args" node (representing 
an argument of a function), the "lineno" (and the "col_offset")
information disappeared from those nodes. Is there a particular 
reason for that ? In python2.x, the "args" nodes were just "Name" nodes,
and as for now we keep them as "AssName" nodes in astng/pylint and would 
like to know where it was defined.

thx for any information


Emile Anclin <emile.anclin at> 
Informatique scientifique & et gestion de connaissances

From fuzzyman at  Wed Nov 17 14:11:51 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 17 Nov 2010 13:11:51 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
Message-ID: <>

On 17/11/2010 12:37, ?ukasz Langa wrote:
> Am 17.11.2010 12:57, schrieb Michael Foord:
>> On 17/11/2010 11:45, Nick Coghlan wrote:
>>> The definition of the public/private policy in all its gory detail
>>> should be in PEP 8 as Guido suggests.
>> +1
> Guido did not said that, though. 

I think that is a reasonable interpretation, and the suggestion that by 
"in a style guide" means "create a new style guide" is more of a stretch.

> I'm with Fred and other people that agree that PEPs should be 
> more-less immutable. Let's make a new document (PEP 88?). The 
> reasoning was well laid out here:

In those emails Fred provides a solution to his most substantial 
difficulty, that other people base their own documents off pep8, by 
recommending that extension documents should refer to a specific revision.

I don't think those reasons are compelling and the cost of splitting the 
Python development style guide into multiple documents are higher. (They 
run the risk of contradicting each other, if you want to find a 
particular rule you have multiple places to check, there is no single 
authoritative place to send people, people *wanting* to base documents 
off the Python style rules now have to refer to multiple places, etc.)

So -1 on splitting Python development style guide into multiple documents.


>> Have we agreed the policy though?
> Everybody has their own opinion on the matter. This discussion thread 
> is getting too fractured to actually get us far enough with the 
> conclusions. Let's make a PEP and discuss concrete wording on a 
> concrete proposal.
>>> The library documentation may then contain a note about the difference
>>> in compatibility guarantees for public and private APIs, say that any
>>> interface and behaviour documented in the manual qualifies as public,
>>> then point readers to PEP 8 for the precise details.
>> +1
> Yes, point to PEP 88.
> Best regards,
> ?ukasz Langa
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fdrake at  Wed Nov 17 14:21:57 2010
From: fdrake at (Fred Drake)
Date: Wed, 17 Nov 2010 08:21:57 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

2010/11/17 Michael Foord <fuzzyman at>:
> So -1 on splitting Python development style guide into multiple documents.

I don't think that the publicness or API stability promises of the
standard library are part of a style guide.  They're an essential part
of the library documentation.  They aren't a guide for 3rd-party code,
and are specific to the standard library.

If we can't come up with something reasonable for the standard
library, we *certainly* shouldn't be making recommendations on the
matter for 3rd party code.  If we do come up with something
reasonable, we can recommend it to others later (once field-proven),
and without duplication.  (Possibly by referring to the standard
library documentation, and possibly by refactoring.  That's not
important until we have something, though.)

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From ncoghlan at  Wed Nov 17 14:24:39 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 17 Nov 2010 23:24:39 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

2010/11/17 Michael Foord <fuzzyman at>:
> I don't think those reasons are compelling and the cost of splitting the
> Python development style guide into multiple documents are higher. (They run
> the risk of contradicting each other, if you want to find a particular rule
> you have multiple places to check, there is no single authoritative place to
> send people, people *wanting* to base documents off the Python style rules
> now have to refer to multiple places, etc.)
> So -1 on splitting Python development style guide into multiple documents.

Indeed. We don't need to clarify things very often, but the idea of
creating a new PEP every time we want to make something explicit that
was historically implicit (or otherwise underspecified) is a silly
idea. Allowing traceable revisions is what version control is for, and
hence why the PEP archive is part of the SVN repository.

As far as notifiying current developers of any changes, they will
generally be following python-dev anyway, or else will get pulled up
on python-checkins if the policy change is significant (and this one
really *isn't* all that significant - the only people it will affect
are those deciding whether to document or deprecate implicitly public
APIs and that almost never happens, since the vast majority of our
APIs are explicitly public or private).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Wed Nov 17 14:25:34 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 17 Nov 2010 13:25:34 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On 17/11/2010 13:21, Fred Drake wrote:
> 2010/11/17 Michael Foord<fuzzyman at>:
>> So -1 on splitting Python development style guide into multiple documents.
> I don't think that the publicness or API stability promises of the
> standard library are part of a style guide.  They're an essential part
> of the library documentation.  They aren't a guide for 3rd-party code,
> and are specific to the standard library.

PEP 8 *isn't* targeted at third party code - is the development style 
guide for the Python standard library.

This document gives coding conventions for the Python code comprising the
standard library in the main Python distribution.

The ideal place for informing the Python core developers the naming 
conventions we should use for our public APIs...

(Which is why Guido said that a style guide *is* the right place for 
this information.)

It doesn't mean it shouldn't be information provided to library users as 
well. (As discussed.)

All the best,

Michael Foord
> If we can't come up with something reasonable for the standard
> library, we *certainly* shouldn't be making recommendations on the
> matter for 3rd party code.  If we do come up with something
> reasonable, we can recommend it to others later (once field-proven),
> and without duplication.  (Possibly by referring to the standard
> library documentation, and possibly by refactoring.  That's not
> important until we have something, though.)
>    -Fred
> --
> Fred L. Drake, Jr.<fdrake at>
> "A storm broke loose in my mind."  --Albert Einstein


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From dirkjan at  Wed Nov 17 14:23:59 2010
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 17 Nov 2010 14:23:59 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
Message-ID: <>

On Wed, Nov 17, 2010 at 13:51, Jesus Cea <jcea at> wrote:
> I can't find the mail now, but I remember that months ago the Mercurial
> migration schedule was mid-december. I wonder if there is any update.

I'm still aiming for that date. I've had some problems getting the
test repository together. It's almost done, but I'm on holiday in
Boston and NYC this week, so I don't have much time to spend on it.
The delay shouldn't be much more than a week, and we'll just compress
the testing period such that the migration date should still be about
the same, release schedules willing.

Georg, if you have any further questions, mail is better than IRC
while I'm here.



From phd at  Wed Nov 17 14:29:59 2010
From: phd at (Oleg Broytman)
Date: Wed, 17 Nov 2010 16:29:59 +0300
Subject: [Python-Dev] python3k vs _ast
In-Reply-To: <201011171348.07169.emile.anclin@logilab>
References: <201011171348.07169.emile.anclin@logilab>
Message-ID: <>

Seems to be rather a usage question, not a development question (python-dev
is about *developing* python, not *using* it).

On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote:
> hello everybody,
> migrating Pylint to python3.x, we encounter a little problem :
> in the tree generated by _ast, if we consider a "args" node (representing 
> an argument of a function), the "lineno" (and the "col_offset")
> information disappeared from those nodes. Is there a particular 
> reason for that ? In python2.x, the "args" nodes were just "Name" nodes,
> and as for now we keep them as "AssName" nodes in astng/pylint and would 
> like to know where it was defined.
> thx for any information
> -- 
> Emile Anclin <emile.anclin at>
> Informatique scientifique & et gestion de connaissances

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From ncoghlan at  Wed Nov 17 14:30:25 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 17 Nov 2010 23:30:25 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 11:21 PM, Fred Drake <fdrake at> wrote:
> 2010/11/17 Michael Foord <fuzzyman at>:
>> So -1 on splitting Python development style guide into multiple documents.
> I don't think that the publicness or API stability promises of the
> standard library are part of a style guide. ?They're an essential part
> of the library documentation. ?They aren't a guide for 3rd-party code,
> and are specific to the standard library.
> If we can't come up with something reasonable for the standard
> library, we *certainly* shouldn't be making recommendations on the
> matter for 3rd party code. ?If we do come up with something
> reasonable, we can recommend it to others later (once field-proven),
> and without duplication. ?(Possibly by referring to the standard
> library documentation, and possibly by refactoring. ?That's not
> important until we have something, though.)

Would it make people happier if we left PEP 7 and PEP 8 alone, and put
the clarification of what constitutes a "public API" into PEP 5
instead? PEP 5 currently the deprecation policy for language
constructs, it would be easy enough to extend it to all public APIs.

The library documentation is *not* the right place for quibbling about
what constitutes a public API when using other means than the library
documentation to find APIs to call.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From lukasz at  Wed Nov 17 14:31:41 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Wed, 17 Nov 2010 14:31:41 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
	<4CE3CC87.1000105@lan> <4CE3D497.50102@voidspace.or>
Message-ID: <>

Am 17.11.2010 14:11, schrieb Michael Foord:
> I don't think those reasons are compelling and the cost of splitting 
> the Python development style guide into multiple documents are higher. 
> (They run the risk of contradicting each other, if you want to find a 
> particular rule you have multiple places to check, there is no single 
> authoritative place to send people, people *wanting* to base documents 
> off the Python style rules now have to refer to multiple places, etc.)
> So -1 on splitting Python development style guide into multiple 
> documents.

Bah, again my English skills failed me in a critical moment ;) I was 
proposing creation of PEP 88 to supersede PEP 8. This would be better 
IMO for the following reasons:

1. Existing projects wouldn't have to explain afterwards why they differ 
from PEP 8, e.g. in terms of public/private API declaration. "Your 
project claims PEP8 conformance! Why don't you use __all__?" "Ah, that 
was before they've added this part to PEP8."

2. All other projects (new and old) would have a much more explicit 
(better than implicit) sign that *something significant has changed* in 
the recommended style.

3. As someone already said, PEP8 is not visible enough. Transition from 
PEP 8 to PEP 88 could help to make some hype that would help raise the 
awareness within the community.

Mutating PEP8 is bad form. We fight mercilessly over source code 
backwards compatibility so I think PEPs should be taken just as 
seriously in that regard.


From benjamin at  Wed Nov 17 14:36:37 2010
From: benjamin at (Benjamin Peterson)
Date: Wed, 17 Nov 2010 07:36:37 -0600
Subject: [Python-Dev] python3k vs _ast
In-Reply-To: <>
References: <201011171348.07169.emile.anclin@logilab>
Message-ID: <>

2010/11/17 Oleg Broytman <phd at>:
> Seems to be rather a usage question, not a development question (python-dev
> is about *developing* python, not *using* it).

Well, technically I think it's a feature request.

> On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote:
>> hello everybody,
>> migrating Pylint to python3.x, we encounter a little problem :
>> in the tree generated by _ast, if we consider a "args" node (representing
>> an argument of a function), the "lineno" (and the "col_offset")
>> information disappeared from those nodes. Is there a particular
>> reason for that ? In python2.x, the "args" nodes were just "Name" nodes,
>> and as for now we keep them as "AssName" nodes in astng/pylint and would
>> like to know where it was defined.

I wouldn't object to adding them back if you want to file a bug report.


From fdrake at  Wed Nov 17 14:45:03 2010
From: fdrake at (Fred Drake)
Date: Wed, 17 Nov 2010 08:45:03 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan <ncoghlan at> wrote:
> The library documentation is *not* the right place for quibbling about
> what constitutes a public API when using other means than the library
> documentation to find APIs to call.

Quibbling can happen on the mailing list, where it can be ignored by
those who aren't interested.

But the documentation is the right place to document what we come up
with for the standard library.  I expect what the tools do will inform
any decisions, and the tools (those in the stdlib) will henceforth be
maintained with that in mind.

I *am* suggesting that the scope of this be restricted to what's
appropriate for the standard library, rather than a general
recommendation for others.  Third-party projects are free to use what
we come up with, or provide their own policies.  That's theirs to
decide, and I see no value in interfering with that.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From fuzzyman at  Wed Nov 17 14:53:24 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 17 Nov 2010 13:53:24 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
	<4CE3CC87.1000105@lan> <4CE3D497.50102@voidspace.or>
Message-ID: <>

On 17/11/2010 13:31, ?ukasz Langa wrote:
> Am 17.11.2010 14:11, schrieb Michael Foord:
>> I don't think those reasons are compelling and the cost of splitting 
>> the Python development style guide into multiple documents are 
>> higher. (They run the risk of contradicting each other, if you want 
>> to find a particular rule you have multiple places to check, there is 
>> no single authoritative place to send people, people *wanting* to 
>> base documents off the Python style rules now have to refer to 
>> multiple places, etc.)
>> So -1 on splitting Python development style guide into multiple 
>> documents.
> Bah, again my English skills failed me in a critical moment ;) I was 
> proposing creation of PEP 88 to supersede PEP 8. This would be better 
> IMO for the following reasons:
> 1. Existing projects wouldn't have to explain afterwards why they 
> differ from PEP 8, e.g. in terms of public/private API declaration. 
> "Your project claims PEP8 conformance! Why don't you use __all__?" 
> "Ah, that was before they've added this part to PEP8."
> 2. All other projects (new and old) would have a much more explicit 
> (better than implicit) sign that *something significant has changed* 
> in the recommended style.
> 3. As someone already said, PEP8 is not visible enough. Transition 
> from PEP 8 to PEP 88 could help to make some hype that would help 
> raise the awareness within the community.
> Mutating PEP8 is bad form. We fight mercilessly over source code 
> backwards compatibility so I think PEPs should be taken just as 
> seriously in that regard.

Given the following:

Anyone who thinks that PEP 8 is immutable (and should remain so) is 
already wrong...

As discussed, the goal is to codify what is already considered "best 
practise" within the wider community and the standard library *anyway*. 
So in practise this won't be a great surprise or change.

As to the publicity, PEP 8 is both the most widely known PEP and the 
most widely known Python style guide. This isn't an argument for letting 
it rot, nor for deprecating it and invalidating all those tutorials / 
developers / links / books that consider it authoritative. Better to 
carefully and slowly evolve it as practise and the language change.

For those wanting immutable versions we provide that in the form of 
specific revisions.

All the best,


> ?ukasz


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From steve at  Wed Nov 17 15:16:53 2010
From: steve at (Steven D'Aprano)
Date: Thu, 18 Nov 2010 01:16:53 +1100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>
Message-ID: <>

Ben Finney wrote:

> I don't know about Guido, but I'd be ?1 on suggestions to add more
> normative information to PEP 7, PEP 8, PEP 257, or any other established
> style guide PEP. I certainly don't want to have to keep going back to
> the same documents frequently just to see if the set of recommendations
> I already know has changed recently.

This is not a problem unique to any specific PEP. How do we learn about 
any changes that might interest us? What are the alternatives?

- our knowledge is fixed to what we knew at some particular date, and 
gets further and further obsolete as time goes by;

- we actively search out new knowledge;

- we wait for somebody to tell us that something we knew has changed.

(E.g. I was rather surprised to learn that, sometime over the last few 
years, the number of extra-solar planets known to astronomers have 
increased from the one or two I was aware of to multiple dozens.)

All three strategies have advantages and disadvantages.

Regardless of whether future versions of the style-guide are called "PEP 
8" or whether they are given new names ("PEP 8" -> "PEP 88" -> ...), we 
have the identical problem -- how do we know whether or not there is a 
new version of the style guide to look for? In twelve months time, how 
sure will we be that PEP 88 is the most recent version to look for? 
Perhaps we missed the release of PEP 95.

The one advantage of giving each revision of the document an updated 
name is that, under some circumstances, we *might* be able to detect a 
new revision easily. If I think that PEP 88 is the most recent version, 
and somebody says that the recommended style guide is PEP 89, I might:

- think that he merely made a mistake, and meant to say 88; or
- think that there is a new document for me to look at.

> Rather, I took Guido's mention of ?this belongs in a style guide? as
> suggesting a *new* style guide. Perhaps one that explicitly obsoletes an
> existing one or perhaps not; either way, the updated normative
> recommendations are in a new document with a new name, so that one knows
> whether one has already read it.

How do you know which is the most recent version of the style guide to 
look at? Instead of doing a O(1) lookup of PEP 8, you have to follow a 
potentially O(N) search:

PEP 8 is obsoleted by PEP 88... go and look at PEP 88.
PEP 88 is obsoleted by PEP 93... go at look at PEP 93.
PEP 93 is obsoleted by PEP 123... go and look at PEP 123.
PEP 123 doesn't contain an "obsoleted by" notice, so:
(1) either it is the current document, or
(2) it has been obsoleted, but the link to the new version was missed, 
and it is now very hard to discover what the current document is called.

Personally, I don't think the current PEP arrangement is broken enough 
to change it. Each PEP is already tracked in VCS and history is 
available for it. There's insufficient advantage, and some disadvantage, 
to splitting each revision of the PEPs into new documents with new 
names. -1 on the idea.


From emile.anclin at  Wed Nov 17 15:18:14 2010
From: emile.anclin at (Emile Anclin)
Date: Wed, 17 Nov 2010 15:18:14 +0100
Subject: [Python-Dev] python3k vs _ast
In-Reply-To: <>
References: <201011171348.07169.emile.anclin@logilab>
Message-ID: <201011171518.14387.emile.anclin@logilab>

On Wednesday 17 November 2010 14:36:37 Benjamin Peterson wrote:
> I wouldn't object to adding them back if you want to file a bug report.

Ok, thank you for quick reply.

here is the issue :


Emile Anclin <emile.anclin at> 
Informatique scientifique & et gestion de connaissances

From steve at  Wed Nov 17 15:19:22 2010
From: steve at (Steven D'Aprano)
Date: Thu, 18 Nov 2010 01:19:22 +1100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<4CE3CC87.1000105@lan> <4CE3D497.50102@voidspace.or>
Message-ID: <>

?ukasz Langa wrote:

> Mutating PEP8 is bad form. We fight mercilessly over source code 
> backwards compatibility so I think PEPs should be taken just as 
> seriously in that regard.

There's no comparison between the two.

If you change your library's API -- not "source code", it doesn't matter 
if the source code changes so long as the interface remains backwards 
compatible -- then you will break other people's code.

If we change PEP 8, then all that will happen is that some people's 
coding style will no longer be exactly compatible with PEP 8. Their code 
will continue to work.


From ncoghlan at  Wed Nov 17 15:19:39 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 00:19:39 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 11:45 PM, Fred Drake <fdrake at> wrote:
> On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan <ncoghlan at> wrote:
>> The library documentation is *not* the right place for quibbling about
>> what constitutes a public API when using other means than the library
>> documentation to find APIs to call.
> Quibbling can happen on the mailing list, where it can be ignored by
> those who aren't interested.
> But the documentation is the right place to document what we come up
> with for the standard library. ?I expect what the tools do will inform
> any decisions, and the tools (those in the stdlib) will henceforth be
> maintained with that in mind.
> I *am* suggesting that the scope of this be restricted to what's
> appropriate for the standard library, rather than a general
> recommendation for others. ?Third-party projects are free to use what
> we come up with, or provide their own policies. ?That's theirs to
> decide, and I see no value in interfering with that.

The standard library documentation should say that the public API is
what the documentation says it is. Officially, anyone going outside
those documented APIs should not be surprised if things get removed or
changed arbitrarily without warning. That has long been the python-dev
policy and I, for one, don't think it should change.

What we're talking about in this thread is what to do in the grey area
of APIs which are not included in the official documentation, but also
don't have names starting with an underscore so they "look public"
when reading the source code or exploring the API in the interactive
interpreter. It *may* be appropriate for the standard library
documentation to acknowledge that this grey area exists (I'm not yet
convinced on that point), but it definitely should *not* be
encouraging anyone to rely on it or on our policies for dealing with

The policy we're aiming to clarify here is what we should do when we
come across standard library APIs that land in the grey area, with
there being two appropriate ways to deal with them:
1. Document them and make them officially public
2. Deprecate the public names and make them officially private (with
the public names later removed in accordance with normal deprecation

The actual approach taken will vary on a case-by-case basis (and is a
little trickier in the case of module level globals, since those can't
be deprecated properly), but is always aimed at bringing the standard
library more into line with the official position (i.e. APIs are
either public-and-documented or private).

So the official policy from a language *user* point of view would
remain unchanged (i.e. if it isn't documented, you're on your own). As
a *pragmatic* policy, however, we would explicitly acknowledge that
developers may inadvertently use an undocumented API without realising
that it isn't technically public, and hence apply the normal
deprecation process even though the official policy says we don't have


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rdmurray at  Wed Nov 17 15:19:35 2010
From: rdmurray at (R. David Murray)
Date: Wed, 17 Nov 2010 09:19:35 -0500
Subject: [Python-Dev] python3k vs _ast
In-Reply-To: <>
References: <201011171348.07169.emile.anclin@logilab>
Message-ID: <>

On Wed, 17 Nov 2010 07:36:37 -0600, Benjamin Peterson <benjamin at> wrote:
> 2010/11/17 Oleg Broytman <phd at>:
> > Seems to be rather a usage question, not a development question (python-dev
> > is about *developing* python, not *using* it).
> Well, technically I think it's a feature request.
> >
> > On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote:
> >> hello everybody,
> >>
> >> migrating Pylint to python3.x, we encounter a little problem :
> >> in the tree generated by _ast, if we consider a "args" node (representing
> >> an argument of a function), the "lineno" (and the "col_offset")
> >> information disappeared from those nodes. Is there a particular
> >> reason for that ? In python2.x, the "args" nodes were just "Name" nodes,
> >> and as for now we keep them as "AssName" nodes in astng/pylint and would
> >> like to know where it was defined.
> I wouldn't object to adding them back if you want to file a bug report.

It also seems to me that it was a perfectly appropriate question
for this list.  The question was "why did you developers drop this
(obscure) feature that we depend on in Python3?"  I don't think that
question would make sense on python-list.  Granted, there's a fuzzy
line there, but pylint is really development infrastructure :)

The python-porting list would have been a good alternate choice.

R. David Murray                            

From fuzzyman at  Wed Nov 17 15:25:01 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 17 Nov 2010 14:25:01 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 17/11/2010 14:19, Nick Coghlan wrote:
> On Wed, Nov 17, 2010 at 11:45 PM, Fred Drake<fdrake at>  wrote:
>> On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan<ncoghlan at>  wrote:
>>> The library documentation is *not* the right place for quibbling about
>>> what constitutes a public API when using other means than the library
>>> documentation to find APIs to call.
>> Quibbling can happen on the mailing list, where it can be ignored by
>> those who aren't interested.
>> But the documentation is the right place to document what we come up
>> with for the standard library.  I expect what the tools do will inform
>> any decisions, and the tools (those in the stdlib) will henceforth be
>> maintained with that in mind.
>> I *am* suggesting that the scope of this be restricted to what's
>> appropriate for the standard library, rather than a general
>> recommendation for others.  Third-party projects are free to use what
>> we come up with, or provide their own policies.  That's theirs to
>> decide, and I see no value in interfering with that.
> The standard library documentation should say that the public API is
> what the documentation says it is. Officially, anyone going outside
> those documented APIs should not be surprised if things get removed or
> changed arbitrarily without warning. That has long been the python-dev
> policy and I, for one, don't think it should change.
> What we're talking about in this thread is what to do in the grey area
> of APIs which are not included in the official documentation, but also
> don't have names starting with an underscore so they "look public"

We're *also* discussing codifying the naming conventions (or using 
__all__) within the standard library, so it isn't just about 
deprecations (which is why I think PEP 8 rather than PEP 5). This is so 
that in the future if a name looks public users can have more confidence 
that it actually is...

Obviously what to do about modules that don't follow these rules 
currently is a big part of it (and how the discussion started).

All the best,


> when reading the source code or exploring the API in the interactive
> interpreter. It *may* be appropriate for the standard library
> documentation to acknowledge that this grey area exists (I'm not yet
> convinced on that point), but it definitely should *not* be
> encouraging anyone to rely on it or on our policies for dealing with
> it.
> The policy we're aiming to clarify here is what we should do when we
> come across standard library APIs that land in the grey area, with
> there being two appropriate ways to deal with them:
> 1. Document them and make them officially public
> 2. Deprecate the public names and make them officially private (with
> the public names later removed in accordance with normal deprecation
> procedures)
> The actual approach taken will vary on a case-by-case basis (and is a
> little trickier in the case of module level globals, since those can't
> be deprecated properly), but is always aimed at bringing the standard
> library more into line with the official position (i.e. APIs are
> either public-and-documented or private).
> So the official policy from a language *user* point of view would
> remain unchanged (i.e. if it isn't documented, you're on your own). As
> a *pragmatic* policy, however, we would explicitly acknowledge that
> developers may inadvertently use an undocumented API without realising
> that it isn't technically public, and hence apply the normal
> deprecation process even though the official policy says we don't have
> to.
> Regards,
> Nick.


From jcea at  Wed Nov 17 15:31:02 2010
From: jcea at (Jesus Cea)
Date: Wed, 17 Nov 2010 15:31:02 +0100
Subject: [Python-Dev] I need help with IO testuite
Message-ID: <>

Hash: SHA1

Hi all. I am modifying IO module for Python 3.2, and I am unable to
understand the mechanism used in IO testsuite to test both the C and the
Python implementation.

In particular I need to test that the implementation passes some
parameters to the OS.

The module uses "Mock" classes, but I think "Mock" is something else,
and I don't see how it interpose between the C/Python code and the OS.

If somebody could explain the mechanism a bit...

Thanks for your time and attention.

Some background:

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From ncoghlan at  Wed Nov 17 15:34:22 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 00:34:22 +1000
Subject: [Python-Dev] Proposed adjustments to PEP 0 generation
Message-ID: <>

The lists of Meta-PEPs and Other Informational PEPs at the beginning
of PEP 0 are starting to get a little long, and contain some outdated
information that doesn't really deserve pride of place at the top of
the PEP index.

If I don't hear any objections in this thread, I plan to make the
following tweaks to the PEP 0 generator "soonish":
- make these two lists respect the "Withdrawn" and "Rejected" flags
(i.e. taking the relevant PEPs out of this list and dropping them into
later categories)
- adding a new "Historical" category for PEPs that have served their
purpose and are no longer of immediate interest (primarily old release
PEPs, but also the old SVN migration PEP, the DVCS study and PEP 42)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Wed Nov 17 15:44:15 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 17 Nov 2010 15:44:15 +0100
Subject: [Python-Dev] I need help with IO testuite
References: <>
Message-ID: <>

On Wed, 17 Nov 2010 15:31:02 +0100
Jesus Cea <jcea at> wrote:
> Hash: SHA1
> Hi all. I am modifying IO module for Python 3.2, and I am unable to
> understand the mechanism used in IO testsuite to test both the C and the
> Python implementation.
> In particular I need to test that the implementation passes some
> parameters to the OS.
> The module uses "Mock" classes, but I think "Mock" is something else,
> and I don't see how it interpose between the C/Python code and the OS.

It doesn't interpose between Python and the OS: it mocks the OS.  It
is, therefore, a mock (!).

Consequently, if you want to test that parameters are passed to the OS,
you shouldn't use a mock, but an actual file. There are several tests
which already do that, it shouldn't be too hard to write your own.



From ncoghlan at  Wed Nov 17 15:46:01 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 00:46:01 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Thu, Nov 18, 2010 at 12:25 AM, Michael Foord
<fuzzyman at> wrote:
> We're *also* discussing codifying the naming conventions (or using __all__)
> within the standard library, so it isn't just about deprecations (which is
> why I think PEP 8 rather than PEP 5). This is so that in the future if a
> name looks public users can have more confidence that it actually is...

I deliberately glossed over that, since my stance on the naming
conventions is "don't change them" (i.e. PEP 8 already says that a
leading underscore is an internal use indicator, and I think that's
how we should guide the clarification of our deprecation policy - just
carving out an exception for imported modules).

My original question related to dealing with the grey area in the
deprecation policy (i.e. wanting to remove an API that was
undocumented, but had a public name) and I'm happy that the existing
style guide does answer my question (even though the implications
aren't necessarily obvious).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From phd at  Wed Nov 17 15:50:56 2010
From: phd at (Oleg Broytman)
Date: Wed, 17 Nov 2010 17:50:56 +0300
Subject: [Python-Dev] python3k vs _ast
In-Reply-To: <>
References: <201011171348.07169.emile.anclin@logilab>
Message-ID: <>

On Wed, Nov 17, 2010 at 09:19:35AM -0500, R. David Murray wrote:
> On Wed, 17 Nov 2010 07:36:37 -0600, Benjamin Peterson <benjamin at> wrote:
> > 2010/11/17 Oleg Broytman <phd at>:
> > > Seems to be rather a usage question, not a development question (python-dev
> > > is about *developing* python, not *using* it).
> > 
> > Well, technically I think it's a feature request.
> > 
> > >
> > > On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote:
> > >> hello everybody,
> > >>
> > >> migrating Pylint to python3.x, we encounter a little problem :
> > >> in the tree generated by _ast, if we consider a "args" node (representing
> > >> an argument of a function), the "lineno" (and the "col_offset")
> > >> information disappeared from those nodes. Is there a particular
> > >> reason for that ? In python2.x, the "args" nodes were just "Name" nodes,
> > >> and as for now we keep them as "AssName" nodes in astng/pylint and would
> > >> like to know where it was defined.
> > 
> > I wouldn't object to adding them back if you want to file a bug report.
> It also seems to me that it was a perfectly appropriate question
> for this list.  The question was "why did you developers drop this
> (obscure) feature that we depend on in Python3?"

   The problem for me is the wording. A question like "why did you
developers drop a feature?" is certainly a development question, while
"like to know where it was defined" seems more like a usage question.
   I apologize for misunderstanding.

> I don't think that
> question would make sense on python-list.  Granted, there's a fuzzy
> line there, but pylint is really development infrastructure :)
> The python-porting list would have been a good alternate choice.
> --
> R. David Murray                            

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From ncoghlan at  Wed Nov 17 15:58:30 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 00:58:30 +1000
Subject: [Python-Dev] I need help with IO testuite
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 18, 2010 at 12:31 AM, Jesus Cea <jcea at> wrote:
> Hash: SHA1
> Hi all. I am modifying IO module for Python 3.2, and I am unable to
> understand the mechanism used in IO testsuite to test both the C and the
> Python implementation.
> In particular I need to test that the implementation passes some
> parameters to the OS.
> The module uses "Mock" classes, but I think "Mock" is something else,
> and I don't see how it interpose between the C/Python code and the OS.

The "Mock" refers to stubbing out or substituting various layers of
the IO stack with the Python implementations in the test file. It
isn't related specifically to the C/Python switching.

> If somebody could explain the mechanism a bit...

The actual C/Python switching happens later in the file. It is best to
start from the bottom of the file (with the list of test cases that
are actually executed) and work your way up from there.

For what Amaury is talking about, what you can test is that the higher
layers of the IO stack (e.g. BufferedReader) correctly pass the new
flags down to the RawIO layer. You're correct that you can't really
test that RawIO is actually passing the flags down to the OS. However,
if you have a way to check whether the filesystem in use is ZFS, you
may be able to create a conditionally executed test, such that correct
behaviour can be verified just by running on a machine that uses ZFS
for its temp directory.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From tseaver at  Wed Nov 17 15:58:37 2010
From: tseaver at (Tres Seaver)
Date: Wed, 17 Nov 2010 09:58:37 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>
Message-ID: <ic0qit$us7$>

Hash: SHA1

On 11/17/2010 09:16 AM, Steven D'Aprano wrote:
> Ben Finney wrote:
>> I don't know about Guido, but I'd be ?1 on suggestions to add more
>> normative information to PEP 7, PEP 8, PEP 257, or any other established
>> style guide PEP. I certainly don't want to have to keep going back to
>> the same documents frequently just to see if the set of recommendations
>> I already know has changed recently.
> This is not a problem unique to any specific PEP. How do we learn about 
> any changes that might interest us? What are the alternatives?
> - our knowledge is fixed to what we knew at some particular date, and 
> gets further and further obsolete as time goes by;
> - we actively search out new knowledge;
> - we wait for somebody to tell us that something we knew has changed.
> (E.g. I was rather surprised to learn that, sometime over the last few 
> years, the number of extra-solar planets known to astronomers have 
> increased from the one or two I was aware of to multiple dozens.)
> All three strategies have advantages and disadvantages.
> Regardless of whether future versions of the style-guide are called "PEP 
> 8" or whether they are given new names ("PEP 8" -> "PEP 88" -> ...), we 
> have the identical problem -- how do we know whether or not there is a 
> new version of the style guide to look for? In twelve months time, how 
> sure will we be that PEP 88 is the most recent version to look for? 
> Perhaps we missed the release of PEP 95.
> The one advantage of giving each revision of the document an updated 
> name is that, under some circumstances, we *might* be able to detect a 
> new revision easily. If I think that PEP 88 is the most recent version, 
> and somebody says that the recommended style guide is PEP 89, I might:
> - think that he merely made a mistake, and meant to say 88; or
> - think that there is a new document for me to look at.
>> Rather, I took Guido's mention of ?this belongs in a style guide? as
>> suggesting a *new* style guide. Perhaps one that explicitly obsoletes an
>> existing one or perhaps not; either way, the updated normative
>> recommendations are in a new document with a new name, so that one knows
>> whether one has already read it.
> How do you know which is the most recent version of the style guide to 
> look at? Instead of doing a O(1) lookup of PEP 8, you have to follow a 
> potentially O(N) search:
> PEP 8 is obsoleted by PEP 88... go and look at PEP 88.
> PEP 88 is obsoleted by PEP 93... go at look at PEP 93.
> PEP 93 is obsoleted by PEP 123... go and look at PEP 123.
> PEP 123 doesn't contain an "obsoleted by" notice, so:
> (1) either it is the current document, or
> (2) it has been obsoleted, but the link to the new version was missed, 
> and it is now very hard to discover what the current document is called.
> Personally, I don't think the current PEP arrangement is broken enough 
> to change it. Each PEP is already tracked in VCS and history is 
> available for it. There's insufficient advantage, and some disadvantage, 
> to splitting each revision of the PEPs into new documents with new 
> names. -1 on the idea.

FWIW, Guido recently ruled that updating PEP 333 to indicate how WSGI
would work in Python3 was not appropriate, and suggested instead a new
PEP (3333), stating[1]:

 Of those, IMO only textual clarifications ought to be made to an
 existing, accepted, widely implemented standards-track PEP.

Note that the BDFL ruled this way even though the changes to PEP 333
were essentially clarifications which applied only to Python 3:  the
existing Python 2 semantics would have rmeained the same.[2]


- -- 
Tres Seaver          +1 540-429-0999          tseaver at
Palladion Software   "Excellence by Design"
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From ncoghlan at  Wed Nov 17 16:00:20 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 01:00:20 +1000
Subject: [Python-Dev] I need help with IO testuite
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 18, 2010 at 12:58 AM, Nick Coghlan <ncoghlan at> wrote:
> For what Amaury is talking about, what you can test is that the higher
> layers of the IO stack (e.g. BufferedReader) correctly pass the new
> flags down to the RawIO layer. You're correct that you can't really
> test that RawIO is actually passing the flags down to the OS. However,
> if you have a way to check whether the filesystem in use is ZFS, you
> may be able to create a conditionally executed test, such that correct
> behaviour can be verified just by running on a machine that uses ZFS
> for its temp directory.

On further thought, the test should probably be unconditional - just
allow a ValueError as an acceptable result that indicates the
underlying filesystem isn't ZFS.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From alexander.belopolsky at  Wed Nov 17 16:17:45 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 17 Nov 2010 10:17:45 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 9:19 AM, Nick Coghlan <ncoghlan at> wrote:
> The standard library documentation should say that the public API is
> what the documentation says it is. Officially, anyone going outside
> those documented APIs should not be surprised if things get removed or
> changed arbitrarily without warning. That has long been the python-dev
> policy and I, for one, don't think it should change.

That's another reason why it is appropriate to document this in both
Library Reference and the Developers Guide (whatever it is).  In the
Library Reference we can say point-blank: "This is the authoritative
documentation of what Python Library provides.  Anything not mentioned
here is subject to change between releases without notice."  In the
Developers Guide, guide, however we can take a more nuanced approach
that would start with a general policy that changing existing APIs
public or not is costly and should not be done without significant
offsetting benefit.  More on this below.

> What we're talking about in this thread is what to do in the grey area
> of APIs which are not included in the official documentation, but also
> don't have names starting with an underscore so they "look public"
> when reading the source code or exploring the API in the interactive
> interpreter. It *may* be appropriate for the standard library
> documentation to acknowledge that this grey area exists (I'm not yet
> convinced on that point), but it definitely should *not* be
> encouraging anyone to rely on it or on our policies for dealing with
> it.

Users will venture into grey area regardless of whether its existence
is acknowledged or not.  Developers Guide should take this into
consideration, but there is no need to encourage this practice in the
Library Reference.   In the Developers Guide, we can list a set of
factors that need to be considered when changing or removing an
undocumented API.  For example:

1. Does it start with an underscore?
2. Is __all__ defined for the module? Id so, is the name in __all__?
3. Is API name well chosen for what it does?
4. How old is the module?  Was is written before modern policies  have
been adopted?
5. Is API used in the standard library outside of the module?
6. Is API broken?  Can it be fixed?  (If it was broken in several
releases and nobody complained - it is ok to remove.)
7. Is API used?  General google search or google code search can give
an insight.

The decision to remove an API should be always done on a case by case
basis.  Purely style compliance changes such as let's add __all__ and
rename all names not in all by prepending an underscore should always
add old names back as deprecated aliases.  (Breaking from xyz import *
by adding __all__ to xyz is probably ok because code using  from xyz
import * may be broken by any addition to xyz and users have been

> So the official policy from a language *user* point of view would
> remain unchanged (i.e. if it isn't documented, you're on your own). As
> a *pragmatic* policy, however, we would explicitly acknowledge that
> developers may inadvertently use an undocumented API without realising
> that it isn't technically public, and hence apply the normal
> deprecation process even though the official policy says we don't have
> to.

From foom at  Wed Nov 17 16:24:12 2010
From: foom at (James Y Knight)
Date: Wed, 17 Nov 2010 10:24:12 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
> (and is a little trickier in the case of module level globals, since those can't be deprecated properly)

People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way...


From guido at  Wed Nov 17 16:30:03 2010
From: guido at (Guido van Rossum)
Date: Wed, 17 Nov 2010 07:30:03 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight <foom at> wrote:
> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
>> (and is a little trickier in the case of module level globals, since those can't be deprecated properly)
> People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way...

That's quite the distraction from the current thread though. Start
discussing it on python-ideas, or submit a code fix, or something in
between. But the hackish way that some 3rd party frameworks use
(replacing the module object with a class instance in sys.modules) is
clearly not right for the standard library (I'll explain on
python-ideas if you insist).

--Guido van Rossum (

From guido at  Wed Nov 17 16:52:37 2010
From: guido at (Guido van Rossum)
Date: Wed, 17 Nov 2010 07:52:37 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
Message-ID: <>

On Tue, Nov 16, 2010 at 1:31 PM, Ben Finney <ben+python at> wrote:
> I don't know about Guido, but I'd be -1 on suggestions to add more
> normative information to PEP 7, PEP 8, PEP 257, or any other established
> style guide PEP. I certainly don't want to have to keep going back to
> the same documents frequently just to see if the set of recommendations
> I already know has changed recently.
> Rather, I took Guido's mention of "this belongs in a style guide" as
> suggesting a *new* style guide. Perhaps one that explicitly obsoletes an
> existing one or perhaps not; either way, the updated normative
> recommendations are in a new document with a new name, so that one knows
> whether one has already read it.

That's not what I meant. In the case of style guides I think it is
totally appropriate to update the PEP as new rules are developed or
existing ones are clarified (or even changed).

I certainly don't want to get into the situation where the style guide
is spread over multiple documents that need to be taken together to
make sense. It's not like PEP 8 specifies an API that is going to
break code in the future -- it is a set of conventions. You could
create a new PEP or move the style guide out of the PEP system (a not
unreasonable option) but the effect of changes to the style guide is
the same: some fraction of old code will become non-compliant. So
what? A style guide is just that -- a guide for coding style. Every
good style guide contains an escape clause: in PEP 8 it is the section
named "A Foolish Consistency is the Hobgoblin of Little Minds".

I've seen many unreasonable uses of style guides. This is a recurring
theme with Google's internal style guides too. For example, some
people get in an argument with a code reviewer about what's the best
way to do something, and they can't agree -- so now they want a
resolution in the style guide, no matter how specific their argument
is to one particular context. Other people claim you cannot change a
style guide because it would make existing code unnecessarily
non-compliant. There are the people who insist that the style guide be
followed mindlessly, even in situations where using a different style
would be clearly better. Then there are the people who want to update
the entire code base to become compliant after each style change.
Etc., etc.

All I want to say is, people lighten up. The style guide can't solve
all your problems. You are never going to have all code compliant. Use
the style guide when it helps, ignore it when it's in the way.

Finally, there's the issue of the scope of PEP 8. Its heading says
that it applies to the stdlib. The reason I put this in was so that
3rd party developers who disagreed with (part of) PEP 8 would not feel
obligated to follow it. At the same time I would hope that most people
see its value and follow (most of) it for their own code, accepting
that a more universal set of conventions helps readability of all
code. I would not be against changes to the style guide that emphasize
that some rules apply specifically to the stdlib (the rules about
mostly not using non-ASCII characters come to mind) and even to
include some normative rules for stdlib developers (e.g. exactly how
to use __all__ and private names).

But we cannot hope that all stdlib modules will all look exactly
alike. It is the work of many contributors, over many years, with
different backgrounds and intentions. That's fine. Let's try to make
new stdlib modules use the best style we can think of, but limit the
time spent fretting over code that's already there.

--Guido van Rossum (

From jcea at  Wed Nov 17 17:07:02 2010
From: jcea at (Jesus Cea)
Date: Wed, 17 Nov 2010 17:07:02 +0100
Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86
Message-ID: <>

Hash: SHA1

Hi, everybody.

I am glad to say I am installing an OpenIndiana zone (Openindiana is a
fork of Indiana, a distribution of OpenSolaris) with the aim to be a
buildbot for python development.

This machine has plenty of disk (even SSD!), CPU and memory for the task.

I am reading . I have installed
buildbotslave already, but I need passwords, etc., to link to python
buildbot infraestructure.

The machine is behind a NAT system, so any incoming connection will need
to be documented and a port mapping request to be done.

So, after installing buildbotslave, what is the next step?.

Thanks to OpenIndiana staff, specially Alasdair Lumsden, for providing
the physical resources for this attempt.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From solipsis at  Wed Nov 17 17:23:01 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 17 Nov 2010 17:23:01 +0100
Subject: [Python-Dev] Help deploying a new buildbot running
References: <>
Message-ID: <>

On Wed, 17 Nov 2010 17:07:02 +0100
Jesus Cea <jcea at> wrote:
> I am reading . I have installed
> buildbotslave already, but I need passwords, etc., to link to python
> buildbot infraestructure.
> The machine is behind a NAT system, so any incoming connection will need
> to be documented and a port mapping request to be done.

There is no incoming connection; however, a bunch of outgoing
connections are made to various hosts by various tests, so it's better
if there's no overzealous firewall in-between.



From foom at  Wed Nov 17 17:23:43 2010
From: foom at (James Y Knight)
Date: Wed, 17 Nov 2010 11:23:43 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Nov 17, 2010, at 10:30 AM, Guido van Rossum wrote:
> On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight <foom at> wrote:
>> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
>>> (and is a little trickier in the case of module level globals, since those can't be deprecated properly)
>> People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way...
> That's quite the distraction from the current thread though. Start
> discussing it on python-ideas, or submit a code fix, or something in
> between. But the hackish way that some 3rd party frameworks use
> (replacing the module object with a class instance in sys.modules) is
> clearly not right for the standard library (I'll explain on
> python-ideas if you insist).

I just don't want people to use the current lack as an excuse to simply remove module attributes without prior deprecation (or make a compatibility policy which recommends doing such a thing). I'll leave it up to the experts on this list (or python-ideas...) to determine how to implement a module-level deprecation in a way that isn't considered "hackish". (Or, if there is no such way, there's also the alternative of simply never removing module-level names.)


From guido at  Wed Nov 17 17:38:09 2010
From: guido at (Guido van Rossum)
Date: Wed, 17 Nov 2010 08:38:09 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 8:23 AM, James Y Knight <foom at> wrote:
> On Nov 17, 2010, at 10:30 AM, Guido van Rossum wrote:
>> On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight <foom at> wrote:
>>> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
>>>> (and is a little trickier in the case of module level globals, since those can't be deprecated properly)
>>> People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way...
>> That's quite the distraction from the current thread though. Start
>> discussing it on python-ideas, or submit a code fix, or something in
>> between. But the hackish way that some 3rd party frameworks use
>> (replacing the module object with a class instance in sys.modules) is
>> clearly not right for the standard library (I'll explain on
>> python-ideas if you insist).
> I just don't want people to use the current lack as an excuse to simply remove module attributes without prior deprecation (or make a compatibility policy which recommends doing such a thing). I'll leave it up to the experts on this list (or python-ideas...) to determine how to implement a module-level deprecation in a way that isn't considered "hackish". (Or, if there is no such way, there's also the alternative of simply never removing module-level names.)

Deprecation doesn't *require* logging a warning or raising an
exception. You can also add a note to the docs, or if it is
undocumented, just add a comment to the code. (Though if it is in
widespread use despite being undocumented, a better way would be to
document it first -- as immediately deprecated if necessary.)

Deprecation is in the end a way to give people advance warning about
future changes. The mechanism of the warning doesn't always have to be
implemented by the interpreter/compiler/parser or whatever other tool.

--Guido van Rossum (

From jcea at  Wed Nov 17 17:52:14 2010
From: jcea at (Jesus Cea)
Date: Wed, 17 Nov 2010 17:52:14 +0100
Subject: [Python-Dev] Help deploying a new buildbot
	running	OpenIndiana/x86
In-Reply-To: <>
References: <> <>
Message-ID: <>

Hash: SHA1

On 17/11/10 17:23, Antoine Pitrou wrote:
> There is no incoming connection; however, a bunch of outgoing
> connections are made to various hosts by various tests, so it's better
> if there's no overzealous firewall in-between.

I know that, just confirming.

You'll need to get someone to create the slavename/slavepasswd on before doing this. Talk to someone like Antoine
Pitrou, Martin von L?wis, Anthony or Neal Norwitz to do this.
#python-dev on freenode is a good place to ask.

?Could you provide the connection credential?. I rather prefer to skip
the IRC (I am a XMPP guy), but I can connect to freenode if you need it.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From foom at  Wed Nov 17 18:05:02 2010
From: foom at (James Y Knight)
Date: Wed, 17 Nov 2010 12:05:02 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Nov 17, 2010, at 11:38 AM, Guido van Rossum wrote:
> Deprecation doesn't *require* logging a warning or raising an
> exception. You can also add a note to the docs, or if it is
> undocumented, just add a comment to the code. (Though if it is in
> widespread use despite being undocumented, a better way would be to
> document it first -- as immediately deprecated if necessary.)
> Deprecation is in the end a way to give people advance warning about
> future changes. The mechanism of the warning doesn't always have to be
> implemented by the interpreter/compiler/parser or whatever other tool.

Well, that's certainly a possible policy. I'd suggest that adding notes to the docs after-the-fact is a singularly ineffective way of giving people advance warning of feature removal compared to having the interpreter/compiler/parser or whatever other tool warn you. And if that's to be python's policy, when it's possible to do better, I'm disappointed. (But won't respond further, my point is made.)


From solipsis at  Wed Nov 17 18:10:02 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 17 Nov 2010 18:10:02 +0100
Subject: [Python-Dev] Help deploying a new buildbot
	running	OpenIndiana/x86
References: <> <>
Message-ID: <>

> ?Could you provide the connection credential?. I rather prefer to skip
> the IRC (I am a XMPP guy), but I can connect to freenode if you need it.

I've already sent you a private e-mail.

From jcea at  Wed Nov 17 18:13:24 2010
From: jcea at (Jesus Cea)
Date: Wed, 17 Nov 2010 18:13:24 +0100
Subject: [Python-Dev] Help deploying a new
	buildbot	running	OpenIndiana/x86
In-Reply-To: <>
References: <>
	<>	<>
Message-ID: <>

Hash: SHA1

On 17/11/10 18:10, Antoine Pitrou wrote:
>> ?Could you provide the connection credential?. I rather prefer to skip
>> the IRC (I am a XMPP guy), but I can connect to freenode if you need it.
> I've already sent you a private e-mail.

OK. Sorry. My mail greylist is probably involved. Lets wait for another

Thanks for your time, Antoine.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From janssen at  Wed Nov 17 18:12:53 2010
From: janssen at (Bill Janssen)
Date: Wed, 17 Nov 2010 09:12:53 PST
Subject: [Python-Dev] Help deploying a new buildbot running
In-Reply-To: <>
References: <> <>
Message-ID: <>

Jesus Cea <jcea at> wrote:

> On 17/11/10 17:23, Antoine Pitrou wrote:
> > There is no incoming connection; however, a bunch of outgoing
> > connections are made to various hosts by various tests, so it's better
> > if there's no overzealous firewall in-between.

For those of us who can't do that, there's a list of what machines the testing
framework needs to be able to reach at <>.

If you modify the tests, please keep that list up-to-date.


From alexander.belopolsky at  Wed Nov 17 19:35:19 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 17 Nov 2010 13:35:19 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan <ncoghlan at> wrote:
> The library documentation is *not* the right place for quibbling about
> what constitutes a public API when using other means than the library
> documentation to find APIs to call.

People who bother to read the Library Reference most likely already
know that it is the authoritative source.  People who read the sources
or use deep introspection most likely know that they are walking on
thin ice.   The only grey area is help() and dir().  Unfortunately may
novice guides recommend using these tools for learning as follows:

>>> L = []
>>> dir(L)
['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove',
'reverse', 'sort']
>>> help(L.append)
Help on built-in function append:


Given the quirkiness of dir(), this is probably not the best practice.
 For the standard library however,

>>> help('module')


$ pydoc module

already refer users to the official manual.  Unfortunately this
feature is slightly broken in 3.x (the link takes you to 2.x
documentation instead of 3.x).

I have opened a bug report about this,, and would like to add a sentence or
two to the "MODULE DOC" section explaining the differences between the
auto-generated docs and the official manual.

We may also revisit the rules used by help() to decide what to include
on the auto-generated module implementation.  Note that currently
help() output excludes names not in __all__ is the module has __all__
defined.  While I advocated this rule earlier in this thread, I now
realize that it may not be quite practical.  Consider the recent
addition of open() to the tokenize module.  It was documented in the
manual, but (wisely) excluded from tokenize.__all__.

It appears that this discussion is converging to the conclusion that
public API = documented in the reST manual.  An unfortunate
consequence is that it is not easy to discover public API
programmatically.  However, "not easy" does not mean "impossible."
ReST documentation is highly structured and Sphinx already generates
various indices that can be easily queried.  Maybe some of these
indices should be distilled into something compact and made available
to pydoc by the build process.  This would allow help(anyobject)
display a deep link to the official documentation or a warning that
anyobject is undocumented.

From tjreedy at  Wed Nov 17 20:52:58 2010
From: tjreedy at (Terry Reedy)
Date: Wed, 17 Nov 2010 14:52:58 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<ibf8p4$ajc$>	<>	<>	<>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>
Message-ID: <ic1bqq$p07$>

On 11/17/2010 10:52 AM, Guido van Rossum wrote:

> That's not what I meant. In the case of style guides I think it is
> totally appropriate to update the PEP as new rules are developed or
> existing ones are clarified (or even changed).

Revising style guides is standard practice. The Chicago Manual of Style, 
which is practically the 'Bible' of American publishing, is now in its 
16th edition after 104 years.

Idea: include the 'current' version of PEP8 in the doc set for each 
Python version as the frozen Python Stdlib Style Guide for that version. 
Then people could specifically refer to the 3.2 version of the style 
guide. PEP8 would then be the trunk version subject to further revision. 
Include with the frozen version the repository id info needed to do a 
diff between it and future revisions so people can discover what has 
changed since whenever.

Terry Jan Reedy

From merwok at  Wed Nov 17 22:08:32 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 17 Nov 2010 22:08:32 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<ibgrn9$kvf$>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

> We may also revisit the rules used by help() to decide what to include
> on the auto-generated module implementation.  Note that currently
> help() output excludes names not in __all__ is the module has __all__
> defined.  While I advocated this rule earlier in this thread, I now
> Consider the recent addition of open() to the tokenize module.  It
> was documented in the manual, but (wisely) excluded from tokenize.__all__.

I?m not sure this was on purpose.  Victor?

From ncoghlan at  Wed Nov 17 22:10:01 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 07:10:01 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

On Thu, Nov 18, 2010 at 7:08 AM, ?ric Araujo <merwok at> wrote:
>> We may also revisit the rules used by help() to decide what to include
>> on the auto-generated module implementation. ?Note that currently
>> help() output excludes names not in __all__ is the module has __all__
>> defined. ?While I advocated this rule earlier in this thread, I now
>> Consider the recent addition of open() to the tokenize module. ?It
>> was documented in the manual, but (wisely) excluded from tokenize.__all__.
> I?m not sure this was on purpose. ?Victor?

Excluding a builtin name from __all__ sounds like a perfectly sensible
idea, so even if it wasn't deliberate, I'd say it qualifies as
fortuitous :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From jaraco at  Wed Nov 17 21:58:10 2010
From: jaraco at (Jason R. Coombs)
Date: Wed, 17 Nov 2010 12:58:10 -0800
Subject: [Python-Dev] new LRU cache API in Py3.2
In-Reply-To: <i5svvt$jkk$>
References: <i5svvt$jkk$>
Message-ID: <12C7AB425F0DD546B6049311F827C74E0986D4151B@VA3DIAXVS141.RED001.local>

I see now that my previous reply went only to Stefan, so I'm re-submitting,
this time to the list.

> -----Original Message-----
> From: Stefan Behnel
> Sent: Saturday, 04 September, 2010 04:29
> What about adding an intermediate namespace called "cache", so that 
> the new operations are available like this:
>      print get_phone_number.cache.hits
>      get_phone_number.cache.clear()

I agree. While the function-based implementation is highly efficient, the
pure use of functions has the counter-Pythonic effect of obfuscating the
internal state (the same way the 'private' keyword does in Java). A
class-based implementation would be capable of having its state introspected
and could easily be extended. While the functional implementation is a
powerful construct, it fails to generalize well. IMHO, a stdlib
implementation should err on the side of transparency and extensibility over

That said, I've adapted Hettinger's Python 2.5 implementation to a
class-based implementation. I've tried to keep the performance optimizations
in place, but instead of instrumenting the wrapped method with lots of
cache_* functions, I simply attach the cache object itself, which then
provides the interface suggested by Stefan. This technique allows access to
the cache object and all of its internal state, so it's also possible to do
things like:

    get_phone_number.cache.maxsize += 100



These techniques are nearly impossible in the functional implementation, as
the state is buried in the locals() of the nested functions.

I'm most grateful to Raymond for contributing this to Python; On many
occasions, I've used the ActiveState recipes for simple caches, but in
almost every case, I've had to adapt the implementation to provide more
transparency. I'd prefer to not have to do the same with the stdlib.

Jason R. Coombs

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6448 bytes
Desc: not available
URL: <>

From merwok at  Wed Nov 17 22:16:10 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 17 Nov 2010 22:16:10 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> Excluding a builtin name from __all__ sounds like a perfectly sensible
> idea, so even if it wasn't deliberate, I'd say it qualifies as
> fortuitous :)

But then, a tool that looks into __all__ to find for example what
objects to document will miss open.  I?d put open in __all__.


From g.brandl at  Wed Nov 17 22:22:50 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 17 Nov 2010 22:22:50 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<AANLkTimrUO2kM>	<>	<>
Message-ID: <ic1h69$jmd$>

Am 17.11.2010 22:16, schrieb ?ric Araujo:
>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>> fortuitous :)
> But then, a tool that looks into __all__ to find for example what
> objects to document will miss open.  I?d put open in __all__.

So it comes down again to what we'd like __all__ to mean foremost:
public API, or just a list for "import *"?


From fdrake at  Wed Nov 17 22:39:25 2010
From: fdrake at (Fred Drake)
Date: Wed, 17 Nov 2010 16:39:25 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ic1h69$jmd$>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
	<> <ic1h69$jmd$>
Message-ID: <>

On Wed, Nov 17, 2010 at 4:22 PM, Georg Brandl <g.brandl at> wrote:
> So it comes down again to what we'd like __all__ to mean foremost:
> public API, or just a list for "import *"?

It is and has been since its inception *the* list for "import *".

Any additional meaning will have to accommodate that usage as well.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From solipsis at  Wed Nov 17 22:48:01 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 17 Nov 2010 22:48:01 +0100
Subject: [Python-Dev] PEP 3151 dictator
Message-ID: <>


I would like to announce that, following Guido's (private) suggestion
that I find a temporary dictator for PEP 3151, Barry has accepted to
fill in this role.



From g.brandl at  Wed Nov 17 22:50:10 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 17 Nov 2010 22:50:10 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
	<>	<>	<>	<>	<>	<>	<>
Message-ID: <ic1iph$s0r$>

Am 17.11.2010 22:39, schrieb Fred Drake:
> On Wed, Nov 17, 2010 at 4:22 PM, Georg Brandl <g.brandl at> wrote:
>> So it comes down again to what we'd like __all__ to mean foremost:
>> public API, or just a list for "import *"?
> It is and has been since its inception *the* list for "import *".
> Any additional meaning will have to accommodate that usage as well.

Seeing that "import *" is discouraged anywhere I look, it might just not
be as important anymore.

BTW, "open" is listed in __all__ for lots of modules: io, gzip, dbm...
and even "ancient" ones like aifc.


From steve at  Wed Nov 17 22:57:00 2010
From: steve at (Steven D'Aprano)
Date: Thu, 18 Nov 2010 08:57:00 +1100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<20101111100516.6e90aa41@mission>
	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>
	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:

> The policy we're aiming to clarify here is what we should do when we
> come across standard library APIs that land in the grey area, with
> there being two appropriate ways to deal with them:
> 1. Document them and make them officially public
> 2. Deprecate the public names and make them officially private (with
> the public names later removed in accordance with normal deprecation
> procedures)

You missed at least two other options:

3. Treat "documented" and "public" as orthogonal, not synonymous: 
undocumented public API is not an oxymoron, and neither is documented 
private API.

4. Do nothing. Inertia wins. Is this problem we're trying to solve so 
serious that we need to solve it now except on a case-by-case basis?

The approach that gives us the most flexibility is #3. Clearly one would 
not need to document private APIs for the use of the general public, but 
adding docstrings to private functions and classes for in-house use is a 
sensible thing to do. This applies equally to the standard library as to 
any other major project.

Likewise, one might introduce a public function into some module, but 
for whatever reason, choose not to document it. (Perhaps it's a lack of 
hours in the day, perhaps it is a deliberate decision.) In this case, 
the mere lack of documentation shouldn't relieve us of the 
responsibility of treating the function as public.

For emphasis: I strongly believe that public/private and 
documented/undocumented are orthogonal qualities, and should not be 
treated as, or forced to be, identical.

The use of imported modules is possibly an exception. If a user is 
writing something like (say) getopt.os.getcwd() instead of importing os 
directly, then they're on shaky ground. We shouldn't expect module 
authors to write "import os as _os" just to avoid making os a part of 
their public API.

I'd be prepared to make an exception to the rule "no leading underscore 
means public": imported modules are implementation details unless 
explicitly documented otherwise. E.g. the os module explicitly makes 
path part of its public API, but os.sys is an implementation detail.


From ben+python at  Thu Nov 18 02:08:08 2010
From: ben+python at (Ben Finney)
Date: Thu, 18 Nov 2010 12:08:08 +1100
Subject: [Python-Dev] Breaking undocumented API
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
Message-ID: <>

Steven D'Aprano <steve at> writes:

> 3. Treat "documented" and "public" as orthogonal, not synonymous:
> undocumented public API is not an oxymoron, and neither is documented
> private API.


> The use of imported modules is possibly an exception. If a user is
> writing something like (say) getopt.os.getcwd() instead of importing
> os directly, then they're on shaky ground. We shouldn't expect module
> authors to write "import os as _os" just to avoid making os a part of
> their public API.
> I'd be prepared to make an exception to the rule "no leading
> underscore means public": imported modules are implementation details
> unless explicitly documented otherwise. E.g. the os module explicitly
> makes path part of its public API, but os.sys is an implementation
> detail.

After reading the discussion for many days, I'm leaning to this position

 \         ?I may disagree with what you say, but I will defend to the |
  `\        death your right to mis-attribute this quote to Voltaire.? |
_o__)                   ?Avram Grumer, rec.arts.sf.written, 2000-05-30 |
Ben Finney

From guido at  Thu Nov 18 03:44:35 2010
From: guido at (Guido van Rossum)
Date: Wed, 17 Nov 2010 18:44:35 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Nov 17, 2010 at 5:08 PM, Ben Finney <ben+python at> wrote:
> Steven D'Aprano <steve at> writes:
>> 3. Treat "documented" and "public" as orthogonal, not synonymous:
>> undocumented public API is not an oxymoron, and neither is documented
>> private API.
> +1
>> The use of imported modules is possibly an exception. If a user is
>> writing something like (say) getopt.os.getcwd() instead of importing
>> os directly, then they're on shaky ground. We shouldn't expect module
>> authors to write "import os as _os" just to avoid making os a part of
>> their public API.
>> I'd be prepared to make an exception to the rule "no leading
>> underscore means public": imported modules are implementation details
>> unless explicitly documented otherwise. E.g. the os module explicitly
>> makes path part of its public API, but os.sys is an implementation
>> detail.
> After reading the discussion for many days, I'm leaning to this position
> also.

Agreed on both counts.

--Guido van Rossum (

From fuzzyman at  Thu Nov 18 11:47:18 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 18 Nov 2010 10:47:18 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 17/11/2010 21:16, ?ric Araujo wrote:
>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>> fortuitous :)
> But then, a tool that looks into __all__ to find for example what
> objects to document will miss open.  I?d put open in __all__.

"import *" would then override the builtin open. A good reason not to 
use "import *" I guess, but also a good reason not to create names that 
shadow builtins.

All the best,


> Regards
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Thu Nov 18 11:54:23 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 18 Nov 2010 10:54:23 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ic1h69$jmd$>
References: <>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<AANLkTimrUO2kM>	<>	<>	<>
Message-ID: <>

On 17/11/2010 21:22, Georg Brandl wrote:
> Am 17.11.2010 22:16, schrieb ?ric Araujo:
>>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>>> fortuitous :)
>> But then, a tool that looks into __all__ to find for example what
>> objects to document will miss open.  I?d put open in __all__.
> So it comes down again to what we'd like __all__ to mean foremost:
> public API, or just a list for "import *"?

Well, as noted earlier in this discussion - the language reference 
*states* that __all__ defines the module level public API.


     "If the list of identifiers is replaced by a star ('*'), all public 
names defined in the module are bound in the local namespace of the 
import statement."


     "The public names defined by a module are determined by checking 
the module?s namespace for a variable named __all__"

If we decide that __all__ is purely for "import *" we should refine the 
use of the word public on this page.

All the best,

Michael Foord
> Georg
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Thu Nov 18 12:41:07 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 18 Nov 2010 11:41:07 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<20101111100516.6e90aa41@mission>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 17/11/2010 21:57, Steven D'Aprano wrote:
> Nick Coghlan wrote:
>> The policy we're aiming to clarify here is what we should do when we
>> come across standard library APIs that land in the grey area, with
>> there being two appropriate ways to deal with them:
>> 1. Document them and make them officially public
>> 2. Deprecate the public names and make them officially private (with
>> the public names later removed in accordance with normal deprecation
>> procedures)
> You missed at least two other options:
> 3. Treat "documented" and "public" as orthogonal, not synonymous: 
> undocumented public API is not an oxymoron, and neither is documented 
> private API.

Along with the others +1

I think how we handle the deprecations (legacy modules with unclear or 
clearly wrong naming policies) is the least interesting part of this 
discussion. For deprecating existing names we have *no choice* but to 
proceed on a case-by-case basis evaluating how likely the deprecation is 
to break other code, whether or not the name was originally intended to 
be public or not. (At least that is how we *should* proceed and part of 
our standard deprecation policy - it is why we aren't removing 
unittest.TestCase.assertEquals and assert_ even though they are 
deprecated. They are just too widely used.)

What is more important is that we have a clearly stated policy for new 
modules and adding names to existing modules so that we don't have to 
repeat this debate in five years time.

My suggestion, which fits in with the use of __all__ by the language and 
also the convention widely in use by the community already boils down to:

* If __all__ exists it is definitive
* Imported names are never part of the public API of a module unless in 
__all__ or documented to be part of the API
* Names with leading underscores are private unless in __all__ (and if 
you want to export leading underscore names as part of a public API you 
should define __all__ or "import *" won't export them)
* Leading underscore convention extends to packages and class members; 
no members of a package or class whose name begins with a leading 
underscore are public

It is still good practise that public APIs *should* be documented (and 
*should* have docstrings). There is however no corollary that private 
APIs should not be documented (and they may have docstrings).

All the best,

Michael Foord

> 4. Do nothing. Inertia wins. Is this problem we're trying to solve so 
> serious that we need to solve it now except on a case-by-case basis?
> The approach that gives us the most flexibility is #3. Clearly one 
> would not need to document private APIs for the use of the general 
> public, but adding docstrings to private functions and classes for 
> in-house use is a sensible thing to do. This applies equally to the 
> standard library as to any other major project.
> Likewise, one might introduce a public function into some module, but 
> for whatever reason, choose not to document it. (Perhaps it's a lack 
> of hours in the day, perhaps it is a deliberate decision.) In this 
> case, the mere lack of documentation shouldn't relieve us of the 
> responsibility of treating the function as public.
> For emphasis: I strongly believe that public/private and 
> documented/undocumented are orthogonal qualities, and should not be 
> treated as, or forced to be, identical.
> The use of imported modules is possibly an exception. If a user is 
> writing something like (say) getopt.os.getcwd() instead of importing 
> os directly, then they're on shaky ground. We shouldn't expect module 
> authors to write "import os as _os" just to avoid making os a part of 
> their public API.
> I'd be prepared to make an exception to the rule "no leading 
> underscore means public": imported modules are implementation details 
> unless explicitly documented otherwise. E.g. the os module explicitly 
> makes path part of its public API, but os.sys is an implementation 
> detail.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Thu Nov 18 13:16:35 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 18 Nov 2010 22:16:35 +1000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ic1h69$jmd$>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
	<> <ic1h69$jmd$>
Message-ID: <>

On Thu, Nov 18, 2010 at 7:22 AM, Georg Brandl <g.brandl at> wrote:
> Am 17.11.2010 22:16, schrieb ?ric Araujo:
>>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>>> fortuitous :)
>> But then, a tool that looks into __all__ to find for example what
>> objects to document will miss open. ?I?d put open in __all__.
> So it comes down again to what we'd like __all__ to mean foremost:
> public API, or just a list for "import *"?

It's the list for star imports. This intended use case is borne out by
the description of the feature when it was first added to the language
back in 2.1:

The public API (for documentation and introspection purposes) is any
name that doesn't start with an underscore and isn't an imported
module. If a tool is attempting to use __all__ as more than just the
list of names for star imports, I would call the tool buggy.

The use of the term "public names" in the language reference when
describing the semantics of __all__ is an unfortunate choice, but it
is used specifically in the context of talking about star imports and
clarifying which names they bring in without making any reference to
standards for documentation or deprecation policies.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From g.brandl at  Thu Nov 18 13:37:38 2010
From: g.brandl at (Georg Brandl)
Date: Thu, 18 Nov 2010 13:37:38 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<AANLkTimrUO2kM>	<>	<>	<>
Message-ID: <ic36pi$6i5$>

Am 18.11.2010 11:47, schrieb Michael Foord:
> On 17/11/2010 21:16, ?ric Araujo wrote:
>>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>>> fortuitous :)
>> But then, a tool that looks into __all__ to find for example what
>> objects to document will miss open.  I?d put open in __all__.
> "import *" would then override the builtin open. A good reason not to 
> use "import *" I guess, but also a good reason not to create names that 
> shadow builtins.

Heh.  Instead have fun with io.ioopen(), gzip.gzipopen(),
webbrowser.webbrowseropen(), etc.?  We do have namespace support for a reason.


From fuzzyman at  Thu Nov 18 13:48:57 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 18 Nov 2010 12:48:57 +0000
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <ic36pi$6i5$>
References: <>	<>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<AANLkTimrUO2kM>	<>	<>	<>	<>
Message-ID: <>

On 18/11/2010 12:37, Georg Brandl wrote:
> Am 18.11.2010 11:47, schrieb Michael Foord:
>> On 17/11/2010 21:16, ?ric Araujo wrote:
>>>> Excluding a builtin name from __all__ sounds like a perfectly sensible
>>>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>>>> fortuitous :)
>>> But then, a tool that looks into __all__ to find for example what
>>> objects to document will miss open.  I?d put open in __all__.
>> "import *" would then override the builtin open. A good reason not to
>> use "import *" I guess, but also a good reason not to create names that
>> shadow builtins.
> Heh.  Instead have fun with io.ioopen(), gzip.gzipopen(),
> webbrowser.webbrowseropen(), etc.?  We do have namespace support for a reason.

Or urllib2.urlopen, oh wait - that's real...

If I was importing from those namespaces I probably *would* import and 
rename to have unambiguous names (and you would *have* to if there was 
any possibility of you using the builtin open). is arguably an 
exception to this as it does the same as the builtin open...

Using meaningful names is *good*. This is a reason I dislike modules 
that just call their base exception class "Error". You *have* to use it 
from the namespace (or import with import as and give it a good name) 
for it to have any meaning.


> Georg
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From lukasz at  Thu Nov 18 14:13:39 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Thu, 18 Nov 2010 14:13:39 +0100
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<AANLkTimrUO2kM>	<>	<>	<>	<>	<ic36pi$6i5$>
Message-ID: <>

Am 18.11.2010 13:48, schrieb Michael Foord:
> On 18/11/2010 12:37, Georg Brandl wrote:
>> Am 18.11.2010 11:47, schrieb Michael Foord:
>>> On 17/11/2010 21:16, ?ric Araujo wrote:
>>>>> Excluding a builtin name from __all__ sounds like a perfectly 
>>>>> sensible
>>>>> idea, so even if it wasn't deliberate, I'd say it qualifies as
>>>>> fortuitous :)
>>>> But then, a tool that looks into __all__ to find for example what
>>>> objects to document will miss open.  I?d put open in __all__.
>>> "import *" would then override the builtin open. A good reason not to
>>> use "import *" I guess, but also a good reason not to create names that
>>> shadow builtins.
>> Heh.  Instead have fun with io.ioopen(), gzip.gzipopen(),
>> webbrowser.webbrowseropen(), etc.?  We do have namespace support for 
>> a reason.
> Or urllib2.urlopen, oh wait - that's real...
> If I was importing from those namespaces I probably *would* import and 
> rename to have unambiguous names (and you would *have* to if there was 
> any possibility of you using the builtin open). is arguably an 
> exception to this as it does the same as the builtin open...
> Using meaningful names is *good*. This is a reason I dislike modules 
> that just call their base exception class "Error". You *have* to use 
> it from the namespace (or import with import as and give it a good 
> name) for it to have any meaning.

Guys, I may agree or disagree with these statements but we are drifting 
towards "opinion" versus "solid, well understood practice". Let's focus 
on the subject.

For the matter, "import *" is a discouraged mechanism anyway, let alone 
the rare exceptions where its usage is valid. If you use star-imports 
and you don't know what you're doing, you might just as well hurt 
yourself in other ways than just by "open".

Maybe we should just sum up the discussion somewhere already. Keeping up 
with a thread reaching a megabyte in size is starting to be painful.

Best regards,

From fdrake at  Thu Nov 18 14:47:05 2010
From: fdrake at (Fred Drake)
Date: Thu, 18 Nov 2010 08:47:05 -0500
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
	<> <>
Message-ID: <>

On Thu, Nov 18, 2010 at 6:41 AM, Michael Foord
<fuzzyman at> wrote:
> Along with the others +1

I agree with keeping these distinct and orthogonal as well.

> What is more important is that we have a clearly stated policy for new
> modules and adding names to existing modules so that we don't have to repeat
> this debate in five years time.

Agreed again.

> My suggestion, which fits in with the use of __all__ by the language and
> also the convention widely in use by the community already boils down to:
> * If __all__ exists it is definitive

I think this is overly vague.  :-)

Specifically, if something is mentioned in __all__, it's public.
Non-inclusion in __all__ doesn't imply privateness.

> * Names with leading underscores are private unless in __all__ (and if you
> want to export leading underscore names as part of a public API you should
> define __all__ or "import *" won't export them)

We shouldn't confuse non-export via "import *" with non-public, however.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From solipsis at  Thu Nov 18 16:18:57 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 18 Nov 2010 16:18:57 +0100
Subject: [Python-Dev] r86514 - in python/branches/py3k/Lib:
 test/ xmlrpc/
References: <>
Message-ID: <>

On Thu, 18 Nov 2010 16:00:53 +0100 (CET)
senthil.kumaran <python-checkins at> wrote:
> Author: senthil.kumaran
> Date: Thu Nov 18 16:00:53 2010
> New Revision: 86514
> Log:
> Fix Issue 9991: xmlrpc client ssl check faulty
> +    def test_ssl_presence(self):
> +        #Check for ssl support
> +        have_ssl = False
> +        if hasattr(socket, 'ssl'):
> +            have_ssl = True

This is not the right way to check for ssl.  socket.ssl is deprecated in
2.x and doesn't exist in 3.x.  "import ssl" is enough.

> +        try:
> +            xmlrpc.client.ServerProxy('https://localhost:9999').bad_function()
> +        except:
> +            exc = sys.exc_info()
> +        if exc[0] == socket.error:

This is a rather clumsy way to check for exception types. Why
don't you just write "except socket.error"?

> -        if not hasattr(socket, "ssl"):
> +        if not hasattr(http.client, "ssl"):

That isn't better. "http.client.ssl" is not a public API. You should
check for http.client.HTTPSConnection instead.



From orsenthil at  Thu Nov 18 17:23:25 2010
From: orsenthil at (Senthil Kumaran)
Date: Fri, 19 Nov 2010 00:23:25 +0800
Subject: [Python-Dev] r86514 - in python/branches/py3k/Lib:
 test/ xmlrpc/
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 18, 2010 at 11:18 PM, Antoine Pitrou <solipsis at> wrote:
>> Log:
>> Fix Issue 9991: xmlrpc client ssl check faulty
> [...]
>> + ? ?def test_ssl_presence(self):
>> + ? ? ? ?#Check for ssl support
>> + ? ? ? ?have_ssl = False
>> + ? ? ? ?if hasattr(socket, 'ssl'):
>> + ? ? ? ? ? ?have_ssl = True
> This is not the right way to check for ssl. ?socket.ssl is deprecated in
> 2.x and doesn't exist in 3.x. ?"import ssl" is enough.

The history of the bug report showed that it was closed earlier with
comments such as "Python should be complied with SSL" which had
resulted in some confusion, so after some thought, I let those earlier
verifications remain (Just for readability/understanding the context
of the tests). Thinking again, I see that it is not required.

Agree to your comments on code changes. Shall change it.


From martin at  Thu Nov 18 17:25:41 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 Nov 2010 17:25:41 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ibvvo1$40o$>
References: <> <ibvvo1$40o$>
Message-ID: <>

Am 17.11.2010 08:18, schrieb Georg Brandl:
> Am 16.11.2010 19:38, schrieb Jesus Cea:
>> Is there any updated mercurial schedule?.
>> Any impact related with the new 3.2 schedule (three weeks offset)?
> I've been trying to contact Dirkjan and ask; generally, I don't
> see much connection to the 3.2 schedule (with the exception that
> the final migration day should not be a release day.)

Please reconsider. When Python migrates to Mercurial, new features
will be added to Python, most notably a new way of identifying versions,
perhaps new variables in the sys module. So far, the policy has been
that no new features can be added after beta 1. So consequentially,
migrating 3.2 to Mercurial would violate that policy if done after b1.
Consequentially, we would need to release 3.2 from Subversion, which
in turn means that the Mercurial migration should be delayed until
after 3.2 is released.

Alternatively, b1 should be postponed until after the Mercurial
migration is done.


From guido at  Thu Nov 18 17:50:22 2010
From: guido at (Guido van Rossum)
Date: Thu, 18 Nov 2010 08:50:22 -0800
Subject: [Python-Dev] Breaking undocumented API
In-Reply-To: <>
References: <>
	<20101111100516.6e90aa41@mission> <>
	<> <>
	<> <ic1h69$jmd$>
Message-ID: <>

On Thu, Nov 18, 2010 at 4:16 AM, Nick Coghlan <ncoghlan at> wrote:
> On Thu, Nov 18, 2010 at 7:22 AM, Georg Brandl <g.brandl at> wrote:
>> So it comes down again to what we'd like __all__ to mean foremost:
>> public API, or just a list for "import *"?
> It's the list for star imports. This intended use case is borne out by
> the description of the feature when it was first added to the language
> back in 2.1:
> The public API (for documentation and introspection purposes) is any
> name that doesn't start with an underscore and isn't an imported
> module. If a tool is attempting to use __all__ as more than just the
> list of names for star imports, I would call the tool buggy.

Not so fast. The feature's meaning has clearly evolved.

> The use of the term "public names" in the language reference when
> describing the semantics of __all__ is an unfortunate choice, but it
> is used specifically in the context of talking about star imports and
> clarifying which names they bring in without making any reference to
> standards for documentation or deprecation policies.

Let's live with a little ambiguity. There are more shades of gray here
than you can imagine. I like gray.

--Guido van Rossum (

From guido at  Thu Nov 18 17:57:58 2010
From: guido at (Guido van Rossum)
Date: Thu, 18 Nov 2010 08:57:58 -0800
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
Message-ID: <>

On Thu, Nov 18, 2010 at 8:25 AM, "Martin v. L?wis" <martin at> wrote:
> Am 17.11.2010 08:18, schrieb Georg Brandl:
>> Am 16.11.2010 19:38, schrieb Jesus Cea:
>>> Is there any updated mercurial schedule?.
>>> Any impact related with the new 3.2 schedule (three weeks offset)?
>> I've been trying to contact Dirkjan and ask; generally, I don't
>> see much connection to the 3.2 schedule (with the exception that
>> the final migration day should not be a release day.)
> Please reconsider. When Python migrates to Mercurial, new features
> will be added to Python, most notably a new way of identifying versions,
> perhaps new variables in the sys module. So far, the policy has been
> that no new features can be added after beta 1. So consequentially,
> migrating 3.2 to Mercurial would violate that policy if done after b1.
> Consequentially, we would need to release 3.2 from Subversion, which
> in turn means that the Mercurial migration should be delayed until
> after 3.2 is released.
> Alternatively, b1 should be postponed until after the Mercurial
> migration is done.

I think this "new feature" is not so shocking that it can be used as
an argument to hold up the migration. If you have another reason to
stop the migration please say so; personally I can't wait for it to

--Guido van Rossum (

From g.brandl at  Thu Nov 18 18:08:10 2010
From: g.brandl at (Georg Brandl)
Date: Thu, 18 Nov 2010 18:08:10 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
Message-ID: <ic3mkq$qg0$>

Am 18.11.2010 17:25, schrieb "Martin v. L?wis":
> Am 17.11.2010 08:18, schrieb Georg Brandl:
>> Am 16.11.2010 19:38, schrieb Jesus Cea:
>>> Is there any updated mercurial schedule?.
>>> Any impact related with the new 3.2 schedule (three weeks offset)?
>> I've been trying to contact Dirkjan and ask; generally, I don't
>> see much connection to the 3.2 schedule (with the exception that
>> the final migration day should not be a release day.)
> Please reconsider. When Python migrates to Mercurial, new features
> will be added to Python, most notably a new way of identifying versions,
> perhaps new variables in the sys module. So far, the policy has been
> that no new features can be added after beta 1. So consequentially,
> migrating 3.2 to Mercurial would violate that policy if done after b1.
> Consequentially, we would need to release 3.2 from Subversion, which
> in turn means that the Mercurial migration should be delayed until
> after 3.2 is released.

I'm with Guido here.  Plus, if you like it can be seen as a bug fix:
the SVN build identification stops working, and we neeed to fix it.


From martin at  Thu Nov 18 18:32:33 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 Nov 2010 18:32:33 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
Message-ID: <>

>> Alternatively, b1 should be postponed until after the Mercurial
>> migration is done.
> I think this "new feature" is not so shocking that it can be used as
> an argument to hold up the migration. If you have another reason to
> stop the migration please say so; personally I can't wait for it to
> happen.

I can't point out any other specific concern, just a general feeling
that *when* the migration happens, it will be rushed, and we will have
to deal for a long time with the aftermath. For example, I expect that
it will take me several days until I get the Windows build process to
work correctly, and, if the migration gets as rushed as it appears to,
that the migration will happen without everything being worked out

Therefore, I'm concerned that I will have to work out all the details
on my own, just so that I can produce the b2 binaries (says); this is
not something I look forward to.

I'm not asking that the migration be stopped - I'm asking that it be
accelerated, so that there is plenty of time to identify all the
problems. But I'm also not willing to put time into it.

Failing the acceleration, I ask that appropriate consequences for
the 3.2 release are drawn: either it is postponed, or done using
Subversion until the final release (I think something can be worked
out then to get the 3.2.1 release from Mercurial - with only slight

In general, I'm *also* concerned about the lack of volunteers that
are interested in working on the infrastructure. I wish some of the
people who stated that they can't wait for the migration to happen
would work on solving some of the remaining problems.


From g.brandl at  Thu Nov 18 19:56:51 2010
From: g.brandl at (Georg Brandl)
Date: Thu, 18 Nov 2010 19:56:51 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>
Message-ID: <ic3t0k$sm2$>

Am 18.11.2010 18:32, schrieb "Martin v. L?wis":
>>> Alternatively, b1 should be postponed until after the Mercurial
>>> migration is done.
>> I think this "new feature" is not so shocking that it can be used as
>> an argument to hold up the migration. If you have another reason to
>> stop the migration please say so; personally I can't wait for it to
>> happen.
> I can't point out any other specific concern, just a general feeling
> that *when* the migration happens, it will be rushed, and we will have
> to deal for a long time with the aftermath. For example, I expect that
> it will take me several days until I get the Windows build process to
> work correctly, and, if the migration gets as rushed as it appears to,
> that the migration will happen without everything being worked out
> beforehand.
> Therefore, I'm concerned that I will have to work out all the details
> on my own, just so that I can produce the b2 binaries (says); this is
> not something I look forward to.

How much does the binary build process really depend on version control?
I.e., what would be stopping you from making a binary from an archive made
with e.g. "svn export"?  (I'm really asking because I don't know.)

Concerning the SVN external/ subdir, that is quite orthogonal to the
main development repo, and doesn't need to be migrated in lockstep (if it is
migrated to Mercurial at all in its current shape.

> I'm not asking that the migration be stopped - I'm asking that it be
> accelerated, so that there is plenty of time to identify all the
> problems. But I'm also not willing to put time into it.

I think we have anticipated what we could.  Of course there will still be
problems, but I think not of the sort that causes big disruptions
everywhere, preventing our developers from committing or breaking the
issue tracker, etc.

> Failing the acceleration, I ask that appropriate consequences for
> the 3.2 release are drawn: either it is postponed, or done using
> Subversion until the final release (I think something can be worked
> out then to get the 3.2.1 release from Mercurial - with only slight
> incompatibilities).
> In general, I'm *also* concerned about the lack of volunteers that
> are interested in working on the infrastructure. I wish some of the
> people who stated that they can't wait for the migration to happen
> would work on solving some of the remaining problems.

Well, put some butter to the fish: how many volunteers would you deem
sufficient, and which specific tasks are uncared for in the infrastructure?
I can only speak for myself, but I am prepared to put in my time.


From martin at  Thu Nov 18 20:33:40 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 Nov 2010 20:33:40 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ic3t0k$sm2$>
References: <>	<ibvvo1$40o$>	<>	<>	<>
Message-ID: <>

>> Therefore, I'm concerned that I will have to work out all the details
>> on my own, just so that I can produce the b2 binaries (says); this is
>> not something I look forward to.
> How much does the binary build process really depend on version control?
> I.e., what would be stopping you from making a binary from an archive made
> with e.g. "svn export"?  (I'm really asking because I don't know.)

The build process currently compiles a program (make_buildinfo), which
in turn finds the subversion installation, and runs subwcrev if found.
If no .svn folder is found, it falls back to the version information in
the export.

I would have to try out what exactly will happen when I try to build
the current hg conversion result on Windows, but chances are that
the resulting interpreter will crash because the string manipulation
fails to find the right substrings.

> Well, put some butter to the fish: how many volunteers would you deem
> sufficient, and which specific tasks are uncared for in the infrastructure?
> I can only speak for myself, but I am prepared to put in my time.

As a starting point, I'd like to see a complete, current conversion
result, using as many repositories as planned, and including as many
branches into each repository as planned (rather than the giant
cpython repository which we have now - unless the plan now is that
there will be a single giant repository).

Then the existing patches to the build identification should be applied,
and the repositories should be opened for (test) commits.

Then people could start identifying problems.

As a parallel activity, I'd also ask that the PEP is finished, or
atleast put into a form where the authors consider it complete
(again so that people could start identifying issues, and determine
where the PEP differs from reality - currently most obviously
in the branching approach).


From jcea at  Fri Nov 19 03:13:38 2010
From: jcea at (Jesus Cea)
Date: Fri, 19 Nov 2010 03:13:38 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>
Message-ID: <>

Hash: SHA1

On 18/11/10 18:32, "Martin v. L?wis" wrote:
> In general, I'm *also* concerned about the lack of volunteers that
> are interested in working on the infrastructure. I wish some of the
> people who stated that they can't wait for the migration to happen
> would work on solving some of the remaining problems.

Do we have a exhaustive list of mercurial "to do" things?.

I thought the plan was to keep a read only SVN mirror fedded from
mercurial. The 3.2 build could come from the mirror, I guess.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From benjamin at  Fri Nov 19 03:23:25 2010
From: benjamin at (Benjamin Peterson)
Date: Thu, 18 Nov 2010 20:23:25 -0600
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

2010/11/18 Jesus Cea <jcea at>:
> Hash: SHA1
> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>> In general, I'm *also* concerned about the lack of volunteers that
>> are interested in working on the infrastructure. I wish some of the
>> people who stated that they can't wait for the migration to happen
>> would work on solving some of the remaining problems.
> Do we have a exhaustive list of mercurial "to do" things?.


From g.brandl at  Fri Nov 19 08:43:15 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 08:43:15 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
Message-ID: <ic59tk$8vf$>

Am 19.11.2010 03:23, schrieb Benjamin Peterson:
> 2010/11/18 Jesus Cea <jcea at>:
>> Hash: SHA1
>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>> In general, I'm *also* concerned about the lack of volunteers that
>>> are interested in working on the infrastructure. I wish some of the
>>> people who stated that they can't wait for the migration to happen
>>> would work on solving some of the remaining problems.
>> Do we have a exhaustive list of mercurial "to do" things?.

Uh, that's the list of things to do *at* the migration.  The todo list is


From martin at  Fri Nov 19 08:58:27 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 19 Nov 2010 08:58:27 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
Message-ID: <>

Am 19.11.2010 03:23, schrieb Benjamin Peterson:
> 2010/11/18 Jesus Cea <jcea at>:
>> Hash: SHA1
>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>> In general, I'm *also* concerned about the lack of volunteers that
>>> are interested in working on the infrastructure. I wish some of the
>>> people who stated that they can't wait for the migration to happen
>>> would work on solving some of the remaining problems.
>> Do we have a exhaustive list of mercurial "to do" things?.

This doesn't, but IMO should, list

- resolve open issues in PEP
- finalize and implement branch structure
- set and implement policy for external code bases for Windows builds
- set up account management infrastructure, determine account managers


From kristjan at  Fri Nov 19 08:31:59 2010
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Fri, 19 Nov 2010 15:31:59 +0800
Subject: [Python-Dev] sha digest endianness
Message-ID: <>

Please see this defect:
It would appear that the digest and hexdigest for sha, is wrong on little endian machines.
There certainly is a discrepancy between little and big endian ones, irrespective of which one is "right"
Any thoughts?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Fri Nov 19 14:50:36 2010
From: ncoghlan at (Nick Coghlan)
Date: Fri, 19 Nov 2010 23:50:36 +1000
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ic59tk$8vf$>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl <g.brandl at> wrote:
> Am 19.11.2010 03:23, schrieb Benjamin Peterson:
>> 2010/11/18 Jesus Cea <jcea at>:
>>> Hash: SHA1
>>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>>> In general, I'm *also* concerned about the lack of volunteers that
>>>> are interested in working on the infrastructure. I wish some of the
>>>> people who stated that they can't wait for the migration to happen
>>>> would work on solving some of the remaining problems.
>>> Do we have a exhaustive list of mercurial "to do" things?.
> Uh, that's the list of things to do *at* the migration. ?The todo list is

That kind of link is the sort of thing that should really be in the
PEP... (along with the info about where to find the hooks repository,
specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ
to replace the current SVN focused FAQ, etc)

Target dates for the following specific activities would also be useful:
- date a "final draft" of converted repository will be made available
to Martin and Ronald to dry run creation of Windows and Mac OS X
- date SVN will go read only
- date Hg will be available for write access (it should be frozen for
a while, to give the folks doing the conversion a chance to make sure
buildbot is back up and run, commit emails are working properly, etc)

So as long as we acknowledge that any migration problems may mean
additional beta releases of 3.2 to iron things out, I don't see a
problem with releasing beta 1 as planned to close the door on any
*other* new features, and giving the Hg migration a clear run at the
source repository before we start working seriously on dealing with
bug reports (either existing ones, or those from the first beta).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From martin at  Fri Nov 19 15:36:35 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Nov 2010 15:36:35 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>
Message-ID: <>

> - date Hg will be available for write access (it should be frozen for
> a while, to give the folks doing the conversion a chance to make sure
> buildbot is back up and run, commit emails are working properly, etc)

I would target the build slaves to the Mercurial repository already in
the testing phase, e.g by creating builders for building from commits
to the 3k branch. I hope Buildbot supports multiple change sources now.
Likewise, I'd also see commit emails being delivered in the test phase
already, and let committers make test commits to trigger this all (and
also to get acquainted with the Mercurial tools they are going to use,
without fear of breaking something).


From barry at  Fri Nov 19 15:46:57 2010
From: barry at (Barry Warsaw)
Date: Fri, 19 Nov 2010 09:46:57 -0500
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <20101119094657.1a7cc24a@mission>

On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote:

>- date SVN will go read only

Please note that svn cannot be made completely read-only.  We've already
decided that versions already in maintenance or security-only mode (2.5, 2.6,
2.7, 3.1) will get updates and releases only via svn.  But only the release
managers should have write access to the svn repositories.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From ncoghlan at  Fri Nov 19 15:56:40 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 20 Nov 2010 00:56:40 +1000
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <20101119094657.1a7cc24a@mission>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

On Sat, Nov 20, 2010 at 12:46 AM, Barry Warsaw <barry at> wrote:
> On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote:
>>- date SVN will go read only
> Please note that svn cannot be made completely read-only. ?We've already
> decided that versions already in maintenance or security-only mode (2.5, 2.6,
> 2.7, 3.1) will get updates and releases only via svn. ?But only the release
> managers should have write access to the svn repositories.

Again, something that should be in PEP 385 (but isn't).

It seems that the work *is* going on, and the people actually doing it
have a reasonable idea as to what has been decided and where things
are going, but those of us "out here" have a fair stake in this as
well, and without an up to date PEP 385 there's no one place to go to
to see the current state of the migration.

That's enough to make folks like me somewhat nervous as to whether or
not we're actually going to have a usable source control system come
December 12.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From dirkjan at  Fri Nov 19 16:00:40 2010
From: dirkjan at (Dirkjan Ochtman)
Date: Fri, 19 Nov 2010 16:00:40 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

On Fri, Nov 19, 2010 at 15:56, Nick Coghlan <ncoghlan at> wrote:
> That's enough to make folks like me somewhat nervous as to whether or
> not we're actually going to have a usable source control system come
> December 12.

Yes, I've been negligent about updating the PEP. I'll try do so next
week. Georg, if you have time to update it a bit, that would be great
as well.



From g.brandl at  Fri Nov 19 17:23:44 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 17:23:44 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>	<ibvvo1$40o$>	<>	<>	<>	<>	<>
Message-ID: <ic68dj$o2q$>

Am 19.11.2010 08:58, schrieb "Martin v. L?wis":
> Am 19.11.2010 03:23, schrieb Benjamin Peterson:
>> 2010/11/18 Jesus Cea <jcea at>:
>>> Hash: SHA1
>>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>>> In general, I'm *also* concerned about the lack of volunteers that
>>>> are interested in working on the infrastructure. I wish some of the
>>>> people who stated that they can't wait for the migration to happen
>>>> would work on solving some of the remaining problems.
>>> Do we have a exhaustive list of mercurial "to do" things?.
> This doesn't, but IMO should, list
> - resolve open issues in PEP
> - finalize and implement branch structure
> - set and implement policy for external code bases for Windows builds
> - set up account management infrastructure, determine account managers

Good points, I've added the missing ones to the todo list.


From john at  Fri Nov 19 17:38:16 2010
From: john at (John Arbash Meinel)
Date: Fri, 19 Nov 2010 10:38:16 -0600
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>
Message-ID: <>

Hash: SHA1

On 11/19/2010 7:50 AM, Nick Coghlan wrote:
> On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl <g.brandl at> wrote:
>> Am 19.11.2010 03:23, schrieb Benjamin Peterson:
>>> 2010/11/18 Jesus Cea <jcea at>:
>>>> Hash: SHA1
>>>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>>>> In general, I'm *also* concerned about the lack of volunteers that
>>>>> are interested in working on the infrastructure. I wish some of the
>>>>> people who stated that they can't wait for the migration to happen
>>>>> would work on solving some of the remaining problems.
>>>> Do we have a exhaustive list of mercurial "to do" things?.
>> Uh, that's the list of things to do *at* the migration.  The todo list is
> That kind of link is the sort of thing that should really be in the
> PEP... (along with the info about where to find the hooks repository,
> specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ
> to replace the current SVN focused FAQ, etc)

Well, if it goes in the pep, you should at least use the 'always the
most recent' version :)


Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla -


From g.brandl at  Fri Nov 19 17:51:23 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 17:51:23 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>	<>	<20101119094657.1a7cc24a@mission>	<>
Message-ID: <ic6a1d$1dm$>

Am 19.11.2010 16:00, schrieb Dirkjan Ochtman:
> On Fri, Nov 19, 2010 at 15:56, Nick Coghlan <ncoghlan at> wrote:
>> That's enough to make folks like me somewhat nervous as to whether or
>> not we're actually going to have a usable source control system come
>> December 12.
> Yes, I've been negligent about updating the PEP. I'll try do so next
> week. Georg, if you have time to update it a bit, that would be great
> as well.

I'm at it.  In fact, I think I will merge both todo.txt and tasks.txt
into the PEP.  It's not more of a burden to update it there, and it's
more visible to the developer community.


From alexander.belopolsky at  Fri Nov 19 17:53:58 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Fri, 19 Nov 2010 11:53:58 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
Message-ID: <>

I was recently surprised to learn that chr(i) can produce a string of
length 2 in python 3.x.   I suspect that I am not alone finding this
behavior non-obvious given that a mistake in Python manual stating the
contrary survived several releases.  [1]  Note that I am not arguing
that the change was bad.  In Python 2.x, \U escapes have been
producing surrogate pair on narrow builds for a long time if not since
introduction of unicode.  I do believe, however that a change like
this [2] and its consequences should be better publicized.  I have not
found any discussion of this change in PEPs or "What's new" documents.
 The closest find was a mentioning of a related issue #3280 in the 3.0
NEWS file. [3]  Since this feature will be first documented in the
Library Reference in 3.2, I wonder if it will be appropriate to
mention it in "What's new in 3.2"?


From g.brandl at  Fri Nov 19 17:53:28 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 17:53:28 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>	<ibvvo1$40o$>	<>	<>	<>	<>	<>	<ic59tk$8vf$>	<>
Message-ID: <ic6a5a$1dm$>

Am 19.11.2010 15:36, schrieb "Martin v. L?wis":
>> - date Hg will be available for write access (it should be frozen for
>> a while, to give the folks doing the conversion a chance to make sure
>> buildbot is back up and run, commit emails are working properly, etc)
> I would target the build slaves to the Mercurial repository already in
> the testing phase, e.g by creating builders for building from commits
> to the 3k branch. I hope Buildbot supports multiple change sources now.
> Likewise, I'd also see commit emails being delivered in the test phase
> already, and let committers make test commits to trigger this all (and
> also to get acquainted with the Mercurial tools they are going to use,
> without fear of breaking something).

I've already let my Mercurial buildbot configuration run for a few checkins
while testing it; a separate changesource was not needed.

The commit email hook also has been tested extensively by its usage for the
distutils2 repo, which are also sent to python-checkins.

That said, it will of course be nice to activate both for the test repo
as well, once it's available.


From status at  Fri Nov 19 18:07:02 2010
From: status at (Python tracker)
Date: Fri, 19 Nov 2010 18:07:02 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (2010-11-12 - 2010-11-19)
Python tracker at

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2549 (+23)
  closed 19694 (+43)
  total  22243 (+66)

Open issues with patches: 1058 

Issues opened (43)

#2571: always uses raw_input, even when another stdin is speci  reopened by eric.araujo

#4153: Unicode HOWTO up to date?  reopened by belopolsky

#6941: Socket error when launching IDLE  reopened by 08jpurcell

#10356: hash of -1  reopened by rhettinger

#10399: AST Optimization: inlining of function calls  opened by dmalcolm

#10401: Globals / builtins cache  opened by pitrou

#10402: sporadic test_bsddb3 failures  opened by pitrou

#10403: Use "member" consistently  opened by fdrake

#10404: IDLE on OS X popup menus do not work: cannot set/clear breakpo  opened by ned.deily

#10405: IDLE breakpoint facility undocumented  opened by ned.deily

#10406: IDLE 2.7 on OS X does not enable Rstrip extension by default  opened by ned.deily

#10407: missing errno import in distutils/  opened by zbysz

#10408: Denser dicts and linear probing  opened by pitrou

#10415: readline.insert_text documentation incomplete  opened by Justin.Lebar

#10417: unittest triggers UnicodeEncodeError with non-ASCII character  opened by jammon

#10419: distutils command build_scripts fails with UnicodeDecodeError  opened by hagen

#10420: Document of Bdb.effective is wrong.  opened by naoki

#10423: s/args/options in arpgarse "Upgrading optparse code"  opened by bethard

#10424: better error message from argparse when positionals missing  opened by bethard

#10427: 24:00 Hour in DateTime  opened by ingo.janssen

#10430: _sha.sha().digest() method is endian-sensitive. and hexdigest(  opened by krisvale

#10433: Document unique behavior of 'getgroups' on OSX  opened by r.david.murray

#10434: Document the rules for "public names"  opened by belopolsky

#10435: Document unicode C-API in reST  opened by belopolsky

#10436: tarfile.extractfile in "r|" stream mode fails with filenames o  opened by David.Nesting

#10437: ThreadPoolExecutor should accept max_workers=None  opened by stutzbach

#10438: list an example for calling static methods from WITHIN classes  opened by ifreecarve

#10439: PyCodec C API is not documented in reST  opened by belopolsky

#10441: some stdlib modules need to be updated to handle SSL certifica  opened by db

#10444: A mechanism is needed to override waiting for Python threads t  opened by michaelahughes

#10446: pydoc3 links to 2.x library reference  opened by belopolsky

#10448: Add Mako template benchmark to Python Benchmark Suite  opened by bobbyi

#10449: ???os.environ was modified by test_httpservers???  opened by eric.araujo

#10450: Fix markup in Misc/NEWS  opened by eric.araujo

#10451: memoryview can be used to write into readonly buffer  opened by abacabadabacaba

#10453: Add -h/--help option to compileall  opened by eric.araujo

#10454: Clarify compileall command-line options  opened by eric.araujo

#10457: "Related help topics" shown outside pager  opened by cben

#10458: 2.7 += re.ASCII  opened by hfuru

#10459: missing character names in unicodedata (CJK...)  opened by vbr

#10460: Misc/ does not reflect PEP 7  opened by Mick.Beaver

#10461: Use with statement throughout the docs  opened by eric.araujo

#10445: _ast py3k : add lineno back to "args" node  opened by emile.anclin

Most recent 15 issues with no replies (15)

#10461: Use with statement throughout the docs

#10460: Misc/ does not reflect PEP 7

#10457: "Related help topics" shown outside pager

#10451: memoryview can be used to write into readonly buffer

#10449: ???os.environ was modified by test_httpservers???

#10445: _ast py3k : add lineno back to "args" node

#10439: PyCodec C API is not documented in reST

#10437: ThreadPoolExecutor should accept max_workers=None

#10433: Document unique behavior of 'getgroups' on OSX

#10424: better error message from argparse when positionals missing

#10423: s/args/options in arpgarse "Upgrading optparse code"

#10420: Document of Bdb.effective is wrong.

#10419: distutils command build_scripts fails with UnicodeDecodeError

#10406: IDLE 2.7 on OS X does not enable Rstrip extension by default

#10405: IDLE breakpoint facility undocumented

Most recent 15 issues waiting for review (15)

#10448: Add Mako template benchmark to Python Benchmark Suite

#10446: pydoc3 links to 2.x library reference

#10444: A mechanism is needed to override waiting for Python threads t

#10435: Document unicode C-API in reST

#10419: distutils command build_scripts fails with UnicodeDecodeError

#10408: Denser dicts and linear probing

#10406: IDLE 2.7 on OS X does not enable Rstrip extension by default

#10404: IDLE on OS X popup menus do not work: cannot set/clear breakpo

#10401: Globals / builtins cache

#10399: AST Optimization: inlining of function calls

#10391: obj2ast's error handling can lead to python crashing with a C-

#10385: Mark up "subprocess" as module in its doc

#10383: test_os leaks under Windows

#10382: Command line error marker misplaced on unicode entry

#10371: Deprecate trace module undocumented API

Top 10 most discussed issues (10)

#3871: cross and native build of python for mingw32 with distutils  17 msgs

#10441: some stdlib modules need to be updated to handle SSL certifica  16 msgs

#2001: Pydoc interactive browsing enhancement  14 msgs

#10356: hash of -1  14 msgs

#10446: pydoc3 links to 2.x library reference  12 msgs

#7900: posix.getgroups() failure on Mac OS X  11 msgs

#10435: Document unicode C-API in reST  11 msgs

#4153: Unicode HOWTO up to date?  10 msgs

#10417: unittest triggers UnicodeEncodeError with non-ASCII character   8 msgs

#1553375: Add traceback.print_full_exception()   8 msgs

Issues closed (44)

#4471: IMAP4 missing support for starttls  closed by pitrou

#4476: compileall fails if current dir has a "types" package  closed by ncoghlan

#5111: httplib: wrong Host header when connecting to IPv6 litteral UR  closed by orsenthil

#7828: chr() and ord() documentation for wide characters  closed by belopolsky

#8649: Py_UNICODE_* functions are undocumented  closed by belopolsky

#9076: Add C-API documentation for PyUnicode_AsDecodedObject/Unicode  closed by georg.brandl

#9520: Add Patricia Trie high performance container  closed by rhettinger

#9991: xmlrpc client ssl check faulty  closed by orsenthil

#10070: 2to3 wishes for already-2to3'ed files  closed by loewis

#10205: Can't have two tags with the same QName  closed by orsenthil

#10260: Add a threading.Condition.wait_for() method  closed by krisvale

#10373: Setup Script example incorrect  closed by eric.araujo

#10392: GZipFile crash when fileobj.mode is None  closed by r.david.murray

#10396: stdin argument to pdb.Pdb doesn't work unless you also set Pdb  closed by georg.brandl

#10397: Unified Benchmark Suite fails on py3k with --track-memory  closed by pitrou

#10398: errors in docs re module initialization vs self arg to functio  closed by georg.brandl

#10400: updating unicodedata to Unicode 6  closed by loewis

#10409: mkcfg crashes with ValueError  closed by tarek

#10410: Is iterable a container type?  closed by rhettinger

#10411: Pickle benchmark fails after converting Benchmark Suite to py3  closed by pitrou

#10412: Add py3k support for "slow" pickle benchmark in Benchmark Suit  closed by pitrou

#10413: Comments in unicode.h are out of date  closed by belopolsky

#10414: socket.gethostbyname  doesn't return an ipv6 address  closed by loewis

#10416: UnicodeDecodeError when 2to3 is run on a dir with numpy .npy f  closed by benjamin.peterson

#10418: test_io hangs on 3.1.3rc1  closed by vdupras

#10421: Failed issue tracker submission  closed by eric.araujo

#10422: : error when loading multiple stats files  closed by ezio.melotti

#10425: xmlrpclib support for None isn't compliant with XMLRPC  closed by orsenthil

#10426: The whole thing is NOT good  closed by georg.brandl

#10428: IDLE Trouble shooting  closed by r.david.murray

#10429: bug in test_imaplib  closed by pitrou

#10431: Failed issue tracker submission  closed by ezio.melotti

#10432: concurrent.futures.as_completed() spins waiting for futures to  closed by bquinlan

#10440: support RUSAGE_THREAD as a constant in the resource module  closed by pitrou

#10442: Please by default enforce ssl certificate checking in modules  closed by ned.deily

#10443: add wrapper for SSL_CTX_set_default_verify_paths  closed by pitrou

#10447: zipfile: IOError for long directory paths on Windows  closed by amaury.forgeotdarc

#10452: Unhelpful diagnostic 'cannot find the path specified'  closed by eric.smith

#10455: typo in urllib.request documentation  closed by ezio.melotti

#10456: unittest.main(verbosity=2) broke in python31, worked when I ha  closed by r.david.murray

#1599329: urllib(2) should allow automatic decoding by charset  closed by eric.araujo

#1376292: Write user's version of the reference guide  closed by akuchling

#1509798: replace dist/src/Tools/scripts/ with tmick's which  closed by eric.araujo

#1520831: urrlib2 max_redirections=0 disables redirects  closed by orsenthil

From g.brandl at  Fri Nov 19 18:12:22 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 18:12:22 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <20101119094657.1a7cc24a@mission>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>	<>
Message-ID: <ic6b8o$7fo$>

Am 19.11.2010 15:46, schrieb Barry Warsaw:
> On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote:
>>- date SVN will go read only
> Please note that svn cannot be made completely read-only.  We've already
> decided that versions already in maintenance or security-only mode (2.5, 2.6,
> 2.7, 3.1) will get updates and releases only via svn.  But only the release
> managers should have write access to the svn repositories.

Really?  I can understand this for security-only branches (commits there will
be rare, and equivalent commits to the Mercurial branches can be made by others
than the release managers, in order to keep history consistent).

But having the maintenance branches (by then, that will mostly be 2.7 because
3.1 will go to security-only mode soon) in SVN will be a burden for every
developer, since they have to backport bugfixes from Hg to SVN...


From solipsis at  Fri Nov 19 18:17:20 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 19 Nov 2010 18:17:20 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
References: <>
Message-ID: <>

On Fri, 19 Nov 2010 11:53:58 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> Since this feature will be first documented in the
> Library Reference in 3.2, I wonder if it will be appropriate to
> mention it in "What's new in 3.2"?

No, since it's not new in 3.2.  No need to further confuse users.
If there's a porting guide to 3.x it should be mentioned there.



From barry at  Fri Nov 19 18:41:58 2010
From: barry at (Barry Warsaw)
Date: Fri, 19 Nov 2010 12:41:58 -0500
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ic6b8o$7fo$>
References: <> <ibvvo1$40o$>
	<> <>
	<20101119094657.1a7cc24a@mission> <ic6b8o$7fo$>
Message-ID: <20101119124158.3d8debc9@mission>

On Nov 19, 2010, at 06:12 PM, Georg Brandl wrote:

>Am 19.11.2010 15:46, schrieb Barry Warsaw:
>> On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote:
>>>- date SVN will go read only
>> Please note that svn cannot be made completely read-only.  We've already
>> decided that versions already in maintenance or security-only mode (2.5, 2.6,
>> 2.7, 3.1) will get updates and releases only via svn.  But only the release
>> managers should have write access to the svn repositories.
>Really?  I can understand this for security-only branches (commits there will
>be rare, and equivalent commits to the Mercurial branches can be made by
>others than the release managers, in order to keep history consistent).
>But having the maintenance branches (by then, that will mostly be 2.7 because
>3.1 will go to security-only mode soon) in SVN will be a burden for every
>developer, since they have to backport bugfixes from Hg to SVN...

Maybe I misremembered Martin's suggestion, and he was only talking about
security releases.  I think the key thing is whether you're going to backport
the vcs related bits to stable releases.

I plan to only do releases for 2.6 from svn, because it's not worth breaking
things like sys.subversion, and as you say the number of commits will be


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From solipsis at  Fri Nov 19 19:06:09 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 19 Nov 2010 19:06:09 +0100
Subject: [Python-Dev] Mercurial Schedule
References: <> <ibvvo1$40o$>
	<> <>
	<20101119094657.1a7cc24a@mission> <ic6b8o$7fo$>
Message-ID: <>

On Fri, 19 Nov 2010 12:41:58 -0500
Barry Warsaw <barry at> wrote:
> >Really?  I can understand this for security-only branches (commits there will
> >be rare, and equivalent commits to the Mercurial branches can be made by
> >others than the release managers, in order to keep history consistent).
> >
> >But having the maintenance branches (by then, that will mostly be 2.7 because
> >3.1 will go to security-only mode soon) in SVN will be a burden for every
> >developer, since they have to backport bugfixes from Hg to SVN...
> Maybe I misremembered Martin's suggestion, and he was only talking about
> security releases.  I think the key thing is whether you're going to backport
> the vcs related bits to stable releases.

It would be horribly burdensome to use two different VCSes depending on
whether you're working on a bugfix branch or a feature branch.

> I plan to only do releases for 2.6 from svn, because it's not worth breaking
> things like sys.subversion, and as you say the number of commits will be
> small.

But 2.6 is security-fixes only, right? It would really be annoying if
the same rules applied for 2.7 and 3.1.

I don't understand all the worry about sys.subversion. It's not like
it's useful to anybody else than us, and I think it should have been
named sys._subversion instead. There's no point in making API-like
promises about which DVCS, bug tracker or documentation toolset we use
for our workflow.



From merwok at  Fri Nov 19 19:41:54 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 19 Nov 2010 19:41:54 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>	<>	<20101119094657.1a7cc24a@mission>
	<ic6b8o$7fo$>	<20101119124158.3d8debc9@mission>
Message-ID: <>

> I don't understand all the worry about sys.subversion. It's not like
> it's useful to anybody else than us, and I think it should have been
> named sys._subversion instead. There's no point in making API-like
> promises about which DVCS, bug tracker or documentation toolset we use
> for our workflow.

I read ?subversion? as ?sub-piece of information about version?, not the
name of a VCS, so I have no problem with its continuing existence under
Mercurial (it?s in PEP 385).


From brett at  Fri Nov 19 19:52:03 2010
From: brett at (Brett Cannon)
Date: Fri, 19 Nov 2010 10:52:03 -0800
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

On Fri, Nov 19, 2010 at 05:50, Nick Coghlan <ncoghlan at> wrote:
> On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl <g.brandl at> wrote:
>> Am 19.11.2010 03:23, schrieb Benjamin Peterson:
>>> 2010/11/18 Jesus Cea <jcea at>:
>>>> Hash: SHA1
>>>> On 18/11/10 18:32, "Martin v. L?wis" wrote:
>>>>> In general, I'm *also* concerned about the lack of volunteers that
>>>>> are interested in working on the infrastructure. I wish some of the
>>>>> people who stated that they can't wait for the migration to happen
>>>>> would work on solving some of the remaining problems.
>>>> Do we have a exhaustive list of mercurial "to do" things?.
>> Uh, that's the list of things to do *at* the migration. ?The todo list is
> That kind of link is the sort of thing that should really be in the
> PEP... (along with the info about where to find the hooks repository,
> specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ
> to replace the current SVN focused FAQ, etc)

I am spending my PSF grant time in January rewriting
practically from scratch. Any needed updates to take Mercurial in
account will happen no later than then.


> Target dates for the following specific activities would also be useful:
> - date a "final draft" of converted repository will be made available
> to Martin and Ronald to dry run creation of Windows and Mac OS X
> installers
> - date SVN will go read only
> - date Hg will be available for write access (it should be frozen for
> a while, to give the folks doing the conversion a chance to make sure
> buildbot is back up and run, commit emails are working properly, etc)
> So as long as we acknowledge that any migration problems may mean
> additional beta releases of 3.2 to iron things out, I don't see a
> problem with releasing beta 1 as planned to close the door on any
> *other* new features, and giving the Hg migration a clear run at the
> source repository before we start working seriously on dealing with
> bug reports (either existing ones, or those from the first beta).
> Cheers,
> Nick.
> --
> Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From victor.stinner at  Fri Nov 19 21:23:14 2010
From: victor.stinner at (Victor Stinner)
Date: Fri, 19 Nov 2010 21:23:14 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
Message-ID: <>


On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote:
> I was recently surprised to learn that chr(i) can produce a string of
> length 2 in python 3.x.

Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in 
wide mode (sys.maxunicode == 1114111).

> I suspect that I am not alone finding this behavior non-obvious 
> given that a mistake in Python manual stating the contrary survived 
> several releases.  [1]

It was a documentation bug and you fixed it. Non-BMP characters are rare, so 
few (maybe only you?) noticed the documentation bug. I consider the behaviour 
as an improvment of non-BMP support of Python3.

Python is unclear about non-BMP characters: narrow build was called "ucs2" for 
long time, even if it is UTF-16 (each character is encoded to one or two 
UTF-16 words). Python2 accepts non-BMP characters with \U syntax, but not with 
chr(). This is inconsistent and I see this as a bug. But I don't want to touch 
Python2 about non-BMP characters, and the "bug" is already fixed in Python3!

> I do believe, however that a change like
> this [2] and its consequences should be better publicized.

Change made before the release of Python 3.0. Do you want to patch the "What's 
new in Python 3.0?" document?

> I have not
> found any discussion of this change in PEPs or "What's new" documents.
>  The closest find was a mentioning of a related issue #3280 in the 3.0
> NEWS file. [3]  Since this feature will be first documented in the
> Library Reference in 3.2, I wonder if it will be appropriate to
> mention it in "What's new in 3.2"?

In my opinion, the question is more what was it not fixed in Python2. I suppose 
that the answer is something ugly like "backward compatibility" or "historical 
reasons" :-)


From martin at  Fri Nov 19 22:25:08 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Nov 2010 22:25:08 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <20101119124158.3d8debc9@mission>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>	<>	<20101119094657.1a7cc24a@mission>
	<ic6b8o$7fo$> <20101119124158.3d8debc9@mission>
Message-ID: <>

> Maybe I misremembered Martin's suggestion, and he was only talking about
> security releases.

Technically, I was only talking about 2.5. For each branch, the
respective release manager should make a decision. For 2.5 and 2.6,
it's been decided; Benjamin has not yet announced plans how 2.7 and 3.1
will be maintained after the switchover.


From martin at  Fri Nov 19 22:35:54 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Nov 2010 22:35:54 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
	<ibvvo1$40o$>	<>	<>	<>
	<>	<>	<ic59tk$8vf$>	<>	<20101119094657.1a7cc24a@mission>
	<ic6b8o$7fo$>	<20101119124158.3d8debc9@mission>
Message-ID: <>

> I don't understand all the worry about sys.subversion.

Really? For a security release, there should be *zero* chance that it
breaks existing applications, unless the application relies on the
security bug that has been fixed. By "zero chance", I mean absolutely
no chance, never. I'm pretty sure that applications *will* break because
of the change to sys.subversion, or sys.version. People made bug reports
complaining that sys.version has a newline on some systems and not on

> It's not like
> it's useful to anybody else than us

I think you underestimate what API people actually use in applications


From g.brandl at  Fri Nov 19 22:39:04 2010
From: g.brandl at (Georg Brandl)
Date: Fri, 19 Nov 2010 22:39:04 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>	<ibvvo1$40o$>	<>	<>	<>	<>	<>	<ic59tk$8vf$>	<>	<20101119094657.1a7cc24a@mission>	<ic6b8o$7fo$>	<20101119124158.3d8debc9@mission>	<>
Message-ID: <ic6qsr$gfs$>

Am 19.11.2010 22:35, schrieb "Martin v. L?wis":
>> I don't understand all the worry about sys.subversion.
> Really? For a security release, there should be *zero* chance that it
> breaks existing applications, unless the application relies on the
> security bug that has been fixed. By "zero chance", I mean absolutely
> no chance, never. I'm pretty sure that applications *will* break because
> of the change to sys.subversion, or sys.version. People made bug reports
> complaining that sys.version has a newline on some systems and not on
> others.
>> It's not like
>> it's useful to anybody else than us
> I think you underestimate what API people actually use in applications
> etc.

Well, it should not be a problem to continue to provide a sys.subversion
that at least will not break applications reading it.

And yes, I am in favor of giving the new attribute a leading underscore.


From solipsis at  Fri Nov 19 22:43:12 2010
From: solipsis at (Antoine Pitrou)
Date: Fri, 19 Nov 2010 22:43:12 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
	<20101119094657.1a7cc24a@mission> <ic6b8o$7fo$>
	<20101119124158.3d8debc9@mission> <>
Message-ID: <1290202992.3621.4.camel@localhost.localdomain>

Le vendredi 19 novembre 2010 ? 22:35 +0100, "Martin v. L?wis" a ?crit :
> > I don't understand all the worry about sys.subversion.
> Really? For a security release, there should be *zero* chance that it
> breaks existing applications,

It should have been clear that my message explicitly excluded security



From martin at  Fri Nov 19 22:43:45 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Nov 2010 22:43:45 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
Message-ID: <>

> In my opinion, the question is more what was it not fixed in Python2. I suppose 
> that the answer is something ugly like "backward compatibility" or "historical 
> reasons" :-)

No, there was a deliberate decision to not support that, see

There had been a long discussion on this specific detail when PEP 261
was written, and in the end, an explicit, deliberate, considered
decision was made to raise a ValueError.


From ezio.melotti at  Fri Nov 19 23:05:51 2010
From: ezio.melotti at (Ezio Melotti)
Date: Sat, 20 Nov 2010 00:05:51 +0200
Subject: [Python-Dev] [Python-checkins] r86530
	-	python/branches/py3k/Doc/howto/unicode.rst
In-Reply-To: <>
References: <>
Message-ID: <>


On 19/11/2010 18.10, alexander.belopolsky wrote:
> Author: alexander.belopolsky
> Date: Fri Nov 19 17:09:58 2010
> New Revision: 86530
> Log:
> Issue #4153: Updated Unicode HOWTO.
> Modified:
>     python/branches/py3k/Doc/howto/unicode.rst
> Modified: python/branches/py3k/Doc/howto/unicode.rst
> ==============================================================================
> --- python/branches/py3k/Doc/howto/unicode.rst	(original)
> +++ python/branches/py3k/Doc/howto/unicode.rst	Fri Nov 19 17:09:58 2010
> [...]
> -Python 2.x's Unicode Support
> -============================
> +Python's Unicode Support
> +========================
>   Now that you've learned the rudiments of Unicode, we can look at Python's
>   Unicode features.
> @@ -265,7 +263,7 @@
>       UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
>                           unexpected code byte
>       >>>  b'\x80abc'.decode("utf-8", "replace")
> -    '\ufffdabc'
> +    '?abc'

Apparently 'make latex' and 'make all-pdf' don't like this char.

>       >>>  b'\x80abc'.decode("utf-8", "ignore")
>       'abc'
>   [...]

Best Regards,
Ezio Melotti

From benjamin at  Fri Nov 19 23:20:25 2010
From: benjamin at (Benjamin Peterson)
Date: Fri, 19 Nov 2010 16:20:25 -0600
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <ibvvo1$40o$>
	<> <>
	<20101119094657.1a7cc24a@mission> <ic6b8o$7fo$>
	<20101119124158.3d8debc9@mission> <>
Message-ID: <>

2010/11/19 "Martin v. L?wis" <martin at>:
>> Maybe I misremembered Martin's suggestion, and he was only talking about
>> security releases.
> Technically, I was only talking about 2.5. For each branch, the
> respective release manager should make a decision. For 2.5 and 2.6,
> it's been decided; Benjamin has not yet announced plans how 2.7 and 3.1
> will be maintained after the switchover.

I propose that they follow the development branches over to hg. Having
to backport bug fixes with any frequency from hg to svn would probably
be more unpleasant than the current svnmerge situation.


From mal at  Fri Nov 19 23:25:03 2010
From: mal at (M.-A. Lemburg)
Date: Fri, 19 Nov 2010 23:25:03 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
Message-ID: <>

Victor Stinner wrote:
> Hi,
> On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote:
>> I was recently surprised to learn that chr(i) can produce a string of
>> length 2 in python 3.x.
> Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in 
> wide mode (sys.maxunicode == 1114111).
>> I suspect that I am not alone finding this behavior non-obvious 
>> given that a mistake in Python manual stating the contrary survived 
>> several releases.  [1]
> It was a documentation bug and you fixed it. Non-BMP characters are rare, so 
> few (maybe only you?) noticed the documentation bug. I consider the behaviour 
> as an improvment of non-BMP support of Python3.
> Python is unclear about non-BMP characters: narrow build was called "ucs2" for 
> long time, even if it is UTF-16 (each character is encoded to one or two 
> UTF-16 words).

No, no, no :-)

UCS2 and UCS4 are more appropriate than "narrow" and "wide" or even
"UTF-16" and "UTF-32".

It'S rather common to confuse a transfer encoding with a storage format.
UCS2 and UCS4 refer to code units (the storage format). You can use
UCS2 and UCS4 code units to represent UTF-16 and UTF-32 resp., but those
are not the same things.

In UTF-16 0xD800 has a special meaning, in UCS2 it doesn't.
Python uses UCS2 internally. It does not assign a special meaning
to those surrogate code point ranges.

However, when it comes to codecs, we do try to make use of the fact
that UCS2 can easily be used to represent an UTF-16 encoding and
that's why you often see surrogates being created for code points
that wouldn't otherwise fit into UCS2 and you see those surrogates
being converted back to single code units in UCS4 builds.

I don't know who invented the terms "narrow" and "wide" builds
for Python3. Not me that's for sure :-) They don't have any
meaning in Unicode terminology and thus cause even more confusion
than UCS2 and UCS4. E.g. the import errors you
get when importing extensions built for a different Unicode
version, (correctly) refer to UCS2 vs. UCS4 and now give even less
of a clue that they relate to difference in Unicode builds (since
these are now labeled "narrow" and "wide").

IMO, we should go back to the Python2 terms UCS2 and UCS4 which
are correct and provide a clear description of what Python uses
internally for code units.

> Python2 accepts non-BMP characters with \U syntax, but not with 
> chr(). This is inconsistent and I see this as a bug. But I don't want to touch 
> Python2 about non-BMP characters, and the "bug" is already fixed in Python3!
>> I do believe, however that a change like
>> this [2] and its consequences should be better publicized.
> Change made before the release of Python 3.0. Do you want to patch the "What's 
> new in Python 3.0?" document?

Perhaps add a section "What we forgot to mention in 3.0" or
"What's not so new in 3.2" to "What's new in 3.2" :-)

>> I have not
>> found any discussion of this change in PEPs or "What's new" documents.
>>  The closest find was a mentioning of a related issue #3280 in the 3.0
>> NEWS file. [3]  Since this feature will be first documented in the
>> Library Reference in 3.2, I wonder if it will be appropriate to
>> mention it in "What's new in 3.2"?
> In my opinion, the question is more what was it not fixed in Python2. I suppose 
> that the answer is something ugly like "backward compatibility" or "historical 
> reasons" :-)

Backwards compatibility.

Python2 applications don't expect unichr(i)
to return anything other than a single character. If you need this
in Python2, it's easy enough to get around, though, with a little
helper function.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 19 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From martin at  Fri Nov 19 23:46:08 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Nov 2010 23:46:08 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> It'S rather common to confuse a transfer encoding with a storage format.
> UCS2 and UCS4 refer to code units (the storage format).

Actually, they don't. Instead, they refer to "coded character sets",
in W3C terminology: mapping of characters to natural numbers. See

The term "UCS-2" is a character set that can encode only encode 65536
characters; it thus refers to Unicode 1.1. According to the Unicode
Consortium's FAQ, the term UCS-2 should be avoided these days.

> IMO, we should go back to the Python2 terms UCS2 and UCS4 which
> are correct and provide a clear description of what Python uses
> internally for code units.

No, we shouldn't. The term UCS-2 is deprecated, see above.


From v+python at  Sat Nov 20 04:48:58 2010
From: v+python at (Glenn Linderman)
Date: Fri, 19 Nov 2010 19:48:58 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4
Message-ID: <>

So maybe this is the wrong forum, if so please tell me what the right 
forum is for each of the various pieces.  I'm assuming that I should 
file some bugs in the tracker, but I'm not exactly sure whether to file 
them on cgitb, http.server, or subprocess, or all of the above.  Pretty 
sure there are at least some in http.server, but maybe some of those 
will be considered "enhancement requests" since they are long 
outstanding in the predecessor code.

So I've been writing CGI scripts in Python behind Apache.  No framework, 
just raw CGI.

Got everything working on Python 2.6 (it's the newest that the hosting 
company has).  Whacked at 2.6's until I got an 
environment that would actually run CGI programs in the same sort of way 
that Apache does, so I can test faster, locally.  Got the site working.  
Am happy.

Now I decided to tackle porting the code to Python 3, in hopes that 
someday the hosting company might have it, and to see what I could learn 
about the "Subject:" matters, and to altruistically see if 3.2a4 has a 
consistent story.  Um.  Well.  Some of me,  Python 3.2a4, or its 
documentation is missing something.  Maybe several somethings.

Here's some code to ponder.

import sys
import traceback
sys.stdout = open("sob", "wb")  # WSGI sez data should be binary, so 
stdout should be binary???
import cgitb
fhb = open("fhb", "wb")
fhb.write("abcdef")  # try writing non-binary to binary file.  Expect an 
error, of course.

Feed it to python32...

Error in sys.excepthook:
TypeError: 'str' does not support the buffer interface

Original exception was:
Traceback (most recent call last):
   File "d:\my\py\", line 8, in <module>
     fhb.write("abcdef")  # try writing non-binary to binary file.  
Expect an err
or, of course.
TypeError: 'str' does not support the buffer interface

So it seems that cgitb can't write to binary files, to report the 
error?  Or how else should I interpret the Error in sys.excepthook ?

So then I tweaked the code for cgitb's enjoyment:

import sys
import traceback
sys.stdout = open("sob", "w", encoding="UTF-8")  # WSGI sez data should 
be binary, so stdout should be binary???
import cgitb
fhb = open("fhb", "wb")
fhb.write("abcdef")  # try writing non-binary to binary file.  Expect an 
error, of course.

Now I get the following report in the stdout file:

out<!--: spam
Content-Type: text/html

<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> -->
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> --> -->
</font> </font> </font> </script> </object> </blockquote> </pre>
</table> </table> </table> </table> </table> </font> </font> </font><p>A 
problem occurred in a Python script.

and the following error on the console:

Error in sys.excepthook:
Traceback (most recent call last):
   File "c:\python32\lib\", line 209, in _mkstemp_inner
     fd =, flags, 0o600)
OSError: [Errno 22] Invalid argument

Original exception was:
Traceback (most recent call last):
   File "d:\my\py\", line 8, in <module>
     fhb.write("abcdef")  # try writing non-binary to binary file.  
Expect an error, of course.
TypeError: 'str' does not support the buffer interface

I was expecting see a whole cgitb in sob, but no such luck.  Not sure 
why it is trying to create a temporary file, but it seems to fail to do 

Of course, the next test, would have been to write binary data into fhb, 
and try to copy it to stdout, which would fail, because stdout has to 
not be binary to make cgitb work???

That brings me to http.server, the 3.2a4 replacement for CGIHTTPServer.  
There are definitely some improvements here, and some 
reported-but-yet-unfixed bugs.  And some pitiful missing features, 
especially on Windows.  I applied some of the whacks I had applied to 
CGIHTTPServer, and got some things working, but, per what I was trying 
to demonstrate above, there seems to be an incompatibility with the idea 
of using cgitb (which wants stdout open with some encoding provided) and 
serving binary files (which wants stdout open in binary) [this latter is 
supported by the WSGI spec too].

So it seems to be that there are some problems.  Yet, it seems that 
http.server can some accept the data sent by cgitb, which comes from 
subprocess running my CGI script, but my CGI script fails to be able to 
copy a binary file to its stdout (a subprocess created PIPE).  The 
subprocess documentation doesn't say what encoding is supplied to the 
PIPE-created handles, if any, but since cgitb data is accepted but 
binary file data is not, I infer it must be a non-binary handle, 
encoding unknown.  The subprocess documentation doesn't document any way 
to specify what encoding should be used on the PIPE-created handles, 
either.  So this isn't very enlightening.  In the absence of a 
specification or parameter, I would have expected the PIPEs to be 
binary, but this seems to be experimentally false.

Yet http.server, when serving plain files, seems to open them in binary 
mode, and transfer them successfully to the browser.  And it can also 
accept the non-binary?? data from cgitb from my CGI script, and display 
it in the browser.  The former comes from a file it opens in binary 
mode, and the latter from the subprocess PIPE in unknown mode.

It seems that the socketfile.server opens the socket in "wb" mode, and 
encodes most data.  That in turn, seems to imply that the binary data 
from SimpleHTTPServer files are reasonably returned, and I note the 
headers and such are expliticly encoded before being written to wfile... 
again, consistent with the socket, wfile, being in binary mode.

But the data coming back from the subprocess PIPE from my CGI script 
seems to be acceptable to be written to wfile also, implying that  the 
PIPEs are binary, like the absence of specifications and parameters and 
knowledge of pipes as being bytestreams would be expected.  But then, it 
would seem that the cgitb output should be in binary to get into the 
PIPE, but it seems that using a binary stdout makes cgitb fail, in the 
above experiment... and I can't find any code in cgitb that does 
explicit encoding.

So I'm confused, and it seems a little extra documentation might help 
decide which are the modules that have bugs or missing features, and 
which do not.

One of the cgitb outputs from my attempt to serve the binary file claims 
that my CGI script's output file (which comes from a subprocess PIPE) is 
a TextIOWrapper with encoding cp1252.  Maybe that is the default that 
comes when a new Python is launched, even though it gets a subprocess 
PIPE as stdout?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Sat Nov 20 05:11:48 2010
From: stephen at (Stephen J. Turnbull)
Date: Sat, 20 Nov 2010 13:11:48 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:

 > The term "UCS-2" is a character set that can encode only encode 65536
 > characters; it thus refers to Unicode 1.1. According to the Unicode
 > Consortium's FAQ, the term UCS-2 should be avoided these days.

So what do you propose we call the Python implementation?  You can
call it "code-unit-oriented" if you like, but in fact it is identical
to UCS-2 for all non-hairsplitting purposes.  AFAICS the Unicode
Consortium deprecates the *term* UCS-2 because they would like us to
avoid *implementations* that don't encode the full Unicode character
set, not because the term is technically incorrect.

Strictly speaking, internally Python only encodes 65536 characters in
2-octet builds.  Its (Unicode) string-handling code does not know
about surrogates at all, AFAIK, and therefore is not UTF-16
conforming.  (The anomolies discussed here are type transformations,
not string-handling, for my purpose.)  I really don't see why we
shouldn't call a UCS-2 implementation by its name.

AFAIK this was not supposed to change in Python 3; indexing and
slicing go by code unit (isomorphic to UCS-n), not character, and due
to PEP 383 4-octet builds do not conform (internally) to UTF-32, and
can produce output that conforms to Unicode not at all (as a user
option, of course, but it's still non-conformant).

 > > IMO, we should go back to the Python2 terms UCS2 and UCS4 which
 > > are correct and provide a clear description of what Python uses
 > > internally for code units.
 > No, we shouldn't. The term UCS-2 is deprecated, see above.

Too bad for the Unicode Consortium, I say.  UCS-2 is the closest term
that folks who are not Unicode geeks will have a chance of

I agree with Marc-Andre that "narrow" and "wide" are too ambiguous to
be useful.  Many people will interpret that as "UTF-16" (or even
"UTF-8") and "UTF-32", respectively, which is dead wrong.  Others
won't have a clue.  Using "UCS-2" and "UCS-4" has the correct
connotations to Unicode geeks, and they are easy to look up for
non-geeks who care about precise definitions.  Cf. the second half of
the FAQ you quote:

    Instead, "UCS-2" has sometimes been used in the past to indicate
    that an implementation does not support supplementary characters
    and doesn't interpret pairs of surrogate code points as
    characters. Such an implementation would not handle processing
    like character properties, codepoint boundaries, collation,
    etc. for supplementary characters.

"Hey, Python, I'm looking at you!"  (Strictly speaking, Python
libraries do some of that for us, but the Python *language* does not.)

From brian.curtin at  Sat Nov 20 05:24:38 2010
From: brian.curtin at (Brian Curtin)
Date: Fri, 19 Nov 2010 22:24:38 -0600
Subject: [Python-Dev] [Python-checkins] r86540 - in
 python/branches/py3k: Parser/ Python/Python-ast.c
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 19, 2010 at 20:01, benjamin.peterson <python-checkins at
> wrote:

> Author: benjamin.peterson
> Date: Sat Nov 20 03:01:45 2010
> New Revision: 86540
> Log:
> c89 declarations
> Modified:
>   python/branches/py3k/Parser/
>   python/branches/py3k/Python/Python-ast.c
> Modified: python/branches/py3k/Parser/
> ==============================================================================
> --- python/branches/py3k/Parser/       (original)
> +++ python/branches/py3k/Parser/       Sat Nov 20 03:01:45 2010
> @@ -366,9 +366,9 @@
>         self.emit("obj2ast_%s(PyObject* obj, %s* out, PyArena* arena)" %
> (name, ctype), 0)
>         self.emit("{", 0)
>         self.emit("PyObject* tmp = NULL;", 1)
> +        self.emit("int isinstance;", 1)
>         # Prevent compiler warnings about unused variable.
>         self.emit("tmp = tmp;", 1)
> -        self.emit("int isinstance;", 1)
>         self.emit("", 0)
>     def sumTrailer(self, name, add_label=False):
> Modified: python/branches/py3k/Python/Python-ast.c
> ==============================================================================
> --- python/branches/py3k/Python/Python-ast.c    (original)
> +++ python/branches/py3k/Python/Python-ast.c    Sat Nov 20 03:01:45 2010
> @@ -3375,8 +3375,8 @@
>  obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena)
>  {
>         PyObject* tmp = NULL;
> -        tmp = tmp;
>         int isinstance;
> +        tmp = tmp;

Windows builds fail due to this change.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From v+python at  Sat Nov 20 07:56:18 2010
From: v+python at (Glenn Linderman)
Date: Fri, 19 Nov 2010 22:56:18 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/19/2010 7:48 PM, Glenn Linderman wrote:
> One of the cgitb outputs from my attempt to serve the binary file 
> claims that my CGI script's output file (which comes from a subprocess 
> PIPE) is a TextIOWrapper with encoding cp1252.  Maybe that is the 
> default that comes when a new Python is launched, even though it gets 
> a subprocess PIPE as stdout?

So the rather gross code below solves the cp1252 stdout problem, and 
also permits both strings and bytes to be written to the same file, 
although those two features are separable.  But now that I've worked 
around it, it seems that subprocesss should somehow ensure that launched 
Python programs know they are working on a binary stream?  Of course, 
not all programs launched are Python programs... so maybe it should be a 
documentation issue, but it seems to be missing from the documentation.

if sys.version_info[ 0 ] == 2:
     class IOMix():
         def __init__( self, fh, encoding="UTF-8"):
             self.fh = fh
         def write( self, param ):
             if isinstance( param, unicode ):
                 self.fh.write( param.encode( encoding ))
                 self.fh.write( param )
if sys.version_info[ 0 ] == 3:
     class IOMix():
         def __init__( self, fh, encoding="UTF-8"):
             if hasattr( fh, 'buffer'):
        = fh.buffer
                 self.last = 'b'
                 import io
                 self.txt = io.TextIOWrapper(, encoding, None, 
                 raise ValueError("not a buffered stream")
         def write( self, param ):
             if isinstance( param, str ):
                 self.last = 't'
                 self.txt.write( param )
                 if self.last == 't':
                 self.last = 'b'
        param )

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Sat Nov 20 10:05:38 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 20 Nov 2010 10:05:38 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

Am 20.11.2010 05:11, schrieb Stephen J. Turnbull:
> "Martin v. L?wis" writes:
>  > The term "UCS-2" is a character set that can encode only encode 65536
>  > characters; it thus refers to Unicode 1.1. According to the Unicode
>  > Consortium's FAQ, the term UCS-2 should be avoided these days.
> So what do you propose we call the Python implementation?

A technical correct description would be to say that Python uses either
16-bit code units or 32-bit code units; for brevity, these can be called
narrow and wide code units.

> Strictly speaking, internally Python only encodes 65536 characters in
> 2-octet builds.  Its (Unicode) string-handling code does not know
> about surrogates at all, AFAIK

Here you are mistaken: it does indeed know about UTF-16 and surrogates
in several places, e.g. in the UTF-8 codec, or in the repr()
implementation; likewise in the parser.

> and therefore is not UTF-16 conforming.

I disagree. Python does "conform" to "UTF-16" (certainly in the
sense that no UTF-16 specification ever mandates a certain Python
API, and that Python follows all general requirements of the
UTF-16 specification).

> AFAIK this was not supposed to change in Python 3; indexing and
> slicing go by code unit (isomorphic to UCS-n), not character, and due
> to PEP 383 4-octet builds do not conform (internally) to UTF-32, and
> can produce output that conforms to Unicode not at all (as a user
> option, of course, but it's still non-conformant).

What behavior specifically do you consider non-conforming, and what
specific specification do you think it is not conforming to? For
example, it *is* fully conforming with UTF-8.


From merwok at  Sat Nov 20 12:38:53 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sat, 20 Nov 2010 12:38:53 +0100
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <>
Message-ID: <>


> cgitb.enable(0,"d:\temp")

Isn?t that expanded to ?d:<tab>emp??

From ncoghlan at  Sat Nov 20 14:16:27 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 20 Nov 2010 23:16:27 +1000
Subject: [Python-Dev] [Python-checkins] pymigr: Build identification
 patch is updated, but only for Unix.
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 20, 2010 at 6:02 PM, georg.brandl
<python-checkins at> wrote:
> georg.brandl pushed abd0dc1328ce to pymigr:
> changeset: ? 70:abd0dc1328ce
> tag: ? ? ? ? tip
> user: ? ? ? ?Georg Brandl <georg at>
> date: ? ? ? ?Sat Nov 20 09:01:03 2010 +0100
> summary: ? ? Build identification patch is updated, but only for Unix.
> files: ? ? ? todo.txt

Does this repository use the same set of hooks as distutils2? (I'm
hoping not, since if it does, my change to the email hook didn't


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 20 14:55:57 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 20 Nov 2010 23:55:57 +1000
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <ic6a1d$1dm$>
References: <> <ibvvo1$40o$>
	<> <>
Message-ID: <>

On Sat, Nov 20, 2010 at 2:51 AM, Georg Brandl <g.brandl at> wrote:
> I'm at it. ?In fact, I think I will merge both todo.txt and tasks.txt
> into the PEP. ?It's not more of a burden to update it there, and it's
> more visible to the developer community.

The latest checkin was definitely an improvement (especially the
updated timeline).

According to the PEP, the .hgeol rules aren't currently enforced
server side - having such a hook in place before Hg went live was
definitely one of the things we agreed on before the hgeol extension
even existed in a usable form.

For fixing whitespace issues (another open question mentioned in the
PEP), "make patchcheck" can continue to handle that - no need to
create a Hg specific extension for it.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 20 16:21:32 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 21 Nov 2010 01:21:32 +1000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 1:07 AM, michael.foord
<python-checkins at> wrote:
> +Fetching attributes statically
> +------------------------------
> +
> +Both :func:`getattr` and :func:`hasattr` can trigger code execution when
> +fetching or checking for the existence of attributes. Descriptors, like
> +properties, will be invoked and :meth:`__getattr__` and :meth:`__getattribute__`
> +may be called.
> +
> +For cases where you want passive introspection, like documentation tools, this
> +can be inconvenient. `getattr_static` has the same signature as :func:`getattr`
> +but avoids executing code when it fetches attributes.

This description feels a little strong to me - getattr_static still
executes all those things on the metaclass as it retrieves the
information it needs to do the "static" lookup. Leaving this original
description (which assumes metaclass=type) alone and adding a note
near the end of the section to say that metaclass code is still
executed might be an improvement.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Sat Nov 20 16:29:13 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 20 Nov 2010 15:29:13 +0000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>
Message-ID: <>

On 20/11/2010 15:21, Nick Coghlan wrote:
> On Sun, Nov 21, 2010 at 1:07 AM, michael.foord
> <python-checkins at>  wrote:
>> +Fetching attributes statically
>> +------------------------------
>> +
>> +Both :func:`getattr` and :func:`hasattr` can trigger code execution when
>> +fetching or checking for the existence of attributes. Descriptors, like
>> +properties, will be invoked and :meth:`__getattr__` and :meth:`__getattribute__`
>> +may be called.
>> +
>> +For cases where you want passive introspection, like documentation tools, this
>> +can be inconvenient. `getattr_static` has the same signature as :func:`getattr`
>> +but avoids executing code when it fetches attributes.
> This description feels a little strong to me - getattr_static still
> executes all those things on the metaclass as it retrieves the
> information it needs to do the "static" lookup. Leaving this original
> description (which assumes metaclass=type) alone and adding a note
> near the end of the section to say that metaclass code is still
> executed might be an improvement.
Can you give an example of code in a metaclass that may be executed by 
getattr_static? It's not that I don't believe you I just can't think of 
an example. Looking up the class and the mro are the only two examples I 
can think of (klass.__mro__ and instance.__class__ - and they are noted 
in the docs?) but aren't metaclass specific.


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Sat Nov 20 16:42:30 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 20 Nov 2010 16:42:30 +0100
Subject: [Python-Dev] r86570 - in python/branches/py3k:
 Lib/unittest/ Lib/unittest/test/ Misc/NEWS
References: <>
Message-ID: <>

On Sat, 20 Nov 2010 16:34:26 +0100 (CET)
michael.foord <python-checkins at> wrote:
> +
> +    def testPickle(self):
> +        # Issue 10326
> +
> +        # Can't use TestCase classes defined in Test class as
> +        # pickle does not work with inner classes
> +        test = unittest.TestCase('run')
> +        for protocol in range(pickle.HIGHEST_PROTOCOL + 1):
> +
> +            # blew up prior to fix
> +            pickled_test = pickle.dumps(test, protocol=protocol)

You must also check that the object can be unpickled, otherwise
making TestCase picklable is not only pointless, but misleading the
user.  Other classes which claim to be picklable (such as e.g.
io.BytesIO) are careful to check that unpickling works fine and
produces an usable object.



From fuzzyman at  Sat Nov 20 16:48:59 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 20 Nov 2010 15:48:59 +0000
Subject: [Python-Dev] r86570 - in python/branches/py3k:
 Lib/unittest/ Lib/unittest/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On 20/11/2010 15:42, Antoine Pitrou wrote:
> On Sat, 20 Nov 2010 16:34:26 +0100 (CET)
> michael.foord<python-checkins at>  wrote:
>> +
>> +    def testPickle(self):
>> +        # Issue 10326
>> +
>> +        # Can't use TestCase classes defined in Test class as
>> +        # pickle does not work with inner classes
>> +        test = unittest.TestCase('run')
>> +        for protocol in range(pickle.HIGHEST_PROTOCOL + 1):
>> +
>> +            # blew up prior to fix
>> +            pickled_test = pickle.dumps(test, protocol=protocol)
> You must also check that the object can be unpickled, otherwise
> making TestCase picklable is not only pointless, but misleading the
> user.  Other classes which claim to be picklable (such as e.g.
> io.BytesIO) are careful to check that unpickling works fine and
> produces an usable object.

Well, given the *particular* bug it is fixing, ensuring that the 
TestCase instances can be pickled is enough. If they fail to unpickle 
that is a bug in pickle and not in unittest. *However*, the test is very 
easy to extend to what you suggest so I have done it.

All the best,


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Sat Nov 20 16:59:49 2010
From: solipsis at (Antoine Pitrou)
Date: Sat, 20 Nov 2010 16:59:49 +0100
Subject: [Python-Dev] r86570 - in python/branches/py3k:
 Lib/unittest/ Lib/unittest/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <1290268789.3560.12.camel@localhost.localdomain>

Le samedi 20 novembre 2010 ? 15:48 +0000, Michael Foord a ?crit :
> On 20/11/2010 15:42, Antoine Pitrou wrote:
> > On Sat, 20 Nov 2010 16:34:26 +0100 (CET)
> > michael.foord<python-checkins at>  wrote:
> >> +
> >> +    def testPickle(self):
> >> +        # Issue 10326
> >> +
> >> +        # Can't use TestCase classes defined in Test class as
> >> +        # pickle does not work with inner classes
> >> +        test = unittest.TestCase('run')
> >> +        for protocol in range(pickle.HIGHEST_PROTOCOL + 1):
> >> +
> >> +            # blew up prior to fix
> >> +            pickled_test = pickle.dumps(test, protocol=protocol)
> > You must also check that the object can be unpickled, otherwise
> > making TestCase picklable is not only pointless, but misleading the
> > user.  Other classes which claim to be picklable (such as e.g.
> > io.BytesIO) are careful to check that unpickling works fine and
> > produces an usable object.
> Well, given the *particular* bug it is fixing, ensuring that the 
> TestCase instances can be pickled is enough. If they fail to unpickle 
> that is a bug in pickle and not in unittest.

It wouldn't be, no.  pickle provides several different APIs to ensure
that state gets correctly stored *and* restored, but it's up to
application classes such as TestCase to ensure that they implement those
APIs correctly for the intended behaviour.  Therefore, checking that
pickling "works" fine (or, rather, seems to work) is only half ot the

(for example, if you define a __getstate__, chances are you must define
a __setstate__ too, and it is your job to make it work properly)


From ncoghlan at  Sat Nov 20 17:01:06 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 21 Nov 2010 02:01:06 +1000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 1:29 AM, Michael Foord
<fuzzyman at> wrote:
> Can you give an example of code in a metaclass that may be executed by
> getattr_static? It's not that I don't believe you I just can't think of an
> example. Looking up the class and the mro are the only two examples I can
> think of (klass.__mro__ and instance.__class__ - and they are noted in the
> docs?) but aren't metaclass specific.

The description heavily implies that arbitrary Python code won't be
executed by calling getattr_static, and that isn't necessarily true.
It's almost certain to be true in the case when the metaclass is type,
but can't be guaranteed otherwise. The retrieval of __class__ is a
normal lookup on the object, so it can trigger all of the things
getattr_static is trying to avoid (unavoidable if you want to support
proxy classes at all), and the lookup of __mro__ invokes all of those
things on the metaclass.

I'll see if I'm still of the same opinion after I sleep on it, but my
first impression of the docs was that they slightly oversold the
strength of the "doesn't execute arbitrary code" aspect of the new
function. The existing caveats were all relating to when getattr() and
getattr_static() might give different answers, while the additional
caveats I was suggesting related to cases where arbitrary code may
still be executed.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Sat Nov 20 17:06:59 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 20 Nov 2010 16:06:59 +0000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 20/11/2010 16:01, Nick Coghlan wrote:
> On Sun, Nov 21, 2010 at 1:29 AM, Michael Foord
> <fuzzyman at>  wrote:
>> Can you give an example of code in a metaclass that may be executed by
>> getattr_static? It's not that I don't believe you I just can't think of an
>> example. Looking up the class and the mro are the only two examples I can
>> think of (klass.__mro__ and instance.__class__ - and they are noted in the
>> docs?) but aren't metaclass specific.
> The description heavily implies that arbitrary Python code won't be
> executed by calling getattr_static, and that isn't necessarily true.
> It's almost certain to be true in the case when the metaclass is type,
> but can't be guaranteed otherwise.

Given the way that member lookups are done by getattr_static I don't 
think any assumptions about the metaclass are made. I'm happy to be 
proven wrong (but would rather fix it than document it as an exception). 
(Actually we assume the metaclass doesn't use __slots__, but only 
because it isn't *possible* for a metaclass to use __slots__.)

> The retrieval of __class__ is a
> normal lookup on the object, so it can trigger all of the things
> getattr_static is trying to avoid (unavoidable if you want to support
> proxy classes at all), and the lookup of __mro__ invokes all of those
> things on the metaclass.

__class__ and mro lookup are noted in the docs as being exceptions. We 
could actually remove the __class__ lookup from the list of exceptions 
by using type(...) instead of obj.__class__.

> I'll see if I'm still of the same opinion after I sleep on it, but my
> first impression of the docs was that they slightly oversold the
> strength of the "doesn't execute arbitrary code" aspect of the new
> function. The existing caveats were all relating to when getattr() and
> getattr_static() might give different answers, while the additional
> caveats I was suggesting related to cases where arbitrary code may
> still be executed.
I'm happy to change the wording to make the promise less strong.

All the best,


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Sat Nov 20 17:10:42 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 20 Nov 2010 16:10:42 +0000
Subject: [Python-Dev] r86570 - in python/branches/py3k:
 Lib/unittest/ Lib/unittest/test/ Misc/NEWS
In-Reply-To: <1290268789.3560.12.camel@localhost.localdomain>
References: <>	<>	<>
Message-ID: <>

On 20/11/2010 15:59, Antoine Pitrou wrote:
> Le samedi 20 novembre 2010 ? 15:48 +0000, Michael Foord a ?crit :
>> On 20/11/2010 15:42, Antoine Pitrou wrote:
>>> On Sat, 20 Nov 2010 16:34:26 +0100 (CET)
>>> michael.foord<python-checkins at>   wrote:
>>>> +
>>>> +    def testPickle(self):
>>>> +        # Issue 10326
>>>> +
>>>> +        # Can't use TestCase classes defined in Test class as
>>>> +        # pickle does not work with inner classes
>>>> +        test = unittest.TestCase('run')
>>>> +        for protocol in range(pickle.HIGHEST_PROTOCOL + 1):
>>>> +
>>>> +            # blew up prior to fix
>>>> +            pickled_test = pickle.dumps(test, protocol=protocol)
>>> You must also check that the object can be unpickled, otherwise
>>> making TestCase picklable is not only pointless, but misleading the
>>> user.  Other classes which claim to be picklable (such as e.g.
>>> io.BytesIO) are careful to check that unpickling works fine and
>>> produces an usable object.
>> Well, given the *particular* bug it is fixing, ensuring that the
>> TestCase instances can be pickled is enough. If they fail to unpickle
>> that is a bug in pickle and not in unittest.
> It wouldn't be, no.  pickle provides several different APIs to ensure
> that state gets correctly stored *and* restored, but it's up to
> application classes such as TestCase to ensure that they implement those
> APIs correctly for the intended behaviour.  Therefore, checking that
> pickling "works" fine (or, rather, seems to work) is only half ot the
> job.
> (for example, if you define a __getstate__, chances are you must define
> a __setstate__ too, and it is your job to make it work properly)

Yes, but unittest.TestCase doesn't implement any of those APIs (and if 
we did we would *definitely* need to test unpickling). That aside I have 
extended the test in the way you suggest.

Actually it would be nice to implement custom pickling / unpickling 
methods to allow Python 2.7 / 3.2 pickled TestCases to be unpickled on 
earlier versions of Python. I couldn't see how to change the class name 
in the pickle using the pickle protocol methods. Suggestions welcomed.


> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Sat Nov 20 17:28:40 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 20 Nov 2010 16:28:40 +0000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 20/11/2010 16:06, Michael Foord wrote:
> On 20/11/2010 16:01, Nick Coghlan wrote:
> [snip...]
>> The retrieval of __class__ is a
>> normal lookup on the object, so it can trigger all of the things
>> getattr_static is trying to avoid (unavoidable if you want to support
>> proxy classes at all), and the lookup of __mro__ invokes all of those
>> things on the metaclass.
> __class__ and mro lookup are noted in the docs as being exceptions. We 
> could actually remove the __class__ lookup from the list of exceptions 
> by using type(...) instead of obj.__class__.


>> I'll see if I'm still of the same opinion after I sleep on it, but my
>> first impression of the docs was that they slightly oversold the
>> strength of the "doesn't execute arbitrary code" aspect of the new
>> function. The existing caveats were all relating to when getattr() and
>> getattr_static() might give different answers, while the additional
>> caveats I was suggesting related to cases where arbitrary code may
>> still be executed.
> I'm happy to change the wording to make the promise less strong.

I've also removed the __mro__ exception. This is done with:


If you can think of any other exceptions then please let me know.


> All the best,
> Michael
>> Cheers,
>> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From v+python at  Sat Nov 20 19:19:11 2010
From: v+python at (Glenn Linderman)
Date: Sat, 20 Nov 2010 10:19:11 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 11/20/2010 3:38 AM, ?ric Araujo wrote:
> Hello
>> cgitb.enable(0,"d:\temp")
> Isn?t that expanded to ?d:<tab>emp??

Oops.  Yes, that fixes the problem with creation of the temp file, 
thanks for catching that.  I  now get a complete report of the original 
error in the temp file (below).  I am a bit less confused now... but it 
seems that there are still a number of issues.  Here is an enumeration 
of problems I was hard pressed to make before you removed my confusion 
on this issue.

1. cgitb should expect to report to a binary stdout, using whatever 
encoding (possibly ASCII) that seems appropriate for the output that in 

2. Some appropriate documentation or API or both should be provided to 
enable a script to set "binary" mode for stdout for CGI scripts. This 
demonstrates the confusion (wish I had found it earlier) that is 
encountered by such lack.  One must tell msvcrt the stream is binary (I 
had figured that out early on), one must also sidestep the use of the 
cp1252 default when printing binary, one must also choose a proper text 
encoding corresponding to the HTTP headers sent.  My second email in 
this thread, sent a few hours after the first, shows a convenient set of 
cures for all but msvcrt (as long as only "write" is used for writing.  
"print" support could be added, similarly).  Likely something along this 
line is needed for stdin as well, I haven't yet experimented with 
uploading binary content to a CGI.

One could speculate about having the Python runtime auto-detect CGI 
mode, but I don't know of any foolproof technique for that, and the 
selection of the "proper" text encoding depends on the details of the 
CGI, so having instead an API or two that assists with doing this sort 
of thing would be better; the need for documentation, at least, seems 

3. subprocess documentation could be improved to point out  that when 
using subprocess.PIPE to talk to a Python subprocess, that the 
communications will be in binary.  Again, I don't know of any way to 
autodetect the subprocess environment, but if it were possible to select 
an appropriate encoding and use it consistently on both sides of the 
PIPE, that would be a convenience to its use; if not possible, 
documenting the issue, and providing an API to use to easily select such 
encodings both in client and server, would be helpful.

While the layers are all there, and ".buffer" is documented for 
TextIOWrapper, the use of sys.stdout.buffer and the fact that it has a 
full set of operations isn't immediately obvious from the reference 
material; perhaps it is in a tutorial I haven't found, but... I was 
looking, and didn't find it.

Of course, subprocess may launch non-Python programs; they will have 
their own ideas of binary vs text encoding, so it is important that it 
is convenient to match them on the Python side.

It would be nice if subprocess had a mechanism for providing no-deadlock 
stdout data to the parent prior to the child terminating.  A CGI 
implementation via subprocess shouldn't accumulate all of stdout (or all 
of stderr, for that matter, although less important).  I don't (yet) 
know enough about Python threading to know if this is possible, but it 
certainly would be useful.

4. http.server has a number of bugs and limitations.
4a. _url_collapse_path_split seems inefficient (although I have to 
benchmark it against what I think would be more efficient), and for its 
only use within http.server it produces the wrong information, so the 
information has to be recombined and resplit to make it function 
properly, adding to the perception of inefficiency.
4b. Detection of "executable" on Windows is simply wrong.  Unix 
execution bits do not exist.
4c. is_cgi doesn't properly handle PATHINFO parts of the path, this is 
the other half of 4a.  The Python2.x had this right, 
but the introduction and use of _url_collapse_path_split broke it.
4d. Searching for a ? to find an explicit query string should use 
.find('?') rather than .rfind('?') as there is no prohibition on using 
'?' within a query string, AFAIK.
4e. doesn't set the REQUEST_URI, HTTP_HOST, or HTTP_PORT  environment 
variables for the CGI.
4f. Should not send the 200 response until it sees if the CGI sends a 
Status: header.
4g. Should not buffer all of stdout: subprocess.communicate is 
inappropriate for a web server CGI interface.  The data should stream 
through to avoid consuming inordinate amounts of memory.  The only 
solution within the current limitations of subprocess is to abandon 
stderr, force the CGI to do its own error logging, and use 
shutil.copyfileobj to hook up p.stdout to self.wfile once the Status: 
message processing has happened.
4h. Doesn't seem to close p.stdin (I'm not sure if that is necessary, it 
may happen when p is garbage collected, but effort was made to close 
p.stdout and p.stderr, which seem similar.)

*TypeError* 	Python 3.2a4: c:\python32\python.exe
Sat Nov 20 09:28:41 2010

A problem occurred in a Python script. Here is the sequence of function 
calls leading up to the error, in the order they occurred.

d:\my\py\ in **()
     4 import cgitb
     5 sys.stdout.write("out")
     6 fhb = open("fhb", "wb")
     7 cgitb.enable(0,"d:\\temp")
=>    8 fhb.write("abcdef")  # try writing non-binary to binary file.  Expect an error, of course.
*fhb* = <_io.BufferedWriter name='fhb'>, fhb.*write* = <built-in method 
write of _io.BufferedWriter object>

*TypeError*: 'str' does not support the buffer interface
args = ("'str' does not support the buffer interface",)
with_traceback = <built-in method with_traceback of TypeError object>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexander.belopolsky at  Sat Nov 20 23:32:28 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sat, 20 Nov 2010 17:32:28 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sat, Nov 20, 2010 at 4:05 AM, "Martin v. L?wis" <martin at> wrote:
> A technical correct description would be to say that Python uses either
> 16-bit code units or 32-bit code units; for brevity, these can be called
> narrow and wide code units.


PEP 261 introduced terms "wide Py_UNICODE" and "narrow Py_UNICODE,"
but when discussion is at Python level, I don't think we should use
names of C typedefs.   I think "wide/narrow Unicode" builds describe
the two options clearly and unambiguously.   I prefer Python-specific
terminology to Unicode terms because in Python reference documentation
we often discuss details that are outside of the scope of Unicode
Standard.  For example, interpretation of lone surrogates on narrow
builds is one such detail.

From ziade.tarek at  Sun Nov 21 00:05:12 2010
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 21 Nov 2010 00:05:12 +0100
Subject: [Python-Dev] Reminder: Distutils vs Distutils2
Message-ID: <>


I have seen some efforts recently to improve Distutils in the standard library,

Just a quick reminder of the status of Distutils: it's frozen and is
just being bug fixed at this time. The work I done last year was
reverted and pushed to Distutils2.
A lot of work has been done since then, and we had 4 GSOC students
working this summer on Distutils2.

It's backward-incompatible, so we can remove the things we don't like
and add new things w/o suffering from backward compatibility pains.

So if you want to improve the tool, or if you have some pending
changes to Distutils, I would encourage you to join the Distutils2
effort and not to waste time on Distutils anymore.  The patches that
did not make it to Distutils can still be added in Distutils2, for
most of them.

The workflow we currently use to change the code is as follow and make
it easy for everyone to contribute:

1. clone
2. discuss / propose a patch on IRC (#distutils - Freenode) or on the
dedicated mailing list
3. I review and merge all changes at bitbucket, then push them on

Crazy ideas are welcome. "" is gone in d2 for instance ;)

Thanks !


Tarek Ziad? |

From ziade.tarek at  Sun Nov 21 00:15:41 2010
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 21 Nov 2010 00:15:41 +0100
Subject: [Python-Dev] Reminder: Distutils vs Distutils2
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 12:05 AM, Tarek Ziad? <ziade.tarek at> wrote:
> Crazy ideas are welcome. "" is gone in d2 for instance ;)

But you can still use a similar form if you want - just to mention

From ncoghlan at  Sun Nov 21 04:52:19 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 21 Nov 2010 13:52:19 +1000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 2:06 AM, Michael Foord
<fuzzyman at> wrote:
>> I'll see if I'm still of the same opinion after I sleep on it, but my
>> first impression of the docs was that they slightly oversold the
>> strength of the "doesn't execute arbitrary code" aspect of the new
>> function. The existing caveats were all relating to when getattr() and
>> getattr_static() might give different answers, while the additional
>> caveats I was suggesting related to cases where arbitrary code may
>> still be executed.
> I'm happy to change the wording to make the promise less strong.

Your latest changes may have actually made the stronger wording
accurate (I certainly can't think of any loopholes off the top of my
head). If you did still want to soften the wording, I'd be inclined to
replace the word "avoids" with "minimises" in the appropriate places.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Nov 21 04:54:11 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 21 Nov 2010 13:54:11 +1000
Subject: [Python-Dev] [Python-checkins] r86566 - in
 python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst
 Lib/ Lib/test/ Misc/NEWS Misc/python-wing4.wpr
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 1:07 AM, michael.foord
<python-checkins at> wrote:
> Author: michael.foord
> Date: Sat Nov 20 16:07:30 2010
> New Revision: 86566
> Log:
> Issue 9732: addition of getattr_static to the inspect module
> Modified:
> ? python/branches/py3k/Doc/glossary.rst
> ? python/branches/py3k/Doc/library/inspect.rst
> ? python/branches/py3k/Lib/
> ? python/branches/py3k/Lib/test/
> ? python/branches/py3k/Misc/NEWS
> ? python/branches/py3k/Misc/python-wing4.wpr

Unrelated to my previous comment - when adding
inspect.getgeneratorstate, I noticed that inspect.getattr_static isn't
mentioned in the 3.2 What's New yet (I put a XXX placeholder in for

Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From v+python at  Sun Nov 21 08:52:45 2010
From: v+python at (Glenn Linderman)
Date: Sat, 20 Nov 2010 23:52:45 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 11/20/2010 10:19 AM, Glenn Linderman wrote:
> Oops.  Yes, that fixes the problem with creation of the temp file, 
> thanks for catching that.  I  now get a complete report of the 
> original error in the temp file (below).  I am a bit less confused 
> now... but it seems that there are still a number of issues.  Here is 
> an enumeration of problems I was hard pressed to make before you 
> removed my confusion on this issue.

Related issues, regarding binary stream requirements for cgi interface.  
Perhaps the cgi module should have the API to set binary mode.

Sadly, input handling seems to depend on the email module, 
thought to be fixed for 3.2, but it is not clear if that has been 
achieved, or if the surrogate encode workaround is sufficient for this.  
More testing needed, but I don't have such a test case developed yet.

> 1. cgitb should expect to report to a binary stdout, using whatever 
> encoding (possibly ASCII) that seems appropriate for the output that 
> in generates.

Maybe should have an API to set the stdin and stdout to binary 
streams.   Although deals more with stdin than stdout, cgitb 
deals more with stdout.


> 2. Some appropriate documentation or API or both should be provided to 
> enable a script to set "binary" mode for stdout for CGI scripts. This 
> link 
> <> 
> demonstrates the confusion (wish I had found it earlier) that is 
> encountered by such lack.  One must tell msvcrt the stream is binary 
> (I had figured that out early on), one must also sidestep the use of 
> the cp1252 default when printing binary, one must also choose a proper 
> text encoding corresponding to the HTTP headers sent.  My second email 
> in this thread, sent a few hours after the first, shows a convenient 
> set of cures for all but msvcrt (as long as only "write" is used for 
> writing.  "print" support could be added, similarly).  Likely 
> something along this line is needed for stdin as well, I haven't yet 
> experimented with uploading binary content to a CGI.
> One could speculate about having the Python runtime auto-detect CGI 
> mode, but I don't know of any foolproof technique for that, and the 
> selection of the "proper" text encoding depends on the details of the 
> CGI, so having instead an API or two that assists with doing this sort 
> of thing would be better; the need for documentation, at least, seems 
> imperative.


> 3. subprocess documentation could be improved to point out  that when 
> using subprocess.PIPE to talk to a Python subprocess, that the 
> communications will be in binary.  Again, I don't know of any way to 
> autodetect the subprocess environment, but if it were possible to 
> select an appropriate encoding and use it consistently on both sides 
> of the PIPE, that would be a convenience to its use; if not possible, 
> documenting the issue, and providing an API to use to easily select 
> such encodings both in client and server, would be helpful.
> While the layers are all there, and ".buffer" is documented for 
> TextIOWrapper, the use of sys.stdout.buffer and the fact that it has a 
> full set of operations isn't immediately obvious from the reference 
> material; perhaps it is in a tutorial I haven't found, but... I was 
> looking, and didn't find it.
> Of course, subprocess may launch non-Python programs; they will have 
> their own ideas of binary vs text encoding, so it is important that it 
> is convenient to match them on the Python side.
> It would be nice if subprocess had a mechanism for providing 
> no-deadlock stdout data to the parent prior to the child terminating.  
> A CGI implementation via subprocess shouldn't accumulate all of stdout 
> (or all of stderr, for that matter, although less important).  I don't 
> (yet) know enough about Python threading to know if this is possible, 
> but it certainly would be useful. for subprocess to document that 
communicate produces byte stream output. for subprocess enhancements to handle 
more cases without deadlock.

Found which documents how to switch 
stdin/stdout/stderr to binary mode, and even back!   I couldn't track 
the documented change to the actual documentation, though, but I did 
find it in section 26.1, under the documentation for the three stdio 

def  make_streams_binary():
     sys.stdin  =  sys.stdin.detach()
     sys.stdout  =  sys.stdout.detach()

> 4. http.server has a number of bugs and limitations.
> 4a. _url_collapse_path_split seems inefficient (although I have to 
> benchmark it against what I think would be more efficient), and for 
> its only use within http.server it produces the wrong information, so 
> the information has to be recombined and resplit to make it function 
> properly, adding to the perception of inefficiency.
> 4b. Detection of "executable" on Windows is simply wrong.  Unix 
> execution bits do not exist. for 4b.

> 4c. is_cgi doesn't properly handle PATHINFO parts of the path, this is 
> the other half of 4a.  The Python2.x had this right, 
> but the introduction and use of _url_collapse_path_split broke it. for 4a and 4c.

> 4d. Searching for a ? to find an explicit query string should use 
> .find('?') rather than .rfind('?') as there is no prohibition on using 
> '?' within a query string, AFAIK. for 4d.

> 4e. doesn't set the REQUEST_URI, HTTP_HOST, or HTTP_PORT  environment 
> variables for the CGI. for 4e.

> 4f. Should not send the 200 response until it sees if the CGI sends a 
> Status: header. for 4f and 4g.

> 4g. Should not buffer all of stdout: subprocess.communicate is 
> inappropriate for a web server CGI interface.  The data should stream 
> through to avoid consuming inordinate amounts of memory.  The only 
> solution within the current limitations of subprocess is to abandon 
> stderr, force the CGI to do its own error logging, and use 
> shutil.copyfileobj to hook up p.stdout to self.wfile once the Status: 
> message processing has happened.
> 4h. Doesn't seem to close p.stdin (I'm not sure if that is necessary, 
> it may happen when p is garbage collected, but effort was made to 
> close p.stdout and p.stderr, which seem similar.)

Discovered that subprocess.communicate closes p.stdin, so it wasn't 
needed until I quit using .communicate in my version of the code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Sun Nov 21 13:55:12 2010
From: stephen at (Stephen J. Turnbull)
Date: Sun, 21 Nov 2010 21:55:12 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:
 > Am 20.11.2010 05:11, schrieb Stephen J. Turnbull:
 > > "Martin v. L?wis" writes:
 > > 
 > >  > The term "UCS-2" is a character set that can encode only encode 65536
 > >  > characters; it thus refers to Unicode 1.1. According to the Unicode
 > >  > Consortium's FAQ, the term UCS-2 should be avoided these days.
 > > 
 > > So what do you propose we call the Python implementation?
 > A technical correct description would be to say that Python uses either
 > 16-bit code units or 32-bit code units; for brevity, these can be called
 > narrow and wide code units.

I agree that's technically correct.  Unfortunately, it's also useless
to anybody who doesn't already know more about Unicode than anybody
should have to know.

 > > and therefore is not UTF-16 conforming.
 > I disagree. Python does "conform" to "UTF-16"

I'm sure the codecs do.  But the Unicode standard doesn't care about
the parts of the process, it cares about what it does as a whole.
Python's internal coding does not conform to UTF-16, and that internal
coding can, under certain conditions, escape to the outside world as
invalid "Unicode" output.

 > > AFAIK this was not supposed to change in Python 3; indexing and
 > > slicing go by code unit (isomorphic to UCS-n), not character, and due
 > > to PEP 383 4-octet builds do not conform (internally) to UTF-32, and
 > > can produce output that conforms to Unicode not at all (as a user
 > > option, of course, but it's still non-conformant).
 > What behavior specifically do you consider non-conforming, and what
 > specific specification do you think it is not conforming to? For
 > example, it *is* fully conforming with UTF-8.


    f = open('/tmp/broken','wt',encoding='utf8',errors='surrogateescape')

for one.  That produces a non-UTF-8 file in a 32-bit-code-unit build.
You can say, "oh, but that's not really a UTF-8 codec", and I'd agree.
Nevertheless, the program is able to produce output from internal
"Unicode" strings that does not conform to Unicode at all.  A Unicode-
conforming Python implementation would error at the chr() call, or
perhaps would not provide surrogateescape error handlers.

It is, of course, possible to write Python programs that conform (and
easier than in any other language I know), but Python itself does not
conform to post-1.1 Unicode standards.  Too bad for the standards:
"Although practicality beats purity."

The point is that internal code is *not* UTF-16 (or -32), but it *is*
isomorphic to UCS-2 (or -4).  *That is very useful information to
users*, it's not a technical detail of interest only to Unicode geeks.
It means that if you stick to defined characters in the BMP when
giving Python input, then slicing and indexing unicode (Python 2) or
str (Python 3) objects gives only valid output even in builds with
16-bit code units.  OTOH, invalid processing (involving functions like
'chr' or input using surrogateescape codecs) can lead to invalid
output even in builds with 32-bit code units.

IMO, saying "UCS-2" or "UCS-4" tells ordinary developers most of what
they need to know about the limitations of their Python vis-a-vis full
conformance, at least with respect to the string manipulation functions.

From rdmurray at  Sun Nov 21 18:18:20 2010
From: rdmurray at (R. David Murray)
Date: Sun, 21 Nov 2010 12:18:20 -0500
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Sat, 20 Nov 2010 23:52:45 -0800, Glenn Linderman <v+python at> wrote:
> Sadly, input handling seems to depend on the email module, 
> thought to be fixed for 3.2, but it is not clear if that has been 
> achieved, or if the surrogate encode workaround is sufficient for this.  
> More testing needed, but I don't have such a test case developed yet.

Indeed, this should theoretically be fixable now.  The email module
is now perfectly capable of both consuming and producing binary data.
The user of the module doesn't need to care how this was achieved unless
they want to do processing of non-RFC conformant data.

I want to look at the CGI issue, but I'm not sure when I'll get to it.

R. David Murray                            

From jcea at  Sun Nov 21 18:27:42 2010
From: jcea at (Jesus Cea)
Date: Sun, 21 Nov 2010 18:27:42 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

What is the impact in the buildbot architecture?. Slaves must do
anything?. At least they need to have mercurial installed, I guess.

What, as a buildslave manager, must I do to ready my server for the

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From rdmurray at  Sun Nov 21 18:38:25 2010
From: rdmurray at (R. David Murray)
Date: Sun, 21 Nov 2010 12:38:25 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, 21 Nov 2010 21:55:12 +0900, "Stephen J. Turnbull" <stephen at> wrote:
> "Martin v. L??wis" writes:
>  > Am 20.11.2010 05:11, schrieb Stephen J. Turnbull:
>  > > "Martin v. L??wis" writes:
>  > >
>  > >  > The term "UCS-2" is a character set that can encode only encode 65536
>  > >  > characters; it thus refers to Unicode 1.1. According to the Unicode
>  > >  > Consortium's FAQ, the term UCS-2 should be avoided these days.
>  > >
>  > > So what do you propose we call the Python implementation?
>  >
>  > A technical correct description would be to say that Python uses either
>  > 16-bit code units or 32-bit code units; for brevity, these can be called
>  > narrow and wide code units.
> I agree that's technically correct.  Unfortunately, it's also useless
> to anybody who doesn't already know more about Unicode than anybody
> should have to know.

> The point is that internal code is *not* UTF-16 (or -32), but it *is*
> isomorphic to UCS-2 (or -4).  *That is very useful information to
> users*, it's not a technical detail of interest only to Unicode geeks.
> It means that if you stick to defined characters in the BMP when
> giving Python input, then slicing and indexing unicode (Python 2) or
> str (Python 3) objects gives only valid output even in builds with
> 16-bit code units.  OTOH, invalid processing (involving functions like
> 'chr' or input using surrogateescape codecs) can lead to invalid
> output even in builds with 32-bit code units.
> IMO, saying "UCS-2" or "UCS-4" tells ordinary developers most of what
> they need to know about the limitations of their Python vis-a-vis full
> conformance, at least with respect to the string manipulation functions.

I'm sorry, but I have to disagree.  As a relative unicode ignoramus,
"UCS-2" and "UCS-4" convey almost no information to me, and the bits I
have heard about them on this list have only confused me.  On the other
hand, I understand that 'narrow' means that fewer bytes are used for
each internal character, meaning that some unicode characters need to
be represented by more than one string element, and thus that slicing
strings containing such characters on a narrow build causes problems.
Now, you could tell me the same information using the terms 'UCS-2'
and 'UCS-4' instead of 'narrow' and 'wide', but to my ear 'narrow'
and 'wide' convey a better gut level feeling for what is going on than
'UCS-2' and 'UCS-4' do.  And it avoids any question of whether or not
Python's internal representation actually conforms to whatever standard
it is that UCS refers to, a point on which there seems to be some

Having written the above, I googled for UCS-2 and got the Wikipedia
article on UTF16/UCS-2 [1].  Scanning that article, I do not see anything
that would clue me in to the problems of slicing strings in a Python
narrow build.  Indeed, reading that article with my limited unicode
knowledge, if I were told Python used UCS-2, I would assume that non-BMP
characters could not be processed by a Python narrow build.

R. David Murray                            


From g.brandl at  Sun Nov 21 18:58:53 2010
From: g.brandl at (Georg Brandl)
Date: Sun, 21 Nov 2010 18:58:53 +0100
Subject: [Python-Dev] Mercurial Schedule
In-Reply-To: <>
References: <> <>
Message-ID: <icbmo3$knc$>

Am 21.11.2010 18:27, schrieb Jesus Cea:
> What is the impact in the buildbot architecture?. Slaves must do
> anything?. At least they need to have mercurial installed, I guess.
> What, as a buildslave manager, must I do to ready my server for the
> migration?.

Apart from having Mercurial installed and "hg" in the PATH (that will
be important for Windows I assume), I don't think anything else is required.


From raymond.hettinger at  Sun Nov 21 19:17:57 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Sun, 21 Nov 2010 10:17:57 -0800
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 21, 2010, at 9:38 AM, R. David Murray wrote:
> I'm sorry, but I have to disagree.  As a relative unicode ignoramus,
> "UCS-2" and "UCS-4" convey almost no information to me, and the bits I
> have heard about them on this list have only confused me. 

From the users point of view, it doesn't much matter which encoding is
used internally.  

Neither UTF-16 nor UCS-2 is exactly correct anyway.  The former encodes
the entire range of unicode characters in a variable length code 
(a character is usually 2 bytes but is sometimes 4 bytes long).  The latter
encodes only a subset of unicode (the basic mulitlingual plane) in a
fixed-length code of bytes per character).

What we use internally looks like utf-16 but a character encoded with
4 bytes is treated as two 2-byte characters (hence the subject of this
thread).   Our hybrid internal coding lets use handle the entire
range of unicode while getting speed and simplicity by doing len() 
and slicing with a surrogate pair being treated as two separate

For the "wide" build, the entire range of unicode is encoded at
4 bytes per character and slicing/len operate correctly since
every character is the same length.   This used to be called UCS-4
and is now UTF-32.

So, with "wide" builds there isn't much confusion (except perhaps
unfamiliar terminology).   The real issue seems to be that for 
"narrow" builds, none of the usual encoding names is exactly correct.  

From a users point-of-view, the actual encoding or encoding name 
doesn't matter much.  They just need to be able to predict the relevant
behaviors (memory consumption and len/slicing behavior).

For the narrow build, that behavior is:
- Characters in the BMP consume 2 bytes and count as one char
  for purposes of len and slicing.
- Characters above the BMP consume 4 bytes and counts as
  two distinct chars for purpose of len and slicing.

For wide builds, all characters are 4 bytes and count as a single
char for len and slicing.

Hope this helps,


From martin at  Sun Nov 21 19:51:44 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Nov 2010 19:51:44 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

>  > I disagree. Python does "conform" to "UTF-16"
> I'm sure the codecs do.  But the Unicode standard doesn't care about
> the parts of the process, it cares about what it does as a whole.

Chapter and verse?

> Python's internal coding does not conform to UTF-16, and that internal
> coding can, under certain conditions, escape to the outside world as
> invalid "Unicode" output.

I'm fairly certain there are provisions in the Unicode standard for such
behavior (taking into account "certain conditions").

>  > What behavior specifically do you consider non-conforming, and what
>  > specific specification do you think it is not conforming to? For
>  > example, it *is* fully conforming with UTF-8.
> Oh,
>     f = open('/tmp/broken','wt',encoding='utf8',errors='surrogateescape')
>     f.write(chr(int('dc80',16)))
>     f.close()
> for one.  That produces a non-UTF-8 file

Right. You are using an API that does not promise to create UTF-8, and
hence isn't UTF-8. The Unicode standard certainly allows implementations
to use character encoding schemes other than UTF-8; this one being
"UTF-8 with surrogate escapes", which is different from "UTF-8" (IANA
MIBEnum 106).

> You can say, "oh, but that's not really a UTF-8 codec", and I'd agree.

See above :-)

> Nevertheless, the program is able to produce output from internal
> "Unicode" strings that does not conform to Unicode at all.

*Any* Unicode implementation will do that, since they all have to
support legacy encodings in some form. This is certainly conforming to
the Unicode standard, and in fact one of the primary Unicode design

> A Unicode-
> conforming Python implementation would error at the chr() call, or
> perhaps would not provide surrogateescape error handlers.

Chapter and verse?

> "Although practicality beats purity."

The Unicode standard itself is based on practicality. It wouldn't
have received the success it did if it was based on purity only
(and indeed, was often rejected in cases where it put purity over
practicality, e.g. with the Hangul syllables).


From rdmurray at  Sun Nov 21 20:29:15 2010
From: rdmurray at (R. David Murray)
Date: Sun, 21 Nov 2010 14:29:15 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, 21 Nov 2010 10:17:57 -0800, Raymond Hettinger <raymond.hettinger at> wrote:
> On Nov 21, 2010, at 9:38 AM, R. David Murray wrote:
> > I'm sorry, but I have to disagree.  As a relative unicode ignoramus,
> > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I
> > have heard about them on this list have only confused me.


> 6rom a users point-of-view, the actual encoding or encoding name
> doesn't matter much.  They just need to be able to predict the relevant
> behaviors (memory consumption and len/slicing behavior).
> For the narrow build, that behavior is:
> - Characters in the BMP consume 2 bytes and count as one char
>   for purposes of len and slicing.
> - Characters above the BMP consume 4 bytes and counts as
>   two distinct chars for purpose of len and slicing.
> For wide builds, all characters are 4 bytes and count as a single
> char for len and slicing.
> Hope this helps,

Thank you, that nicely summarizes and confirms what I thought I knew about
wide versus narrow build.  And as I said, using the names UCS-2/UCS-4
would only *confuse* that understanding, not clarify it.

R. David Murray                            

From alexander.belopolsky at  Sun Nov 21 23:13:22 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 21 Nov 2010 17:13:22 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 19, 2010 at 4:43 PM, "Martin v. L?wis" <martin at> wrote:
>> In my opinion, the question is more what was it not fixed in Python2. I suppose
>> that the answer is something ugly like "backward compatibility" or "historical
>> reasons" :-)
> No, there was a deliberate decision to not support that, see
> There had been a long discussion on this specific detail when PEP 261
> was written, and in the end, an explicit, deliberate, considered
> decision was made to raise a ValueError.

Yes, the existence of PEP 261 was one of the reasons I was surprised
that a change like this was made without a deliberation.   Personally,
I've never used chr() or ord() other than on the python command
prompt.  Processing text one character at a time is just too slow in
Python.  So for my own use cases, the change is quite welcome.  I also
find that with bytes() items being int in 3.x more or less removes the
need for ord().  On the other hand any 2.x program that uses unichr()
and ord() is very likely to exhibit subtly buggy behavior when ported
to 3.x.  I don't think len(chr(i)) = 2 is likely to cause problems,
but map(ord, s) not being an iterator over code points is likely to
break naive programs.   This is especially true because as far as I
can tell there is no easy way to iterate over code points in a Python
string on a narrow build.

From merwok at  Mon Nov 22 01:54:34 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 22 Nov 2010 01:54:34 +0100
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/	Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

> Author: nick.coghlan
> New Revision: 86633
> Issue #10220: Add inspect.getgeneratorstate(). Initial patch by Rodolpho Eckhardt
> Modified: python/branches/py3k/Doc/library/inspect.rst
> ==============================================================================
> --- python/branches/py3k/Doc/library/inspect.rst	(original)
> +++ python/branches/py3k/Doc/library/inspect.rst	Sun Nov 21 04:44:04 2010
> @@ -620,3 +620,25 @@
>             # in which case the descriptor itself will
>             # have to do
>             pass
> +
> +Current State of a Generator
> +----------------------------
> +
> +When implementing coroutine schedulers and for other advanced uses of
> +generators, it is useful to determine whether a generator is currently
> +executing, is waiting to start or resume or execution, or has already
> +terminated. func:`getgeneratorstate` allows the current state of a
> +generator to be determined easily.
> +
> +.. function:: getgeneratorstate(generator)
> +
> +    Get current state of a generator-iterator.
> +
> +    Possible states are:
> +      GEN_CREATED: Waiting to start execution.
> +      GEN_RUNNING: Currently being executed by the interpreter.
> +      GEN_SUSPENDED: Currently suspended at a yield expression.
> +      GEN_CLOSED: Execution has completed.

I wonder if those shouldn?t be marked up as :data: or something to make
them indexed.

From v+python at  Mon Nov 22 04:59:54 2010
From: v+python at (Glenn Linderman)
Date: Sun, 21 Nov 2010 19:59:54 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On 11/21/2010 9:18 AM, R. David Murray wrote:
> I want to look at the CGI issue, but I'm not sure when I'll get to it.

Actually, since this code was working before 3.x, and if email.parser 
can now accept binary streams, it seems like maybe the only thing that 
might be wrong is that presently it is getting a text stream instead, so 
that is something or the application program would have to 
switch, and then maybe some testing would discover correctness, or maybe 
a specification of UTF-8 as the encoding to use for the text parts would 
have to be done.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rdmurray at  Mon Nov 22 05:39:57 2010
From: rdmurray at (R. David Murray)
Date: Sun, 21 Nov 2010 23:39:57 -0500
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Sun, 21 Nov 2010 19:59:54 -0800, Glenn Linderman <v+python at> wrote:
> On 11/21/2010 9:18 AM, R. David Murray wrote:
> > I want to look at the CGI issue, but I'm not sure when I'll get to it.
> Actually, since this code was working before 3.x, and if email.parser 
> can now accept binary streams, it seems like maybe the only thing that 
> might be wrong is that presently it is getting a text stream instead, so 
> that is something or the application program would have to 
> switch, and then maybe some testing would discover correctness, or maybe 
> a specification of UTF-8 as the encoding to use for the text parts would 
> have to be done.

Well, given the bytes/string split in Python3, code definitely has to
be changed to make this work, since you have to explicitly call bytes
processing routines (message_from_bytes, message_from_binary_file,
BytesFeedparser, etc) to parse binary data, and likewise use
BytesGenerator to emit binary data.

R. David Murray                            

From brian.curtin at  Mon Nov 22 06:14:24 2010
From: brian.curtin at (Brian Curtin)
Date: Sun, 21 Nov 2010 23:14:24 -0600
Subject: [Python-Dev] Bug week-end on the 20th-21st?
In-Reply-To: <>
References: <> <i9v7ad$scf$>
Message-ID: <>

On Mon, Oct 25, 2010 at 15:04, Antoine Pitrou <solipsis at> wrote:

> On Mon, 25 Oct 2010 11:32:42 -0400
> "R. David Murray" <rdmurray at> wrote:
> > On Mon, 25 Oct 2010 12:22:24 -0200, Rodrigo Bernardo Pimentel <
> rbp at> wrote:
> > >> Am 23.10.2010 19:08, schrieb Antoine Pitrou:
> > >>> The first 3.2 beta is scheduled by Georg for November 13th.
> > >>> What would you think of scheduling a bug week-end one week later,
> that
> > >>> is on November 20th and 21st? We would need enough core developers to
> > >>> be available on #python-dev.
> > >
> > >FWIW, I'm +1, and I'll try to get the Sao Paulo users group to
> participate.
> >
> > I think this is a great idea (both Antoine's initial suggestion and the
> > idea of getting users groups to participate).
> >
> > I'll be around and able to participate that weekend except for evening
> > US Eastern time.
> Ok, so 20th-21st of November it shall be!
> Regards
> Antoine.

Although a few time zones are still celebrating Bug Weekend, it looks like
at least 76 bugs got closed out [0]. Some of those happened thanks to a
number of first time contributors. Thanks to everyone for their efforts!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Mon Nov 22 06:28:13 2010
From: stephen at (Stephen J. Turnbull)
Date: Mon, 22 Nov 2010 14:28:13 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:

 > Chapter and verse?

Unicode 5.0, Chapter 3, verse C9:

    When a process generates a code unit sequence which purports to be
    in a Unicode character encoding form, it shall not emit ill-formed
    code sequences.

I think anything called "UTF-8 something" is likely to be taken to
"purport".  Furthermore, users don't necessarily see which error
handlers are being used.  A user who specifies "utf8" as the output
codec is likely to be rather surprised if non-UTF-8 is emitted because
the app specified surrogateescape.  Eg, consider a script which munges
file descriptions into reasonable-length file names on Unix.  Yes,
technically the non-Unicode output is the app's fault, but I expect
many users will put some blame on Python.

I am in full agreement with you about the technicalities, but I am
looking for ways to clue in users that (a) the technicalities matter,
and (b) that Python does a *very* good job of making things as safe as
possible without becoming unable to handle bytes.  I think "wide"
vs. "narrow" fails at both.  It focuses on storage issues, which of
course are important, but at the cost of ignoring the fact that for
users of non-BMP characters 32-bit code units are much safer.  Users
who need non-BMP characters are relatively few, and at least at the
present time most are painfully aware of the need to care for
technicalities.  I expect them to be pleasantly surprised by how easy
it is to get reasonably safe behavior even from a 16-bit build.

 > > Python's internal coding does not conform to UTF-16, and that internal
 > > coding can, under certain conditions, escape to the outside world as
 > > invalid "Unicode" output.
 > I'm fairly certain there are provisions in the Unicode standard for such
 > behavior (taking into account "certain conditions").

Sure.  There's nothing in the Unicode standard that says you have to
conform to it unless you claim to conform to it.

So it is valid to say that Python's Unicode codecs without
surrogateescape do conform.  The point is that Python does not, even
if all of the input is valid Unicode, because of the provision of
surrogateescape and the lack of Unicode conformance-checking for
certain internal functionality like chr() and slicing.

You can say "we don't make any such claim", but IMO the distinction in
question is too fine a point for most users, and requires a very large
amount of Unicode knowledge (not to mention standards geekiness) to
even understand the precise statement.

"Unicode support" to users should mean that Python does the right
thing, not that if you look hard enough in the documentation you will
discover that Python doesn't claim to do the right thing even though
in practice it mostly does.  IMO, "UCS-2" is a pretty good description
of what the user can leave up to Python in perfect safety.  RDM's
reply worries me a little, but I'll reply to his message separately.

 > *Any* Unicode implementation will do that, since they all have to
 > support legacy encodings in some form. This is certainly conforming to
 > the Unicode standard, and in fact one of the primary Unicode design
 > principles.

No.  Support for legacy encodings takes you outside of the realm of
Unicode conformance by definition.  Their names tell you that,
however.  "UTF-8 with surrogate escapes" on the other hand is an
entirely different kettle of fish.  It pretends to be UTF-8, but
isn't.  I think that users who give Python valid input should be able
to expect valid output, but they can't.

Chapter 3, verse C7:

    When a process purports not to modify the interpretation of a
    valid coded character sequence, it shall make no change to that
    coded character sequence other than the possible replacement of
    character sequences by their canonical-equivalent sequences, or
    the deletion of *noncharacter* code points.

Sure, you can tell users the truth: "Python may modify your Unicode
characters if you slice or index Unicode strings.  It may even
silently turn them into invalid codes which will eventually raise
Errors."  Then you are conformant, but why would anyone want to use
such a program?

If you tell them "UCS-2[sic] Python is safe to use with *no* extra
care if you use only UCS-2 [or BMP] characters", suddenly Python looks
very nice indeed again.  "UCS-4" Python is even better; all you have
to do is to avoid surrogateescape codecs.  However, you're still
vulnerable to hard-to-diagnose errors at the output stage in case of
program bugs, because not enough checking of values is done by Python

 > > A Unicode-conforming Python implementation would error at the
 > > chr() call, or perhaps would not provide surrogateescape error
 > > handlers.
 > Chapter and verse?

Chapter 3, verse C9 again.

 > > "Although practicality beats purity."
 > The Unicode standard itself is based on practicality. It wouldn't
 > have received the success it did if it was based on purity only
 > (and indeed, was often rejected in cases where it put purity over
 > practicality, e.g. with the Hangul syllables).

Python practicality is very different from Unicode practicality.

From v+python at  Mon Nov 22 06:40:22 2010
From: v+python at (Glenn Linderman)
Date: Sun, 21 Nov 2010 21:40:22 -0800
Subject: [Python-Dev] is this a bug?  no environment variables
Message-ID: <>

In reviewing my notes from my experimentations with CGIHTTPServer 
(Python2.6) and then http.server (Python 3.2a4), I note one behavior I 
haven't reported as a bug, nor do I know where to start to figure it 
out, other than experimentally.

The experiment: launching CGIHTTPServer without environment variables, 
by the simple expedient of using a batch file to unset all the existing 
environment variables, and then launching Python2.6 with CGIHTTPServer.

So it failed early: fails at line 110 (Python 2.6).

I suppose it is possible that some environment variables are used by 
Python directly (but I can't seem to find a documented list of them) 
although I would expect that usage to be optional, with fall-back 
defaults when they don't exist.  I suppose it is even possible that some 
Windows APIs might depend on some environment variables, but I expected 
that the registry had replaced such usage completely, by now, with the 
environment variables mostly being a convenience tool for batch files, 
or for optional, temporary alteration of particular settings.

If anyone knows of documentation listing what environment variables are 
required by Python on Windows, I would appreciate a pointer, searches 
and doc browsing having not turned it up.

I'll attempt to recreate the test situation later this week with Python 
3.2a4, if no one responds, but the only debug technique I can think of 
is to slowly remove environment variables until I find the minimum set 
required to run http.server successfully for my tests with CGI files.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Mon Nov 22 07:14:46 2010
From: stephen at (Stephen J. Turnbull)
Date: Mon, 22 Nov 2010 15:14:46 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

R. David Murray writes:

 > I'm sorry, but I have to disagree.  As a relative unicode ignoramus,
 > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I
 > have heard about them on this list have only confused me.

OK, point taken.

 > On the other hand, I understand that 'narrow' means that fewer
 > bytes are used for each internal character, meaning that some
 > unicode characters need to be represented by more than one string
 > element, and thus that slicing strings containing such characters
 > on a narrow build causes problems.  Now, you could tell me the same
 > information using the terms 'UCS-2' and 'UCS-4' instead of 'narrow'
 > and 'wide', but to my ear 'narrow' and 'wide' convey a better gut
 > level feeling for what is going on than 'UCS-2' and 'UCS-4' do.

I think that is probably conditioned by your long experience with
Python's Unicode features, specifically the knowledge that Python's
Unicode strings are not arrays of characters, which often is referred
to on this list.

My guess is that very few newbies would know that, and it is not
implied by "narrow".  For example, both Emacs (for sure) and Perl
(IIUC) index strings of variable-width character by characters (at
great expense of performance in Emacs, at least), not as code units.

 > And it avoids any question of whether or not Python's internal
 > representation actually conforms to whatever standard it is that
 > UCS refers to, a point on which there seems to be some dissension.

UCS-2 refers to ISO 10646, Annex 1 IIRC.[1]  Anyway, it's somewhere in
ISO 10646.  I don't think there's actually dissension on conformance
to UCS-2, as that's very easy to achieve.  Rather, Guido explicitly
pronounced that Python processes arrays of code units, not
characters.  My point is that if you pretend that Python is processing
*characters* according to UCS-2 rules for characters, you'll always
come to the same conclusion about what Python will do as if you use
the technically correct terminology of code units.  (At least for the
BMP and UTF-16 private areas.  There will necessarily be some
confusion about surrogates, since in UCS-2 they are characters while
in UTF-16 they're merely "code points", and the Unicode characters
they represent can't be represented at all in UCS-2.)

 > Indeed, reading that article with my limited unicode knowledge, if
 > I were told Python used UCS-2, I would assume that non-BMP
 > characters could not be processed by a Python narrow build.

Actually, I'm almost happy with that.

That is, the precise formulation is "could not be processed *safely
without extra care* by a Python narrow build."  Specifically, AFAIK if
you range check characters that have been indexed out of a string, or
are located at slice boundaries, or produced by chr() or a
surrogateescape input codec, you're safe.  But practically speaking
few apps will actually do those checks and therefore they are unsafe:
processing non-BMP characters can easily lead to show-stopping
Exceptions.  It's very analogous to the kind of show-stopping "bad
character in a header" exception that plagued Mailman for so long, and
had to be fixed on a case-by-case basis.  But the restriction to BMP
characters is much more reasonable (at least for now) than RFC 822's
restriction to ASCII!

But evidently you take it much more stringently.  So the question is,
"what fraction of developers who think as you do would therefore be
put off from using Python to build their applications?"  If most would
say "OK, we'll stick with BMP for now and use UCS-4 or some hack to
deal with extended characters later -- it can't really be true that
it's absolutely impossible to use non-BMP characters," I don't mind
that misunderstanding.

OTOH, yes, it would be bad if the use of "UCS-2" were to imply to more
than a couple of developers that 16-bit builds of Python can't handle
UTF-16 *at all*.

[1]  It simply says "we have a subset of the Unicode character set all
of whose code points can be represented in 16 bits, excluding 0xFFFF."
It goes on to define a private area, reserved for use by applications
that will never be standardized, and it says that if you don't know
what a code point in the character area is, don't change it (you can
delete it, however).  ISTR that a later Amendment added 0xFFFE to the
short-list of non-characters.

The surrogate area was taken out of the private area, so a UCS-2
application will simply consider each surrogate to be an unknown
character and pass it through unchanged -- unless it deletes it, or
inserts other characters between the code points of a surrogate pair.
And that's why UCS-2 isn't UTF-16 conforming -- which is basically why
Python isn't either.

From martin at  Mon Nov 22 09:20:59 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Nov 2010 09:20:59 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> Unicode 5.0, Chapter 3, verse C9:
>     When a process generates a code unit sequence which purports to be
>     in a Unicode character encoding form, it shall not emit ill-formed
>     code sequences.

>  > > A Unicode-conforming Python implementation would error at the
>  > > chr() call, or perhaps would not provide surrogateescape error
>  > > handlers.
>  > 
>  > Chapter and verse?
> Chapter 3, verse C9 again.

I agree that the surrogateescape error handler is non-conforming, but,
as you say, it doesn't claim to, either (would your concern about utf-8
being misleading here been resolved if the thing had been called

More interestingly (and to the subject) is chr: how did you arrive
at C9 banning Python3's definition of chr? This chr function puts
the code sequence into well-formed UTF-16; that's the whole point of


From stephen at  Mon Nov 22 11:47:09 2010
From: stephen at (Stephen J. Turnbull)
Date: Mon, 22 Nov 2010 19:47:09 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:

 > More interestingly (and to the subject) is chr: how did you arrive
 > at C9 banning Python3's definition of chr? This chr function puts
 > the code sequence into well-formed UTF-16; that's the whole point of
 > UTF-16.

No, it doesn't, in the specific case of surrogate code points.  In
3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64,
chr(0xd800) returns "\ud800".

I don't know if that's by design (eg, so that it can be used in the
implementation of the surrogateescape error handler) or a correctable
oversight, but it's not conformant.

From stephen at  Mon Nov 22 11:48:42 2010
From: stephen at (Stephen J. Turnbull)
Date: Mon, 22 Nov 2010 19:48:42 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Raymond Hettinger writes:

 > Neither UTF-16 nor UCS-2 is exactly correct anyway.

>From a standards lawyer point of view, UCS-2 is exactly correct, as
far as I can tell upon rereading ISO 10646-1, especially Annexes H
("retransmitting devices") and Q ("UTF-16").  Annex Q makes it clear
that UTF-16 was intentionally designed so that Python-style processing
could be done in a UCS-2 context.

 > For the "wide" build, the entire range of unicode is encoded at
 > 4 bytes per character and slicing/len operate correctly since
 > every character is the same length.   This used to be called UCS-4
 > and is now UTF-32.

That's inaccurate, I believe.  UCS-4 is not a UTF, and doesn't satisfy
the range restrictions of a UTF.

 > So, with "wide" builds there isn't much confusion (except perhaps
 > unfamiliar terminology).   The real issue seems to be that for 
 > "narrow" builds, none of the usual encoding names is exactly
 > correct.  

I disagree.  I do see a problem with "UCS-2", because it fails to tell
us that Python implements a large number of features that make it easy
to do a very good job of working with non-BMP data in 16-bit builds of
Python, with no extra effort.  Python is not perfect, and (rarely)
some of the imperfections may be very distressing.  But it's very
good, and deserves to be advertised as such.

However, I don't see how "narrow" tells us more than "UCS-2" does.  If
"UCS-2" is equally (or more) informative, I prefer it because it is
the technically precise, already well-defined, term.

 > From a users point-of-view, the actual encoding or encoding name 
 > doesn't matter much.  They just need to be able to predict the relevant
 > behaviors (memory consumption and len/slicing behavior).

"UCS-2" indicates those behaviors precisely and concisely.  The
problems are (a) the lack of familiarity of users with this term, if
David is reasonably representative, and (b) the fact that it fails to
advertise Python's UTF-16 capabilities.  "Narrow" suffers from both of
those problems, and further from the fact that it has no independent
standard definition.  Furthermore, "wide" has a very widespread,
platform-dependent meaning derived from wchar_t.

If we have to document what the terms we choose mean anyway, why not
document the existing terms and reduce entropy, rather than invent new
ones and increase entropy?

From martin at  Mon Nov 22 12:22:35 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Nov 2010 12:22:35 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Am 22.11.2010 11:47, schrieb Stephen J. Turnbull:
> "Martin v. L?wis" writes:
>  > More interestingly (and to the subject) is chr: how did you arrive
>  > at C9 banning Python3's definition of chr? This chr function puts
>  > the code sequence into well-formed UTF-16; that's the whole point of
>  > UTF-16.
> No, it doesn't, in the specific case of surrogate code points.  In
> 3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64,
> chr(0xd800) returns "\ud800".

Ah, I see - this is *not* the subject's issue, right?

> I don't know if that's by design (eg, so that it can be used in the
> implementation of the surrogateescape error handler) or a correctable
> oversight, but it's not conformant.

I disagree: Quoting from Unicode 5.0, section 5.4:

# The individual components of implementations may have different
# levels of support for surrogates, as long as those components are
# assembled and communicate correctly. Low-level string processing,
# where a Unicode string is not interpreted but is handled simply as an
# array of code units, may ignore surrogate pairs. With such strings,
# for example, a truncation operation with an arbitrary offset might
# break a surrogate pair. (For further discussion, see Section 2.7,
# Unicode Strings.) For performance in string operations, such behavior
# is reasonable at a low level, but it requires higher-level processes
# to ensure that offsets are on character boundaries so as to guarantee
# the integrity of surrogate pairs.

So lower-level routines (which I claim chr() is one) are allowed
to create lone surrogates. The formal requirement behind this is C1:

# A process shall not interpret a high-surrogate code point or a
# low-surrogate code point as an abstract character.

I also claim that Python, in both narrow and wide mode, conforms to
this requirement. Notice that the requirement is a ban on interpreting
the code point as a character. In particular, unicodedata.category
claims that the code point is of class Cs (surrogate), which I consider

By the same line of reasoning, it is also OK that chr() allows the
creation of unassigned code points, even though C2 says that they
must not be interpreted as abstract characters.

The rationale for supporting these characters in chr() goes back much
further than the surrogateescape handler - as Python unicode strings
are sequences of code points, it would be impractical if you couldn't
create some of them, or even would have to consult the UCD before
determining whether they can be created.


From martin at  Mon Nov 22 12:43:00 2010
From: martin at (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Nov 2010 12:43:00 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Am 22.11.2010 11:48, schrieb Stephen J. Turnbull:
> Raymond Hettinger writes:
>  > Neither UTF-16 nor UCS-2 is exactly correct anyway.
>>From a standards lawyer point of view, UCS-2 is exactly correct, as
> far as I can tell upon rereading ISO 10646-1, especially Annexes H
> ("retransmitting devices") and Q ("UTF-16").  Annex Q makes it clear
> that UTF-16 was intentionally designed so that Python-style processing
> could be done in a UCS-2 context.

I could only find the FCD of 10646:2010, where annex H was integrated
into section 10:

There they have stopped using the term UCS-2, and added a note

# NOTE ? Former editions of this standard included references to a
# two-octet BMP form called UCS-2 which would be a subset
# of the UTF-16 encoding form restricted to the BMP UCS scalar values. #
The UCS-2 form is deprecated.

I think they are now acknowledging that UCS-2 was a misleading term,
making it ambiguous whether this refers to a CCS, a CEF, or a CES;
like "ASCII", people have been using it for all three of them.

Apparently, the ISO WG interprets earlier revisions as saying that
UCS-2 is a CEF that restricted UTF-16 to the BMP. THIS IS NOT WHAT
PYTHON DOES. In a narrow Python build, the character set is *not*
restricted to the BMP. Instead, Unicode strings are meant to be
interpreted (by applications) as UTF-16.

>  > For the "wide" build, the entire range of unicode is encoded at
>  > 4 bytes per character and slicing/len operate correctly since
>  > every character is the same length.   This used to be called UCS-4
>  > and is now UTF-32.
> That's inaccurate, I believe.  UCS-4 is not a UTF, and doesn't satisfy
> the range restrictions of a UTF.

Not sure what it says in your copy; in mine, section 9.3 says

# 9.3 UTF-32 (UCS-4)
# UTF-32 (or UCS-4) is the UCS encoding form that assigns each UCS
# scalar value to a single unsigned 32-bit code unit. The terms UTF-32 #
and UCS-4 can be used interchangeably to designate this encoding
# form.

so they (now) view the two as synonyms.

I think that when ISO 10646 started, they were also fairly confused
about these issues (as the group/plane/row/cell structure demonstrates,
IMO). This is not surprising, since the notion of byte-based character
sets had been ingrained for so long. It took 20 years to learn that
a UCS scalar value really is *not* a sequence of bytes, but a natural

> However, I don't see how "narrow" tells us more than "UCS-2" does.  If
> "UCS-2" is equally (or more) informative, I prefer it because it is
> the technically precise, already well-defined, term.

But it's not. It is a confusing term, one that the relevant standards
bodies are abandoning. After reading FCD 10646:2010, I could agree to
call the two implementations UTF-16 and UTF-32 (as these terms
designate CEFs). Unfortunately, they also designate CESs.

> If we have to document what the terms we choose mean anyway, why not
> document the existing terms and reduce entropy, rather than invent new
> ones and increase entropy?

Because the proposed existing term is deprecated.


From mal at  Mon Nov 22 13:47:29 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 22 Nov 2010 13:47:29 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>


it is really irrelevant whether the standards have decided
to no longer use the terms UCS-2 and UCS-4 in their latest
standard documents.

The definitions still stand (just like Unicode 2.0 is still a valid
standard, even if it's ten years old):

* UCS-2 is defined as "Universal Character Set coded in 2 octets"
by ISO 10464: (see

* UCS-4 is defined as "Universal Character Set coded in 4 octets"
by ISO 10464.

Those two terms have been in use for many years. They refer to
the Unicode character set as it can be represented in 2 or 4
bytes. As such they don't include any of the special meanings
associated with the UTF transfer encodings. There are no invalid
sequences, no invalid code points, etc. as you can find in the UTF
encodings. And that's an important detail.

If you interpret them as encodings, they are 1-1 mappings of
Unicode code point ordinals to integers represented using
2 or 4 bytes.

UCS-2 only supports BMP code points and can conveniently
be interpreted as UTF-16, if you need to encode non-BMP
code points (which we do in the UTF codecs).

UCS-4 also supports non-BMP code points directly.

Now, from a ISO or Unicode Consortium point of view, deprecating
the term UCS-2 in *their* standard papers is only natural, since
they are actively starting to assign non-BMP code points which
cannot be represented in UCS-2.

However, this deprecation is only relevant for the purpose of defining
the standard. The above definitions are still useful
when it comes to defining code units, i.e. the used storage format,
(as opposed to the transfer format).

For the purpose of describing the code units we are using in Python
they are (still) the most correct terms and that's also the reason
why we chose to use them when introducing the configure options
in Python2.

There are no other accurate definitions we could use. The terms
"narrow" and "wide" are simply too inaccurate to be used as
description of UCS-2 and UCS-4 code units.

Please also note that we have used the terms UCS-2 and UCS-4 in Python2
for 9+ years now and users are just starting to learn the difference
and get acquainted with the fact that Python uses these two forms.

Confronting them with "narrow" and "wide" builds is only
going to cause more confusion, not less, and adding those
strings to Python package files isn't going to help much either,
since the terms don't convey any relationship to Unicode:


I opt for switching to the following config options:

--with-unicode=ucs2 (default)

and using "UCS-2" and "UCS-4" in the Python documentation when
describing the two different build modes.  We can add glossary
entries for the two which clarify the differences.

Python2 used --enable-unicode=ucs2/ucs4, but since Python3 doesn't
build without Unicode support, the above two versions appear more

We can keep the alternative --with-wide-unicode as an alias
for --with-unicode=ucs4 to maintain 3.x backwards compatibility.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 22 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

"Martin v. L?wis" wrote:
> Am 22.11.2010 11:48, schrieb Stephen J. Turnbull:
>> Raymond Hettinger writes:
>>  > Neither UTF-16 nor UCS-2 is exactly correct anyway.
>> >From a standards lawyer point of view, UCS-2 is exactly correct, as
>> far as I can tell upon rereading ISO 10646-1, especially Annexes H
>> ("retransmitting devices") and Q ("UTF-16").  Annex Q makes it clear
>> that UTF-16 was intentionally designed so that Python-style processing
>> could be done in a UCS-2 context.
> I could only find the FCD of 10646:2010, where annex H was integrated
> into section 10:
> There they have stopped using the term UCS-2, and added a note
> # NOTE ? Former editions of this standard included references to a
> # two-octet BMP form called UCS-2 which would be a subset
> # of the UTF-16 encoding form restricted to the BMP UCS scalar values. #
> The UCS-2 form is deprecated.
> I think they are now acknowledging that UCS-2 was a misleading term,
> making it ambiguous whether this refers to a CCS, a CEF, or a CES;
> like "ASCII", people have been using it for all three of them.
> Apparently, the ISO WG interprets earlier revisions as saying that
> UCS-2 is a CEF that restricted UTF-16 to the BMP. THIS IS NOT WHAT
> PYTHON DOES. In a narrow Python build, the character set is *not*
> restricted to the BMP. Instead, Unicode strings are meant to be
> interpreted (by applications) as UTF-16.
>>  > For the "wide" build, the entire range of unicode is encoded at
>>  > 4 bytes per character and slicing/len operate correctly since
>>  > every character is the same length.   This used to be called UCS-4
>>  > and is now UTF-32.
>> That's inaccurate, I believe.  UCS-4 is not a UTF, and doesn't satisfy
>> the range restrictions of a UTF.
> Not sure what it says in your copy; in mine, section 9.3 says
> # 9.3 UTF-32 (UCS-4)
> # UTF-32 (or UCS-4) is the UCS encoding form that assigns each UCS
> # scalar value to a single unsigned 32-bit code unit. The terms UTF-32 #
> and UCS-4 can be used interchangeably to designate this encoding
> # form.
> so they (now) view the two as synonyms.
> I think that when ISO 10646 started, they were also fairly confused
> about these issues (as the group/plane/row/cell structure demonstrates,
> IMO). This is not surprising, since the notion of byte-based character
> sets had been ingrained for so long. It took 20 years to learn that
> a UCS scalar value really is *not* a sequence of bytes, but a natural
> number.
>> However, I don't see how "narrow" tells us more than "UCS-2" does.  If
>> "UCS-2" is equally (or more) informative, I prefer it because it is
>> the technically precise, already well-defined, term.
> But it's not. It is a confusing term, one that the relevant standards
> bodies are abandoning. After reading FCD 10646:2010, I could agree to
> call the two implementations UTF-16 and UTF-32 (as these terms
> designate CEFs). Unfortunately, they also designate CESs.
>> If we have to document what the terms we choose mean anyway, why not
>> document the existing terms and reduce entropy, rather than invent new
>> ones and increase entropy?
> Because the proposed existing term is deprecated.
> Regards,
> Martin

From foom at  Mon Nov 22 15:18:02 2010
From: foom at (James Y Knight)
Date: Mon, 22 Nov 2010 09:18:02 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

Why don't ya'll just call them "--unichar-width=16/32". That describes precisely what the options do, and doesn't invite any quibbling over definitions.


From ncoghlan at  Mon Nov 22 16:14:46 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 23 Nov 2010 01:14:46 +1000
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo <merwok at> wrote:
>> +.. function:: getgeneratorstate(generator)
>> +
>> + ? ?Get current state of a generator-iterator.
>> +
>> + ? ?Possible states are:
>> + ? ? ?GEN_CREATED: Waiting to start execution.
>> + ? ? ?GEN_RUNNING: Currently being executed by the interpreter.
>> + ? ? ?GEN_SUSPENDED: Currently suspended at a yield expression.
>> + ? ? ?GEN_CLOSED: Execution has completed.
> I wonder if those shouldn?t be marked up as :data: or something to make
> them indexed.

The same definitions are in the docstrings, and they're just integer
constants so I'm not sure why anyone would be looking them up
directly. Still, if someone with greater Sphinx-fu thinks additional
markup would be helpful, I have no problem with them adding it :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Mon Nov 22 16:19:04 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 22 Nov 2010 15:19:04 +0000
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 22/11/2010 15:14, Nick Coghlan wrote:
> On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo<merwok at>  wrote:
>>> +.. function:: getgeneratorstate(generator)
>>> +
>>> +    Get current state of a generator-iterator.
>>> +
>>> +    Possible states are:
>>> +      GEN_CREATED: Waiting to start execution.
>>> +      GEN_RUNNING: Currently being executed by the interpreter.
>>> +      GEN_SUSPENDED: Currently suspended at a yield expression.
>>> +      GEN_CLOSED: Execution has completed.
>> I wonder if those shouldn?t be marked up as :data: or something to make
>> them indexed.
> The same definitions are in the docstrings, and they're just integer
> constants so I'm not sure why anyone would be looking them up
> directly. Still, if someone with greater Sphinx-fu thinks additional
> markup would be helpful, I have no problem with them adding it :)

Why not use string constants instead? You lose comparability (less than 
/ greater than) but gain readability. Comparability may be a requirement 
- of course if Python had an Enum type we could use that and have both.

> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Mon Nov 22 16:37:21 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 23 Nov 2010 01:37:21 +1000
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 10:47 PM, M.-A. Lemburg <mal at> wrote:
> Please also note that we have used the terms UCS-2 and UCS-4 in Python2
> for 9+ years now and users are just starting to learn the difference
> and get acquainted with the fact that Python uses these two forms.
> Confronting them with "narrow" and "wide" builds is only
> going to cause more confusion, not less, and adding those
> strings to Python package files isn't going to help much either,
> since the terms don't convey any relationship to Unicode:

I was personally surprised to learn in this discussion that there had
even been an *attempt* to change the names of the two build variants
to anything other than UCS2/UCS4. The concrete API implementations
certainly still use those two terms to prevent inadvertent linkage
with the wrong version of the C API.

For practical purposes, UCS2/UCS4 convey far more inherent information
than narrow/wide:
- many developers will recognise them as Unicode related, even if they
don't know exactly what they mean
- even those that don't recognise them, can soon learn that they're
Unicode related just by plugging them into Google*
- a bit more digging should reveal that they're Unicode storage
formats closely related to the UTF-16 and UTF-32 transfer encodings

*(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on
Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article)

All that just armed with Google, without even looking at the Python
docs specifically.

So don't just think about "what will developers know?", also think
about "what will developers know, and what will a quick trip to a
search engine tell them?". And once you take that stance, the overly
generic narrow/wide terms fail, badly.

+1 for MAL's suggested tweaks to the Py3k configure options.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Mon Nov 22 16:37:22 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 22 Nov 2010 16:37:22 +0100
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
References: <>
Message-ID: <>

On Mon, 22 Nov 2010 15:19:04 +0000
Michael Foord <fuzzyman at> wrote:

> On 22/11/2010 15:14, Nick Coghlan wrote:
> > On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo<merwok at>  wrote:
> >>> +.. function:: getgeneratorstate(generator)
> >>> +
> >>> +    Get current state of a generator-iterator.
> >>> +
> >>> +    Possible states are:
> >>> +      GEN_CREATED: Waiting to start execution.
> >>> +      GEN_RUNNING: Currently being executed by the interpreter.
> >>> +      GEN_SUSPENDED: Currently suspended at a yield expression.
> >>> +      GEN_CLOSED: Execution has completed.
> >> I wonder if those shouldn?t be marked up as :data: or something to make
> >> them indexed.
> > The same definitions are in the docstrings, and they're just integer
> > constants so I'm not sure why anyone would be looking them up
> > directly. Still, if someone with greater Sphinx-fu thinks additional
> > markup would be helpful, I have no problem with them adding it :)
> >
> Why not use string constants instead? You lose comparability (less than 
> / greater than) but gain readability. Comparability may be a requirement 
> - of course if Python had an Enum type we could use that and have both.

+1.  The problem with int constants is that the int gets printed, not
the name, when you dump them for debugging purposes :)



From ncoghlan at  Mon Nov 22 16:45:28 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 23 Nov 2010 01:45:28 +1000
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 23, 2010 at 1:19 AM, Michael Foord
<fuzzyman at> wrote:
> On 22/11/2010 15:14, Nick Coghlan wrote:
>> On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo<merwok at> ?wrote:
>>>> + ? ?Possible states are:
>>>> + ? ? ?GEN_CREATED: Waiting to start execution.
>>>> + ? ? ?GEN_RUNNING: Currently being executed by the interpreter.
>>>> + ? ? ?GEN_SUSPENDED: Currently suspended at a yield expression.
>>>> + ? ? ?GEN_CLOSED: Execution has completed.
>>> I wonder if those shouldn?t be marked up as :data: or something to make
>>> them indexed.
>> The same definitions are in the docstrings, and they're just integer
>> constants so I'm not sure why anyone would be looking them up
>> directly. Still, if someone with greater Sphinx-fu thinks additional
>> markup would be helpful, I have no problem with them adding it :)
> Why not use string constants instead? You lose comparability (less than /
> greater than) but gain readability. Comparability may be a requirement - of
> course if Python had an Enum type we could use that and have both.

With only 4 states, comparability isn't really necessary. I'm just so
used to using the range() trick as a replacement for the lack of
proper Enum type that using strings instead didn't even occur to me.

The lack of printability did bother me a bit, so yeah, +1 from me as
well (I've reopened the relevant issue to remind me to change it
before beta 1).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From alexander.belopolsky at  Mon Nov 22 17:03:47 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 22 Nov 2010 11:03:47 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 10:37 AM, Nick Coghlan <ncoghlan at> wrote:
> *(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on
> Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article)

Do you think these articles are helpful for someone learning how to
use chr() and ord() in Python for the first time?

From hrvoje.niksic at  Mon Nov 22 17:08:36 2010
From: hrvoje.niksic at (Hrvoje Niksic)
Date: Mon, 22 Nov 2010 17:08:36 +0100
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
> +1.  The problem with int constants is that the int gets printed, not
> the name, when you dump them for debugging purposes :)

Well, it's trivial to subclass int to something with a nicer __repr__. 
PyGTK uses that technique for wrapping C enums:

<enum GTK_PREVIEW_GRAYSCALE of type GtkPreviewType>
 >>> isinstance(gtk.PREVIEW_GRAYSCALE, int)

From ncoghlan at  Mon Nov 22 17:13:39 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 23 Nov 2010 02:13:39 +1000
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 2:03 AM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> On Mon, Nov 22, 2010 at 10:37 AM, Nick Coghlan <ncoghlan at> wrote:
> ..
>> *(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on
>> Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article)
> Do you think these articles are helpful for someone learning how to
> use chr() and ord() in Python for the first time?

No, that's what the documentation of chr() and ord() is for. For that
use case, it doesn't matter *what* the terms are. They could say "in a
FOO build this will do X, in a BAR build it will do Y, see <link> for
a detailed explanation of the differences between FOO and BAR builds
of Python" and be perfectly adequate for the task. If there is no
appropriate documentation link to point to (probably somewhere in the
C API docs if it isn't anywhere else) then that is a key issue that
needs to be fixed, rather than trying to change the terms that have
been in use for the better part of a decade already.

The raw meaning of UCS2/UCS4 mainly comes into the story when people
are encountering this as a config option when building Python. The
whole idea of changing the terms for the two build types *should* have
been short circuited by the "status quo wins a stalemate" guideline,
but apparently that didn't happen at the time.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Mon Nov 22 17:24:40 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 22 Nov 2010 17:24:40 +0100
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
References: <>
	<> <>
Message-ID: <>

On Mon, 22 Nov 2010 17:08:36 +0100
Hrvoje Niksic <hrvoje.niksic at> wrote:
> On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
> > +1.  The problem with int constants is that the int gets printed, not
> > the name, when you dump them for debugging purposes :)
> Well, it's trivial to subclass int to something with a nicer __repr__. 
> PyGTK uses that technique for wrapping C enums:

Nice. It might be useful to add a private _Constant class somewhere for
stdlib purposes.



From guido at  Mon Nov 22 17:33:57 2010
From: guido at (Guido van Rossum)
Date: Mon, 22 Nov 2010 08:33:57 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman <v+python at> wrote:
> In reviewing my notes from my experimentations with CGIHTTPServer
> (Python2.6) and then http.server (Python 3.2a4), I note one behavior I
> haven't reported as a bug, nor do I know where to start to figure it out,
> other than experimentally.
> The experiment: launching CGIHTTPServer without environment variables, by
> the simple expedient of using a batch file to unset all the existing
> environment variables, and then launching Python2.6 with CGIHTTPServer.
> So it failed early: fails at line 110 (Python 2.6).

What specific traceback do you get? In my copy of the code that line says

                a = long(_hexlify(_urandom(16)), 16)

and I could just imagine that _urandom() fails for some reason to do
with the environment (it is a reference to os.urandom()), which, being
part of the C library code, might depend on the environment.

But you're not giving enough info to debug this.

> I suppose it is possible that some environment variables are used by Python
> directly (but I can't seem to find a documented list of them) although I
> would expect that usage to be optional, with fall-back defaults when they
> don't exist.

That is certainly the idea, but the fallbacks may not always be nice.

Environment variables used by Python or the stdlib itself are supposed
to be named PYTHON<whatever> if they are Python-specific, and there's
a way to disable all of these (-E). But there are other environment
variables (HOME and PATH come to mind) that have a broader definition
and that are used in some part of the stdlib. Plus, as I mentioned,
who knows what the non-Python C library uses (well, somebody probably
knows, but I don't know of a central source that we can actually trust
across the many platforms where Python runs).

> I suppose it is even possible that some Windows APIs might
> depend on some environment variables, but I expected that the registry had
> replaced such usage completely, by now, with the environment variables
> mostly being a convenience tool for batch files, or for optional, temporary
> alteration of particular settings.

That sounds like wishful thinking. :-)

> If anyone knows of documentation listing what environment variables are
> required by Python on Windows, I would appreciate a pointer, searches and
> doc browsing having not turned it up.
> I'll attempt to recreate the test situation later this week with Python
> 3.2a4, if no one responds, but the only debug technique I can think of is to
> slowly remove environment variables until I find the minimum set required to
> run http.server successfully for my tests with CGI files.

--Guido van Rossum (

From fuzzyman at  Mon Nov 22 17:58:56 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 22 Nov 2010 16:58:56 +0000
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<> <>
Message-ID: <>

On 22/11/2010 16:24, Antoine Pitrou wrote:
> On Mon, 22 Nov 2010 17:08:36 +0100
> Hrvoje Niksic<hrvoje.niksic at>  wrote:
>> On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
>>> +1.  The problem with int constants is that the int gets printed, not
>>> the name, when you dump them for debugging purposes :)
>> Well, it's trivial to subclass int to something with a nicer __repr__.
>> PyGTK uses that technique for wrapping C enums:
> Nice. It might be useful to add a private _Constant class somewhere for
> stdlib purposes.
Why not just solve the problem properly and add it to the standard 
library... (Allowing for flag enums too that can be or'd together and 
still have a decent repr.)


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From alexander.belopolsky at  Mon Nov 22 18:00:14 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 22 Nov 2010 12:00:14 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 11:13 AM, Nick Coghlan <ncoghlan at> wrote:
>> Do you think these articles are helpful for someone learning how to
>> use chr() and ord() in Python for the first time?
> No, that's what the documentation of chr() and ord() is for. For that
> use case, it doesn't matter *what* the terms are.

I recently updated  chr() and ord()  documentation and used
"narrow/wide" terms.  I thought USC2/4 proponents objected to that on
the basis that these terms are imprecise.

> They could say "in a
> FOO build this will do X, in a BAR build it will do Y, see <link> for
> a detailed explanation of the differences between FOO and BAR builds
> of Python" and be perfectly adequate for the task. If there is no
> appropriate documentation link to point to (probably somewhere in the
> C API docs if it isn't anywhere else) then that is a key issue that
> needs to be fixed, rather than trying to change the terms that have
> been in use for the better part of a decade already.

That's the point that I was trying to make.  Using somewhat vague
narrow/wide terms gives us an opportunity to describe exactly what is
going on without confusing the reader with the intricacies of the
Unicode Standard or Python'd compliance with a particular version of

> The raw meaning of UCS2/UCS4 mainly comes into the story when people
> are encountering this as a config option when building Python. The
> whole idea of changing the terms for the two build types *should* have
> been short circuited by the "status quo wins a stalemate" guideline,
> but apparently that didn't happen at the time.

It also comes in the "Data model" reference section on String which is
currently out of date:

The items of a string object are Unicode code units. A Unicode code
unit is represented by a string object of one item and can hold either
a 16-bit or 32-bit value representing a Unicode ordinal (the maximum
value for the ordinal is given in sys.maxunicode, and depends on how
Python is configured at compile time). Surrogate pairs may be present
in the Unicode object, and will be reported as two separate items. The
built-in functions chr() and ord() convert between code units and
nonnegative integers representing the Unicode ordinals as defined in
the Unicode Standard 3.0. Conversion from and to other encodings are
possible through the string method encode().

The out of date part is the reference to the Unicode Standard 3.0.  I
don't think we should refer to a specific version of Unicode here.  It
has little consequence for the "Python data model" and AFAICT does not
come into play anywhere except unicodedata which is currently at
version 6.0.

The description of chr() and ord() is also not accurate on narrow
builds and nether is the statement "The items of a string object are
Unicode code units."

From exarkun at  Mon Nov 22 17:46:54 2010
From: exarkun at (exarkun at
Date: Mon, 22 Nov 2010 16:46:54 -0000
Subject: [Python-Dev] [Python-checkins] r86633 - in
	python/branches/py3k:	Doc/library/inspect.rst
	Doc/whatsnew/3.2.rst Lib/	Lib/test/
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>

On 04:24 pm, solipsis at wrote:
>On Mon, 22 Nov 2010 17:08:36 +0100
>Hrvoje Niksic <hrvoje.niksic at> wrote:
>>On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
>> > +1.  The problem with int constants is that the int gets printed, 
>> > the name, when you dump them for debugging purposes :)
>>Well, it's trivial to subclass int to something with a nicer __repr__.
>>PyGTK uses that technique for wrapping C enums:
>Nice. It might be useful to add a private _Constant class somewhere for
>stdlib purposes.
>Python-Dev mailing list
>Python-Dev at

From ezio.melotti at  Mon Nov 22 18:14:03 2010
From: ezio.melotti at (Ezio Melotti)
Date: Mon, 22 Nov 2010 19:14:03 +0200
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
Message-ID: <>

I would like to re-enable by default warnings for regrtest and/or unittest.

The reasons are:
   1) these tools are used mainly by developers and they (should) care 
about warnings;
   2) developers won't have to remember that warning are silenced and 
how to enable them manually;
   3) developers won't have to enable them manually every time they run 
the tests;
   4) some developers are not even aware that warnings have been 
silenced and might not notice things like DeprecationWarnings until the 
function/method/class/etc gets removed and breaks their code;
   5) another developer tool -- the --with-pydebug flag -- already 
re-enables warnings when it's used;

If this is fixed in unittest it won't be necessary to patch regrtest.
If it's fixed in regrtest only the core developers will benefit from this.

This could be fixed checking if any warning flags (-Wx) are passed to 
If no flags are passed the default will be -Wd, otherwise the behavior 
will be the one specified by the flag.
This will allow developers to use `python -Wi` to ignore errors explicitly.

Best Regards,
Ezio Melotti

From rdmurray at  Mon Nov 22 18:30:29 2010
From: rdmurray at (R. David Murray)
Date: Mon, 22 Nov 2010 12:30:29 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, 22 Nov 2010 12:00:14 -0500, Alexander Belopolsky <alexander.belopolsky at> wrote:
> I recently updated  chr() and ord()  documentation and used
> "narrow/wide" terms.  I thought USC2/4 proponents objected to that on
> the basis that these terms are imprecise.

For reference, a grep in py3k/Doc reveals that there are currently exactly
23 lines mentioning UCS2 or UCS4 in the docs.  Most are in the unicode part
of the c-api, and 6 are in what's new for 2.2:

c-api/arg.rst:      Convert a null-terminated buffer of Unicode (UCS-2 or UCS-4) data to a Python
c-api/arg.rst:      Convert a Unicode (UCS-2 or UCS-4) data buffer and its length to a Python

c-api/unicode.rst:   for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also
c-api/unicode.rst:   possible to build a UCS4 version of Python (most recent Linux distributions come
c-api/unicode.rst:   with UCS4 builds of Python). These builds then use a 32-bit type for
c-api/unicode.rst:   :c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms
c-api/unicode.rst:   short` (UCS2) or :c:type:`unsigned long` (UCS4).
c-api/unicode.rst:Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep
c-api/unicode.rst:   values is interpreted as an UCS-2 character.

whatsnew/2.2.rst:usually stored as UCS-2, as 16-bit unsigned integers. Python 2.2 can also be
whatsnew/2.2.rst:compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by
whatsnew/2.2.rst:supplying :option:`--enable-unicode=ucs4` to the configure script.   (It's also
whatsnew/2.2.rst:When built to use UCS-4 (a "wide Python"), the interpreter can natively handle
whatsnew/2.2.rst:compiled to use UCS-2 (a "narrow Python"), values greater than 65535 will still
whatsnew/2.2.rst:Marc-Andr?? Lemburg.  The changes to support using UCS-4 internally were

howto/unicode.rst:.. comment Additional topic: building Python w/ UCS2 or UCS4 support
howto/unicode.rst:           - [ ] Building Python (UCS2, UCS4)

library/sys.rst:   characters are stored as UCS-2 or UCS-4.

library/json.rst:   specified.  Encodings that are not ASCII based (such as UCS-2) are not

faq/extending.rst:When importing module X, why do I get "undefined symbol: PyUnicodeUCS2*"?
faq/extending.rst:If instead the name of the undefined symbol starts with ``PyUnicodeUCS4``, the
faq/extending.rst:   ...     print('UCS4 build')
faq/extending.rst:   ...     print('UCS2 build')

R. David Murray                            

From lukasz at  Mon Nov 22 18:35:16 2010
From: lukasz at (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Mon, 22 Nov 2010 18:35:16 +0100
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <>
Message-ID: <>

Am 22.11.2010 18:14, schrieb Ezio Melotti:
> I would like to re-enable by default warnings for regrtest and/or 
> unittest.


Especially in regrtest it could help manage stdlib quality (currently we 
have a horde of ResourceWarnings, zipfile mostly). I would even be +1 on 
making warnings errors for regrtest but that seems to be unpopular on 

Best regards,
?ukasz Langa

From alexander.belopolsky at  Mon Nov 22 18:37:59 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 22 Nov 2010 12:37:59 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 12:30 PM, R. David Murray <rdmurray at> wrote:
> For reference, a grep in py3k/Doc reveals that there are currently exactly
> 23 lines mentioning UCS2 or UCS4 in the docs.

Did you grep for USC-2 and USC-4 as well?  I have to admit that my
aversion to these terms is mostly due to the fact that I don't know
how to spell them correctly. :-)

From tjreedy at  Mon Nov 22 18:41:46 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 22 Nov 2010 12:41:46 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>
Message-ID: <icea0n$g9v$>

On 11/22/2010 5:48 AM, Stephen J. Turnbull wrote:

> I disagree.  I do see a problem with "UCS-2", because it fails to tell
> us that Python implements a large number of features that make it easy
> to do a very good job of working with non-BMP data in 16-bit builds of

Yes. As I read the standard, UCS-2 is limited to BMP chars. So I was a 
bit confused when Python was described as UCS-2, until I realized that 
the term was inaccurate. Using that term punishes people like me who 
take the time to read the standard or otherwise learn what the term means.

What Python does might be called USC-2+ or UCS-2e (xtended).

Terry Jan Reedy

From fuzzyman at  Mon Nov 22 18:45:58 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 22 Nov 2010 17:45:58 +0000
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 22/11/2010 17:35, ?ukasz Langa wrote:
> Am 22.11.2010 18:14, schrieb Ezio Melotti:
>> I would like to re-enable by default warnings for regrtest and/or 
>> unittest.
> +1
> Especially in regrtest it could help manage stdlib quality (currently 
> we have a horde of ResourceWarnings, zipfile mostly). I would even be 
> +1 on making warnings errors for regrtest but that seems to be 
> unpopular on #python-dev.

Enabling it for regrtest makes sense. For unittest I still think it is a 
choice that should be left to developers.


> Best regards,
> ?ukasz Langa
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe: 


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From raymond.hettinger at  Mon Nov 22 19:13:30 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Mon, 22 Nov 2010 10:13:30 -0800
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 22, 2010, at 2:48 AM, Stephen J. Turnbull wrote:

> Raymond Hettinger writes:
>> Neither UTF-16 nor UCS-2 is exactly correct anyway.
> From a standards lawyer point of view, UCS-2 is exactly correct, 

You're twisting yourself into definitional knots.

Any explanation we give users needs to let them know two things:
* that we cover the entire range of unicode not just BMP
* that sometimes len(chr(i)) is one and sometimes two

The term UCS-2 is a complete communications failure
in that regard.  If someone looks up the term, they will
immediately see something like the wikipedia entry which says,
"UCS-2 cannot represent code points outside the BMP".
How is that helpful?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From raymond.hettinger at  Mon Nov 22 19:29:33 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Mon, 22 Nov 2010 10:29:33 -0800
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <icea0n$g9v$>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>
Message-ID: <>

On Nov 22, 2010, at 9:41 AM, Terry Reedy wrote:

> On 11/22/2010 5:48 AM, Stephen J. Turnbull wrote:
>> I disagree.  I do see a problem with "UCS-2", because it fails to tell
>> us that Python implements a large number of features that make it easy
>> to do a very good job of working with non-BMP data in 16-bit builds of
> Yes. As I read the standard, UCS-2 is limited to BMP chars. So I was a bit confused when Python was described as UCS-2, until I realized that the term was inaccurate. Using that term punishes people like me who take the time to read the standard or otherwise learn what the term means.


Thanks for the excellent summary of the problem.

> What Python does might be called USC-2+ or UCS-2e (xtended).

That would be a step in the right direction.


From jcea at  Mon Nov 22 19:34:49 2010
From: jcea at (Jesus Cea)
Date: Mon, 22 Nov 2010 19:34:49 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
Message-ID: <>

Hash: SHA1

A Solaris installation contains ALWAYS 32 and 64 bits libraries. So in
any Solaris you can run 32/64 bits programs, and compile in 32 and 64 bits.

For this, libraries are stores in "/usr/lib", for instance, for 32 bits,
while the same 64 bits libraries are stored in "/usr/lib/64".

Currently, python do not considerate this.

We have Solaris 10 buildslaves, but they compile in 32 bits, aparently.
For instance

We now have 32 and 64 bits OpenIndiana buildslaves, so we can actually
check this. They were deployed yesterday.

Apparently the changes would be pretty simple, adding ".../64" to
library paths, to try to find the extra libraries.

What do you think?.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From mal at  Mon Nov 22 19:53:00 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 22 Nov 2010 19:53:00 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Raymond Hettinger wrote:
> Any explanation we give users needs to let them know two things:
> * that we cover the entire range of unicode not just BMP
> * that sometimes len(chr(i)) is one and sometimes two
> The term UCS-2 is a complete communications failure
> in that regard.  If someone looks up the term, they will
> immediately see something like the wikipedia entry which says,
> "UCS-2 cannot represent code points outside the BMP".
> How is that helpful?

It's very helpful, since it explains why a UCS-2 build of Python
requires a surrogates pair to represent a non-BMP code point
and explains why chr(i) gives you a length 2 string rather than
a length 1 string.

A UCS-4 build does not need to use surrogates for this, hence
you get a length 1 string from chr(i).

There are two levels we have to explain to users:

1. the transfer level

2. the storage level

The UTF encodings address the transfer level and is what
you deal with in I/O. These provide variable length encodings of
the complete Unicode code point range, regardless of whether
you have a UCS-2 or a UCS-4 build.

The storage level becomes important if you want to work on
strings using indexing and slicing. Here you do have to know
whether you're dealing with a UCS-2 or a UCS-4 build, since the
indexes will vary if you're using non-BMP code points.

Finally, to tie both together, we have to explain that UTF-16
(the transfer encoding) maps to UCS-2 in a straight-forward way,
so it is possible to work with a UCS-2 build of Python and still
use the complete Unicode code point range - you only have to
take into consideration, that Python's string indexing will not
necessarily point you to n-th code point in a string, but may
well give you half or a surrogate.

Note that while that last aspect may appear like a good argument
for UCS-4 builds, in reality it is not. UCS-4 has the same
issue on a different level: the letters that get printed on
the screen or printer (graphemes) may well be made up of
multiple combining code points, e.g. an "e" and an "?".
Those again map to two indexes in the Python string, even
though, the appear to be one character on output.

Now try to explain all of the above using the terms "narrow"
and "wide" (while remembering "explicit is better than implicit"
and "avoid the temptation to guess") :-)

It is not really helpful to replace a correct and accurate
term with a fuzzy term: either way we're stuck with the

However, the correct and accurate terms at least give
you a chance to figure out and understand the reasoning
behind the design. UCS-2 vs. UCS-4 is a trade-off, "narrow"
and "wide" is marketing talk with an implicit emphasis on
one side :-)

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 22 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From ezio.melotti at  Mon Nov 22 19:58:33 2010
From: ezio.melotti at (Ezio Melotti)
Date: Mon, 22 Nov 2010 20:58:33 +0200
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 22/11/2010 19.45, Michael Foord wrote:
> On 22/11/2010 17:35, ?ukasz Langa wrote:
>> Am 22.11.2010 18:14, schrieb Ezio Melotti:
>>> I would like to re-enable by default warnings for regrtest and/or 
>>> unittest.
>> +1
>> Especially in regrtest it could help manage stdlib quality (currently 
>> we have a horde of ResourceWarnings, zipfile mostly). I would even be 
>> +1 on making warnings errors for regrtest but that seems to be 
>> unpopular on #python-dev.

As I said on IRC I think it makes sense to turn them into errors once we 
fixed/silenced all the ones that we have now. That would help keeping 
the number of warning to 0.

> Enabling it for regrtest makes sense. For unittest I still think it is 
> a choice that should be left to developers.

If we consider that most of the developers want to see them, I'd prefer 
to have the warnings by default rather than having to use -Wd explicitly 
every time I run the tests (keep in mind that many developers out there 
don't even know/remember that now they should use -Wd).

> Michael
>> Best regards,
>> ?ukasz Langa
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe: 

From alexander.belopolsky at  Mon Nov 22 20:09:14 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 22 Nov 2010 14:09:14 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <icea0n$g9v$>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 12:41 PM, Terry Reedy <tjreedy at> wrote:
> What Python does might be called USC-2+ or UCS-2e (xtended).

Wow!  I am not the only one who can't get the order of letters right
in these acronyms.  (I am usually consistent within one sentence,
though.) :-)

I-can't-spell-three-letter-acronyms-right-ly yours ...

From brett at  Mon Nov 22 20:12:26 2010
From: brett at (Brett Cannon)
Date: Mon, 22 Nov 2010 11:12:26 -0800
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 22, 2010 at 10:34, Jesus Cea <jcea at> wrote:
> Hash: SHA1
> A Solaris installation contains ALWAYS 32 and 64 bits libraries. So in
> any Solaris you can run 32/64 bits programs, and compile in 32 and 64 bits.
> For this, libraries are stores in "/usr/lib", for instance, for 32 bits,
> while the same 64 bits libraries are stored in "/usr/lib/64".
> Currently, python do not considerate this.
> We have Solaris 10 buildslaves, but they compile in 32 bits, aparently.
> For instance
> <>.
> We now have 32 and 64 bits OpenIndiana buildslaves, so we can actually
> check this. They were deployed yesterday.
> Apparently the changes would be pretty simple, adding ".../64" to
> library paths, to try to find the extra libraries.
> What do you think?.

Are you asking about buildbots only or as a general policy? If you are
asking about the buildbots then I definitely think we should use 64
bits. If you are asking about policy I would say it should be an
option in case people are using C extensions that are not designed to
work with 64 bits.

From brett at  Mon Nov 22 20:24:34 2010
From: brett at (Brett Cannon)
Date: Mon, 22 Nov 2010 11:24:34 -0800
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 10:58, Ezio Melotti <ezio.melotti at> wrote:
> On 22/11/2010 19.45, Michael Foord wrote:
>> On 22/11/2010 17:35, ?ukasz Langa wrote:
>>> Am 22.11.2010 18:14, schrieb Ezio Melotti:
>>>> I would like to re-enable by default warnings for regrtest and/or
>>>> unittest.
>>> +1
>>> Especially in regrtest it could help manage stdlib quality (currently we
>>> have a horde of ResourceWarnings, zipfile mostly). I would even be +1 on
>>> making warnings errors for regrtest but that seems to be unpopular on
>>> #python-dev.
> As I said on IRC I think it makes sense to turn them into errors once we
> fixed/silenced all the ones that we have now. That would help keeping the
> number of warning to 0.

I agree.

>> Enabling it for regrtest makes sense. For unittest I still think it is a
>> choice that should be left to developers.
> If we consider that most of the developers want to see them, I'd prefer to
> have the warnings by default rather than having to use -Wd explicitly every
> time I run the tests (keep in mind that many developers out there don't even
> know/remember that now they should use -Wd).

The problem with that is it means developers who switch to Python 3.2
or whatever are suddenly going to have their tests fail until they
update their code to turn the warnings off. Then again, if we make the
switch for this dead simple to add and backwards-compatible so that
turning them off doesn't trigger an error in older versions then I am
all for turning warnings on by default.

Another approach is to have unittest's runner, when run in verbose
mode, print out what the warnings filter is set to so developers are
aware that they are silencing warnings.


>> Michael
>>> Best regards,
>>> ?ukasz Langa
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at
>>> Unsubscribe:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From jcea at  Mon Nov 22 20:26:40 2010
From: jcea at (Jesus Cea)
Date: Mon, 22 Nov 2010 20:26:40 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 22/11/10 20:12, Brett Cannon wrote:
> Are you asking about buildbots only or as a general policy? If you are
> asking about the buildbots then I definitely think we should use 64
> bits. If you are asking about policy I would say it should be an
> option in case people are using C extensions that are not designed to
> work with 64 bits.

The point is that building python in 64 bits under Solaris (family) is
not easy, because the 64 bits libraries (zlib, openssl, berkeley db,
curses, etc., etc., etc) are not is "/usr/lib", "/usr/local/lib", etc.,
but "/usr/lib/64", "/usr/local/lib/64", etc.

Solaris overcomes most of the issue having separate library searchpath
in 32 and 64 bits (via the "crle" command). But in some cases python try
to find some library in "/usr/local/lib", and my point is that it should
search TOO inside "/usr/local/lib/64".

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From barry at  Mon Nov 22 20:28:43 2010
From: barry at (Barry Warsaw)
Date: Mon, 22 Nov 2010 14:28:43 -0500
Subject: [Python-Dev] issue 9807 - abiflags in paths and symlinks
	(updated patch)
In-Reply-To: <20101110162719.11ae7fe6@mission>
References: <20101110162719.11ae7fe6@mission>
Message-ID: <20101122142843.45ae45ae@mission>

On Nov 10, 2010, at 04:27 PM, Barry Warsaw wrote:

>I finally found a chance to address all the outstanding technical issues
>mentioned in bug 9807:
>I've uploaded a new patch which contains the rest of the changes I'm
>proposing.  I think we still need consensus about whether these changes are
>good to commit.  With 3.2b1 coming soon, now's the time to do that.
>If there are any remaining concerns about the details of the patch, please add
>them to the tracker issue.  If you have any remaining objections to the
>change, please let me know or follow up here.

The patch has now been updated to address the last few comments in the tracker
issue.  I am now ready to commit it to py3k.  If there are any remaining
objections or concerns, please reply here or update the tracker issue.
Otherwise, I plan to commit this to py3k on Wednesday.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From martin at  Mon Nov 22 20:42:16 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Nov 2010 20:42:16 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> Solaris overcomes most of the issue having separate library searchpath
> in 32 and 64 bits (via the "crle" command). But in some cases python try
> to find some library in "/usr/local/lib", and my point is that it should
> search TOO inside "/usr/local/lib/64".

I don't think this will work. If the linker finds a library of the wrong
ELF type, then it will choke.

Before enabling anything on a build slave, a patch needs to be
contributed to make it work in the first place.


From rdmurray at  Mon Nov 22 20:50:14 2010
From: rdmurray at (R. David Murray)
Date: Mon, 22 Nov 2010 14:50:14 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, 22 Nov 2010 12:37:59 -0500, Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Mon, Nov 22, 2010 at 12:30 PM, R. David Murray <rdmurray at> wrote:
> ..
> > For reference, a grep in py3k/Doc reveals that there are currently exactly
> > 23 lines mentioning UCS2 or UCS4 in the docs.
> Did you grep for USC-2 and USC-4 as well?  I have to admit that my
> aversion to these terms is mostly due to the fact that I don't know
> how to spell them correctly. :-)

I grepped using "-ri ucs." and eliminated the false positives (of
which there were only a few) by hand.

R. David Murray                            

From guido at  Mon Nov 22 22:08:57 2010
From: guido at (Guido van Rossum)
Date: Mon, 22 Nov 2010 13:08:57 -0800
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon <brett at> wrote:
> The problem with that is it means developers who switch to Python 3.2
> or whatever are suddenly going to have their tests fail until they
> update their code to turn the warnings off.

That sounds like a feature to me... :-)

--Guido van Rossum (

From jcea at  Mon Nov 22 22:31:21 2010
From: jcea at (Jesus Cea)
Date: Mon, 22 Nov 2010 22:31:21 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Hash: SHA1

On 22/11/10 20:42, "Martin v. L?wis" wrote:
> Before enabling anything on a build slave, a patch needs to be
> contributed to make it work in the first place.

I actually agree. I am not sure yet, but I am thinking that adding a
"--build-64" parameter to "configure" could be an option under Solaris.
Most OSs (let say, Linux) force you to choose 32/64 bits at install
time, but Solaris can use both at the same time, and compilers allow to
compile both (using -m32 or -m64).

Since choosing 32 or 64 bits when compiling python under Solaris change
the requirement, paths, etc., automating it should be a goal.

PS: Martin, is there any reason to restrict the solaris 10 buildslaves
to 32 bits, beside the said problems?.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From v+python at  Mon Nov 22 22:54:47 2010
From: v+python at (Glenn Linderman)
Date: Mon, 22 Nov 2010 13:54:47 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/22/2010 8:33 AM, Guido van Rossum wrote:
> On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman<v+python at>  wrote:
>> In reviewing my notes from my experimentations with CGIHTTPServer
>> (Python2.6) and then http.server (Python 3.2a4), I note one behavior I
>> haven't reported as a bug, nor do I know where to start to figure it out,
>> other than experimentally.
>> The experiment: launching CGIHTTPServer without environment variables, by
>> the simple expedient of using a batch file to unset all the existing
>> environment variables, and then launching Python2.6 with CGIHTTPServer.
>> So it failed early: fails at line 110 (Python 2.6).
> What specific traceback do you get? In my copy of the code that line says
>                  a = long(_hexlify(_urandom(16)), 16)
> and I could just imagine that _urandom() fails for some reason to do
> with the environment (it is a reference to os.urandom()), which, being
> part of the C library code, might depend on the environment.
> But you're not giving enough info to debug this.

Yep, that's the line.  I'll have to re-run the scenario, but will do it 
on 3.2a4, hopefully tonight or tomorrow, to get the traceback.

>> I suppose it is possible that some environment variables are used by Python
>> directly (but I can't seem to find a documented list of them) although I
>> would expect that usage to be optional, with fall-back defaults when they
>> don't exist.
> That is certainly the idea, but the fallbacks may not always be nice.
> Environment variables used by Python or the stdlib itself are supposed
> to be named PYTHON<whatever>  if they are Python-specific, and there's
> a way to disable all of these (-E). But there are other environment
> variables (HOME and PATH come to mind) that have a broader definition
> and that are used in some part of the stdlib. Plus, as I mentioned,
> who knows what the non-Python C library uses (well, somebody probably
> knows, but I don't know of a central source that we can actually trust
> across the many platforms where Python runs).

OK, thanks for the philosophy statement.  That's what I didn't know, 
being new.

>> I suppose it is even possible that some Windows APIs might
>> depend on some environment variables, but I expected that the registry had
>> replaced such usage completely, by now, with the environment variables
>> mostly being a convenience tool for batch files, or for optional, temporary
>> alteration of particular settings.
> That sounds like wishful thinking. :-)

Well, wishful thinking from me regarding the Windows and the registry is 
that Windows would be better off without a registry.  But it seemed like 
their direction was instead to do away with environment variables, but 
in any case, I have little idea if they've achieved it, but should have 
achieved something in 6.1 versions of Windows!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Mon Nov 22 23:01:12 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 22 Nov 2010 22:01:12 +0000
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
Message-ID: <>

On 22/11/2010 21:08, Guido van Rossum wrote:
> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon<brett at>  wrote:
>> The problem with that is it means developers who switch to Python 3.2
>> or whatever are suddenly going to have their tests fail until they
>> update their code to turn the warnings off.
> That sounds like a feature to me... :-)
I think Ezio was suggesting just turning warnings on by default when 
unittest is run, not turning them into errors. Ezio is suggesting that 
developers could explicitly turn warnings off again, but when you use 
the default test runner warnings would be shown. His logic is that 
warnings are for developers, and so are tests...



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From martin at  Mon Nov 22 23:05:40 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Nov 2010 23:05:40 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

> I actually agree. I am not sure yet, but I am thinking that adding a
> "--build-64" parameter to "configure" could be an option under Solaris.
> Most OSs (let say, Linux) force you to choose 32/64 bits at install
> time

Actually, that's not at all the case. Most systems these days support
32-bit and 64-bit applications simultaneously, and also support compiler
tool chains that allow building for either mode.
Solaris, Linux, and Windows are about on-par in this respect;
OS X is more advanced as it allows to have a single binary that
supports both 32-bit and 64-bit execution (making the need for adjusted
path names irrelevant).

> Since choosing 32 or 64 bits when compiling python under Solaris change
> the requirement, paths, etc., automating it should be a goal.
> PS: Martin, is there any reason to restrict the solaris 10 buildslaves
> to 32 bits, beside the said problems?.

I don't see that as a restriction. I have to make a choice, and there
are sooo many choices to make:
- gcc vs. SunPRO
- 32-bit vs. 64-bit
- GNU make vs. /usr/ccs/bin/make

I picked the combination which was most easy to setup, and is therefore
likely to be used by most users (except for those who think 64-bit
is somehow "better" than 32-bit, when it is actually the other way
'round - IMO).

As for configuration, I personally prefer that setting CC indicates
what type of build you want. Set CC to "gcc -m64" to indicate a
64-build. Ideally, you will *not* have to adjust library paths, since
the other compiler will know on its own where to search things.


From nad at  Mon Nov 22 23:12:05 2010
From: nad at (Ned Deily)
Date: Mon, 22 Nov 2010 14:12:05 -0800
Subject: [Python-Dev] Solaris family and 64 bits compiling
References: <>
	<> <>
Message-ID: <>

In article <4CEAE129.2060505 at>, Jesus Cea <jcea at> wrote:
> On 22/11/10 20:42, "Martin v. L?wis" wrote:
> > Before enabling anything on a build slave, a patch needs to be
> > contributed to make it work in the first place.
> I actually agree. I am not sure yet, but I am thinking that adding a
> "--build-64" parameter to "configure" could be an option under Solaris.
> Most OSs (let say, Linux) force you to choose 32/64 bits at install
> time, but Solaris can use both at the same time, and compilers allow to
> compile both (using -m32 or -m64).
> Since choosing 32 or 64 bits when compiling python under Solaris change
> the requirement, paths, etc., automating it should be a goal.

You might want to look at the existing --with-universal-archs=ARCH in 
configure for how this is done for OS X builds.  It's probably both 
simpler and more complicated than would be needed elsewhere: on OS X, a 
single file can contain object codes for multiple architectures, e.g 
32-bit and 64-bit, rather than having to have multiple files.

 Ned Deily,
 nad at

From brett at  Mon Nov 22 23:20:21 2010
From: brett at (Brett Cannon)
Date: Mon, 22 Nov 2010 14:20:21 -0800
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 13:08, Guido van Rossum <guido at> wrote:
> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon <brett at> wrote:
>> The problem with that is it means developers who switch to Python 3.2
>> or whatever are suddenly going to have their tests fail until they
>> update their code to turn the warnings off.
> That sounds like a feature to me... :-)

=) I meant update their tests with the switch to turn off the
warnings, not update to make the warnings properly disappear.

I guess it's a question of whether it will be errors by default or
simply output the warning. I can get behind printing the warnings by
default and adding a switch to make them errors or off otherwise.


> --
> --Guido van Rossum (

From anurag.chourasia at  Mon Nov 22 23:46:16 2010
From: anurag.chourasia at (Anurag Chourasia)
Date: Tue, 23 Nov 2010 04:16:16 +0530
Subject: [Python-Dev] Missing Python Symbols when Starting Python App
Message-ID: <>


I have a problem in starting my Python(Django) App using Apache and Mod_Wsgi

I am using Django 1.2.3 and Python 2.6.6 running on Apache 2.2.17 with
Mod_Wsgi 3.3

When I try to access the app from Web Browser, I am getting these

[Mon Nov 22 09:45:25 2010] [notice] Apache/2.2.17 (Unix) mod_wsgi/3.3
Python/2.6.6 configured -- resuming normal operations

[Mon Nov 22 09:45:43 2010] [error] [client] mod_wsgi
(pid=1273874): Target WSGI script '/u01/home/apli/wm/app/gdd/pyserver/
apache/django.wsgi' cannot be loaded as Python module.

[Mon Nov 22 09:45:43 2010] [error] [client] mod_wsgi
(pid=1273874): Exception occurred processing WSGI script '/u01/home/

[Mon Nov 22 09:45:43 2010] [error] [client] Traceback
(most recent call last):

[Mon Nov 22 09:45:43 2010] [error] [client]   File "/u01/
home/apli/wm/app/gdd/pyserver/apache/django.wsgi", line 19, in

[Mon Nov 22 09:45:43 2010] [error] [client]     import

[Mon Nov 22 09:45:43 2010] [error] [client]   File "/usr/
local/lib/python2.6/site-packages/django/core/handlers/", line
1, in <module>

[Mon Nov 22 09:45:43 2010] [error] [client]     from
threading import Lock

[Mon Nov 22 09:45:43 2010] [error] [client]   File "/usr/
local/lib/python2.6/", line 13, in <module>

[Mon Nov 22 09:45:43 2010] [error] [client]     from
functools import wraps

[Mon Nov 22 09:45:43 2010] [error] [client]   File "/usr/
local/lib/python2.6/", line 10, in <module>

[Mon Nov 22 09:45:43 2010] [error] [client]     from
_functools import partial, reduce

[Mon Nov 22 09:45:43 2010] [error] [client] ImportError:
rtld: 0712-001 Symbol PyArg_UnpackTuple was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyCallable_Check was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyDict_Copy was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyDict_Merge was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyDict_New was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyErr_Occurred was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] rtld:
0712-001 Symbol PyErr_SetString was referenced

[Mon Nov 22 09:45:43 2010] [error] [client]       from
module /usr/local/lib/python2.6/lib-dynload/, but a
runtime definition

[Mon Nov 22 09:45:43 2010] [error] [client]       of the
symbol was not found.

[Mon Nov 22 09:45:43 2010] [error] [client] \t0509-021
Additional errors occurred but are not reported.

I assume that those missing runtime definitions are supposed to be in
the Python executable. Doing an nm on the first missing symbol reveals
that it does exist.

root [zibal]% nm  /usr/local/bin/python | grep -i PyArg_UnpackTuple
.PyArg_UnpackTuple   T   268683204         524
PyArg_UnpackTuple    D   537073500
PyArg_UnpackTuple    d   537073500          12
PyArg_UnpackTuple:F-1 -         224

Please guide.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From merwok at  Mon Nov 22 23:51:18 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 22 Nov 2010 23:51:18 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>
Message-ID: <>


I think this bug is related:
?Problems with /usr/lib64 builds.?


From tlesher at  Mon Nov 22 23:56:25 2010
From: tlesher at (Tim Lesher)
Date: Mon, 22 Nov 2010 17:56:25 -0500
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 22, 2010 at 16:54, Glenn Linderman <v+python at> wrote:
> I suppose it is possible that some environment variables are used by Python
> directly (but I can't seem to find a documented list of them) although I
> would expect that usage to be optional, with fall-back defaults when they
> don't exist.

I can verify that that's the case: Python (at least through 3.1.2)
runs fine on Windows platforms when environment variables are
completely unavailable.  I know that from running our port for Windows
CE (which has no environment variables at all), cross-compiled for
Windows XP.
Tim Lesher <tlesher at>

From martin at  Tue Nov 23 00:16:47 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 23 Nov 2010 00:16:47 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <> <>
Message-ID: <>

Am 22.11.2010 23:51, schrieb ?ric Araujo:
> Hi,
> I think this bug is related:
> ?Problems with /usr/lib64 builds.?

Perhaps more closely related:


From jcea at  Tue Nov 23 00:41:19 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 00:41:19 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Hash: SHA1

On 22/11/10 23:05, "Martin v. L?wis" wrote:
>> PS: Martin, is there any reason to restrict the solaris 10 buildslaves
>> to 32 bits, beside the said problems?.
> I don't see that as a restriction. I have to make a choice, and there
> are sooo many choices to make:
> - gcc vs. SunPRO
> - 32-bit vs. 64-bit
> - GNU make vs. /usr/ccs/bin/make
> I picked the combination which was most easy to setup, and is therefore
> likely to be used by most users (except for those who think 64-bit
> is somehow "better" than 32-bit, when it is actually the other way
> 'round - IMO).

Do not think this is a personal attack. Not at all. I am deploying 32
and 64 bits buildslaves (in the same machine) and feeling the pain. You
are far more experiences than me with buildbots and python. I want to
know if I am missing something.

> As for configuration, I personally prefer that setting CC indicates
> what type of build you want. Set CC to "gcc -m64" to indicate a
> 64-build. Ideally, you will *not* have to adjust library paths, since
> the other compiler will know on its own where to search things.

The problem is not with system library paths. Compilers overcome that.
The problem is with things like "/usr/local/lib" and hardcoded library
paths in Python.

For example, checking

gcc -shared -m64
- -L/usr/lib/termcap -L/usr/local/lib -lreadline -lncursesw -o
ld: fatal: file /usr/local/lib/ wrong ELF class: ELFCLASS32
ld: fatal: file processing errors. No output written to
collect2: ld returned 1 exit status

The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From benjamin at  Tue Nov 23 00:47:16 2010
From: benjamin at (Benjamin Peterson)
Date: Mon, 22 Nov 2010 17:47:16 -0600
Subject: [Python-Dev] [Python-checkins] r86699 -
In-Reply-To: <>
References: <>
Message-ID: <>

No test?

2010/11/22 lukasz.langa <python-checkins at>:
> Author: lukasz.langa
> Date: Tue Nov 23 00:31:26 2010
> New Revision: 86699
> Log:
> Issue #9846: ZipExtFile provides no mechanism for closing the underlying file object
> Modified:
> ? python/branches/py3k/Lib/
> Modified: python/branches/py3k/Lib/
> ==============================================================================
> --- python/branches/py3k/Lib/ (original)
> +++ python/branches/py3k/Lib/ Tue Nov 23 00:31:26 2010
> @@ -473,9 +473,11 @@
> ? ? # Search for universal newlines or line chunks.
> ? ? PATTERN = re.compile(br'^(?P<chunk>[^\r\n]+)|(?P<newline>\n|\r\n?)')
> - ? ?def __init__(self, fileobj, mode, zipinfo, decrypter=None):
> + ? ?def __init__(self, fileobj, mode, zipinfo, decrypter=None,
> + ? ? ? ? ? ? ? ? close_fileobj=False):
> ? ? ? ? self._fileobj = fileobj
> ? ? ? ? self._decrypter = decrypter
> + ? ? ? ?self._close_fileobj = close_fileobj
> ? ? ? ? self._compress_type = zipinfo.compress_type
> ? ? ? ? self._compress_size = zipinfo.compress_size
> @@ -647,6 +649,12 @@
> ? ? ? ? self._offset += len(data)
> ? ? ? ? return data
> + ? ?def close(self):
> + ? ? ? ?try:
> + ? ? ? ? ? ?if self._close_fileobj:
> + ? ? ? ? ? ? ? ?self._fileobj.close()
> + ? ? ? ?finally:
> + ? ? ? ? ? ?super().close()
> ?class ZipFile:
> @@ -889,8 +897,10 @@
> ? ? ? ? # given a file object in the constructor
> ? ? ? ? if self._filePassed:
> ? ? ? ? ? ? zef_file = self.fp
> + ? ? ? ? ? ?should_close = False
> ? ? ? ? else:
> ? ? ? ? ? ? zef_file =, 'rb')
> + ? ? ? ? ? ?should_close = True
> ? ? ? ? # Make sure we have an info object
> ? ? ? ? if isinstance(name, ZipInfo):
> @@ -944,7 +954,7 @@
> ? ? ? ? ? ? if h[11] != check_byte:
> ? ? ? ? ? ? ? ? raise RuntimeError("Bad password for file", name)
> - ? ? ? ?return ?ZipExtFile(zef_file, mode, zinfo, zd)
> + ? ? ? ?return ?ZipExtFile(zef_file, mode, zinfo, zd, close_fileobj=should_close)
> ? ? def extract(self, member, path=None, pwd=None):
> ? ? ? ? """Extract a member from the archive to the current working directory,
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at


From jcea at  Tue Nov 23 00:48:06 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 00:48:06 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Hash: SHA1

I think this is probably trivial, but is there any foolproof way to
detect 64 bit builds in python, beside "sys.maxint"?.

And any macro useable for conditional compilation in C?.

Checking Solaris 10 header files, I see macros like "_LP64". Portability
would be nice, but in this personal case, probably unneeded...

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From tjreedy at  Tue Nov 23 00:58:03 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 22 Nov 2010 18:58:03 -0500
Subject: [Python-Dev] Missing Python Symbols when Starting Python App
In-Reply-To: <>
References: <>
Message-ID: <icf028$qmp$>

On 11/22/2010 5:46 PM, Anurag Chourasia wrote:
> [Mon Nov 22 09:45:43 2010] [error] [client] mod_wsgi
> (pid=1273874): Target WSGI script '/u01/home/apli/wm/app/gdd/pyserver/
> apache/django.wsgi' cannot be loaded as Python module.

All other error stem probably from this.

> Please guide.

Ask usage questions like this on python-list or a django-specific list.
python-list is for discussion of development of future versions of 
Python, not usage of current versions.

Terry Jan Reedy

From martin at  Tue Nov 23 01:05:59 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Nov 2010 01:05:59 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Am 23.11.2010 00:41, schrieb Jesus Cea:
> On 22/11/10 23:05, "Martin v. L?wis" wrote:
>>> PS: Martin, is there any reason to restrict the solaris 10 buildslaves
>>> to 32 bits, beside the said problems?.
>> I don't see that as a restriction. I have to make a choice, and there
>> are sooo many choices to make:
>> - gcc vs. SunPRO
>> - 32-bit vs. 64-bit
>> - GNU make vs. /usr/ccs/bin/make
>> I picked the combination which was most easy to setup, and is therefore
>> likely to be used by most users (except for those who think 64-bit
>> is somehow "better" than 32-bit, when it is actually the other way
>> 'round - IMO).
> Do not think this is a personal attack.

No offense taken. If you really want to know the historical background:
this was the very first build slave (before I actually announced it to
python-dev), and I haven't changed much from the initial setup.

I just point out that none of the binaries in /usr/bin is a 64-bit
binary; this includes the Sun-provided /usr/sfw/bin/python

> The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many.

Is that really the case? I.e. will ncurses automatically install into
/usr/local/lib/64 if built with a 64-bit compiler? My installation
doesn't even have a /usr/local/lib/64 folder.

In any case: this shouldn't need a configure option. Instead, Python can
find out itself whether it's a 64-bit build, and make modifications
it considers necessary.


From solipsis at  Tue Nov 23 01:06:12 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 01:06:12 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, 23 Nov 2010 00:48:06 +0100
Jesus Cea <jcea at> wrote:
> Hash: SHA1
> I think this is probably trivial, but is there any foolproof way to
> detect 64 bit builds in python, beside "sys.maxint"?.


> And any macro useable for conditional compilation in C?.


From brian.curtin at  Tue Nov 23 01:06:33 2010
From: brian.curtin at (Brian Curtin)
Date: Mon, 22 Nov 2010 18:06:33 -0600
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 17:48, Jesus Cea <jcea at> wrote:

> Hash: SHA1
> I think this is probably trivial, but is there any foolproof way to
> detect 64 bit builds in python, beside "sys.maxint"?.

import platform
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Tue Nov 23 01:12:16 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Nov 2010 01:12:16 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Am 23.11.2010 00:48, schrieb Jesus Cea:
> I think this is probably trivial, but is there any foolproof way to
> detect 64 bit builds in python, beside "sys.maxint"?.

The canonical way is to use platform.architecture().

> And any macro useable for conditional compilation in C?.

You need to be more specific than that. There are perhaps ten
independent properties you may query, depending on what precise problem
you try to solve. Most likely, you are looking for SIZEOF_VOID_P
(but don't use that unless you literally want to know how many bytes
a pointer uses, or whether it uses 4 or 8 bytes).


From lukasz at  Tue Nov 23 01:25:01 2010
From: lukasz at (=?utf-8?Q?=C5=81ukasz_Langa?=)
Date: Tue, 23 Nov 2010 01:25:01 +0100
Subject: [Python-Dev] [Python-checkins] r86699 -
In-Reply-To: <>
References: <>
Message-ID: <>

Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47:

> No test?

The tests were there already, raising ResourceWarnings. After this change, they stopped doing that. You may say: now they pass for the first time :)

Best regards,

> 2010/11/22 lukasz.langa <python-checkins at>:
>> Author: lukasz.langa
>> Date: Tue Nov 23 00:31:26 2010
>> New Revision: 86699
>> Log:
>> Issue #9846: ZipExtFile provides no mechanism for closing the underlying file object
>> Modified:
>>   python/branches/py3k/Lib/
>> Modified: python/branches/py3k/Lib/
>> ==============================================================================
>> --- python/branches/py3k/Lib/ (original)
>> +++ python/branches/py3k/Lib/ Tue Nov 23 00:31:26 2010
>> @@ -473,9 +473,11 @@
>>     # Search for universal newlines or line chunks.
>>     PATTERN = re.compile(br'^(?P<chunk>[^\r\n]+)|(?P<newline>\n|\r\n?)')
>> -    def __init__(self, fileobj, mode, zipinfo, decrypter=None):
>> +    def __init__(self, fileobj, mode, zipinfo, decrypter=None,
>> +                 close_fileobj=False):
>>         self._fileobj = fileobj
>>         self._decrypter = decrypter
>> +        self._close_fileobj = close_fileobj
>>         self._compress_type = zipinfo.compress_type
>>         self._compress_size = zipinfo.compress_size
>> @@ -647,6 +649,12 @@
>>         self._offset += len(data)
>>         return data
>> +    def close(self):
>> +        try:
>> +            if self._close_fileobj:
>> +                self._fileobj.close()
>> +        finally:
>> +            super().close()
>>  class ZipFile:
>> @@ -889,8 +897,10 @@
>>         # given a file object in the constructor
>>         if self._filePassed:
>>             zef_file = self.fp
>> +            should_close = False
>>         else:
>>             zef_file =, 'rb')
>> +            should_close = True
>>         # Make sure we have an info object
>>         if isinstance(name, ZipInfo):
>> @@ -944,7 +954,7 @@
>>             if h[11] != check_byte:
>>                 raise RuntimeError("Bad password for file", name)
>> -        return  ZipExtFile(zef_file, mode, zinfo, zd)
>> +        return  ZipExtFile(zef_file, mode, zinfo, zd, close_fileobj=should_close)
>>     def extract(self, member, path=None, pwd=None):
>>         """Extract a member from the archive to the current working directory,
>> _______________________________________________
>> Python-checkins mailing list
>> Python-checkins at
> -- 
> Regards,
> Benjamin
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at

Pozdrawiam serdecznie,
?ukasz Langa
tel. +48 791 080 144

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From reinout at  Mon Nov 22 23:52:10 2010
From: reinout at (Reinout van Rees)
Date: Mon, 22 Nov 2010 23:52:10 +0100
Subject: [Python-Dev] Missing Python Symbols when Starting Python App
In-Reply-To: <>
References: <>
Message-ID: <ices6q$a80$>

On 11/22/2010 11:46 PM, Anurag Chourasia wrote:
> I have a problem in starting my Python(Django) App using Apache and Mod_Wsgi

I'm pretty sure you're asking on the wrong list.  This one is for 
discussing development of python-the-language :-)

You'd better head over to the django-user mailinglist, for instance via


Reinout van Rees - reinout at -
Collega's gezocht!
Django/python vacature in Utrecht:

From lukasz at  Tue Nov 23 01:43:21 2010
From: lukasz at (=?utf-8?Q?=C5=81ukasz_Langa?=)
Date: Tue, 23 Nov 2010 01:43:21 +0100
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
Message-ID: <>

Wiadomo?? napisana przez Michael Foord w dniu 2010-11-22, o godz. 23:01:

> On 22/11/2010 21:08, Guido van Rossum wrote:
>> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon<brett at>  wrote:
>>> The problem with that is it means developers who switch to Python 3.2
>>> or whatever are suddenly going to have their tests fail until they
>>> update their code to turn the warnings off.
>> That sounds like a feature to me... :-)
> I think Ezio was suggesting just turning warnings on by default when unittest is run, not turning them into errors. Ezio is suggesting that developers could explicitly turn warnings off again, but when you use the default test runner warnings would be shown. His logic is that warnings are for developers, and so are tests...

Then again, he is not against the idea to turn those warnings into errors, at least for regrtest.

If you agree to do that for regrtest I will clean up the tests for warnings. Already did that for zipfile so it doesn't raise ResourceWarnings anymore. I just need to correct multiprocessing and xmlrpc ResourceWarnings, silence some DeprecationWarnings in the tests and we're all set. Ah, I see a couple more with -uall but nothing scary.

Anyway, I find warnings as errors in regrtest a welcome feature. Let's make it happen :)

Best regards,
?ukasz Langa
tel. +48 791 080 144

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jcea at  Tue Nov 23 01:47:01 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 01:47:01 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Hash: SHA1

On 23/11/10 01:05, "Martin v. L?wis" wrote:
> No offense taken. If you really want to know the historical background:
> this was the very first build slave (before I actually announced it to
> python-dev), and I haven't changed much from the initial setup.

I do really want to know. I love trivia :-). Thanks.

> I just point out that none of the binaries in /usr/bin is a 64-bit
> binary; this includes the Sun-provided /usr/sfw/bin/python
>> The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many.
> Is that really the case? I.e. will ncurses automatically install into
> /usr/local/lib/64 if built with a 64-bit compiler? My installation
> doesn't even have a /usr/local/lib/64 folder.

A fresh Solaris 10 install doesn't even have a "/usr/local" directory :).

Sadly today most Open Source code is written like if Linux were the only
Unix system out there.

I was amazed that OpenSSL 1.0 installs automatically in
"/usr/local/ssl/lib" when compiled in 32 bits, and in
"/usr/local/ssl/lib/64" when compiled in 64 bits. I almost cry.

> In any case: this shouldn't need a configure option. Instead, Python can
> find out itself whether it's a 64-bit build, and make modifications
> it considers necessary.

I agree. Python should detect it automatically and update the paths when

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From jcea at  Tue Nov 23 01:58:46 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 01:58:46 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Hash: SHA1

On 23/11/10 01:05, "Martin v. L?wis" wrote:
> I just point out that none of the binaries in /usr/bin is a 64-bit
> binary; this includes the Sun-provided /usr/sfw/bin/python

True. This is for simplicity reasons (provide only one binary valid for
32 and 64 bits CPUs) and because 64 bits is overkill for a lot of stuff.

In my own system my only 64 bits libraries are OpenSSL, GMP, and some
multimedia stuff like mencoder, vorbis, etc, where the difference is big.

And the GCC 4.5.x install, that installs libraries (fortran, stdc++,
objective C, etc) automatically under "/usr/local/lib/64". GOOD.

But if we say the Python can be compiled as 64 bits under Solaris, would
be nice if that was actually true. Now that we have a buildbot (under
OpenIndiana) to test, it is doable.

If not, we could say that Solaris+64 bits is unsupported. I don't think
we should go that way. Solaris+64 bits should be a full citizen.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From benjamin at  Tue Nov 23 05:00:08 2010
From: benjamin at (Benjamin Peterson)
Date: Mon, 22 Nov 2010 22:00:08 -0600
Subject: [Python-Dev] [Python-checkins] r86699 -
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/22 ?ukasz Langa <lukasz at>:
> Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47:
> No test?
> The tests were there already, raising ResourceWarnings. After this change,
> they stopped doing that. You may say: now they pass for the first time :)

It looks like you added new API, though. For that, we would expect new tests.


From ocean-city at  Tue Nov 23 05:13:38 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Tue, 23 Nov 2010 13:13:38 +0900
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
Message-ID: <>

Hello. Does this affect python? Thank you.

From glyph at  Tue Nov 23 06:07:09 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 00:07:09 -0500
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto <
ocean-city at> wrote:

> Hello. Does this affect python? Thank you.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Tue Nov 23 07:13:44 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 01:13:44 -0500
Subject: [Python-Dev] [Python-checkins] r86702
	-	python/branches/py3k/Lib/idlelib/
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/23/2010 1:01 AM, terry.reedy wrote:
> Author: terry.reedy
> Date: Tue Nov 23 07:01:31 2010
> New Revision: 86702
> Log:
Issue 9222 Fix filetypes for open dialog

Sorry, forgot to add this before clicking [go] or whatever the button 
is. Is there any way to revise a revision ;-?

> Modified:
>     python/branches/py3k/Lib/idlelib/
> Modified: python/branches/py3k/Lib/idlelib/
> ==============================================================================
> --- python/branches/py3k/Lib/idlelib/	(original)
> +++ python/branches/py3k/Lib/idlelib/	Tue Nov 23 07:01:31 2010
> @@ -476,8 +476,8 @@
>       savedialog = None
>       filetypes = [
> -        ("Python and text files", "*.py *.pyw *.txt", "TEXT"),
> -        ("All text files", "*", "TEXT"),
> +        ("Python files", "*.py *.pyw", "TEXT"),
> +        ("Text files", "*.txt", "TEXT"),
>           ("All files", "*"),
>           ]

From orsenthil at  Tue Nov 23 07:16:12 2010
From: orsenthil at (Senthil Kumaran)
Date: Tue, 23 Nov 2010 14:16:12 +0800
Subject: [Python-Dev] [Python-checkins] r86703 -
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Terry,

On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy <python-checkins at> wrote:
> Author: terry.reedy
> Date: Tue Nov 23 07:07:04 2010
> New Revision: 86703
> Log:
> Issue 9222 Fix filetypes for open dialog
> Modified:
> ? python/branches/release31-maint/Lib/idlelib/

You should be using script ( referenced in the dev FAQ),
to merge your changes to release31-maint. This helps in merge tracking
and helpful to release managers when they do the release.

It is pretty simple, in your release31-maint checkout:

Just run python merge -r 9221 (your py3k revision value)
If successful, do a svn commit -F svnmerge-output-filename ( this file
is autogenerated)

If any conflicts occur, resolve them and then do the step 2.


From g.brandl at  Tue Nov 23 07:44:43 2010
From: g.brandl at (Georg Brandl)
Date: Tue, 23 Nov 2010 07:44:43 +0100
Subject: [Python-Dev] [Python-checkins] r86702 -
In-Reply-To: <>
References: <>
Message-ID: <icfo05$cju$>

Am 23.11.2010 07:13, schrieb Terry Reedy:
> On 11/23/2010 1:01 AM, terry.reedy wrote:
>> Author: terry.reedy
>> Date: Tue Nov 23 07:01:31 2010
>> New Revision: 86702
>> Log:
> Issue 9222 Fix filetypes for open dialog
> Sorry, forgot to add this before clicking [go] or whatever the button 
> is. Is there any way to revise a revision ;-?

Yes, with SVN there is.  I don't know if you can do it with whatever
GUI tool you use, but the command is the following:

svn propedit --revprop -r 86702 svn:log

In a short time however, after switching to Mercurial, commits will be
truly immutable.  However, since the equivalent to committing in SVN is
a two-step process (commit locally and then push one or more commits to
the public repo on the server), you can review your commits locally
before pushing them, and fix mistakes by "rewriting history" (you can
see from that description that it won't work when the changes are already


From tjreedy at  Tue Nov 23 07:49:56 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 01:49:56 -0500
Subject: [Python-Dev] [Python-checkins] r86703
	-	python/branches/release31-maint/Lib/idlelib/
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/23/2010 1:16 AM, Senthil Kumaran wrote:
> Hi Terry,
> On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy<python-checkins at>  wrote:
>> Author: terry.reedy
>> Date: Tue Nov 23 07:07:04 2010
>> New Revision: 86703
>> Log:
>> Issue 9222 Fix filetypes for open dialog
>> Modified:
>>    python/branches/release31-maint/Lib/idlelib/
> You should be using script ( referenced in the dev FAQ),
> to merge your changes to release31-maint. This helps in merge tracking
> and helpful to release managers when they do the release.
> It is pretty simple, in your release31-maint checkout:
> Just run python merge -r 9221 (your py3k revision value)
> If successful, do a svn commit -F svnmerge-output-filename ( this file
> is autogenerated)

I am using TortoiseSVN which has a similar merge but does not seem to 
autogenerate anything. I did use its merge + commit for the 2.7 backport.


From martin at  Tue Nov 23 07:55:20 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Nov 2010 07:55:20 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> But if we say the Python can be compiled as 64 bits under Solaris, would
> be nice if that was actually true. Now that we have a buildbot (under
> OpenIndiana) to test, it is doable.

But it is true, and always has been true. The lib/64 issue did not
prevent one building Python on Solaris/SPARC64 at all, including the
extension modules. Just edit Modules/Setup to suit your needs - that
works since 1995 (before distutils was even written).

> If not, we could say that Solaris+64 bits is unsupported. I don't think
> we should go that way. Solaris+64 bits should be a full citizen.

There we go again: "supported". Python builds on many systems which we
don't have buildbots for, including obscure systems (although Guido has
ruled that we won't specifically accept code for obscure systems
anymore, unlike we did before). It is never fully automatic (you always
have to at least make sure manually that the dependencies are


From tjreedy at  Tue Nov 23 08:16:11 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 02:16:11 -0500
Subject: [Python-Dev] [Python-checkins] r86702 -
In-Reply-To: <icfo05$cju$>
References: <>	<>
Message-ID: <icfpno$isk$>

On 11/23/2010 1:44 AM, Georg Brandl wrote:
> Am 23.11.2010 07:13, schrieb Terry Reedy:
>> On 11/23/2010 1:01 AM, terry.reedy wrote:
>>> Author: terry.reedy
>>> Date: Tue Nov 23 07:01:31 2010
>>> New Revision: 86702
>>> Log:
>> Issue 9222 Fix filetypes for open dialog
>> Sorry, forgot to add this before clicking [go] or whatever the button
>> is. Is there any way to revise a revision ;-?
> Yes, with SVN there is.  I don't know if you can do it with whatever
> GUI tool you use, but the command is the following:
> svn propedit --revprop -r 86702 svn:log
(followed by new message?)

OK, done. TortoiseSVN has a nice revision log dialog. Right click and 
one of the choices is 'edit log message'. Easy. I see that there is a 
TortoiseHg as well.

Terry Jan Reedy

From g.brandl at  Tue Nov 23 09:10:46 2010
From: g.brandl at (Georg Brandl)
Date: Tue, 23 Nov 2010 09:10:46 +0100
Subject: [Python-Dev] [Python-checkins] r86703 -
In-Reply-To: <>
References: <>	<>
Message-ID: <icft1f$ukq$>

Am 23.11.2010 07:49, schrieb Terry Reedy:
> On 11/23/2010 1:16 AM, Senthil Kumaran wrote:
>> Hi Terry,
>> On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy<python-checkins at>  wrote:
>>> Author: terry.reedy
>>> Date: Tue Nov 23 07:07:04 2010
>>> New Revision: 86703
>>> Log:
>>> Issue 9222 Fix filetypes for open dialog
>>> Modified:
>>>    python/branches/release31-maint/Lib/idlelib/
>> You should be using script ( referenced in the dev FAQ),
>> to merge your changes to release31-maint. This helps in merge tracking
>> and helpful to release managers when they do the release.
>> It is pretty simple, in your release31-maint checkout:
>> Just run python merge -r 9221 (your py3k revision value)
>> If successful, do a svn commit -F svnmerge-output-filename ( this file
>> is autogenerated)
> I am using TortoiseSVN which has a similar merge but does not seem to 
> autogenerate anything. I did use its merge + commit for the 2.7 backport.

While the policy is to use svnmerge and I'd expect developers to follow
this policy, in this specific case it's not as important anymore since we
use neither svnmerge's mass merging nor its blocking feature anymore.


From trent at  Tue Nov 23 09:40:50 2010
From: trent at (Trent Nelson)
Date: Tue, 23 Nov 2010 03:40:50 -0500
Subject: [Python-Dev] Stable buildbots
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 14-Nov-10 3:48 AM, David Bolen wrote:
> This is a completely separate issue, though probably around just as
> long, and like the popup problem its frequency changes over time.  By
> "hung" here I'm referring to cases where something must go wrong with
> a test and/or its cleanup such that a python_d process remains
> running, usually several of them at the same time.

My guess: the "hung" (single-threaded) Python process has called 
select() without a timeout in order to wait for some data.  However, the 
data never arrives (due to a broken/failed test), and the select() never 

On Windows, processes seem harder to kill when they get into this state. 
  If I purposely wedge a Windows process via select() via the 
interactive interpreter, ctrl-c has absolutely no effect (whereas on 
Unix, ctrl-c will interrupt the select()).

As for why kill_python.exe doesn't seem to be able to kill said wedged 
processes, the MSDN documentation on TerminateProcess[1] states the 

	The terminated process cannot exit until all
	pending I/O has been completed or canceled. (sic)

It's not unreasonable to assume a wedged select() constitutes pending 
I/O, so that's a possible explanation as to why kill_python.exe isn't 
able to terminate the processes.

(Also, kill_python currently assumes TerminateProcess() always works; 
perhaps this optimism is misplaced.  Also note the XXX TODO regarding 
the fact that we don't kill processes that have loaded our python*.dll, 
but may not be named python_d.exe.  I don't think that's the issue here, 

On 14-Nov-10 5:32 AM, David Bolen wrote:
 > "Martin v. L?wis"<martin at>  writes:
 >> This is what kill_python.exe is supposed to solve. So I recommend to
 >> investigate why it fails to kill the hanging Pythons.
 > Yeah, I know, and I can't say I disagree in principle - not sure why
 > Windows doesn't let the kill in that module work (or if there's an
 > issue actually running it under all conditions).
 > At the moment though, I do know that using the sysinternals pskill
 > utility externally (which is what I currently do interactively)
 > definitely works so to be honest,

That's interesting.  (That kill_python.exe doesn't kill the wedged 
processes, but pskill does.)  kill_python is pretty simple, it just 
calls TerminateProcess() after acquiring a handle with the relevant 
PROCESS_TERMINATE access right.  That being said, that's the recommended 
way to kill a process -- I doubt pskill would be going about it any 
differently (although, it is sysinternals... you never know what kind of 
crazy black magic it's doing behind the scenes).

Are you calling pskill with the -t flag? i.e. kill process and all 
dependents?  That might be the ticket, especially if killing the child 
process that wedged select() is waiting on causes it to return, and 
thus, makes it killable.

Otherwise, if it happens again, can you try kill_python.exe first, then 
pskill, and confirm if the former fails but the latter succeeds?



From v+python at  Tue Nov 23 11:30:31 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 02:30:31 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/22/2010 8:33 AM, Guido van Rossum wrote:
> On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman<v+python at>  wrote:
>> >  In reviewing my notes from my experimentations with CGIHTTPServer
>> >  (Python2.6) and then http.server (Python 3.2a4), I note one behavior I
>> >  haven't reported as a bug, nor do I know where to start to figure it out,
>> >  other than experimentally.
>> >
>> >  The experiment: launching CGIHTTPServer without environment variables, by
>> >  the simple expedient of using a batch file to unset all the existing
>> >  environment variables, and then launching Python2.6 with CGIHTTPServer.
>> >
>> >  So it failed early: fails at line 110 (Python 2.6).
> What specific traceback do you get? In my copy of the code that line says
>                  a = long(_hexlify(_urandom(16)), 16)
> and I could just imagine that _urandom() fails for some reason to do
> with the environment (it is a reference to os.urandom()), which, being
> part of the C library code, might depend on the environment.
> But you're not giving enough info to debug this.

OK, here is the traceback.  I've upgraded the application from Python 
2.6 + + bugfixes to Python 3.2a4 + http.server + 
bugfixes, hoping that it would fix it, but since it didn't that the 
traceback would be more relevant.  It seems that _urandom is the likely 

Traceback (most recent call last):
   File "d:\my\web\areliabl\0test\", line 5, in <module>
     import server
   File "d:\my\web\areliabl\0test\", line 88, in <module>
     import email.message
   File "C:\Python32\lib\email\", line 17, in <module>
     from email import utils
   File "C:\Python32\lib\email\", line 27, in <module>
     import random
   File "C:\Python32\lib\", line 698, in <module>
     _inst = Random()
   File "C:\Python32\lib\", line 90, in __init__
   File "C:\Python32\lib\", line 108, in seed
     a = int.from_bytes(_urandom(32), 'big')
WindowsError: [Error -2146893818] Invalid Signature
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amauryfa at  Tue Nov 23 11:55:08 2010
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Tue, 23 Nov 2010 11:55:08 +0100
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>
Message-ID: <>


2010/11/23 Glenn Linderman <v+python at>:
> ? File "C:\Python32\lib\", line 108, in seed
> ??? a = int.from_bytes(_urandom(32), 'big')
> WindowsError: [Error -2146893818] Invalid Signature

In the subprocess documentation
"""On Windows, in order to run a side-by-side assembly the specified
env *must* include a valid SystemRoot."""

Can you keep this variable and start again?

Amaury Forgeot d'Arc

From martin at  Tue Nov 23 12:55:38 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Nov 2010 12:55:38 +0100
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Am 23.11.2010 11:55, schrieb Amaury Forgeot d'Arc:
> Hi,
> 2010/11/23 Glenn Linderman <v+python at>:
>>   File "C:\Python32\lib\", line 108, in seed
>>     a = int.from_bytes(_urandom(32), 'big')
>> WindowsError: [Error -2146893818] Invalid Signature
> In the subprocess documentation
> """On Windows, in order to run a side-by-side assembly the specified
> env *must* include a valid SystemRoot."""

Indeed, setting SystemRoot might solve this problem. According to

CrypoAPI, in Windows 7, requires this variable be set. Failure to
find the enhanced crypto provider would explain why the "random"
module of Python fails to work.

The specific cause is in the registry:
Strong Cryptographic Provider has as it's ImagePath value


So the registry (and COM) do rely on environment variables.


From stephen at  Tue Nov 23 13:15:20 2010
From: stephen at (Stephen J. Turnbull)
Date: Tue, 23 Nov 2010 21:15:20 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <icea0n$g9v$>
References: <>
	<> <>
Message-ID: <>

Terry Reedy writes:

 > Yes. As I read the standard, UCS-2 is limited to BMP chars.

Et tu, Terry?

OK, I change my vote on the suggestion of "UCS2" to -1.  If a couple
of conscientious blokes like you and David both understand it that
way, I can't see any way to fight it.

FWIW, ISO/IEC 10646 (which is authoritative for UCS-2 and UCS-4) is
available via

Probably I'm the last non-author to ever read that document!

From nadeem.vawda at  Tue Nov 23 13:15:18 2010
From: nadeem.vawda at (Nadeem Vawda)
Date: Tue, 23 Nov 2010 14:15:18 +0200
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

2010/11/23 ?ukasz Langa <lukasz at>:
> If you agree to do that for regrtest I will clean up the tests for warnings.
> Already did that for zipfile so it doesn't raise ResourceWarnings anymore. I
> just need to correct multiprocessing and xmlrpc ResourceWarnings, silence
> some DeprecationWarnings in the tests and we're all set. Ah, I see a couple
> more with -uall but nothing scary.

There are also some in test_socket - I've submitted a patch on

Looking at the multiprocessing warnings, they seem to be caused by
leaks in the underlying package, unlike xmlrpc and socket, where it's
just a matter of the test code neglecting to close the connection.  So
+1 to:

> Anyway, I find warnings as errors in regrtest a welcome feature. Let's make
> it happen :)


From jcea at  Tue Nov 23 13:19:39 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 13:19:39 +0100
Subject: [Python-Dev] Solaris family and 64 bits compiling
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Hash: SHA1

On 23/11/10 07:55, "Martin v. L?wis" wrote:
>> >> But if we say the Python can be compiled as 64 bits under Solaris,
>> >> be nice if that was actually true. Now that we have a buildbot (under
>> >> OpenIndiana) to test, it is doable.
> >
> > But it is true, and always has been true. The lib/64 issue did not
> > prevent one building Python on Solaris/SPARC64 at all, including the
> > extension modules. Just edit Modules/Setup to suit your needs - that
> > works since 1995 (before distutils was even written).
Would be acceptable to change something like:


to something similar to:

if (platform.uname()=="SunOS") and (platform.architecture()[0]=="64bits") :
else :

python-dev would consider that change OK?.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From ncoghlan at  Tue Nov 23 14:41:05 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 23 Nov 2010 23:41:05 +1000
Subject: [Python-Dev] [Python-checkins] r86633 - in
 python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst
 Lib/ Lib/test/ Misc/NEWS
In-Reply-To: <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>
References: <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 2:46 AM,  <exarkun at> wrote:
> On 04:24 pm, solipsis at wrote:
>> On Mon, 22 Nov 2010 17:08:36 +0100
>> Hrvoje Niksic <hrvoje.niksic at> wrote:
>>> On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
>>> > +1. ?The problem with int constants is that the int gets printed, not
>>> > the name, when you dump them for debugging purposes :)
>>> Well, it's trivial to subclass int to something with a nicer __repr__.
>>> PyGTK uses that technique for wrapping C enums:
>> Nice. It might be useful to add a private _Constant class somewhere for
>> stdlib purposes.

Indeed, it is difficult to do enums is such a way that they feel
sufficiently robust to be worth the effort of including them (although
these days, I would be inclined to follow the namedtuple API style
rather than that presented in PEP 354).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Tue Nov 23 14:50:53 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 13:50:53 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>
Message-ID: <>

On 23/11/2010 13:41, Nick Coghlan wrote:
> On Tue, Nov 23, 2010 at 2:46 AM,<exarkun at>  wrote:
>> On 04:24 pm, solipsis at wrote:
>>> On Mon, 22 Nov 2010 17:08:36 +0100
>>> Hrvoje Niksic<hrvoje.niksic at>  wrote:
>>>> On 11/22/2010 04:37 PM, Antoine Pitrou wrote:
>>>>> +1.  The problem with int constants is that the int gets printed, not
>>>>> the name, when you dump them for debugging purposes :)
>>>> Well, it's trivial to subclass int to something with a nicer __repr__.
>>>> PyGTK uses that technique for wrapping C enums:
>>> Nice. It might be useful to add a private _Constant class somewhere for
>>> stdlib purposes.
> Indeed, it is difficult to do enums is such a way that they feel
> sufficiently robust to be worth the effort of including them (although
> these days, I would be inclined to follow the namedtuple API style
> rather than that presented in PEP 354).
Right. As it happens I just submitted a patch to Barry Warsaw's enum 
package (nice), flufl.enum [1], to allow namedtuple style creation of 
named constants:

 >>> from flufl.enum import make_enum
 >>> Colors = make_enum('Colors', 'red green blue')
 >>> Colors
<Colors {red: 1, green: 2, blue: 3}>

PEP 354 was rejected for two primary reasons - lack of interest and 
nowhere obvious to put it. Would it be *so bad* if an enum type lived in 
its own module? There is certainly more interest now, and if we are to 
use something like this in the standard library it *has* to be in the 
standard library (unless every module implements their own private 
_Constant class).

Time to revisit the PEP?

All the best,



> Cheers,
> Nick.


From solipsis at  Tue Nov 23 15:02:19 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 15:02:19 +0100
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
References: <>
Message-ID: <>

On Tue, 23 Nov 2010 00:07:09 -0500
Glyph Lefkowitz <glyph at> wrote:
> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto <
> ocean-city at> wrote:
> > Hello. Does this affect python? Thank you.
> >
> >
> >
> No.

Well, actually it does, but Python links against the system OpenSSL on
most platforms (except Windows), so it's up to the OS vendor to apply
the patch.



From ncoghlan at  Tue Nov 23 15:03:53 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 24 Nov 2010 00:03:53 +1000
Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 8:01 AM, Michael Foord
<fuzzyman at> wrote:
> On 22/11/2010 21:08, Guido van Rossum wrote:
>> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon<brett at> ?wrote:
>>> The problem with that is it means developers who switch to Python 3.2
>>> or whatever are suddenly going to have their tests fail until they
>>> update their code to turn the warnings off.
>> That sounds like a feature to me... :-)
> I think Ezio was suggesting just turning warnings on by default when
> unittest is run, not turning them into errors. Ezio is suggesting that
> developers could explicitly turn warnings off again, but when you use the
> default test runner warnings would be shown. His logic is that warnings are
> for developers, and so are tests...

Having at least the default test runner change the default warnings
behaviour to -Wd (while still respecting sys.warnoptions) sounds like
a good idea. That way users won't see the warnings (as intended with
that change), but developers are less likely to get nasty surprises
when things break in future releases (which was one of our major
concerns when we made the decision to change the default handling of
DeprecationWarning). A similar change may be appropriate for doctest
as well.

Printing out the list of suppressed warnings in verbose mode may also be useful.

A blanket -We is unlikely to work for the test suite, since generating
warnings on some platforms is expected behaviour (e.g. due to the
ongoing argument between multiprocessing and FreeBSD as to the
appropriate behaviour of semaphores). However, we may be able to get
to the point where it is run that way by default and then affected
tests use check_warnings() to alter the filter configuration
(something that many such affected tests already do).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Tue Nov 23 15:02:57 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 15:02:57 +0100
Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/
References: <>
Message-ID: <>

On Mon, 22 Nov 2010 22:00:08 -0600
Benjamin Peterson <benjamin at> wrote:
> 2010/11/22 ?ukasz Langa <lukasz at>:
> > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47:
> >
> > No test?
> >
> >
> > The tests were there already, raising ResourceWarnings. After this change,
> > they stopped doing that. You may say: now they pass for the first time :)
> It looks like you added new API, though. For that, we would expect new tests.

It's an internal API, although ZipExtFile doesn't begin with an



From ncoghlan at  Tue Nov 23 15:16:15 2010
From: ncoghlan at (Nick Coghlan)
Date: Wed, 24 Nov 2010 00:16:15 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord
<fuzzyman at> wrote:
> PEP 354 was rejected for two primary reasons - lack of interest and nowhere
> obvious to put it. Would it be *so bad* if an enum type lived in its own
> module? There is certainly more interest now, and if we are to use something
> like this in the standard library it *has* to be in the standard library
> (unless every module implements their own private _Constant class).
> Time to revisit the PEP?

If you (or anyone else) wanted to revisit the PEP, then I would advise
trawling through the standard library looking for constants that could
be sensibly converted to enum values.

A decision would also need to be made as to whether or not to subclass
int, or just provide __index__ (the former has the advantage of being
able to drop cleanly into OS level APIs that expect a numerical

Whether enums should provide arbitrary name-value mappings (ala C
enums) or were restricted to sequential indices starting from zero
would be another question best addressed by a code survey of at least
the stdlib.

And getgeneratorstate() doesn't count as a use case, since the
ordering isn't needed and using string literals instead of integers
will cover the debugging aspect :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Tue Nov 23 15:24:18 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 14:24:18 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>
Message-ID: <>

On 23/11/2010 14:16, Nick Coghlan wrote:
> On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord
> <fuzzyman at>  wrote:
>> PEP 354 was rejected for two primary reasons - lack of interest and nowhere
>> obvious to put it. Would it be *so bad* if an enum type lived in its own
>> module? There is certainly more interest now, and if we are to use something
>> like this in the standard library it *has* to be in the standard library
>> (unless every module implements their own private _Constant class).
>> Time to revisit the PEP?
> If you (or anyone else) wanted to revisit the PEP, then I would advise
> trawling through the standard library looking for constants that could
> be sensibly converted to enum values.
> A decision would also need to be made as to whether or not to subclass
> int, or just provide __index__ (the former has the advantage of being
> able to drop cleanly into OS level APIs that expect a numerical
> constant).
> Whether enums should provide arbitrary name-value mappings (ala C
> enums) or were restricted to sequential indices starting from zero
> would be another question best addressed by a code survey of at least
> the stdlib.
> And getgeneratorstate() doesn't count as a use case, since the
> ordering isn't needed and using string literals instead of integers
> will cover the debugging aspect :)
Well, for backwards compatibility reasons the new constants would have 
to *behave* like the old ones (including having the same underlying 
value and comparing equal to it).

In many cases it is *likely* that subclassing int is a better way of 
achieving that. Actually looking through the standard library to 
evaluate it is the only way of confirming that.

Another API, that reduces the duplication of creating the enum and 
setting the names, could be something like:

     make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, 

Using __name__ we can set the module globals in the call to make_enums.

All the best,


> Cheers,
> Nick.


From solipsis at  Tue Nov 23 15:42:29 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 15:42:29 +0100
Subject: [Python-Dev] constant/enum type in stdlib
References: <>
	<> <>
Message-ID: <>

On Tue, 23 Nov 2010 14:24:18 +0000
Michael Foord <fuzzyman at> wrote:
> Well, for backwards compatibility reasons the new constants would have 
> to *behave* like the old ones (including having the same underlying 
> value and comparing equal to it).
> In many cases it is *likely* that subclassing int is a better way of 
> achieving that. Actually looking through the standard library to 
> evaluate it is the only way of confirming that.
> Another API, that reduces the duplication of creating the enum and 
> setting the names, could be something like:
>      make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, 
> module=__name__)
> Using __name__ we can set the module globals in the call to make_enums.

I don't understand why people insist on calling that an "enum". enum is
a C legacy and it doesn't bring anything useful as I can tell. Instead,
just assign the values explicitly.


From benjamin at  Tue Nov 23 15:49:37 2010
From: benjamin at (Benjamin Peterson)
Date: Tue, 23 Nov 2010 08:49:37 -0600
Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/23 Antoine Pitrou <solipsis at>:
> On Mon, 22 Nov 2010 22:00:08 -0600
> Benjamin Peterson <benjamin at> wrote:
>> 2010/11/22 ?ukasz Langa <lukasz at>:
>> > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47:
>> >
>> > No test?
>> >
>> >
>> > The tests were there already, raising ResourceWarnings. After this change,
>> > they stopped doing that. You may say: now they pass for the first time :)
>> It looks like you added new API, though. For that, we would expect new tests.
> It's an internal API, although ZipExtFile doesn't begin with an
> underscore.

Why is it internal API then?


From benjamin at  Tue Nov 23 15:52:09 2010
From: benjamin at (Benjamin Peterson)
Date: Tue, 23 Nov 2010 08:52:09 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

2010/11/23 Antoine Pitrou <solipsis at>:
> On Tue, 23 Nov 2010 14:24:18 +0000
> Michael Foord <fuzzyman at> wrote:
>> Well, for backwards compatibility reasons the new constants would have
>> to *behave* like the old ones (including having the same underlying
>> value and comparing equal to it).
>> In many cases it is *likely* that subclassing int is a better way of
>> achieving that. Actually looking through the standard library to
>> evaluate it is the only way of confirming that.
>> Another API, that reduces the duplication of creating the enum and
>> setting the names, could be something like:
>> ? ? ?make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int,
>> module=__name__)
>> Using __name__ we can set the module globals in the call to make_enums.
> I don't understand why people insist on calling that an "enum". enum is
> a C legacy and it doesn't bring anything useful as I can tell. Instead,
> just assign the values explicitly.

The concept of a "enumeration" of values is still useful outside its
stunted C incarnation.

Out of curiosity, why is enum "legacy" in C?


From fuzzyman at  Tue Nov 23 15:56:36 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 14:56:36 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
Message-ID: <>

On 23/11/2010 14:42, Antoine Pitrou wrote:
> On Tue, 23 Nov 2010 14:24:18 +0000
> Michael Foord<fuzzyman at>  wrote:
>> Well, for backwards compatibility reasons the new constants would have
>> to *behave* like the old ones (including having the same underlying
>> value and comparing equal to it).
>> In many cases it is *likely* that subclassing int is a better way of
>> achieving that. Actually looking through the standard library to
>> evaluate it is the only way of confirming that.
>> Another API, that reduces the duplication of creating the enum and
>> setting the names, could be something like:
>>       make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int,
>> module=__name__)
>> Using __name__ we can set the module globals in the call to make_enums.
> I don't understand why people insist on calling that an "enum". enum is
> a C legacy and it doesn't bring anything useful as I can tell. Instead,
> just assign the values explicitly.

enum isn't only in C. (They are in C# as well at least.) Wikipedia links 
enum to "enumerated type" and says:

     an enumerated type (also called enumeration or enum) is a data type 
consisting of a set of named values

It sounds entirely appropriate. I have no problem with explicitly 
assigning values instead of doing it automagically.

All the best,


> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From stephen at  Tue Nov 23 16:00:22 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 00:00:22 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

If you don't care about the ISO standard, but only about Python,
Martin's right, I was wrong.  You can stop reading now.<wink>

"Martin v. L?wis" writes:

 > I could only find the FCD of 10646:2010, where annex H was integrated
 > into section 10:

Thank you for the reference.

I referred to two older versions, 10646-1:1993 (for the annexes and
Amendment, and my basic understanding) and 10646:2003 (for the
detailed definition of UCS-2 in Sections 7, 8 and 13; unfortunately, I
missed the most important detail, which is in Section 9).  In :2003
the Annex I referred to as "Annex H" is Annex J, and "Annex Q" is
partly in Section 9.1 and mostly in Annex C.  I don't know where the
former is in the 2010 FCD, and the latter is section 9.2.

 > I think they are now acknowledging that UCS-2 was a misleading term,
 > making it ambiguous whether this refers to a CCS, a CEF, or a CES;
 > like "ASCII", people have been using it for all three of them.

In :1993 it wasn't ambiguous, they simply didn't make those
distinctions.  They were not needed for ISO 10646's published
versions, although they certainly are for Unicode.

Now, quite clearly, the ISO has *changed the definition* in every new
version, progressively adding new restrictions that go beyond
clarifying ambiguity.  But even in :2003, in view of 4.2, 6.2, 6.3,
and 13.1, UCS-2 is clearly well-defined as a CM according to UTR#17,
which can probably be identified with CCS in :2003 terminology.  Ie,
returning to UTR#17 terminology, it is the composition of a CES, a
CEF, and a CCS, which are not defined individually.  Note: The
definition of "coded character" changed between :2003 and the 2010
FCD, from "character with representation" to "character with integer".

There is a NOTE indicating that 16-bit integers may be used in
processing.  Given that this is a non-normative note, I take it to
mean that in an array of 16-bit integers, "most significant octet" is
to be interpreted in the natural way for the architecture rather than
by the representation in memory, which might be little-endian.  IMO
it's unnatural to think that that changes the definition of UCS-2 to
be either a CEF, or a composition of a CEF and a CCS.

 > Apparently, the ISO WG interprets earlier revisions as saying that
 > UCS-2 is a CEF that restricted UTF-16 to the BMP.

I think that ISO 10646-1:1993 admits only one interpretation, a CM
restricted to the BMP (including surrogates), and ISO 10646:2003
admits only one interpretation, a CM restricted to the BMP (not
including surrogates).  The note under Table 4 on p.24 of the FCD is,
uh, well, a lie.  Earlier versions certainly did not restrict to
"scalar values"; they had no such concept.


Well, no shit, Sherlock.  You don't have to yell at me, I know what
Python does.  The question is, is what does UCS-2 do?  The answer is
that in :1993, AFAICT it did what Python does.  In :2003, they added
(last sentence, section 9.1):

    UCS-2 cannot be used to represent any characters on the
    supplementary planes.

I assume they maintain that position in 2010, so End Of Thread.

I apologize for missing that when I was reviewing the standard
earlier, but I expected restrictions on UCS-2 to be explained in 13.1
or perhaps 14.  And 13.1 simply requires that characters in the BMP be
represented by their defined code positions, truncated to two octets.
Like earlier versions, it doesn't prohibit use of surrogates or say
that non-BMP characters can't be represented.

 > Not sure what it says in your copy; in mine, section 9.3 says


Mine (:2003) says "NOTE 2 - When confined to the code positions in
Planes 00 to 10, UCS-4 is also referred to as UCS Transformation
Format 32 (UTF-32)."  Then it references the Unicode Standard (v4.0)
as the authority for UTF-32.  Obviously they continued to be confused
at this point in time; by the draft you have, apparently the WG had
decided to pretty much completely synchronize the whole standard to a
subset of Unicode.  This seems pointless to me (unlike, say, the work
that has been done on standardizing criteria for repertoire changes).

In particular, the :1993 definition of UCS-2 was a perfectly good
standard for describing the processing Python actually does
internally.  The current definition of UCS-2 as identical to the BMP
is useless, and good riddance, I'm perfectly happy to have them
deprecate it.

From solipsis at  Tue Nov 23 16:01:06 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 16:01:06 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290524466.3642.4.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 08:52 -0600, Benjamin Peterson a ?crit :
> 2010/11/23 Antoine Pitrou <solipsis at>:
> > On Tue, 23 Nov 2010 14:24:18 +0000
> > Michael Foord <fuzzyman at> wrote:
> >> Well, for backwards compatibility reasons the new constants would have
> >> to *behave* like the old ones (including having the same underlying
> >> value and comparing equal to it).
> >>
> >> In many cases it is *likely* that subclassing int is a better way of
> >> achieving that. Actually looking through the standard library to
> >> evaluate it is the only way of confirming that.
> >>
> >> Another API, that reduces the duplication of creating the enum and
> >> setting the names, could be something like:
> >>
> >>      make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int,
> >> module=__name__)
> >>
> >> Using __name__ we can set the module globals in the call to make_enums.
> >
> > I don't understand why people insist on calling that an "enum". enum is
> > a C legacy and it doesn't bring anything useful as I can tell. Instead,
> > just assign the values explicitly.
> The concept of a "enumeration" of values is still useful outside its
> stunted C incarnation.

Well, it is easy to assign range(N) to a tuple of names when desired. I
don't think an automatically-enumerating constant generator is needed.



From solipsis at  Tue Nov 23 16:01:59 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 16:01:59 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
	<> <>
Message-ID: <1290524519.3642.5.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 14:56 +0000, Michael Foord a ?crit :
> On 23/11/2010 14:42, Antoine Pitrou wrote:
> > On Tue, 23 Nov 2010 14:24:18 +0000
> > Michael Foord<fuzzyman at>  wrote:
> >> Well, for backwards compatibility reasons the new constants would have
> >> to *behave* like the old ones (including having the same underlying
> >> value and comparing equal to it).
> >>
> >> In many cases it is *likely* that subclassing int is a better way of
> >> achieving that. Actually looking through the standard library to
> >> evaluate it is the only way of confirming that.
> >>
> >> Another API, that reduces the duplication of creating the enum and
> >> setting the names, could be something like:
> >>
> >>       make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int,
> >> module=__name__)
> >>
> >> Using __name__ we can set the module globals in the call to make_enums.
> > I don't understand why people insist on calling that an "enum". enum is
> > a C legacy and it doesn't bring anything useful as I can tell. Instead,
> > just assign the values explicitly.
> >
> enum isn't only in C. (They are in C# as well at least.)

Well, it's been inherited by C-like languages, no doubt. Like braces and
semicolumns :)



From solipsis at  Tue Nov 23 15:59:59 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 15:59:59 +0100
Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/
In-Reply-To: <>
References: <>
Message-ID: <1290524399.3642.3.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 08:49 -0600, Benjamin Peterson a ?crit :
> 2010/11/23 Antoine Pitrou <solipsis at>:
> > On Mon, 22 Nov 2010 22:00:08 -0600
> > Benjamin Peterson <benjamin at> wrote:
> >> 2010/11/22 ?ukasz Langa <lukasz at>:
> >> > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47:
> >> >
> >> > No test?
> >> >
> >> >
> >> > The tests were there already, raising ResourceWarnings. After this change,
> >> > they stopped doing that. You may say: now they pass for the first time :)
> >>
> >> It looks like you added new API, though. For that, we would expect new tests.
> >
> > It's an internal API, although ZipExtFile doesn't begin with an
> > underscore.
> Why is it internal API then?

Because it's for use by The ZipExtFile constructor is
not supposed to be called by the user.
You might instead asked why ZipExtFile isn't called _ZipExtFile instead,
and I have no idea.



From fuzzyman at  Tue Nov 23 16:15:29 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 15:15:29 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290524466.3642.4.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>
Message-ID: <>

On 23/11/2010 15:01, Antoine Pitrou wrote:
> Le mardi 23 novembre 2010 ? 08:52 -0600, Benjamin Peterson a ?crit :
>> 2010/11/23 Antoine Pitrou<solipsis at>:
>>> On Tue, 23 Nov 2010 14:24:18 +0000
>>> Michael Foord<fuzzyman at>  wrote:
>>>> Well, for backwards compatibility reasons the new constants would have
>>>> to *behave* like the old ones (including having the same underlying
>>>> value and comparing equal to it).
>>>> In many cases it is *likely* that subclassing int is a better way of
>>>> achieving that. Actually looking through the standard library to
>>>> evaluate it is the only way of confirming that.
>>>> Another API, that reduces the duplication of creating the enum and
>>>> setting the names, could be something like:
>>>>       make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int,
>>>> module=__name__)
>>>> Using __name__ we can set the module globals in the call to make_enums.
>>> I don't understand why people insist on calling that an "enum". enum is
>>> a C legacy and it doesn't bring anything useful as I can tell. Instead,
>>> just assign the values explicitly.
>> The concept of a "enumeration" of values is still useful outside its
>> stunted C incarnation.
> Well, it is easy to assign range(N) to a tuple of names when desired. I
> don't think an automatically-enumerating constant generator is needed.
Right, and that is current practise. It has the disadvantage (that you 
seemed to acknowledge) that when debugging the integer values are seen 
instead of something with a useful repr.

Having a *simple* class (and API to create them) that produces named 
constants with a useful repr, is what we are discussing, and that seems 
awfully like an enum (in the general sense not in a C specific sense). 
For backwards compatibility these constants, where they replace integer 
constants, would need to be integer subclasses with the same behaviour. 
Like the Qt example you appreciated so much. ;-)

There are still two reasonable APIs (unless you have changed your mind 
and think that sticking with plain integers is best), of which I prefer 
the latter:

SOME_CONST = Constant('SOME_CONST', 1)


Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)

(Well, there is a third option that takes __name__ and sets the 
constants in the module automagically. I can understand why people would 
dislike that though.)

All the best,

Michael Foord


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From solipsis at  Tue Nov 23 16:30:53 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 16:30:53 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
	<> <>
Message-ID: <1290526253.3642.9.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit :
> There are still two reasonable APIs (unless you have changed your mind 
> and think that sticking with plain integers is best), of which I prefer 
> the latter:
> SOME_CONST = Constant('SOME_CONST', 1)
> OTHER_CONST = Constant('OTHER_CONST', 2)
> or:
> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)


Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',   
                           values=range(1, 3))

Again, auto-enumeration is useless since it's trivial to achieve



From fuzzyman at  Tue Nov 23 16:40:28 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 15:40:28 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>
Message-ID: <>

On 23/11/2010 15:30, Antoine Pitrou wrote:
> Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit :
>> There are still two reasonable APIs (unless you have changed your mind
>> and think that sticking with plain integers is best), of which I prefer
>> the latter:
>> SOME_CONST = Constant('SOME_CONST', 1)
>> OTHER_CONST = Constant('OTHER_CONST', 2)
>> or:
>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)
> Or:
> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',
>                             values=range(1, 3))
> Again, auto-enumeration is useless since it's trivial to achieve
> explicitly.

Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem.

I think the step that Nick described, of evaluating places in the 
standard library that this could be used, is a good one. I'll try to get 
around to it and perhaps attempt to resuscitate the PEP. (Any 
suggestions as to an appropriate module if having it live in its own 
module is still an objection?)


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Tue Nov 23 17:05:19 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 17:05:19 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
	<> <>
Message-ID: <1290528319.3642.11.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 15:40 +0000, Michael Foord a ?crit :
> On 23/11/2010 15:30, Antoine Pitrou wrote:
> > Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit :
> >> There are still two reasonable APIs (unless you have changed your mind
> >> and think that sticking with plain integers is best), of which I prefer
> >> the latter:
> >>
> >> SOME_CONST = Constant('SOME_CONST', 1)
> >> OTHER_CONST = Constant('OTHER_CONST', 2)
> >>
> >> or:
> >>
> >> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)
> > Or:
> >
> > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',
> >                             values=range(1, 3))
> >
> > Again, auto-enumeration is useless since it's trivial to achieve
> > explicitly.
> Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem.
> I think the step that Nick described, of evaluating places in the 
> standard library that this could be used, is a good one. I'll try to get 
> around to it and perhaps attempt to resuscitate the PEP. (Any 
> suggestions as to an appropriate module if having it live in its own 
> module is still an objection?)

We already have a bunch of bizarrely unrelated stuff in collections
(such as Callable), so we could put enum there too.



From fuzzyman at  Tue Nov 23 17:07:30 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 16:07:30 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290528319.3642.11.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>
Message-ID: <>

On 23/11/2010 16:05, Antoine Pitrou wrote:
> Le mardi 23 novembre 2010 ? 15:40 +0000, Michael Foord a ?crit :
>> On 23/11/2010 15:30, Antoine Pitrou wrote:
>>> Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit :
>>>> There are still two reasonable APIs (unless you have changed your mind
>>>> and think that sticking with plain integers is best), of which I prefer
>>>> the latter:
>>>> SOME_CONST = Constant('SOME_CONST', 1)
>>>> OTHER_CONST = Constant('OTHER_CONST', 2)
>>>> or:
>>>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)
>>> Or:
>>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',
>>>                              values=range(1, 3))
>>> Again, auto-enumeration is useless since it's trivial to achieve
>>> explicitly.
>> Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem.
>> I think the step that Nick described, of evaluating places in the
>> standard library that this could be used, is a good one. I'll try to get
>> around to it and perhaps attempt to resuscitate the PEP. (Any
>> suggestions as to an appropriate module if having it live in its own
>> module is still an objection?)
> We already have a bunch of bizarrely unrelated stuff in collections
> (such as Callable), so we could put enum there too.

I guess it creates collections of constants...


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From Ben.Cottrell at  Tue Nov 23 16:37:43 2010
From: Ben.Cottrell at (Ben.Cottrell at
Date: Tue, 23 Nov 2010 07:37:43 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: Your message of "Tue, 23 Nov 2010 15:15:29 GMT."
Message-ID: <>

On Tue, 23 Nov 2010 15:15:29 +0000, Michael Foord wrote:
> There are still two reasonable APIs (unless you have changed your mind 
> and think that sticking with plain integers is best), of which I prefer 
> the latter:
> SOME_CONST = Constant('SOME_CONST', 1)
> OTHER_CONST = Constant('OTHER_CONST', 2)
> or:
> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)

I prefer the latter too, because that makes it possible to have
'Constants' be a rendezvous point for making sure that you're
passing something valid. Perhaps using 'in':

def func(foo):
    if foo not in Constants:
        raise ValueError('foo must be SOME_CONST or OTHER_CONST')

I know this is probably not going to happen, but I would *so much*
like it if functions would start rejecting "the wrong kind of 2".
Constants that are valid, integer-wise, but which aren't part of
the set of constants allowed for that argument. I'd prefer not to
think of the number of times I've made the following mistake:

s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET)


From turnbull at  Tue Nov 23 17:16:55 2010
From: turnbull at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 01:16:55 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:

 > I disagree: Quoting from Unicode 5.0, section 5.4:
 > # The individual components of implementations may have different
 > # levels of support for surrogates, as long as those components are
 > # assembled and communicate correctly.

"Assembly" is the problem.  If chr() or a slice creates a lone
surrogate and surrogateescape passes it back out, Python as a whole is

Technically, you can hide behind "none of slicing, chr(), or
surrogateescape promises to conform", and maybe that would fly to a
standards lawyer; I'd have to see the precise statement.

Here's a more convincing example.  A user specifies "utf8" as her
locale charset.  Then she specifies a string containing a non-BMP
character as the "description" of a file, and internal code munges
this via slicing into a file name conforming to some specification
(eg, length limit + uniquifier if needed).  Then if the non-BMP
character is in the "right" place, she will get either a broken file
name, which will either get written to disk or raise an exception,
depending on whether the munging program has enabled surrogateescape
or not.

I claim both of those results are non-conforming to the specification
of UTF-16, and therefore Python Unicode processing as a whole must be
considered non-conforming.

It's still pretty damn good.  But I've elaborated that point

 > The rationale for supporting these characters in chr() goes back much
 > further than the surrogateescape handler - as Python unicode strings
 > are sequences of code points, it would be impractical if you couldn't
 > create some of them, or even would have to consult the UCD before
 > determining whether they can be created.

The Zen is irrelevant to determining conformance to Unicode, which has
its own Zen.

From stephen at  Tue Nov 23 17:18:57 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 01:18:57 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Nick Coghlan writes:

 > For practical purposes, UCS2/UCS4 convey far more inherent information
 > than narrow/wide:

That was my stance, but in fact (1) the ISO JTC1/SC2 has deliberately
made them ambiguous by changing their definitions over the years[1],
and (2) the more recent definitions and "interpretations" of UCS-2
*prohibit* use of surrogates in UCS-2 as far as I can tell.  And
that's what you'll see everywhere you look, because Wikipedia and
friends pick up the most recent versions of everything.

 > So don't just think about "what will developers know?", also think
 > about "what will developers know, and what will a quick trip to a
 > search engine tell them?".

It will tell them that UCS-2 cannot even *express* non-BMP characters.
Terry and David are *not* dummies, and that's what they got from more
or less careful study of the issue.

 > And once you take that stance, the overly
 > generic narrow/wide terms fail, badly.

I still agree that something more accurate would be nice, but face it:
the ISO will redefine and deprecate such terms as soon as they notice
us using them.<wink>

 > +1 for MAL's suggested tweaks to the Py3k configure options.

Despite my natural sympathy for your arguments, and MAL's, I'm still
-1.  I really wish I could switch back, but it seems to me that
"UCS-2" is a liability we don't need, *especially* on Windows where
the default build is presumably going to be UCS2 forever.

[1]  You'd think it would be hard to change the definition of UCS-4,
but they managed. :-(

From fuzzyman at  Tue Nov 23 17:19:16 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 16:19:16 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
Message-ID: <>

On 23/11/2010 15:37, Ben.Cottrell at wrote:
> On Tue, 23 Nov 2010 15:15:29 +0000, Michael Foord wrote:
>> There are still two reasonable APIs (unless you have changed your mind
>> and think that sticking with plain integers is best), of which I prefer
>> the latter:
>> SOME_CONST = Constant('SOME_CONST', 1)
>> OTHER_CONST = Constant('OTHER_CONST', 2)
>> or:
>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1)
> I prefer the latter too, because that makes it possible to have
> 'Constants' be a rendezvous point for making sure that you're
> passing something valid. Perhaps using 'in':
> def func(foo):
>      if foo not in Constants:
>          raise ValueError('foo must be SOME_CONST or OTHER_CONST')
>      ...
> I know this is probably not going to happen, but I would *so much*
> like it if functions would start rejecting "the wrong kind of 2".
> Constants that are valid, integer-wise, but which aren't part of
> the set of constants allowed for that argument. I'd prefer not to
> think of the number of times I've made the following mistake:
> s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET)

Well it would be perfectly possible for the __contains__ method (on the 
metaclass so that a Constants class can act as a container) to permit a 
*raw integer* (to be backwards compatible with code using hard coded 
values) but not permit other constants that aren't valid. Code that is 
*deliberately* using the wrong constants would be screwed of course...

All the best,

> 	~Ben
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From barry at  Tue Nov 23 17:27:03 2010
From: barry at (Barry Warsaw)
Date: Tue, 23 Nov 2010 11:27:03 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101123112703.42b42812@mission>

On Nov 23, 2010, at 01:50 PM, Michael Foord wrote:

>Right. As it happens I just submitted a patch to Barry Warsaw's enum package
>(nice), flufl.enum [1], to allow namedtuple style creation of named

Thanks for the plug (and the nice patch).

FWIW, the documentation for the package is here:

I made some explicit decisions about the API and semantics of this package, to
fit my own use cases and sensibilities.  I guess you wouldn't expect anything
else <wink>, but I'm willing to acknowledge that others would make different
decisions, and certainly the number of existing enum implementations out there
proves that there are lots of interesting ways to go about it.

That said, there are several things I like about my package:

* Enums are not subclassed from ints or strs.  They are a distinct data type
  that can be converted to and from ints and strs.  EIBTI.

* The typical way to create them is through a simple, but explicit class
  definition.  I personally like being explicit about the item values, and the
  assignments are required to make the metaclass work properly, but Michael's
  convenience patch is totally appropriate for cases where you don't care, or
  you want a one-liner.

* Enum items are singletons and are intended to be compared by identity.  They
  can be compared by equality but are not ordered.

* Enum items have an unambiguous symbolic repr and a nice human readable str.

* Given an enum item, you can get to its enum class, and given the class you
  can get to the set of items.

* Enums can be subclassed (though all items in the subclass must have unique

In any case it may be that enums are too tied to specific use cases to find a
good common ground for the stdlib.  I've been using my module for years and if
there's interest I would of course be happy to donate it for use in the
stdlib.  Like the original sets implementation, it makes perfect sense to
provide them in a separate module rather than as a built-in type.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Tue Nov 23 17:31:27 2010
From: barry at (Barry Warsaw)
Date: Tue, 23 Nov 2010 11:31:27 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101123113127.78506cb5@mission>

On Nov 23, 2010, at 03:15 PM, Michael Foord wrote:

>(Well, there is a third option that takes __name__ and sets the constants in
>the module automagically. I can understand why people would dislike that

Personally, I think if you want that, then the explicit class definition is a
better way to go.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From pje at  Tue Nov 23 17:52:37 2010
From: pje at (P.J. Eby)
Date: Tue, 23 Nov 2010 11:52:37 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <20101123113127.78506cb5@mission>
References: <>
	<> <>
Message-ID: <>

At 11:31 AM 11/23/2010 -0500, Barry Warsaw wrote:
>On Nov 23, 2010, at 03:15 PM, Michael Foord wrote:
> >(Well, there is a third option that takes __name__ and sets the constants in
> >the module automagically. I can understand why people would dislike that
> >though.)
>Personally, I think if you want that, then the explicit class definition is a
>better way to go.

This reminds me: a stdlib enum should support proper pickling and 
copying; i.e.:

    assert SomeEnum.anEnum is pickle.loads(pickle.dumps(SomeEnum.anEnum))

This could probably be implemented by adding something like:

    def __reduce__(self):
        return getattr, (self._class, self._enumname)

in the EnumValue class.

From fuzzyman at  Tue Nov 23 18:02:33 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 23 Nov 2010 17:02:33 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <20101123112703.42b42812@mission>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>
Message-ID: <>

On 23/11/2010 16:27, Barry Warsaw wrote:
> On Nov 23, 2010, at 01:50 PM, Michael Foord wrote:
>> Right. As it happens I just submitted a patch to Barry Warsaw's enum package
>> (nice), flufl.enum [1], to allow namedtuple style creation of named
>> constants:
> Thanks for the plug (and the nice patch).
> FWIW, the documentation for the package is here:
> I made some explicit decisions about the API and semantics of this package, to
> fit my own use cases and sensibilities.  I guess you wouldn't expect anything
> else<wink>, but I'm willing to acknowledge that others would make different
> decisions, and certainly the number of existing enum implementations out there
> proves that there are lots of interesting ways to go about it.
> That said, there are several things I like about my package:
> * Enums are not subclassed from ints or strs.  They are a distinct data type
>    that can be converted to and from ints and strs.  EIBTI.

But if we are to use it *in* the standard library (as opposed to merely 
adding a module *to* the standard library) there are backwards 
compatibility concerns. Where modules are already using integers for 
constants then integers still need to work.

One easy way to achieve this is to subclass integer. If we don't do that 
(assuming we decide that putting a solution in the standard library is 
appropriate) then we'll have to evaluate what we mean by backwards 
compatible. If the modules that use the constants aren't to change then 
comparing equal to the underlying value is the minimum (so that the 
original value can still be used in place of the new named constant). 
Not sure if you'd be happy to make that change in flufl.enum.

> * The typical way to create them is through a simple, but explicit class
>    definition.  I personally like being explicit about the item values, and the
>    assignments are required to make the metaclass work properly, but Michael's
>    convenience patch is totally appropriate for cases where you don't care, or
>    you want a one-liner.

If make_enum was to take a set of values to use (as Antoine suggested) I 
don't see what's un-explicit about it.

All the best,


> * Enum items are singletons and are intended to be compared by identity.  They
>    can be compared by equality but are not ordered.
> * Enum items have an unambiguous symbolic repr and a nice human readable str.
> * Given an enum item, you can get to its enum class, and given the class you
>    can get to the set of items.
> * Enums can be subclassed (though all items in the subclass must have unique
>    values).
> In any case it may be that enums are too tied to specific use cases to find a
> good common ground for the stdlib.  I've been using my module for years and if
> there's interest I would of course be happy to donate it for use in the
> stdlib.  Like the original sets implementation, it makes perfect sense to
> provide them in a separate module rather than as a built-in type.
> -Barry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies ("BOGUS AGREEMENTS") that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Tue Nov 23 18:37:40 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 18:37:40 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290533860.3642.73.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit :
> On Tue, 23 Nov 2010, Antoine Pitrou wrote:
> > We already have a bunch of bizarrely unrelated stuff in collections
> > (such as Callable), so we could put enum there too.
> Why not just "enum" (i.e., "from enum import [...]" or "import 
> enum.[...]")?  Enumerations are one of the basic kinds of types overall 
> (speaking informally and independent of any specific language) - they 
> aren't at all exotic.

Enumerations aren't a type at all (they have no distinguishing

> And "Flat is better than nested", after all.

Not when it means creating a separate module for every micro-feature.



From ijmorlan at  Tue Nov 23 18:32:15 2010
From: ijmorlan at (Isaac Morland)
Date: Tue, 23 Nov 2010 12:32:15 -0500 (EST)
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290528319.3642.11.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, 23 Nov 2010, Antoine Pitrou wrote:

> We already have a bunch of bizarrely unrelated stuff in collections
> (such as Callable), so we could put enum there too.

Why not just "enum" (i.e., "from enum import [...]" or "import 
enum.[...]")?  Enumerations are one of the basic kinds of types overall 
(speaking informally and independent of any specific language) - they 
aren't at all exotic.  And "Flat is better than nested", after all.

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist

From ijmorlan at  Tue Nov 23 18:50:31 2010
From: ijmorlan at (Isaac Morland)
Date: Tue, 23 Nov 2010 12:50:31 -0500 (EST)
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290533860.3642.73.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, 23 Nov 2010, Antoine Pitrou wrote:

> Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit :
>> On Tue, 23 Nov 2010, Antoine Pitrou wrote:
>>> We already have a bunch of bizarrely unrelated stuff in collections
>>> (such as Callable), so we could put enum there too.
>> Why not just "enum" (i.e., "from enum import [...]" or "import
>> enum.[...]")?  Enumerations are one of the basic kinds of types overall
>> (speaking informally and independent of any specific language) - they
>> aren't at all exotic.
> Enumerations aren't a type at all (they have no distinguishing
> property).

Each enumeration is a type (well, OK, not in every language, presumably, 
but certainly in many languages).  The word "basic" is more important than 
"types" in my sentence - the point is that an enumeration capability is a 
very common one in a type system, and is very general, not specific to any 
particular application.

>> And "Flat is better than nested", after all.
> Not when it means creating a separate module for every micro-feature.

Classes have their own keyword.  I don't think it's disproportionate to 
give enums a top-level module name.

Having said that, I understand we're trying to have a not-too-flat module 
namespace and I can see the sense in putting it in "collections".  But I 
think the idea that enumerations are of very wide applicability and hence 
deserve a shorter name should be seriously considered.

I'll leave it at that, except for:

Hey, how about this syntax:

enum Colors:
 	red = 0
 	green = 10

(blue gets the value 11)


Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist

From fdrake at  Tue Nov 23 18:57:20 2010
From: fdrake at (Fred Drake)
Date: Tue, 23 Nov 2010 12:57:20 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290533860.3642.73.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou <solipsis at> wrote:
> Enumerations aren't a type at all (they have no distinguishing
> property).

In any given language, this may be true, or not.  Whether they should
be distinct in Python is core to the current discussion.

>From a backward-compatibility perspective, what makes sense depends on
whether they're used to implement existing constants (socket.AF_INET,
etc.) or if they reserved for new features only.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From solipsis at  Tue Nov 23 19:06:42 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 19:06:42 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290535602.3642.87.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 12:57 -0500, Fred Drake a ?crit :
> On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou <solipsis at> wrote:
> > Enumerations aren't a type at all (they have no distinguishing
> > property).
> In any given language, this may be true, or not.  Whether they should
> be distinct in Python is core to the current discussion.

I meant "type" in the structural sense (hence the parenthesis). enums
are just auto-generated constants. Since Python makes it trivial to
generate sequential integers, there's no need for a specific "enum"

Now you may argue that enums should be strongly-typed, but that would be
a bit backwards given Python's preference for duck-typing.

> From a backward-compatibility perspective, what makes sense depends on
> whether they're used to implement existing constants (socket.AF_INET,
> etc.) or if they reserved for new features only.

It's not only backwards compatibility. New features relying on C APIs
have to be able to map constants to the integers used in the C library.
It would be much better if this were done naturally rather than through
explicit conversion maps.
(this really means subclassing int, if we don't want to complicate
C-level code)



From solipsis at  Tue Nov 23 19:07:56 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 19:07:56 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290535676.3642.89.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit :
> Each enumeration is a type (well, OK, not in every language, presumably, 
> but certainly in many languages).  The word "basic" is more important than 
> "types" in my sentence - the point is that an enumeration capability is a 
> very common one in a type system, and is very general, not specific to any 
> particular application.

Python already has an enumeration capability. It's called range().
There's nothing else that C enums have. AFAICT, neither do enums in
other mainstream languages (assuming they even exist; I don't remember
Perl, PHP or Javascript having anything like that, but perhaps I'm



From v+python at  Tue Nov 23 19:56:20 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 10:56:20 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 11/23/2010 3:55 AM, "Martin v. L?wis" wrote:
> Am 23.11.2010 11:55, schrieb Amaury Forgeot d'Arc:
>> Hi,
>> 2010/11/23 Glenn Linderman<v+python at>:
>>>    File "C:\Python32\lib\", line 108, in seed
>>>      a = int.from_bytes(_urandom(32), 'big')
>>> WindowsError: [Error -2146893818] Invalid Signature
>> In the subprocess documentation
>> """On Windows, in order to run a side-by-side assembly the specified
>> env *must* include a valid SystemRoot."""
> Indeed, setting SystemRoot might solve this problem. According to
> CrypoAPI, in Windows 7, requires this variable be set. Failure to
> find the enhanced crypto provider would explain why the "random"
> module of Python fails to work.
> The specific cause is in the registry:
> HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft
> Strong Cryptographic Provider has as it's ImagePath value
> %SystemRoot%\system32\rsaenh.dll
> So the registry (and COM) do rely on environment variables.
> Regards,
> Martin

I find it sad but hilarious that after working so hard to remove the 
need for environment variables from Windows that M$ has introduced new 
dependencies on them.

I wonder if this particular registry variable is simply an oversight/bug 
on M$' part, that they will eventually fix, or if it a turnaround toward 
the use of more environment variables in the future.  Hmm.  Time will 
tell, I suppose.  I'm unaware of any benefits in _changing_ SystemRoot 
to other values, so not pre-expanding it in that registry location seems 
only to add an unnecessary dependency on the environment.

Indeed, preserving that one environment variable allows my version of 
http.server to proceed with, as far as initial testing can determine, 
proper behavior.  Thanks for your help in figuring this out.  That was a 
lot faster than a "binary search" to choose which variable(s) to preserve.

My purpose in such testing was two-fold: firstly, web servers, for 
security purposes, generally limit the number of environment variables 
that are seen by CGI programs, and secondly, in debugging whether or not 
http.server was properly setting the necessary environment variables, 
the many other environment variables were cluttering up log dumps of all 
environment variables.  It will be nicer to limit the "passed through" 
environment variables to SystemRoot, as see how things go.

I have read some about side-by-side assemblies but had considered them a 
good reason to stick with the outdated M$VC 6.0 compiler, which doesn't 
seem to need to create them, and their myriad requirements, which seem 
far from necessary for simply compiling a program.  I was disappointed 
to realize that Python was heading down the path of using the newer 
tools that create side-by-side assemblies, but I suppose using an old 
and crufty compiler like M$VC 6.0 cannot support some of the newer 
features of Windows, which may seem to be necessary to some.... like 
64-bit support, which does seem necessary, even to me.

I was well aware that shortcuts and the registry _may_ refer to 
environment variables, and have a number of environment variables of my 
own which leverage that capability, to avoid hard-coded drive letters 
and paths in certain areas, and for the convenience of shorting the 
specification of some of the long-winded path names that Windows foists 
upon us (some of those have been significantly shortened in Windows 6.1, 
and maybe 6.0 which I used only for 2 months with disgust; 6.1 has 
helped alleviate the disgust, but I still recommend XP for people that 
don't need 64-bit capabilities).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From v+python at  Tue Nov 23 19:58:37 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 10:58:37 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 11/22/2010 2:56 PM, Tim Lesher wrote:
> On Mon, Nov 22, 2010 at 16:54, Glenn Linderman<v+python at>  wrote:
>> I suppose it is possible that some environment variables are used by Python
>> directly (but I can't seem to find a documented list of them) although I
>> would expect that usage to be optional, with fall-back defaults when they
>> don't exist.
> I can verify that that's the case: Python (at least through 3.1.2)
> runs fine on Windows platforms when environment variables are
> completely unavailable.  I know that from running our port for Windows
> CE (which has no environment variables at all), cross-compiled for
> Windows XP.

Is the Windows CE port generally available?  From where?  The CE ports I 
have found in past searches seem to have been quite outdated and not 
much on-going activity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexander.belopolsky at  Tue Nov 23 20:11:06 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 23 Nov 2010 14:11:06 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> Any explanation we give users needs to let them know two things:
> * that we cover the entire range of unicode not just BMP
> * that sometimes len(chr(i)) is one and sometimes two

This discussion motivated me to start looking into how well Python
library itself is prepared to deal with len(chr(i)) = 2.  I was not
surprised to find that textwrap does not handle the issue that well:

>>> len(wrap(' \U00010140' * 80, 20))
>>> len(wrap(' \U00000140' * 80, 20))

That module should probably be rewritten to properly implement  the
Unicode line breaking algorithm

Yet finding a bug in a str object method after a 5 min review was a
bit discouraging:

>>> 'xyz'.center(20, '\U00010140')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: The fill character must be exactly one character long

Given the apparent difficulty of writing even basic text processing
algorithms in presence of surrogate pairs, I wonder how wise it is to
expose Python users to them.  As Wikipedia explains, [1]

Because the most commonly used characters are all in the Basic
Multilingual Plane, converting between surrogate pairs and the
original values is often not tested thoroughly. This leads to
persistent bugs, and potential security holes, even in popular and
well-reviewed application software.

Since UCS-2 (the Character Encoding Form (CEF)) is now defined [1] to
cover only BMP, maybe rather than changing the terms used in the
reference manual, we should tighten the code to conform to the updated

Again, given that the str object itself has at least one non-BMP
character bug as we are closing on the third major release of py3k,
how likely are 3rd party developers to get their libraries right as
they port to 3.x?


From amauryfa at  Tue Nov 23 20:19:28 2010
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Tue, 23 Nov 2010 20:19:28 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

2010/11/23 Alexander Belopolsky <alexander.belopolsky at>:
> This discussion motivated me to start looking into how well Python
> library itself is prepared to deal with len(chr(i)) = 2. ?I was not
> surprised to find that textwrap does not handle the issue that well:
>>>> len(wrap(' \U00010140' * 80, 20))
> 12
>>>> len(wrap(' \U00000140' * 80, 20))
> 8
> That module should probably be rewritten to properly implement ?the
> Unicode line breaking algorithm
> <>.
> Yet finding a bug in a str object method after a 5 min review was a
> bit discouraging:
>>>> 'xyz'.center(20, '\U00010140')
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: The fill character must be exactly one character long
> Given the apparent difficulty of writing even basic text processing
> algorithms in presence of surrogate pairs, I wonder how wise it is to
> expose Python users to them.

This was already discussed two years ago:

So yes, wrap() and center() should be fixed.

Amaury Forgeot d'Arc

From janssen at  Tue Nov 23 20:26:57 2010
From: janssen at (Bill Janssen)
Date: Tue, 23 Nov 2010 11:26:57 PST
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Isaac Morland <ijmorlan at> wrote:

> On Tue, 23 Nov 2010, Antoine Pitrou wrote:
> > Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit :
> >> On Tue, 23 Nov 2010, Antoine Pitrou wrote:
> >>
> >>> We already have a bunch of bizarrely unrelated stuff in collections
> >>> (such as Callable), so we could put enum there too.
> >>
> >> Why not just "enum" (i.e., "from enum import [...]" or "import
> >> enum.[...]")?  Enumerations are one of the basic kinds of types overall
> >> (speaking informally and independent of any specific language) - they
> >> aren't at all exotic.
> >
> > Enumerations aren't a type at all (they have no distinguishing
> > property).

Not in C, but in some other languages.

> Each enumeration is a type (well, OK, not in every language,
> presumably, but certainly in many languages).

The main purpose of that is to be able to catch type mismatches with
static typing, though.  Seems kind of pointless for Python.

> Classes have their own keyword.  I don't think it's disproportionate
> to give enums a top-level module name.

I do.

> Hey, how about this syntax:
> enum Colors:
> 	red = 0
> 	green = 10
> 	blue

Why not

  class Color:
     red = (255, 0, 0)
     green = (0, 255, 0)
     blue = (0, 0, 255)

Seems to handle the situation OK.


From mal at  Tue Nov 23 20:31:37 2010
From: mal at (M.-A. Lemburg)
Date: Tue, 23 Nov 2010 20:31:37 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:
> On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger
> <raymond.hettinger at> wrote:
> ..
>> Any explanation we give users needs to let them know two things:
>> * that we cover the entire range of unicode not just BMP
>> * that sometimes len(chr(i)) is one and sometimes two
> This discussion motivated me to start looking into how well Python
> library itself is prepared to deal with len(chr(i)) = 2.  I was not
> surprised to find that textwrap does not handle the issue that well:
>>>> len(wrap(' \U00010140' * 80, 20))
> 12
>>>> len(wrap(' \U00000140' * 80, 20))
> 8
> That module should probably be rewritten to properly implement  the
> Unicode line breaking algorithm
> <>.
> Yet finding a bug in a str object method after a 5 min review was a
> bit discouraging:
>>>> 'xyz'.center(20, '\U00010140')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: The fill character must be exactly one character long
> Given the apparent difficulty of writing even basic text processing
> algorithms in presence of surrogate pairs, I wonder how wise it is to
> expose Python users to them. 

What's the alternative ?

Without surrogates, Python users with UCS-2 build (e.g. the Windows
Python users) would not be allowed to play with non-BMP code points.

IMHO, it's better to fix the stdlib. This is a long process, as you
can see with the Python3 stdlib evolution, but Python will eventually
get there.

> As Wikipedia explains, [1]
> """
> Because the most commonly used characters are all in the Basic
> Multilingual Plane, converting between surrogate pairs and the
> original values is often not tested thoroughly. This leads to
> persistent bugs, and potential security holes, even in popular and
> well-reviewed application software.
> """
> Since UCS-2 (the Character Encoding Form (CEF)) is now defined [1] to
> cover only BMP, maybe rather than changing the terms used in the
> reference manual, we should tighten the code to conform to the updated
> standards?

Can we please stop turning this around over and over again :-)
UCS-2 has never supported anything other than the BMP. However,
you can interpret sequences of UCS-2 code unit as UTF-16 and
then get access to the full Unicode character set. We've been
doing this in codecs ever since UCS-4 builds were introduced
some 8-9 years ago.

The change to have chr(i) return surrogates on UCS-2 builds
was perhaps done too early, but then, without such changes you'd
never notice that your code doesn't work well with surrogates.
It's just one piece of the puzzle when going from 8-bit strings
to Unicode.

> Again, given that the str object itself has at least one non-BMP
> character bug as we are closing on the third major release of py3k,
> how likely are 3rd party developers to get their libraries right as
> they port to 3.x?
> [1]
> [2]

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 23 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From guido at  Tue Nov 23 20:34:17 2010
From: guido at (Guido van Rossum)
Date: Tue, 23 Nov 2010 11:34:17 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290535602.3642.87.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 10:06 AM, Antoine Pitrou <solipsis at> wrote:
> Le mardi 23 novembre 2010 ? 12:57 -0500, Fred Drake a ?crit :
>> On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou <solipsis at> wrote:
>> > Enumerations aren't a type at all (they have no distinguishing
>> > property).
>> In any given language, this may be true, or not. ?Whether they should
>> be distinct in Python is core to the current discussion.
> I meant "type" in the structural sense (hence the parenthesis). enums
> are just auto-generated constants. Since Python makes it trivial to
> generate sequential integers, there's no need for a specific "enum"
> construct.
> Now you may argue that enums should be strongly-typed, but that would be
> a bit backwards given Python's preference for duck-typing.

Please take a step back.

The best example of the utility of enums even for Python is bool. I
resisted this for the longest time but people kept asking for it. Some
properties of bool:

(a) bool is a (final) subclass of int, and an int is acceptable in a
pinch where a bool is expected
(b) bool values are guaranteed unique -- there is only one instance
with value True, and only one with value False
(c) bool values have a str() and repr() that shows their name instead
of their value (but not their class -- that's rarely an issue, and
makes the output more compact)

I think it makes sense to add a way to the stdlib to add other types
like bool. I think (c) is probably the most important feature,
followed by (a) -- except the *final* part: I want to subclass enums.
(b) is probably easy to do but I don't think it matters that much in

>> From a backward-compatibility perspective, what makes sense depends on
>> whether they're used to implement existing constants (socket.AF_INET,
>> etc.) or if they reserved for new features only.
> It's not only backwards compatibility. New features relying on C APIs
> have to be able to map constants to the integers used in the C library.
> It would be much better if this were done naturally rather than through
> explicit conversion maps.

I'm not sure what you mean here. Can you give an example of what you
mean? I agree that it should be possible to make pretty much any
constant in the OS modules enums -- even if the values vary across

> (this really means subclassing int, if we don't want to complicate
> C-level code)


FWIW I don't think I'm particular about the exact API to construct a
new enum type in Python code; I think in most cases explicitly
assigning values is fine. Often the values are constrained by
something external anyway; it should be easy to dynamically set the
values of a particular enum type (even add new values after the fact).
There might also be enums with the same value (even though the mapping
from int to enum will then have to pick one).

I expect that the API to convert between enums and bare ints should be
i = int(e) and e = <enumclass>(i). It would be nice if s = str(e) and
e = <enumclass>(s) would work too.

--Guido van Rossum (

From barry at  Tue Nov 23 20:40:45 2010
From: barry at (Barry Warsaw)
Date: Tue, 23 Nov 2010 14:40:45 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101123144045.17b00ac4@mission>

On Nov 23, 2010, at 12:57 PM, Fred Drake wrote:

>>From a backward-compatibility perspective, what makes sense depends on
>whether they're used to implement existing constants (socket.AF_INET,
>etc.) or if they reserved for new features only.

As is usually the case, there's little reason to change existing working code.
Enums can be used whenever a module or API is updated.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Tue Nov 23 20:47:47 2010
From: barry at (Barry Warsaw)
Date: Tue, 23 Nov 2010 14:47:47 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101123144747.44a2f4c9@mission>

On Nov 23, 2010, at 05:02 PM, Michael Foord wrote:

>> * Enums are not subclassed from ints or strs.  They are a distinct data type
>>    that can be converted to and from ints and strs.  EIBTI.
>But if we are to use it *in* the standard library (as opposed to merely
>adding a module *to* the standard library) there are backwards compatibility
>concerns. Where modules are already using integers for constants then
>integers still need to work.

Is int(enum_value) enough, or must the enum value actually *be* an int?

>One easy way to achieve this is to subclass integer. If we don't do that
>(assuming we decide that putting a solution in the standard library is
>appropriate) then we'll have to evaluate what we mean by backwards
>compatible. If the modules that use the constants aren't to change then
>comparing equal to the underlying value is the minimum (so that the original
>value can still be used in place of the new named constant). Not sure if
>you'd be happy to make that change in flufl.enum.

I'm not sure either.  In flufl.enum enum_class(i) also works as expected.

>> * The typical way to create them is through a simple, but explicit class
>>    definition.  I personally like being explicit about the item values, and
>>    the assignments are required to make the metaclass work properly, but
>>    Michael's convenience patch is totally appropriate for cases where you
>>    don't care, or you want a one-liner.
>If make_enum was to take a set of values to use (as Antoine suggested) I
>don't see what's un-explicit about it.

When I saw your patch I immediately thought that I could add a default
argument that was something like `int_iter`, i.e. an iterator of integers for
the values in the string.  I suspect YAGNI, which is why I didn't just add it,
but I'm not totally opposed to it.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From barry at  Tue Nov 23 21:01:02 2010
From: barry at (Barry Warsaw)
Date: Tue, 23 Nov 2010 15:01:02 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101123150102.75f6256c@mission>

On Nov 23, 2010, at 11:52 AM, P.J. Eby wrote:

>This reminds me: a stdlib enum should support proper pickling and copying;
>    assert SomeEnum.anEnum is pickle.loads(pickle.dumps(SomeEnum.anEnum))
>This could probably be implemented by adding something like:
>    def __reduce__(self):
>        return getattr, (self._class, self._enumname)
>in the EnumValue class.

Excellent idea, thanks.  Added to flufl.enum in r38.  However, only enums
created with the class syntax can be pickled though.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From guido at  Tue Nov 23 21:00:51 2010
From: guido at (Guido van Rossum)
Date: Tue, 23 Nov 2010 12:00:51 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <20101123144747.44a2f4c9@mission>
References: <>
	<> <>
	<> <20101123112703.42b42812@mission>
	<> <20101123144747.44a2f4c9@mission>
Message-ID: <>

On Tue, Nov 23, 2010 at 11:47 AM, Barry Warsaw <barry at> wrote:
> On Nov 23, 2010, at 05:02 PM, Michael Foord wrote:
>>> * Enums are not subclassed from ints or strs. ?They are a distinct data type
>>> ? ?that can be converted to and from ints and strs. ?EIBTI.
>>But if we are to use it *in* the standard library (as opposed to merely
>>adding a module *to* the standard library) there are backwards compatibility
>>concerns. Where modules are already using integers for constants then
>>integers still need to work.
> Is int(enum_value) enough, or must the enum value actually *be* an int?

I vote for *be*, following bool's example.

>>One easy way to achieve this is to subclass integer. If we don't do that
>>(assuming we decide that putting a solution in the standard library is
>>appropriate) then we'll have to evaluate what we mean by backwards
>>compatible. If the modules that use the constants aren't to change then
>>comparing equal to the underlying value is the minimum (so that the original
>>value can still be used in place of the new named constant). Not sure if
>>you'd be happy to make that change in flufl.enum.
> I'm not sure either. ?In flufl.enum enum_class(i) also works as expected.
>>> * The typical way to create them is through a simple, but explicit class
>>> ? ?definition. ?I personally like being explicit about the item values, and
>>> ? ?the assignments are required to make the metaclass work properly, but
>>> ? ?Michael's convenience patch is totally appropriate for cases where you
>>> ? ?don't care, or you want a one-liner.
>>If make_enum was to take a set of values to use (as Antoine suggested) I
>>don't see what's un-explicit about it.
> When I saw your patch I immediately thought that I could add a default
> argument that was something like `int_iter`, i.e. an iterator of integers for
> the values in the string. ?I suspect YAGNI, which is why I didn't just add it,
> but I'm not totally opposed to it.
> -Barry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

--Guido van Rossum (

From jcea at  Tue Nov 23 21:33:02 2010
From: jcea at (Jesus Cea)
Date: Tue, 23 Nov 2010 21:33:02 +0100
Subject: [Python-Dev] Sporadic problems with
Message-ID: <>

Hash: SHA1

Happen to me last Sunday, and happening just now.

I can access just fine, but trying to post a
message, open a new bug, change nosy, etc., takes a LONG time (minutes)
and it is finally failing with a "400 Bad Request" error:

Bad Request

Your browser sent a request that this server could not understand.
Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9
OpenSSL/0.9.8g mod_wsgi/2.5 Server at Port 80

Last sunday I was able to open the bug after a time. Today I have been
retrying for while, with no luck yet.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From martin at  Tue Nov 23 21:33:19 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Nov 2010 21:33:19 +0100
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>	<>	<>
	<> <>
Message-ID: <>

> I have read some about side-by-side assemblies but had considered them a
> good reason to stick with the outdated M$VC 6.0 compiler, which doesn't
> seem to need to create them, and their myriad requirements, which seem
> far from necessary for simply compiling a program.  I was disappointed
> to realize that Python was heading down the path of using the newer
> tools that create side-by-side assemblies, but I suppose using an old
> and crufty compiler like M$VC 6.0 cannot support some of the newer
> features of Windows, which may seem to be necessary to some.... like
> 64-bit support, which does seem necessary, even to me.

The rationale for moving along with the releases is different, though:
you cannot obtain the old versions anymore, except perhaps on Ebay.
So new developers coming to Python would not be able to build Python
extensions if we didn't always try to use a compiler that is still
available (and we are stressing that a little bit: 3.2 will use
VS 2008, even though it has been already superceded).

In any case, VS 2010 will stop using SxS for the CRT.


From v+python at  Tue Nov 23 21:42:40 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 12:42:40 -0800
Subject: [Python-Dev] is this a bug? no environment variables
In-Reply-To: <>
References: <>	<>	<>
	<> <>
Message-ID: <>

On 11/23/2010 12:33 PM, "Martin v. L?wis" wrote:
> In any case, VS 2010 will stop using SxS for the CRT.

Good news!  Maybe M$VC will become a useful compiler yet again :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From v+python at  Tue Nov 23 21:43:05 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 12:43:05 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>	<1290535602.3642.87.camel@localhost.localdomain>
Message-ID: <>

On 11/23/2010 11:34 AM, Guido van Rossum wrote:
> The best example of the utility of enums even for Python is bool. I
> resisted this for the longest time but people kept asking for it. Some
> properties of bool:
> (a) bool is a (final) subclass of int, and an int is acceptable in a
> pinch where a bool is expected
> (b) bool values are guaranteed unique -- there is only one instance
> with value True, and only one with value False
> (c) bool values have a str() and repr() that shows their name instead
> of their value (but not their class -- that's rarely an issue, and
> makes the output more compact)
> I think it makes sense to add a way to the stdlib to add other types
> like bool. I think (c) is probably the most important feature,
> followed by (a) -- except the *final* part: I want to subclass enums.
> (b) is probably easy to do but I don't think it matters that much in
> practice.

I was concerned about uniqueness constraints some were touting.  While 
that can be a useful property for some enumerations, it can also be 
convenient for other enumerations to have multiple names map to the same 

Bool seems appropriately not extensible to additional values.  While 
there are tri-valued (and other) logic systems, they deserve a separate 

Bool seems to be an example, then of a "set of distingushed names, with 
values associated to the names", and is restricted to [two] [unique] 
integer values.  C/C++/C# enum is somewhat like that, and is also 
restricted to integer values [not necessarily unique].  I wonder if a 
set of distinguished names need to be restricted to integer values to be 
useful, although I have no doubt that distinguished names with integer 
values are useful.  Someone used an example of color names class having 
RGB tuple values, which is a counter example to a restriction to 
integers.  I can think of others as well.

Perhaps a "set of distinguished names, with values associated to the 
names" is really a dict, with the unique names restricted to Python 
identifier syntax (to be useful), and the values unrestricted. The type 
of the named value, and the value of the named value, seem not to need 
to be restricted.

But the implementations

Bool = dict('False': 0, 'True': 1)

or alternately

class Bool():
     self.False = 0
     self.True = 1

is missing a couple characteristics of Python's present bool: the names 
are not special, and the values are not immutable.  Perhaps games could 
be played to make the second implementation effectively immutable.

So I think the real trick of the "enum" (or a generalized "distinguished 
names") is in the naming.  A technique to import the keys that are legal 
Python identifiers from a dict into a namespace, and retain henceforth 
immutable values for those names would permit the syntactical usage that 
people are accustomed to from the C/C++/C# enum, but with extended 
ranges and types of values, and it seems Bool could be mostly 
reimplemented via that technique.

What is still missing?  The "debugging" help: the values, once imported, 
should not become "just" values of their type, but rather a new type of 
value, that has an associated name (and type, I think).  Whatever magic 
is worked under the covers to make sure that there is just one True and 
just one False, so that they can be distinguished from the values 1 and 
0, and so reported, should also be applied to these values.

So there need not be new syntax for creating the name/value pairs; just 
use dict.  The only new API would be the code that "imports" the dict 
into the local namespace.

Note that other scoped definitions of True and False are not possible 
today because True and False are keywords.  It would be inappropriate to 
define these distinguished names as all being keywords, so it seems like 
one could still override the names, even once defined, but such 
overridden names would lose their special value that makes them a 
distinguished name.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Tue Nov 23 21:48:43 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 21:48:43 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290545323.3642.101.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 11:34 -0800, Guido van Rossum a ?crit :
> >> From a backward-compatibility perspective, what makes sense depends on
> >> whether they're used to implement existing constants (socket.AF_INET,
> >> etc.) or if they reserved for new features only.
> >
> > It's not only backwards compatibility. New features relying on C APIs
> > have to be able to map constants to the integers used in the C library.
> > It would be much better if this were done naturally rather than through
> > explicit conversion maps.
> I'm not sure what you mean here. Can you give an example of what you
> mean? I agree that it should be possible to make pretty much any
> constant in the OS modules enums -- even if the values vary across
> platforms.

I mean that PyArg_ParseTuple should continue to be pratical even if e.g.
os.SEEK_SET and friends become named constants.  It implies that the
various format codes such as "i", "l", etc. are still usable with those
constants. Hence:

> > (this really means subclassing int, if we don't want to complicate
> > C-level code)
> Right.




From rrr at  Tue Nov 23 22:03:21 2010
From: rrr at (Ron Adam)
Date: Tue, 23 Nov 2010 15:03:21 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290535676.3642.89.camel@localhost.localdomain>
References: <>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>
Message-ID: <icha6q$uar$>

On 11/23/2010 12:07 PM, Antoine Pitrou wrote:
> Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit :
>> Each enumeration is a type (well, OK, not in every language, presumably,
>> but certainly in many languages).  The word "basic" is more important than
>> "types" in my sentence - the point is that an enumeration capability is a
>> very common one in a type system, and is very general, not specific to any
>> particular application.
> Python already has an enumeration capability. It's called range().
> There's nothing else that C enums have. AFAICT, neither do enums in
> other mainstream languages (assuming they even exist; I don't remember
> Perl, PHP or Javascript having anything like that, but perhaps I'm
> mistaken).

Aren't we forgetting enumerate?


 >>> dict(e for e in enumerate(colors.split()))
{0: 'BLACK', 1: 'BROWN', 2: 'RED', 3: 'ORANGE', 4: 'YELLOW', 5: 'GREEN', 6: 
'BLUE', 7: 'VIOLET', 8: 'GREY', 9: 'WHITE'}

 >>> dict((f, n) for (n, f) in enumerate(colors.split()))
{'BLUE': 6, 'BROWN': 1, 'GREY': 8, 'YELLOW': 4, 'GREEN': 5, 'VIOLET': 7, 
'ORANGE': 3, 'BLACK': 0, 'WHITE': 9, 'RED': 2}

Most other languages that use numbered constants number them by base n^2.

 >>> [x**2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Binary flags have the advantage of saving memory because you can assign 
more than one to a single integer.  Another advantage is other languages 
use them so it can make it easier interface with them.   There also may be 
some performance advantages as well since you can test for multiple flags 
with a single comparison.

Sets of strings can also work when you don't need to associate a numeric 
value to the constant.  ie... the constant is the value.  In this case the 
set supplies the api.


From glyph at  Tue Nov 23 22:06:41 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 16:06:41 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 23, 2010, at 10:37 AM, Ben.Cottrell at wrote:

> I'd prefer not to think of the number of times I've made the following mistake:
> s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET)

If it's any consolation, it's fewer than the number of times I have :).

(More fun, actually, is where you pass a file descriptor to the wrong argument of 'fromfd'...)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Nov 23 22:06:45 2010
From: steve at (Steven D'Aprano)
Date: Wed, 24 Nov 2010 08:06:45 +1100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>
Message-ID: <>

Antoine Pitrou wrote:

> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',   
>                            values=range(1, 3))
> Again, auto-enumeration is useless since it's trivial to achieve
> explicitly.

That doesn't make auto-enumeration "useless". Unnecessary, perhaps, but 
not useless.

But even then it's only unnecessary if the number of constants are small 
enough that you can see how many there are without counting 
(essentially, 4 or fewer). When you have more, it becomes error-prone 
and a nuisance to have to count them by hand:

Constants = make_constants(


From glyph at  Tue Nov 23 22:10:00 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 16:10:00 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290524466.3642.4.camel@localhost.localdomain>
References: <>
	<> <>
Message-ID: <>

On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote:

> Well, it is easy to assign range(N) to a tuple of names when desired. I
> don't think an automatically-enumerating constant generator is needed.

I don't think that numerical enumerations are the only kind of constants we're talking about.  Others have already mentioned strings.  Also, see <> for some other use-cases.  Since this isn't coming to 2.x, we're probably going to do our own thing anyway (unless it turns out that flufl.enum is so great that we want to add another dependency...) but I'm hoping that the outcome of this discussion will point to something we can be compatible with.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Tue Nov 23 22:15:20 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 23 Nov 2010 22:15:20 +0100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1290546920.3642.104.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 16:10 -0500, Glyph Lefkowitz a ?crit :
> On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote:
> > Well, it is easy to assign range(N) to a tuple of names when
> > desired. I
> > don't think an automatically-enumerating constant generator is
> > needed.
> I don't think that numerical enumerations are the only kind of
> constants we're talking about.  Others have already mentioned strings.
> Also, see <> for some other use-cases.  Since this
> isn't coming to 2.x, we're probably going to do our own thing anyway
> (unless it turns out that flufl.enum is so great that we want to add
> another dependency...) but I'm hoping that the outcome of this
> discussion will point to something we can be compatible with.

I think that asking for too many features would get in the way, and also
make the API quite un-Pythonic. If you want your values to be e.g.
OR'able, just choose your values wisely ;)



From rrr at  Tue Nov 23 22:21:17 2010
From: rrr at (Ron Adam)
Date: Tue, 23 Nov 2010 15:21:17 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <icha6q$uar$>
References: <>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>	<1290535676.3642.89.camel@localhost.localdomain>
Message-ID: <ichb8d$3mi$>

Oops..  x**2 should have been 2**x below.

On 11/23/2010 03:03 PM, Ron Adam wrote:
> On 11/23/2010 12:07 PM, Antoine Pitrou wrote:
>> Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit :
>>> Each enumeration is a type (well, OK, not in every language, presumably,
>>> but certainly in many languages). The word "basic" is more important than
>>> "types" in my sentence - the point is that an enumeration capability is a
>>> very common one in a type system, and is very general, not specific to any
>>> particular application.
>> Python already has an enumeration capability. It's called range().
>> There's nothing else that C enums have. AFAICT, neither do enums in
>> other mainstream languages (assuming they even exist; I don't remember
>> Perl, PHP or Javascript having anything like that, but perhaps I'm
>> mistaken).
> Aren't we forgetting enumerate?
>  >>> dict(e for e in enumerate(colors.split()))
> {0: 'BLACK', 1: 'BROWN', 2: 'RED', 3: 'ORANGE', 4: 'YELLOW', 5: 'GREEN', 6:
> 'BLUE', 7: 'VIOLET', 8: 'GREY', 9: 'WHITE'}
>  >>> dict((f, n) for (n, f) in enumerate(colors.split()))
> {'BLUE': 6, 'BROWN': 1, 'GREY': 8, 'YELLOW': 4, 'GREEN': 5, 'VIOLET': 7,
> 'ORANGE': 3, 'BLACK': 0, 'WHITE': 9, 'RED': 2}
> Most other languages that use numbered constants number them by base n^2.
>  >>> [x**2 for x in range(10)]
> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

 >>> [2**x for x in range(10)]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

> Binary flags have the advantage of saving memory because you can assign
> more than one to a single integer. Another advantage is other languages use
> them so it can make it easier interface with them. There also may be some
> performance advantages as well since you can test for multiple flags with a
> single comparison.
> Sets of strings can also work when you don't need to associate a numeric
> value to the constant. ie... the constant is the value. In this case the
> set supplies the api.
> Cheers,
> Ron
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From steve at  Tue Nov 23 22:30:37 2010
From: steve at (Steven D'Aprano)
Date: Wed, 24 Nov 2010 08:30:37 +1100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290535676.3642.89.camel@localhost.localdomain>
References: <>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>
Message-ID: <>

Antoine Pitrou wrote:

> Python already has an enumeration capability. It's called range().
> There's nothing else that C enums have. AFAICT, neither do enums in
> other mainstream languages (assuming they even exist; I don't remember
> Perl, PHP or Javascript having anything like that, but perhaps I'm
> mistaken).

In Pascal, enumerations are a type, and the value of the named values 
are an implementation detail. E.g. one would define an enumerated type:

   flavour = (sweet, salty, sour, bitter, umame);
   x: flavour;

and then you would write something like:

x := sour;

Notice that the constants sweet etc. aren't explicitly predefined, since 
they're purely internal details and the compiler is allowed to number 
them any way it likes. In Python, we would need stronger guarantees 
about the values chosen, so that they could be exposed to external 
modules, pickled, etc.

But that doesn't mean we should be forced to specify the values ourselves.


From greg.ewing at  Tue Nov 23 22:26:58 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 10:26:58 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Antoine Pitrou wrote:

> I don't understand why people insist on calling that an "enum". enum is
> a C legacy and it doesn't bring anything useful as I can tell.

The usefulness is that they can have a str() or repr() that
displays the name of the value instead of an integer.

The bool type was added for much the same reason -- otherwise
we would simply have gotten builtin names False = 0 and
True = 1.


From greg.ewing at  Tue Nov 23 22:27:02 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 10:27:02 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290524519.3642.5.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

Antoine Pitrou wrote:

> Well, it's been inherited by C-like languages, no doubt. Like braces and
> semicolumns :)

The idea isn't confined to the C family. Pascal and many of the
languages inspired by it also have enumerated types.


From tjreedy at  Tue Nov 23 23:44:07 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 17:44:07 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <ichg3p$qpe$>

On 11/23/2010 2:11 PM, Alexander Belopolsky wrote:

> This discussion motivated me to start looking into how well Python
> library itself is prepared to deal with len(chr(i)) = 2.  I was not

Good idea!

> surprised to find that textwrap does not handle the issue that well:
>>>> len(wrap(' \U00010140' * 80, 20))
> 12
>>>> len(wrap(' \U00000140' * 80, 20))
> 8

How well does textwrap handles composable pairs (letter + accent)? Does 
is count two codepoints as one char space? and avoid putting line breaks 
between? I suspect textwrap should be regarded as 
> That module should probably be rewritten to properly implement  the
> Unicode line breaking algorithm
> <>.

Probably a good idea

> Yet finding a bug in a str object method after a 5 min review was a
> bit discouraging:
>>>> 'xyz'.center(20, '\U00010140')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> TypeError: The fill character must be exactly one character long

Again, what does it do with letter + decorator combinations? It seems to 
me that the whole notion that one code point == one printed character 
space is broken once one leaves ascii. Perhaps we need an is_uchar 
function to recognize multi-code sequences, inclusing surrogate pairs, 
that represent one char for the purpose of character oriented functions.

> Given the apparent difficulty of writing even basic text processing
> algorithms in presence of surrogate pairs, I wonder how wise it is to
> expose Python users to them.  As Wikipedia explains, [1]
> """
> Because the most commonly used characters are all in the Basic
> Multilingual Plane, converting between surrogate pairs and the
> original values is often not tested thoroughly. This leads to
> persistent bugs, and potential security holes, even in popular and
> well-reviewed application software.
> """

So we did not test thoroughly enough and need to add appropriate unit 
tests as bugs are fixed.

Terry Jan Reedy

From tjreedy at  Wed Nov 24 00:07:03 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 18:07:03 -0500
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/23/2010 5:43 PM, ?ric Araujo wrote:
>> Modified: python/branches/py3k/Misc/ACKS
>> ==============================================================================
>> --- python/branches/py3k/Misc/ACKS	(original)
>> +++ python/branches/py3k/Misc/ACKS	Tue Nov 23 21:32:47 2010
>> @@ -1,4 +1,4 @@
>> -Acknowledgements
>> +?Acknowledgements
> This change introduced a so-called UTF-8 BOM in the file.  Is
> TortoiseSvn the culprit or a text editor?

I used Notepad to edit the file, TortoiseSvn to commit, the same as I 
did for #9222, rev86702, Lib\idlelib\, yesterday.
If the latter is OK, perhaps *.py gets filtered better than misc. text 
files. I believe I have the config as specified in dev/faq.

enable-auto-props = yes

* = svn:eol-style=native
*.c = svn:keywords=Id
*.h = svn:keywords=Id
*.py = svn:keywords=Id
*.txt = svn:keywords=Author Date Id Revision


From ijmorlan at  Wed Nov 24 00:15:03 2010
From: ijmorlan at (Isaac Morland)
Date: Tue, 23 Nov 2010 18:15:03 -0500 (EST)
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
	<Pine.GSO.4.64.1011231241050.4991@core.cs!> <>
Message-ID: <>

On Tue, 23 Nov 2010, Bill Janssen wrote:

> The main purpose of that is to be able to catch type mismatches with
> static typing, though.  Seems kind of pointless for Python.

The concept can work dynamically.  In fact, the flufl.enum package which 
has been discussed here makes each enumeration into a separate class so 
many of the advantages of catching type mismatches are obtained.

>> Hey, how about this syntax:
>> enum Colors:
>> 	red = 0
>> 	green = 10
>> 	blue
> Why not
>  class Color:
>     red = (255, 0, 0)
>     green = (0, 255, 0)
>     blue = (0, 0, 255)
> Seems to handle the situation OK.

Yes, this looks almost exactly like flufl.enum syntax.  In any case my 
suggestion of a new keyword was not meant to be taken seriously.  If I 
ever think I have a good reason to suggest a new keyword I'll sleep on it, 
take a vacation, and then if I still think a new keyword is justified I 
will specifically disclaim any possibility of the suggestion being a joke.

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist

From at  Wed Nov 24 00:18:33 2010
From: at (David Bolen)
Date: Tue, 23 Nov 2010 18:18:33 -0500
Subject: [Python-Dev] Stable buildbots
References: <>
Message-ID: <>

Trent Nelson <trent at> writes:

> That's interesting.  (That kill_python.exe doesn't kill the wedged
> processes, but pskill does.)  kill_python is pretty simple, it just
> calls TerminateProcess() after acquiring a handle with the relevant
> PROCESS_TERMINATE access right.  (...)
> Are you calling pskill with the -t flag? i.e. kill process and all
> dependents?  That might be the ticket, especially if killing the child
> process that wedged select() is waiting on causes it to return, and
> thus, makes it killable.

Nope, just "pskill python_d".  Haven't bothered to check the pskill
source but I'm assuming it's just a basic TerminateProcess. Ideally my
quickest workaround would just be to replace the kill_python in the
buildbot tools script with that command but of course they could get
updated on checkouts and I'm not arguing it's generally appropriate enough
to belong in the source.

I suspect the problem may be on the "identify which process to kill"
rather than the "kill it" part, but it's definitely going to take time
to figure that out for sure.  While the approach kill_python takes is
much more appropriate, since we don't currently have multiple builds
running simultaneously (and for me the machines are dedicated as build
slaves, so I won't be having my own python_d), a more blanket kill
operation is safe enough.

> Otherwise, if it happens again, can you try kill_python.exe first,
> then pskill, and confirm if the former fails but the latter succeeds?

Yeah, I've got a temporary tree with a built-binary around, but still
have to make sure of the right way to run it manually in a way that it
will do the identification right (which I think also means I need to
figure out from which build tree the hung process started).  Up until
now, typically when I've found a hung setup, the rest of the build
tree which originally applied to that process has been cleaned.

I definitely sympathize with Martin's position though - it wasn't the
simplest tool to write (and I still have some email from him about the
week+ it took just to test the process identification part remotely
through buildbots at the time), so I regret not jumping right in to
try to fix it.  But it's just way more effort than typing "pskill
python_d", at least with my current availability.

-- David

From greg.ewing at  Wed Nov 24 00:32:39 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 12:32:39 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290546920.3642.104.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

Antoine Pitrou wrote:

> I think that asking for too many features would get in the way, and also
> make the API quite un-Pythonic. If you want your values to be e.g.
> OR'able, just choose your values wisely ;)

On the other hand it could be useful to have an easy way to
request power-of-2 value assignment, seeing as it's another
common pattern.


From greg.ewing at  Wed Nov 24 00:32:56 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 12:32:56 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Bill Janssen wrote:

> The main purpose of that is to be able to catch type mismatches with
> static typing, though.  Seems kind of pointless for Python.

But catching type mismatches with dynamic typing doesn't
seem pointless for Python. There's nothing static about
the proposals being made here that I can see.

> Why not
>   class Color:
>      red = (255, 0, 0)
>      green = (0, 255, 0)
>      blue = (0, 0, 255)

If all you want is a bunch of named constants, that's fine.
But the facilities being discussed here are designed to give
you other things as well, such as

   c =

printing "red" rather than "(255, 0, 0)".


From greg.ewing at  Wed Nov 24 00:33:02 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 12:33:02 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

Antoine Pitrou wrote:

> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST',   
>                            values=range(1, 3))
> Again, auto-enumeration is useless since it's trivial to achieve
> explicitly.

But seeing as it's going to be a common thing to do, why not
make it the default?

When defining an enum, often you don't *care* what the
underlying values are, so assigning sequential natural numbers
is as good a default as any.

In fact, with the Pascal concept of an enumerated type you
don't get any choice in the matter. It's only in the C family
that you get this bastardised conflation of enumerations with
arbitrary named constants...


From greg.ewing at  Wed Nov 24 00:41:50 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 12:41:50 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Isaac Morland wrote:
> In any case my 
> suggestion of a new keyword was not meant to be taken seriously.

I don't think it need be taken entirely as a joke, either.
All the proposed patterns for creating enums that I've seen
end up leaving something to be desired. They violate DRY
by requiring you to write the class name twice, or they
make you write the names of the values in quotes, or some
other minor ugliness.

While it may be possible to work around these things with
sufficient levels of metaclass hackery and black magic, at
some point one has to consider whether new syntax might
be the least worst option.


From greg.ewing at  Wed Nov 24 00:49:42 2010
From: greg.ewing at (Greg Ewing)
Date: Wed, 24 Nov 2010 12:49:42 +1300
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Alexander Belopolsky wrote:

> """
> Because the most commonly used characters are all in the Basic
> Multilingual Plane, converting between surrogate pairs and the
> original values is often not tested thoroughly. This leads to
> persistent bugs, and potential security holes, even in popular and
> well-reviewed application software.
> """

Maybe Python should have used UTF-8 as its internal unicode
representation. Then people who were foolish enough to assume
one character per string item would have their programs break
rather soon under only light unicode testing. :-)


From foom at  Wed Nov 24 01:22:23 2010
From: foom at (James Y Knight)
Date: Tue, 23 Nov 2010 19:22:23 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 23, 2010, at 6:49 PM, Greg Ewing wrote:
> Maybe Python should have used UTF-8 as its internal unicode
> representation. Then people who were foolish enough to assume
> one character per string item would have their programs break
> rather soon under only light unicode testing. :-)

You put a smiley, but, in all seriousness, I think that's actually the right thing to do if anyone writes a new programming language. It is clearly the right thing if you don't have to be concerned with backwards-compatibility: nobody really needs to be able to access the Nth codepoint in a string in constant time, so there's not really any point in storing a vector of codepoints.

Instead, provide bidirectional iterators which can traverse the string by byte, codepoint, or by grapheme (that is: the set of combining characters + base character that go together, making up one thing which a human would think of as a character).


From jcea at  Wed Nov 24 01:31:01 2010
From: jcea at (Jesus Cea)
Date: Wed, 24 Nov 2010 01:31:01 +0100
Subject: [Python-Dev] Sporadic problems with
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 23/11/10 21:33, Jesus Cea wrote:
> Happen to me last Sunday, and happening just now.
> I can access just fine, but trying to post a
> message, open a new bug, change nosy, etc., takes a LONG time (minutes)
> and it is finally failing with a "400 Bad Request" error:
> """
> Bad Request
> Your browser sent a request that this server could not understand.
> Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9
> OpenSSL/0.9.8g mod_wsgi/2.5 Server at Port 80
> """
> Last sunday I was able to open the bug after a time. Today I have been
> retrying for while, with no luck yet.

Still retrying, with no luck.

Anybody else can reproduce?.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From fuzzyman at  Wed Nov 24 01:41:37 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 24 Nov 2010 00:41:37 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <1290546920.3642.104.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>
Message-ID: <>

On 23/11/2010 21:15, Antoine Pitrou wrote:
> Le mardi 23 novembre 2010 ? 16:10 -0500, Glyph Lefkowitz a ?crit :
>> On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote:
>>> Well, it is easy to assign range(N) to a tuple of names when
>>> desired. I
>>> don't think an automatically-enumerating constant generator is
>>> needed.
>> I don't think that numerical enumerations are the only kind of
>> constants we're talking about.  Others have already mentioned strings.
>> Also, see<>  for some other use-cases.  Since this
>> isn't coming to 2.x, we're probably going to do our own thing anyway
>> (unless it turns out that flufl.enum is so great that we want to add
>> another dependency...) but I'm hoping that the outcome of this
>> discussion will point to something we can be compatible with.
> I think that asking for too many features would get in the way, and also
> make the API quite un-Pythonic. If you want your values to be e.g.
> OR'able, just choose your values wisely ;)

Well, the point of an OR'able flag is that the result shows the OR'd 
values in the repr. Raymond suggests using a set of strings where you 
need flag constants. For new apis (so no backwards compatibility 
constraints) where you don't need to use integers (i.e. not wrapping a C 
library) that's a great suggestion:

     flags = {'FOO', 'BAR'}

> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From lukasz at  Wed Nov 24 01:50:23 2010
From: lukasz at (=?utf-8?Q?=C5=81ukasz_Langa?=)
Date: Wed, 24 Nov 2010 01:50:23 +0100
Subject: [Python-Dev] Centos 5.5 freeze during test_concurrent_futures
Message-ID: <>

Hi there!

py3k built from trunk on Centos 5.5 freezes during regrtest on test_concurrent_futures with "Fatal Python error: Invalid thread state for this thread". As in a typical concurrent problem, subsequent calls freeze in different test cases, but the freeze itself is always reproducible and always during this test.

A colorful example:

I created an issue for that here:

If necessary, I can provide Centos 5.5 shell access. I would also like to donate a Centos 5.5 buildbot.

Best regards,
?ukasz Langa
tel. +48 791 080 144

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jcea at  Wed Nov 24 02:32:05 2010
From: jcea at (Jesus Cea)
Date: Wed, 24 Nov 2010 02:32:05 +0100
Subject: [Python-Dev] Sporadic problems with
In-Reply-To: <>
References: <> <>
Message-ID: <>

Hash: SHA1

On 24/11/10 01:31, Jesus Cea wrote:
> Still retrying, with no luck.
> Anybody else can reproduce?.

One of my tracker changes was just processed.

The important one still retrying every 5 minutes...

I hope I can go sleep before dawn :-P.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From tjreedy at  Wed Nov 24 02:51:20 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 23 Nov 2010 20:51:20 -0500
Subject: [Python-Dev] Sporadic problems with
In-Reply-To: <>
References: <> <>
Message-ID: <ichr2o$4ec$>

On 11/23/2010 8:32 PM, Jesus Cea wrote:
> Hash: SHA1
> On 24/11/10 01:31, Jesus Cea wrote:
>> Still retrying, with no luck.
>> Anybody else can reproduce?.
> One of my tracker changes was just processed.
> The important one still retrying every 5 minutes...
> I hope I can go sleep before dawn :-P.

I added a comment to one issue and opened another with no problem during 
the last couple of hours.

Terry Jan Reedy

From glyph at  Wed Nov 24 02:52:13 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 20:52:13 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 23, 2010, at 7:22 PM, James Y Knight wrote:

> On Nov 23, 2010, at 6:49 PM, Greg Ewing wrote:
>> Maybe Python should have used UTF-8 as its internal unicode
>> representation. Then people who were foolish enough to assume
>> one character per string item would have their programs break
>> rather soon under only light unicode testing. :-)
> You put a smiley, but, in all seriousness, I think that's actually the right thing to do if anyone writes a new programming language. It is clearly the right thing if you don't have to be concerned with backwards-compatibility: nobody really needs to be able to access the Nth codepoint in a string in constant time, so there's not really any point in storing a vector of codepoints.
> Instead, provide bidirectional iterators which can traverse the string by byte, codepoint, or by grapheme (that is: the set of combining characters + base character that go together, making up one thing which a human would think of as a character).

I really hope that this idea is not just for new programming languages.  If you switch from doing unicode "wrong" to doing unicode "right" in Python, you quadruple the memory footprint of programs which primarily store and manipulate large amounts of text.

This is especially ridiculous in PyGTK applications, where the GUI's internal representation required by the GUI UTF-8 anyway, so the round-tripping of string data back and forth to the exploded UTF-32 representation is wasting gobs of memory and time.  It at least makes sense when your C library's idea about character width and your Python build match up.

But, in a desktop app this is unlikely to be a performance concern; in servers, it's a big deal; measurably so.  I am pretty sure that in the server apps that I work on, we are eventually going to need our own string type and UTF-8 logic that does exactly what James suggested - certainly if we ever hope to support Py3.

(I dimly recall that both James and I have made this point before, but it's pretty important, so it bears repeating.)

From glyph at  Wed Nov 24 02:56:57 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 20:56:57 -0500
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
In-Reply-To: <>
References: <>
Message-ID: <>

On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote:

> On Tue, 23 Nov 2010 00:07:09 -0500
> Glyph Lefkowitz <glyph at> wrote:
>> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto <
>> ocean-city at> wrote:
>>> Hello. Does this affect python? Thank you.
>> No.
> Well, actually it does, but Python links against the system OpenSSL on
> most platforms (except Windows), so it's up to the OS vendor to apply
> the patch.

It does?  If so, I must have misunderstood the vulnerability.  Can you explain how it affects Python?

From stephen at  Wed Nov 24 03:29:47 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 11:29:47 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Alexander Belopolsky writes:

 > Yet finding a bug in a str object method after a 5 min review was a
 > bit discouraging:
 > >>> 'xyz'.center(20, '\U00010140')
 > Traceback (most recent call last):
 >   File "<stdin>", line 1, in <module>
 > TypeError: The fill character must be exactly one character long
 > Given the apparent difficulty of writing even basic text processing
 > algorithms in presence of surrogate pairs, I wonder how wise it is to
 > expose Python users to them.

"Consenting adults" applies here.

What to do?  Write tests, fix the stdlib.  Raise the probability of
surrogate pair tests in the fuzzer.

But "expose the users to surrogate pairs in an efficient (ie, UCS-2)
implementation" is a fundamental design principle of Python.
Tightening up the internal implementation is -10 unacceptable IMO

 > Again, given that the str object itself has at least one non-BMP
 > character bug as we are closing on the third major release of py3k,
 > how likely are 3rd party developers to get their libraries right as
 > they port to 3.x?

Not our problem, really.  We need to fix the stdlib, but 3rd party
libraries know what they're doing.

I guess we could provide a fuzztest module that generates known nasty
data (zero, very big numbers, "\0x00", "\U00010140", etc) that people
would be able to plug in as a data source for their own code.

Of course that doesn't replace conventional unittests based on
analysis of edge cases and tests designed to tickle them, but it would
be a start for many projects.

From raymond.hettinger at  Wed Nov 24 03:35:35 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Tue, 23 Nov 2010 18:35:35 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 23, 2010, at 3:41 PM, Greg Ewing wrote:

> While it may be possible to work around these things with
> sufficient levels of metaclass hackery and black magic, at
> some point one has to consider whether new syntax might
> be the least worst option.

The least worst option is to do nothing at all.
That's better than creating a new little monster with its 
own nuances and limitations.

We've gotten by well for almost two decades without
this particular static language feature creeping into Python.

For the most part, strings work well enough (see
decimal.ROUND_UP for example).  They are self-documenting
and work well with the rest of the language.

When a cluster of names cries out for its own namespace,
the usual technique is to put the names in class (see the examples
in the namedtuple docs for a way to make this a one-liner) 
or in a module (see for example).

For xor'able and or'able flags, sets of strings work well:
   flags = {'runnable', 'callable'}
   flags |= {'runnable', 'kissable'}
   if 'callable' in flags:
      . . .

We have a hard enough time getting people to not program
Java in Python.  IMO, adding a new enumeration type
would make this situation worse.  Also, it adds weight to
the language -- Python is not in needs of yet another fundamental


P.S.  I do recognize that lots of people have written their own
versions of Enum(), but I think they do it either out of habits formed
from statically compiled languages that lack all of our namespace
mechanisms or they do it because it is easy and fun to write
(just like people seem to enjoy writing flatten() recipes more
than they like actually using them).

One other thought:  With Py3.x, the language had its one chance
to get smaller.  Old-style classes were tossed, some built-ins 
vanished, and a few obsolete modules got nuked.  It would be
easy to have a "let's add thingie x" fest and lose those benefits.
There are many devs who find that the language does not 
fit-in-their-heads anymore, so considerable restraint needs to
be exercised before adding a new language feature that would
soon permeate everyone's code base and add yet another thing
that infrequent users have to learn before being able to read code.

From stephen at  Wed Nov 24 03:44:40 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 11:44:40 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

James Y Knight writes:

 > You put a smiley, but, in all seriousness, I think that's actually
 > the right thing to do if anyone writes a new programming
 > language. It is clearly the right thing if you don't have to be
 > concerned with backwards-compatibility: nobody really needs to be
 > able to access the Nth codepoint in a string in constant time, so
 > there's not really any point in storing a vector of codepoints.

A sad commentary on the state of Emacs usage, "nobody".

The theory is that accessing the first character of a region in a
string often occurs as a primitive operation in O(N) or worse
algorithms, sometimes without enough locality at the "collection of
regions" level to give a reasonably small average access time.

In practice, any *Emacs user can tell you that yes, we do need to be
able to access the Nth codepoint in a buffer in constant time.  The
O(N) behavior of current Emacs implementations means that people often
use a binary coding system on large files.  Yes, some position caching
is done, but if you have a large file (eg, a mail file) which is
virtually segmented using pointers to regions, locality gets lost.
(This is not a design bug, this is a fundamental requirement: consider
fast switching between threaded view and author-sorted view.)

And of course an operation that sorts regions in a buffer using
character pointers will have the same problem.  Working with memory
pointers, OTOH, sucks more than that; GNU Emacs recently bit the
bullet and got rid of their higher-level memory-oriented APIs, all of
the Lisp structures now work with pointers, and only the very
low-level structures know about character-to-memory pointer

This performance issue is perceptible even on 3GHz machines with not
so large (50MB) mbox files.  It's *horrid* if you do something like
"occur" on a 1GB log file, then try randomly jumping to detected log

From fdrake at  Wed Nov 24 03:58:47 2010
From: fdrake at (Fred Drake)
Date: Tue, 23 Nov 2010 21:58:47 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 9:35 PM, Raymond Hettinger
<raymond.hettinger at> wrote:
> The least worst option is to do nothing at all.

For the standard library, I agree.

There are enough variants that are needed/desired in different
contexts, and there isn't a single clear winner.  Nor is there any
compelling reason to have a winner.

I'm generally in favor of enums (or whatever you want to call them),
and I'm in favor of importing support for the flavor you need, or just
defining constants in whatever way makes sense for your library or

I don't see any problems that aren't solved by that.

? -Fred

Fred L. Drake, Jr.? ? <fdrake at>
"A storm broke loose in my mind."? --Albert Einstein

From jcea at  Wed Nov 24 04:03:36 2010
From: jcea at (Jesus Cea)
Date: Wed, 24 Nov 2010 04:03:36 +0100
Subject: [Python-Dev] Sporadic problems with
In-Reply-To: <ichr2o$4ec$>
References: <>
	<>	<>
Message-ID: <>

Hash: SHA1

On 24/11/10 02:51, Terry Reedy wrote:
>> I hope I can go sleep before dawn :-P.
> I added a comment to one issue and opened another with no problem during
> the last couple of hours.

My changes have work now. After like 8 hours and a retry every five minutes.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From glyph at  Wed Nov 24 04:27:38 2010
From: glyph at (Glyph Lefkowitz)
Date: Tue, 23 Nov 2010 22:27:38 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 23, 2010, at 9:44 PM, Stephen J. Turnbull wrote:

> James Y Knight writes:
>> You put a smiley, but, in all seriousness, I think that's actually
>> the right thing to do if anyone writes a new programming
>> language. It is clearly the right thing if you don't have to be
>> concerned with backwards-compatibility: nobody really needs to be
>> able to access the Nth codepoint in a string in constant time, so
>> there's not really any point in storing a vector of codepoints.
> A sad commentary on the state of Emacs usage, "nobody".
> The theory is that accessing the first character of a region in a
> string often occurs as a primitive operation in O(N) or worse
> algorithms, sometimes without enough locality at the "collection of
> regions" level to give a reasonably small average access time.

I'm not sure what you mean by "the theory is".  Whose theory?  About what?

> In practice, any *Emacs user can tell you that yes, we do need to be
> able to access the Nth codepoint in a buffer in constant time.  The
> O(N) behavior of current Emacs implementations means that people often
> use a binary coding system on large files.  Yes, some position caching
> is done, but if you have a large file (eg, a mail file) which is
> virtually segmented using pointers to regions, locality gets lost.
> (This is not a design bug, this is a fundamental requirement: consider
> fast switching between threaded view and author-sorted view.)

Sounds like a design bug to me.  Personally, I'd implement "fast switching between threaded view and author-sorted view" the same way I'd address any other multiple-views-on-the-same-data problem.  I'd retain data structures for both, and update them as the underlying model changed.

These representations may need to maintain cursors into the underlying character data, if they must retain giant wads of character data as an underlying representation (arguably the _main_ design bug in Emacs, that it encourages you to do that for everything, rather than imposing a sensible structure), but those cursors don't need to be code-point counters; they could be byte offsets, or opaque handles whose precise meaning varied with the potentially variable underlying storage.

Also, please remember that Emacs couldn't be implemented with giant Python strings anyway: crucially, all of this stuff is _mutable_ in Emacs.

> And of course an operation that sorts regions in a buffer using
> character pointers will have the same problem.  Working with memory
> pointers, OTOH, sucks more than that; GNU Emacs recently bit the
> bullet and got rid of their higher-level memory-oriented APIs, all of
> the Lisp structures now work with pointers, and only the very
> low-level structures know about character-to-memory pointer
> translation.
> This performance issue is perceptible even on 3GHz machines with not
> so large (50MB) mbox files.  It's *horrid* if you do something like
> "occur" on a 1GB log file, then try randomly jumping to detected log
> entries.

Case in point: "occur" needs to scan the buffer anyway; you can't do better than linear time there.  So you're going to iterate through the buffer, using one of the techniques that James proposed, and remember some locations.  Why not just have those locations be opaque cursors into your data?

In summary: you're right, in that James missed a spot.  You need bidirectional, *copyable* iterators that can traverse the string by byte, codepoint, grapheme, or decomposed glyph.

From v+python at  Wed Nov 24 05:28:19 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 20:28:19 -0800
Subject: [Python-Dev] http.server - reference to bug #427345
Message-ID: <>

Where might I find the bug #427345 that is referred to in a comment 
inside http.server ?  Here is a code excerpt:

             # throw away additional data [see bug #427345]
             while[self.rfile._sock], [], [], 0)[0]:
                 if not self.rfile._sock.recv(1):

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brian.curtin at  Wed Nov 24 05:35:10 2010
From: brian.curtin at (Brian Curtin)
Date: Tue, 23 Nov 2010 22:35:10 -0600
Subject: [Python-Dev] http.server - reference to bug #427345
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 23, 2010 at 22:28, Glenn Linderman
<v+python at<v%2Bpython at>
> wrote:

>  Where might I find the bug #427345 that is referred to in a comment inside
> http.server ?  Here is a code excerpt:
>             # throw away additional data [see bug #427345]
>             while[self.rfile._sock], [], [], 0)[0]:
>                 if not self.rfile._sock.recv(1):
>                     break
> has a box on the left-hand side where you can enter
issue numbers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Wed Nov 24 06:07:52 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 14:07:52 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Note that I'm not saying that there shouldn't be a UTF-8 string type;
I'm just saying that for some purposes it might be a good idea to keep
UTF-16 and UTF-32 string types around.

Glyph Lefkowitz writes:

 > > The theory is that accessing the first character of a region in a
 > > string often occurs as a primitive operation in O(N) or worse
 > > algorithms, sometimes without enough locality at the "collection of
 > > regions" level to give a reasonably small average access time.
 > I'm not sure what you mean by "the theory is".  Whose theory?  About what?

Mine.  About why somebody somewhere someday would need fast random
access to character positions.  "Nobody ever needs that" is a strong

 > > In practice, any *Emacs user can tell you that yes, we do need to be
 > > able to access the Nth codepoint in a buffer in constant time.  The
 > > O(N) behavior of current Emacs implementations means that people often
 > > use a binary coding system on large files.  Yes, some position caching
 > > is done, but if you have a large file (eg, a mail file) which is
 > > virtually segmented using pointers to regions, locality gets lost.
 > > (This is not a design bug, this is a fundamental requirement: consider
 > > fast switching between threaded view and author-sorted view.)
 > Sounds like a design bug to me.  Personally, I'd implement "fast
 > switching between threaded view and author-sorted view" the same
 > way I'd address any other multiple-views-on-the-same-data problem.
 > I'd retain data structures for both, and update them as the
 > underlying model changed.

Um, that's precisely the design I'm talking about.  But as you
recognize later, the message content is not part of those structures
because there's no real point in copying it *if you have fast access
to character positions*.  In a variable width character, character-
addressed design, there can be a perceptible delay in accessing even
the "next" message's content if you're in the wrong view.

 > These representations may need to maintain cursors into the
 > underlying character data, if they must retain giant wads of
 > character data as an underlying representation (arguably the _main_
 > design bug in Emacs, that it encourages you to do that for
 > everything, rather than imposing a sensible structure), but those
 > cursors don't need to be code-point counters; they could be byte
 > offsets, or opaque handles whose precise meaning varied with the
 > potentially variable underlying storage.

Both byte offsets and opaque handles really really suck to design,
implement, and maintain, if Lisp or Python level users can use them.
They're hard enough to do when you can hide them behind internal APIs,
but if they're accessible to users they're an endless source of user
bugs.  What was that you were saying about the difficulty of
remembering which argument is the fd?  It's like that.  Sure, you can
design APIs to help get that right, but it's not easy to provide one
that can be used for all the different applications out there.

 > Also, please remember that Emacs couldn't be implemented with giant
 > Python strings anyway: crucially, all of this stuff is _mutable_ in
 > Emacs.

No, that's a red herring.  The use-cases where Emacs users complain
most is browsing giant logs and reading old mail; neither needs the
content to be mutable (although of course it's a convenience in the
mail case if you delete messages or fetch new mail, but that could be
done with transaction logs that get appended to the on-disk file).

 > Case in point: "occur" needs to scan the buffer anyway; you can't
 > do better than linear time there.  So you're going to iterate
 > through the buffer, using one of the techniques that James
 > proposed, and remember some locations.  Why not just have those
 > locations be opaque cursors into your data?

They are.  But unless you're willing to implement correct character
motion, they need to be character indicies, which will be slow to
access the actual locations.  We've implemented caches, as does Emacs,
but they don't always get hits.  Finding an arbitrary position once
can involve perceptible delay on up to 1GHz machines; doing it in a
loop (which mail programs have a habit of doing) could be very

 > In summary: you're right, in that James missed a spot.  You need
 > bidirectional, *copyable* iterators that can traverse the string by
 > byte, codepoint, grapheme, or decomposed glyph.

That's a good start, yes.  But once you talk about "remembering some
locations", you're implicitly talking about random access.  Either you
maintain position indexes which naively implemented can easily be
close to the size of the text buffer (indexes are going to be at least
4 bytes, possibly 8, per position, and something like "occur" can
generate a lot of positions) -- in which case you might as well just
use a representation that is an array in the first place -- or you
need to implement a position cache which can be very hairy to do well.
Or you can give user programs memory indicies, and enjoy the fun as
the poor developers do things like "pos += 1" which works fine on
the ASCII data they have lying around, then wonder why they get
Unicode errors when they take substrings.

I'm sure it all can be done, but I don't think it will be done right
the first time around.  You may be right that designs better adapted
to large data sets than Emacs's "everything is a big buffer" will
almost always be available with reasonable effort.  But remember, a
lot of good applications start small, when a flat array might make
lots of sense as the underlying structure, and then need to scale.  If
you need to scale for the paying customers, well, "ouch!" but you can
afford it, but for many volunteer or startup projects that takes the
wind right out of your sails.

Note that if the user doesn't use private space, in a UCS-2 build you
have about 1.5K code points available for compressing non-BMP
characters into a 2-byte, valid Unicode representation (of course you
need to save off the table somewhere if that ever gets out of your
program, but that's easy).  I find it hard to imagine that there will
be many use-cases that need more than that many non-BMP characters.
So probably you can tell those few users who care to use a UCS-4
build; most of the array use-cases can be served by UCS-2.  Note that
in my Japanese corpuses, UTF-8 averages just about 2 bytes per
character anyway, and those are mail files, where two lines of
Japanese may be preceded by 2KB of ASCII-only header.  I suspect
Hebrew, Arabic, and Cyrillic users will have similar experiences.

By the way, to send the ball back into your court, I have this feeling
that the demand for UTF-8 is once again driven by native English
speakers who are very shortly going to find themselves, and the data
they are most familiar with, very much in the minority.  Of course the
market that benefits from UTF-8 compression will remain very large for
the immediate future, but in the grand scheme of things, most of the
world is going to prefer UTF-16 by a substantial margin.

N.B.  I'm not talking about persistent storage, where it's 6 of one
and half a dozen of the other; you can translate UTF-8 to UTF-16 way
faster than you can read content from disk, of course.

From foom at  Wed Nov 24 07:26:11 2010
From: foom at (James Y Knight)
Date: Wed, 24 Nov 2010 01:26:11 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote:
> Or you can give user programs memory indicies, and enjoy the fun as
> the poor developers do things like "pos += 1" which works fine on
> the ASCII data they have lying around, then wonder why they get
> Unicode errors when they take substrings.

a) You seem to be hung up implementation details of emacs. But yes, positions should be stored as an byte offset into the utf8 string. NOT as number of codepoints since the beginning of the string. Probably you want it to be somewhat opaque, so that you actually have to specify whether you wanted to go to +1 byte, codepoint, or grapheme.

b) Those poor developers are *already* screwed if they're using pos += 1 when pos is a codepoint index and they then take a substring based on that! They will get half a character when the string contains combining characters...

Pretending that "codepoints" are a useful abstraction just makes poor developers get by without doing the correct thing (incrementing to the next grapheme boundary) for a little bit longer. But once you [the language implementor] are providing correct abstractions for grapheme movement, it's just as easy to also provide an abstraction for codepoint movement, and make your low-level implementation of the iterator object be a byte-offset into a UTF8 buffer.


From foom at  Wed Nov 24 07:27:52 2010
From: foom at (James Y Knight)
Date: Wed, 24 Nov 2010 01:27:52 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote:
> By the way, to send the ball back into your court, I have this feeling
> that the demand for UTF-8 is once again driven by native English
> speakers who are very shortly going to find themselves, and the data
> they are most familiar with, very much in the minority.  Of course the
> market that benefits from UTF-8 compression will remain very large for
> the immediate future, but in the grand scheme of things, most of the
> world is going to prefer UTF-16 by a substantial margin.

No, the demand for UTF-8 is because that's what much of the internet (and not coincidentally, unix) world has standardized on. The main pieces of software using UTF-16 (Windows, Java) started doing so before it became apparent that 16 bits wasn't enough to  actually hold a unicode codepoint, so they were actually implementing UCS-2. In those days, UCS-2 was a fairly sensible choice.

But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly superior. Not because it's smaller -- it's pretty much a tossup -- but because it is an ASCII superset, and thus more easily compatible with other software. That also makes it most commonly used for internet communication. (So, there's a huge advantage for using it internally as well right there: no transcoding necessary for writing your HTML output). UTF-16 is incompatible with ASCII, and furthermore, it's still a variable-width encoding, with all the same issues that causes. As such, there's really very little to be said in favor of it.

If you really want a fixed-width encoding, you have to go to UTF-32, which is excessively large. UTF-32 is a losing choice, simply because of the wasted memory usage.

But that's all a side issue: even if you do choose UTF-16 as your underlying encoding, you *still* need to provide iterators that work by "byte" (only now bytes are 16-bits), by codepoint, and by grapheme. Of course, people who implement UTF-16 (such as python, java, and windows) often pretend they're still implementing UCS-2, and don't bother even providing their users with the necessary APIs to do things correctly. Which, you can often get away with...just so long as you don't mind that you sometimes end up splitting a string in the middle of a codepoint and causing a unicode error!


From v+python at  Wed Nov 24 08:43:18 2010
From: v+python at (Glenn Linderman)
Date: Tue, 23 Nov 2010 23:43:18 -0800
Subject: [Python-Dev] Web servers, bytes, str, documentation,
	Python 3.2a4
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On 11/21/2010 8:39 PM, R. David Murray wrote:
> On Sun, 21 Nov 2010 19:59:54 -0800, Glenn Linderman<v+python at>  wrote:
>> On 11/21/2010 9:18 AM, R. David Murray wrote:
>>> I want to look at the CGI issue, but I'm not sure when I'll get to it.
>> Actually, since this code was working before 3.x, and if email.parser
>> can now accept binary streams, it seems like maybe the only thing that
>> might be wrong is that presently it is getting a text stream instead, so
>> that is something or the application program would have to
>> switch, and then maybe some testing would discover correctness, or maybe
>> a specification of UTF-8 as the encoding to use for the text parts would
>> have to be done.
> Well, given the bytes/string split in Python3, code definitely has to
> be changed to make this work, since you have to explicitly call bytes
> processing routines (message_from_bytes, message_from_binary_file,
> BytesFeedparser, etc) to parse binary data, and likewise use
> BytesGenerator to emit binary data.

Looks like also calls http.client and both of them would need to 
be changed to deal with bytes.  I don't have the full translation of API 
calls in my head, nor have I ever used the email.parser API to know what 
the calls actually do... just read a bit about it... but that is 
different than using it...

However, I find code in http.client.parse_headers that is attempting to 
work-around reading a binary stream and feeding email.parser a string.  
So definitely some work to be done to fix things.

I did add some explicit threads to http.server CGI script code that I 
think work around the deadlocks that can result from attempting to 
serialize 3 pipes, and yet not require full buffering of stdin or 
stdout.  At the moment, I still am doing full buffering of stderr, but 
that is thought to be small potatoes in an http.server environment, 

But since my test case is a CGI form data, I'm stuck until this is 
fixed, or I wrap my head around the code in http.client and 
email.parser.  But not tonight (yawn!).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Wed Nov 24 09:02:13 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 09:02:13 +0100
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
In-Reply-To: <>
References: <>
Message-ID: <1290585733.3642.2.camel@localhost.localdomain>

Le mardi 23 novembre 2010 ? 20:56 -0500, Glyph Lefkowitz a ?crit :
> On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote:
> > On Tue, 23 Nov 2010 00:07:09 -0500
> > Glyph Lefkowitz <glyph at> wrote:
> >> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto <
> >> ocean-city at> wrote:
> >> 
> >>> Hello. Does this affect python? Thank you.
> >>> 
> >>>
> >>> 
> >> 
> >> No.
> > 
> > Well, actually it does, but Python links against the system OpenSSL on
> > most platforms (except Windows), so it's up to the OS vendor to apply
> > the patch.
> It does?  If so, I must have misunderstood the vulnerability.  Can you
> explain how it affects Python?

If I believe the link above:
?Any OpenSSL based TLS server is vulnerable if it is multi-threaded and
uses OpenSSL's internal caching mechanism. Servers that are
multi-process and/or disable internal session caching are NOT affected.?

So, you just have to create a multithreaded TLS server which doesn't
disable server-side session caching (it is enabled by default according
to )



From solipsis at  Wed Nov 24 09:42:07 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 09:42:07 +0100
Subject: [Python-Dev] Centos 5.5 freeze during test_concurrent_futures
References: <>
Message-ID: <>


> py3k built from trunk on Centos 5.5 freezes during regrtest on test_concurrent_futures with "Fatal Python error: Invalid thread state for this thread". As in a typical concurrent problem, subsequent calls freeze in different test cases, but the freeze itself is always reproducible and always during this test.

Well, could you run this under gdb and report the stacks for the
various threads when the process crashes?
(when compiled --with-pydebug, if possible)

Thank you


From solipsis at  Wed Nov 24 09:43:12 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 09:43:12 +0100
Subject: [Python-Dev] http.server - reference to bug #427345
References: <>
Message-ID: <>

On Tue, 23 Nov 2010 22:35:10 -0600
Brian Curtin <brian.curtin at> wrote:
> On Tue, Nov 23, 2010 at 22:28, Glenn Linderman
> <v+python at<v%2Bpython at>
> > wrote:
> >  Where might I find the bug #427345 that is referred to in a comment inside
> > http.server ?  Here is a code excerpt:
> >
> >             # throw away additional data [see bug #427345]
> >             while[self.rfile._sock], [], [], 0)[0]:
> >                 if not self.rfile._sock.recv(1):
> >                     break
> >
> has a box on the left-hand side where you can enter
> issue numbers.

And of course you can also reverse-engineer the clever URL scheme used
by Roundup bug entries ;)



From stephen at  Wed Nov 24 10:03:29 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 18:03:29 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

James Y Knight writes:

 > a) You seem to be hung up implementation details of emacs.

Hung up?  No.  It's the program whose text model I know best, and even
if its design could theoretically be a lot better for this purpose, I
can't say I've seen a real program whose model is obviously better for
the purpose of a language for implementing text editors.[1]  So it's not
obvious to me that its model can be ruled out on a priori grounds.  If
not, it would be nice if your new language could implement it
efficiently without contorted programming.

 >    But yes, positions should be stored as an byte offset into the
 >    utf8 string. NOT as number of codepoints since the beginning of
 >    the string. Probably you want it to be somewhat opaque, so that
 >    you actually have to specify whether you wanted to go to +1
 >    byte, codepoint, or grapheme.

Well, first of all, +1 byte should not be available to a text
iterator, at least not with the same iterator/position object that
implements character and/or grapheme movement.  (You seem to have
thought about this issue a lot, but mixing bytes with text units makes
wonder how much practical implementation you've done.)

Second, incrementing to grapheme boundaries is relatively easy to do
efficiently, just as incrementing to a UTF-8 character boundary is
easy to do.  We already do the latter, the former is pragmatically
harder, but not a conceptual stretch.  That's not the question.  The
question is how do we identify an arbitrary position in the text?
Sometimes it's nice to have a numerical measure of size or location.

It is not obvious that position by grapheme count is going to be the
obvious way to determine position in a text.  Eg, for languages with
variable metric characters, character counts as a way of lining up
table columns is going the way of Tyrannosaurus.  In the Han-using
languages, yes, column counts within lines are going to be important
forever, because the characters are literally square for most
practical purposes ... but they don't use composing characters (all
the Japanese kana are precomposed, for example), so position by
grapheme is going to be very close to position by character, and fine
positioning will be done either by mouse or by incrementing the last
few characters.  Nor do I think operations like "advance 1,000,000
characters" will have less meaning than "advance 1,000,000 graphemes."
Both of them are just a way of saying "go way far away", end up in
about the same place, and where there's a bias, it will be pretty
consistent in a statistical sense for any given natural language (and
therefore, for 99% of users).

 > But once you [the language implementor] are providing correct
 > abstractions for grapheme movement, it's just as easy to also
 > provide an abstraction for codepoint movement, and make your
 > low-level implementation of the iterator object be a byte-offset
 > into a UTF8 buffer.

Sure, that's fine for something that just iterates over the text.  But
if you actually need to remember positions, or regions, to jump to
later or to communicate to other code that manipulates them, doing
this stuff the straightforward way (just copying the whole iterator
object to hang on to its state) becomes expensive.  You end up
proliferating types that all do the same kind of thing.  Judicious use
of inheritance helps, but getting the fundamental abstraction right is
hard.  Or least, Emacs hasn't found it in 20 years of trying.

OTOH, all that stuff "just works" and just works efficiently, up to
the grapheme vs. character issue, with an array.

About that issue, to go back to tired old Emacs, *all* of the things I
can think of that I might want to do by grapheme (display, insert,
delete, move a few places) do fit the "increment until done" model.
These things already work quite well for the variable-width buffer
that "multilingual" Emacsen use, whether the old Mule encoding or
UTF-8.  So I can see how the UTF-8 model with appropriate iterators
for characters and graphemes can work well for lots of applications
and use cases.

But Emacs already has opaque "markers", yet nevertheless the use of
integer character positions in strings and buffers has survived.  That
*may* have to do with mutability, and the "all the world is a buffer"
design, as Glyph suggested, but I think it more likely that markers
are very expense to create and use compared to integers.  Perhaps an
editor of power similar to Emacs could be implemented with string
operations on lines, or the like, and these issues would go away.  But
it's not obvious to me.

[1]  Yes, I know that not all programs are text editors.  So shoot
me.  It's still the text manipulation program I know best, and it's
not obvious to me that it's the unique class that would need these

From stephen at  Wed Nov 24 10:51:49 2010
From: stephen at (Stephen J. Turnbull)
Date: Wed, 24 Nov 2010 18:51:49 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

James Y Knight writes:

 > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly
 > superior [...]a because it is an ASCII superset, and thus more
 > easily compatible with other software. That also makes it most
 > commonly used for internet communication.

Sure, UTF-8 is very nice as a protocol for communicating text.  So
what?  If your application involves shoveling octets real fast, don't
convert and shovel those octets.  If your application involves
significant text processing, well, conversion can almost always be
done as fast as you can do I/O so it doesn't cost wallclock time, and
generally doesn't require a huge percentage of CPU time compared to
the actual text processing.  It's just a specialization of
serialization, that we do all the time for more complex data

So wire protocols are not a killer argument for or against any
particular internal representation of text.

 > (So, there's a huge advantage for using it internally as well right
 > there: no transcoding necessary for writing your HTML output).

I don't know your use cases but for mine, transcoding (whether in Lisp
or Python or C) is invariably the least of my worries.  *Especially*
transcoding to UTF-8, which is the default codec for me, and I *never*
mix bytes and text, so having not bothered to set the codec, I don't
bother to transcode explicitly.

 > If you really want a fixed-width encoding, you have to go to
 > UTF-32

Not really.  I never bothered implementing the codec, because I
haven't yet seen a non-BMP Unicode character in the wild (I still see
a lot of non-Unicode characters, but hey, that's the price you pay for
living in the land that invented sushi, sake, and anime).  For most
use cases, those are going to be rare, where by "rare" I mean "you
aren't going to see 6400 *different* non-BMP characters."[1]  So
instead of having the codec produce UTF-16, you have it produce (Holy
CEF, Batman!) "pure" UCS-2 with the non-BMP characters registered on
demand and encoded in the BMP private area.  Python, of course, will
never know the difference, and your language won't need to care, either.

 > But that's all a side issue: even if you do choose UTF-16 as your
 > underlying encoding, you *still* need to provide iterators that
 > work by "byte" (only now bytes are 16-bits), by codepoint,

Nope, see above.  Codepoints can be bytes and vice versa.  The needed
codec is no harder to use than any other codec, and only slightly less
efficient than the normal UTF-8 codec unless you're basically
restricted to a rather uncommon script (and even then there are

 > and by grapheme.

Sure, but as I point out elsewhere, the use cases where grapheme
movement is distinguished from character movement I can come up with
are all iterative, and I don't need array behavior for both anyway.
So since I *can* have a character array in Unicode, and I *can't* have
a grapheme array (except maybe by a scheme like the above), I'll go
for the character array.

Unless maybe you convince me I don't need it, but I'm yet to be

 > away with...just so long as you don't mind that you sometimes end
 > up splitting a string in the middle of a codepoint and causing a
 > unicode error!

I *do* mind, but I like Python anyway.<wink>

[1]  OK, in practice a lot of the private space will be taken by
existing system characters, such as the Apple logo (absolutely
essential for writing email on Mac, at least in Japan).  Whose
use-case is going to see 1000 different non-BMP characters in a
session?  I do know a couple of Buddhist dictionary editors, but
aside from them, I can't think of anybody.  Lara Croft, maybe.

From solipsis at  Wed Nov 24 11:27:30 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 11:27:30 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
References: <>
	<> <>
Message-ID: <>

On Wed, 24 Nov 2010 18:51:49 +0900
"Stephen J. Turnbull" <stephen at> wrote:
> James Y Knight writes:
>  > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly
>  > superior [...]a because it is an ASCII superset, and thus more
>  > easily compatible with other software. That also makes it most
>  > commonly used for internet communication.
> Sure, UTF-8 is very nice as a protocol for communicating text.  So
> what?  If your application involves shoveling octets real fast, don't
> convert and shovel those octets.  If your application involves
> significant text processing, well, conversion can almost always be
> done as fast as you can do I/O so it doesn't cost wallclock time, and
> generally doesn't require a huge percentage of CPU time compared to
> the actual text processing.  It's just a specialization of
> serialization, that we do all the time for more complex data
> structures.
> So wire protocols are not a killer argument for or against any
> particular internal representation of text.

Agreed. Decoding and encoding utf-8 is so fast that it should be
dwarfed by any actual processing done on the text.



From solipsis at  Wed Nov 24 12:37:54 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 12:37:54 +0100
Subject: [Python-Dev] r86726 -
References: <>
Message-ID: <>

On Wed, 24 Nov 2010 11:39:23 +0100 (CET)
armin.rigo <python-checkins at> wrote:
> Author: armin.rigo
> Date: Wed Nov 24 11:39:23 2010
> New Revision: 86726
> Log:
> A no-op change.  It looks like this call was not meant to be a recursive
> call, but just call the helper (which the recursive call ends up doing).

Since it's allegedly a no-op change, it doesn't come with a test, and
2.7.1 is in rc phase, is it really the right time to do it? What is the
motivation for it?



> Modified:
>    python/branches/release27-maint/Objects/setobject.c
> Modified: python/branches/release27-maint/Objects/setobject.c
> ==============================================================================
> --- python/branches/release27-maint/Objects/setobject.c	(original)
> +++ python/branches/release27-maint/Objects/setobject.c	Wed Nov 24 11:39:23 2010
> @@ -1858,7 +1858,7 @@
>          tmpkey = make_new_set(&PyFrozenSet_Type, key);
>          if (tmpkey == NULL)
>              return -1;
> -        rv = set_contains(so, tmpkey);
> +        rv = set_contains_key(so, tmpkey);
>          Py_DECREF(tmpkey);
>      }
>      return rv;

From fuzzyman at  Wed Nov 24 13:30:15 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 24 Nov 2010 12:30:15 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>
Message-ID: <>

On 23/11/2010 14:16, Nick Coghlan wrote:
> On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord
> <fuzzyman at>  wrote:
>> PEP 354 was rejected for two primary reasons - lack of interest and nowhere
>> obvious to put it. Would it be *so bad* if an enum type lived in its own
>> module? There is certainly more interest now, and if we are to use something
>> like this in the standard library it *has* to be in the standard library
>> (unless every module implements their own private _Constant class).
>> Time to revisit the PEP?
> If you (or anyone else) wanted to revisit the PEP, then I would advise
> trawling through the standard library looking for constants that could
> be sensibly converted to enum values.

Based on a non-exhaustive search, Python standard library modules 
currently using integers for constants:

* re - has flags (OR'able constants) defined in sre_constants, each flag 
has two names (e.g. re.IGNORECASE and re.I)
* os has SEEK_SET, SEEK_CUR, SEEK_END - *plus* those implemented in 
posix / nt
* doctest has its own flag system, but is really just using integer 
flags / constants (quite a few of them)
* token has a tonne of constants (autogenerated)
* socket exports a bunch of constants defined in _socket

* errno (builtin module)


* opcode has HAVE_ARGUMENT, EXTENDED_ARG. In fact pretty much the whole 
of opcode is about defining and exposing named constants
* msilib uses flag constants
* multiprocessing.pool - RUN, CLOSE, TERMINATE
* multiprocessing.util - NOTSET, SUBDEBUG, DEBUG, INFO, SUBWARNING
* xml.dom and xml.dom.Node (in have a bunch of constants
* xml.dom.NodeFilter.NodeFilter holds a bunch of constants (some of them 
* xmlrpc.client has a bunch of error constants
* calendar uses constants to represent weekdays, plus one for the EPOCH 
that is best left alone
* http.client has a tonne of constants - recognisable as ports / error 
codes though
* dis has flags in COMPILER_FLAG_NAMES, which are then set as locals in 
* io defines SEEK_SET, SEEK_CUR, SEEK_END (same as os)

Where constants are implemented in C but exported via a Python module 
(the constants exported by os and socket for example) they could be 
wrapped. Where they are exported directly by a C extension or builtin 
module (e.g. errno) they are probably best left.

Raymond feels that having an enum / constant type would be Javaesque and 
unused. If we used it in the standard library the unused fear at least 
would be unwarranted. The change would be largely transparent to 
developers, except they get better debugging info. Twisted is also 
looking for an enum / constant type:

Because we would need to subclass from int for backwards compatibility 
we can't (unless the base class is set dynamically which I don't 
propose) it couldn't replace float / string constants. Hopefully it 
would still be sufficient to allow Twisted to use it. (Although they do 
so love reimplementing parts of the standard library - usually better 
than the standard library it has to be said.)

All the best,


There are a tonne of constants that are used as numbers (MAX_LINE_LENGTH 
appears in a few places) and aren't just arbitrary constants. There are 
also some other interesting ones:

* poplib has POP3_PORT, POP3_SSL_PORT - recognisable as port numbers, 
should be left as ints
* colorsys has float constants
* tty uses constants for termios list indexes (used as numbers I guess)
* curses.ascii has a whole bunch of integer constants referring to ascii 
* Several modules - decimal, concurrent.futures, uuid (and now inspect) 
already use strings

> A decision would also need to be made as to whether or not to subclass
> int, or just provide __index__ (the former has the advantage of being
> able to drop cleanly into OS level APIs that expect a numerical
> constant).
> Whether enums should provide arbitrary name-value mappings (ala C
> enums) or were restricted to sequential indices starting from zero
> would be another question best addressed by a code survey of at least
> the stdlib.
> And getgeneratorstate() doesn't count as a use case, since the
> ordering isn't needed and using string literals instead of integers
> will cover the debugging aspect :)
> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Wed Nov 24 15:08:04 2010
From: ncoghlan at (Nick Coghlan)
Date: Thu, 25 Nov 2010 00:08:04 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord
<fuzzyman at> wrote:
> Based on a non-exhaustive search, Python standard library modules currently
> using integers for constants:

Thanks for that review. I think following up on the "NamedConstant"
idea may make more sense than pursuing enums in their own right. That
way we could get the debugging benefits on the Python side regardless
of any type constraints on the value (e.g. needing to be an integer in
order to interface to C code), without needing to design an enum API
that suited all purposes.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From exarkun at  Wed Nov 24 16:01:06 2010
From: exarkun at (exarkun at
Date: Wed, 24 Nov 2010 15:01:06 -0000
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
In-Reply-To: <1290585733.3642.2.camel@localhost.localdomain>
References: <>
Message-ID: <20101124150106.2109.660794265.divmod.xquotient.197@localhost.localdomain>

On 08:02 am, solipsis at wrote:
>Le mardi 23 novembre 2010 ? 20:56 -0500, Glyph Lefkowitz a ?crit :
>>On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote:
>> > On Tue, 23 Nov 2010 00:07:09 -0500
>> > Glyph Lefkowitz <glyph at> wrote:
>> >> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto <
>> >> ocean-city at> wrote:
>> >>
>> >>> Hello. Does this affect python? Thank you.
>> >>>
>> >>>
>> >>>
>> >>
>> >> No.
>> >
>> > Well, actually it does, but Python links against the system OpenSSL 
>> > most platforms (except Windows), so it's up to the OS vendor to 
>> > the patch.
>>It does?  If so, I must have misunderstood the vulnerability.  Can you
>>explain how it affects Python?
>If I believe the link above:
> 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded and
>uses OpenSSL's internal caching mechanism. Servers that are
>multi-process and/or disable internal session caching are NOT 
>affected. 1D
>So, you just have to create a multithreaded TLS server which doesn't
>disable server-side session caching (it is enabled by default according

Hm.  The session cache is enabled by default, but nothing will ever use 
it unless the server specifies a session id using 
SSL_set_session_id_context or SSL_CTX_set_session_id_context.  Python 
doesn't expose these, so I don't think any Python SSL server can set 

The vulnerability announcement isn't 100% clear on this, but I took a 
look at the patch which fixes the issue and it /appears/ as though if a 
client never tries to re-use a session then you will be safe from this 
bug.  However, perhaps this only means that only malicious clients 
(which send a session id even when they can't actually have one) will be 
able to trigger the bug.

Or I may misunderstand how SSL sessions work in OpenSSL entirely.  The 
documentation for them is on par with that for most of the rest of 


From solipsis at  Wed Nov 24 16:11:20 2010
From: solipsis at (Antoine Pitrou)
Date: Wed, 24 Nov 2010 16:11:20 +0100
Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a)
References: <>
Message-ID: <>

On Wed, 24 Nov 2010 15:01:06 -0000
exarkun at wrote:
> >
> >If I believe the link above:
> > 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded and
> >uses OpenSSL's internal caching mechanism. Servers that are
> >multi-process and/or disable internal session caching are NOT 
> >affected. 1D
> >
> >So, you just have to create a multithreaded TLS server which doesn't
> >disable server-side session caching (it is enabled by default according
> >to 
> >)
> Hm.  The session cache is enabled by default, but nothing will ever use 
> it unless the server specifies a session id using 
> SSL_set_session_id_context or SSL_CTX_set_session_id_context.  Python 
> doesn't expose these, so I don't think any Python SSL server can set 
> them.

Well, Python calls SSL_CTX_set_session_id_context() implicitly, starting
from 3.2 (precisely so that the session cache gets used). The
"documentation" I've found about the "session id context" seems to
suggest that a process-wide constant is enough.

(and you can verify that caching occurs using the new
SSLContext.session_stats() method)

> Or I may misunderstand how SSL sessions work in OpenSSL entirely.  The 
> documentation for them is on par with that for most of the rest of 
> OpenSSL.




From steve at  Wed Nov 24 16:44:57 2010
From: steve at (Steven D'Aprano)
Date: Thu, 25 Nov 2010 02:44:57 +1100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord
> <fuzzyman at> wrote:
>> Based on a non-exhaustive search, Python standard library modules currently
>> using integers for constants:
> Thanks for that review. I think following up on the "NamedConstant"
> idea may make more sense than pursuing enums in their own right. 

Pardon me if I've missed something in this thread, but when you say 
"NamedConstant", do you mean actual constants that can only be bound 
once but not re-bound? If so, +1. If not, what do you mean?

I thought PEP 3115 could be used to implement such constants, but I 
can't get it to work...

class readonlydict(dict):
     def __setitem__(self, key, value):
         if key in self:
             raise TypeError("can't rebind constant")
         dict.__setitem__(self, key, value)
     # Need to also handle updates, del, pop, etc.

class MetaConstant(type):
     def __prepare__(metacls, name, bases):
         return readonlydict()
     def __new__(cls, name, bases, classdict):
         assert type(classdict) is readonlydict
         return type.__new__(cls, name, bases, classdict)

class Constant(metaclass=MetaConstant):
     a = 1
     b = 2
     c = 3

What I expect is that Constant.a should return 1, and Constant.a=2 
should raise TypeError, but what I get is a normal class __dict__.

 >>> Constant.a
 >>> Constant.a = 2
 >>> Constant.a


From exarkun at  Wed Nov 24 17:23:12 2010
From: exarkun at (exarkun at
Date: Wed, 24 Nov 2010 16:23:12 -0000
Subject: [Python-Dev] OpenSSL Vulnerability (openssl-1.0.0a)
In-Reply-To: <>
References: <>
Message-ID: <20101124162312.2109.1025683352.divmod.xquotient.215@localhost.localdomain>

On 03:11 pm, solipsis at wrote:
>On Wed, 24 Nov 2010 15:01:06 -0000
>exarkun at wrote:
>> >
>> >If I believe the link above:
>> > 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded 
>> >uses OpenSSL's internal caching mechanism. Servers that are
>> >multi-process and/or disable internal session caching are NOT
>> >affected. 1D
>> >
>> >So, you just have to create a multithreaded TLS server which doesn't
>> >disable server-side session caching (it is enabled by default 
>> >to 
>> >)
>>Hm.  The session cache is enabled by default, but nothing will ever 
>>it unless the server specifies a session id using
>>SSL_set_session_id_context or SSL_CTX_set_session_id_context.  Python
>>doesn't expose these, so I don't think any Python SSL server can set
>Well, Python calls SSL_CTX_set_session_id_context() implicitly, 
>from 3.2 (precisely so that the session cache gets used). The
>"documentation" I've found about the "session id context" seems to
>suggest that a process-wide constant is enough.

Ah.  Okay, then Python 3.2 would be vulnerable.  Good thing it isn't 
released yet. ;)


From benjamin at  Wed Nov 24 17:32:56 2010
From: benjamin at (Benjamin Peterson)
Date: Wed, 24 Nov 2010 10:32:56 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

2010/11/24 Steven D'Aprano <steve at>:
> Nick Coghlan wrote:
>> On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord
>> <fuzzyman at> wrote:
>>> Based on a non-exhaustive search, Python standard library modules
>>> currently
>>> using integers for constants:
>> Thanks for that review. I think following up on the "NamedConstant"
>> idea may make more sense than pursuing enums in their own right.
> Pardon me if I've missed something in this thread, but when you say
> "NamedConstant", do you mean actual constants that can only be bound once
> but not re-bound? If so, +1. If not, what do you mean?
> I thought PEP 3115 could be used to implement such constants, but I can't
> get it to work...
> class readonlydict(dict):
> ? ?def __setitem__(self, key, value):
> ? ? ? ?if key in self:
> ? ? ? ? ? ?raise TypeError("can't rebind constant")
> ? ? ? ?dict.__setitem__(self, key, value)
> ? ?# Need to also handle updates, del, pop, etc.
> class MetaConstant(type):
> ? ?@classmethod
> ? ?def __prepare__(metacls, name, bases):
> ? ? ? ?return readonlydict()
> ? ?def __new__(cls, name, bases, classdict):
> ? ? ? ?assert type(classdict) is readonlydict
> ? ? ? ?return type.__new__(cls, name, bases, classdict)
> class Constant(metaclass=MetaConstant):
> ? ?a = 1
> ? ?b = 2
> ? ?c = 3
> What I expect is that Constant.a should return 1, and Constant.a=2 should
> raise TypeError, but what I get is a normal class __dict__.

The construction namespace can be customized, but class.__dict__ must
always be a real dict.


From jsbueno at  Wed Nov 24 18:23:57 2010
From: jsbueno at (Joao S. O. Bueno)
Date: Wed, 24 Nov 2010 15:23:57 -0200
Subject: [Python-Dev] Fwd:  constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Hi --

If I may add my 0.02 cents - this sample has a sample implementation
of the proposed features I found most interesting up to now:
1) inherit from int
2) display the constant's name on 'repr'
3) optionally populate a module with the constants
4) Optionally provide a starting value for the enum
5) Optionally provide a mapping with the values

(implementation is in python 2)

Todo here:
6) Make them "read only"
7) Make the base type optional, with "int" as default - but also being able
to create "constants" inheriting from other objects
8) more ideas?

I am willing to play along this sample code as discussion goes on if
there is any feedback.


From alexander.belopolsky at  Wed Nov 24 18:37:43 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 24 Nov 2010 12:37:43 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 2:18 PM, Amaury Forgeot d'Arc
<amauryfa at> wrote:
>> Given the apparent difficulty of writing even basic text processing
>> algorithms in presence of surrogate pairs, I wonder how wise it is to
>> expose Python users to them.
> This was already discussed two years ago:

Thanks for the link.   Let me summarize that discussion as I read it.

The discussion starts with a reference to Guido's 2001 post which concluded with

... if we had wanted to use a
variable-lenth internal representation, we should have picked UTF-8
way back, like Perl did.  Moving to a UTF-16-based internal
representation now will give us all the problems of the Perl choice
without any of the benefits.
""" [1]

and proposes to move to USC-4 completely for Python 3.0.  Note that
this is not the option that I would like to discuss here.   I don't
propose to discuss abandoning narrow builds.  Instead, I would like to
discuss the costs and benefits associated with using variable width
CES as an internal representation.  This is where the 2008 discussion
moved.  OP did not realize that narrow build supported UTF-16 and like
myself was surprised that application developers should be aware of
surrogates if they want to use narrow builds.  It was also suggested
that Python itself is likely to have many bugs that can be triggered
by non-BMP characters on narrow builds.  Guido's response was:

I'd also prefer to receive bug reports about breakages actually
encountered in the wild than purely theoretical issues

I don't think this is a good position to take.  Programs that expect
one code unit where Python may produce two are likely to have security
holes.  Even when programmers carefully sanitize their input, they are
likely to do it at the code point level based on Unicode category and
0xFFFF boundary does not mean anything special for their applications.
  I think anyone who wants to write a robust application has two
choices in practice:  (a) use wide Unicode build; (b) restrict all
text to BMP.  Supporting surrogates at the application level is likely
to be prohibitively expensive.

It was later suggested that the main benefit of "UTF-16" builds is
that they can easily interface with system libraries that are "UTF-16"
based.  However, how likely are these libraries be bug-free when it
comes to non-BMP characters?  The history teaches us that not very

Daniel Arbuckle presented arguments against imposing the burden of
dealing with surrogates on application writers. [2]

The recurrent theme on the thread was that non-BMP characters are rare
and those who need them can afford the extra development cost
associated with the surrogates.  This point was very eloquently
articulated by Guido:

Who are the many here? Who are the few? I'd venture that (at least for
the foreseeable future, say, until China will finally have taken over
the role of the US as the de-facto dominant super power :-) the many
are people whose app will never see a Unicode character outside the
BMP, or who do such minimal string processing that their code doesn't
care whether it's handling UTF-16-encoded data.
""" [3]

This argument can also be used to support the position that narrow
builds should not support non-BMP characters.

Later the discussion started resembling this thread when it went into
a scholastic dispute over fine points in Unicode Standard terminology.

Then BDFL vetoed len(u"\U00012345") returning 1 on narrow builds. [4]
I would be against that as well.  I don't see len("\U00012345") == 2
as a big problem because application developers can simply avoid using
\U literals if they don't want to support non-BMP characters.  On the
other hand, an option to warn users about non-BMP literals on a narrow
build may be useful but it is easy to implement in lint-like tools.

There were multiple suggestions for standard library additions to help
application writers to deal with surrogate pairs, but as far as I can
tell, nothing has been done in this area in the following two years.
I don't think there is a recipe on how to fix legacy
character-by-character processing loop such as

   for c in string:

to make it iterate over code points consistently in wide and narrow
builds.  (Note that I am not asking for a grapheme iterator here.
This is clearly an application level feature.)

> So yes, wrap() and center() should be fixed.

I opened an issue 10521 for that. [5]  I am fully prepared to see it
dismissed as "theoretical" and be closed with "won't fix" or linger
indefinitely.   Fixing it would most likely involve writing the second
version of pad() utility function specifically for the narrow build.

All examples I've seen in Python C code of dealing with surrogates
came with hand-coded #ifndef Py_UNICODE_WIDE fragments and no
user-friendly macros or APIs that would abstract it away.

A quick grep for maxunicode in the standard library revealed only one
case of "narrow-build aware" code:

        if sys.maxunicode != 65535:
            # XXX: negation does not work with big charsets
            return charset

See  Lib/  Not exactly a model to follow.

To conclude, I feel that rather than trying to fully support non-BMP
characters as surrogate pairs in narrow builds, we should make it
easier for application developers to avoid them.  If abandoning
internal use of UTF-16 is not an option, I think we should at least
add an option for decoders that currently produce surrogate pairs to
treat non-BMP characters as errors and handle them according to user's


From fuzzyman at  Wed Nov 24 18:41:08 2010
From: fuzzyman at (Michael Foord)
Date: Wed, 24 Nov 2010 17:41:08 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>
Message-ID: <>

On 24/11/2010 14:08, Nick Coghlan wrote:
> On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord
> <fuzzyman at>  wrote:
>> Based on a non-exhaustive search, Python standard library modules currently
>> using integers for constants:
> Thanks for that review. I think following up on the "NamedConstant"
> idea may make more sense than pursuing enums in their own right. That
> way we could get the debugging benefits on the Python side regardless
> of any type constraints on the value (e.g. needing to be an integer in
> order to interface to C code), without needing to design an enum API
> that suited all purposes.

Can you explain what you see as the difference?

I'm not particularly interested in type validation but I like the fact 
that typical enum APIs allow you to group constants: the generated 
constant class acts as a namespace for all the defined constants.

Are you just suggesting something along the lines of:

class NamedConstant(int):
def __new__(cls, name, val):
return int.__new__(cls, val)

def __init__(self, name, val):
self._name = name

def __repr__(self):
return '<NamedConstant %s>' % self._name

FOO = NamedConstant('FOO', 3)

In general the less features the better, but I'd like a few more 
features than that. :-)

All the best,


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From mal at  Wed Nov 24 19:50:57 2010
From: mal at (M.-A. Lemburg)
Date: Wed, 24 Nov 2010 19:50:57 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:
> To conclude, I feel that rather than trying to fully support non-BMP
> characters as surrogate pairs in narrow builds, we should make it
> easier for application developers to avoid them. 

I don't understand what you're after here. Programmers can easily
avoid them by not using them :-)

> If abandoning
> internal use of UTF-16 is not an option, I think we should at least
> add an option for decoders that currently produce surrogate pairs to
> treat non-BMP characters as errors and handle them according to user's
> choice.

But what do you gain by doing this ? You'd lose the round-trip
safety of those codecs and that's not a good thing.

Note that most text processing APIs in Python work based on code
units, which in most cases represent single code points, but in
some cases can also represent surrogates (both on UCS-2 and on
UCS-4 builds).

E.g. centers the string in a padded string that
is composed of n code units. Whether that operation will result
in a text that's centered visually on output is a completely
different story. The original string could contain surrogates,
it could also contain combing code points, so the visual
presentation of the result may very well not be centered at
all; it may not even appear as having the length n to the user.

Since we're not going change the semantics of those APIs,
it is OK to not support padding with non-BMP code points on
UCS-2 builds.

Supporting such cases would only cause problems:

* if the methods would pad with surrogates, the resulting
  string would no longer have length n; breaking the
  assumption that len( == n

* if the methods would pad with half the number of surroagtes
  to make sure that len( == n, the resulting
  output to e.g. a terminal would be further off, than what
  you already have with surrogates and combining code points
  in the original string.

More on codecs supporting surrogates:

Perhaps it's time to reconsider a project I once started
but that never got off the ground:

Here's the pre-PEP:

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 24 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From brett at  Wed Nov 24 20:04:01 2010
From: brett at (Brett Cannon)
Date: Wed, 24 Nov 2010 11:04:01 -0800
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Tue, Nov 23, 2010 at 15:07, Terry Reedy <tjreedy at> wrote:
> On 11/23/2010 5:43 PM, ?ric Araujo wrote:
>>> Modified: python/branches/py3k/Misc/ACKS
>>> ==============================================================================
>>> --- python/branches/py3k/Misc/ACKS ? ? ?(original)
>>> +++ python/branches/py3k/Misc/ACKS ? ? ?Tue Nov 23 21:32:47 2010
>>> @@ -1,4 +1,4 @@
>>> -Acknowledgements
>>> +?Acknowledgements
>> This change introduced a so-called UTF-8 BOM in the file. ?Is
>> TortoiseSvn the culprit or a text editor?
> I used Notepad to edit the file, TortoiseSvn to commit, the same as I did
> for #9222, rev86702, Lib\idlelib\, yesterday.
> If the latter is OK, perhaps *.py gets filtered better than misc. text
> files. I believe I have the config as specified in dev/faq.

Adding the BOM will be an editor thing, not a svn thing. Doing a
Google search for [ms notepad bom] shows that Notepad did the
"helpful", invisible edit.


> [miscellany]
> enable-auto-props = yes
> [auto-props]
> * = svn:eol-style=native
> *.c = svn:keywords=Id
> *.h = svn:keywords=Id
> *.py = svn:keywords=Id
> *.txt = svn:keywords=Author Date Id Revision
> Terry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From tjreedy at  Wed Nov 24 20:25:17 2010
From: tjreedy at (Terry Reedy)
Date: Wed, 24 Nov 2010 14:25:17 -0500
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <>
References: <>	<>
Message-ID: <icjoqu$tro$>

On 11/24/2010 2:04 PM, Brett Cannon wrote:
> On Tue, Nov 23, 2010 at 15:07, Terry Reedy<tjreedy at>  wrote:

>> I used Notepad to edit the file, TortoiseSvn to commit, the same as I did
>> for #9222, rev86702, Lib\idlelib\, yesterday.
>> If the latter is OK, perhaps *.py gets filtered better than misc. text
>> files. I believe I have the config as specified in dev/faq.
> Adding the BOM will be an editor thing, not a svn thing. Doing a
> Google search for [ms notepad bom] shows that Notepad did the
> "helpful", invisible edit.

So I presume it did the same with Does *.py get filtered 
is a way that could be extended to no-extention files? Do *.txt files 
get BOM filtered off? Should all text files in repository have some 
extension (default .txt)?

More to the point, can better filtering be added to the new hg 
repository? Or can a local Windows hg setup have such filtering on local 
commits before pushing?

I know now that I could always edit with IDLE's editor, but it is a lot 
easier to right click and select edit than it is to run thru the 
directory tree in an open dialog. And of course, since the pseudo-BOM 
addition is undocumented within notepad itself, and probably other 
editors, it is easy to not know.

Terry Jan Reedy

From g.brandl at  Wed Nov 24 21:04:40 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 24 Nov 2010 21:04:40 +0100
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <icjoqu$tro$>
References: <>	<>	<>	<>
Message-ID: <icjr4r$8ha$>

Am 24.11.2010 20:25, schrieb Terry Reedy:
> On 11/24/2010 2:04 PM, Brett Cannon wrote:
>> On Tue, Nov 23, 2010 at 15:07, Terry Reedy<tjreedy at>  wrote:
>>> I used Notepad to edit the file, TortoiseSvn to commit, the same as I did
>>> for #9222, rev86702, Lib\idlelib\, yesterday.
>>> If the latter is OK, perhaps *.py gets filtered better than misc. text
>>> files. I believe I have the config as specified in dev/faq.
>> Adding the BOM will be an editor thing, not a svn thing. Doing a
>> Google search for [ms notepad bom] shows that Notepad did the
>> "helpful", invisible edit.
> So I presume it did the same with Does *.py get filtered 
> is a way that could be extended to no-extention files? Do *.txt files 
> get BOM filtered off? Should all text files in repository have some 
> extension (default .txt)?
> More to the point, can better filtering be added to the new hg 
> repository? Or can a local Windows hg setup have such filtering on local 
> commits before pushing?

Of course it can; it's just a matter of writing the respective hooks.
What we *can* do in any case is to check for UTF-8 "BOMs" server-side
in the whitespace checking hook.

> I know now that I could always edit with IDLE's editor, but it is a lot 
> easier to right click and select edit than it is to run thru the 
> directory tree in an open dialog. And of course, since the pseudo-BOM 
> addition is undocumented within notepad itself, and probably other 
> editors, it is easy to not know.

It should show up as an invisible change in the first line of a file when you
look at a "svn diff".  (It is a very good practice to look at a diff before
committing anyway.)


From alexander.belopolsky at  Wed Nov 24 21:06:25 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 24 Nov 2010 15:06:25 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Nov 24, 2010 at 1:50 PM, M.-A. Lemburg <mal at> wrote:
>> add an option for decoders that currently produce surrogate pairs to
>> treat non-BMP characters as errors and handle them according to user's
>> choice.
> But what do you gain by doing this ? You'd lose the round-trip
> safety of those codecs and that's not a good thing.

Any non-trivial text processing is likely to be broken in presence of
surrogates.  Producing them on input is just trading known issue for
an unknown one.  Processing surrogate pairs in python code is hard.
Software that has to support non-BMP characters will most likely be
written for a wide build and contain subtle bugs when run under a
narrow build.  Note that my latest proposal does not abolish
surrogates outright.  Users who want them can still use something like
"surrogateescape"  error handler for non-BMP characters.

> Since we're not going change the semantics of those APIs,
> it is OK to not support padding with non-BMP code points on
> UCS-2 builds.

Well, I think more users are willing to accept slightly misaligned
text in their web-app logs than those willing to cope with

Traceback (most recent call last):
TypeError: The fill character must be exactly one character long


Yes, allowing non-trusted users to specify fill character is unlikely,
but it is quite likely that naive slicing or iteration over string
units would result in

Traceback (most recent call last):
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in
position 0: surrogates not allowed

> Supporting such cases would only cause problems:
> * if the methods would pad with surrogates, the resulting
> ?string would no longer have length n; breaking the
> ?assumption that len( == n

I agree, but how is this different from breaking the assumption that
len(chr(i)) == 1?

> * if the methods would pad with half the number of surroagtes
> ?to make sure that len( == n, the resulting
> ?output to e.g. a terminal would be further off, than what
> ?you already have with surrogates and combining code points
> ?in the original string.

I agree again.  What I suggested on the tracker, supporting non-BMP
characters in narrow builds should mean that library functions given
input with the same UCS-4 encoding should produce output with the same
UCS-4 encoding.

> Perhaps it's time to reconsider a project I once started
> but that never got off the ground:
> ?
> Here's the pre-PEP:
> ?

I agree again, but I feel that exposing code units rather than code
points at the Python string level takes us back to 2.x days of mixing
bytes and strings.

Let me quote Guido circa 2001 again:

... if we had wanted to use a
variable-lenth internal representation, we should have picked UTF-8
way back, like Perl did.  Moving to a UTF-16-based internal
representation now will give us all the problems of the Perl choice
without any of the benefits.

I don't understand what changed since 2001 that made this argument
invalid.   I note that an opinion has been raised on this thread that
if we want compressed internal representation for strings, we should
use UTF-8.  I tend to agree, but UTF-8 has been repeatedly rejected as
too hard to implement.  What makes UTF-16 easier than UTF-8?  Only the
fact that you can ignore bugs longer, in my view.

From g.brandl at  Wed Nov 24 21:24:49 2010
From: g.brandl at (Georg Brandl)
Date: Wed, 24 Nov 2010 21:24:49 +0100
Subject: [Python-Dev] [Preview] Comments and change proposals on
Message-ID: <icjsal$eqk$>


at <>, you can look at a version of the 3.2
docs that has the upcoming commenting feature.  JavaScript is mandatory.
I've switched on anonymous comments for testing, but usually at least
comments from anonymous users can be moderated.  Be sure to test the
"propose a change" feature too.  Login currently allows OpenID exclusively.

Credits go to Jacob Mason, whose GSOC project is responsible for almost all
of what you see there.  [1]

Please test on a smaller page, such as <>,
there is currently a speed issue with larger pages.  (Helpful tips from
JS experts are welcome.)

Other things I have to do before this can go live:

* reuse existing logins from either wiki or tracker?
* (re)Captcha integration for anonymous comments
* easier moderation (currently emails are sent on new comments)
* facility for (semi)automatic applying of proposals (once Hg is live, this
  should be easy to do due to the separation between commit and merge)
* allow commenting on code blocks (figure out where to place the "bubble")

Any feedback is appreciated (I'd suggest mailing it to doc-SIG only, to avoid
cluttering up python-dev).

Have fun,

[1] The source for the webapp is at
    <>, but most of the
    functionality is implemented in Sphinx trunk.

From anurag.chourasia at  Wed Nov 24 22:01:32 2010
From: anurag.chourasia at (Anurag Chourasia)
Date: Thu, 25 Nov 2010 02:31:32 +0530
Subject: [Python-Dev] collect2: library libpython2.6 not found while
	building extensions (--enable-shared)
Message-ID: <>


When I configure python to enable shared libraries, none of the extensions
are getting built during the make step due to this error.

building 'cStringIO' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I.
-IInclude -I./Include -I/opt/freeware/include
-I/opt/freeware/include/readline -I/opt/freeware/include/ncurses
-I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include
-I/u01/home/apli/wm/GDD/Python-2.6.6 -c
/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.c -o
./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp
-L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/
*collect2: library libpython2.6 not found*

building 'cPickle' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I.
-IInclude -I./Include -I/opt/freeware/include
-I/opt/freeware/include/readline -I/opt/freeware/include/ncurses
-I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include
-I/u01/home/apli/wm/GDD/Python-2.6.6 -c
/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.c -o
./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp
-L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/
*collect2: library libpython2.6 not found*

This is on AIX 5.3, GCC 4.2, Python 2.6.6

I can confirm that there is a libpython2.6.a file in the top level directory
from where I am doing the configure/make etc

Here are the options supplied to the configure command

./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I
/opt/freeware/include -I /opt/freeware/include/readline -I

Please guide me in getting past this error.

Thanks for your help on this.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Wed Nov 24 23:13:50 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Nov 2010 23:13:50 +0100
Subject: [Python-Dev] [Python-checkins] r86720
	-	python/branches/py3k/Misc/ACKS
In-Reply-To: <icjoqu$tro$>
References: <>	<>	<>	<>
Message-ID: <>

> So I presume it did the same with

No. This file contains only ASCII characters, so notepad has decided
to not add the BOM.


From dreamingforward at  Thu Nov 25 00:38:01 2010
From: dreamingforward at (average)
Date: Wed, 24 Nov 2010 16:38:01 -0700
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Is immutability a general need that should have general solution?  By
generalizing the idea to lists/tuples, set/frozenset, dicts, and strings
(for example), it seems one could simplify the container classes, eliminate
code complexity, and perhaps improve resource utilization.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Thu Nov 25 00:41:58 2010
From: solipsis at (Antoine Pitrou)
Date: Thu, 25 Nov 2010 00:41:58 +0100
Subject: [Python-Dev] r86731 - in python/branches/py3k:
 Lib/distutils/command/ Lib/distutils/
 Lib/ Misc/ configure
References: <>
Message-ID: <>

On Wed, 24 Nov 2010 20:43:47 +0100 (CET)
barry.warsaw <python-checkins at> wrote:
> Author: barry.warsaw
> Date: Wed Nov 24 20:43:47 2010
> New Revision: 86731
> Log:
> Final patch for issue 9807.

This seems to have broken compilation under Windows:

Build started: Project: ssl, Configuration: Debug|Win32
Performing Makefile project actions
Traceback (most recent call last):
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 519, in <module>
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 507, in main
    known_paths = addusersitepackages(known_paths)
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 253, in addusersitepackages
    user_site = getusersitepackages()
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 228, in getusersitepackages
    user_base = getuserbase() # this will also set USER_BASE
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 218, in getuserbase
    USER_BASE = get_config_var('userbase')
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 586, in get_config_var
    return get_config_vars().get(name)
  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 478, in get_config_vars
    _CONFIG_VARS['abiflags'] = sys.abiflags
AttributeError: 'module' object has no attribute 'abiflags'



From barry at  Thu Nov 25 00:50:25 2010
From: barry at (Barry Warsaw)
Date: Wed, 24 Nov 2010 18:50:25 -0500
Subject: [Python-Dev] r86731 - in python/branches/py3k:
 Lib/distutils/command/ Lib/distutils/
 Lib/ Misc/ configure
In-Reply-To: <>
References: <>
Message-ID: <20101124185025.6cb67127@mission>

On Nov 25, 2010, at 12:41 AM, Antoine Pitrou wrote:

>On Wed, 24 Nov 2010 20:43:47 +0100 (CET)
>barry.warsaw <python-checkins at> wrote:
>> Author: barry.warsaw
>> Date: Wed Nov 24 20:43:47 2010
>> New Revision: 86731
>> Log:
>> Final patch for issue 9807.
>This seems to have broken compilation under Windows:
>Build started: Project: ssl, Configuration: Debug|Win32
>Performing Makefile project actions
>Traceback (most recent call last):
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 519, in <module>
>    main()
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 507, in main
>    known_paths = addusersitepackages(known_paths)
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 253, in addusersitepackages
>    user_site = getusersitepackages()
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 228, in getusersitepackages
>    user_base = getuserbase() # this will also set USER_BASE
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 218, in getuserbase
>    USER_BASE = get_config_var('userbase')
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 586, in get_config_var
>    return get_config_vars().get(name)
>  File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\", line 478, in get_config_vars
>    _CONFIG_VARS['abiflags'] = sys.abiflags
>AttributeError: 'module' object has no attribute 'abiflags'

As discussed on IRC, _CONFIG_VARS['abiflags'] = '' if sys.abiflags is not
defined.  Amaury is going to test that.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From greg.ewing at  Thu Nov 25 01:19:37 2010
From: greg.ewing at (Greg Ewing)
Date: Thu, 25 Nov 2010 13:19:37 +1300
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On 24/11/10 13:22, James Y Knight wrote:

> Instead, provide bidirectional iterators which can traverse the string by byte,
> codepoint, or by grapheme

Maybe it would be a good idea to add some iterators like this
to Python. (Or has the time machine beaten me there?)


From stephen at  Thu Nov 25 03:17:44 2010
From: stephen at (Stephen J. Turnbull)
Date: Thu, 25 Nov 2010 11:17:44 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Alexander Belopolsky writes:

 > Any non-trivial text processing is likely to be broken in presence of
 > surrogates.

If you're worried about this, write a UCS-2-producing codec that
rejects surrogates or stuffs them into the private zone of the BMP.
Maybe such a codec should be default, but so far nobody seems to want
one enough; they want UTF-16 even though they know it's wrong.<wink>

One of the things that makes the 16-bit code unit attractive to me is
that the options for working around the variable-width nature of
UTF-16 (without actually implementing conformance to UTF-16 in
internal operations!) are many.  If you use octets as code units, you
don't have such options: you have to do it right.

 > Processing surrogate pairs in python code is hard.

Sure, but as James Knight and MAL point out, so is processing compose
characters, and those errors will go undetected in your proposals,
even with a strict UCS-2 definition.  What can you do?  Banning
composing characters isn't going to fly!

 > Yes, allowing non-trusted users to specify fill character is unlikely,
 > but it is quite likely that naive slicing or iteration over string
 > units would result in
 > Traceback (most recent call last):

Naive slicing yes, but naive iteration (ie, iteration that consumes
the whole string, or up to a known character, rather than up to a
specified position) is highly unlikely to result in such a traceback.
It is precisely that property (non-BMP characters get passed through
unchanged, or ignored) that makes extension to non-BMP code points

 > I agree again, but I feel that exposing code units rather than code
 > points at the Python string level takes us back to 2.x days of mixing
 > bytes and strings.

It does, but there's a difference.  With bytes as UTF-8, only ASCII
values have defined semantics in Unicode.  The rest have semantics
that is context-dependent, and they are frequent in any non-English
processing and many English use cases (math symbols, correctly-
oriented punctuation).  With 16-bit code units, all values have well-
defined semantics in Unicode, and non-characters are going to be
extremely rare in the vast majority of use cases.  IOW, you can think
of Python as a UCS-2 device processing characters, and let surrounding
UTF-16 processors deal with the errors.

 > Let me quote Guido circa 2001 again:
 > """
 > ... if we had wanted to use a
 > variable-lenth internal representation, we should have picked UTF-8
 > way back, like Perl did.  Moving to a UTF-16-based internal
 > representation now will give us all the problems of the Perl choice
 > without any of the benefits.
 > """
 > I don't understand what changed since 2001 that made this argument
 > invalid.

Nothing.  The internal representation of Python is UCS-2, not UTF-16.
People who want to think otherwise are kidding themselves.  The
presence of surrogates is not sufficient to call something UTF-16.
Preserving the Unicode code points through any builtin operations is a
necessary condition, and Python doesn't do that.  *However*, in my
opinion, it's not a big deal to allow surrogates in UCS-2 a la ISO
10646-1:1996.  That lets people who want a quick and dirty way to
handle BMP text that *might* (but usually won't) contain some non-BMP
characters go a long way fast.  "Although practicality beats purity."

 > I note that an opinion has been raised on this thread that
 > if we want compressed internal representation for strings, we should
 > use UTF-8.  I tend to agree, but UTF-8 has been repeatedly rejected as
 > too hard to implement.  What makes UTF-16 easier than UTF-8?  Only the
 > fact that you can ignore bugs longer, in my view.

That's mostly true.  My guess is that we can probably ignore those
bugs for as long as it takes someone to write the higher-level
libraries that James suggests and MAL has actually proposed and
started a PEP for.

From greg.ewing at  Thu Nov 25 03:35:50 2010
From: greg.ewing at (Greg Ewing)
Date: Thu, 25 Nov 2010 15:35:50 +1300
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On 24/11/10 22:03, Stephen J. Turnbull wrote:
> But
> if you actually need to remember positions, or regions, to jump to
> later or to communicate to other code that manipulates them, doing
> this stuff the straightforward way (just copying the whole iterator
> object to hang on to its state) becomes expensive.

If the internal representation of a text pointer (I won't call it
an iterator because that means something else in Python) is a byte
offset or something similar, it shouldn't take up any more space
than a Python int, which is what you'd be using anyway if you
represented text positions by grapheme indexes or whatever.

If you want the text pointer to also remember which string it
points into, it'll be a bit bigger, but again, no bigger than
you would need to get the same functionality using a grapheme
index plus a reference to the original string. Probably smaller,
because it would all be encapsulated in one object.

So I don't really see what you're arguing for here. How do
*you* think positions in unicode strings should be represented?


From greg.ewing at  Thu Nov 25 04:19:33 2010
From: greg.ewing at (Greg Ewing)
Date: Thu, 25 Nov 2010 16:19:33 +1300
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On 25/11/10 06:37, Alexander Belopolsky wrote:

> I don't think there is a recipe on how to fix legacy
> character-by-character processing loop such as
>     for c in string:
>        ...
> to make it iterate over code points consistently in wide and narrow
> builds.

A couple of possibilities:

1) Make things so that 'for c in string' does actually
iterate over characters rather than code units. This could
break existing code, though.

2) Provide some things like

    for c in string.chars():

    for c in string.graphemes():

where chars() and graphemes() return appropriate iterators.
(Or possibly iterable views, but that would raise the
expectation that the views could also be randomly indexed
by char or grapheme, which we probably wouldn't want to


From greg.ewing at  Thu Nov 25 04:46:53 2010
From: greg.ewing at (Greg Ewing)
Date: Thu, 25 Nov 2010 16:46:53 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On 25/11/10 12:38, average wrote:
> Is immutability a general need that should have general solution?

I don't think it really generalizes. Tuples are not just frozen
lists, for example -- they have a different internal structure
that's more efficient to create and access.


From stephen at  Thu Nov 25 04:55:40 2010
From: stephen at (Stephen J. Turnbull)
Date: Thu, 25 Nov 2010 12:55:40 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Greg Ewing writes:
 > On 24/11/10 22:03, Stephen J. Turnbull wrote:
 > > But
 > > if you actually need to remember positions, or regions, to jump to
 > > later or to communicate to other code that manipulates them, doing
 > > this stuff the straightforward way (just copying the whole iterator
 > > object to hang on to its state) becomes expensive.
 > If the internal representation of a text pointer (I won't call it
 > an iterator because that means something else in Python) is a byte
 > offset or something similar, it shouldn't take up any more space
 > than a Python int, which is what you'd be using anyway if you
 > represented text positions by grapheme indexes or whatever.

That's not necessarily true.  Eg, in Emacs ("there you go again"),
Lisp integers are not only immediate (saving one pointer), but the
type is encoded in the lower bits, so that there is no need for a type
pointer -- the representation is smaller than the opaque marker type.
Altogether, up to 8 of 12 bytes saved on a 32-bit platform, or 16 of
24 bytes on a 64-bit platform.

In Python it's true that markers can use the same data structure as
integers and simply provide different methods, and it's arguable that
Python's design is better.  But if you use bytes internally, then you
have problems.  Do you expose that byte value to the user?  Can users
(programmers using the language and end users) specify positions in
terms of byte values?  If so, what do you do if the user specifies a
byte value that points into a multibyte character?  What if the user
wants to specify position by number of characters?  Can you translate

As I say elsewhere, it's possible that there really never is a need to
efficiently specify an absolute position in a large text as a
character (grapheme, whatever) count.  But I think it would be hard to
implement an efficient text-processing *language*, eg, a Python module
for *full conformance* in handling Unicode, on top of UTF-8.  Any time
you have an algorithm that requires efficient access to arbitrary text
positions, you'll spend all your skull sweat fighting the
representation.  At least, that's been my experience with Emacsen.

 > So I don't really see what you're arguing for here. How do
 > *you* think positions in unicode strings should be represented?

I think what users should see is character positions, and they should
be able to specify them numerically as well as via an opaque marker
object.  I don't care whether that position is represented as bytes or
characters internally, except that the experience of Emacsen is that
representation as byte positions is both inefficient and fragile.  The
representation as character positions is more robust but slightly more

From alexander.belopolsky at  Thu Nov 25 05:37:33 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Wed, 24 Nov 2010 23:37:33 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull <stephen at> wrote:
> ?> I note that an opinion has been raised on this thread that
> ?> if we want compressed internal representation for strings, we should
> ?> use UTF-8. ?I tend to agree, but UTF-8 has been repeatedly rejected as
> ?> too hard to implement. ?What makes UTF-16 easier than UTF-8? ?Only the
> ?> fact that you can ignore bugs longer, in my view.
> That's mostly true. ?My guess is that we can probably ignore those
> bugs for as long as it takes someone to write the higher-level
> libraries that James suggests and MAL has actually proposed and
> started a PEP for.

As far as I can tell, that PEP generated grand total of one comment in
nine years.  This may or may not be indicative of how far away we are
from seeing it implemented.  :-)

As far as UTF-8 vs. UCS-2/4 debate, I have an idea that may be even
more far fetched.  Once upon a time, Python Unicode strings supported
buffer protocol and would lazily fill an internal buffer with bytes in
the default encoding.  In 3.x the default encoding has been fixed as
UTF-8, buffer protocol support was removed from strings, but the
internal buffer caching (now UTF-8) encoded representation remained.
Maybe we can now implement defenc logic in reverse.  Recall that
strings are stored as UCS-2/4 sequences, but once buffer is requested
in 2.x Python code or char* is obtained via
_PyUnicode_AsStringAndSize() at the C level in 3.x, an internal buffer
is filled with UTF-8 bytes and  defenc is set to point to that buffer.
  So the idea is for strings to store their data as UTF-8 buffer
pointed by defenc upon construction.  If an application uses string
indexing, UTF-8 only strings will lazily fill their UCS-2/4 buffer.
Proper, Unicode-aware algorithms such as grapheme, word or line
iteration or simple operations such as concatenation, search or
substitution would operate directly on defenc buffers.  Presumably
over time fewer and fewer applications would use code unit indexing
that require UCS-2/4 buffer and eventually Python strings can stop
supporting indexing altogether just like they stopped supporting the
buffer protocol in 3.x.

From tjreedy at  Thu Nov 25 06:22:01 2010
From: tjreedy at (Terry Reedy)
Date: Thu, 25 Nov 2010 00:22:01 -0500
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <icjr4r$8ha$>
References: <>	<>	<>	<>	<icjoqu$tro$>
Message-ID: <ickrpp$4l5$>

On 11/24/2010 3:04 PM, Georg Brandl wrote:

>>> Adding the BOM will be an editor thing, not a svn thing. Doing a

> It should show up as an invisible change in the first line of a file when you
> look at a "svn diff".  (It is a very good practice to look at a diff before
> committing anyway.)

It does show up, and yes I agree. That should be in dev/faq if not already

Terry Jan Reedy

From tjreedy at  Thu Nov 25 06:23:27 2010
From: tjreedy at (Terry Reedy)
Date: Thu, 25 Nov 2010 00:23:27 -0500
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <>
References: <>	<>	<>	<>	<icjoqu$tro$>
Message-ID: <ickrsf$4l5$>

On 11/24/2010 5:13 PM, "Martin v. L?wis" wrote:
>> So I presume it did the same with
> No. This file contains only ASCII characters, so notepad has decided
> to not add the BOM.

Or it somehow got removed from the .py file. I tried with another .py 
file (and reverted!) and the diff showed the invisible change to the 
first line that Georg predicted.

Terry Jan Reedy

From tjreedy at  Thu Nov 25 06:39:30 2010
From: tjreedy at (Terry Reedy)
Date: Thu, 25 Nov 2010 00:39:30 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <icksqj$84p$>

On 11/24/2010 3:06 PM, Alexander Belopolsky wrote:

> Any non-trivial text processing is likely to be broken in presence of
> surrogates.  Producing them on input is just trading known issue for
> an unknown one.  Processing surrogate pairs in python code is hard.
> Software that has to support non-BMP characters will most likely be
> written for a wide build and contain subtle bugs when run under a
> narrow build.  Note that my latest proposal does not abolish
> surrogates outright.  Users who want them can still use something like
> "surrogateescape"  error handler for non-BMP characters.

It seems to me that what you are asking for is an alternate, optional, 
utf-8-bmp codec that would raise an error, in either direction, for 
non-bmp chars. Then, as you suggest, if one is not prepared for 
surrogates, they are not allowed.

Terry Jan Reedy

From anurag.chourasia at  Thu Nov 25 10:24:34 2010
From: anurag.chourasia at (Anurag Chourasia)
Date: Thu, 25 Nov 2010 14:54:34 +0530
Subject: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions
Message-ID: <>


When I configure python to enable shared libraries, none of the
extensions are getting built during the make step due to this error.

building 'cStringIO' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I.
-IInclude -I./Include -I/opt/freeware/include
-I/opt/freeware/include/readline -I/opt/freeware/include/ncurses
-I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include
-I/u01/home/apli/wm/GDD/Python-2.6.6 -c
/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.c -o
./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp
-L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/
*collect2: library libpython2.6 not found*

building 'cPickle' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I.
-IInclude -I./Include -I/opt/freeware/include
-I/opt/freeware/include/readline -I/opt/freeware/include/ncurses
-I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include
-I/u01/home/apli/wm/GDD/Python-2.6.6 -c
/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.c -o
./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp
-L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/
*collect2: library libpython2.6 not found*

This is on AIX 5.3, GCC 4.2, Python 2.6.6

I can confirm that there is a libpython2.6.a file in the top level
directory from where I am doing the configure/make etc

Here are the options supplied to the configure command

./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I
/opt/freeware/include -I /opt/freeware/include/readline -I

Please guide me in getting past this error.

Thanks for your help on this.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From v+python at  Thu Nov 25 10:34:51 2010
From: v+python at (Glenn Linderman)
Date: Thu, 25 Nov 2010 01:34:51 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>	<1290535602.3642.87.camel@localhost.localdomain>	<>
Message-ID: <>

So the following code defines constants with associated names that get 
put in the repr.

I'm still a Python newbie in some areas, particularly classes and 
metaclasses, maybe more.
But this Python 3 code seems to create constants with names ... works 
for int and str at least.

Special case for int defines a special  __or__ operator to OR both the 
values and the names, which some might like.

Dunno why it doesn't work for dict, and it is too late to research that 
today.  That's the last test case in the code below, so you can see how 
it works for int and string before it bombs.

There's some obvious cleanup work to be done, and it would be nice to 
make the names actually be constant... but they do lose their .name if 
you ignorantly assign the base type, so at least it is hard to change 
the value and keep the associated .name that gets reported by repr, 
which might reduce some confusion at debug time.

An idea I had, but have no idea how to implement, is that it might be 
nice to say:

     with imported_constants_from_module:

where do_stuff could reference the constants without qualifying them by 
module.  Of course, if you knew it was just a module of constants, you 
could "import * from module" :)  But the idea of with is that they'd go 
away at the end of that scope.

Some techniques here came from Raymond's namedtuple code.

def constant( name, val ):
     typ = str( type( val ))
     if typ.startswith("<class '")  and  typ[ -2: ] == "'>":
         typ = typ[ 8:-2 ]
     ev = '''
class constant_%s( %s ):
     def __new__( cls, val, name ):
         self = %s.__new__( cls, val ) = name
         return self
     def __repr__( self ):
         return + ': ' + str( self )
     if typ == 'int':
         ev += '''
     def __or__( self, other ):
         if isinstance( other, constant_int ):
             return constant_int( int( self ) | int( other ),
                         + ' | ' + )
     ev += '''
%s = constant_%s( %s, '%s' )

     ev = ev % ( typ, typ, typ, name, typ, repr( val ), name )
     print( ev )
     exec( ev, globals())

constant('O_RANDOM', val=16 )

constant('O_SEQUENTIAL', val=32 )

constant("O_STRING", val="string")

def foo( x ):
     print( str( x ))
     print( repr( x ))
     print( type( x ))

foo( O_RANDOM )
foo( O_STRING )


foo( zz )

y = {'ab': 2, 'yz': 3 }
constant('O_DICT', y )

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mal at  Thu Nov 25 10:51:09 2010
From: mal at (M.-A. Lemburg)
Date: Thu, 25 Nov 2010 10:51:09 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <icksqj$84p$>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Terry Reedy wrote:
> On 11/24/2010 3:06 PM, Alexander Belopolsky wrote:
>> Any non-trivial text processing is likely to be broken in presence of
>> surrogates.  Producing them on input is just trading known issue for
>> an unknown one.  Processing surrogate pairs in python code is hard.
>> Software that has to support non-BMP characters will most likely be
>> written for a wide build and contain subtle bugs when run under a
>> narrow build.  Note that my latest proposal does not abolish
>> surrogates outright.  Users who want them can still use something like
>> "surrogateescape"  error handler for non-BMP characters.
> It seems to me that what you are asking for is an alternate, optional,
> utf-8-bmp codec that would raise an error, in either direction, for
> non-bmp chars. Then, as you suggest, if one is not prepared for
> surrogates, they are not allowed.

That would be a possibility as well... but I doubt that many users
are going to bother, since slicing surrogates is just as bad as
slicing combining code points and the latter are much more common in
real life and they do happen to mostly live in the BMP.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 25 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at  Thu Nov 25 10:57:17 2010
From: mal at (M.-A. Lemburg)
Date: Thu, 25 Nov 2010 10:57:17 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:
> On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull <stephen at> wrote:
> ..
>>  > I note that an opinion has been raised on this thread that
>>  > if we want compressed internal representation for strings, we should
>>  > use UTF-8.  I tend to agree, but UTF-8 has been repeatedly rejected as
>>  > too hard to implement.  What makes UTF-16 easier than UTF-8?  Only the
>>  > fact that you can ignore bugs longer, in my view.
>> That's mostly true.  My guess is that we can probably ignore those
>> bugs for as long as it takes someone to write the higher-level
>> libraries that James suggests and MAL has actually proposed and
>> started a PEP for.
> As far as I can tell, that PEP generated grand total of one comment in
> nine years.  This may or may not be indicative of how far away we are
> from seeing it implemented.  :-)

At the time it was too early for people to start thinking about
these issues. Actual use of Unicode really only started a few years

Since I didn't have a need for such an indexing module myself
(and didn't have much time to work on it anyway), I punted on the

If someone else wants to pick up the idea, I'd gladly help out with
the details.

> As far as UTF-8 vs. UCS-2/4 debate, I have an idea that may be even
> more far fetched.  Once upon a time, Python Unicode strings supported
> buffer protocol and would lazily fill an internal buffer with bytes in
> the default encoding.  In 3.x the default encoding has been fixed as
> UTF-8, buffer protocol support was removed from strings, but the
> internal buffer caching (now UTF-8) encoded representation remained.
> Maybe we can now implement defenc logic in reverse.  Recall that
> strings are stored as UCS-2/4 sequences, but once buffer is requested
> in 2.x Python code or char* is obtained via
> _PyUnicode_AsStringAndSize() at the C level in 3.x, an internal buffer
> is filled with UTF-8 bytes and  defenc is set to point to that buffer.

The original idea was for that buffer to go away once we moved
to Unicode for strings. Reality has shown that we still need
to stick the buffer, though, since the UTF-8 representation
of Unicode objects is used a lot.

>   So the idea is for strings to store their data as UTF-8 buffer
> pointed by defenc upon construction.  If an application uses string
> indexing, UTF-8 only strings will lazily fill their UCS-2/4 buffer.
> Proper, Unicode-aware algorithms such as grapheme, word or line
> iteration or simple operations such as concatenation, search or
> substitution would operate directly on defenc buffers.  Presumably
> over time fewer and fewer applications would use code unit indexing
> that require UCS-2/4 buffer and eventually Python strings can stop
> supporting indexing altogether just like they stopped supporting the
> buffer protocol in 3.x.

I don't follow you: how would UTF-8, which has even more issues
with variable length representation of code points, make something
easier compared to UTF-16, which has far fewer such issues and
then only for non-BMP code points ?

Please note that we can only provide one way of string indexing
in Python using the standard s[1] notation and since we don't
want that operation to be fast and no more than O(1), using the
code units as items is the only reasonable way to implement it.

With an indexing module, we could then let applications work
based on higher level indexing schemes such as complete code
points (skipping surrogates), combined code points, graphemes
(ignoring e.g. most control code points and zero width
code points), words (with some customizations as to where to
break words, which will likely have to be language dependent),
lines (which can be complicated for scripts that use columns
instead ;-)), paragraphs, etc.

It would also help to add transparent indexing for right-to-left
scripts and text that uses both left-to-right and right-to-left
text (BIDI).

However, in order for these indexing methods to actually work,
they will need to return references to the code units, so we cannot
just drop that access method.

* Back on the surrogates topic:

In any case, I think this discussion is losing its grip on reality.

By far, most strings you find in actual applications don't use
surrogates at all, so the problem is being exaggerated.

If you need to be careful about surrogates for some reason, I think
a single new method .hassurrogates() on string objects would
go a long way in making detection and adding special-casing for
these a lot easier.

If adding support for surrogates doesn't make sense (e.g. in the
case of the formatting methods), then we simply punt on that and
leave such handling to other tools.

* Regarding preventing surrogates from entering the Python

It is by far more important to maintain round-trip safety for
Unicode data, than getting every bit of code work correctly
with surrogates (often, there won't be a single correct way).

With a new method for fast detection of surrogates, we could
protect code which obviously doesn't work with surrogates and
then consider each case individually by either adding special
cases as necessary or punting on the support.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 25 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From nadeem.vawda at  Thu Nov 25 11:12:20 2010
From: nadeem.vawda at (Nadeem Vawda)
Date: Thu, 25 Nov 2010 12:12:20 +0200
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, Nov 25, 2010 at 11:34 AM, Glenn Linderman <v+python at> wrote:
> So the following code defines constants with associated names that get put
> in the repr.

The code you gave doesn't work if the constant() function is moved
into a separate module from the code that calls it.  The globals()
function, as I understand it, gives you access to the global namespace
*of the current module*, so the constants end up being defined in the
module containing constant(), not the module you're calling it from.

You could get around this by passing the globals of the calling module
to constant(), but I think it's cleaner to use a class to provide a
distinct namespace for the constants.

> An idea I had, but have no idea how to implement, is that it might be nice
> to say:
> ??? with imported_constants_from_module:
> ??? ?????? do_stuff
> where do_stuff could reference the constants without qualifying them by
> module.? Of course, if you knew it was just a module of constants, you could
> "import * from module" :)? But the idea of with is that they'd go away at
> the end of that scope.

I don't think this is possible - the context manager protocol doesn't
allow you to modify the namespace of the caller like that.  Also, a
with statement does not have its own namespace; any names defined
inside its body will continue to be visible in the containing scope.

Of course, if you want to achieve something similar (at function
scope), you could say:

def foo(bar, baz):
    from module import *

From fuzzyman at  Thu Nov 25 11:34:25 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 25 Nov 2010 10:34:25 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>	<1290535602.3642.87.camel@localhost.localdomain>	<>	<>
Message-ID: <>

On 25/11/2010 10:12, Nadeem Vawda wrote:
> On Thu, Nov 25, 2010 at 11:34 AM, Glenn Linderman<v+python at>  wrote:
>> So the following code defines constants with associated names that get put
>> in the repr.
> The code you gave doesn't work if the constant() function is moved
> into a separate module from the code that calls it.  The globals()
> function, as I understand it, gives you access to the global namespace
> *of the current module*, so the constants end up being defined in the
> module containing constant(), not the module you're calling it from.
> You could get around this by passing the globals of the calling module
> to constant(), but I think it's cleaner to use a class to provide a
> distinct namespace for the constants.
>> An idea I had, but have no idea how to implement, is that it might be nice
>> to say:
>>      with imported_constants_from_module:
>>             do_stuff
>> where do_stuff could reference the constants without qualifying them by
>> module.  Of course, if you knew it was just a module of constants, you could
>> "import * from module" :)  But the idea of with is that they'd go away at
>> the end of that scope.
> I don't think this is possible - the context manager protocol doesn't
> allow you to modify the namespace of the caller like that.  Also, a
> with statement does not have its own namespace; any names defined
> inside its body will continue to be visible in the containing scope.
> Of course, if you want to achieve something similar (at function
> scope), you could say:
> def foo(bar, baz):
>      from module import *
>      ...

Not in Python 3 you can't. :-)

That's invalid syntax, import * can only be used at module level. This 
makes *testing* import * (i.e. testing your __all__) annoying - you have 
to exec('from module import *') instead.


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Thu Nov 25 11:37:13 2010
From: fuzzyman at (Michael Foord)
Date: Thu, 25 Nov 2010 10:37:13 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<1290524466.3642.4.camel@localhost.localdomain>	<>	<1290526253.3642.9.camel@localhost.localdomain>	<>	<1290528319.3642.11.camel@localhost.localdomain>	<>	<1290533860.3642.73.camel@localhost.localdomain>	<>	<1290535602.3642.87.camel@localhost.localdomain>	<>	<>
Message-ID: <>

On 25/11/2010 09:34, Glenn Linderman wrote:
> So the following code defines constants with associated names that get 
> put in the repr.
> I'm still a Python newbie in some areas, particularly classes and 
> metaclasses, maybe more.
> But this Python 3 code seems to create constants with names ... works 
> for int and str at least.
> Special case for int defines a special  __or__ operator to OR both the 
> values and the names, which some might like.
> Dunno why it doesn't work for dict, and it is too late to research 
> that today.  That's the last test case in the code below, so you can 
> see how it works for int and string before it bombs.
> There's some obvious cleanup work to be done, and it would be nice to 
> make the names actually be constant... but they do lose their .name if 
> you ignorantly assign the base type, so at least it is hard to change 
> the value and keep the associated .name that gets reported by repr, 
> which might reduce some confusion at debug time.
> An idea I had, but have no idea how to implement, is that it might be 
> nice to say:
>     with imported_constants_from_module:
>            do_stuff
> where do_stuff could reference the constants without qualifying them 
> by module.  Of course, if you knew it was just a module of constants, 
> you could "import * from module" :)  But the idea of with is that 
> they'd go away at the end of that scope.
> Some techniques here came from Raymond's namedtuple code.
> def constant( name, val ):
>     typ = str( type( val ))
>     if typ.startswith("<class '")  and  typ[ -2: ] == "'>":
>         typ = typ[ 8:-2 ]
>     ev = '''
> class constant_%s( %s ):
>     def __new__( cls, val, name ):
>         self = %s.__new__( cls, val )
> = name
>         return self
>     def __repr__( self ):
>         return + ': ' + str( self )
> '''
>     if typ == 'int':
>         ev += '''
>     def __or__( self, other ):
>         if isinstance( other, constant_int ):
>             return constant_int( int( self ) | int( other ),
>                         + ' | ' + )
> '''

Not quite correct. If you or a value you with itself you should get back 
just the value not something with "name|name" as the repr.

We can hold off on implementations until we have general agreement that 
some kind of named constant *should* be added, and what the feature set 
should look like.

All the best,


>     ev += '''
> %s = constant_%s( %s, '%s' )
> '''
>     ev = ev % ( typ, typ, typ, name, typ, repr( val ), name )
>     print( ev )
>     exec( ev, globals())
> constant('O_RANDOM', val=16 )
> constant('O_SEQUENTIAL', val=32 )
> constant("O_STRING", val="string")
> def foo( x ):
>     print( str( x ))
>     print( repr( x ))
>     print( type( x ))
> foo( O_RANDOM )
> foo( O_STRING )
> foo( zz )
> y = {'ab': 2, 'yz': 3 }
> constant('O_DICT', y )
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies ("BOGUS AGREEMENTS") that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From merwok at  Thu Nov 25 12:47:00 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 25 Nov 2010 12:47:00 +0100
Subject: [Python-Dev] [Python-checkins] r86748 - in
 python/branches/py3k-urllib/Lib: http/ urllib/
In-Reply-To: <>
References: <>
Message-ID: <>

> Author: senthil.kumaran
> New Revision: 86748
> Log:
> Experimental - Transparent gzip Encoding in urllib2. There should be a good way to deal with Content-Length.
Cool feature!  But...

> Modified:
>    python/branches/py3k-urllib/Lib/http/
>    python/branches/py3k-urllib/Lib/urllib/
No tests?  Misc/NEWS?  :)


From rob.cliffe at  Thu Nov 25 13:52:44 2010
From: rob.cliffe at (Rob Cliffe)
Date: Thu, 25 Nov 2010 12:52:44 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 25/11/2010 03:46, Greg Ewing wrote:
> On 25/11/10 12:38, average wrote:
>> Is immutability a general need that should have general solution?
Yes, I have sometimes thought this.  Might be nice to have a "mutable" 
attribute that could be read and could be changed from True to False, 
though presumably not vice versa.
> I don't think it really generalizes. Tuples are not just frozen
> lists, for example -- they have a different internal structure
> that's more efficient to create and access.
But couldn't they be presented to the Python programmer as a single 
type, with the implementation details hidden "under the hood"?
     MyList.__mutable__ = False
would have the same effect as the present
     MyList = tuple(MyList)
This would simplify some code that copes with either list(s) or tuple(s) 
as input data.
One would need syntax for (im)mutable literals, e.g.
     []i    # immutable list (really a tuple).  Bit of a shame that 
"i[]" doesn't work.
     []f    # frozen list (same thing)
     []     # mutable list (same as now)
     []m  # alternative syntax for mutable list
This would reduce the overloading on parentheses and avoid having to 
write a tuple of one item as (t,) which often trips up newbies.  It woud 
also avoid one FAQ: Why does Python have separate list and tuple types?  
Also the syntax could be extended, e.g.
     {a,b,c}f      # frozen set with 3 objects
     {p:x,q:y}f  # frozen dictionary with 2 items
     {:}f,  {}f      # (re the thread on set literals) frozen empty 
dictionary and frozen empty set!
Just some thoughts for Python 4.
Best wishes
Rob Cliffe

From g.brandl at  Thu Nov 25 14:27:14 2010
From: g.brandl at (Georg Brandl)
Date: Thu, 25 Nov 2010 14:27:14 +0100
Subject: [Python-Dev] [Python-checkins] r86748 - in
 python/branches/py3k-urllib/Lib: http/ urllib/
In-Reply-To: <>
References: <>
Message-ID: <iclo7m$1ct$>

Am 25.11.2010 12:47, schrieb ?ric Araujo:
>> Author: senthil.kumaran
>> New Revision: 86748
>> Log:
>> Experimental - Transparent gzip Encoding in urllib2. There should be a good way to deal with Content-Length.
> Cool feature!  But...
>> Modified:
>>    python/branches/py3k-urllib/Lib/http/
>>    python/branches/py3k-urllib/Lib/urllib/
> No tests?  Misc/NEWS?  :)

Note that this is work in a separate branch.


From emile.anclin at  Thu Nov 25 15:30:23 2010
From: emile.anclin at (Emile Anclin)
Date: Thu, 25 Nov 2010 15:30:23 +0100
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
Message-ID: <201011251530.23947.emile.anclin@logilab>


working on Pylint, we have a lot of voluntary corrupted files to test 
Pylint behavior; for instance 
$ cat /home/emile/var/pylint/test/input/ 
# -*- coding: IBO-8859-1 -*-
""" check correct unknown encoding declaration

__revision__ = '????'

and we try to find that module :
find_module('func_unknown_encoding', None). But python3 raises SyntaxError 
in that case ; it didn't raise SyntaxError on python2 nor does so on our 
func_nonascii_noencoding and func_wrong_encoding modules (with obvious 

Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) 
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from imp import find_module
>>> find_module('func_unknown_encoding', None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SyntaxError: encoding problem: with BOM
>>> find_module('func_wrong_encoding', None)
(<_io.TextIOWrapper name=5 encoding='utf-8'>, '', 
('.py', 'U', 1))
>>> find_module('func_nonascii_noencoding', None)
(<_io.TextIOWrapper name=6 encoding='utf-8'>, 
'', ('.py', 'U', 1))

So what is the reason of this selective behavior?
Furthermore, there is BOM in our module.


Emile Anclin <emile.anclin at> 
Informatique scientifique & et gestion de connaissances

From rrr at  Thu Nov 25 18:22:58 2010
From: rrr at (Ron Adam)
Date: Thu, 25 Nov 2010 11:22:58 -0600
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <201011251530.23947.emile.anclin@logilab>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On 11/25/2010 08:30 AM, Emile Anclin wrote:
> hello,
> working on Pylint, we have a lot of voluntary corrupted files to test
> Pylint behavior; for instance
> $ cat /home/emile/var/pylint/test/input/
> # -*- coding: IBO-8859-1 -*-
> """ check correct unknown encoding declaration
> """
> __revision__ = '????'
> and we try to find that module :
> find_module('func_unknown_encoding', None). But python3 raises SyntaxError
> in that case ; it didn't raise SyntaxError on python2 nor does so on our
> func_nonascii_noencoding and func_wrong_encoding modules (with obvious
> names)
> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from imp import find_module
>>>> find_module('func_unknown_encoding', None)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> SyntaxError: encoding problem: with BOM
>>>> find_module('func_wrong_encoding', None)
> (<_io.TextIOWrapper name=5 encoding='utf-8'>, '',
> ('.py', 'U', 1))
>>>> find_module('func_nonascii_noencoding', None)
> (<_io.TextIOWrapper name=6 encoding='utf-8'>,
> '', ('.py', 'U', 1))
> So what is the reason of this selective behavior?
> Furthermore, there is BOM in our module.

I don't think there is a clear reason by design.  Also try importing the 
same modules directly and noting the differences in the errors you get.

For example, the problem that brought this to my attention in python3.2.

 >>> find_module('test/badsyntax_pep3120')
Segmentation fault

 >>> from test import badsyntax_pep3120
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/local/lib/python3.2/test/", line 1
SyntaxError: Non-UTF-8 code starting with '\xf6' in file 
/usr/local/lib/python3.2/test/ on line 1, but no 
encoding declared; see for details

The import statement uses parser.c, and tokenizer.c indirectly, to import a 
file, but the imp module uses tokenizer.c directly.  They aren't consistent 
in how they handle errors because the different error messages are 
generated in different places depending on what the error is, *and* what 
the code path to get to that point was, *and* weather or not a filename was 
set.  For the example above with imp.findmodule(), the filename isn't set, 
so you get a different error than if you used import, which uses the parser 
module and that does set the filename.

 From what I've seen, it would help if the imp module was rewritten to use 
parser.c like the import statement does, rather than tokenizer.c directly. 
The error handling in parser.c is much better than tokenizer.c.  Possibly 
tokenizer.c could be cleaned up after that and be made much simpler.

Ron Adam

From rrr at  Thu Nov 25 18:22:58 2010
From: rrr at (Ron Adam)
Date: Thu, 25 Nov 2010 11:22:58 -0600
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <201011251530.23947.emile.anclin@logilab>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On 11/25/2010 08:30 AM, Emile Anclin wrote:
> hello,
> working on Pylint, we have a lot of voluntary corrupted files to test
> Pylint behavior; for instance
> $ cat /home/emile/var/pylint/test/input/
> # -*- coding: IBO-8859-1 -*-
> """ check correct unknown encoding declaration
> """
> __revision__ = '????'
> and we try to find that module :
> find_module('func_unknown_encoding', None). But python3 raises SyntaxError
> in that case ; it didn't raise SyntaxError on python2 nor does so on our
> func_nonascii_noencoding and func_wrong_encoding modules (with obvious
> names)
> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from imp import find_module
>>>> find_module('func_unknown_encoding', None)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> SyntaxError: encoding problem: with BOM
>>>> find_module('func_wrong_encoding', None)
> (<_io.TextIOWrapper name=5 encoding='utf-8'>, '',
> ('.py', 'U', 1))
>>>> find_module('func_nonascii_noencoding', None)
> (<_io.TextIOWrapper name=6 encoding='utf-8'>,
> '', ('.py', 'U', 1))
> So what is the reason of this selective behavior?
> Furthermore, there is BOM in our module.

I don't think there is a clear reason by design.  Also try importing the 
same modules directly and noting the differences in the errors you get.

For example, the problem that brought this to my attention in python3.2.

 >>> find_module('test/badsyntax_pep3120')
Segmentation fault

 >>> from test import badsyntax_pep3120
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/local/lib/python3.2/test/", line 1
SyntaxError: Non-UTF-8 code starting with '\xf6' in file 
/usr/local/lib/python3.2/test/ on line 1, but no 
encoding declared; see for details

The import statement uses parser.c, and tokenizer.c indirectly, to import a 
file, but the imp module uses tokenizer.c directly.  They aren't consistent 
in how they handle errors because the different error messages are 
generated in different places depending on what the error is, *and* what 
the code path to get to that point was, *and* weather or not a filename was 
set.  For the example above with imp.findmodule(), the filename isn't set, 
so you get a different error than if you used import, which uses the parser 
module and that does set the filename.

 From what I've seen, it would help if the imp module was rewritten to use 
parser.c like the import statement does, rather than tokenizer.c directly. 
The error handling in parser.c is much better than tokenizer.c.  Possibly 
tokenizer.c could be cleaned up after that and be made much simpler.

Ron Adam

From merwok at  Thu Nov 25 18:53:54 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 25 Nov 2010 18:53:54 +0100
Subject: [Python-Dev] [Python-checkins] r86748 - in
 python/branches/py3k-urllib/Lib: http/ urllib/
In-Reply-To: <iclo7m$1ct$>
References: <>	<>
Message-ID: <>

>>> Modified:
>>>    python/branches/py3k-urllib/Lib/http/
>>>    python/branches/py3k-urllib/Lib/urllib/
>> No tests?  Misc/NEWS?  :)
> Note that this is work in a separate branch.

Ah, didn?t notice that!  Senthil replied as much in private email:

> That was in a different branch. Once stable shall definitey include
> the tests and news.

unconsciously-ignoring-svn-branches-to-preserve-sanity-ly yours,

From victor.stinner at  Thu Nov 25 22:39:00 2010
From: victor.stinner at (Victor Stinner)
Date: Thu, 25 Nov 2010 22:39:00 +0100
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
Message-ID: <>

On Friday 19 November 2010 23:25:03 you wrote:
> > Python is unclear about non-BMP characters: narrow build was called
> > "ucs2" for long time, even if it is UTF-16 (each character is encoded to
> > one or two UTF-16 words).
> No, no, no :-)
> UCS2 and UCS4 are more appropriate than "narrow" and "wide" or even
> "UTF-16" and "UTF-32".

Ok for Python 2:

$ ./python 
Python 2.7.0+ (release27-maint:84618M, Sep  8 2010, 12:43:49) 
>>> import sys; sys.maxunicode
>>> x=u'\U0010ffff'; len(x)
>>> ord(x)
TypeError: ord() expected a character, but string of length 2 found

But Python 3 does use UTF-16 for narrow build:

$ ./python                                                                                                                                  
Python 3.2a3+ (py3k:86396:86399M, Nov 10 2010, 15:24:09)                                                                                   
>>> import sys; sys.maxunicode
>>> c=chr(0x10ffff); len(c)
>>> ord(c)


From merwok at  Fri Nov 26 02:32:43 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 26 Nov 2010 02:32:43 +0100
Subject: [Python-Dev] [Python-checkins] r86750 -
In-Reply-To: <>
References: <>
Message-ID: <>


> Author: senthil.kumaran
> Log:
> Mouse support and colour to Demo/curses/ by Dafydd Crosby
> Modified:
>    python/branches/py3k/Demo/curses/
Okay, this time I?m reacting to the right branch <wink>

> Modified: python/branches/py3k/Demo/curses/
> ==============================================================================
> --- python/branches/py3k/Demo/curses/	(original)
> +++ python/branches/py3k/Demo/curses/	Thu Nov 25 15:56:44 2010
> @@ -1,6 +1,7 @@
>  #!/usr/bin/env python3
>  # -- A curses-based version of Conway's Game of Life.
>  # Contributed by AMK
> +# Mouse support and colour by Dafydd Crosby
Shouldn?t his name rather be in Misc/ACKS too?  Modules typically
(warning: non-scientific data) include the name of the author or first
contributors but not the name of every contributor.

I think these cool features deserve a note in Misc/NEWS too :)

Re: ?colour?: the rest of the file use US English, as do the function
names (see for example curses.has_color).  It?s good to use one dialect
consistently in one file.

going-back-to-stare-at-shiny-colors-ly yours,

From orsenthil at  Fri Nov 26 03:15:24 2010
From: orsenthil at (Senthil Kumaran)
Date: Fri, 26 Nov 2010 10:15:24 +0800
Subject: [Python-Dev] [Python-checkins] r86750 -
In-Reply-To: <>
References: <>
Message-ID: <20101126021524.GA1450@rubuntu>

On Fri, Nov 26, 2010 at 02:32:43AM +0100, ?ric Araujo wrote:
> Shouldn?t his name rather be in Misc/ACKS too?  Modules typically
> (warning: non-scientific data) include the name of the author or first
> contributors but not the name of every contributor.
> I think these cool features deserve a note in Misc/NEWS too :)

I don't think it is required. Demo stuffs are usually fun
demonstrations. The contributor had added his name to patch in the
header, and I just left it like that. It's fine.

For features and important patches (subjective), Misc/{ACKS,NEWS} are
both added.

> Re: ?colour?: the rest of the file use US English, as do the function
> names (see for example curses.has_color).  It?s good to use one dialect
> consistently in one file.

Good catch. Did not realize it because, we write it as colour too.
Changing it.


From stephen at  Fri Nov 26 03:42:33 2010
From: stephen at (Stephen J. Turnbull)
Date: Fri, 26 Nov 2010 11:42:33 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
	<icksqj$84p$> <>
Message-ID: <>

M.-A. Lemburg writes:

 > That would be a possibility as well... but I doubt that many users
 > are going to bother, since slicing surrogates is just as bad as
 > slicing combining code points and the latter are much more common in
 > real life and they do happen to mostly live in the BMP.

That's only if you require 100% fidelity in the data, which may not be
true in some use cases.  Where 99.99% fidelity is good enough, an
unexpected sliced surrogate pair is a show-stopper, while a sliced
combining character sequence not only doesn't stop the show (at least
in Python, and I doubt any correct Unicode process can signal a fatal
error there either, I can put a tilde on a Cyrillic character if I
want to, no?), it's probably readable enough that readers will assume
a keypunch error.

Personally, if available I would always use some such dodge in server
software (I don't care enough about 24x7 availability to write it
myself, though).  And never in a script for interactive use; something
needs fixing, may as well take the fatal error and fix it on the spot.
(Again, "on the spot" for me can mean "tomorrow".)

From stephen at  Fri Nov 26 04:02:09 2010
From: stephen at (Stephen J. Turnbull)
Date: Fri, 26 Nov 2010 12:02:09 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

M.-A. Lemburg writes:

 > Please note that we can only provide one way of string indexing
 > in Python using the standard s[1] notation and since we don't
 > want that operation to be fast and no more than O(1), using the
 > code units as items is the only reasonable way to implement it.

AFAICT, the "we" that wants "no more than O(1)" does not include Glyph
Lefkowitz, James Knight, and Greg Ewing.  Greg even said that in
designing a UTF-8 string type he might not provide a indexing
operation at all.  (Caution: That may not be what he meant; I'm just
reporting the way I interpreted it.)  Of course none of them are
proposing to change Python, that's all in the context of designing a
new language.  But it does suggest that a lot of people can't think of
use cases where O(1) string indexing is more important than Unicode

 > It is by far more important to maintain round-trip safety for
 > Unicode data, than getting every bit of code work correctly
 > with surrogates (often, there won't be a single correct way).

But surely it's more important than that to ensure that surrogates
can't crash a Python process with unexpect UnicodeErrors?

From jcea at  Fri Nov 26 05:11:56 2010
From: jcea at (Jesus Cea)
Date: Fri, 26 Nov 2010 05:11:56 +0100
Subject: [Python-Dev] Question about GDB bindings and 32/64 bits
Message-ID: <>

Hash: SHA1

I have installed GDB 7.2 32 bits and 32 bits buildslaves are green.
Nevertheless 64 bits buildslaves are failing test_gdb.

Is there any expectation that a 32 bits GDB be able to debug a 64 bits
python?. If not, gdb test should compare "platform.architecture()" (for
python and gdb in the system) and run only when they are the same. If
this should work, I would open a bug and maybe spend some time with it.

But before thinking about investing time, I would like to know if this
mix is actually expected or not to work.

If not, I would consider to install a 64 bits GDB too and do some tricks
(like using an "/usr/local/bin/gdb" script wrapper to choose 32/64
"real" gdb version) to actually execute "test_gdb" in both buildslaves
(they are running in the same physical machine).

Any advice?

PS: I am talking about AMD64 OpenIndiana buildbots. Haven't check others.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at -     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


From glyph at  Fri Nov 26 08:21:26 2010
From: glyph at (Glyph Lefkowitz)
Date: Fri, 26 Nov 2010 02:21:26 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 24, 2010, at 4:03 AM, Stephen J. Turnbull wrote:

> You end up proliferating types that all do the same kind of thing.  Judicious use of inheritance helps, but getting the fundamental abstraction right is hard.  Or least, Emacs hasn't found it in 20 years of trying.

Emacs hasn't even figured out how to do general purpose iteration in 20 years of trying either.  The easiest way I've found to loop across an arbitrary pile of 'stuff' is the CL 'loop' macro, which you're not even supposed to use.  Even then, you still have to make the arcane and pointless distinction of using 'across' or 'in' or 'on'.  Python, on the other hand, has iteration pretty well tied up nicely in a bow.

I don't know how to respond to the rest of your argument.  Nothing you've said has in any way indicated to me why having code-point offsets is a good idea, only that people who know C and elisp would rather sling around piles of integers than have good abstract types.

For example:

> I think it more likely that markers are very expense to create and use compared to integers.

What?  When you do 'for x in str' in python, you are already creating an iterator object, which has to store the exact same amount of state that our proposed 'marker' or 'character pointer' would have to store.  The proposed UTF-8 marker would have to do a tiny bit more work when iterating because it would have to combine multibyte characters, but in exchange for that you get to skip a whole ton of copying when encoding and decoding.  How is this expensive to create and use?  For every application I have ever designed, encountered, or can even conjecture about, this would be cheaper.  (Assuming not just a UTF-8 string type, but one for UTF-16 as well, where native data is in that format already.)

For what it's worth, not wanting to use abstract types in Emacs makes sense to me: I've written my share of elisp code, and it is hard to create reasonable abstractions in Emacs, because the facilities for defining types and creating polymorphic logic are so crude.  It's a lot easier to just assume your underlying storage is an array, because at the end of the day you're going to need to call some functions on it which care whether it's an array or an alist or a list or a vector anyway, so you might as well just say so up front.  But in Python we could just call 'mystring.by_character()' or 'mystring.by_codepoint()' and get an iterator object back and forget about all that junk.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From glyph at  Fri Nov 26 08:51:35 2010
From: glyph at (Glyph Lefkowitz)
Date: Fri, 26 Nov 2010 02:51:35 -0500
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Nov 24, 2010, at 10:55 PM, Stephen J. Turnbull wrote:

> Greg Ewing writes:
>> On 24/11/10 22:03, Stephen J. Turnbull wrote:
>>> But
>>> if you actually need to remember positions, or regions, to jump to
>>> later or to communicate to other code that manipulates them, doing
>>> this stuff the straightforward way (just copying the whole iterator
>>> object to hang on to its state) becomes expensive.
>> If the internal representation of a text pointer (I won't call it
>> an iterator because that means something else in Python) is a byte
>> offset or something similar, it shouldn't take up any more space
>> than a Python int, which is what you'd be using anyway if you
>> represented text positions by grapheme indexes or whatever.
> That's not necessarily true.  Eg, in Emacs ("there you go again"),
> Lisp integers are not only immediate (saving one pointer), but the
> type is encoded in the lower bits, so that there is no need for a type
> pointer -- the representation is smaller than the opaque marker type.
> Altogether, up to 8 of 12 bytes saved on a 32-bit platform, or 16 of
> 24 bytes on a 64-bit platform.

Yes, yes, lisp is very clever.  Maybe some other runtime, like PyPy, could make this optimization.  But I don't think that anyone is filling up main memory with gigantic piles of character indexes and need to squeeze out that extra couple of bytes of memory on such a tiny object.  Plus, this would allow such a user to stop copying the character data itself just to decode it, and on mostly-ascii UTF-8 text (a common use-case) this is a 2x savings right off the bat.

> In Python it's true that markers can use the same data structure as
> integers and simply provide different methods, and it's arguable that
> Python's design is better.  But if you use bytes internally, then you
> have problems.

No, you just have design questions.

> Do you expose that byte value to the user?

Yes, but only if they ask for it.  It's useful for computing things like quota and the like.

> Can users (programmers using the language and end users) specify positions in terms of byte values?

Sure, why not?

> If so, what do you do if the user specifies a byte value that points into a multibyte character?

Go to the beginning of the multibyte character.  Report that position; if the user then asks the requested marker object for its position, it will report that byte offset, not the originally-requested one.  (Obviously, do the same thing for surrogate pair code points.)

> What if the user wants to specify position by number of characters?

Part of the point that we are trying to make here is that nobody really cares about that use-case.  In order to know anything useful about a position in a text, you have to have traversed to that location in the text. You can remember interesting things like the offsets of starts of lines, or the x/y positions of characters.

> Can you translate efficiently?

No, because there's no point :).  But you _could_ implement an overlay that cached things like the beginning of lines, or the x/y positions of interesting characters.

> As I say elsewhere, it's possible that there really never is a need to efficiently specify an absolute position in a large text as a character (grapheme, whatever) count.

> But I think it would be hard to implement an efficient text-processing *language*, eg, a Python module
> for *full conformance* in handling Unicode, on top of UTF-8.

Still: why?  I guess if I have some free time I'll try my hand at it, and maybe I'll run into a wall and realize you're right :).

> Any time you have an algorithm that requires efficient access to arbitrary text positions, you'll spend all your skull sweat fighting the representation.  At least, that's been my experience with Emacsen.

What sort of algorithm would that be, though?  The main thing that I could think of is a text editor trying to efficiently allow the user to scroll to the middle of a large file without reading the whole thing into memory.  But, in that case, you could use byte-positions to estimate, and display an heuristic number while calculating the real line numbers.  (This is what 'less' does, and it seems to work well.)

>> So I don't really see what you're arguing for here. How do
>> *you* think positions in unicode strings should be represented?
> I think what users should see is character positions, and they should
> be able to specify them numerically as well as via an opaque marker
> object.  I don't care whether that position is represented as bytes or
> characters internally, except that the experience of Emacsen is that
> representation as byte positions is both inefficient and fragile.  The
> representation as character positions is more robust but slightly more
> inefficient.

Is it really the representation as byte positions which is fragile (i.e. the internal implementation detail), or the exposure of that position to calling code, and the idiomatic usage of that number as an integer?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From facundobatista at  Fri Nov 26 16:05:09 2010
From: facundobatista at (Facundo Batista)
Date: Fri, 26 Nov 2010 12:05:09 -0300
Subject: [Python-Dev] [Preview] Comments and change proposals on
In-Reply-To: <icjsal$eqk$>
References: <icjsal$eqk$>
Message-ID: <>

On Wed, Nov 24, 2010 at 5:24 PM, Georg Brandl <g.brandl at> wrote:

> at <>, you can look at a version of the 3.2
> docs that has the upcoming commenting feature. ?JavaScript is mandatory.

This is awesome!! Thanks for this work, remember to buy you a beer next PyCon!

> Credits go to Jacob Mason, whose GSOC project is responsible for almost all
> of what you see there. ?[1]

Ok, two beers.

.? ? Facundo


From ocean-city at  Fri Nov 26 17:33:50 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sat, 27 Nov 2010 01:33:50 +0900
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 2010/11/14 9:06, Victor Stinner wrote:
> Yes, but how do you check if the input argument is a bytes or a str object
> with your PyArg_Parse converter? You should use "O" format and manually
> convert it to unicode, and then convert the result back to bytes (if the input
> was bytes). It don't think that it makes the code shorter.
> The code is currently working. The question is if we have to drop the ANSI API
> now, later or never. It looks like the decision moves to "later" (deprecate in
> 3.2, remove in 3.3). I still think that drop now doesn't really hurt.
> Victor

Humble thoughts...
Is it possible a conversion from bytes (ANSI) to unicode fails on
windows? If not, is it allowed to convert to unicode with
PyUnicode_FSDecoder if function doesn't return str? For example, 
os.stat() takes str as arguments but doesn't return str.

# I noticed win_readlink() in Modules/posixmodule.c already unicode
# only. Maybe not so much problem? ;-)

From ocean-city at  Fri Nov 26 18:06:06 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sat, 27 Nov 2010 02:06:06 +0900
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>
Message-ID: <>

On 2010/11/12 1:18, Ulrich Eckhardt wrote:
>> # I recently did it for winsound.PlaySound with MvL's approval
> Interesting, is there a ticket associate with this? Also, was that on Python 3
> or 2? Which commits?

Sorry for late posting. Rev 86300 and Issue 6317.

From status at  Fri Nov 26 18:07:01 2010
From: status at (Python tracker)
Date: Fri, 26 Nov 2010 18:07:01 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (2010-11-19 - 2010-11-26)
Python tracker at

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2533 (-16)
  closed 19792 (+98)
  total  22325 (+82)

Open issues with patches: 1083 

Issues opened (66)

#1178: IDLE - add "paste code" functionality  reopened by ned.deily

#3709: BaseHTTPRequestHandler innefficient when sending HTTP header  reopened by r.david.murray

#5150: IDLE to support  reopened by rhettinger

#8879: Implement on Windows  reopened by amaury.forgeotdarc

#9769: PyUnicode_FromFormatV() doesn't handle non-ascii text correctl  reopened by belopolsky

#10220: Make generator state easier to introspect  reopened by ncoghlan

#10268: Add --enable-loadable-sqlite-extensions option to `configure`  reopened by ned.deily

#10441: some stdlib modules need to be updated to handle SSL certifica  reopened by pitrou

#10453: Add -h/--help option to compileall  reopened by eric.araujo

#10464: netrc module not parsing passwords containing #s.  opened by the_isz

#10466: resetlocale throws exception on Windows (getdefaultl  opened by skoczian

#10469: test_socket fails using Visual Studio 2010  opened by Kotan

#10475: hardcoded compilers for LDSHARED/LDCXXSHARED on NetBSD  opened by njoly

#10478: Ctrl-C locks up the interpreter  opened by isandler

#10479: should assume a binary stream for output  opened by v+python

#10480: should document the need for binary stdin/stdout  opened by v+python

#10481: subprocess PIPEs are byte streams  opened by v+python

#10482: subprocess and deadlock avoidance  opened by v+python

#10483: http.server - what is executable on Windows  opened by v+python

#10484: http.server.is_cgi fails to handle CGI URLs containing PATH_IN  opened by v+python

#10485: http.server fails when query string contains addition '?' char  opened by v+python

#10486: http.server doesn't set all CGI environment variables  opened by v+python

#10487: http.server - doesn't process Status: header from CGI scripts  opened by v+python

#10492: test_doctest fails with iso-8859-15 locale  opened by pitrou

#10494: Demo/comparisons/ needs some usage information.  opened by ramiroluz

#10495: Demo/comparisons/ needs some usage information.  opened by ramiroluz

#10496: "import site failed" when Python can't find home directory  opened by bbi5291

#10497: Incorrect use of gettext in argparse  opened by eric.araujo

#10498: calendar.LocaleHTMLCalendar.formatyearpage() results in traceb  opened by r.david.murray

#10499: Modular interpolation in configparser  opened by lukasz.langa

#10500: Palevo.DZ worm msix86 installer 3.x installer  opened by VilIgnoble

#10502: Add unittestguirunner to Tools/  opened by michael.foord

#10503: os.getuid() documentation should be clear on what kind of uid  opened by giampaolo.rodola

#10504: Trivial mingw compile fixes  opened by jonny

#10507: Check well-formedness of reST markup within "make patchcheck"  opened by dmalcolm

#10509: PyTokenizer_FindEncoding can lead to a segfault if bad charact  opened by Trundle

#10510: distutils upload/register should use CRLF in HTTP requests  opened by Brian.Jones

#10512: regrtest ResourceWarning - unclosed sockets and files  opened by nvawda

#10513: sqlite3.InterfaceError after commit  opened by anders.blomdell at

#10514: configure does not create accurate Makefile  opened by daelious

#10515: csv sniffer does not recognize quotes at the end of line  opened by Martin.Budaj

#10516: Add list.clear() and list.copy()  opened by terry.reedy

#10517: test_concurrent_futures crashes with "Fatal Python error: Inva  opened by lukasz.langa

#10518: Bring back callable()  opened by pitrou

#10519: setobject.c no-op typo  opened by arigo

#10521: str methods don't accept non-BMP fillchar on a narrow Unicode  opened by belopolsky

#10522: test_telnet exception  opened by pitrou

#10523: argparse has problem parsing option files containing empty row  opened by Michal.Pomorski

#10524: Patch to add Pardus to supported dists in platform  opened by zaburt

#10527: multiprocessing.Pipe problem: "handle out of range in select()  opened by synapse

#10528: argparse uses %s in gettext calls  opened by eric.araujo

#10529: Write argparse i18n howto  opened by eric.araujo

#10530: distutils2 should allow the installing of python files with in  opened by michael.foord

#10531: write tilted text in turtle  opened by lanyjie

#10532: A bug related to matching the empty string  opened by lanyjie

#10533: Need example of using __missing__  opened by lukasz.langa

#10534: difflib.SequenceMatcher: expose junk sets, deprecate undocumen  opened by terry.reedy

#10535: Enable warnings by default in unittest  opened by ezio.melotti

#10536: Enhancements to gettext docs  opened by eric.araujo

#10537: IDLE crashes when you paste something.  opened by 5ragar5

#10538: PyArg_ParseTuple("s*") does not always incref object  opened by krisvale

#10539: Regular expression not checking 'range' element on 1st char in  opened by TxRxFx

#10540: test_shutil fails on Windows after r86733  opened by brian.curtin

#10541: -T broken  opened by doerwalter

#10542: Py_UNICODE_NEXT and other macros for surrogates  opened by belopolsky

#10543: Test discovery (unittest) does not work with jython  opened by michael.foord

Most recent 15 issues with no replies (15)

#10543: Test discovery (unittest) does not work with jython

#10542: Py_UNICODE_NEXT and other macros for surrogates

#10541: -T broken

#10539: Regular expression not checking 'range' element on 1st char in

#10538: PyArg_ParseTuple("s*") does not always incref object

#10537: IDLE crashes when you paste something.

#10536: Enhancements to gettext docs

#10534: difflib.SequenceMatcher: expose junk sets, deprecate undocumen

#10531: write tilted text in turtle

#10530: distutils2 should allow the installing of python files with in

#10523: argparse has problem parsing option files containing empty row

#10522: test_telnet exception

#10514: configure does not create accurate Makefile

#10507: Check well-formedness of reST markup within "make patchcheck"

#10499: Modular interpolation in configparser

Most recent 15 issues waiting for review (15)

#10542: Py_UNICODE_NEXT and other macros for surrogates

#10540: test_shutil fails on Windows after r86733

#10536: Enhancements to gettext docs

#10535: Enable warnings by default in unittest

#10527: multiprocessing.Pipe problem: "handle out of range in select()

#10524: Patch to add Pardus to supported dists in platform

#10521: str methods don't accept non-BMP fillchar on a narrow Unicode

#10518: Bring back callable()

#10515: csv sniffer does not recognize quotes at the end of line

#10512: regrtest ResourceWarning - unclosed sockets and files

#10509: PyTokenizer_FindEncoding can lead to a segfault if bad charact

#10504: Trivial mingw compile fixes

#10499: Modular interpolation in configparser

#10498: calendar.LocaleHTMLCalendar.formatyearpage() results in traceb

#10497: Incorrect use of gettext in argparse

Top 10 most discussed issues (10)

#10461: Use with statement throughout the docs  27 msgs

#7995: On Mac / BSD sockets returned by accept inherit the parent's F  24 msgs

#10453: Add -h/--help option to compileall  24 msgs

#9915: speeding up sorting with a key  14 msgs

#9742: Python 2.7: math module fails to build on Solaris 9  13 msgs

#10533: Need example of using __missing__  13 msgs

#9509: argparse FileType raises ugly exception for missing file  12 msgs

#10469: test_socket fails using Visual Studio 2010  12 msgs

#10504: Trivial mingw compile fixes  12 msgs

#10518: Bring back callable()  12 msgs

Issues closed (92)

#2244: urllib and urllib2 decode userinfo multiple times  closed by orsenthil

#2986: difflib.SequenceMatcher not matching long sequences  closed by terry.reedy

#3292: Position index limit; s.insert(i,x) not same as s[i:i]=[x]  closed by rhettinger

#4493: urllib2 doesn't always supply / where URI path component is em  closed by orsenthil

#4925: Improve error message of subprocess when cannot open  closed by benjamin.peterson

#5353: Improve IndexError messages with actual values  closed by rhettinger

#5412: extend configparser to support mapping access(__*item__)  closed by lukasz.langa

#5616: Distutils 2to3 support doesn't have the doctest_only flag.  closed by eric.araujo

#6166: encoding error for ' --author' when read via subproces  closed by eric.araujo

#6378: Patch to make 'idle.bat' run idle.pyw using appropriate Python  closed by brian.curtin

#6466: duplicate get_version() code between cygwinccompiler and emxcc  closed by eric.araujo

#6722: collections.namedtuple: confusing example  closed by rhettinger

#6799: mimetypes does not give canonical extension for guess_extensio  closed by eric.araujo

#6878: changed return type from tkinter.Canvas.coords  closed by belopolsky

#7212: Retrieve an arbitrary element from a set without removing it  closed by rhettinger

#7226: IDLE right-clicks don't work on Mac OS 10.5  closed by ned.deily

#7257: Improve documentation of list.sort and sorted()  closed by rhettinger

#7645: test_distutils fails on Windows XP  closed by brian.curtin

#7770: sin/cos function in decimal-docs  closed by rhettinger

#7804: test_readline failure  closed by pitrou

#8078: add more baud constants to termios  closed by pitrou

#8340: bytearray undocumented on trunk  closed by pitrou

#8381: IDLE 2.6 freezes on OS X 10.6  closed by ned.deily

#8569: Upgrade OpenSSL in Windows builds  closed by brian.curtin

#8590: test_httpservers.CGIHTTPServerTestCase failure on 3.1-maint Ma  closed by michael.foord

#8631: subprocess.Popen.communicate(...) hangs on Windows  closed by brian.curtin

#8645: PyUnicode_AsEncodedObject is undocumented  closed by belopolsky

#8646: PyUnicode_EncodeDecimal is undocumented  closed by belopolsky

#8647: PyUnicode_GetMax is undocumented  closed by eric.araujo

#8705: shutil.rmtree with empty filepath  closed by brian.curtin

#8938: Mac OS  dialogs(Save As..., Load) translation  closed by ned.deily

#9222: IDLE: Fix open/saveas 'Files of type' choices  closed by terry.reedy

#9500: urllib2: Content-Encoding  closed by r.david.murray

#9732: Addition of getattr_static for inspect module  closed by michael.foord

#9746: All sequence types support .index and .count  closed by eric.araujo

#9802: Document 'stability' of builtin min() and max()  closed by rhettinger

#9807: deriving configuration information for different builds	with t  closed by barry

#9846: ZipExtFile provides no mechanism for closing the underlying fi  closed by lukasz.langa

#9852: test_ctypes fail with clang  closed by ned.deily

#9876: ConfigParser can't interpolate values from other sections  closed by lukasz.langa

#9965: Loading malicious pickle may cause excessive memory usage  closed by georg.brandl

#10134: test_email failures on Windows: end of line issue?  closed by r.david.murray

#10138: calendar module does not support years outside [1, 9999] range  closed by belopolsky

#10164: Add an assertBytesEqual to unittest and use it for bytes asser  closed by rhettinger

#10172: code block has no syntax coloring  closed by georg.brandl

#10183: test_concurrent_futures failure on Windows  closed by bquinlan

#10255: refleak in initstdio  closed by pitrou

#10299: Add index with links section for built-in functions  closed by ezio.melotti

#10319: SocketServer.TCPServer truncates responses on close (in some s  closed by orsenthil

#10325: PY_LLONG_MAX & co - preprocessor constants or not?  closed by mark.dickinson

#10366: Remove unneeded '(object)' from 3.x class examples  closed by eric.araujo

#10371: Deprecate trace module undocumented API  closed by belopolsky

#10377: cProfile incorrectly labels its output  closed by orsenthil

#10391: obj2ast's error handling can lead to python crashing with a C-  closed by benjamin.peterson

#10420: Document of Bdb.effective is wrong.  closed by georg.brandl

#10430: _sha.sha().digest() method is endian-sensitive. and hexdigest(  closed by krisvale

#10437: ThreadPoolExecutor should accept max_workers=None  closed by stutzbach

#10439: PyCodec C API is not documented in reST  closed by georg.brandl

#10448: Add Mako template benchmark to Python Benchmark Suite  closed by pitrou

#10450: Fix markup in Misc/NEWS  closed by eric.araujo

#10458: 2.7 += re.ASCII  closed by terry.reedy

#10459: missing character names in unicodedata (CJK...)  closed by loewis

#10460: Misc/ does not reflect PEP 7  closed by georg.brandl

#10462: Handler.close is not called in subclass while Logger.removeHan  closed by vinay.sajip

#10463: Wrong return type for xml.etree.ElementTree.parse()  closed by tiwoc

#10465: gzip module calls getattr incorrectly  closed by georg.brandl

#10467: io.BytesIO.readinto() segfaults when used on BytesIO object se  closed by benjamin.peterson

#10468: Document UnicodeError access functions  closed by georg.brandl

#10470: python -m unittest ought to default to discovery  closed by michael.foord

#10471: include documentation in python docs and under python -h for o  closed by georg.brandl

#10472: Strange tab key behaviour in interactive python 2.7 OSX 10.6.2  closed by ned.deily

#10473: Strange behavior for socket.timeout  closed by ned.deily

#10474: range.count returns boolean  closed by benjamin.peterson

#10476: __iter__ on a byte file object using a method to return an ite  closed by benjamin.peterson

#10477: AttributeError: 'NoneType' object has no attribute 'name'  (bo  closed by eric.araujo

#10488: Improve documentation for 'float' built-in.  closed by mark.dickinson

#10489: configparser: remove broken `__name__` support  closed by lukasz.langa

#10490: mimetypes read_windows_registry fails for non-ASCII keys  closed by r.david.murray

#10491: Insecure Windows python directory permissions  closed by loewis

#10493: test_strptime failures under OpenIndiana  closed by jcea

#10501: make_buildinfo regression with unquoted path  closed by krisvale

#10505: test_compileall: failure on Windows  closed by eric.araujo

#10506: argparse execute system exit in python prompt  closed by r.david.murray

#10508: compiler warnings about formatting pid_t as an int  closed by georg.brandl

#10511: heapq docs clarification  closed by georg.brandl

#10520: Build with --enable-shared fails  closed by barry

#10525: Added mouse and colour support to Game of Life curses	demo  closed by orsenthil

#10526: Minor typo in What's New in Python 2.7  closed by georg.brandl

#10345: fcntl.ioctl always fails claiming an invalid fd  closed by ned.deily

#1059244: distutil bdist hardcodes the python location  closed by eric.araujo

#1574217: isinstance swallows exceptions  closed by r.david.murray

#1699853: locale.getlocale() output fails as setlocale() input  closed by r.david.murray

From fijall at  Fri Nov 26 19:23:45 2010
From: fijall at (Maciej Fijalkowski)
Date: Fri, 26 Nov 2010 20:23:45 +0200
Subject: [Python-Dev] PyPy 1.4 released
Message-ID: <>

PyPy 1.4: Ouroboros in practice

We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough
in our long journey, as PyPy 1.4 is the first PyPy release that can translate
itself faster than CPython.  Starting today, we are using PyPy more for
our every-day development.  So may you :) You can download it here:

What is PyPy

PyPy is a very compliant Python interpreter, almost a drop-in replacement
for CPython. It's fast (`pypy 1.4 and cpython 2.6`_ comparison)

Among its new features, this release includes numerous performance improvements
(which made fast self-hosting possible), a 64-bit JIT backend, as well
as serious stabilization. As of now, we can consider the 32-bit and 64-bit
linux versions of PyPy stable enough to run `in production`_.

Numerous speed achievements are described on `our blog`_. Normalized speed
charts comparing `pypy 1.4 and pypy 1.3`_ as well as `pypy 1.4 and cpython 2.6`_
are available on benchmark website. For the impatient: yes, we got a lot faster!

More highlights

* PyPy's built-in Just-in-Time compiler is fully transparent and
  automatically generated; it now also has very reasonable memory
  requirements.  The total memory used by a very complex and
  long-running process (translating PyPy itself) is within 1.5x to
  at most 2x the memory needed by CPython, for a speed-up of 2x.

* More compact instances.  All instances are as compact as if
  they had ``__slots__``.  This can give programs a big gain in
  memory.  (In the example of translation above, we already have
  carefully placed ``__slots__``, so there is no extra win.)

* `Virtualenv support`_: now PyPy is fully compatible with
virtualenv_: note that
  to use it, you need a recent version of virtualenv (>= 1.5).

* Faster (and JITted) regular expressions - huge boost in speeding up
  the `re` module.

* Other speed improvements, like JITted calls to functions like map().

.. _virtualenv:
.. _`Virtualenv support`:
.. _`in production`:
.. _`our blog`:
.. _`pypy 1.4 and pypy 1.3`:,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=1%2B41&chart=normal+bars
.. _`pypy 1.4 and cpython 2.6`:,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=2%2B35&chart=normal+bars


Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski,
Amaury Forgeot d'Arc, Armin Rigo and the PyPy team

From reid.kleckner at  Fri Nov 26 19:33:54 2010
From: reid.kleckner at (Reid Kleckner)
Date: Fri, 26 Nov 2010 13:33:54 -0500
Subject: [Python-Dev] PyPy 1.4 released
In-Reply-To: <>
References: <>
Message-ID: <>

Congratulations!  Excellent work.


On Fri, Nov 26, 2010 at 1:23 PM, Maciej Fijalkowski <fijall at> wrote:
> ===============================
> PyPy 1.4: Ouroboros in practice
> ===============================
> We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough
> in our long journey, as PyPy 1.4 is the first PyPy release that can translate
> itself faster than CPython. ?Starting today, we are using PyPy more for
> our every-day development. ?So may you :) You can download it here:
> ? ?
> What is PyPy
> ============
> PyPy is a very compliant Python interpreter, almost a drop-in replacement
> for CPython. It's fast (`pypy 1.4 and cpython 2.6`_ comparison)
> Among its new features, this release includes numerous performance improvements
> (which made fast self-hosting possible), a 64-bit JIT backend, as well
> as serious stabilization. As of now, we can consider the 32-bit and 64-bit
> linux versions of PyPy stable enough to run `in production`_.
> Numerous speed achievements are described on `our blog`_. Normalized speed
> charts comparing `pypy 1.4 and pypy 1.3`_ as well as `pypy 1.4 and cpython 2.6`_
> are available on benchmark website. For the impatient: yes, we got a lot faster!
> More highlights
> ===============
> * PyPy's built-in Just-in-Time compiler is fully transparent and
> ?automatically generated; it now also has very reasonable memory
> ?requirements. ?The total memory used by a very complex and
> ?long-running process (translating PyPy itself) is within 1.5x to
> ?at most 2x the memory needed by CPython, for a speed-up of 2x.
> * More compact instances. ?All instances are as compact as if
> ?they had ``__slots__``. ?This can give programs a big gain in
> ?memory. ?(In the example of translation above, we already have
> ?carefully placed ``__slots__``, so there is no extra win.)
> * `Virtualenv support`_: now PyPy is fully compatible with
> virtualenv_: note that
> ?to use it, you need a recent version of virtualenv (>= 1.5).
> * Faster (and JITted) regular expressions - huge boost in speeding up
> ?the `re` module.
> * Other speed improvements, like JITted calls to functions like map().
> .. _virtualenv:
> .. _`Virtualenv support`:
> .. _`in production`:
> .. _`our blog`:
> .. _`pypy 1.4 and pypy 1.3`:
> .. _`pypy 1.4 and cpython 2.6`:
> Cheers,
> Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski,
> Amaury Forgeot d'Arc, Armin Rigo and the PyPy team
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From brian.curtin at  Fri Nov 26 19:52:22 2010
From: brian.curtin at (Brian Curtin)
Date: Fri, 26 Nov 2010 12:52:22 -0600
Subject: [Python-Dev] [Python-checkins] r86817 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto <python-checkins at
> wrote:

> Author: hirokazu.yamamoto
> Date: Fri Nov 26 19:44:28 2010
> New Revision: 86817
> Log:
> Now can reproduce the error on AMD64 Windows Server 2008
> even where os.symlink is not supported.
> Modified:
>   python/branches/py3k-stat-on-windows/Lib/test/
> Modified: python/branches/py3k-stat-on-windows/Lib/test/
> ==============================================================================
> --- python/branches/py3k-stat-on-windows/Lib/test/
>  (original)
> +++ python/branches/py3k-stat-on-windows/Lib/test/        Fri
> Nov 26 19:44:28 2010
> @@ -271,24 +271,32 @@
>             shutil.rmtree(src_dir)
>             shutil.rmtree(os.path.dirname(dst_dir))
> -    @support.skip_unless_symlink
> +    @unittest.skipUnless(hasattr(os, 'link'), 'requires')
>     def test_dont_copy_file_onto_link_to_itself(self):
>         # bug 851123.
>         os.mkdir(TESTFN)
>         src = os.path.join(TESTFN, 'cheese')
>         dst = os.path.join(TESTFN, 'shop')
>         try:
> -            f = open(src, 'w')
> -            f.write('cheddar')
> -            f.close()
> -
> -            if hasattr(os, "link"):
> -      , dst)
> -                self.assertRaises(shutil.Error, shutil.copyfile, src, dst)
> -                with open(src, 'r') as f:
> -                    self.assertEqual(, 'cheddar')
> -                os.remove(dst)
> +            with open(src, 'w') as f:
> +                f.write('cheddar')
> +  , dst)
> +            self.assertRaises(shutil.Error, shutil.copyfile, src, dst)
> +            with open(src, 'r') as f:
> +                self.assertEqual(, 'cheddar')
> +            os.remove(dst)
> +        finally:
> +            shutil.rmtree(TESTFN, ignore_errors=True)
> +    @support.skip_unless_symlink
> +    def test_dont_copy_file_onto_symlink_to_itself(self):
> +        # bug 851123.
> +        os.mkdir(TESTFN)
> +        src = os.path.join(TESTFN, 'cheese')
> +        dst = os.path.join(TESTFN, 'shop')
> +        try:
> +            with open(src, 'w') as f:
> +                f.write('cheddar')
>             # Using `src` here would mean we end up with a symlink pointing
>             # to TESTFN/TESTFN/cheese, while it should point at
>             # TESTFN/cheese.
> @@ -298,10 +306,7 @@
>                 self.assertEqual(, 'cheddar')
>             os.remove(dst)
>         finally:
> -            try:
> -                shutil.rmtree(TESTFN)
> -            except OSError:
> -                pass
> +            shutil.rmtree(TESTFN, ignore_errors=True)
>     @support.skip_unless_symlink
>     def test_rmtree_on_symlink(self):

You might be working on something slightly different, but I have an issue
created for the failure of that test:

It slipped past me because I was only running the test suite as a regular
user without the required symlink privilege, so the test was skipped. That
Server 2008 build slave runs the test suite as administrator, so it was
running that test and going into the block, which it didn't do until
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ocean-city at  Fri Nov 26 20:45:18 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sat, 27 Nov 2010 04:45:18 +0900
Subject: [Python-Dev] [Python-checkins] r86817
	-	python/branches/py3k-stat-on-windows/Lib/test/
In-Reply-To: <>
References: <>
Message-ID: <>

On 2010/11/27 3:52, Brian Curtin wrote:
> On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto<python-checkins at
>> wrote:
>> Author: hirokazu.yamamoto
>> Date: Fri Nov 26 19:44:28 2010
>> New Revision: 86817
>> Log:
>> Now can reproduce the error on AMD64 Windows Server 2008
>> even where os.symlink is not supported.
>> Modified:
>>    python/branches/py3k-stat-on-windows/Lib/test/
>> Modified: python/branches/py3k-stat-on-windows/Lib/test/
>> ==============================================================================
>> --- python/branches/py3k-stat-on-windows/Lib/test/
>>   (original)
>> +++ python/branches/py3k-stat-on-windows/Lib/test/        Fri
>> Nov 26 19:44:28 2010
>> @@ -271,24 +271,32 @@
>>              shutil.rmtree(src_dir)
>>              shutil.rmtree(os.path.dirname(dst_dir))
>> -    @support.skip_unless_symlink
>> +    @unittest.skipUnless(hasattr(os, 'link'), 'requires')
>>      def test_dont_copy_file_onto_link_to_itself(self):
>>          # bug 851123.
>>          os.mkdir(TESTFN)
>>          src = os.path.join(TESTFN, 'cheese')
>>          dst = os.path.join(TESTFN, 'shop')
>>          try:
>> -            f = open(src, 'w')
>> -            f.write('cheddar')
>> -            f.close()
>> -
>> -            if hasattr(os, "link"):
>> -      , dst)
>> -                self.assertRaises(shutil.Error, shutil.copyfile, src, dst)
>> -                with open(src, 'r') as f:
>> -                    self.assertEqual(, 'cheddar')
>> -                os.remove(dst)
>> +            with open(src, 'w') as f:
>> +                f.write('cheddar')
>> +  , dst)
>> +            self.assertRaises(shutil.Error, shutil.copyfile, src, dst)
>> +            with open(src, 'r') as f:
>> +                self.assertEqual(, 'cheddar')
>> +            os.remove(dst)
>> +        finally:
>> +            shutil.rmtree(TESTFN, ignore_errors=True)
>> +    @support.skip_unless_symlink
>> +    def test_dont_copy_file_onto_symlink_to_itself(self):
>> +        # bug 851123.
>> +        os.mkdir(TESTFN)
>> +        src = os.path.join(TESTFN, 'cheese')
>> +        dst = os.path.join(TESTFN, 'shop')
>> +        try:
>> +            with open(src, 'w') as f:
>> +                f.write('cheddar')
>>              # Using `src` here would mean we end up with a symlink pointing
>>              # to TESTFN/TESTFN/cheese, while it should point at
>>              # TESTFN/cheese.
>> @@ -298,10 +306,7 @@
>>                  self.assertEqual(, 'cheddar')
>>              os.remove(dst)
>>          finally:
>> -            try:
>> -                shutil.rmtree(TESTFN)
>> -            except OSError:
>> -                pass
>> +            shutil.rmtree(TESTFN, ignore_errors=True)
>>      @support.skip_unless_symlink
>>      def test_rmtree_on_symlink(self):
> You might be working on something slightly different, but I have an issue
> created for the failure of that test:
> It slipped past me because I was only running the test suite as a regular
> user without the required symlink privilege, so the test was skipped. That
> Server 2008 build slave runs the test suite as administrator, so it was
> running that test and going into the block, which it didn't do until
> r86733.

I'm not sure, but why does os.path.samefile return False for hard link
on windows? MSDN says,

 > A hard link is the file system representation of a file by which more 
 > than one path references a single file in the same volume.

I know st_ino on windows is a bit different from POSIX, so, just I'm not 
sure. ;-)

From brian.curtin at  Fri Nov 26 21:02:29 2010
From: brian.curtin at (Brian Curtin)
Date: Fri, 26 Nov 2010 14:02:29 -0600
Subject: [Python-Dev] [Python-checkins] r86817 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 26, 2010 at 13:45, Hirokazu Yamamoto <ocean-city at
> wrote:

> On 2010/11/27 3:52, Brian Curtin wrote:
>> On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto<
>> python-checkins at
>>> wrote:
>>  Author: hirokazu.yamamoto
>>> Date: Fri Nov 26 19:44:28 2010
>>> New Revision: 86817
>>> Log:
>>> Now can reproduce the error on AMD64 Windows Server 2008
>>> even where os.symlink is not supported.
>>> Modified:
>>>   python/branches/py3k-stat-on-windows/Lib/test/
>>> Modified: python/branches/py3k-stat-on-windows/Lib/test/
>>> ==============================================================================
>>> --- python/branches/py3k-stat-on-windows/Lib/test/
>>>  (original)
>>> +++ python/branches/py3k-stat-on-windows/Lib/test/
>>>  Fri
>>> Nov 26 19:44:28 2010
>>> @@ -271,24 +271,32 @@
>>>             shutil.rmtree(src_dir)
>>>             shutil.rmtree(os.path.dirname(dst_dir))
>>> -    @support.skip_unless_symlink
>>> +    @unittest.skipUnless(hasattr(os, 'link'), 'requires')
>>>     def test_dont_copy_file_onto_link_to_itself(self):
>>>         # bug 851123.
>>>         os.mkdir(TESTFN)
>>>         src = os.path.join(TESTFN, 'cheese')
>>>         dst = os.path.join(TESTFN, 'shop')
>>>         try:
>>> -            f = open(src, 'w')
>>> -            f.write('cheddar')
>>> -            f.close()
>>> -
>>> -            if hasattr(os, "link"):
>>> -      , dst)
>>> -                self.assertRaises(shutil.Error, shutil.copyfile, src,
>>> dst)
>>> -                with open(src, 'r') as f:
>>> -                    self.assertEqual(, 'cheddar')
>>> -                os.remove(dst)
>>> +            with open(src, 'w') as f:
>>> +                f.write('cheddar')
>>> +  , dst)
>>> +            self.assertRaises(shutil.Error, shutil.copyfile, src, dst)
>>> +            with open(src, 'r') as f:
>>> +                self.assertEqual(, 'cheddar')
>>> +            os.remove(dst)
>>> +        finally:
>>> +            shutil.rmtree(TESTFN, ignore_errors=True)
>>> +    @support.skip_unless_symlink
>>> +    def test_dont_copy_file_onto_symlink_to_itself(self):
>>> +        # bug 851123.
>>> +        os.mkdir(TESTFN)
>>> +        src = os.path.join(TESTFN, 'cheese')
>>> +        dst = os.path.join(TESTFN, 'shop')
>>> +        try:
>>> +            with open(src, 'w') as f:
>>> +                f.write('cheddar')
>>>             # Using `src` here would mean we end up with a symlink
>>> pointing
>>>             # to TESTFN/TESTFN/cheese, while it should point at
>>>             # TESTFN/cheese.
>>> @@ -298,10 +306,7 @@
>>>                 self.assertEqual(, 'cheddar')
>>>             os.remove(dst)
>>>         finally:
>>> -            try:
>>> -                shutil.rmtree(TESTFN)
>>> -            except OSError:
>>> -                pass
>>> +            shutil.rmtree(TESTFN, ignore_errors=True)
>>>     @support.skip_unless_symlink
>>>     def test_rmtree_on_symlink(self):
>> You might be working on something slightly different, but I have an issue
>> created for the failure of that test:
>> It slipped past me because I was only running the test suite as a regular
>> user without the required symlink privilege, so the test was skipped. That
>> Server 2008 build slave runs the test suite as administrator, so it was
>> running that test and going into the block, which it didn't do
>> until
>> r86733.
> I'm not sure, but why does os.path.samefile return False for hard link
> on windows? MSDN says,
> > A hard link is the file system representation of a file by which more >
> than one path references a single file in the same volume.
> (
> I know st_ino on windows is a bit different from POSIX, so, just I'm not
> sure. ;-)

The samefile thing, I don't know either. GetFinalPathNameByHandle does not
appear to work with hard links, at least how it's being used right now. It
has no problem with symlinks. We briefly chatted about this on the
feature issue, but I never found a way around it.

I'll look into it this weekend.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ocean-city at  Fri Nov 26 21:18:58 2010
From: ocean-city at (Hirokazu Yamamoto)
Date: Sat, 27 Nov 2010 05:18:58 +0900
Subject: [Python-Dev] [Python-checkins] r86817 -
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

On 2010/11/27 5:02, Brian Curtin wrote:
> We briefly chatted about this on the
> feature issue, but I never found a way around it.

How about implementing os.path.samefile in
Modules/posixmodule.c like this?

# I hope this works.

From brian.curtin at  Fri Nov 26 21:31:49 2010
From: brian.curtin at (Brian Curtin)
Date: Fri, 26 Nov 2010 14:31:49 -0600
Subject: [Python-Dev] [Python-checkins] r86817 -
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Nov 26, 2010 at 14:18, Hirokazu Yamamoto <ocean-city at
> wrote:

> On 2010/11/27 5:02, Brian Curtin wrote:
>> We briefly chatted about this on the
>> feature issue, but I never found a way around it.
> How about implementing os.path.samefile in
> Modules/posixmodule.c like this?
> # I hope this works.

That's almost identical to what the current os.path.sameopenfile is.

Lib/ opens both files, then compares them via _getfileinformation.
That function is implemented to take in a file descriptor, call
GetFileInformationByHandle with it, then returns a tuple
of dwVolumeSerialNumber, nFileIndexHigh, and nFileIndexLow.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Fri Nov 26 21:39:36 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 Nov 2010 21:39:36 +0100
Subject: [Python-Dev] Removal of Win32 ANSI API
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

> Is it possible a conversion from bytes (ANSI) to unicode fails on
> windows?

It should fail sometimes, right? Not for windows-1252, but certainly
for shift-jis (you know better than me). It seems that whether
MultiByteToWideChar will fail depends on whether MB_ERR_INVALID_CHARS
is given or not. I don't know what it will do if this flag is not
given - my guess it fills in REPLACEMENT CHARACTER.

> If not, is it allowed to convert to unicode with
> PyUnicode_FSDecoder if function doesn't return str? For example,
> os.stat() takes str as arguments but doesn't return str.

This I don't understand. os.stat doesn't return text at all -
so what do you want to convert?

> # I noticed win_readlink() in Modules/posixmodule.c already unicode
> # only. Maybe not so much problem? ;-)

Well, readlink is new on Windows, and symlinks are not widespread.
So there is no backwards compatibility concern here.


From ncoghlan at  Sat Nov 27 08:35:52 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 27 Nov 2010 17:35:52 +1000
Subject: [Python-Dev] [Python-checkins] r86720 -
In-Reply-To: <icjoqu$tro$>
References: <>
	<> <>
Message-ID: <>

On Thu, Nov 25, 2010 at 5:25 AM, Terry Reedy <tjreedy at> wrote:
> I know now that I could always edit with IDLE's editor, but it is a lot
> easier to right click and select edit than it is to run thru the directory
> tree in an open dialog.

If you want a decent free text editor on Windows, the open source
Notepad++ does a very nice job. It also adds an "Edit with Notepad++"
to the explorer context menu :)

> And of course, since the pseudo-BOM addition is
> undocumented within notepad itself, and probably other editors, it is easy
> to not know.

As far as the implicit BOM addition itself goes, and could probably be updated to check for it, but the
miscellaneous files (like ACKS) are likely to continue to need manual


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From stephen at  Sat Nov 27 09:48:52 2010
From: stephen at (Stephen J. Turnbull)
Date: Sat, 27 Nov 2010 17:48:52 +0900
Subject: [Python-Dev] len(chr(i)) = 2?
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Glyph Lefkowitz writes:

 > But I don't think that anyone is filling up main memory with
 > gigantic piles of character indexes and need to squeeze out that
 > extra couple of bytes of memory on such a tiny object.

How do you think editors and browsers represent the regions that they
highlight, then?  How do you think that structure-oriented editors
represent the structures that they work with, then?  In a detailed
analysis of a C or Java file, it's easy to end up with almost 1:2
positions to characters ratio.  Note that *buffer* characters are
typically smaller than a platform word, so saving one word in the
representation of a position mean a 100% or more increase in the
character count of the buffer.  Even in the case of UCS-4 on a 32-bit
platform, that's a 50% increase in the maximum usable size of a
buffer before a parser starts raising OOM errors.

There are two plausible ways to represent these structures that I can
think of offhand.  The first is to do it the way Emacs does, by
reading the text into a buffer and using position offsets to map to
display or structure attributes.  The second is to use a hierarchical
document model, and render the display by traversing the document

It's not obvious to me that forcing use of the second representation
is a good idea for performance in an editor, and I would think that
they have similar memory requirements.

 > Plus, this would allow such a user to stop copying the character
 > data itself just to decode it, and on mostly-ascii UTF-8 text (a
 > common use-case) this is a 2x savings right off the bat.

Which only matters if you're a server in the business of shoveling
octets really fast but are CPU bound (seems unlikely to me, but I'm no
expert; WDYT?), and even then is only that big a savings if you can
push off the issue of validating the purported UTF-8 text on others.
If you're not validating, you may as well acknowledge that you're
processing binary data, not text.[1]  But we're talking about text.

And of course, if you copy mostly-Han UTF-8 text (a common use-case)
to UCS-2, this is a 1.5x memory savings right off the bat, and a 3x
time savings when iterating in most architectures (one increment
operation per character instead of three).

As I've already said, I don't think this is an argument in favor of
either representation.  Sometimes one wins, sometimes the other.  I
don't think supplying both is a great idea, although I've proposed it
myself for XEmacs (but made as opaque as possible).

 > > In Python it's true that markers can use the same data structure as
 > > integers and simply provide different methods, and it's arguable that
 > > Python's design is better.  But if you use bytes internally, then you
 > > have problems.
 > No, you just have design questions.

Call them what you like, they're as yet unanswered.  In any given
editing scenario, I'd concede that it's a "SMOD".  But if you're
designing a language for text processing, it's a restriction that I
believe to be a hindrance to applications.  Many applications may
prefer to use a straightforward array implementation of text and focus
their design efforts on the real problems of their use cases.

 > > Do you expose that byte value to the user?  If so, what do you do
 > > if the user specifies a byte value that points into a multibyte
 > > character?
 > Go to the beginning of the multibyte character.  Report that
 > position; if the user then asks the requested marker object for its
 > position, it will report that byte offset, not the
 > originally-requested one.  (Obviously, do the same thing for
 > surrogate pair code points.)

I will guarantee that some use cases will prefer that you go to the
beginning of the *next* character.  For an obvious example, your
algorithm will infloop if you iterate "pos += 1".  (And the opposite
problem appears for "beginning of next character" combined with
"pos -= 1".)  Of course this trivial example is easily addressed by
saying "the user should be using the character iterator API here", but
I expect the issue can arise where that is not an easy answer.  Either
the API becomes complex, or the user/developers will have to do
complex bookkeeping that should be done by the text implementation.

Nor is it obvious that surrogate pairs will be present in a UCS-2
representation.  Specifically, they can be encoded to single private
space characters in almost all applications, at a very small cost in

 > > What if the user wants to specify position by number of
 > > characters?
 > Part of the point that we are trying to make here is that nobody
 > really cares about that use-case.  In order to know anything useful
 > about a position in a text, you have to have traversed to that
 > location in the text.

Binary search of an ordered text is useful.  Granted, this
particular example can be addressed usefully in terms of byte
positions (viz. your example of less), but your basic premise is

 > You can remember interesting things like the offsets of starts of
 > lines, or the x/y positions of characters.
 > > Can you translate efficiently?
 > No, because there's no point :).  But you _could_ implement an
 > overlay that cached things like the beginning of lines, or the x/y
 > positions of interesting characters.

Emacs does, and a lot of effort has gone into it, and it still sucks
compared to an array representation.  Maybe _you_ _could_ do better,
but as yet we haven't managed to pull it off. :-(

 > > But I think it would be hard to implement an efficient
 > > text-processing *language*, eg, a Python module for *full
 > > conformance* in handling Unicode, on top of UTF-8.
 > Still: why?  I guess if I have some free time I'll try my hand at
 > it, and maybe I'll run into a wall and realize you're right :).

I'd rather have you make it plausible to me that there's no point in
having efficient access to arbitrary character positions.  Then maybe
you can delegate that implementation to me. :-)  But my Emacs
experience says otherwise, and IIUC the intuition and/or experience of
MAL and Guido says this is not a YAGNI.

 > > Any time you have an algorithm that requires efficient access to
 > > arbitrary text positions, you'll spend all your skull sweat
 > > fighting the representation.  At least, that's been my experience
 > > with Emacsen.
 > What sort of algorithm would that be, though?  The main thing that
 > I could think of is a text editor trying to efficiently allow the
 > user to scroll to the middle of a large file without reading the
 > whole thing into memory.

Reading into memory or not is a red herring, I think.  For many legacy
encodings you have to pretty much read the whole thing because they
are stateful, and it's just not very expensive compared to the text
processing itself (unless your application is shoveling octets as fast
as possible, in which case character positions are indeed a YAGNI).

The question is whether opaque markers are always sufficient.  For
example, XEmacs does use byte positions internally for markers and
extents (objects representing regions of text that can carry arbitrary
properties but are tuned for display properties).  Obviously, we have
the marker objects you propose as sufficient, and indeed the
representation is as efficient as you claim.  However, these positions
are not exposed as integers to end users, Lisp, or even most of the C
code.  If a client (end user or code) requests a position, they get a
character position.

Such requests are frequent enough that they constitute a major drag on
many practical applications.  It may be that this is unnecessary, as
less shows for its application.  But less is not an editor, let alone
a language for writing editors.

Do you know of an editor language of power comparable to Emacs Lisp
that is not based on an array representation of text?

 > Is it really the representation as byte positions which is fragile
 > (i.e. the internal implementation detail), or the exposure of that
 > position to calling code, and the idiomatic usage of that number as
 > an integer?

It's the latter.  Sufficient effort can make it safe to use byte
positions, and the effort is not all that great as long as you don't
demand efficiency.  XEmacs vs. Emacs implementation of Mule
demonstrates that.

We at XEmacs never did expose byte positions to even the C code (other
than to buffer and string methods), and that implementation has not
had to change much, if at all, in 15 years.  The caching mechanism to
make character position access reasonably efficient, however, has been
buggy and not so efficient, and so complex that RMS said "I was going
to implement your [position cache] in Emacs but it was too hard for me
to understand".  (OTOH, the alternative Emacs had implemented turned
out to be O(n**2) or worse, so he had to replace it.  Translating byte
positions to character positions seems to be a real loser.)

Emacs did expose byte positions for efficiency reasons, and has had at
least four regressions of the "\201 bug".  "\201" prefixes a Latin-1
character in internal code, and code that treated byte positions would
often result in this being duplicated because all trailing bytes in
Mule code are also Latin-1 code points.  (Don't ask me about the exact
mechanism, XEmacs's implementation is quite different and never
suffered from this bug.)

Note that a \201-like bug is very unlikely to occur in Python's UCS-2
representation because the semantics of surrogate values in Unicode is
unambiguous.  However, I believe similar bugs would be possible in a
UTF-8 representation -- if code is allowed to choose whether to view
UTF-8 in binary or text mode -- because trailing byte values are
Latin-1 code points.  Maybe I'm just an old granny, scared of my

[1]  I have no objection to providing "text" algorithms (such as
regexps) for use on "binary" data.  But then they don't provide any
guarantees that transformations of purported text remains text.

From ncoghlan at  Sat Nov 27 11:51:38 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 27 Nov 2010 20:51:38 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, Nov 25, 2010 at 3:41 AM, Michael Foord
<fuzzyman at> wrote:
> Can you explain what you see as the difference?
> I'm not particularly interested in type validation but I like the fact that
> typical enum APIs allow you to group constants: the generated constant class
> acts as a namespace for all the defined constants.

The problem with blessing one particular "enum API" is that people
have so many different ideas as to what an enum API should look like.

However, the one thing they all have in common is the ability to take
a value and give it a name, then present *both* of those in debugging

> Are you just suggesting something along the lines of:
> class NamedConstant(int):
> def __new__(cls, name, val):
> return int.__new__(cls, val)
> def __init__(self, name, val):
> self._name = name
> def __repr__(self):
> return '<NamedConstant %s>' % self._name
> FOO = NamedConstant('FOO', 3)
> In general the less features the better, but I'd like a few more features
> than that. :-)

Not quite. I'm suggesting a factory function that works for any value,
and derives the parent class from the type of the supplied value.
However, what you wrote is still the essence of the idea - we would be
primarily providing a building block that makes it easier for people
to *create* enum APIs if they want to, but for simple use cases (where
all they really wanted was the enhanced debugging information) they
wouldn't need to bother. In the standard library, wherever we do
"enum-like things" we would switch to using named values where it
makes sense to do so.

Doing so may actually make sense for more than just constants - it may
make sense for significant mutable globals as well.

# Implementation (more than just a sketch, since it handles some
interesting corner cases)
import functools
def _make_named_value_type(base_type):
    class _NamedValueType(base_type):
        def __new__(cls, name, value):
            return base_type.__new__(cls, value)
        def __init__(self, name, value):
            self.__name = name
        def _name(self):
            return self.__name
        def _raw(self):
            return base_type(self)
        def __repr__(self):
            return "{}={}".format(self._name, super().__repr__())
        if base_type.__str__ is object.__str__:
            __str__ = base_type.__repr__
    _NamedValueType.__name__ = "Named<{}>".format(base_type.__name__)
    return _NamedValueType

def named_value(name, value):
    return _make_named_value_type(type(value))(name, value)

def set_named_values(namespace, **kwds):
    for k, v in kwds.items():
        namespace[k] = named_value(k, v)

x = named_value("FOO", 1)
y = named_value("BAR", "Hello World!")
z = named_value("BAZ", dict(a=1, b=2, c=3))

print(x, y, z, sep="\n")
print("\n".join(map(repr, (x, y, z))))
print("\n".join(map(str, map(type, (x, y, z)))))

set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
print("\n".join(map(repr, (foo, bar, baz))))
print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))


# Session output for the last 6 lines
>>> print(x, y, z, sep="\n")
Hello World!
{'a': 1, 'c': 3, 'b': 2}

>>> print("\n".join(map(repr, (x, y, z))))
BAR='Hello World!'
BAZ={'a': 1, 'c': 3, 'b': 2}

>>> print("\n".join(map(str, map(type, (x, y, z)))))
<class '__main__.Named<int>'>
<class '__main__.Named<str>'>
<class '__main__.Named<dict>'>

>>> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
>>> print("\n".join(map(repr, (foo, bar, baz))))
bar='Hello World!'
baz={'a': 1, 'c': 3, 'b': 2}

>>> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
True True True

For "normal" use, such objects would look like ordinary instances of
their class. They would only behave differently when their
representation is printed (prepending their name), or when their type
is interrogated (being an instance of the named subclass rather than
the ordinary type).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 27 13:05:32 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 27 Nov 2010 22:05:32 +1000
Subject: [Python-Dev] [Preview] Comments and change proposals on
In-Reply-To: <icjsal$eqk$>
References: <icjsal$eqk$>
Message-ID: <>

On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl <g.brandl at> wrote:
> Hi,
> at <>, you can look at a version of the 3.2
> docs that has the upcoming commenting feature. ?JavaScript is mandatory.

Very nice!

I'm not sure what to do about the discoverability of the comment
bubbles as the end of each paragraph. I initially thought commenting
wasn't available on What's New or the Using Python docs until seeing
where the blue comment bubbles appeared in the math module docs.

A discreet notice at the bottom of the sidebar and/or an explanation
at the "Report a Bug" page may cover it I guess.

> Please test on a smaller page, such as <>,
> there is currently a speed issue with larger pages. ?(Helpful tips from
> JS experts are welcome.)

I gave the JS a fair few comments on the first paragraph to digest. I
also put my detailed UI comments there as well (I needed something to
write about while testing, so I figured I may as well make it useful
to you!)

> Other things I have to do before this can go live:
> * reuse existing logins from either wiki or tracker?

Tracker sounds like the best bet to me.

> Any feedback is appreciated (I'd suggest mailing it to doc-SIG only, to avoid
> cluttering up python-dev).

My comments may on the math module may give you a chance to see how
easy it is to get text out of comments into a form suitable for
sending to a mailing list or posting to a tracker issue for further
discussion :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 27 13:17:31 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 27 Nov 2010 22:17:31 +1000
Subject: [Python-Dev] [Python-checkins] r86745 - in
 python/branches/py3k: Doc/library/difflib.rst Lib/
 Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy <python-checkins at> wrote:
> ?The :class:`SequenceMatcher` class has this constructor:
> -.. class:: SequenceMatcher(isjunk=None, a='', b='')
> +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True)
> ? ?Optional argument *isjunk* must be ``None`` (the default) or a one-argument
> ? ?function that takes a sequence element and returns true if and only if the
> @@ -340,6 +349,9 @@
> ? ?The optional arguments *a* and *b* are sequences to be compared; both default to
> ? ?empty strings. ?The elements of both sequences must be :term:`hashable`.
> + ? The optional argument *autojunk* can be used to disable the automatic junk
> + ? heuristic.
> +

Catching up on checkins traffic, so a later checkin may already fix
this, but there should be a versionchanged tag in the docs to note
when the autojunk parameter was added.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 27 13:22:50 2010
From: ncoghlan at (Nick Coghlan)
Date: Sat, 27 Nov 2010 22:22:50 +1000
Subject: [Python-Dev] [Python-checkins] r86750 -
In-Reply-To: <20101126021524.GA1450@rubuntu>
References: <>
	<> <20101126021524.GA1450@rubuntu>
Message-ID: <>

On Fri, Nov 26, 2010 at 12:15 PM, Senthil Kumaran <orsenthil at> wrote:
>> Re: ?colour?: the rest of the file use US English, as do the function
>> names (see for example curses.has_color). ?It?s good to use one dialect
>> consistently in one file.
> Good catch. Did not realize it because, we write it as colour too.
> Changing it.

I just resign myself to having to spell words like colour and
serialise wrong when I'm working on Python. Compared to the
adjustments the non-native English speakers have to make, I figure I'm
getting off lightly ;)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Sat Nov 27 13:52:40 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 27 Nov 2010 12:52:40 +0000
Subject: [Python-Dev] [Python-checkins] r86750
	-	python/branches/py3k/Demo/curses/
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 27/11/2010 12:22, Nick Coghlan wrote:
> On Fri, Nov 26, 2010 at 12:15 PM, Senthil Kumaran<orsenthil at>  wrote:
>>> Re: ?colour?: the rest of the file use US English, as do the function
>>> names (see for example curses.has_color).  It?s good to use one dialect
>>> consistently in one file.
>> Good catch. Did not realize it because, we write it as colour too.
>> Changing it.
> I just resign myself to having to spell words like colour and
> serialise wrong when I'm working on Python. Compared to the
> adjustments the non-native English speakers have to make, I figure I'm
> getting off lightly ;)

I *thought* that the Python policy was that English speakers wrote 
documentation in English and American speakers wrote documentation in 
American and that we *don't* insist on US spellings in the Python 


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From eliben at  Sat Nov 27 14:00:27 2010
From: eliben at (Eli Bendersky)
Date: Sat, 27 Nov 2010 15:00:27 +0200
Subject: [Python-Dev] [Python-checkins] r86745 - in
 python/branches/py3k: Doc/library/difflib.rst Lib/
 Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 27, 2010 at 14:17, Nick Coghlan <ncoghlan at> wrote:

> On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy <python-checkins at>
> wrote:
> >  The :class:`SequenceMatcher` class has this constructor:
> >
> >
> > -.. class:: SequenceMatcher(isjunk=None, a='', b='')
> > +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True)
> >
> >    Optional argument *isjunk* must be ``None`` (the default) or a
> one-argument
> >    function that takes a sequence element and returns true if and only if
> the
> > @@ -340,6 +349,9 @@
> >    The optional arguments *a* and *b* are sequences to be compared; both
> default to
> >    empty strings.  The elements of both sequences must be
> :term:`hashable`.
> >
> > +   The optional argument *autojunk* can be used to disable the automatic
> junk
> > +   heuristic.
> > +
> Catching up on checkins traffic, so a later checkin may already fix
> this, but there should be a versionchanged tag in the docs to note
> when the autojunk parameter was added.

Hi Nick,

Since autojunk was added in 2.7.1 (the docs of which do indicate this is the
versionchanged tag), I think Terry may have left the tag in 3.2 out on
purpose. That said, personally I don't know what the policy is regarding
features added just in 3.2 and 2.7 (and didn't exist in 3.1) in this

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Sat Nov 27 14:02:36 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 27 Nov 2010 13:02:36 +0000
Subject: [Python-Dev] [Python-checkins] r86745 - in
 python/branches/py3k: Doc/library/difflib.rst Lib/
 Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 27/11/2010 13:00, Eli Bendersky wrote:
> On Sat, Nov 27, 2010 at 14:17, Nick Coghlan <ncoghlan at 
> <mailto:ncoghlan at>> wrote:
>     On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy
>     <python-checkins at <mailto:python-checkins at>>
>     wrote:
>     >  The :class:`SequenceMatcher` class has this constructor:
>     >
>     >
>     > -.. class:: SequenceMatcher(isjunk=None, a='', b='')
>     > +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True)
>     >
>     >    Optional argument *isjunk* must be ``None`` (the default) or
>     a one-argument
>     >    function that takes a sequence element and returns true if
>     and only if the
>     > @@ -340,6 +349,9 @@
>     >    The optional arguments *a* and *b* are sequences to be
>     compared; both default to
>     >    empty strings.  The elements of both sequences must be
>     :term:`hashable`.
>     >
>     > +   The optional argument *autojunk* can be used to disable the
>     automatic junk
>     > +   heuristic.
>     > +
>     Catching up on checkins traffic, so a later checkin may already fix
>     this, but there should be a versionchanged tag in the docs to note
>     when the autojunk parameter was added.
> Hi Nick,
> Since autojunk was added in 2.7.1 (the docs of which do indicate this 
> is the versionchanged tag), I think Terry may have left the tag in 3.2 
> out on purpose. That said, personally I don't know what the policy is 
> regarding features added just in 3.2 and 2.7 (and didn't exist in 3.1) 
> in this respect.

Features new in Python 3.2 that didn't exist in 3.1 should have a 
versionadded:: 3.2 tag.


> Eli
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies ("BOGUS AGREEMENTS") that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Sat Nov 27 15:01:22 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 27 Nov 2010 14:01:22 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 27/11/2010 10:51, Nick Coghlan wrote:
> On Thu, Nov 25, 2010 at 3:41 AM, Michael Foord
> <fuzzyman at>  wrote:
>> Can you explain what you see as the difference?
>> I'm not particularly interested in type validation but I like the fact that
>> typical enum APIs allow you to group constants: the generated constant class
>> acts as a namespace for all the defined constants.
> The problem with blessing one particular "enum API" is that people
> have so many different ideas as to what an enum API should look like.

There actually seemed to be quite a bit of agreement around basic 
functionality though.

> However, the one thing they all have in common is the ability to take
> a value and give it a name, then present *both* of those in debugging
> information.

And this is the most important functionality. I would say that the 
grouping (namespacing) of constants is also useful, provided by *most* 
Python enum APIs and easy to implement without over complexifying the API.

(Note that there is no *particular* hurry to get this into 3.2 - the 
beta is due imminently. I wouldn't object to it )

>> Are you just suggesting something along the lines of:
>> class NamedConstant(int):
>> def __new__(cls, name, val):
>> return int.__new__(cls, val)
>> def __init__(self, name, val):
>> self._name = name
>> def __repr__(self):
>> return '<NamedConstant %s>' % self._name
>> FOO = NamedConstant('FOO', 3)
>> In general the less features the better, but I'd like a few more features
>> than that. :-)
> Not quite. I'm suggesting a factory function that works for any value,
> and derives the parent class from the type of the supplied value.
> However, what you wrote is still the essence of the idea - we would be
> primarily providing a building block that makes it easier for people
> to *create* enum APIs if they want to, but for simple use cases (where
> all they really wanted was the enhanced debugging information) they
> wouldn't need to bother. In the standard library, wherever we do
> "enum-like things" we would switch to using named values where it
> makes sense to do so.
> Doing so may actually make sense for more than just constants - it may
> make sense for significant mutable globals as well.

Very interesting proposal (typed named values rather than just named 
constants). It doesn't handle flag values, which I would still like, but 
that only really makes sense for integers (sets can be OR'd but their 
representation is already understandable). Perhaps the integer named 
type could be special cased for that.

Without the grouping functionality (associating a bunch of names 
together) you lose the 'from_name' functionality. Guido was in favour of 
this, and it is an obvious feature where you have grouping:

"""I expect that the API to convert between enums and bare ints should be
i = int(e) and e = <enumclass>(i). It would be nice if s = str(e) and
e = <enumclass>(s) would work too."""

This wouldn't work with your suggested implementation (as it is). 
Grouping and mutable "named values" could be inefficient and have issues 
around identity / equality. Maybe restrict the API to the immutable 

All the best,

> ==========================================================================
> # Implementation (more than just a sketch, since it handles some
> interesting corner cases)
> import functools
> @functools.lru_cache()
> def _make_named_value_type(base_type):
>      class _NamedValueType(base_type):
>          def __new__(cls, name, value):
>              return base_type.__new__(cls, value)
>          def __init__(self, name, value):
>              self.__name = name
>              super().__init__(value)
>          @property
>          def _name(self):
>              return self.__name
>          def _raw(self):
>              return base_type(self)
>          def __repr__(self):
>              return "{}={}".format(self._name, super().__repr__())
>          if base_type.__str__ is object.__str__:
>              __str__ = base_type.__repr__
>      _NamedValueType.__name__ = "Named<{}>".format(base_type.__name__)
>      return _NamedValueType
> def named_value(name, value):
>      return _make_named_value_type(type(value))(name, value)
> def set_named_values(namespace, **kwds):
>      for k, v in kwds.items():
>          namespace[k] = named_value(k, v)
> x = named_value("FOO", 1)
> y = named_value("BAR", "Hello World!")
> z = named_value("BAZ", dict(a=1, b=2, c=3))
> print(x, y, z, sep="\n")
> print("\n".join(map(repr, (x, y, z))))
> print("\n".join(map(str, map(type, (x, y, z)))))
> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
> print("\n".join(map(repr, (foo, bar, baz))))
> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
> ==========================================================================
> # Session output for the last 6 lines
>>>> print(x, y, z, sep="\n")
> 1
> Hello World!
> {'a': 1, 'c': 3, 'b': 2}
>>>> print("\n".join(map(repr, (x, y, z))))
> FOO=1
> BAR='Hello World!'
> BAZ={'a': 1, 'c': 3, 'b': 2}
>>>> print("\n".join(map(str, map(type, (x, y, z)))))
> <class '__main__.Named<int>'>
> <class '__main__.Named<str>'>
> <class '__main__.Named<dict>'>
>>>> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
>>>> print("\n".join(map(repr, (foo, bar, baz))))
> foo=1
> bar='Hello World!'
> baz={'a': 1, 'c': 3, 'b': 2}
>>>> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
> True True True
> For "normal" use, such objects would look like ordinary instances of
> their class. They would only behave differently when their
> representation is printed (prepending their name), or when their type
> is interrogated (being an instance of the named subclass rather than
> the ordinary type).
> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Sat Nov 27 15:58:08 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 28 Nov 2010 00:58:08 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 12:01 AM, Michael Foord
<fuzzyman at> wrote:
> Very interesting proposal (typed named values rather than just named
> constants). It doesn't handle flag values, which I would still like, but
> that only really makes sense for integers (sets can be OR'd but their
> representation is already understandable). Perhaps the integer named type
> could be special cased for that.
> Without the grouping functionality (associating a bunch of names together)
> you lose the 'from_name' functionality. Guido was in favour of this, and it
> is an obvious feature where you have grouping:
> """I expect that the API to convert between enums and bare ints should be
> i = int(e) and e = <enumclass>(i). It would be nice if s = str(e) and
> e = <enumclass>(s) would work too."""

Note that the "i = int(e)" and "s = str(e)" parts of Guido's
expectation do work (they are, in fact, the underling implementation
of the _raw() method), so an enum class would only be needed to
provide the other half of the equation. The named values have no
opinion on equivalence at all (they just defer to the parent class),
but change the rules for identity (which are always murky anyway,
since caching is optional even for immutable types).

> This wouldn't work with your suggested implementation (as it is). Grouping
> and mutable "named values" could be inefficient and have issues around
> identity / equality. Maybe restrict the API to the immutable primitives.

My proposal doesn't say anything about grouping at all - it's just an
idea for "here's a standard way to associate a canonical name with a
particular object, independent of the namespaces that happen to
reference that object".

Now, a particular *grouping* API may want to restrict itself in
various ways, but that's my point. We should be looking at a standard
solution for the ground level problem (i.e. the idea named_value
attempts to solve) and then let various 3rd party enum/name grouping
implementations flourish on top of that, rather than trying to create
an all-singing all-dancing "value grouping" API (which is going to be
far more intrusive than a simple API for "here's a way to give your
constants and important data structures names that show up in their

For example, using named_value as a primitive, you can fairly easily do:

class Namegroup:
    # Missing lots of niceties of a real enum class, but shows the idea
    # as to how a real implementation could leverage named_value
    def __init__(self, _groupname, **kwds):
        self._groupname = _groupname
        pattern = _groupname + ".{}"
        self._value_map = {}
        for k, v in kwds.items():
            attr = named_value(pattern.format(k), v)
            setattr(self, k, attr)
            self._value_map[v] = attr
    def from_names(cls, groupname, *args):
        kwds = dict(zip(args, range(len(args))))
        return cls(groupname, **kwds)
    def __call__(self, arg):
        return self._value_map[arg]

silly = Namegroup.from_names("Silly", "FOO", "BAR", "BAZ")

>>> silly.FOO
>>> int(silly.FOO)
>>> silly(0)

named_value deals with all the stuff to do with pretending to be the
original type of object (only with an associated name), leaving the
grouping API to deal with issues of creating groups of names and
mapping between them and the original values in various ways.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Nov 27 16:04:17 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 28 Nov 2010 01:04:17 +1000
Subject: [Python-Dev] [Python-checkins] r86750 -
In-Reply-To: <>
References: <>
	<> <20101126021524.GA1450@rubuntu>
Message-ID: <>

On Sat, Nov 27, 2010 at 10:52 PM, Michael Foord
<fuzzyman at> wrote:
>> I just resign myself to having to spell words like colour and
>> serialise wrong when I'm working on Python. Compared to the
>> adjustments the non-native English speakers have to make, I figure I'm
>> getting off lightly ;)
> I *thought* that the Python policy was that English speakers wrote
> documentation in English and American speakers wrote documentation in
> American and that we *don't* insist on US spellings in the Python
> documentation?

If we're just talking about those things in generally, then that's a
reasonable rule. But when in close proximity to an actual API that
uses the American spelling, or modifying a file that uses the relevant
word a lot, following the prevailing style is a definite courtesy to
the reader.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From fuzzyman at  Sat Nov 27 16:07:18 2010
From: fuzzyman at (Michael Foord)
Date: Sat, 27 Nov 2010 15:07:18 +0000
Subject: [Python-Dev] [Python-checkins] r86750 -
In-Reply-To: <>
References: <>	<>	<20101126021524.GA1450@rubuntu>	<>	<>
Message-ID: <>

On 27/11/2010 15:04, Nick Coghlan wrote:
> On Sat, Nov 27, 2010 at 10:52 PM, Michael Foord
> <fuzzyman at>  wrote:
>>> I just resign myself to having to spell words like colour and
>>> serialise wrong when I'm working on Python. Compared to the
>>> adjustments the non-native English speakers have to make, I figure I'm
>>> getting off lightly ;)
>> I *thought* that the Python policy was that English speakers wrote
>> documentation in English and American speakers wrote documentation in
>> American and that we *don't* insist on US spellings in the Python
>> documentation?
> If we're just talking about those things in generally, then that's a
> reasonable rule. But when in close proximity to an actual API that
> uses the American spelling, or modifying a file that uses the relevant
> word a lot, following the prevailing style is a definite courtesy to
> the reader.
Ok, thanks. Sounds like a good guideline.


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From ncoghlan at  Sat Nov 27 16:07:35 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 28 Nov 2010 01:07:35 +1000
Subject: [Python-Dev] [Python-checkins] r86745 - in
 python/branches/py3k: Doc/library/difflib.rst Lib/
 Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Nov 27, 2010 at 11:02 PM, Michael Foord
<fuzzyman at> wrote:
> Features new in Python 3.2 that didn't exist in 3.1 should have a
> versionadded:: 3.2 tag.

As Michael said, from a docs point of view, the version flow is
independent: "2.6 -> 2.7" and "3.1 -> 3.2".

The issue has really only come up with this release, since there was
no intervening 2.x release between 3.0 and 3.1.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From barry at  Sat Nov 27 19:22:16 2010
From: barry at (Barry Warsaw)
Date: Sat, 27 Nov 2010 13:22:16 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <20101127132216.533f7332@mission>

On Nov 27, 2010, at 02:01 PM, Michael Foord wrote:

>(Note that there is no *particular* hurry to get this into 3.2 - the beta is
>due imminently. I wouldn't object to it )

Indeed.  I don't think the time is right to try to get this into 3.2.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From anurag.chourasia at  Sat Nov 27 19:45:44 2010
From: anurag.chourasia at (Anurag Chourasia)
Date: Sun, 28 Nov 2010 00:15:44 +0530
Subject: [Python-Dev] Python make fails with error "Fatal Python error:
 Interpreter not initialized (version mismatch?)"
Message-ID: <>

Hi All,

During the make step of python, I am encountering a weird error. This is on
AIX 5.3 using gcc as the compiler.

My configuration options are as follows

./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I
/opt/freeware/include -I /opt/freeware/include/readline -I
/opt/freeware/include/ncurses" LDFLAGS="-L. -L/usr/local/lib"

Below is the transcript from the make step.
running build
running build_ext
ldd: /lib/libreadline.a: File is an archive.
INFO: Can't locate Tcl/Tk libs and/or headers
building '_struct' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I.
-IInclude -I./Include -I/opt/freeware/include
-I/opt/freeware/include/readline -I/opt/freeware/include/ncurses
-I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include
-I/u01/home/apli/wm/GDD/Python-2.6.6 -c
/u01/home/apli/wm/GDD/Python-2.6.6/Modules/_struct.c -o
./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp -L. -L/usr/local/lib
-L. -L/usr/local/lib -lpython2.6 -o build/lib.aix-5.3-2.6/
*Fatal Python error: Interpreter not initialized (version mismatch?)*
*make: 1254-059 The signal code from the last command is 6.*

The last command that i see above (ld_so_aix) seems to have completed as the
file exists after this command and hence I am not sure which step
is failing.

There is no other Python version on my machine.

Please guide.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sat Nov 27 21:50:11 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 27 Nov 2010 15:50:11 -0500
Subject: [Python-Dev] [Python-checkins] r86745 - in
 python/branches/py3k: Doc/library/difflib.rst Lib/
 Lib/test/ Misc/NEWS
In-Reply-To: <>
References: <>
Message-ID: <>

On 11/27/2010 7:17 AM, Nick Coghlan wrote:
> On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy<python-checkins at>  wrote:
>>   The :class:`SequenceMatcher` class has this constructor:
>> -.. class:: SequenceMatcher(isjunk=None, a='', b='')
>> +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True)
>>     Optional argument *isjunk* must be ``None`` (the default) or a one-argument
>>     function that takes a sequence element and returns true if and only if the
>> @@ -340,6 +349,9 @@
>>     The optional arguments *a* and *b* are sequences to be compared; both default to
>>     empty strings.  The elements of both sequences must be :term:`hashable`.
>> +   The optional argument *autojunk* can be used to disable the automatic junk
>> +   heuristic.
>> +
> Catching up on checkins traffic, so a later checkin may already fix
> this, but there should be a versionchanged tag in the docs to note
> when the autojunk parameter was added.

Right. When S.C. forward-ported the 2.7 patch. he must have thought it 
not needed and I missed the difference between the diffs. Will add note 
in both places needed immediately.


From v+python at  Sat Nov 27 21:56:14 2010
From: v+python at (Glenn Linderman)
Date: Sat, 27 Nov 2010 12:56:14 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/27/2010 2:51 AM, Nick Coghlan wrote:
> Not quite. I'm suggesting a factory function that works for any value,
> and derives the parent class from the type of the supplied value.

Nick, thanks for the much better implementation than I achieved; you 
seem to have the same goals as my implementation.  I learned a bit 
making mine, and more understanding yours to some degree.  What I still 
don't understand about your implementation, is that when adding one 
additional line to your file, it fails:

w = named_value("ABC", z )

Now I can understand why it might not be a good thing to make a named 
value of a named value (confusing, at least), but I was surprised, and 
still do not understand, that it failed reporting the __new__() takes 
exactly 3 arguments (2 given).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Nov 27 23:11:44 2010
From: steve at (Steven D'Aprano)
Date: Sun, 28 Nov 2010 09:11:44 +1100
Subject: [Python-Dev] [Preview] Comments and change proposals
	on	documentation
In-Reply-To: <>
References: <icjsal$eqk$>
Message-ID: <>

Nick Coghlan wrote:
> On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl <g.brandl at> wrote:
>> Hi,
>> at <>, you can look at a version of the 3.2
>> docs that has the upcoming commenting feature.  JavaScript is mandatory.
> Very nice!
> I'm not sure what to do about the discoverability of the comment
> bubbles as the end of each paragraph. I initially thought commenting
> wasn't available on What's New or the Using Python docs until seeing
> where the blue comment bubbles appeared in the math module docs.

I wonder what the point of the comment bubbles is? This isn't a 
graphical UI where (contrary to popular opinion) a picture is *not* 
worth a thousand words, but may require a help-bubble to explain. This 
is text. If you want to make a comment on some text, the usual practice 
is to add more text :)

I wasn't able to find a comment bubble that contained anything, so I 
don't know what sort of information you expect them to contain -- every 
one I tried said "0 comments". But it seems to me that comments are 
superfluous, if not actively harmful:

(1) Anything important enough to tell the reader should be included in 
the text, where it can be easily seen, read and printed.

(2) Discovery is lousy -- not only do you need to be running Javascript, 
which many people do not for performance, privacy and convenience[*], 
but you have to carefully mouse-over the paragraph just to see the blue 
bubble, and THEN you have to *precisely* mouse-over the bubble itself.

(3) This will be a horrible and possibly even literally painful 
experience for anyone with a physical disability that makes precise 
positioning of the mouse difficult.

(4) Accessibility for the blind and those using screen readers will 
probably be non-existent.

(5) If the information in the comment bubbles is trivial enough that 
we're happy to say that the blind, the disabled and those who avoid 
Javascript don't need it, then perhaps *nobody* needs it.

[*] In my experience, websites tend to fall into two basic categories: 
those that don't work at all without Javascript, and those that run 
better, faster, and with fewer anti-features and inconveniences without 


From g.brandl at  Sat Nov 27 23:37:29 2010
From: g.brandl at (Georg Brandl)
Date: Sat, 27 Nov 2010 23:37:29 +0100
Subject: [Python-Dev] [Preview] Comments and change proposals on
In-Reply-To: <>
References: <icjsal$eqk$>	<>
Message-ID: <ics17i$che$>

Am 27.11.2010 23:11, schrieb Steven D'Aprano:
> Nick Coghlan wrote:
>> On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl <g.brandl at> wrote:
>>> Hi,
>>> at <>, you can look at a version of the 3.2
>>> docs that has the upcoming commenting feature.  JavaScript is mandatory.
>> Very nice!
>> I'm not sure what to do about the discoverability of the comment
>> bubbles as the end of each paragraph. I initially thought commenting
>> wasn't available on What's New or the Using Python docs until seeing
>> where the blue comment bubbles appeared in the math module docs.
> I wonder what the point of the comment bubbles is? This isn't a 
> graphical UI where (contrary to popular opinion) a picture is *not* 
> worth a thousand words, but may require a help-bubble to explain. This 
> is text. If you want to make a comment on some text, the usual practice 
> is to add more text :)

Yes, I already mentioned that the bubbles could be replaced by text links
if they prove too confusing.

> I wasn't able to find a comment bubble that contained anything, so I 
> don't know what sort of information you expect them to contain -- every 
> one I tried said "0 comments".

Maybe you should have tried the page I recommended as a demo, and where Nick
made his comments? :)

> But it seems to me that comments are superfluous, if not actively harmful:

(I've not read anything about harmful below.  Was that just FUD?)

> (1) Anything important enough to tell the reader should be included in 
> the text, where it can be easily seen, read and printed.

Yes.  There need to be ways for the reader to feed back to the author
what they want to have included.  Currently, this is

I'm all for removing comments with suggestions once they have been
integrated in the main text.

> (2) Discovery is lousy -- not only do you need to be running Javascript, 
> which many people do not for performance, privacy and convenience[*], 

That is not an argument nowadays, seeing how many sites/web applications
require JS.  (Most people who deactivate JS globally maintain a whitelist
anyway, and can easily add to that.)

These comments are an optional feature and therefore do not need to
be accessible for 100% of users.

> but you have to carefully mouse-over the paragraph just to see the blue 
> bubble, and THEN you have to *precisely* mouse-over the bubble itself.

Bubbles are always shown for paragraphs *with* comments.

> (3) This will be a horrible and possibly even literally painful 
> experience for anyone with a physical disability that makes precise 
> positioning of the mouse difficult.

You're making this point just because of the size of the bubbles?  Well,
these users can register on the site and there can be a user preference
to display larger links instead (if we choose to keep the bubbles, anyway.)

> (4) Accessibility for the blind and those using screen readers will 
> probably be non-existent.

It will be the same as for other web apps using JavaScript.  Since I'm not
a professional user interface designer, I don't know what screen readers
can and cannot do.

> (5) If the information in the comment bubbles is trivial enough that 
> we're happy to say that the blind, the disabled and those who avoid 
> Javascript don't need it, then perhaps *nobody* needs it.

Sorry, but that is a nonsensical argument.  Apart from the questionable
notion that anything must be available to everyone to be worth anything,
it also doesn't consider that the comments are not only for fellow users:
as I said above, the comments are designed to be a very quick way to give
feedback to *us* developers.  (This is the reason for the "propose a
change" feature, for example.)

So even if only 30% of all users had access to the comments and could use
that to help us improve the documentation by submitting suggestions and
corrections they never would have bothered registering in the tracker for,
that would be a net gain.


From raymond.hettinger at  Sun Nov 28 00:26:13 2010
From: raymond.hettinger at (Raymond Hettinger)
Date: Sat, 27 Nov 2010 15:26:13 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote:

> On 11/27/2010 2:51 AM, Nick Coghlan wrote:
>> Not quite. I'm suggesting a factory function that works for any value,
>> and derives the parent class from the type of the supplied value.
> Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation.  I learned a bit     making mine, and more understanding yours to some degree.  What I still don't understand about your implementation, is that when adding one additional line to your file, it fails:
> w = named_value("ABC", z )
> Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given).

Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library.


From steve at  Sun Nov 28 00:58:52 2010
From: steve at (Steven D'Aprano)
Date: Sun, 28 Nov 2010 10:58:52 +1100
Subject: [Python-Dev] [Preview] Comments and change proposals
	on	documentation
In-Reply-To: <ics17i$che$>
References: <icjsal$eqk$>	<>	<>
Message-ID: <>

Georg Brandl wrote:
> Am 27.11.2010 23:11, schrieb Steven D'Aprano:

>> I wasn't able to find a comment bubble that contained anything, so I 
>> don't know what sort of information you expect them to contain -- every 
>> one I tried said "0 comments".
> Maybe you should have tried the page I recommended as a demo, and where Nick
> made his comments? :)

Aha! I never would have guessed that the bubbles are clickable -- I 
thought you just moused-over them and they showed static comments put 
there by the developers, part of the documentation itself. I didn't 
realise that it was for users to add spam^W comments to the page. With 
that perspective, I need to rethink.

Yes, I failed to fully read the instructions you sent, or understand 
them. That's what users do -- they don't read your instructions, and 
they misunderstand them. If your UI isn't easily discoverable, users 
will not be able to use it, and will be frustrated and annoyed. The user 
is always right, even when they're doing it wrong *wink*

>> But it seems to me that comments are superfluous, if not actively harmful:
> (I've not read anything about harmful below.  Was that just FUD?)

Lowering accessibility to parts of the documentation is what I was 
talking about when I said "actively harmful". But now that I have better 
understanding of what the comment system is actually for, I have to rethink.


From glenn at  Sun Nov 28 02:04:49 2010
From: glenn at (Glenn Linderman)
Date: Sat, 27 Nov 2010 17:04:49 -0800
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/27/2010 12:56 PM, Glenn Linderman wrote:
> On 11/27/2010 2:51 AM, Nick Coghlan wrote:
>> Not quite. I'm suggesting a factory function that works for any value,
>> and derives the parent class from the type of the supplied value.
> Nick, thanks for the much better implementation than I achieved; you 
> seem to have the same goals as my implementation.  I learned a bit 
> making mine, and more understanding yours to some degree.  What I 
> still don't understand about your implementation, is that when adding 
> one additional line to your file, it fails:
> w = named_value("ABC", z )
> Now I can understand why it might not be a good thing to make a named 
> value of a named value (confusing, at least), but I was surprised, and 
> still do not understand, that it failed reporting the __new__() takes 
> exactly 3 arguments (2 given).

OK, I puzzled out the error, and here is a "cure" of sorts.

         def __new__(cls, name, value):
                 return base_type.__new__(cls, value)
             except TypeError:
                 return base_type.__new__(cls, name, value)
         def __init__(self, name, value):
             self.__name = name
             except TypeError:
                 super().__init__(name, value)

Probably it would be better for the except clause to raise a different 
type of error ( Can't recursively create named value ) or to cleverly 
bypass the intermediate named value, and simply apply a new name to the 
original value.  Hmm...  For this, only __new__ need be changed:

         def __new__(cls, name, value):
                 return base_type.__new__(cls, value)
             except TypeError:
                 return _make_named_value_type( type( value._raw() ))( 
name, value._raw() )
         def __init__(self, name, value):
             self.__name = name

Thanks for not responding too quickly, I figured out more, and learned more.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sun Nov 28 03:38:27 2010
From: ncoghlan at (Nick Coghlan)
Date: Sun, 28 Nov 2010 12:38:27 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 9:26 AM, Raymond Hettinger
<raymond.hettinger at> wrote:
> On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote:
>> On 11/27/2010 2:51 AM, Nick Coghlan wrote:
>>> Not quite. I'm suggesting a factory function that works for any value,
>>> and derives the parent class from the type of the supplied value.
>> Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation. ?I learned a bit ? ? making mine, and more understanding yours to some degree. ?What I still don't understand about your implementation, is that when adding one additional line to your file, it fails:
>> w = named_value("ABC", z )
>> Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given).
> Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library.

Indeed. Glenn's failing example suggests to me that using a new
metaclass is probably going to be a cleaner option than trying to
dance around type's default behaviour within an ordinary class
definition (if nothing else, a separate metaclass makes it much easier
to detect when you're dealing with an instance of a named type).

Regardless, I still see value in approaching this whole discussion as
a two-level design problem, with "named values" as the more
fundamental concept, and then higher level grouping APIs to get
enum-style behaviour. Eventually attaining "One Obvious Way" for the
former seems achievable to me, while the diversity of use cases for
grouping APIs suggests to me that "one-size-fits-all" isn't going to
work unless that "one size" is a Frankenstein API with more options
than anyone could reasonably hope to keep in their head at once.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From tjreedy at  Sun Nov 28 04:20:50 2010
From: tjreedy at (Terry Reedy)
Date: Sat, 27 Nov 2010 22:20:50 -0500
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <icshqf$ak$>

On 11/27/2010 6:26 PM, Raymond Hettinger wrote:

> Can I suggest that an enum-maker be offered as a third-party module

Possibly with competing versions for trial and testing ;-)

> rather than prematurely adding it into the standard library.

I had same thought.

Terry Jan Reedy

From donjohnston at  Sun Nov 28 05:17:11 2010
From: donjohnston at (Don Johnston)
Date: Sun, 28 Nov 2010 04:17:11 +0000 (UTC)
Subject: [Python-Dev]
References: <icjsal$eqk$>	<>	<>
	<ics17i$che$> <>
Message-ID: <>

Steven D'Aprano <steve <at>> writes:

> Aha! I never would have guessed that the bubbles are clickable -- I 
> thought you just moused-over them and they showed static comments put 
> there by the developers, part of the documentation itself. I didn't 
> realise that it was for users to add spam^W comments to the page. With 
> that perspective, I need to rethink.
> Yes, I failed to fully read the instructions you sent, or understand 
> them. That's what users do -- they don't read your instructions, and 
> they misunderstand them. If your UI isn't easily discoverable, users 
> will not be able to use it, and will be frustrated and annoyed. The user 
> is always right, even when they're doing it wrong *wink*
> >> But it seems to me that comments are superfluous, if not actively harmful:
> > 
> > (I've not read anything about harmful below.  Was that just FUD?)
> Lowering accessibility to parts of the documentation is what I was 
> talking about when I said "actively harmful". But now that I have better 
> understanding of what the comment system is actually for, I have to rethink.

As an end-user, I, too, share concerns about the accessibility of the pending 
(proposed?) commenting functionality.

A read-only JSON API would be great.

Up until now, Sphinx has been an incredibly helpful tool for generating 
beautiful documentation from ReStructuredText, which is great for limiting the 
risk of malformed input.

The new commenting feature ("dynamic application functionality") requires 
persistence for user-submitted content. Database persistence is currently 
implemented with the -excellent- SQLAlchemy ORM.

So, this is a transition from Sphinx being an excellent publishing tool to being 
a dynamic publishing platform for user-submitted content ("comments"). I am sure 
this was not without due consideration, and FUD.

The Python Web Framework communities (favorite framework *here*) will be the 
first to reiterate the challenges that all web application developers (and 
commenting API providers) face on a daily basis:

- SQL Injection
- XSS (Cross Site Scripting)
- CSRF (Cross Site Request Forgery)

Here are a few scenarios to consider:

(1) Freeloading jackass decides that each paragraph of our documentation would 
look better with 200 "comments" for viagara. Freeloading jackass is aware of how 
HTTP GETs work.

- What markup features are supported?
- How does the application sanitize user-supplied input?
- Is html5lib good enough?
- On, how are 1000 inappropriate (freeloading) comments from 
1000 different IPs deleted?
- What's the roadmap for {..., Akismet, ReCaptcha, ...} support?

(2) Freeloading jackass buys a block of javascript adspace on <insert-site->. The block of javascript surreptitiously posts helpful comments on 
behalf of unwitting users.

- How does the application ensure that comments are submitted from the site 
hosting the documentation?
- Which frameworks have existing, reviewed CSRF protections?

Trying to read through the new source here [1], but there aren't many docstrings 
and BB doesn't yet support inline commenting. AFAIK, there are not yet any 
issues filed for these concerns. [2] 

1. In the event that that kind of bug is discovered, how should the community 
report the issues?
2. If we have an alternate method of encouraging documentation feedback, how can 
this feature be turned off?

Thanks again for a great publishing tool,


From benjamin at  Sun Nov 28 05:33:43 2010
From: benjamin at (Benjamin Peterson)
Date: Sat, 27 Nov 2010 22:33:43 -0600
Subject: [Python-Dev] [RELEASED] Python 2.7.1
Message-ID: <>

On behalf of the Python development team, I'm happy as a clam to announce the
immediate availability of Python 2.7.1.

2.7 includes many features that were first released in Python 3.1. The faster io
module, the new nested with statement syntax, improved float repr, set literals,
dictionary views, and the memoryview object have been backported from 3.1. Other
features include an ordered dictionary implementation, unittests improvements, a
new sysconfig module, auto-numbering of fields in the str/unicode format method,
and support for ttk Tile in Tkinter.  For a more extensive list of changes in
2.7, see or Misc/NEWS in the Python

To download Python 2.7.1 visit:

The 2.7.1 changelog is at:

2.7 documentation can be found at:

This is a production release.  Please report any bugs you find to the bug


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 2.7.1's contributors)

From benjamin at  Sun Nov 28 05:34:42 2010
From: benjamin at (Benjamin Peterson)
Date: Sat, 27 Nov 2010 22:34:42 -0600
Subject: [Python-Dev] [RELEASED] Python 3.1.3
Message-ID: <>

On behalf of the Python development team, I'm happy as a lark to announce the
third bugfix release for the Python 3.1 series, Python 3.1.3.

This bug fix release features numerous bug fixes and documentation improvements
over 3.1.2.

The Python 3.1 version series focuses on the stabilization and optimization of
the features and changes that Python 3.0 introduced.  For example, the new I/O
system has been rewritten in C for speed.  File system APIs that use unicode
strings now handle paths with undecodable bytes in them. Other features include
an ordered dictionary implementation, a condensed syntax for nested with
statements, and support for ttk Tile in Tkinter.  For a more extensive list of
changes in 3.1, see or Misc/NEWS in
the Python distribution.

This is a production release. To download Python 3.1.3 visit:

A list of changes in 3.1.3 can be found here:

The 3.1 documentation can be found at:

Bugs can always be reported to:


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 3.1.3's contributors)

From martin at  Sun Nov 28 09:09:53 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 28 Nov 2010 09:09:53 +0100
Subject: [Python-Dev] Virus on python-3.1.2.msi?
Message-ID: <>

Issue 1050 claims that the 3.1.2 installer has the virus Palevo.DZ.
Can somebody with a virus scanner please confirm or contest that


From fuzzyman at  Sun Nov 28 14:48:08 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 13:48:08 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <icshqf$ak$>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 03:20, Terry Reedy wrote:
> On 11/27/2010 6:26 PM, Raymond Hettinger wrote:
>> Can I suggest that an enum-maker be offered as a third-party module
> Possibly with competing versions for trial and testing ;-)
>> rather than prematurely adding it into the standard library.
> I had same thought.

There are already *several* enum packages for Python available. The 
implementation by Ben Finney, associated with the previous PEP, is on 
PyPI and the most recent release has over 4000 downloads making it 
reasonably popular:

Other contenders include flufl.enum and lazr.enum. The Twisted guys 
would like a named constant type, and have a ticket for it, and PyQt has 
its own implementation (subclassing int) providing this functionality. 
In terms of assessing *general* usefulness in the wider community that 
step has already been done.

This discussion came out of yet-another-set-of-integer-constants being 
added to the Python standard library (since changed to strings). We have 
integer constants, with the associated inscrutability when used from the 
interactive interpreter or debugging, in *many* standard library 
modules. The particular features and use cases being discussed have use 
*within* the standard library in mind.

Releasing yet-another-enum-library-that-the-standard-library-can't-use 
would be a particularly pointless outcome of this discussion. The 
decision is whether or not to use named constants in the standard 
library, otherwise we can just point people at one of the several 
existing packages.

All the best,

Michael Foord


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From doko at  Sun Nov 28 16:46:09 2010
From: doko at (Matthias Klose)
Date: Sun, 28 Nov 2010 16:46:09 +0100
Subject: [Python-Dev] Question about GDB bindings and 32/64 bits
In-Reply-To: <>
References: <>
Message-ID: <>

On 26.11.2010 05:11, Jesus Cea wrote:
> Hash: SHA1
> I have installed GDB 7.2 32 bits and 32 bits buildslaves are green.
> Nevertheless 64 bits buildslaves are failing test_gdb.
> Is there any expectation that a 32 bits GDB be able to debug a 64 bits
> python?. If not, gdb test should compare "platform.architecture()" (for
> python and gdb in the system) and run only when they are the same.

that would be too restrictive, as an 64bit gdb is able to handle 32bit binaries too.

> If
> this should work, I would open a bug and maybe spend some time with it.
> But before thinking about investing time, I would like to know if this
> mix is actually expected or not to work.
> If not, I would consider to install a 64 bits GDB too and do some tricks
> (like using an "/usr/local/bin/gdb" script wrapper to choose 32/64
> "real" gdb version) to actually execute "test_gdb" in both buildslaves
> (they are running in the same physical machine).

yes, and then you should be able to use this gdb for both 32 and 64bit builds. 
No need for a wrapper (Such a gdb is available in the gdb64 package on 


From fuzzyman at  Sun Nov 28 17:28:00 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 16:28:00 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 02:38, Nick Coghlan wrote:
> On Sun, Nov 28, 2010 at 9:26 AM, Raymond Hettinger
> <raymond.hettinger at>  wrote:
>> On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote:
>>> On 11/27/2010 2:51 AM, Nick Coghlan wrote:
>>>> Not quite. I'm suggesting a factory function that works for any value,
>>>> and derives the parent class from the type of the supplied value.
>>> Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation.  I learned a bit     making mine, and more understanding yours to some degree.  What I still don't understand about your implementation, is that when adding one additional line to your file, it fails:
>>> w = named_value("ABC", z )
>>> Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given).
>> Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library.
> Indeed. Glenn's failing example suggests to me that using a new
> metaclass is probably going to be a cleaner option than trying to
> dance around type's default behaviour within an ordinary class
> definition (if nothing else, a separate metaclass makes it much easier
> to detect when you're dealing with an instance of a named type).

Yep, for representing a group of names a single class with a metaclass 
seems like a reasonable approach. See my note below about agreeing 
minimal feature-set and minimal-api before we discuss implementation 

> Regardless, I still see value in approaching this whole discussion as
> a two-level design problem, with "named values" as the more
> fundamental concept, and then higher level grouping APIs to get
> enum-style behaviour.

It seems like using the term "enum" provokes a strong negative reaction 
in some of the core-devs who are basically in favour named constants and 
not actively against grouping. I'm happy with NamedConstant and 
GroupedNames (or similar) and dropping the use of the term enum.

There are also valid concerns about over-engineering (and not so valid 
concerns...). Simplicity in creating them and no additional burden in 
using them are fundamental, but in the APIs / implementations suggested 
so far I think we are keeping that in mind.

> Eventually attaining "One Obvious Way" for the
> former seems achievable to me, while the diversity of use cases for
> grouping APIs suggests to me that "one-size-fits-all" isn't going to
> work unless that "one size" is a Frankenstein API with more options
> than anyone could reasonably hope to keep in their head at once.
Well... yes - treating it as a two level design problem is fine.

I don't think there are *many* competing features, in fact as far as 
feature requests on python-dev go I think this is a relatively 
straightforward one with a lot of *agreement* on the basic functionality.

We have had various discussions about what the API should look like, or 
what the implementation should look like, but I don't think there is a 
lot of disagreement about basic features. There are some 'optional 
features'. Many of these can be added later without backwards 
compatibility issues, so those can profitably be omitted from an initial 

Features as I see them:

Named constant

* Nice repr
* Subclass of the type it represents
* Trivially easy to convert either to a string (name) and the value it 
* If an integer type, can be OR'd with other named constants and retains 
a useful repr

Grouped constants
* Easy to create a group of named constants, accessible as attributes on 
group object
* Capability to go from name or value to corresponding constants

Optional Features

* Ability to dynamically add new named values to a group. (Suggested by 
* Ability to test if a name or value is in a group
* Ability to list all names in a group
* ANDing as well as ORing
* Constants are unique
* OR'ing with an integer will look up the name (or calculate it if the 
int itself represents flags that have already been OR'd) and return a 
named value (with useful repr) instead of just an integer
* Named constants be named values that can wrap *any* type and not just 
immutable values. (Note that wrapping mutable types makes providing 
"from_value" functionality harder *unless* we guarantee that named 
values are unique. If they aren't unique named values for a mutable type 
can have different values and there is no single definition of what the 
named value actually is.)
Requiring that values only have one name - or alternatively that values 
on a group could have multiple names (obviously incompatible features).
* Requiring all names in a group to be of the same type
* Allow names to be set automatically in a namespace, for example in a 
class namespace or on a module
* Allow subclassing and adding of new values only present in subclass

I'd rather we agree a suitable (minimal) API and feature set and go to 
implementation from that.

For wrapping mutable types I'm tempted to say YAGNI. For the standard 
library wrapping integers meets almost all our use-cases except for one 
float. (At work we have a decimal constant as it happens.) Perhaps we 
could require immutable types for groups but allow arbitrary values for 
individual named values?

For the named values api:

name = NamedValue('name', value)

For the grouping (tentatively accepted as reasonable by Antoine):

Group = make_constants('Group', name1=value1, name2=value2)
name1, name2 = Group.name1, Group.name1
flag = name1 | name2

value = int(Group.name1)
name = Group('name1')
# alternatively: value = Group.from_name('name1')
name = Group.from_value(value1)
# Group(value1) could work only if values aren't strings
# perhaps: name = Group(value=value1)

Group.new_name = value3 # create new value on the group
names = Group.all_names()
# further bikeshedding on spelling of all_names required
# correspondingly 'all_values' I guess, returning the constants themselves

Some of the optional features couldn't later be added without backwards 
compatibility concerns (I think the type checking features and requiring 
unique values for example). We should at least consider these if we are 
to make adding them later difficult. I would be fine with not having 
these features.

All the best,


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Sun Nov 28 18:05:12 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 17:05:12 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 16:28, Michael Foord wrote:
> [snip...]
> I don't think there are *many* competing features, in fact as far as 
> feature requests on python-dev go I think this is a relatively 
> straightforward one with a lot of *agreement* on the basic functionality.
> We have had various discussions about what the API should look like, 
> or what the implementation should look like, but I don't think there 
> is a lot of disagreement about basic features. There are some 
> 'optional features'. Many of these can be added later without 
> backwards compatibility issues, so those can profitably be omitted 
> from an initial implementation.
> Features as I see them:
> Named constant
> --------------
> * Nice repr
> * Subclass of the type it represents
> * Trivially easy to convert either to a string (name) and the value it 
> represents
> * If an integer type, can be OR'd with other named constants and 
> retains a useful repr
Note that having an OR repr is meaningless *unless* the constants are 
intended to be flags, OR'ing should be specified.

     name = NamedValue('name', value, flags=True)

Where flags defaults to False. Typically you will use this through the 
grouping API anyway - where it can either be a keyword argument 
(slightly annoying because the suggestion is to create the named values 
through keyword arguments) or we can have two group-factory functions:

     Group = make_constants('Group', name1=value1, name2=value2)
     Flags = make_flags('Flags', name1=value1, name2=value2)

It is sensible if flag values are only powers of 2; we could enforce 
that or not... (Another one for the optional feature list.)

I forgot auto-enumeration (specifying names only and having values 
autogenerated) from the optional feature set by the way. I think Antoine 
strongly disapproves of this feature because it reminds him of C enums.

Mark Dickinson thinks that the flags feature could be an optional 
feature too. If we have ORing it makes sense to have ANDing, so I guess 
they belong together. I think there is value in it though.

I realise that the optional feature list is now not small, and 
implementing all of it would create the "franken-api" Nick is worried 
about. The minimal feature list is nicely small though and provides 
useful functionality.

All the best,


> Grouped constants
> ----------------
> * Easy to create a group of named constants, accessible as attributes 
> on group object
> * Capability to go from name or value to corresponding constants
> Optional Features
> ---------------
> * Ability to dynamically add new named values to a group. (Suggested 
> by Guido)
> * Ability to test if a name or value is in a group
> * Ability to list all names in a group
> * ANDing as well as ORing
> * Constants are unique
> * OR'ing with an integer will look up the name (or calculate it if the 
> int itself represents flags that have already been OR'd) and return a 
> named value (with useful repr) instead of just an integer
> * Named constants be named values that can wrap *any* type and not 
> just immutable values. (Note that wrapping mutable types makes 
> providing "from_value" functionality harder *unless* we guarantee that 
> named values are unique. If they aren't unique named values for a 
> mutable type can have different values and there is no single 
> definition of what the named value actually is.)
> Requiring that values only have one name - or alternatively that 
> values on a group could have multiple names (obviously incompatible 
> features).
> * Requiring all names in a group to be of the same type
> * Allow names to be set automatically in a namespace, for example in a 
> class namespace or on a module
> * Allow subclassing and adding of new values only present in subclass
> I'd rather we agree a suitable (minimal) API and feature set and go to 
> implementation from that.
> For wrapping mutable types I'm tempted to say YAGNI. For the standard 
> library wrapping integers meets almost all our use-cases except for 
> one float. (At work we have a decimal constant as it happens.) Perhaps 
> we could require immutable types for groups but allow arbitrary values 
> for individual named values?
> For the named values api:
> name = NamedValue('name', value)
> For the grouping (tentatively accepted as reasonable by Antoine):
> Group = make_constants('Group', name1=value1, name2=value2)
> name1, name2 = Group.name1, Group.name1
> flag = name1 | name2
> value = int(Group.name1)
> name = Group('name1')
> # alternatively: value = Group.from_name('name1')
> name = Group.from_value(value1)
> # Group(value1) could work only if values aren't strings
> # perhaps: name = Group(value=value1)
> Group.new_name = value3 # create new value on the group
> names = Group.all_names()
> # further bikeshedding on spelling of all_names required
> # correspondingly 'all_values' I guess, returning the constants 
> themselves
> Some of the optional features couldn't later be added without 
> backwards compatibility concerns (I think the type checking features 
> and requiring unique values for example). We should at least consider 
> these if we are to make adding them later difficult. I would be fine 
> with not having these features.
> All the best,
> Michael
>> Cheers,
>> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Sun Nov 28 18:16:21 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 17:16:21 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 17:05, Michael Foord wrote:
> [snip...]
> It is sensible if flag values are only powers of 2; we could enforce 
> that or not... (Another one for the optional feature list.)
Another 'optional' feature I omitted was Phillip J. Eby's suggestion / 
requirement that named values be pickleable. Email is clunky for 
handling this, is there enough support (there is still some objection 
that is sure) to revive the PEP or create a new one?

I also didn't include Nick's suggested API, which is slightly different 
from the one I suggested:

silly = Namegroup.from_names("Silly", "FOO", "BAR", "BAZ")
 >>> silly.FOO
 >>> int(silly.FOO)
 >>> silly(0)

x = named_value("FOO", 1)
y = named_value("BAR", "Hello World!")
z = named_value("BAZ", dict(a=1, b=2, c=3))

set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())

Where a named value created from an integer is an int subclass, from a 
dict a dict subclass and so on.


> I forgot auto-enumeration (specifying names only and having values 
> autogenerated) from the optional feature set by the way. I think 
> Antoine strongly disapproves of this feature because it reminds him of 
> C enums.
> Mark Dickinson thinks that the flags feature could be an optional 
> feature too. If we have ORing it makes sense to have ANDing, so I 
> guess they belong together. I think there is value in it though.
> I realise that the optional feature list is now not small, and 
> implementing all of it would create the "franken-api" Nick is worried 
> about. The minimal feature list is nicely small though and provides 
> useful functionality.
> All the best,
> Michael
>> Grouped constants
>> ----------------
>> * Easy to create a group of named constants, accessible as attributes 
>> on group object
>> * Capability to go from name or value to corresponding constants
>> Optional Features
>> ---------------
>> * Ability to dynamically add new named values to a group. (Suggested 
>> by Guido)
>> * Ability to test if a name or value is in a group
>> * Ability to list all names in a group
>> * ANDing as well as ORing
>> * Constants are unique
>> * OR'ing with an integer will look up the name (or calculate it if 
>> the int itself represents flags that have already been OR'd) and 
>> return a named value (with useful repr) instead of just an integer
>> * Named constants be named values that can wrap *any* type and not 
>> just immutable values. (Note that wrapping mutable types makes 
>> providing "from_value" functionality harder *unless* we guarantee 
>> that named values are unique. If they aren't unique named values for 
>> a mutable type can have different values and there is no single 
>> definition of what the named value actually is.)
>> Requiring that values only have one name - or alternatively that 
>> values on a group could have multiple names (obviously incompatible 
>> features).
>> * Requiring all names in a group to be of the same type
>> * Allow names to be set automatically in a namespace, for example in 
>> a class namespace or on a module
>> * Allow subclassing and adding of new values only present in subclass
>> I'd rather we agree a suitable (minimal) API and feature set and go 
>> to implementation from that.
>> For wrapping mutable types I'm tempted to say YAGNI. For the standard 
>> library wrapping integers meets almost all our use-cases except for 
>> one float. (At work we have a decimal constant as it happens.) 
>> Perhaps we could require immutable types for groups but allow 
>> arbitrary values for individual named values?
>> For the named values api:
>> name = NamedValue('name', value)
>> For the grouping (tentatively accepted as reasonable by Antoine):
>> Group = make_constants('Group', name1=value1, name2=value2)
>> name1, name2 = Group.name1, Group.name1
>> flag = name1 | name2
>> value = int(Group.name1)
>> name = Group('name1')
>> # alternatively: value = Group.from_name('name1')
>> name = Group.from_value(value1)
>> # Group(value1) could work only if values aren't strings
>> # perhaps: name = Group(value=value1)
>> Group.new_name = value3 # create new value on the group
>> names = Group.all_names()
>> # further bikeshedding on spelling of all_names required
>> # correspondingly 'all_values' I guess, returning the constants 
>> themselves
>> Some of the optional features couldn't later be added without 
>> backwards compatibility concerns (I think the type checking features 
>> and requiring unique values for example). We should at least consider 
>> these if we are to make adding them later difficult. I would be fine 
>> with not having these features.
>> All the best,
>> Michael
>>> Cheers,
>>> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From steve at  Sun Nov 28 19:05:55 2010
From: steve at (Steven D'Aprano)
Date: Mon, 29 Nov 2010 05:05:55 +1100
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Michael Foord wrote:

> Another 'optional' feature I omitted was Phillip J. Eby's suggestion / 
> requirement that named values be pickleable. Email is clunky for 
> handling this, is there enough support (there is still some objection 
> that is sure) to revive the PEP or create a new one?

I think it definitely needs a PEP. I don't care whether you revive the 
old PEP or write a new one.


From fuzzyman at  Sun Nov 28 19:49:30 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 18:49:30 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 18:05, Steven D'Aprano wrote:
> Michael Foord wrote:
>> Another 'optional' feature I omitted was Phillip J. Eby's suggestion 
>> / requirement that named values be pickleable. Email is clunky for 
>> handling this, is there enough support (there is still some objection 
>> that is sure) to revive the PEP or create a new one?
> I think it definitely needs a PEP. I don't care whether you revive the 
> old PEP or write a new one.
Well, "if it were to be accepted it would need a PEP" and "the next step 
should be a PEP" are slightly different statements. :-)

As I agree with the former *anyway* at the worst starting a PEP will 
waste time, so I guess I'll get that underway when I get a chance...




READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From alexander.belopolsky at  Sun Nov 28 21:24:37 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 15:24:37 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
Message-ID: <>

Two recently reported issues brought into light the fact that Python
language definition is closely tied to character properties maintained
by the Unicode Consortium. [1,2]  For example, when Python switches to
Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
additional characters that Python can use in identifiers. [3]

With Python 3.1:

>>> exec('\u0CF1 = 1')
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<string>", line 1
   ? = 1
SyntaxError: invalid character in identifier

but with Python 3.2a4:

>>> exec('\u0CF1 = 1')
>>> eval('\u0CF1')

Of course, the likelihood is low that this change will affect any
user, but the change in str.isspace() reported in [1] is likely to
cause some trouble:

Python 2.6.5:
>>> u'A\u200bB'.split()
[u'A', u'B']

Python 2.7:
>>> u'A\u200bB'.split()

While we have little choice but to follow UCD in defining
str.isidentifier(), I think Python can promise users more stability in
what it treats as space or as a digit in its builtins.   For example,
I don't think that supporting

>>> float('????.??')

is more important than to assure users that once their program
accepted some text as a number, they can assume that the text is


From solipsis at  Sun Nov 28 21:43:11 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 28 Nov 2010 21:43:11 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
Message-ID: <>

On Sun, 28 Nov 2010 15:24:37 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins.

Well, if "unicode support" means "support the latest version of the
Unicode standard", I'm not sure we have a choice.
We can make exceptions, but that would only confuse users even more,
wouldn't it?

> For example,
> I don't think that supporting
> >>> float('????.??')
> 1234.56
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is

Why would they assume the text is ASCII?



From alexander.belopolsky at  Sun Nov 28 21:58:33 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 15:58:33 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou <solipsis at> wrote:
>> For example,
>> I don't think that supporting
>> >>> float('????.??')
>> 1234.56
>> is more important than to assure users that once their program
>> accepted some text as a number, they can assume that the text is
> Why would they assume the text is ASCII?

def deposit(self, amountstr):
      self.balance += float(amountstr)
      audit_log("Deposited: " + amountstr)


$ cat numbered-account.log
Deposited: ?????.??

From solipsis at  Sun Nov 28 22:04:15 2010
From: solipsis at (Antoine Pitrou)
Date: Sun, 28 Nov 2010 22:04:15 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, 28 Nov 2010 15:58:33 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:

> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou <solipsis at> wrote:
> ..
> >> For example,
> >> I don't think that supporting
> >>
> >> >>> float('????.??')
> >> 1234.56
> >>
> >> is more important than to assure users that once their program
> >> accepted some text as a number, they can assume that the text is
> >> ASCII.
> >
> > Why would they assume the text is ASCII?
> def deposit(self, amountstr):
>       self.balance += float(amountstr)
>       audit_log("Deposited: " + amountstr)
> Auditor:
> $ cat numbered-account.log
> Deposited: ?????.??

I'm not sure that's how banking applications are written :)


From jsbueno at  Sun Nov 28 22:12:09 2010
From: jsbueno at (Joao S. O. Bueno)
Date: Sun, 28 Nov 2010 19:12:09 -0200
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 28, 2010 at 7:04 PM, Antoine Pitrou <solipsis at> wrote:
> On Sun, 28 Nov 2010 15:58:33 -0500
> Alexander Belopolsky <alexander.belopolsky at> wrote:
>> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou <solipsis at> wrote:
>> ..
>> >> For example,
>> >> I don't think that supporting
>> >>
>> >> >>> float('????.??')
>> >> 1234.56
>> >>
>> >> is more important than to assure users that once their program
>> >> accepted some text as a number, they can assume that the text is
>> >> ASCII.
>> >
>> > Why would they assume the text is ASCII?
>> def deposit(self, amountstr):
>> ? ? ? self.balance += float(amountstr)
>> ? ? ? audit_log("Deposited: " + amountstr)
>> Auditor:
>> $ cat numbered-account.log
>> Deposited: ?????.??
> I'm not sure that's how banking applications are written :)
+1 for this being bogus  - I see no correlation whatsoever in numbers
inside unicode having to be "ASCII" if we have surpassed all technical
barriers for needing to behave like that.  ASCII is an
oversimplification of human communication needed for computing devices
not complex enough to represent it fully.

Let novice C programmers in English speaking countries deal with the
fact that 1 character is not 1 byte anymore. We are past this point.


> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From alexander.belopolsky at  Sun Nov 28 22:18:06 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 16:18:06 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 28, 2010 at 4:12 PM, Joao S. O. Bueno <jsbueno at> wrote:
> Let novice C programmers in English speaking countries deal with the
> fact that 1 character is not 1 byte anymore. We are past this point.

If you are, please contribute your expertise here:

From greg.ewing at  Sun Nov 28 22:23:56 2010
From: greg.ewing at (Greg Ewing)
Date: Mon, 29 Nov 2010 10:23:56 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Rob Cliffe wrote:

> But couldn't they be presented to the Python programmer as a single 
> type, with the implementation details hidden "under the hood"?

Not in CPython, because tuple items are kept in the same block
of memory as the object header. Because CPython can't move
objects, this means that the size of the tuple must be known
when the object is created.


From martin at  Sun Nov 28 23:17:13 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 28 Nov 2010 23:17:13 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

>>>>> float('????.??')
>> 1234.56

I think it's a bug that this works. The definition of the float builtin says

Convert a string or a number to floating point. If the argument is a
string, it must contain a possibly signed decimal or floating point
number, possibly embedded in whitespace. The argument may also be
'[+|-]nan' or '[+|-]inf'.

Now, one may wonder what precisely a "possibly signed floating point
number" is, but most likely, this refers to

floatnumber   ::=  pointfloat | exponentfloat
pointfloat    ::=  [intpart] fraction | intpart "."
exponentfloat ::=  (intpart | pointfloat) exponent
intpart       ::=  digit+
fraction      ::=  "." digit+
exponent      ::=  ("e" | "E") ["+" | "-"] digit+
digit          ::=  "0"..."9"


From alexander.belopolsky at  Sun Nov 28 23:31:51 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 17:31:51 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. L?wis" <martin at> wrote:
>>>>>> float('????.??')
>>> 1234.56
> I think it's a bug that this works. The definition of the float builtin says
> Convert a string or a number to floating point. If the argument is a
> string, it must contain a possibly signed decimal or floating point
> number, possibly embedded in whitespace. The argument may also be
> '[+|-]nan' or '[+|-]inf'.

This definition fails long before we get beyond 127-th code point:

>>> float('infinity')

From mal at  Sun Nov 28 23:42:31 2010
From: mal at (M.-A. Lemburg)
Date: Sun, 28 Nov 2010 23:42:31 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
Message-ID: <>

"Martin v. L?wis" wrote:
>>>>>> float('????.??')
>>> 1234.56
> I think it's a bug that this works. The definition of the float builtin says
> Convert a string or a number to floating point. If the argument is a
> string, it must contain a possibly signed decimal or floating point
> number, possibly embedded in whitespace. The argument may also be
> '[+|-]nan' or '[+|-]inf'.
> Now, one may wonder what precisely a "possibly signed floating point
> number" is, but most likely, this refers to
> floatnumber   ::=  pointfloat | exponentfloat
> pointfloat    ::=  [intpart] fraction | intpart "."
> exponentfloat ::=  (intpart | pointfloat) exponent
> intpart       ::=  digit+
> fraction      ::=  "." digit+
> exponent      ::=  ("e" | "E") ["+" | "-"] digit+
> digit          ::=  "0"..."9"

I don't see why the language spec should limit the wealth of number
formats supported by float().

It is not uncommon for Asians and other non-Latin script users to
use their own native script symbols for numbers. Just because these
digits may look strange to someone doesn't mean that they are
meaningless or should be discarded.

Please also remember that Python3 now allows Unicode names for
identifiers for much the same reasons.

Note that the support in float() (and the other numeric constructors)
to work with Unicode code points was explicitly added when Unicode
support was added to Python and has been available since Python 1.6.

It is not a bug by any definition of "bug", even though the feature
may bug someone occasionally to go read up a bit on what else
the world has to offer other than Arabic numerals :-)

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 28 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at  Sun Nov 28 23:48:59 2010
From: mal at (M.-A. Lemburg)
Date: Sun, 28 Nov 2010 23:48:59 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky wrote:
> Two recently reported issues brought into light the fact that Python
> language definition is closely tied to character properties maintained
> by the Unicode Consortium. [1,2]  For example, when Python switches to
> Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
> additional characters that Python can use in identifiers. [3]
> With Python 3.1:
>>>> exec('\u0CF1 = 1')
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "<string>", line 1
>    ? = 1
>      ^
> SyntaxError: invalid character in identifier
> but with Python 3.2a4:
>>>> exec('\u0CF1 = 1')
>>>> eval('\u0CF1')
> 1

Such changes are not new, but I agree that they should probably
be highlighted in the "What's new in Python x.x".

> Of course, the likelihood is low that this change will affect any
> user, but the change in str.isspace() reported in [1] is likely to
> cause some trouble:
> Python 2.6.5:
>>>> u'A\u200bB'.split()
> [u'A', u'B']
> Python 2.7:
>>>> u'A\u200bB'.split()
> [u'A\u200bB']

That's a classical bug fix.

> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins. 

Why should we divert from the work done by the Unicode Consortium ?
After all, most of their changes are in fact bug fixes as well.

> For example,
> I don't think that supporting
>>>> float('????.??')
> 1234.56
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is

Sorry, but I don't agree.

If ASCII numerals are an important aspect of an application, the
application should make sure that only those numerals are used
(e.g. by using a regular expression for checking).

In a Unicode world, not accepting non-Arabic numerals would be
a limitation, not a feature. Besides Python has had this support
since Python 1.6.

> [1]
> [2]
> [3]

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 28 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From alexander.belopolsky at  Sun Nov 28 23:51:00 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 17:51:00 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg <mal at> wrote:
> I don't see why the language spec should limit the wealth of number
> formats supported by float().

The Language Spec (whatever it is) should not, but hopefully the
Library Reference should.  If you follow link and
the references therein, you'll end up with

digit          ::=  "0"..."9"

From martin at  Sun Nov 28 23:56:47 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 28 Nov 2010 23:56:47 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Am 28.11.2010 23:31, schrieb Alexander Belopolsky:
> On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. L?wis" <martin at> wrote:
>>>>>>> float('????.??')
>>>> 1234.56
>> I think it's a bug that this works. The definition of the float builtin says
>> Convert a string or a number to floating point. If the argument is a
>> string, it must contain a possibly signed decimal or floating point
>> number, possibly embedded in whitespace. The argument may also be
>> '[+|-]nan' or '[+|-]inf'.
> This definition fails long before we get beyond 127-th code point:
>>>> float('infinity')
> inf

What do infer from that? That the definition is wrong, or the code is wrong?


From tjreedy at  Mon Nov 29 00:00:25 2010
From: tjreedy at (Terry Reedy)
Date: Sun, 28 Nov 2010 18:00:25 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
Message-ID: <icumu6$tlq$>

On 11/28/2010 3:58 PM, Alexander Belopolsky wrote:
> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou<solipsis at>  wrote:
> ..
>>> For example,
>>> I don't think that supporting
>>>>>> float('????.??')
>>> 1234.56

Even if this is somehow an accident or something that someone snuck in, 
I think it a good idea that *users* be able to input amounts with their 
native digits. That is different from requiring *programmers* to write 
literals with euro-ascii-digits

>>> is more important than to assure users that once their program
>>> accepted some text as a number, they can assume that the text is
>>> ASCII.
>> Why would they assume the text is ASCII?
> def deposit(self, amountstr):
>        self.balance += float(amountstr)
>        audit_log("Deposited: " + amountstr)

If the programmer want to assure ascii, he can produce a string, 
possible formatted, from the amount

depform = "Deposited: ${:14.2f}".format
def deposit(self, amountstr):
     amount = float(amountstr)
     self.balance += amount
     # audit_log("Deposited: " + str(amount) # simple version

Given that amountstr could be something like '        182.33        ', I 
think programmer should plan to format it.

Terry Jan Reedy

From alexander.belopolsky at  Mon Nov 29 00:01:10 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:01:10 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" <martin at> wrote:
>> This definition fails long before we get beyond 127-th code point:
>>>>> float('infinity')
>> inf
> What do infer from that? That the definition is wrong, or the code is wrong?

The development version of the reference manual is more detailed, but
as far as I can tell, it still defines digit as 0-9.

From martin at  Mon Nov 29 00:03:45 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 00:03:45 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<> <>
Message-ID: <>

>> Now, one may wonder what precisely a "possibly signed floating point
>> number" is, but most likely, this refers to
>> floatnumber   ::=  pointfloat | exponentfloat
>> pointfloat    ::=  [intpart] fraction | intpart "."
>> exponentfloat ::=  (intpart | pointfloat) exponent
>> intpart       ::=  digit+
>> fraction      ::=  "." digit+
>> exponent      ::=  ("e" | "E") ["+" | "-"] digit+
>> digit          ::=  "0"..."9"
> I don't see why the language spec should limit the wealth of number
> formats supported by float().

If it doesn't, there should be some other specification of what
is correct and what is not. It must not be unspecified.

> It is not uncommon for Asians and other non-Latin script users to
> use their own native script symbols for numbers. Just because these
> digits may look strange to someone doesn't mean that they are
> meaningless or should be discarded.

Then these users should speak up and indicate their need, or somebody
should speak up and confirm that there are users who actually want
'????.??' to denote 1234.56. To my knowledge, there is no writing
system in which '????.??e4' means 12345600.0.

> Please also remember that Python3 now allows Unicode names for
> identifiers for much the same reasons.

No no no. Addition of Unicode identifiers has a well-designed,
deliberate specification, with a PEP and all. The support for
non-ASCII digits in float appears to be ad-hoc, and not founded
on actual needs of actual users.

> Note that the support in float() (and the other numeric constructors)
> to work with Unicode code points was explicitly added when Unicode
> support was added to Python and has been available since Python 1.6.

That doesn't necessarily make it useful. Alexander's complaint is that
it makes Python unstable (i.e. changing as the UCD changes).

> It is not a bug by any definition of "bug"

Most certainly it is: the documentation is either underspecified,
or deviates from the implementation (when taking the most plausible
interpretation). This is the very definition of "bug".


From tjreedy at  Mon Nov 29 00:03:30 2010
From: tjreedy at (Terry Reedy)
Date: Sun, 28 Nov 2010 18:03:30 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
Message-ID: <icun3u$tlq$>

On 11/28/2010 5:51 PM, Alexander Belopolsky wrote:

> The Language Spec (whatever it is) should not, but hopefully the
> Library Reference should.  If you follow
> link and
> the references therein, you'll end up with
> digit          ::=  "0"..."9"

So fix the doc for builtin float() and perhaps int().

Terry Jan Reedy

From alexander.belopolsky at  Mon Nov 29 00:05:56 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:05:56 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

+1 on all point below.

On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" <martin at> wrote:
>>> Now, one may wonder what precisely a "possibly signed floating point
>>> number" is, but most likely, this refers to
>>> floatnumber ? ::= ?pointfloat | exponentfloat
>>> pointfloat ? ?::= ?[intpart] fraction | intpart "."
>>> exponentfloat ::= ?(intpart | pointfloat) exponent
>>> intpart ? ? ? ::= ?digit+
>>> fraction ? ? ?::= ?"." digit+
>>> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+
>>> digit ? ? ? ? ?::= ?"0"..."9"
>> I don't see why the language spec should limit the wealth of number
>> formats supported by float().
> If it doesn't, there should be some other specification of what
> is correct and what is not. It must not be unspecified.
>> It is not uncommon for Asians and other non-Latin script users to
>> use their own native script symbols for numbers. Just because these
>> digits may look strange to someone doesn't mean that they are
>> meaningless or should be discarded.
> Then these users should speak up and indicate their need, or somebody
> should speak up and confirm that there are users who actually want
> '????.??' to denote 1234.56. To my knowledge, there is no writing
> system in which '????.??e4' means 12345600.0.
>> Please also remember that Python3 now allows Unicode names for
>> identifiers for much the same reasons.
> No no no. Addition of Unicode identifiers has a well-designed,
> deliberate specification, with a PEP and all. The support for
> non-ASCII digits in float appears to be ad-hoc, and not founded
> on actual needs of actual users.
>> Note that the support in float() (and the other numeric constructors)
>> to work with Unicode code points was explicitly added when Unicode
>> support was added to Python and has been available since Python 1.6.
> That doesn't necessarily make it useful. Alexander's complaint is that
> it makes Python unstable (i.e. changing as the UCD changes).
>> It is not a bug by any definition of "bug"
> Most certainly it is: the documentation is either underspecified,
> or deviates from the implementation (when taking the most plausible
> interpretation). This is the very definition of "bug".
> Regards,
> Martin

From martin at  Mon Nov 29 00:08:29 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 00:08:29 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Am 29.11.2010 00:01, schrieb Alexander Belopolsky:
> On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" <martin at> wrote:
> ..
>>> This definition fails long before we get beyond 127-th code point:
>>>>>> float('infinity')
>>> inf
>> What do infer from that? That the definition is wrong, or the code is wrong?
> The development version of the reference manual is more detailed, but
> as far as I can tell, it still defines digit as 0-9.

I wasn't asking about 0..9, but about "infinity". According to the
spec, it shouldn't accept that (and neither should it accept
'infinitY'). However, whether that's a spec bug or an implementation
bug - it seems like a minor issue to me (i.e. easily fixed).


From alexander.belopolsky at  Mon Nov 29 00:12:44 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:12:44 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" <martin at> wrote:
>> Note that the support in float() (and the other numeric constructors)
>> to work with Unicode code points was explicitly added when Unicode
>> support was added to Python and has been available since Python 1.6.
> That doesn't necessarily make it useful. Alexander's complaint is that
> it makes Python unstable (i.e. changing as the UCD changes).

What makes it worse, is that while superficially, Unicode versions
follow the same X.Y.Z format as Python versions, the stability
promises are completely different.  For example, it appears that the
general category for the ZERO WIDTH SPACE was changed in Unicode
4.0.1.  I don't think a change affecting str.split(), int(), float()
and probably numerous other library functions would be acceptable in a
Python micro release.

From alexander.belopolsky at  Mon Nov 29 00:16:24 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:16:24 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:08 PM, "Martin v. L?wis" <martin at> wrote:
> Am 29.11.2010 00:01, schrieb Alexander Belopolsky:
>> On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" <martin at> wrote:
>> ..
>>>> This definition fails long before we get beyond 127-th code point:
>>>>>>> float('infinity')
>>>> inf
>>> What do infer from that? That the definition is wrong, or the code is wrong?
>> The development version of the reference manual is more detailed, but
>> as far as I can tell, it still defines digit as 0-9.
> I wasn't asking about 0..9, but about "infinity". According to the
> spec, it shouldn't accept that (and neither should it accept
> 'infinitY').

According to the link that I mentioned,

infinity       ::=  "Infinity" | "inf"

and "Case is not significant, so, for example, ?inf?, ?Inf?,
?INFINITY? and ?iNfINity? are all acceptable spellings for positive

I completely agree with your arguments and the reference manual has
been improved a lot in the recent years.

From martin at  Mon Nov 29 00:19:54 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 00:19:54 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

> What makes it worse, is that while superficially, Unicode versions
> follow the same X.Y.Z format as Python versions, the stability
> promises are completely different.  For example, it appears that the
> general category for the ZERO WIDTH SPACE was changed in Unicode
> 4.0.1.  I don't think a change affecting str.split(), int(), float()
> and probably numerous other library functions would be acceptable in a
> Python micro release.

Well, we managed to completely break Unicode normalization between
2.6.5 and 2.6.6, due to a bug.

You can see the Unicode Consortium's stability policy at

In a sense, this is stronger than Python's backwards compatibility
promises (which allow for certain incompatible changes to occur
over time, whereas Unicode makes promises about all future versions).


From benjamin at  Mon Nov 29 00:23:01 2010
From: benjamin at (Benjamin Peterson)
Date: Sun, 28 Nov 2010 17:23:01 -0600
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

2010/11/28 M.-A. Lemburg <mal at>:
> "Martin v. L?wis" wrote:
>>>>>>> float('????.??')
>>>> 1234.56
>> I think it's a bug that this works. The definition of the float builtin says
>> Convert a string or a number to floating point. If the argument is a
>> string, it must contain a possibly signed decimal or floating point
>> number, possibly embedded in whitespace. The argument may also be
>> '[+|-]nan' or '[+|-]inf'.
>> Now, one may wonder what precisely a "possibly signed floating point
>> number" is, but most likely, this refers to
>> floatnumber ? ::= ?pointfloat | exponentfloat
>> pointfloat ? ?::= ?[intpart] fraction | intpart "."
>> exponentfloat ::= ?(intpart | pointfloat) exponent
>> intpart ? ? ? ::= ?digit+
>> fraction ? ? ?::= ?"." digit+
>> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+
>> digit ? ? ? ? ?::= ?"0"..."9"
> I don't see why the language spec should limit the wealth of number
> formats supported by float().
> It is not uncommon for Asians and other non-Latin script users to
> use their own native script symbols for numbers. Just because these
> digits may look strange to someone doesn't mean that they are
> meaningless or should be discarded.

That's different. Python doesn't assign any semantic meaning to the
characters in identifiers. The non-latin support for numerals, though,
could change the meaning of a program dramatically and needs to be
well-specified. Whether int() should do this is debatable. I, for one,
think this kind of support belongs in the locale module.


From alexander.belopolsky at  Mon Nov 29 00:29:47 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:29:47 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:19 PM, "Martin v. L?wis" <martin at> wrote:
> You can see the Unicode Consortium's stability policy at

>From the link above:
As more experience is gathered in implementing the characters,
adjustments in the properties may become necessary. Examples of such
properties include, but are not limited to, the following:

> In a sense, this is stronger than Python's backwards compatibility
> promises (which allow for certain incompatible changes to occur
> over time, whereas Unicode makes promises about all future versions).

I would say it is *different* and should be taken into account when
tying language features to Unicode specifications. This was done in
PEP 3131.  Note that one of the stated objections was "Unicode is
young; its problems are not yet well understood and solved;"  (It is
still true.)

From martin at  Mon Nov 29 00:33:23 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 00:33:23 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <icumu6$tlq$>
References: <>	<>	<>
Message-ID: <>

>>>>>>> float('????.??')
>>>> 1234.56
> Even if this is somehow an accident or something that someone snuck in,
> I think it a good idea that *users* be able to input amounts with their
> native digits. That is different from requiring *programmers* to write
> literals with euro-ascii-digits

So one question is what kind of data float() is aimed at. I claim that
it is about "programmer" data, not "user" data. If it supported "user"
data, it probably would have to support "1,000" to denote 1e3 in the
U.S., and denote 1e0 in Germany. Our users are generally confused
on whether they should use th full stop or the comma as the decimal

As not even the locale-dependent issues are considered in float(),
it is clear to me that entering local numbers cannot possibly be
the objective of the function.

Instead, following a wide-spread Python convention, it is meant to be
the reverse of repr().

Can you name a single person who actually wants to write '????.??'
as a number? I'm fairly skeptical that users of arabic-indic digits.

suggests that they would rather U+066B, i.e. '???????', which isn't
supported by Python.


From martin at  Mon Nov 29 00:40:31 2010
From: martin at (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 00:40:31 +0100
Subject: [Python-Dev] PEP 384 final review
Message-ID: <>

I have now completed

Benjamin has volunteered to rule on this PEP.

Please comment with any changes you want to see, or speak in
favor or against this PEP.


From fuzzyman at  Mon Nov 29 00:44:50 2010
From: fuzzyman at (Michael Foord)
Date: Sun, 28 Nov 2010 23:44:50 +0000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<icumu6$tlq$>
Message-ID: <>

On 28/11/2010 23:33, "Martin v. L?wis" wrote:
>>>>>>>> float('????.??')
>>>>> 1234.56
>> Even if this is somehow an accident or something that someone snuck in,
>> I think it a good idea that *users* be able to input amounts with their
>> native digits. That is different from requiring *programmers* to write
>> literals with euro-ascii-digits
> So one question is what kind of data float() is aimed at. I claim that
> it is about "programmer" data, not "user" data. If it supported "user"
> data, it probably would have to support "1,000" to denote 1e3 in the
> U.S., and denote 1e0 in Germany. Our users are generally confused
> on whether they should use th full stop or the comma as the decimal
> separator.
FWIW the C# equivalent is locale aware *unless* you pass in a specific 

If you're not aware that your code may be run on non-US computers this 
is a trap for the unwary. If you *are* aware then it is very useful.

An alternative overload allows you to specify the culture used to do the 


> As not even the locale-dependent issues are considered in float(),
> it is clear to me that entering local numbers cannot possibly be
> the objective of the function.
> Instead, following a wide-spread Python convention, it is meant to be
> the reverse of repr().
> Can you name a single person who actually wants to write '????.??'
> as a number? I'm fairly skeptical that users of arabic-indic digits.
> Instead,
> suggests that they would rather U+066B, i.e. '???????', which isn't
> supported by Python.
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From alexander.belopolsky at  Mon Nov 29 00:56:00 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 18:56:00 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" <martin at> wrote:
> No no no. Addition of Unicode identifiers has a well-designed,
> deliberate specification, with a PEP and all. The support for
> non-ASCII digits in float appears to be ad-hoc, and not founded
> on actual needs of actual users.

I wonder how carefully right-to-left scripts were considered when PEP
3131 was discussed.
Try the following on the python prompt:

>>> ?= int('???')
>>> ?

In my OSX Terminal window, entering ? flips the >>> prompt and the
session looks like this:

('???')int = ? <<<

From martin at  Mon Nov 29 00:59:12 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 00:59:12 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<icumu6$tlq$>
	<> <>
Message-ID: <>

> FWIW the C# equivalent is locale aware *unless* you pass in a specific
> culture.
> (System.Double.Parse):

That's not quite the equivalent of float(), I would say: this one
apparently is locale-aware, so it is more the equivalent of locale.atof.

The next question then is if it supports indo-arabic digits in any
locale (or more specifically in an arabic locale).


From solipsis at  Mon Nov 29 01:01:12 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 29 Nov 2010 01:01:12 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
	<> <>
Message-ID: <>

On Sun, 28 Nov 2010 17:23:01 -0600
Benjamin Peterson <benjamin at> wrote:
> 2010/11/28 M.-A. Lemburg <mal at>:
> >
> >
> > "Martin v. L?wis" wrote:
> >>>>>>> float('????.??')
> >>>> 1234.56
> >>
> >> I think it's a bug that this works. The definition of the float builtin says
> >>
> >> Convert a string or a number to floating point. If the argument is a
> >> string, it must contain a possibly signed decimal or floating point
> >> number, possibly embedded in whitespace. The argument may also be
> >> '[+|-]nan' or '[+|-]inf'.
> >>
> >> Now, one may wonder what precisely a "possibly signed floating point
> >> number" is, but most likely, this refers to
> >>
> >> floatnumber ? ::= ?pointfloat | exponentfloat
> >> pointfloat ? ?::= ?[intpart] fraction | intpart "."
> >> exponentfloat ::= ?(intpart | pointfloat) exponent
> >> intpart ? ? ? ::= ?digit+
> >> fraction ? ? ?::= ?"." digit+
> >> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+
> >> digit ? ? ? ? ?::= ?"0"..."9"
> >
> > I don't see why the language spec should limit the wealth of number
> > formats supported by float().
> >
> > It is not uncommon for Asians and other non-Latin script users to
> > use their own native script symbols for numbers. Just because these
> > digits may look strange to someone doesn't mean that they are
> > meaningless or should be discarded.
> That's different. Python doesn't assign any semantic meaning to the
> characters in identifiers. The non-latin support for numerals, though,
> could change the meaning of a program dramatically and needs to be
> well-specified. Whether int() should do this is debatable.

Perhaps int(), float(), Decimal() and friends could take an optional
parameter indicating whether non-ascii digits are considered. It would
then satisfy all parties.


From martin at  Mon Nov 29 01:02:18 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 01:02:18 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Am 29.11.2010 00:56, schrieb Alexander Belopolsky:
> On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" <martin at> wrote:
> ..
>> No no no. Addition of Unicode identifiers has a well-designed,
>> deliberate specification, with a PEP and all. The support for
>> non-ASCII digits in float appears to be ad-hoc, and not founded
>> on actual needs of actual users.
> I wonder how carefully right-to-left scripts were considered when PEP
> 3131 was discussed.

IIRC, some Hebrew users have spoken in favor of the PEP, despite the
obvious difficulties it would create. I may misremember, but I think
someone pointed out that they had these difficulties all the time,
and that it wasn't really a burden.

Unicode specifies that one should always use "logical order" in memory,
and that's what the PEP does. Rendering is then a tool issue.


From alexander.belopolsky at  Mon Nov 29 01:04:53 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 19:04:53 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<icumu6$tlq$> <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. L?wis" <martin at> wrote:
> The next question then is if it supports indo-arabic digits in any
> locale (or more specifically in an arabic locale).

And once you answered that question, does it support Devanagari or
Bengali digits?  And if so, an arbitrary mix of those and indo-arabic

From alexander.belopolsky at  Mon Nov 29 01:25:37 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 19:25:37 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou <solipsis at> wrote:
>> That's different. Python doesn't assign any semantic meaning to the
>> characters in identifiers. The non-latin support for numerals, though,
>> could change the meaning of a program dramatically and needs to be
>> well-specified. Whether int() should do this is debatable.
> Perhaps int(), float(), Decimal() and friends could take an optional
> parameter indicating whether non-ascii digits are considered. It would
> then satisfy all parties.

What parties?  I don't think anyone has claimed to actually have used
non-ASCII digits with float(). Of course it is fun that Python can
process Bengali numerals, but so would be allowing Roman numerals.
There is a reason why after careful consideration, PEP 313 was
ultimately rejected.

BTW, it is common in Russia to specify months using roman numerals.
Maybe we should consider allowing accept '1.IV.2011'.

From fuzzyman at  Mon Nov 29 01:25:40 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 29 Nov 2010 00:25:40 +0000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<icumu6$tlq$>
	<> <>
Message-ID: <>

On 28/11/2010 23:59, "Martin v. L?wis" wrote:
>> FWIW the C# equivalent is locale aware *unless* you pass in a specific
>> culture.
>> (System.Double.Parse):
> That's not quite the equivalent of float(), I would say: this one
> apparently is locale-aware, so it is more the equivalent of locale.atof.

Right. It is *the* standard way of getting a float from a string though, 
whereas in Python we have two depending on whether or not you want to be 
locale aware. The standard way in C# is locale aware. To be non-locale 
aware you pass in a specific culture or number format.

> The next question then is if it supports indo-arabic digits in any
> locale (or more specifically in an arabic locale).

I don't think so actually. The float parse formatting rules are defined 
like this:


(From )

integral-digits, fractional-digits and exponential-digits are all 
defined as "A series of digits ranging from 0 to 9". Arguably this is 
not be conclusive. In fact the NumberFormatInfo class seems to hint that 
it may be otherwise:

See DigitSubstitution on that page. I would have to try it to be sure 
and I don't have a Windows VM in convenient reach right now.

All the best,


> Regards,
> Martin


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From fuzzyman at  Mon Nov 29 01:28:59 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 29 Nov 2010 00:28:59 +0000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<icumu6$tlq$>	<>	<>	<>
Message-ID: <>

On 29/11/2010 00:04, Alexander Belopolsky wrote:
> On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. L?wis"<martin at>  wrote:
> ..
>> The next question then is if it supports indo-arabic digits in any
>> locale (or more specifically in an arabic locale).
> And once you answered that question, does it support Devanagari or
> Bengali digits?  And if so, an arbitrary mix of those and indo-arabic
> digits?
Haha. Go and try it yourself. :-)



READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From solipsis at  Mon Nov 29 01:29:40 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 29 Nov 2010 01:29:40 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <1290990580.8242.2.camel@localhost.localdomain>

> > Perhaps int(), float(), Decimal() and friends could take an optional
> > parameter indicating whether non-ascii digits are considered. It would
> > then satisfy all parties.
> What parties?  I don't think anyone has claimed to actually have used
> non-ASCII digits with float().

Have you done a poll of all Python 3 users?

> Of course it is fun that Python can
> process Bengali numerals, but so would be allowing Roman numerals.
> There is a reason why after careful consideration, PEP 313 was
> ultimately rejected.

That's mostly irrelevant. This feature exists and someone, somewhere,
may be using it. We normally don't remove stuff without deprecation.


From ncoghlan at  Mon Nov 29 01:48:51 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 29 Nov 2010 10:48:51 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord
<fuzzyman at> wrote:
> For wrapping mutable types I'm tempted to say YAGNI. For the standard
> library wrapping integers meets almost all our use-cases except for one
> float. (At work we have a decimal constant as it happens.) Perhaps we could
> require immutable types for groups but allow arbitrary values for individual
> named values?

Whereas my opinion is that "immutable vs mutable" is such a blurry
distinction that we shouldn't try to make it at the lowest level.
Would it be possible to name frozenset instances? Tuples? How about
objects that are conceptually immutable, but don't close all the
loopholes allowing you to mutate them? (e.g. Decimal, Fraction)

Better to design a named value API that doesn't care about mutability,
and then leave questions of reverse mappings from values back to names
to the grouping API level. At that level, it would be trivial (and
natural) to limit names to referencing Hashable values so that a
reverse lookup table would be easy to implement. For standard library
purposes, we could even reasonably provide an int-only grouping API,
since the main use case is almost certainly to be in managing
translation of OS-level integer constants to named values.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ben+python at  Mon Nov 29 01:55:33 2010
From: ben+python at (Ben Finney)
Date: Mon, 29 Nov 2010 11:55:33 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
	<> <>
Message-ID: <>

Alexander Belopolsky <alexander.belopolsky at> writes:

> On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou <solipsis at> wrote:
> > Perhaps int(), float(), Decimal() and friends could take an optional
> > parameter indicating whether non-ascii digits are considered. It
> > would then satisfy all parties.
> What parties? I don't think anyone has claimed to actually have used
> non-ASCII digits with float().

Rather, it has been pointed out that there is an unknown amount of
existing code which does that. You're not going to know how much or how
little from this forum.

> Of course it is fun that Python can process Bengali numerals, but so
> would be allowing Roman numerals. There is a reason why after careful
> consideration, PEP 313 was ultimately rejected.

Rejecting a proposed *new* capability is a different matter from
disabling an *existing* capability which works in existing Python

 \       ?Following fashion and the status quo is easy. Thinking about |
  `\        your users' lives and creating something practical is much |
_o__)                                harder.? ?Ryan Singer, 2008-07-09 |
Ben Finney

From fuzzyman at  Mon Nov 29 01:57:27 2010
From: fuzzyman at (Michael Foord)
Date: Mon, 29 Nov 2010 00:57:27 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 29/11/2010 00:48, Nick Coghlan wrote:
> On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord
> <fuzzyman at>  wrote:
>> For wrapping mutable types I'm tempted to say YAGNI. For the standard
>> library wrapping integers meets almost all our use-cases except for one
>> float. (At work we have a decimal constant as it happens.) Perhaps we could
>> require immutable types for groups but allow arbitrary values for individual
>> named values?
> Whereas my opinion is that "immutable vs mutable" is such a blurry
> distinction that we shouldn't try to make it at the lowest level.
> Would it be possible to name frozenset instances? Tuples? How about
> objects that are conceptually immutable, but don't close all the
> loopholes allowing you to mutate them? (e.g. Decimal, Fraction)
> Better to design a named value API that doesn't care about mutability,
> and then leave questions of reverse mappings from values back to names
> to the grouping API level. At that level, it would be trivial (and
> natural) to limit names to referencing Hashable values so that a
> reverse lookup table would be easy to implement. For standard library
> purposes, we could even reasonably provide an int-only grouping API,
> since the main use case is almost certainly to be in managing
> translation of OS-level integer constants to named values.

Sounds reasonable to me.


> Cheers,
> Nick.


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From tjreedy at  Mon Nov 29 02:00:56 2010
From: tjreedy at (Terry Reedy)
Date: Sun, 28 Nov 2010 20:00:56 -0500
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <icuu05$mt6$>

On 11/28/2010 6:40 PM, "Martin v. L?wis" wrote:
> I have now completed

The current text contains several error messages like:

"System Message: WARNING/2 (pep-0384.txt, line 194)
Bullet list ends without a blank line; unexpected unindent."

Terry Jan Reedy

From steve at  Mon Nov 29 01:14:31 2010
From: steve at (Steven D'Aprano)
Date: Mon, 29 Nov 2010 11:14:31 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Martin v. L?wis wrote:
>>>>>> float('????.??')
>>> 1234.56
> I think it's a bug that this works. The definition of the float builtin says

I think that's a documentation bug rather than a coding bug. If Python 
wishes to limit the digits allowed in numeric *literals* to ASCII 0...9, 
that's one thing, but I think that the digits allowed in numeric 
*strings* should allow the full range of digits supported by the Unicode 

The former ensures that literals in code are always readable; the later 
allows users to enter numbers in their own number system. How could that 
be a bad thing?


From rob.cliffe at  Sun Nov 28 02:07:08 2010
From: rob.cliffe at (Rob Cliffe)
Date: Sun, 28 Nov 2010 01:07:08 +0000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 28/11/2010 21:23, Greg Ewing wrote:
> Rob Cliffe wrote:
>> But couldn't they be presented to the Python programmer as a single 
>> type, with the implementation details hidden "under the hood"?
> Not in CPython, because tuple items are kept in the same block
> of memory as the object header. Because CPython can't move
> objects, this means that the size of the tuple must be known
> when the object is created.
But when a frozen list a.k.a. tuple would be created - either directly, 
or by setting a list's mutable flag to False which would really turn it 
into a tuple - the size *would* be known.  And since the object would 
now be immutable, there would be no requirement for its size to change.  
(My idea doesn't require additional functionality, just a different API.)
Rob Cliffe

From alexander.belopolsky at  Mon Nov 29 02:24:24 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 20:24:24 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Sun, Nov 28, 2010 at 7:55 PM, Ben Finney <ben+python at> wrote:
>> Of course it is fun that Python can process Bengali numerals, but so
>> would be allowing Roman numerals. There is a reason why after careful
>> consideration, PEP 313 was ultimately rejected.
> Rejecting a proposed *new* capability is a different matter from
> disabling an *existing* capability which works in existing Python
> releases.

Was this capability ever documented?  It does not feel like a
deliberate feature.  If it was, '\N{ARABIC DECIMAL SEPARATOR}' would
be accepted in arabic-indic notation.   If feels more like a CPython
implementation detail similar to say:

>>> int('10') is 10
>>> int('10000') is 10000

Note that the underlying PyUnicode_EncodeDecimal() function is
described in the unicodeobject.h header file as follows:

/* --- Decimal Encoder ---------------------------------------------------- */

/* Takes a Unicode string holding a decimal value and writes it into
   an output buffer using standard ASCII digit codes.
  The encoder converts whitespace to ' ', decimal characters to their
   corresponding ASCII digit and all other Latin-1 characters except
   \0 as-is. Characters outside this range (Unicode ordinals 1-256)
   are treated as errors. This includes embedded NULL bytes.

So the support for non-ASCII digits is accidental and should be
treated as a bug.

From ben+python at  Mon Nov 29 02:25:56 2010
From: ben+python at (Ben Finney)
Date: Mon, 29 Nov 2010 12:25:56 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
	<> <>
Message-ID: <>

Steven D'Aprano <steve at> writes:

> If Python wishes to limit the digits allowed in numeric *literals* to
> ASCII 0...9, that's one thing, but I think that the digits allowed in
> numeric *strings* should allow the full range of digits supported by
> the Unicode standard.

I assume you specifically mean that the numeric class constructors, like
?int? and ?float?, should parse their input string such that any
character Unicode defines as a numeric digit is mapped to the
corresponding digit.

That sounds attractive, but it raises questions about mixed notations,
mixing digits from different writing systems, and probably other
questionss I haven't thought of. It's not something to make a simple
yes-or-no-decision on now, IMO.

This sounds best suited to a PEP, which someone who cares enough can
champion in ?python-ideas?.

 \      ?The manager has personally passed all the water served here.? |
  `\                                                  ?hotel, Acapulco |
_o__)                                                                  |
Ben Finney

From steve at  Mon Nov 29 00:43:59 2010
From: steve at (Steven D'Aprano)
Date: Mon, 29 Nov 2010 10:43:59 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky wrote:
> Two recently reported issues brought into light the fact that Python
> language definition is closely tied to character properties maintained
> by the Unicode Consortium. [1,2]  For example, when Python switches to
> Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
> additional characters that Python can use in identifiers. [3]

Why do you consider this a problem? It would be a problem if previously 
valid identifiers *stopped* being valid, but not the other way around.

> Of course, the likelihood is low that this change will affect any
> user, but the change in str.isspace() reported in [1] is likely to
> cause some trouble:

Looking at the thread here:

I interpret it as indicting that Python's isspace() has been buggy for 
many years, and is only now being fixed. It's always unfortunate when 
people rely on bugs, but I'm not sure we should be promising to support 
bug-for-bug compatibility from one version to the next :)

> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins.   For example,
> I don't think that supporting
>>>> float('????.??')
> 1234.56
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is

Seems like a pretty foolish assumption, if you ask me, pretty much akin 
to assuming that if string.isalpha() returns true that string is ASCII.

Support for non-Arabic numerals in number strings goes back to at least 
Python 2.4:

[steve at sylar ~]$ python2.4
Python 2.4.6 (#1, Mar 30 2009, 10:08:01)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> float(u'????.??')

The fact that this is (apparently) only being raised now means that it 
isn't actually a problem in real life. I'd even say that it's a feature, 
and that if Python didn't support non-Arabic numerals, it should.


From alexander.belopolsky at  Mon Nov 29 03:32:15 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sun, 28 Nov 2010 21:32:15 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 28, 2010 at 6:43 PM, Steven D'Aprano <steve at> wrote:
>> is more important than to assure users that once their program
>> accepted some text as a number, they can assume that the text is
> Seems like a pretty foolish assumption, if you ask me, pretty much akin to
> assuming that if string.isalpha() returns true that string is ASCII.

It is not to 99.9% of Python users whose code is written for 2.x.
Their strings are byte strings and string.isdigit() does imply ASCII
even if string.isalpha() does not in many locales.

> The fact that this is (apparently) only being raised now means that it isn't
> actually a problem in real life. I'd even say that it's a feature, and that
> if Python didn't support non-Arabic numerals, it should.

I raised this problem because I found a bug that is related to this
feature.  The bug is also a regression from 2.x.

In 2.7:

>>> float(u'1234\xa1')
ValueError: invalid literal for float(): 1234?

The last character is lost, but the error message is still meaningful.

In 3.x, however:

>>> float('1234\xa1')


While investigating this issue I found that by the time the string
gets to the number parser (_Py_dg_strtod), all non-ascii characters
are dropped by PyUnicode_EncodeDecimal() so it cannot produce
meaningful diagnostic.

Of course, PyUnicode_EncodeDecimal(), can be fixed by making it pass
non-ascii chars through as UTF-8 bytes, but I was wondering if
preserving the ability to parse exotic numerals was worth the effort.

From rrr at  Mon Nov 29 04:03:39 2010
From: rrr at (Ron Adam)
Date: Sun, 28 Nov 2010 21:03:39 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/27/2010 04:51 AM, Nick Coghlan wrote:
> x = named_value("FOO", 1)
> y = named_value("BAR", "Hello World!")
> z = named_value("BAZ", dict(a=1, b=2, c=3))
> print(x, y, z, sep="\n")
> print("\n".join(map(repr, (x, y, z))))
> print("\n".join(map(str, map(type, (x, y, z)))))
> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
> print("\n".join(map(repr, (foo, bar, baz))))
> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
> ==========================================================================
> # Session output for the last 6 lines
>>>> >>>  print(x, y, z, sep="\n")
> 1
> Hello World!
> {'a': 1, 'c': 3, 'b': 2}
>>>> >>>  print("\n".join(map(repr, (x, y, z))))
> FOO=1
> BAR='Hello World!'
> BAZ={'a': 1, 'c': 3, 'b': 2}

This reminds me of python annotations.  Which seem like an already 
forgotten new feature.  Maybe they can help with this?

It does associate additional info to names and creates a nice dictionary to 

 >>> def name_values( FOO: 1,
                      BAR: "Hello World!",
                      BAZ: dict(a=1, b=2, c=3) ):
...   return FOO, BAR, BAZ
 >>> foo(1,2,3)
(1, 2, 3)
 >>> foo.__annotations__
{'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}}


From rrr at  Mon Nov 29 04:03:39 2010
From: rrr at (Ron Adam)
Date: Sun, 28 Nov 2010 21:03:39 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 11/27/2010 04:51 AM, Nick Coghlan wrote:
> x = named_value("FOO", 1)
> y = named_value("BAR", "Hello World!")
> z = named_value("BAZ", dict(a=1, b=2, c=3))
> print(x, y, z, sep="\n")
> print("\n".join(map(repr, (x, y, z))))
> print("\n".join(map(str, map(type, (x, y, z)))))
> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
> print("\n".join(map(repr, (foo, bar, baz))))
> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
> ==========================================================================
> # Session output for the last 6 lines
>>>> >>>  print(x, y, z, sep="\n")
> 1
> Hello World!
> {'a': 1, 'c': 3, 'b': 2}
>>>> >>>  print("\n".join(map(repr, (x, y, z))))
> FOO=1
> BAR='Hello World!'
> BAZ={'a': 1, 'c': 3, 'b': 2}

This reminds me of python annotations.  Which seem like an already 
forgotten new feature.  Maybe they can help with this?

It does associate additional info to names and creates a nice dictionary to 

 >>> def name_values( FOO: 1,
                      BAR: "Hello World!",
                      BAZ: dict(a=1, b=2, c=3) ):
...   return FOO, BAR, BAZ
 >>> foo(1,2,3)
(1, 2, 3)
 >>> foo.__annotations__
{'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}}


From stephen at  Mon Nov 29 04:39:32 2010
From: stephen at (Stephen J. Turnbull)
Date: Mon, 29 Nov 2010 12:39:32 +0900
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

M.-A. Lemburg writes:

 > It is not uncommon for Asians and other non-Latin script users to
 > use their own native script symbols for numbers.

Japanese don't, in computational or scientific work where float()
would be used.  Japanese numerals are used for dates and for certain
felicitous ages (and even there so-called "Arabic" numerals are
perfectly acceptable).  Otherwise, it's all ASCII (although it might
be "full-width" compatibility variants).

 > Please also remember that Python3 now allows Unicode names for
 > identifiers for much the same reasons.

I don't think it's the same reason, not for Japanese, anyway.

I agree that Python should make it easy for the programmer to get
numerical values of native numeric strings, but it's not at all clear
to me that there is any point to having float() recognize them by

From ncoghlan at  Mon Nov 29 04:58:05 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 29 Nov 2010 13:58:05 +1000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull <stephen at> wrote:
> I agree that Python should make it easy for the programmer to get
> numerical values of native numeric strings, but it's not at all clear
> to me that there is any point to having float() recognize them by
> default.

Indeed, as someone else suggested earlier in the thread, supporting
non-ASCII digits sounds more like a job for the locale module than for
the builtin types.

Deprecating non-ASCII support in the latter, while ensuring it is
properly supported in the former sounds like a better way forward than
maintaining the status quo (starting in 3.3 though, with the first
beta just around the corner, we don't want to be monkeying with this
in 3.2)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From martin at  Mon Nov 29 08:18:59 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 08:18:59 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

> Perhaps int(), float(), Decimal() and friends could take an optional
> parameter indicating whether non-ascii digits are considered. It would
> then satisfy all parties.

Not really. I still would want to see what the actual requirement is:
i.e. do any users actually have the desire to have these digits
accepted, yet the alternative decimal points rejected?


From martin at  Mon Nov 29 08:22:46 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 08:22:46 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> The former ensures that literals in code are always readable; the later
> allows users to enter numbers in their own number system. How could that
> be a bad thing?

It's YAGNI, feature bloat. It gives the illusion of supporting something
that actually isn't supported very well (namely, parsing local number
strings). I claim that there is no meaningful application
of this feature.


From martin at  Mon Nov 29 08:25:19 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 08:25:19 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <1290990580.8242.2.camel@localhost.localdomain>
References: <>	<>
	<>	<>	<>	<>	<>
Message-ID: <>

> That's mostly irrelevant. This feature exists and someone, somewhere,
> may be using it. We normally don't remove stuff without deprecation.

Sure: it should be deprecated before being removed.


From amauryfa at  Mon Nov 29 08:55:13 2010
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Mon, 29 Nov 2010 08:55:13 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

2010/11/29 "Martin v. L?wis" <martin at>

> I have now completed

was structseq.h considered?
IMO it could be made PEP384-compliant with two additions that would replace
two non-compliant functions:

- A new function to create types, since PyStructSequence_InitType
is supposed to work on a unititialized static variable:
    PyTypeObject *PyStructSequence_NewType(PyStructSequence_Desc *desc);

- PyStructSequence_SetItem(), similar to the
macro PyStructSequence_SET_ITEM; the PyStructSequence structure should be

Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Mon Nov 29 09:09:14 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 09:09:14 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

>     I have now completed
> was structseq.h considered?

No, it wasn't - unfortunately, it still doesn't get included when
including Python.h. I'll add it.

> IMO it could be made PEP384-compliant with two additions that would
> replace two non-compliant functions:
> - A new function to create types, since PyStructSequence_InitType
> is supposed to work on a unititialized static variable:
>     PyTypeObject *PyStructSequence_NewType(PyStructSequence_Desc *desc);
> - PyStructSequence_SetItem(), similar to the
> macro PyStructSequence_SET_ITEM; the PyStructSequence structure should
> be hidden.

Sounds good.


From mal at  Mon Nov 29 09:35:05 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 29 Nov 2010 09:35:05 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:
> On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg <mal at> wrote:
> ..
>> I don't see why the language spec should limit the wealth of number
>> formats supported by float().
> The Language Spec (whatever it is) should not, but hopefully the
> Library Reference should.  If you follow
> link and
> the references therein, you'll end up with

... the language spec again :-)

> digit          ::=  "0"..."9"

That's obviously a bug in the documentation, since the Python 2.7 docs
don't mention any such relationship to the language spec:

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 29 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From g.brandl at  Mon Nov 29 09:36:56 2010
From: g.brandl at (Georg Brandl)
Date: Mon, 29 Nov 2010 09:36:56 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>	<>
Message-ID: <icvoqb$ea6$>

Am 29.11.2010 09:09, schrieb "Martin v. L?wis":
>>     I have now completed
>> was structseq.h considered?
> No, it wasn't - unfortunately, it still doesn't get included when
> including Python.h. I'll add it.

Would 3.2 be a good time to finally include it?  All of its macros and
declarations are named PyStructSequence*, so there shouldn't be a
name clash concern.


From g.brandl at  Mon Nov 29 09:52:19 2010
From: g.brandl at (Georg Brandl)
Date: Mon, 29 Nov 2010 09:52:19 +0100
Subject: [Python-Dev] [Preview] Comments and change proposals on
In-Reply-To: <>
References: <icjsal$eqk$>	<>	<>	<ics17i$che$>
Message-ID: <icvpmv$ikr$>

Am 28.11.2010 00:58, schrieb Steven D'Aprano:
> Georg Brandl wrote:
>> Am 27.11.2010 23:11, schrieb Steven D'Aprano:
>>> I wasn't able to find a comment bubble that contained anything, so I 
>>> don't know what sort of information you expect them to contain -- every 
>>> one I tried said "0 comments".
>> Maybe you should have tried the page I recommended as a demo, and where Nick
>> made his comments? :)
> Aha! I never would have guessed that the bubbles are clickable -- I 
> thought you just moused-over them and they showed static comments put 
> there by the developers, part of the documentation itself. I didn't 
> realise that it was for users to add spam^W comments to the page. With 
> that perspective, I need to rethink.
> Yes, I failed to fully read the instructions you sent, or understand 
> them. That's what users do -- they don't read your instructions, and 
> they misunderstand them. If your UI isn't easily discoverable, users 
> will not be able to use it, and will be frustrated and annoyed. The user 
> is always right, even when they're doing it wrong *wink*

That's right, of course.  I really come to the conclusion that having a text
link that "looks like" a link, i.e. is underlined, will have a better UI
experience (since we cannot put notes "click bubble to comment" everywhere).

>>> But it seems to me that comments are superfluous, if not actively harmful:
>> (I've not read anything about harmful below.  Was that just FUD?)
> Lowering accessibility to parts of the documentation is what I was 
> talking about when I said "actively harmful". But now that I have better 
> understanding of what the comment system is actually for, I have to rethink.



From doko at  Mon Nov 29 11:24:22 2010
From: doko at (Matthias Klose)
Date: Mon, 29 Nov 2010 11:24:22 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

On 29.11.2010 00:40, "Martin v. L?wis" wrote:
> I have now completed
> Benjamin has volunteered to rule on this PEP.
> Please comment with any changes you want to see, or speak in
> favor or against this PEP.

I looked at a diff with r84330 from the py3k branch.

Extensions built with Py_LIMITED_API have the python version encoded in it's 
name.  Which abi name should be used for these extensions?

  - The m and u modifiers in the abi name are complimentary (?)
  - debug builds and Py_LIMITED_API are incompatible (?) and therefore
    the current name should be used?
  - For posix systems the implementation is currently part of the abi name,
    are Py_LIMITED_API extensions supposed to be compatible with e.g. PyPy?
    Should the LIMITED_API abi name include the implementation string?
  - Should the distutils support for LIMITED_API be part of the pep, or
    be implemented later?

In favour of the pep.


From mal at  Mon Nov 29 12:02:57 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 29 Nov 2010 12:02:57 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull <stephen at> wrote:
>> I agree that Python should make it easy for the programmer to get
>> numerical values of native numeric strings, but it's not at all clear
>> to me that there is any point to having float() recognize them by
>> default.
> Indeed, as someone else suggested earlier in the thread, supporting
> non-ASCII digits sounds more like a job for the locale module than for
> the builtin types.
> Deprecating non-ASCII support in the latter, while ensuring it is
> properly supported in the former sounds like a better way forward than
> maintaining the status quo (starting in 3.3 though, with the first
> beta just around the corner, we don't want to be monkeying with this
> in 3.2)

Since when do we only support certain Unicode features in specific
locales ?

If we would go down that road, we would also have to disable other
Unicode features based on locale, e.g. whether to apply non-ASCII
case mappings, what to consider whitespace, etc.

We don't do that for a good reason: Unicode is supposed to be
universal and not limited to a single locale.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 29 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From sylvain.thenault at  Mon Nov 29 12:53:11 2010
From: sylvain.thenault at (Sylvain =?utf-8?B?VGjDqW5hdWx0?=)
Date: Mon, 29 Nov 2010 12:53:11 +0100
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On 25 novembre 11:22, Ron Adam wrote:
> On 11/25/2010 08:30 AM, Emile Anclin wrote:
> >
> >hello,
> >
> >working on Pylint, we have a lot of voluntary corrupted files to test
> >Pylint behavior; for instance
> >
> >$ cat /home/emile/var/pylint/test/input/
> ># -*- coding: IBO-8859-1 -*-
> >""" check correct unknown encoding declaration
> >"""
> >
> >__revision__ = '????'
> >
> >
> >and we try to find that module :
> >find_module('func_unknown_encoding', None). But python3 raises SyntaxError
> >in that case ; it didn't raise SyntaxError on python2 nor does so on our
> >func_nonascii_noencoding and func_wrong_encoding modules (with obvious
> >names)
> >
> >Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
> >[GCC 4.3.4] on linux2
> >Type "help", "copyright", "credits" or "license" for more information.
> >>>>from imp import find_module
> >>>>find_module('func_unknown_encoding', None)
> >Traceback (most recent call last):
> >   File "<stdin>", line 1, in<module>
> >SyntaxError: encoding problem: with BOM
> I don't think there is a clear reason by design.  Also try importing
> the same modules directly and noting the differences in the errors
> you get.

IMO the point is that we can consider as a bug the fact that find_module
tries to somewhat read the content of the file, no? Though it seems to only
doing this for encoding detection or like since find_module doesn't choke on
a module containing another kind of syntax error.

So the question is, should we deal with this in pylint/astng, or can we expect
this to be fixed at some point?
Sylvain Th?nault                               LOGILAB, Paris (France)
Formations Python, Debian, M?th. Agiles:
D?veloppement logiciel sur mesure:
CubicWeb, the semantic web framework:

From ncoghlan at  Mon Nov 29 13:43:26 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 29 Nov 2010 22:43:26 +1000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg <mal at> wrote:
> If we would go down that road, we would also have to disable other
> Unicode features based on locale, e.g. whether to apply non-ASCII
> case mappings, what to consider whitespace, etc.
> We don't do that for a good reason: Unicode is supposed to be
> universal and not limited to a single locale.

Because parsing numbers is about more than just the characters used
for the individual digits. There are additional semantics associated
with digit ordering (for any number) and decimal separators and
exponential notation (for floating point numbers) and those vary by
locale. We deliberately chose to make the builtin numeric parsers
unaware of all of those things, and assuming that we can simply parse
other digits as if they were their ASCII equivalents and otherwise
assume a C locale seems questionable.

If the existing semantics can be adequately defined, documented and
defended, then retaining them would be fine. However, the language
reference needs to define the behaviour properly so that other
implementations know what they need to support and what can be chalked
up as being just an implementation accident of CPython. (As a point in
the plus column, both decimal.Decimal and fractions.Fraction were able
to handle the '????.??' example in a manner consistent with the int
and float handling)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From merwok at  Mon Nov 29 14:14:30 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 29 Nov 2010 14:14:30 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>


> Please comment with any changes you want to see, or speak in
> favor or against this PEP.

How to get a diff between py3k and this branch?


From doko at  Mon Nov 29 14:37:33 2010
From: doko at (Matthias Klose)
Date: Mon, 29 Nov 2010 14:37:33 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 29.11.2010 14:14, ?ric Araujo wrote:
> Hello,
>> Please comment with any changes you want to see, or speak in
>> favor or against this PEP.
> How to get a diff between py3k and this branch?

I used
svn diff svn:// at 84330 

From ncoghlan at  Mon Nov 29 14:58:50 2010
From: ncoghlan at (Nick Coghlan)
Date: Mon, 29 Nov 2010 23:58:50 +1000
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 11:37 PM, Matthias Klose <doko at> wrote:
> On 29.11.2010 14:14, ?ric Araujo wrote:
>> Hello,
>>> Please comment with any changes you want to see, or speak in
>>> favor or against this PEP.
>> How to get a diff between py3k and this branch?
> I used
> svn diff svn:// at 84330
> svn://

I had to use the full read/write svn+ssh:pythondev at
repository URLs to get it to give me a diff. The http read only URLs
didn't work (no diff returned, just "svn: OPTIONS of
'': 200 OK
("), and the bare svn protocol isn't enabled on

Since directory diffs don't appear to be enabled on the
ViewVC instance, it would probably be a good idea to put this up on
Reitveld so people can more easily see the details of what has been
changed on the branch to date. If nobody beats me to it, I'll put one
up in the morning.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Nov 29 15:07:32 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 30 Nov 2010 00:07:32 +1000
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 29, 2010 at 9:40 AM, "Martin v. L?wis" <martin at> wrote:
> I have now completed
> Benjamin has volunteered to rule on this PEP.
> Please comment with any changes you want to see, or speak in
> favor or against this PEP.

This is probably an issue independent of the PEP, but there appear to
be a *lot* of exposed typedefs for various type slots and other
function signatures that don't start with the Py prefix (i.e. getter,
setter, unaryfunc and friends). Python.h shouldn't be leaking
unprefixed names like that. We certainly shouldn't be enshrining them
in the stable ABI without adding prefixes first.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Mon Nov 29 15:19:07 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 29 Nov 2010 15:19:07 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
	<> <>
Message-ID: <>

On Mon, 29 Nov 2010 13:58:05 +1000
Nick Coghlan <ncoghlan at> wrote:
> On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull <stephen at> wrote:
> > I agree that Python should make it easy for the programmer to get
> > numerical values of native numeric strings, but it's not at all clear
> > to me that there is any point to having float() recognize them by
> > default.
> Indeed, as someone else suggested earlier in the thread, supporting
> non-ASCII digits sounds more like a job for the locale module than for
> the builtin types.

Not sure, really. For example, "\d" in a regular expression will match
all Unicode digits, unless you pass the re.ASCII flag. The C locale
mechanism generally does a poor job of supporting what MS seems to call
"culture-specific" characteristics.



From solipsis at  Mon Nov 29 15:22:24 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 29 Nov 2010 15:22:24 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
Message-ID: <>

On Sun, 28 Nov 2010 21:32:15 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Sun, Nov 28, 2010 at 6:43 PM, Steven D'Aprano <steve at> wrote:
> ..
> >> is more important than to assure users that once their program
> >> accepted some text as a number, they can assume that the text is
> >> ASCII.
> >
> > Seems like a pretty foolish assumption, if you ask me, pretty much akin to
> > assuming that if string.isalpha() returns true that string is ASCII.
> >
> It is not to 99.9% of Python users whose code is written for 2.x.
> Their strings are byte strings and string.isdigit() does imply ASCII
> even if string.isalpha() does not in many locales.

We are not talking about string.isdigit(), we are talking about the
float() constructor when given an unicode string.  Constructing a float
from an unicode string is certainly a common thing, even in 2.x.



From foom at  Mon Nov 29 15:15:12 2010
From: foom at (James Y Knight)
Date: Mon, 29 Nov 2010 09:15:12 -0500
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Nov 29, 2010, at 8:58 AM, Nick Coghlan wrote:

> The http read only URLs
> didn't work (no diff returned, just "svn: OPTIONS of
> '': 200 OK
> ("), 

That was the wrong url: you should've used

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mal at  Mon Nov 29 16:19:19 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 29 Nov 2010 16:19:19 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg <mal at> wrote:
>> If we would go down that road, we would also have to disable other
>> Unicode features based on locale, e.g. whether to apply non-ASCII
>> case mappings, what to consider whitespace, etc.
>> We don't do that for a good reason: Unicode is supposed to be
>> universal and not limited to a single locale.
> Because parsing numbers is about more than just the characters used
> for the individual digits. There are additional semantics associated
> with digit ordering (for any number) and decimal separators and
> exponential notation (for floating point numbers) and those vary by
> locale. We deliberately chose to make the builtin numeric parsers
> unaware of all of those things, and assuming that we can simply parse
> other digits as if they were their ASCII equivalents and otherwise
> assume a C locale seems questionable.

Sure, and those additional semantics are locale dependent, even
between ASCII-only locales. However, that does not apply to the
basic building blocks, the decimal digits themselves.

> If the existing semantics can be adequately defined, documented and
> defended, then retaining them would be fine. However, the language
> reference needs to define the behaviour properly so that other
> implementations know what they need to support and what can be chalked
> up as being just an implementation accident of CPython. (As a point in
> the plus column, both decimal.Decimal and fractions.Fraction were able
> to handle the '????.??' example in a manner consistent with the int
> and float handling)

The support is built into the C API, so there's not really much
surprise there.

Regarding documentation, we'd just have to add that numbers may
be made up of an Unicode code point in the category "Nd".

See, section
4.6 for details....

Decimal digits form a large subcategory of numbers consisting of those digits that can be
used to form decimal-radix numbers. They include script-specific digits, but exclude char-
acters such as Roman numerals and Greek acrophonic numerals. (Note that <1, 5> = 15 =
fifteen, but <I, V> = IV = four.) Decimal digits also exclude the compatibility subscript or
superscript digits to prevent simplistic parsers from misinterpreting their values in context.

int(), float() and long() (in Python2) are such simplistic

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 29 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From ziade.tarek at  Mon Nov 29 16:59:42 2010
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 29 Nov 2010 16:59:42 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Nov 29, 2010 at 11:24 AM, Matthias Klose <doko at> wrote:
> On 29.11.2010 00:40, "Martin v. L?wis" wrote:
>> I have now completed
>> Benjamin has volunteered to rule on this PEP.
>> Please comment with any changes you want to see, or speak in
>> favor or against this PEP.
> I looked at a diff with r84330 from the py3k branch.
> Extensions built with Py_LIMITED_API have the python version encoded in it's
> name. ?Which abi name should be used for these extensions?
> ?- Should the distutils support for LIMITED_API be part of the pep, or
> ? be implemented later?

In any case, it has to be implemented in Distutils2, not in Distutils.
Distutils is frozen and just in maintenance mode.

Once Distutils2 final is released (it's currently in alpha), it will
be installable from 2.4 to 3.x and can provide this feature.

For Python itself we can backport the feature in its, until
Distutils2 is back to the sdtlib

> In favour of the pep.


> ?Matthias

Tarek Ziad? |

From alexander.belopolsky at  Mon Nov 29 17:07:03 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 29 Nov 2010 11:07:03 -0500
Subject: [Python-Dev] [Preview] Comments and change proposals on
In-Reply-To: <icvpmv$ikr$>
References: <icjsal$eqk$>
	<> <ics17i$che$>
	<> <icvpmv$ikr$>
Message-ID: <>

On Mon, Nov 29, 2010 at 3:52 AM, Georg Brandl <g.brandl at> wrote:
>> Yes, I failed to fully read the instructions you sent, or understand
>> them. That's what users do -- they don't read your instructions, and
>> they misunderstand them. If your UI isn't easily discoverable, users
>> will not be able to use it, and will be frustrated and annoyed. The user
>> is always right, even when they're doing it wrong *wink*
> That's right, of course. ?I really come to the conclusion that having a text
> link that "looks like" a link, i.e. is underlined, will have a better UI
> experience (since we cannot put notes "click bubble to comment" everywhere).
Please don't make comment bubbles more visible.  Doing so will only
decrease signal to noise ratio.  I think a little bit of a learning
barrier is a good thing: it will keep down the number of "Bart was
here" comments.

From alexander.belopolsky at  Mon Nov 29 19:09:58 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 29 Nov 2010 13:09:58 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 2:22 AM, "Martin v. L?wis" <martin at> wrote:
>> The former ensures that literals in code are always readable; the later
>> allows users to enter numbers in their own number system. How could that
>> be a bad thing?
> It's YAGNI, feature bloat. It gives the illusion of supporting something
> that actually isn't supported very well (namely, parsing local number
> strings). I claim that there is no meaningful application
> of this feature.

Speaking of YAGNI, does anyone want to defend

>>> complex('????.??j')


Especially given that we reject complex('1234.56i'):

From solipsis at  Mon Nov 29 19:33:02 2010
From: solipsis at (Antoine Pitrou)
Date: Mon, 29 Nov 2010 19:33:02 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, 29 Nov 2010 08:22:46 +0100
"Martin v. L?wis" <martin at> wrote:
> > The former ensures that literals in code are always readable; the later
> > allows users to enter numbers in their own number system. How could that
> > be a bad thing?
> It's YAGNI, feature bloat. It gives the illusion of supporting something
> that actually isn't supported very well (namely, parsing local number
> strings). I claim that there is no meaningful application
> of this feature.

Still, if it's not detrimental and it it's not difficult to support,
then why do you care? You aren't even maintaining that part of the code.

I don't think "remove feature bloat" is part of our development goals
or practices. Given the diversity of our user base, such removal should
be done carefully and only for serious reasons.



From mal at  Mon Nov 29 19:59:57 2010
From: mal at (M.-A. Lemburg)
Date: Mon, 29 Nov 2010 19:59:57 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:
> On Mon, Nov 29, 2010 at 2:22 AM, "Martin v. L?wis" <martin at> wrote:
>>> The former ensures that literals in code are always readable; the later
>>> allows users to enter numbers in their own number system. How could that
>>> be a bad thing?
>> It's YAGNI, feature bloat. It gives the illusion of supporting something
>> that actually isn't supported very well (namely, parsing local number
>> strings). I claim that there is no meaningful application
>> of this feature.

This is not about parsing local number strings, it's about
parsing number strings represented using different scripts -
besides en-US is a locale as well, ye know :-)

> Speaking of YAGNI, does anyone want to defend
>>>> complex('????.??j')
> 1234.56j
> ?

Yes. The same arguments apply.

Just because ASCII-proponents may have a hard time reading such
literals, doesn't mean that script users have the same trouble.

> Especially given that we reject complex('1234.56i'):

We've had that discussion long before we had Unicode in Python.
The main reason was that 'i' looked to similar to 1 in a number
of fonts which is why it was rejected for Python source code.

However, I don't any reason why we shouldn't accept both i and
j for complex(), though, since the input to that constructor
doesn't have to originate in Python source code.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 29 2010)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From brett at  Mon Nov 29 20:22:22 2010
From: brett at (Brett Cannon)
Date: Mon, 29 Nov 2010 11:22:22 -0800
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault
<sylvain.thenault at> wrote:
> On 25 novembre 11:22, Ron Adam wrote:
>> On 11/25/2010 08:30 AM, Emile Anclin wrote:
>> >
>> >hello,
>> >
>> >working on Pylint, we have a lot of voluntary corrupted files to test
>> >Pylint behavior; for instance
>> >
>> >$ cat /home/emile/var/pylint/test/input/
>> ># -*- coding: IBO-8859-1 -*-
>> >""" check correct unknown encoding declaration
>> >"""
>> >
>> >__revision__ = '????'
>> >
>> >
>> >and we try to find that module :
>> >find_module('func_unknown_encoding', None). But python3 raises SyntaxError
>> >in that case ; it didn't raise SyntaxError on python2 nor does so on our
>> >func_nonascii_noencoding and func_wrong_encoding modules (with obvious
>> >names)
>> >
>> >Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
>> >[GCC 4.3.4] on linux2
>> >Type "help", "copyright", "credits" or "license" for more information.
>> >>>>from imp import find_module
>> >>>>find_module('func_unknown_encoding', None)
>> >Traceback (most recent call last):
>> > ? File "<stdin>", line 1, in<module>
>> >SyntaxError: encoding problem: with BOM
>> I don't think there is a clear reason by design. ?Also try importing
>> the same modules directly and noting the differences in the errors
>> you get.
> IMO the point is that we can consider as a bug the fact that find_module
> tries to somewhat read the content of the file, no? Though it seems to only
> doing this for encoding detection or like since find_module doesn't choke on
> a module containing another kind of syntax error.
> So the question is, should we deal with this in pylint/astng, or can we expect
> this to be fixed at some point?

Considering these semantics changed between Python 2 and 3 w/o a
discernable benefit (I would consider it a negative as finding a
module should not be impacted by syntactic correctness; the full act
of importing should be the only thing that cares about that), I would
consider it a bug that should be filed.

From tjreedy at  Mon Nov 29 20:23:28 2010
From: tjreedy at (Terry Reedy)
Date: Mon, 29 Nov 2010 14:23:28 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <id0ujb$d5s$>

On 11/29/2010 10:19 AM, M.-A. Lemburg wrote:
> Nick Coghlan wrote:
>> On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg<mal at>  wrote:
>>> If we would go down that road, we would also have to disable other
>>> Unicode features based on locale, e.g. whether to apply non-ASCII
>>> case mappings, what to consider whitespace, etc.
>>> We don't do that for a good reason: Unicode is supposed to be
>>> universal and not limited to a single locale.
>> Because parsing numbers is about more than just the characters used
>> for the individual digits. There are additional semantics associated
>> with digit ordering (for any number) and decimal separators and
>> exponential notation (for floating point numbers) and those vary by
>> locale. We deliberately chose to make the builtin numeric parsers
>> unaware of all of those things, and assuming that we can simply parse
>> other digits as if they were their ASCII equivalents and otherwise
>> assume a C locale seems questionable.
> Sure, and those additional semantics are locale dependent, even
> between ASCII-only locales. However, that does not apply to the
> basic building blocks, the decimal digits themselves.
>> If the existing semantics can be adequately defined, documented and
>> defended, then retaining them would be fine. However, the language
>> reference needs to define the behaviour properly so that other
>> implementations know what they need to support and what can be chalked
>> up as being just an implementation accident of CPython. (As a point in
>> the plus column, both decimal.Decimal and fractions.Fraction were able
>> to handle the '????.??' example in a manner consistent with the int
>> and float handling)
> The support is built into the C API, so there's not really much
> surprise there.
> Regarding documentation, we'd just have to add that numbers may
> be made up of an Unicode code point in the category "Nd".
> See, section
> 4.6 for details....
> """
> Decimal digits form a large subcategory of numbers consisting of those digits that can be
> used to form decimal-radix numbers. They include script-specific digits, but exclude char-
> acters such as Roman numerals and Greek acrophonic numerals. (Note that<1, 5>  = 15 =
> fifteen, but<I, V>  = IV = four.) Decimal digits also exclude the compatibility subscript or
> superscript digits to prevent simplistic parsers from misinterpreting their values in context.
> """
> int(), float() and long() (in Python2) are such simplistic
> parsers.

Since you are the knowledgable advocate of the current behavior, perhaps 
you could open an issue and propose a doc patch, even if not .rst formatted.

Terry Jan Reedy

From alexander.belopolsky at  Mon Nov 29 20:38:46 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 29 Nov 2010 14:38:46 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 1:33 PM, Antoine Pitrou <solipsis at> wrote:
> On Mon, 29 Nov 2010 08:22:46 +0100
> "Martin v. L?wis" <martin at> wrote:
>> > The former ensures that literals in code are always readable; the later
>> > allows users to enter numbers in their own number system. How could that
>> > be a bad thing?
>> It's YAGNI, feature bloat. It gives the illusion of supporting something
>> that actually isn't supported very well (namely, parsing local number
>> strings). I claim that there is no meaningful application
>> of this feature.
> Still, if it's not detrimental and it it's not difficult to support,
> then why do you care?

It is difficult to support.  A fix for issue10557 would be much
simpler if we did not support non-European digits.  I now added a
patch that handles non-ascii digits, so you can see what's involved.
Note that when Unicode Consortium inevitably adds more Nd characters
to the non-BMP planes, we will have to add surrogate pairs' support to
this code.

In any case, there is little we can do about it in 3.2 other than fix
bugs like issue10557 without breaking currently valid code, so I
created a separate issue to continue this debate in context of 3.3.

Now, I would like to bring this thread back to it's subject.  Given
that UCD is now affecting the language definition and the standard
library behavior, how should changes to UCD be handled?

- Should Python documentation refer to the specific version of Unicode
that it supports?

Current documentation refers to old versions.  Should version be
updated or removed to imply the latest?

- How UCD updates should be handled during the language moratorium?

During PEP 3003 discussion, it was suggested to handle it on a case by
case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP
3003.  Should this upgrade be backported to 2.7?

- How specific should library reference manual be in defining methods
affected by UCD such as str.upper()?

- What is an acceptable level of variation between Python
implementations?  For example, if '\UXXXXXXXX'.isalpha() returns true
in one implementation, can it return false in another?  Note that even
CPython narrow and wide builds are presently not consistent in this


From alexander.belopolsky at  Mon Nov 29 20:43:14 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 29 Nov 2010 14:43:14 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <id0ujb$d5s$>
References: <>
	<> <>
	<> <id0ujb$d5s$>
Message-ID: <>

On Mon, Nov 29, 2010 at 2:23 PM, Terry Reedy <tjreedy at> wrote:
> Since you are the knowledgable advocate of the current behavior, perhaps you
> could open an issue and propose a doc patch, even if not .rst formatted.

I am not an advocate of the current behavior, but an issue for doc patches is at

From martin at  Mon Nov 29 20:38:59 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 20:38:59 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>	<>
Message-ID: <>

>>  - Should the distutils support for LIMITED_API be part of the pep, or
>>   be implemented later?
> In any case, it has to be implemented in Distutils2, not in Distutils.
> Distutils is frozen and just in maintenance mode.

I think it's too late for that. PEP 3149 is accepted, and it does
specify a change to distutils (namely, the abi= parameter). ISTM that
an approved PEP will override the distutils code freeze.

> For Python itself we can backport the feature in its, until
> Distutils2 is back to the sdtlib

This won't be for python itself, but for extension modules.


From ziade.tarek at  Mon Nov 29 20:45:35 2010
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 29 Nov 2010 20:45:35 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

2010/11/29 "Martin v. L?wis" <martin at>:
>>> ?- Should the distutils support for LIMITED_API be part of the pep, or
>>> ? be implemented later?
>> In any case, it has to be implemented in Distutils2, not in Distutils.
>> Distutils is frozen and just in maintenance mode.
> I think it's too late for that. PEP 3149 is accepted, and it does
> specify a change to distutils (namely, the abi= parameter). ISTM that
> an approved PEP will override the distutils code freeze.

Having an accepted PEP does not imply that it should be implemented in
the standard library.

For instance PEP 345 and PEP 376 are accepted but implemented in Distutils2.

it's also a:

- good opportunity to boost Distutils2 adoption
- way to get feedback from people for that abi= option and have the
chance to correct any design issue before d2 is added in the sdtlib

>> For Python itself we can backport the feature in its, until
>> Distutils2 is back to the sdtlib
> This won't be for python itself, but for extension modules.


> Regards,
> Martin

Tarek Ziad? |

From rrr at  Mon Nov 29 21:21:07 2010
From: rrr at (Ron Adam)
Date: Mon, 29 Nov 2010 14:21:07 -0600
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>	<>	<>
Message-ID: <id11vk$9d$>

On 11/29/2010 01:22 PM, Brett Cannon wrote:
> On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault
> <sylvain.thenault at>  wrote:
>> On 25 novembre 11:22, Ron Adam wrote:
>>> On 11/25/2010 08:30 AM, Emile Anclin wrote:
>>>> hello,
>>>> working on Pylint, we have a lot of voluntary corrupted files to test
>>>> Pylint behavior; for instance
>>>> $ cat /home/emile/var/pylint/test/input/
>>>> # -*- coding: IBO-8859-1 -*-
>>>> """ check correct unknown encoding declaration
>>>> """
>>>> __revision__ = '????'
>>>> and we try to find that module :
>>>> find_module('func_unknown_encoding', None). But python3 raises SyntaxError
>>>> in that case ; it didn't raise SyntaxError on python2 nor does so on our
>>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious
>>>> names)
>>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
>>>> [GCC 4.3.4] on linux2
>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> >from imp import find_module
>>>>>>> find_module('func_unknown_encoding', None)
>>>> Traceback (most recent call last):
>>>>    File "<stdin>", line 1, in<module>
>>>> SyntaxError: encoding problem: with BOM
>>> I don't think there is a clear reason by design.  Also try importing
>>> the same modules directly and noting the differences in the errors
>>> you get.
>> IMO the point is that we can consider as a bug the fact that find_module
>> tries to somewhat read the content of the file, no? Though it seems to only
>> doing this for encoding detection or like since find_module doesn't choke on
>> a module containing another kind of syntax error.
>> So the question is, should we deal with this in pylint/astng, or can we expect
>> this to be fixed at some point?
> Considering these semantics changed between Python 2 and 3 w/o a
> discernable benefit (I would consider it a negative as finding a
> module should not be impacted by syntactic correctness; the full act
> of importing should be the only thing that cares about that), I would
> consider it a bug that should be filed.

The output of imp.find_module() returns an open file io object, and it's 
output feeds directly into to imp.load_module().

 >>> imp.find_module('pydoc')
(<_io.TextIOWrapper name=4 encoding='utf-8'>, 
'/usr/local/lib/python3.2/', ('.py', 'U', 1))

So I think the imp.find_module() is suppose to be used when you *do* want 
to do the full act of importing and not for just finding out if or where 
module xyz exists.


From martin at  Mon Nov 29 21:22:02 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 21:22:02 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

> Extensions built with Py_LIMITED_API have the python version encoded in
> it's name.  Which abi name should be used for these extensions?

PEP 3149, IIUC, says it should be "abi3". I don't understand what that
means, though (with respect to, say, distutils)

>  - The m and u modifiers in the abi name are complimentary (?)

See above: none of these will be used. Of course, it is possible to name
an ABI-conforming extensions with the regular ABI name of the
Python release.

>  - For posix systems the implementation is currently part of the abi name,
>    are Py_LIMITED_API extensions supposed to be compatible with e.g. PyPy?

That's a choice that PyPy needs to make, of course, but Amaury has
indicated that they are interested in doing so.

>    Should the LIMITED_API abi name include the implementation string?
>  - Should the distutils support for LIMITED_API be part of the pep, or
>    be implemented later?

Depends on what support you want. Currently, all you need to do is to
define Py_LIMITED_API to the preprocessor - this is something that is
already supported in distutils.

If you want the support suggested in PEP 3149 (specifying abi=3),
it should certainly be implemented in Python 3.2, despite the distutils


From martin at  Mon Nov 29 21:36:46 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 21:36:46 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <>
Message-ID: <>

> This is probably an issue independent of the PEP but there appear to
> be a *lot* of exposed typedefs for various type slots and other
> function signatures that don't start with the Py prefix (i.e. getter,
> setter, unaryfunc and friends).

It's indeed independent: the names don't actually affect the ABI, but
the API. Changing them is possible later without risking binary

>  Python.h shouldn't be leaking
> unprefixed names like that. We certainly shouldn't be enshrining them
> in the stable ABI without adding prefixes first.

The stable ABI isn't actually enshrining them - what gets enshrined is
the value of the typedefs, not their names.

I don't mind renaming them, though. I see a number of different cases:

- struct names. I don't see a problem to have
  "typedef struct PyFoo PyFoo"
  I vaguely recall that there had been compiler problems with that
  construct at some point, but to my knowledge, they are past, and
  this is actually both well-formed C and well-formed C++.
- function pointer type names
- "various" other types

For the struct types, in particular for the ones which already have a
typedef, I think renaming them should be possible right away.
Applications that break should be able to use the typedef instead,
and continue to work with older releases.

For the function pointer type names, caution is necessary. We cannot
remove them, since it would break a lot of code. I also think that
some smart naming scheme would be desirable that makes the names all
sound right, yet allows easy mapping from the existing types. Once
such a scheme is added, we should have a graceful deprecation procedure,
such as:
- release A: add typedefs in addition to existing pointer types,
  deprecate pointer types in documentation
- release B>A: make the old names somehow conditional (e.g. put them
  all into a header file rename3.h, or some such)
- release C>B: remove rename3.h

For the other rest, I think many of them are considered internal (of
course, they shouldn't appear in the ABI then at all). Renaming them
right away might be fine.


From martin at  Mon Nov 29 21:41:09 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 Nov 2010 21:41:09 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

Am 29.11.2010 14:14, schrieb ?ric Araujo:
> Hello,
>> Please comment with any changes you want to see, or speak in
>> favor or against this PEP.
> How to get a diff between py3k and this branch?

As others have already explained:

svn diff at 84329

(84329 is the value of svnmerge-integrated).

In any case, I posted it to Rietveld as


From greg.ewing at  Mon Nov 29 21:47:23 2010
From: greg.ewing at (Greg Ewing)
Date: Tue, 30 Nov 2010 09:47:23 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

Rob Cliffe wrote:

> But when a frozen list a.k.a. tuple would be created - either directly, 
> or by setting a list's mutable flag to False which would really turn it 
> into a tuple - the size *would* be known.

But at that point the object consists of two memory blocks -- one
containing just the object header and a pointer to the items, and
the other containing the items.

To turn that into a true tuple structure would require resizing
the main object block to be big enough to hold the items and
copying them into it. The main object can't be moved (because
there are PyObject *s all over the place pointing to it), so
if there's not enough room at its current location, you're out
of luck.

So lists frozen after creation would have to remain as two blocks,
making them second-class citizens compared to those that were
created frozen. Either that or store all lists/tuples as two
blocks, and give up some of the performance advantages of the
current tuple structure.


From martin at  Mon Nov 29 22:04:03 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 22:04:03 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
	<> <>
Message-ID: <>

Am 29.11.2010 19:33, schrieb Antoine Pitrou:
> On Mon, 29 Nov 2010 08:22:46 +0100
> "Martin v. L?wis" <martin at> wrote:
>>> The former ensures that literals in code are always readable; the later
>>> allows users to enter numbers in their own number system. How could that
>>> be a bad thing?
>> It's YAGNI, feature bloat. It gives the illusion of supporting something
>> that actually isn't supported very well (namely, parsing local number
>> strings). I claim that there is no meaningful application
>> of this feature.
> Still, if it's not detrimental and it it's not difficult to support,
> then why do you care? You aren't even maintaining that part of the code.

I sure do maintain the Unicode database implementation in Python - the
one that is being used (IMO incorrectly) to implement the conversion in
question (and also the one that triggered this thread).

> I don't think "remove feature bloat" is part of our development goals
> or practices. Given the diversity of our user base, such removal should
> be done carefully and only for serious reasons.

I think it's a serious reason that the intuitive expectation of many
people (including committers) deviates from the actual implementation -
so much that they clarify the documentation in a way that makes the
difference explicit.

Having a mismatch between the expected behavior and the actual behavior
is a serious problem because it could lead to security issues, e.g. when
someone relies on float() to perform certain syntactic checking, making
it then possible to sneak in values that cause corruption later on
(speaking theoretically, of course - I'm not aware of an application
that is vulnerable in this manner).


From martin at  Mon Nov 29 22:13:41 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 22:13:41 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>
Message-ID: <>

> - Should Python documentation refer to the specific version of Unicode
> that it supports?

You mean, mention it somewhere? Sure (although it would be nice if the
documentation generator would automatically extract it from the source,
just as it extracts the Python version number).

Of course, such mentioning should explain that this is specific to
CPython, and not an aspect of Python-the-language.

> Current documentation refers to old versions.  Should version be
> updated or removed to imply the latest?

What specific reference are you referring to?

> - How UCD updates should be handled during the language moratorium?

It's clearly not affected.

> During PEP 3003 discussion, it was suggested to handle it on a case by
> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP
> 3003.

It's covered by "As the standard library is not directly tied to the
language definition it is not covered by this moratorium."

>  Should this upgrade be backported to 2.7?

No, it's a new feature.

> - How specific should library reference manual be in defining methods
> affected by UCD such as str.upper()?

It should specify what this actually does in Unicode terminology
(probably in addition to a layman's rephrase of that)

> - What is an acceptable level of variation between Python
> implementations?  For example, if '\UXXXXXXXX'.isalpha() returns true
> in one implementation, can it return false in another?

Implementations are free to use any version of the UCD.


From martin at  Mon Nov 29 22:14:07 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 Nov 2010 22:14:07 +0100
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <icvoqb$ea6$>
References: <>	<>	<>
Message-ID: <>

Am 29.11.2010 09:36, schrieb Georg Brandl:
> Am 29.11.2010 09:09, schrieb "Martin v. L?wis":
>>>     I have now completed
>>> was structseq.h considered?
>> No, it wasn't - unfortunately, it still doesn't get included when
>> including Python.h. I'll add it.
> Would 3.2 be a good time to finally include it?  All of its macros and
> declarations are named PyStructSequence*, so there shouldn't be a
> name clash concern.

Sure, I see no problem with that.


From greg.ewing at  Mon Nov 29 22:36:51 2010
From: greg.ewing at (Greg Ewing)
Date: Tue, 30 Nov 2010 10:36:51 +1300
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

I don't see how the grouping can be completely separated
from the value-naming. If the named values are to be
subclassed from the base values, then you want all the
members of a group to belong to the *same* subclass.
You can't get that by treating each named value on its
own and then trying to group them together afterwards.


From steve at  Mon Nov 29 23:09:15 2010
From: steve at (Steven D'Aprano)
Date: Tue, 30 Nov 2010 09:09:15 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Alexander Belopolsky wrote:

> Speaking of YAGNI, does anyone want to defend
>>>> complex('????.??j')
> 1234.56j

*If* we allow float('????.??') (as we currently do, but is being 
disputed by some), then we should allow complex('????.??j'). It would be 
silly for complex to be more restrictive than float.

> Especially given that we reject complex('1234.56i'):

I don't understand why you use 'i' when Python uses 'j' as the symbol 
for imaginary numbers.

 >>> complex('1234.56j')

works fine. I have no problem with Python choosing one of i/j as the 
symbol for imaginary-1 and rejecting the other. I prefer i rather than 
j, but that's because my background is in maths rather than electrical 
engineering, but I can live with either.

But in any case, please don't conflate the question of whether Python 
should accept j and/or i for complex numbers with the question of 
supporting non-arabic numerals. The two issues are unrelated.


From rrr at  Tue Nov 30 00:38:26 2010
From: rrr at (Ron Adam)
Date: Mon, 29 Nov 2010 17:38:26 -0600
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <id1dho$n77$>

On 11/28/2010 09:03 PM, Ron Adam wrote:

> It does associate additional info to names and creates a nice dictionary to
> reference.
>  >>> def name_values( FOO: 1,
> BAR: "Hello World!",
> BAZ: dict(a=1, b=2, c=3) ):
> ... return FOO, BAR, BAZ
> ...
>  >>> foo(1,2,3)
> (1, 2, 3)
>  >>> foo.__annotations__
> {'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}}

sigh... I havn't been very focused lately. That should have been:

 >>> def named_values(FOO:1, BAR:"Hello World!", BAZ:dict(a=1, b=2, c=3)):
...   return FOO, BAR, BAZ
 >>> named_values.__annotations__
{'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}}
 >>> named_values(1, 2, 3)
(1, 2, 3)


From ncoghlan at  Tue Nov 30 03:04:28 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 30 Nov 2010 12:04:28 +1000
Subject: [Python-Dev] PEP 384 final review
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 12:15 AM, James Y Knight <foom at> wrote:
> On Nov 29, 2010, at 8:58 AM, Nick Coghlan wrote:
> The http read only URLs
> didn't work (no diff returned, just "svn: OPTIONS of
> '': 200 OK
> ("),
> That was the wrong url: you should've
> used?
> James

Ah, thanks, I always forget that part (since it isn't there in the
read/write URLs).

The SVN output may qualify as one of the least helpful error messages
I have ever seen, though :)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Tue Nov 30 03:23:04 2010
From: ncoghlan at (Nick Coghlan)
Date: Tue, 30 Nov 2010 12:23:04 +1000
Subject: [Python-Dev] constant/enum type in stdlib
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 7:36 AM, Greg Ewing <greg.ewing at> wrote:
> I don't see how the grouping can be completely separated
> from the value-naming. If the named values are to be
> subclassed from the base values, then you want all the
> members of a group to belong to the *same* subclass.
> You can't get that by treating each named value on its
> own and then trying to group them together afterwards.

Note that my sample implementation cached the created types, so that
(for example) there was only ever one "Named<int>" type (my
implementation wasn't quite kosher in that respect, since
functools.lru_cache has a non-optional size limit - setting maxsize to
float('inf') deals with that). A grouping API would use either single
or multiple inheritance to create members that supported both the
naming aspects as well as the grouping aspects.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From alexander.belopolsky at  Tue Nov 30 04:46:33 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Mon, 29 Nov 2010 22:46:33 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 5:09 PM, Steven D'Aprano <steve at> wrote:
> But in any case, please don't conflate the question of whether Python should
> accept j and/or i for complex numbers with the question of supporting
> non-arabic numerals. The two issues are unrelated.

The two issues are related because they are both about how strict
numerical constructors should be.   If we want to accept wide
variations in how numbers can be spelled, then surely using i for the
imaginary unit is much more common than using ? for the digit 7.

I see two problems with supporting non-ascii spellings:

1. Support costs.

2. User confusion.

The two are related because when users are confused, they will report
invalid bugs when Python does not meet their expectations.  For
example, why

>>> int('???', 10)

works, but

>>> int('??????', 16)
Traceback (most recent call last):
UnicodeEncodeError: 'decimal' codec can't encode character '\uff21' in
position 3: invalid decimal Unicode string

does not?  And if 'decimal' is a codec, why

>>> '123'.encode('decimal')
Traceback (most recent call last):
LookupError: unknown encoding: decimal

Before anyone suggests that int(.., 16) should consult the new
Hex_Digit property in the UCD, let me remind that int() supports bases
from 2 through 36.

I thought Python design was primarily driven by practicality.  Here
the only plausible argument that one can make is that if Unicode says
it is a digit, we should treat it as a digit.  Purity over

In practical terms, UCD comes at a price.  The unicodedata module size
is over 700K on my machine.  This is almost half the size of the
python executable and by far the largest extension module. (only CJK
encodings come close.)  Making builtins depend on the largest
extension module for operation does not strike me as sound design.

From stephen at  Tue Nov 30 05:20:11 2010
From: stephen at (Stephen J. Turnbull)
Date: Tue, 30 Nov 2010 13:20:11 +0900
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

M.-A. Lemburg writes:

 > Just because ASCII-proponents may have a hard time reading such
 > literals,

That's not the point.

 > doesn't mean that script users have the same trouble.

The script users may have no trouble reading them, but that doesn't
mean it's not a YAGNI.  In Japanese, it's a YAGNI except in addresses
on New Year cards and in dates, which could be handled by specialized
modules, or by a generic module for extracting numeric information
from general (as opposed to program) text.  Neither of those is likely
to appear in program text in context where they would be used as a
numeric literal.

In fact, Python *does* consider it a YAGNI for Han!  Although my
apartment number would be written "???" on a New Year card, Python
won't parse it as 704: unicodedata considers those digits to be Lo,
except for "?" which fails anyway because it's Nl, not Nd.  (To add
insult to injury, it doesn't even return numeric values for those
characters, even though any Han-user would consider them numeric when
used in isolation, except that Japanese would be likely to consider
"?" to be the non-numeric "maru" symbol, ie, circle, meaning "OK"!)

The whole concept of numeric in Unicode is a mess; why import that
mess into Python?

Can you give any examples where people do computation, keep books, or
do nuclear physics in non-Arabic numerals?  I suppose Arabic users
might, but even there I suspect not.

From stephen at  Tue Nov 30 05:39:21 2010
From: stephen at (Stephen J. Turnbull)
Date: Tue, 30 Nov 2010 13:39:21 +0900
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Steven D'Aprano writes:

 > But in any case, please don't conflate the question of whether Python 
 > should accept j and/or i for complex numbers with the question of 
 > supporting non-arabic numerals. The two issues are unrelated.

Different, yes, unrelated, no.  They're both about whether variant
forms of universally used literals should be allowed in a programming
language, or whether only the canonical form is allowed.  Note that
*nobody* is saying that Python should have no facility for parsing
these numbers, only that by default literal decimal numerals should be
encoded as ASCII digits.

For example, I would not object to int() getting a Boolean flag
meaning "consult unicodedata for non-ASCII digits", just as it has an
optional parameter meaning "decode in base other than 10".[1]  OTOH,
until somebody says "Yes, in Mecca the bazaar traders keep books on
their Lenovos using ISO-8859-6 numerals, and it would be painful for
them to switch to what we call 'Arabic' numerals", I'm going to
consider it a YAGNI.  Just as even though mathematicians clearly
prefer "i" as the imaginary unit, there's not enough pain involved in
them switching to "j" to make it worth supporting both.

(BTW, my first reaction to the "j" notation was "cool, Python supports
quaternions out of the box!"  It took only a second or so to return to
reality, but that was my first reaction.)

[1]  That might not be a good idea on other grounds, but in principle
I would be OK with such built-ins accepting non-ASCII digits on

From merwok at  Tue Nov 30 07:33:51 2010
From: merwok at (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Tue, 30 Nov 2010 07:33:51 +0100
Subject: [Python-Dev] PEP 291 versus Python 3
Message-ID: <>

Good morning python-dev,

PEP 291 (Backward Compatibility for Standard Library) does not seem to
take Python 3 into account.  Is this PEP only relevant for the 2.7
branch?*  If it?s supposed to apply to 3.x too, despite the view that
3.0 was a clean break, what does it mean to have a module that is
developed in the py3k branch and should retain compatibility with 2.3 or

* Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible
  so we should keep the 3.x as similar as possible for bugfixes.?

In the particular case of distutils (should be compatible with 2.3), we
(including I) have been lax.  Our tests for example use modern unittest
features like skips, which makes them not runnable on old Pythons.  I am
very uncomfortable with code that seems to run fine but which tests
(however few) cannot be run, so I think I?ll have to trade the skips for
old-style ?return? statements.  The other way of solving that is to
change the compat policy.  If I remember correctly, the rationale for
code compat in distutils is that people may copy distutils from Python
x.y to their install of x.y-n; I don?t know if this is still an active
practice, and if it is, I don?t know if it should be supported,
considering that distutils2 (compatible with 2.4+ and available from
PyPI) is coming.


From regebro at  Tue Nov 30 09:10:37 2010
From: regebro at (Lennart Regebro)
Date: Tue, 30 Nov 2010 09:10:37 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Nov 28, 2010 at 21:24, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins.

Why? I can see this is a problem if one character that earlier was
allowed no longer is. That breaks backwards compatibility. This

>>>> float('????.??')
> 1234.56
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is

*I* think it is more important. In python 3, you can never ever assume
anything is ASCII any more. ASCII is practically dead an buried as far
as Python goes, unless you explicitly encode to it.

> def deposit(self, amountstr):
>       self.balance += float(amountstr)
>       audit_log("Deposited: " + amountstr)
> Auditor:
> $ cat numbered-account.log
> Deposited: ?????.??

That log reasonably should be in UTF-8 or something else, in which
case this is not a problem. And that's ignoring that it makes way more
sense to log the numerical amount.

Lennart Regebro:
Python 3 Porting:
+33 661 58 14 64

From hagen at  Tue Nov 30 09:15:54 2010
From: hagen at (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=)
Date: Tue, 30 Nov 2010 09:15:54 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <id2brr$v61$>

>> During PEP 3003 discussion, it was suggested to handle it on a case by
>> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP
>> 3003.
> It's covered by "As the standard library is not directly tied to the
> language definition it is not covered by this moratorium."

How is this restricted to the stdlib if it defines the set of valid

- Hagen

From stephen at  Tue Nov 30 09:23:10 2010
From: stephen at (Stephen J. Turnbull)
Date: Tue, 30 Nov 2010 17:23:10 +0900
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

Lennart Regebro writes:

 > *I* think it is more important. In python 3, you can never ever assume
 > anything is ASCII any more.

Sure you can.  In Python program text, all keywords will be ASCII
(English, even, though it may be en_NL.UTF-8<wink>) for the forseeable

I see no reason not to make a similar promise for numeric literals.  I
see no good reason to allow compatibility full-width Japanese "ASCII"
numerals or Arabic cursive numerals in "for i in range(...)" for

As soon as somebody gives an example of a culture, however minor, that
uses computers but actively prefers to use non-ASCII numerals to
express numbers in an IT context, I'll review my thinking.  But at the
moment it's 101% YAGNI.

From sylvain.thenault at  Tue Nov 30 09:34:18 2010
From: sylvain.thenault at (Sylvain =?utf-8?B?VGjDqW5hdWx0?=)
Date: Tue, 30 Nov 2010 09:34:18 +0100
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <id11vk$9d$>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On 29 novembre 14:21, Ron Adam wrote:
> On 11/29/2010 01:22 PM, Brett Cannon wrote:
> >Considering these semantics changed between Python 2 and 3 w/o a
> >discernable benefit (I would consider it a negative as finding a
> >module should not be impacted by syntactic correctness; the full act
> >of importing should be the only thing that cares about that), I would
> >consider it a bug that should be filed.
> The output of imp.find_module() returns an open file io object, and
> it's output feeds directly into to imp.load_module().
> >>> imp.find_module('pydoc')
> (<_io.TextIOWrapper name=4 encoding='utf-8'>,
> '/usr/local/lib/python3.2/', ('.py', 'U', 1))
> So I think the imp.find_module() is suppose to be used when you *do*
> want to do the full act of importing and not for just finding out if
> or where module xyz exists.

in python 2, find_module was usable for such usage, and this is a needed api
for a tool like pylint. Is there another way to do so with python 3?
Sylvain Th?nault                               LOGILAB, Paris (France)
Formations Python, Debian, M?th. Agiles:
D?veloppement logiciel sur mesure:
CubicWeb, the semantic web framework:

From cornsea at  Tue Nov 30 09:41:19 2010
From: cornsea at (haiyang kang)
Date: Tue, 30 Nov 2010 16:41:19 +0800
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>


  I agree with this.

  I never seen any man in China using chinese number literals (at
least two kinds:?, ?, same meaning with 1)
  in Python program, except UI output.

  They can do some mappings when want to output these non-ascii numbers.
  Example: if 1: print "?"

  I think it is a little ugly to have code like this: num =
float("?.?"), expected result is: num = 1.1


On Tue, Nov 30, 2010 at 4:23 PM, Stephen J. Turnbull <stephen at> wrote:
> Lennart Regebro writes:
> ?> *I* think it is more important. In python 3, you can never ever assume
> ?> anything is ASCII any more.
> Sure you can. ?In Python program text, all keywords will be ASCII
> (English, even, though it may be en_NL.UTF-8<wink>) for the forseeable
> future.
> I see no reason not to make a similar promise for numeric literals. ?I
> see no good reason to allow compatibility full-width Japanese "ASCII"
> numerals or Arabic cursive numerals in "for i in range(...)" for
> example.
> As soon as somebody gives an example of a culture, however minor, that
> uses computers but actively prefers to use non-ASCII numerals to
> express numbers in an IT context, I'll review my thinking. ?But at the
> moment it's 101% YAGNI.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From ziade.tarek at  Tue Nov 30 10:14:20 2010
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 30 Nov 2010 10:14:20 +0100
Subject: [Python-Dev] PEP 291 versus Python 3
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 30, 2010 at 7:33 AM, ?ric Araujo <merwok at> wrote:
> Good morning python-dev,
> PEP 291 (Backward Compatibility for Standard Library) does not seem to
> take Python 3 into account. ?Is this PEP only relevant for the 2.7
> branch?* ?If it?s supposed to apply to 3.x too, despite the view that
> 3.0 was a clean break, what does it mean to have a module that is
> developed in the py3k branch and should retain compatibility with 2.3 or
> 1.5.2?
> * Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible
> ?so we should keep the 3.x as similar as possible for bugfixes.?
> In the particular case of distutils (should be compatible with 2.3), we
> (including I) have been lax. ?Our tests for example use modern unittest
> features like skips, which makes them not runnable on old Pythons. ?I am
> very uncomfortable with code that seems to run fine but which tests
> (however few) cannot be run, so I think I?ll have to trade the skips for
> old-style ?return? statements.

You shouldn't be uncomfortable with the current state of distutils and
try to improve its tests (or improve any other nasty stuff you'll find
in that code)

Distutils is dead code. All we have to do is the bare minimum
maintenance. Everything else is a waste of time.

>?The other way of solving that is to
> change the compat policy. ?If I remember correctly, the rationale for
> code compat in distutils is that people may copy distutils from Python
> x.y to their install of x.y-n; I don?t know if this is still an active
> practice, and if it is, I don?t know if it should be supported,
> considering that distutils2 (compatible with 2.4+ and available from
> PyPI) is coming.

Again, don't worry about these rules in Distutils now. The only rule
that now apply to Distutils is that we do only bug fixing, and we
should not waste our precious time to do other stuff in there. Plain
python tests are fine for what we want to do and simplify our forward
ports and backports.  One thing we should do though, is fix those bugs
in Distutils2 first when they exist there too.

I really appreciate all the hard work your are doing in triaging the
issues and bug fixing by the way !


From emile.anclin at  Tue Nov 30 10:39:29 2010
From: emile.anclin at (Emile Anclin)
Date: Tue, 30 Nov 2010 10:39:29 +0100
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <201011301039.30033.emile.anclin@logilab>

On Monday 29 November 2010 20:22:22 Brett Cannon wrote:
> Considering these semantics changed between Python 2 and 3 w/o a
> discernable benefit (I would consider it a negative as finding a
> module should not be impacted by syntactic correctness; the full act
> of importing should be the only thing that cares about that), I would
> consider it a bug that should be filed.

ok, here it is :

Since I did not understand all of it, I just quoted Brett Cannon
in the ticket.


Emile Anclin <emile.anclin at> 
Informatique scientifique & et gestion de connaissances

From steve at  Tue Nov 30 13:59:49 2010
From: steve at (Steven D'Aprano)
Date: Tue, 30 Nov 2010 23:59:49 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

haiyang kang wrote:
> hi,
>   I agree with this.
>   I never seen any man in China using chinese number literals (at
> least two kinds:?, ?, same meaning with 1)
>   in Python program, except UI output.
>   They can do some mappings when want to output these non-ascii numbers.
>   Example: if 1: print "?"
>   I think it is a little ugly to have code like this: num =
> float("?.?"), expected result is: num = 1.1

I don't expect that anyone would sensibly write code like that, except 
for testing. You wouldn't write num = float("1.1") instead of just
num = 1.1 either.

But you should be able to write:

text = input("Enter a number using your preferred digits: ")
num = float(text)

without caring whether the user enters ?.? or 1.1 or something else.


From fuzzyman at  Tue Nov 30 14:09:16 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 30 Nov 2010 13:09:16 +0000
Subject: [Python-Dev] PEP 291 versus Python 3
In-Reply-To: <>
References: <>
Message-ID: <>

On 30/11/2010 06:33, ?ric Araujo wrote:
> Good morning python-dev,
> PEP 291 (Backward Compatibility for Standard Library) does not seem to
> take Python 3 into account.  Is this PEP only relevant for the 2.7
> branch?*  If it?s supposed to apply to 3.x too, despite the view that
> 3.0 was a clean break, what does it mean to have a module that is
> developed in the py3k branch and should retain compatibility with 2.3 or
> 1.5.2?

PEP 291 is very old and should probably be retired. I don't think anyone 
is maintaining standard libraries in py3k that are also compatible with 
Python 2.anything. (At least not in a single codebase.)

For Python 2.7 that may not be true, but for Python 3 I think we can 
start with a clean slate on compatibility.

> * Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible
>    so we should keep the 3.x as similar as possible for bugfixes.?
> In the particular case of distutils (should be compatible with 2.3), we
> (including I) have been lax.  Our tests for example use modern unittest
> features like skips, which makes them not runnable on old Pythons.
They can be run on old Pythons with unittest2. This is what distutils2 
is doing.

>   I am
> very uncomfortable with code that seems to run fine but which tests
> (however few) cannot be run, so I think I?ll have to trade the skips for
> old-style ?return? statements.  The other way of solving that is to
> change the compat policy.

This is only an issue for distutils in Python 2.7 right? Maintaining the 
compat policy for that will be a short-lived pain, and distutils itself 
is getting only infrequent bugfixes *anyway*, right? I defer to Tarek on 
that particular decision.

All the best,

> If I remember correctly, the rationale for
> code compat in distutils is that people may copy distutils from Python
> x.y to their install of x.y-n; I don?t know if this is still an active
> practice, and if it is, I don?t know if it should be supported,
> considering that distutils2 (compatible with 2.4+ and available from
> PyPI) is coming.
> Regards
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From steve at  Tue Nov 30 14:23:22 2010
From: steve at (Steven D'Aprano)
Date: Wed, 01 Dec 2010 00:23:22 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Stephen J. Turnbull wrote:
> Lennart Regebro writes:
>  > *I* think it is more important. In python 3, you can never ever assume
>  > anything is ASCII any more.
> Sure you can.  In Python program text, all keywords will be ASCII
> (English, even, though it may be en_NL.UTF-8<wink>) for the forseeable
> future.
> I see no reason not to make a similar promise for numeric literals.  I
> see no good reason to allow compatibility full-width Japanese "ASCII"
> numerals or Arabic cursive numerals in "for i in range(...)" for
> example.

I agree with you that numeric *literals* should be restricted to the 
ASCII digits. I don't think anyone here is arguing differently -- if 
they are, they should speak up and try to make the case for allowing 
numeric literals in arbitrary scripts. Python doesn't currently allow 
non-ASCII numeric literals, and even if such a change were desirable, it 
would run up against the moratorium. So let's just forget the specter of 
code like:

x = math.sqrt(????.?? ** ?.?)

It ain't gonna happen :)

But I think there is a good case for allowing the constructors int, 
float and complex to continue to accept numeric *strings* with non-ASCII 
  digits. The code already exists, there's probably people out there who 
rely on it, and in the absence of any convincing demonstration that the 
existing behaviour is causing widespread difficulty, we should leave 
well-enough alone.

Various people have suggested that there should be a function in the 
locale module that handles numeric string input in non-ASCII digits. 
This is a de facto admission that there are use-cases for taking user 
input like the string '?' and turning it into the int 3. Python can 
already do this, and has been able to for many years:

[steve at sylar ~]$ python2.4
Python 2.4.6 (#1, Mar 30 2009, 10:08:01)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> int(u'?')

It seems to me that there's no need to move this functionality into locale.


From solipsis at  Tue Nov 30 14:32:54 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 14:32:54 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
Message-ID: <>

On Wed, 01 Dec 2010 00:23:22 +1100
Steven D'Aprano <steve at> wrote:
> But I think there is a good case for allowing the constructors int, 
> float and complex to continue to accept numeric *strings* with non-ASCII 
>   digits. The code already exists, there's probably people out there who 
> rely on it, and in the absence of any convincing demonstration that the 
> existing behaviour is causing widespread difficulty, we should leave 
> well-enough alone.


> It seems to me that there's no need to move this functionality into locale.

Not only, but moving it into locale won't make it easier to maintain



From solipsis at  Tue Nov 30 14:38:22 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 14:38:22 +0100
Subject: [Python-Dev] Module size
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, 29 Nov 2010 22:46:33 -0500
Alexander Belopolsky <alexander.belopolsky at> wrote:
> In practical terms, UCD comes at a price.  The unicodedata module size
> is over 700K on my machine.  This is almost half the size of the
> python executable and by far the largest extension module. (only CJK
> encodings come close.)  Making builtins depend on the largest
> extension module for operation does not strike me as sound design.

Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend
only on Objects/unicodectype.c.

$ size Objects/unicode*.o
   text	   data	    bss	    dec	    hex	filename
  60398	      0	      0	  60398	   ebee	Objects/unicodectype.o
 130440	  13559	   2208	 146207	  23b1f	Objects/unicodeobject.o


From alexander.belopolsky at  Tue Nov 30 15:18:13 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 09:18:13 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 30, 2010 at 7:59 AM, Steven D'Aprano <steve at> wrote:
> But you should be able to write:
> text = input("Enter a number using your preferred digits: ")
> num = float(text)
> without caring whether the user enters ?.? or 1.1 or something else.

I find it ironic that people who argue for preservation of the current
behavior do it without checking what it actually is:

>>> float('?.?')
UnicodeEncodeError: 'decimal' codec can't encode character '\u4e00' ..

This one of the biggest problems with this feature.  It does not fit
user's expectations.  Even the original author of the decimal "codec"
expected the above to work. [1]

> Python can already do this, and has been able to for many years:
> >>> int(u'?')
> 3

but you can do this without support from int() as well:

>>> import unicodedata
>>> unicodedata.digit('?')

and for Unihan numbers, you can do
>>> unicodedata.numeric('?')


>>> unicodedata.numeric('?')

and if you are so inclined,

>>> [unicodedata.numeric(c) for c in "? ? ? ? ?".split()]
[10000.0, 5000.0, 0.6, 0.875, 90000.0]

Do you want to see all these supported by float()?

[1] " does not support Unihan digit data"

From alexander.belopolsky at  Tue Nov 30 15:32:38 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 09:32:38 -0500
Subject: [Python-Dev] Module size
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 8:38 AM, Antoine Pitrou <solipsis at> wrote:
> On Mon, 29 Nov 2010 22:46:33 -0500
> Alexander Belopolsky <alexander.belopolsky at> wrote:
>> In practical terms, UCD comes at a price. ?The unicodedata module size
>> is over 700K on my machine. ?This is almost half the size of the
>> python executable and by far the largest extension module. (only CJK
>> encodings come close.) ?Making builtins depend on the largest
>> extension module for operation does not strike me as sound design.
> Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend
> only on Objects/unicodectype.c.

My mistake. That was a late night post.  I wonder why
is so big then.

It must be character names:

$ python -v
>>> '\N{DIGIT ONE}'
dlopen("/.../", 2);
import unicodedata # dynamically loaded from /.../

From solipsis at  Tue Nov 30 15:41:48 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 15:41:48 +0100
Subject: [Python-Dev] Module size
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1291128108.3538.10.camel@localhost.localdomain>

Le mardi 30 novembre 2010 ? 09:32 -0500, Alexander Belopolsky a ?crit :
> On Tue, Nov 30, 2010 at 8:38 AM, Antoine Pitrou <solipsis at> wrote:
> > On Mon, 29 Nov 2010 22:46:33 -0500
> > Alexander Belopolsky <alexander.belopolsky at> wrote:
> >>
> >> In practical terms, UCD comes at a price.  The unicodedata module size
> >> is over 700K on my machine.  This is almost half the size of the
> >> python executable and by far the largest extension module. (only CJK
> >> encodings come close.)  Making builtins depend on the largest
> >> extension module for operation does not strike me as sound design.
> >
> > Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend
> > only on Objects/unicodectype.c.
> My mistake. That was a late night post.  I wonder why
> is so big then.
> It must be character names:
> $ python -v
> >>> '\N{DIGIT ONE}'
> dlopen("/.../", 2);
> import unicodedata # dynamically loaded from /.../
> '1'

From a quick peek using hexdump, character names seem to only account
for 1/4 of the module size.
That said, I don't think the size is very important. For any non-trivial
Python application, the size of unicodedata will be negligible compared
to the size of Python objects.



From tlesher at  Tue Nov 30 15:48:32 2010
From: tlesher at (Tim Lesher)
Date: Tue, 30 Nov 2010 09:48:32 -0500
Subject: [Python-Dev] Module size
In-Reply-To: <1291128108.3538.10.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 09:41, Antoine Pitrou <solipsis at> wrote:
> That said, I don't think the size is very important. For any non-trivial
> Python application, the size of unicodedata will be negligible compared
> to the size of Python objects.

That depends very much on the platform and the application.  For our
embedded use of Python, static data size (like the text segment of a
shared object) is far dearer than the heap space used by Python
objects, which is why we've had to excise both the UCD and the CJK
codecs in our builds.
Tim Lesher <tlesher at>

From cornsea at  Tue Nov 30 15:56:33 2010
From: cornsea at (haiyang kang)
Date: Tue, 30 Nov 2010 22:56:33 +0800
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

> But you should be able to write:
> text = input("Enter a number using your preferred digits: ")
> num = float(text)
> without caring whether the user enters ?.? or 1.1 or something else.

yes. from logical point of view, this can happen.

But i really doubt that if really there are users who would like to
input number like that,
means that they first use google pinyin method to input ?, then change
to english input method to input . , then change to google pinyin
again for the other ?;
 or maybe you mean they input the whole  ?.? words with google pinyin
input method.

To input 1, users only need to type one time keyboard, but to input ?,
they need to type three times (yi SPACE).

Of course, users can also input something accidentally, but we just
need to give them some kind reminders.

At least coders in my around will restrain their system users to input
numbers with ASCII,
and seems that users are still happy with the ASCII type numbers :).


From alexander.belopolsky at  Tue Nov 30 16:05:42 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 10:05:42 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 4:13 PM, "Martin v. L?wis" <martin at> wrote:
>> - Should Python documentation refer to the specific version of Unicode
>> that it supports?
> You mean, mention it somewhere? Sure (although it would be nice if the
> documentation generator would automatically extract it from the source,
> just as it extracts the Python version number).
> Of course, such mentioning should explain that this is specific to
> CPython, and not an aspect of Python-the-language.
>> Current documentation refers to old versions. ?Should version be
>> updated or removed to imply the latest?
> What specific reference are you referring to?
I found two places: A reference to Unicode 3.0 (!) in the Data Model
section and a reference to 5.2.0 in unicodedata docs.


>> - How UCD updates should be handled during the language moratorium?
> It's clearly not affected.

This is not what Guido said last year:
> One question:
> There are currently number of patch waiting on the tracker for
> additional Unicode feature support and it's also likely that we'll
> want to upgrade to a more recent Unicode version within the
> next few years.
> How would such indirect changes be seen under the moratorium ?

That would fall under the Case-by-Case Exemptions section. "Within the
next few years" sounds like it might well wait until the moratorium is
ended though. :-)

I don't see it as a big deal, but technically speaking, with Unicode
6.0 changing properties of two characters to become identifiers Python
language definition is affected.  For example, an alternative
implementation based on 5.2.0 will not accept a valid CPython program
that uses one of these characters.

>> During PEP 3003 discussion, it was suggested to handle it on a case by
>> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP
>> 3003.
> It's covered by "As the standard library is not directly tied to the
> language definition it is not covered by this moratorium."

See above.  Also, it has been suggested that semantics of built-ins
cannot change.  (If that was so, it would put int('????') debate to
rest at least for the time being.:-)

>> ?Should this upgrade be backported to 2.7?
> No, it's a new feature.
Given that 2.7 will be maintained for 5 years and arguably Unicode
Consortium takes backward compatibility very seriously, wouldn't it
make sense to consider a backport at some point?

I am sure we will soon see a bug report that the following does not
work in 2.7: :-)

>> - How specific should library reference manual be in defining methods
>> affected by UCD such as str.upper()?
> It should specify what this actually does in Unicode terminology
> (probably in addition to a layman's rephrase of that)

I opened an issue for this:

>> .. For example, if '\UXXXXXXXX'.isalpha() returns true
>> in one implementation, can it return false in another?
> Implementations are free to use any version of the UCD.

I was more concerned about wide an narrow unicode CPython builds.  Is
it a bug that   '\UXXXXXXXX'.isalpha() may disagree even when the two
implementations are based on the same version of UCD?

Thanks for your answers.

From alexander.belopolsky at  Tue Nov 30 16:11:24 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 10:11:24 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Nov 30, 2010 at 9:56 AM, haiyang kang <cornsea at> wrote:
>> But you should be able to write:
>> text = input("Enter a number using your preferred digits: ")
>> num = float(text)
>> without caring whether the user enters ?.? or 1.1 or something else.
> yes. from logical point of view, this can happen. ...

Please stop discussing a non-feature.  Python's float *does not*
accept ' ?.?'.  This was reported as a bug and closed as invalid.

See " does not support Unihan digit data"

From barry at  Tue Nov 30 16:35:31 2010
From: barry at (Barry Warsaw)
Date: Tue, 30 Nov 2010 10:35:31 -0500
Subject: [Python-Dev] PEP 291 versus Python 3
In-Reply-To: <>
References: <> <>
Message-ID: <20101130103531.54d79465@mission>

On Nov 30, 2010, at 01:09 PM, Michael Foord wrote:

>PEP 291 is very old and should probably be retired. I don't think anyone is
>maintaining standard libraries in py3k that are also compatible with Python
>2.anything. (At least not in a single codebase.)

I agree.  I think we should change the status of PEP 291 to Final, and add a
few words to make it clear it applies only to Python 2.  Since Neal owns the
PEP, he should get first crack at doing the update, but I volunteer to make
those changes if he declines (or does not respond).

We may eventually need a similar document for Python 3, but it should be a new

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From stefan-usenet at  Tue Nov 30 16:55:19 2010
From: stefan-usenet at (Stefan Krah)
Date: Tue, 30 Nov 2010 16:55:19 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
Message-ID: <>

Alexander Belopolsky <alexander.belopolsky at> wrote:
> On Tue, Nov 30, 2010 at 9:56 AM, haiyang kang <cornsea at> wrote:
> >> But you should be able to write:
> >>
> >> text = input("Enter a number using your preferred digits: ")
> >> num = float(text)
> >>
> >> without caring whether the user enters ?.? or 1.1 or something else.
> >
> > yes. from logical point of view, this can happen. ...
> Please stop discussing a non-feature.  Python's float *does not*
> accept ' ?.?'.  This was reported as a bug and closed as invalid.

That seems irrelevant to me. One of the main topics of this thread is
whether actual native speakers would be happy with ascii-only input for

haiyang kang confirmed that this is the case. I hope that more
local speakers will contribute their views.

Stefan Krah

From alexander.belopolsky at  Tue Nov 30 17:40:19 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 11:40:19 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Mon, Nov 29, 2010 at 2:38 PM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
>> Still, if it's not detrimental and it it's not difficult to support,
>> then why do you care?
> It is difficult to support. ?A fix for issue10557 would be much
> simpler if we did not support non-European digits. ?I now added a
> patch that handles non-ascii digits, so you can see what's involved.
> Note that when Unicode Consortium inevitably adds more Nd characters
> to the non-BMP planes, we will have to add surrogate pairs' support to
> this code.

It turns out that this did in fact happen:

# Newly assigned in Unicode 3.1.0 (March, 2001)


And of course,

>>> unicodedata.digit('\U0001D7CE')


>>> int('\U0001D7CE')
UnicodeEncodeError: 'decimal' codec can't encode character '\ud835' ..

on a narrow Unicode build.  (Note the character reported in the error message!)

If you think non-ASCII digits are not difficult to support, please
contribute to the following tracker issues:
(Review and document string format accepted in numeric data type constructors)
(Malformed error message from float())
(Document unicode C-API in reST - Specifically, PyUnicode_EncodeDecimal)
(PyUnicode_EncodeDecimal is undocumented)
(Include more fullwidth chars in the decimal codec)

and back to the issue of user confusion [closed/invalid]
(int(u"\u1234") raises UnicodeEncodeError by Guido van Rossum)

From fuzzyman at  Tue Nov 30 18:40:52 2010
From: fuzzyman at (Michael Foord)
Date: Tue, 30 Nov 2010 17:40:52 +0000
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>	<>
Message-ID: <>

On 30/11/2010 16:40, Alexander Belopolsky wrote:
> [snip...]
> And of course,
>>>> unicodedata.digit('\U0001D7CE')
> 0
> but
>>>> int('\U0001D7CE')
> ..
> UnicodeEncodeError: 'decimal' codec can't encode character '\ud835' ..
> on a narrow Unicode build.  (Note the character reported in the error message!)
> If you think non-ASCII digits are not difficult to support, please
> contribute to the following tracker issues:

Would moving this functionality to the locale module make the issues any 
easier to fix?


> (Review and document string format accepted in numeric data type constructors)
> (Malformed error message from float())
> (Document unicode C-API in reST - Specifically, PyUnicode_EncodeDecimal)
> (PyUnicode_EncodeDecimal is undocumented)
> (Include more fullwidth chars in the decimal codec)
> and back to the issue of user confusion
> [closed/invalid]
> (int(u"\u1234") raises UnicodeEncodeError by Guido van Rossum)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (?BOGUS AGREEMENTS?) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

From alexander.belopolsky at  Tue Nov 30 19:21:30 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 13:21:30 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 12:40 PM, Michael Foord
<fuzzyman at> wrote:
>> If you think non-ASCII digits are not difficult to support, please
>> contribute to the following tracker issues:
> Would moving this functionality to the locale module make the issues any
> easier to fix?

Sure, if we code it in Python, supporting it will by much easier:

def normalize_digits(s):
    digits = { for m in re.finditer('(\d)', s)}
    trtab = {ord(d): str(unicodedata.digit(d)) for d in digits}
    return s.translate(trtab)

>>> normalize_digits('????.??')

I am not sure this belongs to the locale module, however.  It seems to
me, something like 'unicodealgo' for unicode algorithms would be more

From solipsis at  Tue Nov 30 19:29:52 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 19:29:52 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <1291141792.8628.0.camel@localhost.localdomain>

> Sure, if we code it in Python, supporting it will by much easier:
> def normalize_digits(s):
>     digits = { for m in re.finditer('(\d)', s)}
>     trtab = {ord(d): str(unicodedata.digit(d)) for d in digits}
>     return s.translate(trtab)
> >>> normalize_digits('????.??')
> '1234.56'
> I am not sure this belongs to the locale module, however.  It seems to
> me, something like 'unicodealgo' for unicode algorithms would be more
> appropriate.

It could simply be in unicodedata if you split the implementation into a
core C part and some Python bits.



From alexander.belopolsky at  Tue Nov 30 19:59:29 2010
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 30 Nov 2010 13:59:29 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <1291141792.8628.0.camel@localhost.localdomain>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 1:29 PM, Antoine Pitrou <solipsis at> wrote:
>> I am not sure this belongs to the locale module, however. ?It seems to
>> me, something like 'unicodealgo' for unicode algorithms would be more
>> appropriate.
> It could simply be in unicodedata if you split the implementation into a
> core C part and some Python bits.

Splitting unicodedata may not be a bad idea.  There are many more
pieces in UCD than covered by unicodedata. [1]  Hardcoding them all
into unicodedata module is hard to justify, but some are quite useful.
 For example, PropertyValueAliases.txt is quite useful for those like
myself who cannot remember what Pd or Zl category names stand for.
SpecialCasing.txt is required for proper casing, but is not currently
included in Python.  I would not want to change str.upper or str.title
because of this, but providing the raw info to someone who wants to
implement proper case mappings may not be a bad idea.  Blocks.txt is
certainly useful for any language-dependent processing.

On the other hand, I think we should keep Unicode data and Unicode
algorithms separate.  And the latter may not even belong to the Python


From martin at  Tue Nov 30 20:13:01 2010
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 30 Nov 2010 20:13:01 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <id2brr$v61$>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Am 30.11.2010 09:15, schrieb Hagen F?rstenau:
>>> During PEP 3003 discussion, it was suggested to handle it on a case by
>>> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP
>>> 3003.
>> It's covered by "As the standard library is not directly tied to the
>> language definition it is not covered by this moratorium."
> How is this restricted to the stdlib if it defines the set of valid
> identifiers?

The language does not change. The language specification says

Python 3.0 introduces additional characters from outside the ASCII range
(see PEP 3131). For these characters, the classification uses the
version of the Unicode Character Database as included in the unicodedata

That remains unchanged. It was a deliberate design decision of PEP 3131
to not codify a fixed set of characters that can be used in identifiers.


From martin at  Tue Nov 30 20:16:49 2010
From: martin at (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 30 Nov 2010 20:16:49 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> Would moving this functionality to the locale module make the issues any
> easier to fix?

You could delegate it to the C library, so: yes.


From solipsis at  Tue Nov 30 20:23:13 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 20:23:13 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
	<>  <>
Message-ID: <1291144993.8628.1.camel@localhost.localdomain>

Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit :
> > Would moving this functionality to the locale module make the issues any
> > easier to fix?
> You could delegate it to the C library, so: yes.

I hope you don't suggest delegating it to the C locale functions.
Do you?

From martin at  Tue Nov 30 20:40:54 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 30 Nov 2010 20:40:54 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <1291144993.8628.1.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Am 30.11.2010 20:23, schrieb Antoine Pitrou:
> Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit :
>>> Would moving this functionality to the locale module make the issues any
>>> easier to fix?
>> You could delegate it to the C library, so: yes.
> I hope you don't suggest delegating it to the C locale functions.
> Do you?

Yes, I do. Why do you hope I don't?


From brett at  Tue Nov 30 20:41:47 2010
From: brett at (Brett Cannon)
Date: Tue, 30 Nov 2010 11:41:47 -0800
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <id11vk$9d$>
References: <201011251530.23947.emile.anclin@logilab>
Message-ID: <>

On Mon, Nov 29, 2010 at 12:21, Ron Adam <rrr at> wrote:
> On 11/29/2010 01:22 PM, Brett Cannon wrote:
>> On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault
>> <sylvain.thenault at> ?wrote:
>>> On 25 novembre 11:22, Ron Adam wrote:
>>>> On 11/25/2010 08:30 AM, Emile Anclin wrote:
>>>>> hello,
>>>>> working on Pylint, we have a lot of voluntary corrupted files to test
>>>>> Pylint behavior; for instance
>>>>> $ cat /home/emile/var/pylint/test/input/
>>>>> # -*- coding: IBO-8859-1 -*-
>>>>> """ check correct unknown encoding declaration
>>>>> """
>>>>> __revision__ = '????'
>>>>> and we try to find that module :
>>>>> find_module('func_unknown_encoding', None). But python3 raises
>>>>> SyntaxError
>>>>> in that case ; it didn't raise SyntaxError on python2 nor does so on
>>>>> our
>>>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious
>>>>> names)
>>>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
>>>>> [GCC 4.3.4] on linux2
>>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>>> >from imp import find_module
>>>>>>>> find_module('func_unknown_encoding', None)
>>>>> Traceback (most recent call last):
>>>>> ? File "<stdin>", line 1, in<module>
>>>>> SyntaxError: encoding problem: with BOM
>>>> I don't think there is a clear reason by design. ?Also try importing
>>>> the same modules directly and noting the differences in the errors
>>>> you get.
>>> IMO the point is that we can consider as a bug the fact that find_module
>>> tries to somewhat read the content of the file, no? Though it seems to
>>> only
>>> doing this for encoding detection or like since find_module doesn't choke
>>> on
>>> a module containing another kind of syntax error.
>>> So the question is, should we deal with this in pylint/astng, or can we
>>> expect
>>> this to be fixed at some point?
>> Considering these semantics changed between Python 2 and 3 w/o a
>> discernable benefit (I would consider it a negative as finding a
>> module should not be impacted by syntactic correctness; the full act
>> of importing should be the only thing that cares about that), I would
>> consider it a bug that should be filed.
> The output of imp.find_module() returns an open file io object, and it's
> output feeds directly into to imp.load_module().
>>>> imp.find_module('pydoc')
> (<_io.TextIOWrapper name=4 encoding='utf-8'>,
> '/usr/local/lib/python3.2/', ('.py', 'U', 1))
> So I think the imp.find_module() is suppose to be used when you *do* want to
> do the full act of importing and not for just finding out if or where module
> xyz exists.

Going with your line of argument, why can't imp.load_module be the
call that figures out there is a syntax error? If you look at this
from the perspective of PEP 302, finding a module has absolutely
nothing to do with the validity of the found source, just that
something was found somewhere which (hopefully) contains code that
represents the module.

From solipsis at  Tue Nov 30 20:44:14 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 20:44:14 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
	<>  <>
Message-ID: <1291146254.8628.4.camel@localhost.localdomain>

Le mardi 30 novembre 2010 ? 20:40 +0100, "Martin v. L?wis" a ?crit :
> Am 30.11.2010 20:23, schrieb Antoine Pitrou:
> > Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit :
> >>> Would moving this functionality to the locale module make the issues any
> >>> easier to fix?
> >>
> >> You could delegate it to the C library, so: yes.
> > 
> > I hope you don't suggest delegating it to the C locale functions.
> > Do you?
> Yes, I do. Why do you hope I don't?

Because we all know how locale is a pile of cr*p, both in specification
and in implementations. Our unit tests for it are a clear proof of that.

Actually, I remember you saying that locale should ideally be replaced
with a wrapper around the ICU library.



From brett at  Tue Nov 30 20:46:07 2010
From: brett at (Brett Cannon)
Date: Tue, 30 Nov 2010 11:46:07 -0800
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>
	<id11vk$9d$> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 00:34, Sylvain Th?nault
<sylvain.thenault at> wrote:
> On 29 novembre 14:21, Ron Adam wrote:
>> On 11/29/2010 01:22 PM, Brett Cannon wrote:
>> >Considering these semantics changed between Python 2 and 3 w/o a
>> >discernable benefit (I would consider it a negative as finding a
>> >module should not be impacted by syntactic correctness; the full act
>> >of importing should be the only thing that cares about that), I would
>> >consider it a bug that should be filed.
>> The output of imp.find_module() returns an open file io object, and
>> it's output feeds directly into to imp.load_module().
>> >>> imp.find_module('pydoc')
>> (<_io.TextIOWrapper name=4 encoding='utf-8'>,
>> '/usr/local/lib/python3.2/', ('.py', 'U', 1))
>> So I think the imp.find_module() is suppose to be used when you *do*
>> want to do the full act of importing and not for just finding out if
>> or where module xyz exists.
> in python 2, find_module was usable for such usage, and this is a needed api
> for a tool like pylint. Is there another way to do so with python 3?

At the moment, no. Best option would be to create an
importlib.find_module function which returns a loader if the module is
found, else returns None. The loader can have its get_source method
called to read the source code (w/o verification). I have this planned
for Python 3.3 but not 3.2 with us so close to 3.2b1.

> --
> Sylvain Th?nault ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? LOGILAB, Paris (France)
> Formations Python, Debian, M?th. Agiles:
> D?veloppement logiciel sur mesure: ? ? ?
> CubicWeb, the semantic web framework: ? ?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From martin at  Tue Nov 30 20:55:52 2010
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 30 Nov 2010 20:55:52 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <1291146254.8628.4.camel@localhost.localdomain>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
	<>	<1291144993.8628.1.camel@localhost.localdomain>	<>
Message-ID: <>

> Because we all know how locale is a pile of cr*p, both in specification
> and in implementations. Our unit tests for it are a clear proof of that.

I wouldn't use expletives, but rather claim that the locale module is
highly platform-dependent.

> Actually, I remember you saying that locale should ideally be replaced
> with a wrapper around the ICU library.

By that, I stand - however, I have given up the hope that this will
happen anytime soon.

Wrt. to local number parsing, I think that the locale module would be
way better than the nonsense that Python currently does. In the locale
module, somebody at least has thought about what specifically
constitutes a number. The current not-ASCII-but-not-local-either
approach is just useless.

Maintaining a reasonable implementation is a burden, so deferring
to the C library is more attractive than having to maintain an
unreasonable implementation.


From solipsis at  Tue Nov 30 21:11:59 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 21:11:59 +0100
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
	<>  <>
Message-ID: <1291147919.8628.12.camel@localhost.localdomain>

Le mardi 30 novembre 2010 ? 20:55 +0100, "Martin v. L?wis" a ?crit :
> Wrt. to local number parsing, I think that the locale module would be
> way better than the nonsense that Python currently does. In the locale
> module, somebody at least has thought about what specifically
> constitutes a number. The current not-ASCII-but-not-local-either
> approach is just useless.

It depends what you need. If you parse integers it's probably good
enough. And it's better to have a trustable standard (unicode) than a
myriad of ad-hoc, possibly buggy or incomplete, often unavailable,
cultural specifications drafted by OS vendors who have no business (and
no expertise) in drafting them.

At least you can build more sophisticated routines on the simple
information given to you by the unicode database. You cannot build
anything solid on the C locale functions (and even then you are limited
by various issues inherent in the locale semantics, such as the fact
that it relies on process-wide state, which would only be ok, at best,
for single-user applications). There's a reason that e.g. Babel (*)
reimplements locale-like functionality from scratch.




From brett at  Tue Nov 30 21:11:58 2010
From: brett at (Brett Cannon)
Date: Tue, 30 Nov 2010 12:11:58 -0800
Subject: [Python-Dev] PEP 291 versus Python 3
In-Reply-To: <20101130103531.54d79465@mission>
References: <> <>
Message-ID: <>

On Tue, Nov 30, 2010 at 07:35, Barry Warsaw <barry at> wrote:
> On Nov 30, 2010, at 01:09 PM, Michael Foord wrote:
>>PEP 291 is very old and should probably be retired. I don't think anyone is
>>maintaining standard libraries in py3k that are also compatible with Python
>>2.anything. (At least not in a single codebase.)
> I agree.

Same here; I have purposefully ignored compatibility requirements
because I always found those promises to be extremely annoying and
somewhat painful to enforce.

> ?I think we should change the status of PEP 291 to Final, and add a
> few words to make it clear it applies only to Python 2. ?Since Neal owns the
> PEP, he should get first crack at doing the update, but I volunteer to make
> those changes if he declines (or does not respond).

I will channel Neal: "I decline and/or do not want to respond". =)

> We may eventually need a similar document for Python 3, but it should be a new
> PEP.

I hope not.

From solipsis at  Tue Nov 30 21:13:07 2010
From: solipsis at (Antoine Pitrou)
Date: Tue, 30 Nov 2010 21:13:07 +0100
Subject: [Python-Dev] ICU
In-Reply-To: <>
References: <>
	<>	<>
	<>	<>
	<>  <>
Message-ID: <1291147987.8628.13.camel@localhost.localdomain>

Oh, about ICU:

> > Actually, I remember you saying that locale should ideally be replaced
> > with a wrapper around the ICU library.
> By that, I stand - however, I have given up the hope that this will
> happen anytime soon.

Perhaps this could be made a GSOC topic.



From ben+python at  Tue Nov 30 21:24:08 2010
From: ben+python at (Ben Finney)
Date: Wed, 01 Dec 2010 07:24:08 +1100
Subject: [Python-Dev] Python and the Unicode Character Database
References: <>
Message-ID: <>

haiyang kang <cornsea at> writes:

>   I think it is a little ugly to have code like this: num =
> float("?.?"), expected result is: num = 1.1

That's a straw man, though. The string need not be a literal in the
program; it can be input to the program.

    num = float(input_from_the_external_world)

Does that change your assessment of whether non-ASCII digits are used?

 \        ?The greatest tragedy in mankind's entire history may be the |
  `\       hijacking of morality by religion.? ?Arthur C. Clarke, 1991 |
_o__)                                                                  |
Ben Finney

From barry at  Tue Nov 30 22:05:43 2010
From: barry at (Barry Warsaw)
Date: Tue, 30 Nov 2010 16:05:43 -0500
Subject: [Python-Dev] PEP 291 versus Python 3
In-Reply-To: <>
References: <> <>
Message-ID: <20101130160543.3b478311@mission>

On Nov 30, 2010, at 12:11 PM, Brett Cannon wrote:

>I will channel Neal: "I decline and/or do not want to respond". =)

PEP 291 updated.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From tjreedy at  Tue Nov 30 23:43:22 2010
From: tjreedy at (Terry Reedy)
Date: Tue, 30 Nov 2010 17:43:22 -0500
Subject: [Python-Dev] Python and the Unicode Character Database
In-Reply-To: <>
References: <>	<>
Message-ID: <id3umc$ps8$>

On 11/30/2010 3:23 AM, Stephen J. Turnbull wrote:

> I see no reason not to make a similar promise for numeric literals.  I
> see no good reason to allow compatibility full-width Japanese "ASCII"
> numerals or Arabic cursive numerals in "for i in range(...)" for
> example.

I do not think that anyone, at least not me, has argued for anything 
other than 0-9 digits (or 0-f for hex) in literals in program code. The 
only issue is whether non-programmer *users* should be able to use their 
native digits in applications in response to input prompts.

Terry Jan Reedy

From rrr at  Tue Nov 30 23:48:56 2010
From: rrr at (Ron Adam)
Date: Tue, 30 Nov 2010 16:48:56 -0600
Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError
In-Reply-To: <>
References: <201011251530.23947.emile.anclin@logilab>	<>	<>	<>	<id11vk$9d$>
Message-ID: <id3v0p$r97$>

On 11/30/2010 01:41 PM, Brett Cannon wrote:
> On Mon, Nov 29, 2010 at 12:21, Ron Adam<rrr at>  wrote:
>> On 11/29/2010 01:22 PM, Brett Cannon wrote:
>>> On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault
>>> <sylvain.thenault at>    wrote:
>>>> On 25 novembre 11:22, Ron Adam wrote:
>>>>> On 11/25/2010 08:30 AM, Emile Anclin wrote:
>>>>>> hello,
>>>>>> working on Pylint, we have a lot of voluntary corrupted files to test
>>>>>> Pylint behavior; for instance
>>>>>> $ cat /home/emile/var/pylint/test/input/
>>>>>> # -*- coding: IBO-8859-1 -*-
>>>>>> """ check correct unknown encoding declaration
>>>>>> """
>>>>>> __revision__ = '????'
>>>>>> and we try to find that module :
>>>>>> find_module('func_unknown_encoding', None). But python3 raises
>>>>>> SyntaxError
>>>>>> in that case ; it didn't raise SyntaxError on python2 nor does so on
>>>>>> our
>>>>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious
>>>>>> names)
>>>>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36)
>>>>>> [GCC 4.3.4] on linux2
>>>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>>>> >from imp import find_module
>>>>>>>>> find_module('func_unknown_encoding', None)
>>>>>> Traceback (most recent call last):
>>>>>>    File "<stdin>", line 1, in<module>
>>>>>> SyntaxError: encoding problem: with BOM
>>>>> I don't think there is a clear reason by design.  Also try importing
>>>>> the same modules directly and noting the differences in the errors
>>>>> you get.
>>>> IMO the point is that we can consider as a bug the fact that find_module
>>>> tries to somewhat read the content of the file, no? Though it seems to
>>>> only
>>>> doing this for encoding detection or like since find_module doesn't choke
>>>> on
>>>> a module containing another kind of syntax error.
>>>> So the question is, should we deal with this in pylint/astng, or can we
>>>> expect
>>>> this to be fixed at some point?
>>> Considering these semantics changed between Python 2 and 3 w/o a
>>> discernable benefit (I would consider it a negative as finding a
>>> module should not be impacted by syntactic correctness; the full act
>>> of importing should be the only thing that cares about that), I would
>>> consider it a bug that should be filed.
>> The output of imp.find_module() returns an open file io object, and it's
>> output feeds directly into to imp.load_module().
>>>>> imp.find_module('pydoc')
>> (<_io.TextIOWrapper name=4 encoding='utf-8'>,
>> '/usr/local/lib/python3.2/', ('.py', 'U', 1))
>> So I think the imp.find_module() is suppose to be used when you *do* want to
>> do the full act of importing and not for just finding out if or where module
>> xyz exists.
> Going with your line of argument, why can't imp.load_module be the
> call that figures out there is a syntax error? If you look at this
> from the perspective of PEP 302, finding a module has absolutely
> nothing to do with the validity of the found source, just that
> something was found somewhere which (hopefully) contains code that
> represents the module.

The part that I'm looking at, is what would find_module return if the 
encoding is bad or not found for the encoding?

    <_io.TextIOWrapper name=4 encoding='bad_encoding'>

Maybe we could have some library introspection function in the inspect for 
just looking in the library rather than loading modules.  But I think those 
would have the same issues, as packages need to be loaded in order to find 
sub modules.*

* It almost seems like the concept of a sub-module (in a package) is 
flawed.  I'm not sure I can explain what causes me to feel that way at the 
moment though.
