From brett at  Sat Aug  1 00:17:53 2009
From: brett at (Brett Cannon)
Date: Fri, 31 Jul 2009 15:17:53 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jul 31, 2009 at 14:16, Jacob Rus <jacobolus at> wrote:

> Hi all,
> In an attempt to figure out some twisted.web code, I was reading
> through the Python Standard Library?s mimetypes module today, and
> was shocked at the poor quality of the code. I wonder how the
> mimetypes code made it into the standard library, and whether anyone
> has ever bothered to read it or update it: it is an embarrassment.
> Much of the code is redundant, portions fail to execute, control
> flow is routed through a horribly confusing mess of spaghetti, and
> most of the complexity has no clear benefit as far as I can tell. I
> probably should drop the subject and get back to work, but as a good
> citizen, it?s hard to just ignore this sort of thing.

I have not looked at the code nor ever used it (that I can remember) so I
can't directly address the quality. But I can say the code was added in 1997
which puts it as an addition in Python 1.4. That's why before Python took
off mainstream and began to tighten up the quality control on the standard

I also would like to stay that I am not embarrassed by anything in Python.
It's unfortunate if the mimetypes module's code is a mess, but I think
putting at embarrassing is taking a little far and borderline insulting
(which I don't think you meant to do).

> stores its types in a pair of dictionaries, one for
> "strict" use, and the other for "non-standard types". It creates the
> strict dictionary by default out of apache's mime.types file, and
> then overrides the entries it finds with a set of exceptions. Then
> it creates the non-standard dictionary, which is set to match if the
> strict parameter is set to False when guessing types. Just in this
> basic design, and in the list of types in the file, there are
> several problems:
>  * Various apache mime types files are read, if found, but the
>    ordering of the files is such that older versions of apache are
>    sometimes read after newer ones, overriding updated mime types
>    with out-of-date versions if multiple versions of apache are
>    installed on the system.
>  * The vast majority of types declared in are
>    duplicates of types already declared by Apache. In a few cases
>    this is to change the apache default (make an exception, that
>    is), but in most cases the mime type and extension are
>    completely identical. This huge number of redundant types makes
>    the file substantially harder to follow. No comments are
>    provided to explain why various sets of exceptions are made to
>    Apache's default mime types, and in several cases
>    seems to just be out of date as compared to recent versions of
>    Apache, for instance not knowing about the 'text/troff' type
>    which was registered in January 2006 in RFC 4263.
>  * The 'non-standard' type dictionary is nearly useless, because
>    all of the types it declares are already in apache's mime.types
>    file, meaning that types are, as far as I can tell trying to
>    follow ugly program flow, *never* drawn from the non-strict
>    dictionary, except in the improbable situation where the
>    mimetypes module is initialized with a custom set of
>    apache-mime.types?like files, which does not include those
>    'non-standard' types. I personally cannot see a use case for
>    initializing the module with a custom set of mime types, but
>    then leaving the very few types included as non-strict to the
>    defaults: this seems like a fragile and pathological use case.
>    Given this, I don?t see any benefit to dragging the 'strict'
>    parameter along all the way through the code, and would advise
>    getting rid of it altogether. Does anyone know of any code that
>    uses the mimetypes module with strict set to False, where the
>    non-strict code path ever *actually* is executed?
> But though these problems, which affect actual use of the code and
> are therefore probably most important, are significant, they really
> pale in comparison to the awful quality of implementation. I'll try
> to briefly outline my understanding of how code flows in
>, and what the problems are. I haven't stepped through
> the code in a debugger, this is just from reading it, so I apologize
> in advance if I get something wrong. This is, however, some of the
> worst code I?ve seen in the standard library or anywhere else.
>  * It defines __all__: I didn?t even realize __all__ could be used
>    for single-file modules (w/o submodules), but it definitely
>    shouldn?t be here.

__all__ is used to control what a module exports when used in an import *,
nothing more. Thus it's use in a module compared to a package is completely

> This specific __all__ oddly does not include
>    all of the documented variables and functions in the mimetypes
>    class. It?s not clear why someone calling import * here wouldn?t
>    want the bits not included.

If something is documented by not listed in __all__ that is a bug.

>  * It creates a _default_mime_types() function which declares a
>    bunch of global variables, and then immediately calls
>    _default_mime_types() below the definition. There is literally
>    no difference in result between this and just putting those
>    variables at the top level of the file, so I have no idea why
>    this function exists, except to make the code more confusing.

It could potentially be used for testing, but that's a guess.

>  * It allows command line usage: I don?t think this is necessary
>    for a part of the standard library like this. There are better
>    tools for finding mime types from the command line which ship
>    with most operating systems.

Yeah, various modules have command-line versions which are not truly
necessary. This can probably stand to go.

>  * Its API is pretty poorly designed. It offers 6 functions when
>    about 3 are needed, and it takes a couple reads-through of the
>    code to figure out exactly what any of them are supposed to do.
>  * The operation is crazy: It defines a MimeTypes class which
>    actually stores the type mappings, but this class is designed to
>    be a singleton. The way that such a design is enforced is
>    through the use of the module-global 'init' function, which
>    makes an instance of the class, and then maps all of the
>    functions in the module global namespace to instance methods.
>    But confusingly, all such functions are also defined
>    independently of the init function, with definitions such as:
>        def guess_type(url, strict=True):
>            if not inited:
>                init()
>            return guess_type(url, strict)
>    I?d be amazed if anyone could guess what that code was trying to
>    do. I did a double-take when I saw it.

Probably came from someone who is very OO happy. Not everyone comes to
Python ready to embrace its procedural or slightly functional facets.

>    Of course, that return call is only ever reached the first time
>    this function is called, if init() has not happened yet. This
>    was all presumably done for lazy initialization, so that the
>    type information would only be loaded when needed. Needless to
>    say, there are more pythonic ways to accomplish such a goal.
>    Oh, also, the other good one here is that it means that someone
>    who writes `from mimetypes import guess_types` gets something
>    different than someone who writes:
>    `import mimetypes; guess_types = mimetypes.guess_types`. In the
>    former case, this wrapper function is saved as guess_type, which
>    each time just calls the (changed after init())
>    mimetypes.guess_types function. This caused a performance
>    nightmare before March of this year, when there was no check for
>    `if not inited` before running init() (amazing!?).
>  * Because the type datastore is set up to be a singleton, any time
>    init() is called in one section of code, it resets any types
>    which have been added manually: this means that if init() is
>    called by different pieces of code in the same python program,
>    they will interfere with each-others? type databases, and break
>    each-other. This is extremely fragile and, in my opinion, crazy.
>    It is hard for me to imagine any use case that would benefit
>    from this ability to clobber custom type mappings, and I very
>    much doubt that any code calling the mimetypes module realizes
>    that the contract of the API is so flimsy by definition. In
>    practice, I would not advise consumers of this API to ever call
>    init() manually, or to ever add custom mime type mappings,
>    because they are setting themselves up for hard-to-track bugs
>    down the line.
>  * The 'inited' flag is a documented part of the interface, in the
>    standard library documentation. I cannot imagine any reason to
>    set this flag manually: setting it to false when it was true
>    will have no effect, because the top-level functions have
>    already been replaced by instance methods of the 'db' MimeTypes
>    instance. Setting it to true when it was false will make the
>    code just break outright.
>  * In python 3, this has been changed a bit. There?s still an
>    inited flag, and it still in the docs, but now awful code from
>    above has been changed slightly, to:
>        def guess_type(url, strict=True):
>            if _db is None:
>                init()
>                return _db.guess_type(url, strict)
>    Which is still embarrassingly confusing. On the upside, the
>    inited flag now does literally nothing, but remains defined, and
>    in the docs.
>  * The 'types_map' and 'common_types' (for 'strict' and
>    'common' types, respectively) dictionaries are also a documented
>    part of the interface. When init() is called, a new MimeTypes
>    instance makes a (different) types_map which is a tuple of two
>    dictionaries, for 'strict' and 'common' types. Then this
>    instance reads the apache mime.types files and adds the types to
>    its pair of self.types_map dictionaries, and then after that
>    looks at the global types_map and common_types dictionaries and
>    adds *those* types to its self.types_map. Then at the end it
>    replaces the global types_map with self.types_map[True] and
>    replaces common_types with self.types_map[False]. Unfortunately,
>    while changing these dictionaries will have an effect on the
>    operation of the library, it will not update the types_map_inv
>    mapping, so inverse lookups will not behave as the changer
>    expects. If these dictionaries are going to remain documented,
>    the documentation should be clear to describe them as read only
>    to avoid very confusing bugs.
>  * Speaking of these dictionaries, .copy() is called on those two
>    and a few other inside MimeTypes.__init__(), which happens every
>    time the global init() function is called, but then init() puts
>    the copies back in the global namespace, meaning that the
>    original is discarded. Basically the only reason for the .copy()
>    is to make sure that the correct updates are applied to the
>    apache mimetype defaults, but the code will gladly re-read all
>    of the apache files even after its mapped types are already in
>    these dictionaries, essentially making re-initializing a (very
>    expensive) no-op. All we?re doing is a lot of unnecessary extra
>    disk reads and memory allocations and deallocations. The only
>    time this has any effect is when a non-singleton MimeTypes
>    instance is created, as in the read_mime_types function.
>  * And that read_mime_types function is a doozy. It tries to open a
>    filename, spits back None if there?s an IOError (instead of
>    raising the exception as it should), and then creates a new
>    MimeTypes instance (remember, this is identical to the singleton
>    MimeTypes instance because it starts itself from that one?s
>    mappings), adds any new types it finds in the file with that
>    name, and then returns the 'strict' types_map from it. I?m not
>    sure whether any sane user of this API would expect it to return
>    the existing type mappings *plus* the extra ones in the provided
>    filename, but I really can?t imagine this function ever being
>    particularly useful: it requires you are reading mime types in
>    apache format, but not the apache mime type files you already
>    looked at, and then the only way to find out what new mappings
>    were defined is to take the difference of the default mappings
>    with the result of the function.
>  * The code itself, on a line-by-line basis, is unpythonic and
>    unnecessarily verbose, confusing, and slow. The code should be
>    rewritten to use python 2.3?2.6 features: even leaving its
>    functionality identical it could be cut to about half the number
>    of lines, and made clearer.
> In case the above doesn?t make this clear: this code is extremely
> confusing.

Yeah, kind of picked up on that. =)

> Trying to read it has caused all the people around me to
> look up as I shout "what the fuck??!" at the screen every few
> minutes, as each new revelation gives another surprise. I?m not
> convinced that I completely understand what the code does, because
> it has been quite effectively obfuscated, but I understand enough to
> want to throw the whole thing out, and start essentially from
> scratch.
> So the question is, what should be done about this? I?d like to hear
> how people use the mimetypes module, and how they expect it to work,
> to figure out the sanest possible mostly-backwards-compatible
> replacement which could be dropped in (ideally this would just allow
> the use of default mimetypes and rip out the ability to alter the
> default datastore: or is there some easy way to change this away
> from a singleton without breaking code which calls these methods?),
> and then extend that replacement to support a somewhat saner model
> for anyone who actually wants to extend the set of mappings. My
> guess is that replacement code could actually fix subtle bugs in
> existing uses of this module, by people who had a sane expectation
> of how it was supposed to work.
> At the very least, the parts about figuring out exactly which
> exceptions to Apache?s set of default types are useful would be a
> good idea, and I?d maybe even recommend including an up-to-date copy
> of Apache?s mime.types file in the Python Standard Library, and then
> only overriding its definitions for future versions of Apache (and
> then overriding the combination of both of those with further
> exceptions deemed useful for python, with comments explaining why
> each exception), so that we?re not bothering to look up horribly
> out-of-date types in multiple locations from Apache 1, 1.2, 1.3,
> etc. I?d also recommend making the API for overriding definitions be
> the same as the code used to declare the default overrides, because
> as it is there are three ways do define types: a) in a mime.types
> formatted file, b) in a python dictionary that gets initialized with
> a confusing bit of code, and c) through the add_type function.
> Does anyone else have thoughts about this, or maybe some good (it
> had better be *really* good) explanations why this code is the way
> it is? I'd be happy to try to rewrite it, but I think I?d need a bit
> of help figuring out how to make the rewrite backwards-compatible.

So the problem of changing fundamentally how the code works, even for a
cleanup, is that it will break someone's code out there because they
depended on the module's crazy way of doing things. Now if they are cheating
and looking at things that are meant to be hidden you might be able to clean
things up, but if the semantics are exposed to the user, then there is not
much we can do w/o breaking someone's code.

Honestly, if the code is as bad as it seems -- including its API --, the
best bet would be to come up with a new module for handling MIME types from
scratch, put it up on the Cheeseshop/PyPI, and get the community behind it.
If the community picks it up as the de-facto replacement for mimetypes and
the code has settled we can then talk about adding it to the standard
library and begin deprecating mimetypes.

And thanks for willing to volunteer to fix this.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amcnabb at  Sat Aug  1 00:01:10 2009
From: amcnabb at (Andrew McNabb)
Date: Fri, 31 Jul 2009 16:01:10 -0600
Subject: [Python-Dev] standard library mimetypes module
	pathologically	broken?
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jul 31, 2009 at 09:16:02PM +0000, Jacob Rus wrote:
>   * The operation is crazy: It defines a MimeTypes class which 
>     actually stores the type mappings, but this class is designed to 
>     be a singleton. The way that such a design is enforced is 
>     through the use of the module-global 'init' function, which 
>     makes an instance of the class, and then maps all of the 
>     functions in the module global namespace to instance methods. 
>     But confusingly, all such functions are also defined 
>     independently of the init function, with definitions such as:
>         def guess_type(url, strict=True):
>             if not inited:
>                 init()
>             return guess_type(url, strict)

I can't speak for any of your other complaints, but I know that this
weird init stuff is fixed in trunk.

For the other stuff, you seem to have some very good points.  I'm sure a
patch would be welcome.

Andrew McNabb
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868

From rdmurray at  Sat Aug  1 00:33:20 2009
From: rdmurray at (R. David Murray)
Date: Fri, 31 Jul 2009 18:33:20 -0400 (EDT)
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 31 Jul 2009 at 15:17, Brett Cannon wrote:
>>  * It creates a _default_mime_types() function which declares a
>>    bunch of global variables, and then immediately calls
>>    _default_mime_types() below the definition. There is literally
>>    no difference in result between this and just putting those
>>    variables at the top level of the file, so I have no idea why
>>    this function exists, except to make the code more confusing.
> It could potentially be used for testing, but that's a guess.

regrtest calls it from dash_R_cleanup as part of "clear[ing]
assorted module caches".


From brett at  Sat Aug  1 00:52:17 2009
From: brett at (Brett Cannon)
Date: Fri, 31 Jul 2009 15:52:17 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jul 31, 2009 at 15:38, Jacob Rus <jacobolus at> wrote:

> Brett Cannon wrote:
> > Jacob Rus wrote:
> >>  * It defines __all__: I didn?t even realize __all__ could be used
> >>    for single-file modules (w/o submodules), but it definitely
> >>    shouldn?t be here.
> >
> > __all__ is used to control what a module exports when used in an import
> *,
> > nothing more. Thus it's use in a module compared to a package is
> completely
> > legitimate.
> >
> >> This specific __all__ oddly does not include
> >>    all of the documented variables and functions in the mimetypes
> >>    class. It?s not clear why someone calling import * here wouldn?t
> >>    want the bits not included.
> >
> > If something is documented by not listed in __all__ that is a bug.
> In this case, everything in the module is documented, including parts
> that should be private, but only a small number are in __all__.  My
> recommendation would be to make those private parts be _ variables and
> remove them from the docs (using them has no legitimate use cases I
> can see), and rip out __all__.

Well, if the module had stuff that did not lead with an underscore then you
can't remove it. You can deprecate it under the old name and rename it with
an underscore, but backwards-compatibility says someone out there is using
those functions so you can't just batch rename them w/o the proper warning.

> >>  * It creates a _default_mime_types() function which declares a
> >>    bunch of global variables, and then immediately calls
> >>    _default_mime_types() below the definition. There is literally
> >>    no difference in result between this and just putting those
> >>    variables at the top level of the file, so I have no idea why
> >>    this function exists, except to make the code more confusing.
> >
> > It could potentially be used for testing, but that's a guess.
> Here's an abridged version of this function. I don?t think there?s any
> reason for this that I can see.
>    def _default_mime_types():
>        global suffix_map
>        global encodings_map
>        global types_map
>        global common_types
>        suffix_map = {
>            '.tgz': '.tar.gz', #...
>            }
>        encodings_map = {
>            '.gz': 'gzip', #...
>            }
>        types_map = {
>            '.a'      : 'application/octet-stream', #...
>            }
>        common_types = {
>            '.jpg' : 'image/jpg', #...
>            }
>    _default_mime_types()

As R. David pointed out, it is being used by regrtest to clean up after
running the test suite.

> > Probably came from someone who is very OO happy. Not everyone comes to
> > Python ready to embrace its procedural or slightly functional facets.
> Yes, it seems so to me too.
> > So the problem of changing fundamentally how the code works, even for a
> > cleanup, is that it will break someone's code out there because they
> > depended on the module's crazy way of doing things. Now if they are
> cheating
> > and looking at things that are meant to be hidden you might be able to
> clean
> > things up, but if the semantics are exposed to the user, then there is
> not
> > much we can do w/o breaking someone's code.
> The problem is that the semantics as documented are really ambiguous,
> and what I would consider the reasonable interpretation is different
> from what the code actually does. So anyone using this code naively is
> going to run into trouble, and anyone relying on how the code actually
> works is going behind the back of the docs, but they sort of have to
> in order to use much of the functionality of the module. I agree this
> puts us in a tricky spot.

Well, perhaps the docs can be updated to match the code where cleanup would
change the semantics.

> > Honestly, if the code is as bad as it seems -- including its API --, the
> > best bet would be to come up with a new module for handling MIME types
> from
> > scratch, put it up on the Cheeseshop/PyPI, and get the community behind
> it.
> > If the community picks it up as the de-facto replacement for mimetypes
> and
> > the code has settled we can then talk about adding it to the standard
> > library and begin deprecating mimetypes.
> > And thanks for willing to volunteer to fix this.
> Okay.  Well I'd still like to hear a bit about what people really need
> before trying to make a new API. I'm not such an experienced API
> designer, and I haven?t really plumbed the depths of mimetypes use
> cases (though it seems to me like quite a simple module of not more
> than 100 lines of code or so would suffice).

I'm sure you can get help from the community with any of this.

> At the very least, I
> think some changes can be made to this code without altering its basic
> function, which would clean up the actual mime types it returns,
> comment the exceptions to Apache and explain why they're there, and
> make the code flow understandable to someone reading the code.

That all sounds reasonable.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jacobolus at  Sat Aug  1 01:07:34 2009
From: jacobolus at (Jacob Rus)
Date: Fri, 31 Jul 2009 16:07:34 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Brett Cannon wrote:
>>>> ?* It creates a _default_mime_types() function which declares a
>>>> ? ?bunch of global variables, and then immediately calls
>>>> ? ?_default_mime_types() below the definition. There is literally
>>>> ? ?no difference in result between this and just putting those
>>>> ? ?variables at the top level of the file, so I have no idea why
>>>> ? ?this function exists, except to make the code more confusing.
>>> It could potentially be used for testing, but that's a guess.
>> Here's an abridged version of this function. I don?t think there?s any
>> reason for this that I can see.
>> ? ?def _default_mime_types():
>> ? ? ? ?global suffix_map
>> ? ? ? ?global encodings_map
>> ? ? ? ?global types_map
>> ? ? ? ?global common_types
>> ? ? ? ?suffix_map = {
>> ? ? ? ? ? ?'.tgz': '.tar.gz', #...
>> ? ? ? ? ? ?}
>> ? ? ? ?encodings_map = {
>> ? ? ? ? ? ?'.gz': 'gzip', #...
>> ? ? ? ? ? ?}
>> ? ? ? ?types_map = {
>> ? ? ? ? ? ?'.a' ? ? ?: 'application/octet-stream', #...
>> ? ? ? ? ? ?}
>> ? ? ? ?common_types = {
>> ? ? ? ? ? ?'.jpg' : 'image/jpg', #...
>> ? ? ? ? ? ?}
>> ? ?_default_mime_types()
> As R. David pointed out, it is being used by regrtest to clean up after
> running the test suite.

Yeah, basically the issue is that the default mime types should be
separate objects from the final set after apache's files have been
parsed and custom additions have been made. If these ones at the top
level are renamed and not modified after creation, if new objects with
all the updated stuff is put at these names, and if the test code is
changed to instead reset the ones at these names based on the default
objects, I think that will maybe fix things.  I'll try to write some
potential patches in the next day or two and submit them here for

>> The problem is that the semantics as documented are really ambiguous,
>> and what I would consider the reasonable interpretation is different
>> from what the code actually does. So anyone using this code naively is
>> going to run into trouble, and anyone relying on how the code actually
>> works is going behind the back of the docs, but they sort of have to
>> in order to use much of the functionality of the module. I agree this
>> puts us in a tricky spot.
> Well, perhaps the docs can be updated to match the code where cleanup would
> change the semantics.

I think that would make the docs extremely confusing, and I?m not even
sure it would be possible. The current semantics are vaguely okay if
an API consumer sticks to straight-forward use cases, such as any
which don?t break when the current docs are followed (anything
complicated is going to break unless the code is read a few times),
and assuming such uses it would be possible to swap out most of the
implementation for something relatively straight-forward. But if any
of the edges are pushed, the semantics quickly turn insane, to the
point I?m not sure they?re document-able. Anyone expecting the code to
work that way is going to have a buggy program anyway, so I?m not sure
it makes sense to bend over backwards leaving the particular set of
bugs unchanged.

Jacob Rus

From jacobolus at  Sat Aug  1 00:38:32 2009
From: jacobolus at (Jacob Rus)
Date: Fri, 31 Jul 2009 15:38:32 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Brett Cannon wrote:
> Jacob Rus wrote:
>> ?* It defines __all__: I didn?t even realize __all__ could be used
>> ? ?for single-file modules (w/o submodules), but it definitely
>> ? ?shouldn?t be here.
> __all__ is used to control what a module exports when used in an import *,
> nothing more. Thus it's use in a module compared to a package is completely
> legitimate.
>> This specific __all__ oddly does not include
>> ? ?all of the documented variables and functions in the mimetypes
>> ? ?class. It?s not clear why someone calling import * here wouldn?t
>> ? ?want the bits not included.
> If something is documented by not listed in __all__ that is a bug.

In this case, everything in the module is documented, including parts
that should be private, but only a small number are in __all__.  My
recommendation would be to make those private parts be _ variables and
remove them from the docs (using them has no legitimate use cases I
can see), and rip out __all__.

>> ?* It creates a _default_mime_types() function which declares a
>> ? ?bunch of global variables, and then immediately calls
>> ? ?_default_mime_types() below the definition. There is literally
>> ? ?no difference in result between this and just putting those
>> ? ?variables at the top level of the file, so I have no idea why
>> ? ?this function exists, except to make the code more confusing.
> It could potentially be used for testing, but that's a guess.

Here's an abridged version of this function. I don?t think there?s any
reason for this that I can see.

    def _default_mime_types():
        global suffix_map
        global encodings_map
        global types_map
        global common_types

        suffix_map = {
            '.tgz': '.tar.gz', #...

        encodings_map = {
            '.gz': 'gzip', #...

        types_map = {
            '.a'      : 'application/octet-stream', #...

        common_types = {
            '.jpg' : 'image/jpg', #...


> Probably came from someone who is very OO happy. Not everyone comes to
> Python ready to embrace its procedural or slightly functional facets.

Yes, it seems so to me too.

> So the problem of changing fundamentally how the code works, even for a
> cleanup, is that it will break someone's code out there because they
> depended on the module's crazy way of doing things. Now if they are cheating
> and looking at things that are meant to be hidden you might be able to clean
> things up, but if the semantics are exposed to the user, then there is not
> much we can do w/o breaking someone's code.

The problem is that the semantics as documented are really ambiguous,
and what I would consider the reasonable interpretation is different
from what the code actually does. So anyone using this code naively is
going to run into trouble, and anyone relying on how the code actually
works is going behind the back of the docs, but they sort of have to
in order to use much of the functionality of the module. I agree this
puts us in a tricky spot.

> Honestly, if the code is as bad as it seems -- including its API --, the
> best bet would be to come up with a new module for handling MIME types from
> scratch, put it up on the Cheeseshop/PyPI, and get the community behind it.
> If the community picks it up as the de-facto replacement for mimetypes and
> the code has settled we can then talk about adding it to the standard
> library and begin deprecating mimetypes.
> And thanks for willing to volunteer to fix this.

Okay.  Well I'd still like to hear a bit about what people really need
before trying to make a new API. I'm not such an experienced API
designer, and I haven?t really plumbed the depths of mimetypes use
cases (though it seems to me like quite a simple module of not more
than 100 lines of code or so would suffice). At the very least, I
think some changes can be made to this code without altering its basic
function, which would clean up the actual mime types it returns,
comment the exceptions to Apache and explain why they're there, and
make the code flow understandable to someone reading the code.

Jacob Rus

From jacobolus at  Sat Aug  1 04:53:02 2009
From: jacobolus at (Jacob Rus)
Date: Sat, 1 Aug 2009 02:53:02 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Andrew McNabb wrote:
> Jacob Rus wrote:
>>   * The operation is crazy: It defines a MimeTypes class which 
>>     actually stores the type mappings, but this class is designed to 
>>     be a singleton. The way that such a design is enforced is 
>>     through the use of the module-global 'init' function, which 
>>     makes an instance of the class, and then maps all of the 
>>     functions in the module global namespace to instance methods. 
>>     But confusingly, all such functions are also defined 
>>     independently of the init function, with definitions such as:
>>         def guess_type(url, strict=True):
>>             if not inited:
>>                 init()
>>             return guess_type(url, strict)
> I can't speak for any of your other complaints, but I know that this
> weird init stuff is fixed in trunk.

Actually, this fix changes the semantics of the code quite
substantially (not in any way that is incompatible with the
extremely vague documentation, but in a way that might break any
code that relies on the Python <=2.6 behavior). If such a change is
okay, then we can do quite a bit of implementation change under
these new semantics.

Jacob Rus

From tjreedy at  Sat Aug  1 05:03:27 2009
From: tjreedy at (Terry Reedy)
Date: Fri, 31 Jul 2009 23:03:27 -0400
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>
Message-ID: <h50b9t$be0$>

Jacob Rus wrote:

> Okay.  Well I'd still like to hear a bit about what people really need
> before trying to make a new API. 

Try asking some specific question on python-list.
"How to you use the stdlib mimetypes module?"

From jafo at  Sun Aug  2 02:00:51 2009
From: jafo at (Sean Reifschneider)
Date: Sat, 1 Aug 2009 18:00:51 -0600
Subject: [Python-Dev] REVIEW: PyArg_ParseTuple with "s" format and NUL:
	Bogus	TypeError detail string.
In-Reply-To: <>
References: <>
Message-ID: <>

"make test" in both python and py3k trunks were happy with this change, so
I've documented the issue in Issue #6624 and committed it in 
74277 (2.x) and 74278 (3.x).

 "The only thing more expensive than hiring a professional is hiring
 an amateur."  -- Red Adair,  Oil Well Fire-Fighter
Sean Reifschneider, Member of Technical Staff <jafo at>, ltd. - Linux Consulting since 1995: Ask me about High Availability

From vincent.legoll at  Sun Aug  2 00:40:06 2009
From: vincent.legoll at (Vincent Legoll)
Date: Sun, 2 Aug 2009 00:40:06 +0200
Subject: [Python-Dev] pylinting the stdlib
Message-ID: <>


I've fed parts of the stdlib to pylint and after some filtering
there appears to be some things that looks strange, I've
filled a few bugs to the tracker for them.

6623 Lib/ netrc class parsing problem
6622 [RFC] wrong variable used in Lib/
6621 [RFC] Remove leftover use of Carbon module from Lib/
6620 Variable may be used before first being assigned to in Lib/
6619 Remove duplicated function in Lib/

Is this useless and taking reviewer's time for nothing ?

Please advise, if this is deemed useful, I'll continue further

Vincent Legoll

From jacobolus at  Sun Aug  2 06:03:12 2009
From: jacobolus at (Jacob Rus)
Date: Sat, 1 Aug 2009 21:03:12 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Brett Cannon wrote:
> Jacob Rus wrote:
>> At the very least, I
>> think some changes can be made to this code without altering its basic
>> function, which would clean up the actual mime types it returns,
>> comment the exceptions to Apache and explain why they're there, and
>> make the code flow understandable to someone reading the code.
> That all sounds reasonable.

Okay, as a start, I did a simple code cleanup that I think fixes some
potential bugs (any code using its own instance of the MimeTypes class
should now be insulated from other same-process users of the module),
chops out 80 or 90 lines, removes some redundant code paths, clarifies
some of the micro level behavior of some chunks of code, adds a bit
more to the docstring at the top of the file, and makes the program
flow somewhat clearer ? *without* changing the semantics of the module
or its included list of MIME types.

Here's a diff:

And here's the whole file:

This change does require any tests that previously called
_default_mime_types() to instead call init().

Any thoughts?
Jacob Rus

From jacobolus at  Sun Aug  2 06:58:38 2009
From: jacobolus at (Jacob Rus)
Date: Sat, 1 Aug 2009 21:58:38 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Jacob Rus wrote:
> Here's a diff:
> And here's the whole file:

Slightly better:

From jacobolus at  Sun Aug  2 08:37:18 2009
From: jacobolus at (Jacob Rus)
Date: Sat, 1 Aug 2009 23:37:18 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Jacob Rus wrote:
> Brett Cannon wrote:
>> Jacob Rus wrote:
>>> At the very least, I
>>> think some changes can be made to this code without altering its basic
>>> function, which would clean up the actual mime types it returns,
>>> comment the exceptions to Apache and explain why they're there, and
>>> make the code flow understandable to someone reading the code.
>> That all sounds reasonable.
> Okay, as a start, I did a simple code cleanup that I think fixes some
> potential bugs (any code using its own instance of the MimeTypes class
> should now be insulated from other same-process users of the module),
> chops out 80 or 90 lines, removes some redundant code paths, clarifies
> some of the micro level behavior of some chunks of code, adds a bit
> more to the docstring at the top of the file, and makes the program
> flow somewhat clearer ? *without* changing the semantics of the module
> or its included list of MIME types.

Here is a somewhat more substantively changed version. This one does
away with the 'inited' flag and the 'init' function, which might be
impossible given that their documented (though I would be extremely
surprised if anyone calls them in third-party code), and makes the
behavior of the code much clearer, I think, by making it very obvious
how the singleton instance is actually working.

Additionally, this version brings the lazy loading of Apache
mime.types files to every MimeTypes instance, and makes the
read_mime_types() function behave as expected (only getting the
mapping from an apache mime.types file rather than including some
extra types as the current code does).

In this version, tests would want to call the _init_singleton()
function to reset to defaults.

To reiterate: this should still behave identically to the current
module in all reasonable conditions. I still haven't made any changes
to the set of MIME types included in the file, or the behavior of the
module. Some such changes should be made as well, but the changes so
far should be relatively uncontroversial, I hope.

Jacob Rus

From fuzzyman at  Sun Aug  2 12:53:09 2009
From: fuzzyman at (Michael Foord)
Date: Sun, 02 Aug 2009 11:53:09 +0100
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

Jacob Rus wrote:
> Jacob Rus wrote:
>> Brett Cannon wrote:
>>> Jacob Rus wrote:
>>>> At the very least, I
>>>> think some changes can be made to this code without altering its basic
>>>> function, which would clean up the actual mime types it returns,
>>>> comment the exceptions to Apache and explain why they're there, and
>>>> make the code flow understandable to someone reading the code.
>>> That all sounds reasonable.
>> Okay, as a start, I did a simple code cleanup that I think fixes some
>> potential bugs (any code using its own instance of the MimeTypes class
>> should now be insulated from other same-process users of the module),
>> chops out 80 or 90 lines, removes some redundant code paths, clarifies
>> some of the micro level behavior of some chunks of code, adds a bit
>> more to the docstring at the top of the file, and makes the program
>> flow somewhat clearer ? *without* changing the semantics of the module
>> or its included list of MIME types.
> Here is a somewhat more substantively changed version. This one does
> away with the 'inited' flag and the 'init' function, which might be
> impossible given that their documented (though I would be extremely
> surprised if anyone calls them in third-party code), and makes the
> behavior of the code much clearer, I think, by making it very obvious
> how the singleton instance is actually working.
> Additionally, this version brings the lazy loading of Apache
> mime.types files to every MimeTypes instance, and makes the
> read_mime_types() function behave as expected (only getting the
> mapping from an apache mime.types file rather than including some
> extra types as the current code does).
> In this version, tests would want to call the _init_singleton()
> function to reset to defaults.
> To reiterate: this should still behave identically to the current
> module in all reasonable conditions. I still haven't made any changes
> to the set of MIME types included in the file, or the behavior of the
> module. Some such changes should be made as well, but the changes so
> far should be relatively uncontroversial, I hope.

Please post the patches to the Python bug tracker:


Michael Foord

> Cheers,
> Jacob Rus
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From p.f.moore at  Sun Aug  2 13:45:49 2009
From: p.f.moore at (Paul Moore)
Date: Sun, 2 Aug 2009 12:45:49 +0100
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/2 Michael Foord <fuzzyman at>:
>> In this version, tests would want to call the _init_singleton()
>> function to reset to defaults.
> Please post the patches to the Python bug tracker:
> ?
> Thanks

The patch you post should also patch the test suite to use your
replacement initialisation function where needed (if you didn't
already do that).


From stargaming at  Sun Aug  2 13:14:05 2009
From: stargaming at (Robert Lehmann)
Date: Sun, 2 Aug 2009 11:14:05 +0000 (UTC)
Subject: [Python-Dev] standard library mimetypes module
	pathologically	broken?
References: <>
Message-ID: <h53sdt$blt$>

On Sat, 01 Aug 2009 23:37:18 -0700, Jacob Rus wrote:

> Here is a somewhat more substantively changed version. This one does
> away with the 'inited' flag and the 'init' function, which might be
> impossible given that their documented (though I would be extremely
> surprised if anyone calls them in third-party code)

There seem to be quite a bunch of high-profile third-party modules 
relying on this interface, eg. Zope, Plone, TurboGears, and CherryPy. See for a 
more thorough listing.

Given that most of them aren't ported to Python 3 yet, I guess, changing 
the semantics in 3.x seems not-too-bad to me.


Robert "Stargaming" Lehmann

From python at  Sun Aug  2 17:54:22 2009
From: python at (MRAB)
Date: Sun, 02 Aug 2009 16:54:22 +0100
Subject: [Python-Dev] [regex] memory leak
In-Reply-To: <>
References: <>
Message-ID: <>

John Machin wrote:
> Hi Matthew,
> Your post in about your re rewrite didn't mention where to report 
> bugs etc so I dug this address out of Google Groups ...
> Environment: Python 2.6.2, Windows XP SP3, your latest (29 July) regex 
> from the Python bugtracker.
> Problem is repeated calls of e.g. -- 
> Task Manager performance panel shows increasing memory usage with regex 
> but not with re. It appears to be cumulative i.e. changing to another 
> pattern or text doesn't release memory.
> Example:
> 8<--
> import sys
> import time
> if sys.platform == 'win32':
>     timer = time.clock
> else:
>     timer = time.time
> module = __import__(sys.argv[1])
> count = int(sys.argv[2])
> pattern = sys.argv[3]
> expected = sys.argv[4]
> text = 80 * '~' + 'qwerty'
> rx = module.compile(pattern)
> t0 = timer()
> for i in xrange(count):
>     assert == expected
> t1 = timer()
> print "%d iterations in %.6f seconds" % (count, t1 - t0)
> 8<---
> Here are the results of running this (plus observed difference between 
> peak memory usage and base memory usage):
> dos-prompt>\python26\python regex 1000000 "~" "~"
> 1000000 iterations in 3.811500 seconds [60 Mb]
> dos-prompt>\python26\python regex 2000000 "~" "~"
> 2000000 iterations in 7.581335 seconds [128 Mb]
> dos-prompt>\python26\python re 2000000 "~" "~"
> 2000000 iterations in 2.549738 seconds [3 Mb]
> This happens on a variety of patterns: "w", "wert", "[a-z]+", "[a-z]+t", 
> ...
Thanks for that, John. I've should've kept an eye on the Task Manager!
:-) Now fixed.

It's surprising how much time and effort is needed just to manage the

From dickinsm at  Sun Aug  2 18:20:36 2009
From: dickinsm at (Mark Dickinson)
Date: Sun, 2 Aug 2009 17:20:36 +0100
Subject: [Python-Dev] pylinting the stdlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Aug 1, 2009 at 11:40 PM, Vincent Legoll<vincent.legoll at> wrote:
> Hello,
> I've fed parts of the stdlib to pylint and after some filtering
> there appears to be some things that looks strange, I've
> filled a few bugs to the tracker for them.
> <buglist snipped>
> Is this useless and taking reviewer's time for nothing ?
> Please advise, if this is deemed useful, I'll continue further

I think this is valuable work---please do continue!

Just out of interest, how many false positives did you have
to filter out in finding the 5 cases above?


From jimjjewett at  Sun Aug  2 19:47:30 2009
From: jimjjewett at (Jim Jewett)
Date: Sun, 2 Aug 2009 13:47:30 -0400
Subject: [Python-Dev] standard library mimetypes module pathologically
Message-ID: <>

[It may be worth creating a patch; I think most of these comments
would be better on the bug-tracker.]

(1)  In a few cases, it looked like you were changing parameter names
between "files" and "filenames".  This might break code that was
calling it with keyword arguments -- as I typically would for this
type of function.

(1a)  If you are going to change the .sig, you might as well do it
right, and make the default be "knownfiles" rather than the empty

(2)  The comment about why inited was set true at the beginning of the
function instead of the end should probably be kept, or at least

(3) Default values:

(3a) Why the list of known files going back to Apache 1.2, in that
order?  Is there any risk in using too *new* of a MimeTypes file?

I would assume that the goal is to pick up whatever changes the user
has made locally, but in that case, it still makes sense to have the
newest file be the last one read, in case Apache has made bugfixes.

(3b)  Also, this would improve cross-platform consistency; if I read
that correctly, the Apache files will override the python defaults on
unix or a mac, but not on windows.  That will change the results on
the majority of items in _common_types.  (application vs text, whether
to put an x- in front of the word pict.)

(3c)  rtf is listed in non-standard, but does define it.  (Though
whether to guess application vs text is not defined, and python
chooses differently from apache.)

(3d)  jpg is listed as non-standard.  It turns out that this is just
for the inverse mapping, where image/jpg is non-standard (for
image/jpeg) but that is worth a comment.  (see #5)

(3e)  In _types_map, the lines marked duplicates are duplicate keys,
not duplicate values; it would be more clear to also comment out the
(first) line itself, instead of just marking it a duplicate.  (Or
better yet, to mention that it is just being added for the inverse
mapping, if that is the case.)

(4)  Why bother to lazyinit?    Is there any sane usecase for a
MimeTypes that hasn't been inited?

I see value in not reading the default files, but none in not reading
at least the files that were asked for.  I could see value in only
partial initialization if there were several long steps, but right
now, initialization is all-or-nothing.

If the thing is useless without an init, then it makes sense to just
get done it immediately and skip the later checks; anyone who could
have actually saved time should just remove the import.


From jacobolus at  Sun Aug  2 20:56:29 2009
From: jacobolus at (Jacob Rus)
Date: Sun, 2 Aug 2009 11:56:29 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Jim Jewett wrote:
> [It may be worth creating a patch; I think most of these comments
> would be better on the bug-tracker.]

I'm going to do that shortly.

> (1) ?In a few cases, it looked like you were changing parameter names
> between "files" and "filenames". ?This might break code that was
> calling it with keyword arguments -- as I typically would for this
> type of function.

Sorry, that was a mistake.

> (1a) ?If you are going to change the .sig, you might as well do it
> right, and make the default be "knownfiles" rather than the empty
> tuple.

Seems reasonable.

> (2) ?The comment about why inited was set true at the beginning of the
> function instead of the end should probably be kept, or at least
> reworded.
> (3) Default values:
> (3a) Why the list of known files going back to Apache 1.2, in that
> order? ?Is there any risk in using too *new* of a MimeTypes file?
> I would assume that the goal is to pick up whatever changes the user
> has made locally, but in that case, it still makes sense to have the
> newest file be the last one read, in case Apache has made bugfixes.

I did not change this in my patch, but I completely agree. Indeed, I
think it makes more sense to grab the newest Apache mime.types and
just include them with the standard library, either as an in-code
python object, or as a mime.types file to be parsed.

> (3b) ?Also, this would improve cross-platform consistency; if I read
> that correctly, the Apache files will override the python defaults on
> unix or a mac, but not on windows. ?That will change the results on
> the majority of items in _common_types. ?(application vs text, whether
> to put an x- in front of the word pict.)

Quite possibly true. It actually seems

> (3c) ?rtf is listed in non-standard, but
> does define it. ?(Though
> whether to guess application vs text is not defined, and python
> chooses differently from apache.)
> (3d) ?jpg is listed as non-standard. ?It turns out that this is just
> for the inverse mapping, where image/jpg is non-standard (for
> image/jpeg) but that is worth a comment. ?(see #5)
> (3e) ?In _types_map, the lines marked duplicates are duplicate keys,
> not duplicate values; it would be more clear to also comment out the
> (first) line itself, instead of just marking it a duplicate. ?(Or
> better yet, to mention that it is just being added for the inverse
> mapping, if that is the case.)

I completely agree that this whole section should be considered
carefully. Just any changes might have more impact on backwards
compatibility than the code flow changes I made, so I thought they
could be in a separate patch.

> (4) ?Why bother to lazyinit? ? ?Is there any sane usecase for a
> MimeTypes that hasn't been inited?

Only because the original was written that way, back in 1997 or
whatever. I don't think there's necessarily any need for it these
days: reading the default files even should be blazingly fast, unless
the disk is otherwise thrashing: each is about a a 37k file, and there
are at most going to be 3 or 4 of them installed on one machine for
different versions of Apache.

Jacob Rus

From jacobolus at  Sun Aug  2 22:17:45 2009
From: jacobolus at (Jacob Rus)
Date: Sun, 2 Aug 2009 13:17:45 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Robert Lehmann wrote:
> Jacob Rus wrote:
>> Here is a somewhat more substantively changed version. This one does
>> away with the 'inited' flag and the 'init' function, which might be
>> impossible given that their documented (though I would be extremely
>> surprised if anyone calls them in third-party code)
> [snip]
> There seem to be quite a bunch of high-profile third-party modules
> relying on this interface, eg. Zope, Plone, TurboGears, and CherryPy. See
> for a
> more thorough listing.
> Given that most of them aren't ported to Python 3 yet, I guess, changing
> the semantics in 3.x seems not-too-bad to me.

Ooh, okay.  Well I guess we can?t get rid of those then!

Michael Foord wrote:
> Please post the patches to the Python bug tracker:

I made a new issue on the bug tracker,
<>, and added a new patch which should
hopefully be fairly reasonable.  I still haven't addressed the issue
of which MIME types should be included by default, and how precisely
the logic should work for setting those up. But again, hopefully this
at least makes it clear what the code is trying to do, so that it's
relatively readable for someone trying to use the module. (For
instance, so they'll be warned off of using init() and breaking
each-other's code)

Paul Moore wrote:
> The patch you post should also patch the test suite to use your
> replacement initialisation function where needed (if you didn't
> already do that).

Done. The tests still pass, though to be honest this test suite isn't
really testing any edge cases.

Jacob Rus

From glyph at  Mon Aug  3 00:36:27 2009
From: glyph at (Glyph Lefkowitz)
Date: Sun, 2 Aug 2009 18:36:27 -0400
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 2, 2009 at 4:17 PM, Jacob Rus <jacobolus at> wrote:

> Robert Lehmann wrote:
> > Jacob Rus wrote:
> >> Here is a somewhat more substantively changed version. This one does
> >> away with the 'inited' flag and the 'init' function, which might be
> >> impossible given that their documented (though I would be extremely
> >> surprised if anyone calls them in third-party code)
> > [snip]
> >
> > There seem to be quite a bunch of high-profile third-party modules
> > relying on this interface, eg. Zope, Plone, TurboGears, and CherryPy. See
> > for a
> > more thorough listing.
> >
> > Given that most of them aren't ported to Python 3 yet, I guess, changing
> > the semantics in 3.x seems not-too-bad to me.

No, it's bad.  If I may quote Guido:

So, once more for emphasis: *Don't change your APIs at the same time as
> porting to Py3k!*

Please follow this policy as much as possible in the standard library; the
language transition is going to be hard enough.

Put a different way: please don't change the library unless you're
*also*going to write a 2to3 fixer that somehow updates all calling
code, too.

Ooh, okay.  Well I guess we can?t get rid of those then!

Indeed not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From dirkjan at  Mon Aug  3 13:53:06 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Mon, 3 Aug 2009 13:53:06 +0200
Subject: [Python-Dev] PEP 385: updating the PEP
Message-ID: <>

The diff below should reflect changes from the discussion we had last
time. Please review. (Some comments may be more appropriate for the
other threads I just kicked off.)



Index: pep-0385.txt
--- pep-0385.txt        (revision 74294)
+++ pep-0385.txt        (revision 74296)
@@ -59,27 +59,25 @@
 often has somewhat unintuitive results for people (though this has been
 getting better in recent versions of Mercurial).

-I'm still a bit on the fence about whether Python should adopt cloned
-branches and named branches. Since it usually makes more sense to tag releases
-on the maintenance branch, for example, mainline history would not contain
-release tags if we used cloned branches. Also, Mercurial 1.2 and 1.3 have the
-necessary tools to make named branches less painful (because they can be
-properly closed and closed heads are no longer considered in relevant cases).
+The current proposal is to use named branches for release branches and adopt
+cloned branches for feature branches, with one exception to this rule: the 3.x
+branches will be kept in separate clones from the 2.x branches. I think this
+provides an optimal hybrid approach for Python's uses of branching.

-A disadvantage might be that the used clones will be a good bit larger (since
-they essentially contain all other branches as well). This can me mitigated by
-keeping non-release (feature) branches in separate clones. Also note that it's
-still possible to clone a single named branch from a combined clone, by
-specifying the branch as in hg clone
-Keeping the py3k history in a separate clone problably also makes sense.
+Differences between named branches and cloned branches:

-XXX To do: size comparison for selected separation scenarios.
+* Tags in a different (maintenance) clone aren't available in the local clone
+* Clones with named branches will be larger, since they contain more data

+(The Mercurial book discourages the use of named branches, but it is, in this
+respect, somewhat outdated. Named branches have gotten much easier to use
+since that comment was written, due to improvements in hg.)
 Converting branches

 There are quite a lot of branches in SVN's branches directory. I propose to
-clean this up a bit, by employing the following the strategy:
+clean this up a bit, by following this basic strategy:

 * Keep all release (maintenance) branches
 * Discard branches that haven't been touched in 18 months, unless somone
@@ -87,6 +85,21 @@
 * Keep branches that have been touched in the last 18 months, unless someone
   indicates the branch can be deprecated

+There's a `branch map`_ available that shows info about each branch:
+* keep-clone means we'll keep that branch in a separate clone
+* keep-named means we'll keep that branch as a named branch in one of
the clones
+* strip means we won't keep that branch
+* streamed-merge means that it got merged by committing several new revisions
+  to the other branch
+* merged-r* means the branch got merged in the named revision
+* merges? means I haven't checked/found out yet whether that branch was ever
+  merged
+* ? means that your input would be even more helpful than for the other items
+* some items have no action yet, feel free to treat that as just '?'
+.. _branch map:
 Converting tags

@@ -95,8 +108,8 @@
 we should keep all release tags, and consider other tags for inclusion based
 on requests from the developer community. I'd like to consider unifying the
 release tag naming scheme to make some things more consistent, if people feel
-that won't create too many problems. For example, Mercurial itself just uses
-'1.2.1' as a tag, where CPython would currently use r121.
+that won't create too many problems. The current proposal is to bring old
+release tags in line with the current practice of release tag naming.

 Author map
@@ -119,17 +132,19 @@
 possible forms of pattern matching. The current Python repository already
 includes a rudimentary .hgignore file to help with using the hg mirrors.

-It might be useful to have the .hgignore be generated automatically from
-svn:ignore properties. This would make sure all historic revisions also have
-useful ignore information (though one could argue ignoring isn't really
-relevant to just checking out an old revision).
+Since the current Python repository already includes a .hgignore file (for use
+with hg mirrors), we'll just use that. Generating full history of the file
+was debated but deemed impractical (because it's relatively hard with fairly
+little gain, since ignoring is less important for older revisions).

 Revlog reordering

-As an optional optimization technique, we should consider trying a reordering
-pass on the revlogs (internal Mercurial files) resulting from the conversion.
-In some cases this results in dramatic decreases in on-disk repository size.
+As an optional optimization technique, I have performed a reordering pass on
+the revlogs (internal Mercurial files) resulting from the conversion. In some
+cases this results in dramatic decreases in on-disk repository size. This
+especially makes sense for the manifest (where it really helps out quite a lot)
+and oft-edited files like NEWS.txt (with an admittedly smaller effect).

 Other repositories
@@ -138,7 +153,14 @@
 converted. What other projects in the repository should be
 converted? Do we want to convert the peps repository? distutils? others?

+There's now an initial stab at converting the Jython repository. The current
+tip of hgsubversion unfortunately fails at some point. Pending investigation.

+Other repositories that would like to converted to Mercurial can announce
+themselves to me after the main Python migration is done, and I'll take care
+of their needs.

@@ -165,18 +187,34 @@
   lines. Open issue: do we check only the tip after each push, or do we check
   every commit in a changegroup?

-* commit mails: we can leverage the notify extension for this
+* commit mails: we can leverage the notify extension for this. Emails will
+  include diffs for each changeset committed against the repository.

 * buildbots: both the regular and the community build masters must be notified.
   Fortunately buildbot includes support for hg. I've also implemented this for
   Mercurial itself, so I don't expect problems here.

 * check contributors: in the current setup, all changesets bear the username of
-  committers, who must have signed the contributor agreement. In a DVCS, the
-  committers are not necessarily the same people who push, and so we can't
-  check if the committer is a contributor. We could use a hook to check if the
-  committer is a contributor if we keep a list of registered contributors.
+  committers, who must have signed the contributor agreement. We might want to
+  use a hook to check if the committer is a contributor if we keep a list of
+  registered contributors. Then, the hook might warn users that push a group
+  of revisions containing changesets from unknown contributors.

+End-of-line conversions
+There has been some discussion about the lack of end-of-line conversion support
+in Mercurial. While Mercurial comes with a win32text extension that provides
+some basic support for converting end-of-line data on a file-name pattern
+basis, the lack of exclusion (for specifying broad rules with exceptions) and
+the use of hgrc files (which can't be versioned) make it less than ideal.
+I think the primary line of defense for prevention of inappropriate newlines
+should be hooks on the server side which basically turn down any changegroup
+or changeset introducing such data. The use of the win32text extension (which
+can hopefully be improved/extended to support the usage scenarios mentioned
+above) and/or a commit-time hook could be the first line of defense.

@@ -185,7 +223,16 @@
 build a quick extension to augment the URL rev parser so that it can also take
 r[0-9]+ args and come up with the matching hg revision.


+We'll come up with an auto-linking plugin for roundup, which can match a
+changeset identifier (possibly with a branch prefix), and link it to the
+appropriate revision in the hgwebdir instance. Second, the script above (in
+the hgwebdir section) will make sure that old links to revision should continue
+to work (by pointing to the hg changeset that reflects the svn revision).
 After migration

@@ -222,37 +269,32 @@
  .. _wiki:
  .. _parts of the developer FAQ:

-Think first, commit later?
+Proposed workflow

-In recent history, old versions of Python have been maintained by a select
-group of people backporting patches from trunk to release branches. While
-this may not scale so well as the development pace grows, it also runs into
-some problems with the current crop of distributed versioning tools. These
-tools (I believe similar problems would exist for either git, bzr, or hg,
-though some may cope better than others) are based on the idea of a Directed
-Acyclic Graph (or DAG), meaning they keep track of relations of changesets.
+I propose two workflows for the migration of patches between several branches.

-Mercurial itself has a stable branch which is a ''strict'' subset of the
-unstable branch. This means that generally all fixes for the stable branch
-get committed against the tip of the stable branch, then they get merged into
-the unstable branch (which already contains the parent of the new cset). This
-provides a largely frictionless environment for moving changes from stable to
-unstable branches. Mistakes, where a change that should go on stable goes on
-unstable first, do happen, but they're usually easy to fix. That can be done by
-copying the change over to the stable branch, then trivial-merging with
-unstable -- meaning the merge in fact ignores the parent from the stable
+For migration within 2.x or 3.x branches, I propose a patch always gets
+committed to the oldest branch where it applies first. Then, the resulting
+changeset can be merged using hg merge to all newer branches within that
+series (2.x or 3.x). If it does not apply as-is to the newer branch, hg revert
+can be used to easily revert to the new-branch-native head, patch in some
+alternative version of the patch (or none, if it's not applicable), then commit
+the merge. The premise here is that all changesets from an older branch within
+the series are eventually merged to all newer branches within the series.

-This strategy means a little more work for regular committers, because they
-have to think about whether their change should go on stable or unstable; they
-may even have to ask someone else (the RM) before committing. But it also
-relieves a dedicated group of committers of regular backporting duty, in
-addition to making it easier to work with the tool.
+The upshot is that this provides for the most painless merging procedure. The
+downside is that in the general case, people have to think about the oldest
+branch to which the patch should be applied before actually applying it.

-Now would be a good time to consider changing strategies in this regard,
-although it would be relatively easy to switch to such a model later on.
+For migration between 2.x and 3.x branches (which should all be in the same
+direction, though I'm not sure what direction is most appropriate here),
+changesets should be transplanted (not merged) in some other way. The
+transplant extension, import/export and bundle/unbundle work equally well here.

+Choosing this approach allows 3.x not to carry all of the 2.x history-since-it-
+was-branched, meaning the clone is not as big and the merges not as
 The future of Subversion

@@ -281,7 +323,9 @@
 I propose that the revision identifier will be the short version of hg's
 revision hash, for example 'dd3ebf81af43', augmented with '+' (instead of 'M')
 if the working directory from which it was built was modified. This mirrors
-the output of the hg id command, which is intended for this kind of usage.
+the output of the hg id command, which is intended for this kind of usage. The
+sys.subversion value will also be renamed to sys.mercurial to reflect the
+change in VCS.

 For the tag/branch identifier, I propose that hg will check for tags on the
 currently checked out revision, use the tag if there is one ('tip' doesn't

From dirkjan at  Mon Aug  3 12:41:31 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Mon, 3 Aug 2009 12:41:31 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
Message-ID: <>

So, I've been not-working on this, which I feel bad about. Suffice it
to say the day job has required more of my time then usual for the
past few weeks. I want to get back into it, so let's start by
re-raising this issue, which Mark Hammond conveniently summarized

> On 4/07/2009 2:03 PM, Mark Hammond wrote:
>> On 4/07/2009 12:30 PM, Nick Coghlan wrote:
>>> And since Mercurial doesn't even allow us to say "this is a binary file"
>>> the way CVS used to I'm currently not seeing any way for that to happen
>>> except for win32text to be updated to correctly handle wild cards in
>>> combination with negative filters.
>> I agree with your conclusion. My ruminating on this over the last few
>> months leaves me thinking this would involve:
>> * my older 'accepted but then lost' hg patch to allow an explicit 'none'
>> rule for a single file to override wildcards.

This was and still is a good idea. It would be very nice if you could
un-bitrot it and submit it for inclusion into crew-stable (so that it
may land in the next release, which would hopefully be a somewhat near

>> * win32text be enhanced to use a normal versioned file in the root of
>> the repo, much like hgingore, where a project can maintain project wide
>> rules.

I'm thinking that it should take stuff from .hgeols or whatever and
apply rules from .hg/hgrc after that, so both may be used (and for
backwards compatibility), but it sounds like a good idea in principle.

>> * win32text be enhanced such that all python developers, regardless of
>> platform, are willing to use this extension, even if the majority of
>> files happen to use their native line ending (sauce for the goose is
>> sauce for the gander, and all that...)

I don't think that is necessary, I will elaborate below.

>> * commit hooks be implemented to enforce this - but this should not be
>> necessary if the above was implemented and socially enforced.

You seem to advocate a two-step approach: enforce line endings through
win32text, catch any errors that slipped through in a hook (commit
hook is an optional first line of defense, changegroup hooks on the
server to protect the rest of the world).

I think inverting that approach would be better: have strict hooks on
the server to prevent people from pushing inappropriate EOLs, and
provide help on configuring win32text as an extra help for developers
on Windows who use editors that work better with \r\n. That leaves
people to pick their own weapon of choice against propagation of \r\n
(e.g. better editor, commit hooks, whatever) while still making sure
no inappropriate line endings land in the repositories. It
also seems to fit well with the whole consenting adults thing (but
that might just be me).

On Sun, Jul 19, 2009 at 15:27, Mark Hammond<skippy.hammond at> wrote:
> Sorry Dirkjan - I just noticed I didn't CC you on this mail originally.  I'm
> wondering if you have any more thoughts on these EOL issues and if there is
> anything I can do to help?

Taking up the 'none' filter, first, and .hgeols, secondly, in the
win32text extension would be wonderfully helpful, since I don't do
much development on Windows and am therefore not that familiar with
the extension in the first place.



From solipsis at  Mon Aug  3 18:50:02 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 3 Aug 2009 16:50:02 +0000 (UTC)
Subject: [Python-Dev] PEP 385: updating the PEP
References: <>
Message-ID: <>

Hello Dirkjan,

Dirkjan Ochtman <dirkjan <at>> writes:
> +As an optional optimization technique, I have performed a reordering pass on
> +the revlogs (internal Mercurial files) resulting from the conversion. In some
> +cases this results in dramatic decreases in on-disk repository size.

Can you give size numbers for the two main repositories (2.x and 3.x)?

Thanks for your work, again! I'm glad this is progressing.



From casey at  Mon Aug  3 16:47:30 2009
From: casey at (Casey Duncan)
Date: Mon, 3 Aug 2009 08:47:30 -0600
Subject: [Python-Dev] In late this am
Message-ID: <>

Going to the Dr., will be in thereafter.


From casey at  Mon Aug  3 16:56:11 2009
From: casey at (Casey Duncan)
Date: Mon, 3 Aug 2009 08:56:11 -0600
Subject: [Python-Dev] In late this am
In-Reply-To: <>
References: <>
Message-ID: <>

Heh, wrong dev list 8^). Sorry for the noise.


On Aug 3, 2009, at 8:47 AM, Casey Duncan wrote:

> Going to the Dr., will be in thereafter.
> -Casey

From vincent.legoll at  Mon Aug  3 09:20:25 2009
From: vincent.legoll at (Vincent Legoll)
Date: Mon, 3 Aug 2009 09:20:25 +0200
Subject: [Python-Dev] pylinting the stdlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 2, 2009 at 6:20 PM, Mark Dickinson<dickinsm at> wrote:
> On Sat, Aug 1, 2009 at 11:40 PM, Vincent Legoll<vincent.legoll at> wrote:
>> I've fed parts of the stdlib to pylint and after some filtering
>> there appears to be some things that looks strange, I've
>> filled a few bugs to the tracker for them.
>> Is this useless and taking reviewer's time for nothing ?
>> Please advise, if this is deemed useful, I'll continue further
> I think this is valuable work---please do continue!

Thanks, I will

> Just out of interest, how many false positives did you have
> to filter out in finding the 5 cases above?

I can't really tell if there was false positives, I just started with
the low hanging fruits, the ones I immediately saw as fishy,
the remaining I skipped without too much consideration, I
think it will take many iterations to do the whole thing.

I used a pylint version which is not capable of understanding
py3k syntax, so a lot of files were simply skipped.

Vincent Legoll

From eric.pruitt at  Mon Aug  3 20:42:03 2009
From: eric.pruitt at (Eric Pruitt)
Date: Mon, 3 Aug 2009 13:42:03 -0500
Subject: [Python-Dev] Functionality in subprocess.Popen.terminate()
Message-ID: <>

In my GSoC project, I have implemented asnychronous I/O in subprocess.Popen.
Since the read/write operations are asynchronous, the program may have
already exited by the time one calls the asyncread function I have
implemented. While it returns the data just fine, I have come across an
issue with the TerminateProcess function in Windows: if the program has
already exited, when subprocess.Popen.Terminate calls the Windows built-in
"TerminateProcess" function, an "access denied" error will occur. Should I
just make it so that this exception is simply ignored or perform some kind
of check to see if the process exists beforehand? If the latter, I have been
unable to find a way to do so, to my liking at least. The solutions I saw
would require code that seems a bit excessive to me.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Mon Aug  3 23:02:19 2009
From: brett at (Brett Cannon)
Date: Mon, 3 Aug 2009 14:02:19 -0700
Subject: [Python-Dev] PEP 385: updating the PEP
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 3, 2009 at 04:53, Dirkjan Ochtman <dirkjan at> wrote:

> The diff below should reflect changes from the discussion we had last
> time. Please review. (Some comments may be more appropriate for the
> other threads I just kicked off.)
> Cheers,
> Dirkjan
> Index: pep-0385.txt
> ===================================================================
> --- pep-0385.txt        (revision 74294)
> +++ pep-0385.txt        (revision 74296)
> @@ -59,27 +59,25 @@
>  often has somewhat unintuitive results for people (though this has been
>  getting better in recent versions of Mercurial).
> -I'm still a bit on the fence about whether Python should adopt cloned
> -branches and named branches. Since it usually makes more sense to tag
> releases
> -on the maintenance branch, for example, mainline history would not contain
> -release tags if we used cloned branches. Also, Mercurial 1.2 and 1.3 have
> the
> -necessary tools to make named branches less painful (because they can be
> -properly closed and closed heads are no longer considered in relevant
> cases).
> +The current proposal is to use named branches for release branches and
> adopt
> +cloned branches for feature branches, with one exception to this rule: the
> 3.x
> +branches will be kept in separate clones from the 2.x branches. I think
> this
> +provides an optimal hybrid approach for Python's uses of branching.

Sounds good to me.

> -A disadvantage might be that the used clones will be a good bit larger
> (since
> -they essentially contain all other branches as well). This can me
> mitigated by
> -keeping non-release (feature) branches in separate clones. Also note that
> it's
> -still possible to clone a single named branch from a combined clone, by
> -specifying the branch as in hg clone
> -Keeping the py3k history in a separate clone problably also makes sense.
> +Differences between named branches and cloned branches:
> -XXX To do: size comparison for selected separation scenarios.
> +* Tags in a different (maintenance) clone aren't available in the local
> clone
> +* Clones with named branches will be larger, since they contain more data
> +(The Mercurial book discourages the use of named branches, but it is, in
> this
> +respect, somewhat outdated. Named branches have gotten much easier to use
> +since that comment was written, due to improvements in hg.)
> +
>  Converting branches
>  -------------------
>  There are quite a lot of branches in SVN's branches directory. I propose
> to
> -clean this up a bit, by employing the following the strategy:
> +clean this up a bit, by following this basic strategy:
>  * Keep all release (maintenance) branches
>  * Discard branches that haven't been touched in 18 months, unless somone
> @@ -87,6 +85,21 @@
>  * Keep branches that have been touched in the last 18 months, unless
> someone
>   indicates the branch can be deprecated
> +There's a `branch map`_ available that shows info about each branch:
> +
> +* keep-clone means we'll keep that branch in a separate clone
> +* keep-named means we'll keep that branch as a named branch in one of
> the clones
> +* strip means we won't keep that branch
> +* streamed-merge means that it got merged by committing several new
> revisions
> +  to the other branch
> +* merged-r* means the branch got merged in the named revision
> +* merges? means I haven't checked/found out yet whether that branch was
> ever
> +  merged
> +* ? means that your input would be even more helpful than for the other
> items
> +* some items have no action yet, feel free to treat that as just '?'
> +
> +.. _branch map:
> +
>  Converting tags
>  ---------------
> @@ -95,8 +108,8 @@
>  we should keep all release tags, and consider other tags for inclusion
> based
>  on requests from the developer community. I'd like to consider unifying
> the
>  release tag naming scheme to make some things more consistent, if people
> feel
> -that won't create too many problems. For example, Mercurial itself just
> uses
> -'1.2.1' as a tag, where CPython would currently use r121.
> +that won't create too many problems. The current proposal is to bring old
> +release tags in line with the current practice of release tag naming.
>  Author map
>  ----------
> @@ -119,17 +132,19 @@
>  possible forms of pattern matching. The current Python repository already
>  includes a rudimentary .hgignore file to help with using the hg mirrors.
> -It might be useful to have the .hgignore be generated automatically from
> -svn:ignore properties. This would make sure all historic revisions also
> have
> -useful ignore information (though one could argue ignoring isn't really
> -relevant to just checking out an old revision).
> +Since the current Python repository already includes a .hgignore file (for
> use
> +with hg mirrors), we'll just use that. Generating full history of the file
> +was debated but deemed impractical (because it's relatively hard with
> fairly
> +little gain, since ignoring is less important for older revisions).
>  Revlog reordering
>  -----------------
> -As an optional optimization technique, we should consider trying a
> reordering
> -pass on the revlogs (internal Mercurial files) resulting from the
> conversion.
> -In some cases this results in dramatic decreases in on-disk repository
> size.
> +As an optional optimization technique, I have performed a reordering pass
> on
> +the revlogs (internal Mercurial files) resulting from the conversion. In
> some
> +cases this results in dramatic decreases in on-disk repository size. This
> +especially makes sense for the manifest (where it really helps out quite a
> lot)
> +and oft-edited files like NEWS.txt (with an admittedly smaller effect).
>  Other repositories
>  ------------------
> @@ -138,7 +153,14 @@
>  converted. What other projects in the repository should be
>  converted? Do we want to convert the peps repository? distutils? others?
> +There's now an initial stab at converting the Jython repository. The
> current
> +tip of hgsubversion unfortunately fails at some point. Pending
> investigation.
> +Other repositories that would like to converted to Mercurial can announce
> +themselves to me after the main Python migration is done, and I'll take
> care
> +of their needs.
> +
> +
>  Infrastructure
>  ==============
> @@ -165,18 +187,34 @@
>   lines. Open issue: do we check only the tip after each push, or do we
> check
>   every commit in a changegroup?
> -* commit mails: we can leverage the notify extension for this
> +* commit mails: we can leverage the notify extension for this. Emails will
> +  include diffs for each changeset committed against the repository.
>  * buildbots: both the regular and the community build masters must be
> notified.
>   Fortunately buildbot includes support for hg. I've also implemented this
> for
>   Mercurial itself, so I don't expect problems here.
>  * check contributors: in the current setup, all changesets bear the
> username of
> -  committers, who must have signed the contributor agreement. In a DVCS,
> the
> -  committers are not necessarily the same people who push, and so we can't
> -  check if the committer is a contributor. We could use a hook to check if
> the
> -  committer is a contributor if we keep a list of registered contributors.
> +  committers, who must have signed the contributor agreement. We might
> want to
> +  use a hook to check if the committer is a contributor if we keep a list
> of
> +  registered contributors. Then, the hook might warn users that push a
> group
> +  of revisions containing changesets from unknown contributors.

Is this from people who submit patch sets to that include
the individual commits? Or is this for core developers?

> +End-of-line conversions
> +-----------------------
> +
> +There has been some discussion about the lack of end-of-line conversion
> support
> +in Mercurial. While Mercurial comes with a win32text extension that
> provides
> +some basic support for converting end-of-line data on a file-name pattern
> +basis, the lack of exclusion (for specifying broad rules with exceptions)
> and
> +the use of hgrc files (which can't be versioned) make it less than ideal.
> +
> +I think the primary line of defense for prevention of inappropriate
> newlines
> +should be hooks on the server side which basically turn down any
> changegroup
> +or changeset introducing such data. The use of the win32text extension
> (which
> +can hopefully be improved/extended to support the usage scenarios
> mentioned
> +above) and/or a commit-time hook could be the first line of defense.
> +
>  hgwebdir
>  --------
> @@ -185,7 +223,16 @@
>  build a quick extension to augment the URL rev parser so that it can also
> take
>  r[0-9]+ args and come up with the matching hg revision.
> +roundup
> +-------
> +We'll come up with an auto-linking plugin for roundup, which can match a
> +changeset identifier (possibly with a branch prefix), and link it to the
> +appropriate revision in the hgwebdir instance. Second, the script above
> (in
> +the hgwebdir section) will make sure that old links to revision should
> continue
> +to work (by pointing to the hg changeset that reflects the svn revision).
> +
> +
>  After migration
>  ===============
> @@ -222,37 +269,32 @@
>  .. _wiki:
>  .. _parts of the developer FAQ:
> -Think first, commit later?
> ---------------------------
> +Proposed workflow
> +-----------------
> -In recent history, old versions of Python have been maintained by a select
> -group of people backporting patches from trunk to release branches. While
> -this may not scale so well as the development pace grows, it also runs
> into
> -some problems with the current crop of distributed versioning tools. These
> -tools (I believe similar problems would exist for either git, bzr, or hg,
> -though some may cope better than others) are based on the idea of a
> Directed
> -Acyclic Graph (or DAG), meaning they keep track of relations of
> changesets.
> +I propose two workflows for the migration of patches between several
> branches.
> -Mercurial itself has a stable branch which is a ''strict'' subset of the
> -unstable branch. This means that generally all fixes for the stable branch
> -get committed against the tip of the stable branch, then they get merged
> into
> -the unstable branch (which already contains the parent of the new cset).
> This
> -provides a largely frictionless environment for moving changes from stable
> to
> -unstable branches. Mistakes, where a change that should go on stable goes
> on
> -unstable first, do happen, but they're usually easy to fix. That can be
> done by
> -copying the change over to the stable branch, then trivial-merging with
> -unstable -- meaning the merge in fact ignores the parent from the stable
> -branch).
> +For migration within 2.x or 3.x branches, I propose a patch always gets
> +committed to the oldest branch where it applies first. Then, the resulting
> +changeset can be merged using hg merge to all newer branches within that
> +series (2.x or 3.x). If it does not apply as-is to the newer branch, hg
> revert
> +can be used to easily revert to the new-branch-native head, patch in some
> +alternative version of the patch (or none, if it's not applicable), then
> commit
> +the merge. The premise here is that all changesets from an older branch
> within
> +the series are eventually merged to all newer branches within the series.
> -This strategy means a little more work for regular committers, because
> they
> -have to think about whether their change should go on stable or unstable;
> they
> -may even have to ask someone else (the RM) before committing. But it also
> -relieves a dedicated group of committers of regular backporting duty, in
> -addition to making it easier to work with the tool.
> +The upshot is that this provides for the most painless merging procedure.
> The
> +downside is that in the general case, people have to think about the
> oldest
> +branch to which the patch should be applied before actually applying it.

People should be doing that anyway intra-major version, so it should be a
very minor pain point. Plus named branches make this straightforward, right?

> -Now would be a good time to consider changing strategies in this regard,
> -although it would be relatively easy to switch to such a model later on.
> +For migration between 2.x and 3.x branches (which should all be in the
> same
> +direction, though I'm not sure what direction is most appropriate here),

Patches go from 2.x to 3.x typically.

> +changesets should be transplanted (not merged) in some other way. The
> +transplant extension, import/export and bundle/unbundle work equally well
> here.

We will need to choose one and document how to use it. When you are ready to
lock this down let me know and we can starting writing a new version of the
dev FAQ.

> +Choosing this approach allows 3.x not to carry all of the 2.x
> history-since-it-
> +was-branched, meaning the clone is not as big and the merges not as
> complicated.
> +

So just like it is now with svnmerge, right? That's fine as long as we
continue to include the revision # in the commit so it is possible to
reference the original commit on the other branch.

>  The future of Subversion
>  ------------------------
> @@ -281,7 +323,9 @@
>  I propose that the revision identifier will be the short version of hg's
>  revision hash, for example 'dd3ebf81af43', augmented with '+' (instead of
> 'M')
>  if the working directory from which it was built was modified. This
> mirrors
> -the output of the hg id command, which is intended for this kind of usage.
> +the output of the hg id command, which is intended for this kind of usage.
> The
> +sys.subversion value will also be renamed to sys.mercurial to reflect the
> +change in VCS.
>  For the tag/branch identifier, I propose that hg will check for tags on
> the
>  currently checked out revision, use the tag if there is one ('tip' doesn't
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From dirkjan at  Mon Aug  3 12:51:36 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Mon, 3 Aug 2009 12:51:36 +0200
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
Message-ID: <>

So PEP 385 proposes to clean up the old branches we still have lying
around in SVN.

This list of branch: action items is what I've come up with to do this
cleanup. Legend first:

- keep-clone means we'll keep that branch in a separate clone
- keep-named means we'll keep that branch as a named branch in one of the clones
- strip means we won't keep that branch
- streamed-merge means that it got merged by committing several new
revisions to the other branch
- merged-r* means the branch got merged in the named revision
- merges? means I haven't checked/found out yet whether that branch
was ever merged
- ? means that your input would be even more helpful than for the other items
- some items have no action yet, feel free to treat that as just '?'

The actual List:

py3k: keep-clone
default: keep-clone
tk_and_idle_maintenance: keep-clone
release26-maint: keep-named
release30-maint: keep-named
pep-0383: keep-clone
py3k-short-float-repr: strip streamed-merge
multiprocessing-autoconf: keep-clone?
release25-maint: keep-named
io-c: keep-clone?
py3k-issue1717: keep-clone
tlee-ast-optimize: keep-clone
release24-maint: keep-named
empty: keep-clone?
py3k-urllib: keep-clone
tnelson-trunk-bsddb-47-upgrade: strip
benjaminp-testing: strip
py3k-importlib: keep-clone
release23-maint: keep-named
py3k-importhook: keep-clone
ctypes-branch: strip
decimal-branch: merged-r58143
bcannon-objcap: keep-clone?
p3yk_no_args_on_exc: strip
amk-mailbox: keep-clone?
twouters-dictviews-backport: keep-clone?
bcannon-sandboxing: keep-clone?
release22-maint: keep-named
theller_modulefinder: strip
hoxworth-stdlib_logging-soc: strip partial
tim-exc_sanity: merged-r46426
IDLE-syntax-branch: merged-r41480 strip-later
ast-objects: strip
ast-arena: merged-r41739
jim-doctest: strip
ast-branch: merged-r39758
release23-branch: merges?
jim-modulator: strip
release21-maint: keep-named
indexing-cleanup-branch: strip
r23c1-branch: merged-r33637
r23b2-branch: merges?
anthony-parser-branch: merged-r35460
r23b1-branch: merged-r32490
idlefork-merge-branch: strip
getargs_mask_mods: strip
cache-attr-branch: strip
folding-reimpl-branch: strip streamed-merge
r23a2-branch: merges?
bsddb-bsddb3-schizo-branch: merged-r31008
r23a1-branch: merged-r30482
py-cvs-vendor-branch: strip
DS_RPC_BRANCH: strip streamed-merge
SourceForge: strip
release22-branch: merged-r24921
r22rc1-branch: strip
r22b2-branch: merges? merged-r24426
r22b1-branch: merges?
r22a4-branch: merges?
r22a3-branch: merges?
r22a2-branch: merged-r22674
descr-branch: merged-r22139
release20-maint: keep-named
gen-branch: merged-r21181
iter-branch: merged-r20492
r161-branch: merges?
cnri-16-start: strip
universal-33: merges?
None: strip
avendor: strip
Distutils_0_1_3-branch: strip partial
release152p1-patches: merges?
string_methods: merged-r13927
PYIDE: strip
OSAM: strip
BBPY: strip
jar: merges?
alpha100: strip streamed-merge
unlabeled-2.36.4: strip partial
unlabeled-2.1.4: strip partial
unlabeled-2.25.4: strip partial
fix-test-ftplib: merged-r66673
py3k-ctypes-pep3118: merged-r62597
trunk-bytearray: merged-r61936
libffi3-branch: merged-r61234
alex-py3k: strip
cpy_merge: strip
py3k-pep3137: merged-r58888
../ctypes-branch: strip
pep302_phase2: strip
py3k-buffer: merged-r57181
p3yk: rename
unlabeled-1.1.1: strip
unlabeled-1.5.4: strip
unlabeled-1.1.2: strip
unlabeled-2.9.2: strip
unlabeled-2.9.4: strip
unlabeled-1.5.2: strip
unlabeled-2.1.2: strip
unlabeled-2.36.2: strip
unlabeled-2.108.2: strip
unlabeled-2.10.2: strip
unlabeled-2.54.2: strip
unlabeled-1.3.2: strip
unlabeled-1.23.4: strip
unlabeled-2.25.2: strip
unlabeled-1.2.2: strip
unlabeled-1.98.2: strip
unlabeled-2.16.2: strip
unlabeled-2.3.2: strip
unlabeled-1.9.2: strip
unlabeled-1.8.2: strip
aimacintyre-sf1454481: merged-r46919
tim-current_frames: merged-r50541
bippolito-newstruct: merges?
runar-longslice-branch: strip
steve-notracing: strip
rjones-funccall: merged-r46096
sreifschneider-newnewexcept: merged-r46456
tim-doctest-branch: merged-r36839
blais-bytebuf: strip
../bippolito-newstruct: rename
rjones-prealloc: strip
sreifschneider-64ints: strip
stdlib-cleanup: strip
ssize_t: merged-r42382
sqlite-integration: merged-r43514
tim-obmalloc: merged-r43059

Further actions:

- implement branch map support in hgsubversion to be able to do
named/unnamed/no branch on a branch-by-branch basis
- implement splice map support in hgsubversion to be able to convert
given merges to hg-native merge data



From martin at  Tue Aug  4 08:06:04 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Aug 2009 08:06:04 +0200
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

> empty: keep-clone?

I use that as a branch to tell build slaves to clean out their
current checkouts. So keep-clone sounds right, assuming it is possible
to target buildslaves at either clones or branches (which, IIUC, would
be necessary anyway, since we are using a mix of branches and clones).

> amk-mailbox: keep-clone?
> twouters-dictviews-backport: keep-clone?
> bcannon-sandboxing: keep-clone?
> bippolito-newstruct: merges?

You'll probably need to explicitly ping the specific owners
(Andrew Kuchling, Thomas Wouters, Brett Cannon, Bob Ippolito)
to understand the fate of these branches.

This also raises the question how developers should publish their
"own" branches. For the bzr setup, there was apparently a proposal
to use directories for that, i.e. giving each developer a directory
on to publish branches.

Not doing that, but keeping owner information encoded in the clone
name, would be fine as well.

> release23-branch: merges?
> r23b2-branch: merges?
> r22rc1-branch: strip
> r22b1-branch: merges?
> r22a4-branch: merges?
> r22a3-branch: merges?
> r161-branch: merges?

It seems we had been creating CVS branches for every release around
that time; I don't remember the details. Each such branch should end
up in a tag. For example, release23-branch should (and does) ultimately
lead to tags/r23. cvs2svn wasn't able to recognize this correctly (as
CVS branches apply to each file individually), so it created the r23
tag out of various copies that were current when the tag was made.

I don't know what your plan is wrt. release tags, i.e. whether you
want to keep them all. If you are stripping out some of the branches,
but plan to keep the release tags, I wonder what the tags look like.

> release22-branch: merged-r24921

Not really. Jack Jansen merged some changes that got first applied
to the 2.2

> r22b2-branch: merges? merged-r24426
> r22b2-branch: merges? merged-r24426

> release20-maint: keep-named

See above. So you do plan to keep all past releases?

> release152p1-patches: merges?

Probably merged. I don't recall whether 1.5.2p1 really happened;
in r14966, Fred claims that he merged all changes from 1.5.2p2 (!).

"Hopefully I got all this right!"

I surely hope the same - I doubt anybody would go back and check
whether anything is missing.


From dirkjan at  Tue Aug  4 08:33:49 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Tue, 4 Aug 2009 08:33:49 +0200
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 4, 2009 at 08:06, "Martin v. L?wis"<martin at> wrote:
> I use that as a branch to tell build slaves to clean out their
> current checkouts. So keep-clone sounds right, assuming it is possible
> to target buildslaves at either clones or branches (which, IIUC, would
> be necessary anyway, since we are using a mix of branches and clones).

Yes, that should be straightforward.

>> amk-mailbox: keep-clone?
>> twouters-dictviews-backport: keep-clone?
>> bcannon-sandboxing: keep-clone?
>> bippolito-newstruct: merges?
> You'll probably need to explicitly ping the specific owners
> (Andrew Kuchling, Thomas Wouters, Brett Cannon, Bob Ippolito)
> to understand the fate of these branches.

Will do.

> This also raises the question how developers should publish their
> "own" branches. For the bzr setup, there was apparently a proposal
> to use directories for that, i.e. giving each developer a directory
> on to publish branches.

User repositories has apparently worked well for Mozilla, so yeah,
it's worth discussing.

> Not doing that, but keeping owner information encoded in the clone
> name, would be fine as well.
>> release23-branch: merges?
>> r23b2-branch: merges?
>> r22rc1-branch: strip
>> r22b1-branch: merges?
>> r22a4-branch: merges?
>> r22a3-branch: merges?
>> r161-branch: merges?
> It seems we had been creating CVS branches for every release around
> that time; I don't remember the details. Each such branch should end
> up in a tag. For example, release23-branch should (and does) ultimately
> lead to tags/r23. cvs2svn wasn't able to recognize this correctly (as
> CVS branches apply to each file individually), so it created the r23
> tag out of various copies that were current when the tag was made.
> I don't know what your plan is wrt. release tags, i.e. whether you
> want to keep them all. If you are stripping out some of the branches,
> but plan to keep the release tags, I wonder what the tags look like.

The plan was to keep all maintenance branches and all release tags but
not all release branches (since they seem to contain few commits

>> release22-branch: merged-r24921
> Not really. Jack Jansen merged some changes that got first applied
> to the 2.2
>> r22b2-branch: merges? merged-r24426
>> r22b2-branch: merges? merged-r24426
>> release20-maint: keep-named
> See above. So you do plan to keep all past releases?
>> release152p1-patches: merges?
> Probably merged. I don't recall whether 1.5.2p1 really happened;
> in r14966, Fred claims that he merged all changes from 1.5.2p2 (!).
> "Hopefully I got all this right!"
> I surely hope the same - I doubt anybody would go back and check
> whether anything is missing.

Thanks for the thorough review,


From dickinsm at  Tue Aug  4 11:20:09 2009
From: dickinsm at (Mark Dickinson)
Date: Tue, 4 Aug 2009 10:20:09 +0100
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

Comments on some of the branches I've had involvement with...

On Mon, Aug 3, 2009 at 11:51 AM, Dirkjan Ochtman<dirkjan at> wrote:

> py3k-short-float-repr: strip streamed-merge

Sounds fine.

> py3k-issue1717: keep-clone

I don't think there's any need to keep this branch;  its contents were
all merged (in pieces) to py3k (various revisions with numbers in
the range 69188--69225).  So I think 'strip streamed-merge' is
appropriate here, if I'm understanding your terminology.

> trunk-math:

I think this one can go down as 'strip', too;  there's nothing there of
interest that isn't already in trunk and py3k.  It was merged to
trunk in r62380.


From ncoghlan at  Tue Aug  4 11:20:13 2009
From: ncoghlan at (Nick Coghlan)
Date: Tue, 04 Aug 2009 19:20:13 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
Message-ID: <>

Dirkjan Ochtman wrote:
>>> * commit hooks be implemented to enforce this - but this should not be
>>> necessary if the above was implemented and socially enforced.
> You seem to advocate a two-step approach: enforce line endings through
> win32text, catch any errors that slipped through in a hook (commit
> hook is an optional first line of defense, changegroup hooks on the
> server to protect the rest of the world).
> I think inverting that approach would be better: have strict hooks on
> the server to prevent people from pushing inappropriate EOLs, and
> provide help on configuring win32text as an extra help for developers
> on Windows who use editors that work better with \r\n. That leaves
> people to pick their own weapon of choice against propagation of \r\n
> (e.g. better editor, commit hooks, whatever) while still making sure
> no inappropriate line endings land in the repositories. It
> also seems to fit well with the whole consenting adults thing (but
> that might just be me).

It's about not treating Windows developers as second class citizens.
Their platform uses \r\n as its native line ending format, so they
should be able to work in that format without any hassles by following
some simple instructions (such as "ensure you have version X of the
Windows hg client, enable the win32text extension and configure it in
such-and-such a way"). Not "oh, yeah, that's an issue but if you search
the Intarwebs there are a few different things you can do that kinda
sorta work but are a bit fragile and klunky".

The precise order the two issues (server side enforcement and client
side assistance) are dealt with doesn't really matter because *both*
issues need to be addressed before we migrate.

win32text needs to be usable on non-Windows clients so that tarballs
generated on a *nix machine get the line endings right in the
Windows-only files.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Tue Aug  4 11:27:46 2009
From: ncoghlan at (Nick Coghlan)
Date: Tue, 04 Aug 2009 19:27:46 +1000
Subject: [Python-Dev] Functionality in subprocess.Popen.terminate()
In-Reply-To: <>
References: <>
Message-ID: <>

Eric Pruitt wrote:
> In my GSoC project, I have implemented asnychronous I/O in
> subprocess.Popen. Since the read/write operations are asynchronous, the
> program may have already exited by the time one calls the asyncread
> function I have implemented. While it returns the data just fine, I have
> come across an issue with the TerminateProcess function in Windows: if
> the program has already exited, when subprocess.Popen.Terminate calls
> the Windows built-in "TerminateProcess" function, an "access denied"
> error will occur. Should I just make it so that this exception is simply
> ignored or perform some kind of check to see if the process exists
> beforehand? If the latter, I have been unable to find a way to do so, to
> my liking at least. The solutions I saw would require code that seems a
> bit excessive to me.

I'm pretty sure we already ignore some spurious error messages in cases
like calling flush() in file.close(). I would suggest checking what the
io module does in such cases and see what kind of precedent it sets.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Tue Aug  4 11:35:33 2009
From: ncoghlan at (Nick Coghlan)
Date: Tue, 04 Aug 2009 19:35:33 +1000
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Dirkjan Ochtman wrote:
> On Tue, Aug 4, 2009 at 08:06, "Martin v. L?wis"<martin at> wrote:
>> I don't know what your plan is wrt. release tags, i.e. whether you
>> want to keep them all. If you are stripping out some of the branches,
>> but plan to keep the release tags, I wonder what the tags look like.
> The plan was to keep all maintenance branches and all release tags but
> not all release branches (since they seem to contain few commits
> anyway).

I think I share Martin's confusion here - how can you keep a release tag
 (e.g. 2.2.2) without also keeping the release branch where that tag was
created? Yes, the maintenance branches contain a comparatively small
number of commits, but they're still the sources of the maintenance
release tags.

Or is this a case where Mercurial's DAG allows you to handle those old
branches as "abandoned" leaves of the DAG in the history of the affected
files, with the tags picking out the relevant versions of the files?


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From amk at  Tue Aug  4 13:43:12 2009
From: amk at (A.M. Kuchling)
Date: Tue, 4 Aug 2009 07:43:12 -0400
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 03, 2009 at 12:51:36PM +0200, Dirkjan Ochtman wrote:
> amk-mailbox: keep-clone?

strip -- this branch was for working on a fix for, but the actual work in the branch
is available as the patches attached to that item.


From robert.schuppenies at  Tue Aug  4 15:41:33 2009
From: robert.schuppenies at (Robert Schuppenies)
Date: Tue, 04 Aug 2009 06:41:33 -0700
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

Dirkjan Ochtman wrote:
> okkoto-sizeof

strip - It's an 2008 Google Summer of Code project. The important changes have
been applied in r63856.

From eric.pruitt at  Tue Aug  4 17:01:57 2009
From: eric.pruitt at (Eric Pruitt)
Date: Tue, 4 Aug 2009 10:01:57 -0500
Subject: [Python-Dev] Functionality in subprocess.Popen.terminate()
In-Reply-To: <>
References: <> 
Message-ID: <>

On Tue, Aug 4, 2009 at 04:27, Nick Coghlan<ncoghlan at> wrote:
> Eric Pruitt wrote:
>> In my GSoC project, I have implemented asnychronous I/O in
>> subprocess.Popen. Since the read/write operations are asynchronous, the
>> program may have already exited by the time one calls the asyncread
>> function I have implemented. While it returns the data just fine, I have
>> come across an issue with the TerminateProcess function in Windows: if
>> the program has already exited, when subprocess.Popen.Terminate calls
>> the Windows built-in "TerminateProcess" function, an "access denied"
>> error will occur. Should I just make it so that this exception is simply
>> ignored or perform some kind of check to see if the process exists
>> beforehand? If the latter, I have been unable to find a way to do so, to
>> my liking at least. The solutions I saw would require code that seems a
>> bit excessive to me.
> I'm pretty sure we already ignore some spurious error messages in cases
> like calling flush() in file.close(). I would suggest checking what the
> io module does in such cases and see what kind of precedent it sets.
> Cheers,
> Nick.
> --
> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
> ---------------------------------------------------------------

Sounds good enough to me but I was wondering if it might be a good
idea to add a function like "pidinuse" to subprocess as a whole that
would determine if a process ID was being used and return a simple
boolean value. I came across a number of people searching for a way to
determine if a PID was running (Google "python check if pid exists")
so it seems like the implemented functionality would be of use to the
community as a whole, not just my wrapper class.


From janzert at  Tue Aug  4 17:24:23 2009
From: janzert at (Janzert)
Date: Tue, 04 Aug 2009 11:24:23 -0400
Subject: [Python-Dev] Functionality in subprocess.Popen.terminate()
In-Reply-To: <>
References: <>
Message-ID: <>

Eric Pruitt wrote:
> On Tue, Aug 4, 2009 at 04:27, Nick Coghlan<ncoghlan at> wrote:
>> Eric Pruitt wrote:
>>> In my GSoC project, I have implemented asnychronous I/O in
>>> subprocess.Popen. Since the read/write operations are asynchronous, the
>>> program may have already exited by the time one calls the asyncread
>>> function I have implemented. While it returns the data just fine, I have
>>> come across an issue with the TerminateProcess function in Windows: if
>>> the program has already exited, when subprocess.Popen.Terminate calls
>>> the Windows built-in "TerminateProcess" function, an "access denied"
>>> error will occur. Should I just make it so that this exception is simply
>>> ignored or perform some kind of check to see if the process exists
>>> beforehand? If the latter, I have been unable to find a way to do so, to
>>> my liking at least. The solutions I saw would require code that seems a
>>> bit excessive to me.
>> I'm pretty sure we already ignore some spurious error messages in cases
>> like calling flush() in file.close(). I would suggest checking what the
>> io module does in such cases and see what kind of precedent it sets.
>> Cheers,
>> Nick.
>> --
>> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
>> ---------------------------------------------------------------
> Sounds good enough to me but I was wondering if it might be a good
> idea to add a function like "pidinuse" to subprocess as a whole that
> would determine if a process ID was being used and return a simple
> boolean value. I came across a number of people searching for a way to
> determine if a PID was running (Google "python check if pid exists")
> so it seems like the implemented functionality would be of use to the
> community as a whole, not just my wrapper class.
> Eric

I'm not sure of the actual details but it seems from your description
that even if you check first a race condition will still exist.
Specifically the subprocess could terminate after the check and before
the TerminateProcess call. So it seems better just to call
TerminateProcess and then correctly handle any possible error.


From brett at  Tue Aug  4 21:28:10 2009
From: brett at (Brett Cannon)
Date: Tue, 4 Aug 2009 12:28:10 -0700
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 3, 2009 at 23:06, "Martin v. L?wis" <martin at> wrote:

> > empty: keep-clone?
> I use that as a branch to tell build slaves to clean out their
> current checkouts. So keep-clone sounds right, assuming it is possible
> to target buildslaves at either clones or branches (which, IIUC, would
> be necessary anyway, since we are using a mix of branches and clones).
> > amk-mailbox: keep-clone?
> > twouters-dictviews-backport: keep-clone?
> > bcannon-sandboxing: keep-clone?
> > bippolito-newstruct: merges?

keep-clone bcannon-objcap, strip bcannon-sandboxing.

> You'll probably need to explicitly ping the specific owners
> (Andrew Kuchling, Thomas Wouters, Brett Cannon, Bob Ippolito)
> to understand the fate of these branches.
> This also raises the question how developers should publish their
> "own" branches. For the bzr setup, there was apparently a proposal
> to use directories for that, i.e. giving each developer a directory
> on to publish branches.

Yeah, I thought I brought this up and people liked the idea of keeping some
user directory on I am fine with as well. But
having some place would be really handy (although having bitbucket and
Google Code makes this not quite as important).

> Not doing that, but keeping owner information encoded in the clone
> name, would be fine as well.
> > release23-branch: merges?
> > r23b2-branch: merges?
> > r22rc1-branch: strip
> > r22b1-branch: merges?
> > r22a4-branch: merges?
> > r22a3-branch: merges?
> > r161-branch: merges?
> It seems we had been creating CVS branches for every release around
> that time; I don't remember the details. Each such branch should end
> up in a tag. For example, release23-branch should (and does) ultimately
> lead to tags/r23. cvs2svn wasn't able to recognize this correctly (as
> CVS branches apply to each file individually), so it created the r23
> tag out of various copies that were current when the tag was made.
> I don't know what your plan is wrt. release tags, i.e. whether you
> want to keep them all. If you are stripping out some of the branches,
> but plan to keep the release tags, I wonder what the tags look like.
> > release22-branch: merged-r24921
> Not really. Jack Jansen merged some changes that got first applied
> to the 2.2
> > r22b2-branch: merges? merged-r24426
> > r22b2-branch: merges? merged-r24426
> > release20-maint: keep-named
> See above. So you do plan to keep all past releases?
> > release152p1-patches: merges?
> Probably merged. I don't recall whether 1.5.2p1 really happened;
> in r14966, Fred claims that he merged all changes from 1.5.2p2 (!).
> "Hopefully I got all this right!"
> I surely hope the same - I doubt anybody would go back and check
> whether anything is missing.
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mhammond at  Wed Aug  5 01:43:15 2009
From: mhammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 09:43:15 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
Message-ID: <>

On 4/08/2009 7:20 PM, Nick Coghlan wrote:
> Dirkjan Ochtman wrote:
>>>> * commit hooks be implemented to enforce this - but this should not be
>>>> necessary if the above was implemented and socially enforced.
>> You seem to advocate a two-step approach: enforce line endings through
>> win32text, catch any errors that slipped through in a hook (commit
>> hook is an optional first line of defense, changegroup hooks on the
>> server to protect the rest of the world).
>> I think inverting that approach would be better: have strict hooks on
>> the server to prevent people from pushing inappropriate EOLs, and
>> provide help on configuring win32text as an extra help for developers
>> on Windows who use editors that work better with \r\n. That leaves
>> people to pick their own weapon of choice against propagation of \r\n
>> (e.g. better editor, commit hooks, whatever) while still making sure
>> no inappropriate line endings land in the repositories. It
>> also seems to fit well with the whole consenting adults thing (but
>> that might just be me).
> It's about not treating Windows developers as second class citizens.
> Their platform uses \r\n as its native line ending format, so they

Thanks Nick; I didn't want to be the only one saying that.  There is a 
fine line between asserting reasonable requirements for Windows users 
and being obstructionist and unhelpful, and I'm trying to stay on the 
former side :)

> should be able to work in that format without any hassles by following
> some simple instructions (such as "ensure you have version X of the
> Windows hg client, enable the win32text extension and configure it in
> such-and-such a way"). Not "oh, yeah, that's an issue but if you search
> the Intarwebs there are a few different things you can do that kinda
> sorta work but are a bit fragile and klunky".
> The precise order the two issues (server side enforcement and client
> side assistance) are dealt with doesn't really matter because *both*
> issues need to be addressed before we migrate.

I'm not that happy with the server being the primary line of defense. 
Let's say I make a branch of the hg repo, myself and a few others work 
on it committing as we go, then attempt to merge back upstream.  Let's 
say some of the early commits on that clone introduced "bad" line 
endings.  I'm guessing I would be forced to make a number of 
whitespace-only checkins to normalize the line-endings before it could 
merge - and these checkins would then be in the history forever.  Or I 
could attempt to recreate the clone by somehow "replaying" the commits 
with line endings corrected.  Either way, the situation doesn't seem good.

> win32text needs to be usable on non-Windows clients so that tarballs
> generated on a *nix machine get the line endings right in the
> Windows-only files.

I agree.  It isn't fair to make this windows users problem.  It would be 
like me proposing the repo get imported with \r\n line endings, enforce 
that with server side hooks, and let non-Windows users worry about the 
ramifications of that - somehow I doubt that would fly - so neither 
should it fly for Windows users...

I'm more than willing to help on this; I haven't resurrected my stale 
patch because I find win32text only 1/2 a solution that doesn't work in 
practice.  Therefore that patch is as stale for me as it is anyone. 
However, if a plan is put in place which offers a full solution and the 
hg developers are committed to it, I promise I'll put my hand up to help 
with implementation in a fairly timely manner...



From nyamatongwe at  Wed Aug  5 02:44:04 2009
From: nyamatongwe at (Neil Hodgson)
Date: Wed, 5 Aug 2009 10:44:04 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Mark Hammond:

> Thanks Nick; I didn't want to be the only one saying that. ?There is a fine
> line between asserting reasonable requirements for Windows users and being
> obstructionist and unhelpful, and I'm trying to stay on the former side :)

   I haven't commented on this issue before because I can't really be
helpful. I just don't understand why hg is being considered before
it's Windows support is roughly equivalent to svn and cvs.

   There has been some similar experience with the main repository for
the Cocoa port of Scintilla which is in bzr on launchpad. Several
times in that repository, files were checked in with wrong line ends
making every line appear changed when looking through history. There
are several causes for this including user error but bzr (and hg)
should default to more helpful behaviour on text files.


From ben+python at  Wed Aug  5 07:56:16 2009
From: ben+python at (Ben Finney)
Date: Wed, 05 Aug 2009 15:56:16 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>
	<> <>
Message-ID: <>

Mark Hammond <mhammond at> writes:

> Let's say I make a branch of the hg repo, myself and a few others work
> on it committing as we go, then attempt to merge back upstream. Let's
> say some of the early commits on that clone introduced "bad" line
> endings. I'm guessing I would be forced to make a number of
> whitespace-only checkins to normalize the line-endings before it could
> merge - and these checkins would then be in the history forever.

What is wrong with that? I mean, if that is the actual sequence of
events, why should the history not reflect that?

> Either way, the situation doesn't seem good.

I see this assertion made often, so I'm not saying you are necessarily
wrong to make it. I just don't see a justification for making it (and,
without justification, I would say it *is* wrong to make it).

 \          ?Our products just aren't engineered for security.? ?Brian |
  `\             Valentine, senior vice-president of Microsoft Windows |
_o__)                                                      development |
Ben Finney

From digitalxero at  Wed Aug  5 08:02:03 2009
From: digitalxero at (Dj Gilcrease)
Date: Wed, 5 Aug 2009 00:02:03 -0600
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Tue, Aug 4, 2009 at 5:43 PM, Mark Hammond<mhammond at> wrote:
> I'm more than willing to help on this; I haven't resurrected my stale patch
> because I find win32text only 1/2 a solution that doesn't work in practice.
> ?Therefore that patch is as stale for me as it is anyone. However, if a plan
> is put in place which offers a full solution and the hg developers are
> committed to it, I promise I'll put my hand up to help with implementation
> in a fairly timely manner...

Not sure what your patch was as I cannot find it, but I did up a quick
change to win32text that uses a versioned .win32text file to maintain
encoders, decoders and an ignore list

and add to your hgrc file
precommit.eol_encode = python:hgext.win32text.versioned_encode

it needs to be precommit since it needs to run before the change set
has been created so it can modify the data. Honestly I think this
solution is kind of a hack, a much better solution would be to modify
the encode/decode hooks to accept a filename so you can at least do
ignore pattern matching, but that still ignores versioned encodes /

From skippy.hammond at  Wed Aug  5 08:08:43 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 16:08:43 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 5/08/2009 3:56 PM, Ben Finney wrote:
> Mark Hammond<mhammond at>  writes:
>> Let's say I make a branch of the hg repo, myself and a few others work
>> on it committing as we go, then attempt to merge back upstream. Let's
>> say some of the early commits on that clone introduced "bad" line
>> endings. I'm guessing I would be forced to make a number of
>> whitespace-only checkins to normalize the line-endings before it could
>> merge - and these checkins would then be in the history forever.
> What is wrong with that? I mean, if that is the actual sequence of
> events, why should the history not reflect that?

The problem is the sequence of events happened in the first place.  An 
extra burden is placed on the developer that will quickly get tiresome. 
  I wouldn't personally be happy if that workflow became the norm.

>> Either way, the situation doesn't seem good.
> I see this assertion made often, so I'm not saying you are necessarily
> wrong to make it. I just don't see a justification for making it (and,
> without justification, I would say it *is* wrong to make it).

*shrug* - in my opinion, the fact the developer is faced with that 
hurdle in their workflow is justification enough to say that developer's 
situation "doesn't seem good" and should have been prevented from 
happening by the tool much earlier than proposed.


From ben+python at  Wed Aug  5 08:50:05 2009
From: ben+python at (Ben Finney)
Date: Wed, 05 Aug 2009 16:50:05 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>
	<> <>
	<> <>
Message-ID: <>

Mark Hammond <skippy.hammond at> writes:

> On 5/08/2009 3:56 PM, Ben Finney wrote:
> > Mark Hammond<mhammond at>  writes:
> >
> >> Let's say I make a branch of the hg repo, myself and a few others work
> >> on it committing as we go, then attempt to merge back upstream. Let's
> >> say some of the early commits on that clone introduced "bad" line
> >> endings.
> The problem is the sequence of events happened in the first place. An
> extra burden is placed on the developer that will quickly get
> tiresome. I wouldn't personally be happy if that workflow became the
> norm.

Ah, okay. In that case, the ultimate ?problem? is that OS vendors
entrenched their incompatible line-ending conventions instead of
choosing a single standard. Any line-ending burden borne by developers
is a result of that.

If things were different, they'd be different. However, we live with the
legacy of that stupid set of decisions and have no real option to
resolve it permanently short of deprecating entire vistas of tools (or
even entire operating systems).

> *shrug* - in my opinion, the fact the developer is faced with that
> hurdle in their workflow is justification enough to say that
> developer's situation "doesn't seem good" and should have been
> prevented from happening by the tool much earlier than proposed.

AIUI, this is a combination of several things:

* different OSen have incompatible, entrenched conventions for
  line-ending that is embodied in the default output of their text
  processing tools.

* these differences matter in many concrete ways to the tools that
  process text, so the differences need to be preserved, or explicitly

* distributed VCS has the job of preserving data as present on the
  filesystem, including whatever line-ending convention is present in a

* distributed VCS has the job of managing data exchange between users,
  presenting differences in a way that allows easy inspection and

* humans want to pretend that these incompatibilities don't exist, and
  want ?end of line? to be an automatically-handled abstraction.

It's not a simple thing to solve, and many clever people have tried over
the decades. The fact that a centralised VCS can put the problem aside
by requiring an explicit, single decision in the repository, is no help
when addressing the constraints of a distributed VCS.

At some point, the decision about how to handle line endings in
cross-platform data needs to be punted to a human for a
context-sensitive assessment, since (as can be seen) the above list of
requirements is internally inconsistent and can't be relegated to a
one-size-fits-all algorithm.

 \           ?All progress has resulted from people who took unpopular |
  `\                                      positions.? ?Adlai Stevenson |
_o__)                                                                  |
Ben Finney

From martin at  Wed Aug  5 09:35:26 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 09:35:26 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
Message-ID: <>

>    I haven't commented on this issue before because I can't really be
> helpful. I just don't understand why hg is being considered before
> it's Windows support is roughly equivalent to svn and cvs.

Is it really that you don't *understand*? It's fairly easy: there was
a PEP which offered a number of options, and there was BDFL
pronouncement. This (BDFL pronouncement) is how Python has always
worked, and, as a principle, it is a good and useful process.

Now, the specific outcome of the process means that more work needs to
be done. So we have a *second* PEP, and we have a lack of volunteers
that help implementing it. The second PEP hasn't been approved yet
(as it isn't complete, yet), so migration to hg is stalled.
The primary volunteer (Dirkjan) has indicated that he can't help with
that specific issue, so other volunteers need to step forward, or we
cannot move to hg.


From skippy.hammond at  Wed Aug  5 09:31:53 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 17:31:53 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<> <>
Message-ID: <>

On 5/08/2009 4:50 PM, Ben Finney wrote:
> Mark Hammond<skippy.hammond at>  writes:
>> On 5/08/2009 3:56 PM, Ben Finney wrote:
>>> Mark Hammond<mhammond at>   writes:
>>>> Let's say I make a branch of the hg repo, myself and a few others work
>>>> on it committing as we go, then attempt to merge back upstream. Let's
>>>> say some of the early commits on that clone introduced "bad" line
>>>> endings.
> [?]
>> The problem is the sequence of events happened in the first place. An
>> extra burden is placed on the developer that will quickly get
>> tiresome. I wouldn't personally be happy if that workflow became the
>> norm.
> Ah, okay. In that case, the ultimate ?problem? is that OS vendors
> entrenched their incompatible line-ending conventions instead of
> choosing a single standard. Any line-ending burden borne by developers
> is a result of that.

Yeah - this happened around 1964 if wikipedia is any guide.

> If things were different, they'd be different. However, we live with the
> legacy of that stupid set of decisions and have no real option to
> resolve it permanently short of deprecating entire vistas of tools (or
> even entire operating systems).

Agreed - so let's not solve it permanently.

> It's not a simple thing to solve, and many clever people have tried over
> the decades.

As already mentioned in this thread, a capability similar to what svn or 
cvs offers would be sufficient.  While a DVCS does offer unique 
challenges, it seems to me that doing something at commit time without 
requiring magic hooks be configured would go a long way to addressing 
the problem.  Magic hooks on the official repo would then be considered 
the final fallback defense, but should rarely be invoked.

> At some point, the decision about how to handle line endings in
> cross-platform data needs to be punted to a human for a
> context-sensitive assessment, since (as can be seen) the above list of
> requirements is internally inconsistent and can't be relegated to a
> one-size-fits-all algorithm.

I'm not sure what point you are trying to make, but I believe it *is* 
possible for a solution to be found here which will keep Windows users 
happy.  I'm guessing you haven't had much practical experience with this 
problem, so probably don't see this is clearly as Windows users do.



From martin at  Wed Aug  5 09:45:18 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 09:45:18 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<> <>
Message-ID: <>

> If things were different, they'd be different. However, we live with the
> legacy of that stupid set of decisions and have no real option to
> resolve it permanently short of deprecating entire vistas of tools (or
> even entire operating systems).

I think you missed the solution to the problem that Mark proposed
(IIUC): a local commit to a hg repository should already get the line
endings right, by automatically converting the file-to-be-committed
into the repository line endings. This is what CVS has supported for
more than ten years, and what svn supports for close-to ten years.

> * distributed VCS has the job of preserving data as present on the
>   filesystem, including whatever line-ending convention is present in a
>   file.

No, that's not true. Distributed VCS has the job to help the developer.
That may mean to preserve the file as-is, or it may mean to convert the
file on checkout and checkin. Which of these would be needed depends
on the file, of course.

> It's not a simple thing to solve, and many clever people have tried over
> the decades. The fact that a centralised VCS can put the problem aside
> by requiring an explicit, single decision in the repository, is no help
> when addressing the constraints of a distributed VCS.

Why do you say that? It's not true. The approach that has worked for the
central repository can work just as well for a distributed repository.

> At some point, the decision about how to handle line endings in
> cross-platform data needs to be punted to a human for a
> context-sensitive assessment, since (as can be seen) the above list of
> requirements is internally inconsistent and can't be relegated to a
> one-size-fits-all algorithm.

Right - there needs to be a way for the user to specify what line
endings to use. That's why both CVS and subversion have supported such
configuration, on a per file basis, for many years. I can't see why
hg couldn't, in principle, support the same configuration. Being a DVCS,
such configuration would have to be part of the clone, of course, being
versioned, and all that. I think hg is well capable of keeping versioned
configuration information in the clone, as demonstrated by the .hgignore


From skippy.hammond at  Wed Aug  5 09:44:18 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 17:44:18 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 5/08/2009 5:35 PM, "Martin v. L?wis" wrote:

> Now, the specific outcome of the process means that more work needs to
> be done. So we have a *second* PEP, and we have a lack of volunteers
> that help implementing it. The second PEP hasn't been approved yet
> (as it isn't complete, yet), so migration to hg is stalled.
> The primary volunteer (Dirkjan) has indicated that he can't help with
> that specific issue, so other volunteers need to step forward, or we
> cannot move to hg.

I don't recall Dirkjan saying he can't help with that issue - was it a 
lack of time, or a lack of understanding the problem/lack of a Windows 

The problem I see is a lack of agreement about exactly what the solution 
entails.  I believe there is general agreement win32text needs to be 
enhanced to support versioned 'rules'.  But even with that, the only 
option I see is a truly cross-platform extension to implement these 
rules which every Python committer, regardless of operating-system, is 
expected to use - but that doesn't seem the consensus.

As mentioned, I'm willing to lend manpower for this once there is 
agreement on something workable...



From martin at  Wed Aug  5 09:57:29 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 09:57:29 +0200
Subject: [Python-Dev] Mercurial migration: help needed
Message-ID: <>

This is a repost from a month ago. It didn't get much feedback last
time. I have now two items.

In this thread, I'd like to collect things that ought to be done
but where Dirkjan has indicated that he would prefer if somebody else
did it.

Item 1
The first item is build identification. If you want to work
on this, please either provide a patch (for trunk and/or py3k), or
(if you are a committer) create a subversion branch.

It seems that Barry and I agree that for the maintenance branches,
sys.subversion should be frozen, so we need actually two sets of
patches: one that removes sys.subversion entirely, and the other that
freezes the branch to the respective one, and freezes the subversion
revision to None.

Of course, it seems that the actual representation of branches hasn't
been determined yet, so the build process integration may need to be
changed if named branches aren't going to be used in the end.

Anybody working on this should have good knowledge of the Python source
code, Mercurial, and either autoconf or Visual Studio (preferably both).

Item 2

The second item is line conversion hooks. Dj Gilcrease has posted a
solution which he considers a hack himself. Mark Hammond has also
volunteered, but it seems some volunteer needs to be "in charge",
keeping track of a proposed solution until everybody agrees that it
is a good solution. It may be that two solutions are necessary: a
short-term one, that operates as a hook and has limitations, and
a long-term one, that improves the hook system of Mercurial to
implement the proper functionality (which then might get shipped
with Mercurial in a cross-platform manner).


From ben+python at  Wed Aug  5 10:00:57 2009
From: ben+python at (Ben Finney)
Date: Wed, 05 Aug 2009 18:00:57 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

Mark Hammond <skippy.hammond at> writes:

> As already mentioned in this thread, a capability similar to what svn
> or cvs offers would be sufficient.

That capability presented by centralised VCSen is entirely dependent on
the fact that they *are* centralised. Using a distributed VCS means the
same capability doesn't apply.

> While a DVCS does offer unique challenges, it seems to me that doing
> something at commit time without requiring magic hooks be configured
> would go a long way to addressing the problem.

The hand-waving ?doing something? is exactly what needs to be solved.

> Magic hooks on the official repo would then be considered the final
> fallback defense, but should rarely be invoked.

Right, so that's ?capability similar to centralised VCS? out of
consideration; I'm glad we agree in the end.

> I'm not sure what point you are trying to make

That I disagree with your position. You seem to think that the problem
has an obvious solution, which is not true; and that choice of a
distributed VCS should be delayed until the problem is solved, which I
don't agree with.

> but I believe it *is* possible for a solution to be found here which
> will keep Windows users happy. I'm guessing you haven't had much
> practical experience with this problem, so probably don't see this is
> clearly as Windows users do.

Your guess is incorrect; I've been bitten time and again by this problem
in many different contexts, enough to know that it's not obvious what
the ?right? solution is.

 \     ?Not to perambulate the corridors in the hours of repose in the |
  `\                          boots of ascension.? ?ski hotel, Austria |
_o__)                                                                  |
Ben Finney

From skippy.hammond at  Wed Aug  5 10:09:24 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 18:09:24 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>
	<> <>
Message-ID: <>

On 5/08/2009 6:00 PM, Ben Finney wrote:
> Mark Hammond<skippy.hammond at>  writes:
>> As already mentioned in this thread, a capability similar to what svn
>> or cvs offers would be sufficient.
> That capability presented by centralised VCSen is entirely dependent on
> the fact that they *are* centralised. Using a distributed VCS means the
> same capability doesn't apply.

Why do you say that (without justification I might add <wink>) about 
this issue?

>> While a DVCS does offer unique challenges, it seems to me that doing
>> something at commit time without requiring magic hooks be configured
>> would go a long way to addressing the problem.
> The hand-waving ?doing something? is exactly what needs to be solved.

I think you have been mis-reading this thread.  It is quite clear what 
'doing something' means in this context - it means implement the 
human-defined rules for the line-ending policy for the repository.

>> Magic hooks on the official repo would then be considered the final
>> fallback defense, but should rarely be invoked.
> Right, so that's ?capability similar to centralised VCS? out of
> consideration; I'm glad we agree in the end.

I'm afraid you have lost me again, as clearly we don't agree on what 
useful things can be done at local commit time.

>> I'm not sure what point you are trying to make
> That I disagree with your position. You seem to think that the problem
> has an obvious solution, which is not true; and that choice of a
> distributed VCS should be delayed until the problem is solved, which I
> don't agree with.

Fair enough - but it seems clear to enough of us that we can make 
progress and meet the requirements of the people actually impacted.

>> but I believe it *is* possible for a solution to be found here which
>> will keep Windows users happy. I'm guessing you haven't had much
>> practical experience with this problem, so probably don't see this is
>> clearly as Windows users do.
> Your guess is incorrect; I've been bitten time and again by this problem
> in many different contexts, enough to know that it's not obvious what
> the ?right? solution is.

Sorry about that - but that was the only way I could explain you not 
seeing how such a solution can work.



From martin at  Wed Aug  5 10:09:47 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 10:09:47 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

>> Now, the specific outcome of the process means that more work needs to
>> be done. So we have a *second* PEP, and we have a lack of volunteers
>> that help implementing it. The second PEP hasn't been approved yet
>> (as it isn't complete, yet), so migration to hg is stalled.
>> The primary volunteer (Dirkjan) has indicated that he can't help with
>> that specific issue, so other volunteers need to step forward, or we
>> cannot move to hg.
> I don't recall Dirkjan saying he can't help with that issue - was it a
> lack of time, or a lack of understanding the problem/lack of a Windows
> environment?

I think he said (at some point) that he is not a Windows user, and thus
can't really help. Of course, he also indicated that, as a Mercurial
contributor, he is willing to help as much as he can.

> The problem I see is a lack of agreement about exactly what the solution
> entails.  I believe there is general agreement win32text needs to be
> enhanced to support versioned 'rules'.  But even with that, the only
> option I see is a truly cross-platform extension to implement these
> rules which every Python committer, regardless of operating-system, is
> expected to use - but that doesn't seem the consensus.
> As mentioned, I'm willing to lend manpower for this once there is
> agreement on something workable...

I think it needs to work the other way 'round. Somebody (perhaps you)
needs to propose a hook and configuration settings, and propose that
this hook is used on every system, and that refusal to use these hooks
could lead to changes not being integratable (is that a word?).

There can't be consensus to use a solution that doesn't exist.

My personal favorite outcome would be this:
- most files have svn's "native" eol style; they get stored in LF
  in the repository; the hook will convert them on Windows, and check
  on Unix.
- some files have "windows" eol style; they get stored in CRLF.
  The hook will not convert, but only check.
- not sure whether some files need to be declared as "unix" eol style.
- some files are "binary"; they get stored as-is - the hook will
  do nothing.

With such a setup, using the hook would be truly optional on Unix,
as it only ever checks and never converts. So if you manage to mess
up, and don't have the hook installed on Unix, you lose when trying
to push. That will teach you to be more careful in the future, or
to install the hook (which hopefully becomes built into Mercurial at
some point).

Whether it is actually possible to implement all that, I don't know.


From martin at  Wed Aug  5 10:12:38 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 10:12:38 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>
	<> <>
Message-ID: <>

>> As already mentioned in this thread, a capability similar to what svn
>> or cvs offers would be sufficient.
> That capability presented by centralised VCSen is entirely dependent on
> the fact that they *are* centralised. Using a distributed VCS means the
> same capability doesn't apply.

Why do you say that? People have demonstrated the contrary already.

>> I'm not sure what point you are trying to make
> That I disagree with your position. You seem to think that the problem
> has an obvious solution, which is not true; and that choice of a
> distributed VCS should be delayed until the problem is solved, which I
> don't agree with.

But is *has* an obvious solution. See the implementation from Dj
Gilcrease, or the spec that I just posted.

> Your guess is incorrect; I've been bitten time and again by this problem
> in many different contexts, enough to know that it's not obvious what
> the ?right? solution is.

The configuration options of svn have served us well enough.


From nyamatongwe at  Wed Aug  5 10:25:08 2009
From: nyamatongwe at (Neil Hodgson)
Date: Wed, 5 Aug 2009 18:25:08 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Martin v. L?wis:

> Is it really that you don't *understand*? It's fairly easy: there was
> a PEP ...

   The PEP process is straightforward. However, a PEP may produce an
outcome that proves after more experience to be wrong. ISTM a
prerequisite to choosing a DVCS is that it should support the full
range of development platforms and thus the PEP was accepted
prematurely. At some point the PEP should be reexamined and, if
necessary, rescinded. What I don't understand is why the plan is still
to move to hg despite, after several months, there not being a known
good way to include Windows eol support.


From dirkjan at  Wed Aug  5 10:25:19 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 10:25:19 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 01:43, Mark Hammond<mhammond at> wrote:
> Thanks Nick; I didn't want to be the only one saying that. ?There is a fine
> line between asserting reasonable requirements for Windows users and being
> obstructionist and unhelpful, and I'm trying to stay on the former side :)

I'm not trying to be obstructionist and unhelpful (I hope that should
be obvious). On the other hand, I'm working from the point of view of
hg, which has two assumptions:

- we're a distributed system, there's fairly little we can assume about clients
- we exchange checksummed byte streams (even if we have some tools
that assume those streams are code)
- because of the previous point, there's one native (and therefore
better, in a sense) serialization of what you consider "structured"

The first point means, for example, there will always be some clients
who don't have win32text enabled, no matter what, so you can't rely on
it, which is why I want to make the server hooks the primary line of
defense, and view the client-side tools as helper tools (to make it
easy not to trigger the server-side hooks). That doesn't mean I think
Windows users are second-rate, or anything like that!

> I'm not that happy with the server being the primary line of defense. Let's
> say I make a branch of the hg repo, myself and a few others work on it
> committing as we go, then attempt to merge back upstream. ?Let's say some of
> the early commits on that clone introduced "bad" line endings. ?I'm guessing
> I would be forced to make a number of whitespace-only checkins to normalize
> the line-endings before it could merge - and these checkins would then be in
> the history forever. ?Or I could attempt to recreate the clone by somehow
> "replaying" the commits with line endings corrected. ?Either way, the
> situation doesn't seem good.

I don't think either is bad. In the first case, you have one or maybe
two extra changesets. As we like to advocate small changesets that fix
one thing, a changeset fixing up whitespace is par for the course. ;)
The other solution would be to employ mq, for example, to fix up the
commits, which mq excels at (although admittedly it has a learning

> I agree. ?It isn't fair to make this windows users problem. ?It would be
> like me proposing the repo get imported with \r\n line endings, enforce that
> with server side hooks, and let non-Windows users worry about the
> ramifications of that - somehow I doubt that would fly - so neither should
> it fly for Windows users...
> I'm more than willing to help on this; I haven't resurrected my stale patch
> because I find win32text only 1/2 a solution that doesn't work in practice.
> ?Therefore that patch is as stale for me as it is anyone. However, if a plan
> is put in place which offers a full solution and the hg developers are
> committed to it, I promise I'll put my hand up to help with implementation
> in a fairly timely manner...

Well, I'd be happy to help convince the hg crew to accept whatever we
come up with, but I'm not sure I'm the best person to come up with it.
It sounds like a versioned .hgeols would help a bunch of issues, but I
have the feeling you know that better than me, so I'm hoping you can
come up with a concrete proposal on what should change in win32text to
fix all the problems you see.



From martin at  Wed Aug  5 10:41:41 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 10:41:41 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

>    The PEP process is straightforward. However, a PEP may produce an
> outcome that proves after more experience to be wrong. ISTM a
> prerequisite to choosing a DVCS is that it should support the full
> range of development platforms and thus the PEP was accepted
> prematurely.

To be as blunt as possible: the PEP was accepted because Guido
really, Really, REALLY wanted to switch to Mercurial. So you would
have to convince Guido to revert his decision. You may not like
the decision (I did not like using a DVCS in the first place), but
following such decisions has served us well, and will serve us well
this time.

> At some point the PEP should be reexamined and, if
> necessary, rescinded. What I don't understand is why the plan is still
> to move to hg despite, after several months, there not being a known
> good way to include Windows eol support.

You don't understand why it takes many months? That's also easy: because
there is a single volunteer, and because there is a lot of work. I think
it took me a year to migrate to subversion back then, and I wouldn't be
surprised if the Mercurial migration takes even longer.

Or don't you understand why that single unresolved item didn't manage
to revert the decision? Well, there are many unresolved items in
the Mercurial conversion, some much more stressful than the eol issue
(e.g. the branching discussion). None of them is unsolvable (AFAICT);
you can either contribute to the solution, and sit back and wait for
solutions to emerge. Then you can vote on PEP 385 up or down still.


From martin at  Wed Aug  5 10:51:46 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 10:51:46 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> - we're a distributed system, there's fairly little we can assume about clients

Not as Mercurial, no. As Python, we can certainly expect that all of our
contributors have read the developer FAQ, and set up their systems
accordingly. If all else fails, we can revoke commit access (or is
it "push access"?) if some committer doesn't get the configuration
right. We would, of course, prefer if it was very easy to get the
configuration right, so that problems don't occur in the first place.

> The first point means, for example, there will always be some clients
> who don't have win32text enabled, no matter what, so you can't rely on
> it, which is why I want to make the server hooks the primary line of
> defense

I think it's a terminology issue only: don't say "primary", say "last".

Can we agree that the "last" line of defense will be the server hooks,
and the "primary" line of defense will be the client commits? "primary"
would mean that this is were most errors are detected and fixed; Mark
would really object to a flow where most errors are detected only
at the server.

> That doesn't mean I think
> Windows users are second-rate, or anything like that!

If the server hooks were the primary line of defense, it would
effectively make Windows users second-rate: they will have to redo all
their changes over-and-over again, whereas the Unix users can push the
changes without any obstacles (just because they are less likely to make

If the client machines were the primary line of defense, Windows users
were treated equally: they would make as few mistakes as Unix users,
because the hooks do what they want correctly.

> I don't think either is bad. In the first case, you have one or maybe
> two extra changesets. As we like to advocate small changesets that fix
> one thing, a changeset fixing up whitespace is par for the course. ;)

Whitespace-only changes hurt the "annotate" feature, so we dislike them
very much in Python.

> Well, I'd be happy to help convince the hg crew to accept whatever we
> come up with, but I'm not sure I'm the best person to come up with it.

That is all very well. See my other message (asking for volunteers)
as well. If you have more work you would prefer to delegate, please let
us know.


From skippy.hammond at  Wed Aug  5 11:02:08 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 19:02:08 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 5/08/2009 6:25 PM, Dirkjan Ochtman wrote:
> On Wed, Aug 5, 2009 at 01:43, Mark Hammond<mhammond at>  wrote:
>> Thanks Nick; I didn't want to be the only one saying that.  There is a fine
>> line between asserting reasonable requirements for Windows users and being
>> obstructionist and unhelpful, and I'm trying to stay on the former side :)
> I'm not trying to be obstructionist and unhelpful (I hope that should
> be obvious).

It is, and I hope I didn't imply otherwise.

> On the other hand, I'm working from the point of view of
> hg, which has two assumptions:
> - we're a distributed system, there's fairly little we can assume about clients
> - we exchange checksummed byte streams (even if we have some tools
> that assume those streams are code)
> - because of the previous point, there's one native (and therefore
> better, in a sense) serialization of what you consider "structured"
> data
> The first point means, for example, there will always be some clients
> who don't have win32text enabled, no matter what, so you can't rely on
> it, which is why I want to make the server hooks the primary line of
> defense, and view the client-side tools as helper tools (to make it
> easy not to trigger the server-side hooks). That doesn't mean I think
> Windows users are second-rate, or anything like that!

In general I agree - although I think we can enforce a "social contract" 
which puts requirements on people who commit to the Python repository - 
and therefore we can consider the server-side hooks a "secondary" 
defense.  IOW, the system (including the social aspects of the system) 
are setup such that the server-side hooks are very rarely called upon.

>> I'm not that happy with the server being the primary line of defense. Let's
>> say I make a branch of the hg repo, myself and a few others work on it
>> committing as we go, then attempt to merge back upstream.  Let's say some of
>> the early commits on that clone introduced "bad" line endings.  I'm guessing
>> I would be forced to make a number of whitespace-only checkins to normalize
>> the line-endings before it could merge - and these checkins would then be in
>> the history forever.  Or I could attempt to recreate the clone by somehow
>> "replaying" the commits with line endings corrected.  Either way, the
>> situation doesn't seem good.
> I don't think either is bad.

With all due respect, I suspect that is because you don't expect to see 
the issue regularly.  This proposal still leaves the problem squarely in 
the lap of Windows users and imposes a burden on them that would 
probably be considered unreasonable if the situation was reversed.

I'm yet to work on a hg repository without mixed line endings.  If I 
understand correctly, every such repository would have involved a 
developer checking in locally, than at some point in the future pushing 
these changes upstream.  I really really don't want hg to tell me at 
this final step that I need to perform whitespace only fixes purely 
because I am running Windows.

I understand we are discussing how win32text can offer that - but I must 
object to your assertion that the situation I described isn't bad when 
you hit it.

> Well, I'd be happy to help convince the hg crew to accept whatever we
> come up with, but I'm not sure I'm the best person to come up with it.
> It sounds like a versioned .hgeols would help a bunch of issues, but I
> have the feeling you know that better than me, so I'm hoping you can
> come up with a concrete proposal on what should change in win32text to
> fix all the problems you see.

Actually, I think it is easy to make this problem much easier to 
understand; mandate every platform should use win32text, then start 
collating the issues people, including yourself, will no doubt face. 
I'm happy to get this ball rolling, but again, don't want this left 
purely in the domain of "it is a windows problem" - it isn't.



From dirkjan at  Wed Aug  5 11:04:40 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 11:04:40 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 10:51, "Martin v. L?wis"<martin at> wrote:
> Not as Mercurial, no. As Python, we can certainly expect that all of our
> contributors have read the developer FAQ, and set up their systems
> accordingly. If all else fails, we can revoke commit access (or is
> it "push access"?) if some committer doesn't get the configuration
> right. We would, of course, prefer if it was very easy to get the
> configuration right, so that problems don't occur in the first place.

There will also be non-committers who forge changesets that you want
to be able to push directly to the Python repositories.

> If the client machines were the primary line of defense, Windows users
> were treated equally: they would make as few mistakes as Unix users,
> because the hooks do what they want correctly.

Similarly, if Python kept its .py files in \r\n line endings by
default instead of \n endings, Unix-like users would be more prone to
mistake, so by keeping the .py files in \n-format, so Python is making
Windows users second-rate by keeping the line endings in \n format. To
cope with that, hg needs to do extra work on the client side.



From nyamatongwe at  Wed Aug  5 11:09:10 2009
From: nyamatongwe at (Neil Hodgson)
Date: Wed, 5 Aug 2009 19:09:10 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Martin v. L?wis:

> Or don't you understand why that single unresolved item didn't manage
> to revert the decision? Well, there are many unresolved items in
> the Mercurial conversion, some much more stressful than the eol issue
> (e.g. the branching discussion).

   Then these issues should have been included in the initial PEP for
choosing a DVCS since the issues could have driven the choice. PEP 374
implies that win32text effectively solves the Windows eol issue which
no longer appears to be correct.


From dirkjan at  Wed Aug  5 11:09:44 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 11:09:44 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 11:02, Mark Hammond<skippy.hammond at> wrote:
> In general I agree - although I think we can enforce a "social contract"
> which puts requirements on people who commit to the Python repository - and
> therefore we can consider the server-side hooks a "secondary" defense. ?IOW,
> the system (including the social aspects of the system) are setup such that
> the server-side hooks are very rarely called upon.


> With all due respect, I suspect that is because you don't expect to see the
> issue regularly.

I suspect so, too!

> I'm yet to work on a hg repository without mixed line endings. ?If I
> understand correctly, every such repository would have involved a developer
> checking in locally, than at some point in the future pushing these changes
> upstream. ?I really really don't want hg to tell me at this final step that
> I need to perform whitespace only fixes purely because I am running Windows.
> I understand we are discussing how win32text can offer that - but I must
> object to your assertion that the situation I described isn't bad when you
> hit it.

I agree it is to be avoided, I'm just saying that I think it will be
exceptional and therefore not a large burden, given other kinds of
defenses we can put in place.

> Actually, I think it is easy to make this problem much easier to understand;
> mandate every platform should use win32text, then start collating the issues
> people, including yourself, will no doubt face. I'm happy to get this ball
> rolling, but again, don't want this left purely in the domain of "it is a
> windows problem" - it isn't.

I'm not sure how win32text will provide anything other than
performance degradation for non-Windows developers, but if there's
functionality to be had, I'm happy to mandate its use on every



From martin at  Wed Aug  5 11:12:58 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 11:12:58 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

>> Not as Mercurial, no. As Python, we can certainly expect that all of our
>> contributors have read the developer FAQ, and set up their systems
>> accordingly. If all else fails, we can revoke commit access (or is
>> it "push access"?) if some committer doesn't get the configuration
>> right. We would, of course, prefer if it was very easy to get the
>> configuration right, so that problems don't occur in the first place.
> There will also be non-committers who forge changesets that you want
> to be able to push directly to the Python repositories.

They will also have to follow the policies we set up. If they refuse to
do that, we refuse to accept their changes. It's very simple, and
contributors have learned very quickly what the policies were (after
they were explained to them).

Whether that means that they have to fix their changesets, or that they
have to redo them, practice will show.

>> If the client machines were the primary line of defense, Windows users
>> were treated equally: they would make as few mistakes as Unix users,
>> because the hooks do what they want correctly.
> Similarly, if Python kept its .py files in \r\n line endings by
> default instead of \n endings, Unix-like users would be more prone to
> mistake, so by keeping the .py files in \n-format, so Python is making
> Windows users second-rate by keeping the line endings in \n format. To
> cope with that, hg needs to do extra work on the client side.

I think you still miss the point. *If* hg does the extra work, *then*
Windows users are *not* second-class citizens anymore. They *only*
consider themselves second-class if they have to do additional *manual*
work (*).


(*) They may also consider themselves second-class if they have to
install additional software, so hopefully, the necessary extra code for
hg will become part of the regular Mercurial distribution at some point.

From martin at  Wed Aug  5 11:16:37 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 11:16:37 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

> I'm not sure how win32text will provide anything other than
> performance degradation for non-Windows developers, but if there's
> functionality to be had, I'm happy to mandate its use on every
> platform.

This is all fairly hypothetical - if hg grew a .hgeols file, it would
be good if it supported that cross-platform. It then may make win32text
obsolete (in particular if it provided some useful defaults).

On Unix, the functionality might be as simple as checking conformance
with the eol-style at pre-commit time.


From skippy.hammond at  Wed Aug  5 11:17:58 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 19:17:58 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
Message-ID: <>

On 5/08/2009 7:09 PM, Dirkjan Ochtman wrote:
> I'm not sure how win32text will provide anything other than
> performance degradation for non-Windows developers, but if there's
> functionality to be had, I'm happy to mandate its use on every
> platform.

I see two practical outcomes of such a mandate:

* line-ending rules are enforced for local checkins, even for linux 
users, even though such 'accidental' inappropriate line-ending checkins 
should be much rarer than for windows.

* practical problems faced by Windows users, including any performance 
considerations, are shared by the community and therefore addressed as a 
community, thereby ensuring all platforms are considered as important as 
any other.



From ben+python at  Wed Aug  5 11:42:05 2009
From: ben+python at (Ben Finney)
Date: Wed, 05 Aug 2009 19:42:05 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>
	<> <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

"Martin v. L?wis" <martin at> writes:

> > You seem to think that the problem has an obvious solution, which is
> > not true;

> But is *has* an obvious solution. See the implementation from Dj
> Gilcrease, or the spec that I just posted.

Two different solutions are both obvious? There are other solutions
proposed elsewhere too; are they also obvious?

Mark Hammond <skippy.hammond at> writes:

> I think you have been mis-reading this thread.

Quite possibly; I'm not intending to impose my position on anyone. I'll
go back to lurking on the thread for a while and see if it becomes any

 \       ?First things first, but not necessarily in that order.? ?The |
  `\                                              Doctor, _Doctor Who_ |
_o__)                                                                  |
Ben Finney

From p.f.moore at  Wed Aug  5 12:04:42 2009
From: p.f.moore at (Paul Moore)
Date: Wed, 5 Aug 2009 11:04:42 +0100
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

2009/8/5 "Martin v. L?wis" <martin at>:
> My personal favorite outcome would be this:
> - most files have svn's "native" eol style; they get stored in LF
> ?in the repository; the hook will convert them on Windows, and check
> ?on Unix.
> - some files have "windows" eol style; they get stored in CRLF.
> ?The hook will not convert, but only check.
> - not sure whether some files need to be declared as "unix" eol style.
> - some files are "binary"; they get stored as-is - the hook will
> ?do nothing.
> With such a setup, using the hook would be truly optional on Unix,
> as it only ever checks and never converts. So if you manage to mess
> up, and don't have the hook installed on Unix, you lose when trying
> to push. That will teach you to be more careful in the future, or
> to install the hook (which hopefully becomes built into Mercurial at
> some point).

Given that my preference is to use Unix-style EOL for "text" files on
Windows, as every text editor I use (barring notepad!) understands LF
format, it seems to me that this proposal also means that the hook
would be optional for me. That suits me fine - I'd prefer to avoid
having hooks that are required for Python checkouts, as that means I
have to remember to configure them on each clone (IIUC).

Of course, this implies that your proposal only requires any action by
the user in the case of Windows users whose text editing tools insist
on CRLF format text files (sources, etc). Is that really a large group
of developers? (I honestly don't know).

I suspect that there is something missing from your proposal, as if
this were the case, then the problem appears to be limited to a very
small group of developers. Maybe it's Visual Studio that insists on
CRLF for source files? (I don't know, as I don't use the VS editor).
If that's the case, then maybe a VS hook would be an alternative
approach? (I can't imagine such a hook would be an *easier* approach,
I only mention it because it makes it clearer where the issue lies).


From dirkjan at  Wed Aug  5 12:14:24 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 12:14:24 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 12:04, Paul Moore<p.f.moore at> wrote:
> Given that my preference is to use Unix-style EOL for "text" files on
> Windows, as every text editor I use (barring notepad!) understands LF
> format, it seems to me that this proposal also means that the hook
> would be optional for me. That suits me fine - I'd prefer to avoid
> having hooks that are required for Python checkouts, as that means I
> have to remember to configure them on each clone (IIUC).

Yeah, this may also be what's making it harder for me to understand
the issues. I am actually a Windows user, although I do most of my
development on Linux servers through PuTTY. I just always make sure I
use editors that respect the file's line endings, and so for those
things where I've used hg to version code on Windows (for example,
when testing a Firefox extension) and when my colleague who does edit
his code inside Windows, I've just used editors that deal with line
endings. Typically, in my case, that was either Notepad2 (an awesomely
light-weight Notepad replacement) or Komodo (Edit). That solved all of
my issues, so I haven't had a need for win32text so far.



From ncoghlan at  Wed Aug  5 12:43:12 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 05 Aug 2009 20:43:12 +1000
Subject: [Python-Dev] Functionality in subprocess.Popen.terminate()
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Janzert wrote:
> Eric Pruitt wrote:
>> Sounds good enough to me but I was wondering if it might be a good
>> idea to add a function like "pidinuse" to subprocess as a whole that
>> would determine if a process ID was being used and return a simple
>> boolean value. I came across a number of people searching for a way to
>> determine if a PID was running (Google "python check if pid exists")
>> so it seems like the implemented functionality would be of use to the
>> community as a whole, not just my wrapper class.
>> Eric
> I'm not sure of the actual details but it seems from your description
> that even if you check first a race condition will still exist.
> Specifically the subprocess could terminate after the check and before
> the TerminateProcess call. So it seems better just to call
> TerminateProcess and then correctly handle any possible error.

Janzert is correct here - this is a case where ruling out the error
completely is impossible, so you're going to have to handle it regardless.

A cross platform way of checking if a particular subprocess is still
running might be an interesting feature in its own right, but I don't
think it will prevent this exception.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From mhammond at  Wed Aug  5 13:19:24 2009
From: mhammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 21:19:24 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	
	<> <>	
	<> <>	
Message-ID: <>

On 5/08/2009 8:04 PM, Paul Moore wrote:
> 2009/8/5 "Martin v. L?wis"<martin at>:

>> With such a setup, using the hook would be truly optional on Unix,
>> as it only ever checks and never converts. So if you manage to mess
>> up, and don't have the hook installed on Unix, you lose when trying
>> to push. That will teach you to be more careful in the future, or
>> to install the hook (which hopefully becomes built into Mercurial at
>> some point).
> Given that my preference is to use Unix-style EOL for "text" files on
> Windows, as every text editor I use (barring notepad!) understands LF
> format,

Most tools that I use will tend to not mix EOL styles in a single file, 
but will tend to create \r\n line endings for new files I create.  Most 
hg repos I come across don't have mixed line endings within individual 
files, so I can only guess these files were accidentally introduced in 
the same way (and indeed I have personally done this.)  I'm hoping to be 
part of the solution instead of part of the problem :)

 > it seems to me that this proposal also means that the hook
> would be optional for me.

Technically it would be optional for everyone, of course.  However, the 
solution should be such that everyone, regardless of personal 
preference, is willing to take the hit.

For example, if the repo is converted using \r\n line endings natively, 
then Windows users would need to take no action either and puts the onus 
back on you (given your stated preferences) to configure the tool 
appropriately.  I assume you would have no objection to that and would 
be happy to make that tool optional for me?

  That suits me fine - I'd prefer to avoid
> having hooks that are required for Python checkouts, as that means I
> have to remember to configure them on each clone (IIUC).

Configuring on each clone would certainly be sub-optimal, so the 
proposal is this configuration be stored in a versioned file in the repo.

> Of course, this implies that your proposal only requires any action by
> the user in the case of Windows users whose text editing tools insist
> on CRLF format text files (sources, etc). Is that really a large group
> of developers? (I honestly don't know).

It applies to all files that aren't "native" EOL style - there are just 
less of them regularly modified than those that are so marked.

> I suspect that there is something missing from your proposal, as if
> this were the case, then the problem appears to be limited to a very
> small group of developers. Maybe it's Visual Studio that insists on
> CRLF for source files? (I don't know, as I don't use the VS editor).
> If that's the case, then maybe a VS hook would be an alternative
> approach? (I can't imagine such a hook would be an *easier* approach,
> I only mention it because it makes it clearer where the issue lies).

I must concede that Windows developers are the minority here - but 
assuming we want a level playing field, I don't see how that changes the 
underlying issue...



From mhammond at  Wed Aug  5 13:22:02 2009
From: mhammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 21:22:02 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	
	<> <>	
	<> <>	
Message-ID: <>

On 5/08/2009 8:14 PM, Dirkjan Ochtman wrote:
> endings. Typically, in my case, that was either Notepad2 (an awesomely
> light-weight Notepad replacement) or Komodo (Edit). That solved all of
> my issues, so I haven't had a need for win32text so far.

FWIW, I use komodo and scite as my primary editors, and as mentioned, am 
personally responsible for accidentally checking in \r\n files into what 
should be a \n repo.  I am slowly and painfully learning to be more 
careful - IMO, I shouldn't need to...



From dirkjan at  Wed Aug  5 13:28:49 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 13:28:49 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 13:19, Mark Hammond<mhammond at> wrote:
> Configuring on each clone would certainly be sub-optimal, so the proposal is
> this configuration be stored in a versioned file in the repo.

Even if we do that, enabling hg extensions will still need to be done
locally -- although it can be done per-user/box instead of per-clone.



From skippy.hammond at  Wed Aug  5 13:46:14 2009
From: skippy.hammond at (Mark Hammond)
Date: Wed, 05 Aug 2009 21:46:14 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

On 5/08/2009 9:28 PM, Dirkjan Ochtman wrote:
> On Wed, Aug 5, 2009 at 13:19, Mark Hammond<mhammond at>  wrote:
>> Configuring on each clone would certainly be sub-optimal, so the proposal is
>> this configuration be stored in a versioned file in the repo.
> Even if we do that, enabling hg extensions will still need to be done
> locally -- although it can be done per-user/box instead of per-clone.

That is completely fine, and not unlike SVN where a per-user/box setting 
generally needs to be set once - but after that everything "just works". 
  Windows developers don't mind taking a hit once ;)  The dev guide can 
make it clear what the expectations are...



From ncoghlan at  Wed Aug  5 14:50:39 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 05 Aug 2009 22:50:39 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Mark Hammond wrote:
> On 5/08/2009 7:09 PM, Dirkjan Ochtman wrote:
>> I'm not sure how win32text will provide anything other than
>> performance degradation for non-Windows developers, but if there's
>> functionality to be had, I'm happy to mandate its use on every
>> platform.
> I see two practical outcomes of such a mandate:
> * line-ending rules are enforced for local checkins, even for linux
> users, even though such 'accidental' inappropriate line-ending checkins
> should be much rarer than for windows.
> * practical problems faced by Windows users, including any performance
> considerations, are shared by the community and therefore addressed as a
> community, thereby ensuring all platforms are considered as important as
> any other.

The main error that enabling win32text everywhere can catch is the use
of a *nix client to accidentally corrupt one of the files that is
supposed to have \r\n line endings.

It also simplifies the configuration rules in the Python hg FAQ - we
would be able to just tell all developers wanting to contribute patches
to Python to enable the win32text extension when working with the Python
repositories (or clones thereof) without having to worry about what
platform they were on.

So it seems to me that the main client-side feature we want is a
versioned .hgeols file in the repository that allows files to be
explicitly nominated as one of:
- eol=CRLF (i.e. have \r\n line endings in the repository and should be
left that way on the local disk as well - equivalent to SVN eol-style:CRLF)
- eol=LF (i.e. have \n line endings in the repository and should be left
that way on the local disk as well - equivalent to SVN eol-style:LF)
- eol=CR (i.e. have \n line endings in the repository and should be left
that way on the local disk as well - equivalent to SVN eol-style:CR)
- native text (i.e. always stored in the repository with \n line
endings, but uses native line endings on the local disk - equivalent to
SVN eol-style:native)
- binary (i.e. always reproduced on disk exactly as they are in the
repository - equivalent to SVN files without eol-style set at all)

The .hgeols file should also allow the repository to define which of the
above should be used as the default handling mechanism for text files
that are not named in the file (native text, in the specific case of the
Python repositories).

Files which look like binary files (according to the existing win32text
heuristics) would be left alone regardless of what the default handling
was set to in .hgeols.

win32text would then be enhanced to check for a .hgeols file before
falling back to its existing configuration mechanisms.

The above basically provides the SVN eol-style feature in a more
hg-friendly way. Allowing wildcards in the .hgeols files might be nice,
but I don't think it is actually required. We really don't have that
many files that are affected by this problem (it's just the fact that it
is a number greater than zero that is causing the problem).

The server side pre-push hooks for the main Python repositories would be
set to reject change sets which didn't meet the above rules. If a patch
fails those checks, either the committer can fix it themselves and
resubmit, or else send it back to the originator along with a pointer to
the section in the dev FAQ that describes the expected client-side


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From python at  Wed Aug  5 15:35:02 2009
From: python at (MRAB)
Date: Wed, 05 Aug 2009 14:35:02 +0100
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> Mark Hammond wrote:
>> On 5/08/2009 7:09 PM, Dirkjan Ochtman wrote:
>>> I'm not sure how win32text will provide anything other than
>>> performance degradation for non-Windows developers, but if there's
>>> functionality to be had, I'm happy to mandate its use on every
>>> platform.
>> I see two practical outcomes of such a mandate:
>> * line-ending rules are enforced for local checkins, even for linux
>> users, even though such 'accidental' inappropriate line-ending checkins
>> should be much rarer than for windows.
>> * practical problems faced by Windows users, including any performance
>> considerations, are shared by the community and therefore addressed as a
>> community, thereby ensuring all platforms are considered as important as
>> any other.
> The main error that enabling win32text everywhere can catch is the use
> of a *nix client to accidentally corrupt one of the files that is
> supposed to have \r\n line endings.
> It also simplifies the configuration rules in the Python hg FAQ - we
> would be able to just tell all developers wanting to contribute patches
> to Python to enable the win32text extension when working with the Python
> repositories (or clones thereof) without having to worry about what
> platform they were on.
> So it seems to me that the main client-side feature we want is a
> versioned .hgeols file in the repository that allows files to be
> explicitly nominated as one of:
> - eol=CRLF (i.e. have \r\n line endings in the repository and should be
> left that way on the local disk as well - equivalent to SVN eol-style:CRLF)
> - eol=LF (i.e. have \n line endings in the repository and should be left
> that way on the local disk as well - equivalent to SVN eol-style:LF)
> - eol=CR (i.e. have \n line endings in the repository and should be left
> that way on the local disk as well - equivalent to SVN eol-style:CR)
> - native text (i.e. always stored in the repository with \n line
> endings, but uses native line endings on the local disk - equivalent to
> SVN eol-style:native)
> - binary (i.e. always reproduced on disk exactly as they are in the
> repository - equivalent to SVN files without eol-style set at all)
> The .hgeols file should also allow the repository to define which of the
> above should be used as the default handling mechanism for text files
> that are not named in the file (native text, in the specific case of the
> Python repositories).
> Files which look like binary files (according to the existing win32text
> heuristics) would be left alone regardless of what the default handling
> was set to in .hgeols.
> win32text would then be enhanced to check for a .hgeols file before
> falling back to its existing configuration mechanisms.
> The above basically provides the SVN eol-style feature in a more
> hg-friendly way. Allowing wildcards in the .hgeols files might be nice,
> but I don't think it is actually required. We really don't have that
> many files that are affected by this problem (it's just the fact that it
> is a number greater than zero that is causing the problem).
> The server side pre-push hooks for the main Python repositories would be
> set to reject change sets which didn't meet the above rules. If a patch
> fails those checks, either the committer can fix it themselves and
> resubmit, or else send it back to the originator along with a pointer to
> the section in the dev FAQ that describes the expected client-side
> configuration.
Instead of just talking about line endings, could each file have a
specific 'filetype'? This would define what kind of data it contains,
how it's stored in the repository, and what actions to perform for
fetching and committing, including any checks:

     c_header: C header file; LF in repository; native outside

     c_source: C source file; LF in repository; native outside

     text: plain text; LF in repository; native outside

     crlf_text: plain text; CRLF in repository; CRLF outside

     cr_text: plain text; CR in repository; CR outside

     lf_text: plain text; LF in repository; LF outside

     binary: arbitrary binary data; as-is in repository

This could be expanded in the future to include filetypes for JPEG, etc.

From dirkjan at  Wed Aug  5 15:37:57 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 15:37:57 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 15:35, MRAB<python at> wrote:
> Instead of just talking about line endings, could each file have a
> specific 'filetype'? This would define what kind of data it contains,
> how it's stored in the repository, and what actions to perform for
> fetching and committing, including any checks:

Sounds like YAGNI to me. The outline Nick provided seems to me to be
quite close to the current win32text settings in syntax and purpose
and staying close to that would help making adoption easier.



From phd at  Wed Aug  5 15:50:03 2009
From: phd at (Oleg Broytmann)
Date: Wed, 5 Aug 2009 17:50:03 +0400
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 05, 2009 at 02:35:02PM +0100, MRAB wrote:
> Instead of just talking about line endings, could each file have a
> specific 'filetype'?

   EOL-conversion, MIME type and encoding (charset) are three different
concepts. Yes, all of them must be supported, but not necessary in one
configuration mechanism.
   Subversion handles these issues by providing svn:eol-style and
svn:mime-type (handles both MIME type and charset) properties on a
file-by-file basis.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From phd at  Wed Aug  5 15:57:59 2009
From: phd at (Oleg Broytmann)
Date: Wed, 5 Aug 2009 17:57:59 +0400
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 05, 2009 at 05:50:03PM +0400, Oleg Broytmann wrote:
>    Subversion handles these issues by providing ...
> svn:mime-type (handles both MIME type and charset)
> file-by-file basis.

   Dirkjan, how does Mercurial handles charsets? If I have three files in
my repository - one in utf-8, another in koi8-r, and the third in cp1251
encoding - I certainly don't want to convert them back and force, but I
want hg web interface to provide charset in the Content-Type header.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From dirkjan at  Wed Aug  5 16:04:24 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 16:04:24 +0200
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 15:57, Oleg Broytmann<phd at> wrote:
> ? Dirkjan, how does Mercurial handles charsets? If I have three files in
> my repository - one in utf-8, another in koi8-r, and the third in cp1251
> encoding - I certainly don't want to convert them back and force, but I
> want hg web interface to provide charset in the Content-Type header.

It doesn't currently have any way to provide out-of-band charset info.



From ncoghlan at  Wed Aug  5 16:12:08 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 06 Aug 2009 00:12:08 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>	<>	<>
	<>	<>
Message-ID: <>

Dirkjan Ochtman wrote:
> On Wed, Aug 5, 2009 at 15:35, MRAB<python at> wrote:
>> Instead of just talking about line endings, could each file have a
>> specific 'filetype'? This would define what kind of data it contains,
>> how it's stored in the repository, and what actions to perform for
>> fetching and committing, including any checks:
> Sounds like YAGNI to me.

Yep - while SVN does support full mime_type specification for files, I
don't think we have ever used it. The SVN eol-style property is all
we're trying to replicate, since that has served us well in the few
cases where it has mattered.

> The outline Nick provided seems to me to be
> quite close to the current win32text settings in syntax and purpose
> and staying close to that would help making adoption easier.

Yeah, win32text is already tantalising close to what we would like so I
deliberately tried to stay close to its existing approach. We're just
being a bit fussier than most about the repository being able to tell
the clients which files should be given special treatment. That way
individual users can just set it up once on their development machine
and then no longer have to worry about it (if more files that need
special treatment are added to the repository, then the same checkin
that adds them should also update .hgeols).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From stephen at  Wed Aug  5 16:28:42 2009
From: stephen at (Stephen J. Turnbull)
Date: Wed, 05 Aug 2009 23:28:42 +0900
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

Mark Hammond writes:

 > I'm not sure what point you are trying to make, but I believe it *is* 
 > possible for a solution to be found here which will keep Windows users 
 > happy.  I'm guessing you haven't had much practical experience with this 
 > problem, so probably don't see this is clearly as Windows users do.

Mercurial is not only open source, it's written in Python.  The
problem is known to be hard in a practical sense, the existing
solutions (written by non-Windows developers, of course) are judged to
be insufficient by Windows users, and the non-Windows developers
"probably don't see this is clearly as Windows users do."

I think the implication is obvious.  There will be no good solution
until Windows users develop it.  I don't see a good reason to wait for
that.  I do see good reason for non-Windows users to put up with some
inconvenience during the "beta" phase of implementing that solution;
it's important enough to be fast-tracked, and doesn't need to be
perfect for everybody to be tried (though it should not be allowed to
endanger repo content, which seems unlikely but needs care since it's
a potential disaster).

From phd at  Wed Aug  5 16:35:33 2009
From: phd at (Oleg Broytmann)
Date: Wed, 5 Aug 2009 18:35:33 +0400
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Aug 05, 2009 at 04:04:24PM +0200, Dirkjan Ochtman wrote:
> On Wed, Aug 5, 2009 at 15:57, Oleg Broytmann<phd at> wrote:
> > ? Dirkjan, how does Mercurial handles charsets? If I have three files in
> > my repository - one in utf-8, another in koi8-r, and the third in cp1251
> > encoding - I certainly don't want to convert them back and force, but I
> > want hg web interface to provide charset in the Content-Type header.
> It doesn't currently have any way to provide out-of-band charset info.

   Perhaps that's not a big issue for Python, but it's certainly a big
issue for me.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From dirkjan at  Wed Aug  5 16:40:31 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Wed, 5 Aug 2009 16:40:31 +0200
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <> <>
	<> <>
	<> <>
Message-ID: <>

On Wed, Aug 5, 2009 at 16:35, Oleg Broytmann<phd at> wrote:
> ? Perhaps that's not a big issue for Python, but it's certainly a big
> issue for me.

I think there are extensions that try to deal with it. Have a look:

If not, it should be easy to come up with something and write an
extension for it.



From phd at  Wed Aug  5 16:58:57 2009
From: phd at (Oleg Broytmann)
Date: Wed, 5 Aug 2009 18:58:57 +0400
Subject: [Python-Dev] PEP 385: the charset issue
In-Reply-To: <>
References: <> <>
	<> <>
Message-ID: <>

On Thu, Aug 06, 2009 at 12:12:08AM +1000, Nick Coghlan wrote:
> Yep - while SVN does support full mime_type specification for files, I
> don't think we have ever used it.

   These files are in 8859-1 encoding (names in comments, at least):
   If they are not marked as "text/plain; charset=iso-8859-1" I think it's
a bug. Either they should be marked, or converted to ascii or utf-8; the
coding pseudocomment (directive) should be changed accordingly.
   Probably there are other files.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From john.arbash.meinel at  Wed Aug  5 16:58:50 2009
From: john.arbash.meinel at (John Arbash Meinel)
Date: Wed, 05 Aug 2009 09:58:50 -0500
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>		<>
	<>		<>		<>
	<>		<>		<>	<>
Message-ID: <>

Mark Hammond wrote:
> On 5/08/2009 8:14 PM, Dirkjan Ochtman wrote:
>> endings. Typically, in my case, that was either Notepad2 (an awesomely
>> light-weight Notepad replacement) or Komodo (Edit). That solved all of
>> my issues, so I haven't had a need for win32text so far.
> FWIW, I use komodo and scite as my primary editors, and as mentioned, am
> personally responsible for accidentally checking in \r\n files into what
> should be a \n repo.  I am slowly and painfully learning to be more
> careful - IMO, I shouldn't need to...
> Cheers,
> Mark

IIRC one of the main problems in Copy & Paste. I believe both Scite and
Visual Studio have had issues where they "preserve" the line endings of
files, but if you paste from another source, it will continue to
"preserve" the line endings of the pasted content.

That said, you also have the "create a new file defaults to CRLF" that
has similar problems.


From solipsis at  Wed Aug  5 17:08:25 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 5 Aug 2009 15:08:25 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the charset issue
References: <> <>
	<> <>
	<> <>
Message-ID: <>

Oleg Broytmann <phd <at>> writes:
>    These files are in 8859-1 encoding (names in comments, at least):
>    If they are not marked as "text/plain; charset=iso-8859-1" I think it's
> a bug. Either they should be marked, or converted to ascii or utf-8; the
> coding pseudocomment (directive) should be changed accordingly.

It's certainly ok to convert them to utf-8 (and add the marker anyway).
There's no point in having different charsets used throughout the code base,
except for testing purposes (just as there's no point in having different
indentation rules used for the same file type throughout the code base ;-)).



From stephen at  Wed Aug  5 17:34:39 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 06 Aug 2009 00:34:39 +0900
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

Oleg Broytmann writes:

 >    Dirkjan, how does Mercurial handles charsets? If I have three files in
 > my repository - one in utf-8, another in koi8-r, and the third in cp1251
 > encoding - I certainly don't want to convert them back and force, but I
 > want hg web interface to provide charset in the Content-Type header.

How is this relevant to PEP 385?  I hope the answer is "not at all".
I've been there, done that, and my answer is "never again".  (I'm not
telling you what to do with *your* repository, just that I don't see
any good reason for having any encodings but UTF-8 in Python's.)

From phd at  Wed Aug  5 17:54:15 2009
From: phd at (Oleg Broytmann)
Date: Wed, 5 Aug 2009 19:54:15 +0400
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, Aug 06, 2009 at 12:34:39AM +0900, Stephen J. Turnbull wrote:
> Oleg Broytmann writes:
>  >    Dirkjan, how does Mercurial handles charsets? If I have three files in
>  > my repository - one in utf-8, another in koi8-r, and the third in cp1251
>  > encoding - I certainly don't want to convert them back and force, but I
>  > want hg web interface to provide charset in the Content-Type header.
> How is this relevant to PEP 385?  I hope the answer is "not at all".

   There are non-utf8 non-ascii files in the Python source tree. Either
there should be a way to handle them in Mercurial or they have to be
converted to UTF-8 in a proper way (i.e., don't forget to rewrite charset
   Other tan that - I am pondering a switch from SVN to hg in other
projects using Python process as an example and asking questions that are
slightly off-topic (but only slightly).

> I've been there, done that, and my answer is "never again".  (I'm not
> telling you what to do with *your* repository, just that I don't see
> any good reason for having any encodings but UTF-8 in Python's.)

   We have files in at least two different encodings - utf-8 and cp1251 for
user-visible text-files on w32.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From steve at  Wed Aug  5 18:05:11 2009
From: steve at (Steve Holden)
Date: Wed, 05 Aug 2009 12:05:11 -0400
Subject: [Python-Dev] Microsoft MSDN
Message-ID: <>

I sent fourteen requests for licenses in to Microsoft. I've asked them
to let me know which they grant (since they may choose to limit the
number) and will inform you all personally when I hear their decision.

Steve Holden           +1 571 484 6266   +1 800 494 3119
Holden Web LLC       
Watch PyCon on video now!

From p.f.moore at  Wed Aug  5 18:24:17 2009
From: p.f.moore at (Paul Moore)
Date: Wed, 5 Aug 2009 17:24:17 +0100
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

2009/8/5 Mark Hammond <mhammond at>:
> Most tools that I use will tend to not mix EOL styles in a single file, but
> will tend to create \r\n line endings for new files I create. ?Most hg repos
> I come across don't have mixed line endings within individual files, so I
> can only guess these files were accidentally introduced in the same way (and
> indeed I have personally done this.) ?I'm hoping to be part of the solution
> instead of part of the problem :)

Interesting. I don't recall *ever* having generated CRLF line endings
in a LF-delimited file (I use Vim) although I may have created CRLF in
new files (and then not noticed, as Vim handles it transparently
enough that I missed it).

There are no significant projects where I'm a committer, though, so I
interact via patches, which means I don't get the opportunity to break
the repository :-)

> Technically it would be optional for everyone, of course. ?However, the
> solution should be such that everyone, regardless of personal preference, is
> willing to take the hit.
> For example, if the repo is converted using \r\n line endings natively, then
> Windows users would need to take no action either and puts the onus back on
> you (given your stated preferences) to configure the tool appropriately. ?I
> assume you would have no objection to that and would be happy to make that
> tool optional for me?

Absolutely. My issue is with 2 points:

1) I'm an infrequent contributor, so I don't keep a checkout around. I
make a new clone "on demand", so I would be likely to forget to enable
the hook on at least a proportion of my clones. The versioned .hgeols
proposal seems to cover this.

2) This behaviour is something needed for Python only. I've no issue
with enabling win32text globally, but I'd want to be clear that it is
a no-op unless specifically requested (ie, something like
**=cleverencode is *not* used in the absence of an explicit set of
rules). That may well be the case, but I had the impression that
win32text tried to be "automatic", so I'd like to verify it.

> I must concede that Windows developers are the minority here - but assuming
> we want a level playing field, I don't see how that changes the underlying
> issue...

Again, agreed entirely.

As a Windows developer who doesn't (knowingly) encounter the issue,
I'm not in a good position to help, but I'm happy to contribute
comments and test things. I'll be offline for a couple of weeks,
though, so you may well have solved it before I can do anything :-)


From v+python at  Wed Aug  5 19:43:57 2009
From: v+python at (Glenn Linderman)
Date: Wed, 05 Aug 2009 10:43:57 -0700
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

On approximately 8/5/2009 4:28 AM, came the following characters from 
the keyboard of Dirkjan Ochtman:
> On Wed, Aug 5, 2009 at 13:19, Mark Hammond<mhammond at> wrote:
>> Configuring on each clone would certainly be sub-optimal, so the proposal is
>> this configuration be stored in a versioned file in the repo.
> Even if we do that, enabling hg extensions will still need to be done
> locally -- although it can be done per-user/box instead of per-clone.

On approximately 8/5/2009 9:24 AM, came the following characters from 
the keyboard of Paul Moore:
 > 2) This behaviour is something needed for Python only. I've no issue
 > with enabling win32text globally, but I'd want to be clear that it is
 > a no-op unless specifically requested (ie, something like
 > **=cleverencode is *not* used in the absence of an explicit set of
 > rules). That may well be the case, but I had the impression that
 > win32text tried to be "automatic", so I'd like to verify it.

Depending on [Windows] users to configure their installation of 
Mercurial to work with the Python repository is lame; it will lead to 
new Windows contributors getting beat-up at check-in time, and make them 
less likely to want to contribute even the work they have already done 
(with wrong EOL), and much less to want to start future contributions, 
because some Unix Python hacker will be nasty about "Didn't you RTFM?" 
(Maybe not at first, but eventually).

If the configuration settings have to be different per project for 
Windows developers using Mercurial for multiple projects, then that is 
also lame... Windows developers would have to keep changing their 
configurations, or (implied in above discussion) remember to recreate 
settings for each new clone or branch or whatever of the Python project. 
  This is also error-prone, and leads to the above problem a different way.

I have read this whole discussion, but want to step back and look at it 
from a theoretical viewpoint.  A good solution would have the following 

INSTALLATION) The developer should install the [D]VCS (for this 
discussion, Mercurial, present or future version), and attempt to access 
a repository (for this discussion, the Python repository, converted and 
configured for the chosen [D]VCS).  The resultant environment should 
automatically be configured to work properly. If any [D]VCS extensions 
are required for the project, they should be automatically installed and 
configured, or the user given explicit instructions on how to do so, as 
a one-time installation step, that adversely affects no other projects 
for which the [D]VCS is used by that or other users of the present 
installation..  See below for what properly means.

EOL CONFIGURATION) Each file, when added to the repository, should have 
a repository setting that indicates what the appropriate EOL type is for 
that file.  The values I have heard are  \n only, \r\n, platform-native, 
and binary.  I haven't heard \r only in this discussion, but have heard 
it in other similar discussions, and it may be a useful setting for 
Mercurial to have, if the feature must be newly implemented there.  I 
believe there are also systems that use RS to separate lines, and 
perhaps other things (and are there new Unicode control characters that 
could be used for line endings?), so it might be good to leave a few 
unassigned values in such a setting.  I don't think any setting should 
be created to allow mixed line ending usage within a file, except 
binary.  Per repository default for this setting should be available to 
avoid burdening the user when creating the typical type of file.

ENCODING CONFIGURATION) Each file, when created, should have a 
repository settings that declares its character repertoire and encoding, 
and if it is a Unicode UTF encoding, whether or not it should have a 
leading BOM.  In my opinion, all source code files should use a Unicode 
encoding, the exception being for test files that help test encoding 
support in internationalized environments.  But the feature supports 
other people's opinions too.  Per repository default for this setting 
should be available to avoid burdening the user when creating the 
typical type of file.

CHECKOUT) Check-outs should be sensitive to the user's local environment 
(platform and locale settings), and non-binary files should be converted 
from the repository format to the local encoding and platform-specific 
line endings.  Settings to override the line endings should be 
optionally available for users whose tools understand other line 
endings, and prefer them over the native line endings.  If the 
characters used within a file cannot be converted losslessly to the 
encoding specified by the locale settings, then it should not be able to 
be checked out.  A special override might be useful for using a lossy 
transformation for a read-only view of the file, at user request.

CHECKIN) Check-ins, even local check-ins to local clones or branches, 
should automatically convert encodings and line endings from the 
platform and locale setting to the encoding and line ending specified by 
the repository for that file.  If the characters in the modified file 
cannot be transformed losslessly to the repository repertoire and 
encoding, the check-in should be prevented.

The CHECKIN should be a requirement of a useful [D]VCS, regardless of if 
any other capabilities are present.

Even if none of the existing tools can reach the above flexibility, the 
problems that results from using tools that do not have such flexibility 
should be understood in terms of their specific deficiencies compared to 
the theoretical model.

I can think of only one other solution that properly handles the 
problems (which is punting, really): to require the development 
environment to support the repertoire, encoding, and line endings of the 
repository.  Doing this in a cross-platform manner is hard, because the 
tool sets (editors, compilers, databases, etc.) tend to support the 
platform-native convention better than the non-native conventions.  It 
sounds like Mercurial's win32text extension is one form of this sort of 
requirement.  CHECKIN should be a requirement even in this case, to 
validate the incoming data file.  Basic software design requires 
validation of incoming data.

I have no clue how many of these characteristics are implemented by 
Mercurial (or any other VCS or DVCS, I've been 7 years away from using 
SCCS, CVS, and Clearcase, but none of them had such features then, and 
I've not used the modern crop of VCSes much: git, svn, hg, bazaar, 
except a little in passing, but haven't read any documentation, nor 
attempted to set up a project myself in any of them).

If none of the existing tools can reach the above flexibility, then 
there will be problems that result, and understanding what the problems 
are, and coming up with documented workarounds, processes, and auxiliary 
tools on each platform/envirenment to cure or prevent them, would seem 
to be necessary to support the use of such tools.

Since Mercurial is the presently chosen DVCS for Python to migrate to, 
I'd be delighted to learn how close it comes to the theoretical model, 
and I'm sure someone out there knows.  When I have some time, I'll 
attempt to figure that out by reading the Mercurial documentation... I 
have a personal (Python, cross-platform) project that is in need of a 
DVCS soon, and so I'm watching this discussion with much interest, to 
know whether I should also choose Mercurial, or should choose something 
that is closer to the theoretical solution outlined above (if there is 
something that is, or appears to be more likely to reach it sooner).

Glenn --
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

From lanyjie at  Wed Aug  5 20:10:29 2009
From: lanyjie at (Yingjie Lan)
Date: Wed, 5 Aug 2009 11:10:29 -0700 (PDT)
Subject: [Python-Dev] Reasons for using expy
In-Reply-To: <>
Message-ID: <>


The expy project provides an express way to extend Python. After some careful considerations, I came up with some reasons for expy (this is not an exhaustive list):

(I). WYSIWYG. The expy project enables you to write your module in Python the way your extension would be (WYSIWYG), and meanwhile write your implementation in pure C. You specify your modules, functions, methods, classes, and even their documentations the usual way of writing your Python correspondences. Then your provide your implementation to the functions/methods by returning a multi-line string. By such an arrangement, everything falls in its right place, and your extension code becomes easy to read and maintain. Also, the generated code is very human-friendly.

(II). You only provide minimal information to indicate your intension of how your module/class would function in Python. So your extension is largely independent from the Python extension API. As your interaction with the Python extension API is reduced to minimal (you only care about the functionality and logic), it is then possible that your module written in expy can be independent of changes in the extension API.

(III). The building and setup of your project can be automatically done with the distutil tool. In the tutorial, there are ample examples on how easily this is achieved.

(IV). Very light weight. The expy tool is surprisingly light weight dispite of its powerful ability, as it is written in pure Python. There is no parser or compiler for code generation, but rather the powerful reflexion mechanism of Python is exploited in a clever way to generate human-friendly codes. Currently, generating code in C is supported, however, the implementation is well modularized and code generation in other languages such as Java and C++ should be easy.

While there are already a couple of other projects trying to simply this task with different strategies, such as Cython, Pyrex and modulator, this project is unique and charming in its own way. All you need is the WYSIWYG Python file for your module extension, then expy takes care of everything else. What follows in this documentation is on how to extend Python in C using expy-cxpy: the module expy helps define your module, while module cxpy helps generate C codes for your defined module.

For more information about expy, please visit its homepage at:




From martin at  Wed Aug  5 20:22:27 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 20:22:27 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>
	<>	<>	<>
Message-ID: <>

>> Given that my preference is to use Unix-style EOL for "text" files on
>> Windows, as every text editor I use (barring notepad!) understands LF
>> format, it seems to me that this proposal also means that the hook
>> would be optional for me. That suits me fine - I'd prefer to avoid
>> having hooks that are required for Python checkouts, as that means I
>> have to remember to configure them on each clone (IIUC).
> Yeah, this may also be what's making it harder for me to understand
> the issues.

Please trust that there are plenty of editors that get the line ending
implementation wrong. I'm fairly certain that some Visual Studio
versions are among them. They will recognize LF as a line ending, but
add CRLF line breaks when the user presses enter.

In addition, some editors (in particular notepad) choke when confronted
with LF-only files. It is very annoying if you have to look at source
code at somebody else's machine which doesn't have any programmer
editor installed (except for Visual Studio).


From martin at  Wed Aug  5 20:32:09 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 20:32:09 +0200
Subject: [Python-Dev] PEP 385: the charset issue
In-Reply-To: <>
References: <>
	<>	<>	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

>    These files are in 8859-1 encoding (names in comments, at least):
>    If they are not marked as "text/plain; charset=iso-8859-1" I think it's
> a bug. Either they should be marked, or converted to ascii or utf-8; the
> coding pseudocomment (directive) should be changed accordingly.

It's certainly a bug of the web page. I'm not so sure it's a bug in the
files: I would claim that it's a bug in ViewCVS.


From martin at  Wed Aug  5 20:35:02 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 20:35:02 +0200
Subject: [Python-Dev] PEP 385: the charset issue
In-Reply-To: <>
References: <>
	<>	<>	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

>>    These files are in 8859-1 encoding (names in comments, at least):
>>    If they are not marked as "text/plain; charset=iso-8859-1" I think it's
>> a bug. Either they should be marked, or converted to ascii or utf-8; the
>> coding pseudocomment (directive) should be changed accordingly.
> It's certainly ok to convert them to utf-8 (and add the marker anyway).

No, it's not. PEP 8 mandates that non-ASCII code in the Python source
code is in Latin-1.


From martin at  Wed Aug  5 20:37:55 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Aug 2009 20:37:55 +0200
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>	<>
	<>	<>	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

>  >    Dirkjan, how does Mercurial handles charsets? If I have three files in
>  > my repository - one in utf-8, another in koi8-r, and the third in cp1251
>  > encoding - I certainly don't want to convert them back and force, but I
>  > want hg web interface to provide charset in the Content-Type header.
> How is this relevant to PEP 385?  I hope the answer is "not at all".
> I've been there, done that, and my answer is "never again".  (I'm not
> telling you what to do with *your* repository, just that I don't see
> any good reason for having any encodings but UTF-8 in Python's.)

Just in case my previous message gets overlooked: PEP 8 mandates Latin-1
for Python 2.x source code (except for files that test PEP 263).


From solipsis at  Wed Aug  5 21:17:42 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 5 Aug 2009 19:17:42 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the charset issue
References: <>
	<>	<>	<>	<>	<>
	<>	<>	<>	<>
Message-ID: <>

Martin v. L?wis <martin <at>> writes:
> No, it's not. PEP 8 mandates that non-ASCII code in the Python source
> code is in Latin-1.

Ok, point taken.
Having several encodings (and several indentation rules) certainly makes things
more annoying for contributors than they should, however.



From g.brandl at  Wed Aug  5 21:43:08 2009
From: g.brandl at (Georg Brandl)
Date: Wed, 05 Aug 2009 21:43:08 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>	<>	<>	<>
Message-ID: <h5cnem$sbv$>

Neil Hodgson schrieb:
> Martin v. L?wis:
>> Or don't you understand why that single unresolved item didn't manage
>> to revert the decision? Well, there are many unresolved items in
>> the Mercurial conversion, some much more stressful than the eol issue
>> (e.g. the branching discussion).
>    Then these issues should have been included in the initial PEP for
> choosing a DVCS since the issues could have driven the choice. PEP 374
> implies that win32text effectively solves the Windows eol issue which
> no longer appears to be correct.

Apparently, it was the author's understanding at that time that win32text
would be sufficient.  Also, PEP 374 has not been written in isolation; at
any time during the process people could have notified Dirkjan that this
is not the case.

The branching issue *has* been included in PEP 374; it is not a blocker
for migration, but rather a decision has to be made between two similar,
but in other ways quite different styles for converting SVN branches.

I'm not aware of any other unresolved items; they may exist, but the fact
that they're not discussed on this list in detail means that they are
largely unimportant.


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From g.brandl at  Wed Aug  5 21:56:15 2009
From: g.brandl at (Georg Brandl)
Date: Wed, 05 Aug 2009 21:56:15 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>
Message-ID: <h5co79$c8$>

Stephen J. Turnbull schrieb:

> Mark Hammond writes:
>  > I'm not sure what point you are trying to make, but I believe it *is* 
>  > possible for a solution to be found here which will keep Windows users 
>  > happy.  I'm guessing you haven't had much practical experience with this 
>  > problem, so probably don't see this is clearly as Windows users do.
> Mercurial is not only open source, it's written in Python.  The
> problem is known to be hard in a practical sense, the existing
> solutions (written by non-Windows developers, of course) are judged to
> be insufficient by Windows users, and the non-Windows developers
> "probably don't see this is clearly as Windows users do."
> I think the implication is obvious.  There will be no good solution
> until Windows users develop it.  I don't see a good reason to wait for
> that.  I do see good reason for non-Windows users to put up with some
> inconvenience during the "beta" phase of implementing that solution;

It's not that obvious -- we at least need the server-side check that doesn't
allow "wrong" line endings as the "last" line of defense, and this check
already needs a way to know which files are supposed to have which line
endings -- deciding how to specify that is already half of the needed


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From mal at  Wed Aug  5 22:04:46 2009
From: mal at (M.-A. Lemburg)
Date: Wed, 05 Aug 2009 22:04:46 +0200
Subject: [Python-Dev] PEP 385: the charset issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

"Martin v. L?wis" wrote:
>>>    These files are in 8859-1 encoding (names in comments, at least):
>>>    If they are not marked as "text/plain; charset=iso-8859-1" I think it's
>>> a bug. Either they should be marked, or converted to ascii or utf-8; the
>>> coding pseudocomment (directive) should be changed accordingly.
>> It's certainly ok to convert them to utf-8 (and add the marker anyway).
> No, it's not. PEP 8 mandates that non-ASCII code in the Python source
> code is in Latin-1.

Then I guess it's time to change PEP 8 for Python 2.7 ...

Code in the core Python distribution should aways use the ASCII or
UTF-8 encoding together with a PEP 263 encoding comment header.

Since UTF-8 is ASCII compatible, the whole source code will
effectively be UTF-8 encoded.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 05 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From martin at  Wed Aug  5 22:13:07 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Aug 2009 22:13:07 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <h5cnem$sbv$>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> I'm not aware of any other unresolved items; they may exist, but the fact
> that they're not discussed on this list in detail means that they are
> largely unimportant.

There is a long list of things that still need to be done; each one
potentially creating new problems. In particular:
- the .hgeols plugin needs to be written
- the hooks need to be written, or at least deployed, for code
  style checks, for email notification, and for buildbot triggering
- the build identification patch needs to be written (I do expect
  many problems out of that one, some possibly small - I'm not a
  Mercurial user, so I can't estimate how difficult that will be)
- buildbot configuration needs to be adjusted
- the roundup regex needs to be configured to refer to hgweb links
- access control needs to be setup
- stackless needs to be converted
- a decision on the location of the PEPs must be made and implemented
- developer documentation needs to be written
- a decision must be made what to do with the migrated parts of
  subversion, in the subversion repository

I may have missed some things. I would like to see test period (say,
two weeks) were we can find further issues.


From g.brandl at  Wed Aug  5 22:18:16 2009
From: g.brandl at (Georg Brandl)
Date: Wed, 05 Aug 2009 22:18:16 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<h5cnem$sbv$>
Message-ID: <h5cpgi$6oc$>

Martin v. L?wis schrieb:
>> I'm not aware of any other unresolved items; they may exist, but the fact
>> that they're not discussed on this list in detail means that they are
>> largely unimportant.
> There is a long list of things that still need to be done; each one
> potentially creating new problems. In particular:
> - the .hgeols plugin needs to be written
> - the hooks need to be written, or at least deployed, for code
>   style checks, for email notification, and for buildbot triggering
> - the build identification patch needs to be written (I do expect
>   many problems out of that one, some possibly small - I'm not a
>   Mercurial user, so I can't estimate how difficult that will be)
> - buildbot configuration needs to be adjusted
> - the roundup regex needs to be configured to refer to hgweb links
> - access control needs to be setup
> - stackless needs to be converted
> - a decision on the location of the PEPs must be made and implemented
> - developer documentation needs to be written
> - a decision must be made what to do with the migrated parts of
>   subversion, in the subversion repository
> I may have missed some things. I would like to see test period (say,
> two weeks) were we can find further issues.

Sure there are many things to do; I was speaking of issues where the way
to go is not decided, and needs to be before the switch can happen.

Maybe build identification is one of them; but I think everything has been
said in the one thread we had about this.


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From amauryfa at  Wed Aug  5 23:03:44 2009
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Wed, 5 Aug 2009 23:03:44 +0200
Subject: [Python-Dev] PEP 385: pruning/reorganizing branches
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/3 Dirkjan Ochtman <dirkjan at>:
> So PEP 385 proposes to clean up the old branches we still have lying
> around in SVN.
> io-c: keep-clone?

strip - it was merged into py3k some months ago.

Amaury Forgeot d'Arc

From ncoghlan at  Wed Aug  5 23:44:58 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 06 Aug 2009 07:44:58 +1000
Subject: [Python-Dev] Reasons for using expy
In-Reply-To: <>
References: <>
Message-ID: <>

Yingjie Lan wrote:
> Hi,
> The expy project provides an express way to extend Python. After some
> careful considerations, I came up with some reasons for expy (this is
> not an exhaustive list):

This kind of advocacy for external projects belongs on python-list, not
python-dev (or, if you're proposing something for use in the standard
library, on python-ideas).


P.S. The message to capi-sig was probably on topic - certainly closer to
being so than the inclusion of python-dev.

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From nyamatongwe at  Thu Aug  6 00:22:14 2009
From: nyamatongwe at (Neil Hodgson)
Date: Thu, 6 Aug 2009 08:22:14 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Glenn Linderman:

> and perhaps other things (and
> are there new Unicode control characters that could be used for line
> endings?),

   Unicode includes Line Separator U+2028 and Paragraph Separator
U+2029 but they are rarely supported and very rarely used. They are a
pain to work with since they are 3 byte sequences in UTF-8. Visual
Studio does support them.

   Python does not currently support these line separators such as in
this example which only reads 2 lines rather than 3:

with open("x.txt", "wb") as f:
with open("x.txt", "r") as f:
	n = 1
	for l in f.readlines():
		print(n, repr(l))
		n += 1


From lanyjie at  Thu Aug  6 00:55:28 2009
From: lanyjie at (Yingjie Lan)
Date: Wed, 5 Aug 2009 15:55:28 -0700 (PDT)
Subject: [Python-Dev] Reasons for using expy
In-Reply-To: <>
Message-ID: <>

> From: Nick Coghlan <ncoghlan at>
> Subject: Re: [Python-Dev] Reasons for using expy
> To: "Yingjie Lan" <lanyjie at>
> Cc: python-dev at
> Date: Thursday, August 6, 2009, 1:44 AM
> This kind of advocacy for external projects belongs on
> python-list, not
> python-dev (or, if you're proposing something for use in
> the standard
> library, on python-ideas).

Thanks Nick. 




From mhammond at  Thu Aug  6 02:34:08 2009
From: mhammond at (Mark Hammond)
Date: Thu, 06 Aug 2009 10:34:08 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 6/08/2009 12:28 AM, Stephen J. Turnbull wrote:
> Mark Hammond writes:
>   >  I'm not sure what point you are trying to make, but I believe it *is*
>   >  possible for a solution to be found here which will keep Windows users
>   >  happy.  I'm guessing you haven't had much practical experience with this
>   >  problem, so probably don't see this is clearly as Windows users do.
> Mercurial is not only open source, it's written in Python.  The
> problem is known to be hard in a practical sense, the existing
> solutions (written by non-Windows developers, of course) are judged to
> be insufficient by Windows users, and the non-Windows developers
> "probably don't see this is clearly as Windows users do."
> I think the implication is obvious.  There will be no good solution
> until Windows users develop it.  I don't see a good reason to wait for
> that.

My conclusion is different.  I'm not sure of the history of win32text, 
but it most certainly is now squarely in the hands of Windows users. 
Patches to win32text, or even general discussion is usually met with 
silence, and when prodded, the response is "sorry - we don't use that - 
it is a Windows problem."

As a result, we end up in the position we are in now - win32text is 
great in theory but doesn't work in practice, attempts to make it work 
are met with indifference, and the "problem" stays squarely with Windows 
users.  Non Windows users remain oblivious to the pain, Windows users 
stop bothering with the extension, and the repository post-commit hooks 
then cause different pain.

Hence my conclusion that the answer is for any such support to be 
developed in conjunction with Windows users, but also in such a way that 
the solution works, almost identically, for non Windows users.  By 
insisting all platforms eat the same dog-food, there is much more chance 
the glaringly obvious (to Windows users) issues are addressed.

 > I do see good reason for non-Windows users to put up with some
> inconvenience during the "beta" phase of implementing that solution;
> it's important enough to be fast-tracked, and doesn't need to be
> perfect for everybody to be tried (though it should not be allowed to
> endanger repo content, which seems unlikely but needs care since it's
> a potential disaster).

And on the flip-side, I accept we may migrate without the agreed 
solution fully implemented - I'm happy to accept commitments about what 
*will* be done even if it isn't a reality for a short while...



From mcaninch at  Thu Aug  6 00:22:30 2009
From: mcaninch at (Jeff McAninch)
Date: Wed, 05 Aug 2009 16:22:30 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
Message-ID: <>

I'm new to this list, so please excuse me if this topic has been 
discussed, but I didn't
see anything similar in the archives.

I very often want something like a try-except conditional expression similar
to the if-else conditional. 

An example of the proposed syntax might be:
    x = float(string) except float('nan')
or possibly
    x = float(string) except ValueError float('nan')

Here's a simple example: Converting a large list of strings to floats 
where there may be errors
that I want returned as nan's.

Currently I would write the function:
    def safe_float_function(string):
            result = float(string)
            result = float('nan')
        return result
and get my list of floats using the list comprehension:
    xs = [ safe_float_function(string) for string in strings ]

With a try-except conditional I would instead define the following lambda:
    safe_float_conditional = lambda string : float(string) except 
leading to:
    xs = [ safe_float_conditional(string) for string in strings ]

My understanding is that the second would be faster at run time, and, 
like if-else conditional expressions,
possibly more easily read by the human.

Again, please excuse me if this has been discussed previously.  If so, 
I'd appreciate being pointed to the discussion.

Please also excuse me if for there is some currently (pre-python 3.0) 
idiom that I could use to efficiently get this
same behaviour.  If so, I'd appreciate being educated.

Jeff McAninch

Jeffrey E. McAninch, PhD
Physicist, X-2-IFD
Los Alamos National Laboratory
Phone: 505-667-0374
Email: mcaninch at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Thu Aug  6 02:59:44 2009
From: python at (Raymond Hettinger)
Date: Wed, 5 Aug 2009 17:59:44 -0700
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
References: <>
Message-ID: <592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>

[Jeffrey E. McAninch, PhD]
> I very often want something like a try-except conditional expression similar
> to the if-else conditional.  
> An example of the proposed syntax might be:
>    x = float(string) except float('nan')
> or possibly
>    x = float(string) except ValueError float('nan')

I've long wanted something like this.
One possible spelling is:

   x = float(string) except ValueError else float('nan')

If accepted, this would also solve the feature requests for various functions to have default arguments.
For example:

   x = min(seq) except ValueError else 0     # default to zero for empty sequences

It would also be helpful in calculations that have algebraic restrictions:

  sample_std_deviation = sqrt(sum(x - mu for x in seq) / (len(seq)-1)) except ZeroDivisionError else float('Inf')


From pje at  Thu Aug  6 03:20:54 2009
From: pje at (P.J. Eby)
Date: Wed, 05 Aug 2009 21:20:54 -0400
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>
References: <>
Message-ID: <>

At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>[Jeffrey E. McAninch, PhD]
>>I very often want something like a try-except conditional expression similar
>>to the if-else conditional.
>>An example of the proposed syntax might be:
>>    x = float(string) except float('nan')
>>or possibly
>>    x = float(string) except ValueError float('nan')
>+1 I've long wanted something like this.
>One possible spelling is:
>   x = float(string) except ValueError else float('nan')

I think 'as' would be better than 'else', since 'else' has a 
different meaning in try/except statements, e.g.:

    x = float(string) except ValueError, TypeError as float('nan')

Of course, this is a different meaning of 'as', too, but it's not 
"as" contradictory, IMO...  ;-)

From mcaninch at  Thu Aug  6 04:11:28 2009
From: mcaninch at (Jeff McAninch)
Date: Wed, 05 Aug 2009 20:11:28 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>
References: <>
Message-ID: <>

Raymond Hettinger wrote:
> If accepted, this would also solve the feature requests for various 
> functions to have default arguments.
> For example:
>   x = min(seq) except ValueError else 0     # default to zero for 
> empty sequences
> It would also be helpful in calculations that have algebraic 
> restrictions:
>  sample_std_deviation = sqrt(sum(x - mu for x in seq) / (len(seq)-1)) 
> except ZeroDivisionError else float('Inf')
> Raymond
Yes, exactly the situations I keep coding around.

Jeffrey E. McAninch, PhD
Physicist, X-2-IFD
Los Alamos National Laboratory
Phone: 505-667-0374
Email: mcaninch at

From stephen at  Thu Aug  6 07:00:43 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 06 Aug 2009 14:00:43 +0900
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:

 > > I don't see any good reason for having any encodings but UTF-8 in
 > > Python's.
 > Just in case my previous message gets overlooked: PEP 8 mandates Latin-1
 > for Python 2.x source code (except for files that test PEP 263).

You're right, sorry for the misinformation.

An exception should be made for gettext message files, too?

From martin at  Thu Aug  6 07:48:46 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Aug 2009 07:48:46 +0200
Subject: [Python-Dev] PEP 385: Mercurial issues
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

>  > Just in case my previous message gets overlooked: PEP 8 mandates Latin-1
>  > for Python 2.x source code (except for files that test PEP 263).
> You're right, sorry for the misinformation.
> An exception should be made for gettext message files, too?

In principle, perhaps. However, Python doesn't have any .po files, AFAIK.


From stephen at  Thu Aug  6 08:00:54 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 06 Aug 2009 15:00:54 +0900
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

Mark Hammond writes:
 > On 6/08/2009 12:28 AM, Stephen J. Turnbull wrote:

 > > I think the implication is obvious.  There will be no good solution
 > > until Windows users develop it.  I don't see a good reason to wait for
 > > that.

 > My conclusion is different.  I'm not sure of the history of win32text, 
 > but it most certainly is now squarely in the hands of Windows users. 
 > Patches to win32text, or even general discussion is usually met with 
 > silence, and when prodded, the response is "sorry - we don't use that - 
 > it is a Windows problem."

Well, yes, it is a Windows problem.  And it will probably always be
that way, because for practical purposes, Windows users cannot
advocate their platform's infrastructure solutions for open source
projects: those solutions are proprietary.  On the flip side, in my
experience at least Windows users do not contribute much to this kind
of infrastructure initiative, undoubtedly due to the high cost of
acquiring familiarity with the usable options[1], and so have less
input into the process.

But that's a matter of certain costs that are built in to the nature
of a proprietary platform.  Somebody has to pay them, and I think it
should be the users of that platform.  Why should the rest of the
community subsidize that platform?

 > As a result, we end up in the position we are in now - win32text is 
 > great in theory but doesn't work in practice, attempts to make it work 
 > are met with indifference, and the "problem" stays squarely with Windows 
 > users.

This is simply false AFAICS.  There was little participation on this
particular issue during PEP 374 that I can recall.  Now that it is
clearly an issue after all, it's still early in the PEP 385 process.
Martin has already picked up the ball on EOL support, and has carried
informal design pretty much to the goal line already ... all that's
left is the detailed design and the implementation, and there are
several people involved who will help develop the patch, all very
capable.  (Of course it's going to be easier said than done and there
are probably bumps in the road to a smooth workflow, but I do claim
that the process is working as well as you could expect.)

 > Hence my conclusion that the answer is for any such support to be 
 > developed in conjunction with Windows users, [...]

Ahem.  Why not "(primarily) by Windows users"?

 > And on the flip-side, I accept we may migrate without the agreed
 > solution fully implemented - I'm happy to accept commitments about
 > what *will* be done even if it isn't a reality for a short while...

Make no mistake about it, EOL support is a tempest in a teapot
compared to the benefits to a large number of core developers in their
*personal* workspaces -- even if the project workflow doesn't change
at all.  That's what is driving this change.

Unless Windows users do it themselves, they are dependent on the good
will of the PEP 385 proponent and other volunteer contributors.  I
don't think "accepting commitments" is part of the game plan.

[1]  Eg, I was willing to participate in PEP 374 because I already
have a great interest in version control and use git daily.  Lots of
Unix users don't, and they didn't participate any more than most
Windows users did.

From martin at  Thu Aug  6 08:40:35 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Aug 2009 08:40:35 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>
	<>	<>
	<>	<>
	<>	<>	<>
Message-ID: <>

> This is simply false AFAICS.  There was little participation on this
> particular issue during PEP 374 that I can recall.  Now that it is
> clearly an issue after all, it's still early in the PEP 385 process.
> Martin has already picked up the ball on EOL support, and has carried
> informal design pretty much to the goal line already ... all that's
> left is the detailed design and the implementation, and there are
> several people involved who will help develop the patch, all very
> capable. 

I'm not so optimistic. To me, it looks like that either Dirkjan or Mark
will implement a hg hook, or else it won't happen (for me, I certainly
know that I will not write Mercurial hooks anytime soon).


From stephen at  Thu Aug  6 09:12:04 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 06 Aug 2009 16:12:04 +0900
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
	<> <>
Message-ID: <>

"Martin v. L?wis" writes:
 > > This is simply false AFAICS.  There was little participation on this
 > > particular issue during PEP 374 that I can recall.  Now that it is
 > > clearly an issue after all, it's still early in the PEP 385 process.
 > > Martin has already picked up the ball on EOL support, and has carried
 > > informal design pretty much to the goal line already ... all that's
 > > left is the detailed design and the implementation, and there are
 > > several people involved who will help develop the patch, all very
 > > capable. 
 > I'm not so optimistic. To me, it looks like that either Dirkjan or Mark
 > will implement a hg hook, or else it won't happen (for me, I certainly
 > know that I will not write Mercurial hooks anytime soon).

Ouch.  Still, I think the informal discussion so far is pretty close
to a usable solution at that level.

From mal at  Thu Aug  6 10:31:04 2009
From: mal at (M.-A. Lemburg)
Date: Thu, 06 Aug 2009 10:31:04 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>
	<>	<>	<>	<>	<>	<>
Message-ID: <>

Neil Hodgson wrote:
> Glenn Linderman:
>> and perhaps other things (and
>> are there new Unicode control characters that could be used for line
>> endings?),
>    Unicode includes Line Separator U+2028 and Paragraph Separator
> U+2029 but they are rarely supported and very rarely used. They are a
> pain to work with since they are 3 byte sequences in UTF-8. Visual
> Studio does support them.
>    Python does not currently support these line separators such as in
> this example which only reads 2 lines rather than 3:
> with open("x.txt", "wb") as f:
> 	f.write("a\nb\u2029c\n".encode('utf-8'))
> with open("x.txt", "r") as f:
> 	n = 1
> 	for l in f.readlines():
> 		print(n, repr(l))
> 		n += 1

Please file a bug report for this. f.readlines() (or rather
the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
for detecting line break characters.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From solipsis at  Thu Aug  6 10:51:29 2009
From: solipsis at (Antoine Pitrou)
Date: Thu, 6 Aug 2009 08:51:29 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>	<>	<>	<>
	<>	<>	<>	<>	<>	<>
Message-ID: <>

M.-A. Lemburg <mal <at>> writes:
> Please file a bug report for this. f.readlines() (or rather
> the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
> for detecting line break characters.

Actually, no. It has been designed from the start to only recognize the
"standard" line break representations found in common formats/protocols (CR, LF
and CR+LF).
People wanting to split on arbitrary unicode line breaks should use



From ncoghlan at  Thu Aug  6 12:19:38 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 06 Aug 2009 20:19:38 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at>> writes:
>> Please file a bug report for this. f.readlines() (or rather
>> the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
>> for detecting line break characters.
> Actually, no. It has been designed from the start to only recognize the
> "standard" line break representations found in common formats/protocols (CR, LF
> and CR+LF).
> People wanting to split on arbitrary unicode line breaks should use
> str.splitlines().

The fairly long-standing RFE relating to an arbitrarily selectable
newline separator seems relevant here:

As with the discussion there, the problem with using str.splitlines is
that it prevents pipelining approaches that avoid reading a whole file
into memory.

While removing the validity check from readlines() completely is
questionable (the readrecords() approach mentioned in the tracker issue
would still be better there), loosening the validity check to be based
on Py_UNICODE_IS_LINEBREAK seems a bit more feasible. (I'd still call it
a feature requests rather than a bug though).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From mal at  Thu Aug  6 12:40:09 2009
From: mal at (M.-A. Lemburg)
Date: Thu, 06 Aug 2009 12:40:09 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> Antoine Pitrou wrote:
>> M.-A. Lemburg <mal <at>> writes:
>>> Please file a bug report for this. f.readlines() (or rather
>>> the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
>>> for detecting line break characters.
>> Actually, no. It has been designed from the start to only recognize the
>> "standard" line break representations found in common formats/protocols (CR, LF
>> and CR+LF).
>> People wanting to split on arbitrary unicode line breaks should use
>> str.splitlines().
> The fairly long-standing RFE relating to an arbitrarily selectable
> newline separator seems relevant here:
> As with the discussion there, the problem with using str.splitlines is
> that it prevents pipelining approaches that avoid reading a whole file
> into memory.
> While removing the validity check from readlines() completely is
> questionable (the readrecords() approach mentioned in the tracker issue
> would still be better there), loosening the validity check to be based
> on Py_UNICODE_IS_LINEBREAK seems a bit more feasible. (I'd still call it
> a feature requests rather than a bug though).

I've had a look at the io implementation: this appears to be
based on the universal newline support idea which addresses
only a fixed set of "new line" character combinations and is
not as straight forward to extend to support all Unicode
line break characters as I thought.

What I don't understand is why the io layer tries to reinvent
the wheel here instead of just using the codec's .readline()
method - which *does* use .splitlines() and has full support
for all Unicode line break characters (including the CRLF

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at  Thu Aug  6 12:46:44 2009
From: mal at (M.-A. Lemburg)
Date: Thu, 06 Aug 2009 12:46:44 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

M.-A. Lemburg wrote:
> Nick Coghlan wrote:
>> Antoine Pitrou wrote:
>>> M.-A. Lemburg <mal <at>> writes:
>>>> Please file a bug report for this. f.readlines() (or rather
>>>> the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
>>>> for detecting line break characters.
>>> Actually, no. It has been designed from the start to only recognize the
>>> "standard" line break representations found in common formats/protocols (CR, LF
>>> and CR+LF).
>>> People wanting to split on arbitrary unicode line breaks should use
>>> str.splitlines().
>> The fairly long-standing RFE relating to an arbitrarily selectable
>> newline separator seems relevant here:
>> As with the discussion there, the problem with using str.splitlines is
>> that it prevents pipelining approaches that avoid reading a whole file
>> into memory.
>> While removing the validity check from readlines() completely is
>> questionable (the readrecords() approach mentioned in the tracker issue
>> would still be better there), loosening the validity check to be based
>> on Py_UNICODE_IS_LINEBREAK seems a bit more feasible. (I'd still call it
>> a feature requests rather than a bug though).
> I've had a look at the io implementation: this appears to be
> based on the universal newline support idea which addresses
> only a fixed set of "new line" character combinations and is
> not as straight forward to extend to support all Unicode
> line break characters as I thought.
> What I don't understand is why the io layer tries to reinvent
> the wheel here instead of just using the codec's .readline()
> method - which *does* use .splitlines() and has full support
> for all Unicode line break characters (including the CRLF
> combination).

... and because of this, the feature is already available if
you use instead of the built-in open():

import codecs

with"x.txt", "w", encoding='utf-8') as f:

with"x.txt", "r", encoding='utf-8') as f:
  n = 1
  for l in f.readlines():
     print(n, repr(l))
     n += 1

This prints:

1 'a\n'
2 'b\u2029'
3 'c\n'

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From ncoghlan at  Thu Aug  6 12:47:45 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 06 Aug 2009 20:47:45 +1000
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>
Message-ID: <>

P.J. Eby wrote:
> At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>> [Jeffrey E. McAninch, PhD]
>>> I very often want something like a try-except conditional expression
>>> similar
>>> to the if-else conditional.
>>> An example of the proposed syntax might be:
>>>    x = float(string) except float('nan')
>>> or possibly
>>>    x = float(string) except ValueError float('nan')
>> +1 I've long wanted something like this.
>> One possible spelling is:
>>   x = float(string) except ValueError else float('nan')
> I think 'as' would be better than 'else', since 'else' has a different
> meaning in try/except statements, e.g.:
>    x = float(string) except ValueError, TypeError as float('nan')
> Of course, this is a different meaning of 'as', too, but it's not "as"
> contradictory, IMO...  ;-)

(We're probably well into python-ideas territory at this point, but I'll
keep things where the thread started for now)

The basic idea appears sound to me as well. I suspect finding an
acceptable syntax is going to be the sticking point.

Breaking the problem down, we have three things we want to separate:

1. The expression that may raise the exception
2. The expression defining the exceptions to be caught
3. The expression to be used if the exception actually is caught

>From there it is possible to come up with all sorts of variants.

Option 1:

Change the relative order of the clauses by putting the exception
definition last:

  x = float(string) except float('nan') if ValueError
  op(float(string) except float('nan') if ValueError)

I actually like this one (that's why I listed it first). It gets the
clauses out of order relative to the statement, but the meaning still
seems pretty obvious to me.

Option 2:

Follow the lamba model and allow a colon inside this form of expression:

  x = float(string) except ValueError: float('nan')
  op(float(string) except ValueError: float('nan'))

This has the virtue of closely matching the statement syntax, but
embedding colons inside expressions is somewhat ugly. Yes, lambda
already does it, but lambda can hardly be put forward as a paragon of

Option 3a/3b:

Raymond's except-else suggestion:

  x = float(string) except ValueError else float('nan')
  op(float(string) except ValueError else float('nan'))

This has the problem of inverting the sense of the else clause relative
to the statement form (where the else clause is executed only if no
exception occurs)

A couple of extra keywords would get the sense correct again, but I'm
not sure the parser could cope with it and it is rather verbose (I much
prefer option 1 to this idea):

  x = float(string) if not except ValueError else float('nan')
  op(float(string) if not except ValueError else float('nan'))

Option 4:

PJE's except-as suggestion:

  x = float(string) except ValueError as float('nan')
  op(float(string) except ValueError as float('nan'))

Given that we now use "except ValueError as ex" in exception statements,
the above strikes me a really confusing idea.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From solipsis at  Thu Aug  6 13:01:42 2009
From: solipsis at (Antoine Pitrou)
Date: Thu, 6 Aug 2009 11:01:42 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

M.-A. Lemburg <mal <at>> writes:
> What I don't understand is why the io layer tries to reinvent
> the wheel here instead of just using the codec's .readline()
> method - which *does* use .splitlines() and has full support
> for all Unicode line break characters (including the CRLF
> combination).

As for the original Python implementation, the goal was probably to start from a
clean sheet. Besides, the new API has seek() and tell() as well. But I'm not
really qualified to say more -- I didn't participate in its design.

As for the C implementation, it had to be written from scratch anyway -- is pure Python and too slow. Deferring to str.splitlines() would
still have been possible but a bit wasteful since in C you can use buffers

(and, besides, when writing the C implementation we were concerned with exact
compatibility with the Python version -- including line break semantics)



From digitalxero at  Thu Aug  6 13:18:52 2009
From: digitalxero at (Dj Gilcrease)
Date: Thu, 6 Aug 2009 05:18:52 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 6, 2009 at 4:47 AM, Nick Coghlan<ncoghlan at> wrote:
> Option 2:
> ?x = float(string) except ValueError: float('nan')
> ?op(float(string) except ValueError: float('nan'))
> This has the virtue of closely matching the statement syntax, but
> embedding colons inside expressions is somewhat ugly. Yes, lambda
> already does it, but lambda can hardly be put forward as a paragon of
> beauty.

+1 on this option as it resembles the standard try/except block enough
it would be a quick edit to convert it to one if later you realize you
need to catch more exceptions*

* I recommend NOT allowing multiple exceptions in this form eg
x = float(string)/var except ValueError, ZeroDivisionError, ...: float('nan')

as it will start to reduce readability quickly

From solipsis at  Thu Aug  6 13:32:16 2009
From: solipsis at (Antoine Pitrou)
Date: Thu, 6 Aug 2009 11:32:16 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Raymond Hettinger <python <at>> writes:
> For example:
>    x = min(seq) except ValueError else 0     # default to zero for empty

How about:
    x = min(seq) if seq else 0

Shorter and more readable ("except X else Y" isn't very logical).

>   sample_std_deviation = sqrt(sum(x - mu for x in seq) / (len(seq)-1)) except
ZeroDivisionError else float('Inf')

Same transformation here.

I have to say that the original example:
    x = float(string) except ValueError else float('nan')

looks artificial. I don't see how it's adequate behaviour to return a NaN when
presented with a string which doesn't represent a float number.

Besides, all this is python-ideas material (and has probably already been
proposed before).



From mal at  Thu Aug  6 13:34:24 2009
From: mal at (M.-A. Lemburg)
Date: Thu, 06 Aug 2009 13:34:24 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at>> writes:
>> What I don't understand is why the io layer tries to reinvent
>> the wheel here instead of just using the codec's .readline()
>> method - which *does* use .splitlines() and has full support
>> for all Unicode line break characters (including the CRLF
>> combination).
> As for the original Python implementation, the goal was probably to start from a
> clean sheet. Besides, the new API has seek() and tell() as well. But I'm not
> really qualified to say more -- I didn't participate in its design.
> As for the C implementation, it had to be written from scratch anyway --
> is pure Python and too slow. Deferring to str.splitlines() would
> still have been possible but a bit wasteful since in C you can use buffers
> directly.

Sure, but the code for line splitting is not really all that
complicated (see PyUnicode_Splitlines()), so could easily
be adapted to work on buffers directly.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From catch-all at  Thu Aug  6 12:25:25 2009
From: catch-all at (Xavier Morel)
Date: Thu, 6 Aug 2009 12:25:25 +0200
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

On 6 Aug 2009, at 00:22 , Jeff McAninch wrote:
> I'm new to this list, so please excuse me if this topic has been  
> discussed, but I didn't
> see anything similar in the archives.
> I very often want something like a try-except conditional expression  
> similar
> to the if-else conditional.
I fear this idea is soon going to extend to all compound statements  
one by one.

Wouldn't it be smarter to fix the issue once and for all by looking  
into making Python's compound statements (or even all statements  
without restrictions) expressions that can return values in the first  
place? Now I don't know if it's actually possible, but if it is the  
problem becomes solved not just for try:except: (and twice so for  
if:else:) but also for while:, for: (though that one's already served  
pretty well by comprehensions) and with:.

From python at  Thu Aug  6 13:39:58 2009
From: python at (MRAB)
Date: Thu, 06 Aug 2009 12:39:58 +0100
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>	<>
Message-ID: <>

Nick Coghlan wrote:
> P.J. Eby wrote:
>> At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>>> [Jeffrey E. McAninch, PhD]
>>>> I very often want something like a try-except conditional expression
>>>> similar
>>>> to the if-else conditional.
>>>> An example of the proposed syntax might be:
>>>>    x = float(string) except float('nan')
>>>> or possibly
>>>>    x = float(string) except ValueError float('nan')
>>> +1 I've long wanted something like this.
>>> One possible spelling is:
>>>   x = float(string) except ValueError else float('nan')
>> I think 'as' would be better than 'else', since 'else' has a different
>> meaning in try/except statements, e.g.:
>>    x = float(string) except ValueError, TypeError as float('nan')
>> Of course, this is a different meaning of 'as', too, but it's not "as"
>> contradictory, IMO...  ;-)
> (We're probably well into python-ideas territory at this point, but I'll
> keep things where the thread started for now)
> The basic idea appears sound to me as well. I suspect finding an
> acceptable syntax is going to be the sticking point.
> Breaking the problem down, we have three things we want to separate:
> 1. The expression that may raise the exception
> 2. The expression defining the exceptions to be caught
> 3. The expression to be used if the exception actually is caught
>>From there it is possible to come up with all sorts of variants.
> Option 1:
> Change the relative order of the clauses by putting the exception
> definition last:
>   x = float(string) except float('nan') if ValueError
>   op(float(string) except float('nan') if ValueError)
> I actually like this one (that's why I listed it first). It gets the
> clauses out of order relative to the statement, but the meaning still
> seems pretty obvious to me.
A further extension (if we need it):

     result = foo(arg) except float('inf') if ZeroDivisionError else 

The 'else' part handles any other exceptions (not necessarily a good idea!).


     result = foo(arg) except float('inf') if ZeroDivisionError else 
float('nan') if ValueError

Handles a number of different exceptions.

> Option 2:
> Follow the lamba model and allow a colon inside this form of expression:
>   x = float(string) except ValueError: float('nan')
>   op(float(string) except ValueError: float('nan'))
> This has the virtue of closely matching the statement syntax, but
> embedding colons inside expressions is somewhat ugly. Yes, lambda
> already does it, but lambda can hardly be put forward as a paragon of
> beauty.
A colon is also used in a dict literal.

> Option 3a/3b:
> Raymond's except-else suggestion:
>   x = float(string) except ValueError else float('nan')
>   op(float(string) except ValueError else float('nan'))

From solipsis at  Thu Aug  6 13:42:03 2009
From: solipsis at (Antoine Pitrou)
Date: Thu, 6 Aug 2009 11:42:03 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

M.-A. Lemburg <mal <at>> writes:
> Sure, but the code for line splitting is not really all that
> complicated (see PyUnicode_Splitlines()), so could easily
> be adapted to work on buffers directly.

Certainly indeed. It all comes down to compatibility with the original
(PEP 3116 itself is vague on the subject, but it didn't come to me to question
the validity of the Python implementation, I admit)



From ilya.nikokoshev at  Thu Aug  6 14:03:12 2009
From: ilya.nikokoshev at (ilya)
Date: Thu, 6 Aug 2009 16:03:12 +0400
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
Message-ID: <>

I took a look at the options 1 and 2:

    x = float(string) except float('nan') if ValueError
    y = float(string) except ValueError: float('nan')

and I think this can be done just as easily with existing syntax:

    x = try_1(float, string, except_ = float('nan'), if_ = ValueError)
    y = try_2(float, string, { ValueError: float('nan') })

Here's the full example:

----- example starts -----

def try_1(func, *args, except_ = None, if_ = None):
        return func(*args)
    except if_ as e:
        return except_

def try_2(func, *args):
    'The last argument is a dictionary {exception type: return value}.'
    dic = args[-1]
        return func(*args[:-1])
    except Exception as e:
        for k,v in dic.items():
            if isinstance(e, k):
                return v

for string in ['5', 'five']:
    #   x = float(string) except float('nan') if ValueError
    x = try_1(float, string, except_ = float('nan'), if_ = ValueError)
    #   y = float(string) except ValueError: float('nan')
    y = try_2(float, string, { ValueError: float('nan') })
    print(x, y)

----- example ends -----

As a side note, if I just subscribed to python-dev, is it possible to
quote an old email? Below is my manual cut-and-paste quote:

---------- my quote --------------

Nick Coghlan wrote:
> P.J. Eby wrote:
>> At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>>> [Jeffrey E. McAninch, PhD]
>>>> I very often want something like a try-except conditional expression
>>>> similar
>>>> to the if-else conditional.
>>>> An example of the proposed syntax might be:
>>>>    x = float(string) except float('nan')
>>>> or possibly
>>>>    x = float(string) except ValueError float('nan')
>>> +1 I've long wanted something like this.
>>> One possible spelling is:
>>>   x = float(string) except ValueError else float('nan')
>> I think 'as' would be better than 'else', since 'else' has a different
>> meaning in try/except statements, e.g.:
>>    x = float(string) except ValueError, TypeError as float('nan')
>> Of course, this is a different meaning of 'as', too, but it's not "as"
>> contradictory, IMO...  ;-)
> (We're probably well into python-ideas territory at this point, but I'll
> keep things where the thread started for now)
> The basic idea appears sound to me as well. I suspect finding an
> acceptable syntax is going to be the sticking point.
> Breaking the problem down, we have three things we want to separate:
> 1. The expression that may raise the exception
> 2. The expression defining the exceptions to be caught
> 3. The expression to be used if the exception actually is caught
>>From there it is possible to come up with all sorts of variants.
> Option 1:
> Change the relative order of the clauses by putting the exception
> definition last:
>   x = float(string) except float('nan') if ValueError
>   op(float(string) except float('nan') if ValueError)
> I actually like this one (that's why I listed it first). It gets the
> clauses out of order relative to the statement, but the meaning still
> seems pretty obvious to me.
A further extension (if we need it):

     result = foo(arg) except float('inf') if ZeroDivisionError else

The 'else' part handles any other exceptions (not necessarily a good idea!).


     result = foo(arg) except float('inf') if ZeroDivisionError else
float('nan') if ValueError

Handles a number of different exceptions.

> Option 2:
> Follow the lamba model and allow a colon inside this form of expression:
>   x = float(string) except ValueError: float('nan')
>   op(float(string) except ValueError: float('nan'))
> This has the virtue of closely matching the statement syntax, but
> embedding colons inside expressions is somewhat ugly. Yes, lambda
> already does it, but lambda can hardly be put forward as a paragon of
> beauty.
A colon is also used in a dict literal.

> Option 3a/3b:
> Raymond's except-else suggestion:
>   x = float(string) except ValueError else float('nan')
>   op(float(string) except ValueError else float('nan'))

From eric.pruitt at  Thu Aug  6 15:39:59 2009
From: eric.pruitt at (Eric Pruitt)
Date: Thu, 6 Aug 2009 08:39:59 -0500
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

What about catching specific error numbers? Maybe an option so that
the dictionary elements can also be dictionaries with integers as the

    filedata = try_3(open, randomfile, except = { IOError,  {2: None} } )

If it isn't found in the dictionary, then we raise the error.

On Thu, Aug 6, 2009 at 07:03, ilya<ilya.nikokoshev at> wrote:
> I took a look at the options 1 and 2:
> ? ?x = float(string) except float('nan') if ValueError
> ? ?y = float(string) except ValueError: float('nan')
> and I think this can be done just as easily with existing syntax:
> ? ?x = try_1(float, string, except_ = float('nan'), if_ = ValueError)
> ? ?y = try_2(float, string, { ValueError: float('nan') })
> Here's the full example:
> ----- example starts -----
> def try_1(func, *args, except_ = None, if_ = None):
> ? ?try:
> ? ? ? ?return func(*args)
> ? ?except if_ as e:
> ? ? ? ?return except_
> def try_2(func, *args):
> ? ?'The last argument is a dictionary {exception type: return value}.'
> ? ?dic = args[-1]
> ? ?try:
> ? ? ? ?return func(*args[:-1])
> ? ?except Exception as e:
> ? ? ? ?for k,v in dic.items():
> ? ? ? ? ? ?if isinstance(e, k):
> ? ? ? ? ? ? ? ?return v
> ? ? ? ?raise
> for string in ['5', 'five']:
> ? ?# ? x = float(string) except float('nan') if ValueError
> ? ?x = try_1(float, string, except_ = float('nan'), if_ = ValueError)
> ? ?# ? y = float(string) except ValueError: float('nan')
> ? ?y = try_2(float, string, { ValueError: float('nan') })
> ? ?print(x, y)
> ----- example ends -----
> As a side note, if I just subscribed to python-dev, is it possible to
> quote an old email? Below is my manual cut-and-paste quote:
> ---------- my quote --------------
> Nick Coghlan wrote:
>> P.J. Eby wrote:
>>> At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>>>> [Jeffrey E. McAninch, PhD]
>>>>> I very often want something like a try-except conditional expression
>>>>> similar
>>>>> to the if-else conditional.
>>>>> An example of the proposed syntax might be:
>>>>> ? ?x = float(string) except float('nan')
>>>>> or possibly
>>>>> ? ?x = float(string) except ValueError float('nan')
>>>> +1 I've long wanted something like this.
>>>> One possible spelling is:
>>>> ? x = float(string) except ValueError else float('nan')
>>> I think 'as' would be better than 'else', since 'else' has a different
>>> meaning in try/except statements, e.g.:
>>> ? ?x = float(string) except ValueError, TypeError as float('nan')
>>> Of course, this is a different meaning of 'as', too, but it's not "as"
>>> contradictory, IMO... ?;-)
>> (We're probably well into python-ideas territory at this point, but I'll
>> keep things where the thread started for now)
>> The basic idea appears sound to me as well. I suspect finding an
>> acceptable syntax is going to be the sticking point.
>> Breaking the problem down, we have three things we want to separate:
>> 1. The expression that may raise the exception
>> 2. The expression defining the exceptions to be caught
>> 3. The expression to be used if the exception actually is caught
>>>From there it is possible to come up with all sorts of variants.
>> Option 1:
>> Change the relative order of the clauses by putting the exception
>> definition last:
>> ? x = float(string) except float('nan') if ValueError
>> ? op(float(string) except float('nan') if ValueError)
>> I actually like this one (that's why I listed it first). It gets the
>> clauses out of order relative to the statement, but the meaning still
>> seems pretty obvious to me.
> A further extension (if we need it):
> ? ? result = foo(arg) except float('inf') if ZeroDivisionError else
> float('nan')
> The 'else' part handles any other exceptions (not necessarily a good idea!).
> or:
> ? ? result = foo(arg) except float('inf') if ZeroDivisionError else
> float('nan') if ValueError
> Handles a number of different exceptions.
>> Option 2:
>> Follow the lamba model and allow a colon inside this form of expression:
>> ? x = float(string) except ValueError: float('nan')
>> ? op(float(string) except ValueError: float('nan'))
>> This has the virtue of closely matching the statement syntax, but
>> embedding colons inside expressions is somewhat ugly. Yes, lambda
>> already does it, but lambda can hardly be put forward as a paragon of
>> beauty.
> A colon is also used in a dict literal.
>> Option 3a/3b:
>> Raymond's except-else suggestion:
>> ? x = float(string) except ValueError else float('nan')
>> ? op(float(string) except ValueError else float('nan'))
> [snip]
> -1
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From cool-rr at  Thu Aug  6 15:44:26 2009
From: cool-rr at (cool-RR)
Date: Thu, 6 Aug 2009 15:44:26 +0200
Subject: [Python-Dev] Tkinter has many files
Message-ID: <>

Hello python-dev!

I'm a Python programmer, but this is the first time I'm posting on
python-dev, and I am not familiar at all with how the Python implementation
works -- so this post may be way off.

I've recently released a Python application,
which is packaged using py2exe and InnoSetup. Due to the fact that my
program needs to give the user a full Python shell, I've made py2exe package
the entire Python standard library with my application. What I've noticed
when I did that is that Tkinter has *a lot* of files. This is a bit
inconvenient for several reasons, the main one being that the installer for
PythonTurtle takes a long time to copy all of those little files. (I think
the reason for the slowness is not the weight of the files, but the fact
that there are so many of them.) There are also other reasons why it's
annoying: Ohloh thinks my project is "Mostly written in Tcl," and git-gui
gave me trouble for trying to commit so many files.
Do you think it will be a good thing to package all of these Tkinter files
into one big file (or several big files)?

Best Wishes,
Ram Rachum.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fuzzyman at  Thu Aug  6 15:59:01 2009
From: fuzzyman at (Michael Foord)
Date: Thu, 06 Aug 2009 14:59:01 +0100
Subject: [Python-Dev] Tkinter has many files
In-Reply-To: <>
References: <>
Message-ID: <>

cool-RR wrote:
> Hello python-dev!
> I'm a Python programmer, but this is the first time I'm posting on 
> python-dev, and I am not familiar at all with how the Python 
> implementation works -- so this post may be way off.
> I've recently released a Python application, PythonTurtle 
> <>, which is packaged using py2exe and 
> InnoSetup. Due to the fact that my program needs to give the user a 
> full Python shell, I've made py2exe package the entire Python standard 
> library with my application. What I've noticed when I did that is that 
> Tkinter has /a lot/ of files. This is a bit inconvenient for several 
> reasons, the main one being that the installer for PythonTurtle takes 
> a long time to copy all of those little files. (I think the reason for 
> the slowness is not the weight of the files, but the fact that there 
> are so many of them.) There are also other reasons why it's annoying: 
> Ohloh thinks my project is "Mostly written in Tcl," and git-gui gave 
> me trouble for trying to commit so many files.
> Do you think it will be a good thing to package all of these Tkinter 
> files into one big file (or several big files)?

Do you mean the .tcl files? Tkinter is a Python wrapper around Tcl - 
which is a separate project / programming environment that includes the 
Tk GUI. Python is not in a position to modify or repackage those files.

Why do you need to keep the whole Python distribution under version 
control? Isn't all you need a script to *generate* the py2exe'd output 
from an *installed* Python? This is the approach I take with Movable 
Python which does something very similar.

All the best,

Michael Foord

> Best Wishes,
> Ram Rachum.
> ------------------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From cool-rr at  Thu Aug  6 16:10:48 2009
From: cool-rr at (cool-RR)
Date: Thu, 6 Aug 2009 16:10:48 +0200
Subject: [Python-Dev] Tkinter has many files
In-Reply-To: <>
References: <> 
Message-ID: <>

> Why do you need to keep the whole Python distribution under version
> control? Isn't all you need a script to *generate* the py2exe'd output from
> an *installed* Python? This is the approach I take with Movable Python which
> does something very similar.
Never mind the source control issue, it's minor.

If it's not possible to minimize the number of files there, I guess I'll
have to live with it.

Ram Rachum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mcaninch at  Thu Aug  6 16:36:45 2009
From: mcaninch at (Jeff McAninch)
Date: Thu, 06 Aug 2009 08:36:45 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>
Message-ID: <>

Nick Coghlan wrote:
> Option 1:
> Change the relative order of the clauses by putting the exception
> definition last:
>   x = float(string) except float('nan') if ValueError
>   op(float(string) except float('nan') if ValueError)
> I actually like this one (that's why I listed it first). It gets the
> clauses out of order relative to the statement, but the meaning still
> seems pretty obvious to me.
Since I don't know the parser coding, I won't comment on the relative implentability (implementableness?) of the syntax options that Nick,
P.J. and Raymond suggested.  But all seem readable and debugable.

Nick's option 1 seems like it might be the most understandable to 
a Python novice.

Would the full syntax include multiple Exceptions after the "if"?


Jeffrey E. McAninch, PhD
Physicist, X-2-IFD
Los Alamos National Laboratory
Phone: 505-667-0374
Email: mcaninch at

From rowen at  Thu Aug  6 21:55:10 2009
From: rowen at (Russell E. Owen)
Date: Thu, 06 Aug 2009 12:55:10 -0700
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
References: <>
Message-ID: <>

In article <D28975E8-6706-4515-9C9E-FB7F90775CA5 at>,
 Xavier Morel <catch-all at> wrote:

> On 6 Aug 2009, at 00:22 , Jeff McAninch wrote:
> > I'm new to this list, so please excuse me if this topic has been  
> > discussed, but I didn't
> > see anything similar in the archives.
> >
> > I very often want something like a try-except conditional expression  
> > similar
> > to the if-else conditional.
> I fear this idea is soon going to extend to all compound statements  
> one by one.
> Wouldn't it be smarter to fix the issue once and for all by looking  
> into making Python's compound statements (or even all statements  
> without restrictions) expressions that can return values in the first  
> place? Now I don't know if it's actually possible, but if it is the  
> problem becomes solved not just for try:except: (and twice so for  
> if:else:) but also for while:, for: (though that one's already served  
> pretty well by comprehensions) and with:.

I like this idea a lot.

-- Russell

From dinov at  Thu Aug  6 23:55:47 2009
From: dinov at (Dino Viehland)
Date: Thu, 6 Aug 2009 21:55:47 +0000
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

On option 1 is this legal then?

x = float(string) except float('nan') if some_check() else float('inf') if ValueError

-----Original Message-----
From: at [ at] On Behalf Of Nick Coghlan
Sent: Thursday, August 06, 2009 3:48 AM
To: P.J. Eby
Cc: python-dev at; Jeff McAninch
Subject: Re: [Python-Dev] (try-except) conditional expression similar to (if-else) conditional (PEP 308)

P.J. Eby wrote:
> At 05:59 PM 8/5/2009 -0700, Raymond Hettinger wrote:
>> [Jeffrey E. McAninch, PhD]
>>> I very often want something like a try-except conditional expression
>>> similar to the if-else conditional.
>>> An example of the proposed syntax might be:
>>>    x = float(string) except float('nan') or possibly
>>>    x = float(string) except ValueError float('nan')
>> +1 I've long wanted something like this.
>> One possible spelling is:
>>   x = float(string) except ValueError else float('nan')
> I think 'as' would be better than 'else', since 'else' has a different
> meaning in try/except statements, e.g.:
>    x = float(string) except ValueError, TypeError as float('nan')
> Of course, this is a different meaning of 'as', too, but it's not "as"
> contradictory, IMO...  ;-)

(We're probably well into python-ideas territory at this point, but I'll keep things where the thread started for now)

The basic idea appears sound to me as well. I suspect finding an acceptable syntax is going to be the sticking point.

Breaking the problem down, we have three things we want to separate:

1. The expression that may raise the exception 2. The expression defining the exceptions to be caught 3. The expression to be used if the exception actually is caught

>From there it is possible to come up with all sorts of variants.

Option 1:

Change the relative order of the clauses by putting the exception definition last:

  x = float(string) except float('nan') if ValueError
  op(float(string) except float('nan') if ValueError)

I actually like this one (that's why I listed it first). It gets the clauses out of order relative to the statement, but the meaning still seems pretty obvious to me.

Option 2:

Follow the lamba model and allow a colon inside this form of expression:

  x = float(string) except ValueError: float('nan')
  op(float(string) except ValueError: float('nan'))

This has the virtue of closely matching the statement syntax, but embedding colons inside expressions is somewhat ugly. Yes, lambda already does it, but lambda can hardly be put forward as a paragon of beauty.

Option 3a/3b:

Raymond's except-else suggestion:

  x = float(string) except ValueError else float('nan')
  op(float(string) except ValueError else float('nan'))

This has the problem of inverting the sense of the else clause relative to the statement form (where the else clause is executed only if no exception occurs)

A couple of extra keywords would get the sense correct again, but I'm not sure the parser could cope with it and it is rather verbose (I much prefer option 1 to this idea):

  x = float(string) if not except ValueError else float('nan')
  op(float(string) if not except ValueError else float('nan'))

Option 4:

PJE's except-as suggestion:

  x = float(string) except ValueError as float('nan')
  op(float(string) except ValueError as float('nan'))

Given that we now use "except ValueError as ex" in exception statements, the above strikes me a really confusing idea.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
Python-Dev mailing list
Python-Dev at

From kristjan at  Thu Aug  6 22:56:11 2009
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Thu, 6 Aug 2009 20:56:11 +0000
Subject: [Python-Dev] issue 6654
Message-ID: <>

I added
I also put a not to python-ideas but have had no response yet.  Any comments?
Here's the summary:

I've created on Rietveld:

by passing the "path" component of the xmlrpc request to the dispatch

method, itbecomes possible to dispatch differently according to this.  This patch

providesthat addition.

Additionally, it provides an MultiPathXMLRPCDispatcher mixin

class and a MultiPathXMLRPCServer that uses it, to have multiple dispatchers for

different paths.

This allows a single server port to serve different XMLRPC servers as

differentiated by the HTTP path.  A test is also preovided.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From nyamatongwe at  Fri Aug  7 00:10:28 2009
From: nyamatongwe at (Neil Hodgson)
Date: Fri, 7 Aug 2009 08:10:28 +1000
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>
	<> <>
	<> <>
Message-ID: <>

M.-A. Lemburg:

> ... and because of this, the feature is already available if
> you use instead of the built-in open():

   So should I not add an issue for the basic open because
should be used for this case?


From python at  Fri Aug  7 01:33:49 2009
From: python at (MRAB)
Date: Fri, 07 Aug 2009 00:33:49 +0100
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>	<>	<>
Message-ID: <>

Dino Viehland wrote:
> On option 1 is this legal then?
> x = float(string) except float('nan') if some_check() else float('inf') if ValueError
Well, is this is legal?

         x = float(string)
     except some_check():
         x = float('nan')
     except ValueError:
         x = float('inf')

In other words, some_check() returns an exception _class_.

 >>> def get_exception():
     return ValueError

 >>> try:
     x = float("")
except get_exception():
     print "not a float"

not a float

From dinov at  Fri Aug  7 02:01:23 2009
From: dinov at (Dino Viehland)
Date: Fri, 7 Aug 2009 00:01:23 +0000
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

MRAB wrote:
> Dino Viehland wrote:
> > On option 1 is this legal then?
> >
> > x = float(string) except float('nan') if some_check() else float('inf') if
> ValueError
> >
> Well, is this is legal?
>      try:
>          x = float(string)
>      except some_check():
>          x = float('nan')
>      except ValueError:
>          x = float('inf')

I was thinking this was would be equal to:

x = float(string) except (float('nan') if some_check() else float('inf')) if ValueError

From python at  Fri Aug  7 02:22:00 2009
From: python at (MRAB)
Date: Fri, 07 Aug 2009 01:22:00 +0100
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>	<>	<>	<>
Message-ID: <>

Dino Viehland wrote:
> MRAB wrote:
>> Dino Viehland wrote:
>>> On option 1 is this legal then?
>>> x = float(string) except float('nan') if some_check() else float('inf') if
>> ValueError
>> Well, is this is legal?
>>      try:
>>          x = float(string)
>>      except some_check():
>>          x = float('nan')
>>      except ValueError:
>>          x = float('inf')
> I was thinking this was would be equal to:
> x = float(string) except (float('nan') if some_check() else float('inf')) if ValueError
I suppose it depends on the precedence of 'x except y if z' vs 'x if y
else y'.

From python at  Fri Aug  7 02:36:34 2009
From: python at (MRAB)
Date: Fri, 07 Aug 2009 01:36:34 +0100
Subject: [Python-Dev] (try-except) conditional expression similar
 to	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Russell E. Owen wrote:
> In article <D28975E8-6706-4515-9C9E-FB7F90775CA5 at>,
>  Xavier Morel <catch-all at> wrote:
>> On 6 Aug 2009, at 00:22 , Jeff McAninch wrote:
>>> I'm new to this list, so please excuse me if this topic has been  
>>> discussed, but I didn't
>>> see anything similar in the archives.
>>> I very often want something like a try-except conditional expression  
>>> similar
>>> to the if-else conditional.
>> I fear this idea is soon going to extend to all compound statements  
>> one by one.
>> Wouldn't it be smarter to fix the issue once and for all by looking  
>> into making Python's compound statements (or even all statements  
>> without restrictions) expressions that can return values in the first  
>> place? Now I don't know if it's actually possible, but if it is the  
>> problem becomes solved not just for try:except: (and twice so for  
>> if:else:) but also for while:, for: (though that one's already served  
>> pretty well by comprehensions) and with:.
> I like this idea a lot.
For some reason this kind of reminds me of BCPL.

A function definition looked like:

     LET func_name(arg1, arg2) = expression

so, strictly speaking, no multiline functions.

However, there was also the VALOF ... RESULTIS ... block.

In Python, the 'return' statement provides the result of a function; in
BCPL, the 'RESULTIS' statement provided the result of the VALOF block,
which was call from within an expression, like:

     LET foo(...) = VALOF
         RESULTIS expression

From tjreedy at  Fri Aug  7 06:28:52 2009
From: tjreedy at (Terry Reedy)
Date: Fri, 07 Aug 2009 13:28:52 +0900
Subject: [Python-Dev] Tkinter has many files
In-Reply-To: <>
References: <>
Message-ID: <h5gai8$ij9$>

cool-RR wrote:
> Hello python-dev!
> I'm a Python programmer, but this is the first time I'm posting on 
> python-dev, and I am not familiar at all with how the Python 
> implementation works -- so this post may be way off.
> I've recently released a Python application, PythonTurtle 
> <>, which is packaged using py2exe and InnoSetup. 
> Due to the fact that my program needs to give the user a full Python 
> shell, I've made py2exe package the entire Python standard library with 
> my application.

I really think you you just make you app sit on top of a standard Python 
installation. The current Windows installers work well. Just decide 
which versions you are willing to support. The usually reasons for 
bundling, to control the versions of multiple 3rd-party libraries, do 
not seen to apply.

From mal at  Fri Aug  7 10:31:01 2009
From: mal at (M.-A. Lemburg)
Date: Fri, 07 Aug 2009 10:31:01 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
	<>	<>
	<>	<>
Message-ID: <>

Neil Hodgson wrote:
> M.-A. Lemburg:
>> ... and because of this, the feature is already available if
>> you use instead of the built-in open():
>    So should I not add an issue for the basic open because
> should be used for this case?

Like Antoine mentioned: Using and .readline()
is about 20-30 times slower than open().

This is mainly due to the fact that the codec's .readline()
method is implemented in pure Python and does its own

IMHO, it would be a lot better to add full Unicode support
for line breaks to the io layer. Given that the code for the
complicated handling of the CRLF combination is already there,
it's not difficult to add support for the remaing line break

The implementation could reuse the Bloom filter approach
used in unicodeobject.c to make this very fast.

BTW: I'm not sure why the io layer records the line endings
it has seen. This makes processing more complicated for no
apparent reason. In the few cases where you might need this
(I don't see any), you could just as well scan the lines
in a quick loop using Python.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 07 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From fuzzyman at  Fri Aug  7 12:17:24 2009
From: fuzzyman at (Michael Foord)
Date: Fri, 07 Aug 2009 11:17:24 +0100
Subject: [Python-Dev] Tkinter has many files
In-Reply-To: <h5gai8$ij9$>
References: <>
Message-ID: <>

Terry Reedy wrote:
> cool-RR wrote:
>> Hello python-dev!
>> I'm a Python programmer, but this is the first time I'm posting on 
>> python-dev, and I am not familiar at all with how the Python 
>> implementation works -- so this post may be way off.
>> I've recently released a Python application, PythonTurtle 
>> <>, which is packaged using py2exe and 
>> InnoSetup. Due to the fact that my program needs to give the user a 
>> full Python shell, I've made py2exe package the entire Python 
>> standard library with my application.
> I really think you you just make you app sit on top of a standard 
> Python installation. The current Windows installers work well. Just 
> decide which versions you are willing to support. The usually reasons 
> for bundling, to control the versions of multiple 3rd-party libraries, 
> do not seen to apply.

Actually on Windows a very common reason for bundling with py2exe is to 
not be dependent (or require) an installed version of Python. For a 
standalone teaching tool this seems reasonable.


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe: 


From mcaninch at  Fri Aug  7 12:14:59 2009
From: mcaninch at (Jeff McAninch)
Date: Fri, 07 Aug 2009 04:14:59 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>	<>
Message-ID: <>

Should be legal, right?, since syntax would be
   <expression> except <expression> if <exception>

Dino Viehland wrote:
> On option 1 is this legal then?
> x = float(string) except float('nan') if some_check() else float('inf') if ValueError
Thinking more about the syntax options: if P.J.'s "if" Option is used, 
it should also be optional.
That is, I would want this to also be legal,
  <expression> except <expression>
to trap any exception when robustness is more important than catching a 
specific exception.

What would be the typical next step in trying to put this forward?  A 
draft PEP?

Jeffrey E. McAninch, PhD
Physicist, X-2-IFD
Los Alamos National Laboratory
Phone: 505-667-0374
Email: mcaninch at

From kristjan at  Fri Aug  7 12:22:14 2009
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Fri, 7 Aug 2009 10:22:14 +0000
Subject: [Python-Dev] (try-except) conditional expression similar
	to	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

Unless I am very much mistaken, this is the approach Ruby takes.
Everything is an expression.  For example, the value of a block is the value of
The last expression in the block.

I've never understood the need to have a distinction betwen statements and expressions, not when expressions can have side effects.  It's like that differentce between procedures and functions in pascal that only serves to confuse

> -----Original Message-----
> From: at
> [ at] On Behalf
> Of Xavier Morel
> Sent: 6. ?g?st 2009 10:25
> To: python-dev at
> Subject: Re: [Python-Dev] (try-except) conditional expression similar
> to (if-else) conditional (PEP 308)

> Wouldn't it be smarter to fix the issue once and for all by looking
> into making Python's compound statements (or even all statements
> without restrictions) expressions that can return values in the first
> place? Now I don't know if it's actually possible, but if it is the
> problem becomes solved not just for try:except: (and twice so for
> if:else:) but also for while:, for: (though that one's already served
> pretty well by comprehensions) and with:.

From python at  Fri Aug  7 13:03:16 2009
From: python at (MRAB)
Date: Fri, 07 Aug 2009 12:03:16 +0100
Subject: [Python-Dev] (try-except) conditional expression similar to
 (if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>	<592033416A4F4C20A60ACB28E40F84DF@RaymondLaptop1>	<>	<>	<>
Message-ID: <>

Jeff McAninch wrote:
> Should be legal, right?, since syntax would be
>   <expression> except <expression> if <exception>
> Dino Viehland wrote:
>> On option 1 is this legal then?
>> x = float(string) except float('nan') if some_check() else 
>> float('inf') if ValueError
> Thinking more about the syntax options: if P.J.'s "if" Option is used, 
> it should also be optional.
> That is, I would want this to also be legal,
>  <expression> except <expression>
> to trap any exception when robustness is more important than catching a 
> specific exception.

Catch all exceptions:

     <expression> except <expression>

Catch specific exceptions, optionally catching all others:

     <expression> except (<expression> if <exception>)+ [else <expression>]

Of course, a catch-all is a bare except, with all its dangers!

> What would be the typical next step in trying to put this forward?  A 
> draft PEP?

From ilya.nikokoshev at  Fri Aug  7 13:06:11 2009
From: ilya.nikokoshev at (ilya)
Date: Fri, 7 Aug 2009 15:06:11 +0400
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

I believe people now discuss this both on python-dev and python-ideas,
though since I'm new to both lists, I can't really tell where this

I played a little with this syntax, my try_ function and @catch
decorator (which are at

    #   x = float(string) except float('nan') if ValueError
    x = try_(float, string, except_ = float('nan'), if_ = ValueError)

    @catch(ValueError = float('nan'))
    def x1(): return float(string)

    #   y = float(string) except ValueError: float('nan')
    y = try_(float, string, { ValueError: float('nan') })

    @catch({ValueError: float('nan')})
    def y1(): return float(string)

    #   try:
    #       z = open(string, 'r')
    #   except IOError as e:
    #       if e.errno == 2:
    #               z = 'not_exist'
    #       else:
    #               raise
    z = try_(open, string, 'r', iocatcher({2: 'no file!'}))

    @catch(iocatcher({2: 'nothing!'}))
    def z1(): return open(string, 'r')

Here are my overall feelings:

(1) it would be interesting to come up with syntax for except/if
clause, but it's not obvious how to make one and this fact itself may
kill the idea.
(2) the more reasonable approach to things like this is by defining a
separate block and then performing a "catch" operation with it.
Unfortunately, this looks very clumsy as currently this can only be
done by defining a separate function. I think code blocks are a good
direction to explore.

2009/8/7 Kristj?n Valur J?nsson <kristjan at>:
> Unless I am very much mistaken, this is the approach Ruby takes.
> Everything is an expression. ?For example, the value of a block is the value of
> The last expression in the block.
> I've never understood the need to have a distinction betwen statements and expressions, not when expressions can have side effects. ?It's like that differentce between procedures and functions in pascal that only serves to confuse
> K
>> -----Original Message-----
>> From: at
>> [ at] On Behalf
>> Of Xavier Morel
>> Sent: 6. ?g?st 2009 10:25
>> To: python-dev at
>> Subject: Re: [Python-Dev] (try-except) conditional expression similar
>> to (if-else) conditional (PEP 308)
>> Wouldn't it be smarter to fix the issue once and for all by looking
>> into making Python's compound statements (or even all statements
>> without restrictions) expressions that can return values in the first
>> place? Now I don't know if it's actually possible, but if it is the
>> problem becomes solved not just for try:except: (and twice so for
>> if:else:) but also for while:, for: (though that one's already served
>> pretty well by comprehensions) and with:.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From fuzzyman at  Fri Aug  7 13:22:57 2009
From: fuzzyman at (Michael Foord)
Date: Fri, 7 Aug 2009 12:22:57 +0100
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>


On 7 Aug 2009, at 12:06, ilya <ilya.nikokoshev at> wrote:

> I believe people now discuss this both on python-dev and python-ideas,
> though since I'm new to both lists, I can't really tell where this
> belongs.

It definitely belongs on the ideas list...


> I played a little with this syntax, my try_ function and @catch
> decorator (which are at
>    #   x = float(string) except float('nan') if ValueError
>    x = try_(float, string, except_ = float('nan'), if_ = ValueError)
>    @catch(ValueError = float('nan'))
>    def x1(): return float(string)
>    #   y = float(string) except ValueError: float('nan')
>    y = try_(float, string, { ValueError: float('nan') })
>    @catch({ValueError: float('nan')})
>    def y1(): return float(string)
>    #   try:
>    #       z = open(string, 'r')
>    #   except IOError as e:
>    #       if e.errno == 2:
>    #               z = 'not_exist'
>    #       else:
>    #               raise
>    #
>    z = try_(open, string, 'r', iocatcher({2: 'no file!'}))
>    @catch(iocatcher({2: 'nothing!'}))
>    def z1(): return open(string, 'r')
> Here are my overall feelings:
> (1) it would be interesting to come up with syntax for except/if
> clause, but it's not obvious how to make one and this fact itself may
> kill the idea.
> (2) the more reasonable approach to things like this is by defining a
> separate block and then performing a "catch" operation with it.
> Unfortunately, this looks very clumsy as currently this can only be
> done by defining a separate function. I think code blocks are a good
> direction to explore.
> 2009/8/7 Kristj?n Valur J?nsson <kristjan at>:
>> Unless I am very much mistaken, this is the approach Ruby takes.
>> Everything is an expression.  For example, the value of a block is  
>> the value of
>> The last expression in the block.
>> I've never understood the need to have a distinction betwen  
>> statements and expressions, not when expressions can have side  
>> effects.  It's like that differentce between procedures and  
>> functions in pascal that only serves to confuse
>> K
>>> -----Original Message-----
>>> From: at
>>> [ at] On  
>>> Behalf
>>> Of Xavier Morel
>>> Sent: 6. ?g?st 2009 10:25
>>> To: python-dev at
>>> Subject: Re: [Python-Dev] (try-except) conditional expression  
>>> similar
>>> to (if-else) conditional (PEP 308)
>>> Wouldn't it be smarter to fix the issue once and for all by looking
>>> into making Python's compound statements (or even all statements
>>> without restrictions) expressions that can return values in the  
>>> first
>>> place? Now I don't know if it's actually possible, but if it is the
>>> problem becomes solved not just for try:except: (and twice so for
>>> if:else:) but also for while:, for: (though that one's already  
>>> served
>>> pretty well by comprehensions) and with:.
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From solipsis at  Fri Aug  7 14:12:15 2009
From: solipsis at (Antoine Pitrou)
Date: Fri, 7 Aug 2009 12:12:15 +0000 (UTC)
Subject: [Python-Dev] PEP 385: the eol-type issue
References: <>	<>	<>	<>	<>	<>
	<>	<>
	<>	<>
Message-ID: <>

M.-A. Lemburg <mal <at>> writes:
> IMHO, it would be a lot better to add full Unicode support
> for line breaks to the io layer. Given that the code for the
> complicated handling of the CRLF combination is already there,
> it's not difficult to add support for the remaing line break
> characters.

I'm not against anything in principle here, but I'd just like to point out two

1. Changing line break semantics would break compatibility with the current
behaviour, and it would also diverge from what the `newline` parameter
specifies; this may be annoying if, for example, the TextIOWrapper class is used
to parse some network protocols with a rigorous line ending definition

2. It would be useful to have some input by the original designers of the IO
library (the PEP lists Guido, Daniel Stutzbach and Mike Verdone, but I suppose
other people were involved)



From mal at  Fri Aug  7 14:48:39 2009
From: mal at (M.-A. Lemburg)
Date: Fri, 07 Aug 2009 14:48:39 +0200
Subject: [Python-Dev] PEP 385: the eol-type issue
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at>> writes:
>> IMHO, it would be a lot better to add full Unicode support
>> for line breaks to the io layer. Given that the code for the
>> complicated handling of the CRLF combination is already there,
>> it's not difficult to add support for the remaining line break
>> characters.
> I'm not against anything in principle here, but I'd just like to point out two
> things:
> 1. Changing line break semantics would break compatibility with the current
> behaviour, and it would also diverge from what the `newline` parameter
> specifies; this may be annoying if, for example, the TextIOWrapper class is used
> to parse some network protocols with a rigorous line ending definition

Sure, but that would still be possible using the newline parameter.
We'd only have to find a way to tell the io layer "accept all Unicode
line break characters".

> 2. It would be useful to have some input by the original designers of the IO
> library (the PEP lists Guido, Daniel Stutzbach and Mike Verdone, but I suppose
> other people were involved)

Fair enough.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 07 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From aleaxit at  Fri Aug  7 16:55:44 2009
From: aleaxit at (Alex Martelli)
Date: Fri, 7 Aug 2009 07:55:44 -0700
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/7 Kristj?n Valur J?nsson <kristjan at>:
> Unless I am very much mistaken, this is the approach Ruby takes.
> Everything is an expression. ?For example, the value of a block is the value of
> The last expression in the block.
> I've never understood the need to have a distinction betwen statements and expressions, not when expressions can have side effects. ?It's like that differentce between procedures and functions in pascal that only serves to confuse

If you're interested in understanding it better, research
Query-Command Separation (QCS), e.g. starting at and links


From status at  Fri Aug  7 18:07:37 2009
From: status at (Python tracker)
Date: Fri,  7 Aug 2009 18:07:37 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (07/31/09 - 08/07/09)
Python tracker at

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.

 2315 open (+39) / 16175 closed (+16) / 18490 total (+55)

Open issues with patches:   919

Average duration of open issues: 657 days.
Median duration of open issues: 411 days.

Open Issues Breakdown
   open  2283 (+39)
pending    31 ( +0)

Issues Created Or Reopened (55)

heapq.nsmallest and nlargest should be smarter/more usable/more  07/31/09
CLOSED    created  jab                           

multiprocessing logging support test                             08/01/09    created  OG7                           

PyList_APPEND (append without incref)                            08/01/09
CLOSED    created  ideasman42                    

During compiling python 3.1 getting error Undefined symbol libin 08/01/09    created  thoratsandip                  

Typo in a listing in 5.2.9 of language reference                 08/01/09
CLOSED    created  gregorlingl                   

Remove duplicated function in Lib/                     08/01/09
CLOSED    created  vincele                       

Variable may be used before first being assigned to in Lib/local 08/01/09
CLOSED    created  vincele                       

[RFC] Remove leftover use of Carbon module from Lib/    08/01/09
CLOSED    created  vincele                       

[RFC] wrong variable used in Lib/                       08/01/09
CLOSED    created  vincele                       

Lib/ netrc class parsing problem                        08/01/09    created  vincele                       

PyArg_ParseTuple with "s" format and NUL: Bogus TypeError detail 08/01/09
CLOSED    created  jafo                          

UnicodeEncodeError on pydoc's CLI                                08/02/09    created  christoph                     

show Python mimetypes module some love                           08/02/09    created  jrus                          

threading.local() does not work with C-created threads           08/02/09    created  Nikratio                      

IDLE freezes after encountering a syntax error                   08/02/09    created  brian89                       

seek doesn't properly handle file buffer, leads to silent data c 08/03/09
CLOSED    created  lorentey                      

string.Template custom pattern not working                       08/03/09    created  jcollado                      

urlparse.urlunsplit() can't handle relative files (for urllib*.o 08/03/09    created  albert                        

Include more fullwidth chars in the decimal codec                08/03/09    created  ezio.melotti                  

No handlers could be found for logger                            08/03/09
CLOSED    created  purpleidea                    

sys.exit() called from threads other than the main one: undocume 08/03/09    created  jgehrcke                      

Profiler doesn't print usage (indexError instead)                08/03/09    created  pr0gg3d                       

Non-existant directory in sys.path prevents further imports      08/03/09    created  cemasoniv                     

non-empty defaultdict .copy()  fails returning empty dict        08/03/09
CLOSED    created  tomcl                         

optparse parse_args argument references wrong                    08/04/09    created  kq1quick                      

turtle: _tkinter.TclError: invalid command name ".10170160"      08/04/09    created  srid                          

urlparse should parse mailto: URL headers as query parameters    08/04/09    created  mykmelez                      

strptime doesn't support %z format ?                             08/04/09    created  eka                           

returning after forking a child thread doesn't call Py_Finalize  08/04/09    created  rnk                           

joining a child that forks can deadlock in the forked child proc 08/04/09    created  rnk                           

cmathmodule.c: Extra comma in enum - fails on AIX                08/04/09
CLOSED    created  srid                          

multiprocessing build fails on AIX - /dev/urandom (or equivalent 08/04/09    created  srid                          

test_pickle fails on AIX -- 6.9999999999999994e-308 != 6.9999999 08/05/09    created  srid                          

warnings.catch_warnings is not thread-safe                       08/05/09    created  gagenellina                   

codecs documentation does not mention surrogateescape            08/05/09
CLOSED    created  Nikratio                      

idlelib/ missing exit status on exithook                   08/05/09    created  gpolo                         

sre_parse contains a confusing generic error message             08/05/09    created  torne                         

Py3k's posixpath.relpath not compatible with ntpath.relpath      08/05/09    created  erickt                        

missing cmath functions                                          08/05/09
CLOSED    created  pfeldman at          

Potential memory leak in multiprocessing                         08/05/09    created  jnoller                       

Add "path" to the xmrlpc dispatcher method                       08/05/09    created  krisvale                      

etree iterative find[text]                                       08/06/09    created  Digitalxero                   

locale.format_string fails on escaped percentage                 08/06/09    created  christoph                     

Copy documentation section                                       08/06/09    created  Sheepherd                     

typo in buffer api docs                                          08/06/09
CLOSED    created  ash                           

buffer c-api: memoryview object documentation                    08/06/09    created  ash                           

Desire documentation link to user contribution wiki ( 08/06/09
CLOSED    created  keenethery                    

Transient test_multiprocessing failure                           08/06/09    created  pitrou                        

HTMLParser.HTMLParser doesn't handle malformed charrefs          08/07/09    created  dayveday                      

re.findall does not always return a list of strings              08/07/09    created  pfeldman at          

readlines should understand Line Separator and Paragraph Separat 08/07/09    created  nyamatongwe                   

fnmatch fails on filenames containing \n character               08/07/09    created  rajcze                        

List of dirs to ignore in is applied only for the first 08/07/09    created  bogdan.opanchuk               

logging config - using of FileHandler's delay argument?          08/07/09    created  maro                          
                                                                        can't parse sr_RS at latin locale                        08/07/09    created  VPeric                        

Issues Now Closed (32)

Backport set comprehensions                                       506 days    alexandre.vassalotti          

Extension module build fails for MinGW: missing vcvarsall.bat     465 days    tarek                         

PyCF_DONT_IMPLY_DEDENT can be used to activate the with statemen  378 days    gpolo                         

Include Tcl/Tk 8.5.4 in the windows binary for the upcoming beta  352 days    gpolo                         

Arrows key do not browse in the IDLE                              313 days    gpolo                         
                                                                        (Mac) File Menu MIssing Options                          249 days    gpolo                         

multiprocessing.JoinableQueue task_done() issue                   234 days    jnoller                       

bug fix to prevent io.BytesIO from accepting arbitrary keyword a  149 days    alexandre.vassalotti          

OverflowError in RLock.acquire()                                   12 days    davidar                       

Make Decimal constructor accept all unicode decimal digits in in    4 days    marketdickinson               

urllib2 bug on CentOS                                               5 days    rpetrov                       

test_distutils subtest test_get_exe_bytes fails depending on exe    3 days    tarek                         

smtplib.SMTP.sendmail() rejected after quit(),connect() sequence    6 days    amaury.forgeotdarc            

HTMLParser cannot deal with mixture of arbitrary data and charac    1 days    liudongmiao at         

heapq.nsmallest and nlargest should be smarter/more usable/more     0 days    rhettinger                    

PyList_APPEND (append without incref)                               1 days    rhettinger                    

Typo in a listing in 5.2.9 of language reference                    2 days    georg.brandl                  

Remove duplicated function in Lib/                        1 days    marketdickinson               

Variable may be used before first being assigned to in Lib/local    3 days    marketdickinson               

[RFC] Remove leftover use of Carbon module from Lib/       1 days    marketdickinson               

[RFC] wrong variable used in Lib/                          5 days    marketdickinson               

PyArg_ParseTuple with "s" format and NUL: Bogus TypeError detail    0 days    jafo                          

seek doesn't properly handle file buffer, leads to silent data c    4 days    pitrou                        

No handlers could be found for logger                               2 days    purpleidea                    

non-empty defaultdict .copy()  fails returning empty dict           1 days    rhettinger                    

cmathmodule.c: Extra comma in enum - fails on AIX                   0 days    marketdickinson               

codecs documentation does not mention surrogateescape               1 days    georg.brandl                  

missing cmath functions                                             0 days    georg.brandl                  

typo in buffer api docs                                             0 days    georg.brandl                  

Desire documentation link to user contribution wiki (    1 days    keenethery                    

Mouse wheel crashes program                                      2102 days  gpolo                         

add "reload" function to IDLE                                    1583 days gpolo                         

Top Issues Most Discussed (10)

  9 heapq.nsmallest and nlargest should be smarter/more usable/more    0 days

  7 Include more fullwidth chars in the decimal codec                  4 days

  6 [RFC] wrong variable used in Lib/                         5 days

  5 show Python mimetypes module some love                             5 days

  5 json C serializer performance tied to structure depth on some s   10 days

  5 OverflowError in RLock.acquire()                                  12 days

  5 asyncore incorrect failure when connection is refused and using   16 days

  5 error: (10035, 'The socket operation could not complete without  466 days

  5 Backport set literals                                            508 days

  4 Desire documentation link to user contribution wiki     1 days

From greg at  Fri Aug  7 20:08:23 2009
From: greg at (Gregory P. Smith)
Date: Fri, 7 Aug 2009 11:08:23 -0700
Subject: [Python-Dev] socket.makefile and EINTR handling
Message-ID: <>

In particular this issue:

I believe we should handle EINTR internally within the
socket._fileobject wrapper as nobody using a file-like object should
ever expect to get an EINTR.  EINTR only comes from using the lowest
level system calls.

Anyone strongly disagree?


From amk at  Sat Aug  8 02:42:52 2009
From: amk at (A.M. Kuchling)
Date: Fri, 7 Aug 2009 20:42:52 -0400
Subject: [Python-Dev] www, down
Message-ID: <20090808004252.GA4185@andrew-kuchlings-macbook.local>

Both and are down.  They're hosted on
the same machine, and it seems to have run into disk problems and
hasn't rebooted even after power-cycling.  Thomas Wouters will be
visiting the machine physically tomorrow to try to diagnose the

(The machine also hosts and


From steve at  Sat Aug  8 08:02:42 2009
From: steve at (Steven D'Aprano)
Date: Sat, 8 Aug 2009 16:02:42 +1000
Subject: [Python-Dev] (try-except) conditional expression similar
	=?iso-8859-1?q?to=09?=(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 7 Aug 2009 08:22:14 pm Kristj?n Valur J?nsson wrote:
> Unless I am very much mistaken, this is the approach Ruby takes.
> Everything is an expression.  For example, the value of a block is
> the value of The last expression in the block.

Copying what other languages do is not necessarily a bad thing, but that 
would fail both "explicit is better than implicit" and "in the face of 
ambiguity, avoid the temptation to guess".

It's not immediately obvious to me why the last expression should be 
given that privileged rule. Why not the first expression?

> I've never understood the need to have a distinction betwen
> statements and expressions, not when expressions can have side
> effects.  It's like that differentce between procedures and functions
> in pascal that only serves to confuse

Its been a while, but I don't think it ever confused me. Being unable to 
return multiple values, *that* confused me, but the distinction 
between "procedures are for doing something, functions are for getting 
something back" was perfectly straight-forward.

(And then Pascal went and made it slightly more confusing by adding var 
parameters, so you could get results back from a procedure and have 
side-effects in a function... oh well.)

Steven D'Aprano

From stefan_ml at  Sat Aug  8 14:55:57 2009
From: stefan_ml at (Stefan Behnel)
Date: Sat, 08 Aug 2009 14:55:57 +0200
Subject: [Python-Dev] expy: an expressway to extend Python
In-Reply-To: <>
References: <>
Message-ID: <h5jskt$59e$>

Yingjie Lan wrote:
> This is to announce the initial release of expy 0.1.0.
> More details at

I'm clearly biased, but my main concern here is that expy requires C code
to be written inside of strings. There isn't any good editor support for
that, so I doubt that expy is good for anything but very thin wrappers (as
in the examples you presented).

That said, you might want to look at the argument unpacking code generated
by Cython. It's highly optimised through specialisation and has been
benchmarked quite a bit faster than the generic Python C-API functions for
tuple/keyword extracting. Since argument conversion seems to be more or
less all that expy really does, maybe you want to reuse that code.


From stephen at  Sat Aug  8 15:19:23 2009
From: stephen at (Stephen J. Turnbull)
Date: Sat, 08 Aug 2009 22:19:23 +0900
Subject: [Python-Dev] (try-except) conditional expression
	similar	to	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

Steven D'Aprano writes:

 > It's not immediately obvious to me why the last expression should be 
 > given that privileged rule. Why not the first expression?

Or the second, for that matter.  So find a large body of Lisp code and
run "grep -r prog1 | wc", "grep -r prog2 | wc", and "grep -r progn | wc"
on it.  I think the pragmatic answer will be obvious.

Personally, I like functional languages and style.  But I admit the
*need* for a progn construct (ie, "block") to express procedural
style, and see no particular reason why expressing that by making a
syntactic distinction between expressions and statements is worse (or
better) than the progn construct.  That should be kept distinct from
the question of whether extended assignment operators or conditional
operators are appropriate for a given language.

From catch-all at  Sat Aug  8 10:17:10 2009
From: catch-all at (Xavier Morel)
Date: Sat, 8 Aug 2009 10:17:10 +0200
Subject: [Python-Dev] (try-except) conditional expression similar
	to	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

On 8 Aug 2009, at 08:02 , Steven D'Aprano wrote:
> On Fri, 7 Aug 2009 08:22:14 pm Kristj?n Valur J?nsson wrote:
>> Unless I am very much mistaken, this is the approach Ruby takes.
>> Everything is an expression.  For example, the value of a block is
>> the value of The last expression in the block.
> Copying what other languages do is not necessarily a bad thing, but  
> that
> would fail both "explicit is better than implicit" and "in the face of
> ambiguity, avoid the temptation to guess".
The first objection one might be able to give, you maybe, but the  
second one? Where's the ambiguity in "compound statements return the  
result of the last evaluated expression"?

> It's not immediately obvious to me why the last expression should be
> given that privileged rule. Why not the first expression?
Because it wouldn't make any sense? When you're computing something,  
the value you want is the one at the end of the computation (usually  
called a result), not some random one somewhere else.

From amk at  Sat Aug  8 22:22:21 2009
From: amk at (A.M. Kuchling)
Date: Sat, 8 Aug 2009 16:22:21 -0400
Subject: [Python-Dev] www/svn status update
Message-ID: <20090808202221.GA4911@andrew-kuchlings-macbook.local>

The following sites are up again on a new machine, but cannot be
updated through SVN hooks or whatever mechanism: was deliberately not brought up again.  The backups
were a few hours behind and missing the ~10 most recent commits.  Not
disastrous, but it could probably mess up people's SVN trees, so after
some IRC discussion, the decision was to wait until the original disks
are available again.  That will probably not occur until Monday, maybe

I've disabled donations to the PSF through credit cards, which pointed
to a CGI script that doesn't currently work; PayPal donations still

Do we want to make any edits to the 3.1 or 3.0 pages about the I/O
bug?  I can do that manually if someone will provide the text and/or a
patch to put up.

Unfortunately without SVN we probably can't cut a new 3.1 release,
unless Benjamin or someone has a really up-to-date copy of the
Mercurial tree and wants to work from that.


From dirkjan at  Sat Aug  8 22:25:52 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Sat, 8 Aug 2009 22:25:52 +0200
Subject: [Python-Dev] www/svn status update
In-Reply-To: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

On Sat, Aug 8, 2009 at 22:22, A.M. Kuchling<amk at> wrote:
> was deliberately not brought up again. ?The backups
> were a few hours behind and missing the ~10 most recent commits. ?Not
> disastrous, but it could probably mess up people's SVN trees, so after
> some IRC discussion, the decision was to wait until the original disks
> are available again. ?That will probably not occur until Monday, maybe
> Tuesday.

What's the last revision supposed to be? I keep a somewhat regularly
updated full sync of the Python repo.



From martin at  Sat Aug  8 22:40:29 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 08 Aug 2009 22:40:29 +0200
Subject: [Python-Dev] www/svn status update
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

> What's the last revision supposed to be? I keep a somewhat regularly
> updated full sync of the Python repo.

We don't know exactly; python-checkins has recorded r74352. If anybody
has a more recent checkout (svn info .), please speak up.


From martin at  Sat Aug  8 22:47:41 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 08 Aug 2009 22:47:41 +0200
Subject: [Python-Dev] www/svn status update
In-Reply-To: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

> The following sites are up again on a new machine

I'd like to thank the people who have helped getting the temporary
machine up: Thomas Wouters spent much of his day at XS4ALL, where
he was helped by Gerben Schepers (who also provided the hardware).
Sean Reifschneider provided the backups (from the system at which keeps backups of all machines). Andrew
Kuchling put the backup back into the places to bring the system
into its current state.

The failure of the old system was caused by the RAID controller;
the disks themselves should still be intact. Unfortunately, the
RAID controller keeps its configuration on the controller (not
on the disks), so it is unclear still whether the replacement
will be able to recognize the RAID array.


From malathiramya at  Sun Aug  9 09:16:19 2009
From: malathiramya at (malathi selvaraj)
Date: Sun, 9 Aug 2009 12:46:19 +0530
Subject: [Python-Dev] hi everyone
Message-ID: <>

I am new one to this mailing list

I would like to learn python..

how to join IRC for python,i try it like #python, but i dn't get can you

tell me

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From asmodai at  Sun Aug  9 10:20:48 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Sun, 9 Aug 2009 10:20:48 +0200
Subject: [Python-Dev] hi everyone
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090809 09:50], malathi selvaraj (malathiramya at wrote:
>I am new one to this mailing list


>I would like to learn python..

There is sufficient information on the website of

Furthermore, this is not the mailinglist you want to email with questions,
you need to mail the normal Python mailinglist for this. This is the
mailinglist related to the actual development of the Python language itself.

>how to join IRC for python,i try it like #python, but i dn't get can you

Python has various channels on Undernet and Freenode at least.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
The wisdom of the wise, and the experience of ages, may be preserved by

From ncoghlan at  Sun Aug  9 11:26:09 2009
From: ncoghlan at (Nick Coghlan)
Date: Sun, 09 Aug 2009 19:26:09 +1000
Subject: [Python-Dev] hi everyone
In-Reply-To: <>
References: <>
Message-ID: <>

Jeroen Ruigrok van der Werven wrote:
> -On [20090809 09:50], malathi selvaraj (malathiramya at wrote:
>> I am new one to this mailing list
> Welcome.
>> I would like to learn python..
> There is sufficient information on the website of
> Furthermore, this is not the mailinglist you want to email with questions,
> you need to mail the normal Python mailinglist for this.

Specifically, python-list at (also available as the newsgroup


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From skip at  Sun Aug  9 13:27:09 2009
From: skip at (skip at
Date: Sun, 9 Aug 2009 06:27:09 -0500
Subject: [Python-Dev] hi everyone
In-Reply-To: <>
References: <>
Message-ID: <>

    Nick> Specifically, python-list at (also available as the
    Nick> newsgroup comp.lang.python).

Also, if you're a complete beginner, try subscribing to tutor at

and reading through that list's ten year's worth of archived postings.
(Maybe someone create a BestOfTutor wiki page?)

Skip Montanaro - skip at -
    Getting old sucks, but it beats dying young

From billy.earney at  Sun Aug  9 18:16:14 2009
From: billy.earney at (Billy Earney)
Date: Sun, 9 Aug 2009 11:16:14 -0500
Subject: [Python-Dev] hi everyone
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

There's also the website that contains a free ebook about
That's what I used to get started :)

-----Original Message-----
From: at
[ at] On Behalf Of
skip at
Sent: Sunday, August 09, 2009 6:27 AM
To: Nick Coghlan
Cc: malathi selvaraj; Jeroen Ruigrok van der Werven; python-dev at
Subject: Re: [Python-Dev] hi everyone

    Nick> Specifically, python-list at (also available as the
    Nick> newsgroup comp.lang.python).

Also, if you're a complete beginner, try subscribing to tutor at

and reading through that list's ten year's worth of archived postings.
(Maybe someone create a BestOfTutor wiki page?)

Skip Montanaro - skip at -
    Getting old sucks, but it beats dying young
Python-Dev mailing list
Python-Dev at

From jimjjewett at  Mon Aug 10 02:55:12 2009
From: jimjjewett at (Jim Jewett)
Date: Sun, 9 Aug 2009 20:55:12 -0400
Subject: [Python-Dev] statement vs expression [was: (try-except) conditional
	expression similar to (if-else) conditional (PEP 308)]
Message-ID: <>

> Kristj?n Valur J?nsson wrote:
>> I've never understood the need to have a distinction betwen statements
>>  and expressions, not when expressions can have side effects.

Alex Martelli responded:
> If you're interested in understanding it better, research
> Query-Command Separation (QCS), e.g. starting at

Either you missed Kristj?n's point, or your answer was so
subtle that I missed yours.

QCS makes it easy to determine which pieces of code
(queries) are free of side-effects.  I see value in that
for both debugging and optimization.

What I don't see is how that relates to expressions vs
statements **when expressions can have side effects.**

(Actually, in Python, I would say that statements are far
*more* likely to be free of side-effects, as they are often
there for flow control.)


From jimjjewett at  Mon Aug 10 03:07:45 2009
From: jimjjewett at (Jim Jewett)
Date: Sun, 9 Aug 2009 21:07:45 -0400
Subject: [Python-Dev] codecs.oen [was: PEP 385: the eol-type issue]
Message-ID: <>

> M.-A. Lemburg wrote:

>> ... and because of this, the feature is already available if
>> you use instead of the built-in open():

Neil Hodgson asked:
> So should I not add an issue for the basic open because
> should be used for this case?

In python 3, why does even still exist?

As best I can tell, should be the same as regular open,
but for a unicode file -- and all text files are treated as unicode in
python 3.0

So at this point, are there any differences beyond:

(a)  The builtin open doesn't work on multi-byte line-endings other
than the multi-character CRLF.  (In other words, it goes by the
traditional Operating System conventions developed when a char was a
byte, but the Unicode standard allows for a few more possibilities,
which are currently rare in practice.)

(b)  The codecs version is much slower, because it hasn't seen the
optimization effort.


From kristjan at  Mon Aug 10 11:18:24 2009
From: kristjan at (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 10 Aug 2009 09:18:24 +0000
Subject: [Python-Dev] issue 6654
In-Reply-To: <>
References: <>
Message-ID: <>

I've had no response to this yet.  Is no one using xmlrpc?
To clarify the feature:  The xmlrpc server invokes a "dispatch" method on a dispatcher object (typcillay, just itself) to process xmlrpc requests.
The "path" from the xmlrpc request is not provided.  By providing this path, it becomes possible to provide different behaviour for different paths.
The patch provided also includes a new dispatcher, MultipPathXMLRPCDispatcher, which will forward method to different dispatchers based on the path.
This makes it possible to multiplex many xmlrpc "servers", each with their own request path, on a single connection.
You may, for example, have installed a server at
but find that you want your server also to handle an entirely different application domain, and can do so now by having those requests sent to


From: at [ at] On Behalf Of Kristj?n Valur J?nsson
Sent: 6. ?g?st 2009 20:56
To: python-dev at
Subject: [Python-Dev] issue 6654

I added
I also put a not to python-ideas but have had no response yet.  Any comments?
Here's the summary:

I've created on Rietveld:

by passing the "path" component of the xmlrpc request to the dispatch

method, itbecomes possible to dispatch differently according to this.  This patch

providesthat addition.

Additionally, it provides an MultiPathXMLRPCDispatcher mixin

class and a MultiPathXMLRPCServer that uses it, to have multiple dispatchers for

different paths.

This allows a single server port to serve different XMLRPC servers as

differentiated by the HTTP path.  A test is also preovided.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Aug 10 12:12:49 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 10 Aug 2009 20:12:49 +1000
Subject: [Python-Dev] issue 6654
In-Reply-To: <>
References: <>
Message-ID: <>

Kristj?n Valur J?nsson wrote:
> I?ve had no response to this yet.  Is no one using xmlrpc?

It sounds like a reasonable feature to me, but I'm one of those that
doesn't actually use xmlrpc so my +0 or +1 probably isn't very
meaningful to you...


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From solipsis at  Mon Aug 10 15:45:32 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 10 Aug 2009 13:45:32 +0000 (UTC)
Subject: [Python-Dev] codecs.oen [was: PEP 385: the eol-type issue]
References: <>
Message-ID: <>

Jim Jewett <jimjjewett <at>> writes:
> In python 3, why does even still exist?

I don't remember anyone proposing to deprecate it, so I suppose that's the
(social) reason.

> So at this point, are there any differences beyond:

(c) The built-in open is probably a little more featureful, especially when it
comes to seek() and tell().

> (b)  The codecs version is much slower, because it hasn't seen the
> optimization effort.

By the way, the built-in open would also benefit from an optimization of's IncrementalEncoder classes: they are just thin Python wrappers
around C function calls, and the overhead of calling a Python method is very
significant when doing a lot of small unicode writes with a non-optimized codec
(a couple of dominant codecs have been optimized by means of internal shortcuts
bypassing latin-1, utf-8, utf-16).



> (a)  The builtin open doesn't work on multi-byte line-endings other
> than the multi-character CRLF.  (In other words, it goes by the
> traditional Operating System conventions developed when a char was a
> byte, but the Unicode standard allows for a few more possibilities,
> which are currently rare in practice.)

From digitalxero at  Mon Aug 10 16:29:32 2009
From: digitalxero at (Dj Gilcrease)
Date: Mon, 10 Aug 2009 08:29:32 -0600
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

I figure I would write up the PEP draft, I have never tried writing a
pep before, but i did read PEP 1 and tried to follow it's formating
guides. If there are no additions to the idea, then it seems there
just needs to be a consensus on the syntax before submitting it to the
peps list

I posted this to the python-ideas version of this thread already, but
since more people seem to be posting to the python-dev list I will
post it here as well

 PEP: <pep number>
 Title: try-except conditional expressions
 Version: <svn version string>
 Last-Modified: <svn date string>
 Author: Jeff McAninch <mcaninch at>, Dj Gilcrease
<digitalxero at>
 Discussions-To: python-ideas at
 Status: Draft
 Type: Standards Track
 Content-Type: text/plain
 Created: 06-Aug-2009
 Python-Version: 2.7/3.2
 Post-History: <dates of postings to python-list and python-dev>

   I very often want something like a try-except conditional
   expression similar to the if-else conditional instead of resorting
   to a multi-line try-except block.

Design Goals:
   The new syntax should
       * Be simple to read
       * Be intuitive so people who may use it infrequently dont need
           to go lookup the format every time
       * Make it obvious what is happening

   Often when doing calculations or string recasting (to int, float,
   etc) it is required to wrap the section in a simple try-except
   where the exception just assigns a default value. It would be more
   readable and consise if these type of try-excepts could be written
   on a single line.


   All 3 components would just be ordinary expressions. The exception
   definition would be allowed to resolve to a single exception or a
   tuple of exceptions, just as it is in a normal try/except

Syntax Ideas:
   Option 1:
       x = float(string) except float('nan') if ValueError
       op(float(string) except float('nan') if ValueError)

   Option 2:
       x = float(string) except ValueError: float('nan')
       op(float(string) except ValueError: float('nan'))

   Option 3:
       x = float(string) except ValueError else float('nan')
       op(float(string) except ValueError else float('nan'))

From benjamin at  Mon Aug 10 16:39:10 2009
From: benjamin at (Benjamin Peterson)
Date: Mon, 10 Aug 2009 09:39:10 -0500
Subject: [Python-Dev] 3.1.1 plan
Message-ID: <>

Once Subversion is back up (today, tomorrow?), I will tag the 3.1
maintence branch as 3.1.1rc1. The tree will remain frozen until
Saturday. If at that time, no one has found something wrong with the
RC, I will retag it as the final bugfix release.


From solipsis at  Mon Aug 10 16:49:04 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 10 Aug 2009 14:49:04 +0000 (UTC)
Subject: [Python-Dev] 3.1.1 plan
References: <>
Message-ID: <>

Benjamin Peterson <benjamin <at>> writes:
> Once Subversion is back up (today, tomorrow?), I will tag the 3.1
> maintence branch as 3.1.1rc1. The tree will remain frozen until
> Saturday. If at that time, no one has found something wrong with the
> RC, I will retag it as the final bugfix release.

Do you intend to wait for the pdb fix?

From benjamin at  Mon Aug 10 17:00:05 2009
From: benjamin at (Benjamin Peterson)
Date: Mon, 10 Aug 2009 10:00:05 -0500
Subject: [Python-Dev] 3.1.1 plan
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/10 Antoine Pitrou <solipsis at>:
> Benjamin Peterson <benjamin <at>> writes:
>> Once Subversion is back up (today, tomorrow?), I will tag the 3.1
>> maintence branch as 3.1.1rc1. The tree will remain frozen until
>> Saturday. If at that time, no one has found something wrong with the
>> RC, I will retag it as the final bugfix release.
> Do you intend to wait for the pdb fix?
> (

Georg says he'll commit it once svn is up, so yes.


From brett at  Mon Aug 10 18:07:28 2009
From: brett at (Brett Cannon)
Date: Mon, 10 Aug 2009 09:07:28 -0700
Subject: [Python-Dev] issue 6654
In-Reply-To: <>
References: <> 
Message-ID: <>

2009/8/10 Nick Coghlan <ncoghlan at>

> Kristj?n Valur J?nsson wrote:
> > I?ve had no response to this yet.  Is no one using xmlrpc?
> It sounds like a reasonable feature to me, but I'm one of those that
> doesn't actually use xmlrpc so my +0 or +1 probably isn't very
> meaningful to you...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From thomas at  Mon Aug 10 21:12:59 2009
From: thomas at (Thomas Wouters)
Date: Mon, 10 Aug 2009 21:12:59 +0200
Subject: [Python-Dev] www/svn status update
In-Reply-To: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

On Sat, Aug 8, 2009 at 22:22, A.M. Kuchling <amk at> wrote:

> The following sites are up again on a new machine, but cannot be
> updated through SVN hooks or whatever mechanism:
> was deliberately not brought up again.  The backups
> were a few hours behind and missing the ~10 most recent commits.  Not
> disastrous, but it could probably mess up people's SVN trees, so after
> some IRC discussion, the decision was to wait until the original disks
> are available again.  That will probably not occur until Monday, maybe
> Tuesday.

I'm still waiting on a replacement controller, so it wasn't to be today.
Hopefully tomorrow, if the hardware supplier has one in stock. Still no
news on whether we have any chance at all on getting the old data back.

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From v+python at  Tue Aug 11 00:26:40 2009
From: v+python at (Glenn Linderman)
Date: Mon, 10 Aug 2009 15:26:40 -0700
Subject: [Python-Dev] www/svn status update
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

On approximately 8/10/2009 12:12 PM, came the following characters from 
the keyboard of Thomas Wouters:
> I'm still waiting on a replacement controller, so it wasn't to be today.
> Hopefully tomorrow, if the hardware supplier has one in stock. Still no
> news on whether we have any chance at all on getting the old data back.

Sadly, redundant hardware controlled by non-redundant hardware, 
configured to be redundant without a backup of that configuration, isn't 
all that reliable :(

It is hard to get redundancy correct and complete, so you can't just 
hear the word "RAID" and conclude that it is reliable, or fully 
redundant, without proper system management.

That's why I still recommend RAID 0 with appropriate backup procedures, 
or RAID 1... but only if the RAID 1 is operable by removing the RAID 
controller, and attaching the disks to regular controllers, and having 
them be readable... sadly, many RAID 1 configurations do not permit that.

Glenn --
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

From steve at  Tue Aug 11 01:45:58 2009
From: steve at (Steven D'Aprano)
Date: Tue, 11 Aug 2009 09:45:58 +1000
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Tue, 11 Aug 2009 12:29:32 am Dj Gilcrease wrote:
> I figure I would write up the PEP draft, I have never tried writing a
> pep before, but i did read PEP 1 and tried to follow it's formating
> guides. If there are no additions to the idea, then it seems there
> just needs to be a consensus on the syntax before submitting it to
> the peps list

Shouldn't there be consensus on whether or not this is a good idea 


> Modivation


>    Often when doing calculations or string recasting (to int, float,
>    etc) it is required to wrap the section in a simple try-except
>    where the exception just assigns a default value. It would be more
>    readable and consise if these type of try-excepts could be written
>    on a single line.

Concise (note spelling) certainly, but I question that it would be more 
readable. Newlines are not a bad thing, but trying to squeeze too much 
into a single line is.

When the `x if y else z` expression was first introduced, I was very 
excited because I thought it would be very useful. But I soon found 
that it actually wasn't that useful to me: it was rare that I wanted 
it, and when I did, it was usually more readable to use an `if` block 
instead. So I don't find this proposal the least bit compelling. It 
seems to me to be primarily useful for saving wear and tear on the 
Enter key.

> Syntax Ideas:
>    Option 1:
>        x = float(string) except float('nan') if ValueError
>        op(float(string) except float('nan') if ValueError)

Looks too confusingly like an if test. I find my eye drawn to the final 
clause, `if ValueError`, and expecting that to evaluate to true.


>    Option 2:
>        x = float(string) except ValueError: float('nan')
>        op(float(string) except ValueError: float('nan'))

At the risk of an extra keyword, I would prefer `unless` instead of 

I find this the least worst of the alternatives. 


>    Option 3:
>        x = float(string) except ValueError else float('nan')
>        op(float(string) except ValueError else float('nan'))

Also looks confusingly like an if test, but not as strongly as Option 1.


Should the PEP allow expressions like this?

    func(obj) except str(e) if ValueError as e  # Option 1
    func(obj) except ValueError as e: str(e)  # Option 2
    func(obj) except ValueError as e else str(e)  # Option 3

Justify your choice please.

Steven D'Aprano

From ben+python at  Tue Aug 11 03:42:53 2009
From: ben+python at (Ben Finney)
Date: Tue, 11 Aug 2009 11:42:53 +1000
Subject: [Python-Dev] Python mail-to-news gateway status (was: www/svn status update)
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

"A.M. Kuchling" <amk at> writes:

> The following sites are up again on a new machine, but cannot be
> updated through SVN hooks or whatever mechanism:

I don't see ? there. Is it affected?

I ask because I haven't seen any messages reach the ?comp.lang.python?
Usenet group since this report. Is that related?

 \      ?The fact that a believer is happier than a skeptic is no more |
  `\   to the point than the fact that a drunken man is happier than a |
_o__)                                 sober one.? ?George Bernard Shaw |
Ben Finney

From barry at  Tue Aug 11 05:09:34 2009
From: barry at (Barry Warsaw)
Date: Mon, 10 Aug 2009 23:09:34 -0400
Subject: [Python-Dev] Python mail-to-news gateway status (was: www/svn status update)
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

On Aug 10, 2009, at 9:42 PM, Ben Finney wrote:

> "A.M. Kuchling" <amk at> writes:
>> The following sites are up again on a new machine, but cannot be
>> updated through SVN hooks or whatever mechanism:
> I don't see ? there. Is it affected?

It shouldn't be. is a different machine.

> I ask because I haven't seen any messages reach the ?comp.lang.python?
> Usenet group since this report. Is that related?

Hmm, if you're getting this message then mailing lists should be  
working.  I don't know what if anything might be wrong with the  
gateway.  Are both directions affected or only one way?


-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 832 bytes
Desc: This is a digitally signed message part
URL: <>

From guido at  Tue Aug 11 06:03:09 2009
From: guido at (Guido van Rossum)
Date: Mon, 10 Aug 2009 21:03:09 -0700
Subject: [Python-Dev] Python mail-to-news gateway status (was: www/svn status update)
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local> 
Message-ID: <>

Wasn't there a problem with the spam filter recently?

On Mon, Aug 10, 2009 at 8:09 PM, Barry Warsaw<barry at> wrote:
> On Aug 10, 2009, at 9:42 PM, Ben Finney wrote:
>> "A.M. Kuchling" <amk at> writes:
>>> The following sites are up again on a new machine, but cannot be
>>> updated through SVN hooks or whatever mechanism:
>> I don't see ? there. Is it affected?
> It shouldn't be. ? is a different machine.
>> I ask because I haven't seen any messages reach the ?comp.lang.python?
>> Usenet group since this report. Is that related?
> Hmm, if you're getting this message then mailing lists should be working. ?I
> don't know what if anything might be wrong with the gateway. ?Are both
> directions affected or only one way?
> -Barry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

--Guido van Rossum (home page:

From thomas at  Tue Aug 11 16:26:28 2009
From: thomas at (Thomas Wouters)
Date: Tue, 11 Aug 2009 16:26:28 +0200
Subject: [Python-Dev] www/svn status update
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

On Mon, Aug 10, 2009 at 21:12, Thomas Wouters <thomas at> wrote:

> On Sat, Aug 8, 2009 at 22:22, A.M. Kuchling <amk at> wrote:
>> The following sites are up again on a new machine, but cannot be
>> updated through SVN hooks or whatever mechanism:
>> was deliberately not brought up again.  The backups
>> were a few hours behind and missing the ~10 most recent commits.  Not
>> disastrous, but it could probably mess up people's SVN trees, so after
>> some IRC discussion, the decision was to wait until the original disks
>> are available again.  That will probably not occur until Monday, maybe
>> Tuesday.
> I'm still waiting on a replacement controller, so it wasn't to be today.
> Hopefully tomorrow, if the hardware supplier has one in stock. Still no
> news on whether we have any chance at all on getting the old data back.

The new card had to be ordered (and I couldn't find any other place that had
them in stock) bit it should arrive tomorrow or thursday. On the plus side,
Martin found out there should be no problem with just inserting the card and
having it detect the RAID, so as long as the dying card didn't write garbage
to the disks we should be back up and running quite fast.

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jacobolus at  Wed Aug 12 00:09:15 2009
From: jacobolus at (Jacob Rus)
Date: Tue, 11 Aug 2009 15:09:15 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Glyph Lefkowitz wrote:
> Jacob Rus wrote:
> No, [changing the semantics in 3.x] is bad.? If I may quote Guido:
>> So, once more for emphasis: Don't change your APIs at the same time as
>> porting to Py3k!
> Please follow this policy as much as possible in the standard library; the
> language transition is going to be hard enough.
>> Ooh, okay. ?Well I guess we can?t get rid of those then!
> Indeed not.

Well, I've had some patches up at for
over a week now, and my updated version should have identical
semantics to the current module, just with the module's *actual*
behavior clear to anyone reading the code, some serious edge-case bugs
fixed, and a general performance improvement.

I'd like to make some further changes, particularly in which types and
extensions the module knows about, to bring it up to date, and ideally
even to remove the dependency on an Apache install, but I'd like some
discussion and advice about it.

I have some other questions: How does one deprecate part of a standard
library API? How can we alert users to the deprecation? When can the
deprecated parts be removed?

I don't want to just give up on this, because I put more than a day of
time into it, and I really do think the previous code was of poorer
quality than should be in the standard library: I don't want new
Python users reading it and thinking that's just how things are done
around here. But if no one looks at my patches, I'm not sure what more
I can do.



From ncoghlan at  Wed Aug 12 05:19:07 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 12 Aug 2009 13:19:07 +1000
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Jacob Rus wrote:
> Well, I've had some patches up at for
> over a week now, and my updated version should have identical
> semantics to the current module, just with the module's *actual*
> behavior clear to anyone reading the code, some serious edge-case bugs
> fixed, and a general performance improvement.

One thing that would definitely help promote the patch is if you could
figure out a way to test those edge cases in the mimetypes test suite.
Then the usual technique of "add new tests to test suite -> see errors
-> apply fixes to module -> errors go away" demonstrates clearly that
the bugs used to exist and ensures that they won't be reintroduced in
the future.

> I'd like to make some further changes, particularly in which types and
> extensions the module knows about, to bring it up to date, and ideally
> even to remove the dependency on an Apache install, but I'd like some
> discussion and advice about it.

I'd want someone more familiar with using MIME than I am (Barry maybe?)
to chime in before doing anything on that front.

> I have some other questions: How does one deprecate part of a standard
> library API? How can we alert users to the deprecation? When can the
> deprecated parts be removed?

warnings.warn and DeprecatingWarning is the way to go for that. The code
stays in for at least one release with the warning (in this case, 2.7
and 3.2) and can then be removed in the subsequent release.

> I don't want to just give up on this, because I put more than a day of
> time into it, and I really do think the previous code was of poorer
> quality than should be in the standard library: I don't want new
> Python users reading it and thinking that's just how things are done
> around here. But if no one looks at my patches, I'm not sure what more
> I can do.
> Again:

I added myself to the nosy list for your patch precisely so I could look
at it, but the RAID array in the subversion server went down late last
week and won't be fixed for another day or two.

Otherwise I probably would have tried this out over the weekend :(


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From benjamin at  Wed Aug 12 05:22:02 2009
From: benjamin at (Benjamin Peterson)
Date: Tue, 11 Aug 2009 22:22:02 -0500
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/11 Jacob Rus <jacobolus at>:
> I have some other questions: How does one deprecate part of a standard
> library API? How can we alert users to the deprecation? When can the
> deprecated parts be removed?

Basically, you add a DeprecationWarning to the API. Then remove it in
the next major version.

If python-dev was more interested, we would have a policy for this. *cough*

> I don't want to just give up on this, because I put more than a day of
> time into it, and I really do think the previous code was of poorer
> quality than should be in the standard library: I don't want new
> Python users reading it and thinking that's just how things are done
> around here. But if no one looks at my patches, I'm not sure what more
> I can do.

It looks like you need to add some tests for the bugs you fixed to
test_mimetypes. While you're at it, you could improve that test
generally, since it's not exactly extensive.

Then, you might garner some more reviews by putting your patch up on
Rietveld; it makes reviewing much painful.


From ben+python at  Wed Aug 12 06:16:26 2009
From: ben+python at (Ben Finney)
Date: Wed, 12 Aug 2009 14:16:26 +1000
Subject: [Python-Dev] Python mail-to-news gateway status
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

Barry Warsaw <barry at> writes:

> On Aug 10, 2009, at 9:42 PM, Ben Finney wrote:
> > I ask because I haven't seen any messages reach the
> > ?comp.lang.python? Usenet group since this report. Is that related?
> Hmm, if you're getting this message then mailing lists should be
> working. I don't know what if anything might be wrong with the
> gateway. Are both directions affected or only one way?

It seems to be a problem specific to my Usenet provider. Thanks for the
ongoing work to restore services.

 \          ?A hundred times every day I remind myself that [?] I must |
  `\       exert myself in order to give in the same measure as I have |
_o__)                received and am still receiving? ?Albert Einstein |
Ben Finney
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From eric at  Wed Aug 12 12:32:33 2009
From: eric at (Eric Smith)
Date: Wed, 12 Aug 2009 06:32:33 -0400
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Benjamin Peterson wrote:
> Then, you might garner some more reviews by putting your patch up on
> Rietveld; it makes reviewing much painful.

"... much _less_ painful", I hope!

From gelhaus at  Wed Aug 12 11:53:31 2009
From: gelhaus at (Martin Gelhaus)
Date: Wed, 12 Aug 2009 11:53:31 +0200
Subject: [Python-Dev] =?utf-8?q?Study_on_communication_and_collaboration_i?=
Message-ID: <>

Dear Python  developer,

within the scope of my diploma thesis at the University of Paderborn, Germany, with the title "Study about communication and collaboration in software development in teams" I am conducting a survey of members of software development teams.

I would be very grateful if you help me in my studies and answer the survey at

This is the official description of the survey:

In the last years many means of communication and collaboration were introduced in software projects to assist the development teams with their daily work.

With this study we want to identify requirements for a communication- and collaboration-supporting platform for software development. For this purpose we will evaluate the utilization and effectiveness of different means of communication and collaboration in solving software and managerial problems in software development teams.

The survey will take about 10-15 minutes and contains 55 questions that cover various topics.

Many thanks for your support of my research. If there are any further questions, don't hesitate to contact me.
Best regards from Paderborn, Germany

Martin Gelhaus (gelhaus at

Click here to do the survey:

Martin Gelhaus

Graduand at Didactics of Informatics chair at University of Paderborn
F?rstenallee 11
Room F2.416
D-33102 Paderborn			

From chris at  Wed Aug 12 13:05:46 2009
From: chris at (Chris Withers)
Date: Wed, 12 Aug 2009 12:05:46 +0100
Subject: [Python-Dev] how to debug httplib slowness
Message-ID: <>

Hi All,

I'd like to work on this issue:

Specifically, in my case, while IE can download a 150Mb file from a 
local server in about 3 seconds, httplib takes over 20 minutes!

However, I'm kinda stumped on where to start with debugging the 
difference. I've tried upping the buffer size as suggested in the issue, 
but it's had no effect...

Any ideas?


Simplistix - Content Management, Batch Processing & Python Consulting

From solipsis at  Wed Aug 12 13:37:12 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 12 Aug 2009 11:37:12 +0000 (UTC)
Subject: [Python-Dev] how to debug httplib slowness
References: <>
Message-ID: <>

Chris Withers <chris <at>> writes:
> However, I'm kinda stumped on where to start with debugging the 
> difference. I've tried upping the buffer size as suggested in the issue, 
> but it's had no effect...

Then perhaps it's not the same bug.
Please take a look at CPU utilization during the download. If Python takes close
to 100% CPU, it might be due to the lack of buffering or any other suboptimal
situation in the implementation. If Python takes close to 0%, then it's just
waiting on data to arrive from the network...

From stefan_ml at  Wed Aug 12 14:20:31 2009
From: stefan_ml at (Stefan Behnel)
Date: Wed, 12 Aug 2009 14:20:31 +0200
Subject: [Python-Dev] [issue6673] Py3.1 hangs in coroutine and eats up
	all memory
In-Reply-To: <>
References: <>
Message-ID: <h5uc2h$d1m$>

[moving this from the bug tracker]

Alexandre Vassalotti wrote:
> Alexandre Vassalotti added the comment:
> Not a bug.
> The list comprehension in your chunker:
>     while True:
>         target.send([ (yield) for i in range(chunk_size) ])
> is equivalent to the following generator in Python 3:
>     while True:
>         def g():
>             for i in range(chunk_size):
>                 yield (yield)
>         target.send(list(g()))
> This clearly needs not what you want.

Does this do anything meaningful, or would it make sense to output a
compiler warning (or better: an error) here?

Using yield in a comprehension (as opposed to a generator expression, which
I intuitively expected not to work) doesn't look any dangerous at first
glance, so it was quite surprising to see it fail that drastically.

This is also an important issue for other Python implementations. Cython
simply transforms comprehensions into the equivalent for-loop, so when we
implement PEP 342 in Cython, we will have to find a way to emulate
CPython's behaviour here (unless we decide to stick with Py2.x sematics,
which would not be my preferred solution).


> So, just rewrite your code using for-loop:
>     while True:
>         result = []
>         for i in range(chunk_size):
>             result.append((yield))
>         target.send(result)
> ----------
> nosy: +alexandre.vassalotti
> resolution:  -> invalid
> status: open -> closed

From ncoghlan at  Wed Aug 12 14:22:34 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 12 Aug 2009 22:22:34 +1000
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Benjamin Peterson wrote:
> <rant>
> If python-dev was more interested, we would have a policy for this. *cough*
> </rant>

PEP 5 isn't enough? (I'll grant that PEP could probably do with
mentioning the use of warnings.warn(DeprecationWarning) explicitly, but
the policy itself seems fine)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Wed Aug 12 14:25:19 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 12 Aug 2009 22:25:19 +1000
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> Benjamin Peterson wrote:
>> <rant>
>> If python-dev was more interested, we would have a policy for this. *cough*
>> </rant>
> PEP 5 isn't enough? (I'll grant that PEP could probably do with
> mentioning the use of warnings.warn(DeprecationWarning) explicitly, but
> the policy itself seems fine)

Oops, I get it now :)


P.S. For anyone else that is slow like me, take a close look at PEP 387...

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Wed Aug 12 15:00:45 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 12 Aug 2009 23:00:45 +1000
Subject: [Python-Dev] [issue6673] Py3.1 hangs in coroutine and eats up
 all memory
In-Reply-To: <h5uc2h$d1m$>
References: <>
Message-ID: <>

Stefan Behnel wrote:
> This is also an important issue for other Python implementations. Cython
> simply transforms comprehensions into the equivalent for-loop, so when we
> implement PEP 342 in Cython, we will have to find a way to emulate
> CPython's behaviour here (unless we decide to stick with Py2.x sematics,
> which would not be my preferred solution).

How do you do that without leaking the iteration variable into the
current namespace?

Avoiding that leakage is where the semantic change between 2.x and 3.x
came from here: 2.x just creates the for loop inline (thus leaking the
iteration variable into the current scope), while 3.x creates an inner
function that does the iteration so that the iteration variables exist
in their own scope without polluting the namespace of the containing

The translation of your example isn't quite as Alexandre describes it -
we do at least avoid the overhead of creating a generator function in
the list comprehension case. It's more like:

    while True:
        def f():
            result = []
            for i in range(chunk_size):
            return result

So what you end up with is a generator that has managed to bypass the
syntactic restriction that disallows returning non-None values from
generators. In CPython it appears that happens to end up being executed
as if the return was just another yield expression (most likely due to a
quirk in the implementation of RETURN_VALUE inside generators):

    while True:
        def f():
            result = []
            for i in range(chunk_size):
            yield result

It seems to me that CPython should be raising a SyntaxError for yield
expressions inside comprehensions (in line with the "no returning values
other than None from generator functions" rule), and probably for
generator expressions as well.


P.S. Experimentation at a 3.x interpreter prompt:

>>> def f():
...   return [(yield) for i in range(10)]
>>> x = f()
>>> next(x)
>>> for i in range(8):
...   x.send(i)
>>> x.send(8)
>>> next(x)
[0, 1, 2, 3, 4, 5, 6, 7, 8, None]
>>> x = f()
>>> next(x)
>>> for i in range(10): # A statement with a return value!
...   x.send(i)
[0, 1, 2, 3, 4, 5, 6, 7, 8, None]
>>> dis(f)
  2           0 LOAD_CONST               1 (<code object <listcomp> at
0xb7c53bf0, file "<stdin>", line 2>)
              3 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              9 LOAD_CONST               2 (10)
             12 CALL_FUNCTION            1
             15 GET_ITER
             16 CALL_FUNCTION            1
             19 RETURN_VALUE
>>> dis(f.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                13 (to 22)
              9 STORE_FAST               1 (i)
             12 LOAD_CONST               0 (None)
             15 YIELD_VALUE
             16 LIST_APPEND              2
             19 JUMP_ABSOLUTE            6
        >>   22 RETURN_VALUE

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From chris at  Wed Aug 12 15:40:56 2009
From: chris at (Chris Withers)
Date: Wed, 12 Aug 2009 14:40:56 +0100
Subject: [Python-Dev] how to debug httplib slowness
In-Reply-To: <>
References: <>
Message-ID: <>

Antoine Pitrou wrote:
> Chris Withers <chris <at>> writes:
>> However, I'm kinda stumped on where to start with debugging the 
>> difference. I've tried upping the buffer size as suggested in the issue, 
>> but it's had no effect...
> Then perhaps it's not the same bug.
> Please take a look at CPU utilization during the download. If Python takes close
> to 100% CPU, it might be due to the lack of buffering or any other suboptimal
> situation in the implementation. 

Well, it's locked at 25% on a quad core box, so yeah, I'd say something 
is wrong ;-)

I guess I could try profile it and finding out where most of the time is 
being spent?


Simplistix - Content Management, Batch Processing & Python Consulting

From stefan_ml at  Wed Aug 12 16:33:54 2009
From: stefan_ml at (Stefan Behnel)
Date: Wed, 12 Aug 2009 16:33:54 +0200
Subject: [Python-Dev] [issue6673] Py3.1 hangs in coroutine and eats up
	all memory
In-Reply-To: <>
References: <>	<h5uc2h$d1m$>
Message-ID: <h5ujsk$bl3$>

Nick Coghlan wrote:
> Stefan Behnel wrote:
>> This is also an important issue for other Python implementations. Cython
>> simply transforms comprehensions into the equivalent for-loop, so when we
>> implement PEP 342 in Cython, we will have to find a way to emulate
>> CPython's behaviour here (unless we decide to stick with Py2.x sematics,
>> which would not be my preferred solution).
> How do you do that without leaking the iteration variable into the
> current namespace?

We currently have 2.x sematics for comprehensions anyway, but the
(long-standing) idea is to move comprehensions into their own scope (not a
function, just a new type of scope), so that all names defined inside the
expressions end up inside of the inner scope. This is completely orthogonal
to the loop transformation itself, though, which would simply happen inside
of the inner scope.

However, having to emulate the other Py3 semantics for comprehensions that
this thread is about, would pretty much kill such a simple solution.

> The translation of your example isn't quite as Alexandre describes it -
> we do at least avoid the overhead of creating a generator function in
> the list comprehension case. It's more like:
>     while True:
>         def f():
>             result = []
>             for i in range(chunk_size):
>                 result.append((yield))
>             return result
>         target.send(f())

So the problem is that f(), i.e. the function-wrapped comprehension itself,
swallows the "(yield)" expression (which redundantly makes it a generator).
That means that the outer function in my example, which was

	def chunker(chunk_size, target):
	    while True:
	        target.send([ (yield) for i in range(chunk_size) ])

doesn't become a generator itself, so the above simply ends up as an
infinite loop.

IMHO, that's pretty far from obvious when you look at the code.

Also, the target receives a "generator object <listcomp>" instead of a
list. That sounds weird.

> It seems to me that CPython should be raising a SyntaxError for yield
> expressions inside comprehensions (in line with the "no returning values
> other than None from generator functions" rule), and probably for
> generator expressions as well.

Yes, that's what I was suggesting. Disallowing it in genexps is a more open
question, though. I wouldn't mind being able to send() values into a
generator expression, or to throw() exceptions during their execution.

Anyway, I have no idea about a use case, so it might just as well be
disallowed for symmetry reasons.


From guido at  Wed Aug 12 17:07:32 2009
From: guido at (Guido van Rossum)
Date: Wed, 12 Aug 2009 08:07:32 -0700
Subject: [Python-Dev] how to debug httplib slowness
In-Reply-To: <>
References: <>
Message-ID: <>

Try instrumenting the actual calls to the lowest-level socket methods
(recv() and send()) and log for each one the arguments, return time,
and how long it took. You might see a pattern. Is this on Windows?
It's embarrassing, we've had problems with socket speed on Windows
since 1999 and they're still not gone... :-(

On Wed, Aug 12, 2009 at 4:05 AM, Chris Withers<chris at> wrote:
> Hi All,
> I'd like to work on this issue:
> Specifically, in my case, while IE can download a 150Mb file from a local
> server in about 3 seconds, httplib takes over 20 minutes!
> However, I'm kinda stumped on where to start with debugging the difference.
> I've tried upping the buffer size as suggested in the issue, but it's had no
> effect...
> Any ideas?
> Chris
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
> ? ? ? ? ? -
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

--Guido van Rossum (home page:

From guido at  Wed Aug 12 17:07:51 2009
From: guido at (Guido van Rossum)
Date: Wed, 12 Aug 2009 08:07:51 -0700
Subject: [Python-Dev] how to debug httplib slowness
In-Reply-To: <>
References: <>
Message-ID: <>

s/return time/return size/

On Wed, Aug 12, 2009 at 8:07 AM, Guido van Rossum<guido at> wrote:
> Try instrumenting the actual calls to the lowest-level socket methods
> (recv() and send()) and log for each one the arguments, return time,
> and how long it took. You might see a pattern. Is this on Windows?
> It's embarrassing, we've had problems with socket speed on Windows
> since 1999 and they're still not gone... :-(
> On Wed, Aug 12, 2009 at 4:05 AM, Chris Withers<chris at> wrote:
>> Hi All,
>> I'd like to work on this issue:
>> Specifically, in my case, while IE can download a 150Mb file from a local
>> server in about 3 seconds, httplib takes over 20 minutes!
>> However, I'm kinda stumped on where to start with debugging the difference.
>> I've tried upping the buffer size as suggested in the issue, but it's had no
>> effect...
>> Any ideas?
>> Chris
>> --
>> Simplistix - Content Management, Batch Processing & Python Consulting
>> ? ? ? ? ? -
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> --
> --Guido van Rossum (home page:

--Guido van Rossum (home page:

From solipsis at  Wed Aug 12 17:18:14 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 12 Aug 2009 15:18:14 +0000 (UTC)
Subject: [Python-Dev]
References: <>	<h5uc2h$d1m$>
	<> <h5ujsk$bl3$>
Message-ID: <>

Stefan Behnel <stefan_ml <at>> writes:
> IMHO, that's pretty far from obvious when you look at the code.

A "yield" wrapped in a list comprehension looks far from obvious IMO anyway,
whether in 2.x or 3.x. It's this kind of "smart" writing tricks people find that
only makes code more difficult to read for others (? la Perl).



From chris at  Wed Aug 12 17:34:56 2009
From: chris at (Chris Withers)
Date: Wed, 12 Aug 2009 16:34:56 +0100
Subject: [Python-Dev] how to debug httplib slowness
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> Try instrumenting the actual calls to the lowest-level socket methods
> (recv() and send()) and log for each one the arguments, return time,
> and how long it took.

Can I do that in python code?

> You might see a pattern. Is this on Windows?

Well, yes, but I'm not 100%. The problematic machine is a Windows box, 
but there are no non-windows boxes on that network and vpn'ing from one 
of my non-windows boxes slows things down enough that I'm not confident 
what I'd be seeing was indicative of the same problem...

> It's embarrassing, we've had problems with socket speed on Windows
> since 1999 and they're still not gone... :-(

Oh dear :-(


Simplistix - Content Management, Batch Processing & Python Consulting

From guido at  Wed Aug 12 18:10:57 2009
From: guido at (Guido van Rossum)
Date: Wed, 12 Aug 2009 09:10:57 -0700
Subject: [Python-Dev] how to debug httplib slowness
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 12, 2009 at 8:34 AM, Chris Withers<chris at> wrote:
> Guido van Rossum wrote:
>> Try instrumenting the actual calls to the lowest-level socket methods
>> (recv() and send()) and log for each one the arguments, return time,
>> and how long it took.
> Can I do that in python code?

Probably if you hack on the file long enough.

>> You might see a pattern. Is this on Windows?
> Well, yes, but I'm not 100%. The problematic machine is a Windows box, but
> there are no non-windows boxes on that network and vpn'ing from one of my
> non-windows boxes slows things down enough that I'm not confident what I'd
> be seeing was indicative of the same problem...

Time to set up a more conclusive test. Do you have something like curl
or wget available on the same box?

>> It's embarrassing, we've had problems with socket speed on Windows
>> since 1999 and they're still not gone... :-(
> Oh dear :-(

Well it may be that it's really just your box. Or proxy settings. Look
into proxy settings.

--Guido van Rossum (home page:

From thomas at  Wed Aug 12 18:33:28 2009
From: thomas at (Thomas Wouters)
Date: Wed, 12 Aug 2009 09:33:28 -0700
Subject: [Python-Dev] www/svn status update
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local>
Message-ID: <>

I replaced the RAID controller, the old data was still intact, so I brought
the temporary machine down and the new machine up. Everything seems to work
just fine, so happy svn-up'ing.

(I will reboot for a few minutes, to check its serial
console configuration, but that shouldn't affect anyone.)

On Tue, Aug 11, 2009 at 07:26, Thomas Wouters <thomas at> wrote:

> On Mon, Aug 10, 2009 at 21:12, Thomas Wouters <thomas at> wrote:
>> On Sat, Aug 8, 2009 at 22:22, A.M. Kuchling <amk at> wrote:
>>> The following sites are up again on a new machine, but cannot be
>>> updated through SVN hooks or whatever mechanism:
>>> was deliberately not brought up again.  The backups
>>> were a few hours behind and missing the ~10 most recent commits.  Not
>>> disastrous, but it could probably mess up people's SVN trees, so after
>>> some IRC discussion, the decision was to wait until the original disks
>>> are available again.  That will probably not occur until Monday, maybe
>>> Tuesday.
>> I'm still waiting on a replacement controller, so it wasn't to be today.
>> Hopefully tomorrow, if the hardware supplier has one in stock. Still no
>> news on whether we have any chance at all on getting the old data back.
> The new card had to be ordered (and I couldn't find any other place that
> had them in stock) bit it should arrive tomorrow or thursday. On the plus
> side, Martin found out there should be no problem with just inserting the
> card and having it detect the RAID, so as long as the dying card didn't
> write garbage to the disks we should be back up and running quite fast.
> --
> Thomas Wouters <thomas at>
> Hi! I'm a .signature virus! copy me into your .signature file to help me
> spread!

Thomas Wouters <thomas at>

Hi! I'm a .signature virus! copy me into your .signature file to help me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg at  Wed Aug 12 18:55:32 2009
From: greg at (Gregory P. Smith)
Date: Wed, 12 Aug 2009 09:55:32 -0700
Subject: [Python-Dev] www/svn status update
In-Reply-To: <>
References: <20090808202221.GA4911@andrew-kuchlings-macbook.local> 
Message-ID: <>

On Wed, Aug 12, 2009 at 9:33 AM, Thomas Wouters<thomas at> wrote:
> I replaced the RAID controller, the old data was still intact, so I brought
> the temporary machine down and the new machine up. Everything seems to work
> just fine, so happy svn-up'ing.
> (I will reboot for a few minutes, to check its serial
> console configuration, but that shouldn't affect anyone.)

Yay!  Thanks for your dedicated work!


> On Tue, Aug 11, 2009 at 07:26, Thomas Wouters <thomas at> wrote:
>> On Mon, Aug 10, 2009 at 21:12, Thomas Wouters <thomas at> wrote:
>>> On Sat, Aug 8, 2009 at 22:22, A.M. Kuchling <amk at> wrote:
>>>> The following sites are up again on a new machine, but cannot be
>>>> updated through SVN hooks or whatever mechanism:
>>>> was deliberately not brought up again. ?The backups
>>>> were a few hours behind and missing the ~10 most recent commits. ?Not
>>>> disastrous, but it could probably mess up people's SVN trees, so after
>>>> some IRC discussion, the decision was to wait until the original disks
>>>> are available again. ?That will probably not occur until Monday, maybe
>>>> Tuesday.
>>> I'm still waiting on a replacement controller, so it wasn't to be today.
>>> Hopefully tomorrow, if the hardware supplier has one in stock. Still no
>>> news on whether we have any chance at all on getting the old data back.
>> The new card had to be ordered (and I couldn't find any other place that
>> had them in stock) bit it should arrive tomorrow or thursday. On the plus
>> side, Martin found out there should be no problem with just inserting the
>> card and having it detect the RAID, so as long as the dying card didn't
>> write garbage to the disks we should be back up and running quite fast.
>> --
>> Thomas Wouters <thomas at>
>> Hi! I'm a .signature virus! copy me into your .signature file to help me
>> spread!
> --
> Thomas Wouters <thomas at>
> Hi! I'm a .signature virus! copy me into your .signature file to help me
> spread!
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From jacobolus at  Wed Aug 12 19:54:30 2009
From: jacobolus at (Jacob Rus)
Date: Wed, 12 Aug 2009 10:54:30 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

Benjamin Peterson wrote:
> It looks like you need to add some tests for the bugs you fixed to
> test_mimetypes. While you're at it, you could improve that test
> generally, since it's not exactly extensive.

Okay, I'll try to do this sometime in the next few days, if I get the chance.

> Then, you might garner some more reviews by putting your patch up on
> Rietveld; it makes reviewing much painful.

Okay, now that is back up, here's a Rietveld link:

Jacob Rus

From solipsis at  Wed Aug 12 21:04:26 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 12 Aug 2009 19:04:26 +0000 (UTC)
Subject: [Python-Dev] how to debug httplib slowness
References: <>
Message-ID: <>

Chris Withers <chris <at>> writes:
> Well, it's locked at 25% on a quad core box, so yeah, I'd say something 
> is wrong 
> I guess I could try profile it and finding out where most of the time is 
> being spent?

I guess you could indeed :-)


From stefan_ml at  Thu Aug 13 10:11:30 2009
From: stefan_ml at (Stefan Behnel)
Date: Thu, 13 Aug 2009 10:11:30 +0200
Subject: [Python-Dev] [issue6673] Py3.1 hangs in coroutine and eats up
	all memory
In-Reply-To: <>
References: <>	<h5uc2h$d1m$>	<>
Message-ID: <h60hrk$51l$>

Antoine Pitrou wrote:
> Stefan Behnel <stefan_ml <at>> writes:
>> IMHO, that's pretty far from obvious when you look at the code.
> A "yield" wrapped in a list comprehension looks far from obvious IMO anyway,
> whether in 2.x or 3.x. It's this kind of "smart" writing tricks people find that
> only makes code more difficult to read for others (? la Perl).

So, your vote is to make it a compiler error as well?


From lists at  Thu Aug 13 14:10:56 2009
From: lists at (Christian Heimes)
Date: Thu, 13 Aug 2009 14:10:56 +0200
Subject: [Python-Dev] Microsoft MSDN
In-Reply-To: <>
References: <>
Message-ID: <>

Steve Holden wrote:
> I sent fourteen requests for licenses in to Microsoft. I've asked them
> to let me know which they grant (since they may choose to limit the
> number) and will inform you all personally when I hear their decision.

I've received my MSDN subscription today. Everybody watch out for a 
message from MSDN! I almost confused the email with spam.

Thanks for your work and please forward my gratitude to James Rice.


From g.brandl at  Thu Aug 13 15:26:27 2009
From: g.brandl at (Georg Brandl)
Date: Thu, 13 Aug 2009 15:26:27 +0200
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <h614a3$r92$>

Nick Coghlan schrieb:
> Nick Coghlan wrote:
>> Benjamin Peterson wrote:
>>> <rant>
>>> If python-dev was more interested, we would have a policy for this. *cough*
>>> </rant>
>> PEP 5 isn't enough? (I'll grant that PEP could probably do with
>> mentioning the use of warnings.warn(DeprecationWarning) explicitly, but
>> the policy itself seems fine)
> Oops, I get it now :)
> Cheers,
> Nick.
> P.S. For anyone else that is slow like me, take a close look at PEP 387...

What should we see, other than that we have two PEPs on the same topic that
should be merged?


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From jan.matejek at  Thu Aug 13 20:23:14 2009
From: jan.matejek at (Jan Matejek)
Date: Thu, 13 Aug 2009 20:23:14 +0200
Subject: [Python-Dev] request for comments - standardization of python's
	purelib and platlib
Message-ID: <h61lmi$m3v$>


I'm cross-posting this to distributions at freedesktop and python-dev,
because the topic is relevant to both groups and should be solved in

The issue:

In Python's default configuration (on linux), both purelib (location for
pure python modules) and platlib (location for platform-dependent binary
extensions) point to $prefix/lib/pythonX.Y/site-packages.
That is no good for two main reasons.

One, python depends on the "lib" directory. (from distro's point of
view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
impossible to install python under /usr/lib64 without heavy patching.
Repeated attempts to bring python developers to acknowledge and rectify
the situation have all failed (common argument here is "that would mean
redesign of distutils and huge parts of whatnot").

Conversely, that also means that multiarch setup (/usr/lib or lib32 with
32bit python and /usr/lib64 with 64bit python) is not possible with
stock python.

Two, the default configuration makes purelib and platlib identical,
which somehow defeats the purpose of the distinction in the first place.
You either need to patch the default, or supply some alternate
configuration to take advantage of this feature.
And that's not the end of it - the next step is to make python aware of
two different locations on sys.path, one for purelib and one for
platlib, which is a different story altogether.

As distributors, we like to take advantage of purelib/platlib separation
to package pure python modules as platform-independent (noarch for
rpm-speakers). And that's not easy to do properly.

The proposal:

Let's put our heads together and choose good default locations for
purelib and platlib. Then add support to python for recognizing the
locations by default, and possibly leave note in FHS that "this is the

This is IMO a good first step to making python multiarch-aware, and it
would also help a bit with LSB integration [1].

I've come up with three basic options for the configuration (substitute
"/usr" with "$prefix" if you're not a distributor). This list is by no
means comprehensive, it's just what looked reasonable at the time of

1 - the traditional way
purelib = /usr/lib/pythonX.Y/site-packages
platlib = /usr/lib(64)/pythonX.Y/site-packages

+ this is already the default for 32bit systems
+ major distributions (including Fedora, Mandriva and now finally
openSUSE too) do this
- 32bit systems have no separation, poor they!
- with multiarch setup, /usr/lib is "cluttered" by both
platform-dependent files for 32bit and platform-independent files shared
by the platforms. Also, 64bit python can pick up 32bit modules. That
doesn't cause problems in practice, but doesn't fell like a clean design.

2 - the sharedir way
purelib = /usr/share/python/X.Y
platlib = /usr/lib(64)/pythonX.Y/site-packages

+ clean separation of purelib - nice!
+ unheard of - a good place to start anew
- FHS states that /usr/share is for data. But OTOH, they don't say much
about platform-independent bytecode. We could probably get an exception
for this.
- unheard of - everyone will be surprised

3 - the perl way
purelib = /usr/lib/pythonX.Y
platlib = /usr/lib/pythonX.Y/lib-dynload-(platform-identifier)/site-packages

+ possibility of multiarch packages that would install pure python parts
into purelib and extensions or accelerators for more platforms at once -
and therefore, possibility to split large modules into
platform-dependent and platform-independent parts and save space on
installation media
+ "idea compatibility" with perl and ruby, one less install layout to learn
- completely different from what we have now - would require the most
work from both python developers and distributions


jan matejek
python packager for SUSE Linux


From at  Thu Aug 13 20:51:18 2009
From: at (David Bolen)
Date: Thu, 13 Aug 2009 14:51:18 -0400
Subject: [Python-Dev] Microsoft MSDN
References: <> <>
Message-ID: <>

Christian Heimes <lists at> writes:

> Steve Holden wrote:
>> I sent fourteen requests for licenses in to Microsoft. I've asked them
>> to let me know which they grant (since they may choose to limit the
>> number) and will inform you all personally when I hear their decision.
> I've received my MSDN subscription today. Everybody watch out for a
> message from MSDN! I almost confused the email with spam.
> Thanks for your work and please forward my gratitude to James Rice.

Ditto from me (my subscription info arrived yesterday afternoon).  Many
thanks to all involved!

-- David

From brett at  Thu Aug 13 21:22:49 2009
From: brett at (Brett Cannon)
Date: Thu, 13 Aug 2009 12:22:49 -0700
Subject: [Python-Dev] request for comments - standardization of python's
	purelib and platlib
In-Reply-To: <h61lmi$m3v$>
References: <h61lmi$m3v$>
Message-ID: <>

On Thu, Aug 13, 2009 at 11:23, Jan Matejek <jan.matejek at> wrote:

> Hello,
> I'm cross-posting this to distributions at freedesktop and python-dev,
> because the topic is relevant to both groups and should be solved in
> cooperation.
> The issue:
> In Python's default configuration (on linux), both purelib (location for
> pure python modules) and platlib (location for platform-dependent binary
> extensions) point to $prefix/lib/pythonX.Y/site-packages.
> That is no good for two main reasons.
> One, python depends on the "lib" directory. (from distro's point of
> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
> impossible to install python under /usr/lib64 without heavy patching.
> Repeated attempts to bring python developers to acknowledge and rectify
> the situation have all failed (common argument here is "that would mean
> redesign of distutils and huge parts of whatnot").

This is now Tarek's call, so this may or may not have changed in terms of
what the (now) distutils maintainer thinks.

> Conversely, that also means that multiarch setup (/usr/lib or lib32 with
> 32bit python and /usr/lib64 with 64bit python) is not possible with
> stock python.
> Two, the default configuration makes purelib and platlib identical,
> which somehow defeats the purpose of the distinction in the first place.
> You either need to patch the default, or supply some alternate
> configuration to take advantage of this feature.
> And that's not the end of it - the next step is to make python aware of
> two different locations on sys.path, one for purelib and one for
> platlib, which is a different story altogether.
> As distributors, we like to take advantage of purelib/platlib separation
> to package pure python modules as platform-independent (noarch for
> rpm-speakers). And that's not easy to do properly.
> The proposal:
> Let's put our heads together and choose good default locations for
> purelib and platlib. Then add support to python for recognizing the
> locations by default, and possibly leave note in FHS that "this is the
> place".
> This is IMO a good first step to making python multiarch-aware, and it
> would also help a bit with LSB integration [1].
> I've come up with three basic options for the configuration (substitute
> "/usr" with "$prefix" if you're not a distributor). This list is by no
> means comprehensive, it's just what looked reasonable at the time of
> writing.
> 1 - the traditional way
> purelib = /usr/lib/pythonX.Y/site-packages
> platlib = /usr/lib(64)/pythonX.Y/site-packages

Why can't pure libraries go into lib64 as well? There is nothing saying that
a pure Python package won't have a that installs different files
based on whether it is for a 32-bit or 64-bit CPython install.

> pros:
> + this is already the default for 32bit systems
> + major distributions (including Fedora, Mandriva and now finally
> openSUSE too) do this
> cons:
> - 32bit systems have no separation, poor they!
> - with multiarch setup, /usr/lib is "cluttered" by both
> platform-dependent files for 32bit and platform-independent files shared
> by the platforms. Also, 64bit python can pick up 32bit modules. That
> doesn't cause problems in practice, but doesn't fell like a clean design.
> 2 - the sharedir way
> purelib = /usr/share/python/X.Y
> platlib = /usr/lib(64)/pythonX.Y/site-packages

Now are you proposing that packages that have both Python source and
extensions be split based on the type of files, or that only pure Python
packages go to /usr/share/python and any packages that are mixed go into
lib(64)? If you are proposing the latter this is more reasonable as the
former will require using .pth files to get import to search both locations
for files in the same package and that just feels icky to me.

> pros:
> + clean separation of purelib - nice!
> + unheard of - a good place to start anew
> cons:
> - FHS states that /usr/share is for data. But OTOH, they don't say much
> about platform-independent bytecode. We could probably get an exception
> for this.
> - unheard of - everyone will be surprised

> 3 - the perl way
> purelib = /usr/lib/pythonX.Y
> platlib =
> /usr/lib/pythonX.Y/lib-dynload-(platform-identifier)/site-packages
> pros:
> + possibility of multiarch packages that would install pure python parts
> into purelib and extensions or accelerators for more platforms at once -
> and therefore, possibility to split large modules into
> platform-dependent and platform-independent parts and save space on
> installation media
> + "idea compatibility" with perl and ruby, one less install layout to learn
> cons:
> - completely different from what we have now - would require the most
> work from both python developers and distributions

I think that last con says what chances this approach has of winning. =)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jnoller at  Thu Aug 13 22:16:33 2009
From: jnoller at (Jesse Noller)
Date: Thu, 13 Aug 2009 16:16:33 -0400
Subject: [Python-Dev] PyCon 2010: Call for Proposals
Message-ID: <>

Yup! It's that time again, I'm encouraging anyone involved in core
development, or wanting to talk about core development - or
python-core internals to submit talk proposals. Lots of people have
expressed interest in such talks.

Call for proposals -- PyCon 2010 -- <>

Due date: October 1st, 2009

Want to showcase your skills as a Python Hacker? Want to have
hundreds of people see your talk on the subject of your choice? Have some
hot button issue you think the community needs to address, or have some
package, code or project you simply love talking about? Want to launch
your master plan to take over the world with python?

PyCon is your platform for getting the word out and teaching something
new to hundreds of people, face to face.

Previous PyCon conferences have had a broad range of presentations,
from reports on academic and commercial projects, tutorials on a broad
range of subjects and case studies. All conference speakers are volunteers
and come from a myriad of backgrounds. Some are new speakers, some
are old speakers. Everyone is welcome so bring your passion and your
code! We're looking to you to help us top the previous years of success
PyCon has had.

PyCon 2010 is looking for proposals to fill the formal presentation tracks.
The PyCon conference days will be February 19-22, 2010 in Atlanta,
Georgia, preceded by the tutorial days (February 17-18), and followed
by four days of development sprints (February 22-25).

Online proposal submission is open now! Proposals  will be accepted
through October 1st, with acceptance notifications coming out on
November 15th. For the detailed call for proposals, please see:


For videos of talks from previous years - check out:


We look forward to seeing you in Atlanta!

From benjamin at  Thu Aug 13 23:12:50 2009
From: benjamin at (Benjamin Peterson)
Date: Thu, 13 Aug 2009 16:12:50 -0500
Subject: [Python-Dev] [RELEASED] Python 3.1.1 Release Candidate
Message-ID: <>

On behalf of the Python development team, I'm pleased to announce the first
release candidate of Python 3.1.1.

This bug fix release fixes many normal bugs and several critical ones including
potential data corruption in the io library.  The final version should be out
within the next week.

Python 3.1 focuses on the stabilization and optimization of the features and
changes that Python 3.0 introduced.  For example, the new I/O system has been
rewritten in C for speed.  File system APIs that use unicode strings now handle
paths with undecodable bytes in them. Other features include an ordered
dictionary implementation, a condensed syntax for nested with statements, and
support for ttk Tile in Tkinter.  For a more extensive list of changes in 3.1,
see or Misc/NEWS in the Python

To download Python 3.1.1 visit:

The 3.1 documentation can be found at:

Bugs can always be reported to:


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 3.1.1's contributors)

From david.lyon at  Fri Aug 14 05:02:42 2009
From: david.lyon at (David Lyon)
Date: Thu, 13 Aug 2009 23:02:42 -0400
Subject: [Python-Dev] request for comments - standardization of python's
 purelib and platlib
In-Reply-To: <h61lmi$m3v$>
References: <h61lmi$m3v$>
Message-ID: <>

Hi Jan,

It's not impossible, but you have some dependencies.

If you can patch distutils within Suse, then it mightn't be so
difficult. Distutils is not much more than a file copier.

Inside distutils, a lot of the paths that you are talking
about are hardcoded. 

> One, python depends on the "lib" directory. (from distro's point of
> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
> impossible to install python under /usr/lib64 without heavy patching.

correction - light patching. 

> Repeated attempts to bring python developers to acknowledge and rectify
> the situation have all failed (common argument here is "that would mean
> redesign of distutils and huge parts of whatnot").

Make it a zope/plone issue... and something might get done about it....


If it's a windows or linux issue... pour petrol on it and light a match..

seriously... it's not major refactoring.. it's just changing a few
constants.. within distutils..

> Let's put our heads together and choose good default locations for
> purelib and platlib. Then add support to python for recognizing the
> locations by default, and possibly leave note in FHS that "this is the
> place".

Sure - discuss away. But you might end up having to patch your own

> 2 - the sharedir way
> purelib = /usr/share/python/X.Y
> platlib = /usr/lib(64)/pythonX.Y/site-packages
> pros:
> + clean separation of purelib - nice!
> + unheard of - a good place to start anew
> cons:
> - FHS states that /usr/share is for data. But OTOH, they don't say much
> about platform-independent bytecode. We could probably get an exception
> for this.
> - unheard of - everyone will be surprised


Go try...


From ncoghlan at  Fri Aug 14 09:34:56 2009
From: ncoghlan at (Nick Coghlan)
Date: Fri, 14 Aug 2009 17:34:56 +1000
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <h614a3$r92$>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Georg Brandl wrote:
> Nick Coghlan schrieb:
>> P.S. For anyone else that is slow like me, take a close look at PEP 387...
> What should we see, other than that we have two PEPs on the same topic that
> should be merged?

Benjamin wrote the second one, so he obviously knows there's a written
deprecation policy in place, and hence his mini-rant probably wasn't
meant to be taken literally - a point I completely missed on first reading.

I agree the two PEPs should probably be consolidated into one, but
absent a volunteer for that task, leaving them as is doesn't really hurt


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ziade.tarek at  Fri Aug 14 10:02:03 2009
From: ziade.tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 14 Aug 2009 10:02:03 +0200
Subject: [Python-Dev] request for comments - standardization of python's
	purelib and platlib
In-Reply-To: <>
References: <h61lmi$m3v$>
Message-ID: <>

On Thu, Aug 13, 2009 at 9:22 PM, Brett Cannon<brett at> wrote:
> On Thu, Aug 13, 2009 at 11:23, Jan Matejek <jan.matejek at> wrote:
>> Hello,
>> I'm cross-posting this to distributions at freedesktop and python-dev,
>> because the topic is relevant to both groups and should be solved in
>> cooperation.
>> The issue:
>> In Python's default configuration (on linux), both purelib (location for
>> pure python modules) and platlib (location for platform-dependent binary
>> extensions) point to $prefix/lib/pythonX.Y/site-packages.
>> That is no good for two main reasons.
>> One, python depends on the "lib" directory. (from distro's point of
>> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
>> impossible to install python under /usr/lib64 without heavy patching.
>> Repeated attempts to bring python developers to acknowledge and rectify
>> the situation have all failed (common argument here is "that would mean
>> redesign of distutils and huge parts of whatnot").
> This is now Tarek's call, so this may or may not have changed in terms of
> what the (now) distutils maintainer thinks.

I don't recall those repeated attempts , but I've been around for less
than two years.

You are very welcome to come in the Distutils-SIG ML to discuss these matters.
I'm moving the discussion there.

Among the proposals you have detailed, the sharedir way seems like the
most simple/interesting
one (depending on you answer to Brett's question )


Tarek Ziad? |

From paolo.fragu at  Fri Aug 14 13:21:29 2009
From: paolo.fragu at (paolo.fragu at
Date: Fri, 14 Aug 2009 13:21:29 +0200
Subject: [Python-Dev] Tkinter: modify xview of entry widget
Message-ID: <KOD67T$>

I'm Paolo from Italy and I'm a python user.
I wish to propose a useful and smart method modify in Tkinter Library:

Previously to scroll this widget we had to write an external function  (recalling xview_moveto and xview_scroll).

With my method this operation is cleared and the same as all other widgets  (just have to call xview).

Modify Proposal:
Change the method xview of entry so it works as all widget scrollable, and it's compatible with 'old' xview.

So to scroll entry widget:


The change in module Tkinter is:

def  xview(self,*args):
    """Query and change horizontal position of the view."""
    if not args:
       return self._getdoubles(, 'xview'))
    #old code
    index=args[0], 'xview', index)


I wish that this implementation could be integrated in Tkinter, and I remain at disposal for any question or further information.

Waiting for your response,
Best regards
Paolo Fraguglia 

From fuzzyman at  Fri Aug 14 13:25:05 2009
From: fuzzyman at (Michael Foord)
Date: Fri, 14 Aug 2009 12:25:05 +0100
Subject: [Python-Dev] Tkinter: modify xview of entry widget
In-Reply-To: <KOD67T$>
References: <KOD67T$>
Message-ID: <>

paolo.fragu at wrote:
> Hi,
> I'm Paolo from Italy and I'm a python user.
> I wish to propose a useful and smart method modify in Tkinter Library:

Hi Paolo,

Can you create an issue on the bug tracker - with the patch attached.

Your suggestion stands a much better chance if this patch includes tests 
and documentation.

All the best,

Michael Foord

> Previously to scroll this widget we had to write an external function  (recalling xview_moveto and xview_scroll).
> With my method this operation is cleared and the same as all other widgets  (just have to call xview).
> ----------------------------------------------------------
> Modify Proposal:
> ----------------------------------------------------------
> Change the method xview of entry so it works as all widget scrollable, and it's compatible with 'old' xview.
> So to scroll entry widget:
> entry_widget['xscrollcommand']=scroll_widget.set
> scroll_widget['command']=entry_widget.xview
> The change in module Tkinter is:
> def  xview(self,*args):
>     """Query and change horizontal position of the view."""
>     #modify
>     if not args:
>        return self._getdoubles(, 'xview'))
>     #old code
>     index=args[0]
>, 'xview', index)
> ----------------------------------------------------------
> I wish that this implementation could be integrated in Tkinter, and I remain at disposal for any question or further information.
> Waiting for your response,
> Best regards
> Paolo Fraguglia 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From ggpolo at  Fri Aug 14 14:30:17 2009
From: ggpolo at (Guilherme Polo)
Date: Fri, 14 Aug 2009 09:30:17 -0300
Subject: [Python-Dev] Tkinter: modify xview of entry widget
In-Reply-To: <KOD67T$>
References: <KOD67T$>
Message-ID: <>

2009/8/14 paolo.fragu at <paolo.fragu at>:
> Hi,
> I'm Paolo from Italy and I'm a python user.
> I wish to propose a useful and smart method modify in Tkinter Library:
> Previously to scroll this widget we had to write an external function ?(recalling xview_moveto and xview_scroll).
> With my method this operation is cleared and the same as all other widgets ?(just have to call xview).
> I wish that this implementation could be integrated in Tkinter, and I remain at disposal for any question or further information.
> Waiting for your response,

I believe you are trying to mention the fact that the Entry.xview
method doesn't allow being called without passing an index, even if
this index is None. Is that the case ?

Take a look on and, they already address this fix.

> Best regards
> Paolo Fraguglia


-- Guilherme H. Polo Goncalves

From fwierzbicki at  Fri Aug 14 17:46:46 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 14 Aug 2009 11:46:46 -0400
Subject: [Python-Dev] Tweaking AST lineno and col_offset
Message-ID: <>

Hi all,

Off and on I have been directly comparing Jython's AST with Python's
AST and generally working towards making them as close to identical as
possible.  There are a couple of places where I haven't "fixed" Jython
because it looks to me like Jython has slightly better offsets.  One

for a,b in c:

The Tuple node "a,b" ends up with a col_offset of 0 (the position of
the "for") where Jython has the col_offset as 4 (the position of "a").
 Jython's result is more consistent with other Tuple node col_offset

I have a local patch that changes the CPython col_offset to match
Jython's, but before I submit a patch I thought I'd ask here if there
is support for this sort of change and if I should continue to find
col_offset and lineno results that look fishy to me, or should I just
change Jython's results to match (one way or another, things will be
much easier for me to test if they match).

Also, would this be a change that would be considered a backwards
incompatibility?  In other words, would patches like this be allowed
in 2.6/3.1 or only in 2.7/3.2.



From status at  Fri Aug 14 18:07:06 2009
From: status at (Python tracker)
Date: Fri, 14 Aug 2009 18:07:06 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (08/07/09 - 08/14/09)
Python tracker at

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.

 2329 open (+26) / 16195 closed ( +8) / 18524 total (+34)

Open issues with patches:   925

Average duration of open issues: 659 days.
Median duration of open issues: 414 days.

Open Issues Breakdown
   open  2297 (+26)
pending    31 ( +0)

Issues Created Or Reopened (34)

TarFile.getmembers fails at struct.unpack: unpack requires a str 08/07/09    created  srid                          

Printing the 'The Python Tutorial'                               08/08/09    created  brimac                        
                                                                        doesn't respect xfce default browser               08/09/09    created  ava1ar                        

Add Mingw recognition to pyport.h to allow building extensions   08/09/09    created  f0k                           

Py3.1 hangs in coroutine and eats up all memory                  08/09/09
CLOSED    created  scoder                        

Fatal error: deallocating None                                   08/10/09
CLOSED    created  shashi                        

inf == inf (wrong IEEE 754 behaviour)                            08/10/09
CLOSED    created  Cyborg16                      

expat parser throws Memory Error when parsing multiple files     08/10/09    created  realpolitik                   

Place the term "delete" within the documentation for os.remove() 08/10/09    created  mcow                          

inspect.currentframe documentation omits optional depth paramete 08/10/09    created  llimllib                      

obsolete paragraph in re doc for re.sub                          08/10/09
CLOSED    created  MLModel                       

Python 3.1 fails to build when db.h contains non-UTF-8 character 08/10/09
CLOSED    created  Arfrever                      

email.parser clips trailing \n of multipart/mixed part if part e 08/10/09    created  gvanrossum                    

Default traceback does not handle PEP302 loaded modules          08/11/09    created  anders.blomdell at

smtplib authentication - try all mechanisms                      08/11/09    created  shubes                        

"x / 1" and "x * 1" should return x                              08/11/09
CLOSED    created  mrjbq7                        

CGI module documentation references method 'toupper'; should be  08/11/09    created  troy                          

xml.sax.xmlreader.XMLReader.getProperty (xml.sax.handler.propert 08/11/09    created  cms103                        

Move the special-case for integer objects out of PyBytes_FromObj 08/11/09    created  alexandre.vassalotti          
       patch, 26backport                                                       

Optimize PyBytes_FromObject.                                     08/11/09    created  alexandre.vassalotti          

subprocess doesn't pass arguments correctly on Linux when shell= 08/12/09    created  davidfraser                   

BUILD_SET followed by COMPARE_OP (in) can be optimized if all it 08/12/09    created  alex                          

Support for nested classes and function for pyclbr               08/12/09    created  gpolo                         

asyncore kqueue support                                          08/13/09    created  Ikinoki                       

New functions in to get user/global site packages paths  08/13/09    created  tarek                         

itertools documentation still contains references to ifilterfals 08/13/09
CLOSED    created  alex.morega                   

PyXXX_ClearFreeList for dict, set, and list                      08/13/09    created  matthiastroffaes              

Profile objects should be documented                             08/13/09    created  honeyman                      

Python 3.1 segfaults when invalid UTF-8 characters are passed fr 08/13/09    created  Arfrever                      

IDLE no longer opens only an edit window when configured to do s 08/13/09    created  gpolo                         

IDLE: Warn user about overwriting a file that has a newer versio 08/13/09    created  gpolo                         

inspect.getsource() returns incorrect source lines               08/14/09    created  gagenellina                   

Make custom xmlrpc extension easier                              08/14/09    created  bogdan.opanchuk               

Tkinter: modify xview of entry widget                            08/14/09
CLOSED    created  paolo                         

Issues Now Closed (22)

xview/yview of Tix.Grid is broken                                 705 days    gpolo                         

IDLE - use enumerate instead of zip(count(), ...)                 399 days    gpolo                         

subprocess fails in select when descriptors are large             392 days    benjamin.peterson             

Idle doesn't obey the new improved warnings arguements            327 days    gpolo                         

Idle hangs when given a nonexistent filename.                     207 days    gpolo                         

IDLE to support                                       190 days    gpolo                         

Python 3 pdb: shows internal code, breakpoints don't work          78 days    georg.brandl                  

Missing Shell menu in Linux IDLE                                   67 days    gpolo                         

Tkinter.Entry: fix for xview and some doc clarifications           73 days    gpolo                         

test_ttk_guionly buildbot test crash: Tcl_FinalizeNotifier: noti   18 days    gpolo                         

2to3 fails to fix test.test_support                                13 days    benjamin.peterson             

2to3 test_print_function_option fails on Windows                   14 days    benjamin.peterson             

Desire documentation link to user contribution wiki (    5 days    keenethery                    

re.findall does not always return a list of strings                 2 days    pitrou                        

Py3.1 hangs in coroutine and eats up all memory                     2 days    alexandre.vassalotti          

Fatal error: deallocating None                                      0 days    benjamin.peterson             

inf == inf (wrong IEEE 754 behaviour)                               0 days    marketdickinson               

obsolete paragraph in re doc for re.sub                             2 days    georg.brandl                  

Python 3.1 fails to build when db.h contains non-UTF-8 character    2 days    benjamin.peterson             

"x / 1" and "x * 1" should return x                                 0 days    rhettinger                    

itertools documentation still contains references to ifilterfals    0 days    georg.brandl                  

Tkinter: modify xview of entry widget                               0 days    gpolo                         

Top Issues Most Discussed (10)

 15 Regexp 2.7 (modifications to current re 2.2.2)                   486 days

  5 Python 3 pdb: shows internal code, breakpoints don't work         78 days

  5 urllib.quote() escapes characters unnecessarily and contrary to  486 days

  4 inf == inf (wrong IEEE 754 behaviour)                              0 days

  4 Py3.1 hangs in coroutine and eats up all memory                    2 days

  4 subprocess fails in select when descriptors are large            392 days

  3 PyXXX_ClearFreeList for dict, set, and list                        1 days

  3 subprocess doesn't pass arguments correctly on Linux when shell    2 days

  3 xml.sax.xmlreader.XMLReader.getProperty (xml.sax.handler.proper    3 days

  3 email.parser clips trailing \n of multipart/mixed part if part     4 days

From jan.matejek at  Fri Aug 14 18:11:07 2009
From: jan.matejek at (Jan Matejek)
Date: Fri, 14 Aug 2009 18:11:07 +0200
Subject: [Python-Dev] request for comments - standardization of python's
 purelib and platlib
In-Reply-To: <>
References: <h61lmi$m3v$>
Message-ID: <>

Dne 13.8.2009 21:22, Brett Cannon napsal(a):
> On Thu, Aug 13, 2009 at 11:23, Jan Matejek <jan.matejek at> wrote:
>> 1 - the traditional way
>> purelib = /usr/lib/pythonX.Y/site-packages
>> platlib = /usr/lib(64)/pythonX.Y/site-packages
> Why can't pure libraries go into lib64 as well? There is nothing saying that
> a pure Python package won't have a that installs different files
> based on whether it is for a 32-bit or 64-bit CPython install.

What i'd like to accomplish is to have pure "noarch" package that can be
installed unchanged into 32bit or 64bit (or 256bit) system, and the
respective python would still find the files.
Or, to put it another way, a package that can be installed into a
multiarch system and be recognized by pythons of all architectures
(assuming they are the same version, of course).

If the distutils package installs different pure files for 32bit and
64bit python, then it can't be "noarch", so it doesn't matter if it goes
into lib64.

Also, such package would break this particular scheme - in the situation
where the user installs only 32bit version of such package and tries to
run it with 64bit python, it will probably break in some weird way.

Last but not least, i'd argue that if a python-only package installs
different files for different platforms, it is platform-dependent and
therefore not pure ;)

>> 2 - the sharedir way
>> purelib = /usr/share/python/X.Y
>> platlib = /usr/lib(64)/pythonX.Y/site-packages
> Now are you proposing that packages that have both Python source and
> extensions be split based on the type of files, or that only pure Python
> packages go to /usr/share/python and any packages that are mixed go into
> lib(64)? If you are proposing the latter this is more reasonable as the
> former will require using .pth files to get import to search both locations
> for files in the same package and that just feels icky to me.

The latter. Assume no change to "normal" distutils mechanism, only
setting the default paths. (for now anyway)


From benjamin at  Fri Aug 14 18:16:47 2009
From: benjamin at (Benjamin Peterson)
Date: Fri, 14 Aug 2009 11:16:47 -0500
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/14 Frank Wierzbicki <fwierzbicki at>:
> Hi all,
> Off and on I have been directly comparing Jython's AST with Python's
> AST and generally working towards making them as close to identical as
> possible. ?There are a couple of places where I haven't "fixed" Jython
> because it looks to me like Jython has slightly better offsets. ?One
> example:
> for a,b in c:
> ? ?pass
> The Tuple node "a,b" ends up with a col_offset of 0 (the position of
> the "for") where Jython has the col_offset as 4 (the position of "a").
> ?Jython's result is more consistent with other Tuple node col_offset
> results.
> I have a local patch that changes the CPython col_offset to match
> Jython's, but before I submit a patch I thought I'd ask here if there
> is support for this sort of change and if I should continue to find
> col_offset and lineno results that look fishy to me, or should I just
> change Jython's results to match (one way or another, things will be
> much easier for me to test if they match).

Yes, please submit it.

> Also, would this be a change that would be considered a backwards
> incompatibility? ?In other words, would patches like this be allowed
> in 2.6/3.1 or only in 2.7/3.2.

While I don't see a problem in backporting it to maintence branches, I
would personally only apply it to the current development branches. It
doesn't seem to fix a "bug", just make a nice improvement.


From lanyjie at  Fri Aug 14 19:00:49 2009
From: lanyjie at (Yingjie Lan)
Date: Fri, 14 Aug 2009 10:00:49 -0700 (PDT)
Subject: [Python-Dev] expy: an expressway to extend Python
In-Reply-To: <h5jskt$59e$>
Message-ID: <>

--- On Sat, 8/8/09, Stefan Behnel <stefan_ml at> wrote:

> From: Stefan Behnel <stefan_ml at>
> Subject: Re: [Python-Dev] expy: an expressway to extend Python
> To: python-dev at
> Date: Saturday, August 8, 2009, 4:55 PM
> > More details at
> I'm clearly biased, but my main concern here is that expy
> requires C code
> to be written inside of strings. There isn't any good
> editor support for
> that, so I doubt that expy is good for anything but very
> thin wrappers (as
> in the examples you presented).

Thanks a lot for the input -- I sort of recaptured the advantages of expy and listed four points in the new introduction at homepage. 

Lacking of editor highlight support is quite a problem, but it is possible to create a support. For example, you can use this to indicate the start of embedded code highlight: 

return """

and then the end mark is of course the enclosing """

> That said, you might want to look at the argument unpacking
> code generated
> by Cython. It's highly optimised through specialisation and
> has been
> benchmarked quite a bit faster than the generic Python
> C-API functions for
> tuple/keyword extracting. Since argument conversion seems
> to be more or
> less all that expy really does, maybe you want to reuse
> that code.
> Stefan

Oh sure, that's nice if that part can be adopted by expy-cxpy. Any help out on this would be very welcomed.



From jaraco at  Fri Aug 14 20:39:03 2009
From: jaraco at (Jason R. Coombs)
Date: Fri, 14 Aug 2009 14:39:03 -0400
Subject: [Python-Dev] functools.compose to chain functions together
Message-ID: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>

I'd like to express additional interest in python patch 1660179, discussed


On several occasions, I've had the desire for something like this.  I've
made due with lambda functions, but as was mentioned, the lambda is clumsy
and harder to read than functools.compose would be.


A potentially common use-case is when a library has a multi-decorator use
case in which they want to compose a meta decorator out of one or more
individual decorators.


Consider the hypothetical library.


# we have three decorators we use commonly

def dec_register_function_for_x(func):

                # do something with func

                return func


def dec_alter_docstring(func):

                # do something to func.__doc__

                return func


def inject_some_data(data):

                def dec_inject_data(func):

                       = data # this may not be legal,
but assume it does something useful

                                return func

                return dec_inject_data


# we could use these decorators explicitly throughout our project



@dec_inject_some_data('foo data 1')

def our_func_1(params):





@dec_inject_some_data('foo data 2')

def our_func_2(params):



For two functions, that's not too onerous, but if it's used throughout the
application, it would be nice to abstract the collection of decorators.  One
could do this with lambdas.


def meta_decorator(data):

                return lambda func:


But to me, a compose function is much easier to read and much more
consistent with the decorator usage syntax itself.


def meta_decorator(data):

return compose(dec_register_function_for_x, dec_alter_docstring,


The latter implementation seems much more readable and elegant.  One doesn't
even need to know the decorator signature to effectively compose


I've heard it said that Python is not a functional language, but if that
were really the case, then functools would not exist. In addition to the
example described above, I've had multiple occasions where having a general
purpose function composition function would have simplified the
implementation by providing a basic functional construct. While Python isn't
primarily a functional language, it does have some functional constructs,
and this is one of the features that makes Python so versatile; one can
program functionally, procedurally, or in an object-oriented way, all within
the same language.


I admit, I may be a bit biased; my first formal programming course was
taught in Scheme.


Nevertheless, I believe functools is the ideal location for a very basic and
general capability such as composition.


I realize this patch was rejected, but I'd like to propose reviving the
patch and incorporating it into functools.




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6998 bytes
Desc: not available
URL: <>

From fwierzbicki at  Fri Aug 14 20:57:04 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 14 Aug 2009 14:57:04 -0400
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 14, 2009 at 12:16 PM, Benjamin Peterson<benjamin at> wrote:

>> I have a local patch that changes the CPython col_offset to match
>> Jython's, but before I submit a patch I thought I'd ask here if there
>> is support for this sort of change and if I should continue to find
>> col_offset and lineno results that look fishy to me, or should I just
>> change Jython's results to match (one way or another, things will be
>> much easier for me to test if they match).
> Yes, please submit it.
Great, the patch is here:

 BTW - I would have added a test to, but above the test
data it notes: #### EVERYTHING BELOW IS GENERATED ##### and I couldn't
find the tool used for the generation.  Does anyone know where that


From catch-all at  Fri Aug 14 20:58:22 2009
From: catch-all at (Xavier Morel)
Date: Fri, 14 Aug 2009 20:58:22 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <>

On 14 Aug 2009, at 20:39 , Jason R. Coombs wrote:
> I've heard it said that Python is not a functional language, but if  
> that
> were really the case, then functools would not exist. In addition to  
> the
> example described above, I've had multiple occasions where having a  
> general
> purpose function composition function would have simplified the
> implementation by providing a basic functional construct.

It's not like a basic variable-arity composition function is hard to  
implement though, it's basically:

     def compose(*funcs):
         return reduce(lambda f1, f2:
                           lambda v:

it'll turn compose(a, b, c, d)(value) into a(b(c(d(value))))

From benjamin at  Fri Aug 14 21:11:19 2009
From: benjamin at (Benjamin Peterson)
Date: Fri, 14 Aug 2009 14:11:19 -0500
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/14 Frank Wierzbicki <fwierzbicki at>:
> On Fri, Aug 14, 2009 at 12:16 PM, Benjamin Peterson<benjamin at> wrote:
>>> I have a local patch that changes the CPython col_offset to match
>>> Jython's, but before I submit a patch I thought I'd ask here if there
>>> is support for this sort of change and if I should continue to find
>>> col_offset and lineno results that look fishy to me, or should I just
>>> change Jython's results to match (one way or another, things will be
>>> much easier for me to test if they match).
>> Yes, please submit it.
> Great, the patch is here:

I'll take a look.

> ?BTW - I would have added a test to, but above the test
> data it notes: #### EVERYTHING BELOW IS GENERATED ##### and I couldn't
> find the tool used for the generation. ?Does anyone know where that
> is?

It's at the bottom of the test file. :) You can add a handwritten test
above that, though.


From fwierzbicki at  Fri Aug 14 21:41:41 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 14 Aug 2009 15:41:41 -0400
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 14, 2009 at 3:11 PM, Benjamin Peterson<benjamin at> wrote:
> 2009/8/14 Frank Wierzbicki <fwierzbicki at>:
>> On Fri, Aug 14, 2009 at 12:16 PM, Benjamin Peterson<benjamin at> wrote:
>>>> I have a local patch that changes the CPython col_offset to match
>>>> Jython's, but before I submit a patch I thought I'd ask here if there
>>>> is support for this sort of change and if I should continue to find
>>>> col_offset and lineno results that look fishy to me, or should I just
>>>> change Jython's results to match (one way or another, things will be
>>>> much easier for me to test if they match).
>>> Yes, please submit it.
>> Great, the patch is here:
> I'll take a look.
>> ?BTW - I would have added a test to, but above the test
>> data it notes: #### EVERYTHING BELOW IS GENERATED ##### and I couldn't
>> find the tool used for the generation. ?Does anyone know where that
>> is?
> It's at the bottom of the test file. :) You can add a handwritten test
> above that, though.
Heh -- how did I miss that :) ? -- I'll resubmit the patch with tests.


From fwierzbicki at  Fri Aug 14 22:01:50 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 14 Aug 2009 16:01:50 -0400
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 14, 2009 at 3:41 PM, Frank Wierzbicki<fwierzbicki at> wrote:
>> It's at the bottom of the test file. :) You can add a handwritten test
>> above that, though.
> Heh -- how did I miss that :) ? -- I'll resubmit the patch with tests.
Resubmitted with tests.


From brett at  Fri Aug 14 23:54:30 2009
From: brett at (Brett Cannon)
Date: Fri, 14 Aug 2009 14:54:30 -0700
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <>

It would be best to discuss this on comp.lang.python or python-ideas to get
general support for the idea before trying to bring this to python-dev in
hopes of changing people's minds.

On Fri, Aug 14, 2009 at 11:39, Jason R. Coombs <jaraco at> wrote:

>  I?d like to express additional interest in python patch 1660179,
> discussed here:
> On several occasions, I?ve had the desire for something like this.  I?ve
> made due with lambda functions, but as was mentioned, the lambda is clumsy
> and harder to read than functools.compose would be.
> A potentially common use-case is when a library has a multi-decorator use
> case in which they want to compose a meta decorator out of one or more
> individual decorators.
> Consider the hypothetical library.
> # we have three decorators we use commonly
> def dec_register_function_for_x(func):
>                 # do something with func
>                 return func
> def dec_alter_docstring(func):
>                 # do something to func.__doc__
>                 return func
> def inject_some_data(data):
>                 def dec_inject_data(func):
>                        = data # this may not be legal,
> but assume it does something useful
>                                 return func
>                 return dec_inject_data
> # we could use these decorators explicitly throughout our project
> @dec_register_function_for_x
> @dec_alter_docstring
> @dec_inject_some_data(?foo data 1?)
> def our_func_1(params):
>                 pass
> @dec_register_function_for_x
> @dec_alter_docstring
> @dec_inject_some_data(?foo data 2?)
> def our_func_2(params):
>                 pass
> For two functions, that?s not too onerous, but if it?s used throughout the
> application, it would be nice to abstract the collection of decorators.  One
> could do this with lambdas.
> def meta_decorator(data):
>                 return lambda func:
> dec_register_function_for_x(dec_alter_docstring(dec_inject_some_data(data)(func)))
> But to me, a compose function is much easier to read and much more
> consistent with the decorator usage syntax itself.
> def meta_decorator(data):
> return compose(dec_register_function_for_x, dec_alter_docstring,
> dec_inject_some_data(data))
> The latter implementation seems much more readable and elegant.  One
> doesn?t even need to know the decorator signature to effectively compose
> meta_decorators.
> I?ve heard it said that Python is not a functional language, but if that
> were really the case, then functools would not exist. In addition to the
> example described above, I?ve had multiple occasions where having a general
> purpose function composition function would have simplified the
> implementation by providing a basic functional construct. While Python isn?t
> primarily a functional language, it does have some functional constructs,
> and this is one of the features that makes Python so versatile; one can
> program functionally, procedurally, or in an object-oriented way, all within
> the same language.
> I admit, I may be a bit biased; my first formal programming course was
> taught in Scheme.
> Nevertheless, I believe functools is the ideal location for a very basic
> and general capability such as composition.
> I realize this patch was rejected, but I?d like to propose reviving the
> patch and incorporating it into functools.
> Regards,
> Jason
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Fri Aug 14 23:57:02 2009
From: brett at (Brett Cannon)
Date: Fri, 14 Aug 2009 14:57:02 -0700
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <> 
Message-ID: <>

On Fri, Aug 14, 2009 at 09:16, Benjamin Peterson <benjamin at>wrote:

> 2009/8/14 Frank Wierzbicki <fwierzbicki at>:
> > Hi all,
> >
> > Off and on I have been directly comparing Jython's AST with Python's
> > AST and generally working towards making them as close to identical as
> > possible.  There are a couple of places where I haven't "fixed" Jython
> > because it looks to me like Jython has slightly better offsets.  One
> > example:
> >
> > for a,b in c:
> >    pass
> >
> > The Tuple node "a,b" ends up with a col_offset of 0 (the position of
> > the "for") where Jython has the col_offset as 4 (the position of "a").
> >  Jython's result is more consistent with other Tuple node col_offset
> > results.
> >
> > I have a local patch that changes the CPython col_offset to match
> > Jython's, but before I submit a patch I thought I'd ask here if there
> > is support for this sort of change and if I should continue to find
> > col_offset and lineno results that look fishy to me, or should I just
> > change Jython's results to match (one way or another, things will be
> > much easier for me to test if they match).
> Yes, please submit it.
> >
> > Also, would this be a change that would be considered a backwards
> > incompatibility?  In other words, would patches like this be allowed
> > in 2.6/3.1 or only in 2.7/3.2.
> While I don't see a problem in backporting it to maintence branches, I
> would personally only apply it to the current development branches. It
> doesn't seem to fix a "bug", just make a nice improvement.

I like the improvement, but I disagree it should be considered for
backporting as it changes semantics for something that could be considered a
bug, but that feels like a stretch.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From fwierzbicki at  Sat Aug 15 00:51:32 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 14 Aug 2009 18:51:32 -0400
Subject: [Python-Dev] Tweaking AST lineno and col_offset
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 14, 2009 at 5:57 PM, Brett Cannon<brett at> wrote:
> I like the improvement, but I disagree it should be considered for
> backporting as it changes semantics for something that could be considered a
> bug, but that feels like a stretch.
Just thought I'd ask -- I'm perfectly ok with the change being 2.7/3.2 only.


From alexander.kozlovsky at  Sat Aug 15 00:41:30 2009
From: alexander.kozlovsky at (Alexander Kozlovsky)
Date: Sat, 15 Aug 2009 02:41:30 +0400
Subject: [Python-Dev] (try-except) conditional expression similar to
	(if-else) conditional (PEP 308)
In-Reply-To: <>
References: <>
Message-ID: <>

Jeff McAninch wrote:
> I very often want something like a try-except conditional expression similar
> to the if-else conditional. 

I think it may be done currently with the help of next function:

    def guard(func, *args):
            return func()
        except Exception, e:
            for exc_type, exc_func in args:
                if isinstance(e, exc_type):
                    return exc_func()

Example usage:

    a, b, c = 10, 20, 0

    result = a + b/c  # raise ZeroDivisionError

    result = a + guard(lambda: b/c, (TypeError, lambda: 10),
                                    (ZeroDivisionError, lambda: b/2))
May be not very concise, but it works...

Best regards,
 Alexander                  mailto:alexander.kozlovsky at

From david.lyon at  Sat Aug 15 02:59:02 2009
From: david.lyon at (David Lyon)
Date: Fri, 14 Aug 2009 20:59:02 -0400
Subject: [Python-Dev] request for comments - standardization of python's
 purelib and platlib
In-Reply-To: <>
References: <h61lmi$m3v$>
Message-ID: <>

Hi Tarek,

What is needed is to remove/refactor the hardcoding of paths that
currently exists within distutils and replace it with the ability to 
override the defaults via configuration files. (distutils.cfg?)

If there's one thing that's certain for the future, its that
python will go onto more platforms. Using different paths.

When people are complaining about paths being hard-coded into
distutils and it causing angst, I think that their complaints are

I can find posts going back to 2004 for windows users complaining
about exactly the same thing. So it isn't a new issue. The problem
applies to both linux and windows.

Anyway.. do you know the code that we're talking about?


On Fri, 14 Aug 2009 10:02:03 +0200, Tarek Ziad? <ziade.tarek at>
> On Thu, Aug 13, 2009 at 9:22 PM, Brett Cannon<brett at> wrote:
>> On Thu, Aug 13, 2009 at 11:23, Jan Matejek <jan.matejek at>
>> wrote:
>>> Hello,
>>> I'm cross-posting this to distributions at freedesktop and python-dev,
>>> because the topic is relevant to both groups and should be solved in
>>> cooperation.
>>> The issue:
>>> In Python's default configuration (on linux), both purelib (location
>>> pure python modules) and platlib (location for platform-dependent
>>> extensions) point to $prefix/lib/pythonX.Y/site-packages.
>>> That is no good for two main reasons.
>>> One, python depends on the "lib" directory. (from distro's point of
>>> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
>>> impossible to install python under /usr/lib64 without heavy patching.
>>> Repeated attempts to bring python developers to acknowledge and rectify
>>> the situation have all failed (common argument here is "that would mean
>>> redesign of distutils and huge parts of whatnot").
>> This is now Tarek's call, so this may or may not have changed in terms
>> what the (now) distutils maintainer thinks.
> I don't recall those repeated attempts , but I've been around for less
> than two years.
> You are very welcome to come in the Distutils-SIG ML to discuss these
> matters.
> I'm moving the discussion there.
> Among the proposals you have detailed, the sharedir way seems like the
> most simple/interesting
> one (depending on you answer to Brett's question )
> Regards
> Tarek

From jacobolus at  Sat Aug 15 04:45:23 2009
From: jacobolus at (Jacob Rus)
Date: Fri, 14 Aug 2009 19:45:23 -0700
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
Message-ID: <>

11 Aug 2009, Benjamin Peterson wrote:
> 2009/8/11 Jacob Rus:
>> I have some other questions: How does one deprecate part of a standard
>> library API? How can we alert users to the deprecation? When can the
>> deprecated parts be removed?
> Basically, you add a DeprecationWarning to the API. Then remove it in
> the next major version.

Okay, I made another patch,

That adds some deprecation warnings to many of the functions/methods
in the module.

(I think the 'strict' parameters should also be deprecated. But I'm
considering actually making a new class, MimeTypesRegistry, or
something, and then just making its API stay mostly compatible with
MimeTypes, but extended to behave the way I think it should, and
deprecating the MimeTypes class altogether, making it a subclass in
the interim.)

Is there any way to explicitly (i.e. in code rather than docs)
deprecate string flags or dicts/lists from the module global
namespace? I don't think users should be mucking with the module's
singleton at all, and should be forced to make a new registry instance
to customize the behavior, so they don't break each-other's code.

> Then, you might garner some more reviews by putting your patch up on
> Rietveld; it makes reviewing much painful.

Okay, my last Rietveld link didn't get any eyeballs, but here's another try:

Jacob Rus

From solipsis at  Sat Aug 15 17:00:33 2009
From: solipsis at (Antoine Pitrou)
Date: Sat, 15 Aug 2009 15:00:33 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>


Jacob Rus <jacobolus <at>> writes:
> Okay, I made another patch,
> That adds some deprecation warnings to many of the functions/methods
> in the module.

After a fair amount of discussion on Rietveld, I think you should post another
patch without the deprecations.
(since the discussion was fairly long, I won't repeat here the reasons I gave
unless someone asks me to :-))
Besides, it would be nice to have the additional tests you were talking about.

Thanks for doing this anyway.

> (I think the 'strict' parameters should also be deprecated. But I'm
> considering actually making a new class, MimeTypesRegistry, or
> something, and then just making its API stay mostly compatible with
> MimeTypes, but extended to behave the way I think it should, and
> deprecating the MimeTypes class altogether, making it a subclass in
> the interim.)

This sounds very pie-in-the-sky compared to the original intent of the patch
(that is, fix the mimetypes module's implementation oddities). Let's remain
focused. The more a patch tries to cater for different issues, the less easy it
if to review and discuss (and, consequently, the less likely it is to go to the
end of the approval process).



From Scott.Daniels at Acm.Org  Sat Aug 15 21:54:41 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat, 15 Aug 2009 12:54:41 -0700
Subject: [Python-Dev] random number generator state
Message-ID: <h6739u$s31$>

I find I have a need in randomized testing for a shorter version
of getstate, even if it _is_ slower to restore.  When running
exhaustive tests, a failure report should show the start state
of the generator.  Unfortunately, our current state includes a
625-element array.  I want a state that can be read off a report
and typed in to reproduce the state.  Something a bit like the
initial seed, a count of cycle calls, and a few other things.

So, in addition to .getstate() and  .setstate(...), I'd at
least need to have .get_slow_state() and possibly expand what
.setstate(...) takes.  However, a call to .setstate should
reset the counter or all is for naught.  That means I need to
change the results of .getstate, thus giving me three kinds of
input to .setstate: old, new-short, and new-long.  In trying to
get this to work, I found what might be a bug:
code says
   mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
but probably should be:
   mt[0] |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array */

In checking into that issue, I went to the original Mersenne-Twister
code, and I see the original authors are pursuing a newer generator,

I now have a dilemma.  Should I continue the work on the original M-T
code (which is now seeming problematic for compatibility) or simply make
a new generator with similar calls using dSFMT and put the new feature
in that where there is no compatibility problem.  Which would be more
useful for the Python community?

--Scott David Daniels
Scott.Daniels at Acm.Org

From brett at  Sat Aug 15 22:24:42 2009
From: brett at (Brett Cannon)
Date: Sat, 15 Aug 2009 13:24:42 -0700
Subject: [Python-Dev] request for comments - standardization of python's
	purelib and platlib
In-Reply-To: <>
References: <h61lmi$m3v$>
Message-ID: <>

Please do not cross-post to python-dev. This discussion has been taken
to the distutils SIG.

On Fri, Aug 14, 2009 at 17:59, David Lyon<david.lyon at> wrote:
> Hi Tarek,
> What is needed is to remove/refactor the hardcoding of paths that
> currently exists within distutils and replace it with the ability to
> override the defaults via configuration files. (distutils.cfg?)
> If there's one thing that's certain for the future, its that
> python will go onto more platforms. Using different paths.
> When people are complaining about paths being hard-coded into
> distutils and it causing angst, I think that their complaints are
> valid.
> I can find posts going back to 2004 for windows users complaining
> about exactly the same thing. So it isn't a new issue. The problem
> applies to both linux and windows.
> Anyway.. do you know the code that we're talking about?
> David
> On Fri, 14 Aug 2009 10:02:03 +0200, Tarek Ziad? <ziade.tarek at>
> wrote:
>> On Thu, Aug 13, 2009 at 9:22 PM, Brett Cannon<brett at> wrote:
>>> On Thu, Aug 13, 2009 at 11:23, Jan Matejek <jan.matejek at>
>>> wrote:
>>>> Hello,
>>>> I'm cross-posting this to distributions at freedesktop and python-dev,
>>>> because the topic is relevant to both groups and should be solved in
>>>> cooperation.
>>>> The issue:
>>>> In Python's default configuration (on linux), both purelib (location
> for
>>>> pure python modules) and platlib (location for platform-dependent
> binary
>>>> extensions) point to $prefix/lib/pythonX.Y/site-packages.
>>>> That is no good for two main reasons.
>>>> One, python depends on the "lib" directory. (from distro's point of
>>>> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
>>>> impossible to install python under /usr/lib64 without heavy patching.
>>>> Repeated attempts to bring python developers to acknowledge and rectify
>>>> the situation have all failed (common argument here is "that would mean
>>>> redesign of distutils and huge parts of whatnot").
>>> This is now Tarek's call, so this may or may not have changed in terms
> of
>>> what the (now) distutils maintainer thinks.
>> I don't recall those repeated attempts , but I've been around for less
>> than two years.
>> You are very welcome to come in the Distutils-SIG ML to discuss these
>> matters.
>> I'm moving the discussion there.
>> Among the proposals you have detailed, the sharedir way seems like the
>> most simple/interesting
>> one (depending on you answer to Brett's question )
>> Regards
>> Tarek
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From python at  Sat Aug 15 23:58:10 2009
From: python at (Raymond Hettinger)
Date: Sat, 15 Aug 2009 14:58:10 -0700
Subject: [Python-Dev] random number generator state
References: <h6739u$s31$>
Message-ID: <622D9CF0914D47A39D797968E9A9BB1F@RaymondLaptop1>

[Scott David Daniels]
>I find I have a need in randomized testing for a shorter version
> of getstate, even if it _is_ slower to restore.  When running
> exhaustive tests, a failure report should show the start state
> of the generator.  Unfortunately, our current state includes a
> 625-element array.  I want a state that can be read off a report
> and typed in to reproduce the state.  Something a bit like the
> initial seed, a count of cycle calls, and a few other things.

Sounds like you could easily wrap the generator to get this.
It would slow you down but would give the information you want.

I think it would be a mistake to complexify the API to accomodate
short states -- I'm not even sure than they are generally useful
(recording my initial seed and how many cycles I've run through
is only helpful for sequences short enough that I'm willing to rerun

I'm curious what your use case is.  Why not just record the 
the sequence as generated -- I don't see any analytic value to
just knowing the initial seed and cycle count.  

Ability to print out a short state implies that you are using only a
small subset of possible states (i.e. the ones you can get to with
a short seed).  A short state print out isn't even possible if you actually
have a random initial state (every state having an equal chance of
being the starting point).

>  In trying to
> get this to work, I found what might be a bug:
> code says
>   mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
> but probably should be:
>   mt[0] |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array */

Please file a bug report for this and assign to me.  I put in the existing
MT code and took it directly from the author's published (and widely
tested code).  Also, our tests for MT exactly reproduce their published test
sequence.  But, if there is an error, I would be happy to fix it.

> In checking into that issue, I went to the original Mersenne-Twister
> code, and I see the original authors are pursuing a newer generator,
> dSFMT.

The MT itself has the advantage of having been widely exercised and
tested.  The newer generator may have more states but has not been
as extensively tested.

> I now have a dilemma.  Should I continue the work on the original M-T
> code (which is now seeming problematic for compatibility) or simply make
> a new generator with similar calls using dSFMT and put the new feature
> in that where there is no compatibility problem.  Which would be more
> useful for the Python community?

It's not hard to subclass Random and add different generators.  Why not
publish some code on ASPN and see how it gets received.  I've put a
recipe there for a long period generator, ,
but there doesn't seem to have been any real interest in generators with
longer periods than MT. 



From dickinsm at  Sun Aug 16 02:00:42 2009
From: dickinsm at (Mark Dickinson)
Date: Sun, 16 Aug 2009 01:00:42 +0100
Subject: [Python-Dev] random number generator state
In-Reply-To: <h6739u$s31$>
References: <h6739u$s31$>
Message-ID: <>

On Sat, Aug 15, 2009 at 8:54 PM, Scott David
Daniels<Scott.Daniels at> wrote:
> [...] input to .setstate: old, new-short, and new-long. ?In trying to
> get this to work, I found what might be a bug:
> code says
> ?mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
> but probably should be:
> ?mt[0] |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array */

I'm 92.3% sure that this isn't a bug.  For one thing, that line comes
directly from the authors' code[1], so if it's a bug then it's a bug in
the original code, dating from 2002;  this seems unlikely, given how
widely used and (presumably) well-scrutinized MT is.

For a more technical justification, the Mersenne Twister is based
on a linear transformation of a 19937-dimensional vector space
over F2, so its state naturally consists of 19937 bits of information,
which is 623 words plus one additional bit.  In this implementation,
that extra bit is the top bit of the first word;  the other 31 bits of that
first word shouldn't really be regarded as part of the state proper.
If you examine the genrand_int32 function in _randommodule.c,
you'll see that the low 31 bits of mt[0] play no role in updating the
state;  i.e., their value doesn't affect the new state.  So using
mt[0] |= 0x80000000UL instead of mt[0] = 0x80000000UL during
initialization should make no difference to the resulting stream of
random numbers (with the possible exception of the first random
number generated).



From greg.ewing at  Sun Aug 16 04:38:41 2009
From: greg.ewing at (Greg Ewing)
Date: Sun, 16 Aug 2009 14:38:41 +1200
Subject: [Python-Dev] random number generator state
In-Reply-To: <h6739u$s31$>
References: <h6739u$s31$>
Message-ID: <>

Scott David Daniels wrote:
> I find I have a need in randomized testing for a shorter version
> of getstate, even if it _is_ slower to restore.  When running
> exhaustive tests, a failure report should show the start state
> of the generator.  Unfortunately, our current state includes a
> 625-element array.

Do you need to use the Mersenne Twister in particular
for this? There are other kinds of generator with very
long cycles and good statistical properties, that can
easily be restored to any state in constant time given
an initial state and a count.

Let me know if you're interested and I can give you
further details.


From jacobolus at  Sun Aug 16 09:21:36 2009
From: jacobolus at (Jacob Rus)
Date: Sun, 16 Aug 2009 07:21:36 +0000 (UTC)
Subject: [Python-Dev] standard library mimetypes module pathologically
References: <>
Message-ID: <>

Antoine Pitrou:
> After a fair amount of discussion on Rietveld, I think you should post another
> patch without the deprecations.
> (since the discussion was fairly long, I won't repeat here the reasons I gave
> unless someone asks me to )
> Besides, it would be nice to have the additional tests you were talking about.

I'd guess that if I make another patch, no one else will actually look at the
discussion on the first one  (though maybe no one will look at it either
way)... I'd rather get another couple of opinions about it before burying
that conversation.

> This sounds very pie-in-the-sky compared to the original intent of the patch
> (that is, fix the mimetypes module's implementation oddities).

Okay. At least for me, the goals are twofold, because not only is the
implementation odd, but I consider  the semantics broken as well. But even
just fixing the obvious implementation problems would be a big  improvement.

Jacob Rus

From steve at  Sun Aug 16 14:14:56 2009
From: steve at (Steven D'Aprano)
Date: Sun, 16 Aug 2009 22:14:56 +1000
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <>

On Sat, 15 Aug 2009 04:39:03 am Jason R. Coombs wrote:

> I'd like to express additional interest in python patch 1660179,
> discussed here:
> But to me, a compose function is much easier to read and much more
> consistent with the decorator usage syntax itself.
> def meta_decorator(data):
>     return compose(dec_register_function_for_x, dec_alter_docstring,
>     dec_inject_some_data(data))

Surely that's better written as:

meta_decorator = compose(dec_register_function_for_x,
    dec_alter_docstring, dec_inject_some_data)

> I admit, I may be a bit biased; my first formal programming course
> was taught in Scheme.

Mine wasn't -- I've never even used Scheme, or Lisp, or any other 
functional language. But I've come to appreciate Python's functional 
tools, and would like to give a +0.5 to compose(). +1 if anyone can 
come up with additional use-cases.

Steven D'Aprano

From fwierzbicki at  Sun Aug 16 17:36:56 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Sun, 16 Aug 2009 11:36:56 -0400
Subject: [Python-Dev] Updating tests in branches
Message-ID: <>

I plan on updating the Python unit tests with tests from Jython that
turn out to be generic Python tests.  Should I be putting these tests
into trunk and 3k or should I also put them into the 2.6 and 3.1
maintenance branches as well?



From benjamin at  Sun Aug 16 17:45:59 2009
From: benjamin at (Benjamin Peterson)
Date: Sun, 16 Aug 2009 10:45:59 -0500
Subject: [Python-Dev] Updating tests in branches
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/16 Frank Wierzbicki <fwierzbicki at>:
> I plan on updating the Python unit tests with tests from Jython that
> turn out to be generic Python tests. ?Should I be putting these tests
> into trunk and 3k or should I also put them into the 2.6 and 3.1
> maintenance branches as well?


Usually, unless the test is for a bug we are backporting, new tests
only go in the trunk and py3k.


From fwierzbicki at  Sun Aug 16 17:55:05 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Sun, 16 Aug 2009 11:55:05 -0400
Subject: [Python-Dev] Updating tests in branches
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 16, 2009 at 11:45 AM, Benjamin Peterson<benjamin at> wrote:
> 2009/8/16 Frank Wierzbicki <fwierzbicki at>:
> Usually, unless the test is for a bug we are backporting, new tests
> only go in the trunk and py3k.
Thanks! I'll do that from now on.


From jaraco at  Sun Aug 16 18:22:55 2009
From: jaraco at (Jason R. Coombs)
Date: Sun, 16 Aug 2009 12:22:55 -0400
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>

Steven D'Aprano wrote:
> Sent: Sunday, 16 August, 2009 08:15
> On Sat, 15 Aug 2009 04:39:03 am Jason R. Coombs wrote:
> >
> > def meta_decorator(data):
> >     return compose(dec_register_function_for_x, dec_alter_docstring,
> >     dec_inject_some_data(data))
> Surely that's better written as:
> meta_decorator = compose(dec_register_function_for_x,
>     dec_alter_docstring, dec_inject_some_data)

I agree. The former looks unnecessarily complicated.

I purposely chose a non-trivial use case, one which involves a decorator that requires a parameter and thus must be called first before the actual decorator is returned.  I think for this reason, the former syntax must be used so that the meta_decorator also takes the data parameter and constructs the proper "inject" decorator.  Put another way, both dec_inject_some_data and meta_decorator are more like decorator factories.

I suspect a simpler, and more common use-case would be like the one you described, where either data is global or the "inject" decorator is not used:

meta_decorator = compose(dec_register_function_for_x, dec_alter_docstring)

> Mine wasn't -- I've never even used Scheme, or Lisp, or any other
> functional language. But I've come to appreciate Python's functional
> tools, and would like to give a +0.5 to compose(). +1 if anyone can
> come up with additional use-cases.

Thanks for the interest.  I decided to search through some of my active code for lambdas and see if there are areas where I would prefer to be using a compose function instead of an explicit lambda/reduce combination.

I only found one such application; I attribute this limited finding to the fact that I probably elected for a procedural implementation when the functional implementation might have proven difficult to read, esp. with lambda.

1) Multiple string substitutions.  You have a list of functions that operate on a string, but you want to collect them into a single operator that can be applied to a list of strings.

sub_year = lambda s: s.replace("%Y", "2009")

fix_strings_with_substituted_year = compose(str.strip, textwrap.dedent, sub_year)
map(fix_strings_with_substituted_year, target_strings)

Moreover, it would be great to be able to accept any number of substitutions.

substitutions = [sub_year, sub_month, ...]
fix_strings_with_substitutions = compose(str.strip, textwrap.dedent, *substitutions)

I did conceive of another possibly interesting use case: vector translation.

Consider an application that performs mathematical translations on n-dimensional vectors.  While it would be optimal to use optimized matrix operations to perform these translations, for the sake of this example, all we have are basic Python programming constructs.

At run-time, the user can compose an experiment to be conducted on his series of vectors. To do this, he selects from a list of provided translations and can provide his own.  These translations can be tagged as named translations and thereafter used as translations themselves.  The code might look something like:

translations = selected_translations + custom_translations
meta_translation = compose(*translations)
save_translation(meta_translation, "My New Translation")

def run_experiment(translation, vectors):
        result = map(translation, vectors)
        # do something with result

Then, run_experiment can take a single translation or a meta-translation such as the one created above. This use-case highlights that a composed functions must take and return exactly one value, but that the value need not be a primitive scalar.

I'm certain there are other, more obscure examples, but I feel these two use cases demonstrate some fairly common potential use cases for something like a composition function.


From solipsis at  Sun Aug 16 18:29:38 2009
From: solipsis at (Antoine Pitrou)
Date: Sun, 16 Aug 2009 16:29:38 +0000 (UTC)
Subject: [Python-Dev] functools.compose to chain functions together
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <>

Jason R. Coombs <jaraco <at>> writes:
> I'm certain there are other, more obscure examples, but I feel these two use
cases demonstrate some fairly
> common potential use cases for something like a composition function.

I also think it would be a nice addition.
(but someone has to propose a patch :-))



From python at  Sun Aug 16 18:41:54 2009
From: python at (Raymond Hettinger)
Date: Sun, 16 Aug 2009 09:41:54 -0700
Subject: [Python-Dev] functools.compose to chain functions together
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold><><8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>
Message-ID: <B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>

[Antoine Pitrou]
> I also think it would be a nice addition.
> (but someone has to propose a patch :-))

I agree with Martin's reasons for rejecting the feature request
(see the bug report for his full explanation).  IIRC, the compose() 
idea had come-up and been rejected in previous discussions as well.

At best, it will be a little syntactic sugar (though somewhat odd because
the traditional mathematical ordering of a composition operator is the
opposite of what intuition would suggest).  At worst, it will be slower
and less flexible than our normal ways of linking functions together.

IMO, its only virtue is that people coming from functional languages
are used to having compose.  Otherwise, it's a YAGNI.


From jaraco at  Sun Aug 16 19:15:21 2009
From: jaraco at (Jason R. Coombs)
Date: Sun, 16 Aug 2009 13:15:21 -0400
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold><><8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>
Message-ID: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>

> Raymond Hettinger wrote:
> Sent: Sunday, 16 August, 2009 12:42
> [Antoine Pitrou]
> > I also think it would be a nice addition.
> > (but someone has to propose a patch :-))

The patch was proposed and rejected here:; my reason for mentioning it here is because the functionality isn't YAGNI for me; It seems like a fundamental capability when employing a functional programming paradigm.

> I agree with Martin's reasons for rejecting the feature request
> (see the bug report for his full explanation).  IIRC, the compose()
> idea had come-up and been rejected in previous discussions as well.
> At best, it will be a little syntactic sugar (though somewhat odd
> because
> the traditional mathematical ordering of a composition operator is the
> opposite of what intuition would suggest).  At worst, it will be slower
> and less flexible than our normal ways of linking functions together.
> IMO, its only virtue is that people coming from functional languages
> are used to having compose.  Otherwise, it's a YAGNI.

Right.  I have great respect for your and Martin's original conclusion.

The reason I came across the old patch was because I was searching for something that did exactly what compose does. That is to say, I had a use case that was compelling enough that I thought there should be something in functools to do what I wanted.  I've encountered this pattern often enough that it might be in the stdlib.

As it turns out, it isn't.  For this reason, I wanted to voice my opinion that contradicts the conclusion of the previous patch discussion.  Specifically, YAGNI doesn't apply to my experiences, and it does seem to have broad, fundamental application, especially with respect to functional programming.

I'm not arguing that just because Jason needs it, it should be in the standard library.  Rather, I just wanted to express that, like Chris AtLee, I would find this function quite useful.

As Steven pointed out, this functionality is desirable even for those without a functional programming background.  I'd like to mention also that even though I learned to program in Scheme in 1994, I haven't used it since, and I've been using Python since 1996, so my affinity for this function is based almost entirely from experiences programming in Python and not in a primarily functional language.

If the Python community still concurs that 'compose' is YAGNI or otherwise undesirable, I understand.  I just wanted to share my experiences and motivations as they pertain to the discussion.  If it turns out that it's included in the stdlib later, all the better.


From Scott.Daniels at Acm.Org  Sun Aug 16 19:40:00 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 16 Aug 2009 10:40:00 -0700
Subject: [Python-Dev] random number generator state
In-Reply-To: <622D9CF0914D47A39D797968E9A9BB1F@RaymondLaptop1>
References: <h6739u$s31$>
Message-ID: <h69fp7$cbf$>

Raymond Hettinger wrote:
> [Scott David Daniels]
>> I find I have a need in randomized testing for a shorter version
>> of getstate, even if it _is_ slower to restore.  [blah about big state]
> Sounds like you could easily wrap the generator to get this.
> It would slow you down but would give the information you want.
Well, I was thinking that this might be generally useful for randomized

> I think it would be a mistake to complexify the API to accomodate
> short states -- I'm not even sure than they are generally useful
> (recording my initial seed and how many cycles I've run through
> is only helpful for sequences short enough that I'm willing to rerun
> them).
Right, that was what I was asking about.  The complexity of the change
grew on me; I hadn't realized at the outset it would be more than adding
a counter internally.  Consider me officially dissuaded.

> I'm curious what your use case is.  Why not just record the the sequence 
> as generated -- I don't see any analytic value to
> just knowing the initial seed and cycle count. 
I'm building data structures controlled by an rng, and then performing
sequences of (again randomly controlled) operations on those data
structures, check all invariants at each step.  I then lather, rinse,
repeat recording the start of each failing experiment.  In the morning I
come in and look for commonality in the cases I see.  Having the short
state means I  means I can easily rebuild the data structure and command
list to see what is going on.  I prune commands, simplify the tree, and
thus isolate the problem I found.

I did enough experimenting to see that if I simply provide access to run
N cycles of the block, I can actually do 2**32 cycles in feasible time,
so I have a pair of counters, and the code should take long enough for
eternity to show up before the wrap.

My short state is:
     seed, block_index, cycles_low, cycles_high, floating

(block_index + 625 * (cycles_low + (cycles_high << 32)) is the position,
and could be done as such; the pieces reflect the least-expensive cost
in performance to the rng. floating is simply the same final floating
piece that the state keeps now.

> Ability to print out a short state implies that you are using only a
> small subset of possible states (i.e. the ones you can get to with
> a short seed). 
Well, as you see above, I do capture the seed.  I realize that the time-
constructed seeds are distinct from identically provided values as small
ints, and I also mark when the rng gets called by set_state to indicate
that I then know nothing about the seed.
 >> mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
 >> but probably should be:
 >> mt[0] |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array*/
 > Please file a bug report for this and assign to me....  Also, our
 > tests for MT exactly reproduce their published test sequence.
I've been assured it is not a bug, and I filed no report since I had 
just arrived at the point of suspicion.

To summarize, I am officially dissuaded, and will post a recipe if I
get something nice working.

--Scott David Daniels
Scott.Daniels at Acm.Org

From solipsis at  Sun Aug 16 19:39:19 2009
From: solipsis at (Antoine Pitrou)
Date: Sun, 16 Aug 2009 19:39:19 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <1250444359.5558.15.camel@localhost>

Raymond Hettinger <python <at>> writes:
> IMO, its only virtue is that people coming from functional languages
> are used to having compose.  Otherwise, it's a YAGNI.

Then I wonder how partial() ended up in the stdlib. It seems hardly more
useful than compose().
Either we decide it is useful to have a set of basic "functional" tools
in the stdlib, and both partial() and compose() have their place there,
or we decide functools has no place in the stdlib at all. Providing a
half-assed module is probably frustrating to its potential users.

(not being particularly attached to functional tools, I still think
compose() has its value, and Jason did a good job of presenting
potential use cases)



From martin at  Sun Aug 16 23:54:48 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 16 Aug 2009 23:54:48 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold><><8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>
Message-ID: <>

> The reason I came across the old patch was because I was searching
> for something that did exactly what compose does. That is to say, I
> had a use case that was compelling enough that I thought there should
> be something in functools to do what I wanted.  I've encountered this
> pattern often enough that it might be in the stdlib.

Can you kindly give one or two examples of where compose would have been


From martin at  Mon Aug 17 00:10:16 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 00:10:16 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <1250444359.5558.15.camel@localhost>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>
Message-ID: <>

> Then I wonder how partial() ended up in the stdlib. 

PEP 309 was written, discussed, approved, and implemented - that's how
partial ended up in the stdlib. The feature itself might be debatable,
that's what we have the PEP process for.

> Either we decide it is useful to have a set of basic "functional" tools
> in the stdlib, and both partial() and compose() have their place there,
> or we decide functools has no place in the stdlib at all. Providing a
> half-assed module is probably frustrating to its potential users.

So write a PEP and propose to enhance the standard library.

> (not being particularly attached to functional tools, I still think
> compose() has its value, and Jason did a good job of presenting
> potential use cases)

I don't think he did. Comparing it to the one obvious solution (use
a lambda expression), his only reasoning was "it is much easier to
read". I truly cannot believe that a compose function would be easier
to read to the average Python programmer: if you have

  def foo(data):
    return compose(a, b(data), c)

what would you expect that to mean? Please rewrite it as a regular
Python expression, preferably without looking at the patch that
has been proposed first. I bet there is a 50% chance that you get
it wrong (because there are two possible interpretations).


From joshua at  Mon Aug 17 00:08:55 2009
From: joshua at (Joshua Haberman)
Date: Sun, 16 Aug 2009 15:08:55 -0700
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
Message-ID: <>

I wrote to this list a few weeks ago asking about Py_TPFLAGS_HEAPTYPE
(  It occurred
to me today that I could probably make object instances INCREF and
DECREF my type appropriately, without setting Py_TPFLAGS_HEAPTYPE, by
writing my own tp_alloc and tp_dealloc functions.  My tp_alloc function
could be:

PyObject *my_tp_alloc(PyTypeObject *type, Py_ssize_t nitems)
  PyObject *obj = PyType_GenericAlloc(type, nitems);
  if(obj) Py_INCREF(type);
  return obj;

This seems right since it is PyType_GenericAlloc that contains this

  if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)

I don't want to set Py_TPFLAGS_HEAPTYPE, but I want to get that
Py_INCREF(), so far so good.

But I couldn't find the corresponding Py_DECREF() in typeobject.c to
the above Py_INCREF().  Notably, object_dealloc() does not call
Py_DECREF(self->ob_type) if self->ob_type has the Py_TPFLAGS_HEAPTYPE
flag set.

So where does the Py_DECREF() for the above Py_INCREF() live?  I
expected to find this code snippet somewhere, but couldn't:

  if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)


From martin at  Mon Aug 17 00:13:43 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 00:13:43 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold><><8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>
Message-ID: <>

Martin v. L?wis wrote:
>> The reason I came across the old patch was because I was searching
>> for something that did exactly what compose does. That is to say, I
>> had a use case that was compelling enough that I thought there should
>> be something in functools to do what I wanted.  I've encountered this
>> pattern often enough that it might be in the stdlib.
> Can you kindly give one or two examples of where compose would have been
> useful?

I went back in the archives and found your example. What I now don't
understand is why you say that a compose function would be easier to
read than a lambda expression. Can you please elaborate on that?

I deeply believe that it is *harder* to read than a lambda expression,
because the lambda expression makes the evaluation order clear, whereas
the compose function doesn't (of course, function decorators ought to be
commutative, so in this case, lack of clear evaluation order might be
less important).


From martin at  Mon Aug 17 00:37:48 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 00:37:48 +0200
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

> So where does the Py_DECREF() for the above Py_INCREF() live?  I
> expected to find this code snippet somewhere, but couldn't:
>   if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
>     Py_DECREF(type);

For a regular heaptype, it's in subtype_dealloc:

                /* Can't reference self beyond this point */


From joshua at  Mon Aug 17 01:01:58 2009
From: joshua at (Joshua Haberman)
Date: Sun, 16 Aug 2009 16:01:58 -0700
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 16, 2009 at 3:37 PM, "Martin v. L?wis"<martin at> wrote:
>> So where does the Py_DECREF() for the above Py_INCREF() live? ?I
>> expected to find this code snippet somewhere, but couldn't:
>> ? if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
>> ? ? Py_DECREF(type);
> For a regular heaptype, it's in subtype_dealloc:
> ? ? ? ? ? ? ? ?/* Can't reference self beyond this point */
> ? ? ? ? ? ? ? ?Py_DECREF(type);

Thanks for the pointer.  I noticed that subtype_dealloc is only called for types
that are allocated using type_new().  Does this mean that it is not
safe to create
types in C using just PyType_Ready() and set Py_TPFLAGS_HEAPTYPE on
them?  The documentation is not clear on this point.

Here is what I would like to do when I create my types dynamically:

- implement tp_alloc and tp_dealloc() to INCREF and DECREF the type.
- set Py_TPFLAGS_HAVE_GC (because instances of my obj can create cycles)

Does this seem safe?  I notice that subtype_dealloc() does some funky
GC/trashcan stuff.  Is it safe for me not to call subtype_dealloc?  Can I
safely implement my tp_dealloc function like this?

void my_tp_dealloc(PyObject *obj)


From solipsis at  Mon Aug 17 02:07:50 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 17 Aug 2009 02:07:50 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost>  <>
Message-ID: <1250467670.5528.8.camel@localhost>

> PEP 309 was written, discussed, approved, and implemented - that's how
> partial ended up in the stdlib.

Ok, I'm surprised that a single addition to a module needed a PEP in
order to be approved.

Interestingly, here's what the summary section in PEP 309 says: 
? A standard library module functional should contain an implementation
of partial, /and any other higher-order functions the community want/. ?
(emphasis mine)

> I truly cannot believe that a compose function would be easier
> to read to the average Python programmer: if you have
>   def foo(data):
>     return compose(a, b(data), c)
> what would you expect that to mean? Please rewrite it as a regular
> Python expression, preferably without looking at the patch that
> has been proposed first.

Ok, here's my attempt without looking at the patch:

def foo(data):
    def bar(*args, **kwargs):
        return a(b(data)(c(*args, **kwargs)))
    return bar

Whether or not it is easier to read to the "average Python programmer"
is not that important I think. We have lots of things that certainly
aren't, and yet still exist (all of the functions in the operator
module, for example; or `partial` itself for that matter). They are
there for advanced programmers.



From greg.ewing at  Mon Aug 17 02:51:27 2009
From: greg.ewing at (Greg Ewing)
Date: Mon, 17 Aug 2009 12:51:27 +1200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
Message-ID: <>

Jason R. Coombs wrote:
> I had a use case that was compelling enough that I thought there
 > should be something in functools to do what I wanted.

I think this is one of those things that a small minority of
people would use frequently, but everyone else would use
very rarely or never. The decision on whether to include
something in the stdlib needs to be based on the wider

In this case, it's trivial to write your own if you want
it. As they say, "not every one-line function needs to
be in the stdlib".


From benjamin at  Mon Aug 17 03:19:37 2009
From: benjamin at (Benjamin Peterson)
Date: Sun, 16 Aug 2009 20:19:37 -0500
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/16 Joshua Haberman <joshua at>:
> On Sun, Aug 16, 2009 at 3:37 PM, "Martin v. L?wis"<martin at> wrote:
>>> So where does the Py_DECREF() for the above Py_INCREF() live? ?I
>>> expected to find this code snippet somewhere, but couldn't:
>>> ? if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
>>> ? ? Py_DECREF(type);
>> For a regular heaptype, it's in subtype_dealloc:
>> ? ? ? ? ? ? ? ?/* Can't reference self beyond this point */
>> ? ? ? ? ? ? ? ?Py_DECREF(type);
> Thanks for the pointer. ?I noticed that subtype_dealloc is only called for types
> that are allocated using type_new(). ?Does this mean that it is not
> safe to create
> types in C using just PyType_Ready() and set Py_TPFLAGS_HEAPTYPE on
> them? ?The documentation is not clear on this point.
> Here is what I would like to do when I create my types dynamically:
> - implement tp_alloc and tp_dealloc() to INCREF and DECREF the type.
> - not set Py_TPFLAGS_HEAPTYPE.
> - set Py_TPFLAGS_HAVE_GC (because instances of my obj can create cycles)

[Note that this is really starting to get off topic for python-dev.]

Why do you need to set Py_TPFLAGS_HEAPTYPE on your C type? Is a normal
static type not sufficient? The easiest way to create heaptypes is to
simply call PyType_Type.


From greg.ewing at  Mon Aug 17 03:39:36 2009
From: greg.ewing at (Greg Ewing)
Date: Mon, 17 Aug 2009 13:39:36 +1200
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

Benjamin Peterson wrote:

> Why do you need to set Py_TPFLAGS_HEAPTYPE on your C type?

I think he *doesn't* want to set Py_TPFLAGS_HEAPTYPE, but
does want to create the type dynamically.

But I suspect this is actually FUD, and that letting
Py_TPFLAGS_HEAPTYPE be set wouldn't lead to anything
disastrous happening.

Note that by not giving instances a __dict__, they
will be prevented from having arbitrary attributes
set on them, which is the most noticeable distinction
between built-in and user-defined types.


From greg.ewing at  Mon Aug 17 04:13:01 2009
From: greg.ewing at (Greg Ewing)
Date: Mon, 17 Aug 2009 14:13:01 +1200
Subject: [Python-Dev] random number generator state
In-Reply-To: <4A882D30.2070508@Acm.Org>
References: <h6739u$s31$> <>
Message-ID: <>

Scott David Daniels wrote:

> No, I don't really need MT.  The others would be fine.
> I'd love further details.

The one I've been working with is due to Pierre L'Ecuyer [1]
and is known as MRG32k3a. It's a combined multiple recursive
linear congruential generator with 6 words of state. The
formulas are

     r1[i] = (a12 * r1[i-2] + a13 * r1[i-3]) % m1
     r2[i] = (a21 * r2[i-1] + a23 * r2[i-3]) % m2
     r[i] = (r1[i] - r2[i]) * m1


     m1 = 2**32 - 209
     m2 = 2**32 - 22835

     a12 = 1403580
     a13 = -810728
     a21 = 527612
     a23 = -1370589

If you consider the state to be made up of two 3-word
state vectors, then there are two 3x3 matrices which
map a given state onto the next state. So to jump
ahead n steps in the sequence, you raise these matrices
to the power of n.

I've attached some code implementing this generator
together with the jumping-ahead. (Sorry it's in C++,
I hadn't discovered Python when I wrote it.)

[1] Pierre L'Ecuyer, Good Parameters and Implementations for
     Combined Multiple Recursive Random Number Generators,
     Operations Research v47 no1 Jan-Feb 1999

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cmr_random_generator.C
URL: <>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cmr_random_generator.H
URL: <>

From benjamin at  Mon Aug 17 04:55:14 2009
From: benjamin at (Benjamin Peterson)
Date: Sun, 16 Aug 2009 21:55:14 -0500
Subject: [Python-Dev] [RELEASED] Python 3.1.1
Message-ID: <>

On behalf of the Python development team, I'm happy to announce the first bugfix
release of the Python 3.1 series, Python 3.1.1.

This bug fix release fixes many normal bugs and several critical ones including
potential data corruption in the io library.

Python 3.1 focuses on the stabilization and optimization of the features and
changes that Python 3.0 introduced.  For example, the new I/O system has been
rewritten in C for speed.  File system APIs that use unicode strings now handle
paths with undecodable bytes in them. Other features include an ordered
dictionary implementation, a condensed syntax for nested with statements, and
support for ttk Tile in Tkinter.  For a more extensive list of changes in 3.1,
see or Misc/NEWS in the Python

Please note the Windows and Mac binaries are not available yet but
will be in the coming days.

To download Python 3.1.1 visit:

The 3.1 documentation can be found at:

Bugs can always be reported to:


Benjamin Peterson
Release Manager
benjamin at
(on behalf of the entire python-dev team and 3.1.1's contributors)

From martin at  Mon Aug 17 08:53:56 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 08:53:56 +0200
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>	<>
Message-ID: <>

> Thanks for the pointer.  I noticed that subtype_dealloc is only called for types
> that are allocated using type_new().  Does this mean that it is not
> safe to create
> types in C using just PyType_Ready() and set Py_TPFLAGS_HEAPTYPE on
> them?  The documentation is not clear on this point.

As Benjamin says, this is getting off-topic - python-dev is not a place
to ask for help in your project.

I believe setting flags on a type is inherently unsafe.

> Here is what I would like to do when I create my types dynamically:
> - implement tp_alloc and tp_dealloc() to INCREF and DECREF the type.
> - not set Py_TPFLAGS_HEAPTYPE.
> - set Py_TPFLAGS_HAVE_GC (because instances of my obj can create cycles)
> Does this seem safe?  I notice that subtype_dealloc() does some funky
> GC/trashcan stuff.  Is it safe for me not to call subtype_dealloc?  Can I
> safely implement my tp_dealloc function like this?

If you bypass documented API, you really need to study the code,
understand its motivation, judge whether certain usage is "safe" wrt.
to the current implementation, and judge the likelihood of this code
not getting changed in future versions.


From martin at  Mon Aug 17 09:07:00 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 17 Aug 2009 09:07:00 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <1250467670.5528.8.camel@localhost>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>	<1250444359.5558.15.camel@localhost>
	<> <1250467670.5528.8.camel@localhost>
Message-ID: <>

>> PEP 309 was written, discussed, approved, and implemented - that's how
>> partial ended up in the stdlib.
> Ok, I'm surprised that a single addition to a module needed a PEP in
> order to be approved.

A PEP is generally needed if there is no easy consent achievable. It's
not (primarily) the size of a feature that determines the need for a
formal process, but but whether the community considers a certain change
"obviously" correct and desirable.

>>   def foo(data):
>>     return compose(a, b(data), c)
> Ok, here's my attempt without looking at the patch:
> def foo(data):
>     def bar(*args, **kwargs):
>         return a(b(data)(c(*args, **kwargs)))
>     return bar

Ok, that's also what the patch has proposed. I was puzzled when I read

   l.sort(key=compose(itemgetter(1), itemgetter(0))))

because I expected it to mean

   l.sort(key=lambda x:x[1][0])

when it would really mean

   l.sort(key=lambda x:x[0][1])

> Whether or not it is easier to read to the "average Python programmer"
> is not that important I think.

I completely disagree. It is one of Python's strength that it is
"executable pseudo-code", which originates from the code being easy
to read, and meaning the obvious thing even to a reader not familiar
with the language. The proposed compose function breaks this important
property, in a way that allows misinterpretation (i.e. you think you
know what it does, and it actually does something different).

I, personally, was not able to understand the compose function
correctly, so I remain opposed.

> We have lots of things that certainly
> aren't, and yet still exist (all of the functions in the operator
> module, for example; or `partial` itself for that matter). They are
> there for advanced programmers.

It's quite ok if only advanced programmers know that they are there,
and know how to write them. However, I still think it is desirable
that "lesser" programmers are then able to read them, or atleast notice
that they mean something that they will need to learn first (such
as a keyword they had never seen before).


From scott+python-dev at  Mon Aug 17 09:50:04 2009
From: scott+python-dev at (Scott Dial)
Date: Mon, 17 Aug 2009 03:50:04 -0400
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>
Message-ID: <>

Greg Ewing wrote:
> Jason R. Coombs wrote:
>> I had a use case that was compelling enough that I thought there
>> should be something in functools to do what I wanted.
> I think this is one of those things that a small minority of
> people would use frequently, but everyone else would use
> very rarely or never. The decision on whether to include
> something in the stdlib needs to be based on the wider
> picture.
> In this case, it's trivial to write your own if you want
> it. As they say, "not every one-line function needs to
> be in the stdlib".

I have never found these arguments compelling. They are obviously not
true (e.g., itertools.compress()[1] added in 2.7/3.1), and so what I
really hear is: "I don't like it and I outrank you."

I can't help invoke part of PEP309's justification for

I agree that lambda is usually good enough, just not always. And I want
the possibility of useful introspection and subclassing.

The same reasoning would seem to apply here. In the OP's example, the
meta-decorator becomes opaque due to the use of a lambda. If one could
introspect a compose(), then introspection tools could actually know the
set of decorators being applied. As it is, the "preferred" method of
using a lambda actually makes it quite hard to know anything.

class compose():
    def __init__(self, *funcs):
        if not funcs:
            raise ValueError(funcs)
        self.funcs = funcs

    def __call__(self, *args, **kwargs):
        v = self.funcs[-1](*args, **kwargs)
        for func in reversed(self.funcs[:-1]):
            v = func(v)
        return v

meta = functools.compose(decorator_a, decorator_b)
print meta.funcs

meta = lambda f: decorator_a(decorator_b(f))
# impossible, short of disassembling the lambda



def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in zip(data, selectors) if s)

From steve at  Mon Aug 17 09:43:52 2009
From: steve at (Steven D'Aprano)
Date: Mon, 17 Aug 2009 17:43:52 +1000
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost> <>
Message-ID: <>

On Mon, 17 Aug 2009 08:10:16 am Martin v. L?wis wrote:

> I don't think he did. Comparing it to the one obvious solution (use
> a lambda expression), his only reasoning was "it is much easier to 
> read". I truly cannot believe that a compose function would be
> easier to read to the average Python programmer: if you have
>   def foo(data):
>     return compose(a, b(data), c)
> what would you expect that to mean? 

foo is a factory function that, given an argument `data`, generates a 
function b(data), then composes it with two other functions a and c, 
and returns the result, also a function.

> Please rewrite it as a regular 
> Python expression, preferably without looking at the patch that
> has been proposed first. I bet there is a 50% chance that you get 
> it wrong (because there are two possible interpretations).

But surely only one of them will agree with the standard definition of 
function composition. Both Mathworld and Wikipedia agree that f?g(x) 
is equivalent to f(g(x)):

and I don't see any reason why a compose() function shouldn't do the 

(Aside: how do I look at the patch? The only link I have is here:
but I can't see how to get to the patch from there.)

foo could be written as:

def foo(data):
    return lambda *args, **kwargs: a(b(data)(c(*args, **kwargs)))

Or without lambda:

def foo(data):
    def composed(*args, **kwargs):
        return a(b(data)(c(*args, **kwargs)))
    return composed

This soon gets unwieldy:

def foo(arg1, arg2, arg3):
    return compose(
      f, g, h, factory(arg1), factory(arg2), factory(arg3)


def foo(arg1, arg2, arg3):
    return lambda *a, **kw: (
      f(g(h(factory(arg1)(factory(arg2)(factory(arg3)(*a, **kw))))))

but presumably composing six functions is rare.

A further advantage of compose() is that one could, if desired, 
generate a sensible name and doc string for the returned function. 
Depends on how heavyweight you want compose() to become.

I think the compose() version is far more readable and understandable, 
but another factor is the performance cost of the generated function 
compared to a hand-made lambda.

For the record, Haskell makes compose a built-in operator:

It doesn't appear to be standard in Ruby, but it seems to be commonly 
requested, and a version is on Facets:

Steven D'Aprano 

From stefan_ml at  Mon Aug 17 11:14:05 2009
From: stefan_ml at (Stefan Behnel)
Date: Mon, 17 Aug 2009 11:14:05 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <1250444359.5558.15.camel@localhost>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>
Message-ID: <h6b70s$gd1$>

Antoine Pitrou wrote:
> Raymond Hettinger <python <at>> writes:
>> IMO, its only virtue is that people coming from functional languages
>> are used to having compose.  Otherwise, it's a YAGNI.
> Then I wonder how partial() ended up in the stdlib. It seems hardly more
> useful than compose().

I would certainly consider it more useful, but that aside, it's also a lot
simpler to understand and use than the proposed compose() function. I think
the main difference is that compose() requires functional/math skills to be
used and read correctly (and might still be surprising in some corner
cases), whereas partial() only requires you to understand how to set a
function argument. Totally different level of mental complexity, IMHO.


From ncoghlan at  Mon Aug 17 12:53:20 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 17 Aug 2009 20:53:20 +1000
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <1250467670.5528.8.camel@localhost>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>	<1250444359.5558.15.camel@localhost>
	<> <1250467670.5528.8.camel@localhost>
Message-ID: <>

Antoine Pitrou wrote:
>> PEP 309 was written, discussed, approved, and implemented - that's how
>> partial ended up in the stdlib.
> Ok, I'm surprised that a single addition to a module needed a PEP in
> order to be approved.

It makes a little more sense once you realise that there was no
functools module before the implementation of PEP 309. The other
functions it contains in Python 2.5 (update_wrapper() and wraps()) were
added later in the development cycle and reduce() didn't get added to it
until 2.6/3.0.

If a concrete proposal is made that emphasises the improved
introspection capabilities and raw speed increase that a function
composition approach can offer over the use of lambda then I'd
personally be willing to support this idea, since it was at least in
part those two ideas that sold the idea of partial(). (partial() did
have a big advantage over compose() in that the former's readability
gains were far more obvious to most readers).


P.S. PEP 309 is wrong when it says a C version probably isn't worthwhile
- between the time the PEP was first implemented and the time 2.5 was
actually released, enough investigation was done to show that the speed
gain from Hye-Shik Chang's C version was well worth the additional code

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From solipsis at  Mon Aug 17 12:59:40 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 17 Aug 2009 12:59:40 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost>  <>
	<1250467670.5528.8.camel@localhost>  <>
Message-ID: <1250506780.5706.11.camel@localhost>

Le lundi 17 ao?t 2009 ? 09:07 +0200, "Martin v. L?wis" a ?crit :
> Ok, that's also what the patch has proposed. I was puzzled when I read
>    l.sort(key=compose(itemgetter(1), itemgetter(0))))
> because I expected it to mean
>    l.sort(key=lambda x:x[1][0])

But that's itemgetter's fault, not compose's. Because itemgetter's
obvious equivalent (the [] operator) uses postfix notation, combining
several itemgetters reverses the lexical order of appearance.

Besides, the argument order is similar to the one in the function
composition notation in mathematics (which isn't really advanced stuff
and should have been taught to every former scientific/technical student
out there).



From solipsis at  Mon Aug 17 13:01:31 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 17 Aug 2009 13:01:31 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost>  <>
	<1250467670.5528.8.camel@localhost>  <>
Message-ID: <1250506891.5706.12.camel@localhost>

Le lundi 17 ao?t 2009 ? 20:53 +1000, Nick Coghlan a ?crit :
> P.S. PEP 309 is wrong when it says a C version probably isn't worthwhile
> - between the time the PEP was first implemented and the time 2.5 was
> actually released, enough investigation was done to show that the speed
> gain from Hye-Shik Chang's C version was well worth the additional code
> complexity.

Yes, one-line Python wrappers can kill performance of pure C code.
Seen that in the IO lib, again.



From xavier.morel at  Mon Aug 17 12:38:57 2009
From: xavier.morel at (Xavier Morel)
Date: Mon, 17 Aug 2009 12:38:57 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost> <>
Message-ID: <>

On 17 Aug 2009, at 09:43 , Steven D'Aprano wrote:
> On Mon, 17 Aug 2009 08:10:16 am Martin v. L?wis wrote:
>> I don't think he did. Comparing it to the one obvious solution (use
>> a lambda expression), his only reasoning was "it is much easier to
>> read". I truly cannot believe that a compose function would be
>> easier to read to the average Python programmer: if you have
>>  def foo(data):
>>    return compose(a, b(data), c)
>> what would you expect that to mean?
> foo is a factory function that, given an argument `data`, generates a
> function b(data), then composes it with two other functions a and c,
> and returns the result, also a function.
 From his messages, I think Martin's issue with `compose` is with the  
composition order rather than the fact that it "pipes" functions:  
compose uses the mathematical order, (f ? g)(x) = f(g(x)) (so g, the  
last function of the composition, is applied first), rather than a  
"shell pipe" order of `(f >>> g)(x) = g(f(x))` (where g, the last  
function of the composition, is applied last).

> For the record, Haskell makes compose a built-in operator:

Yes, but Haskell also has a left-to-right composition, the (>>>)  

From chris at  Mon Aug 17 16:22:50 2009
From: chris at (Chris Withers)
Date: Mon, 17 Aug 2009 15:22:50 +0100
Subject: [Python-Dev] VC++ versions to match python versions?
Message-ID: <>

Hi All,

Is the Express Edition of Visual C++ 2008 suitable for compiling 
packages for Python 2.6 on Windows?
(And Python 2.6 itself for that matter...)

Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)



Simplistix - Content Management, Batch Processing & Python Consulting

From phd at  Mon Aug 17 16:30:49 2009
From: phd at (Oleg Broytmann)
Date: Mon, 17 Aug 2009 18:30:49 +0400
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 17, 2009 at 03:22:50PM +0100, Chris Withers wrote:
> Is the Express Edition of Visual C++ 2008 suitable for compiling  
> packages for Python 2.6 on Windows?
> (And Python 2.6 itself for that matter...)
> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)

   These two I know for sure:

Python 2.5: MSVC-7.1 (VC++ 2003)
Python 2.6: MSVC-9.0 (VS 2008)

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From fuzzyman at  Mon Aug 17 16:31:49 2009
From: fuzzyman at (Michael Foord)
Date: Mon, 17 Aug 2009 15:31:49 +0100
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>
Message-ID: <>

Chris Withers wrote:
> Hi All,
> Is the Express Edition of Visual C++ 2008 suitable for compiling 
> packages for Python 2.6 on Windows?
> (And Python 2.6 itself for that matter...)
I would think so - all you really need is the compiler (which the 
express version definitely includes). You may need to manually add some 
directories to the path.

I haven't actually tried it, but then nor have you from the sound of it.

> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)

Python 3.1 / 3.2 are built with VS 2008. 2.5 is built with 2003 which is 
difficult to download unless you have an MSDN subscription. VS 2008 
can't (reliably) be used to build extensions for 2005 I believe.

I'm sure someone will correct me if this information is incorrect.


> cheers,
> Chris


From fuzzyman at  Mon Aug 17 16:34:25 2009
From: fuzzyman at (Michael Foord)
Date: Mon, 17 Aug 2009 15:34:25 +0100
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>
Message-ID: <>

Michael Foord wrote:
> Chris Withers wrote:
>> Hi All,
>> Is the Express Edition of Visual C++ 2008 suitable for compiling 
>> packages for Python 2.6 on Windows?
>> (And Python 2.6 itself for that matter...)
> I would think so - all you really need is the compiler (which the 
> express version definitely includes). You may need to manually add 
> some directories to the path.
> I haven't actually tried it, but then nor have you from the sound of it.
>> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)
> Python 3.1 / 3.2 are built with VS 2008. 2.5 is built with 2003 which 
> is difficult to download unless you have an MSDN subscription. VS 2008 
> can't (reliably) be used to build extensions for 2005 I believe.

D'oh. For 2.5 I mean. It may be *possible* though - just as you *can* 
build extensions for Python 2.5 on windows with mingw (with the 
appropriate distutils configuration), but there are pitfalls with doing 

> I'm sure someone will correct me if this information is incorrect.
> Michael
>> cheers,
>> Chris


From chris at  Mon Aug 17 16:36:22 2009
From: chris at (Chris Withers)
Date: Mon, 17 Aug 2009 15:36:22 +0100
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>
Message-ID: <>

Michael Foord wrote:
> D'oh. For 2.5 I mean. It may be *possible* though - just as you *can* 
> build extensions for Python 2.5 on windows with mingw (with the 
> appropriate distutils configuration), but there are pitfalls with doing 
> this.

Yes, in my case I'm trying to compile guppy (for heapy, which is an 
amazing tool) but that blows up with mingw...

(But I'm also likely to want to do some python dev on windows, httplib 
download problems and all...)


Simplistix - Content Management, Batch Processing & Python Consulting

From eric.pruitt at  Mon Aug 17 17:19:10 2009
From: eric.pruitt at (Eric Pruitt)
Date: Mon, 17 Aug 2009 10:19:10 -0500
Subject: [Python-Dev] PEP Submission
Message-ID: <>

Several days ago, around the time the servers went down, I
submitted a PEP to editor at When things to have been worked,
I submitted the PEP again. I have not seen any activity on the PEP in
Python-Dev or any reply acknowledging that it was received. Did I
misunderstand the process of submitting a PEP?

From john at  Mon Aug 17 17:14:09 2009
From: john at (John Arbash Meinel)
Date: Mon, 17 Aug 2009 10:14:09 -0500
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Hash: SHA1

Chris Withers wrote:
> Michael Foord wrote:
>> D'oh. For 2.5 I mean. It may be *possible* though - just as you *can*
>> build extensions for Python 2.5 on windows with mingw (with the
>> appropriate distutils configuration), but there are pitfalls with
>> doing this.
> Yes, in my case I'm trying to compile guppy (for heapy, which is an
> amazing tool) but that blows up with mingw...
> (But I'm also likely to want to do some python dev on windows, httplib
> download problems and all...)
> Chris

Guppy doesn't compile on Windows. Pretty much full-stop. It uses static
references to DLL functions, which on Windows is not allowed.

I've tried patching it to remove such things, and I finally got it to
compile, only to have it go "boom!" in actual use.

If you can get it to work, certainly post something so that I can cheer.


Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla -


From chris at  Mon Aug 17 17:21:34 2009
From: chris at (Chris Withers)
Date: Mon, 17 Aug 2009 16:21:34 +0100
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

John Arbash Meinel wrote:
> Guppy doesn't compile on Windows. Pretty much full-stop. It uses static
> references to DLL functions, which on Windows is not allowed.

This is no longer true as of the latest version of guppy...

> I've tried patching it to remove such things, and I finally got it to
> compile, only to have it go "boom!" in actual use.
> If you can get it to work, certainly post something so that I can cheer.

Are you on the guppy list? Someone posted a patch to it (which may have 
been you?) which has made it into the latest release. I haven't tried to 
get it working myself yet, but John Machin (who maintains xlrd) has with 
the Express compiler. I was just checking I had the right version and 
what version I should use if I want to try with Python 2.5.

I also wanted to know what versions I should be using for python core 
development should I be foolhardy enough to try any of that on 
Windows... (he says, staring down the barrel of slow httplib downloads 
on Windows :'( )

Will let you know what I find...


Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Mon Aug 17 17:22:48 2009
From: chris at (Chris Withers)
Date: Mon, 17 Aug 2009 16:22:48 +0100
Subject: [Python-Dev] FAO John Arbash Meinel
In-Reply-To: <>
References: <>
Message-ID: <>

Mail Delivery System wrote:
> This is the mail system at host
> I'm sorry to have to inform you that your message could not
> be delivered to one or more recipients. It's attached below.
> For further assistance, please send mail to postmaster.
> If you do so, please include this problem report. You can
> delete your own text from the attached returned message.
>                    The mail system
> <john at>: host[] said: 550
>     relay not permitted (in reply to RCPT TO command)

Looks like something is not happy with your mail setup...


Simplistix - Content Management, Batch Processing & Python Consulting

From guido at  Mon Aug 17 17:43:22 2009
From: guido at (Guido van Rossum)
Date: Mon, 17 Aug 2009 08:43:22 -0700
Subject: [Python-Dev] PEP Submission
In-Reply-To: <>
References: <>
Message-ID: <>

Hm... I thought the address was peps at

On Mon, Aug 17, 2009 at 8:19 AM, Eric Pruitt<eric.pruitt at> wrote:
> Several days ago, around the time the servers went down, I
> submitted a PEP to editor at When things to have been worked,
> I submitted the PEP again. I have not seen any activity on the PEP in
> Python-Dev or any reply acknowledging that it was received. Did I
> misunderstand the process of submitting a PEP?

--Guido van Rossum (home page:

From joshua at  Mon Aug 17 20:12:21 2009
From: joshua at (Joshua Haberman)
Date: Mon, 17 Aug 2009 11:12:21 -0700
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 16, 2009 at 11:53 PM, "Martin v. L?wis"<martin at> wrote:
>> Thanks for the pointer. ?I noticed that subtype_dealloc is only called for types
>> that are allocated using type_new(). ?Does this mean that it is not
>> safe to create
>> types in C using just PyType_Ready() and set Py_TPFLAGS_HEAPTYPE on
>> them? ?The documentation is not clear on this point.
> As Benjamin says, this is getting off-topic - python-dev is not a place
> to ask for help in your project.

Please let me know where is a more suitable place to discuss the
implementation of the cPython as it pertains to C extensions. I wrote
to python-dev only because the other lists appeared to be more focused
on Python-the-language.

> I believe setting flags on a type is inherently unsafe.

Clearly this is not true in general.  Take Py_TPFLAGS_BASETYPE, which
C types are expected to set if they can be subclassed.  Or
Py_TPFLAGS_HAVE_GC, which C types set if they participate in cyclic
reference collection.

The docs do not distinguish (AFAICS) between flags that C types may set
directly and those that they may not.  My reading of the docs left me with the
impression that a type could set Py_TPFLAGS_HEAPTYPE if it had allocated
that type on the heap and wanted it INCREF'd and DECREF'd by instances.
I now know that there is much more to this flag than I anticipated (see, I am just
giving you feedback about why the docs led me to this incorrect conclusion.

In any case, I think I will experiment with a different approach, where instead
of creating types in C dynamically at runtime, I will create a type
whose instances
"pretend" to be types (they will create instances when called).  Still, I would
appreciate knowing where I should direct further questions of this type, which
are not questions about how to use Python but rather questions about how to
properly implement extensions.

>> Here is what I would like to do when I create my types dynamically:
>> - implement tp_alloc and tp_dealloc() to INCREF and DECREF the type.
>> - not set Py_TPFLAGS_HEAPTYPE.
>> - set Py_TPFLAGS_HAVE_GC (because instances of my obj can create cycles)
>> Does this seem safe? ?I notice that subtype_dealloc() does some funky
>> GC/trashcan stuff. ?Is it safe for me not to call subtype_dealloc? ?Can I
>> safely implement my tp_dealloc function like this?
> If you bypass documented API, you really need to study the code,
> understand its motivation, judge whether certain usage is "safe" wrt.
> to the current implementation, and judge the likelihood of this code
> not getting changed in future versions.

It was not my intention to bypass the documented API.
Py_TPFLAGS_HEAPTYPE is documented here, with no note that
the flag should not be set explicitly by C types:

Also, INCREF'ing and DECREF'ing my type from the tp_new and
tp_dealloc functions doesn't seem outside of the documented API.


From martin at  Mon Aug 17 22:41:44 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 22:41:44 +0200
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>	
Message-ID: <>

>> As Benjamin says, this is getting off-topic - python-dev is not a place
>> to ask for help in your project.
> Please let me know where is a more suitable place to discuss the
> implementation of the cPython as it pertains to C extensions. I wrote
> to python-dev only because the other lists appeared to be more focused
> on Python-the-language.

The general python-list, or, more specifically, capi-sig.

python-dev is exclusively reserved for current development of Python
(including PEP discussions). It is out-of-scope to ask questions of
the form "how do I do XYZ?".


From martin at  Mon Aug 17 22:53:14 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 17 Aug 2009 22:53:14 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<1250444359.5558.15.camel@localhost>
Message-ID: <>

> and I don't see any reason why a compose() function shouldn't do the 
> same.

I was tricked into reading it different when used with getters, i.e.


is too easy (IMO) to read as applying on all elements of
the list.

> (Aside: how do I look at the patch? The only link I have is here:
> but I can't see how to get to the patch from there.)

It's best to search for "compose" in the bug tracker:


From martin at  Mon Aug 17 23:01:50 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 23:01:50 +0200
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>	<>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECA@hornigold>	<>	<B984CDB2AC784D6186E02EDB5FDE0DDF@RaymondLaptop1>	<8B473FAE8A08C34C9F5666FD4B0A87B6997EED1ECB@hornigold>	<>
Message-ID: <>

> I have never found these arguments compelling. They are obviously not
> true (e.g., itertools.compress()[1] added in 2.7/3.1), and so what I
> really hear is: "I don't like it and I outrank you."

That certainly contributes to it - if you are not a committer, you have
to find a committer that finds the feature important enough to work with
you to integrate it.

Fortunately, there is a process to overcome this problem: the PEP
process. If you you really really want the feature, and can't find
a committer that supports it yet, write a PEP. Then it will be up
to Guido van Rossum to reject it.

> The same reasoning would seem to apply here. In the OP's example, the
> meta-decorator becomes opaque due to the use of a lambda. If one could
> introspect a compose(), then introspection tools could actually know the
> set of decorators being applied. As it is, the "preferred" method of
> using a lambda actually makes it quite hard to know anything.

That makes it even more necessary to write a PEP. I would have never
guessed that introspection on the compose result is desirable. AFAICT,
operator.attrgetter isn't introspectable, either, nor would the patch
proposed in #7762 give you an introspectable getter.

ISTM that people have fairly different requirements wrt. that feature.


From at  Mon Aug 17 23:01:26 2009
From: at (David Bolen)
Date: Mon, 17 Aug 2009 17:01:26 -0400
Subject: [Python-Dev] VC++ versions to match python versions?
References: <>
Message-ID: <>

Chris Withers <chris at> writes:

> Is the Express Edition of Visual C++ 2008 suitable for compiling
> packages for Python 2.6 on Windows?
> (And Python 2.6 itself for that matter...)

Yes - it's currently being used on my buildbot, for example, to build
Python itself.  Works for 2.6 and later.

> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)

2.5 needs VS 2003.

-- David

From martin at  Mon Aug 17 23:06:20 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Aug 2009 23:06:20 +0200
Subject: [Python-Dev] PEP Submission
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> Hm... I thought the address was peps at
> On Mon, Aug 17, 2009 at 8:19 AM, Eric Pruitt<eric.pruitt at> wrote:
>> Several days ago, around the time the servers went down, I
>> submitted a PEP to editor at When things to have been worked,
>> I submitted the PEP again. I have not seen any activity on the PEP in
>> Python-Dev or any reply acknowledging that it was received. Did I
>> misunderstand the process of submitting a PEP?

Correct - that's what PEP 1 says.

editor at goes to the HOWTO editor, which is
amk at Not sure whether this alias is still useful.


From joshua at  Tue Aug 18 00:26:37 2009
From: joshua at (Joshua Haberman)
Date: Mon, 17 Aug 2009 15:26:37 -0700
Subject: [Python-Dev] another Py_TPFLAGS_HEAPTYPE question
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 17, 2009 at 1:41 PM, "Martin v. L?wis"<martin at> wrote:
>>> As Benjamin says, this is getting off-topic - python-dev is not a place
>>> to ask for help in your project.
>> Please let me know where is a more suitable place to discuss the
>> implementation of the cPython as it pertains to C extensions. I wrote
>> to python-dev only because the other lists appeared to be more focused
>> on Python-the-language.
> The general python-list, or, more specifically, capi-sig.
> python-dev is exclusively reserved for current development of Python
> (including PEP discussions). It is out-of-scope to ask questions of
> the form "how do I do XYZ?".

Ok, I will direct future questions of this sort to one of those two
places -- thanks
and sorry for posting off-topic.


From steve at  Tue Aug 18 10:01:05 2009
From: steve at (Steven D'Aprano)
Date: Tue, 18 Aug 2009 18:01:05 +1000
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <h6b70s$gd1$>
References: <8B473FAE8A08C34C9F5666FD4B0A87B6997EED1EB6@hornigold>
	<1250444359.5558.15.camel@localhost> <h6b70s$gd1$>
Message-ID: <>

On Mon, 17 Aug 2009 07:14:05 pm Stefan Behnel wrote:
> Antoine Pitrou wrote:
> > Raymond Hettinger <python <at>> writes:
> >> IMO, its only virtue is that people coming from functional
> >> languages are used to having compose.  Otherwise, it's a YAGNI.
> >
> > Then I wonder how partial() ended up in the stdlib. It seems
> > hardly more useful than compose().
> I would certainly consider it more useful, but that aside, it's
> also a lot simpler to understand and use than the proposed
> compose() function. I think the main difference is that compose()
> requires functional/math skills to be used and read correctly (and
> might still be surprising in some corner cases), whereas partial()
> only requires you to understand how to set a function argument.
> Totally different level of mental complexity, IMHO.

I find the opposite -- compose() seems completely simple and 
straight-forward to me, while partial() is still a mystery no matter 
how many times I use it. I always have to look it up to see which way 
it binds.

Putting that aside, partial() too is easy enough to implement with 
lambda: partial(f, 2) is the same as lambda *args: f(2, *args). To my 
mind, there are two important reasons for preferring named functions 
like partial() and compose() over lambda solutions:

* performance: a good C implementation should be better than a 
pure-Python lambda; and

* specificity: there's only one thing compose() or partial() could do, 
whereas a lambda is so general it could do anything. Contrast:

compose(f, g, h)
lambda x: f(g(h(x)))

You need to read virtually the entire lambda before you can 
distinguish it from some other arbitrary lambda:

lambda x: f(g(h))(x)
lambda x: f(g(x) or h(x))
lambda x: f(g(x)) + h(x)

Steven D'Aprano 

From martin at  Tue Aug 18 10:12:06 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 18 Aug 2009 10:12:06 +0200
Subject: [Python-Dev] Mercurial migration: help needed
Message-ID: <>

This is a repost from two weeks ago. It didn't get much feedback last
time. I still keep trying, reposting to python-list also this time.

In this thread, I'd like to collect things that ought to be done
but where Dirkjan has indicated that he would prefer if somebody else
did it.

Item 1
The first item is build identification. If you want to work
on this, please either provide a patch (for trunk and/or py3k), or
(if you are a committer) create a subversion branch.

It seems that Barry and I agree that for the maintenance branches,
sys.subversion should be frozen, so we need actually two sets of
patches: one that removes sys.subversion entirely, and the other that
freezes the branch to the respective one, and freezes the subversion
revision to None.

The patch should consider what Dirkjan proposes as the branching
strategy: clones to separate 2.x and 3.x, as well as for features,
and branches with the clones for releases and maintenance (see the
PEP for details).

Anybody working on this should have good knowledge of the Python source
code, Mercurial, and either autoconf or Visual Studio (preferably both).

Item 2

The second item is line conversion hooks. Dj Gilcrease has posted a
solution which he considers a hack himself. Mark Hammond has also
volunteered, but it seems some volunteer needs to be "in charge",
keeping track of a proposed solution until everybody agrees that it
is a good solution. It may be that two solutions are necessary: a
short-term one, that operates as a hook and has limitations, and
a long-term one, that improves the hook system of Mercurial to
implement the proper functionality (which then might get shipped
with Mercurial in a cross-platform manner).


From dirkjan at  Tue Aug 18 10:20:15 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Tue, 18 Aug 2009 10:20:15 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 10:12, "Martin v. L?wis"<martin at> wrote:
> In this thread, I'd like to collect things that ought to be done
> but where Dirkjan has indicated that he would prefer if somebody else
> did it.

I think the most important item here is currently the win32text stuff.
Mark Hammond said he would work on this; Mark, when do you have time
for this? Then I could set apart some time for it as well.

Have stalled a bit on the fine-grained branch processing, hope to move
that forward tomorrow.



From cournape at  Tue Aug 18 11:45:07 2009
From: cournape at (David Cournapeau)
Date: Tue, 18 Aug 2009 02:45:07 -0700
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 17, 2009 at 2:01 PM, David Bolen< at> wrote:
> Chris Withers <chris at> writes:
>> Is the Express Edition of Visual C++ 2008 suitable for compiling
>> packages for Python 2.6 on Windows?
>> (And Python 2.6 itself for that matter...)
> Yes - it's currently being used on my buildbot, for example, to build
> Python itself. ?Works for 2.6 and later.
>> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)
> 2.5 needs VS 2003.

The 64 bits version of 2.5 is built with VS 2005, though.



From mhammond at  Tue Aug 18 13:32:23 2009
From: mhammond at (Mark Hammond)
Date: Tue, 18 Aug 2009 21:32:23 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On 18/08/2009 6:20 PM, Dirkjan Ochtman wrote:
> On Tue, Aug 18, 2009 at 10:12, "Martin v. L?wis"<martin at>  wrote:
>> In this thread, I'd like to collect things that ought to be done
>> but where Dirkjan has indicated that he would prefer if somebody else
>> did it.
> I think the most important item here is currently the win32text stuff.
> Mark Hammond said he would work on this; Mark, when do you have time
> for this? Then I could set apart some time for it as well.

I can make time, somewhat spasmodically, starting fairly soon.  Might I 
suggest that as a first task I can resurrect my old stale patch, and you 
can arrange to install win32text locally and start experimenting with 
how mixed line-endings can work for you.  Once we are all playing in the 
same ballpark I think we should be able to make good progress.

I-said-ballpark-yet-I-call-myself-an-aussie? ly,


From dirkjan at  Tue Aug 18 13:46:36 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Tue, 18 Aug 2009 13:46:36 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 13:32, Mark Hammond<mhammond at> wrote:
> I can make time, somewhat spasmodically, starting fairly soon. ?Might I
> suggest that as a first task I can resurrect my old stale patch, and you can
> arrange to install win32text locally and start experimenting with how mixed
> line-endings can work for you. ?Once we are all playing in the same ballpark
> I think we should be able to make good progress.

Sounds good to me.



From brett at  Tue Aug 18 21:46:13 2009
From: brett at (Brett Cannon)
Date: Tue, 18 Aug 2009 12:46:13 -0700
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

[stripping out python-list and Mark from the CC]

On Tue, Aug 18, 2009 at 01:20, Dirkjan Ochtman<dirkjan at> wrote:
> On Tue, Aug 18, 2009 at 10:12, "Martin v. L?wis"<martin at> wrote:
>> In this thread, I'd like to collect things that ought to be done
>> but where Dirkjan has indicated that he would prefer if somebody else
>> did it.
> I think the most important item here is currently the win32text stuff.
> Mark Hammond said he would work on this; Mark, when do you have time
> for this? Then I could set apart some time for it as well.
> Have stalled a bit on the fine-grained branch processing, hope to move
> that forward tomorrow.

Can we possibly get these todo items in the PEP? I keep looking at the
PEP out of habit to see what the blockers are and they are not there,
at which point I have to dig up Martin's email.


> Cheers,
> Dirkjan
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From dirkjan at  Tue Aug 18 21:53:30 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Tue, 18 Aug 2009 21:53:30 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 21:46, Brett Cannon<brett at> wrote:
> Can we possibly get these todo items in the PEP? I keep looking at the
> PEP out of habit to see what the blockers are and they are not there,
> at which point I have to dig up Martin's email.

Will do.



From peter at  Tue Aug 18 22:00:06 2009
From: peter at (Peter Moody)
Date: Tue, 18 Aug 2009 13:00:06 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
Message-ID: <>

Howdy folks,

I have a first draft of a PEP for including an IP address manipulation
library in the python stdlib. It seems like there are a lot of really
smart folks with some, ahem, strong ideas about what an IP address
module should and shouldn't be so I wanted to solicit your input on
this pep.

the pep can be found here:

the code can be found here:

Please let me know if you have any comments (some already coming :)


From phd at  Tue Aug 18 22:34:27 2009
From: phd at (Oleg Broytmann)
Date: Wed, 19 Aug 2009 00:34:27 +0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

> :

> def IP(address, host=False, version=None):
>     """Take an IP string/int and return an object of the correct type.
>     Args:
>         ip_str: ...

   The arg is 'address', not 'ip_str'.

   There are two classes, IPv4 and IPv6 whose __new__ never create an
instance of its class, instead they create instances of other classes. Why
IPv4 and IPv6 are classes and not (factory) functions (like function IP)?

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From martin at  Tue Aug 18 22:50:53 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 18 Aug 2009 22:50:53 +0200
Subject: [Python-Dev] VC++ versions to match python versions?
In-Reply-To: <>
References: <>	<>
Message-ID: <>

>>> Ditto for 2.5, 3.1 and the trunk (which I guess becomes 3.2?)
>> 2.5 needs VS 2003.
> The 64 bits version of 2.5 is built with VS 2005, though.

Not really - it is built with the compiler in the platform SDK.


From phd at  Tue Aug 18 23:01:15 2009
From: phd at (Oleg Broytmann)
Date: Wed, 19 Aug 2009 01:01:15 +0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 01:53:36PM -0700, Peter Moody wrote:
> hold over from when I was trying to be too fancy. fixed as well.

   Thank you. The PEP and the code is Ok for me. Something like this should
be in the stdlib. Currently I'm using IPy.

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From digitalxero at  Wed Aug 19 00:21:26 2009
From: digitalxero at (Dj Gilcrease)
Date: Tue, 18 Aug 2009 16:21:26 -0600
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 2:12 AM, "Martin v. L?wis"<martin at> wrote:
> The second item is line conversion hooks. Dj Gilcrease has posted a
> solution which he considers a hack himself. Mark Hammond has also
> volunteered, but it seems some volunteer needs to be "in charge",
> keeping track of a proposed solution until everybody agrees that it
> is a good solution. It may be that two solutions are necessary: a
> short-term one, that operates as a hook and has limitations, and
> a long-term one, that improves the hook system of Mercurial to
> implement the proper functionality (which then might get shipped
> with Mercurial in a cross-platform manner).

My solution is a hack because the hooks in Mercurial need to be
modified to support it properly, I would be happy to help work on this
as it is a situation I run into all the time in my own projects. I can
never seem to get all the developers to enable the hooks, and one of
them always commits with improper line endings =P

From peter at  Tue Aug 18 22:53:36 2009
From: peter at (Peter Moody)
Date: Tue, 18 Aug 2009 13:53:36 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Aug 18, 2009 at 1:34 PM, Oleg Broytmann<phd at> wrote:
>> :
>> def IP(address, host=False, version=None):
>> ? ? """Take an IP string/int and return an object of the correct type.
>> ? ? Args:
>> ? ? ? ? ip_str: ...
> ? The arg is 'address', not 'ip_str'.

d'oh, fixed.

> ? There are two classes, IPv4 and IPv6 whose __new__ never create an
> instance of its class, instead they create instances of other classes. Why
> IPv4 and IPv6 are classes and not (factory) functions (like function IP)?

hold over from when I was trying to be too fancy. fixed as well.


> Oleg.
> --
> ? ? Oleg Broytmann ? ? ? ? ? ? ? ? ? ? ? ?phd at
> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From solipsis at  Wed Aug 19 12:20:14 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 19 Aug 2009 10:20:14 +0000 (UTC)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
References: <>
Message-ID: <h6gjku$t2n$>

Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit?:
> Howdy folks,
> I have a first draft of a PEP for including an IP address manipulation
> library in the python stdlib. It seems like there are a lot of really
> smart folks with some, ahem, strong ideas about what an IP address
> module should and shouldn't be so I wanted to solicit your input on this
> pep.

When you say :

? the results of the first computation should be cached and only
re-generated should the object properties change ?

does it mean that the objects are mutable? Would it make sense to make 
them immutable and therefore hashable (such as, e.g., datetime objects)?

From tino at  Wed Aug 19 15:47:19 2009
From: tino at (Tino Wildenhain)
Date: Wed, 19 Aug 2009 15:47:19 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for	the
 Python Standard Library
In-Reply-To: <h6gjku$t2n$>
References: <>
Message-ID: <>

Antoine Pitrou wrote:
> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>> Howdy folks,
>> I have a first draft of a PEP for including an IP address manipulation
>> library in the python stdlib. It seems like there are a lot of really
>> smart folks with some, ahem, strong ideas about what an IP address
>> module should and shouldn't be so I wanted to solicit your input on this
>> pep.
> When you say :
> ? the results of the first computation should be cached and only
> re-generated should the object properties change ?
> does it mean that the objects are mutable? Would it make sense to make 
> them immutable and therefore hashable (such as, e.g., datetime objects)?

They could impelement __hash__ to behave correctly in this case.

In the examples however I see:

 >>> o.broadcast

this is often used but not the only valid broadcast address,
in fact, any address between network address and max(address with given
netmask) can be defined as broadcast. Maybe biggest or greatest
would be better name for the attribute. User is then free to interpret
it as broadcast if desired.

The attribute network returned as address object also does not seem

The performance hit you mention by translating the object upfront
is neglegtible I'd say - for any sensible use of the object you'd
need the binary form anyway. You can even use system (e.g. socket)
funtions to make the translation very fast. This also safes space
and allow vor verification of the input.

(e.g. '' is 18 bytes where it could
  be stored as 8 bytes instead (or even 5 if you use

I have a very very old implementation which even did the
translation from cidr format to integer in python code
(I don't say plain ;) but maybe worth a look:


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3241 bytes
Desc: S/MIME Cryptographic Signature
URL: <>

From peter at  Wed Aug 19 17:19:46 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 08:19:46 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<h6gjku$t2n$> <>
Message-ID: <>

On Wed, Aug 19, 2009 at 6:47 AM, Tino Wildenhain<tino at> wrote:
> Antoine Pitrou wrote:
>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>> Howdy folks,
>>> I have a first draft of a PEP for including an IP address manipulation
>>> library in the python stdlib. It seems like there are a lot of really
>>> smart folks with some, ahem, strong ideas about what an IP address
>>> module should and shouldn't be so I wanted to solicit your input on this
>>> pep.
>> When you say :
>> ? the results of the first computation should be cached and only
>> re-generated should the object properties change ?
>> does it mean that the objects are mutable? Would it make sense to make
>> them immutable and therefore hashable (such as, e.g., datetime objects)?
> They could impelement __hash__ to behave correctly in this case.
> In the examples however I see:
>>>> o.broadcast
> ? ?IPv4Address('')
> this is often used but not the only valid broadcast address,
> in fact, any address between network address and max(address with given
> netmask) can be defined as broadcast. Maybe biggest or greatest
> would be better name for the attribute. User is then free to interpret
> it as broadcast if desired.
> The attribute network returned as address object also does not seem
> right.

by convention, the highest address in a given network is called the
broadcast address while the lowest address is called the network
address. They're also distinct addresses, as opposed to networks,
hence .broadcast/.network/etc returning IPvXAddress objects. calling
them .biggest and .smallest would be confusing.

am I misinterpreting what you mean?

> The performance hit you mention by translating the object upfront
> is neglegtible I'd say - for any sensible use of the object you'd
> need the binary form anyway. You can even use system (e.g. socket)
> funtions to make the translation very fast. This also safes space
> and allow vor verification of the input.

I'll look into using socket where I can, but the computational hit
actually wasn't negligible. A common use for something like this
library might be to verify that an addresses typed by a user is valid,
'' instead os '1921.68.1.1'; computing the extra attributes
delays the return time and doesn't actually benefit the user or


> (e.g. '' is 18 bytes where it could
> ?be stored as 8 bytes instead (or even 5 if you use
> ip/prefixlength)
> I have a very very old implementation which even did the
> translation from cidr format to integer in python code
> (I don't say plain ;) but maybe worth a look:
> Regards
> Tino
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From peter at  Wed Aug 19 17:35:15 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 08:35:15 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <h6gjku$t2n$>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 3:20 AM, Antoine Pitrou<solipsis at> wrote:
> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit?:
>> Howdy folks,
>> I have a first draft of a PEP for including an IP address manipulation
>> library in the python stdlib. It seems like there are a lot of really
>> smart folks with some, ahem, strong ideas about what an IP address
>> module should and shouldn't be so I wanted to solicit your input on this
>> pep.
> When you say :
> ? the results of the first computation should be cached and only
> re-generated should the object properties change ?
> does it mean that the objects are mutable? Would it make sense to make
> them immutable and therefore hashable (such as, e.g., datetime objects)?

that's a good point. I'll implement __hash__ in the BaseIP class.

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From eric at  Wed Aug 19 17:39:27 2009
From: eric at (Eric Smith)
Date: Wed, 19 Aug 2009 11:39:27 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
Message-ID: <>

Peter Moody wrote:
> On Wed, Aug 19, 2009 at 3:20 AM, Antoine Pitrou<solipsis at> wrote:
>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>> Howdy folks,
>>> I have a first draft of a PEP for including an IP address manipulation
>>> library in the python stdlib. It seems like there are a lot of really
>>> smart folks with some, ahem, strong ideas about what an IP address
>>> module should and shouldn't be so I wanted to solicit your input on this
>>> pep.
>> When you say :
>> ? the results of the first computation should be cached and only
>> re-generated should the object properties change ?
>> does it mean that the objects are mutable? Would it make sense to make
>> them immutable and therefore hashable (such as, e.g., datetime objects)?
> that's a good point. I'll implement __hash__ in the BaseIP class.

But are the objects mutable? I haven't had time to deep dive on this 
yet, but I'd like to. I also use IPy and would like to some this in the 


From tino at  Wed Aug 19 18:01:03 2009
From: tino at (Tino Wildenhain)
Date: Wed, 19 Aug 2009 18:01:03 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	
	<h6gjku$t2n$> <>
Message-ID: <>

Peter Moody wrote:
> On Wed, Aug 19, 2009 at 6:47 AM, Tino Wildenhain<tino at> wrote:
>> Antoine Pitrou wrote:
>>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>>> Howdy folks,
>>>> I have a first draft of a PEP for including an IP address manipulation
>>>> library in the python stdlib. It seems like there are a lot of really
>>>> smart folks with some, ahem, strong ideas about what an IP address
>>>> module should and shouldn't be so I wanted to solicit your input on this
>>>> pep.
>>> When you say :
>>> ? the results of the first computation should be cached and only
>>> re-generated should the object properties change ?
>>> does it mean that the objects are mutable? Would it make sense to make
>>> them immutable and therefore hashable (such as, e.g., datetime objects)?
>> They could impelement __hash__ to behave correctly in this case.
>> In the examples however I see:
>>>>> o.broadcast
>>    IPv4Address('')
>> this is often used but not the only valid broadcast address,
>> in fact, any address between network address and max(address with given
>> netmask) can be defined as broadcast. Maybe biggest or greatest
>> would be better name for the attribute. User is then free to interpret
>> it as broadcast if desired.
>> The attribute network returned as address object also does not seem
>> right.
> by convention, the highest address in a given network is called the
> broadcast address while the lowest address is called the network
> address. They're also distinct addresses, as opposed to networks,
> hence .broadcast/.network/etc returning IPvXAddress objects. calling
> them .biggest and .smallest would be confusing.
> am I misinterpreting what you mean?

No, I just said its conventionally used as that but its not definition
of a broadcast (in fact you can have any valid host address defined
as broadcast as long as all members of the network agree on that)

Since you dont want to call the attribute ususally_the_broadcast_address
or something, other names which tell you about the data would seem more
appropriate (like greatest)

>> The performance hit you mention by translating the object upfront
>> is neglegtible I'd say - for any sensible use of the object you'd
>> need the binary form anyway. You can even use system (e.g. socket)
>> funtions to make the translation very fast. This also safes space
>> and allow vor verification of the input.
> I'll look into using socket where I can, but the computational hit
> actually wasn't negligible. A common use for something like this
> library might be to verify that an addresses typed by a user is valid,
> '' instead os '1921.68.1.1'; computing the extra attributes
> delays the return time and doesn't actually benefit the user or
> programmer.

Maybe I don't quite understand your extra attributes stuff - the
32 bit integer for ipv4 IP and the netmask in either 32 bit
or prefix length in 5 bits would be enough of a storage attribute.

All others are just representation of the values.

Storing the data as string seems a bit suboptimal since for
any sensible operation with the data you'd need to do the
conversion anyway.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3241 bytes
Desc: S/MIME Cryptographic Signature
URL: <>

From rdmurray at  Wed Aug 19 18:21:40 2009
From: rdmurray at (R. David Murray)
Date: Wed, 19 Aug 2009 12:21:40 -0400 (EDT)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
	<h6gjku$t2n$> <>
Message-ID: <>

On Wed, 19 Aug 2009 at 08:19, Peter Moody wrote:
> On Wed, Aug 19, 2009 at 6:47 AM, Tino Wildenhain<tino at> wrote:
>>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>>>> o.broadcast
>> ? ?IPv4Address('')
>> this is often used but not the only valid broadcast address,
>> in fact, any address between network address and max(address with given
>> netmask) can be defined as broadcast. Maybe biggest or greatest
>> would be better name for the attribute. User is then free to interpret
>> it as broadcast if desired.
>> The attribute network returned as address object also does not seem
>> right.
> by convention, the highest address in a given network is called the
> broadcast address while the lowest address is called the network
> address. They're also distinct addresses, as opposed to networks,
> hence .broadcast/.network/etc returning IPvXAddress objects. calling
> them .biggest and .smallest would be confusing.
> am I misinterpreting what you mean?

Possibly.  Tino means exactly what he said:  the broadcast address
does not _have_ to be the last IP, nor does the last IP _have_ to be
a broadcast, though in practice they almost always are (and using the
last IP as a host IP almost never works in practice in a heterogeneous
network).  Check out the 'broadcast' option of the ifconfig command for
confirmation that the broadcast address can be any IP in the network.
Of course, for that to work all hosts on the network have to agree on
what the broadcast is, hence the normal convention that the broadcast
is the last IP in the network.

As for the 'network' attribute, if you call it 'network' IMO it should
be a network data type, which would make it rather redundant.  What you
are actually returning is what I have normally heard called either the
'zero' of the network, or the "network number" or "network identifier";
but never just "network" (a network has to have at least an implicit
netmask to be meaningful, IMO).

Since you are dealing with networks as a list of addresses, perhaps
you should drop the 'network' attribute, make the 'broadcast' attribute
settable with a default equal to self[-1], and let the user refer to
the zero element to get the zero of the network if they want it.


From peter at  Wed Aug 19 18:17:21 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 09:17:21 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 8:39 AM, Eric Smith<eric at> wrote:
> Peter Moody wrote:
>> On Wed, Aug 19, 2009 at 3:20 AM, Antoine Pitrou<solipsis at>
>> wrote:
>>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>>> Howdy folks,
>>>> I have a first draft of a PEP for including an IP address manipulation
>>>> library in the python stdlib. It seems like there are a lot of really
>>>> smart folks with some, ahem, strong ideas about what an IP address
>>>> module should and shouldn't be so I wanted to solicit your input on this
>>>> pep.
>>> When you say :
>>> ? the results of the first computation should be cached and only
>>> re-generated should the object properties change ?
>>> does it mean that the objects are mutable? Would it make sense to make
>>> them immutable and therefore hashable (such as, e.g., datetime objects)?
>> that's a good point. I'll implement __hash__ in the BaseIP class.
> But are the objects mutable? I haven't had time to deep dive on this yet,
> but I'd like to. I also use IPy and would like to some this in the stdlib.

you can't set them directly, if that's what you mean.

>>> import ipaddr
>>> o = ipaddr.IPv4Network('')
>>> o.broadcast
>>> o.broadcast = ipaddr.IPv4Address('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
>>> o.prefixlen = 25
>>> o.broadcast

> Eric.

From solipsis at  Wed Aug 19 18:29:08 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 19 Aug 2009 16:29:08 +0000 (UTC)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
References: <>
Message-ID: <h6h98k$5n2$>

Le Wed, 19 Aug 2009 08:35:15 -0700, Peter Moody a ?crit?:
>> does it mean that the objects are mutable? Would it make sense to make
>> them immutable and therefore hashable (such as, e.g., datetime
>> objects)?
> that's a good point. I'll implement __hash__ in the BaseIP class.

It is a common practice that only immutable objects define a meaningful 
__hash__ method. The reason is that dicts and sets (and perhaps other 
structures) cache the hash value instead of calling __hash__ again and 
again. If you stick a mutable with a meaningful __hash__ in a dict, and 
then modify the mutable object, lookups will give the wrong results (they 
will be based on the old, stale hash value).

It seems to me that hashability is a more desireable property of IP 
objects than modifiability. I don't see any reason to modify an IP object 
after having created it (rather than creating a new object).



From peter at  Wed Aug 19 18:55:46 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 09:55:46 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<h6gjku$t2n$> <>
Message-ID: <>

On Wed, Aug 19, 2009 at 9:21 AM, R. David Murray<rdmurray at> wrote:
> On Wed, 19 Aug 2009 at 08:19, Peter Moody wrote:
>> On Wed, Aug 19, 2009 at 6:47 AM, Tino Wildenhain<tino at>
>> wrote:
>>>> Le Tue, 18 Aug 2009 13:00:06 -0700, Peter Moody a ?crit :
>>>>>> o.broadcast
>>> ? ?IPv4Address('')
>>> this is often used but not the only valid broadcast address,
>>> in fact, any address between network address and max(address with given
>>> netmask) can be defined as broadcast. Maybe biggest or greatest
>>> would be better name for the attribute. User is then free to interpret
>>> it as broadcast if desired.
>>> The attribute network returned as address object also does not seem
>>> right.
>> by convention, the highest address in a given network is called the
>> broadcast address while the lowest address is called the network
>> address. They're also distinct addresses, as opposed to networks,
>> hence .broadcast/.network/etc returning IPvXAddress objects. calling
>> them .biggest and .smallest would be confusing.
>> am I misinterpreting what you mean?
> Possibly. ?Tino means exactly what he said: ?the broadcast address
> does not _have_ to be the last IP, nor does the last IP _have_ to be
> a broadcast, though in practice they almost always are (and using the
> last IP as a host IP almost never works in practice in a heterogeneous
> network). ?Check out the 'broadcast' option of the ifconfig command for
> confirmation that the broadcast address can be any IP in the network.
> Of course, for that to work all hosts on the network have to agree on
> what the broadcast is, hence the normal convention that the broadcast
> is the last IP in the network.
> As for the 'network' attribute, if you call it 'network' IMO it should
> be a network data type, which would make it rather redundant. ?What you
> are actually returning is what I have normally heard called either the
> 'zero' of the network, or the "network number" or "network identifier";
> but never just "network" (a network has to have at least an implicit
> netmask to be meaningful, IMO).
> Since you are dealing with networks as a list of addresses, perhaps
> you should drop the 'network' attribute, make the 'broadcast' attribute
> settable with a default equal to self[-1], and let the user refer to
> the zero element to get the zero of the network if they want it.

making the broadcast address settable (with a default to self[-1])
might be reasonable, though it is different from just about every
other python implementation I've seen (IPy,, netaddr).

I'm not sure I understand your point about the network attribute.
what I'm returning with network is the subnet-id/base address of the
given network. Again, .network seems to be fairly standard for naming.

> --David

From fwierzbicki at  Wed Aug 19 20:10:35 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Wed, 19 Aug 2009 14:10:35 -0400
Subject: [Python-Dev] Two laments about CPython's AST Nodes
Message-ID: <>

Before I start complaining, I want to mention what a huge help it has
been to be able to directly compare the AST exposed by in
making Jython a better Python.  Thanks for that!

Now on to the complaints: Though I recently added support for this in
Jython, I don't like that nodes can be defined without required
attributes, for example:

node = ast.Assign()

Is valid, even though it requires "node.targets" and "node.value" (I'm
less concerned about the required lineno and col_offset, as I can
understand holding off on these so that you can just use
fix_missing_locations to fill these in for you).

My other (bigger) gripe is that you can define these values with
arbitrary objects that will blow up at parse time.  So for example you
can write:

node = ast.Assign()
node.targets = "whatever"

Which, when you try to parse, blows up with "TypeError: Assign field
"targets" must be a list, not a str".  I'd be much happier if this
blew up right away when you try to make the assignment.  At the
moment, this is how it works in Jython (though I could support this
with some contorting).

BTW -- I *am* volunteering to attempt to implement these things in
CPython if there is support :)


From eric at  Wed Aug 19 20:20:20 2009
From: eric at (Eric Smith)
Date: Wed, 19 Aug 2009 14:20:20 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
	<>	<>	<>
Message-ID: <>

Peter Moody wrote:
> On Wed, Aug 19, 2009 at 9:21 AM, R. David Murray<rdmurray at> wrote:

>> Possibly.  Tino means exactly what he said:  the broadcast address
>> does not _have_ to be the last IP, nor does the last IP _have_ to be
>> a broadcast, though in practice they almost always are (and using the
>> last IP as a host IP almost never works in practice in a heterogeneous
>> network).  Check out the 'broadcast' option of the ifconfig command for
>> confirmation that the broadcast address can be any IP in the network.
>> Of course, for that to work all hosts on the network have to agree on
>> what the broadcast is, hence the normal convention that the broadcast
>> is the last IP in the network.
>> As for the 'network' attribute, if you call it 'network' IMO it should
>> be a network data type, which would make it rather redundant.  What you
>> are actually returning is what I have normally heard called either the
>> 'zero' of the network, or the "network number" or "network identifier";
>> but never just "network" (a network has to have at least an implicit
>> netmask to be meaningful, IMO).
>> Since you are dealing with networks as a list of addresses, perhaps
>> you should drop the 'network' attribute, make the 'broadcast' attribute
>> settable with a default equal to self[-1], and let the user refer to
>> the zero element to get the zero of the network if they want it.
> making the broadcast address settable (with a default to self[-1])
> might be reasonable, though it is different from just about every
> other python implementation I've seen (IPy,, netaddr).
> I'm not sure I understand your point about the network attribute.
> what I'm returning with network is the subnet-id/base address of the
> given network. Again, .network seems to be fairly standard for naming.

I think using .network and .broadcast are pretty well understood to be 
the [0] and [-1] of the network address block. I don't think we want to 
start creating new terms or access patterns here.

+1 on leaving .network and .broadcast as-is (including returning a 
IPvXAddress object).


From glyph at  Wed Aug 19 20:28:38 2009
From: glyph at (Glyph Lefkowitz)
Date: Wed, 19 Aug 2009 14:28:38 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<h6gjku$t2n$> <>
Message-ID: <>

On Wed, Aug 19, 2009 at 2:20 PM, Eric Smith <eric at> wrote:

> I think using .network and .broadcast are pretty well understood to be the
> [0] and [-1] of the network address block. I don't think we want to start
> creating new terms or access patterns here.
> +1 on leaving .network and .broadcast as-is (including returning a
> IPvXAddress object).

-1.  I think 'network.number' or '' is a lot clearer than
''.  Maybe '.broadcast' would be okay, as long as it *can* be
adjusted for those unusual, or maybe even only hypothetical, networks where
it is not the [-1].
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Wed Aug 19 20:34:46 2009
From: brett at (Brett Cannon)
Date: Wed, 19 Aug 2009 11:34:46 -0700
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 11:10, Frank Wierzbicki<fwierzbicki at> wrote:
> Before I start complaining, I want to mention what a huge help it has
> been to be able to directly compare the AST exposed by in
> making Jython a better Python. ?Thanks for that!
> Now on to the complaints: Though I recently added support for this in
> Jython, I don't like that nodes can be defined without required
> attributes, for example:
> node = ast.Assign()
> Is valid, even though it requires "node.targets" and "node.value" (I'm
> less concerned about the required lineno and col_offset, as I can
> understand holding off on these so that you can just use
> fix_missing_locations to fill these in for you).
> My other (bigger) gripe is that you can define these values with
> arbitrary objects that will blow up at parse time. ?So for example you
> can write:
> node = ast.Assign()
> node.targets = "whatever"
> Which, when you try to parse, blows up with "TypeError: Assign field
> "targets" must be a list, not a str". ?I'd be much happier if this
> blew up right away when you try to make the assignment. ?At the
> moment, this is how it works in Jython (though I could support this
> with some contorting).
> BTW -- I *am* volunteering to attempt to implement these things in
> CPython if there is support :)

+1 from me for adding this support. While I can see people wanting to
create the node as soon as it is known to be needed and then plug in
the parts as they get parsed, postponing the node creation to later
once the subnodes have been done is not exactly a challenge.


From eric at  Wed Aug 19 20:37:11 2009
From: eric at (Eric Smith)
Date: Wed, 19 Aug 2009 14:37:11 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
	<>	<>	<>	<>	<>
Message-ID: <>

Glyph Lefkowitz wrote:
> On Wed, Aug 19, 2009 at 2:20 PM, Eric Smith <eric at 
> <mailto:eric at>> wrote:
>     I think using .network and .broadcast are pretty well understood to
>     be the [0] and [-1] of the network address block. I don't think we
>     want to start creating new terms or access patterns here.
>     +1 on leaving .network and .broadcast as-is (including returning a
>     IPvXAddress object).
> -1.  I think 'network.number' or '' is a lot clearer than 
> ''.  Maybe '.broadcast' would be okay, as long as it 
> /can/ be adjusted for those unusual, or maybe even only hypothetical, 
> networks where it is not the [-1].

Is there some existing library that uses .number or .zero? IPy uses .net 
(and .broadcast for [-1]).

From benjamin at  Wed Aug 19 22:06:52 2009
From: benjamin at (Benjamin Peterson)
Date: Wed, 19 Aug 2009 15:06:52 -0500
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/19 Frank Wierzbicki <fwierzbicki at>:
> Before I start complaining, I want to mention what a huge help it has
> been to be able to directly compare the AST exposed by in
> making Jython a better Python. ?Thanks for that!
> Now on to the complaints: Though I recently added support for this in
> Jython, I don't like that nodes can be defined without required
> attributes, for example:
> node = ast.Assign()
> Is valid, even though it requires "node.targets" and "node.value" (I'm
> less concerned about the required lineno and col_offset, as I can
> understand holding off on these so that you can just use
> fix_missing_locations to fill these in for you).


> My other (bigger) gripe is that you can define these values with
> arbitrary objects that will blow up at parse time. ?So for example you
> can write:
> node = ast.Assign()
> node.targets = "whatever"
> Which, when you try to parse, blows up with "TypeError: Assign field
> "targets" must be a list, not a str". ?I'd be much happier if this
> blew up right away when you try to make the assignment. ?At the
> moment, this is how it works in Jython (though I could support this
> with some contorting).

I also think this is a good idea, but this also causes an asymmetry. I
would still be able to do this:

node = ast.Module([])
node.body.append("random stuff")

and not have it type checked until it is compiled. This would be hard
to fix, though, and I think it is worth living with.

> BTW -- I *am* volunteering to attempt to implement these things in
> CPython if there is support :)

Very generous. :)


From martin at  Wed Aug 19 22:45:23 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 19 Aug 2009 22:45:23 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>		<h6gjku$t2n$>
	<>	<>
Message-ID: <>

> No, I just said its conventionally used as that but its not definition
> of a broadcast (in fact you can have any valid host address defined
> as broadcast as long as all members of the network agree on that)

You could, but then you are violating existing protocol specifications.

RFC 1122 mandates, in sections and 3.3.6, that certain addresses
MUST be understood as broadcast addresses, by all nodes (independent of

I think a Python IP address library should conform to all relevant RFCs.

> Since you dont want to call the attribute ususally_the_broadcast_address
> or something, other names which tell you about the data would seem more
> appropriate (like greatest)

No. I think setting the broadcast address to something else just does
not need to be supported.


From martin at  Wed Aug 19 23:00:33 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 19 Aug 2009 23:00:33 +0200
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>
Message-ID: <>

> Now on to the complaints: Though I recently added support for this in
> Jython, I don't like that nodes can be defined without required
> attributes, for example:
> node = ast.Assign()

I think we disagree in two points in our evaluation of this behavior:

a) ast.Assign is *not* a node of the CPython AST. The CPython AST is
   a set of C structures in Include/Python-ast.h. ast.Assign is merely
   a mirror structure of that.

b) it is, IMO, not reasonable to require users who create AST trees
   out of nothing to have correct trees at all times. I.e. it must be
   possible to represent incorrect trees.

c) the AST is *not* part of the Python language or library. It may
   change at any time without notice, and Jython is not required to
   duplicate its behavior exactly.

[so that's three items - as there should be in any good list of
two items :-]

> Which, when you try to parse, blows up with "TypeError: Assign field
> "targets" must be a list, not a str".  I'd be much happier if this
> blew up right away when you try to make the assignment.  At the
> moment, this is how it works in Jython (though I could support this
> with some contorting).

What precisely is it that makes this difficult to implement. If you
would follow CPython's implementation strategy (i.e. generate glue
code out of ASDL), I feel that it should be straight-forward to provide
exactly the same behavior in Jython.

> BTW -- I *am* volunteering to attempt to implement these things in
> CPython if there is support :)

I'm not sure I can support such a change. Giving the child nodes at
creation time, optionally, would be fine with me. Requiring the
tree to conform to the grammar at all times is unreasonable, IMO
(you can't simultaneously create all nodes in the tree and glue
them together, so you have to create the tree in steps - which means
that it will be intermittently incorrect).

You seem to propose some middle ground, but I'm not sure I understand
what that middle ground is.


From glyph at  Wed Aug 19 23:02:11 2009
From: glyph at (Glyph Lefkowitz)
Date: Wed, 19 Aug 2009 17:02:11 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<h6gjku$t2n$> <>
	<> <>
Message-ID: <>

On Wed, Aug 19, 2009 at 4:45 PM, "Martin v. L?wis" <martin at>wrote:

> > No, I just said its conventionally used as that but its not definition
> > of a broadcast (in fact you can have any valid host address defined
> > as broadcast as long as all members of the network agree on that)
> You could, but then you are violating existing protocol specifications.
> RFC 1122 mandates, in sections and 3.3.6, that certain addresses
> MUST be understood as broadcast addresses, by all nodes (independent of
> configuration).
> I think a Python IP address library should conform to all relevant RFCs.

Yes, but section 3.3.6 also states:

There is a class of hosts (4.2BSD Unix and its derivatives, but not 4.3BSD)
that use non-standard broadcast address forms, substituting 0 for -1. All
hosts SHOULD recognize and accept any of these non-standard broadcast
addresses as the destination address of an incoming datagram. A host MAY
optionally have a configuration option to choose the 0 or the -1 form of
broadcast address, for each physical interface, but this option SHOULD
default to the standard (-1) form.

So it sounds like doing what I suggested earlier (default to [-1], allow for
customization) is actually required by the RFC :-).  Although it does sound
like the RFC only requires that you be able to customize to [0] rather than
[-1], rather than any address.  In practical terms though I believe it is
possible to do as Tino suggests and configure any crazy address you want to
be the broadcast address (or addresses, even) for a network.

I think setting the broadcast address to something else just does not need
> to be supported.

It is unusual, but frankly, needing to actually do operations on broadcast
addresses at all is also a pretty unusual task.  Broadcast itself is a
somewhat obscure corner of networking.  I suspect that in many deployments
that need to write significant code to deal with broadcast addresses, rather
than the usual default stuff, funky configurations will actually be quite

I would not be surprised to find that there are still some 4.2BSD VAXes
somewhere doing something important, and some Python may one day be called
upon to manage their networks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From eric at  Wed Aug 19 23:13:05 2009
From: eric at (Eric Smith)
Date: Wed, 19 Aug 2009 17:13:05 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
	<>	<>	<>
Message-ID: <>

> So it sounds like doing what I suggested earlier (default to [-1], allow 
> for customization) is actually required by the RFC :-).  Although it 
> does sound like the RFC only requires that you be able to customize to 
> [0] rather than [-1], rather than any address.  In practical terms 
> though I believe it is possible to do as Tino suggests and configure any 
> crazy address you want to be the broadcast address (or addresses, even) 
> for a network.

If you're doing this, are you really going to be specifying the 
broadcast address as something like network.use_broadcast_index(-2) (or 
even 0) and then using network.broadcast somewhere else? I just don't 
see that happening.

>     I think setting the broadcast address to something else just does
>     not need to be supported.

I agree.

> It is unusual, but frankly, needing to actually do operations on 
> broadcast addresses at all is also a pretty unusual task.  Broadcast 
> itself is a somewhat obscure corner of networking.  I suspect that in 
> many deployments that need to write significant code to deal with 
> broadcast addresses, rather than the usual default stuff, funky 
> configurations will actually be quite common.

I use .broadcast from IPy, and I'm not doing anything funky. All of my 
broadcast addresses are network[-1].

From ncoghlan at  Wed Aug 19 23:17:04 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 20 Aug 2009 07:17:04 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
	<>	<>	<>
Message-ID: <>

Glyph Lefkowitz wrote:
> It is unusual, but frankly, needing to actually do operations on
> broadcast addresses at all is also a pretty unusual task.  Broadcast
> itself is a somewhat obscure corner of networking.  I suspect that in
> many deployments that need to write significant code to deal with
> broadcast addresses, rather than the usual default stuff, funky
> configurations will actually be quite common.
> I would not be surprised to find that there are still some 4.2BSD VAXes
> somewhere doing something important, and some Python may one day be
> called upon to manage their networks.

If using a custom broadcast address rather than the standard one, don't
use the ipnet.broadcast property?

I'm with Martin and the PEP author here - the property can quite happily
just use the conventional meaning without causing any real problems.
People doing something more unusual will still be free to either create
an appropriate IPAddress instance or else create an IPNetwork subclass
that defines the broadcast property differently (e.g. making it the same
as the network address).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Wed Aug 19 23:24:12 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 20 Aug 2009 07:24:12 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>
	<>	<>	<>	<>	<>
Message-ID: <>

Glyph Lefkowitz wrote:
> -1.  I think 'network.number' or '' is a lot clearer than
> ''.  Maybe '.broadcast' would be okay, as long as it
> /can/ be adjusted for those unusual, or maybe even only hypothetical,
> networks where it is not the [-1].

Maybe this is something that differs by country, but I have *never*
heard the first address in an IP network (i.e. every bit not covered by
the netmask set to zero) referred to as anything other than the "network
address". Similarly, the last address (every bit not covered by the
netmask set to one) is referred to as the "broadcast address", even if
the relevant RFCs don't actually guarantee that.

Anyone tasked to deal with a network that is sufficient unusual to break
those two conventions is almost certainly going to have bigger problems
than the proposed meanings for and ipnet.broadcast not
giving the correct answer for their specific situation.

And if someone does need to deal with that, then they create an
appropriate subclass or use a less lightweight IP addressing library.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Wed Aug 19 23:44:17 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 20 Aug 2009 07:44:17 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>	<>	<>
Message-ID: <>

Peter Moody wrote:
> you can't set them directly, if that's what you mean.
>>>> import ipaddr
>>>> o = ipaddr.IPv4Network('')
>>>> o.broadcast
> IPv4Address('')
> IPv4Address('')
>>>> o.broadcast = ipaddr.IPv4Address('')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: can't set attribute
>>>> o.prefixlen = 25
>>>> o.broadcast
> IPv4Address('')

IPAddress instances should definitely be hashable, but as long as
"prefixlen" is mutable, IPNetwork instances should *not* be hashable,
since their value can change.

If prefixlen was made read only, then IPNetwork instances could also be
made hashable. In that case, changing the prefix length would then have
to be done by creating a new instance.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From peter at  Thu Aug 20 00:01:42 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 15:01:42 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 2:44 PM, Nick Coghlan<ncoghlan at> wrote:
> Peter Moody wrote:
>> you can't set them directly, if that's what you mean.
>>>>> import ipaddr
>>>>> o = ipaddr.IPv4Network('')
>>>>> o.broadcast
>> IPv4Address('')
>> IPv4Address('')
>>>>> o.broadcast = ipaddr.IPv4Address('')
>> Traceback (most recent call last):
>> ? File "<stdin>", line 1, in <module>
>> AttributeError: can't set attribute
>>>>> o.prefixlen = 25
>>>>> o.broadcast
>> IPv4Address('')
> IPAddress instances should definitely be hashable, but as long as
> "prefixlen" is mutable, IPNetwork instances should *not* be hashable,
> since their value can change.
> If prefixlen was made read only, then IPNetwork instances could also be
> made hashable. In that case, changing the prefix length would then have
> to be done by creating a new instance.

ah, I see. I'll make this fix and I think this might actually simplify
the code.

just to double check, it's fine for IPNetwork to remain hashable if
set_prefix() actually returned a new object, correct?

> Cheers,
> Nick.
> --
> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
> ---------------------------------------------------------------

From solipsis at  Thu Aug 20 00:05:24 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 19 Aug 2009 22:05:24 +0000 (UTC)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
References: <>	<h6gjku$t2n$>
	<>	<>	<>	<>	<>
Message-ID: <>

Eric Smith <eric <at>> writes:
> Is there some existing library that uses .number or .zero? IPy uses .net 
> (and .broadcast for [-1]).

Why not be explicit? Use .first and .last.

From fdrake at  Thu Aug 20 00:10:56 2009
From: fdrake at (Fred Drake)
Date: Wed, 19 Aug 2009 18:10:56 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Aug 19, 2009, at 6:01 PM, Peter Moody wrote:
> just to double check, it's fine for IPNetwork to remain hashable if
> set_prefix() actually returned a new object, correct?

The name would be confusing, though.  Perhaps using_prefix() would be  
more clear.


Fred Drake   <fdrake at>

From eric at  Thu Aug 20 06:07:51 2009
From: eric at (Eric Smith)
Date: Thu, 20 Aug 2009 00:07:51 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>	<>	<>	<>	<>	<>
Message-ID: <>

Fred Drake wrote:
> On Aug 19, 2009, at 6:01 PM, Peter Moody wrote:
>> just to double check, it's fine for IPNetwork to remain hashable if
>> set_prefix() actually returned a new object, correct?
> The name would be confusing, though.  Perhaps using_prefix() would be 
> more clear.

I think you'd be better off either doing this with an optional parameter 
to __init__, or a class method factory function (maybe from_prefix or 
similar). I don't see why it should be a method on an existing object.

From peter at  Thu Aug 20 06:55:33 2009
From: peter at (Peter Moody)
Date: Wed, 19 Aug 2009 21:55:33 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 9:07 PM, Eric Smith<eric at> wrote:
> Fred Drake wrote:
>> On Aug 19, 2009, at 6:01 PM, Peter Moody wrote:
>>> just to double check, it's fine for IPNetwork to remain hashable if
>>> set_prefix() actually returned a new object, correct?
>> The name would be confusing, though. ?Perhaps using_prefix() would be more
>> clear.
> I think you'd be better off either doing this with an optional parameter to
> __init__, or a class method factory function (maybe from_prefix or similar).
> I don't see why it should be a method on an existing object.

while not the the prettiest, you can already (ignoring the set_prefix)
do something like:

>>> newobject = ipaddr.IP(str( + "/new prefix")

Is this sufficient?

From doko at  Thu Aug 20 11:47:03 2009
From: doko at (Matthias Klose)
Date: Thu, 20 Aug 2009 11:47:03 +0200
Subject: [Python-Dev] [Distutils] request for comments - standardization
 of python's purelib and platlib
In-Reply-To: <>
References: <h61lmi$m3v$>	<>
Message-ID: <>

On 14.08.2009 10:02, Tarek Ziad? wrote:
> On Thu, Aug 13, 2009 at 9:22 PM, Brett Cannon<brett at>  wrote:
>> On Thu, Aug 13, 2009 at 11:23, Jan Matejek<jan.matejek at>  wrote:
>>> Hello,
>>> I'm cross-posting this to distributions at freedesktop and python-dev,
>>> because the topic is relevant to both groups and should be solved in
>>> cooperation.
>>> The issue:
>>> In Python's default configuration (on linux), both purelib (location for
>>> pure python modules) and platlib (location for platform-dependent binary
>>> extensions) point to $prefix/lib/pythonX.Y/site-packages.
>>> That is no good for two main reasons.
>>> One, python depends on the "lib" directory. (from distro's point of
>>> view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
>>> impossible to install python under /usr/lib64 without heavy patching.
>>> Repeated attempts to bring python developers to acknowledge and rectify
>>> the situation have all failed (common argument here is "that would mean
>>> redesign of distutils and huge parts of whatnot").
>> This is now Tarek's call, so this may or may not have changed in terms of
>> what the (now) distutils maintainer thinks.
> I don't recall those repeated attempts , but I've been around for less
> than two years.
> You are very welcome to come in the Distutils-SIG ML to discuss these matters.
> I'm moving the discussion there.
> Among the proposals you have detailed, the sharedir way seems like the
> most simple/interesting
> one (depending on you answer to Brett's question )

The approach of splitting the installation into two different locations seems to 
be wrong, it changes the semantics for imports of python packages which are not 
installed in the same location. Simplest counter example is the use of relative 
imports, which will fail if the imported module/extension is not found in the 
same paths.

Other languages like Perl or Java don't have relative imports, or they map all 
components on the "path" into one logical path so you don't have this kind of 

I don't see an explict statement that code really has to live inside /usr/share, 
and even generated .py files differ depending on the architecture you build for 
(e.g. sip, qt bindings).


From fwierzbicki at  Thu Aug 20 15:20:49 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Thu, 20 Aug 2009 09:20:49 -0400
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 19, 2009 at 5:00 PM, "Martin v. L?wis"<martin at> wrote:
>> Now on to the complaints: Though I recently added support for this in
>> Jython, I don't like that nodes can be defined without required
>> attributes, for example:
>> node = ast.Assign()
> I think we disagree in two points in our evaluation of this behavior:
> a) ast.Assign is *not* a node of the CPython AST. The CPython AST is
> ? a set of C structures in Include/Python-ast.h. ast.Assign is merely
> ? a mirror structure of that.
Ah -- that is different from Jython's current design (The nodes that
are constructed by ast.Assign() in Jython actually are the exact nodes
that are used in real parsing)

> b) it is, IMO, not reasonable to require users who create AST trees
> ? out of nothing to have correct trees at all times. I.e. it must be
> ? possible to represent incorrect trees.
That does seem reasonable.  Luckily it was easy to implement for me :)

> c) the AST is *not* part of the Python language or library. It may
> ? change at any time without notice, and Jython is not required to
> ? duplicate its behavior exactly.
Sure, I'm really just talking about (though for Jython ATM they
are the same thing).  So that I understand better: when this call is
made in Python:

x = compile("def foo():pass", "foo", "exec", _ast.PyCF_ONLY_AST)

Does x contain real AST nodes or does it contain mirror structures
(feel free to just tell me: don't be lazy, go read the code). If it
contains real nodes, this is where I have some implementation trouble.
 If a tree of real nodes is then manipulated so that you end up with a
mix, I don't want to walk the entire thing over again to find the
mirror objects (that might be incomplete) and replace them with real
nodes.  If this creates a tree of mirror nodes, then I may want to
consider doing the same thing on the Jython side (it makes sense, now
that I understand CPython better I realize that the cost I am
incurring is probably due to having the real and mirror AST as the
same beast).

> [so that's three items - as there should be in any good list of
> two items :-]

> What precisely is it that makes this difficult to implement. If you
> would follow CPython's implementation strategy (i.e. generate glue
> code out of ASDL), I feel that it should be straight-forward to provide
> exactly the same behavior in Jython.
I do use the ASDL to generate this stuff, but again, the real and the
mirror nodes are not separated ATM, and that is what makes it

>> BTW -- I *am* volunteering to attempt to implement these things in
>> CPython if there is support :)
> I'm not sure I can support such a change. Giving the child nodes at
> creation time, optionally, would be fine with me. Requiring the
> tree to conform to the grammar at all times is unreasonable, IMO
> (you can't simultaneously create all nodes in the tree and glue
> them together, so you have to create the tree in steps - which means
> that it will be intermittently incorrect).
That is quite reasonable, I'll withdraw gripe #1 -- in fact the reason
I have already implemented this in Jython is that there is already
real world use out there.  I still need to understand gripe #2 a
little better before I back down on that one.


From ncoghlan at  Thu Aug 20 15:39:17 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 20 Aug 2009 23:39:17 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Peter Moody wrote:
> while not the the prettiest, you can already (ignoring the set_prefix)
> do something like:
>>>> newobject = ipaddr.IP(str( + "/new prefix")
> Is this sufficient?

At this point, that is probably fine. If it comes up often enough to be
worth providing a cleaner interface then it is easier to add that later.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From asmodai at  Thu Aug 20 15:46:51 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Thu, 20 Aug 2009 15:46:51 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090818 22:15], Peter Moody (peter at wrote:
>I have a first draft of a PEP for including an IP address manipulation
>library in the python stdlib. It seems like there are a lot of really
>smart folks with some, ahem, strong ideas about what an IP address
>module should and shouldn't be so I wanted to solicit your input on
>this pep.
>the pep can be found here:

No chance at the moment to test/look through the code, so please excuse any
obvious ones, I'm basing my comments on the PEP.

Some elaboration on handling ipv4 mapped addresses would be nice, e.g.
::ffff:c000:280 and/or ::ffff:

Some IPv6 examples would also help the PEP I think. Especially on how 0
compression is handled in addresses.

Maybe show ipv4 examples on non-class boundaries, e.g. /23 instead of /24,
so people are more convinced it handles CIDR properly.

Clarification on whether this library will support converting a sequence of
networks into another sequence where the networks which comprise consecutive
netblocks will be collapsed in a new entry. E.g. 2 /24s that are neighbours
will be represented as one /23.

I realise some might be answered by the last paragraph of your PEP, but it
would be nice to know what you consider essential and what not.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
They have learned nothing, and forgotten nothing...

From p.f.moore at  Thu Aug 20 18:25:52 2009
From: p.f.moore at (Paul Moore)
Date: Thu, 20 Aug 2009 17:25:52 +0100
Subject: [Python-Dev] Microsoft MSDN
In-Reply-To: <>
References: <> <>
Message-ID: <>

2009/8/13 Christian Heimes <lists at>:
> Steve Holden wrote:
>> I sent fourteen requests for licenses in to Microsoft. I've asked them
>> to let me know which they grant (since they may choose to limit the
>> number) and will inform you all personally when I hear their decision.
> I've received my MSDN subscription today. Everybody watch out for a message
> from MSDN! I almost confused the email with spam.
> Thanks for your work and please forward my gratitude to James Rice.

I've received mine, too, and my thanks to all as well.


From p.f.moore at  Thu Aug 20 18:35:18 2009
From: p.f.moore at (Paul Moore)
Date: Thu, 20 Aug 2009 17:35:18 +0100
Subject: [Python-Dev] standard library mimetypes module pathologically
In-Reply-To: <>
References: <>
	<> <>
	<h614a3$r92$> <>
Message-ID: <>

2009/8/14 Nick Coghlan <ncoghlan at>:
> Georg Brandl wrote:
>> Nick Coghlan schrieb:
>>> P.S. For anyone else that is slow like me, take a close look at PEP 387...
>> What should we see, other than that we have two PEPs on the same topic that
>> should be merged?
> Benjamin wrote the second one, so he obviously knows there's a written
> deprecation policy in place, and hence his mini-rant probably wasn't
> meant to be taken literally - a point I completely missed on first reading.

IIRC, the point is probably the fact that PEP 387 has status "Draft"
rather than "Accepted" - Benjamin proposed the PEP, met with a fair
bit of discussion but no consensus, and everything fizzled out before
a conclusion was reached.

> I agree the two PEPs should probably be consolidated into one, but
> absent a volunteer for that task, leaving them as is doesn't really hurt
> anything.

Agreed, if both were accepted...


PS I personally have no firm opinion on PEP 387. The idea seems good,
but I don't feel qualified to say whether the proposed approach is

From peter at  Thu Aug 20 20:11:00 2009
From: peter at (Peter Moody)
Date: Thu, 20 Aug 2009 11:11:00 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 20, 2009 at 6:46 AM, Jeroen Ruigrok van der
Werven<asmodai at> wrote:
> -On [20090818 22:15], Peter Moody (peter at wrote:
>>I have a first draft of a PEP for including an IP address manipulation
>>library in the python stdlib. It seems like there are a lot of really
>>smart folks with some, ahem, strong ideas about what an IP address
>>module should and shouldn't be so I wanted to solicit your input on
>>this pep.
>>the pep can be found here:
> No chance at the moment to test/look through the code, so please excuse any
> obvious ones, I'm basing my comments on the PEP.
> Some elaboration on handling ipv4 mapped addresses would be nice, e.g.
> ::ffff:c000:280 and/or ::ffff:
> Some IPv6 examples would also help the PEP I think. Especially on how 0
> compression is handled in addresses.
> Maybe show ipv4 examples on non-class boundaries, e.g. /23 instead of /24,
> so people are more convinced it handles CIDR properly.
> Clarification on whether this library will support converting a sequence of
> networks into another sequence where the networks which comprise consecutive
> netblocks will be collapsed in a new entry. E.g. 2 /24s that are neighbours
> will be represented as one /23.
> I realise some might be answered by the last paragraph of your PEP, but it
> would be nice to know what you consider essential and what not.

I've updated the pep with lots of examples; most of the stuff you're
asking for is already supported, I just didn't do a good job
explaining it. A few things are pending review.


> --
> Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
> ????? ?????? ??? ?? ??????
> | | GPG: 2EAC625B
> They have learned nothing, and forgotten nothing...

From brett at  Thu Aug 20 22:27:31 2009
From: brett at (Brett Cannon)
Date: Thu, 20 Aug 2009 13:27:31 -0700
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <> 
Message-ID: <>

On Thu, Aug 20, 2009 at 06:20, Frank Wierzbicki<fwierzbicki at> wrote:
> On Wed, Aug 19, 2009 at 5:00 PM, "Martin v. L?wis"<martin at> wrote:
>>> Now on to the complaints: Though I recently added support for this in
>>> Jython, I don't like that nodes can be defined without required
>>> attributes, for example:
>>> node = ast.Assign()
>> I think we disagree in two points in our evaluation of this behavior:
>> a) ast.Assign is *not* a node of the CPython AST. The CPython AST is
>> ? a set of C structures in Include/Python-ast.h. ast.Assign is merely
>> ? a mirror structure of that.
> Ah -- that is different from Jython's current design (The nodes that
> are constructed by ast.Assign() in Jython actually are the exact nodes
> that are used in real parsing)
>> b) it is, IMO, not reasonable to require users who create AST trees
>> ? out of nothing to have correct trees at all times. I.e. it must be
>> ? possible to represent incorrect trees.
> That does seem reasonable. ?Luckily it was easy to implement for me :)
>> c) the AST is *not* part of the Python language or library. It may
>> ? change at any time without notice, and Jython is not required to
>> ? duplicate its behavior exactly.
> Sure, I'm really just talking about (though for Jython ATM they
> are the same thing). ?So that I understand better: when this call is
> made in Python:
> x = compile("def foo():pass", "foo", "exec", _ast.PyCF_ONLY_AST)
> Does x contain real AST nodes or does it contain mirror structures
> (feel free to just tell me: don't be lazy, go read the code). If it
> contains real nodes, this is where I have some implementation trouble.
> ?If a tree of real nodes is then manipulated so that you end up with a
> mix, I don't want to walk the entire thing over again to find the
> mirror objects (that might be incomplete) and replace them with real
> nodes. ?If this creates a tree of mirror nodes, then I may want to
> consider doing the same thing on the Jython side (it makes sense, now
> that I understand CPython better I realize that the cost I am
> incurring is probably due to having the real and mirror AST as the
> same beast).
I am using PEP 339 to help me with this.

Looking at Python/bltinmodule.c:builtin_compile() you will notice that
when you take an AST and compile it the call goes to
Python/Python-ast.c:PyAST_obj2mod(). That function calls obj2ast_mod()
which turns a ast.Module object into an mod_ty (defined in
Python/Python-ast.c). The reverse is done with ast2obj_mod() when you
get the AST out. And looking at those conversion functions it seems
that the PyObject values convert as needed or reuse constants like
ints and strings (see obj2ast_unaryop() to see a conversion from
object AST to internal AST).

But I would double-check me on all of this. =)


>> [so that's three items - as there should be in any good list of
>> two items :-]
> :)
>> What precisely is it that makes this difficult to implement. If you
>> would follow CPython's implementation strategy (i.e. generate glue
>> code out of ASDL), I feel that it should be straight-forward to provide
>> exactly the same behavior in Jython.
> I do use the ASDL to generate this stuff, but again, the real and the
> mirror nodes are not separated ATM, and that is what makes it
> difficult.
>>> BTW -- I *am* volunteering to attempt to implement these things in
>>> CPython if there is support :)
>> I'm not sure I can support such a change. Giving the child nodes at
>> creation time, optionally, would be fine with me. Requiring the
>> tree to conform to the grammar at all times is unreasonable, IMO
>> (you can't simultaneously create all nodes in the tree and glue
>> them together, so you have to create the tree in steps - which means
>> that it will be intermittently incorrect).
> That is quite reasonable, I'll withdraw gripe #1 -- in fact the reason
> I have already implemented this in Jython is that there is already
> real world use out there. ?I still need to understand gripe #2 a
> little better before I back down on that one.
> -Frank
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From peter at  Thu Aug 20 23:00:06 2009
From: peter at (Peter Moody)
Date: Thu, 20 Aug 2009 14:00:06 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

The pep has been updated with the excellent suggestions thus far.

Are there any more?


On Tue, Aug 18, 2009 at 1:00 PM, Peter Moody<peter at> wrote:
> Howdy folks,
> I have a first draft of a PEP for including an IP address manipulation
> library in the python stdlib. It seems like there are a lot of really
> smart folks with some, ahem, strong ideas about what an IP address
> module should and shouldn't be so I wanted to solicit your input on
> this pep.
> the pep can be found here:
> ?
> the code can be found here:
> ?
> Please let me know if you have any comments (some already coming :)
> Cheers,
> /peter

From jjb5 at  Thu Aug 20 22:34:46 2009
From: jjb5 at (Joel Bender)
Date: Thu, 20 Aug 2009 16:34:46 -0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Nick Coghlan wrote:

> Maybe this is something that differs by country, but I have *never*
> heard the first address in an IP network (i.e. every bit not covered by
> the netmask set to zero) referred to as anything other than the "network
> address".

Ah!  A change to interject a mostly pointless comment...

Prior to IEN-212 [1] it wasn't standardized, the 'zero' was used and 
supported by the Berkeley socket library.  This was a number of years 
ago, however (!), and I dare say the sample code is lost to antiquity.

> And if someone does need to deal with that, then they create an
> appropriate subclass or use a less lightweight IP addressing library.


[1] <>

From martin at  Fri Aug 21 00:11:03 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 21 Aug 2009 00:11:03 +0200
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>	
Message-ID: <>

> x = compile("def foo():pass", "foo", "exec", _ast.PyCF_ONLY_AST)
> Does x contain real AST nodes or does it contain mirror structures
> (feel free to just tell me: don't be lazy, go read the code).

It only contains a mirror structure. See
pythonrun.c:Py_CompileStringFlags, and the (generated) PyAST_mod2obj
function. There is no way for a Python script to get hold of the
real AST.

> I do use the ASDL to generate this stuff, but again, the real and the
> mirror nodes are not separated ATM, and that is what makes it
> difficult.

Couldn't you just generate a check function for your tree that
would be invoked before you try to process a tree that a
script got access to?

> That is quite reasonable, I'll withdraw gripe #1 -- in fact the reason
> I have already implemented this in Jython is that there is already
> real world use out there.  I still need to understand gripe #2 a
> little better before I back down on that one.

If you are asking that a type check is made on assigning a value to
these fields - I'm not quite sure whether you could implement that
check reliably. Wouldn't it be possible to bypass it by filling a
value directly into __dict__?

If you can come up with a patch that checks in a reliable manner,
I would be in favor of adding that (in 2.7 and 3.2), taking out
the corresponding checks when converting to the internal AST.


From casevh at  Fri Aug 21 07:15:59 2009
From: casevh at (Case Vanhorsen)
Date: Thu, 20 Aug 2009 22:15:59 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

>On Thu, Aug 20, 2009 at 2:00 PM, Peter Moody<peter at> wrote:
> The pep has been updated with the excellent suggestions thus far.
> Are there any more?

Thanks for writing the PEP.

I tried a few of the common scenarios that I use at work. Disclaimer:
my comments are based on my work environment.

I was surprised that IP('') returned
IPv4Address('') instead of IPv4Address(''). I
know I can change the behavior by using host=True, but then
IP('', host=True) will raise an error. It makes more
sense, at least to me, that if I input just an IP address, I get an IP
address back. I would prefer that IP('') return an
IPv4Network and IP('') return an IPv4Address.

Would it be possible to provide an iterator that returns just the
valid host IP addresses (minus the network and broadcast addresses)?
"for i in IPv4Network('')" and "for in in
IPv4Network('').iterhosts()" both return all 16 IP
addresses. I normally describe as consisting of one
network address, 14 host addresses, and one broadcast address. I would
prefer that "for i in IPv4Network('')" return all IP
addresses and that "for in in
IPv4Network('').iterhosts()" exclude the network and
broadcast addresses. I think creating a list of IP addresses that can
be assigned to devices on a network is a common task.

Can .subnet() be enhanced to accept masks? For example,
IPv4Network('').subnet('/19') would return the eight /19

What about supporting multiple parameters to subnet? I frequently need
to create complex subnet layouts. The following subnet layout is NOT
made up!

A possible syntax would be:

Note: I am willing to provide patches to implement my suggestions. I
just won't have much time over the next couple weeks.


> Cheers,
> /peter
> On Tue, Aug 18, 2009 at 1:00 PM, Peter Moody<peter at> wrote:
>> Howdy folks,
>> I have a first draft of a PEP for including an IP address manipulation
>> library in the python stdlib. It seems like there are a lot of really
>> smart folks with some, ahem, strong ideas about what an IP address
>> module should and shouldn't be so I wanted to solicit your input on
>> this pep.
>> the pep can be found here:
>> ?
>> the code can be found here:
>> ?
>> Please let me know if you have any comments (some already coming :)
>> Cheers,
>> /peter
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From skippy.hammond at  Fri Aug 21 09:16:18 2009
From: skippy.hammond at (Mark Hammond)
Date: Fri, 21 Aug 2009 17:16:18 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

[Adjusted the CCs...]

On 19/08/2009 8:21 AM, Dj Gilcrease wrote:
> On Tue, Aug 18, 2009 at 2:12 AM, "Martin v. L?wis"<martin at>  wrote:
>> The second item is line conversion hooks. Dj Gilcrease has posted a
>> solution which he considers a hack himself. Mark Hammond has also
>> volunteered, but it seems some volunteer needs to be "in charge",
>> keeping track of a proposed solution until everybody agrees that it
>> is a good solution. It may be that two solutions are necessary: a
>> short-term one, that operates as a hook and has limitations, and
>> a long-term one, that improves the hook system of Mercurial to
>> implement the proper functionality (which then might get shipped
>> with Mercurial in a cross-platform manner).
> My solution is a hack because the hooks in Mercurial need to be
> modified to support it properly, I would be happy to help work on this
> as it is a situation I run into all the time in my own projects. I can
> never seem to get all the developers to enable the hooks, and one of
> them always commits with improper line endings =P

Maybe you can enumerate what you think needs to change in mercurial, 
then once we have a plan in place it will be clearer who can do what.

I'm resurrecting my patch to support a filter called 'none' (which is 
turning out to be harder than I thought).  Off the top of my head, it 
would the following would give us a pretty solid solution:

* Finish my patch for 'none' as a filter, so '**=cleverencode' can be 
reasonably used (currently you can't specify specific files *not* have 
cleverencode, making it unsuitable in practice without the concept of 

* Add support for versioned 'filter rules' - eg, /.hgfilters or similar.

* This might be pushing my luck, but: add 'defensive' support to core hg 
for this feature - if /.hgfilters exists, hg should refuse to operate on 
the working tree unless the win32text extension is enabled.

Note that this last point still leaves win32text optional for hg itself 
- but if the owner of a repository has explicitly 'opted in' for 
win32text support, hg can still assist in refusing to screw the tree. 
The hg user has the option of enabling that extension, declining to use 
that repository, or arguing with the owner of the repo about use of the 
feature in the first place.

Is there something I'm missing?  Or maybe a better way to have hg 
enforce a repository's policy while not inflicting pain on hg users who 
don't want to ever think about windows?



From dirkjan at  Fri Aug 21 09:48:03 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Fri, 21 Aug 2009 09:48:03 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 09:16, Mark Hammond<skippy.hammond at> wrote:
> I'm resurrecting my patch to support a filter called 'none' (which is
> turning out to be harder than I thought). ?Off the top of my head, it would
> the following would give us a pretty solid solution:
> * Finish my patch for 'none' as a filter, so '**=cleverencode' can be
> reasonably used (currently you can't specify specific files *not* have
> cleverencode, making it unsuitable in practice without the concept of
> 'none')
> * Add support for versioned 'filter rules' - eg, /.hgfilters or similar.
> * This might be pushing my luck, but: add 'defensive' support to core hg for
> this feature - if /.hgfilters exists, hg should refuse to operate on the
> working tree unless the win32text extension is enabled.

Sounds great to me. The latter might indeed be hard to get into the
core, but seems like a good idea to try.



From stephen at  Fri Aug 21 10:50:00 2009
From: stephen at (Stephen J. Turnbull)
Date: Fri, 21 Aug 2009 17:50:00 +0900
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

Mark Hammond writes:

 > * Add support for versioned 'filter rules' - eg, /.hgfilters or similar.
 > * This might be pushing my luck, but: add 'defensive' support to core hg 
 > for this feature - if /.hgfilters exists, hg should refuse to operate on 
 > the working tree unless the win32text extension is enabled.

The name ".hgfilters" should be changed, then.  That's way too generic
to be used to "enforce" something as specific as win32text.  I can
imagine all kinds of things wanting to use rules or filters.

How about a scheme where an extension reserves a filter file for
itself in .hgfilters?  In this case the win32text filters would live
in .hgfilters/win32text, and if that file exists hg checks that
the corresponding extension has been enabled, and if not, refuses to
run (and tells you that if you really want to override, you rename the
file to win32text.disabled and commit).

Note that Bazaar is currently discussing some similar policies.  I
think the name they have settled on is ".bzrrules".  Maybe .hgrules is
a better name.

Q: What are those straight lines?
A: "XEmacs rules."

From ncoghlan at  Fri Aug 21 11:57:38 2009
From: ncoghlan at (Nick Coghlan)
Date: Fri, 21 Aug 2009 19:57:38 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Stephen J. Turnbull wrote:
> Note that Bazaar is currently discussing some similar policies.  I
> think the name they have settled on is ".bzrrules".  Maybe .hgrules is
> a better name.

So it would be .hgrules/<extensionname>? With the extension then
defining the contents of the rule file?

An alternative would be to go one level deeper and have:


If an extension rule file appeared in the first subdirectory then hg
would refuse to operate on the repository without that extension being

I guess something like that might be nice to have, but the support for
negative filtering and versioned rule definitions is all we really need
from a python-dev point of view.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Fri Aug 21 12:02:19 2009
From: ncoghlan at (Nick Coghlan)
Date: Fri, 21 Aug 2009 20:02:19 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<h6gjku$t2n$>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

Joel Bender wrote:
> Nick Coghlan wrote:
>> Maybe this is something that differs by country, but I have *never*
>> heard the first address in an IP network (i.e. every bit not covered by
>> the netmask set to zero) referred to as anything other than the "network
>> address".
> Ah!  A change to interject a mostly pointless comment...
> Prior to IEN-212 [1] it wasn't standardized, the 'zero' was used and
> supported by the Berkeley socket library.  This was a number of years
> ago, however (!), and I dare say the sample code is lost to antiquity.

Ah, that would be me showing my (lack of) age then :)

I was still six years or so away from getting my first computer and more
than 15 years away from any formal networking training when that note
was published...


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From phd at  Fri Aug 21 13:00:02 2009
From: phd at (Oleg Broytmann)
Date: Fri, 21 Aug 2009 15:00:02 +0400
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

>     _compat_has_real_bytes = bytes != str

   Wouldn't it be nicer "bytes is not str"?

     Oleg Broytmann              phd at
           Programmers don't die, they just GOSUB without RETURN.

From asmodai at  Fri Aug 21 13:54:21 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Fri, 21 Aug 2009 13:54:21 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090820 20:19], Peter Moody (peter at wrote:
>I've updated the pep with lots of examples; most of the stuff you're
>asking for is already supported, I just didn't do a good job
>explaining it. A few things are pending review.

Thanks for that Peter!

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
Earth to earth, ashes to ashes, dust to dust...

From stephen at  Fri Aug 21 15:00:02 2009
From: stephen at (Stephen J. Turnbull)
Date: Fri, 21 Aug 2009 22:00:02 +0900
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan writes:
 > Stephen J. Turnbull wrote:
 > > Note that Bazaar is currently discussing some similar policies.  I
 > > think the name they have settled on is ".bzrrules".  Maybe .hgrules is
 > > a better name.
 > So it would be .hgrules/<extensionname>? With the extension then
 > defining the contents of the rule file?


 > An alternative would be to go one level deeper and have:
 > .hgrules/required/<extensionname>
 > .hgrules/optional/<extensionname>

I thought briefly about that kind of thing.  However, this way would
require deciding the semantics of the subdirectories, and while
"optional" vs "required" is pretty appealing, how about "required"
vs. "requisite"?  (As Dave Barry would say, "I am *still* not
kidding."  See:

Of course anything related to Python would do a better job of
naming<wink>, but such semantic fine points might very well be
important.  And yes, there are people who take their VCS as seriously
as they take authenticating as root.)

So what I thought was that extensions would provide a policy function,
which would make such judgments when called.  But then I realized I
had no clue what the semantics should be, so I didn't mention it.

From fwierzbicki at  Fri Aug 21 15:46:35 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Fri, 21 Aug 2009 09:46:35 -0400
Subject: [Python-Dev] Two laments about CPython's AST Nodes
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 20, 2009 at 6:11 PM, "Martin v. L?wis"<martin at> wrote:
> Couldn't you just generate a check function for your tree that
> would be invoked before you try to process a tree that a
> script got access to?
That would be one way, though now that I understand CPython's AST
design better, I am tempted to follow the lead.  If I had a private
AST and and a public mirror for, the design could become much
simpler, and probably faster for normal parsing.

> If you are asking that a type check is made on assigning a value to
> these fields - I'm not quite sure whether you could implement that
> check reliably. Wouldn't it be possible to bypass it by filling a
> value directly into __dict__?
> If you can come up with a patch that checks in a reliable manner,
> I would be in favor of adding that (in 2.7 and 3.2), taking out
> the corresponding checks when converting to the internal AST.
Great, I may give it a try, but changing the AST impl  for Jython 2.6
will probably be my short term answer.


From digitalxero at  Fri Aug 21 16:10:55 2009
From: digitalxero at (Dj Gilcrease)
Date: Fri, 21 Aug 2009 08:10:55 -0600
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 1:16 AM, Mark Hammond<skippy.hammond at> wrote:
> Maybe you can enumerate what you think needs to change in mercurial, then
> once we have a plan in place it will be clearer who can do what.

The encode/decode hooks need to be passed the filename they are
working on so you can have an ignore list, this is why I consider my
method a hack since I am using a precommit hook to do conversion since
I am able to find out which file I am working on and make sure it is
not in an ignore list. There also needs to be a way to have required
and version controlled extensions.

This weekend I plan on digging into Mercurials hook code and doing up
a patch so the encode/decode hooks accept the filename they are
working on in a backwards compatible way

 > An alternative would be to go one level deeper and have:
 > .hgrules/required/<extensionname>
 > .hgrules/optional/<extensionname>

I like this, though maybe .hgextensions since it would contain
versioned rules and the actual required extension. The extra sub
directories are not really required IMHO, you just have a hgrc file
that works the same as the local hgrc file except it only looks in the
.hgextensions directory for the correct extension so for python we
could have something like

format_enforcer =



pretxncommit.crlf = python:format_enforcer.forbidcrlf = python:format_enforcer.forbidcr

From dirkjan at  Fri Aug 21 16:19:50 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Fri, 21 Aug 2009 16:19:50 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 16:10, Dj Gilcrease<digitalxero at> wrote:
> I like this, though maybe .hgextensions since it would contain
> versioned rules and the actual required extension. The extra sub
> directories are not really required IMHO, you just have a hgrc file
> that works the same as the local hgrc file except it only looks in the
> .hgextensions directory for the correct extension so for python we
> could have something like
> [extensions]
> format_enforcer =

Enabling extensions in a versioned file is not going to fly.



From digitalxero at  Fri Aug 21 16:42:29 2009
From: digitalxero at (Dj Gilcrease)
Date: Fri, 21 Aug 2009 08:42:29 -0600
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 8:19 AM, Dirkjan Ochtman<dirkjan at> wrote:
> Enabling extensions in a versioned file is not going to fly.

any specific reason?

From status at  Fri Aug 21 18:07:59 2009
From: status at (Python tracker)
Date: Fri, 21 Aug 2009 18:07:59 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (08/14/09 - 08/21/09)
Python tracker at

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.

 2353 open (+40) / 16226 closed (+15) / 18579 total (+55)

Open issues with patches:   928

Average duration of open issues: 656 days.
Median duration of open issues: 411 days.

Open Issues Breakdown
   open  2322 (+40)
pending    30 ( +0)

Issues Created Or Reopened (57)

Allow buffering for HTTPResponse                                 08/18/09
CLOSED    reopened r.david.murray                
       patch, patch                                                            

Py3.1 hangs in coroutine and eats up all memory                  08/17/09    reopened scoder                        

cross platform failure and silly test in doctest                 08/14/09    created  cjw296                        

better col_offset for AST in statements like "for a,b in ..."    08/14/09
CLOSED    created  fwierzbicki                   

'''3,5'''.strip(r''',''') does not strip comma, returns '3,5'    08/14/09
CLOSED    created  mgruen                        

asyncore's accept() is broken                                    08/14/09    created  giampaolo.rodola              

dir() on __new__'d module w/o dict crashes 2.6.2                 08/14/09
CLOSED    created  DinoV                         

raw_input() calls generate compile errors.                       08/15/09
CLOSED    created  starz                         

It's possible to create TryExcept with no handlers               08/15/09    created  benjamin.peterson             

hotshot stats load causes TypeError when multiple files are load 08/15/09    created  j1m                           

macurl2path has typos that raise AttributeError                  08/16/09
CLOSED    created  joe.amenta                    

sys._getframe is not available on all Python implementations     08/16/09    created  johannes.janssen              

Integer & Long types:  Performance improvement of 1.6x to 2x for 08/16/09    created  gawain                        

memmove fails with unicode strings                               08/16/09
CLOSED    created  verigak                       

xz compressor support                                            08/17/09    created  devurandom                    

Windows install error when choosing to compile .py files         08/17/09    created  pds                           

Some problem with recursion handling                             08/17/09    created  gregorlingl                   

ValueError in SocketServer.UDPServer Example          08/17/09    created  ericpope                      

pdb messes up when debugging an non-ascii program                08/17/09    created  smu                           

multiprocessing logging                                          08/17/09    created  benliles                      

Locks in python standard library should be sanitized on fork     08/17/09    created  gregory.p.smith               

collections.namedtuple: confusing example                        08/18/09    created  ash                           

csv.writer: example does not work                                08/18/09
CLOSED    created  nicolasg                      

r74463 causes failures in test_xmlrpc                            08/18/09    created  r.david.murray                

Inconsistency in Documentation: "Name Spaces" vs "Namespaces"    08/18/09    created  CarGuy37                      

pyuic4.bat has bad path to python.exe and               08/18/09
CLOSED    created  maar                          

ImportError when package is symlinked on Windows                 08/18/09    created  jaraco                        

To avoid hang up in using CGIXMLRPCRequestHandler under IIS 7.x  08/18/09    created  sjtuer                        

Add support for ssize_t                                          08/18/09    created  Nikratio                      

dict.fromkeys() should not cross reference mutable value by defa 08/18/09
CLOSED    created  maxlem                        

Add option of non-zero exit status of when building of  08/18/09    created  Arfrever                      

PyInit_shoddy() in shoddy.c does not return anything on success  08/18/09    created  kajiyama                      

curses line wrap broken when mixing full- and half-width unicode 08/19/09    created  fugounashi                    

Imap lib implicit conversion from bytes to string                08/19/09    created  surprising42                  

restype pointer to Structure subclass never initialized          08/19/09
CLOSED    created  dontbugme                     

File handling in Python                                          08/19/09
CLOSED    created  SallaS07                      

PEP 372 odict.__eq__ behaves incorrectly                         08/19/09
CLOSED    created  cegner                        

Wrong doc strings in itertools                                   08/20/09    created  hagen                         

IDLE window won't start or show up after assgining new key in op 08/20/09    created  CaribbeanCruise               

Compounded expressions with lambda functions are evaluated incor 08/20/09
CLOSED    created  mvyskocil                     

Garbage collector release method                                 08/20/09    created  gardster                      

Embedding python into shared library crash on AIX                08/20/09    created  damahay123                    

pprint.pprint should support no objects to print blank lines & a 08/20/09    created  marystern                     

calling kevent repr raises a TypeError                           08/20/09    created  jesstess                      

(curses) addstr() takes str in Python 3                          08/20/09    created  Trundle                       

ValueError raised by IDLE during tooltip open on 64-bit Centos 5 08/20/09
CLOSED    created  srid                          

test test_multiprocessing failed                                 08/21/09    created  LinuxDonald                   

test test_telnetlib failed                                       08/21/09    created  LinuxDonald                   

Support for encrypted zipfiles when interpreting zipfile as scri 08/21/09    created  manis                         

threading issue in __builtins__.print                            08/21/09    created  nullnil                       
       patch, needs review                                                     

Default return value in ConfigParser                             08/21/09    created  jjdominguezm                  

-1**2=-1                                                         08/21/09
CLOSED    created  rahul1618                     

Python 3.1.1 test_cmd_line fails on Fedora 11                    08/21/09    created  Pif                           

Non-existent member 'nb_inplace_divide' in PyNumberMethods       08/21/09    created  kajiyama                      

Patch: new method get_wch for ncurses bindings: accept wide char 08/21/09    created  inigoserna                    

ftplib documentation does not document what the acct parameter i 08/21/09    created  tarjei                        

Marshal's documentation incomplete (Bools)                       08/21/09    created  serprex                       

Issues Now Closed (38)

Allow buffering for HTTPResponse                                    0 days    r.david.murray                
       patch, patch                                                            

Unable to launch IDLE on Windows                                  151 days    srid                          

Missing labelside option for Tix option menu (fix included)       103 days    gpolo                         

Fix O(n**2) performance problem in socket._fileobject              81 days    gregory.p.smith               
       patch, patch, easy, needs review                                        

Support for tcl 8.6                                                69 days    gpolo                         

Bug in hashlib                                                     64 days    gregory.p.smith               

multiline exception logging via syslog handler                     43 days    vsajip                        

IDLE with Tk-Cocoa: Edit, format menus hang                        37 days    wordtech                      

"HOME" is not a standard environment variable on Windows           29 days    tarek                         

urllib.urlopen creates bad requests when location header of 301    23 days    orsenthil                     

test_pickle fails on AIX -- 6.9999999999999994e-308 != 6.9999999   10 days    marketdickinson               

codecs documentation does not mention surrogateescape              13 days    ash                           

logging config - using of FileHandler's delay argument?            14 days    vsajip                        

CGI module documentation references method 'toupper'; should be     6 days    r.david.murray                

New functions in to get user/global site packages paths     8 days    tarek                         

better col_offset for AST in statements like "for a,b in ..."       1 days    benjamin.peterson             

'''3,5'''.strip(r''',''') does not strip comma, returns '3,5'       0 days    eric.smith                    

dir() on __new__'d module w/o dict crashes 2.6.2                    1 days    benjamin.peterson             

raw_input() calls generate compile errors.                          0 days    benjamin.peterson             

macurl2path has typos that raise AttributeError                     4 days    orsenthil                     

memmove fails with unicode strings                                  0 days    eric.smith                    

csv.writer: example does not work                                   0 days    skip.montanaro                

pyuic4.bat has bad path to python.exe and                  0 days    benjamin.peterson             

dict.fromkeys() should not cross reference mutable value by defa    0 days    r.david.murray                

restype pointer to Structure subclass never initialized             0 days    theller                       

File handling in Python                                             0 days    amaury.forgeotdarc            

PEP 372 odict.__eq__ behaves incorrectly                            0 days    rhettinger                    

Compounded expressions with lambda functions are evaluated incor    0 days    eric.smith                    

ValueError raised by IDLE during tooltip open on 64-bit Centos 5    0 days    gpolo                         

-1**2=-1                                                            0 days    amaury.forgeotdarc            

ScrolledText allows Frame.bbox to hide Text.bbox                 1651 days gpolo                         

Tix: PanedWindow.panes nonfunctional                             1477 days gpolo                         

Tix CheckList 'radio' option cannot be changed                   1465 days gpolo                         
       patch                                                            class HList missing info_bbox, info_dragsite and info_dro 1373 days gpolo                         
       patch, easy                                                             

urllib.FancyURLopener.redirect_internal looses data on POST!     1293 days orsenthil                     

Tix.Grid patch                                                   1131 days gpolo                         

Fix numerous bugs in unittest                                    1084 days jonozzz                       

functools.compose to chain functions together                     912 days rhettinger                    

Top Issues Most Discussed (10)

 17 Add option of non-zero exit status of when building of    3 days

 15 PyXXX_ClearFreeList for dict, set, and list                        8 days

  7 Imap lib implicit conversion from bytes to string                  2 days

  7 sys._getframe is not available on all Python implementations       5 days

  7 Python 3.1 segfaults when invalid UTF-8 characters are	passed f    8 days

  7 c_char_p return value returns string, not bytes                   74 days

  7 Conversion of longs to bytes and vice-versa.                    1810 days

  6 httplib read() very slow due to lack of socket buffer            501 days

  6 urllib.FancyURLopener.redirect_internal looses data on POST!    1293 days

  5 fnmatch fails on filenames containing \n character                14 days

From peter at  Fri Aug 21 18:24:46 2009
From: peter at (Peter Moody)
Date: Fri, 21 Aug 2009 09:24:46 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 20, 2009 at 10:15 PM, Case Vanhorsen<casevh at> wrote:
>>On Thu, Aug 20, 2009 at 2:00 PM, Peter Moody<peter at> wrote:
>> The pep has been updated with the excellent suggestions thus far.
>> Are there any more?
> Thanks for writing the PEP.
> I tried a few of the common scenarios that I use at work. Disclaimer:
> my comments are based on my work environment.
> I was surprised that IP('') returned
> IPv4Address('') instead of IPv4Address(''). I
> know I can change the behavior by using host=True, but then
> IP('', host=True) will raise an error. It makes more
> sense, at least to me, that if I input just an IP address, I get an IP
> address back. I would prefer that IP('') return an
> IPv4Network and IP('') return an IPv4Address.

I think you mean that it returns an IPv4Network object (not
IPv4Address).  My suggestion there is that if you know you're dealing
with an address, use one of the IPvXAddress classes (or pass host=True
to the IP function). IP is just a helper function and defaulting to a
network with a /32 prefix seems relatively common.

Knowing that my experience may not always be the most common, I can
change this behavior if it's indeed confusing, but in my conversations
with others and in checking out the current state of ip address
libraries, this seems to be a perfectly acceptable default.

> Would it be possible to provide an iterator that returns just the
> valid host IP addresses (minus the network and broadcast addresses)?
> "for i in IPv4Network('')" and "for in in
> IPv4Network('').iterhosts()" both return all 16 IP
> addresses. I normally describe as consisting of one
> network address, 14 host addresses, and one broadcast address. I would
> prefer that "for i in IPv4Network('')" return all IP
> addresses and that "for in in
> IPv4Network('').iterhosts()" exclude the network and
> broadcast addresses. I think creating a list of IP addresses that can
> be assigned to devices on a network is a common task.

this is a good idea and I'll implement this.  .iterhosts() for subnet
- (network|broadcast) and .iterallhosts() for the entire subnet (in my
testing, looping over an iterator was actually reasonably faster than
just for i in IP(network):, so I'll support iterators for both)

> Can .subnet() be enhanced to accept masks? For example,
> IPv4Network('').subnet('/19') would return the eight /19
> subnets.

This seems like an easy win. I'll implement this too.

> What about supporting multiple parameters to subnet? I frequently need
> to create complex subnet layouts. The following subnet layout is NOT
> made up!

I believe it, we have equally odd subnet assignments at work.


> A possible syntax would be:
> .subnet((1,'/23'),(1,'/25'),(2,'/26'),(2,'/28'),(8,'/30'),(16,'/32'),(3,'/28'),(2,'/27'),(1,'/26'))
> Note: I am willing to provide patches to implement my suggestions. I
> just won't have much time over the next couple weeks.

I'm happy reviewing/accepting patches to ipaddr. I'd worry a bit about
the complexity required for this, but I'm open-minded.

> casevh
>> Cheers,
>> /peter
>> On Tue, Aug 18, 2009 at 1:00 PM, Peter Moody<peter at> wrote:
>>> Howdy folks,
>>> I have a first draft of a PEP for including an IP address manipulation
>>> library in the python stdlib. It seems like there are a lot of really
>>> smart folks with some, ahem, strong ideas about what an IP address
>>> module should and shouldn't be so I wanted to solicit your input on
>>> this pep.
>>> the pep can be found here:
>>> ?
>>> the code can be found here:
>>> ?
>>> Please let me know if you have any comments (some already coming :)
>>> Cheers,
>>> /peter
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:

From ncoghlan at  Sat Aug 22 01:08:05 2009
From: ncoghlan at (Nick Coghlan)
Date: Sat, 22 Aug 2009 09:08:05 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Peter Moody wrote:
 this is a good idea and I'll implement this.  .iterhosts() for subnet
> - (network|broadcast) and .iterallhosts() for the entire subnet (in my
> testing, looping over an iterator was actually reasonably faster than
> just for i in IP(network):, so I'll support iterators for both)

I would suggest just changing __iter__ to be the equivalent of the
current iterhosts() and then changing iterhosts() as described.

Such a change would would also fix the thread safety and nested
iteration problems problems suffered by the current __iter__
implementation. I haven't executed the following, but from reading the
code I am confident they would behave as a I describe in the comments:

  # With the current implementation, this is an infinite loop
  net = IPv4Network("")
  for x in net:
  # And this only runs the inner loop once
  for x in net:
    for y in net:


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Aug 22 01:41:49 2009
From: ncoghlan at (Nick Coghlan)
Date: Sat, 22 Aug 2009 09:41:49 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Peter Moody wrote:
> On Thu, Aug 20, 2009 at 10:15 PM, Case Vanhorsen<casevh at> wrote:
>> I was surprised that IP('') returned
>> IPv4Address('') instead of IPv4Address(''). I
>> know I can change the behavior by using host=True, but then
>> IP('', host=True) will raise an error. It makes more
>> sense, at least to me, that if I input just an IP address, I get an IP
>> address back. I would prefer that IP('') return an
>> IPv4Network and IP('') return an IPv4Address.
> I think you mean that it returns an IPv4Network object (not
> IPv4Address).  My suggestion there is that if you know you're dealing
> with an address, use one of the IPvXAddress classes (or pass host=True
> to the IP function). IP is just a helper function and defaulting to a
> network with a /32 prefix seems relatively common.
> Knowing that my experience may not always be the most common, I can
> change this behavior if it's indeed confusing, but in my conversations
> with others and in checking out the current state of ip address
> libraries, this seems to be a perfectly acceptable default.

The IP() helper function actually bothers me a bit - it's a function
where the return *type* depends on the parameter *value*. While
switching between IPv4 and IPv6 based on value is probably a necessary
behaviour, perhaps it would be possible to get rid of the "host=True"
ugliness and instead have two separate helper functions:

  IP() - returns either IPv4Address or IPv6Address
  IPNetwork() - returns either IPv4Network or IPv6Network

Both would still accept a version argument, allowing programmatic
control of which version to accept. If an unknown version is passed then
some kind of warning or error should be emitted rather than the current
silent fallback to attempting to guess the version based on the value.

I would suggest removing the corresponding IPv4 and IPv6 helper
functions altogether.

My rationale for the above is that hosts and networks are *not* the same
thing. For any given operation, the programmer should know whether they
want a host or a network and ask for whichever one they want. The
IPv4/IPv6 distinction, on the other hand, is something that a lot of
operations are going to be neutral about, so it makes sense to deal with
the difference implicitly.

Other general comments:

- the module appears to have quite a few isinstance() checks against
non-abstract base classes. Either these instance checks should all be
removed (relying on pure duck-typing instead) or else the relevant
classes should be turned into ABCs. (Note: this comment doesn't apply to
the type dispatch in the constructor methods)

- the reference implementation has aliased "CamelCase" names with the
heading "backwards compatibility". This is inappropriate for a standard
library submission (although I can see how it would be useful if you
were already using a different IP address library).

- isinstance() accepts a tuple of types, so isinstance(address, (int,
long)) is a shorter way of writing "isinstance(address, int) or
isinstance(address, long)". The former also has the virtue of executing
faster. However, an even better approach would be to use
operator.index() in order to accept all integral types rather than just
the builtin ones.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From mg at  Sat Aug 22 01:17:53 2009
From: mg at (Martin Geisler)
Date: Sat, 22 Aug 2009 01:17:53 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
Message-ID: <>

Dj Gilcrease <digitalxero at> writes:

> On Fri, Aug 21, 2009 at 8:19 AM, Dirkjan Ochtman<dirkjan at> wrote:
>> Enabling extensions in a versioned file is not going to fly.
> any specific reason?

In the general case, you can specify an extension to be enabled by

  foo = ~/src/foo

So if I can enable an extension like that on your system, I might be
evil and commit a bad extension *and* enable it at the same time.

You might argue that one should then limit which extensions one can
enable in a versioned file, but it seems hard to come up with a good
mechanism for this. The current "mechanism" is the users own ~/.hgrc
file which can be seen as a whitelist of extensions he trust.

An alternative could be the new %include syntax for configuration files,
which was introduced in Mercurial 1.3. If you add

  %include ../config

to your .hg/hgrc file, the (versioned!) file named 'config' from the
root of your repository will be included on the spot. The catch is that
you have to add such a line to all your Python clones.

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From ben+python at  Sat Aug 22 02:16:17 2009
From: ben+python at (Ben Finney)
Date: Sat, 22 Aug 2009 10:16:17 +1000
Subject: [Python-Dev] =?utf-8?q?Issue_1170=3A_Unicode_for_=E2=80=98shlex?=
Message-ID: <>

Howdy all,

What is the procedure for finding out why an issue hasn't progressed? I
don't want to fill the bug database with such noise.

In the case of <URL:> (?shlex have
problems with parsing unicode?), the problem is apparently addressed by
a patch, assigned to that issue since 2007-12-22. There is no indication
in the report why it's not yet applied.

I'd really like this fixed in the 2.x series if possible.

 \       ?? whoever claims any right that he is unwilling to accord to |
  `\             his fellow-men is dishonest and infamous.? ?Robert G. |
_o__)           Ingersoll, _The Liberty of Man, Woman and Child_, 1877 |
Ben Finney

From skippy.hammond at  Sat Aug 22 02:56:36 2009
From: skippy.hammond at (Mark Hammond)
Date: Sat, 22 Aug 2009 10:56:36 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 22/08/2009 12:19 AM, Dirkjan Ochtman wrote:
> On Fri, Aug 21, 2009 at 16:10, Dj Gilcrease<digitalxero at>  wrote:
>> I like this, though maybe .hgextensions since it would contain
>> versioned rules and the actual required extension. The extra sub
>> directories are not really required IMHO, you just have a hgrc file
>> that works the same as the local hgrc file except it only looks in the
>> .hgextensions directory for the correct extension so for python we
>> could have something like
>> [extensions]
>> format_enforcer =
> Enabling extensions in a versioned file is not going to fly.

I like Stephen and Nick's discussion higher in this thread, but wonder 
if some middle ground couldn't work.

Instead of [extensions], just have a place to list the required 
extensions - eg;

Something like ~/.hgrules having:

[config]  # or maybe [rules] ?
required_extensions = win32text, some_pydev_specific_extension

{rules for encoding}

some_custom_property_for_our_custom_ext = 1

... etc ...
(Note I am not proposing we need out own pydev_specific_extension, I 
just included it here to try and show the more general concept)

This way you aren't *enabling* extensions in this versioned file, just 
listing rules about what extensions must be enabled.  From core hg's 
POV, it doesn't care if the required extensions relate to windows line 
endings or re-encoding images - it just honours the wishes of the repo 

 From earlier in the thread, Dirkjan writes:

 > The [concept of hg enforing required extensions] might indeed be
 > hard to get into the core, but seems like a good idea to try.

 From my POV, this would be required in some form or another before such 
a scheme could actually work.  Without it we end up with an improved 
win32text (good!) but in practice still have the same problems we have 
discussed in this thread which would make it unsuitable for us who 
actually try and use it, particularly as a general solution for projects 
with any kind of windows focus or community.

Given you are a core hg committer and well known in the community, would 
you be willing to start a thread with the hg developers about this 
issue?  If something like this can't get into the core, I will drop any 
expectations of it becoming a viable general solution for windows 
focused projects, so would limit the work I am willing to invest to the 
commitments I've made here.



From mhammond at  Sat Aug 22 02:58:40 2009
From: mhammond at (Mark Hammond)
Date: Sat, 22 Aug 2009 10:58:40 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	
Message-ID: <>

On 22/08/2009 12:10 AM, Dj Gilcrease wrote:
> On Fri, Aug 21, 2009 at 1:16 AM, Mark Hammond<skippy.hammond at>  wrote:
>> Maybe you can enumerate what you think needs to change in mercurial, then
>> once we have a plan in place it will be clearer who can do what.
> The encode/decode hooks need to be passed the filename they are
> working on so you can have an ignore list, this is why I consider my
> method a hack since I am using a precommit hook to do conversion since
> I am able to find out which file I am working on and make sure it is
> not in an ignore list. There also needs to be a way to have required
> and version controlled extensions.

I think this is the exact issue my 'none' patch addresses.  Your filters 
can say:


The end result should be that anything with 'none:' forms what you call 
an ignore list.

Would that not meet your requirements?



From stephen at  Sat Aug 22 06:46:43 2009
From: stephen at (Stephen J. Turnbull)
Date: Sat, 22 Aug 2009 13:46:43 +0900
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

Mark Hammond writes:

 > Something like ~/.hgrules having:

Surely you mean $PROJECTROOT/.hgrules?

 > [config]  # or maybe [rules] ?
 > required_extensions = win32text, some_pydev_specific_extension

required_for_commit = win32text,some_other_ext

That might require a change to hg's ini file semantics if currently it
refuses to parse [extension] sections in versioned hgrcs.

Note the change in name: I'm not sure exactly what the semantics
should be, but surely we want to allow browsing the repository,
branching, etc without enabling any extensions.

 > [Encode]
 > {rules for encoding}

No, there must be a way to indicate that "this is a section for a
specific extension".  Bare [Encode] will be seen as polluting the
global namespace, and will get a lot of pushback, I think.

 > This way you aren't *enabling* extensions in this versioned file,

True, but how many people will just download the extension and enable
it?  This would open a door to "social engineering".  (Personally, *I*
am not opposed to it on those grounds, but as devil's advocate I do
want to mention that as an argument you might run into.)

 > just listing rules about what extensions must be enabled.  From
 > core hg's POV, it doesn't care if the required extensions relate to
 > windows line endings or re-encoding images - it just honours the
 > wishes of the repo owner.

If it refuses the user's request, it should issue a message to the
effect of "Please enable win32text, which is required in <absolute
name of .hgrules>."

From mhammond at  Sat Aug 22 07:02:19 2009
From: mhammond at (Mark Hammond)
Date: Sat, 22 Aug 2009 15:02:19 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:
> Mark Hammond writes:
>   >  Something like ~/.hgrules having:
> Surely you mean $PROJECTROOT/.hgrules?


>   >  [config]  # or maybe [rules] ?
>   >  required_extensions = win32text, some_pydev_specific_extension
> [extensions]
> required_for_commit = win32text,some_other_ext
> That might require a change to hg's ini file semantics if currently it
> refuses to parse [extension] sections in versioned hgrcs.

Yes - I'm not proposing specific names for sections etc - I'm more 
interested in getting the concepts across, and fully expect the hg guys 
will have their own opinions and make final decisions on the exact spelling.

> Note the change in name: I'm not sure exactly what the semantics
> should be, but surely we want to allow browsing the repository,
> branching, etc without enabling any extensions.
>   >  [Encode]
>   >  {rules for encoding}
> No, there must be a way to indicate that "this is a section for a
> specific extension".  Bare [Encode] will be seen as polluting the
> global namespace, and will get a lot of pushback, I think.

Possibly - although I would expect the existing section names be reused 
when applied to a versioned file, I'd be more than happy for the hg guys 
to declare new names are appropriate for this.

>   >  This way you aren't *enabling* extensions in this versioned file,
> True, but how many people will just download the extension and enable
> it?

In the ideal world, exactly as many people who would read the Python 
developer guide, then download and install the extension based purely on 
that.  IOW, it is Python itself setting the policy, so people need to 
make their own decisions based on that, regardless of whether the tool 
enforces it or not.

> This would open a door to "social engineering".  (Personally, *I*
> am not opposed to it on those grounds, but as devil's advocate I do
> want to mention that as an argument you might run into.)
>   >  just listing rules about what extensions must be enabled.  From
>   >  core hg's POV, it doesn't care if the required extensions relate to
>   >  windows line endings or re-encoding images - it just honours the
>   >  wishes of the repo owner.
> If it refuses the user's request, it should issue a message to the
> effect of "Please enable win32text, which is required in<absolute
> name of .hgrules>."




From benjamin at  Sat Aug 22 08:07:52 2009
From: benjamin at (Benjamin Peterson)
Date: Sat, 22 Aug 2009 01:07:52 -0500
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/21 Ben Finney <ben+python at>:
> Howdy all,
> What is the procedure for finding out why an issue hasn't progressed? I
> don't want to fill the bug database with such noise.

In this case, it's probably because no one officially maintains the
shlex module at the moment.

> In the case of <URL:> (?shlex have
> problems with parsing unicode?), the problem is apparently addressed by
> a patch, assigned to that issue since 2007-12-22. There is no indication
> in the report why it's not yet applied.

I will leave a few initial comments.
> I'd really like this fixed in the 2.x series if possible.


From dirkjan at  Sat Aug 22 09:35:13 2009
From: dirkjan at (Dirkjan Ochtman)
Date: Sat, 22 Aug 2009 09:35:13 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Aug 22, 2009 at 01:17, Martin Geisler<mg at> wrote:
> In the general case, you can specify an extension to be enabled by
> filename:
> ?[extensions]
> ?foo = ~/src/foo
> So if I can enable an extension like that on your system, I might be
> evil and commit a bad extension *and* enable it at the same time.
> You might argue that one should then limit which extensions one can
> enable in a versioned file, but it seems hard to come up with a good
> mechanism for this. The current "mechanism" is the users own ~/.hgrc
> file which can be seen as a whitelist of extensions he trust.

Thanks for explaining that bit, Martin. Everyone: Martin is also a hg
crew member.

It sounds to me like somehow requiring extensions to be enabled
(without actually enabling them) would help mitigate the issues
somehow, although it's still a distributed system and so clients
cannot be trusted (e.g. I might put a win32text stub in there
somewhere that does nothing).



From stephen at  Sat Aug 22 10:52:47 2009
From: stephen at (Stephen J. Turnbull)
Date: Sat, 22 Aug 2009 17:52:47 +0900
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

Mark Hammond writes:
 > On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:

 > Possibly - although I would expect the existing section names be reused 
 > when applied to a versioned file, I'd be more than happy for the hg guys 
 > to declare new names are appropriate for this.

If there's already an [Encode] section, that's different.  (I don't
details, I'm not that big a Mercurial fan.)  But you'd still need a
way to differentiate win32text rules from other encoding rules.

 > >   >  This way you aren't *enabling* extensions in this versioned file,
 > >
 > > True, but how many people will just download the extension and enable
 > > it?
 > In the ideal world, exactly as many people who would read the Python 
 > developer guide, then download and install the extension based purely on 
 > that.  IOW, it is Python itself setting the policy, so people need to 
 > make their own decisions based on that, regardless of whether the tool 
 > enforces it or not.

You're missing the point.  I'm not talking about whether it will work
for Python, I'm talking about the worry that somebody will post a way
cool Python branch and require a private extension, which everybody
will just automatically install and enable, which extension then
proceeds to phone home to Spammer Haven, Inc. with the contents of
your email contact list.  That's what I mean by "social engineering,"
and why I worry about policy pushback from Mercurial HQ.

Maybe that's more paranoid than they are....  But it can't hurt your
cause to be ready for that kind of worry.

From stephen at  Sat Aug 22 10:59:37 2009
From: stephen at (Stephen J. Turnbull)
Date: Sat, 22 Aug 2009 17:59:37 +0900
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

Dirkjan Ochtman writes:

 > [Clients] cannot be trusted (e.g. I might put a win32text stub in
 > there somewhere that does nothing).

Heck, just edit the .hgrules file, and do a Houdini on any and all

Don't trust software, trust people -- but help them avoid thoughtless

From martin at  Sat Aug 22 11:09:21 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 22 Aug 2009 11:09:21 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>
Message-ID: <>

> From my POV, this would be required in some form or another before such
> a scheme could actually work.  Without it we end up with an improved
> win32text (good!)

I still think this would be actually bad.

Instead, a new extension should be written, with a name that does not
have "win32" as a substring, and that has no provision for guessing
line breaks by inspecting files.


From martin at  Sat Aug 22 11:16:47 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 22 Aug 2009 11:16:47 +0200
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

> What is the procedure for finding out why an issue hasn't progressed?

It's fairly simple: just read through the issue, and it should be
obvious. In the specific case, no committer has ever commented on
the issue, so chances are high that no committer has ever *seen*
the issue.

> I don't want to fill the bug database with such noise.

And likely, posting to the issue won't be a way to find out, since no
committer would see your comment.

> In the case of <URL:> (?shlex have
> problems with parsing unicode?), the problem is apparently addressed by
> a patch, assigned to that issue since 2007-12-22.

Apparently, or really? Did you review the patch?


From mg at  Sat Aug 22 11:57:14 2009
From: mg at (Martin Geisler)
Date: Sat, 22 Aug 2009 11:57:14 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
Message-ID: <>

"Stephen J. Turnbull" <stephen at> writes:

> Mark Hammond writes:
> [extensions]
> required_for_commit = win32text,some_other_ext
> That might require a change to hg's ini file semantics if currently it
> refuses to parse [extension] sections in versioned hgrcs.

It doesn' refuse anything like that. When Mercurial starts, it reads
these configuration files:

Notice that they are all outside the clone's working directory, the
closes one is the <repo>/.hg/hgrc file.

As I wrote somewhere else in this thread, you can add

  %include ../.repo-settings

in your <repo>/.hg/hgrc file, and this will result in


being loaded (and this file *is* in the working copy and can thus be put
under revision control).

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From mg at  Sat Aug 22 11:48:58 2009
From: mg at (Martin Geisler)
Date: Sat, 22 Aug 2009 11:48:58 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
Message-ID: <>

"Stephen J. Turnbull" <stephen at> writes:

> Mark Hammond writes:
>  > On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:
>  > Possibly - although I would expect the existing section names be reused 
>  > when applied to a versioned file, I'd be more than happy for the hg guys 
>  > to declare new names are appropriate for this.
> If there's already an [Encode] section, that's different.  (I don't
> details, I'm not that big a Mercurial fan.)  But you'd still need a
> way to differentiate win32text rules from other encoding rules.

There is a [decode] and an [encode] section:

The win32text extension works by defining new filters which can then be
used like this:

  ** = cleverencode:
  ** = cleverdecode:

(they are "clever" because they skip binary files)

>>> True, but how many people will just download the extension and
>>> enable it?
>> In the ideal world, exactly as many people who would read the Python
>> developer guide, then download and install the extension based purely
>> on that. IOW, it is Python itself setting the policy, so people need
>> to make their own decisions based on that, regardless of whether the
>> tool enforces it or not.
> You're missing the point.  I'm not talking about whether it will work
> for Python, I'm talking about the worry that somebody will post a way
> cool Python branch and require a private extension, which everybody
> will just automatically install and enable, which extension then
> proceeds to phone home to Spammer Haven, Inc. with the contents of
> your email contact list.  That's what I mean by "social engineering,"
> and why I worry about policy pushback from Mercurial HQ.
> Maybe that's more paranoid than they are.... But it can't hurt your
> cause to be ready for that kind of worry.

Oh, we try to be very paranoid in Mercurial :-) That's why you don't see
any support for copying hgrc files when you clone and why hg wont trust
hgrc files not owned by you: it should be safe to do

  cd ~collegue/src/python
  hg tip

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From p.f.moore at  Sat Aug 22 13:18:35 2009
From: p.f.moore at (Paul Moore)
Date: Sat, 22 Aug 2009 12:18:35 +0100
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/22 Martin Geisler <mg at>:
> Oh, we try to be very paranoid in Mercurial :-) That's why you don't see
> any support for copying hgrc files when you clone and why hg wont trust
> hgrc files not owned by you: it should be safe to do
> ?cd ~collegue/src/python
> ?hg tip

So, is the implication therefore that there would be resistance to
having some way of making a setting which *is* copied on clone, which
says that you can't commit in this repository unless you have the
following extensions enabled?

Or is the fact that it's only saying "you must have an extension
called win32text enabled" and not actually enabling code directly,
sufficiently secure to make it acceptable?


From mg at  Sat Aug 22 15:35:05 2009
From: mg at (Martin Geisler)
Date: Sat, 22 Aug 2009 15:35:05 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
Message-ID: <>

Paul Moore <p.f.moore at> writes:

> 2009/8/22 Martin Geisler <mg at>:
>> Oh, we try to be very paranoid in Mercurial :-) That's why you don't
>> see any support for copying hgrc files when you clone and why hg wont
>> trust hgrc files not owned by you: it should be safe to do
>> ?cd ~collegue/src/python
>> ?hg tip
> So, is the implication therefore that there would be resistance to
> having some way of making a setting which *is* copied on clone, which
> says that you can't commit in this repository unless you have the
> following extensions enabled?

It sounds somewhat invasive to forbid commits. Moreover, repository
owners should remember that clients can do whatever they want, so this
can only be a hint, never a requirement.

I don't think this has been mentioned: When you clone you move history
(changesets) only and I'm pretty sure you cannot even read the
configuration settings over the "wire protocol".

So cloning from a HTTP URL wont copy a setting found in the
<repo>/.hg/hgrc file. This implies that the settings should live in a
version controlled file. I think that is sensible under all

So if the win32text extension (horrible name, I agree... it should have
been made more general and called eolconvert or something like that)
would just read a configuration file from the repository, then all you
should ask people is to enable win32text.

> Or is the fact that it's only saying "you must have an extension
> called win32text enabled" and not actually enabling code directly,
> sufficiently secure to make it acceptable?

It is definitely secure enough to be included. There should be a way to
turn off those hints, though: I might want to clone the Python
repository and play around with it without enabling win32text.

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From ncoghlan at  Sat Aug 22 17:47:41 2009
From: ncoghlan at (Nick Coghlan)
Date: Sun, 23 Aug 2009 01:47:41 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Stephen J. Turnbull wrote:
> Dirkjan Ochtman writes:
>  > [Clients] cannot be trusted (e.g. I might put a win32text stub in
>  > there somewhere that does nothing).
> Heck, just edit the .hgrules file, and do a Houdini on any and all
> handcuffs.
> Don't trust software, trust people -- but help them avoid thoughtless
> mistakes.

Yes, on the client side we're not trying to prevent someone doing the
wrong thing deliberately - just nudging them towards doing the right
thing so they won't run afoul of the server side checks that will
actually *enforce* the line ending rules for the main repository.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From p.f.moore at  Sat Aug 22 21:28:55 2009
From: p.f.moore at (Paul Moore)
Date: Sat, 22 Aug 2009 20:28:55 +0100
Subject: [Python-Dev] Setting up a buildbot
Message-ID: <>

I've just had a look on, but couldn't immediately see a
pointer to instructions on what the process is to set up a buildbot.
There's a not on setting things up for pybots, but nothing on the core
buildbot setup.

The reason I'm asking is that I'm thinking of seeing if I could set up
a Windows buildbot of some sort, to offer extra coverage. It's early
days, yet, but I wonder if someone could answer a few questions for

- Is there any documentation on how to set up a buildbot? If so, can
someone give me a pointer?
- What configurations would be most useful? (I've got a 64-bit PC, so
I can theoretically set up 32 or 64 bit VMs with VMWare, and with my
shiny new MSDN subscription, I can set up whatever OS is most useful).
- Is it possible to set up the pull/build/test side of the process
separately, before linking it into the full buildbot farm? That would
let me try things out on my own, and iron out any configuration
glitches before dumping it on the world.

Thanks for any pointers. It's early days yet, so it may be a while
before I have anything properly set up, but I'd like to see what I can


From asmodai at  Sat Aug 22 21:40:49 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Sat, 22 Aug 2009 21:40:49 +0200
Subject: [Python-Dev] Setting up a buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090822 21:30], Paul Moore (p.f.moore at wrote:
>I've just had a look on, but couldn't immediately see a
>pointer to instructions on what the process is to set up a buildbot. comes to mind.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
Success is satisfaction with yourself...

From p.f.moore at  Sat Aug 22 22:08:02 2009
From: p.f.moore at (Paul Moore)
Date: Sat, 22 Aug 2009 21:08:02 +0100
Subject: [Python-Dev] Setting up a buildbot
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/22 Jeroen Ruigrok van der Werven <asmodai at>:
> -On [20090822 21:30], Paul Moore (p.f.moore at wrote:
>>I've just had a look on, but couldn't immediately see a
>>pointer to instructions on what the process is to set up a buildbot.
> comes to mind.

Ah, thanks. I'll take a look.


From digitalxero at  Sun Aug 23 00:35:48 2009
From: digitalxero at (Dj Gilcrease)
Date: Sat, 22 Aug 2009 16:35:48 -0600
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 6:58 PM, Mark Hammond<mhammond at> wrote:
> [encode]
> *.dsp=none:
> **=cleverencode:
> The end result should be that anything with 'none:' forms what you call an
> ignore list.
> Would that not meet your requirements?

It would, so I guess I'll hold off on digging into the hook code

From mhammond at  Sun Aug 23 01:17:57 2009
From: mhammond at (Mark Hammond)
Date: Sun, 23 Aug 2009 09:17:57 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 22/08/2009 6:52 PM, Stephen J. Turnbull wrote:
> Mark Hammond writes:
>   >  On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:
>   >  Possibly - although I would expect the existing section names be reused
>   >  when applied to a versioned file, I'd be more than happy for the hg guys
>   >  to declare new names are appropriate for this.
> If there's already an [Encode] section, that's different.  (I don't
> details, I'm not that big a Mercurial fan.)  But you'd still need a
> way to differentiate win32text rules from other encoding rules.

As mentioned in my previous post, I'm trying to avoid bike-shedding what 
the hg guys are better placed to decree.  How they choose to spell these 
options is something for hg to decide, and I doubt my opinion matters 
enough to bother sharing, let alone advocating.

>   >  >    >   This way you aren't *enabling* extensions in this versioned file,
>   >  >
>   >  >  True, but how many people will just download the extension and enable
>   >  >  it?
>   >
>   >  In the ideal world, exactly as many people who would read the Python
>   >  developer guide, then download and install the extension based purely on
>   >  that.  IOW, it is Python itself setting the policy, so people need to
>   >  make their own decisions based on that, regardless of whether the tool
>   >  enforces it or not.
> You're missing the point.  I'm not talking about whether it will work
> for Python, I'm talking about the worry that somebody will post a way
> cool Python branch and require a private extension, which everybody
> will just automatically install and enable, which extension then
> proceeds to phone home to Spammer Haven, Inc. with the contents of
> your email contact list.  That's what I mean by "social engineering,"
> and why I worry about policy pushback from Mercurial HQ.

No, you are missing the point - social engineering doesn't require tool 
support - tools simply make certain things easier.

> Maybe that's more paranoid than they are....  But it can't hurt your
> cause to be ready for that kind of worry.

If this becomes seen as 'my' cause, I suspect it will run out of steam 
very quickly.  I truly hope python-dev, as a community, takes some 
ownership of this issue or I predict the effort will fizzle out without 
a workable solution.  There seem to be a number of people who agree the 
status-quo isn't acceptable, so I'm not sure what would happen in that 



From mhammond at  Sun Aug 23 01:37:43 2009
From: mhammond at (Mark Hammond)
Date: Sun, 23 Aug 2009 09:37:43 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<> <>
Message-ID: <>

On 22/08/2009 7:09 PM, "Martin v. L?wis" wrote:
>>  From my POV, this would be required in some form or another before such
>> a scheme could actually work.  Without it we end up with an improved
>> win32text (good!)
> I still think this would be actually bad.
> Instead, a new extension should be written, with a name that does not
> have "win32" as a substring, and that has no provision for guessing
> line breaks by inspecting files.

To be clear, you are suggesting:

* Having hg enforce an extension as required is good.

* Python adopting win32text as that extension would be bad - instead 
another extension with different semantics (ie, no guessing based on 
file content) should be used, and enforced, instead.

Or have I misunderstood?

Assuming I am correct, I am inclined to agree - win32text may be "good 
enough" in the short term, but it is far from ideal.



From ben+python at  Sun Aug 23 03:52:14 2009
From: ben+python at (Ben Finney)
Date: Sun, 23 Aug 2009 11:52:14 +1000
Subject: [Python-Dev]
References: <>
Message-ID: <>

Benjamin Peterson <benjamin at> writes:

> I will leave a few initial comments.

Thank you.

"Martin v. L?wis" <martin at> writes:

> > In the case of <URL:> (?shlex have
> > problems with parsing unicode?), the problem is apparently addressed
> > by a patch, assigned to that issue since 2007-12-22.
> Apparently, or really? Did you review the patch?

No. The bug report showed that others had already tried it and said it
worked; and also, I don't consider myself qualified to review that
particular patch.

> > What is the procedure for finding out why an issue hasn't progressed?
> It's fairly simple: just read through the issue, and it should be
> obvious. In the specific case, no committer has ever commented on the
> issue, so chances are high that no committer has ever *seen* the
> issue.

Okay, so not obvious to someone (like me) who doesn't immediately know
who is or is not a committer. Thanks for the clarification.

> > I don't want to fill the bug database with such noise.
> And likely, posting to the issue won't be a way to find out, since no
> committer would see your comment.

I'll take that as confirmation that asking in this forum is the right

 \           ?We spend the first twelve months of our children's lives |
  `\          teaching them to walk and talk and the next twelve years |
_o__)           telling them to sit down and shut up.? ?Phyllis Diller |
Ben Finney

From martin at  Sun Aug 23 09:16:49 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Aug 2009 09:16:49 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>
	<> <>
Message-ID: <>

>>>  From my POV, this would be required in some form or another before such
>>> a scheme could actually work.  Without it we end up with an improved
>>> win32text (good!)
>> I still think this would be actually bad.
>> Instead, a new extension should be written, with a name that does not
>> have "win32" as a substring, and that has no provision for guessing
>> line breaks by inspecting files.
> To be clear, you are suggesting:
> * Having hg enforce an extension as required is good.

I have no opinion on that.

> * Python adopting win32text as that extension would be bad - instead
> another extension with different semantics (ie, no guessing based on
> file content) should be used, and enforced, instead.

Yes. The functionality being discussed should not be added to win32text.

> Assuming I am correct, I am inclined to agree - win32text may be "good
> enough" in the short term, but it is far from ideal.

I also feel that an extension that is inherently platform independent
and has a clear specification has much higher chances of becoming a
standard feature of Mercurial one day.


From martin at  Sun Aug 23 09:25:56 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Aug 2009 09:25:56 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> If this becomes seen as 'my' cause, I suspect it will run out of steam
> very quickly.  I truly hope python-dev, as a community, takes some
> ownership of this issue

That certainly won't happen. python-dev, as a community, has never ever
taken ownership of anything. It's always individuals who take ownership.

So you essentially say that you want somebody else (but not you) take
ownership - which, of course, is certainly fine. Hence my call for

> There seem to be a number of people who agree the
> status-quo isn't acceptable, so I'm not sure what would happen in that
> case...

My prediction is that it will depend on whether workable code is
available by the time a decision is made to migrate. If code is
available, then migration will happen (no matter whether the code
has an owner); if no code is available, migration will stall.


From shashank.sunny.singh at  Sun Aug 23 18:09:54 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Sun, 23 Aug 2009 21:39:54 +0530
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
Message-ID: <>

There is an interesting suggestion (
to add support to run encrypted zip files as python scripts.

No doubt this is a useful functionality to have but it would be great to
have some comments on whether
this can be(or even should be) feasibly added as an inbuilt support.

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sun Aug 23 22:09:33 2009
From: guido at (Guido van Rossum)
Date: Sun, 23 Aug 2009 13:09:33 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 23, 2009 at 9:09 AM, Shashank
Singh<shashank.sunny.singh at> wrote:
> There is an interesting suggestion (
> to add support to run encrypted zip files as python scripts.
> No doubt this is a useful functionality to have but it would be great to
> have some comments on whether
> this can be(or even should be) feasibly added as an inbuilt support.

MvL already asked for a patch so I suppose that means he thinks it's
useful. Personally I've never encountered an encrypted zipfile, so I
just have questions: is there a standard encryption algorithm? What is
encrypted? The entire file or individual members? How are you supposed
to give the password? Also, I suppose there could be (US) export
problems with the code, so it would have to be optional (and we might
not be able to build it into binaries we distribute from

--Guido van Rossum (home page:

From brett at  Sun Aug 23 22:24:48 2009
From: brett at (Brett Cannon)
Date: Sun, 23 Aug 2009 13:24:48 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

There is a standard for encrypting entire zip files. And I was looking at
the zip docs the other day and zipfile can already decrypt but not encrypt
(assuming my memory is accurate; doing this from my phone on vacation).

On Aug 23, 2009 2:10 PM, "Guido van Rossum" <guido at> wrote:

On Sun, Aug 23, 2009 at 9:09 AM, Shashank Singh<
shashank.sunny.singh at> wrote: > There is an...
MvL already asked for a patch so I suppose that means he thinks it's
useful. Personally I've never encountered an encrypted zipfile, so I
just have questions: is there a standard encryption algorithm? What is
encrypted? The entire file or individual members? How are you supposed
to give the password? Also, I suppose there could be (US) export
problems with the code, so it would have to be optional (and we might
not be able to build it into binaries we distribute from

--Guido van Rossum (home page:
Python-Dev mailing list
Python-Dev at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sun Aug 23 23:16:20 2009
From: guido at (Guido van Rossum)
Date: Sun, 23 Aug 2009 14:16:20 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

> On Aug 23, 2009 2:10 PM, "Guido van Rossum" <guido at> wrote:
> MvL already asked for a patch so I suppose that means he thinks it's
> useful. Personally I've never encountered an encrypted zipfile, so I
> just have questions: is there a standard encryption algorithm? What is
> encrypted? The entire file or individual members? How are you supposed
> to give the password? Also, I suppose there could be (US) export
> problems with the code, so it would have to be optional (and we might
> not be able to build it into binaries we distribute from

On Sun, Aug 23, 2009 at 1:24 PM, Brett Cannon<brett at> wrote:
> There is a standard for encrypting entire zip files. And I was looking at
> the zip docs the other day and zipfile can already decrypt but not encrypt
> (assuming my memory is accurate; doing this from my phone on vacation).

Ah, cool. Then the only issue for the patch presumably is an API to
provide the password. Passing it as a command-line flag seems very
insecure (though in some cases there may be no choice), so presumably
it needs to be prompted and read from stdin. (Though it appears from
skimming that it support encrypted individual archive
members, not the zipfile as a whole. Also the docs mention that
decryption is "extremely slow as it is implemented in native python
rather than C.")

Anyway it looks like if someone wants to try this, only the code in needs to be touched.

--Guido van Rossum (home page:

From martin at  Sun Aug 23 23:24:11 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Aug 2009 23:24:11 +0200
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

>> No doubt this is a useful functionality to have but it would be great to
>> have some comments on whether
>> this can be(or even should be) feasibly added as an inbuilt support.
> MvL already asked for a patch so I suppose that means he thinks it's
> useful.

I am actual skeptical that it is implementable in a reasonable way;
if implemented, I'd say: why not?

> Personally I've never encountered an encrypted zipfile, so I
> just have questions: is there a standard encryption algorithm?

In principle, yes. There are several aspects of encryption described in

There are several encryption algorithms defined, such as
"traditional PKWARE", DES, 3DES, "original RC2", RC4, AES,
"corrected RC2", "corrected RC2-64", blowfish, twofish.	

In the file header general purpose bits , bit 0 indicates "file is
encrypted" (which means "traditional PKWARE"), bit 6 indicates "strong
encryption" (an additional header then giving details).

> What is encrypted? The entire file or individual members?

Traditionally, only individual files. With strong encryption (only?),
the central directory can also be encrypted.

> How are you supposed to give the password?

In pkzip: interactively. In the import support: this remains to be seen
in the patch. I assume people requesting that feature have a plan.

> Also, I suppose there could be (US) export
> problems with the code, so it would have to be optional (and we might
> not be able to build it into binaries we distribute from

The zipfile module already supports decryption. I forgot whether we
determined that support for decryption only doesn't fall under the
export restrictions, or whether we reported the module to the BXA as


From greg at  Mon Aug 24 02:59:53 2009
From: greg at (Gregory P. Smith)
Date: Sun, 23 Aug 2009 17:59:53 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 23, 2009 at 2:24 PM, "Martin v. L?wis" <martin at>wrote:

> >> No doubt this is a useful functionality to have but it would be great to
> >> have some comments on whether
> >> this can be(or even should be) feasibly added as an inbuilt support.
> >
> > MvL already asked for a patch so I suppose that means he thinks it's
> > useful.
> I am actual skeptical that it is implementable in a reasonable way;
> if implemented, I'd say: why not?
> > Personally I've never encountered an encrypted zipfile, so I
> > just have questions: is there a standard encryption algorithm?
> In principle, yes. There are several aspects of encryption described in
> There are several encryption algorithms defined, such as
> "traditional PKWARE", DES, 3DES, "original RC2", RC4, AES,
> "corrected RC2", "corrected RC2-64", blowfish, twofish.
> In the file header general purpose bits , bit 0 indicates "file is
> encrypted" (which means "traditional PKWARE"), bit 6 indicates "strong
> encryption" (an additional header then giving details).
> > What is encrypted? The entire file or individual members?
> Traditionally, only individual files. With strong encryption (only?),
> the central directory can also be encrypted.
> > How are you supposed to give the password?
> In pkzip: interactively. In the import support: this remains to be seen
> in the patch. I assume people requesting that feature have a plan.
> > Also, I suppose there could be (US) export
> > problems with the code, so it would have to be optional (and we might
> > not be able to build it into binaries we distribute from
> The zipfile module already supports decryption. I forgot whether we
> determined that support for decryption only doesn't fall under the
> export restrictions, or whether we reported the module to the BXA as
> well.

I doubt you can even classify the zipfile module's "decryption" support as
encryption.  It is trivially stupid, easily cracked (a 32bit crc based
"cipher").  The zipfile module does not support the various later encryption
schemes that use actual crypto algorithms.

I do not think we should support execution of python scripts or importing of
modules from encrypted zips.  I do not see a valid use case.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Aug 24 03:52:26 2009
From: guido at (Guido van Rossum)
Date: Sun, 23 Aug 2009 18:52:26 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 23, 2009 at 5:59 PM, Gregory P. Smith<greg at> wrote:
> I doubt you can even classify the zipfile module's "decryption" support as
> encryption.? It is trivially stupid, easily cracked (a 32bit crc based
> "cipher").? The zipfile module does not support the various later encryption
> schemes that use actual crypto algorithms.

Oops. I guess this is what Martin called "traditional PKWARE".

Quite separate from the current thread it might make sense to support
the stronger encryption schemes in zipfile.

> I do not think we should support execution of python scripts or importing of
> modules from encrypted zips.? I do not see a valid use case.

I am still awaiting a use case too (for running an encrypted script).
I notice that the OP hasn't replied yet. Let's give them a chance. (I
added Shashank back to the thread just in case.)

--Guido van Rossum (home page:

From ncoghlan at  Mon Aug 24 04:09:02 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 24 Aug 2009 12:09:02 +1000
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> Anyway it looks like if someone wants to try this, only the code in
> needs to be touched.

The necessary work would actually be in zipimport. runpy doesn't know
anything about the details of where the module code comes from, it just
asks the relevant importer for the details. For zipfile and directory
execution, they get added to the start of sys.path and then runpy is
invoked to look for the module "__main__". From that point on most of
the heavy lifting is handled by the regular import machinery (aside from
using the pkgutil emulation for the basic import behaviour that isn't
fully exposed by the imp module).

I added a -1 to the tracker issue as well. That's due both to my opinion
on the inherent idiocy of DRM though (since shared secrets don't provide
any security when the attacker in your threat model is one of the people
you are sharing the secret with) and to the fact that associating
passwords with the relevant zipfile entries on sys.path would get messy
fairly quickly.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From shashank.sunny.singh at  Mon Aug 24 04:39:00 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Mon, 24 Aug 2009 08:09:00 +0530
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 24, 2009 at 7:39 AM, Nick Coghlan <ncoghlan at> wrote:

> Guido van Rossum wrote:
> > Anyway it looks like if someone wants to try this, only the code in
> > needs to be touched.
> The necessary work would actually be in zipimport. runpy doesn't know
> anything about the details of where the module code comes from, it just
> asks the relevant importer for the details. For zipfile and directory
> execution, they get added to the start of sys.path and then runpy is
> invoked to look for the module "__main__". From that point on most of
> the heavy lifting is handled by the regular import machinery (aside from
> using the pkgutil emulation for the basic import behaviour that isn't
> fully exposed by the imp module).
> I added a -1 to the tracker issue as well. That's due both to my opinion
> on the inherent idiocy of DRM though (since shared secrets don't provide
> any security when the attacker in your threat model is one of the people
> you are sharing the secret with) and to the fact that associating
> passwords with the relevant zipfile entries on sys.path would get messy
> fairly quickly.
> Cheers.
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> ---------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From shashank.sunny.singh at  Mon Aug 24 04:40:02 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Mon, 24 Aug 2009 08:10:02 +0530
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

oops..sorry for the empty mail :P

On Mon, Aug 24, 2009 at 8:09 AM, Shashank Singh <
shashank.sunny.singh at> wrote:

> On Mon, Aug 24, 2009 at 7:39 AM, Nick Coghlan <ncoghlan at> wrote:
>> Guido van Rossum wrote:
>> > Anyway it looks like if someone wants to try this, only the code in
>> > needs to be touched.
>> The necessary work would actually be in zipimport. runpy doesn't know
>> anything about the details of where the module code comes from, it just
>> asks the relevant importer for the details. For zipfile and directory
>> execution, they get added to the start of sys.path and then runpy is
>> invoked to look for the module "__main__". From that point on most of
>> the heavy lifting is handled by the regular import machinery (aside from
>> using the pkgutil emulation for the basic import behaviour that isn't
>> fully exposed by the imp module).

That is where I see the problem in creating a natural approach. Correct me
if I am wrong here but since runpy doesn't  know anything about the script
being a zip file to add such a support we will have to break the current
delegation mechanism and bring runpy in the loop too.

Also, since a zip file is automatically checked for (I believe there are no
switches to
specify that the script is a zip) will it not be a two trip mechanism: You
naively try
a to run a zip; get an error (say ERR_ZIP_ENCRYPTED) and then ask for

>> I added a -1 to the tracker issue as well. That's due both to my opinion
>> on the inherent idiocy of DRM though (since shared secrets don't provide
>> any security when the attacker in your threat model is one of the people
>> you are sharing the secret with) and to the fact that associating
>> passwords with the relevant zipfile entries on sys.path would get messy
>> fairly quickly.
>> Cheers.
>> Nick.
>> --
>> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
>> ---------------------------------------------------------------
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> --
> Regards
> Shashank Singh
> Senior Undergraduate, Department of Computer Science and Engineering
> Indian Institute of Technology Bombay
> shashank.sunny.singh at

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Aug 24 04:41:22 2009
From: guido at (Guido van Rossum)
Date: Sun, 23 Aug 2009 19:41:22 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

OMG, the use case is actually running a script without giving the user
access to the script's source? Agreed that's a big -1.

I thought it was just for running a zip containing code so secret you
don't want to leave it around on your hard drive without encryption
(say, the program you use to compute your employee's bonuses, or
perhaps a patented algoritm for detecting spam). That use case would
make a small amount of sense, though I personally don't care enough to
write the code to support it.


On Sun, Aug 23, 2009 at 7:09 PM, Nick Coghlan<ncoghlan at> wrote:
> Guido van Rossum wrote:
>> Anyway it looks like if someone wants to try this, only the code in
>> needs to be touched.
> The necessary work would actually be in zipimport. runpy doesn't know
> anything about the details of where the module code comes from, it just
> asks the relevant importer for the details. For zipfile and directory
> execution, they get added to the start of sys.path and then runpy is
> invoked to look for the module "__main__". From that point on most of
> the heavy lifting is handled by the regular import machinery (aside from
> using the pkgutil emulation for the basic import behaviour that isn't
> fully exposed by the imp module).
> I added a -1 to the tracker issue as well. That's due both to my opinion
> on the inherent idiocy of DRM though (since shared secrets don't provide
> any security when the attacker in your threat model is one of the people
> you are sharing the secret with) and to the fact that associating
> passwords with the relevant zipfile entries on sys.path would get messy
> fairly quickly.

--Guido van Rossum (home page:

From mhammond at  Mon Aug 24 04:59:16 2009
From: mhammond at (Mark Hammond)
Date: Mon, 24 Aug 2009 12:59:16 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

On 23/08/2009 5:25 PM, "Martin v. L?wis" wrote:
>> If this becomes seen as 'my' cause, I suspect it will run out of steam
>> very quickly.  I truly hope python-dev, as a community, takes some
>> ownership of this issue
> That certainly won't happen. python-dev, as a community, has never ever
> taken ownership of anything. It's always individuals who take ownership.

I believe ownership of a task and ownership of a cause are somewhat 

In other words, I'm happy to take ownership of a number as tasks 
relating to this cause, but if the general feeling is that it is my 
cause rather than *our* cause, then I will probably opt-out - I'm taking 
these tasks on at this moment purely because I believe it *is* a common 

> So you essentially say that you want somebody else (but not you) take
> ownership - which, of course, is certainly fine. Hence my call for
> volunteers.

Hence my volunteering and the time I am currently spending.

>> There seem to be a number of people who agree the
>> status-quo isn't acceptable, so I'm not sure what would happen in that
>> case...
> My prediction is that it will depend on whether workable code is
> available by the time a decision is made to migrate. If code is
> available, then migration will happen (no matter whether the code
> has an owner); if no code is available, migration will stall.

Right - I guess we are all still struggling with exactly what "workable 
code" means in this context.



From ncoghlan at  Mon Aug 24 05:15:25 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 24 Aug 2009 13:15:25 +1000
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> OMG, the use case is actually running a script without giving the user
> access to the script's source? Agreed that's a big -1.
> I thought it was just for running a zip containing code so secret you
> don't want to leave it around on your hard drive without encryption
> (say, the program you use to compute your employee's bonuses, or
> perhaps a patented algoritm for detecting spam). That use case would
> make a small amount of sense, though I personally don't care enough to
> write the code to support it.

Actually, the issue posting doesn't say either way - it doesn't provide
any real use cases at all.

For local protection of confidential information there are already much
better solutions out there (e.g. whole disk encryption, OS file
permissions, OS folder encryption), so a poor-man's DRM was the only
remaining remotely plausible use case I could see (and that's a bad idea
for all the reasons that DRM is almost always a bad idea).

Now, that could just be a failure of imagination on my part, but genuine
use case suggestions for the feature have been non existent so far.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Mon Aug 24 05:20:34 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 24 Aug 2009 13:20:34 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>
	<> <>
Message-ID: <>

Mark Hammond wrote:
> On 23/08/2009 5:25 PM, "Martin v. L?wis" wrote:
>>> If this becomes seen as 'my' cause, I suspect it will run out of steam
>>> very quickly.  I truly hope python-dev, as a community, takes some
>>> ownership of this issue
>> That certainly won't happen. python-dev, as a community, has never ever
>> taken ownership of anything. It's always individuals who take ownership.
> I believe ownership of a task and ownership of a cause are somewhat
> different.
> In other words, I'm happy to take ownership of a number as tasks
> relating to this cause, but if the general feeling is that it is my
> cause rather than *our* cause, then I will probably opt-out - I'm taking
> these tasks on at this moment purely because I believe it *is* a common
> cause.

If by ownership of the cause you just mean "acceptable handling of line
conversions" as being one of the criteria that must be dealt with before
the switch to hg actually happens, then I think you have that agreement

We're not going to accept a regression in line handling from what SVN
provides. Your proposed improvements to win32text (possibly in the form
of a new extension based on win32text rather than a new version of
win32text itself) along with server side enforcement sound like they
will meet the need.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From shashank.sunny.singh at  Mon Aug 24 06:23:24 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Mon, 24 Aug 2009 09:53:24 +0530
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

A litle off topic but the zipfile doc says: "Decryption is extremely slow as
is implemented in native python rather than C".

Why is this limitation there? I mean, is there any specific reason for not
it in C?

On Mon, Aug 24, 2009 at 8:45 AM, Nick Coghlan <ncoghlan at> wrote:

> Guido van Rossum wrote:
> > OMG, the use case is actually running a script without giving the user
> > access to the script's source? Agreed that's a big -1.
> >
> > I thought it was just for running a zip containing code so secret you
> > don't want to leave it around on your hard drive without encryption
> > (say, the program you use to compute your employee's bonuses, or
> > perhaps a patented algoritm for detecting spam). That use case would
> > make a small amount of sense, though I personally don't care enough to
> > write the code to support it.
> Actually, the issue posting doesn't say either way - it doesn't provide
> any real use cases at all.
> For local protection of confidential information there are already much
> better solutions out there (e.g. whole disk encryption, OS file
> permissions, OS folder encryption), so a poor-man's DRM was the only
> remaining remotely plausible use case I could see (and that's a bad idea
> for all the reasons that DRM is almost always a bad idea).
> Now, that could just be a failure of imagination on my part, but genuine
> use case suggestions for the feature have been non existent so far.
> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> ---------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Aug 24 06:31:49 2009
From: guido at (Guido van Rossum)
Date: Sun, 23 Aug 2009 21:31:49 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

Because it is easier to write in Python, and (as Greg explained) the
encryption is so lousy that you're unlikely to find heavy use of it.
Therefore nobody (so far) has cared to write an accelerator in C.

On Sun, Aug 23, 2009 at 9:23 PM, Shashank
Singh<shashank.sunny.singh at> wrote:
> A litle off topic but the zipfile doc says: "Decryption is extremely slow as
> it
> is implemented in native python rather than C".
> Why is this limitation there? I mean, is there any specific reason for not
> implementing
> it in C?
> On Mon, Aug 24, 2009 at 8:45 AM, Nick Coghlan <ncoghlan at> wrote:
>> Guido van Rossum wrote:
>> > OMG, the use case is actually running a script without giving the user
>> > access to the script's source? Agreed that's a big -1.
>> >
>> > I thought it was just for running a zip containing code so secret you
>> > don't want to leave it around on your hard drive without encryption
>> > (say, the program you use to compute your employee's bonuses, or
>> > perhaps a patented algoritm for detecting spam). That use case would
>> > make a small amount of sense, though I personally don't care enough to
>> > write the code to support it.
>> Actually, the issue posting doesn't say either way - it doesn't provide
>> any real use cases at all.
>> For local protection of confidential information there are already much
>> better solutions out there (e.g. whole disk encryption, OS file
>> permissions, OS folder encryption), so a poor-man's DRM was the only
>> remaining remotely plausible use case I could see (and that's a bad idea
>> for all the reasons that DRM is almost always a bad idea).
>> Now, that could just be a failure of imagination on my part, but genuine
>> use case suggestions for the feature have been non existent so far.
>> Cheers,
>> Nick.
>> --
>> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
>> ---------------------------------------------------------------
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> --
> Regards
> Shashank Singh
> Senior Undergraduate, Department of Computer Science and Engineering
> Indian Institute of Technology Bombay
> shashank.sunny.singh at
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

--Guido van Rossum (home page:

From asmodai at  Mon Aug 24 10:54:05 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Mon, 24 Aug 2009 10:54:05 +0200
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090823 22:10], Guido van Rossum (guido at wrote:
>Also, I suppose there could be (US) export problems with the code, so it
>would have to be optional (and we might not be able to build it into
>binaries we distribute from

For all I know the website and repository are both located @ the XS4All
colocation, so how does US export problems apply? It would apply if the box
would've been hosted in the USA, but they're not for all I know.

It's one of the reasons FreeBSD had their ebones repository located in Zuid
Afrika back in the day. Nowadays they can just include all the relevant bits
in the repository. So I wonder how applicable the entire US export
restriction still is nowadays.

In short: I don't think we have much to worry about.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
Wisdom is the difference between knowledge and experience...

From guido at  Mon Aug 24 16:46:09 2009
From: guido at (Guido van Rossum)
Date: Mon, 24 Aug 2009 07:46:09 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 24, 2009 at 1:54 AM, Jeroen Ruigrok van der
Werven<asmodai at> wrote:
> -On [20090823 22:10], Guido van Rossum (guido at wrote:
>>Also, I suppose there could be (US) export problems with the code, so it
>>would have to be optional (and we might not be able to build it into
>>binaries we distribute from
> For all I know the website and repository are both located @ the XS4All
> colocation, so how does US export problems apply? It would apply if the box
> would've been hosted in the USA, but they're not for all I know.
> It's one of the reasons FreeBSD had their ebones repository located in Zuid
> Afrika back in the day. Nowadays they can just include all the relevant bits
> in the repository. So I wonder how applicable the entire US export
> restriction still is nowadays.
> In short: I don't think we have much to worry about.

Are you a lawyer? Do you know the legal history of Python
distributions and the US export laws? It's not so easy -- for one, the
PSF (a US foundation) owns the copyright.

--Guido van Rossum (home page:

From chris at  Mon Aug 24 18:03:15 2009
From: chris at (Chris Withers)
Date: Mon, 24 Aug 2009 17:03:15 +0100
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> Anyway it looks like if someone wants to try this, only the code in
> needs to be touched.

Where is to be found?

I'm trying to find whatever implements python -m and the other python 
command line options...


Simplistix - Content Management, Batch Processing & Python Consulting

From benjamin at  Mon Aug 24 18:07:49 2009
From: benjamin at (Benjamin Peterson)
Date: Mon, 24 Aug 2009 11:07:49 -0500
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/24 Chris Withers <chris at>:
> Guido van Rossum wrote:
>> Anyway it looks like if someone wants to try this, only the code in
>> needs to be touched.
> Where is to be found?

$ find . -name ""


From peter at  Mon Aug 24 19:57:09 2009
From: peter at (Peter Moody)
Date: Mon, 24 Aug 2009 10:57:09 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 4:41 PM, Nick Coghlan<ncoghlan at> wrote:
> Peter Moody wrote:
>> On Thu, Aug 20, 2009 at 10:15 PM, Case Vanhorsen<casevh at> wrote:
>>> I was surprised that IP('') returned
>>> IPv4Address('') instead of IPv4Address(''). I
>>> know I can change the behavior by using host=True, but then
>>> IP('', host=True) will raise an error. It makes more
>>> sense, at least to me, that if I input just an IP address, I get an IP
>>> address back. I would prefer that IP('') return an
>>> IPv4Network and IP('') return an IPv4Address.
>> I think you mean that it returns an IPv4Network object (not
>> IPv4Address). ?My suggestion there is that if you know you're dealing
>> with an address, use one of the IPvXAddress classes (or pass host=True
>> to the IP function). IP is just a helper function and defaulting to a
>> network with a /32 prefix seems relatively common.
>> Knowing that my experience may not always be the most common, I can
>> change this behavior if it's indeed confusing, but in my conversations
>> with others and in checking out the current state of ip address
>> libraries, this seems to be a perfectly acceptable default.
> The IP() helper function actually bothers me a bit - it's a function
> where the return *type* depends on the parameter *value*. While
> switching between IPv4 and IPv6 based on value is probably a necessary
> behaviour, perhaps it would be possible to get rid of the "host=True"
> ugliness and instead have two separate helper functions:
> ?IP() - returns either IPv4Address or IPv6Address
> ?IPNetwork() - returns either IPv4Network or IPv6Network

IPAddress() and IPNetwork seem a little less confusing. I'll get rid
of IP() since it seems to be the source of a fair bit of confusion.

I've added this to the pep3144 change. need to change the pep to
reflect this now.

> Both would still accept a version argument, allowing programmatic
> control of which version to accept. If an unknown version is passed then
> some kind of warning or error should be emitted rather than the current
> silent fallback to attempting to guess the version based on the value.
> I would suggest removing the corresponding IPv4 and IPv6 helper
> functions altogether.


> My rationale for the above is that hosts and networks are *not* the same
> thing. For any given operation, the programmer should know whether they
> want a host or a network and ask for whichever one they want. The
> IPv4/IPv6 distinction, on the other hand, is something that a lot of
> operations are going to be neutral about, so it makes sense to deal with
> the difference implicitly.

makes sense to me.

> Other general comments:
> - the module appears to have quite a few isinstance() checks against
> non-abstract base classes. Either these instance checks should all be
> removed (relying on pure duck-typing instead) or else the relevant
> classes should be turned into ABCs. (Note: this comment doesn't apply to
> the type dispatch in the constructor methods)

I'll look through this.

> - the reference implementation has aliased "CamelCase" names with the
> heading "backwards compatibility". This is inappropriate for a standard
> library submission (although I can see how it would be useful if you
> were already using a different IP address library).

this is for legacy reasons (how it's used at work). it's fully
expected that those will disappear if/when this is accepted.

> - isinstance() accepts a tuple of types, so isinstance(address, (int,
> long)) is a shorter way of writing "isinstance(address, int) or
> isinstance(address, long)". The former also has the virtue of executing
> faster. However, an even better approach would be to use
> operator.index() in order to accept all integral types rather than just
> the builtin ones.

I can easily make the change to checking tuples.  I'd have to look
further at the operator.index() to see what's required.

> Cheers,
> Nick.
> --
> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
> ---------------------------------------------------------------

From larry.bugbee at  Mon Aug 24 20:52:55 2009
From: larry.bugbee at (Bugbee, Larry)
Date: Mon, 24 Aug 2009 11:52:55 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

I like the idea, but...

Here is a quick list of things to think about and if some of this has
already been mentioned, sorry.

Speed:  Encryption speed has been mentioned.  For short scripts this may
not be a problem, although algorithms implemented in C would be faster.

Strength:  Passwords are [very] weak, especially if of the 6-10
alphanumeric variety.  True secret keys where all bit combinations are
used is stronger.  Entering passwords has been mentioned but I believe
only passwords were assumed.  It is better to not provide any encryption
than to lure novices into believing they are secure when they are not.  

Algorithms:  Be sure to choose good ones and allow for changing later.

Key distribution:  How to distribute secret keys beyond a small group of
friends is problematic.  In short it doesn't scale.  Looking to
public-private key pairs can be equally problematic.  This can get you
into encryption certs, but *how* you use them correctly differs from
signing certs.  More on this later if you want. 

ZIP:  Look beyond just zip files.  A scheme that works for any/all files
in the distribution, not just ZIPs, would be better.  (IIRC there have
been problems with encrypted zips, but that was years ago.  Those issues
may have been fixed.)

Short version:  Doing this right is hard.  Simply supporting a password
based ZIP file is, in my opinion, not real protection.

Gotta go.  Later.


From guido at  Mon Aug 24 21:09:36 2009
From: guido at (Guido van Rossum)
Date: Mon, 24 Aug 2009 12:09:36 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 24, 2009 at 11:52 AM, Bugbee, Larry<larry.bugbee at> wrote:
> I like the idea, but...

For what use case?

--Guido van Rossum (home page:

From larry.bugbee at  Mon Aug 24 22:01:07 2009
From: larry.bugbee at (Bugbee, Larry)
Date: Mon, 24 Aug 2009 13:01:07 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

> > I like the idea, but...
> For what use case?

I don't have a specific case in mind.  In general, however, it would be
nice to be able to protect intellectual property, but without addressing
the problem from a holistic view, there is little protection afforded
and perhaps a lot of unrewarded work.  

And I forgot one, Distribution of crypto across certain international
borders.  Export/import laws by itself can be a showstopper.

From martin at  Mon Aug 24 22:39:17 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Aug 2009 22:39:17 +0200
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Bugbee, Larry wrote:
>>> I like the idea, but...
>> For what use case?
> I don't have a specific case in mind.  In general, however, it would be
> nice to be able to protect intellectual property

This I'm also unclear about. How does it protect intellectual property?
Won't the person running the zipfile have to enter the password? Whom
would you protect the IP from?


From digitalxero at  Mon Aug 24 23:01:11 2009
From: digitalxero at (Dj Gilcrease)
Date: Mon, 24 Aug 2009 15:01:11 -0600
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 24, 2009 at 2:01 PM, Bugbee, Larry<larry.bugbee at> wrote:
> I don't have a specific case in mind. ?In general, however, it would be
> nice to be able to protect intellectual property, but without addressing
> the problem from a holistic view, there is little protection afforded
> and perhaps a lot of unrewarded work.

I would think just distributing pyc files would achieve that goal

From larry.bugbee at  Mon Aug 24 23:16:42 2009
From: larry.bugbee at (Bugbee, Larry)
Date: Mon, 24 Aug 2009 14:16:42 -0700
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

>>>> I like the idea, but...
>>> For what use case?
>> I don't have a specific case in mind.  In general, however, it 
>> would be nice to be able to protect intellectual property
> This I'm also unclear about. How does it protect intellectual
> property? Won't the person running the zipfile have to enter the 
> password? Whom would you protect the IP from?

I agree the IP will have to be exposed at some point to be useful, but let's not overlook other things that could be in play like PIA agreements and the like.  Also, something stronger than a password will be needed to be secure, and secret key distribution does not scale.  There is a lot more to consider and we are only scratching the surface.  Confidentiality in-the-large will take far more than encrypted ZIP files.  

Please know that I am not pushing for the encryption of ZIP files and this thread is going down a path I did not intend, or desire pursuing.  My original post was intended to increase the awareness in those thinking encrypted ZIP files will 1) be easy, 2) afford the protection they desire, and 3) not lead others into a sense of false security.  Encryption sounds good, but doing it right can be a landmine.  A quick fix to support ZIP files will likely create more problems than it will solve.

I still say it would be *nice* if there was some way to protect IP.  I have no expectations that it will be easy, and least of all, solved by encrypted ZIP files and a simple patch to Python.  ...but that should not diminish the desire.  


From drkjam at  Tue Aug 25 00:24:48 2009
From: drkjam at (DrKJam)
Date: Mon, 24 Aug 2009 23:24:48 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

Good evening fellow Pythonistas,

Considering a PEP is now available I'd like to join this discussion and
raise several points with regard to both the PEP and the ipaddr reference
implementation put forward with it.

1) Firstly, an offering of code.

I'd like to bring to your attention an example implementation of an IP
address library and interface for general discussion to compare and contrast
with ipaddr 2.0.x :-

It is based on netaddr 0.7.2 which I threw together earlier today.

In essence, I've stripped out all of what could be considered non-essential
code for a purely IP related library. This branch should be suitable for
*theoretical* consideration of inclusion into some future version of the
Python standard library (with a little work).

It is a pure subset of netaddr release 0.7.2, *minus* the following :-

- all IEEE layer-2 code
- some fairly non-essential IANA IP data files and lookup code
- IP globbing code (fairly niche)

Aside: Just a small mention here that I listened carefully to Clay McClure's
and others criticisms of the previous incarnation of ipaddr. The 0.7.x
series of netaddr breaks backward compatibility with previous netaddr
releases and is an "answer" of sorts to that discussion and issue raised
within the Python community. I hope you like what I've done with it.

For the purposes of this discussion consider this branch the "Firefox to
netaddr's Mozilla" or maybe just plain old "netaddr-ip-lite" ;-)

2) I refute bold claim in the PEP that :-

    "Finding a good library for performing those tasks can be somewhat more

On the contrary, I wager that netaddr is now a perfectly decent alternative
implementation to ipaddr, containing quite a few more features with little
of the slowness for most common operations, 2/3x faster in a lot of cases,
not that we're counting. What a difference a year makes! I also rate IPy
quite highly even if it is getting a little "long in the tooth". For a lot
of users, IPy could also be considered a nice, stable API!

By the same token I'm happy to note some convergence between the ipaddr and
netaddr's various interfaces, particularly in light of discussions and
arguments put forward by Clay McClure and others. A satisfactory compromise
between the two however still seems a way off.

3) I also disagree with the PEP's claim that :-

    "attempts to combine [IPv4 and IPv6] into one object would be like
trying to force a round peg into a square hole (or vice versa)".

netaddr (and for that matter IPy) cope with this perceived problem

netaddr employs a simple variant of the GoF Strategy design pattern (with
added Python sensibility). In the rare cases where ambiguity exists between
IPv4 and IPv6 addresses a version parameter may be passed to the constructor
of the IPAddress class to differentiate between them. Providing an IP
address version to the constructor also provides a small performance

IPv4 and IPv6 addresses can be used interchangably throughout netaddr
without causing issue during operations such as sorting, merging (known in
the PEP as "address collapsing") or address exclusion.

Don't try and do this with the current reference implementation of ipaddr :-

>>> collapse_address_list([IPv4Address(''),

OUCH! Even if this isn't allowed (according to the documentation), it should
raise an Exception rather than silently passing through.

I actually raised this back in May on the ipaddr bug tracker but it hasn't
received any attention so far :-

Compare this with netaddr's behaviour :-

>>> cidr_merge([IPAddress(''), IPAddress('::')])
[IPNetwork(''), IPNetwork('::')]

That's more like it.

4) It may just be me but the design of this latest incarnation of ipaddr
seems somewhat complicated for so few lines of code. Compared with ipaddr,
netaddr doesn't use or require multiple inheritance nor a seemingly
convoluted inheritance heirarchy. There isn't a need for an IP() type
'multiplexer' function either (although I might be missing an important use
case here). But, then again, this may just be my personal preference talking
here. I prefer composition over inheritance in most cases.

In netaddr, if a user wants to represent an IP address (without netmask),
they should use the IPAddress class, if they want to represent and IP
address with some form of mask, they should use the IPNetwork class.

5) The ipaddr library is also missing options for expanding various
(exceedingly common) IP abbreviations.

>>> from netaddr import IPNetwork

>>> IPNetwork('10/8', True)

netaddr also handles classful IP address logic, still pervasive throughout
modern IP stacks :-

>>> IPNetwork('', True)

Note that these options are disabled by default, to keep up the speed of the
IPNetwork constructor up for more normal cases.

6) netaddr currently contains a lot of useful features absent in ipaddr that
would be extremely useful in a general, "lightweight" IP library.

For example, it already contains routines for :-

- arbitrary address range calculations
- full assistance for IPv4-mapped/compatible IPv6 addressing
- a fully function IPSet class which allows you to perform operations such
as unions, intersections and symmetric differences between lists of
IPNetwork (CIDR) objects.

The last one is actually really handy and is based on an idea for an IPv4
only library by Heiko Wundram posted to the ASPN Python Cookbook some years
ago (details can be found in the netaddr THANKS file).

There is a lot more to consider here than I can cram into this initial
message, so I'll hand over to you all for some (hopefully) serious debate.


David P. D. Moss
netaddr author and maintainer

PS - Why does the References section in the PEP contain links to patches
already applied to the ipaddr 2.0.x reference implementation?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From peter at  Tue Aug 25 06:54:58 2009
From: peter at (Peter Moody)
Date: Mon, 24 Aug 2009 21:54:58 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 24, 2009 at 3:24 PM, DrKJam<drkjam at> wrote:
> Good evening fellow Pythonistas,
> Considering a PEP is now available I'd like to join this discussion and
> raise several points with regard to both the PEP and the ipaddr reference
> implementation put forward with it.

Hi David,

is this what passes for serious debate? there's more passive
aggressive condescension in here than a teenager's diary. I'll try to
respond with a little more civility than you managed (apologies,
present paragraph excluded).

As it was left in early June, a pep and design modifications were
requested before ipaddr would be considered for inclusion, but if this
is going to start *another* drawn out ipaddr/netaddr thread, perhaps
the mailman admin(s) could setup a new SIG list for this.  I
personally hope that's not required; yours has been the only
dissenting email and I believe I respond to all of your major points

> 1) Firstly, an offering of code.
> I'd like to bring to your attention an example implementation of an IP
> address library and interface for general discussion to compare and contrast
> with ipaddr 2.0.x :-
> ???
> It is based on netaddr 0.7.2 which I threw together earlier today.
> In essence, I've stripped out all of what could be considered non-essential
> code for a purely IP related library. This branch should be suitable for
> *theoretical* consideration of inclusion into some future version of the
> Python standard library (with a little work).
> It is a pure subset of netaddr release 0.7.2, *minus* the following :-
> - all IEEE layer-2 code
> - some fairly non-essential IANA IP data files and lookup code
> - IP globbing code (fairly niche)
> Aside: Just a small mention here that I listened carefully to Clay McClure's
> and others criticisms of the previous incarnation of ipaddr. The 0.7.x
> series of netaddr breaks backward compatibility with previous netaddr
> releases and is an "answer" of sorts to that discussion and issue raised
> within the Python community. I hope you like what I've done with it.
> For the purposes of this discussion consider this branch the "Firefox to
> netaddr's Mozilla" or maybe just plain old "netaddr-ip-lite" ;-)
> 2) I refute bold claim in the PEP that :-
> ??? "Finding a good library for performing those tasks can be somewhat more
> difficult."
> On the contrary, I wager that netaddr is now a perfectly decent alternative
> implementation to ipaddr, containing quite a few more features with little
> of the slowness for most common operations,

I think you mean refuse, b/c this certainly wasn't the case when I
started writing ipaddr. IPy existed, but it was far too heavyweight
and restrictive for what I needed (no disrespect to the author(s)
intended). I believe I've an email or two from you wherein you
indicate the same.

> 2/3x faster in a lot of cases,
> not that we're counting. What a difference a year makes!
> I also rate IPy quite highly even if it is getting a little "long in the tooth".
> For a lot of users, IPy could also be considered a nice, stable API!

yes, netaddr has sped up quite a bit. It's still slower in many cases
as well. But again, who's timing?

> By the same token I'm happy to note some convergence between the ipaddr and
> netaddr's various interfaces, particularly in light of discussions and
> arguments put forward by Clay McClure and others. A satisfactory compromise
> between the two however still seems a way off.
> 3) I also disagree with the PEP's claim that :-
> ??? "attempts to combine [IPv4 and IPv6] into one object would be like
> trying to force a round peg into a square hole (or vice versa)".
> netaddr (and for that matter IPy) cope with this perceived problem
> admirably.
> netaddr employs a simple variant of the GoF Strategy design pattern (with
> added Python sensibility). In the rare cases where ambiguity exists between
> IPv4 and IPv6 addresses a version parameter may be passed to the constructor
> of the IPAddress class to differentiate between them. Providing an IP
> address version to the constructor also provides a small performance
> improvement.

I'm not sure what point you're trying to make here. I didn't say it
was impossible, I inferred that there are easier ways. having used
code which crams both types into one object, I found it to be cludgey
and complicated so I designed something different.

and as a hardly partial observer, I'll add the explicit address
version you can pass to the IPAddress class, but not the IPNetwork
class, is, odd. it actually seems to slow down object creation (~5%)
except in the case of an int arg (your default is about twice as

> IPv4 and IPv6 addresses can be used interchangably throughout netaddr
> without causing issue during operations such as sorting, merging (known in
> the PEP as "address collapsing") or address exclusion.
> Don't try and do this with the current reference implementation of ipaddr :-
>>>> collapse_address_list([IPv4Address(''),
>>>> IPv6Address('::')])
> [IPv4Network('')]
> OUCH! Even if this isn't allowed (according to the documentation), it should
> raise an Exception rather than silently passing through.
> I actually raised this back in May on the ipaddr bug tracker but it hasn't
> received any attention so far :-
> ???
> Compare this with netaddr's behaviour :-
>>>> cidr_merge([IPAddress(''), IPAddress('::')])
> [IPNetwork(''), IPNetwork('::')]
> That's more like it.

OUCH! indeed. I'm not even sure that this is a nice corner case
feature, summarizing a single list of mixed ip type objects. with an
extra line or two, this can be done in ipaddr, though 'tis true that
we should now raise an exception and don't (it appears to be something
that was introduced recently).  If this is a feature for which
developers are clamoring, I'm all over it. Yours is the first email
I've heard mention it.

> 4) It may just be me but the design of this latest incarnation of ipaddr
> seems somewhat complicated for so few lines of code. Compared with ipaddr,
> netaddr doesn't use or require multiple inheritance nor a seemingly
> convoluted inheritance heirarchy. There isn't a need for an IP() type
> 'multiplexer' function either (although I might be missing an important use
> case here). But, then again, this may just be my personal preference talking
> here. I prefer composition over inheritance in most cases.

this basically smacks of more petty attackery from the start. so I'll
reply with, "it's just you".

if you want to debate the merits of GOF strategy vs. multiple
inheritance, fine. the class inheritance in ipaddr is very clean, and
leaves very little code duplication. The classes are very clearly
named and laid out, and in general are much easier to follow than the
strategy method you've chosen for netaddr.

> In netaddr, if a user wants to represent an IP address (without netmask),
> they should use the IPAddress class, if they want to represent and IP
> address with some form of mask, they should use the IPNetwork class.

you might've missed the discussions thus far, but that's basically
what ipaddr does at this point.

> 5) The ipaddr library is also missing options for expanding various
> (exceedingly common) IP abbreviations.
>>>> from netaddr import IPNetwork
>>>> IPNetwork('10/8', True)
> IPNetwork('')
> netaddr also handles classful IP address logic, still pervasive throughout
> modern IP stacks :-
>>>> IPNetwork('', True)
> IPNetwork('')
> Note that these options are disabled by default, to keep up the speed of the
> IPNetwork constructor up for more normal cases.

these seem like corner case features for the sake of having features,
you don't even seem to put much stock in them. FWIW, I've never seen a
request for something similar. I may say '10 slash 8', but I mean,
''. I'm missing the utility here, but I'm open to reasoned

> 6) netaddr currently contains a lot of useful features absent in ipaddr that
> would be extremely useful in a general, "lightweight" IP library.
> For example, it already contains routines for :-
> - arbitrary address range calculations
> - full assistance for IPv4-mapped/compatible IPv6 addressing
> - a fully function IPSet class which allows you to perform operations such
> as unions, intersections and symmetric differences between lists of
> IPNetwork (CIDR) objects.
> The last one is actually really handy and is based on an idea for an IPv4
> only library by Heiko Wundram posted to the ASPN Python Cookbook some years
> ago (details can be found in the netaddr THANKS file).
> There is a lot more to consider here than I can cram into this initial
> message, so I'll hand over to you all for some (hopefully) serious debate.

I'm always open to serious debate, and patches/bug reports (apologies
for missing your earlier issue. I'm not sure if you were aware, but
ipaddr was undergoing a major re-write at the time and I never got
around to following up).

Your email however, like many of your previous ones, was not an
opening to a serious debate.

> Regards,
> David P. D. Moss
> netaddr author and maintainer
> PS - Why does the References section in the PEP contain links to patches
> already applied to the ipaddr 2.0.x reference implementation?

There's A link to A patch (singular, both times), which has already
been applied. This link exists b/c, at the time I last updated the
PEP, the patch hadn't been applied as it was still being reviewed. I
prefer having changes to ipaddr reviewed by people before submitting
them (as opposed to your lone submitter model); in general, that leads
to fewer bugs like the following:

>>> help(netaddr.IPNetwork.__init__)
Help on method __init__ in module netaddr.ip:

__init__(self, addr, implicit_prefix=False) unbound netaddr.ip.IPNetwork method

    @param addr: an IPv4 or IPv6 address with optional CIDR prefix,
        netmask or hostmask. May be an IP address in representation
        (string) format, an integer or another IP object (copy

    @param implicit_prefix: if True, the constructor uses classful IPv4
        rules to select a default prefix when one is not provided.
        If False it uses the length of the IP address version.
        (default: False).

>>> netaddr.IPNetwork(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./netaddr/ip/", line 632, in __init__
    prefix, suffix = addr.split('/')
AttributeError: 'int' object has no attribute 'split'


>>> import ipaddr
>>> ipaddr.IPNetwork(1)

Did you have any other comments on the PEP?


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From stephen at  Tue Aug 25 07:10:15 2009
From: stephen at (Stephen J. Turnbull)
Date: Tue, 25 Aug 2009 14:10:15 +0900
Subject: [Python-Dev] Support for Encrypted Zip as python scripts
In-Reply-To: <>
References: <>
Message-ID: <>

Bugbee, Larry writes:

 > My original post was intended to increase the awareness in those
 > thinking encrypted ZIP files will 1) be easy, 2) afford the
 > protection they desire, and 3) not lead others into a sense of
 > false security.

All good points, but note that (even without the DMCA) at least in the
U.S. copyright law provides for *criminal* penalties for willful
infringement.  If you needed to type in a password to copy, you can't
argue "I didn't know this was private property", just as crossing a
fence is stronger evidence of criminal trespass than ignoring "Posted"
signs.  For patents, the password prompt could notice of patent
protection, with similar effect of strengthening penalties for
infringing.  So even weak encryption strengthens the available legal

That is not sufficient reason to consider putting encrypted zips or
anything similar into the stdlib.  It's relevant to users' decisions
should such features become available, that's all.

 > I still say it would be *nice* if there was some way to protect IP.

-1.  Intellectual assets *can* give benefits with zero further costs
of production and almost negligible costs of distribution.  But IP,
like any other property that requires a temporary transfer of
possession to give economic benefit (eg, rental cars), is going to
involve substantial transaction cost for consumers (search for the
product, license negotiation[1]), as well as the usual excess burden of

The current state where only legal protection is feasible is arguably
a good compromise.  Since it involves substantial costs of enforcement
borne by the rightsholder, it's only going to be invoked where the
total social benefit (net consumer value plus vendor profit) is large
enough to swamp the small transaction costs.

I think that Python should spend zero effort on implementing technical
means of IP protection.  Any side effects of privacy protection
devices should be more than enough to serve.


[1]  Not necessarily bargaining, but also including studying the terms
of take it or leave it offers, etc.

From chris at  Tue Aug 25 10:22:39 2009
From: chris at (Chris Withers)
Date: Tue, 25 Aug 2009 09:22:39 +0100
Subject: [Python-Dev]
In-Reply-To: <>
References: <>	
Message-ID: <>

Benjamin Peterson wrote:
> 2009/8/24 Chris Withers <chris at>:
>> Guido van Rossum wrote:
>>> Anyway it looks like if someone wants to try this, only the code in
>>> needs to be touched.
>> Where is to be found?
> $ find . -name ""
> ./Lib/

Heh, grep beats Mk I eyeball ;-)
(I did actually look in Lib...)

Anyway, so how is the stuff in wired up to the command line 
options passed to the interpretter?


Simplistix - Content Management, Batch Processing & Python Consulting

From benjamin at  Tue Aug 25 10:25:15 2009
From: benjamin at (Benjamin Peterson)
Date: Tue, 25 Aug 2009 10:25:15 +0200
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/25 Chris Withers <chris at>:

> Anyway, so how is the stuff in wired up to the command line options
> passed to the interpretter?



From ncoghlan at  Tue Aug 25 11:01:17 2009
From: ncoghlan at (Nick Coghlan)
Date: Tue, 25 Aug 2009 19:01:17 +1000
Subject: [Python-Dev]
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Benjamin Peterson wrote:
> 2009/8/25 Chris Withers <chris at>:
>> Anyway, so how is the stuff in wired up to the command line options
>> passed to the interpretter?
> Modules/main.c

The most relevant functions in there are "RunMainFromImporter()"
(attempting zipfile/directory execution) and "RunModule()" (-m switch
and also called for zipfile/directory execution). The latter function
just uses normal C API calls to actually invoke the runpy code
(specifically "runpy._run_module_as_main()").


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From martin.zugnoni at  Tue Aug 25 17:17:50 2009
From: martin.zugnoni at (Martin Zugnoni)
Date: Tue, 25 Aug 2009 12:17:50 -0300
Subject: [Python-Dev] Problems with events in a numeric keyboard
Message-ID: <>

Hi! I'm trying to catch the triple zero (000) key from a numeric keyboard
but, I found that it's the same id that the single zero. So, when I press
the triple zero key once, I receive three events from the single zero key.
I need to make a disctintion between these keys, and use them to different
How can I do?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Aug 25 17:32:47 2009
From: steve at (Steven D'Aprano)
Date: Wed, 26 Aug 2009 01:32:47 +1000
Subject: [Python-Dev] Problems with events in a numeric keyboard
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, 26 Aug 2009 01:17:50 am Martin Zugnoni wrote:
> Hi! I'm trying to catch the triple zero (000) key from a numeric
> keyboard...

This list is for the development *of* the Python language, not 
development *with* Python. You should probably try the comp.lang.python 
newsgroup, also available as a mailing list:

Good luck.

Steven D'Aprano

From alexander.belopolsky at  Tue Aug 25 17:50:01 2009
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Tue, 25 Aug 2009 11:50:01 -0400
Subject: [Python-Dev]
In-Reply-To: <>
References: <>
Message-ID: <>

Take a look at two PEPs referenced in runpy doc, :

PEP 338 - Executing modules as scripts
PEP written and implemented by Nick Coghlan.
PEP 366 - Main module explicit relative imports
PEP written and implemented by Nick Coghlan.

(Nick is too modest to self-reference, but these two PEPs give an
excellent exposition. :-)

On Tue, Aug 25, 2009 at 5:01 AM, Nick Coghlan<ncoghlan at> wrote:
> Benjamin Peterson wrote:
>> 2009/8/25 Chris Withers <chris at>:
>>> Anyway, so how is the stuff in wired up to the command line options
>>> passed to the interpretter?
>> Modules/main.c
> The most relevant functions in there are "RunMainFromImporter()"
> (attempting zipfile/directory execution) and "RunModule()" (-m switch
> and also called for zipfile/directory execution). The latter function
> just uses normal C API calls to actually invoke the runpy code
> (specifically "runpy._run_module_as_main()").
> Cheers,
> Nick.
> --
> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
> ---------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From chris at  Tue Aug 25 17:59:44 2009
From: chris at (Chris Withers)
Date: Tue, 25 Aug 2009 16:59:44 +0100
Subject: [Python-Dev] Excluding the current path from module search path?
Message-ID: <>

Hi All,

I'm being bitten by this issue:

I'm not sure I agree with Daniel's closing of it so thought I'd ask here...

Am I right in thinking that the general idea is that "the current 
working directory at the time of invoking a script or interpreter ends 
up on the python path" or should I be thinking "the directory that a 
script exists in should end up on the python path"?

If the latter, then what happens in the case of just starting up an 

If neither, then how come when I have two .py files in a directory, I 
can import one as a module from the other?

In any case, as a parting comment, 
seems to have been committed with no tests and the only documentation 
being a one liner in the NEWS.txt file. Was there other discussion of this?

(Incidentally, export PYTHONPATH= or its Windows equivalent circumvents 
whatever the patch was trying to achieve, so the change doesn't seem to 
make sense anyway...)



Simplistix - Content Management, Batch Processing & Python Consulting

From peter at  Tue Aug 25 18:04:16 2009
From: peter at (Peter Moody)
Date: Tue, 25 Aug 2009 09:04:16 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 21, 2009 at 4:00 AM, Oleg Broytmann<phd at> wrote:
>> ? ? _compat_has_real_bytes = bytes != str
> ? Wouldn't it be nicer "bytes is not str"?

it is. fixing this.


> Oleg.
> --
> ? ? Oleg Broytmann ? ? ? ? ? ? ? ? ? ? ? ?phd at
> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From chris at  Tue Aug 25 18:08:05 2009
From: chris at (Chris Withers)
Date: Tue, 25 Aug 2009 17:08:05 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
Message-ID: <>

Hi All,

Would anyone object if I removed the deletion of of 
sys.setdefaultencoding in

I'm guessing "yes!" so thought I'd state my reasons now:

This deletion appears to be pretty flimsy; reload(sys) and you have it 
back. Which is lucky, because I need it after it's been deleted...

Why? Well, because you can no longer put in a 
project-specific location ( and 
because for some projects the only way I can deal with encoded strings 
sensibly is to use setdefaultencoding, in my case at the start of a 
script generated by zc.buildout's zc.recipe.egg (I *know* all the 
encodings in this project are utf-8, but I don't want to go playing 
whack-a-mole with whatever modules this rather large project uses that 
haven't been made properly unicode aware).

Yes, it needs to be used as early as possible, and the docs should say 
this, but deleting it seems to be petty in terms of stopping its use 
when is too early and too system-wide and spraying 
.decode('utf-8')'s all over a code base made up of a load of eggs 
managed by buildout simply isn't feasible...



Simplistix - Content Management, Batch Processing & Python Consulting

From exarkun at  Tue Aug 25 18:23:05 2009
From: exarkun at (exarkun at
Date: Tue, 25 Aug 2009 16:23:05 -0000
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
Message-ID: <20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>

On 04:08 pm, chris at wrote:
>Hi All,
>Would anyone object if I removed the deletion of of 
>sys.setdefaultencoding in
>I'm guessing "yes!" so thought I'd state my reasons now:
>This deletion appears to be pretty flimsy; reload(sys) and you have it 
>back. Which is lucky, because I need it after it's been deleted...

The ability to change the default encoding is a misfeature.  There's 
essentially no way to write correct Python code in the presence of this 

Using setdefaultencoding is never the sensible way to deal with encoded 
strings.  Actually exposing this function in the sys module would lead 
all kinds of people who haven't fully grasped the way str, unicode, and 
encodings work to doing horrible things to create broken programs.  It's 
bad enough that it's already possible to get this function back with the 
reload(sys) trick.
>Why? Well, because you can no longer put in a project- 
>specific location ( and because for 
>some projects the only way I can deal with encoded strings sensibly is 
>to use setdefaultencoding, in my case at the start of a script 
>generated by zc.buildout's zc.recipe.egg (I *know* all the encodings in 
>this project are utf-8, but I don't want to go playing whack-a-mole 
>with whatever modules this rather large project uses that haven't been 
>made properly unicode aware).
>Yes, it needs to be used as early as possible, and the docs should say 
>this, but deleting it seems to be petty in terms of stopping its use 
>when is too early and too system-wide and spraying 
>.decode('utf-8')'s all over a code base made up of a load of eggs 
>managed by buildout simply isn't feasible...

It may be a major task, but the best thing you can do is find each str 
and unicode operation in the software you're working with and make them 
correct with respect to your inputs and outputs.  Flipping a giant 
switch for the entire process is just going to change which things are 


From pje at  Tue Aug 25 18:30:00 2009
From: pje at (P.J. Eby)
Date: Tue, 25 Aug 2009 12:30:00 -0400
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <>
Message-ID: <>

At 04:59 PM 8/25/2009 +0100, Chris Withers wrote:
>Hi All,
>I'm being bitten by this issue:
>I'm not sure I agree with Daniel's closing of it so thought I'd ask here...
>Am I right in thinking that the general idea is that "the current 
>working directory at the time of invoking a script or interpreter 
>ends up on the python path" or should I be thinking "the directory 
>that a script exists in should end up on the python path"?
>If the latter, then what happens in the case of just starting up an 

It's the latter.  In the case where there is no script, then the 
current directory is considered to be the directory of the script.

From benjamin at  Tue Aug 25 18:31:01 2009
From: benjamin at (Benjamin Peterson)
Date: Tue, 25 Aug 2009 18:31:01 +0200
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/25 Chris Withers <chris at>:
> Hi All,
> I'm being bitten by this issue:
> I'm not sure I agree with Daniel's closing of it so thought I'd ask here...
> Am I right in thinking that the general idea is that "the current working
> directory at the time of invoking a script or interpreter ends up on the
> python path" or should I be thinking "the directory that a script exists in
> should end up on the python path"?

The latter.

> If the latter, then what happens in the case of just starting up an
> interpreter?

Because '' is prepended to sys.path then.


From rdmurray at  Tue Aug 25 18:43:03 2009
From: rdmurray at (R. David Murray)
Date: Tue, 25 Aug 2009 12:43:03 -0400 (EDT)
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, 25 Aug 2009 at 16:59, Chris Withers wrote:
> In any case, as a parting comment, seems 
> to have been committed with no tests and the only documentation being a one 
> liner in the NEWS.txt file. Was there other discussion of this?

It probably should have gone into What's New as well, but it was too
late for that at the time the bug was filed.

> (Incidentally, export PYTHONPATH= or its Windows equivalent circumvents 
> whatever the patch was trying to achieve, so the change doesn't seem to make 
> sense anyway...)

The change was fixing a clear bug:  blank path elements were being
introduced into the path _unintentionally_ and unexpectedly.  Setting
PYTHONPATH would be a way to do it intentionally.


From mal at  Tue Aug 25 18:49:21 2009
From: mal at (M.-A. Lemburg)
Date: Tue, 25 Aug 2009 18:49:21 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
Message-ID: <>

Chris Withers wrote:
> Hi All,
> Would anyone object if I removed the deletion of of
> sys.setdefaultencoding in
> I'm guessing "yes!" so thought I'd state my reasons now:
> This deletion appears to be pretty flimsy; reload(sys) and you have it
> back. Which is lucky, because I need it after it's been deleted...
> Why? Well, because you can no longer put in a
> project-specific location ( and
> because for some projects the only way I can deal with encoded strings
> sensibly is to use setdefaultencoding, in my case at the start of a
> script generated by zc.buildout's zc.recipe.egg (I *know* all the
> encodings in this project are utf-8, but I don't want to go playing
> whack-a-mole with whatever modules this rather large project uses that
> haven't been made properly unicode aware).
> Yes, it needs to be used as early as possible, and the docs should say
> this, but deleting it seems to be petty in terms of stopping its use
> when is too early and too system-wide and spraying
> .decode('utf-8')'s all over a code base made up of a load of eggs
> managed by buildout simply isn't feasible...
> Thoughts?

Let's look at this from another angle: sys.setdefaultencoding()
is only made available for use in This is documented
and by design (since a site may want to set the default encoding
based on the locale or to "utf-8").

If you use it anywhere else, you're on your own. Such usage
is not supported and may very well break your interpreter or
cause data corruption (the default encoded versions of Unicode
objects are cached inside the objects).

Now, in your particular case, you're probably better off just
tweaking directly in your custom Python interpreter
rather than relying on (see setencoding() in

To answer your question: yes, this particular API may not be
used outside

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 25 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From guido at  Tue Aug 25 19:37:24 2009
From: guido at (Guido van Rossum)
Date: Tue, 25 Aug 2009 10:37:24 -0700
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <>

In retrospect, it should have been called sys._setdefaultencoding().
That sends an extra signal that it's not meant for general use.

--Guido van Rossum (home page:

From mcguire at  Tue Aug 25 19:41:55 2009
From: mcguire at (Jake McGuire)
Date: Tue, 25 Aug 2009 10:41:55 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Aug 24, 2009 at 9:54 PM, Peter Moody<peter at> wrote:
> I personally hope that's not required; yours has been the only
> dissenting email and I believe I respond to all of your major points
> here.

Silence is not assent.

ipaddr looks like a reasonable library from here, but AFAIK it's not
widely used outside of google.  I don't know if it's reasonable to
want some amount public usage before a brand-new API goes into the
standard library, but such use is more likely to uncover API flaws or
quirks than a PEP.


From robert.kern at  Tue Aug 25 20:10:20 2009
From: robert.kern at (Robert Kern)
Date: Tue, 25 Aug 2009 13:10:20 -0500
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <h719ef$bet$>

On 2009-08-25 12:37 PM, Guido van Rossum wrote:
> In retrospect, it should have been called sys._setdefaultencoding().
> That sends an extra signal that it's not meant for general use.

Considering all of the sys._getframe() hacks out there, I suspect that this 
would encourage more abuse of the function than the current situation.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

From guido at  Tue Aug 25 20:29:37 2009
From: guido at (Guido van Rossum)
Date: Tue, 25 Aug 2009 11:29:37 -0700
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <h719ef$bet$>
References: <> <> 
Message-ID: <>

On Tue, Aug 25, 2009 at 11:10 AM, Robert Kern<robert.kern at> wrote:
> On 2009-08-25 12:37 PM, Guido van Rossum wrote:
>> In retrospect, it should have been called sys._setdefaultencoding().
>> That sends an extra signal that it's not meant for general use.
> Considering all of the sys._getframe() hacks out there, I suspect that this
> would encourage more abuse of the function than the current situation.

Why? It would still be deleted by The abuse of
sys._getframe() exists because it fills a real need. (As does abuse of
sys.setdefaultencoding(). However abusing it is actually more
troublesome, because the problems are much less theoretical.)

--Guido van Rossum (home page:

From robert.kern at  Tue Aug 25 20:35:52 2009
From: robert.kern at (Robert Kern)
Date: Tue, 25 Aug 2009 13:35:52 -0500
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <h71aua$fqg$>

On 2009-08-25 13:29 PM, Guido van Rossum wrote:
> On Tue, Aug 25, 2009 at 11:10 AM, Robert Kern<robert.kern at>  wrote:
>> On 2009-08-25 12:37 PM, Guido van Rossum wrote:
>>> In retrospect, it should have been called sys._setdefaultencoding().
>>> That sends an extra signal that it's not meant for general use.
>> Considering all of the sys._getframe() hacks out there, I suspect that this
>> would encourage more abuse of the function than the current situation.
> Why? It would still be deleted by

Ah, yes. You're right. For whatever reason I thought it lived as 
site.setdefaultencoding() when I read your message and thought that you were 
proposing to move it to sys. Never mind me.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

From martin at  Tue Aug 25 23:37:34 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Aug 2009 23:37:34 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> ipaddr looks like a reasonable library from here, but AFAIK it's not
> widely used outside of google.  I don't know if it's reasonable to
> want some amount public usage before a brand-new API goes into the
> standard library, but such use is more likely to uncover API flaws or
> quirks than a PEP.

OTOH, the PEP process *is* the stronger of the two approaches, allowing
people to provide explicit opinions even if (and especially if) they
dislike the technology entirely (whereas for an external module, they
would just ignore it).

If they refuse the comment, they can't complain when it gets added
to the standard library - they can still chose to ignore it, then,
of course (just as many people ignore xml.dom).

In the specific case, I'm not worried about timing. Either 2.7 or 3.2
are still a year ahead, which should give people plenty of time to

OTTH, I *like* people to comment strongly on the PEP, in particular
if they are authors of competing libraries. It's no surprise that they
get emotional when their hard work won't be appropriately honored in
the long run - and if they believe there is something wrong with the
technology being proposed (rather than just the words used to describe
it), they are probably right.

I said it before - this is not going to be a fast acceptance path
of a library that gets accepted just because GvR works at google.
People of competing libraries *could* write competing PEPs if they
wanted to see their library incorporated instead - or they can just
state that they don't want *this* library to be incorporated for
specific technical reasons.


From ncoghlan at  Wed Aug 26 13:28:03 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 26 Aug 2009 21:28:03 +1000
Subject: [Python-Dev]
In-Reply-To: <>
References: <>	
Message-ID: <>

Alexander Belopolsky wrote:
> Take a look at two PEPs referenced in runpy doc,
> :
> PEP 338 - Executing modules as scripts
> PEP written and implemented by Nick Coghlan.
> PEP 366 - Main module explicit relative imports
> PEP written and implemented by Nick Coghlan.
> (Nick is too modest to self-reference, but these two PEPs give an
> excellent exposition. :-)

The PEPs don't go into the process of how we actually hook the command
line up to the runpy module though - that's something you need to dig
into the main.c code to really understand.

The command line documentation is also relevant since it defines the
intended behaviour:

(Drop the /dev from the URL to see the defined behaviour for 2.6)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Wed Aug 26 13:44:12 2009
From: ncoghlan at (Nick Coghlan)
Date: Wed, 26 Aug 2009 21:44:12 +1000
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <>
Message-ID: <>

Chris Withers wrote:
> Hi All,
> I'm being bitten by this issue:
> I'm not sure I agree with Daniel's closing of it so thought I'd ask here...
> Am I right in thinking that the general idea is that "the current
> working directory at the time of invoking a script or interpreter ends
> up on the python path" or should I be thinking "the directory that a
> script exists in should end up on the python path"?
> If the latter, then what happens in the case of just starting up an
> interpreter?
> If neither, then how come when I have two .py files in a directory, I
> can import one as a module from the other?

The details of the sys.path manipulation at program startup are
documented here:

The directory prepended to sys.path is based on the code executed by the
command line.

stdin, -c, -m or nothing specified: current directory
Filesystem path pointing to script (source or compiled): directory
containing script
Filesystem path pointing to directory or zipfile: the named directory or


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From solipsis at  Wed Aug 26 16:44:02 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 26 Aug 2009 14:44:02 +0000 (UTC)
Subject: [Python-Dev] copyright ownership
References: <>
Message-ID: <>

Guido van Rossum <guido <at>> writes:
> Are you a lawyer? Do you know the legal history of Python
> distributions and the US export laws? It's not so easy -- for one, the
> PSF (a US foundation) owns the copyright.

Does it? As far as I understand, the contributor agreement is not a copyright
transfer agreement (? PSF understands and agrees that Contributor retains
copyright in its Contributions ?).

Not that it makes the issue easier of course :)


From guido at  Wed Aug 26 17:00:57 2009
From: guido at (Guido van Rossum)
Date: Wed, 26 Aug 2009 08:00:57 -0700
Subject: [Python-Dev] copyright ownership
In-Reply-To: <>
References: <> 
Message-ID: <>

On Wed, Aug 26, 2009 at 7:44 AM, Antoine Pitrou<solipsis at> wrote:
> Guido van Rossum <guido <at>> writes:
>> Are you a lawyer? Do you know the legal history of Python
>> distributions and the US export laws? It's not so easy -- for one, the
>> PSF (a US foundation) owns the copyright.
> Does it? As far as I understand, the contributor agreement is not a copyright
> transfer agreement (? PSF understands and agrees that Contributor retains
> copyright in its Contributions ?).

The rights in the individual contributions are retained by the
contributor. However the rights in the distributions as a whole are
most definitely claimed by the PSF. Read the LICENSE file in the
distro. :-)

> Not that it makes the issue easier of course :)

Nothing that involves lawyers is ever easy. That's why well-meaning
suggestions like "but is outside the US" are so aggravating
-- it's so hard to explain why it doesn't work that way.

--Guido van Rossum (home page:

From solipsis at  Wed Aug 26 19:42:02 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 26 Aug 2009 17:42:02 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

DrKJam <drkjam <at>> writes:
> netaddr employs a simple variant of the GoF Strategy design pattern (with
added Python sensibility).

It would be nice if you could avoid employing this kind of acronyms without
explaining them. Not everybody drinks the design pattern kool-aid.
(Google tells me that GoF seems to mean "Gang of Four", which is of course as
meaningful as a hip-hop band name can be :-))

In any case, if you think netaddr's implementation strategy is better than
ipaddr's, a detailed comparison would be welcome.



From martin at  Wed Aug 26 20:35:21 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Aug 2009 20:35:21 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

>> netaddr employs a simple variant of the GoF Strategy design pattern (with
> added Python sensibility).
> It would be nice if you could avoid employing this kind of acronyms without
> explaining them. Not everybody drinks the design pattern kool-aid.
> (Google tells me that GoF seems to mean "Gang of Four", which is of course as
> meaningful as a hip-hop band name can be :-))
> In any case, if you think netaddr's implementation strategy is better than
> ipaddr's, a detailed comparison would be welcome.

Just in case it still isn't clear "Strategy" above doesn't refer to
"implementation strategy" (i.e. "way of implementing things").

Instead, "Strategy" is a specific design pattern, originally defined by
one of Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides.


for a detailed description. The basic idea is that you can switch
algorithms at run-time, by merely replacing one object with another.

To use it, you define an abstract base class with a method called
"execute". You then create as many subclasses as you want, redefining
"execute" to provide specific *strategies*. You create a (often global)
variable pointing to one instance of the base class, and dynamically
assign this variable with an instance of the appropriate strategy
class. Users of the strategy don't need to know what strategy is chosen.

The wikipedia article points (IMO correctly) out that the Strategy
pattern is mostly useless in languages that have true function pointers,
such as Python. You can still have the global variable part, but you
don't need the abstract base class part at all.

In the specific case of netaddr, I think David is referring to the
netaddr.strategy package, which has modules called ipv4 and ipv6,
both implementing functions like valid_str and str_to_arpa.
Then, class IPAddress has a method reverse_dns, which is defined

    def reverse_dns(self):
        """The reverse DNS lookup record for this IP address"""
        return self._module.int_to_arpa(self._value)

So IPv4 addresses and IPv6 addresses share the same class, but instances
have different values of _module. IPAddress.__init__ looks at the
version keyword parameter if given, otherwise, it tries str_to_int
first for v4, then for v6.

Whether that's better or not than using subclasses, I don't know.


From ben+python at  Wed Aug 26 20:49:38 2009
From: ben+python at (Ben Finney)
Date: Thu, 27 Aug 2009 04:49:38 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for
	the	Python Standard Library
References: <>
Message-ID: <>

Antoine Pitrou <solipsis at> writes:

> DrKJam <drkjam <at>> writes:
> > netaddr employs a simple variant of the GoF Strategy design pattern (with
> added Python sensibility).
> It would be nice if you could avoid employing this kind of acronyms
> without explaining them. Not everybody drinks the design pattern
> kool-aid.

A pity, since the entire point of Design Patterns is to give us a
vocabulary of terms to use that enable these concepts to be communicated
*without* continually re-defining them. To that extent, then, they fail
their purpose.

 \        ?Our wines leave you nothing to hope for.? ?restaurant menu, |
  `\                                                       Switzerland |
_o__)                                                                  |
Ben Finney

From solipsis at  Wed Aug 26 21:02:28 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 26 Aug 2009 19:02:28 +0000 (UTC)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Martin v. L?wis <martin <at>> writes:
> [...]
> Then, class IPAddress has a method reverse_dns, which is defined
> as
>     def reverse_dns(self):
>         """The reverse DNS lookup record for this IP address"""
>         return self._module.int_to_arpa(self._value)
> So IPv4 addresses and IPv6 addresses share the same class, but instances
> have different values of _module.

Ok, thanks for the explanation. It looks like an inheritance-based approach
would allow for easier and more traditional introspection (e.g. `isinstance(ip,
IPv4Address)`). Not to mention that avoiding an indirection level can make
things faster.



From martin at  Wed Aug 26 21:03:52 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 26 Aug 2009 21:03:52 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for	the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

>> DrKJam <drkjam <at>> writes:
>>> netaddr employs a simple variant of the GoF Strategy design pattern (with
>> added Python sensibility).
>> It would be nice if you could avoid employing this kind of acronyms
>> without explaining them. Not everybody drinks the design pattern
>> kool-aid.
> A pity, since the entire point of Design Patterns is to give us a
> vocabulary of terms to use that enable these concepts to be communicated
> *without* continually re-defining them. To that extent, then, they fail
> their purpose.

I think it's too early to tell. It may be that they have not yet
achieved their purpose - just let's wait fifty more years
(and I'm only half-joking).


From fuzzyman at  Wed Aug 26 21:18:40 2009
From: fuzzyman at (Michael Foord)
Date: Wed, 26 Aug 2009 20:18:40 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> DrKJam <drkjam <at>> writes:
>> netaddr employs a simple variant of the GoF Strategy design pattern (with
> added Python sensibility).
> It would be nice if you could avoid employing this kind of acronyms without
> explaining them. Not everybody drinks the design pattern kool-aid.
> (Google tells me that GoF seems to mean "Gang of Four", which is of course as
> meaningful as a hip-hop band name can be :-))

Really? Discussing the GoF design patterns by name seems to be prevalent 
amongst the programmers I know (yourself excluded of course...).


> In any case, if you think netaddr's implementation strategy is better than
> ipaddr's, a detailed comparison would be welcome.
> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From solipsis at  Wed Aug 26 21:36:46 2009
From: solipsis at (Antoine Pitrou)
Date: Wed, 26 Aug 2009 19:36:46 +0000 (UTC)
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
References: <>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Michael Foord <fuzzyman <at>> writes:
> Really? Discussing the GoF design patterns by name seems to be prevalent 
> amongst the programmers I know (yourself excluded of course...).

Ah? I still haven't understood what "Gang of Four" is supposed to be, however.
Is it a design pattern?

Besides, saying "I use the strategy design pattern" doesn't tell a lot, while an
ad hoc description is much more informational (witness Martin's explanation for

It's like those frameworks who have a class simply named "Factory" ;)



From fuzzyman at  Wed Aug 26 21:43:27 2009
From: fuzzyman at (Michael Foord)
Date: Wed, 26 Aug 2009 20:43:27 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> Michael Foord <fuzzyman <at>> writes:
>> Really? Discussing the GoF design patterns by name seems to be prevalent 
>> amongst the programmers I know (yourself excluded of course...).
> Ah? I still haven't understood what "Gang of Four" is supposed to be, however.
> Is it a design pattern?

The gang of four are the four folk who wrote the classic design patterns 

Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides.
> Besides, saying "I use the strategy design pattern" doesn't tell a lot, while an
> ad hoc description is much more informational (witness Martin's explanation for
> example).
> It's like those frameworks who have a class simply named "Factory" ;)

Well, depending on the circumstances it can convey some to no 
information. :-)


> Regards
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From amentajo at  Wed Aug 26 21:29:25 2009
From: amentajo at (Joe Amenta)
Date: Wed, 26 Aug 2009 15:29:25 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
Message-ID: <>

Hello all,

I have released the first alpha version of 3to2 after finishing it for my
Google Summer of Code 2009(tm) project.  You can get the tarball for this
release at
This requires python 2.7, because it requires a newer version of 2to3 than
what comes with 2.6.

Release notes are in the RELEASE file.  Development happens at, and the source code for this release
lives at
Report bugs at, please.

Additional notes and comments can (for now) be found at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Wed Aug 26 22:20:05 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Aug 2009 22:20:05 +0200
Subject: [Python-Dev] No 2.4.7 release
Message-ID: <>

I once announced that I would be working on releasing 2.4.7 this month.

However, since no patches have been committed to 2.4.6, there is little
point in making a release. As 2.4 is nearing its end-of-life soon, there
likely won't be any 2.4.7 release.

Python 2.5 has seen only two patches since 2.5.4. However, since several
months have been passed since that release, I'll be creating a 2.5.5
release candidate within a few days.

Further security releases of Python 2.5 will be made until September 2011.


From martin at  Wed Aug 26 22:26:49 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Aug 2009 22:26:49 +0200
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

> I have released the first alpha version of 3to2 after finishing it for
> my Google Summer of Code 2009(tm) project.

Congratulations! I understand SoC is basically over, but I would still
like to request two things:

- can you please register it with PyPI?
- can you please announce/report some plans for the future of this
  project? In particular, will you continue to work on it?


From skip at  Wed Aug 26 23:25:24 2009
From: skip at (skip at
Date: Wed, 26 Aug 2009 16:25:24 -0500
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

    Martin> I think it's too early to tell. It may be that they have not yet
    Martin> achieved their purpose - just let's wait fifty more years (and
    Martin> I'm only half-joking).

So what you're really saying is we only have to wait 25 years...


From amentajo at  Thu Aug 27 00:55:54 2009
From: amentajo at (Joe Amenta)
Date: Wed, 26 Aug 2009 18:55:54 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <> 
Message-ID: <>

On Wed, Aug 26, 2009 at 4:26 PM, "Martin v. L?wis" <martin at>wrote:

> > I have released the first alpha version of 3to2 after finishing it for
> > my Google Summer of Code 2009(tm) project.
> Congratulations! I understand SoC is basically over, but I would still
> like to request two things:
> - can you please register it with PyPI?
> - can you please announce/report some plans for the future of this
>  project? In particular, will you continue to work on it?
> Thanks,
> Martin

-- 3to2 is now registered with
Did I do it right?
-- I plan to continue to work on 3to2 in my free time, though I have one of
those social lives, so I could certainly use some help; in particular, I
could use some quality bug
My long-term plans for the future are:

   - Bugfixes
   - Keep up with new features added in newer versions of py3k
   - Ensure syntactical correctness with a more robust test suite

My short-term plans for the future are:

   - Fixes imports and imports2 need to work properly
   - Continue to build a suitable test suite that tests common cases of all
   - print fixer refactors the syntax into print statements rather than
   imports print_function from __future__

Thanks for the acknowledgement,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris at  Thu Aug 27 01:48:40 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 00:48:40 +0100
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <> <>
Message-ID: <>

Nick Coghlan wrote:
> The details of the sys.path manipulation at program startup are
> documented here:
> The directory prepended to sys.path is based on the code executed by the
> command line.

It's more subtle than that though...

The OP in is being bitten by the 
same expectation that I am: should be found somewhere 
on the sys.path present at the start of the script/module/command/etc 
being executed. (The bug referenced in that report makes things worse, 
because this used to work, at least on Windows ;-) )

The problem is that (and therefore is imported 
early in main.c on line 516 as part of Py_Initialize(), but the path of 
the current script only gets added later on in RunMainFromImporter 
called on line 569.

Strictly speaking, the docs at 
aren't lying, but it takes an understanding of when is imported 
that isn't available to anyone who doesn't read C to know why a path 
that is present on sys.path when the user's script starts isn't being 
searched for

What do people feel about this?

At the very least, I'd like to add a warning box in site.html to explain 
why sitecustomize might not be found where people expect.

I'd *like* to have the paths be the same for as they are for the 
subsequent code that's executed, but would that make too much of a mess 
of main.c and runpython.c?



Simplistix - Content Management, Batch Processing & Python Consulting

From drkjam at  Thu Aug 27 01:48:46 2009
From: drkjam at (DrKJam)
Date: Thu, 27 Aug 2009 00:48:46 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

I've started a very basic (work in progress) entry on the netaddr wiki to
track various aspects of this discussion that might not be in a format
suitable for publishing to the list or are too lengthy. It will also allow
my ascii art diagrams to render correctly ;-)

I will be updating it as free time become available to me over the coming
days. Feel free to make comments on the wiki page itself if you want me to
make any changes. Duncan McGreggor should be able to make changes if I am
not available for whatever reason :-

If anyone has suggestions for a better place to put this, please shout (but
not too loudly please Peter M. ;-)


Dave M.

PS - Can't wait for Google Wave which would make this kind of thing so much
easier ;-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris at  Thu Aug 27 01:51:51 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 00:51:51 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>
References: <>
Message-ID: <>

exarkun at wrote:
> The ability to change the default encoding is a misfeature.  There's 
> essentially no way to write correct Python code in the presence of this 
> feature.

How so? If every single piece of text in your project is encoded in a 
superset of ascii (such as utf-8), why would this be a problem?
Even if you were evil/stupid and mixed encodings, surely all you'd get 
is different unicode errors or mayvbe the odd strange character during 

> It may be a major task, but the best thing you can do is find each str 
> and unicode operation in the software you're working with and make them 
> correct with respect to your inputs and outputs.  Flipping a giant 
> switch for the entire process is just going to change which things are 
> wrong.

Well, flipping that giant switch has worked in production for the past 5 
years, so I'm afraid I'll respectfully disagree. I'd suspect the 
pragmatics of real world software are with that function even exists, 
and it's extremely useful when used correctly...


Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Thu Aug 27 01:59:35 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 00:59:35 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <>

M.-A. Lemburg wrote:
> Let's look at this from another angle: sys.setdefaultencoding()
> is only made available for use in 

...see this:

I would like to use for all the very good reasons given 
in this thread:

- I don't want to change the default encoding for every project that 
uses the python installation in question

- I don't even want to change the default encoding for every python 
script run by the current user

- I only want to change the default encoding for one particular project.

Sadly, for the reasons I describe in the thread, won't find a in this situation...

> If you use it anywhere else, you're on your own. 

No problem with that. To be specific, this is a Zope 2.12 instance 
driven by this buildout:

recipe = zc.recipe.egg
eggs = ${buildout:eggs}
interpreter = py
scripts = runzope zopectl
initialization =
    import sys
    sys.argv[1:1] = ['-C','${buildout:directory}/etc/instance.conf']

The call to sys.setdefaultencoding is *very* early in the scheme of 
things... The runzope script that gets run only has some sys.path 
manipulation before sys.setdefaultencoding gets called. What problems 
could there be by calling sys.setdefaultencoding there?

> Such usage
> is not supported and may very well break your interpreter 

Can you give an example?

> or
> cause data corruption (the default encoded versions of Unicode
> objects are cached inside the objects).

When called as early as in the above script, what objects would have 
encoded strings cached in them?

> Now, in your particular case, you're probably better off just
> tweaking directly in your custom Python interpreter
> rather than relying on (see setencoding() in



Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Thu Aug 27 02:00:16 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 01:00:16 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <>

Guido van Rossum wrote:
> In retrospect, it should have been called sys._setdefaultencoding().
> That sends an extra signal that it's not meant for general use.

Crazy idea: how about mutating it into sys._setdefaultencoding rather 
than deleting it?


Simplistix - Content Management, Batch Processing & Python Consulting

From greg.ewing at  Thu Aug 27 02:30:33 2009
From: greg.ewing at (Greg Ewing)
Date: Thu, 27 Aug 2009 12:30:33 +1200
Subject: [Python-Dev] Problems with events in a numeric keyboard
In-Reply-To: <>
References: <>
Message-ID: <>

Martin Zugnoni wrote:
> when I press
> the triple zero key once, I receive three events from the single zero key.
> I need to make a disctintion between these keys

Sounds like you can't, except perhaps by detecting
three '0' key events arriving at almost the same


From martin at  Thu Aug 27 08:47:59 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Aug 2009 08:47:59 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>
Message-ID: <>

>> The ability to change the default encoding is a misfeature.  There's
>> essentially no way to write correct Python code in the presence of
>> this feature.
> How so? If every single piece of text in your project is encoded in a
> superset of ascii (such as utf-8), why would this be a problem?

What is "every single piece of text"? Every string occurring in source
code? or also every single string that may be read from a file, a
socket, out of a database, or from a user interface?

How can you be certain that any string is UTF-8 when doing any
reasonable IO?

> Even if you were evil/stupid and mixed encodings, surely all you'd get
> is different unicode errors or mayvbe the odd strange character during
> display?

One specific problem is dictionaries will stop working correctly if you
set the default encoding to anything but ASCII. The reason is that
with UTF-8 as the default encoding, you get

py> u"\u20ac" == u"\u20ac".encode("utf-8")
py> hash(u"\u20ac") == hash(u"\u20ac".encode("utf-8"))

So objects that compare equal will not hash equal. As a consequence, you
may have two different values for what should be the same key in a

> Well, flipping that giant switch has worked in production for the past 5
> years, so I'm afraid I'll respectfully disagree. I'd suspect the
> pragmatics of real world software are with that function even exists,
> and it's extremely useful when used correctly...

It has worked in your application. See my example above: it is very easy
to create applications that stop working correctly if you use
setdefaultencoding (at all - the only supported value is "latin-1",
since Unicode strings hash the same as byte strings if all characters
are in row 0).


From martin at  Thu Aug 27 08:53:02 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Aug 2009 08:53:02 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<>	<>
Message-ID: <>

>> In retrospect, it should have been called sys._setdefaultencoding().
>> That sends an extra signal that it's not meant for general use.
> Crazy idea: how about mutating it into sys._setdefaultencoding rather
> than deleting it?

Please don't post crazy ideas unless you really mean them.

This specific crazy idea must be rejected; it would break backwards
compatibility, for no good reason.


From chris at  Thu Aug 27 09:27:01 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 08:27:01 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
Message-ID: <>

Martin v. L?wis wrote:
>>> In retrospect, it should have been called sys._setdefaultencoding().
>>> That sends an extra signal that it's not meant for general use.
>> Crazy idea: how about mutating it into sys._setdefaultencoding rather
>> than deleting it?
> Please don't post crazy ideas unless you really mean them.
> This specific crazy idea must be rejected; it would break backwards
> compatibility, for no good reason.

How is it breaking backwards compatibility?

- If people were somehow relying on sys.setdefaultencoding to be 
deleted, that's fine, it's still gone

- If people were somehow relying on sys not having an attribute called 
_setdefaultencoding, or were relying on stuffing an attribute into sys 
called _setdefaultencoding then... well... that seems pretty unlikely ;-)


Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Thu Aug 27 09:42:51 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 08:42:51 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>
	<> <>
Message-ID: <>

Martin v. L?wis wrote:
>>> The ability to change the default encoding is a misfeature.  There's
>>> essentially no way to write correct Python code in the presence of
>>> this feature.
>> How so? If every single piece of text in your project is encoded in a
>> superset of ascii (such as utf-8), why would this be a problem?

I guess I should have said "every single piece of text in your project 
is encoded in a superset of ascii (such as utf-8) or is decoded into a 
unicode object at the application boundaries, such as an incoming http 
request or in the process of parsing a file off disk", in which case:

> What is "every single piece of text"? Every string occurring in source
> code? 


> or also every single string that may be read from a file,


> a
> socket, 


> out of a database, 


> or from a user interface?


Any others I can say Yes to? ;-)

> How can you be certain that any string is UTF-8 when doing any
> reasonable IO?

Careful checking, and a knowledge for people working on the app's 
development that anything else will result in severe pain, both physical 
and mental ;-)

>> Even if you were evil/stupid and mixed encodings, surely all you'd get
>> is different unicode errors or mayvbe the odd strange character during
>> display?
> One specific problem is dictionaries will stop working correctly if you
> set the default encoding to anything but ASCII. 

...except they haven't.

> The reason is that
> with UTF-8 as the default encoding, you get
> py> u"\u20ac" == u"\u20ac".encode("utf-8")
> True
> py> hash(u"\u20ac") == hash(u"\u20ac".encode("utf-8"))
> False
> So objects that compare equal will not hash equal. As a consequence, you
> may have two different values for what should be the same key in a
> dictionary.

Indeed, but this doesn't happen because the app never has a situation 
where strings and unicodes are put in the same dict. However, it does 
have plenty of situations where lists containing a mixture of utf-8 
encoded strings and unicodes exist, where changing the default encoding 
removes a *lot* of pain.

> It has worked in your application. See my example above: it is very easy
> to create applications that stop working correctly if you use
> setdefaultencoding (at all - the only supported value is "latin-1",
> since Unicode strings hash the same as byte strings if all characters
> are in row 0).

Would anyone object if I added this snippet to the .rst that generates:

It doesn't seem to be recorded anywhere anyone who's likely to use 
setdefaultencoding is likely to find it...


Simplistix - Content Management, Batch Processing & Python Consulting

From martin at  Thu Aug 27 09:53:23 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Aug 2009 09:53:23 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
Message-ID: <>

Chris Withers wrote:
> Martin v. L?wis wrote:
>>>> In retrospect, it should have been called sys._setdefaultencoding().
>>>> That sends an extra signal that it's not meant for general use.
>>> Crazy idea: how about mutating it into sys._setdefaultencoding rather
>>> than deleting it?
>> Please don't post crazy ideas unless you really mean them.
>> This specific crazy idea must be rejected; it would break backwards
>> compatibility, for no good reason.
> How is it breaking backwards compatibility?
> - If people were somehow relying on sys.setdefaultencoding to be
> deleted, that's fine, it's still gone
> - If people were somehow relying on sys not having an attribute called
> _setdefaultencoding, or were relying on stuffing an attribute into sys
> called _setdefaultencoding then... well... that seems pretty unlikely ;-)

If people were using the reload trickery, that would break if the
function changed its name.


From chris at  Thu Aug 27 10:01:33 2009
From: chris at (Chris Withers)
Date: Thu, 27 Aug 2009 09:01:33 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
	<> <>
Message-ID: <>

Martin v. L?wis wrote:
>> - If people were somehow relying on sys not having an attribute called
>> _setdefaultencoding, or were relying on stuffing an attribute into sys
>> called _setdefaultencoding then... well... that seems pretty unlikely ;-)
> If people were using the reload trickery, that would break if the
> function changed its name.

No it doesn't:

$ svn diff
Index: Lib/
--- Lib/ (revision 74552)
+++ Lib/ (working copy)
@@ -540,6 +540,7 @@
      if hasattr(sys, "setdefaultencoding"):
+        sys._setdefaultencoding = sys.setdefaultencoding
          del sys.setdefaultencoding

 >>> import sys
 >>> sys._setdefaultencoding
<built-in function setdefaultencoding>
 >>> sys.setdefaultencoding
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'setdefaultencoding'

 >>> reload(sys)
<module 'sys' (built-in)>
 >>> sys.setdefaultencoding
<built-in function setdefaultencoding>
 >>> sys._setdefaultencoding
<built-in function setdefaultencoding>


Simplistix - Content Management, Batch Processing & Python Consulting

From martin at  Thu Aug 27 10:02:47 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Aug 2009 10:02:47 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>
	<> <>
Message-ID: <>

>> One specific problem is dictionaries will stop working correctly if you
>> set the default encoding to anything but ASCII. 
> ...except they haven't.

In your application. Can you please agree that this a semantical problem
that is completely unacceptable for language design?

> Indeed, but this doesn't happen because the app never has a situation
> where strings and unicodes are put in the same dict. However, it does
> have plenty of situations where lists containing a mixture of utf-8
> encoded strings and unicodes exist, where changing the default encoding
> removes a *lot* of pain.

So you should convert all byte strings to UTF-8 before adding them
to the list. Assuming you have used proper encapsulation and
object-oriented design, it shouldn't be too difficult to find, for each
such list, where the places are that modify the list.

> Would anyone object if I added this snippet to the .rst that generates:

The snippet explaining the problem? I don't mind, but Raymond is on
record for objecting to any addition of a warning box to the
documentation, because it gives the impression that Python is full of
problems, when many these warnings really refer to boundary cases only.


From stephen at  Thu Aug 27 10:12:30 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 27 Aug 2009 17:12:30 +0900
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library
	for	the	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

Ben Finney writes:

 > A pity, since the entire point of Design Patterns is to give us a
 > vocabulary of terms to use that enable these concepts to be communicated
 > *without* continually re-defining them. To that extent, then, they fail
 > their purpose.

Of course not!  There will always be children ... I hope.

The point of Design Patterns is to give a vocabulary that can be used
in an "in group" context without any explanation at all, and in the
stylized form

    This implementation follows the "Strategy" design pattern of Ralf,
    Joel, Yen-shih, and Gundarmambanagoong, which is ...

which educates the young and allows the cognoscenti to daydream for a

From martin at  Thu Aug 27 10:06:40 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Aug 2009 10:06:40 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<>	<>
	<> <>
	<> <>
Message-ID: <>

>      if hasattr(sys, "setdefaultencoding"):
> +        sys._setdefaultencoding = sys.setdefaultencoding
>          del sys.setdefaultencoding

Ah, so you didn't want to rename the function. I agree that this
would not break backwards compatibility.

I guess the basic objection remains: making it so would make
_setdefaultencoding a supported feature, which would then mean
that we should fix all the bugs that it causes - when we already
know (because we thought many years about this) that it is not
possible to implement setdefaultencoding correctly and efficiently
(so the current implementation is only efficient, but not correct).


From mal at  Thu Aug 27 10:34:36 2009
From: mal at (M.-A. Lemburg)
Date: Thu, 27 Aug 2009 10:34:36 +0200
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <> <>
Message-ID: <>

Chris Withers wrote:
> M.-A. Lemburg wrote:
>> Let's look at this from another angle: sys.setdefaultencoding()
>> is only made available for use in 
> ...see this:
> I would like to use for all the very good reasons given
> in this thread:
> - I don't want to change the default encoding for every project that
> uses the python installation in question
> - I don't even want to change the default encoding for every python
> script run by the current user
> - I only want to change the default encoding for one particular project.
> Sadly, for the reasons I describe in the thread, won't find a
> in this situation...
>> If you use it anywhere else, you're on your own. 
> No problem with that. To be specific, this is a Zope 2.12 instance
> driven by this buildout:
> [instance]
> recipe = zc.recipe.egg
> eggs = ${buildout:eggs}
> interpreter = py
> entry-points=
>    zopectl=Zope2.Startup.zopectl:main
> scripts = runzope zopectl
> initialization =
>    import sys
>    reload(sys)
>    sys.setdefaultencoding('utf-8')
>    sys.argv[1:1] = ['-C','${buildout:directory}/etc/instance.conf']
> The call to sys.setdefaultencoding is *very* early in the scheme of
> things... The runzope script that gets run only has some sys.path
> manipulation before sys.setdefaultencoding gets called. What problems
> could there be by calling sys.setdefaultencoding there?
>> Such usage
>> is not supported and may very well break your interpreter 
> Can you give an example?

You can get strange effects caused by the fact that some
string objects will now compare equal while not necessarily
having the same hash value.

Unicode objects and strings have the same hash value provided
that they are both ASCII.

With the ASCII default encoding, a non-ASCII string cannot
be compared to a Unicode object, so the problem does not

>> or
>> cause data corruption (the default encoded versions of Unicode
>> objects are cached inside the objects).
> When called as early as in the above script, what objects would have
> encoded strings cached in them?

Difficult to say. This depends a lot on the environment
where you are running the script.

Note that the codecs are loaded at a very early stage in
the interpreter startup and a lot of them do use Unicode
strings. This wasn't the case in Python 1.6 when
the whole approach to setting the default
encoding was designed, but added later on, in Python 2.1
IIRC, when noone really considered using a different
default encoding anymore.

Using UTF-8 as new default encoding will not cause much
trouble with this, since it is an ASCII superset.

However, changing it more than once will cause the earlier
Unicode objects to still use the old default encoding

Using a different non-ASCII compatible encoding, such
as UTF-16, will cause breakage for the same reason.

The default encoded string version of a Unicode object is
cached in the object and never recreated after it has
first been successfully encoded.

When only changing the default encoding once and using
UTF-8 as the new default encoding, you'll only run into
the hash value problem.

If that's not an issue for your
application, e.g. you don't mix Unicode and string key
objects in your dictionaries and don't rely on the special
relationship between hashes and comparisons elsewhere,
you should be fine.

>> Now, in your particular case, you're probably better off just
>> tweaking directly in your custom Python interpreter
>> rather than relying on (see setencoding() in
> Why?

To get the job done :-)

You could rewrite setencoding() to get the encoding information
from e.g. an os.environ variable or some config file.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Aug 27 2009)
>>> Python/Zope Consulting and Support ...
>>> mxODBC.Zope.Database.Adapter ...   
>>> mxODBC, mxDateTime, mxTextTools ...

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From ncoghlan at  Thu Aug 27 11:08:39 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 27 Aug 2009 19:08:39 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for	the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Ben Finney wrote:
> Antoine Pitrou <solipsis at> writes:
>> DrKJam <drkjam <at>> writes:
>>> netaddr employs a simple variant of the GoF Strategy design pattern (with
>> added Python sensibility).
>> It would be nice if you could avoid employing this kind of acronyms
>> without explaining them. Not everybody drinks the design pattern
>> kool-aid.
> A pity, since the entire point of Design Patterns is to give us a
> vocabulary of terms to use that enable these concepts to be communicated
> *without* continually re-defining them. To that extent, then, they fail
> their purpose.

My experience with them the named design patterns is that:

1. An awful lot of them seem to be just about working around some of the
limitations of static typing, a lack of functions as first class
objects, or are in some other way a lot less useful outside a C++/Java

2. Others are used intuitively by a lot of developers that have never
even heard of any of the GoF authors or the Design Patterns book (the
Adapter pattern being the most common example that occurs to me).

I still think most people (even those that primarily work in dynamic
languages) can learn something by reading it, but following their
examples too slavishly creates evils of its own (especially when
attempting to apply patterns that don't really suit the language being
used). When the inappropriate use of named patterns runs counter to the
accepted idioms of a language is when I see terms like the "design
pattern kool-aid" mentioned above getting thrown around :)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Thu Aug 27 11:18:10 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 27 Aug 2009 19:18:10 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

skip at wrote:
>     Martin> I think it's too early to tell. It may be that they have not yet
>     Martin> achieved their purpose - just let's wait fifty more years (and
>     Martin> I'm only half-joking).
> So what you're really saying is we only have to wait 25 years...


Martin has a really good point though - software development is still
pretty immature as a discipline, being around for mere decades as
opposed to the centuries (or more) that other activities like
mathematics, physics or construction have behind them.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Thu Aug 27 11:37:25 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 27 Aug 2009 19:37:25 +1000
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <> <>
Message-ID: <>

Chris Withers wrote:
> Nick Coghlan wrote:
>> The details of the sys.path manipulation at program startup are
>> documented here:
>> The directory prepended to sys.path is based on the code executed by the
>> command line.
> It's more subtle than that though...
> The OP in is being bitten by the
> same expectation that I am: should be found somewhere
> on the sys.path present at the start of the script/module/command/etc
> being executed. (The bug referenced in that report makes things worse,
> because this used to work, at least on Windows ;-) )
> The problem is that (and therefore is imported
> early in main.c on line 516 as part of Py_Initialize(), but the path of
> the current script only gets added later on in RunMainFromImporter
> called on line 569.
> Strictly speaking, the docs at
> aren't lying, but it takes an understanding of when is imported
> that isn't available to anyone who doesn't read C to know why a path
> that is present on sys.path when the user's script starts isn't being
> searched for
> What do people feel about this?
> At the very least, I'd like to add a warning box in site.html to explain
> why sitecustomize might not be found where people expect.
> I'd *like* to have the paths be the same for as they are for the
> subsequent code that's executed, but would that make too much of a mess
> of main.c and runpython.c?

Ah, OK - I see the problem now. However, I think the current behaviour
is correct, it just needs to be documented better (probably noted in
both the command line doco regarding sys.path manipulation and in the
doco for

The reason I think the current behaviour is correct is that and are meant to be about customising the *site* (i.e. the
installation of Python that is being executed) rather than about
customizing a particular application. Importing them before the script
specific directories are prepended to sys.path goes a long way towards
achieving that.

Also, as was pointed out on the tracker item, having a script that can
automatically be executed when running an arbitrary Python script
without any request from or notification to the user is not a good idea
from a security standpoint.

When it comes to adding additional paths for specific applications, you
can either bundle the relevant packages into a single directory and use
2.6's directory execution feature or else look into the assorted
application environment customisation tools that are out there like


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From stephen at  Thu Aug 27 12:29:52 2009
From: stephen at (Stephen J. Turnbull)
Date: Thu, 27 Aug 2009 19:29:52 +0900
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

Chris Withers writes:

 > > How can you be certain that any string is UTF-8 when doing any
 > > reasonable IO?
 > Careful checking, and a knowledge for people working on the app's 
 > development that anything else will result in severe pain, both physical 
 > and mental ;-)

If you're *that* careful, the additional effort to hack around this is
negligible. The problem is that most people are *never* that careful,
and *all* people are rarely that careful.

I understand your use case, but I don't see a case for exposing this
to the general public.

From drkjam at  Thu Aug 27 15:07:59 2009
From: drkjam at (DrKJam)
Date: Thu, 27 Aug 2009 14:07:59 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/25 Peter Moody <peter at>

> On Mon, Aug 24, 2009 at 3:24 PM, DrKJam<drkjam at> wrote:


> As it was left in early June, a pep and design modifications were
> requested before ipaddr would be considered for inclusion, but if this
> is going to start *another* drawn out ipaddr/netaddr thread, perhaps
> the mailman admin(s) could setup a new SIG list for this.  I
> personally hope that's not required; yours has been the only
> dissenting email and I believe I respond to all of your major points
> here.

The PEP process is the perfect forum for spending some time scrutinizing and
discussing this topic in more detail. I will be raising further points in
future when I've had time to fully evaluate both the PEP and the reference
implementation of ipaddr.

At this stage, it is premature to assume the reference implementation
provided along with the PEP is necessarily complete, only requiring a few
bug fixes to get through the approval process.

> > 1) Firstly, an offering of code.
> >
> > I'd like to bring to your attention an example implementation of an IP
> > address library and interface for general discussion to compare and
> contrast
> > with ipaddr 2.0.x :-
> >
> >
> >
> > It is based on netaddr 0.7.2 which I threw together earlier today.
> >
> > In essence, I've stripped out all of what could be considered
> non-essential
> > code for a purely IP related library. This branch should be suitable for
> > *theoretical* consideration of inclusion into some future version of the
> > Python standard library (with a little work).
> >
> > It is a pure subset of netaddr release 0.7.2, *minus* the following :-
> >
> > - all IEEE layer-2 code
> > - some fairly non-essential IANA IP data files and lookup code
> > - IP globbing code (fairly niche)
> >
> > Aside: Just a small mention here that I listened carefully to Clay
> McClure's
> > and others criticisms of the previous incarnation of ipaddr. The 0.7.x
> > series of netaddr breaks backward compatibility with previous netaddr
> > releases and is an "answer" of sorts to that discussion and issue raised
> > within the Python community. I hope you like what I've done with it.
> >
> > For the purposes of this discussion consider this branch the "Firefox to
> > netaddr's Mozilla" or maybe just plain old "netaddr-ip-lite" ;-)
> >
> > 2) I refute bold claim in the PEP that :-
> >
> >     "Finding a good library for performing those tasks can be somewhat
> more
> > difficult."
> >
> > On the contrary, I wager that netaddr is now a perfectly decent
> alternative
> > implementation to ipaddr, containing quite a few more features with
> little
> > of the slowness for most common operations,
> I think you mean refuse,

No, I meant refute.

> b/c this certainly wasn't the case when I
> started writing ipaddr. IPy existed, but it was far too heavyweight
> and restrictive for what I needed (no disrespect to the author(s)
> intended). I believe I've an email or two from you wherein you
> indicate the same.

The comment made on IPy, to which I believe you are referring, was in
response to you incorrectly comparing netaddr and IPy's implementation
(assuming conditional logic was used within each method to support IP
versioning). As already stated netaddr gets around this with a strategy
design pattern approach (apologies to readers for using the "Gang of Four"
acronym with regard to this).

IPy is heavyweight? How so? It is a mere 1200 lines including comments and
deals with IPv4 and IPv6 addressing, much like ipaddr (albeit with fewer
features). There are certainly issues you could raise against it (otherwise
we wouldn't be here), but being heavyweight is not one of them.

I would actively encourage authors of said library (Victor Stinner is listed
as the current maintainer) to get involved in the discussion of this PEP. It
is their legacy that this work is picking up from.

Incidentally, I've noticed a few bug fix releases come through for IPy on
PyPI in the last month so that project certainly seems alive and well.

I think the PEP currently doesn't provide appropriate weight to the efforts
of others in this area.

FYI, here is a wiki entry I've been maintaining for a while now to this end

> > 2/3x faster in a lot of cases,
> > not that we're counting. What a difference a year makes!
> > I also rate IPy quite highly even if it is getting a little "long in the
> tooth".
> > For a lot of users, IPy could also be considered a nice, stable API!
> yes, netaddr has sped up quite a bit. It's still slower in many cases
> as well. But again, who's timing?

I mention speed and timings as the PEP cites this as one of the benefits of
considering the ipaddr reference implementation.

> > By the same token I'm happy to note some convergence between the ipaddr
> and
> > netaddr's various interfaces, particularly in light of discussions and
> > arguments put forward by Clay McClure and others. A satisfactory
> compromise
> > between the two however still seems a way off.
> >
> >
> > 3) I also disagree with the PEP's claim that :-
> >
> >     "attempts to combine [IPv4 and IPv6] into one object would be like
> > trying to force a round peg into a square hole (or vice versa)".
> >
> > netaddr (and for that matter IPy) cope with this perceived problem
> > admirably.
> >
> > netaddr employs a simple variant of the GoF Strategy design pattern (with
> > added Python sensibility). In the rare cases where ambiguity exists
> between
> > IPv4 and IPv6 addresses a version parameter may be passed to the
> constructor
> > of the IPAddress class to differentiate between them. Providing an IP
> > address version to the constructor also provides a small performance
> > improvement.
> I'm not sure what point you're trying to make here. I didn't say it
> was impossible, I inferred that there are easier ways. having used
> code which crams both types into one object, I found it to be cludgey
> and complicated so I designed something different.

Let me clarify. I am +1 on the specific item in the PEP regarding the need
for separate and distinct IPAddress and IPNetwork class interfaces that are
not conflated into a single interface. Clay McClure made this point very
eloquently. I've done a good bit of experimentation on this since it was
mentioned so I am fully aware of the pros and cons of each approach. A brief
look at netaddr.ip.lite confirms that on this we both agree.

Where I disagree is on the need to have yet another split in the interface
to support different IP versions (and a set of Factory functions to pull it
all together again). Hey, another design pattern, also known as the "Factory
Method" a.k.a. Virtual Constructor (or in this case a Python function).

> and as a hardly partial observer, I'll add the explicit address
> version you can pass to the IPAddress class, but not the IPNetwork
> class, is, odd. it actually seems to slow down object creation (~5%)
> except in the case of an int arg (your default is about twice as
> slow).

Ah, the issue of speed and timings again. Let's concentrate on getting the
interface right before we spend too much effort on optimization. I'm quite
happy to do a full speed comparison of major features in both libraries but
I don't think that would be a worthwhile use of time just now.

Currently I'm ambivalent on whether an IP(vX)Network class constructor
should accept a numerical (i.e. integer) value at all *unless* you explicit
state somehow that you want the network aspect to be inferred in some
specific way. It isn't a case of just choosing /32 or /128 and having this
as the only option. IP (v4) classful rules are still pervasive in the real
world. A general case IP library available to the whole Python community
should certainly take this into account.

> > IPv4 and IPv6 addresses can be used interchangably throughout netaddr
> > without causing issue during operations such as sorting, merging (known
> in
> > the PEP as "address collapsing") or address exclusion.
> >
> > Don't try and do this with the current reference implementation of ipaddr
> :-
> >
> >>>> collapse_address_list([IPv4Address(''),
> >>>> IPv6Address('::')])
> > [IPv4Network('' <>)]
> >
> > OUCH! Even if this isn't allowed (according to the documentation), it
> should
> > raise an Exception rather than silently passing through.
> >
> > I actually raised this back in May on the ipaddr bug tracker but it
> hasn't
> > received any attention so far :-
> >
> >
> >
> > Compare this with netaddr's behaviour :-
> >
> >>>> cidr_merge([IPAddress(''), IPAddress('::')])
> > [IPNetwork('' <>), IPNetwork('::
>' <>)]
> >
> > That's more like it.
> OUCH! indeed. I'm not even sure that this is a nice corner case
> feature, summarizing a single list of mixed ip type objects. with an
> extra line or two, this can be done in ipaddr, though 'tis true that
> we should now raise an exception and don't (it appears to be something
> that was introduced recently).  If this is a feature for which
> developers are clamoring, I'm all over it. Yours is the first email
> I've heard mention it.

I may be the only one raising issues but that shouldn't mean they are any
less relevant. There is a whole different feel and thrust behind both
interfaces each with their own merits.

> > 4) It may just be me but the design of this latest incarnation of ipaddr
> > seems somewhat complicated for so few lines of code. Compared with
> ipaddr,
> > netaddr doesn't use or require multiple inheritance nor a seemingly
> > convoluted inheritance heirarchy. There isn't a need for an IP() type
> > 'multiplexer' function either (although I might be missing an important
> use
> > case here). But, then again, this may just be my personal preference
> talking
> > here. I prefer composition over inheritance in most cases.
> this basically smacks of more petty attackery from the start. so I'll
> reply with, "it's just you".
> if you want to debate the merits of GOF strategy vs. multiple
> inheritance, fine. the class inheritance in ipaddr is very clean, and
> leaves very little code duplication. The classes are very clearly
> named and laid out, and in general are much easier to follow than the
> strategy method you've chosen for netaddr.

I realise you've done a lot of work on ipaddr and my observations are not
intended as a "petty attackery" as you put it. It was merely to question
whether the shift in approach from earlier incarnations of ipaddr to this is
the correct path to be taking. I don't think that solely relying on "IS A"
via multiple inheritance necessarily brings clarity to this code which, as
stated in the PEP, is intended to be simple for other to understand and
possibly use as a basis for their own extensions. More on this in future

If you missed it I have diagrammed the class hierarchy and internal layout
of each library here for consideration :-


> > 5) The ipaddr library is also missing options for expanding various
> > (exceedingly common) IP abbreviations.
> >
> >>>> from netaddr import IPNetwork
> >
> >>>> IPNetwork('10/8', True)
> > IPNetwork('')
> >
> > netaddr also handles classful IP address logic, still pervasive
> throughout
> > modern IP stacks :-
> >
> >>>> IPNetwork('', True)
> > IPNetwork('')
> >
> > Note that these options are disabled by default, to keep up the speed of
> the
> > IPNetwork constructor up for more normal cases.
> these seem like corner case features for the sake of having features,
> you don't even seem to put much stock in them. FWIW, I've never seen a
> request for something similar. I may say '10 slash 8', but I mean,
> ''. I'm missing the utility here, but I'm open to reasoned
> arguments.

I don't see why genuine features should be automatically dismissed as
"corner cases".

If you need proof, here an excerpt from RFC 1918 :-

3. Private Address Space

   The Internet Assigned Numbers Authority (IANA) has reserved the
   following three blocks of the IP address space for private
   internets:        -  (10/8 prefix)      -  (172.16/12 prefix)     - (192.168/16 prefix)


I've also had specific requests from users about this feature, one just in
the last week (which only required me to point them towards the available
switch argument in IPNetwork constructor to enable the required behaviour).

In netaddr 0.7.x I have chosen *not* to make this expansion the default case
because it provides a not insignificant construction penalty for those that
are not interested in it (as you have already noted and of which I am

I believe strongly that this *is* an important option for a general use IP
address library.


> > There is a lot more to consider here than I can cram into this initial
> > message, so I'll hand over to you all for some (hopefully) serious
> debate.
> I'm always open to serious debate, and patches/bug reports (apologies
> for missing your earlier issue. I'm not sure if you were aware, but
> ipaddr was undergoing a major re-write at the time and I never got
> around to following up).

I note your response to this on the ipaddr bug tracker today, thanks.


> PS - Why does the References section in the PEP contain links to patches
> > already applied to the ipaddr 2.0.x reference implementation?
> There's A link to A patch (singular, both times), which has already
> been applied. This link exists b/c, at the time I last updated the
> PEP, the patch hadn't been applied as it was still being reviewed.

Thanks for the clarification.


in general, that leads
> to fewer bugs like the following:
> >>> help(netaddr.IPNetwork.__init__)
> Help on method __init__ in module netaddr.ip:
> __init__(self, addr, implicit_prefix=False) unbound netaddr.ip.IPNetwork
> method
>    Constructor.
>    @param addr: an IPv4 or IPv6 address with optional CIDR prefix,
>        netmask or hostmask. May be an IP address in representation
>        (string) format, an integer or another IP object (copy
>        construction).
>    @param implicit_prefix: if True, the constructor uses classful IPv4
>        rules to select a default prefix when one is not provided.
>        If False it uses the length of the IP address version.
>        (default: False).
> >>> netaddr.IPNetwork(1)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>   File "./netaddr/ip/", line 632, in __init__
>    prefix, suffix = addr.split('/')
> AttributeError: 'int' object has no attribute 'split'
> vs.
> >>> import ipaddr
> >>> ipaddr.IPNetwork(1)
> IPv4Network('')

Thanks for raising this on the netaddr bug tracker. I'll take a look at it.

Did you have any other comments on the PEP?

Yes I do but they will be coming through in stages unfortunately as I get
time to look at this further.

David Moss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From exarkun at  Thu Aug 27 15:08:57 2009
From: exarkun at (exarkun at
Date: Thu, 27 Aug 2009 13:08:57 -0000
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
Message-ID: <20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>

On 26 Aug, 11:51 pm, chris at wrote:
>exarkun at wrote:
>>The ability to change the default encoding is a misfeature.  There's 
>>essentially no way to write correct Python code in the presence of 
>>this feature.
>How so? If every single piece of text in your project is encoded in a 
>superset of ascii (such as utf-8), why would this be a problem?
>Even if you were evil/stupid and mixed encodings, surely all you'd get 
>is different unicode errors or mayvbe the odd strange character during 

This is what I meant when I said what I said about correct code.  If 
you're happy to have encoding errors and corrupt data, then I guess 
you're happy to have a function like setdefaultencoding.
>>It may be a major task, but the best thing you can do is find each str 
>>and unicode operation in the software you're working with and make 
>>them correct with respect to your inputs and outputs.  Flipping a 
>>giant switch for the entire process is just going to change which 
>>things are wrong.
>Well, flipping that giant switch has worked in production for the past 
>5 years, so I'm afraid I'll respectfully disagree. I'd suspect the 
>pragmatics of real world software are with that function even exists, 
>and it's extremely useful when used correctly...

I suppose it's fortunate for you that the function exists, then.  For my 
part, I have managed to write and operate a lot of code in production 
for at least as long without ever touching it.  Generally speaking, I 
also don't find that I encounter lots of unicode errors or corrupted 
data (*sometimes* I do; in those cases, I fix the broken code and it 
doesn't happen again).


From ncoghlan at  Thu Aug 27 15:51:30 2009
From: ncoghlan at (Nick Coghlan)
Date: Thu, 27 Aug 2009 23:51:30 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

DrKJam wrote:
> Currently I'm ambivalent on whether an IP(vX)Network class constructor
> should accept a numerical (i.e. integer) value at all *unless* you
> explicit state somehow that you want the network aspect to be inferred
> in some specific way. It isn't a case of just choosing /32 or /128 and
> having this as the only option. IP (v4) classful rules are still
> pervasive in the real world. A general case IP library available to the
> whole Python community should certainly take this into account.

Don't forget that separating the "from_int" construction behaviour out
to a separate class method is an available option rather than type-based
behaviour switching in the main constructor.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From peter at  Thu Aug 27 15:52:20 2009
From: peter at (Peter Moody)
Date: Thu, 27 Aug 2009 06:52:20 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

Howdy folks,

the reference code has been updated per your comments; specifically,
there's no more IP/IPv4/IPv6 factory functions, it's all IPAddress()
and IPNetwork constructors.

I've submitted a patch to the PEP with updated examples and a lengthy
description of the class inheritance and the benefits from that design
choice. hopefully that should go live soon.

If there are any more suggestions on the PEP or the code, please let me know.


On Tue, Aug 18, 2009 at 1:00 PM, Peter Moody<peter at> wrote:
> Howdy folks,
> I have a first draft of a PEP for including an IP address manipulation
> library in the python stdlib. It seems like there are a lot of really
> smart folks with some, ahem, strong ideas about what an IP address
> module should and shouldn't be so I wanted to solicit your input on
> this pep.
> the pep can be found here:
> ?
> the code can be found here:
> ?
> Please let me know if you have any comments (some already coming :)
> Cheers,
> /peter

From peter at  Thu Aug 27 16:15:33 2009
From: peter at (Peter Moody)
Date: Thu, 27 Aug 2009 07:15:33 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Wed, Aug 26, 2009 at 4:48 PM, DrKJam<drkjam at> wrote:
> I've started a very basic (work in progress) entry on the netaddr wiki to
> track various aspects of this discussion that might not be in a format
> suitable for publishing to the list or are too lengthy. It will also allow
> my ascii art diagrams to render correctly ;-)
> I will be updating it as free time become available to me over the coming
> days. Feel free to make comments on the wiki page itself if you want me to
> make any changes. Duncan McGreggor should be able to make changes if I am
> not available for whatever reason :-
> If anyone has suggestions for a better place to put this, please shout (but
> not too loudly please Peter M. ;-)

you know, Dave, I'm actually pretty tired of your snide remarks. Your
passive aggressive emails, your inability to remember when or be
honest about why you unsubbed me from the netaddr list, etc. are not
the actions of someone seeking serious debate about the best possible
library for python.

To answer your 'where' question, is pretty clear on where peps
should be submitted:  specifically, "We try to build consensus around
a PEP, but if that's not possible, you can always submit a competing

I'm not personally looking forward waiting around for more of your
free time to become available before you can get to updating your wiki
(which, btw, seems to be both out of date and otherwise incorrect wrt
ipaddr and shockingly under-representative of the complexity of
netaddr).  Hopefully the code you've written is currently in a more
complete state than the wiki and can be judged as it is.


> Thanks,
> Dave M.
> PS - Can't wait for Google Wave which would make this kind of thing so much
> easier ;-)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From barry at  Thu Aug 27 16:17:27 2009
From: barry at (Barry Warsaw)
Date: Thu, 27 Aug 2009 10:17:27 -0400
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>
References: <>
Message-ID: <>

On Aug 27, 2009, at 9:08 AM, exarkun at wrote:

> This is what I meant when I said what I said about correct code.  If  
> you're happy to have encoding errors and corrupt data, then I guess  
> you're happy to have a function like setdefaultencoding.

Whatever happened to "we're all adults here"[1]?  I have no problem  
with making it difficult but possible to write buggy but practical  
code.  Software engineering is a messy business.


[1] That may not be literally true any more, but still :)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 832 bytes
Desc: This is a digitally signed message part
URL: <>

From fuzzyman at  Thu Aug 27 16:28:45 2009
From: fuzzyman at (Michael Foord)
Date: Thu, 27 Aug 2009 15:28:45 +0100
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

Congratulations and thank you - this is *great* news.


Joe Amenta wrote:
> Hello all,
> I have released the first alpha version of 3to2 after finishing it for 
> my Google Summer of Code 2009(tm) project.  You can get the tarball 
> for this release at 
> This requires python 2.7, because it requires a newer version of 2to3 
> than what comes with 2.6.
> Release notes are in the RELEASE file.  Development happens at 
>, and the source code for this 
> release lives at 
> <>.
> Report bugs at, please.
> Additional notes and comments can (for now) be found at 
> --Joe
> ------------------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From guido at  Thu Aug 27 18:50:38 2009
From: guido at (Guido van Rossum)
Date: Thu, 27 Aug 2009 09:50:38 -0700
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/27 Barry Warsaw <barry at>:
> On Aug 27, 2009, at 9:08 AM, exarkun at wrote:
>> This is what I meant when I said what I said about correct code. ?If you're happy to have encoding errors and corrupt data, then I guess you're happy to have a function like setdefaultencoding.
> Whatever happened to "we're all adults here"[1]? ?I have no problem with making it difficult but possible to write buggy but practical code. ?Software engineering is a messy business.

Being adults about it also means when to give up. Chris, please stop
arguing about this. There are plenty of techniques you can use to get
what you want without changing Python, for example virtualenv, which
allows you to create a custom Python environment for each project. Or
you could switch to Python 3.1, whose different approach to
distinguishing between encoded and decoded string means that you won't
have to worry about the default encoding quite as much (and you are
free to change the default *filesystem* encoding in Py3k). Or you
could invoke python -S, which skips and, so
you are free to mess up any way you want.

The fundamental reason the designers of Python's 2.x standard library
don't want you to be able to set the default encoding in your app, is
that the standard library is written with the assumption that the
default encoding is fixed, and no guarantees about the correct
workings of the standard library can be made when you change it. There
are no tests for this situation. Nobody knows what will fail when. And
you (or worse, your users) *will* come back to us with complaints if
the standard library suddenly starts doing things you didn't expect.

--Guido van Rossum (home page:

From drkjam at  Thu Aug 27 19:24:35 2009
From: drkjam at (David Moss)
Date: Thu, 27 Aug 2009 18:24:35 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>


I would like to apologise if I have caused you any offense. Please can  
we put the animosity behind us and stick to pulling together the best  
IP library possible as part of this PEP?


Dave M.

From peter at  Thu Aug 27 19:37:14 2009
From: peter at (Peter Moody)
Date: Thu, 27 Aug 2009 10:37:14 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, Aug 27, 2009 at 10:24 AM, David Moss<drkjam at> wrote:
> Peter,
> I would like to apologise if I have caused you any offense.

Thanks. Accepted.

> Please can we
> put the animosity behind us and stick to pulling together the best IP
> library possible as part of this PEP?

pep-3144 should hopefully soon be updated on with
this past week's suggestions (including a discussion on the ipaddr
class design).  The updated ipaddr reference code should also still be
available for 'svn co' at


> Regards,
> Dave M.

From peter at  Thu Aug 27 19:39:50 2009
From: peter at (Peter Moody)
Date: Thu, 27 Aug 2009 10:39:50 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

On Thu, Aug 27, 2009 at 10:37 AM, Peter Moody<peter at> wrote:
> On Thu, Aug 27, 2009 at 10:24 AM, David Moss<drkjam at> wrote:
>> Peter,
>> I would like to apologise if I have caused you any offense.
> Thanks. Accepted.
>> Please can we
>> put the animosity behind us and stick to pulling together the best IP
>> library possible as part of this PEP?
> pep-3144 should hopefully soon be updated on with
> this past week's suggestions (including a discussion on the ipaddr
> class design). ?The updated ipaddr reference code should also still be
> available for 'svn co' at

er, make that

https seems to ask for a password.

> Cheers,
> /peter
>> Regards,
>> Dave M.

From fwierzbicki at  Thu Aug 27 20:52:28 2009
From: fwierzbicki at (Frank Wierzbicki)
Date: Thu, 27 Aug 2009 14:52:28 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Aug 26, 2009 at 3:29 PM, Joe Amenta<amentajo at> wrote:
> Hello all,
> I have released the first alpha version of 3to2 after finishing it for my
> Google Summer of Code 2009(tm) project.

Wow, congratulations!


From benjamin at  Thu Aug 27 22:08:25 2009
From: benjamin at (Benjamin Peterson)
Date: Thu, 27 Aug 2009 15:08:25 -0500
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/26 Joe Amenta <amentajo at>:
> Hello all,
> I have released the first alpha version of 3to2 after finishing it for my
> Google Summer of Code 2009(tm) project.? You can get the tarball for this
> release at
> This requires python 2.7, because it requires a newer version of 2to3 than
> what comes with 2.6.

Great work and congratulations on your first release!

Have you posted this to python-list and python-announce-list, too?


From brett at  Thu Aug 27 23:12:06 2009
From: brett at (Brett Cannon)
Date: Thu, 27 Aug 2009 14:12:06 -0700
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <> 
Message-ID: <>

What are the plans to merge this into Python's repository so we can
all help out on this?

On Thu, Aug 27, 2009 at 13:08, Benjamin Peterson<benjamin at> wrote:
> 2009/8/26 Joe Amenta <amentajo at>:
>> Hello all,
>> I have released the first alpha version of 3to2 after finishing it for my
>> Google Summer of Code 2009(tm) project.? You can get the tarball for this
>> release at
>> This requires python 2.7, because it requires a newer version of 2to3 than
>> what comes with 2.6.
> Great work and congratulations on your first release!
> Have you posted this to python-list and python-announce-list, too?
> --
> Regards,
> Benjamin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From benjamin at  Thu Aug 27 23:47:35 2009
From: benjamin at (Benjamin Peterson)
Date: Thu, 27 Aug 2009 16:47:35 -0500
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/27 Brett Cannon <brett at>:
> What are the plans to merge this into Python's repository so we can
> all help out on this?

None at the moment. I think the community needs to show its interest
in it and Joe his willingness to maintain it in the future in order
for it to qualify for addition to the stdlib.

I don't see why having it merged into Python's repo is a requirement
for contribution, since he's using Mecurial. :)


From mcguire at  Thu Aug 27 23:15:16 2009
From: mcguire at (Jake McGuire)
Date: Thu, 27 Aug 2009 14:15:16 -0700
Subject: [Python-Dev] deprecated methods on array objects
Message-ID: <>

The python documentation says that the read() and write() methods on array
objects have been deprecated since 1.5.1.  I assume this is because their
semantics are almost the exact opposite of read() and write() on a file-like
object; reads data from a file into the array and array.write()
writes data from the array to a file.
This causes fatal confusion in code that checks for the existence of read()
and write() to determine whether an object is file-like.  Code such as

What is the timeline for removing these methods from array?  It has been 11
years now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jnoller at  Fri Aug 28 00:39:50 2009
From: jnoller at (Jesse Noller)
Date: Thu, 27 Aug 2009 18:39:50 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 27, 2009 at 5:47 PM, Benjamin Peterson<benjamin at> wrote:
> 2009/8/27 Brett Cannon <brett at>:
>> What are the plans to merge this into Python's repository so we can
>> all help out on this?
> None at the moment. I think the community needs to show its interest
> in it and Joe his willingness to maintain it in the future in order
> for it to qualify for addition to the stdlib.

Is that how 2to3 got in? If I remember correctly, this was a huge
request from the language summit - and by huge, I mean really, really

In fact, I think you were the guy who was heading the SoC project, right? ;)

From brett at  Fri Aug 28 00:48:28 2009
From: brett at (Brett Cannon)
Date: Thu, 27 Aug 2009 15:48:28 -0700
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <> 
Message-ID: <>

On Thu, Aug 27, 2009 at 14:47, Benjamin Peterson<benjamin at> wrote:
> 2009/8/27 Brett Cannon <brett at>:
>> What are the plans to merge this into Python's repository so we can
>> all help out on this?
> None at the moment. I think the community needs to show its interest
> in it and Joe his willingness to maintain it in the future in order
> for it to qualify for addition to the stdlib.

As Jesse said, at the language summit there was serious interest.

> I don't see why having it merged into Python's repo is a requirement
> for contribution, since he's using Mecurial. :)

Just because it's in Mercurial does not mean that Joe can't hold up
patches to apply to the main branch. If we all start forking it to fix
things and Joe ends up falling behind because of school, life, etc.
this could kill the project. But I am willing to wait for now in hopes
that doesn't happen.


From drkjam at  Fri Aug 28 01:03:49 2009
From: drkjam at (DrKJam)
Date: Fri, 28 Aug 2009 00:03:49 +0100
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
	<> <>
Message-ID: <>

I've posted several of issues to the ipaddr bug tracker for consideration.
They shouldn't be major discussion topics so I'll leave them off the list.

The following are a few feature requests that might possibly require further
discussion here. If they are no-brainers that don't require any further
mulling over, can we have a few votes -1/+1 to get a feel for the importance
and I'llI convert them into tickets.

1) Additional is_<network_class> boolean properties.


Returns True if this address/network is IPv4-compatible IPv6 e.g.
:: or ::, False otherwise.


Returns True if this address/network is IPv4-compatible IPv6 e.g.
::ffff: or ::ffff:, False otherwise.


Possible list of IPv4 networks and ranges to be used for this purpose :-

#    IANA Reserved but subject to allocation

#   Reserved for Future Use

#   Reserved multicast - -

Possible list of IPv6 networks to be used for this purpose :-



True if addresses or networks are within unicast ranges.

2) An IPvXRange class (representing an arbitrary start and end IP address).
This is in addition to the existing summarize_address_range() function.

There are use cases were an actual object representing an arbitrary range
rather than a basic tuple containing two IPvXAddress objects or a list of
IPvXNetwork objects is just simple to handle, more appropriate and less
hassle overall. Willing to expand on the interface for this.

3) Support for IPv4-mapped/compatible IPv6 conversion functions.

It would be handy to have methods to convert backwards and forwards between
actual IPv4 addresses and networks to their IPv6 mapped or compatible

Basic examples :-

IPv4 -> IPv6

>>> IPv6Address('::ffff:').ipv4()

>>> IPv6Address('::').ipv4()

IPv4 -> IPv6

>>> IPv4Address('').ipv6()
IPv6Address(::ffff:') Prefer IPv4-compatible as the default (RFC

>>> IPv4Address('').ipv6(ipv4_mapped=True)

By the same token we should provide the same functionality for IP network
classes (with the necessary CIDR prefix conversions) :-

>>> IPv6Network('::ffff:').ipv4()

>>> IPv4Network('').ipv6()

>>> IPv4Network('').ipv6(ipv4_mapped=True)

If address ranges overflow boundaries the necessary exceptions can be

4) IP set based classes.

This is a big topic, so I'll save it for a subsequent post. What are the
general feelings about having something like this in ipaddr? It might be
deemed a little heavyweight but it is a really sweet option for the power
user. Combined with the new speed of collapse_address_list() this could be
handy and fast.

That's all for now,

Dave M.

2009/8/27 Peter Moody <peter at>

> On Thu, Aug 27, 2009 at 10:37 AM, Peter Moody<peter at> wrote:
> > On Thu, Aug 27, 2009 at 10:24 AM, David Moss<drkjam at> wrote:
> >> Peter,
> >>
> >> I would like to apologise if I have caused you any offense.
> >
> > Thanks. Accepted.
> >
> >> Please can we
> >> put the animosity behind us and stick to pulling together the best IP
> >> library possible as part of this PEP?
> >
> > pep-3144 should hopefully soon be updated on with
> > this past week's suggestions (including a discussion on the ipaddr
> > class design).  The updated ipaddr reference code should also still be
> > available for 'svn co' at
> >
> er, make that
> https seems to ask for a password.
> > Cheers,
> > /peter
> >
> >> Regards,
> >>
> >> Dave M.
> >>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From benjamin at  Fri Aug 28 01:40:33 2009
From: benjamin at (Benjamin Peterson)
Date: Thu, 27 Aug 2009 18:40:33 -0500
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/27 Jesse Noller <jnoller at>:
> On Thu, Aug 27, 2009 at 5:47 PM, Benjamin Peterson<benjamin at> wrote:
>> 2009/8/27 Brett Cannon <brett at>:
>>> What are the plans to merge this into Python's repository so we can
>>> all help out on this?
>> None at the moment. I think the community needs to show its interest
>> in it and Joe his willingness to maintain it in the future in order
>> for it to qualify for addition to the stdlib.
> Is that how 2to3 got in? If I remember correctly, this was a huge
> request from the language summit - and by huge, I mean really, really
> big.

2to3 got in because it's part of our current recommended plan for
porting to Python 3.

Regardless of that, there's no hurry. There will be no major Python
releases for at least 6 months, and in that time, 3to2 can enjoy the
benefits of being separate from the core, frequent releases and
testing. People who are interested should look now; being "in the
core" isn't a requirement for that.

And anyway, I don't see the point of importing new stuff into SVN if
we're just going to move hg.

> In fact, I think you were the guy who was heading the SoC project, right? ;)

No, just mentoring. :)


From martin at  Fri Aug 28 01:43:21 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 28 Aug 2009 01:43:21 +0200
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

>> None at the moment. I think the community needs to show its interest
>> in it and Joe his willingness to maintain it in the future in order
>> for it to qualify for addition to the stdlib.
> Is that how 2to3 got in? If I remember correctly, this was a huge
> request from the language summit - and by huge, I mean really, really
> big.

Ok, so then it should be easy to generate some real interest out of
it, right? E.g. a somebody actually running the tool, or perhaps even
a bug report?


From brett at  Fri Aug 28 01:57:51 2009
From: brett at (Brett Cannon)
Date: Thu, 27 Aug 2009 16:57:51 -0700
Subject: [Python-Dev] quick PEP 387 comments
Message-ID: <>

Is the PEP considering all non-private APIs public even if they are
not documented? If so we might want to be up front about that and say
so to make sure we are all very careful about making all non-essential
APIs private (assuming this PEP gets accepted).

And we might want to say that all code in 'test' sub-packages are not
subject to backwards compatibility unless documented. I have a ton of
support code in importlib.test that I do not want to have to maintain
for public consumption as they are meant solely for testing purposes
by me. If you read the PEP it would suggest that all modules in test
are subject to the PEP's compatibility policy which is obviously


From arcriley at  Fri Aug 28 02:06:30 2009
From: arcriley at (Arc Riley)
Date: Thu, 27 Aug 2009 20:06:30 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

How about moving it to a new repository on  Give it more of
an "official" feel without the burden of being in theb cpython tree?

On Thu, Aug 27, 2009 at 7:43 PM, "Martin v. L?wis" <martin at>wrote:

> Ok, so then it should be easy to generate some real interest out of
> it, right? E.g. a somebody actually running the tool, or perhaps even
> a bug report?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Fri Aug 28 02:17:54 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 28 Aug 2009 02:17:54 +0200
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>	<>	<>
	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

> IPv6Address.is_ipv4_compat
> IPv6Network.is_ipv4_compat
> Returns True if this address/network is IPv4-compatible IPv6 e.g.
> :: or :: <>, False otherwise.

-1. These addresses are deprecated.

> IPv6Address.is_ipv4_mapped
> IPv6Network.is_ipv4_mapped
> Returns True if this address/network is IPv4-compatible IPv6 e.g.
> ::ffff: or ::ffff: <>, False
> otherwise.

Perhaps there could be a v4_mapped function, that returned either None
or a V4 address?

> IPvXAddress.is_reserved
> IPvXNetwork.is_reserved
> Possible list of IPv4 networks and ranges to be used for this purpose :-
> #    IANA Reserved but subject to allocation
> <>

-1. These are merely unallocated.

> <>

-1. RFC 3330 says they are subject to allocation.

> <>

-1. It's allocated to LACNIC, for further allocation.

> <>

-1. It's allocated to ARIN; RFC 3330 says it's free for

> <>

-1. Not sure what the status is here, but RFC 3330
also lists it as available for allocation; IANA lists as UNALLOCATED.

> #   Reserved for Future Use
> <>     


> #   Reserved multicast
> -
> -

-0. What makes them different from, say,

> Possible list of IPv6 networks to be used for this purpose :-
> FF00::/12

-1. These are multicast addresses, some in use.

> ::/8
> 0100::/8
> 0200::/7
> 0400::/6
> 0800::/5
> 1000::/4
> 4000::/3
> 6000::/3
> 8000::/3
> A000::/3
> C000::/3
> E000::/4
> F000::/5
> F800::/6
> FE00::/9


> IPvXAddress.is_unicast    
> IPvXNetwork.is_unicast
> True if addresses or networks are within unicast ranges.

-0. What about anycast addresses?

> 2) An IPvXRange class (representing an arbitrary start and end IP
> address). This is in addition to the existing summarize_address_range()
> function.
> There are use cases were an actual object representing an arbitrary
> range rather than a basic tuple containing two IPvXAddress objects or a
> list of IPvXNetwork objects is just simple to handle, more appropriate
> and less hassle overall. Willing to expand on the interface for this.

-0. What's use case where a tuple of two addresses wouldn't be just as

> Basic examples :-
> IPv4 -> IPv6
>>>> IPv6Address('::ffff:').ipv4()
> IPv4Address('')
>>>> IPv6Address('::').ipv4()
> IPv4Address('')

-1 in this form. See above for compatible and mapped addresses.
I could imaging a method trailing_ipv4, which would give an IPv4 address
when all bits between 64 and 95 are 0; this would also cover cases
where people put the IPv4 address into the IPv6 address using a regular
prefix. Of course, this is guess-work, so the method name has to make
it clear that it is guessing.

> IPv4 -> IPv6
>>>> IPv4Address('').ipv6()
> IPv6Address(::ffff:') Prefer IPv4-compatible as the default
> (RFC 4291)

-1. These are deprecated.

>>>> IPv4Address('').ipv6(ipv4_mapped=True)
> IPv6Address('::')

-0. Call it ipv4_mapped().

If there is to be an ipv6 method, it should take an IPv6Network,
to allow constructing things like 2001:888:2000:d::

> 4) IP set based classes.
> This is a big topic, so I'll save it for a subsequent post. What are the
> general feelings about having something like this in ipaddr? It might be
> deemed a little heavyweight but it is a really sweet option for the
> power user. Combined with the new speed of collapse_address_list() this
> could be handy and fast.

So when talking about it, please provide use cases.


From martin at  Fri Aug 28 02:20:06 2009
From: martin at (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 28 Aug 2009 02:20:06 +0200
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

> How about moving it to a new repository on
> <>?  Give it more of an "official" feel without the
> burden of being in theb cpython tree?

Unfortunately, this is not yet set up (i.e. you can't push to it).


From benjamin at  Fri Aug 28 02:49:49 2009
From: benjamin at (Benjamin Peterson)
Date: Thu, 27 Aug 2009 19:49:49 -0500
Subject: [Python-Dev] quick PEP 387 comments
In-Reply-To: <>
References: <>
Message-ID: <>

I should probably mark that PEP as abandoned or deferred, since for
various reasons, it seems like this is not what Python-dev feels is
needed [1].


2009/8/27 Brett Cannon <brett at>:
> Is the PEP considering all non-private APIs public even if they are
> not documented? If so we might want to be up front about that and say
> so to make sure we are all very careful about making all non-essential
> APIs private (assuming this PEP gets accepted).
> And we might want to say that all code in 'test' sub-packages are not
> subject to backwards compatibility unless documented. I have a ton of
> support code in importlib.test that I do not want to have to maintain
> for public consumption as they are meant solely for testing purposes
> by me. If you read the PEP it would suggest that all modules in test
> are subject to the PEP's compatibility policy which is obviously
> absurd.
> -Brett
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:


From jnoller at  Fri Aug 28 03:00:57 2009
From: jnoller at (Jesse Noller)
Date: Thu, 27 Aug 2009 21:00:57 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 27, 2009 at 7:43 PM, "Martin v. L?wis"<martin at> wrote:
>>> None at the moment. I think the community needs to show its interest
>>> in it and Joe his willingness to maintain it in the future in order
>>> for it to qualify for addition to the stdlib.
>> Is that how 2to3 got in? If I remember correctly, this was a huge
>> request from the language summit - and by huge, I mean really, really
>> big.
> Ok, so then it should be easy to generate some real interest out of
> it, right? E.g. a somebody actually running the tool, or perhaps even
> a bug report?
> Regards,
> Martin

I'm sure we can get just as much interest in a 3 to 2 tool as we have
for python 3 itself.

Sorry, that was over-catty of me, but it's true. Right now 95% of the
known world is living in 2.x, and I would argue that some of this is
due to an unclear path of maintaining compatibility with 2.x code if
they should jump to python 3. That was part of the driving force
behind getting something like this done, at least at the language
summit and last year's PyCon.

Having an "official" - where "official" could mean linking to it in
the Python 3 docs, announcing something on a python blog, something -
anything to encourage/get more people to use it. I'm not disagreeing
that just plopping it into core may not be the Right Thing To Do, but
making it "semi-official" and "recommended" carries a lot of weight.

I know the mercurial migration is happening Really Soon Now, but even
hosting it in our svn/piggy backing our tracker and putting out a
little post pointing out 2to3 and 3to2 exist as migration
tools could help.

Maybe just a post about the GSOC project, what it does and where to
get it, and "please try it out so we can smash the bugs" on the front



From exarkun at  Fri Aug 28 04:10:36 2009
From: exarkun at (exarkun at
Date: Fri, 28 Aug 2009 02:10:36 -0000
Subject: [Python-Dev] quick PEP 387 comments
In-Reply-To: <>
References: <>
Message-ID: <20090828021036.7475.530690913.divmod.xquotient.62@localhost.localdomain>

On 12:49 am, benjamin at wrote:
>I should probably mark that PEP as abandoned or deferred, since for
>various reasons, it seems like this is not what Python-dev feels is
>needed [1].

Re-reading that thread, I see some good discussion about how to improve 
the PEP, a little bit of misunderstanding about what the PEP is about, 
and not a lot of strong opposition.  Maybe it's worth picking it up 


From amentajo at  Fri Aug 28 06:32:40 2009
From: amentajo at (Joe Amenta)
Date: Fri, 28 Aug 2009 00:32:40 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <> 
Message-ID: <>

On Thu, Aug 27, 2009 at 4:08 PM, Benjamin Peterson <benjamin at>wrote:

> 2009/8/26 Joe Amenta <amentajo at>:
> > Hello all,
> >
> > I have released the first alpha version of 3to2 after finishing it for my
> > Google Summer of Code 2009(tm) project.  You can get the tarball for this
> > release at
> >
> > This requires python 2.7, because it requires a newer version of 2to3
> than
> > what comes with 2.6.
> Great work and congratulations on your first release!

Thank you, I couldn't have done it without you!

> Have you posted this to python-list and python-announce-list, too?

I am going to post it to python-announce-list and python-list if I find that
I can keep up with the amount of traffic I am getting from python-dev.

> --
> Regards,
> Benjamin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amentajo at  Fri Aug 28 06:58:03 2009
From: amentajo at (Joe Amenta)
Date: Fri, 28 Aug 2009 00:58:03 -0400
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <> 
Message-ID: <>

On Thu, Aug 27, 2009 at 6:48 PM, Brett Cannon <brett at> wrote:

> On Thu, Aug 27, 2009 at 14:47, Benjamin Peterson<benjamin at>
> wrote:
> > 2009/8/27 Brett Cannon <brett at>:
> >> What are the plans to merge this into Python's repository so we can
> >> all help out on this?
> >
> > None at the moment. I think the community needs to show its interest
> > in it and Joe his willingness to maintain it in the future in order
> > for it to qualify for addition to the stdlib.
> >
> As Jesse said, at the language summit there was serious interest.
> > I don't see why having it merged into Python's repo is a requirement
> > for contribution, since he's using Mecurial. :)
> Just because it's in Mercurial does not mean that Joe can't hold up
> patches to apply to the main branch. If we all start forking it to fix
> things and Joe ends up falling behind because of school, life, etc.
> this could kill the project. But I am willing to wait for now in hopes
> that doesn't happen.
> -Brett

My intent is to be very responsive to the administrative end of maintaining
the project, i.e., applying patches to the main branch, bug triage, etc.,
while contributing new code in my (limited) free time as well.  It would
kill me to see the project die due to any of my shortcomings, and I resolve
that if I start falling behind on maintaining 3to2, I will call for someone
else to come forward to replace me.

I will always respond to e-mails sent to amentajo at, so please
contact me if there are any concerns, early if possible.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From shashank.sunny.singh at  Fri Aug 28 10:08:23 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Fri, 28 Aug 2009 13:38:23 +0530
Subject: [Python-Dev] Adding a C Module to python source distribution
Message-ID: <>

Hi All,

I am trying to add a module written in c to python source on Win32 using
VC++ 9 Pro.
I went through the available documentation but there doesn't seem to be any
clear instruction on how to do that.

Basically I opened pcbuild.sln in vc++, added the c file (xxx.c) to Modules/
Building the solution after that works fine: xxx.c is compiled (no errors,
no warnings) and
the python executable gets created. But I am not able to import the module
in xxx.c using that executable.

Do I need to register this module some place else too ( ?

Any hints and pointers will be appreciated :)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Fri Aug 28 11:31:16 2009
From: ncoghlan at (Nick Coghlan)
Date: Fri, 28 Aug 2009 19:31:16 +1000
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
 Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

Peter Moody wrote:
> If there are any more suggestions on the PEP or the code, please let me know.

I noticed the new paragraphs on the IPv4 vs IPv6 types not being
comparable - is there a canonical ordering for mixed address lists
defined anywhere (e.g. an RFC)?

If there is, then it should be possible to implement that on BaseIP and
BaseNet so that comparisons work as canonically defined. If there isn't,
then that should be mentioned in the PEP as the reason why the PEP
deliberately isn't trying to invent a convention.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From lists at  Fri Aug 28 14:58:31 2009
From: lists at (Christian Heimes)
Date: Fri, 28 Aug 2009 14:58:31 +0200
Subject: [Python-Dev] Adding a C Module to python source distribution
In-Reply-To: <>
References: <>
Message-ID: <>

Shashank Singh wrote:
> Do I need to register this module some place else too ( ?
> Any hints and pointers will be appreciated :)

You have to add the module and its initializer to PC/config.c, too.


From status at  Fri Aug 28 18:07:26 2009
From: status at (Python tracker)
Date: Fri, 28 Aug 2009 18:07:26 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <>

ACTIVITY SUMMARY (08/21/09 - 08/28/09)
Python tracker at

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.

 2370 open (+23) / 16247 closed (+15) / 18617 total (+38)

Open issues with patches:   935

Average duration of open issues: 657 days.
Median duration of open issues: 410 days.

Open Issues Breakdown
   open  2339 (+23)
pending    30 ( +0)

Issues Created Or Reopened (39)

add conversion table to time module docs                         08/22/09    reopened cvrebert                      

implement new setuid-related calls and a standard way to drop al 08/21/09
CLOSED    created  solinym                       
                                                                        is missing universal newline support   08/22/09    created  ryles                         

patch to subprocess docs to better explain Popen's 'args' argume 08/22/09    created  cvrebert                      

Class calling                                                    08/22/09    created  onlyme                        

strange string representation of  xrange in print                08/22/09
CLOSED    created  mintaka                       

Crash on mac os x leopard in mimetypes.guess_type (or PyObject_M 08/22/09    created  santagada                     

os.path.join should call	os.path.normpath on result              08/22/09
CLOSED    created  michael.foord                 

math.log, log10 inconsistency                                    08/23/09
CLOSED    created  steve21                       

Cannot modify dictionaries inside dictionaries using Managers fr 08/23/09    created  carlosdf                      

Python as zip package                                            08/23/09    created  Joe                           

asyncore file_wrapper leaking file descriptors?                  08/23/09    created  keysers                       

in NameError: global name 'HTTPSConnection' is not 08/23/09    created  ivank                         

PDF download links of docs for 3.1.1 are broken                  08/24/09
CLOSED    created  Radiant                       

documentation/implementation error                               08/24/09    created  steve21                       

Missing alias  utf-8 in  Standard Encodings list.                08/24/09
CLOSED    created  mintaka                       

subprocess issue on Win 7 x64                                    08/24/09    created  tesla                         

socket.shudown documentation: on some platforms, closing one hal 08/24/09    created  nicdumz                       

readme: correct url                   08/24/09
CLOSED    created  nicdumz                       

A few tests are failing when zlib is not supported               08/24/09
CLOSED    created  nicdumz                       

Python 2.6 tutorial still recommends using Exception.message att 08/24/09    created  cito                          

False positives given through bisect module (binary search)      08/24/09
CLOSED    created  kaashif                       

NotImplementedError on for statement                             08/24/09
CLOSED    created  sbq                           

startswith error message is incomplete                           08/24/09    created  srid                          

even exponentiation of negative numbers                          08/25/09
CLOSED    created  benlbroussard                 

Scoping of variables in closures                                 08/25/09
CLOSED    created  bitfort                       

Add a builtin method to 'int' for base/radix conversion          08/25/09    created  ubershmekel                   

byte/unicode pickle incompatibilities between python2 and python 08/26/09    created  RonnyPfannschmidt             

IncompleteRead / BadStatus when parsing      08/26/09    created  kmoon                         

readline and zero based indexing                                 08/26/09    created  purpleidea                    

thread docs contain an incorrect link in a reference to the 'exi 08/27/09
CLOSED    created  r.david.murray                
                                                                        on Win32 does not force binary mode                  08/27/09    created  EnigmaCurry                   

ftplib storelines does not honor strings returned in fp.readline 08/27/09    created  aymanhs                       

httplib and array do not play together well                      08/27/09    created  jakemcguire                   

httplib read status memory usage                                 08/28/09    created  m.sucajtys                    

Distutils-based installer does not detect 64bit versions of Pyth 08/28/09    created  erluk                         
                                                                        div_nearest ==> _div_nearest                         08/28/09
CLOSED    created  skrah                         
       26backport                                                       incorrect results in NaN comparisons                 08/28/09
CLOSED    created  skrah                         
                                                                        minor issues && usability                            08/28/09    created  skrah                         

Issues Now Closed (26)

Extension module build fails for MinGW: missing vcvarsall.bat     483 days    hagen                         

Allow buffering for HTTPResponse                                    6 days    gregory.p.smith               
       patch, patch                                                            

let unittest.assertRaises() return the exception object caught     75 days    michael.foord                 
       patch, patch, easy, needs review                                        

Add "path" to the xmrlpc dispatcher method                         22 days    krisvale                      

buffer c-api: memoryview object documentation                      20 days    benjamin.peterson             

Desire documentation link to user contribution wiki (   18 days    keenethery                    

Place the term "delete" within the documentation for os.remove()   14 days    mcow                          

ValueError in SocketServer.UDPServer Example             7 days    georg.brandl                  

Inconsistency in Documentation: "Name Spaces" vs "Namespaces"       6 days    georg.brandl                  

dict.fromkeys() should not cross reference mutable value by defa    6 days    georg.brandl                  

implement new setuid-related calls and a standard way to drop al    1 days    exarkun                       

strange string representation of  xrange in print                   0 days    rhettinger                    

os.path.join should call	os.path.normpath on result                 0 days    michael.foord                 

math.log, log10 inconsistency                                       1 days    tim_one                       

PDF download links of docs for 3.1.1 are broken                     1 days    benjamin.peterson             

Missing alias  utf-8 in  Standard Encodings list.                   0 days    georg.brandl                  

readme: correct url                      0 days    georg.brandl                  

A few tests are failing when zlib is not supported                  0 days    ezio.melotti                  

False positives given through bisect module (binary search)         0 days    r.david.murray                

NotImplementedError on for statement                                0 days    r.david.murray                

even exponentiation of negative numbers                             0 days    marketdickinson               

Scoping of variables in closures                                    0 days    benjamin.peterson             

thread docs contain an incorrect link in a reference to the 'exi    1 days    georg.brandl                  
                                                                        div_nearest ==> _div_nearest                            0 days    marketdickinson               
       26backport                                                       incorrect results in NaN comparisons                    0 days    marketdickinson               

Wide-character curses                                            2357 days  gpolo                         

Top Issues Most Discussed (10)

  9 byte/unicode pickle incompatibilities between python2 and pytho    2 days

  9 is missing universal newline support     6 days

  9 implement new setuid-related calls and a standard way to drop a    1 days

  7 Patch: new method get_wch for ncurses bindings: accept wide cha    7 days

  6 Support for encrypted zipfiles when interpreting zipfile as scr    7 days

  6 (curses) addstr() takes str in Python 3                            8 days

  5 expose setresuid                                                  42 days

  4 Add a builtin method to 'int' for base/radix conversion            3 days

  4 Python as zip package                                              5 days

  4 math.log, log10 inconsistency                                      1 days

From cgw at  Fri Aug 28 19:40:07 2009
From: cgw at (Charles Waldman)
Date: Fri, 28 Aug 2009 12:40:07 -0500
Subject: [Python-Dev] consider for inclusion in std.
Message-ID: <>

Here's a module "timed_command" I wrote a while ago and is generally
useful and might be a good addition to the standard library.  It is 
like commands.getstatusoutput but lets you run a command with an
optional timeout.  Useful for systems programming where a sub-process 
might hang.  Only works on POSIX, but could perhaps be modified to run
on other platforms (I don't have the knowledge of Windows to do this).
If you would like to add this to the library, I relinquish all rights
to it.

Here's a link to the source repository:

From brett at  Sat Aug 29 00:24:45 2009
From: brett at (Brett Cannon)
Date: Fri, 28 Aug 2009 15:24:45 -0700
Subject: [Python-Dev] deprecated methods on array objects
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Aug 27, 2009 at 14:15, Jake McGuire<mcguire at> wrote:
> The python documentation says that the read() and write() methods on array
> objects have been deprecated since 1.5.1. ?I assume this is because their
> semantics are almost the exact opposite of read() and write() on a file-like
> object; reads data from a file into the array and array.write()
> writes data from the array to a file.
> This causes fatal confusion in code that checks for the existence of read()
> and write() to determine whether an object is file-like. ?Code such as
> httplib.
> What is the timeline for removing these methods from array? ?It has been 11
> years now.

They are gone from Python 3.x, so they have been removed where it
counts. Bothering with 2.x is not worth it at this point.

From brett at  Sat Aug 29 00:27:14 2009
From: brett at (Brett Cannon)
Date: Fri, 28 Aug 2009 15:27:14 -0700
Subject: [Python-Dev] consider for inclusion in std.
In-Reply-To: <>
References: <>
Message-ID: <>

To get a module included in the standard library you need to have it
out for about a year, have the community consider it best-of-breed,
and write a PEP passed through python-ideas.

On Fri, Aug 28, 2009 at 10:40, Charles Waldman<cgw at> wrote:
> Here's a module "timed_command" I wrote a while ago and is generally
> useful and might be a good addition to the standard library. ?It is
> like commands.getstatusoutput but lets you run a command with an
> optional timeout. ?Useful for systems programming where a sub-process
> might hang. ?Only works on POSIX, but could perhaps be modified to run
> on other platforms (I don't have the knowledge of Windows to do this).
> If you would like to add this to the library, I relinquish all rights
> to it.
> Here's a link to the source repository:
> ?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From scav at  Fri Aug 28 21:27:27 2009
From: scav at (Peter Harris)
Date: Fri, 28 Aug 2009 20:27:27 +0100
Subject: [Python-Dev] functools.compose to chain functions together
In-Reply-To: <>
References: <>
Message-ID: <>

I am personally indifferent to this, even though I had in mind in 
PEP309, that compose would probably end up in there too.

On the one hand, some people will keep on expecting it to be there. The 
ones that care about it will not be confused: they'll expect 
compose(f,g)(x) to be f(g(x)) as is proposed.  It can't do any 
significant harm.

On the other hand it's not likely to be used even as often as partial, 
which I always wanted mostly to make anonymous callables for Tkinter, 
not because of any ivory-tower functional programming bias.  And the 
most common use case of compose() is covered by a one-liner that really 
doesn't need to be in the standard library.

I'll say +0, with the + because if new Python programmers run across 
compose() in the docs, and aren't familiar with the idea, they can 
follow a link from there to Wikipedia, and maybe it will give them an 
idea we haven't thought of for something cool to do with it.

Peter Harris

From skippy.hammond at  Sun Aug 30 04:44:10 2009
From: skippy.hammond at (Mark Hammond)
Date: Sun, 30 Aug 2009 12:44:10 +1000
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>
Message-ID: <>

On 18/08/2009 6:20 PM, Dirkjan Ochtman wrote:
> On Tue, Aug 18, 2009 at 10:12, "Martin v. L?wis"<martin at>  wrote:
>> In this thread, I'd like to collect things that ought to be done
>> but where Dirkjan has indicated that he would prefer if somebody else
>> did it.
> I think the most important item here is currently the win32text stuff.
> Mark Hammond said he would work on this; Mark, when do you have time
> for this? Then I could set apart some time for it as well.
> Have stalled a bit on the fine-grained branch processing, hope to move
> that forward tomorrow.

I'm afraid I've also stalled on this task and I need some help to get 
things moving again.

1) I've stalled on the 'none:' patch I promised to resurrect.  While 
doing this, I re-discovered that the tests for win32text appear to check 
win32 line endings are used by win32text on *all* platforms, not just 

I asked for advice from Dirkjan who referred me to the mercurual-devel 
list, but my request of slightly over a week ago remains unanswered 
( - 
maybe I just need to be more patient...

Further, Martin's comments in this thread indicate he believes a new 
extension will be necessary rather than 'fixing' win32text.  If this is 
the direction we take, it may mean the none: patch, which targets the 
implementation of win32text, is no longer necessary anyway.

2) These same recent discussions about an entirely new extension and no 
clear indication of our expectations regarding what the tool actually 
enforces means I'm not sure how to make a start on the more general 
issue.  I also fear that should I try to make a start on this, it will 
still wind up fruitless - eg, it seems any work targeting win32text 
specifically would have been wasted, so I'd really like to see a 
consensus on what needs to be done before attempting to start it.

So in short, I'm still offering to work on this issue - I just don't 
know what that currently entails.



From shashank.sunny.singh at  Sun Aug 30 07:45:20 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Sun, 30 Aug 2009 11:15:20 +0530
Subject: [Python-Dev] Fast Implementation for ZIP decryption
Message-ID: <>

Hi All,

I have implemented the simple zip decryption in C (yes, the much loathed
weak password based classical
PKWARE encryption, which incidentally is the only one currently supported in
python) .

It performs fast, as one would expect, as compared to the current all-python

Does it sound worthy enough to create a patch for and integrate into python

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From martin at  Sun Aug 30 10:55:33 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Aug 2009 10:55:33 +0200
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <>

> Does it sound worthy enough to create a patch for and integrate into
> python itself?

Probably not, given that people think that the algorithm itself is
fairly useless.


From steve at  Sun Aug 30 14:59:41 2009
From: steve at (Steven D'Aprano)
Date: Sun, 30 Aug 2009 22:59:41 +1000
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, 30 Aug 2009 06:55:33 pm Martin v. L?wis wrote:
> > Does it sound worthy enough to create a patch for and integrate
> > into python itself?
> Probably not, given that people think that the algorithm itself is
> fairly useless.

I would think that for most people, the threat model isn't "the CIA is 
reading my files" but "my little brother or nosey co-worker is reading 
my files", and for that, zip encryption with a good password is 
probably perfectly adequate. E.g. OpenOffice uses it for 
password-protected documents.

Given that Python already supports ZIP decryption (as it should), are 
there any reasons to prefer the current pure-Python implementation over 
a faster version?

Steven D'Aprano

From mg at  Sun Aug 30 13:37:56 2009
From: mg at (Martin Geisler)
Date: Sun, 30 Aug 2009 13:37:56 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
Message-ID: <>

Mark Hammond <skippy.hammond at> writes:

> 1) I've stalled on the 'none:' patch I promised to resurrect.  While
> doing this, I re-discovered that the tests for win32text appear to
> check win32 line endings are used by win32text on *all* platforms, not
> just Windows.

I think it is only Patrick Mezard who knows how to run (parts of) the
test suite on Windows.

> I asked for advice from Dirkjan who referred me to the mercurual-devel
> list, but my request of slightly over a week ago remains unanswered
> (
> - 
> maybe I just need to be more patient...

Oh no, that's usually the wrong tactic :-) I've been too busy for real
Mercurial work the last couple of weeks, but you should not feel bad
about poking us if you don't get a reply. Or come to the IRC channel
(#mercurial on where Dirkjan (djc) and myself (mg)
hang out when it's daytime in Europe.

> Further, Martin's comments in this thread indicate he believes a new
> extension will be necessary rather than 'fixing' win32text.  If this
> is the direction we take, it may mean the none: patch, which targets
> the implementation of win32text, is no longer necessary anyway.

I suggested a new extension for two reasons:

* I'm using Linux, and I mentally skip over all extensions that mention
  "win32"... I guess others do the same, and in this case it's really a
  shame since converting EOL markers is a cross-platform problem: if
  someone creates a repository on Windows, I might find it nice to
  translate the EOL markers into LF on my machine.

  As far as I know, all my tools works correctly with CRLF EOL markers,
  but I can see the usefulness of such an extension when adding new
  files (which would default to LF unless I take care).

* A new extension will not have to deal with backwards compatibility
  issues. That would let us clean up the strange names: I think
  "cleverencode:" and "cleverdecode:" quite poor names that convey
  little meaning (and what's with the colon?). We could instead use the
  same names as Subversion: "native", "CRLF" and "LF".

  The new extension could be named 'convert-eol' or something like that.

> 2) These same recent discussions about an entirely new extension and
> no clear indication of our expectations regarding what the tool
> actually enforces means I'm not sure how to make a start on the more
> general issue.

It would be a folly to require all files in all changesets to use the
right EOL markers -- people will be making mistakes offline. The
important thing is that they fix them before pushing to a public server.

So the extension should do that: either abort commits with the wrong EOL
markers or do as Subversion and automatically convert the file in the
working copy.

> I also fear that should I try to make a start on this, it will still
> wind up fruitless - eg, it seems any work targeting win32text
> specifically would have been wasted, so I'd really like to see a
> consensus on what needs to be done before attempting to start it.

As I understand it, what is lacking is that win32text will read the
encode/decode settings from a versioned file called <repo>/.hgeol. This
means that you can just enable the extension and be done with it,
instead of configuring it in every clone. The <repo>/.hgeol file should
contain two sections:

  native = LF

  Windows.txt = CRLF
  Unix.txt = LF
  Tools/buildbot/** = CRLF
  **.txt = native
  **.py = native
  **.dsp = CRLF

The [repository] setting controls what native is translated into upon
commit. The [patterns] section can be translated into safe [decode] /
[encode] settings by the extension:

  Windows.txt = to-crlf
  Unix.txt = to-lf
  Tools/buildbot/** = to-crlf
  **.txt = to-lf
  **.py = to-lf
  **.dsp = to-crlf

  Windows.txt = to-crlf
  Unix.txt = to-lf
  Tools/buildbot/** = to-crlf
  **.txt = to-native
  **.py = to-native
  **.dsp = to-crlf

where to-crlf, to-lf, to-native are filters installed by the extension.

I guess your 'none' encode/decode filter patch would be needed if the
Unix.txt file were to be stored unchanged in the repository? Instead I
imagine that the extension will convert a modified Unix.txt to LF EOL
markers automatically (Subversion behaves like that, as far as I can
tell from a bit of testing).

That way the repository will contain most files in the format specified
as native for it, but selected files are stored using whatever EOLs they
like. The result is that someone who has not enabled the extension will
get correct files from a checkout. Had we stored the *.dsp files with LF
EOLs in the repository (like Subversion does), then using the extension
would be mandatory for everybody.

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From exarkun at  Sun Aug 30 15:05:22 2009
From: exarkun at (exarkun at
Date: Sun, 30 Aug 2009 13:05:22 -0000
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <20090830130522.11528.715943779.divmod.xquotient.4@localhost.localdomain>

On 12:59 pm, steve at wrote:
>On Sun, 30 Aug 2009 06:55:33 pm Martin v. L?wis wrote:
>> > Does it sound worthy enough to create a patch for and integrate
>> > into python itself?
>>Probably not, given that people think that the algorithm itself is
>>fairly useless.
>I would think that for most people, the threat model isn't "the CIA is
>reading my files" but "my little brother or nosey co-worker is reading
>my files", and for that, zip encryption with a good password is
>probably perfectly adequate. E.g. OpenOffice uses it for
>password-protected documents.
>Given that Python already supports ZIP decryption (as it should), are
>there any reasons to prefer the current pure-Python implementation over
>a faster version?

Given that the use case is "protect my biology homework from my little 
brother", how fast does the implementation really need to be?  Is 
speeding it up from 0.1 seconds to 0.001 seconds worth the potential new 
problems that come with more C code (more code to maintain, less 
portability to other runtimes, potential for interpreter crashes or even 
arbitrary code execution vulnerabilities from specially crafted files)?


From shashank.sunny.singh at  Sun Aug 30 16:34:36 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Sun, 30 Aug 2009 20:04:36 +0530
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <20090830130522.11528.715943779.divmod.xquotient.4@localhost.localdomain>
References: <> 
	<> <>
Message-ID: <>

just to give you an idea of the speed up:

a 3.3 mb zip file extracted using the current all-python implementation on
my machine (win xp 1.67Ghz 1.5GB)
takes approximately 38 seconds.

the same file when extracted using c implementation takes 0.4 seconds.


On Sun, Aug 30, 2009 at 6:35 PM, <exarkun at> wrote:

> On 12:59 pm, steve at wrote:
>> On Sun, 30 Aug 2009 06:55:33 pm Martin v. L?wis wrote:
>>> > Does it sound worthy enough to create a patch for and integrate
>>> > into python itself?
>>> Probably not, given that people think that the algorithm itself is
>>> fairly useless.
>> I would think that for most people, the threat model isn't "the CIA is
>> reading my files" but "my little brother or nosey co-worker is reading
>> my files", and for that, zip encryption with a good password is
>> probably perfectly adequate. E.g. OpenOffice uses it for
>> password-protected documents.
>> Given that Python already supports ZIP decryption (as it should), are
>> there any reasons to prefer the current pure-Python implementation over
>> a faster version?
> Given that the use case is "protect my biology homework from my little
> brother", how fast does the implementation really need to be?  Is speeding
> it up from 0.1 seconds to 0.001 seconds worth the potential new problems
> that come with more C code (more code to maintain, less portability to other
> runtimes, potential for interpreter crashes or even arbitrary code execution
> vulnerabilities from specially crafted files)?
> Jean-Paul
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ludvig at  Sun Aug 30 17:58:59 2009
From: ludvig at (Ludvig Ericson)
Date: Sun, 30 Aug 2009 17:58:59 +0200
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 aug 2009, at 16:34, Shashank Singh wrote:
> just to give you an idea of the speed up:
> a 3.3 mb zip file extracted using the current all-python  
> implementation on my machine (win xp 1.67Ghz 1.5GB)
> takes approximately 38 seconds.
> the same file when extracted using c implementation takes 0.4 seconds.

If this matters to the users of the API, then likely they'd search for  
alternatives -- no need for it to go into the standard library just  
because it replaces functionality, or am I misunderstanding?

- Ludvig Ericson <ludvig at>

From martin at  Sun Aug 30 19:59:26 2009
From: martin at (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Aug 2009 19:59:26 +0200
Subject: [Python-Dev] Mercurial migration: help needed
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

> I suggested a new extension for two reasons:
> * I'm using Linux, and I mentally skip over all extensions that mention
>   "win32"... I guess others do the same, and in this case it's really a
>   shame since converting EOL markers is a cross-platform problem: if
>   someone creates a repository on Windows, I might find it nice to
>   translate the EOL markers into LF on my machine.
>   As far as I know, all my tools works correctly with CRLF EOL markers,
>   but I can see the usefulness of such an extension when adding new
>   files (which would default to LF unless I take care).
> * A new extension will not have to deal with backwards compatibility
>   issues. That would let us clean up the strange names: I think
>   "cleverencode:" and "cleverdecode:" quite poor names that convey
>   little meaning (and what's with the colon?). We could instead use the
>   same names as Subversion: "native", "CRLF" and "LF".
>   The new extension could be named 'convert-eol' or something like that.

Thanks for the confirmation - this is also why I think a new extension
would be best. FWIW, in Python, most files would be declared native,
some CRLF, none LF.

>> 2) These same recent discussions about an entirely new extension and
>> no clear indication of our expectations regarding what the tool
>> actually enforces means I'm not sure how to make a start on the more
>> general issue.
> It would be a folly to require all files in all changesets to use the
> right EOL markers -- people will be making mistakes offline. The
> important thing is that they fix them before pushing to a public server.
> So the extension should do that: either abort commits with the wrong EOL
> markers or do as Subversion and automatically convert the file in the
> working copy.

Maybe I misunderstand: when people use the extension, they cannot
possibly make mistakes, right? Because the commit that gets aborted
is already the local commit, right?

Of course, it may still be that not all people use the extension.
I think this is of concern to Mark (and he would like hg to refuse
operation at all if the extension isn't used), but not to me: I would
like this to be a feature of hg eventually, in which case I don't need
to worry whether hg enforces presence of certain extensions.

If people make commits that break the eol style, we could well
refuse to accept them on the server, telling people that they should
have used the extension (or that they should have been more careful
if they don't use the extension).

I think subversion's behavior wrt. incorrect eol-style is more subtle.
In some cases, it will complain about inconsistencies, rather than
fixing them automatically.


From mg at  Sun Aug 30 20:37:36 2009
From: mg at (Martin Geisler)
Date: Sun, 30 Aug 2009 20:37:36 +0200
Subject: [Python-Dev] Mercurial migration: help needed
References: <>
	<> <>
Message-ID: <>

"Martin v. L?wis" <martin at> writes:

>> So the extension should do that: either abort commits with the wrong
>> EOL markers or do as Subversion and automatically convert the file in
>> the working copy.
> Maybe I misunderstand: when people use the extension, they cannot
> possibly make mistakes, right? Because the commit that gets aborted is
> already the local commit, right?
> Of course, it may still be that not all people use the extension.

Exactly, when people use the extension, they wont be able to make bad

> I think this is of concern to Mark (and he would like hg to refuse
> operation at all if the extension isn't used), but not to me: I would
> like this to be a feature of hg eventually, in which case I don't need
> to worry whether hg enforces presence of certain extensions.

Yes, that would be nice for the future. I don't know if the other
Mercurial developers will see this as a big controversy -- Mercurial has
so far made very sure to never mutate your files behind your back.
Expansion of keywords (like $Id$) is also implemented as an extension.

> If people make commits that break the eol style, we could well refuse
> to accept them on the server, telling people that they should have
> used the extension (or that they should have been more careful if they
> don't use the extension).

Indeed. Their work will not be lost -- one can always take the final
file, convert the line-endings, copy it into a fresh clone and commit
that. With more work one could even salvage the intermediate commits,
but that is probably not necessary.

> I think subversion's behavior wrt. incorrect eol-style is more subtle.
> In some cases, it will complain about inconsistencies, rather than
> fixing them automatically.

Okay --- I don't have much experience with the svn:eol-style, except
that I've read about it in the manual.

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <>

From ncoghlan at  Sun Aug 30 23:24:44 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 31 Aug 2009 07:24:44 +1000
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <20090830130522.11528.715943779.divmod.xquotient.4@localhost.localdomain>
References: <>	<>	<>
Message-ID: <>

exarkun at wrote:
> Given that the use case is "protect my biology homework from my little
> brother", how fast does the implementation really need to be?  Is
> speeding it up from 0.1 seconds to 0.001 seconds worth the potential new
> problems that come with more C code (more code to maintain, less
> portability to other runtimes, potential for interpreter crashes or even
> arbitrary code execution vulnerabilities from specially crafted files)?

Also, if the use case is just protecting stuff from a sibling or your
childen, use an archiving program to zip/extract it :)

So -1 here as well. Any added C code has a real cost for the reasons
Jean-Paul listed, so it should only be used in cases where there's a
major practical benefit to the speed-up. Faster execution of a
problematic algorithm that is already well implemented by plenty of
other applications doesn't qualify in my book (even if the speedup is by
a couple of orders of magnitude).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From brett at  Mon Aug 31 01:28:25 2009
From: brett at (Brett Cannon)
Date: Sun, 30 Aug 2009 16:28:25 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
Message-ID: <>

I am going through and running the entire test suite using importlib
to ferret out incompatibilities. I have found a bunch, although all
rather minor (raising a different exception typically; not even sure
they are worth backporting as anyone reliant on the old exceptions
might get a nasty surprise in the next micro release), and now I am
down to my last failing test suite: test_import.

Ignoring the execution bit problem (
but I have no clue why this is happening), I am bumping up against
TestPycRewriting.test_incorrect_code_name. Turns out that import
resets co_filename on a code object to __file__ before exec'ing it to
create a module's namespace in order to ignore the file name passed
into compile() for the filename argument. Now I can't change
co_filename from Python as it's a read-only attribute and thus can't
match this functionality in importlib w/o creating some custom code to
allow me to specify the co_filename somewhere (marshal.loads() or some
new function).

My question is how important is this functionality? Do I really need
to go through and add an argument to marshal.loads or some new
function just to set co_filename to something that someone explicitly
set in a .pyc file? Or I can let this go and have this be the one
place where builtins.__import__ and importlib.__import__ differ and
just not worry about it?


From robertc at  Mon Aug 31 02:13:12 2009
From: robertc at (Robert Collins)
Date: Mon, 31 Aug 2009 10:13:12 +1000
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <1251677592.1956.69.camel@lifeless-64>

On Sun, 2009-08-30 at 16:28 -0700, Brett Cannon wrote:
> My question is how important is this functionality? Do I really need
> to go through and add an argument to marshal.loads or some new
> function just to set co_filename to something that someone explicitly
> set in a .pyc file? Or I can let this go and have this be the one
> place where builtins.__import__ and importlib.__import__ differ and
> just not worry about it? 

Just to be clear, this would show up if I:
had a python tree
built and run stuff from it
symlinked to that tree from somewhere else
ran stuff from that somewhere else

 - because the pyc is already on disk?

Thats been an invaluable 'wtf' debugging tool at various times, because
the odd provenance of the path in the pyc makes it extremely clear that
what is being loaded isn't what one had thought was being loaded.

OTOH, always showing the path that the pyc was *actually found at* would
fix the weirdness that occurs when you mv a python tree from one place
to another.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <>

From brett at  Mon Aug 31 02:23:32 2009
From: brett at (Brett Cannon)
Date: Sun, 30 Aug 2009 17:23:32 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <1251677592.1956.69.camel@lifeless-64>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 17:13, Robert Collins<robertc at> wrote:
> On Sun, 2009-08-30 at 16:28 -0700, Brett Cannon wrote:
>> My question is how important is this functionality? Do I really need
>> to go through and add an argument to marshal.loads or some new
>> function just to set co_filename to something that someone explicitly
>> set in a .pyc file? Or I can let this go and have this be the one
>> place where builtins.__import__ and importlib.__import__ differ and
>> just not worry about it?
> Just to be clear, this would show up if I:
> had a python tree
> built and run stuff from it
> symlinked to that tree from somewhere else
> ran stuff from that somewhere else

Right; the code object would think it was loaded from the original
location it was created at instead of where it actually is. Now why
someone would want to move their .pyc files around instead of
recompiling I don't know short of not wanting to send someone source.


From guido at  Mon Aug 31 02:24:54 2009
From: guido at (Guido van Rossum)
Date: Sun, 30 Aug 2009 17:24:54 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon<brett at> wrote:
> I am going through and running the entire test suite using importlib
> to ferret out incompatibilities. I have found a bunch, although all
> rather minor (raising a different exception typically; not even sure
> they are worth backporting as anyone reliant on the old exceptions
> might get a nasty surprise in the next micro release), and now I am
> down to my last failing test suite: test_import.
> Ignoring the execution bit problem (
> but I have no clue why this is happening), I am bumping up against
> TestPycRewriting.test_incorrect_code_name. Turns out that import
> resets co_filename on a code object to __file__ before exec'ing it to
> create a module's namespace in order to ignore the file name passed
> into compile() for the filename argument. Now I can't change
> co_filename from Python as it's a read-only attribute and thus can't
> match this functionality in importlib w/o creating some custom code to
> allow me to specify the co_filename somewhere (marshal.loads() or some
> new function).
> My question is how important is this functionality? Do I really need
> to go through and add an argument to marshal.loads or some new
> function just to set co_filename to something that someone explicitly
> set in a .pyc file? Or I can let this go and have this be the one
> place where builtins.__import__ and importlib.__import__ differ and
> just not worry about it?

ISTR that Bill Janssen once mentioned a file replication mechanism
whereby there were two names for each file: the "canonical" name on a
replicated read-only filesystem, and the longer "writable" name on a
unique master copy. He ended up with the filenames in the .pyc files
being pretty bogus (since not everyone had access to the writable
filesystem). So setting co_filename to match __file__ (i.e. the name
under which the module is being imported) would be a nice service in
this case.

In general this would happen whenever you pre-compile a bunch of .py
files to .pyc/.pyo and then copy the lot to a different location. Not
a completely unlikely scenario.

(I was going to comment on the execution bit issue but I realized I'm
not even sure if you're talking about import.c or not. :-)

--Guido van Rossum (home page:

From guido at  Mon Aug 31 02:26:08 2009
From: guido at (Guido van Rossum)
Date: Sun, 30 Aug 2009 17:26:08 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 5:23 PM, Brett Cannon<brett at> wrote:
> Right; the code object would think it was loaded from the original
> location it was created at instead of where it actually is. Now why
> someone would want to move their .pyc files around instead of
> recompiling I don't know short of not wanting to send someone source.

I already mentioned replication; it could also just be a matter of
downloading a tarball with .py and .pyc files.

--Guido van Rossum (home page:

From brett at  Mon Aug 31 02:34:06 2009
From: brett at (Brett Cannon)
Date: Sun, 30 Aug 2009 17:34:06 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 17:24, Guido van Rossum<guido at> wrote:
> On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon<brett at> wrote:
>> I am going through and running the entire test suite using importlib
>> to ferret out incompatibilities. I have found a bunch, although all
>> rather minor (raising a different exception typically; not even sure
>> they are worth backporting as anyone reliant on the old exceptions
>> might get a nasty surprise in the next micro release), and now I am
>> down to my last failing test suite: test_import.
>> Ignoring the execution bit problem (
>> but I have no clue why this is happening), I am bumping up against
>> TestPycRewriting.test_incorrect_code_name. Turns out that import
>> resets co_filename on a code object to __file__ before exec'ing it to
>> create a module's namespace in order to ignore the file name passed
>> into compile() for the filename argument. Now I can't change
>> co_filename from Python as it's a read-only attribute and thus can't
>> match this functionality in importlib w/o creating some custom code to
>> allow me to specify the co_filename somewhere (marshal.loads() or some
>> new function).
>> My question is how important is this functionality? Do I really need
>> to go through and add an argument to marshal.loads or some new
>> function just to set co_filename to something that someone explicitly
>> set in a .pyc file? Or I can let this go and have this be the one
>> place where builtins.__import__ and importlib.__import__ differ and
>> just not worry about it?
> ISTR that Bill Janssen once mentioned a file replication mechanism
> whereby there were two names for each file: the "canonical" name on a
> replicated read-only filesystem, and the longer "writable" name on a
> unique master copy. He ended up with the filenames in the .pyc files
> being pretty bogus (since not everyone had access to the writable
> filesystem). So setting co_filename to match __file__ (i.e. the name
> under which the module is being imported) would be a nice service in
> this case.
> In general this would happen whenever you pre-compile a bunch of .py
> files to .pyc/.pyo and then copy the lot to a different location. Not
> a completely unlikely scenario.

Well, to get this level of compatibility I am going to need to add
some magical API somewhere then to overwrite a code object's "file"
location. Blah.

I will either add an argument to marshal.loads to specify an
overriding file path or add an imp.exec that takes a file path
argument to override the code object with.

> (I was going to comment on the execution bit issue but I realized I'm
> not even sure if you're talking about import.c or not. :-)

So it turns out a bunch of execution/write bit stuff has come up in
Python 2.7 and importlib has been ignoring it. =) Importlib has simply
been opening up the bytecode files with 'wb' and writing out the file.
But test_import tests that no execution bit get set or that a write
bit gets added if the source file lacks it. I guess I can use
posix.chmod and posix.stat to copy the source file's read and write
bits and always mask out the execution bits. I hate this low-level
file permission stuff.


From guido at  Mon Aug 31 04:34:43 2009
From: guido at (Guido van Rossum)
Date: Sun, 30 Aug 2009 19:34:43 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 5:34 PM, Brett Cannon<brett at> wrote:
> On Sun, Aug 30, 2009 at 17:24, Guido van Rossum<guido at> wrote:
>> On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon<brett at> wrote:
>>> I am going through and running the entire test suite using importlib
>>> to ferret out incompatibilities. I have found a bunch, although all
>>> rather minor (raising a different exception typically; not even sure
>>> they are worth backporting as anyone reliant on the old exceptions
>>> might get a nasty surprise in the next micro release), and now I am
>>> down to my last failing test suite: test_import.
>>> Ignoring the execution bit problem (
>>> but I have no clue why this is happening), I am bumping up against
>>> TestPycRewriting.test_incorrect_code_name. Turns out that import
>>> resets co_filename on a code object to __file__ before exec'ing it to
>>> create a module's namespace in order to ignore the file name passed
>>> into compile() for the filename argument. Now I can't change
>>> co_filename from Python as it's a read-only attribute and thus can't
>>> match this functionality in importlib w/o creating some custom code to
>>> allow me to specify the co_filename somewhere (marshal.loads() or some
>>> new function).
>>> My question is how important is this functionality? Do I really need
>>> to go through and add an argument to marshal.loads or some new
>>> function just to set co_filename to something that someone explicitly
>>> set in a .pyc file? Or I can let this go and have this be the one
>>> place where builtins.__import__ and importlib.__import__ differ and
>>> just not worry about it?
>> ISTR that Bill Janssen once mentioned a file replication mechanism
>> whereby there were two names for each file: the "canonical" name on a
>> replicated read-only filesystem, and the longer "writable" name on a
>> unique master copy. He ended up with the filenames in the .pyc files
>> being pretty bogus (since not everyone had access to the writable
>> filesystem). So setting co_filename to match __file__ (i.e. the name
>> under which the module is being imported) would be a nice service in
>> this case.
>> In general this would happen whenever you pre-compile a bunch of .py
>> files to .pyc/.pyo and then copy the lot to a different location. Not
>> a completely unlikely scenario.

> Well, to get this level of compatibility I am going to need to add
> some magical API somewhere then to overwrite a code object's "file"
> location. Blah.

Agreed, no fun. Unfortunately for core Python it really pays to go the
extra mile...

> I will either add an argument to marshal.loads to specify an
> overriding file path or add an imp.exec that takes a file path
> argument to override the code object with.

Remember, there are many code objects created from one pyc file.
Adding it to marshal.load*() makes sense because then it's usable for
other purposes too, and that attacks the issue from the root. (in
import.c it's done by update_compiled_module() right after
read_compiled_module(), which is a thin wrapper around marshal.load())
I'm not sure how imp.exec would make sure that introspection of the
loaded code objects always gets the right thing.

>> (I was going to comment on the execution bit issue but I realized I'm
>> not even sure if you're talking about import.c or not. :-)
> So it turns out a bunch of execution/write bit stuff has come up in
> Python 2.7 and importlib has been ignoring it. =) Importlib has simply
> been opening up the bytecode files with 'wb' and writing out the file.
> But test_import tests that no execution bit get set or that a write
> bit gets added if the source file lacks it. I guess I can use
> posix.chmod and posix.stat to copy the source file's read and write
> bits and always mask out the execution bits. I hate this low-level
> file permission stuff.

It's no fun -- see the layers of #ifdefs in open_exclusive() in
import.c. (Though I think you won't need to worry about VMS. :-) But
it's somewhat important to get it right from a security POV. I would
use and wrap an io.BufferedWriter around it.

--Guido van Rossum (home page:

From brett at  Mon Aug 31 04:43:48 2009
From: brett at (Brett Cannon)
Date: Sun, 30 Aug 2009 19:43:48 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 19:34, Guido van Rossum<guido at> wrote:
> On Sun, Aug 30, 2009 at 5:34 PM, Brett Cannon<brett at> wrote:
>> On Sun, Aug 30, 2009 at 17:24, Guido van Rossum<guido at> wrote:
>>> On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon<brett at> wrote:
>>>> I am going through and running the entire test suite using importlib
>>>> to ferret out incompatibilities. I have found a bunch, although all
>>>> rather minor (raising a different exception typically; not even sure
>>>> they are worth backporting as anyone reliant on the old exceptions
>>>> might get a nasty surprise in the next micro release), and now I am
>>>> down to my last failing test suite: test_import.
>>>> Ignoring the execution bit problem (
>>>> but I have no clue why this is happening), I am bumping up against
>>>> TestPycRewriting.test_incorrect_code_name. Turns out that import
>>>> resets co_filename on a code object to __file__ before exec'ing it to
>>>> create a module's namespace in order to ignore the file name passed
>>>> into compile() for the filename argument. Now I can't change
>>>> co_filename from Python as it's a read-only attribute and thus can't
>>>> match this functionality in importlib w/o creating some custom code to
>>>> allow me to specify the co_filename somewhere (marshal.loads() or some
>>>> new function).
>>>> My question is how important is this functionality? Do I really need
>>>> to go through and add an argument to marshal.loads or some new
>>>> function just to set co_filename to something that someone explicitly
>>>> set in a .pyc file? Or I can let this go and have this be the one
>>>> place where builtins.__import__ and importlib.__import__ differ and
>>>> just not worry about it?
>>> ISTR that Bill Janssen once mentioned a file replication mechanism
>>> whereby there were two names for each file: the "canonical" name on a
>>> replicated read-only filesystem, and the longer "writable" name on a
>>> unique master copy. He ended up with the filenames in the .pyc files
>>> being pretty bogus (since not everyone had access to the writable
>>> filesystem). So setting co_filename to match __file__ (i.e. the name
>>> under which the module is being imported) would be a nice service in
>>> this case.
>>> In general this would happen whenever you pre-compile a bunch of .py
>>> files to .pyc/.pyo and then copy the lot to a different location. Not
>>> a completely unlikely scenario.
>> Well, to get this level of compatibility I am going to need to add
>> some magical API somewhere then to overwrite a code object's "file"
>> location. Blah.
> Agreed, no fun. Unfortunately for core Python it really pays to go the
> extra mile...

Definitely, which is why I will do it, just not tonight as I am tired
of compatibility fixing for now. =)

>> I will either add an argument to marshal.loads to specify an
>> overriding file path or add an imp.exec that takes a file path
>> argument to override the code object with.
> Remember, there are many code objects created from one pyc file.
> Adding it to marshal.load*() makes sense because then it's usable for
> other purposes too, and that attacks the issue from the root.

That was my thinking.

> (in
> import.c it's done by update_compiled_module() right after
> read_compiled_module(), which is a thin wrapper around marshal.load())
> I'm not sure how imp.exec would make sure that introspection of the
> loaded code objects always gets the right thing.

Basically it would be imp.exec(module, code, path) and it would tweak
the code object before execution based on introspecting what the
module had set for __file__. But might as well add the support to

>>> (I was going to comment on the execution bit issue but I realized I'm
>>> not even sure if you're talking about import.c or not. :-)
>> So it turns out a bunch of execution/write bit stuff has come up in
>> Python 2.7 and importlib has been ignoring it. =) Importlib has simply
>> been opening up the bytecode files with 'wb' and writing out the file.
>> But test_import tests that no execution bit get set or that a write
>> bit gets added if the source file lacks it. I guess I can use
>> posix.chmod and posix.stat to copy the source file's read and write
>> bits and always mask out the execution bits. I hate this low-level
>> file permission stuff.
> It's no fun -- see the layers of #ifdefs in open_exclusive() in
> import.c. (Though I think you won't need to worry about VMS. :-) But
> it's somewhat important to get it right from a security POV. I would
> use and wrap an io.BufferedWriter around it.

I will have to see what of that is implemented in C or in Python. I
have always tried to keep all pure Python code out of importlib for
bootstrapping reasons in order to keep the possibility of using
importlib as the implementation of import. But maybe I should not be
worrying about that right at the moment and instead do what keeps the
code simple.


From benjamin at  Mon Aug 31 04:51:00 2009
From: benjamin at (Benjamin Peterson)
Date: Sun, 30 Aug 2009 21:51:00 -0500
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/30 Brett Cannon <brett at>:
> On Sun, Aug 30, 2009 at 19:34, Guido van Rossum<guido at> wrote:
>> On Sun, Aug 30, 2009 at 5:34 PM, Brett Cannon<brett at> wrote:
>>> On Sun, Aug 30, 2009 at 17:24, Guido van Rossum<guido at> wrote:
>>>> (I was going to comment on the execution bit issue but I realized I'm
>>>> not even sure if you're talking about import.c or not. :-)
>>> So it turns out a bunch of execution/write bit stuff has come up in
>>> Python 2.7 and importlib has been ignoring it. =) Importlib has simply
>>> been opening up the bytecode files with 'wb' and writing out the file.
>>> But test_import tests that no execution bit get set or that a write
>>> bit gets added if the source file lacks it. I guess I can use
>>> posix.chmod and posix.stat to copy the source file's read and write
>>> bits and always mask out the execution bits. I hate this low-level
>>> file permission stuff.
>> It's no fun -- see the layers of #ifdefs in open_exclusive() in
>> import.c. (Though I think you won't need to worry about VMS. :-) But
>> it's somewhat important to get it right from a security POV. I would
>> use and wrap an io.BufferedWriter around it.
> I will have to see what of that is implemented in C or in Python. I
> have always tried to keep all pure Python code out of importlib for
> bootstrapping reasons in order to keep the possibility of using
> importlib as the implementation of import. But maybe I should not be
> worrying about that right at the moment and instead do what keeps the
> code simple.

You can use the C implementation of io, _io, which has a full
buffering implementation. Of course, that also makes it a better
harder for other implementations which may wish to use importlib
because the io library would have to be completely implemented...


From glyph at  Mon Aug 31 05:13:49 2009
From: glyph at (Glyph Lefkowitz)
Date: Sun, 30 Aug 2009 23:13:49 -0400
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 30, 2009 at 8:26 PM, Guido van Rossum <guido at> wrote:

> On Sun, Aug 30, 2009 at 5:23 PM, Brett Cannon<brett at> wrote:
> > Right; the code object would think it was loaded from the original
> > location it was created at instead of where it actually is. Now why
> > someone would want to move their .pyc files around instead of
> > recompiling I don't know short of not wanting to send someone source.
> I already mentioned replication; it could also just be a matter of
> downloading a tarball with .py and .pyc files.

Also, if you're using Python in an embedded context, bytecode compilation
(or even filesystem access!) can be prohibitively slow, so an uncompressed
.zip file full of compiled .pyc files is really the way to go.

I did this a long time ago on an XScale machine, but recent inspection of
the Android Python scripting stuff shows a similar style of deployment (c.f.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From collinw at  Mon Aug 31 06:28:49 2009
From: collinw at (Collin Winter)
Date: Sun, 30 Aug 2009 21:28:49 -0700
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Aug 30, 2009 at 7:34 AM, Shashank
Singh<shashank.sunny.singh at> wrote:
> just to give you an idea of the speed up:
> a 3.3 mb zip file extracted using the current all-python implementation on
> my machine (win xp 1.67Ghz 1.5GB)
> takes approximately 38 seconds.
> the same file when extracted using c implementation takes 0.4 seconds.

Are there any applications/frameworks which have zip files on their
critical path, where this kind of (admittedly impressive) speedup
would be beneficial? What was the motivation for writing the C

Collin Winter

> On Sun, Aug 30, 2009 at 6:35 PM, <exarkun at> wrote:
>> On 12:59 pm, steve at wrote:
>>> On Sun, 30 Aug 2009 06:55:33 pm Martin v. L?wis wrote:
>>>> > Does it sound worthy enough to create a patch for and integrate
>>>> > into python itself?
>>>> Probably not, given that people think that the algorithm itself is
>>>> fairly useless.
>>> I would think that for most people, the threat model isn't "the CIA is
>>> reading my files" but "my little brother or nosey co-worker is reading
>>> my files", and for that, zip encryption with a good password is
>>> probably perfectly adequate. E.g. OpenOffice uses it for
>>> password-protected documents.
>>> Given that Python already supports ZIP decryption (as it should), are
>>> there any reasons to prefer the current pure-Python implementation over
>>> a faster version?
>> Given that the use case is "protect my biology homework from my little
>> brother", how fast does the implementation really need to be? ?Is speeding
>> it up from 0.1 seconds to 0.001 seconds worth the potential new problems
>> that come with more C code (more code to maintain, less portability to other
>> runtimes, potential for interpreter crashes or even arbitrary code execution
>> vulnerabilities from specially crafted files)?
>> Jean-Paul
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at
>> Unsubscribe:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From asmodai at  Mon Aug 31 07:40:21 2009
From: asmodai at (Jeroen Ruigrok van der Werven)
Date: Mon, 31 Aug 2009 07:40:21 +0200
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <>
Message-ID: <>

-On [20090831 06:29], Collin Winter (collinw at wrote:
>Are there any applications/frameworks which have zip files on their
>critical path, where this kind of (admittedly impressive) speedup
>would be beneficial? What was the motivation for writing the C

Would zipped eggs count? For example, SQLAlchemy runs in the 5 MB range.

Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
????? ?????? ??? ?? ?????? | | GPG: 2EAC625B
All for one, one for all...

From brett at  Mon Aug 31 07:43:07 2009
From: brett at (Brett Cannon)
Date: Sun, 30 Aug 2009 22:43:07 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 19:51, Benjamin Peterson<benjamin at> wrote:
> 2009/8/30 Brett Cannon <brett at>:
>> On Sun, Aug 30, 2009 at 19:34, Guido van Rossum<guido at> wrote:
>>> On Sun, Aug 30, 2009 at 5:34 PM, Brett Cannon<brett at> wrote:
>>>> On Sun, Aug 30, 2009 at 17:24, Guido van Rossum<guido at> wrote:
>>>>> (I was going to comment on the execution bit issue but I realized I'm
>>>>> not even sure if you're talking about import.c or not. :-)
>>>> So it turns out a bunch of execution/write bit stuff has come up in
>>>> Python 2.7 and importlib has been ignoring it. =) Importlib has simply
>>>> been opening up the bytecode files with 'wb' and writing out the file.
>>>> But test_import tests that no execution bit get set or that a write
>>>> bit gets added if the source file lacks it. I guess I can use
>>>> posix.chmod and posix.stat to copy the source file's read and write
>>>> bits and always mask out the execution bits. I hate this low-level
>>>> file permission stuff.
>>> It's no fun -- see the layers of #ifdefs in open_exclusive() in
>>> import.c. (Though I think you won't need to worry about VMS. :-) But
>>> it's somewhat important to get it right from a security POV. I would
>>> use and wrap an io.BufferedWriter around it.
>> I will have to see what of that is implemented in C or in Python. I
>> have always tried to keep all pure Python code out of importlib for
>> bootstrapping reasons in order to keep the possibility of using
>> importlib as the implementation of import. But maybe I should not be
>> worrying about that right at the moment and instead do what keeps the
>> code simple.
> You can use the C implementation of io, _io, which has a full
> buffering implementation. Of course, that also makes it a better
> harder for other implementations which may wish to use importlib
> because the io library would have to be completely implemented...

True. I guess it's a question of whether making importlib easier to
maintain and as minimally reliant on C-specific modules is more/less
important than trying to bootstrap it in for CPython for __import__ at
some point.


From greg at  Mon Aug 31 08:38:32 2009
From: greg at (Gregory P. Smith)
Date: Sun, 30 Aug 2009 23:38:32 -0700
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <> 
	<> <>
Message-ID: <>

On Sun, Aug 30, 2009 at 10:40 PM, Jeroen Ruigrok van der Werven <
asmodai at> wrote:

> -On [20090831 06:29], Collin Winter (collinw at wrote:
> >Are there any applications/frameworks which have zip files on their
> >critical path, where this kind of (admittedly impressive) speedup
> >would be beneficial? What was the motivation for writing the C
> >version?
> Would zipped eggs count? For example, SQLAlchemy runs in the 5 MB range.

Unless someone's also pushing for being able to import and execute code from
scrambled zip files, no that doesn't matter.

The C code for this should be trivially tiny.  See the zipfile._ZipDecryptor
class, its got ~25 lines of actual code in it.  It is not worth arguing
about.  I'll commit this if you post it as a patch in a tracker issue.
 Please make sure your patch includes the following:

* A unittest that compares the C version of the descrambler to the python
version of the descrambler using a variety of inputs and outputs that
exercise any boundary condition.

* Conditional import code in the zipfile module itself so that the module
works even if the C module isn't available.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg at  Mon Aug 31 09:01:29 2009
From: greg at (Gregory P. Smith)
Date: Mon, 31 Aug 2009 00:01:29 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Sun, Aug 30, 2009 at 5:24 PM, Guido van Rossum <guido at> wrote:

> On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon<brett at> wrote:
> > I am going through and running the entire test suite using importlib
> > to ferret out incompatibilities. I have found a bunch, although all
> > rather minor (raising a different exception typically; not even sure
> > they are worth backporting as anyone reliant on the old exceptions
> > might get a nasty surprise in the next micro release), and now I am
> > down to my last failing test suite: test_import.
> >
> > Ignoring the execution bit problem (
> > but I have no clue why this is happening), I am bumping up against
> > TestPycRewriting.test_incorrect_code_name. Turns out that import
> > resets co_filename on a code object to __file__ before exec'ing it to
> > create a module's namespace in order to ignore the file name passed
> > into compile() for the filename argument. Now I can't change
> > co_filename from Python as it's a read-only attribute and thus can't
> > match this functionality in importlib w/o creating some custom code to
> > allow me to specify the co_filename somewhere (marshal.loads() or some
> > new function).
> >
> > My question is how important is this functionality? Do I really need
> > to go through and add an argument to marshal.loads or some new
> > function just to set co_filename to something that someone explicitly
> > set in a .pyc file? Or I can let this go and have this be the one
> > place where builtins.__import__ and importlib.__import__ differ and
> > just not worry about it?
> ISTR that Bill Janssen once mentioned a file replication mechanism
> whereby there were two names for each file: the "canonical" name on a
> replicated read-only filesystem, and the longer "writable" name on a
> unique master copy. He ended up with the filenames in the .pyc files
> being pretty bogus (since not everyone had access to the writable
> filesystem). So setting co_filename to match __file__ (i.e. the name
> under which the module is being imported) would be a nice service in
> this case.
> In general this would happen whenever you pre-compile a bunch of .py
> files to .pyc/.pyo and then copy the lot to a different location. Not
> a completely unlikely scenario.

8-9 years ago while using py2exe on windows to create stand along binaries
out of Python programs for distribution we ran into this issue... The
compiled .pyc's that py2exe bundles up contained the pathname to the source
code on the development build system.  When you get a stacktrace python
would look for the source code based on those.... Really horrible if your
build system used a windows drive letter other than C such as D: as it could
cause windows to pop up a dialog asking the user to insert a CD or spin up a
spun down optical disc or ask for a floppy, etc. ;)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From shashank.sunny.singh at  Mon Aug 31 10:10:45 2009
From: shashank.sunny.singh at (Shashank Singh)
Date: Mon, 31 Aug 2009 13:40:45 +0530
Subject: [Python-Dev] Fast Implementation for ZIP decryption
In-Reply-To: <>
References: <> 
	<> <>
Message-ID: <>

On Mon, Aug 31, 2009 at 12:08 PM, Gregory P. Smith <greg at> wrote:

> On Sun, Aug 30, 2009 at 10:40 PM, Jeroen Ruigrok van der Werven <
> asmodai at> wrote:
>> -On [20090831 06:29], Collin Winter (collinw at wrote:
>> >Are there any applications/frameworks which have zip files on their
>> >critical path, where this kind of (admittedly impressive) speedup
>> >would be beneficial? What was the motivation for writing the C
>> >version?
>> Would zipped eggs count? For example, SQLAlchemy runs in the 5 MB range.
> Unless someone's also pushing for being able to import and execute code
> from scrambled zip files, no that doesn't matter

For those who have not seen it : asks for
such an ability (there was a good deal of discussion about it on python-dev
too and I think Greg you were a -1 on it :).

> .
> The C code for this should be trivially tiny.  See the
> zipfile._ZipDecryptor class, its got ~25 lines of actual code in it.

right you are. It is just a simple translation of the (~25 lines)   of code
into C.

 It is not worth arguing about.  I'll commit this if you post it as a patch
> in a tracker issue.  Please make sure your patch includes the following:
> * A unittest that compares the C version of the descrambler to the python
> version of the descrambler using a variety of inputs and outputs that
> exercise any boundary condition.
> * Conditional import code in the zipfile module itself so that the module
> works even if the C module isn't available.

I sure can do that.
What boundary conditions do you have on mind?

While we are at it (and forgive my obsession with the zip module :), is
there enough need for supporting the Strong Encryption Specification in the
zip module?
At least one immediate benefit I can see is that the OP of the link I posted
above will be happy :)

The main reason the idea of supporting import of encrypted module was shot
down is that the simple encryption scheme is too weak to bother about.
Supporting Strong Encryption might do away with that problem beside,
possibly, adding a whole new way
of distributing python modules.

Are there any (more?) use cases or am I missing something very trivial why
Strong Encryption was never supported in the zip module?

-- Shashank

Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Mon Aug 31 11:24:39 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 09:24:39 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Brett Cannon <brett <at>> writes:
> Now I can't change
> co_filename from Python as it's a read-only attribute and thus can't
> match this functionality in importlib w/o creating some custom code to
> allow me to specify the co_filename somewhere

Why can't we simply make co_filename a writable attribute instead of inventing
some complicated API?

From chris at  Mon Aug 31 13:59:29 2009
From: chris at (Chris Withers)
Date: Mon, 31 Aug 2009 12:59:29 +0100
Subject: [Python-Dev]
In-Reply-To: <>
References: <>	
Message-ID: <>

Nick Coghlan wrote:
> The PEPs don't go into the process of how we actually hook the command
> line up to the runpy module though - that's something you need to dig
> into the main.c code to really understand.

Yeah, main.c does quite a lot... ;-)

This all spawned from a suggestion by Jim Fulton over on the 
distutils-sig that it would be nice if there was a python module that 
did all of the various types of launching found in main.c. His use case 
is so that buildout scripts can easily use the same functionality that 
the interpreter startup uses.

I didn't spot any, but does anyone know of code in that mix that 
couldn't be moved to a pure python module like runpy?

If not, how would people feel about the various types of launching all 
moving to runpy rather than just the -m stuff being there?



Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Mon Aug 31 14:17:12 2009
From: chris at (Chris Withers)
Date: Mon, 31 Aug 2009 13:17:12 +0100
Subject: [Python-Dev] Excluding the current path from module search path?
In-Reply-To: <>
References: <>
	<>	<>
Message-ID: <>

Nick Coghlan wrote:
> Ah, OK - I see the problem now. However, I think the current behaviour
> is correct, it just needs to be documented better (probably noted in
> both the command line doco 

Not sure what you mean by this?

> regarding sys.path manipulation and in the
> doco for

Agreed :-)

> The reason I think the current behaviour is correct is that and
> are meant to be about customising the *site* (i.e. the
> installation of Python that is being executed) rather than about
> customizing a particular application.

Unless you use virtualenv as Guido suggested in the other thread ;-)

> Importing them before the script
> specific directories are prepended to sys.path goes a long way towards
> achieving that.

If had more uses that the setdefaultencoding hack, I'd 
argue more about this... If it does have other uses, my argument would 
be that "site" wide is a very subjective term tht many people, myself 
included, would like to be able to mean "per project, I don't *ever* 
want to screw with my actual Python install, it should stay pristine"...

> Also, as was pointed out on the tracker item, having a script that can
> automatically be executed when running an arbitrary Python script
> without any request from or notification to the user is not a good idea
> from a security standpoint.

Agreed, but I think that's only an issue when you're starting up an 
interpreter. If you're running a script from a file or module, I'd say 
it's more akin to what's specified in PYTHONSTARTUP being executed that 
than a random script being silently executed without your permission.



Simplistix - Content Management, Batch Processing & Python Consulting

From chris at  Mon Aug 31 14:23:16 2009
From: chris at (Chris Withers)
Date: Mon, 31 Aug 2009 13:23:16 +0100
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>	<>	<20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>	<>
Message-ID: <>

Guido van Rossum wrote:
> Being adults about it also means when to give up. Chris, please stop
> arguing about this. 

Sure. Even if people had agreed to this change, it wouldn't end up in a 
python release I could use for this project.

> There are plenty of techniques you can use to get
> what you want without changing Python, for example virtualenv, which
> allows you to create a custom Python environment for each project.

Yep, I'll resort to wrapping the buildout in a virtualenv iff the 
reload(sys) hack ends up causing problems...

> Or
> you could switch to Python 3.1, 

I would love to, once Python 3 has a viable web app story...



Simplistix - Content Management, Batch Processing & Python Consulting

From ncoghlan at  Mon Aug 31 15:13:55 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 31 Aug 2009 23:13:55 +1000
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

Brett Cannon wrote:
>> You can use the C implementation of io, _io, which has a full
>> buffering implementation. Of course, that also makes it a better
>> harder for other implementations which may wish to use importlib
>> because the io library would have to be completely implemented...
> True. I guess it's a question of whether making importlib easier to
> maintain and as minimally reliant on C-specific modules is more/less
> important than trying to bootstrap it in for CPython for __import__ at
> some point.

I'd suggest preferring _io, but falling back to the Python io module if
the accelerated version doesn't exist. You should get the best of both
worlds that way (no bootstrap issues in CPython and other
implementations with an _io module, but a still functional importlib in
other implementations).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Mon Aug 31 15:19:29 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 31 Aug 2009 23:19:29 +1000
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

Antoine Pitrou wrote:
> Brett Cannon <brett <at>> writes:
>> Now I can't change
>> co_filename from Python as it's a read-only attribute and thus can't
>> match this functionality in importlib w/o creating some custom code to
>> allow me to specify the co_filename somewhere
> Why can't we simply make co_filename a writable attribute instead of inventing
> some complicated API?

I thought of that question as well, but the later exchange between Guido
and Brett made me realise that a lot more than the top level module code
object is affected here - the adjustment also needs to be propagated to
the code objects created by the module for functions and generators and
so forth.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From fuzzyman at  Mon Aug 31 15:27:25 2009
From: fuzzyman at (Michael Foord)
Date: Mon, 31 Aug 2009 14:27:25 +0100
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Nick Coghlan wrote:
> Antoine Pitrou wrote:
>> Brett Cannon <brett <at>> writes:
>>> Now I can't change
>>> co_filename from Python as it's a read-only attribute and thus can't
>>> match this functionality in importlib w/o creating some custom code to
>>> allow me to specify the co_filename somewhere
>> Why can't we simply make co_filename a writable attribute instead of inventing
>> some complicated API?
> I thought of that question as well, but the later exchange between Guido
> and Brett made me realise that a lot more than the top level module code
> object is affected here - the adjustment also needs to be propagated to
> the code objects created by the module for functions and generators and
> so forth.

Even if it is not necessary or sufficient it still sounds like a useful 
change. When writing tools that generate modules or manipulate code 
objects these read-only attributes are a great nuisance.


> Cheers,
> Nick.


From solipsis at  Mon Aug 31 15:32:53 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 13:32:53 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Nick Coghlan <ncoghlan <at>> writes:
> I thought of that question as well, but the later exchange between Guido
> and Brett made me realise that a lot more than the top level module code
> object is affected here - the adjustment also needs to be propagated to
> the code objects created by the module for functions and generators and
> so forth.

I'm not sure I understand. There's a single type of "code object" and it's
PyCodeObject. Making the attribute writable (from Python code) at that level
should be sufficient.
(then of course the recursive machinery needed to mutate all code objects
created in a module may be slightly inefficient if written in Python, but at
least it's possible to write it)



From ncoghlan at  Mon Aug 31 15:36:42 2009
From: ncoghlan at (Nick Coghlan)
Date: Mon, 31 Aug 2009 23:36:42 +1000
Subject: [Python-Dev]
In-Reply-To: <>
References: <>	
	<> <>
Message-ID: <>

Chris Withers wrote:
> Nick Coghlan wrote:
>> The PEPs don't go into the process of how we actually hook the command
>> line up to the runpy module though - that's something you need to dig
>> into the main.c code to really understand.
> Yeah, main.c does quite a lot... ;-)
> This all spawned from a suggestion by Jim Fulton over on the
> distutils-sig that it would be nice if there was a python module that
> did all of the various types of launching found in main.c. His use case
> is so that buildout scripts can easily use the same functionality that
> the interpreter startup uses.
> I didn't spot any, but does anyone know of code in that mix that
> couldn't be moved to a pure python module like runpy?
> If not, how would people feel about the various types of launching all
> moving to runpy rather than just the -m stuff being there?

I haven't timed it, but I believe runpy is a fair bit slower than the
native C functions in main. (That first part of the comment means I
could easily be wrong though - it's definitely possible that overall
interpreter startup time will dwarf any difference between the two
launch mechanisms).

That said, while actually ditching the C code might cause an argument,
expanding runpy with Python equivalents of the C level functionality
(i.e. run script by name, run directory/zipfile by name, '-c' switch,
and other odds and ends that I'm probably forgetting right now, with all
associated modifications to sys.argv and the __main__ module attributes)
should be far less controversial.

For example, _run_module_as_main() has survived long enough now without
anyone poking holes in it (unlike the holes in the original run_module()
that PJE drove a truck through!) that I could probably be talked into
removing the comment I put on it and making it public :)

As you say, making all of that functionality accessible from Python
would allow launch scripts to be far more flexible in handling arguments
as if they were the normal interpreter.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From benjamin at  Mon Aug 31 16:55:56 2009
From: benjamin at (Benjamin Peterson)
Date: Mon, 31 Aug 2009 09:55:56 -0500
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

2009/8/31 Antoine Pitrou <solipsis at>:
> Brett Cannon <brett <at>> writes:
>> Now I can't change
>> co_filename from Python as it's a read-only attribute and thus can't
>> match this functionality in importlib w/o creating some custom code to
>> allow me to specify the co_filename somewhere
> Why can't we simply make co_filename a writable attribute instead of inventing
> some complicated API?

Because code objects are supposed to be a immutable hashable object?


From fumanchu at  Mon Aug 31 16:49:30 2009
From: fumanchu at (Robert Brewer)
Date: Mon, 31 Aug 2009 07:49:30 -0700
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>	<>	<20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>	<><>
Message-ID: <F1962646D3B64642B7C9A06068EE1E6409F6161A@ex10.hostedexchange.local>

Chris Withers wrote:
> Guido van Rossum wrote:
> > Being adults about it also means when to give up. Chris, please stop
> > arguing about this.
> Sure. Even if people had agreed to this change, it wouldn't end up in
> python release I could use for this project.
> > There are plenty of techniques you can use to get
> > what you want without changing Python, for example virtualenv, which
> > allows you to create a custom Python environment for each project.
> Yep, I'll resort to wrapping the buildout in a virtualenv iff the
> reload(sys) hack ends up causing problems...
> > Or
> > you could switch to Python 3.1,
> I would love to, once Python 3 has a viable web app story...

CherryPy 3.2 is now in beta, and mod_wsgi is nearly ready as well. Both
support Python 3. :)

Robert Brewer
fumanchu at

From chris at  Mon Aug 31 17:00:59 2009
From: chris at (Chris Withers)
Date: Mon, 31 Aug 2009 16:00:59 +0100
Subject: [Python-Dev] web apps in python 3
In-Reply-To: <F1962646D3B64642B7C9A06068EE1E6409F6161A@ex10.hostedexchange.local>
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>	<>	<20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>	<><>
Message-ID: <>

Robert Brewer wrote:
>>> you could switch to Python 3.1,
>> I would love to, once Python 3 has a viable web app story...
> CherryPy 3.2 is now in beta, and mod_wsgi is nearly ready as well. Both
> support Python 3. :)

My understanding was that the wsgi spec for Python 3 wasn't finished...


Simplistix - Content Management, Batch Processing & Python Consulting

From solipsis at  Mon Aug 31 17:10:34 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 15:10:34 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Benjamin Peterson <benjamin <at>> writes:
> > Why can't we simply make co_filename a writable attribute instead of
> > some complicated API?
> Because code objects are supposed to be a immutable hashable object?

Right, but co_filename is used neither in tp_hash nor in tp_richcompare.



From fumanchu at  Mon Aug 31 17:13:32 2009
From: fumanchu at (Robert Brewer)
Date: Mon, 31 Aug 2009 08:13:32 -0700
Subject: [Python-Dev] web apps in python 3
References: <>	<20090825162305.7657.1157242584.divmod.xquotient.112@localhost.localdomain>	<>	<20090827130857.7475.1053558531.divmod.xquotient.8@localhost.localdomain>	<><>
Message-ID: <F1962646D3B64642B7C9A06068EE1E6418B424@ex10.hostedexchange.local>

Chris Withers wrote:
> Robert Brewer wrote:
>>>> you could switch to Python 3.1,
>>> I would love to, once Python 3 has a viable web app story...
>> CherryPy 3.2 is now in beta, and mod_wsgi is nearly ready as well. Both
>> support Python 3. :)
> My understanding was that the wsgi spec for Python 3 wasn't finished...

The WSGI 1.0 spec has always included Python 3 using unicode strings in the environ (decoded via ISO-8859-1, and limited to \x00-\xFF). In addition, the CherryPy and mod_wsgi teams are working to interoperably support a modified version of WSGI, in which the environ is true unicode for both Python 2 and 3. We hope these implementations become references from which a WSGI 1.1 spec can be written; since web-sig has not yet reached consensus on certain specification details, we are proceeding together with tools that allow users to get work done now.

Robert Brewer
fumanchu at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Aug 31 18:25:06 2009
From: guido at (Guido van Rossum)
Date: Mon, 31 Aug 2009 09:25:06 -0700
Subject: [Python-Dev] deleting setdefaultencoding iin is evil
In-Reply-To: <F1962646D3B64642B7C9A06068EE1E6409F6161A@ex10.hostedexchange.local>
References: <>
Message-ID: <>

On Mon, Aug 31, 2009 at 7:49 AM, Robert Brewer<fumanchu at> wrote:
> CherryPy 3.2 is now in beta, and mod_wsgi is nearly ready as well. Both
> support Python 3. :)

Excellent news! I just saw that PyYAML also suports 3.1. Slowly but
surely, 3.1 is gaining traction...

--Guido van Rossum (home page:

From brett at  Mon Aug 31 18:27:49 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 09:27:49 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 08:10, Antoine Pitrou<solipsis at> wrote:
> Benjamin Peterson <benjamin <at>> writes:
>> > Why can't we simply make co_filename a writable attribute instead of
> inventing
>> > some complicated API?
>> Because code objects are supposed to be a immutable hashable object?
> Right, but co_filename is used neither in tp_hash nor in tp_richcompare.

I didn't suggest this since I assumed co_filename was made read-only
for a reason back when the design decision was made. But if the
original safety concerns are not there then I am happy to simply
change the attribute to writable.


From guido at  Mon Aug 31 18:33:12 2009
From: guido at (Guido van Rossum)
Date: Mon, 31 Aug 2009 09:33:12 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 9:27 AM, Brett Cannon<brett at> wrote:
> On Mon, Aug 31, 2009 at 08:10, Antoine Pitrou<solipsis at> wrote:
>> Benjamin Peterson <benjamin <at>> writes:
>>> > Why can't we simply make co_filename a writable attribute instead of
>> inventing
>>> > some complicated API?
>>> Because code objects are supposed to be a immutable hashable object?
>> Right, but co_filename is used neither in tp_hash nor in tp_richcompare.
> I didn't suggest this since I assumed co_filename was made read-only
> for a reason back when the design decision was made. But if the
> original safety concerns are not there then I am happy to simply
> change the attribute to writable.

Hm... I still wonder if there would be bad side effects of making
co_filename writable, but I can't think of any, so maybe you can make
this work... The next step would be to not write it out when
marshalling a code object -- this might save a bit of space in pyc
files too! (I guess for compatibility you might want to write it as an
empty string.)

Of course, tracking down all the code objects in the return value of
marshal.load*() might be a bit tricky -- API-wise I still think that
making it an argument to marshal.load*() might be simpler. Also it
would preserve the purity of code objects.

(Michael: it would be fine if *other* implementations of Python made
co_filename writable, as long as you can't think of security issues
with this.)

--Guido van Rossum (home page:

From brett at  Mon Aug 31 18:34:41 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 09:34:41 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 06:13, Nick Coghlan<ncoghlan at> wrote:
> Brett Cannon wrote:
>>> You can use the C implementation of io, _io, which has a full
>>> buffering implementation. Of course, that also makes it a better
>>> harder for other implementations which may wish to use importlib
>>> because the io library would have to be completely implemented...
>> True. I guess it's a question of whether making importlib easier to
>> maintain and as minimally reliant on C-specific modules is more/less
>> important than trying to bootstrap it in for CPython for __import__ at
>> some point.
> I'd suggest preferring _io, but falling back to the Python io module if
> the accelerated version doesn't exist. You should get the best of both
> worlds that way (no bootstrap issues in CPython and other
> implementations with an _io module, but a still functional importlib in
> other implementations).

Well, all important code is in importlib._bootstrap which lacks a
single import statement; all dependent modules are injected externally
in importlib.__init__. That allows for the possibility of C code to
import importlib/ along with the buit-in modules
required, and then inject those built-in modules into importlib's
global namespace. This is why I have functions in there that are
duplications of things found elsewhere.

That means that while I have named the module _io and I use
_io.FileIO, you could also easily inject io with the name _io and have
everything just work if you were trying to bootstrap. The deal is that
if I want to keep up the bootstrap goal I need to continue to restrict
myself to either built-in modules or thing I know I can choose to
expose later in built-in modules. This is why when Guido suggested I said I will have to see how it is implemented because if it
doesn't come from posix or is easy to duplicate I won't be able to use


From brett at  Mon Aug 31 18:57:13 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 09:57:13 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 09:33, Guido van Rossum<guido at> wrote:
> On Mon, Aug 31, 2009 at 9:27 AM, Brett Cannon<brett at> wrote:
>> On Mon, Aug 31, 2009 at 08:10, Antoine Pitrou<solipsis at> wrote:
>>> Benjamin Peterson <benjamin <at>> writes:
>>>> > Why can't we simply make co_filename a writable attribute instead of
>>> inventing
>>>> > some complicated API?
>>>> Because code objects are supposed to be a immutable hashable object?
>>> Right, but co_filename is used neither in tp_hash nor in tp_richcompare.
>> I didn't suggest this since I assumed co_filename was made read-only
>> for a reason back when the design decision was made. But if the
>> original safety concerns are not there then I am happy to simply
>> change the attribute to writable.
> Hm... I still wonder if there would be bad side effects of making
> co_filename writable, but I can't think of any, so maybe you can make
> this work... The next step would be to not write it out when
> marshalling a code object -- this might save a bit of space in pyc
> files too! (I guess for compatibility you might want to write it as an
> empty string.)

I would only want to consider stripping out the filename from the
marshal format if a filename argument to marshal.load* was required to
guarantee that code objects always in some sensible state. Otherwise
everyone would end up with tracebacks that made no sense by default.
But adding a required argument to marshal.load* would be quite the
pain for compatibility.

> Of course, tracking down all the code objects in the return value of
> marshal.load*() might be a bit tricky -- API-wise I still think that
> making it an argument to marshal.load*() might be simpler. Also it
> would preserve the purity of code objects.
> (Michael: it would be fine if *other* implementations of Python made
> co_filename writable, as long as you can't think of security issues
> with this.)

OK, so what does co_filename get used for? I think it is referenced to
open files for use in printing out the traceback. Python won't be able
to open files that you can't as a user, so that shouldn't be a
security risk. All places where co_filename is referenced would need
to gain a check or start using some new C function/macro which
verified that co_filename was a string and not some number or
something else which wouldn't get null-terminated and thus lead to
buffer overflow. A quick grep for co_filename turns up 17 uses in C
code, although having to add some check would ruin the purity Guido is
talking about and make a single attribute on code objects something
people have to be careful about instead of having a guarantee that all
attributes have some specific type of value.

I'm with Guido; I would rather add an optional argument to
marshal.load*. It must be a string and, if present, is used to
override co_filename in the resulting code object. Once we have had
the argument around we can then potentially make it a required
argument and have file paths in the marshal data go away (or decide to
default to some string constant when people don't specify the path


From guido at  Mon Aug 31 19:02:24 2009
From: guido at (Guido van Rossum)
Date: Mon, 31 Aug 2009 10:02:24 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 9:57 AM, Brett Cannon<brett at> wrote:
> On Mon, Aug 31, 2009 at 09:33, Guido van Rossum<guido at> wrote:
>> Hm... I still wonder if there would be bad side effects of making
>> co_filename writable, but I can't think of any, so maybe you can make
>> this work... The next step would be to not write it out when
>> marshalling a code object -- this might save a bit of space in pyc
>> files too! (I guess for compatibility you might want to write it as an
>> empty string.)
> I would only want to consider stripping out the filename from the
> marshal format if a filename argument to marshal.load* was required to
> guarantee that code objects always in some sensible state. Otherwise
> everyone would end up with tracebacks that made no sense by default.
> But adding a required argument to marshal.load* would be quite the
> pain for compatibility.

Well... It would be, but consider this: marshal.load() already takes a
file argument; in most cases you can extract the name from the file
easily. And for marshal.loads(), I'm not sure that the filename baked
into the data is all that reliable anyways.

>> Of course, tracking down all the code objects in the return value of
>> marshal.load*() might be a bit tricky -- API-wise I still think that
>> making it an argument to marshal.load*() might be simpler. Also it
>> would preserve the purity of code objects.
>> (Michael: it would be fine if *other* implementations of Python made
>> co_filename writable, as long as you can't think of security issues
>> with this.)
> OK, so what does co_filename get used for? I think it is referenced to
> open files for use in printing out the traceback. Python won't be able
> to open files that you can't as a user, so that shouldn't be a
> security risk. All places where co_filename is referenced would need
> to gain a check or start using some new C function/macro which
> verified that co_filename was a string and not some number or
> something else which wouldn't get null-terminated and thus lead to
> buffer overflow.

You could also do the validation on assignment.

> A quick grep for co_filename turns up 17 uses in C
> code, although having to add some check would ruin the purity Guido is
> talking about and make a single attribute on code objects something
> people have to be careful about instead of having a guarantee that all
> attributes have some specific type of value.
> I'm with Guido; I would rather add an optional argument to
> marshal.load*. It must be a string and, if present, is used to
> override co_filename in the resulting code object. Once we have had
> the argument around we can then potentially make it a required
> argument and have file paths in the marshal data go away (or decide to
> default to some string constant when people don't specify the path
> argument).

Actually that sounds like a fine transitional argument.

--Guido van Rossum (home page:

From brett at  Mon Aug 31 21:00:48 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 12:00:48 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 10:02, Guido van Rossum<guido at> wrote:
> On Mon, Aug 31, 2009 at 9:57 AM, Brett Cannon<brett at> wrote:
>> On Mon, Aug 31, 2009 at 09:33, Guido van Rossum<guido at> wrote:
>>> Hm... I still wonder if there would be bad side effects of making
>>> co_filename writable, but I can't think of any, so maybe you can make
>>> this work... The next step would be to not write it out when
>>> marshalling a code object -- this might save a bit of space in pyc
>>> files too! (I guess for compatibility you might want to write it as an
>>> empty string.)
>> I would only want to consider stripping out the filename from the
>> marshal format if a filename argument to marshal.load* was required to
>> guarantee that code objects always in some sensible state. Otherwise
>> everyone would end up with tracebacks that made no sense by default.
>> But adding a required argument to marshal.load* would be quite the
>> pain for compatibility.
> Well... It would be, but consider this: marshal.load() already takes a
> file argument; in most cases you can extract the name from the file
> easily. And for marshal.loads(), I'm not sure that the filename baked
> into the data is all that reliable anyways.
>>> Of course, tracking down all the code objects in the return value of
>>> marshal.load*() might be a bit tricky -- API-wise I still think that
>>> making it an argument to marshal.load*() might be simpler. Also it
>>> would preserve the purity of code objects.
>>> (Michael: it would be fine if *other* implementations of Python made
>>> co_filename writable, as long as you can't think of security issues
>>> with this.)
>> OK, so what does co_filename get used for? I think it is referenced to
>> open files for use in printing out the traceback. Python won't be able
>> to open files that you can't as a user, so that shouldn't be a
>> security risk. All places where co_filename is referenced would need
>> to gain a check or start using some new C function/macro which
>> verified that co_filename was a string and not some number or
>> something else which wouldn't get null-terminated and thus lead to
>> buffer overflow.
> You could also do the validation on assignment.
>> A quick grep for co_filename turns up 17 uses in C
>> code, although having to add some check would ruin the purity Guido is
>> talking about and make a single attribute on code objects something
>> people have to be careful about instead of having a guarantee that all
>> attributes have some specific type of value.
>> I'm with Guido; I would rather add an optional argument to
>> marshal.load*. It must be a string and, if present, is used to
>> override co_filename in the resulting code object. Once we have had
>> the argument around we can then potentially make it a required
>> argument and have file paths in the marshal data go away (or decide to
>> default to some string constant when people don't specify the path
>> argument).
> Actually that sounds like a fine transitional argument.

I will plan to take this approach then; will track all of this. Since this is
a 3.2 thing I am not going to rush to implement this.


From pje at  Mon Aug 31 21:08:51 2009
From: pje at (P.J. Eby)
Date: Mon, 31 Aug 2009 15:08:51 -0400
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <
References: <>
Message-ID: <>

At 09:33 AM 8/31/2009 -0700, Guido van Rossum wrote:
>Of course, tracking down all the code objects in the return value of
>marshal.load*() might be a bit tricky -- API-wise I still think that
>making it an argument to marshal.load*() might be simpler. Also it
>would preserve the purity of code objects.

Or maybe we could just do something like this:

     from new import code

     def with_changed_filename(code_ob, filename):
         def remap(ob):
             if not isinstance(ob, code):
                 return ob
             return code(
                 ob.co_argcount, ob.co_nlocals, ob.co_stacksize, 
ob.co_flags, ob.co_code,
                 map(remap, ob.co_consts), ob.co_names, 
ob.co_varnames, filename,
                 ob.co_name, ob.co_firstlineno, ob.co_lnotab, ob.co_freevars,
         return remap(code_ob)

Granted, this takes a bit more memory than an in-place modification, 
but it's immediately usable and at least works wherever new.code is available.

(I've not tested the above, so it may not work.  I seem to recall the 
last time I wrote something like this there was something tricky 
about handling co_freevars and co_cellvars; I think you may need to 
omit them if empty, or convert them to None, or from None to an empty 
tuple or some such rigamarole.   And a 3.x version is left as an 
exercise for the reader.  ;-) )

From SridharR at  Mon Aug 31 21:18:18 2009
From: SridharR at (Sridhar Ratnakumar)
Date: Mon, 31 Aug 2009 12:18:18 -0700
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <>
References: <>
Message-ID: <op.uzjjwsodbrrvlq@double>

On Wed, 26 Aug 2009 15:55:54 -0700, Joe Amenta <amentajo at> wrote:

> -- 3to2 is now registered with PyPI.  Did I do it right?


Please fix the version number to not contain any whitespace characters.  
Also set the `version` argument in setup(..) in your And then  
you may want to use the `upload` command to upload the new tarball to  
PyPI. See for more details.


From brett at  Mon Aug 31 21:20:51 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 12:20:51 -0700
Subject: [Python-Dev]
In-Reply-To: <>
References: <> 
	<> <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 06:36, Nick Coghlan<ncoghlan at> wrote:
> Chris Withers wrote:
>> Nick Coghlan wrote:
>>> The PEPs don't go into the process of how we actually hook the command
>>> line up to the runpy module though - that's something you need to dig
>>> into the main.c code to really understand.
>> Yeah, main.c does quite a lot... ;-)
>> This all spawned from a suggestion by Jim Fulton over on the
>> distutils-sig that it would be nice if there was a python module that
>> did all of the various types of launching found in main.c. His use case
>> is so that buildout scripts can easily use the same functionality that
>> the interpreter startup uses.
>> I didn't spot any, but does anyone know of code in that mix that
>> couldn't be moved to a pure python module like runpy?
>> If not, how would people feel about the various types of launching all
>> moving to runpy rather than just the -m stuff being there?
> I haven't timed it, but I believe runpy is a fair bit slower than the
> native C functions in main. (That first part of the comment means I
> could easily be wrong though - it's definitely possible that overall
> interpreter startup time will dwarf any difference between the two
> launch mechanisms).

That's quite possible. If you benchmark it you might be able to convince people.

> That said, while actually ditching the C code might cause an argument,
> expanding runpy with Python equivalents of the C level functionality
> (i.e. run script by name, run directory/zipfile by name, '-c' switch,
> and other odds and ends that I'm probably forgetting right now, with all
> associated modifications to sys.argv and the __main__ module attributes)
> should be far less controversial.

It also has the perk of letting alternative VMs not have to implement
all of that stuff themselves, potentially helping to unify even the
command-line interfaces for all the VMs.


From brett at  Mon Aug 31 21:25:50 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 12:25:50 -0700
Subject: [Python-Dev] 3to2 0.1 alpha 1 released
In-Reply-To: <op.uzjjwsodbrrvlq@double>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 12:18, Sridhar
Ratnakumar<SridharR at> wrote:
> On Wed, 26 Aug 2009 15:55:54 -0700, Joe Amenta <amentajo at> wrote:
>> -- 3to2 is now registered with PyPI. ?Did I do it right?
> Please fix the version number to not contain any whitespace characters. Also
> set the `version` argument in setup(..) in your And then you may
> want to use the `upload` command to upload the new tarball to PyPI. See
> for more details.

See PEP 386 ( for what the
current thinking on version numbers is.


From solipsis at  Mon Aug 31 21:27:00 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 19:27:00 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Brett Cannon <brett <at>> writes:
> I will plan to take this approach then;
> will track all of this. Since this is
> a 3.2 thing I am not going to rush to implement this.

I still don't understand what the point is of this complicated approach (adding
an argument to marshal.load()) compared to the simple and obvious approach
(making co_filename mutable).
Besides, the latter would let you code the recursive renaming algorithm in
Python, which is the whole point of importlib (rewriting most code in Python),
isn't it?



From brett at  Mon Aug 31 21:59:46 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 12:59:46 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 12:27, Antoine Pitrou<solipsis at> wrote:
> Brett Cannon <brett <at>> writes:
>> I will plan to take this approach then;
>> will track all of this. Since this is
>> a 3.2 thing I am not going to rush to implement this.
> I still don't understand what the point is of this complicated approach (adding
> an argument to marshal.load()) compared to the simple and obvious approach
> (making co_filename mutable).

If we add the argument to marshal.load* we can eventually drop the
file location string from marshal data entirely by requiring people to
specify the filename to use when the code object is created. Making
co_filename mutable simply doesn't allow for this case unless we
decide a default value should be used instead.

> Besides, the latter would let you code the recursive renaming algorithm in
> Python, which is the whole point of importlib (rewriting most code in Python),
> isn't it?

Sure, but I am not about to re-implement marshal in pure Python just
because importlib uses it.


From brett at  Mon Aug 31 22:39:04 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 13:39:04 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 12:59, Brett Cannon<brett at> wrote:
> On Mon, Aug 31, 2009 at 12:27, Antoine Pitrou<solipsis at> wrote:
>> Brett Cannon <brett <at>> writes:
>>> I will plan to take this approach then;
>>> will track all of this. Since this is
>>> a 3.2 thing I am not going to rush to implement this.
>> I still don't understand what the point is of this complicated approach (adding
>> an argument to marshal.load()) compared to the simple and obvious approach
>> (making co_filename mutable).
> If we add the argument to marshal.load* we can eventually drop the
> file location string from marshal data entirely by requiring people to
> specify the filename to use when the code object is created. Making
> co_filename mutable simply doesn't allow for this case unless we
> decide a default value should be used instead.

I should also mention that I am +0 on the marshal.load* change. I
could be convinced to try to pursue a mutable co_filenme direction,
but considering the BDFL likes the marshal.load* approach and it opens
up the possibility of compacting the marshal format I am leaning
towards sticking with this initial direction.


From solipsis at  Mon Aug 31 22:47:34 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 20:47:34 +0000 (UTC)
Subject: [Python-Dev]
References: <>
Message-ID: <>

Brett Cannon <brett <at>> writes:
> I should also mention that I am +0 on the marshal.load* change. I
> could be convinced to try to pursue a mutable co_filenme direction,
> but considering the BDFL likes the marshal.load* approach and it opens
> up the possibility of compacting the marshal format I am leaning
> towards sticking with this initial direction.

I am really not opinionated on this one. I was just pointing out that choosing a
non-obvious solution generally requires good reasons to do so. The marshal
format compaction sounds like premature optimization, since nobody seems to have
formulated such a request.



From greg at  Mon Aug 31 23:12:14 2009
From: greg at (Gregory P. Smith)
Date: Mon, 31 Aug 2009 14:12:14 -0700
Subject: [Python-Dev] default of returning None hurts performance?
Message-ID: <>

food for thought as noticed by a coworker who has been profiling some hot
code to optimize a library...

If a function does not have a return statement we return None.  Ironically
this makes the foo2 function below faster than the bar2 function at least as
measured using bytecode size:

Python 2.6.2 (r262:71600, Jul 24 2009, 17:29:21)
[GCC 4.2.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> def foo(x):
...   y = x()
...   return y
>>> def foo2(x):
...   return x()
>>> def bar(x):
...   y = x()
>>> def bar2(x):
...   x()
>>> dis.dis(foo)
  2           0 LOAD_FAST                0 (x)
              3 CALL_FUNCTION            0
              6 STORE_FAST               1 (y)

  3           9 LOAD_FAST                1 (y)
             12 RETURN_VALUE
>>> dis.dis(foo2)
  2           0 LOAD_FAST                0 (x)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE
>>> dis.dis(bar)
  2           0 LOAD_FAST                0 (x)
              3 CALL_FUNCTION            0
              6 STORE_FAST               1 (y)
              9 LOAD_CONST               0 (None)
             12 RETURN_VALUE
>>> dis.dis(bar2)
  2           0 LOAD_FAST                0 (x)
              3 CALL_FUNCTION            0
              6 POP_TOP
              7 LOAD_CONST               0 (None)
             10 RETURN_VALUE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Mon Aug 31 23:20:34 2009
From: solipsis at (Antoine Pitrou)
Date: Mon, 31 Aug 2009 21:20:34 +0000 (UTC)
Subject: [Python-Dev] default of returning None hurts performance?
References: <>
Message-ID: <>

Gregory P. Smith <greg <at>> writes:
> food for thought as noticed by a coworker who has been profiling some hot code
to optimize a library...If a function does not have a return statement we return
None.? Ironically this makes the foo2 function below faster than the bar2
function at least as measured using bytecode size

I would be surprised if this "bytecode size" difference made a significant
difference in runtimes, given that function call cost should dwarf the cumulated
cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find).

Did your coworker run any timings instead of basing his assumptions on bytecode



From peter at  Mon Aug 31 23:43:33 2009
From: peter at (Peter Moody)
Date: Mon, 31 Aug 2009 14:43:33 -0700
Subject: [Python-Dev] PEP 3144: IP Address Manipulation Library for the
	Python Standard Library
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Aug 28, 2009 at 2:31 AM, Nick Coghlan<ncoghlan at> wrote:
> Peter Moody wrote:
>> If there are any more suggestions on the PEP or the code, please let me know.
> I noticed the new paragraphs on the IPv4 vs IPv6 types not being
> comparable - is there a canonical ordering for mixed address lists
> defined anywhere (e.g. an RFC)?
> If there is, then it should be possible to implement that on BaseIP and
> BaseNet so that comparisons work as canonically defined. If there isn't,
> then that should be mentioned in the PEP as the reason why the PEP
> deliberately isn't trying to invent a convention.

updated the pep with more information about this.

working through changes/issues brought up by David and Martin.


> Cheers,
> Nick.
> --
> Nick Coghlan ? | ? ncoghlan at ? | ? Brisbane, Australia
> ---------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at
> Unsubscribe:

From pje at  Mon Aug 31 23:52:08 2009
From: pje at (P.J. Eby)
Date: Mon, 31 Aug 2009 17:52:08 -0400
Subject: [Python-Dev] how important is setting co_filename for a module
 being imported to what __file__ is set to?
In-Reply-To: <
References: <>
Message-ID: <>

At 01:39 PM 8/31/2009 -0700, Brett Cannon wrote:
>On Mon, Aug 31, 2009 at 12:59, Brett Cannon<brett at> wrote:
> > On Mon, Aug 31, 2009 at 12:27, Antoine Pitrou<solipsis at> wrote:
> >> Brett Cannon <brett <at>> writes:
> >>>
> >>> I will plan to take this approach then;
> >>> will track all of this. Since this is
> >>> a 3.2 thing I am not going to rush to implement this.
> >>
> >> I still don't understand what the point is of this complicated 
> approach (adding
> >> an argument to marshal.load()) compared to the simple and obvious approach
> >> (making co_filename mutable).
> >
> > If we add the argument to marshal.load* we can eventually drop the
> > file location string from marshal data entirely by requiring people to
> > specify the filename to use when the code object is created. Making
> > co_filename mutable simply doesn't allow for this case unless we
> > decide a default value should be used instead.
> >
>I should also mention that I am +0 on the marshal.load* change. I
>could be convinced to try to pursue a mutable co_filenme direction,
>but considering the BDFL likes the marshal.load* approach and it opens
>up the possibility of compacting the marshal format I am leaning
>towards sticking with this initial direction.

Why not just try the code I posted earlier, that doesn't need a 
mutable attribute OR an API change?

From brett at  Mon Aug 31 23:57:49 2009
From: brett at (Brett Cannon)
Date: Mon, 31 Aug 2009 14:57:49 -0700
Subject: [Python-Dev] how important is setting co_filename for a module
	being imported to what __file__ is set to?
In-Reply-To: <>
References: <> 
Message-ID: <>

On Mon, Aug 31, 2009 at 14:52, P.J. Eby<pje at> wrote:
> At 01:39 PM 8/31/2009 -0700, Brett Cannon wrote:
>> On Mon, Aug 31, 2009 at 12:59, Brett Cannon<brett at> wrote:
>> > On Mon, Aug 31, 2009 at 12:27, Antoine Pitrou<solipsis at>
>> > wrote:
>> >> Brett Cannon <brett <at>> writes:
>> >>>
>> >>> I will plan to take this approach then;
>> >>> will track all of this. Since this is
>> >>> a 3.2 thing I am not going to rush to implement this.
>> >>
>> >> I still don't understand what the point is of this complicated approach
>> >> (adding
>> >> an argument to marshal.load()) compared to the simple and obvious
>> >> approach
>> >> (making co_filename mutable).
>> >
>> > If we add the argument to marshal.load* we can eventually drop the
>> > file location string from marshal data entirely by requiring people to
>> > specify the filename to use when the code object is created. Making
>> > co_filename mutable simply doesn't allow for this case unless we
>> > decide a default value should be used instead.
>> >
>> I should also mention that I am +0 on the marshal.load* change. I
>> could be convinced to try to pursue a mutable co_filenme direction,
>> but considering the BDFL likes the marshal.load* approach and it opens
>> up the possibility of compacting the marshal format I am leaning
>> towards sticking with this initial direction.
> Why not just try the code I posted earlier, that doesn't need a mutable
> attribute OR an API change?

Ignoring that 'new' is not in Python 3.x (luckily 'types' is), I want
a proper solution that doesn't require reconstructing every code
object that I happen to import.


From fdrake at  Mon Aug 31 23:04:25 2009
From: fdrake at (Fred Drake)
Date: Mon, 31 Aug 2009 17:04:25 -0400
Subject: [Python-Dev] how important is setting co_filename for a
	module	being imported to what __file__ is set to?
In-Reply-To: <>
References: <>
Message-ID: <>

On Aug 31, 2009, at 4:47 PM, Antoine Pitrou wrote:
> I am really not opinionated on this one. I was just pointing out  
> that choosing a
> non-obvious solution generally requires good reasons to do so. The  
> marshal
> format compaction sounds like premature optimization, since nobody  
> seems to have
> formulated such a request.

Every time I've been bitten by the wrong co_filename values (usually  
from tracebacks), changing the way marshal creates code objects to use  
a values passed in has been the thing that made the most sense to me.

The feature request that's involved here, getting correct co_filename  
values, can be implemented in different ways, sure.  This particular  
change produces the least impact in the because it *doesn't* change  
the mutability of code objects.

I for one appreciate that, mostly because I'm simply wary of making  
code objects mutable in this way having unexpected side effects in  
some library.


Fred Drake   <fdrake at>