From alexandre at peadrop.com  Sun Sep  2 19:14:45 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Sun, 2 Sep 2007 13:14:45 -0400
Subject: [Python-Dev] Avoiding cascading test failures
In-Reply-To: <43aa6ff70708282015q56699bebk4bc52a38749d121e@mail.gmail.com>
References: <acd65fa20708221644y2ab376bemd4d550ff43fdd91@mail.gmail.com>
	<43aa6ff70708282015q56699bebk4bc52a38749d121e@mail.gmail.com>
Message-ID: <acd65fa20709021014r5785463cq72e2e08c985b1085@mail.gmail.com>

On 8/28/07, Collin Winter <collinw at gmail.com> wrote:
> On 8/22/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > When I was fixing tests failing in the py3k branch, I found the number
> > duplicate failures annoying. Often, a single bug, in an important
> > method or function, caused a large number of testcase to fail. So, I
> > thought of a simple mechanism for avoiding such cascading failures.
> >
> > My solution is to add a notion of dependency to testcases. A typical
> > usage would look like this:
> >
> >     @depends('test_getvalue')
> >     def test_writelines(self):
> >         ...
> >         memio.writelines([buf] * 100)
> >         self.assertEqual(memio.getvalue(), buf * 100)
> >         ...
>
> This definitely seems like a neat idea. Some thoughts:
>
> * How do you deal with dependencies that cross test modules? Say
> test A depends on test B, how do we know whether it's worthwhile
> to run A if B hasn't been run yet? It looks like you run the test
> anyway (I haven't studied the code closely), but that doesn't
> seem ideal.

I am not sure what you mean by "test modules". Do you mean module in
the Python sense, or like a test-case class?

> * This might be implemented in the wrong place. For example, the [x
> for x in dir(self) if x.startswith('test')] you do is most certainly
> better-placed in a custom TestLoader implementation.

That certainly is a good suggestion. I am not sure yet how I will
implement my idea in the unittest module. However, I pretty sure that
it will be quite different from my prototype.

> But despite that, I think it's a cool idea and worth pursuing. Could
> you set up a branch (probably of py3k) so we can see how this plays
> out in the large?

Sure. I need to finish merging pickle and cPickle for Py3k before
tackling this project, though.

-- Alexandre

From ryan.freckleton at gmail.com  Mon Sep  3 06:34:26 2007
From: ryan.freckleton at gmail.com (Ryan Freckleton)
Date: Sun, 2 Sep 2007 22:34:26 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
Message-ID: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>

Hello,

At one time Guido mentioned adding a built-in product() function to
cover some of the remaining use cases of the built-in reduce(). I
don't know if this function is still wanted or needed, but I've
created an implementation with tests and documentation at
http://bugs.python.org/issue1093 .

If it is still wanted, could someone review it and give me feedback on it?

Thanks,
-- 
=====
--Ryan E. Freckleton

From martin at v.loewis.de  Mon Sep  3 07:27:08 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 03 Sep 2007 07:27:08 +0200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
Message-ID: <46DB9B2C.7010202@v.loewis.de>

> At one time Guido mentioned adding a built-in product() function to
> cover some of the remaining use cases of the built-in reduce(). 

What is the use case for product()?

Regards,
Martin

From skip at pobox.com  Mon Sep  3 14:24:30 2007
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 3 Sep 2007 07:24:30 -0500
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DB9B2C.7010202@v.loewis.de>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DB9B2C.7010202@v.loewis.de>
Message-ID: <18139.64766.602709.442409@montanaro.dyndns.org>


    >> At one time Guido mentioned adding a built-in product() function to
    >> cover some of the remaining use cases of the built-in reduce().

    Martin> What is the use case for product()?

As I recall, there were basically two uses of reduce(), to sum a series or
(less frequently) to take the product of a series.  sum() obviously takes
care of the first use case.  product() would take care of the second.

Skip


From guido at python.org  Mon Sep  3 16:37:59 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Sep 2007 07:37:59 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <18139.64766.602709.442409@montanaro.dyndns.org>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DB9B2C.7010202@v.loewis.de>
	<18139.64766.602709.442409@montanaro.dyndns.org>
Message-ID: <ca471dc20709030737w55c11f61j6d5b84601499ea26@mail.gmail.com>

Actually, if you use Google code search, you'll find that multiplying
the numbers in a list doesn't have much use at all. After summing
numbers, joining strings is by far the most common usage -- which is
much better done with the str.join() method.

(PS. I rejected the issue; product() was proposed and rejected when
sum() was originally proposed and accepted, and I don't see anything
to change my mind.)

On 9/3/07, skip at pobox.com <skip at pobox.com> wrote:
>
>     >> At one time Guido mentioned adding a built-in product() function to
>     >> cover some of the remaining use cases of the built-in reduce().
>
>     Martin> What is the use case for product()?
>
> As I recall, there were basically two uses of reduce(), to sum a series or
> (less frequently) to take the product of a series.  sum() obviously takes
> care of the first use case.  product() would take care of the second.
>
> Skip
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From orsenthil at gmail.com  Mon Sep  3 20:47:48 2007
From: orsenthil at gmail.com (O.R.Senthil Kumaran)
Date: Tue, 4 Sep 2007 00:17:48 +0530
Subject: [Python-Dev] Roundup issue mails "Do not thread!"
Message-ID: <20070903184748.GB5632@gmail.com>

Hi all,
	Has anyone observed missing "email-threads" issue with Roundup bug tracker email? Any work around for that? 
I use mutt and find that roundup bug issue000xx mails are not being threaded.
Its not do with settings, I believe.
The issue000xxx emails might not being have In-Reply-To or References: header.

<MuttFAQ>

Why are some msgs threaded and others not?
You have some msgs which don't have correct

 In-Reply-To:
 References:

headers (or not set at all) and you've turned on

 $strict_threads

What do "->", "-?-" and "*>" mean in thread trees?
When you turn off

 $strict_threads

msgs with similar subjects get grouped together. In
...
</MuttFAQ>

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From ty.newton at copperchipgames.com  Tue Sep  4 00:20:29 2007
From: ty.newton at copperchipgames.com (Ty Newton)
Date: Tue, 04 Sep 2007 08:20:29 +1000
Subject: [Python-Dev] Porting information
Message-ID: <46DC88AD.5060001@copperchipgames.com>

Hi,
I'm looking into porting CPython to native C# (not like IronPython) so 
that it can be used in game software on the XBox360: integrated with the 
indie development tool XNA Game Studio Express.

I am looking for some guidance on how to approach this in the most 
effective way.

I've started by looking at the parser portion of the code.  However I am 
not certain this is the best place to start.  Since there are so many 
ports I assume there is a well trodden path to completing this kind of 
task.  Any suggestions would be greatly appreciated.

I would prefer to break the task into portions that can be verified 
(tested for correctness) independently or as a stack (one on top of the 
next).  That way I can catch errors early and have more confidence in 
what I am creating.

When I looked through the test suites they all seem to be written in 
Python.  Is there a test suite for the core of CPython i.e. before the C 
code can interpret Python code?


Thanks,
Ty

From greg.ewing at canterbury.ac.nz  Tue Sep  4 00:59:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 04 Sep 2007 10:59:30 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
Message-ID: <46DC91D2.7060407@canterbury.ac.nz>

Ryan Freckleton wrote:
> At one time Guido mentioned adding a built-in product() function to
> cover some of the remaining use cases of the built-in reduce().

Speaking of such things, I was thinking the other day
that it might be useful to have somewhere in the stdlib
a full set of functions for doing elementwise operations
and reductions on the built-in array type.

This would make it possible for one to do efficient
bulk arithmetic when the need arises from time to time
without having to pull in a heavyweight dependency
such as Numeric or numpy.

--
Greg

From martin at v.loewis.de  Tue Sep  4 04:21:25 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 04:21:25 +0200
Subject: [Python-Dev] Roundup issue mails "Do not thread!"
In-Reply-To: <20070903184748.GB5632@gmail.com>
References: <20070903184748.GB5632@gmail.com>
Message-ID: <46DCC125.8040109@v.loewis.de>

> The issue000xxx emails might not being have In-Reply-To or References: header.

If messages are entered through the web interface, they won't have these
headers.

Regards,
Martin

From orsenthil at gmail.com  Tue Sep  4 04:49:43 2007
From: orsenthil at gmail.com (O.R.Senthil Kumaran)
Date: Tue, 4 Sep 2007 08:19:43 +0530
Subject: [Python-Dev] Roundup issue mails "Do not thread!"
In-Reply-To: <46DCC125.8040109@v.loewis.de>
References: <20070903184748.GB5632@gmail.com> <46DCC125.8040109@v.loewis.de>
Message-ID: <20070904024943.GA3605@gmail.com>

* "Martin v. L?wis" <martin at v.loewis.de> [2007-09-04 04:21:25]:

> > The issue000xxx emails might not being have In-Reply-To or References: header.
> 
> If messages are entered through the web interface, they won't have these
> headers.

Then I should file a bug/feature request for Roundup. How are others keeping track? Whenever I open an issue after analyzing the email message, I find that it salready discussed and state is changed, I had missed the further emails on the same issue due to non-threads.

Thanks,

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From guido at python.org  Tue Sep  4 04:57:49 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Sep 2007 19:57:49 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DC91D2.7060407@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
Message-ID: <ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>

On 9/3/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Speaking of such things, I was thinking the other day
> that it might be useful to have somewhere in the stdlib
> a full set of functions for doing elementwise operations
> and reductions on the built-in array type.
>
> This would make it possible for one to do efficient
> bulk arithmetic when the need arises from time to time
> without having to pull in a heavyweight dependency
> such as Numeric or numpy.

But what's the point, given that numpy already exists? Wouldn't you
just be redoing the work that numpy has already done?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Sep  4 05:13:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 05:13:59 +0200
Subject: [Python-Dev] Roundup issue mails "Do not thread!"
In-Reply-To: <20070904024943.GA3605@gmail.com>
References: <20070903184748.GB5632@gmail.com> <46DCC125.8040109@v.loewis.de>
	<20070904024943.GA3605@gmail.com>
Message-ID: <46DCCD77.7090300@v.loewis.de>

> Then I should file a bug/feature request for Roundup.

Please consider what you are asking for. How precisely should
roundup set the In-reply-to header? It won't know what message
this is a reply to, or whether it is a reply at all.

> How are others keeping track? Whenever I open an issue after analyzing the email
> message, I find that it salready discussed and state is changed, I
> had missed the further emails on the same issue due to non-threads.

My email tool has better threading than yours, I guess. IceDove
(Thunderbird) will thread the messages by subject also. The non-threaded
ones get displayed on the second level, appearing in reply to the
original message (or, rather, the youngest message with the same
subject - just as if the message mentioned in In-Reply-To has already
been deleted).

Regards,
Martin


From martin at v.loewis.de  Tue Sep  4 05:33:10 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 05:33:10 +0200
Subject: [Python-Dev] Porting information
In-Reply-To: <46DC88AD.5060001@copperchipgames.com>
References: <46DC88AD.5060001@copperchipgames.com>
Message-ID: <46DCD1F6.2040906@v.loewis.de>

> I've started by looking at the parser portion of the code.  However I am 
> not certain this is the best place to start.  Since there are so many 
> ports I assume there is a well trodden path to completing this kind of 
> task.

I believe this assumption is wrong. There are not many ports, only a
handful (or less - Jython, IronPython, PyPy). While Jython and
IronPython may have similar implementation strategies, I would expect
that PyPy took an entirely different approach.

In any case, there certainly is a step that you apparently failed
to perform as the very first step: set some explicit goals. What
kind of compatibility do you want to achieve in your port, what
other goals would you like to follow?

IOW, why is IronPython not what you want (it *is* a port of CPython
to C#, in a sense), and why is the C# support in PyPy not good enough
for you?

> I would prefer to break the task into portions that can be verified 
> (tested for correctness) independently or as a stack (one on top of the 
> next).  That way I can catch errors early and have more confidence in 
> what I am creating.

As I don't know what you want to achieve, it is difficult to tell
you what steps to take.

I assume your implementation would be similar to CPython in that
it uses the same byte code format. So one path would be to ignore
the compiler at all, and assume that the byte code format is given,
i.e. start with port ceval.c.

I'm not sure whether you also want to provide the same low-level
API (i.e. whether you want to provide "Embedding and Extending");
it surely can't be the *same* API, since your's will be C#, whereas
CPython's is, well, C. If you implement ceval.c, you will find
quickly that you need much of the Objects folder, so implementing
the 10 or so most important objects would be the natural starting
point (type, int, string, tuple, dict, frame, code, class, method -
assuming you would target Python 1.5 first, i.e. no bool, cell,
descr, gen, iter, weakref, unicode, object).

> When I looked through the test suites they all seem to be written in 
> Python.  Is there a test suite for the core of CPython i.e. before the C 
> code can interpret Python code?

Yes and no. The core Python is tested through compilation - if it
compiles without warnings on the relevant compilers, it is considered
good enough to run the Python test suite. For selected features of
the interpreter, there are specific tests, in particular test_capi.

The core of CPython (compiler, objects, builtins) is then tested
through Python code.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Sep  4 11:18:43 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 04 Sep 2007 21:18:43 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
Message-ID: <46DD22F3.6070701@canterbury.ac.nz>

Guido van Rossum wrote:
> But what's the point, given that numpy already exists? Wouldn't you
> just be redoing the work that numpy has already done?

Sometimes I just want to do something simple like
adding two vectors together, and it seems unreasonable
to add the whole of numpy as a dependency just to
get that.

Currently Python has built-in ways of doing arithmetic,
and built-in ways of storing arrays of numbers efficiently,
but no built-in way of doing arithmetic on arrays of
numbers efficiently.

I'd like to see some of the core machinery of numpy moved
into the Python stdlib, and numpy refactored so that it
builds on that. Then there wouldn't be duplication.

--
Greg

From steve at shrogers.com  Tue Sep  4 13:57:30 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Tue, 04 Sep 2007 05:57:30 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD22F3.6070701@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz>
Message-ID: <46DD482A.1000801@shrogers.com>

Greg Ewing wrote:
> Guido van Rossum wrote:
>   
>> But what's the point, given that numpy already exists? Wouldn't you
>> just be redoing the work that numpy has already done?
>>     
>
> Sometimes I just want to do something simple like
> adding two vectors together, and it seems unreasonable
> to add the whole of numpy as a dependency just to
> get that. ...
>
> I'd like to see some of the core machinery of numpy moved
> into the Python stdlib, and numpy refactored so that it
> builds on that. Then there wouldn't be duplication.
>   
Concur.  Array processing would be a very practical addition to the 
standard library.  It's used extensively in engineering, finance, and 
the sciences.  It looks like they may find room in the OLPC XO for key 
subsets of NumPy and Matplotlib.  They want it both as a teaching 
resource and to optimize their software suite as a whole.  If they're 
successful, we'll have a lot of young pythoneers expecting this 
functionality.

# Steve

From martin at v.loewis.de  Tue Sep  4 14:54:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 14:54:49 +0200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD22F3.6070701@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz>
Message-ID: <46DD5599.9010003@v.loewis.de>

> I'd like to see some of the core machinery of numpy moved
> into the Python stdlib, and numpy refactored so that it
> builds on that. Then there wouldn't be duplication.

I think this requires a PEP, and explicit support from the
NumPy people.

Regards,
Martin

From guido at python.org  Tue Sep  4 16:38:58 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 07:38:58 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD482A.1000801@shrogers.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
Message-ID: <ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>

On 9/4/07, Steven H. Rogers <steve at shrogers.com> wrote:
> Concur.  Array processing would be a very practical addition to the
> standard library.  It's used extensively in engineering, finance, and
> the sciences.  It looks like they may find room in the OLPC XO for key
> subsets of NumPy and Matplotlib.  They want it both as a teaching
> resource and to optimize their software suite as a whole.  If they're
> successful, we'll have a lot of young pythoneers expecting this
> functionality.

I still don't see why the standard library needs to be weighed down
with a competitor to numpy. Including a subset of numpy was considered
in the past, but it's hard to decide on the right subset. In the end
it was decided that numpy is too big to become a standard library.
Given all the gyrations it has gone through I definitely believe this
was the right decision.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Tue Sep  4 16:52:29 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 05 Sep 2007 00:52:29 +1000
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD5599.9010003@v.loewis.de>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD5599.9010003@v.loewis.de>
Message-ID: <46DD712D.9080103@gmail.com>

Martin v. L?wis wrote:
>> I'd like to see some of the core machinery of numpy moved into the
>> Python stdlib, and numpy refactored so that it builds on that. Then
>> there wouldn't be duplication.
> 
> I think this requires a PEP, and explicit support from the NumPy
> people.

Travis has actually been working on this off-and-on for the last couple 
of years, including mentoring an SoC project last year. I believe PEP 
3118 (the revised buffer protocol) was one of the major outcomes - 
rather than having yet-another-array-type to copy data to and from in 
order to use different libraries, the focus moved to permitting better 
interoperability amongst the array types that already exist.

Once we support better interoperability at the data storage level, it 
will actually become *more* useful to have a simple multi-dimensional 
array type in the standard library as you could easily pass those 
objects to functions from more powerful array manipulation libraries as 
your needs become more complicated.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From janssen at parc.com  Tue Sep  4 20:33:50 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 11:33:50 PDT
Subject: [Python-Dev] frozenset C API?
Message-ID: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com>

I'm looking at building a "frozenset" instance as a return value from
a C function, and the C API seems ridiculously clumsy.  Maybe I'm
misunderstanding it.  Apparently, I need to create a list object, then
pass that to PyFrozenSet_New(), then decref the list object.

Is that correct?

What I'd like is something more like

     PyFrozenSet_NEW(int) => PySetObject *
     PyFrozenSet_SET_ITEM(s, i, v)

Any idea why these aren't part of the API?

Bill

From guido at python.org  Tue Sep  4 20:37:58 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 11:37:58 -0700
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <-4762611594645938717@unknownmsgid>
References: <-4762611594645938717@unknownmsgid>
Message-ID: <ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>

I guess nobody has tried to create frozenset instances from C code
before. Almost everyone uses set anyway. What are you trying to do?

On 9/4/07, Bill Janssen <janssen at parc.com> wrote:
> I'm looking at building a "frozenset" instance as a return value from
> a C function, and the C API seems ridiculously clumsy.  Maybe I'm
> misunderstanding it.  Apparently, I need to create a list object, then
> pass that to PyFrozenSet_New(), then decref the list object.
>
> Is that correct?
>
> What I'd like is something more like
>
>      PyFrozenSet_NEW(int) => PySetObject *
>      PyFrozenSet_SET_ITEM(s, i, v)
>
> Any idea why these aren't part of the API?
>
> Bill
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Sep  4 21:02:06 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 21:02:06 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DDABAE.9020004@v.loewis.de>

Bill Janssen schrieb:
> I'm looking at building a "frozenset" instance as a return value from
> a C function, and the C API seems ridiculously clumsy.  Maybe I'm
> misunderstanding it.  Apparently, I need to create a list object, then
> pass that to PyFrozenSet_New(), then decref the list object.
> 
> Is that correct?

Almost. It doesn't have to be a list - any iterable object would do.

Regards,
Martin

From ty.newton at copperchipgames.com  Tue Sep  4 21:07:41 2007
From: ty.newton at copperchipgames.com (Ty Newton)
Date: Wed, 05 Sep 2007 05:07:41 +1000
Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005?
Message-ID: <46DDACFD.2080606@copperchipgames.com>

Hi,

I was building Python 2.5.1 in Visual Studio 2005 and noticed some 
problems with the instructions.

Can someone confirm this and update the readme file in the PCbuild8 
directory?  I don't yet have access to the repository.

This is what the readme.txt file says to do:
<snip>
All you need to do is open the workspace "pcbuild.sln" in VisualStudio 
2005, select the platform, select the Debug or Release setting (using 
"Solution Configuration" from the "Standard" toolbar"), and build the 
solution.

The proper order to build subprojects:

1) pythoncore (this builds the main Python DLL and library files,
                python25.{dll, lib} in Release mode)
               NOTE:  in previous releases, this subproject was
               named after the release number, e.g. python20.

2) python (this builds the main Python executable,
            python.exe in Release mode)
</snip>

This is my experience. DEBUG configuration:

When I select 'pythoncore' (right click) from the solution explorer, 
select 'project only', select 'build only pythoncore' I get this error 
report:

Warning	1	warning C4005: 'Yield' : macro redefinition	E:\Program 
Files\Microsoft Visual Studio 8\VC\PlatformSDK\include\winbase.h	57	

Warning	2	warning C4005: 'Yield' : macro redefinition	E:\Program 
Files\Microsoft Visual Studio 8\VC\PlatformSDK\include\winbase.h	57	

Error	3	fatal error C1083: Cannot open source file: '.\getbuildinfo2.c': 
No such file or directory	c1	


It looks like the project dependencies are not kicking in.  I assume 
this is caused by building the project instead of the solution.  So I 
did them manually.

First make_versioninfo project:
I select 'make_versioninfo' (right click) from the solution explorer, 
select 'project only', select 'build only make_versioninfo'.  This succeeds.

Second make_buildinfo project:
I select 'make_buildinfo' (right click) from the solution explorer, 
select 'project only', select 'build only make_buildinfo'.  This succeeds.

Finally I try to make pythoncore again:
I select 'pythoncore' (right click) from the solution explorer, select 
'project only', select 'build only pythoncore'.  This succeeds.

Now I build python and it also succeeds.


One last thing I noticed is if there are spaces in the path of the 
source files the compilation also fails.


Regards,
Ty

From python at rcn.com  Tue Sep  4 21:14:17 2007
From: python at rcn.com (Raymond Hettinger)
Date: Tue,  4 Sep 2007 15:14:17 -0400 (EDT)
Subject: [Python-Dev] frozenset C API?
Message-ID: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net>

You can create a frozenset from any iterable using PyFrozenSet_New().

If you don't have an iterable and want to build-up the frozenset one element at a time, the approach is to create a regular set (or some other mutable container), add to it, then convert it to a frozenset when you're done:

  s = PySet_New(NULL);
  PySet_Add(s, obj1);
  PySet_Add(s, obj2);
  PySet_Add(s, obj3);
  f = PyFrozenSet_New(s);
  Py_DECREF(s);

That approach is similar to what you do with tuples in pure python.  You either build them from an iterable "t = tuple(iterable)" or your build-up a mutable object one element at a time and convert it all at once:

  s = []
  s.append(obj1)
  s.append(obj2)
  t = tuple(s)

The API you propose doesn't work because sets and frozensets are not indexed like tuples and lists.  Accordingly, sets and frozensets have a C API that is more like dictionaries.  Since dictionaries are not indexable, they also cannot have an API like the one you propose:

     PyDict_NEW(int) => PySetObject *
     PyDict_SET_ITEM(s, index, key, value)

If you find all this really annoying and need to fashion a small frozenset with a few known objects, consider using the abstract API:

    f = PyObject_CallFunction(&PyFrozenSet_Type, "(OOO)", obj1, obj2, obj3);

That will roll the whole process up into one line.

Hope this was helpful,


Raymond


---------------------------------------------------------------
Bill Janssen <janssen at parc.com>Add To Address Book|This is Spam
Subject:[Python-Dev] frozenset C API?
To:python-dev at python.org

I'm looking at building a "frozenset" instance as a return value from
a C function, and the C API seems ridiculously clumsy.  Maybe I'm
misunderstanding it.  Apparently, I need to create a list object, then
pass that to PyFrozenSet_New(), then decref the list object.

Is that correct?

What I'd like is something more like

     PyFrozenSet_NEW(int) => PySetObject *
     PyFrozenSet_SET_ITEM(s, i, v)

Any idea why these aren't part of the API?

From janssen at parc.com  Tue Sep  4 21:21:37 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 12:21:37 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
Message-ID: <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>

I'm working on issue 1583946.  Nagle pointed out that each DN (the
"subject" and "issuer" fields in a certificate) may have multiple
values for the same attribute name, and I haven't been able to rule
this out yet.  X.509 DNs are sets of X.500 attributes, and X.500
attributes may be either single-valued or multiple-valued.  I haven't
found anything in the X.509 standard that prohibits multiple-valued
attributes (yet -- I'm still looking), so I'm working on an
alternative to using dicts to represent the set of attributes in the
certificate that's returned from ssl.sslsocket.getpeercert().
"frozenset" seems the most appropriate -- it's a non-ordered immutable
set of attributes.  Could use a tuple, but (1) that implies order,
and (2) using set operations on the attribute set would be handy to
test for various things, particularly "issubset" and "issuperset".

I think frozenset is quite analogous to tuple at this level, and I
suggest that a similar set of C construction functions would be a good
thing.

Bill

> I guess nobody has tried to create frozenset instances from C code
> before. Almost everyone uses set anyway. What are you trying to do?
> 
> On 9/4/07, Bill Janssen <janssen at parc.com> wrote:
> > I'm looking at building a "frozenset" instance as a return value from
> > a C function, and the C API seems ridiculously clumsy.  Maybe I'm
> > misunderstanding it.  Apparently, I need to create a list object, then
> > pass that to PyFrozenSet_New(), then decref the list object.
> >
> > Is that correct?
> >
> > What I'd like is something more like
> >
> >      PyFrozenSet_NEW(int) => PySetObject *
> >      PyFrozenSet_SET_ITEM(s, i, v)
> >
> > Any idea why these aren't part of the API?
> >
> > Bill
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> 
> 
> -- 
> --Guido van Rossum (home page: http://www.python.org/~guido/)


From janssen at parc.com  Tue Sep  4 21:31:09 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 12:31:09 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net> 
References: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net>
Message-ID: <07Sep4.123111pdt."57996"@synergy1.parc.xerox.com>

Raymond, thanks for the note.

> You can create a frozenset from any iterable using PyFrozenSet_New().
> 
> If you don't have an iterable and want to build-up the frozenset one element at a time, the approach is to create a regular set (or some other mutable container), add to it, then convert it to a frozenset when you're done:
> 
>   s = PySet_New(NULL);
>   PySet_Add(s, obj1);
>   PySet_Add(s, obj2);
>   PySet_Add(s, obj3);
>   f = PyFrozenSet_New(s);
>   Py_DECREF(s);

This is essentially the same thing I mentioned, except using a set
instead of a list as the iterable.

I'm just a tad annoyed at the fact that I know at set creation time
exactly how many elements it's going to have, and this procedure
strikes me as a somewhat inefficient way to create that set.  Just
tickles my "C inefficiency" funnybone a bit :-).

> The API you propose doesn't work because sets and frozensets are not
> indexed like tuples and lists.  Accordingly, sets and frozensets have
> a C API that is more like dictionaries.  Since dictionaries are not
> indexable, they also cannot have an API like the one you propose:
> 
>      PyDict_NEW(int) => PySetObject *
>      PyDict_SET_ITEM(s, index, key, value)

Didn't really mean to propose "PyDict_SET_ITEM(s, index, key, value)",
should have been

       PyDict_SET_ITEM(s, index, value)

But your point is still well taken.  How about this one, though:

     PyDict_NEW(int) => PySetObject *
     PyDict_ADD(s, value)

ADD would just stick value in the next empty slot (and steal its
reference).

Bill

From janssen at parc.com  Tue Sep  4 22:11:11 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 13:11:11 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.123111pdt."57996"@synergy1.parc.xerox.com> 
References: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net>
	<07Sep4.123111pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep4.131116pdt."57996"@synergy1.parc.xerox.com>

> But your point is still well taken.  How about this one, though:
> 
>      PyDict_NEW(int) => PySetObject *
>      PyDict_ADD(s, value)
> 
> ADD would just stick value in the next empty slot (and steal its
> reference).

Sorry, I meant to say

     PyFrozenSet_NEW(int) => PySetObject *
     PyFrozenSet_ADD(s, value)

Bill

From python at rcn.com  Tue Sep  4 22:19:45 2007
From: python at rcn.com (Raymond Hettinger)
Date: Tue,  4 Sep 2007 16:19:45 -0400 (EDT)
Subject: [Python-Dev] frozenset C API?
Message-ID: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net>

[Bill Janssen]
> 
> How about this one, though:
>
>    PyDict_NEW(int) => PySetObject *
>    PyDict_ADD(s, value)
>
> ADD would just stick value in the next 
> empty slot (and steal its reference).

Dicts, sets and frozenset are implemented as hash tables, not as arrays, so the above suggestion doesn't make any sense to me.  The location of the "next empty slot" depends on a the key associated with the value being added (btw, where is the "key" handled in your proposed API?).  Consequently, the PyDict_New(int) step would have no way to know where to create the n empty slots (since their location is determined by the hash value of the keys). That is a reason that the tuple/list API differs from the set/frozenet/dict API.


Raymond

From martin at v.loewis.de  Tue Sep  4 22:55:38 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 22:55:38 +0200
Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005?
In-Reply-To: <46DDACFD.2080606@copperchipgames.com>
References: <46DDACFD.2080606@copperchipgames.com>
Message-ID: <46DDC64A.9010700@v.loewis.de>

> Can someone confirm this and update the readme file in the PCbuild8 
> directory?  I don't yet have access to the repository.

Please provide patches instead, and post them on bugs.python.org.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Sep  4 22:58:07 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Sep 2007 08:58:07 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD5599.9010003@v.loewis.de>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de>
Message-ID: <46DDC6DF.1080307@canterbury.ac.nz>

Martin v. L?wis wrote:
> I think this requires a PEP, and explicit support from the
> NumPy people.

Someone who knows more about numpy's internals would
be needed to figure out what the details should be
like in order to be usable by numpy. But I could write
a PEP about how what I have in mind would look from
the Python level.

--
Greg

From martin at v.loewis.de  Tue Sep  4 23:26:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 23:26:20 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DDCD7C.40004@v.loewis.de>

> I'm working on issue 1583946.  Nagle pointed out that each DN (the
> "subject" and "issuer" fields in a certificate) may have multiple
> values for the same attribute name, and I haven't been able to rule
> this out yet.

This is indeed common. In particular, DN= and OU= often occur multiple
times.

> X.509 DNs are sets of X.500 attributes, and X.500
> attributes may be either single-valued or multiple-valued.

Conceptually perhaps (although I doubt that). Practically, Name is

Name ::= CHOICE { RDNSequence }

RDNSequence ::= SEQUENCE OF RelativeDistinguishedName

RelativeDistinguishedName ::=
     SET OF AttributeTypeAndValue

 AttributeTypeAndValue ::= SEQUENCE {
     type     AttributeType,
     value    AttributeValue }

So it's a sequence of sets of key/value pairs. If you want to
have the same type twice, you have two options: either make
multiple RDNs, each single-valued, or make a single RDN,
with multiple kv-pairs.

IIUC, the intention of the multi-valued RDNs is that you have
an entity described by multiple attributes. For example,
relative to O=Foo, neither GN=Bill nor SN=Janssen might correctly
identify a person. So you would create O=Foo,GN=Bill+SN=Janssen.

That's allowed, but not really common - instead, people both
a) use CN as a unique identifier, and b) put separate
attributes for a single object into separate RDNs, as if
email=janssen at parc.com was a subnode in the DIT relative
to CN="Bill Janssen".

> I haven't
> found anything in the X.509 standard that prohibits multiple-valued
> attributes (yet -- I'm still looking), so I'm working on an
> alternative to using dicts to represent the set of attributes in the
> certificate that's returned from ssl.sslsocket.getpeercert().

Conceptually, it should be a list (order *is* relevant). It can
then be debated whether the RDN can be represented as a dictionary;
my understanding is that the intention of RDNs is that the AttributeType
is unique within an RDN (but I may be wrong).

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Sep  4 23:26:32 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Sep 2007 09:26:32 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
Message-ID: <46DDCD88.7080009@canterbury.ac.nz>

Guido van Rossum wrote:
> I still don't see why the standard library needs to be weighed down
> with a competitor to numpy.

The way to get things done efficiently with an interpreted
language is for the language or its libraries to provide
primitives that work on large chunks of data at once, and
can be combined in flexible ways.

Python provides many such primitives for working with
strings -- the string methods, regexps, etc. But it doesn't
provide *any* for numbers, and that strikes me as an odd
gap in functionality.

What I have in mind would be quite small, so it wouldn't
"weigh down" the stdlib. You could think of it as an
extension to the operator module that turns it into
something useful. :-)

And, as I said, if it's designed so that numpy can build
on it, then it needn't be competing with numpy.

> Including a subset of numpy was considered
> in the past, but it's hard to decide on the right subset.

What I'm thinking of wouldn't be a "subset" of numpy, in
the sense that it wouldn't necessarily share any of the
numpy API from the Python perspective. All it would
provide is the minimum necessary primitives to get the
grunt work done.

I'm thinking of having a bunch of functions like

   add_elementwise(src1, src2, dst, start, chunk, stride)

where src1, src2 and dst are anything supporting the
new buffer protocol. That should be sufficient to support
something with a numpy-like API, I think.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Sep  4 23:30:25 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Sep 2007 09:30:25 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DD712D.9080103@gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de>
	<46DD712D.9080103@gmail.com>
Message-ID: <46DDCE71.7090608@canterbury.ac.nz>

Nick Coghlan wrote:
> Travis has actually been working on this off-and-on for the last couple 
> of years,

Well, yes, but that's concentrating on a different aspect
of things -- the data storage.

My proposal concerns what you can *do* with the data,
independent of the way it's stored. My idea and Travis's
would complement each other, I think.

--
Greg

From martin at v.loewis.de  Tue Sep  4 23:42:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Sep 2007 23:42:27 +0200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DDCD88.7080009@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD482A.1000801@shrogers.com>	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
	<46DDCD88.7080009@canterbury.ac.nz>
Message-ID: <46DDD143.3030205@v.loewis.de>

> What I have in mind would be quite small, so it wouldn't
> "weigh down" the stdlib.

If it's a builtin, it certainly would. Every builtin weighs
down the library, as it clutters the global(est) namespace.

> I'm thinking of having a bunch of functions like
> 
>    add_elementwise(src1, src2, dst, start, chunk, stride)
> 
> where src1, src2 and dst are anything supporting the
> new buffer protocol. That should be sufficient to support
> something with a numpy-like API, I think.

This sounds like a topic for python-ideas.

Regards,
Martin

From hasan.diwan at gmail.com  Tue Sep  4 23:43:04 2007
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Tue, 4 Sep 2007 14:43:04 -0700
Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError?
Message-ID: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com>

I'm trying to fix a failing unit test in revision 57974. The test in
question claims that math.sqrt(-1) should raise ValueError; the code itself
gives "nan" as a result for that expression. I can modify the test and
therefore have it pass, but I'm not sure if an exception would be more
appropriate. I'd be happy for some direction here. Many thanks!

-- 
Cheers,
Hasan Diwan <hasan.diwan at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070904/7e0c52ba/attachment.htm 

From ty.newton at copperchipgames.com  Tue Sep  4 23:46:33 2007
From: ty.newton at copperchipgames.com (Ty Newton)
Date: Wed, 05 Sep 2007 07:46:33 +1000
Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005?
In-Reply-To: <46DDC64A.9010700@v.loewis.de>
References: <46DDACFD.2080606@copperchipgames.com>
	<46DDC64A.9010700@v.loewis.de>
Message-ID: <46DDD239.5050706@copperchipgames.com>

oh, sorry.  I'll do that.

Ty

Martin v. L?wis wrote:
>> Can someone confirm this and update the readme file in the PCbuild8 
>> directory?  I don't yet have access to the repository.
> 
> Please provide patches instead, and post them on bugs.python.org.
> 
> Regards,
> Martin
> 
> 

From guido at python.org  Tue Sep  4 23:55:28 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 14:55:28 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DDCD88.7080009@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
	<46DDCD88.7080009@canterbury.ac.nz>
Message-ID: <ca471dc20709041455j515c887am350134216159848@mail.gmail.com>

By all means do write up a PEP -- it's hard to generalize from that one example.

On 9/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > I still don't see why the standard library needs to be weighed down
> > with a competitor to numpy.
>
> The way to get things done efficiently with an interpreted
> language is for the language or its libraries to provide
> primitives that work on large chunks of data at once, and
> can be combined in flexible ways.
>
> Python provides many such primitives for working with
> strings -- the string methods, regexps, etc. But it doesn't
> provide *any* for numbers, and that strikes me as an odd
> gap in functionality.
>
> What I have in mind would be quite small, so it wouldn't
> "weigh down" the stdlib. You could think of it as an
> extension to the operator module that turns it into
> something useful. :-)
>
> And, as I said, if it's designed so that numpy can build
> on it, then it needn't be competing with numpy.
>
> > Including a subset of numpy was considered
> > in the past, but it's hard to decide on the right subset.
>
> What I'm thinking of wouldn't be a "subset" of numpy, in
> the sense that it wouldn't necessarily share any of the
> numpy API from the Python perspective. All it would
> provide is the minimum necessary primitives to get the
> grunt work done.
>
> I'm thinking of having a bunch of functions like
>
>    add_elementwise(src1, src2, dst, start, chunk, stride)
>
> where src1, src2 and dst are anything supporting the
> new buffer protocol. That should be sufficient to support
> something with a numpy-like API, I think.
>
> --
> Greg
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Sep  4 23:58:27 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 14:58:27 -0700
Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError?
In-Reply-To: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com>
References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com>
Message-ID: <ca471dc20709041458l10170c91g215844778199fe13@mail.gmail.com>

Is this on OSX? That test has been failing (because on that platform
sqrt(-1) returns nan instead of raising ValueError) for years -- but
the test is only run when run in verbose mode, which mostly hides the
issue.  Have you read the comment for the test?

On 9/4/07, Hasan Diwan <hasan.diwan at gmail.com> wrote:
> I'm trying to fix a failing unit test in revision 57974. The test in
> question claims that math.sqrt(-1) should raise ValueError; the code itself
> gives "nan" as a result for that expression. I can modify the test and
> therefore have it pass, but I'm not sure if an exception would be more
> appropriate. I'd be happy for some direction here. Many thanks!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From hasan.diwan at gmail.com  Wed Sep  5 00:13:50 2007
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Tue, 4 Sep 2007 15:13:50 -0700
Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError?
In-Reply-To: <ca471dc20709041458l10170c91g215844778199fe13@mail.gmail.com>
References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com>
	<ca471dc20709041458l10170c91g215844778199fe13@mail.gmail.com>
Message-ID: <2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com>

On 04/09/07, Guido van Rossum <guido at python.org> wrote:
>
> Is this on OSX? That test has been failing (because on that platform
> sqrt(-1) returns nan instead of raising ValueError) for years -- but
> the test is only run when run in verbose mode, which mostly hides the
> issue.  Have you read the comment for the test?


Indeed, I am on OSX. Yes, I have read the comment for the test. Would the
following pseudocode be an acceptable fix for the problem:
if sys.platform == 'darwin' and math.sqrt(-1) == nan:
     return
else:
  try:
     x = math.sqrt(-1)
  except ValueError:
     pass
  ...
or should I just not bother?
-- 
Cheers,
Hasan Diwan <hasan.diwan at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070904/a598e9d5/attachment.htm 

From janssen at parc.com  Wed Sep  5 00:21:10 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 15:21:10 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DDCD7C.40004@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
Message-ID: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>

> > X.509 DNs are sets of X.500 attributes, and X.500
> > attributes may be either single-valued or multiple-valued.
> 
> Conceptually perhaps (although I doubt that).

I got that from David Chadwick's book at http://sec.cs.kent.ac.uk/x500book/.

``An attribute comprises an attribute type and one or more attribute values.''

The question is, how would a multiple-valued attribute be represented
in a certificate Name?  I'm presuming it would appear as multiple
attributes with the same "type", but different values.

> Conceptually, it should be a list (order *is* relevant). It can
> then be debated whether the RDN can be represented as a dictionary;
> my understanding is that the intention of RDNs is that the AttributeType
> is unique within an RDN (but I may be wrong).

> Name ::= CHOICE { RDNSequence }
> 
> RDNSequence ::= SEQUENCE OF RelativeDistinguishedName
> 
> RelativeDistinguishedName ::=
>      SET OF AttributeTypeAndValue
> 
>  AttributeTypeAndValue ::= SEQUENCE {
>      type     AttributeType,
>      value    AttributeValue }

Order is important in the directory tree, but not (I think) in the DN;
that name is just an unordered set of attributes, because the
hierarchy information has already been lost (the RDN elements cannot
be distinguished from each other using only the internal certificate
information).

In any case, it certainly sounds to me as if there can be multiple
instances of AttributeTypeAndValue with the same "type" field in a
single Name.  So I'll represent them as tuples, which will preserve
the order in which they occur in the certificate, and make the value
immutable.  Applications which need them as sets can create their
own frozensets from that tuple.

Bill

From janssen at parc.com  Wed Sep  5 00:26:56 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 15:26:56 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net> 
References: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net>
Message-ID: <07Sep4.152702pdt."57996"@synergy1.parc.xerox.com>

> Dicts, sets and frozenset are implemented as hash tables, not as arrays,

I see, thanks.

> The location of the "next empty slot" depends on a the key
> associated with the value being added (btw, where is the "key" handled
> in your proposed API?).

What key?  It's a set, not a mapping.  The value is the key.

Bill

From guido at python.org  Wed Sep  5 00:45:15 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 15:45:15 -0700
Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError?
In-Reply-To: <2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com>
References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com>
	<ca471dc20709041458l10170c91g215844778199fe13@mail.gmail.com>
	<2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com>
Message-ID: <ca471dc20709041545y46f8f6cfj65bc6662bd2d9710@mail.gmail.com>

I think it's better for the test to fail, to indicate that there's an
unresolved problem on the platform.

On 9/4/07, Hasan Diwan <hasan.diwan at gmail.com> wrote:
> On 04/09/07, Guido van Rossum <guido at python.org> wrote:
> > Is this on OSX? That test has been failing (because on that platform
> > sqrt(-1) returns nan instead of raising ValueError) for years -- but
> > the test is only run when run in verbose mode, which mostly hides the
> > issue.  Have you read the comment for the test?
>
> Indeed, I am on OSX. Yes, I have read the comment for the test. Would the
> following pseudocode be an acceptable fix for the problem:
> if sys.platform == 'darwin' and math.sqrt(-1) == nan:
>       return
> else:
>   try:
>      x = math.sqrt(-1)
>   except ValueError:
>      pass
>   ...
> or should I just not bother?
> --
> Cheers,
>
> Hasan Diwan < hasan.diwan at gmail.com>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ty.newton at copperchipgames.com  Wed Sep  5 02:45:52 2007
From: ty.newton at copperchipgames.com (Ty Newton)
Date: Wed, 05 Sep 2007 10:45:52 +1000
Subject: [Python-Dev] Porting information
In-Reply-To: <46DCD1F6.2040906@v.loewis.de>
References: <46DC88AD.5060001@copperchipgames.com>
	<46DCD1F6.2040906@v.loewis.de>
Message-ID: <46DDFC40.2000808@copperchipgames.com>

Thanks Martin,

Martin v. L?wis wrote:
>> I've started by looking at the parser portion of the code.  However I am 
>> not certain this is the best place to start.  Since there are so many 
>> ports I assume there is a well trodden path to completing this kind of 
>> task.
> 
> I believe this assumption is wrong. There are not many ports, only a
> handful (or less - Jython, IronPython, PyPy). While Jython and
> IronPython may have similar implementation strategies, I would expect
> that PyPy took an entirely different approach.
> 
> In any case, there certainly is a step that you apparently failed
> to perform as the very first step: set some explicit goals. What
> kind of compatibility do you want to achieve in your port, what
> other goals would you like to follow?
> 
I thought I'd try and keep my message short so I decided not to go into
the explicit objectives.  At the most basic it is the ability for
developers to run compiled Python as part of the game code.  The next 
step up from that is allowing Python source code to execute and be 
modified in a 'simple' interactive coding tool: allowing for 'tweaking' 
code to be implemented outside of the game engine team.

Principal constraint:
Microsoft support for independent development on the 360 is only 
provided through the use of their slimmed down .Net compact framework 
and the XNA Game Studio Express development environment (C# only).  This 
allows Microsoft to implement security within the tool chain and 
deployment pipeline to sandbox strictly.

> IOW, why is IronPython not what you want (it *is* a port of CPython
> to C#, in a sense), and why is the C# support in PyPy not good enough
> for you?
> 
The impact, to this project, of the reduced API and strict sandboxing in
the 360 dev environment is Python implementations like IronPython are
not feasible.  IronPython uses the reflection capabilities of C# to
interpret directly to CLR.  Without reflection IronPython simply cannot
operate.  Unfortunately the 360 API does not include reflection
functionality.

I had a look into PyPy and concluded that it could produce a result that
would operate however I was less certain about integrating it into a
development tool chain for the 360.  It seems more likely that a
'C#Python' would result in a cleaner development environment - much like 
the embedded inclusion of Lua scripting in many games software.

>> I would prefer to break the task into portions that can be verified 
>> (tested for correctness) independently or as a stack (one on top of the 
>> next).  That way I can catch errors early and have more confidence in 
>> what I am creating.
> 
> As I don't know what you want to achieve, it is difficult to tell
> you what steps to take.
>
> I assume your implementation would be similar to CPython in that
> it uses the same byte code format. So one path would be to ignore
> the compiler at all, and assume that the byte code format is given,
> i.e. start with port ceval.c.
> 
> I'm not sure whether you also want to provide the same low-level
> API (i.e. whether you want to provide "Embedding and Extending");
> it surely can't be the *same* API, since your's will be C#, whereas
> CPython's is, well, C. If you implement ceval.c, you will find
> quickly that you need much of the Objects folder, so implementing
> the 10 or so most important objects would be the natural starting
> point (type, int, string, tuple, dict, frame, code, class, method -
> assuming you would target Python 1.5 first, i.e. no bool, cell,
> descr, gen, iter, weakref, unicode, object).
> 
>> When I looked through the test suites they all seem to be written in 
>> Python.  Is there a test suite for the core of CPython i.e. before the C 
>> code can interpret Python code?
> 
> Yes and no. The core Python is tested through compilation - if it
> compiles without warnings on the relevant compilers, it is considered
> good enough to run the Python test suite. For selected features of
> the interpreter, there are specific tests, in particular test_capi.
> 
> The core of CPython (compiler, objects, builtins) is then tested
> through Python code.
> 
This seems like a sensible way to start since the test harness needs a 
Python interpreter.  Although it seems counter-intuitive to build the 
bytecode interpreter so that I can test the bytecode compiler...

> Regards,
> Martin
> 
> 

Thanks for the advice Martin.

Regards,
Ty

From steve at shrogers.com  Wed Sep  5 03:39:42 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Tue, 04 Sep 2007 19:39:42 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	
	<46DC91D2.7060407@canterbury.ac.nz>	
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
Message-ID: <46DE08DE.4090503@shrogers.com>

Guido van Rossum wrote:
> I still don't see why the standard library needs to be weighed down
> with a competitor to numpy. Including a subset of numpy was considered
> in the past, but it's hard to decide on the right subset. In the end
> it was decided that numpy is too big to become a standard library.
> Given all the gyrations it has gone through I definitely believe this
> was the right decision.
A competitor to NumPy would be counter-productive, but including a core 
subset in the standard library that NumPy could be built upon would add 
valuable functionality to Python out of the box.  It was probably the 
best decision to not include NumPy when it was previously considered, 
but I think it should be reconsidered for Python 3.x.  While defining 
the right subset to include has it's difficulties, I believe it can be 
done.  What would be a reasonable target size for inclusion in the 
standard library?

# Steve


From steve at shrogers.com  Wed Sep  5 03:56:23 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Tue, 04 Sep 2007 19:56:23 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DDC6DF.1080307@canterbury.ac.nz>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD5599.9010003@v.loewis.de> <46DDC6DF.1080307@canterbury.ac.nz>
Message-ID: <46DE0CC7.3040503@shrogers.com>

Greg Ewing wrote:
> Martin v. L?wis wrote:
>   
>> I think this requires a PEP, and explicit support from the
>> NumPy people.
>>     
>
> Someone who knows more about numpy's internals would
> be needed to figure out what the details should be
> like in order to be usable by numpy. But I could write
> a PEP about how what I have in mind would look from
> the Python level.
>   
I'm confident that the NumPy developers would support this in 
principle.  If you want help with the PEP, I'm willing to help.



From guido at python.org  Wed Sep  5 04:03:51 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 19:03:51 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <46DE08DE.4090503@shrogers.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
	<46DE08DE.4090503@shrogers.com>
Message-ID: <ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>

On 9/4/07, Steven H. Rogers <steve at shrogers.com> wrote:
> Guido van Rossum wrote:
> > I still don't see why the standard library needs to be weighed down
> > with a competitor to numpy. Including a subset of numpy was considered
> > in the past, but it's hard to decide on the right subset. In the end
> > it was decided that numpy is too big to become a standard library.
> > Given all the gyrations it has gone through I definitely believe this
> > was the right decision.
> A competitor to NumPy would be counter-productive, but including a core
> subset in the standard library that NumPy could be built upon would add
> valuable functionality to Python out of the box.  It was probably the
> best decision to not include NumPy when it was previously considered,
> but I think it should be reconsidered for Python 3.x.  While defining
> the right subset to include has it's difficulties, I believe it can be
> done.  What would be a reasonable target size for inclusion in the
> standard library?

What makes 3.0 so special? Additions to the stdlib can be considered
at any feature release. Frankly, 3.0 is already so loaded with new
features (and removals) that I'm not sure it's worth pile this onto
it.

That said, I would much rather argue with a detailed PEP than with yet
another suggestion that we do something. I am already doing enough --
it's up for some other folks to get together and produce a proposal.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steve at shrogers.com  Wed Sep  5 04:35:41 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Tue, 04 Sep 2007 20:35:41 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	
	<46DC91D2.7060407@canterbury.ac.nz>	
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	
	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD482A.1000801@shrogers.com>	
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>	
	<46DE08DE.4090503@shrogers.com>
	<ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
Message-ID: <46DE15FD.9000801@shrogers.com>

Guido van Rossum wrote:
> What makes 3.0 so special? Additions to the stdlib can be considered
> at any feature release. Frankly, 3.0 is already so loaded with new
> features (and removals) that I'm not sure it's worth pile this onto
> it.
>   
I actually wrote 3.x, not 3.0.  I agree that it makes no sense to add 
anything more to 3.0.



From robert.kern at gmail.com  Wed Sep  5 04:45:45 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 04 Sep 2007 21:45:45 -0500
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD482A.1000801@shrogers.com>	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>	<46DE08DE.4090503@shrogers.com>
	<ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
Message-ID: <fbl590$8j7$1@sea.gmane.org>

Guido van Rossum wrote:
> On 9/4/07, Steven H. Rogers <steve at shrogers.com> wrote:
>> Guido van Rossum wrote:
>>> I still don't see why the standard library needs to be weighed down
>>> with a competitor to numpy. Including a subset of numpy was considered
>>> in the past, but it's hard to decide on the right subset. In the end
>>> it was decided that numpy is too big to become a standard library.
>>> Given all the gyrations it has gone through I definitely believe this
>>> was the right decision.
>> A competitor to NumPy would be counter-productive, but including a core
>> subset in the standard library that NumPy could be built upon would add
>> valuable functionality to Python out of the box.  It was probably the
>> best decision to not include NumPy when it was previously considered,
>> but I think it should be reconsidered for Python 3.x.  While defining
>> the right subset to include has it's difficulties, I believe it can be
>> done.  What would be a reasonable target size for inclusion in the
>> standard library?
> 
> What makes 3.0 so special? Additions to the stdlib can be considered
> at any feature release.

The 3.x compatibility break (however alleviated by 2to3) makes a nice clean
cutoff. The numpy that works on Pythons 3.x would essentially be a port from the
current numpy. Consequently, we could modify the numpy for Pythons 3.x to always
rely on the stdlib API to build on top of. We couldn't do that for the version
targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I
don't mind maintaining two versions of numpy, one for Python 2.x and one for
3.x, but I don't care to maintain three.

I invite Greg and Steven and whoever else is interested to discuss ideas for the
PEP on numpy-discussion. I'm skeptical, seeing what currently has been
suggested, but some more details could easily allay that.

  http://projects.scipy.org/mailman/listinfo/numpy-discussion

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From janssen at parc.com  Wed Sep  5 04:58:11 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 4 Sep 2007 19:58:11 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>

> In any case, it certainly sounds to me as if there can be multiple
> instances of AttributeTypeAndValue with the same "type" field in a
> single Name.  So I'll represent them as tuples, which will preserve
> the order in which they occur in the certificate, and make the value
> immutable.  Applications which need them as sets can create their
> own frozensets from that tuple.

Here's an example of the new format:

  {'issuer': (('countryName', u'US'),
              ('organizationName', u'VeriSign, Inc.'),
              ('organizationalUnitName', u'VeriSign Trust Network'),
              ('organizationalUnitName',
               u'Terms of use at https://www.verisign.com/rpa (c)06'),
              ('commonName',
               u'VeriSign Class 3 Extended Validation SSL SGC CA')),
   'notAfter': 'May  8 23:59:59 2009 GMT',
   'notBefore': 'May  9 00:00:00 2007 GMT',
   'subject': (('serialNumber', u'2497886'),
               ('1.3.6.1.4.1.311.60.2.1.3', u'US'),
               ('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),
               ('countryName', u'US'),
               ('postalCode', u'94043'),
               ('stateOrProvinceName', u'California'),
               ('localityName', u'Mountain View'),
               ('streetAddress', u'487 East Middlefield Road'),
               ('organizationName', u'VeriSign, Inc.'),
               ('organizationalUnitName', u'Production Security Services'),
               ('organizationalUnitName',
                u'Terms of use at www.verisign.com/rpa (c)06'),
               ('commonName', u'www.verisign.com')),
   'version': 2}

Bill

From guido at python.org  Wed Sep  5 05:18:36 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Sep 2007 20:18:36 -0700
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <fbl590$8j7$1@sea.gmane.org>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
	<46DE08DE.4090503@shrogers.com>
	<ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
	<fbl590$8j7$1@sea.gmane.org>
Message-ID: <ca471dc20709042018v16f97c98j26ec49d302b426fa@mail.gmail.com>

On 9/4/07, Robert Kern <robert.kern at gmail.com> wrote:
> The 3.x compatibility break (however alleviated by 2to3) makes a nice clean
> cutoff. The numpy that works on Pythons 3.x would essentially be a port from the
> current numpy. Consequently, we could modify the numpy for Pythons 3.x to always
> rely on the stdlib API to build on top of. We couldn't do that for the version
> targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I
> don't mind maintaining two versions of numpy, one for Python 2.x and one for
> 3.x, but I don't care to maintain three.

I just had a discussion with Glyph "Twisted" Lefkowitz about this. He
warns that if every project using Python uses 3.0's incompatibility as
an excuse to make their own library/package/project incompatible as
well, we will end up with total pandemonium (my paraphrase). I think
he has a good point -- we shouldn't be injecting any more instability
into the world than absolutely necessary.

In any case, the rift is more likely to be between 2.5 and 2.6, since
2.6 will provide backports of most 3.0 features (though without some
of the accompanying cleanups, in order to also provide strong
backwards compatibility).

To be honest, I also doubt the viability of designing and implementing
something that would satisfy Greg Ewing's goals *and* be stable enough
in the standard library, in under a year.  But as I said before, I
don't see much point in arguing much further until I see the PEP. I
may yet be convinced, but it will have to be a good design and a
well-argued proposal.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steve at shrogers.com  Wed Sep  5 05:17:43 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Tue, 04 Sep 2007 21:17:43 -0600
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <fbl590$8j7$1@sea.gmane.org>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>	<46DD482A.1000801@shrogers.com>	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>	<46DE08DE.4090503@shrogers.com>	<ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>
	<fbl590$8j7$1@sea.gmane.org>
Message-ID: <46DE1FD7.7060706@shrogers.com>

Robert Kern wrote:
> I invite Greg and Steven and whoever else is interested to discuss ideas for the
> PEP on numpy-discussion. I'm skeptical, seeing what currently has been
> suggested, but some more details could easily allay that.
>
>   http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   
Accepted, that's probably the best place for this to continue.  Greg's 
suggestion sounds plausible to me, but needs to be fleshed out.

From robert.kern at gmail.com  Wed Sep  5 05:28:46 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 04 Sep 2007 22:28:46 -0500
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709042018v16f97c98j26ec49d302b426fa@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>	<46DC91D2.7060407@canterbury.ac.nz>	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>	<46DD22F3.6070701@canterbury.ac.nz>
	<46DD482A.1000801@shrogers.com>	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>	<46DE08DE.4090503@shrogers.com>	<ca471dc20709041903l3b7898a6pd5cf2cd450b05b81@mail.gmail.com>	<fbl590$8j7$1@sea.gmane.org>
	<ca471dc20709042018v16f97c98j26ec49d302b426fa@mail.gmail.com>
Message-ID: <fbl7pl$el9$1@sea.gmane.org>

Guido van Rossum wrote:
> On 9/4/07, Robert Kern <robert.kern at gmail.com> wrote:
>> The 3.x compatibility break (however alleviated by 2to3) makes a nice clean
>> cutoff. The numpy that works on Pythons 3.x would essentially be a port from the
>> current numpy. Consequently, we could modify the numpy for Pythons 3.x to always
>> rely on the stdlib API to build on top of. We couldn't do that for the version
>> targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I
>> don't mind maintaining two versions of numpy, one for Python 2.x and one for
>> 3.x, but I don't care to maintain three.
> 
> I just had a discussion with Glyph "Twisted" Lefkowitz about this. He
> warns that if every project using Python uses 3.0's incompatibility as
> an excuse to make their own library/package/project incompatible as
> well, we will end up with total pandemonium (my paraphrase). I think
> he has a good point -- we shouldn't be injecting any more instability
> into the world than absolutely necessary.

I agree. I didn't mean to imply that the 3.x version of numpy would be
incompatible to users of it, just that the codebase that implements it will be
different, whether it is automatically or manually translated.

Of course, if the API is introduced in 3.(x>0), we end up with the same problem
I wanted to avoid. Ah well. See you on the flip side of the PEP.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From martin at v.loewis.de  Wed Sep  5 07:25:12 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 07:25:12 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DE3DB8.6000004@v.loewis.de>

>>> X.509 DNs are sets of X.500 attributes, and X.500
>>> attributes may be either single-valued or multiple-valued.
>> Conceptually perhaps (although I doubt that).
> 
> I got that from David Chadwick's book at http://sec.cs.kent.ac.uk/x500book/.
> 
> ``An attribute comprises an attribute type and one or more attribute values.''

Ah, ok. But then, the DN is not a *set* of such attributes, but a
sequence.

> The question is, how would a multiple-valued attribute be represented
> in a certificate Name?  I'm presuming it would appear as multiple
> attributes with the same "type", but different values.

Within a single RelativeDistinguishedName, yes.

> Order is important in the directory tree, but not (I think) in the DN;
> that name is just an unordered set of attributes, because the
> hierarchy information has already been lost (the RDN elements cannot
> be distinguished from each other using only the internal certificate
> information).

Hmm. The directory tree only exists through the order in the DN.
E.g from

http://java.sun.com/products/jndi/tutorial/ldap/models/x500.html

"The X.500 namespace is hierarchical. An entry is unambiguously
identified by a distinguished name (DN). A distinguished name is the
concatenation of selected attributes from each entry, called the
relative distinguished name (RDN), in the tree along a path leading from
the root down to the named entry."

If the RDNs within a DN would not be ordered, you would not get
a hierarchical tree, and you could not identify entries unambiguously.

> In any case, it certainly sounds to me as if there can be multiple
> instances of AttributeTypeAndValue with the same "type" field in a
> single Name.  So I'll represent them as tuples, which will preserve
> the order in which they occur in the certificate, and make the value
> immutable.

Ok. I think this will still not support multi-valued RDNs properly, but
those are uncommon in PKI.

Regards,
Martin

From martin at v.loewis.de  Wed Sep  5 07:48:11 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 07:48:11 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>	<46DDCD7C.40004@v.loewis.de>	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DE431B.20403@v.loewis.de>

> Here's an example of the new format:
> 
>   {'issuer': (('countryName', u'US'),
>               ('organizationName', u'VeriSign, Inc.'),
>               ('organizationalUnitName', u'VeriSign Trust Network'),
>               ('organizationalUnitName',
>                u'Terms of use at https://www.verisign.com/rpa (c)06'),
>               ('commonName',
>                u'VeriSign Class 3 Extended Validation SSL SGC CA')),

Can you please take a look at the attached certificates? How are they
represented? The DNs of these are structurally different, one being
/DC=org/DC=python/CN=foo/CN=bar and the other
/DC=org/DC=python/CN=foo2+CN=bar2

Regards,
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ca1.crt
Type: application/x-x509-ca-cert
Size: 1008 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070905/61187ed8/attachment-0002.crt 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ca2.crt
Type: application/x-x509-ca-cert
Size: 1008 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070905/61187ed8/attachment-0003.crt 

From ndbecker2 at gmail.com  Wed Sep  5 14:57:06 2007
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 05 Sep 2007 08:57:06 -0400
Subject: [Python-Dev] python sphinx install?
Message-ID: <fbm937$ddo$1@sea.gmane.org>

I'm interested in trying out new style (python 2.6) documentation.  I see
we're using docutils + sphinx?

I did: svn co http://svn.python.org/projects/doctools/trunk/

How can I install this to try it with python-2.5?


From g.brandl at gmx.net  Wed Sep  5 15:05:55 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 05 Sep 2007 15:05:55 +0200
Subject: [Python-Dev] python sphinx install?
In-Reply-To: <fbm937$ddo$1@sea.gmane.org>
References: <fbm937$ddo$1@sea.gmane.org>
Message-ID: <fbm9jg$fd4$1@sea.gmane.org>

Neal Becker schrieb:
> I'm interested in trying out new style (python 2.6) documentation.  I see
> we're using docutils + sphinx?
> 
> I did: svn co http://svn.python.org/projects/doctools/trunk/
> 
> How can I install this to try it with python-2.5?

What do you want to try with Python 2.5?

If you want to build the Python 2.6/3.0 docs, it's easiest to check the
Python sources out from http://svn.python.org/projects/python/trunk, go to
the Doc directory and do "make html". This will checkout sphinx and all
other needed libraries into Doc/tools and build the docs.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From alan.mcintyre at gmail.com  Wed Sep  5 15:09:21 2007
From: alan.mcintyre at gmail.com (Alan McIntyre)
Date: Wed, 5 Sep 2007 09:09:21 -0400
Subject: [Python-Dev] x86 XP trunk failure
Message-ID: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>

Hi all,

My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk)
keeps failing because of a crash that appears to be in the bsddb
module.  I assume the master deems the slave to be lost because it's
sitting there waiting on me to make a choice on the "debug/abort"
dialog box.

I can provide details if anybody needs them.  I just figured somebody
might want to know that this is actual build/test problem instead of
some kind of issue with the internet connection here.

Thanks,
Alan

From ndbecker2 at gmail.com  Wed Sep  5 15:18:29 2007
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 05 Sep 2007 09:18:29 -0400
Subject: [Python-Dev] python sphinx install?
References: <fbm937$ddo$1@sea.gmane.org> <fbm9jg$fd4$1@sea.gmane.org>
Message-ID: <fbmabb$i7e$1@sea.gmane.org>

Georg Brandl wrote:

> Neal Becker schrieb:
>> I'm interested in trying out new style (python 2.6) documentation.  I see
>> we're using docutils + sphinx?
>> 
>> I did: svn co http://svn.python.org/projects/doctools/trunk/
>> 
>> How can I install this to try it with python-2.5?
> 
> What do you want to try with Python 2.5?
> 
> If you want to build the Python 2.6/3.0 docs, it's easiest to check the
> Python sources out from http://svn.python.org/projects/python/trunk, go to
> the Doc directory and do "make html". This will checkout sphinx and all
> other needed libraries into Doc/tools and build the docs.
> 
> Georg
> 

I want to document my own python code.  I figured I might as well start
using the new documentation system - but I'm using python-2.5.  I intend to
use epydoc.  I thought maybe I could just add sphinx to my docutils, but
maybe not?


From g.brandl at gmx.net  Wed Sep  5 15:30:24 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 05 Sep 2007 15:30:24 +0200
Subject: [Python-Dev] python sphinx install?
In-Reply-To: <fbmabb$i7e$1@sea.gmane.org>
References: <fbm937$ddo$1@sea.gmane.org> <fbm9jg$fd4$1@sea.gmane.org>
	<fbmabb$i7e$1@sea.gmane.org>
Message-ID: <fbmb1c$kaa$1@sea.gmane.org>

Neal Becker schrieb:
> Georg Brandl wrote:
> 
>> Neal Becker schrieb:
>>> I'm interested in trying out new style (python 2.6) documentation.  I see
>>> we're using docutils + sphinx?
>>> 
>>> I did: svn co http://svn.python.org/projects/doctools/trunk/
>>> 
>>> How can I install this to try it with python-2.5?
>> 
>> What do you want to try with Python 2.5?
>> 
>> If you want to build the Python 2.6/3.0 docs, it's easiest to check the
>> Python sources out from http://svn.python.org/projects/python/trunk, go to
>> the Doc directory and do "make html". This will checkout sphinx and all
>> other needed libraries into Doc/tools and build the docs.
>> 
>> Georg
>> 
> 
> I want to document my own python code.  I figured I might as well start
> using the new documentation system - but I'm using python-2.5.  I intend to
> use epydoc.  I thought maybe I could just add sphinx to my docutils, but
> maybe not?

I see. Currently, sphinx is not ready to be used by other projects, at least
not in conjunction with tools like epydoc. (You should, however, be able to
create a rst document hierarchy like Python's and use sphinx for it.)

As soon as all needs for the Python documentation are fulfilled, I'll think
about how to make the toolset available for other projects.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Wed Sep  5 15:34:03 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 15:34:03 +0200
Subject: [Python-Dev] x86 XP trunk failure
In-Reply-To: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
Message-ID: <46DEB04B.3090500@v.loewis.de>

> My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk)
> keeps failing because of a crash that appears to be in the bsddb
> module.  I assume the master deems the slave to be lost because it's
> sitting there waiting on me to make a choice on the "debug/abort"
> dialog box.

What branch, and for how long has this dialog been sitting around?

For crashes in 3.0, there should not be any such dialogs anymore, but
there may have been before I turned them off.

> I can provide details if anybody needs them.  I just figured somebody
> might want to know that this is actual build/test problem instead of
> some kind of issue with the internet connection here.

Thanks. You can discard any such dialogs - most likely, they really were
from the 3.0 branch, which is known to crash in bsddb.

Regards,
Martin

From alan.mcintyre at gmail.com  Wed Sep  5 15:40:57 2007
From: alan.mcintyre at gmail.com (Alan McIntyre)
Date: Wed, 5 Sep 2007 09:40:57 -0400
Subject: [Python-Dev] x86 XP trunk failure
In-Reply-To: <46DEB04B.3090500@v.loewis.de>
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
	<46DEB04B.3090500@v.loewis.de>
Message-ID: <1d36917a0709050640y6fead745p6bac24f6f6bb520d@mail.gmail.com>

On 9/5/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk)
> > keeps failing because of a crash that appears to be in the bsddb
> > module.  I assume the master deems the slave to be lost because it's
> > sitting there waiting on me to make a choice on the "debug/abort"
> > dialog box.
>
> What branch, and for how long has this dialog been sitting around?

It's the trunk; at the moment the debugger is sitting there with
python_d.exe at a breakpoint.  The current instance of python_d being
debugged is only a day or so old.  I don't know when this problem
started happening, but I think it's been a while (it was happening for
all the visible builds on the dashboard when I first noticed it a day
or two ago).

> For crashes in 3.0, there should not be any such dialogs anymore, but
> there may have been before I turned them off.
>
> > I can provide details if anybody needs them.  I just figured somebody
> > might want to know that this is actual build/test problem instead of
> > some kind of issue with the internet connection here.
>
> Thanks. You can discard any such dialogs - most likely, they really were
> from the 3.0 branch, which is known to crash in bsddb.

Ok.

From janssen at parc.com  Wed Sep  5 17:17:12 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 08:17:12 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DE3DB8.6000004@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
Message-ID: <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>

> > In any case, it certainly sounds to me as if there can be multiple
> > instances of AttributeTypeAndValue with the same "type" field in a
> > single Name.  So I'll represent them as tuples, which will preserve
> > the order in which they occur in the certificate, and make the value
> > immutable.
> 
> Ok. I think this will still not support multi-valued RDNs properly, but
> those are uncommon in PKI.

I'm not sure why not...  Can you say more?

Bill

From janssen at parc.com  Wed Sep  5 17:44:09 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 08:44:09 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DE431B.20403@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
	<46DE431B.20403@v.loewis.de>
Message-ID: <07Sep5.084418pdt."57996"@synergy1.parc.xerox.com>

>The DNs of these are structurally different, one being
>/DC=org/DC=python/CN=foo/CN=bar and the other
>/DC=org/DC=python/CN=foo2+CN=bar2

Ah, I see what you're driving at.

You can inspect them yourself by looking at the certs with openssl:

% openssl x509 -text -in attachment-0002.crt 
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            a9:29:70:b4:3a:72:27:5a
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: DC=org, DC=python, CN=foo, CN=bar
        Validity
            Not Before: Sep  5 05:38:20 2007 GMT
            Not After : Sep  4 05:38:20 2008 GMT
        Subject: DC=org, DC=python, CN=foo, CN=bar
        Subject Public Key Info
	...

% openssl x509 -text -in attachment-0003.crt 
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            82:0a:4f:36:0f:ab:1a:c3
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: DC=org, DC=python, CN=bar2, CN=foo2
        Validity
            Not Before: Sep  5 05:43:26 2007 GMT
            Not After : Sep  4 05:43:26 2008 GMT
        Subject: DC=org, DC=python, CN=bar2, CN=foo2
        Subject Public Key Info:

The hierarchy information does not appear to be preserved.

Bill


From janssen at parc.com  Wed Sep  5 17:48:04 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 08:48:04 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DE431B.20403@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
	<46DE431B.20403@v.loewis.de>
Message-ID: <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com>

More succinctly:

% openssl x509 -subject -noout -in attachment-0002.crt 
subject= /DC=org/DC=python/CN=foo/CN=bar
% openssl x509 -subject -noout -in attachment-0003.crt 
subject= /DC=org/DC=python/CN=bar2/CN=foo2
% 

Bill

From martin at v.loewis.de  Wed Sep  5 17:49:10 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 17:49:10 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DECFF6.4040107@v.loewis.de>

>> Ok. I think this will still not support multi-valued RDNs properly, but
>> those are uncommon in PKI.
> 
> I'm not sure why not...  Can you say more?

See the example certificates. If you get (('cn','a'),('email','b')),
you can't tell whether that's two single-valued RDNs in a DN,
or one multi-valued RDN with two attribute/value pairs.

Regards,
Martin

From martin at v.loewis.de  Wed Sep  5 18:05:27 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 18:05:27 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.084418pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
	<46DE431B.20403@v.loewis.de>
	<07Sep5.084418pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DED3C7.3050308@v.loewis.de>

> The hierarchy information does not appear to be preserved.

But it only appears so. OpenSSL does not know how to render it
properly (hence I say it is not very common in PKI), but they
started supporting that when generating certificates, with the
-multivalue-rdn option for req, and if you do

openssl asn1parse -in ca1.crt

you see that they differ:

(ca1)
l=  17 cons: SEQUENCE
l=  10 prim: OBJECT            :domainComponent
l=   3 prim: IA5STRING         :org
l=  22 cons: SET
l=  20 cons: SEQUENCE
l=  10 prim: OBJECT            :domainComponent
l=   6 prim: IA5STRING         :python
l=  12 cons: SET
l=  10 cons: SEQUENCE
l=   3 prim: OBJECT            :commonName
l=   3 prim: PRINTABLESTRING   :foo
l=  12 cons: SET
l=  10 cons: SEQUENCE
l=   3 prim: OBJECT            :commonName
l=   3 prim: PRINTABLESTRING   :bar

(ca2)
l=  17 cons: SEQUENCE
l=  10 prim: OBJECT            :domainComponent
l=   3 prim: IA5STRING         :org
l=  22 cons: SET
l=  20 cons: SEQUENCE
l=  10 prim: OBJECT            :domainComponent
l=   6 prim: IA5STRING         :python
l=  26 cons: SET
l=  11 cons: SEQUENCE
l=   3 prim: OBJECT            :commonName
l=   4 prim: PRINTABLESTRING   :bar2
l=  11 cons: SEQUENCE
l=   3 prim: OBJECT            :commonName
l=   4 prim: PRINTABLESTRING   :foo2

In the first case, foo and bar are in different sets, in the
second case, they are in the same set.

For people concerned about security, that makes a difference.

If OpenSSL actually supports that in its APIs, my proposal
would be to make a multi-valued RDN a more-than-two-tuple,
e.g.

(('DC','org'),('DC','python'),('CN','bar2','CN','foo2'))

That would make it possible to distinguish the names (pun
intended), yet still don't produce structural overhead for
the normal case of single-valued RDNs.

Regards,
Martin

From martin at v.loewis.de  Wed Sep  5 18:06:53 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 18:06:53 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
	<46DE431B.20403@v.loewis.de>
	<07Sep5.084813pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DED41D.8010506@v.loewis.de>

> % openssl x509 -subject -noout -in attachment-0002.crt 
> subject= /DC=org/DC=python/CN=foo/CN=bar
> % openssl x509 -subject -noout -in attachment-0003.crt 
> subject= /DC=org/DC=python/CN=bar2/CN=foo2

Well, that's the same bug that John Nagle complains about.
This output is incorrect.

Regards,
Martin

From janssen at parc.com  Wed Sep  5 18:12:34 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 09:12:34 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DECFF6.4040107@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
Message-ID: <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>

> See the example certificates. If you get (('cn','a'),('email','b')),
> you can't tell whether that's two single-valued RDNs in a DN,
> or one multi-valued RDN with two attribute/value pairs.

Yup, got it.  I don't see a way in the OpenSSL library functions I'm
using (X509_NAME_ENTRY_get_object, X509_NAME_ENTRY_get_data) to
distinguish between different RDNs, but I'll take a look at the source
for X509_NAME_print_ex, which does seem to be able to do this.

Bill

From janssen at parc.com  Wed Sep  5 18:26:42 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 09:26:42 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>

> Yup, got it.  I don't see a way in the OpenSSL library functions I'm
> using (X509_NAME_ENTRY_get_object, X509_NAME_ENTRY_get_data) to
> distinguish between different RDNs, but I'll take a look at the source
> for X509_NAME_print_ex, which does seem to be able to do this.

There's a field on the X509_NAME_ENTRY struct which gives the level.
OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN)
of tuples (one for each attribute in the RDN).  And maybe add a
flatten function to the ssl.py module :-).

Bill

From janssen at parc.com  Wed Sep  5 18:27:04 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 09:27:04 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DED41D.8010506@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<07Sep4.195817pdt."57996"@synergy1.parc.xerox.com>
	<46DE431B.20403@v.loewis.de>
	<07Sep5.084813pdt."57996"@synergy1.parc.xerox.com>
	<46DED41D.8010506@v.loewis.de>
Message-ID: <07Sep5.092704pdt."57996"@synergy1.parc.xerox.com>

> Well, that's the same bug that John Nagle complains about.

Yes, I agree.

Bill

From martin at v.loewis.de  Wed Sep  5 19:24:35 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 19:24:35 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DEE653.3000909@v.loewis.de>

> There's a field on the X509_NAME_ENTRY struct which gives the level.
> OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN)
> of tuples (one for each attribute in the RDN).  And maybe add a
> flatten function to the ssl.py module :-).
> 

See my other proposal as well. As nobody actually uses multi-valued
RDNs, an option would be to make single tuple for each RDN, containing
all attributes, with alternatingly type and value. Then, a single-valued
RDN would turn out as a key-value pair (two-tuple), a multi-valued RDN
would have a length of 2*number-of-attributes.

As for accessor functions, I'd then rather see a get_attr_by_type,
returning a list of all values of attributes of that type, across
all RDNs in the DN (empty if no attribute was found). People would then
do

x = get_attr_by_type(subj, ssl.commonName)
if len(x) != 1:
   unsupported_certificate()
CN = x[0]

Regards,
Martin

From janssen at parc.com  Wed Sep  5 19:49:09 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 10:49:09 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>

> OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN)
> of tuples (one for each attribute in the RDN).

Which gets us to this:

{'issuer': ((('countryName', u'US'),),
            (('stateOrProvinceName', u'Delaware'),),
            (('localityName', u'Wilmington'),),
            (('organizationName', u'Python Software Foundation'),),
            (('organizationalUnitName', u'SSL'),),
            (('commonName', u'somemachine.python.org'),)),
 'notAfter': 'Feb 16 16:54:50 2013 GMT',
 'notBefore': 'Aug 27 16:54:50 2007 GMT',
 'subject': ((('countryName', u'US'),),
             (('stateOrProvinceName', u'Delaware'),),
             (('localityName', u'Wilmington'),),
             (('organizationName', u'Python Software Foundation'),),
             (('organizationalUnitName', u'SSL'),),
             (('commonName', u'somemachine.python.org'),)),
 'version': 2}

and

{'issuer': ((('countryName', u'US'),),
            (('organizationName', u'VeriSign, Inc.'),),
            (('organizationalUnitName', u'VeriSign Trust Network'),),
            (('organizationalUnitName',
              u'Terms of use at https://www.verisign.com/rpa (c)06'),),
            (('commonName',
              u'VeriSign Class 3 Extended Validation SSL SGC CA'),)),
 'notAfter': 'May  8 23:59:59 2009 GMT',
 'notBefore': 'May  9 00:00:00 2007 GMT',
 'subject': ((('serialNumber', u'2497886'),),
             (('1.3.6.1.4.1.311.60.2.1.3', u'US'),),
             (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),
             (('countryName', u'US'),),
             (('postalCode', u'94043'),),
             (('stateOrProvinceName', u'California'),),
             (('localityName', u'Mountain View'),),
             (('streetAddress', u'487 East Middlefield Road'),),
             (('organizationName', u'VeriSign, Inc.'),),
             (('organizationalUnitName', u'Production Security Services'),),
             (('organizationalUnitName',
               u'Terms of use at www.verisign.com/rpa (c)06'),),
             (('commonName', u'www.verisign.com'),)),
 'version': 2}

Ugly, but accurate.  Or is it?  Do you really think that
"serialNumber" is at the top of a naming tree somewhere?

Bill

From martin at v.loewis.de  Wed Sep  5 20:31:27 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 20:31:27 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DEF5FF.8040602@v.loewis.de>

>  'subject': ((('serialNumber', u'2497886'),),
>              (('1.3.6.1.4.1.311.60.2.1.3', u'US'),),
>              (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),
>              (('countryName', u'US'),),
>              (('postalCode', u'94043'),),
>              (('stateOrProvinceName', u'California'),),
>              (('localityName', u'Mountain View'),),
>              (('streetAddress', u'487 East Middlefield Road'),),
>              (('organizationName', u'VeriSign, Inc.'),),
>              (('organizationalUnitName', u'Production Security Services'),),
>              (('organizationalUnitName',
>                u'Terms of use at www.verisign.com/rpa (c)06'),),
>              (('commonName', u'www.verisign.com'),)),
>  'version': 2}
> 
> Ugly, but accurate.  Or is it?  Do you really think that
> "serialNumber" is at the top of a naming tree somewhere?

Firefox claims the same order. To bad Verisign hasn't grasped
the concept of distinguished names :-(

Had they done it right, incorporationStateId, incorporationLocalityId,
streetAddress, localityName, postalCode would all have been in the
RDN with organizationName - they are all attributes of that
organization (or the address attributes perhaps belong to the OU).
Also, I doubt they have an organizationalUnit "Terms of use at ...".

Regards,
Martin

From skip at pobox.com  Wed Sep  5 20:47:17 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 5 Sep 2007 13:47:17 -0500
Subject: [Python-Dev] Errors in the csv module reader/writer methods - new
	w/ change to rst?
Message-ID: <18142.63925.900009.894021@montanaro.dyndns.org>

I was just looking for some csv DictWriter examples for a colleague at work
and was myself confused by the apparent transformation which took place in
the Reader Objects and Writer Objects sections.  Each of the methods is now
prefixed by "csv.csvreader." or "csv.csvwriter."  Neither expression was
previously defined in that section.

Those undefined expressions are not in the old versions of the
documentation, e.g.:

    http://www.python.org/doc/2.4.4/lib/node634.html
    http://www.python.org/doc/2.4.4/lib/node635.html
    http://www.python.org/doc/2.5/lib/node265.html
    http://www.python.org/doc/2.5/lib/node266.html

I don't think they add anything useful to the documentation.  Were they
added just for the csv module (and possibly a few others) or was this a
change to the entire libref documentation?  That is, was this some sort of
policy change?  Can they be removed?

Skip


From janssen at parc.com  Wed Sep  5 20:58:16 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 11:58:16 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DEF5FF.8040602@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
Message-ID: <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>

All of this makes me think that some folks may want to do more
processing on certificates with more advanced tools, and for that they
will need access to the full bits of the certificate.  I'll add the
ability to retrieve that as well.

I'm wondering if I should try to pull some extension attributes out of
the cert, and add them to the dict, as well.  Like subjectAltName, for
instance.  Or should we just wait till someone wants it and files a
bug report?

Bill

From martin at v.loewis.de  Wed Sep  5 21:10:52 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Sep 2007 21:10:52 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
	<07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DEFF3C.90306@v.loewis.de>

> I'm wondering if I should try to pull some extension attributes out of
> the cert, and add them to the dict, as well.  Like subjectAltName, for
> instance.  Or should we just wait till someone wants it and files a
> bug report?

If you have the time and inclination to do that, feel free to. Covering
some of the most widely used extensions could be useful: subjectAltName,
key usage, extended key usage. If you set up a framework for that,
people will contribute others they like to see supported.

Regards,
Martin

From db3l.net at gmail.com  Wed Sep  5 21:48:04 2007
From: db3l.net at gmail.com (David Bolen)
Date: Wed, 05 Sep 2007 15:48:04 -0400
Subject: [Python-Dev] x86 XP trunk failure
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
	<46DEB04B.3090500@v.loewis.de>
Message-ID: <m2ps0wdhcb.fsf@valheru.db3l.homeip.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

>> My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk)
>> keeps failing because of a crash that appears to be in the bsddb
>> module.  I assume the master deems the slave to be lost because it's
>> sitting there waiting on me to make a choice on the "debug/abort"
>> dialog box.
>
> What branch, and for how long has this dialog been sitting around?
>
> For crashes in 3.0, there should not be any such dialogs anymore, but
> there may have been before I turned them off.

I think there may actually be an issue here, if only with the tests,
even though 3.0 does suppress the dialog.  I think I started noticing
this in the first build after bringing my buildbot online, so I think
on Sep 1.  I had manually done on a build on Aug 28 (running the
buildbot batch file interactively) without the problem, but I haven't
been able to find any relevant source tree changes in that interval.
Re-fetching from that date has the problem, and I had blown away my
older tree when starting up the buildbot officially (of course :-().

At least for me, it's happening on 2.5 and trunk (hard to tell about
3.0, but that's dying without a dialog), so I thought it might have
been something backported.  But it also appears common to more
platforms than just Windows - it's just Windows that pops up that
dialog.

In my case, the actual dialog doesn't pop up until the end of the
tests, and it seems to be occurring only if test_bsddb3 has run during
the tests.  On other platforms, it just shows up as a warning message,
which doesn't serve to mark the tests as failing (e.g., OS X and
FreeBSD) - at the of the test you get a message of:

warning: DBTxn aborted in destructor.  No prior commit() or abort().

which I tracked back to an abort() call within the bsddb library as
final destruction is happening at Python exit. (When clearing the
test_bsddb module, and the bsddb wrapper tries to access a log file
related to an open transaction).  So perhaps there's an issue with how
one or more of the tests are constructed, or cleanup or something.  I
haven't narrowed it down further yet though.

As with Alan, more details are available as needed.  While it seems to
show up in the full test run on more platforms, I have a harder time
forcing it by just running test_bsddb3 on FreeBSD, for example, while I
get the dialog consistently on Windows.

-- David


From db3l.net at gmail.com  Wed Sep  5 22:25:05 2007
From: db3l.net at gmail.com (David Bolen)
Date: Wed, 05 Sep 2007 16:25:05 -0400
Subject: [Python-Dev] x86 XP trunk failure
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
	<46DEB04B.3090500@v.loewis.de>
	<m2ps0wdhcb.fsf@valheru.db3l.homeip.net>
Message-ID: <m2ir6odfmm.fsf@valheru.db3l.homeip.net>

I previously wrote:

> (...)
> which I tracked back to an abort() call within the bsddb library as
> final destruction is happening at Python exit. (When clearing the
> test_bsddb module, and the bsddb wrapper tries to access a log file
> related to an open transaction).  (...)

For those more familiar with bsddb, it's the test_1413192.py module in
lib/bsddb/test that tickles the problem.  It should have been more
obvious, since I saw the 1413192 in the module name during exit
cleanup, but mentally ignored it as an internal identifier of some
sort.

The test module clearly leaves an open transaction, but also purges
its working directory, so maybe that's why the log file is missing.
But since the test was specifically against object destruction, I'm
not sure how best to restructure (maybe make env_name into a class
that only prunes the directory in __del__?  Although that would affect
GC and thus destruction order too).

This test has been around a bit, but the pruning of the directory was
backported recently, which is probably the source of the problems.

-- David


From greg.ewing at canterbury.ac.nz  Wed Sep  5 22:40:22 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 06 Sep 2007 08:40:22 +1200
Subject: [Python-Dev] Product function patch [issue 1093]
In-Reply-To: <ca471dc20709041455j515c887am350134216159848@mail.gmail.com>
References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com>
	<46DC91D2.7060407@canterbury.ac.nz>
	<ca471dc20709031957s19ad2e9ey90a783a2cae099cc@mail.gmail.com>
	<46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com>
	<ca471dc20709040738l5b5af2cex66b2b48eeb263077@mail.gmail.com>
	<46DDCD88.7080009@canterbury.ac.nz>
	<ca471dc20709041455j515c887am350134216159848@mail.gmail.com>
Message-ID: <46DF1436.4060404@canterbury.ac.nz>

Guido van Rossum wrote:
> By all means do write up a PEP -- it's hard to generalize from that one example.

I'll write a PEP as soon as I get a chance. But the
generalisation is pretty straightforward -- just replicate
that signature for each of the binary operations.

--
Greg

From martin at v.loewis.de  Wed Sep  5 23:18:34 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 05 Sep 2007 23:18:34 +0200
Subject: [Python-Dev] x86 XP trunk failure
In-Reply-To: <m2ps0wdhcb.fsf@valheru.db3l.homeip.net>
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>	<46DEB04B.3090500@v.loewis.de>
	<m2ps0wdhcb.fsf@valheru.db3l.homeip.net>
Message-ID: <46DF1D2A.8090506@v.loewis.de>

> warning: DBTxn aborted in destructor.  No prior commit() or abort().

I have seen these as well. bsddb isn't very forgiving when you have
a Python exception inside a bsddb transaction, in the test suite.
IIRC, the exception will abort the transaction, then the unittest
fixture teardown will close the environment, and that will cause
a bsddb crash because something is getting released that does not
exist anymore. When I last looked at it, I did not see an easy way to
fix it; contributions are welcome.

Regards,
Martin

From db3l.net at gmail.com  Thu Sep  6 00:35:10 2007
From: db3l.net at gmail.com (David Bolen)
Date: Wed, 05 Sep 2007 18:35:10 -0400
Subject: [Python-Dev] x86 XP trunk failure
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
	<46DEB04B.3090500@v.loewis.de>
	<m2ps0wdhcb.fsf@valheru.db3l.homeip.net>
	<46DF1D2A.8090506@v.loewis.de>
Message-ID: <m2ejhcd9lt.fsf@valheru.db3l.homeip.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

>> warning: DBTxn aborted in destructor.  No prior commit() or abort().
>
> I have seen these as well. bsddb isn't very forgiving when you have
> a Python exception inside a bsddb transaction, in the test suite.
> IIRC, the exception will abort the transaction, then the unittest
> fixture teardown will close the environment, and that will cause
> a bsddb crash because something is getting released that does not
> exist anymore. When I last looked at it, I did not see an easy way to
> fix it; contributions are welcome.

One thing I tried that seems to work fairly well for this case is to
encapsulate much of the module-level code in the test into a class
instance.  That way the module-level code can instantiate and destroy
the class instance rather than waiting for the interpreter exit for
the latter.

It definitely resolves this current issue, but when I reverted the
changes to _bsddb.c that were originally made in conjunction with this
test, it still seemed to pass the test.  So I tried the reverted
module with the original test code and it still passes.

So I'm not entirely sure that the test is enforcing anything at this
point, or at least I'm not sure how to be absolutely positive that the
change will continue to enforce what the existing code used to test.

But I can open a ticket with the proposed changes if that would help.

-- David


From db3l.net at gmail.com  Thu Sep  6 01:01:43 2007
From: db3l.net at gmail.com (David Bolen)
Date: Wed, 05 Sep 2007 19:01:43 -0400
Subject: [Python-Dev] x86 XP trunk failure
References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com>
	<46DEB04B.3090500@v.loewis.de>
	<m2ps0wdhcb.fsf@valheru.db3l.homeip.net>
	<46DF1D2A.8090506@v.loewis.de>
	<m2ejhcd9lt.fsf@valheru.db3l.homeip.net>
Message-ID: <m2abs0d8dk.fsf@valheru.db3l.homeip.net>

I wrote:

> But I can open a ticket with the proposed changes if that would help.

Figure it can't hurt - I've created issue 1112 with the proposed patch
to the test_1413192.py module.

-- David


From brett at python.org  Thu Sep  6 01:38:06 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 5 Sep 2007 16:38:06 -0700
Subject: [Python-Dev] Google spreadsheet to collaborate on backporting Py3K
	stuff to 2.6
Message-ID: <bbaeab100709051638i7fedb693qea9fe0079e8f7fd1@mail.gmail.com>

Neal, Anthony, Thomas W., and I have a spreadsheet that was started to
keep track of what needs to be done in what needs to be done in 2.6
for Py3K transitioning:
http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg .  I am
opening the spreadsheet up to everyone so that others can help
maintain it.

There is a sheet in the Python 3000 Tasks spreadsheet that should be
merged into this spreadsheet and then deleted.  If anyone wants to
help with that it would be great (once something has been moved from
"Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from
"Python 3000 Tasks").

Because Neal created this spreadsheet he is the only one who can open
editing to everyone.  If you would like to have edit abilities to the
spreadsheet just reply to this email saying you want an invite and I
will add you manually (and if you want a different address added just
say so).

-Brett

From janssen at parc.com  Thu Sep  6 05:03:58 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 20:03:58 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DEFF3C.90306@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
	<07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
	<46DEFF3C.90306@v.loewis.de>
Message-ID: <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com>

> > I'm wondering if I should try to pull some extension attributes out of
> > the cert, and add them to the dict, as well.  Like subjectAltName, for
> > instance.  Or should we just wait till someone wants it and files a
> > bug report?
> 
> If you have the time and inclination to do that, feel free to. Covering
> some of the most widely used extensions could be useful: subjectAltName,
> key usage, extended key usage. If you set up a framework for that,
> people will contribute others they like to see supported.

It's actually easier to do all or nothing.  I'm tempted to just report
'critical' extensions.

Bill

From janssen at parc.com  Thu Sep  6 05:52:08 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 5 Sep 2007 20:52:08 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
	<07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
	<46DEFF3C.90306@v.loewis.de>
	<07Sep5.200401pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com>

> > > I'm wondering if I should try to pull some extension attributes out of
> > > the cert, and add them to the dict, as well.  Like subjectAltName, for
> > > instance.  Or should we just wait till someone wants it and files a
> > > bug report?
> > 
> > If you have the time and inclination to do that, feel free to. Covering
> > some of the most widely used extensions could be useful: subjectAltName,
> > key usage, extended key usage. If you set up a framework for that,
> > people will contribute others they like to see supported.
> 
> It's actually easier to do all or nothing.  I'm tempted to just report
> 'critical' extensions.

Simpler to provide them all, though I should note that the purpose of
the information provided here is mainly for authorization/accounting
purposes, not for "other" use of the certificate.  If that's desired,
they should pull the binary form of the certificate (there's an
interface for that), and use M2Crypto or PyOpenSSL to decode it in
general.  This certificate has already been validated; the issue is
how to get critical information to the app so it can make
authorization decisions (like subjectAltName when the subject field is
empty).  Reporting non-critical extensions like "extended key usage"
is nifty, but seems pointless.

Here's an example:

      {'extensions': {'Netscape Cert Type': u'SSL Server'},
       'issuer': ((('countryName', u'US'),),
                  (('stateOrProvinceName', u'Delaware'),),
                  (('localityName', u'Wilmington'),),
                  (('organizationName', u'Python Software Foundation'),),
                  (('organizationalUnitName', u'SSL'),),
                  (('commonName', u'somemachine.python.org'),)),
       'notAfter': 'Feb 16 16:54:50 2013 GMT',
       'notBefore': 'Aug 27 16:54:50 2007 GMT',
       'serialNumber': 'FFAA4ADBF570818D',
       'subject': ((('countryName', u'US'),),
                   (('stateOrProvinceName', u'Delaware'),),
                   (('localityName', u'Wilmington'),),
                   (('organizationName', u'Python Software Foundation'),),
                   (('organizationalUnitName', u'SSL'),),
                   (('commonName', u'somemachine.python.org'),)),
       'version': 3}

and

      {'extensions': {'1.3.6.1.5.5.7.1.12': u'',
                      'Authority Information Access': u'OCSP - URI:http://EVIntl-ocsp.verisign.com\n',
                      'X509v3 Authority Key Identifier': u'keyid:4E:43:C8:1D:76:EF:37:53:7A:4F:F2:58:6F:94:F3:38:E2:D5:BD:DF\n',
                      'X509v3 Basic Constraints': u'CA:FALSE',
                      'X509v3 CRL Distribution Points': u'URI:http://EVIntl-crl.verisign.com/EVIntl2006.crl\n',
                      'X509v3 Certificate Policies': u'Policy: 2.16.840.1.113733.1.7.23.6\n',
                      'X509v3 Extended Key Usage': u'TLS Web Server Authentication, TLS Web Client Authentication, Netscape Server Gated Crypto, Microsoft Server Gated Crypto',
                      'X509v3 Key Usage': u'Digital Signature, Key Encipherment',
                      'X509v3 Subject Key Identifier': u'F1:5A:89:93:55:47:4B:BA:51:F5:4E:E0:CB:16:55:F4:D7:CC:38:67'},
       'issuer': ((('countryName', u'US'),),
                  (('organizationName', u'VeriSign, Inc.'),),
                  (('organizationalUnitName', u'VeriSign Trust Network'),),
                  (('organizationalUnitName',
                    u'Terms of use at https://www.verisign.com/rpa (c)06'),),
                  (('commonName',
                    u'VeriSign Class 3 Extended Validation SSL SGC CA'),)),
       'notAfter': 'May  8 23:59:59 2009 GMT',
       'notBefore': 'May  9 00:00:00 2007 GMT',
       'serialNumber': '6A4AC31B3110E6EB48F0FC51A39A171F',
       'subject': ((('serialNumber', u'2497886'),),
                   (('1.3.6.1.4.1.311.60.2.1.3', u'US'),),
                   (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),
                   (('countryName', u'US'),),
                   (('postalCode', u'94043'),),
                   (('stateOrProvinceName', u'California'),),
                   (('localityName', u'Mountain View'),),
                   (('streetAddress', u'487 East Middlefield Road'),),
                   (('organizationName', u'VeriSign, Inc.'),),
                   (('organizationalUnitName', u'Production Security Services'),),
                   (('organizationalUnitName',
                     u'Terms of use at www.verisign.com/rpa (c)06'),),
                   (('commonName', u'www.verisign.com'),)),
       'version': 3}

Probably another thing that *should* be reported is the cipher used to
protect the information on the channel, so that the app can decide
whether it's strong enough for its taste.  (If it's not, it can
presumably reconnect using a different variant of SSL to try for a
better result, or decide not to use the server (or talk to the client)
at all.)

Bill

From martin at v.loewis.de  Thu Sep  6 08:46:50 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Sep 2007 08:46:50 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
	<07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
	<46DEFF3C.90306@v.loewis.de>
	<07Sep5.200401pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.205217pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46DFA25A.1070901@v.loewis.de>

>> It's actually easier to do all or nothing.  I'm tempted to just report
>> 'critical' extensions.
> 
> Simpler to provide them all

I very much doubt that, at least if you want to report decoded
information. Conceptually, there is an infinite number of extensions,
and when you are done, I can show you lots of certificates that
have extensions that you don't support.

> This certificate has already been validated; the issue is
> how to get critical information to the app so it can make
> authorization decisions (like subjectAltName when the subject field is
> empty)

>       {'extensions': {'1.3.6.1.5.5.7.1.12': u'',
>                       'Authority Information Access': u'OCSP - URI:http://EVIntl-ocsp.verisign.com\n',
>                       'X509v3 Authority Key Identifier': u'keyid:4E:43:C8:1D:76:EF:37:53:7A:4F:F2:58:6F:94:F3:38:E2:D5:BD:DF\n',
>                       'X509v3 Basic Constraints': u'CA:FALSE',
>                       'X509v3 CRL Distribution Points': u'URI:http://EVIntl-crl.verisign.com/EVIntl2006.crl\n',
>                       'X509v3 Certificate Policies': u'Policy: 2.16.840.1.113733.1.7.23.6\n',
>                       'X509v3 Extended Key Usage': u'TLS Web Server Authentication, TLS Web Client Authentication, Netscape Server Gated Crypto, Microsoft Server Gated Crypto',
>                       'X509v3 Key Usage': u'Digital Signature, Key Encipherment',
>                       'X509v3 Subject Key Identifier': u'F1:5A:89:93:55:47:4B:BA:51:F5:4E:E0:CB:16:55:F4:D7:CC:38:67'},

Hmm. In this certificate, none of the extensions you report have been
marked critical; they are all non-critical.

Also, you are reporting the logotype (1.3.6.1.5.5.7.1.12) incorrectly;
it's defined in RFC 3709, and it's definitely not an empty string in
the certificate you've used.

Regards,
Martin

From janssen at parc.com  Thu Sep  6 18:11:57 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 6 Sep 2007 09:11:57 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46DFA25A.1070901@v.loewis.de> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<07Sep4.122146pdt."57996"@synergy1.parc.xerox.com>
	<46DDCD7C.40004@v.loewis.de>
	<07Sep4.152117pdt."57996"@synergy1.parc.xerox.com>
	<46DE3DB8.6000004@v.loewis.de>
	<07Sep5.081717pdt."57996"@synergy1.parc.xerox.com>
	<46DECFF6.4040107@v.loewis.de>
	<07Sep5.091238pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.092643pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.104910pdt."57996"@synergy1.parc.xerox.com>
	<46DEF5FF.8040602@v.loewis.de>
	<07Sep5.115820pdt."57996"@synergy1.parc.xerox.com>
	<46DEFF3C.90306@v.loewis.de>
	<07Sep5.200401pdt."57996"@synergy1.parc.xerox.com>
	<07Sep5.205217pdt."57996"@synergy1.parc.xerox.com>
	<46DFA25A.1070901@v.loewis.de>
Message-ID: <07Sep6.091202pdt."57996"@synergy1.parc.xerox.com>

> I very much doubt that, at least if you want to report decoded
> information. Conceptually, there is an infinite number of extensions,
> and when you are done, I can show you lots of certificates that
> have extensions that you don't support.

I'm not going to decode anything; I'm just using the OpenSSL
functionality and providing whatever it provides.

> Hmm. In this certificate, none of the extensions you report have been
> marked critical; they are all non-critical.

That's what I meant by "simpler to show everything".

> Also, you are reporting the logotype (1.3.6.1.5.5.7.1.12) incorrectly;
> it's defined in RFC 3709, and it's definitely not an empty string in
> the certificate you've used.

Yes, I see.  I'll poke at the OpenSSL code harder :-).

Bill



From radix at twistedmatrix.com  Thu Sep  6 18:50:57 2007
From: radix at twistedmatrix.com (Christopher Armstrong)
Date: Thu, 6 Sep 2007 12:50:57 -0400
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <-1936579380892715012@unknownmsgid>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
Message-ID: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>

On 9/5/07, Bill Janssen <janssen at parc.com> wrote:
> > It's actually easier to do all or nothing.  I'm tempted to just report
> > 'critical' extensions.
>
> Simpler to provide them all, though I should note that the purpose of
> the information provided here is mainly for authorization/accounting
> purposes, not for "other" use of the certificate.  If that's desired,
> they should pull the binary form of the certificate (there's an
> interface for that), and use M2Crypto or PyOpenSSL to decode it in
> general.  This certificate has already been validated; the issue is
> how to get critical information to the app so it can make
> authorization decisions (like subjectAltName when the subject field is
> empty).  Reporting non-critical extensions like "extended key usage"
> is nifty, but seems pointless.


RFC 2818

"""If a subjectAltName extension of type dNSName is present, that MUST
be used as the identity. Otherwise, the (most specific) Common Name
field in the Subject field of the certificate MUST be used. Although
the use of the Common Name is existing practice, it is deprecated and
Certification Authorities are encouraged to use the dNSName instead.
"""

This is from an explanation of how to do hostname verification when
doing HTTPS requests. HTTPS clients MUST do this in order to be
compliant. Is an HTTPS client not in your list of use cases?

"""In general, HTTP/TLS requests are generated by dereferencing a URI.
As a consequence, the hostname for the server is known to the client.
If the hostname is available, the client MUST check it against the
server's identity as presented in the server's Certificate message, in
order to prevent man-in-the-middle attacks."""

I really don't understand why you would not expose all data in the
certificate. It seems totally obvious. The data is there for a reason.
I want the subjectAltName. Probably other people want other stuff. Why
cripple it? Please include it all.

-- 
Christopher Armstrong
International Man of Twistery
http://radix.twistedmatrix.com/
http://twistedmatrix.com/
http://canonical.com/

From martin at v.loewis.de  Thu Sep  6 19:03:32 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 06 Sep 2007 19:03:32 +0200
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
References: <-4762611594645938717@unknownmsgid>	
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>	
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>	
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>	
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
Message-ID: <46E032E4.6050300@v.loewis.de>

> I really don't understand why you would not expose all data in the
> certificate.

You mean, providing the entire certificate as a blob? That is planned
(although perhaps not implemented).

Or do you mean "expose all data in a structured manner". BECAUSE
IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the
notion of "extension".

> It seems totally obvious. The data is there for a reason.
> I want the subjectAltName. Probably other people want other stuff. Why
> cripple it? Please include it all.

That's not possible. You can get the whole thing as a blob, and then
you have to decode it yourself if something you want is not decoded.

Regards,
Martin



From janssen at parc.com  Thu Sep  6 19:15:41 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 6 Sep 2007 10:15:41 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
Message-ID: <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>

> RFC 2818
> 
> """If a subjectAltName extension of type dNSName is present, that MUST
> be used as the identity. Otherwise, the (most specific) Common Name
> field in the Subject field of the certificate MUST be used. Although
> the use of the Common Name is existing practice, it is deprecated and
> Certification Authorities are encouraged to use the dNSName instead.
> """

Yes, subjectAltName is a big one.  But I think it may be the only
extension I'll expose.  The issue is that I don't see a generic way
of mapping extension X into Python data structure Y; each one needs to
be handled specially.  If you can see a way around this, please speak
up!

> I really don't understand why you would not expose all data in the
> certificate. It seems totally obvious. The data is there for a reason.
> I want the subjectAltName. Probably other people want other stuff. Why
> cripple it? Please include it all.

I intend to "include it all", by giving you a way to pull the full DER
form of the certificate into Python.  But a number of fields in the
certificate have nothing to do with authorization, like the signature,
which has already been used for validation.  So I don't intend to try
to convert them into Python-friendly forms.  Applications which want to
use that information already need to have a more powerful library, like
M2Crypto or PyOpenSSL, available; they can simply work with the DER form
of the certificate.

Bill

From janssen at parc.com  Thu Sep  6 19:25:39 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 6 Sep 2007 10:25:39 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
Message-ID: <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>

By the way, I think the hostname matching provisions of 2818 (which
is, after all, only an informational RFC, not a standard) are poorly
thought out.  Many machines have more hostnames than you can shake a
stick at, and often provide certs with the wrong hostname in them
(usually because they have no way to determine what the *right*
hostname is, from inside that machine).

Bill

From glyph at divmod.com  Thu Sep  6 19:31:55 2007
From: glyph at divmod.com (glyph at divmod.com)
Date: Thu, 06 Sep 2007 17:31:55 -0000
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46E032E4.6050300@v.loewis.de>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<46E032E4.6050300@v.loewis.de>
Message-ID: <20070906173155.21185.610867885.divmod.xquotient.7040@joule.divmod.com>

On 05:03 pm, martin at v.loewis.de wrote:
>>I really don't understand why you would not expose all data in the
>>certificate.
>
>You mean, providing the entire certificate as a blob? That is planned
>(although perhaps not implemented).
>
>Or do you mean "expose all data in a structured manner". BECAUSE
>IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the
>notion of "extension".

"structure" is a relative term.  A typical way to deal with extensions 
unknown to the implementation is to provide ways to deal with the 
*extension-specific* parts of the data in question, c.f. 
http://java.sun.com/j2se/1.4.2/docs/api/java/security/cert/X509Extension.html

Exposing the entire certificate object as a blob so that some *other* 
library could parse it *again* seems like just giving up.

However, as to the specific issue of subjectAltName which Chris first 
mentioned: if HTTPS isn't an important specification to take into 
account while designing an SSL layer for Python, then I can't imagine 
what is.  subjectAltName should be directly supported regardless of how 
it deals with unknown extensions.
>>It seems totally obvious. The data is there for a reason.
>>I want the subjectAltName. Probably other people want other stuff. Why
>>cripple it? Please include it all.

>That's not possible. You can get the whole thing as a blob, and then
>you have to decode it yourself if something you want is not decoded.

Something very much like that is certainly possible, and has been done 
in numerous other places (including the Java implementation linked 
above).  Providing a semantically rich interface to every possible X509 
extension is of course ridiculous, but I don't think that's what anyone 
is actually proposing here.

From glyph at divmod.com  Thu Sep  6 19:45:18 2007
From: glyph at divmod.com (glyph at divmod.com)
Date: Thu, 06 Sep 2007 17:45:18 -0000
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>

On 05:15 pm, janssen at parc.com wrote:
>>RFC 2818
>>
>>"""If a subjectAltName extension of type dNSName is present, that MUST
>>be used as the identity. Otherwise, the (most specific) Common Name
>>field in the Subject field of the certificate MUST be used. Although
>>the use of the Common Name is existing practice, it is deprecated and
>>Certification Authorities are encouraged to use the dNSName instead.
>>"""

>Yes, subjectAltName is a big one.  But I think it may be the only
>extension I'll expose.  The issue is that I don't see a generic way
>of mapping extension X into Python data structure Y; each one needs to
>be handled specially.  If you can see a way around this, please speak
>up!

Well, I can't speak for Chris, but that will certainly make *me* happier 
:).
>I intend to "include it all", by giving you a way to pull the full DER
>form of the certificate into Python.  But a number of fields in the
>certificate have nothing to do with authorization, like the signature,
>which has already been used for validation.  So I don't intend to try
>to convert them into Python-friendly forms.  Applications which want to
>use that information already need to have a more powerful library, like
>M2Crypto or PyOpenSSL, available; they can simply work with the DER 
>form
>of the certificate.

When you say "the full DER form", are you simply referring to the full 
blob, or a broken-down representation by key and by extension?

This begs the question: M2Crypto and PyOpenSSL already do what you're 
proposing to do, as far as I can tell, and are, as you say, "more 
powerful".  There are issues with each (and issues with the GNU TLS 
bindings too, which I notice you didn't mention...)

Speaking of issues, PyOpenSSL, for example, does not expose 
subjectAltName :).

This has been a long thread, so I may have missed posts where this was 
already discussed, but even if I'm repeating this, I think it deserves 
to be beaten to death.  *Why* are you trying to bring the number of 
(potentially buggy, incomplete) Python SSL bindings to 4, rather than 
adopting one of the existing ones and implementing a simple wrapper on 
top of it?  PyOpenSSL, in particular, is both a popular de-facto 
standard *and* almost completely unmaintained; python's standard library 
could absorb/improve it with little fuss.

From janssen at parc.com  Thu Sep  6 20:15:16 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 6 Sep 2007 11:15:16 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
	<20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>
Message-ID: <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com>

> When you say "the full DER form", are you simply referring to the full 
> blob, or a broken-down representation by key and by extension?

The full blob.

> This begs the question: M2Crypto and PyOpenSSL already do what you're 
> proposing to do, as far as I can tell, and are, as you say, "more 
> powerful".

I'm trying to give the application the ability to do some level of
authorization without requiring either of those packages.  Like being
able to tell who's on the other side of the connection :-).  Right
now, I think the right fields to expose are

  "subject" (I see little point to exposing "issuer"),

  "notAfter" (you're always guaranteed to be after "notBefore", or the
  cert wouldn't validate, so I see little point to exposing that, but
  "notAfter" can be used after the connection has been established),

  subjectAltName if present,

  and perhaps the certificate's serial number.

I don't see how the other fields in the cert can be profitably used.
Anything else you want, you can pull over the DER blob and look into
it.

> PyOpenSSL, in particular, is both a popular de-facto 
> standard *and* almost completely unmaintained; python's standard library 
> could absorb/improve it with little fuss.

Good idea, go for it!  A full wrapper for OpenSSL is beyond the scope
of my ambition; I'm simply trying to add a simple fix to what's
already in the standard library.

Bill






From radix at twistedmatrix.com  Thu Sep  6 20:18:21 2007
From: radix at twistedmatrix.com (Christopher Armstrong)
Date: Thu, 6 Sep 2007 14:18:21 -0400
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <46E032E4.6050300@v.loewis.de>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<46E032E4.6050300@v.loewis.de>
Message-ID: <60ed19d40709061118k2755bbfdkecb3eae94cf22f93@mail.gmail.com>

On 9/6/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> You mean, providing the entire certificate as a blob? That is planned
> (although perhaps not implemented).
>
> Or do you mean "expose all data in a structured manner". BECAUSE
> IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the
> notion of "extension".
>
> > It seems totally obvious. The data is there for a reason.
> > I want the subjectAltName. Probably other people want other stuff. Why
> > cripple it? Please include it all.
>
> That's not possible. You can get the whole thing as a blob, and then
> you have to decode it yourself if something you want is not decoded.

Sorry, I guess I thought it was obvious. Please let me get at the
bytes of just the unknown-to-ssl-module extension without forcing me
to write an entire general ASN.1 certificate parser or use another
(incomplete) one. Many extensions have simple data in them that is
trivial to parse alone.

-- 
Christopher Armstrong
International Man of Twistery
http://radix.twistedmatrix.com/
http://twistedmatrix.com/
http://canonical.com/

From guido at python.org  Thu Sep  6 23:10:44 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 6 Sep 2007 14:10:44 -0700
Subject: [Python-Dev] Google spreadsheet to collaborate on backporting
	Py3K stuff to 2.6
In-Reply-To: <bbaeab100709051638i7fedb693qea9fe0079e8f7fd1@mail.gmail.com>
References: <bbaeab100709051638i7fedb693qea9fe0079e8f7fd1@mail.gmail.com>
Message-ID: <ca471dc20709061410s50d7848j6fd0d6f92d665d61@mail.gmail.com>

I've transferred everything from my spreadsheet to Neal's.

On 9/5/07, Brett Cannon <brett at python.org> wrote:
> Neal, Anthony, Thomas W., and I have a spreadsheet that was started to
> keep track of what needs to be done in what needs to be done in 2.6
> for Py3K transitioning:
> http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg .  I am
> opening the spreadsheet up to everyone so that others can help
> maintain it.
>
> There is a sheet in the Python 3000 Tasks spreadsheet that should be
> merged into this spreadsheet and then deleted.  If anyone wants to
> help with that it would be great (once something has been moved from
> "Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from
> "Python 3000 Tasks").
>
> Because Neal created this spreadsheet he is the only one who can open
> editing to everyone.  If you would like to have edit abilities to the
> spreadsheet just reply to this email saying you want an invite and I
> will add you manually (and if you want a different address added just
> say so).
>
> -Brett
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Thu Sep  6 23:55:06 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 6 Sep 2007 14:55:06 PDT
Subject: [Python-Dev] Python access to data fields of SSL connection peer
	certificate
Message-ID: <07Sep6.145509pdt."57996"@synergy1.parc.xerox.com>

After a great deal of discussion, under the Subject line of "frozenset
C API?" (you may have missed it :-), I'm coming to the conclusion that
in revealing the fields of an SSL certificate, less is more.

>From one of the messages in that thread:

  I'm trying to give the application the ability to do some level of
  authorization without requiring either of those packages.  Like being
  able to tell who's on the other side of the connection :-).  Right
  now, I think the right fields to expose are

    "subject" (I see little point to exposing "issuer"),

    "notAfter" (you're always guaranteed to be after "notBefore", or the
    cert wouldn't validate, so I see little point to exposing that, but
    "notAfter" can be used after the connection has been established),

    subjectAltName if present,

    and perhaps the certificate's serial number.

Remember that the cert has already been validated, so I don't see how
the other fields in the cert can be profitably used for authorization
and/or accounting, which is the purpose of this interface.  Anything
else you want, you can pull over the DER blob and look into it with
some other crypto package; I'll provide a way to pull the full binary
form of the certificate into Python as a bytes string (as soon as the
bytes API gets backported into the trunk).

Under those rules, the samples in the current documentation would look
like

{'notAfter': 'May  8 23:59:59 2009 GMT',
 'serialNumber': '6A4AC31B3110E6EB48F0FC51A39A171F',
 'subject': ((('serialNumber', u'2497886'),),
             (('1.3.6.1.4.1.311.60.2.1.3', u'US'),),
             (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),
             (('countryName', u'US'),),
             (('postalCode', u'94043'),),
             (('stateOrProvinceName', u'California'),),
             (('localityName', u'Mountain View'),),
             (('streetAddress', u'487 East Middlefield Road'),),
             (('organizationName', u'VeriSign, Inc.'),),
             (('organizationalUnitName', u'Production Security Services'),),
             (('organizationalUnitName',
               u'Terms of use at www.verisign.com/rpa (c)06'),),
             (('commonName', u'www.verisign.com'),))}

and

{'notAfter': 'Feb 16 16:54:50 2013 GMT',
 'serialNumber': 'FFAA4ADBF570818D',
 'subject': ((('countryName', u'US'),),
             (('stateOrProvinceName', u'Delaware'),),
             (('localityName', u'Wilmington'),),
             (('organizationName', u'Python Software Foundation'),),
             (('organizationalUnitName', u'SSL'),),
             (('commonName', u'somemachine.python.org'),))}

The server cert at https://www.dcl.hpi.uni-potsdam.de/ would look like

{'notAfter': 'Mar 17 13:02:27 2008 GMT',
 'serialNumber': '2567F168000300000678',
 'subject': ((('countryName', u'DE'),),
             (('stateOrProvinceName', u'Brandenburg'),),
             (('localityName', u'Potsdam'),),
             (('organizationName', u'Hasso-Plattner-Institut'),),
             (('organizationalUnitName', u'Operating Systems & Middleware'),),
             (('commonName', u'www.dcl.hpi.uni-potsdam.de'),)),
 'subjectAltName': ('DNS:www.dcl.hpi.uni-potsdam.de',
                    'DNS:www',
                    'DNS:dfw',
                    'DNS:dfw.dcl.hpi.uni-potsdam.de',
                    'IP Address:141.89.224.164')}

Thanks to Martin for suggesting it.

Bill

From brett at python.org  Fri Sep  7 04:34:42 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 6 Sep 2007 19:34:42 -0700
Subject: [Python-Dev] PEP 362: Signature objects
Message-ID: <bbaeab100709061934p53f8e3if432c996eab67da1@mail.gmail.com>

Neal Becker over on python-3000 said that the Boost people could use
this.  Figured it was time to present it officially to the list to see
if I can get it added for 2.6/3.0.

The implementation in the sandbox works in both 2.6 and 3.0 out of the
box (no 2to3 necessary) so feel free to play with it.

---------------------------------------------------------

Abstract
========

Python has always supported powerful introspection capabilities,
including that for functions and methods (for the rest of this PEP the
word "function" refers to both functions and methods).  Taking a
function object, you can fully reconstruct the function's signature.
Unfortunately it is a little unruly having to look at all the
different attributes to pull together complete information for a
function's signature.

This PEP proposes an object representation for function signatures.
This should help facilitate introspection on functions for various
uses (e.g., decorators).  The introspection information contains all
possible information about the parameters in a signature (including
Python 3.0 features).

This object, though, is not meant to replace existing ways of
introspection on a function's signature.  The current solutions are
there to make Python's execution work in an efficient manner.  The
proposed object representation is only meant to help make application
code have an easier time to query a function on its signature.


Signature Object
================

The overall signature of an object is represented by the Signature
object.  This object is to store a `Parameter object`_ for each
parameter in the signature.  It is also to store any information
about the function itself that is pertinent to the signature.

A Signature object has the following structure attributes:

* name : str
    Name of the function.  This is not fully qualified because
    function objects for methods do not know the class they are
    contained within.  This makes functions and methods
    indistinguishable from one another when passed to decorators,
    preventing proper creation of a fully qualified name.
* var_args : str
    Name of the variable positional parameter (i.e., ``*args``), if
    present, or the empty string.
* var_kw_args : str
    Name of the variable keyword parameter (i.e., ``**kwargs``), if
    present, or the empty string.
* var_annotations: dict(str, object)
    Dict that contains the annotations for the variable parameters.
    The keys are of the variable parameter with values of the
    annotation.  If an annotation does not exist for a variable
    parameter then the key does not exist in the dict.
* parameters : list(Parameter)
    List of the parameters of the function as represented by
    Parameter objects in the order of its definition (keyword-only
    arguments are in the order listed by ``code.co_varnames``).
* bind(\*args, \*\*kwargs) -> dict(str, Parameter)
    Create a mapping from arguments to parameters.  The keys are the
    names of the parameter that an argument maps to with the value
    being the value the parameter would have if this function was
    called with the given arguments.

The Signature object is stored in the ``__signature__`` attribute of
a function.  When it is to be created is discussed in
`Open Issues`_.


Parameter Object
================

A function's signature is made up of several parameters.  Python's
different kinds of parameters is quite large and rich and continues to
grow.  Parameter objects represent any possible parameter.

Originally the plan was to represent parameters using a list of
parameter names on the Signature object along with various dicts keyed
on parameter names to disseminate the various pieces of information
one can know about a parameter.  But the decision was made to
incorporate all information about a parameter in a single object so
as to make extending the information easier.  This was originally put
forth by Talin and the preferred form of Guido (as discussed at the
2006 Google Sprint).

The structure of the Parameter object is:

* name : (str | tuple(str))
    The name of the parameter as a string if it is not a tuple.  If
    the argument is a tuple then a tuple of strings is used.
* position : int
    The position of the parameter within the signature of the
    function (zero-indexed).  For keyword-only parameters the position
    value is arbitrary while not conflicting with positional
    parameters.  The suggestion of setting the attribute to None or -1
    to represent keyword-only parameters was rejected to prevent
    variable type usage and as a possible point of errors,
    respectively.
* has_default : bool
    True if the parameter has a default value, else False.
* default_value : object
    The default value for the parameter, if present, else the
    attribute does not exist.  This is done so that the attribute is
    not accidentally used if no default value is set as any default
    value could be a legitimate default value itself.
* keyword_only : bool
    True if the parameter is keyword-only, else False.
* has_annotation : bool
    True if the parameter has an annotation, else False.
* annotation
    Set to the annotation for the parameter.  If ``has_annotation`` is
    False then the attribute does not exist to prevent accidental use.


Implementation
==============

An implementation can be found in Python's sandbox [#impl]_.
There is a function named ``signature()`` which
returns the value stored on the ``__signature__`` attribute if it
exists, else it creates the Signature object for the
function and sets ``__signature__``.  For methods this is stored
directly on the im_func function object since that is what decorators
work with.


Open Issues
===========

When to construct the Signature object?
---------------------------------------

The Signature object can either be created in an eager or lazy
fashion.  In the eager situation, the object can be created during
creation of the function object.  In the lazy situation, one would
pass a function object to a function and that would generate the
Signature object and store it to ``__signature__`` if
needed, and then return the value of ``__signature__``.


Should ``Signature.bind`` return Parameter objects as keys?
-----------------------------------------------------------

Instead of returning a dict with keys consisting of the name of the
parameters, would it be more useful to instead use Parameter
objects?  The name of the argument can easily be retrieved from the
key (and the name would be used as the hash for a Parameter object).


Provide a mapping of parameter name to Parameter object?
--------------------------------------------------------

While providing access to the parameters in order is handy, it might
also be beneficial to provide a way to retrieve Parameter objects from
a Signature object based on the parameter's name.  Which style of
access (sequential/iteration or mapping) will influence how the
parameters are stored internally and whether __getitem__ accepts
strings or integers.

One possible compromise is to have ``__getitem__`` provide mapping
support and have ``__iter__`` return Parameter objects based on their
``position`` attribute.  This allows for getting the sequence of
Parameter objects easily by using the ``__iter__`` method on Signature
object along with the sequence constructor (e.g., ``list`` or
``tuple``).


Remove ``has_*`` attributes?
----------------------------

If an EAFP approach to the API is taken, both ``has_annotation`` and
``has_default`` are unneeded as the respective ``annotation`` and
``default_value`` attributes are simply not set.  It's simply a
question of whether to have a EAFP or LBYL interface.


Have ``var_args`` and ``_var_kw_args`` default to ``None``?
------------------------------------------------------------

It has been suggested by Fred Drake that these two attributes have a
value of ``None`` instead of empty strings when they do not exist.


Deprecate ``inspect.getargspec()`` and ``.formatargspec()``?
-------------------------------------------------------------

Since the Signature object replicates the use of ``getargspec()``
from the ``inspect`` module it might make sense to deprecate it in
2.6.  ``formatargspec()`` could also go if Signature objects gained a
__str__ representation.

Issue with that is types such as ``int``, when used as annotations,
do not lend themselves for output (e.g., ``"type 'int'>"`` is the
string represenation for ``int``).  The repr representation of types
would need to change in order to make this reasonable.


References
==========

.. [#impl] pep362 directory in Python's sandbox
   (http://svn.python.org/view/sandbox/trunk/pep362/)


Copyright
=========

This document has been placed in the public domain.

From oliphant.travis at ieee.org  Fri Sep  7 07:18:07 2007
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri, 07 Sep 2007 00:18:07 -0500
Subject: [Python-Dev] Google spreadsheet to collaborate on backporting
 Py3K stuff to 2.6
In-Reply-To: <bbaeab100709051638i7fedb693qea9fe0079e8f7fd1@mail.gmail.com>
References: <bbaeab100709051638i7fedb693qea9fe0079e8f7fd1@mail.gmail.com>
Message-ID: <fbqmug$v7f$1@sea.gmane.org>

Brett Cannon wrote:
> Neal, Anthony, Thomas W., and I have a spreadsheet that was started to
> keep track of what needs to be done in what needs to be done in 2.6
> for Py3K transitioning:
> http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg .  I am
> opening the spreadsheet up to everyone so that others can help
> maintain it.
> 
> There is a sheet in the Python 3000 Tasks spreadsheet that should be
> merged into this spreadsheet and then deleted.  If anyone wants to
> help with that it would be great (once something has been moved from
> "Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from
> "Python 3000 Tasks").
> 
> Because Neal created this spreadsheet he is the only one who can open
> editing to everyone.  If you would like to have edit abilities to the
> spreadsheet just reply to this email saying you want an invite and I
> will add you manually (and if you want a different address added just
> say so).

I would like an invite.

Thanks.

-Travis


From janssen at parc.com  Fri Sep  7 22:55:57 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 13:55:57 PDT
Subject: [Python-Dev] any tips on malloc debugging?
Message-ID: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com>

I've been expanding the SSL test suite, and found something like this
cropping up, not always, but maybe 30% of the time.  So I run it under
gdb, but the "szone_error" breakpoint never gets hit.  Any other
malloc debugging tips I should know about?

(gdb) info break
Num Type           Disp Enb Address    What
1   breakpoint     keep y   0x900f2e56 <szone_error+6>
(gdb) (gdb) run ./Lib/test/regrtest.py -R :4: -u all test_ssl
Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl
test_ssl
[...]
python.exe(22696,0xa000d000) malloc: *** error for object 0x650800: double free
python.exe(22696,0xa000d000) malloc: *** set a breakpoint in szone_error to debug
test test_ssl failed -- Traceback (most recent call last):
  File "/local/python/trunk/src/Lib/test/test_ssl.py", line 304, in testSSL3
    CERTFILE2, CERTFILE3)
  File "/local/python/trunk/src/Lib/test/test_ssl.py", line 203, in serverParamsTest
    raise test_support.TestFailed("Unexpected SSL error:  " + str(x))
TestFailed: Unexpected SSL error:  (8, '_ssl.c:394: EOF occurred in violation of protocol')

1 test failed:
    test_ssl
[23436 refs]

Program exited with code 01.
(gdb) 

Bill

From guido at python.org  Fri Sep  7 23:16:05 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 7 Sep 2007 14:16:05 -0700
Subject: [Python-Dev] any tips on malloc debugging?
In-Reply-To: <7018697645894217094@unknownmsgid>
References: <7018697645894217094@unknownmsgid>
Message-ID: <ca471dc20709071416o2b4f1b44o6e46768aeaba2997@mail.gmail.com>

I think there's a way to enable heavier malloc debugging than the
normal --with-pydebug. You'll have to enable it manually by editing
Python.h I believe. Though it may already be on if you define
Py_DEBUG. (Is WITH_PYMALLOC always on?) Ther may also be a libmalloc
that enables heavier debugging; the malloc man page would have info.

On 9/7/07, Bill Janssen <janssen at parc.com> wrote:
> I've been expanding the SSL test suite, and found something like this
> cropping up, not always, but maybe 30% of the time.  So I run it under
> gdb, but the "szone_error" breakpoint never gets hit.  Any other
> malloc debugging tips I should know about?
>
> (gdb) info break
> Num Type           Disp Enb Address    What
> 1   breakpoint     keep y   0x900f2e56 <szone_error+6>
> (gdb) (gdb) run ./Lib/test/regrtest.py -R :4: -u all test_ssl
> Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl
> test_ssl
> [...]
> python.exe(22696,0xa000d000) malloc: *** error for object 0x650800: double free
> python.exe(22696,0xa000d000) malloc: *** set a breakpoint in szone_error to debug
> test test_ssl failed -- Traceback (most recent call last):
>   File "/local/python/trunk/src/Lib/test/test_ssl.py", line 304, in testSSL3
>     CERTFILE2, CERTFILE3)
>   File "/local/python/trunk/src/Lib/test/test_ssl.py", line 203, in serverParamsTest
>     raise test_support.TestFailed("Unexpected SSL error:  " + str(x))
> TestFailed: Unexpected SSL error:  (8, '_ssl.c:394: EOF occurred in violation of protocol')
>
> 1 test failed:
>     test_ssl
> [23436 refs]
>
> Program exited with code 01.
> (gdb)
>
> Bill
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri Sep  7 23:19:52 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 07 Sep 2007 23:19:52 +0200
Subject: [Python-Dev] any tips on malloc debugging?
In-Reply-To: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E1C078.2030501@v.loewis.de>

> I've been expanding the SSL test suite, and found something like this
> cropping up, not always, but maybe 30% of the time.  So I run it under
> gdb, but the "szone_error" breakpoint never gets hit.  Any other
> malloc debugging tips I should know about?

Is this a --with-pydebug build? If not, it should be.

If that still does not give insights, I usually try valgrind
(although usually with little success).

Regards,
Martin

From jimjjewett at gmail.com  Fri Sep  7 23:43:59 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 7 Sep 2007 17:43:59 -0400
Subject: [Python-Dev] PEP 362: Signature objects
In-Reply-To: <fb6fbf560709071432y1de4061eoa1414476593a48b@mail.gmail.com>
References: <fb6fbf560709071432y1de4061eoa1414476593a48b@mail.gmail.com>
Message-ID: <fb6fbf560709071443x583c8f7di1f51714d8132b235@mail.gmail.com>

Brett Cannon wrote:

> A Signature object has the following structure attributes:

> * name : str
>     Name of the function.  This is not fully qualified because
>     function objects for methods do not know the class they are
>     contained within.  This makes functions and methods
>     indistinguishable from one another when passed to decorators,
>     preventing proper creation of a fully qualified name.

(1)  Would this change with the new static __class__ attribute used
for the new super?

(2)  What about functions without a name?  Do you want to say str or
NoneType, or is that assumed?

(3)  Is the Signature object live or frozen?  (name is writable ...
will the Signature object reflect the new name, or the name in use at
the time it was created?)

> * var_annotations: dict(str, object)
>     Dict that contains the annotations for the variable parameters.
>     The keys are of the variable parameter with values of the

Is there a special key for the "->" returns annotation, or is that
available as a separate property?

> The structure of the Parameter object is:

> * name : (str | tuple(str))
>     The name of the parameter as a string if it is not a tuple.  If
>     the argument is a tuple then a tuple of strings is used.

What is used for unnamed arguments (typically provided by C)?  I like
None, but I see the arguments for both "" and missing attribute.

> * position : int
>     The position of the parameter within the signature of the
>     function (zero-indexed).  For keyword-only parameters the position
>     value is arbitrary while not conflicting with positional
>     parameters.

Is this just a property/alias for signature.parameters.index(self) ?

What should a "parameter" object not associated with a specific
signature return?  -1, None, or missing attribute?

Is there a way to get the associated Signature, or is it "compiled
out" when the Signature and its child Parameters are first
constructed?  (I think the position property is the only attribute
that would use it, unless you want some of the other attributes --
like annotations -- to be live.)

...

I would also like to see a

 * value : object

attribute; this would be missing on most functions, but might be
filled in on a Signature representing a closure, or an execution
frame.


> When to construct the Signature object?
> ---------------------------------------

> The Signature object can either be created in an eager or lazy
> fashion.  In the eager situation, the object can be created during
> creation of the function object.

Since most code doesn't need it, I would expect it to be optimized out
at least as often as docstrings are.

>  In the lazy situation, one would
> pass a function object to a function and that would generate the
> Signature object and store it to ``__signature__`` if
> needed, and then return the value of ``__signature__``.

Why store it?  Do you expect many use cases to need the signature more
than once (but not to save it themselves)?

If there is a __signature__ attribute on a object, you have to specify
whether it can be replaced, which parts of it are writable, how that
will affect the function's own behavior, etc.  I also suspect it might
become a source of heisenbugs, like the "reference leaks" that were
really DUMMY items in a dict.

If the Signature is just a snapshot no longer attached to the original
function, then people won't expect changes to the Signature to affect
the callable.

> Should ``Signature.bind`` return Parameter objects as keys?

(see above) If a Signature is a snapshot (rather than a live part of
the function), then it might make more sense to just add a value
attribute to Parameter objects.

> Provide a mapping of parameter name to Parameter object?
> --------------------------------------------------------

> While providing access to the parameters in order is handy, it might
> also be beneficial to provide a way to retrieve Parameter objects from
> a Signature object based on the parameter's name.  Which style of
> access (sequential/iteration or mapping) will influence how the
> parameters are stored internally and whether __getitem__ accepts
> strings or integers.

I think it should accept both.

What storage mechanism to use is an internal detail that should be
left to the implementation.  I wouldn't expect Signature inspection to
be inside a tight loop anyhow, unless it were part of a Generic
Function dispatch engine ... and those authors (just PJE?) can
optimize on what they actually need.

> Remove ``has_*`` attributes?
> ----------------------------

> If an EAFP approach to the API is taken,

Please leave them; it is difficult to catch Exceptions in a list comprehension.

> Have ``var_args`` and ``_var_kw_args`` default to ``None``?

Makes sense to me, particularly since it should probably be consistent
with function name, and that should probably be None.


-jJ

From janssen at parc.com  Fri Sep  7 23:44:21 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 14:44:21 PDT
Subject: [Python-Dev] any tips on malloc debugging?
In-Reply-To: <46E1C078.2030501@v.loewis.de> 
References: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com>
	<46E1C078.2030501@v.loewis.de>
Message-ID: <07Sep7.144424pdt."57996"@synergy1.parc.xerox.com>

> Is this a --with-pydebug build? If not, it should be.

Yes.

> If that still does not give insights, I usually try valgrind
> (although usually with little success).

Actually, Google is your friend here.  The message in malloc is
misleading; set a breakpoint in malloc_printf instead.

Bill

From janssen at parc.com  Fri Sep  7 23:53:36 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 14:53:36 PDT
Subject: [Python-Dev] OpenSSL thread safety when reading files?
Message-ID: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com>

I'm seeing a number of malloc (actully, free) errors, now that I'm
pounding on the OpenSSL server/client setup with lots of server
threads and client threads.  They all look like either

(gdb) bt
#0  0x9010b807 in malloc_printf ()
#1  0x900058ad in szone_free ()
#2  0x90005588 in free ()
#3  0x9194e508 in CRYPTO_free ()
#4  0x91993e77 in ERR_clear_error ()
#5  0x919b1884 in PEM_X509_INFO_read_bio ()
#6  0x9197a692 in X509_load_cert_crl_file ()
#7  0x9197a80e in by_file_ctrl ()
#8  0x919d6e2e in X509_STORE_load_locations ()
[...]

or (much more frequently)

(gdb) bt
#0  0x9010b807 in malloc_printf ()
#1  0x900058ad in szone_free ()
#2  0x90005588 in free ()
#3  0x9194e508 in CRYPTO_free ()
#4  0x91993e77 in ERR_clear_error ()
#5  0x949fcf11 in SSL_CTX_use_certificate_chain_file ()
[...]

Always in ERR_clear_error(), always from some frame that's reading a
certificate file for some purpose.

If I disable Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS around the
places where the C code reads the certificate files, all these free
errors go away.

ERR_clear_error() is supposed to be thread-safe; it operates on a
per-thread error state structure (which I make sure is initialized in
my C code).  But it sure looks like the client and server threads are
both working with the same error state.

Bill

From janssen at parc.com  Sat Sep  8 00:10:10 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 15:10:10 PDT
Subject: [Python-Dev] OpenSSL thread safety when reading files?
In-Reply-To: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep7.151015pdt."57996"@synergy1.parc.xerox.com>

> I'm seeing a number of malloc (actully, free) errors, now that I'm
> pounding on the OpenSSL server/client setup with lots of server
> threads and client threads.  They all look like either

The issue seems to be that we assume OpenSSL is thread-safe (that is,
we call Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS), but the _ssl.c
code never did what was necessary to support that assumption.  See
http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION.

My analysis is that we need to add lock and unlock functions to the
OpenSSL initialization code we currently use, which looks like this:

	/* Init OpenSSL */
	SSL_load_error_strings();
	SSLeay_add_ssl_algorithms();

Or, just not allow threads, which seems wrong.

Bill

From trentm at activestate.com  Sat Sep  8 00:37:55 2007
From: trentm at activestate.com (Trent Mick)
Date: Fri, 07 Sep 2007 15:37:55 -0700
Subject: [Python-Dev] [PEPs]  Email addresses in PEPs?
In-Reply-To: <4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com>
References: <18121.47310.218893.540750@montanaro.dyndns.org>	<ca471dc20708201120o23d527d4lc62e01bc4fc13585@mail.gmail.com>	<bbaeab100708201216k213112e0u10b47013ec75837a@mail.gmail.com>
	<4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com>
Message-ID: <46E1D2C3.5030705@activestate.com>

David Goodger wrote:
> On 8/20/07, Brett Cannon <brett at python.org> wrote:
>> I believe email addresses are automatically obfuscated as part of the
>> HTML generation process, but one of the PEP editors can correct me if
>> I am wrong.
> 
> Yes, email addresses are obfuscated in PEPs.
> 
> For example, in PEPs 0 & 12, my address is encoded as
> "goodger&#32;&#97;t&#32;python.org" (the "@" is changed to " at " and
> further obfuscated from there).  More tricks could be played, but that
> would only decrease the usefulness of addresses for legitimate
> purposes.

If some would find it useful, here is a snippet of code that obfuscates 
email addresses for HTML as done by Markdown (a text-to-html markup 
translator). It randomly encodes each charater as a hex or decimal HTML 
entity (roughly 10% raw, 45% hex, 45% dec).

The email still appears normally in the browser, but is pretty obtuse 
when slicing and dicing the raw HTML.

Would others find this useful in pep2html.py?


-------------------
from random import random

def _encode_email_address(self, addr):
     #  Input: an email address, e.g. "foo at example.com"
     #
     #  Output: the email address as a mailto link, with each character
     #      of the address encoded as either a decimal or hex entity, in
     #      the hopes of foiling most address harvesting spam bots. E.g.:
     #
     #    <a href="&#x6D;&#97;&#105;&#108;&#x74;&#111;:&#102;&#111;
     #       &#111;&#64;&#101;x&#x61;&#109;&#x70;&#108;&#x65;&#x2E;
     #       &#99;&#111;&#109;">&#102;&#111;&#111;&#64;&#101;x&#x61;
     #       &#109;&#x70;&#108;&#x65;&#x2E;&#99;&#111;&#109;</a>
     #
     #  Based on a filter by Matthew Wickline, posted to the BBEdit-Talk
     #  mailing list: <http://tinyurl.com/yu7ue>
     chars = [_xml_encode_email_char_at_random(ch)
              for ch in "mailto:" + addr]
     # Strip the mailto: from the visible part.
     addr = '<a href="%s">%s</a>' \
            % (''.join(chars), ''.join(chars[7:]))
     return addr

def _xml_encode_email_char_at_random(ch):
     r = random()
     # Roughly 10% raw, 45% hex, 45% dec.
     # '@' *must* be encoded. I [John Gruber] insist.
     if r > 0.9 and ch != "@":
         return ch
     elif r < 0.45:
         # The [1:] is to drop leading '0': 0x63 -> x63
         return '&#%s;' % hex(ord(ch))[1:]
     else:
         return '&#%s;' % ord(ch)
-------------------


-- 
Trent Mick
trentm at activestate.com

From janssen at parc.com  Sat Sep  8 00:56:08 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 15:56:08 PDT
Subject: [Python-Dev] OpenSSL thread safety when reading files?
In-Reply-To: <07Sep7.151015pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com>
	<07Sep7.151015pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep7.155615pdt."57996"@synergy1.parc.xerox.com>

> My analysis is that we need to add lock and unlock functions to the
> OpenSSL initialization code we currently use

Yep, this seems to fix the problem.  I'm now able to re-enable
Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS, and still get a clean
run:

(gdb) run
Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl
test_ssl
/local/python/trunk/src/Lib/test/test_ssl.py:247: DeprecationWarning: socket.ssl() is deprecated.  Use ssl.sslsocket() instead.
  ssl_sock = socket.ssl(s)
beginning 9 repetitions
123456789
.........
1 test OK.
[30009 refs]

Program exited normally.
(gdb)

Bill

From janssen at parc.com  Sat Sep  8 01:09:09 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 16:09:09 PDT
Subject: [Python-Dev] working with Python threads from C extension module?
Message-ID: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>

Reading through the C API documentation, I find:

``This is done so that dynamically loaded extensions compiled with
thread support enabled can be loaded by an interpreter that was
compiled with disabled thread support.''

I've currently got the set-up-SSL-threading code in _ssl.c surrounded
by a "#ifdef HAVE_THREAD" bracket.  It sounds like that might not be
sufficient.  It sounds like I need a runtime test for thread
availability, instead, like this:

#ifdef HAVE_THREAD
	if (PyEval_ThreadsInitialized())
		_setup_ssl_threads();
#endif

Seem right?

So what happens when someone loads the _ssl module, initializes the
threads, and tries to use SSL?  It's going to start failing again.  I
think I need my own version of Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS, don't I?  Which also checks to see if the SSL
threading support has been initialized, in addition to the Python
threading support.  Something like

#define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}
#define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}

Any comments?

Bill

From janssen at parc.com  Sat Sep  8 01:20:35 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 16:20:35 PDT
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com>

> #define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}
> #define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}

I'd forgotten how convoluted Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS were.  Anyone have any other suggestions about
how to do this?

Raise an error if loaded in a non-threaded environment, then used in a
threaded environment?  Dynamic initialization of threading?

Bill

From janssen at parc.com  Sat Sep  8 01:31:06 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 16:31:06 PDT
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep7.163111pdt."57996"@synergy1.parc.xerox.com>

> So what happens when someone loads the _ssl module, initializes the
> threads, and tries to use SSL?  It's going to start failing again.  I

Which turns out to be exactly what test_ssl.py does.  I'm tempted
to have the _ssl module call PyEval_InitThreads().  Would that be kosher?

Bill

From guido at python.org  Sat Sep  8 01:46:13 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 7 Sep 2007 16:46:13 -0700
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <6121031531930291931@unknownmsgid>
References: <6121031531930291931@unknownmsgid>
Message-ID: <ca471dc20709071646l624635dfka455bd083a8232d1@mail.gmail.com>

Well, one shouldn't be bothering with threads unless the user intends
to create threads. So I think it's not kosher. Once threads are
initialized, everything runs a tad slower because the GIL
manipulations actually cost time (even if there are no other threads).

On 9/7/07, Bill Janssen <janssen at parc.com> wrote:
> > So what happens when someone loads the _ssl module, initializes the
> > threads, and tries to use SSL?  It's going to start failing again.  I
>
> Which turns out to be exactly what test_ssl.py does.  I'm tempted
> to have the _ssl module call PyEval_InitThreads().  Would that be kosher?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Sat Sep  8 01:57:40 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 7 Sep 2007 16:57:40 PDT
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <ca471dc20709071646l624635dfka455bd083a8232d1@mail.gmail.com> 
References: <6121031531930291931@unknownmsgid>
	<ca471dc20709071646l624635dfka455bd083a8232d1@mail.gmail.com>
Message-ID: <07Sep7.165746pdt."57996"@synergy1.parc.xerox.com>

> Well, one shouldn't be bothering with threads unless the user intends
> to create threads. So I think it's not kosher. Once threads are
> initialized, everything runs a tad slower because the GIL
> manipulations actually cost time (even if there are no other threads).

I think that doing it in _ssl.c might be OK; it would only happen when
the user loaded that extension module.  In any case, I'm going to do
it that way till we figure out a better solution.  The alternatives
right now are (1) let OpenSSL step all over itself (and potentially
other things), or (2) remove the Py_BEGIN_ALLOW_THREADS on SSL context
reads and writes.

> On 9/7/07, Bill Janssen <janssen at parc.com> wrote:
> > > So what happens when someone loads the _ssl module, initializes the
> > > threads, and tries to use SSL?  It's going to start failing again.  I
> >
> > Which turns out to be exactly what test_ssl.py does.  I'm tempted
> > to have the _ssl module call PyEval_InitThreads().  Would that be kosher?

The problem is the sequencing of the loading of the extension module,
compared to when the user gets around to initializing threading.  If
we want to keep it kosher, we need a way to hook into
PyEval_InitThreads() so that it will call the thread initialization
routines of other dynamically loaded libraries that have already been
loaded.  Or a way to have Py_BEGIN_ALLOW_THREADS take into account that
there may be more than one thread-dependent thing to check on.

Bill

From skip at pobox.com  Sat Sep  8 02:04:11 2007
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 7 Sep 2007 19:04:11 -0500
Subject: [Python-Dev] [PEPs]  Email addresses in PEPs?
In-Reply-To: <46E1D2C3.5030705@activestate.com>
References: <18121.47310.218893.540750@montanaro.dyndns.org>
	<ca471dc20708201120o23d527d4lc62e01bc4fc13585@mail.gmail.com>
	<bbaeab100708201216k213112e0u10b47013ec75837a@mail.gmail.com>
	<4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com>
	<46E1D2C3.5030705@activestate.com>
Message-ID: <18145.59131.323002.910688@montanaro.dyndns.org>


    Trent> If some would find it useful, here is a snippet of code that
    Trent> obfuscates email addresses for HTML as done by Markdown (a
    Trent> text-to-html markup translator). It randomly encodes each
    Trent> charater as a hex or decimal HTML entity (roughly 10% raw, 45%
    Trent> hex, 45% dec).

Aren't most spammers' scrapers going to be intelligent enough by now
(several years since they first arrived on the scene) to "see through" these
sorts of common obfuscations?

Skip

From brett at python.org  Sat Sep  8 02:59:35 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 7 Sep 2007 17:59:35 -0700
Subject: [Python-Dev] PEP 362: Signature objects
In-Reply-To: <fb6fbf560709071443x583c8f7di1f51714d8132b235@mail.gmail.com>
References: <fb6fbf560709071432y1de4061eoa1414476593a48b@mail.gmail.com>
	<fb6fbf560709071443x583c8f7di1f51714d8132b235@mail.gmail.com>
Message-ID: <bbaeab100709071759lb891876ye8246fb424a3cd39@mail.gmail.com>

On 9/7/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > A Signature object has the following structure attributes:
>
> > * name : str
> >     Name of the function.  This is not fully qualified because
> >     function objects for methods do not know the class they are
> >     contained within.  This makes functions and methods
> >     indistinguishable from one another when passed to decorators,
> >     preventing proper creation of a fully qualified name.
>
> (1)  Would this change with the new static __class__ attribute used
> for the new super?
>

I don't know enough about the super implementation to know.  If you
can figure out the class from the function object alone then sure,
this can change.

> (2)  What about functions without a name?  Do you want to say str or
> NoneType, or is that assumed?
>

What functions don't have a name?  Even lambdas have the name '<lambda>'.

> (3)  Is the Signature object live or frozen?  (name is writable ...
> will the Signature object reflect the new name, or the name in use at
> the time it was created?)
>

They are currently one-time creation objects.  One could change it to
use properties and do the look up dynamically by caching the function
object.  But I currently have it implemented as all created in
__init__ and then just left alone.

> > * var_annotations: dict(str, object)
> >     Dict that contains the annotations for the variable parameters.
> >     The keys are of the variable parameter with values of the
>
> Is there a special key for the "->" returns annotation, or is that
> available as a separate property?
>

Oops, that didn't get into the PEP for some reason.  The Signature
object has ``has_annotation``/``annotation`` attributes for the
'return' annotation.


> > The structure of the Parameter object is:
>
> > * name : (str | tuple(str))
> >     The name of the parameter as a string if it is not a tuple.  If
> >     the argument is a tuple then a tuple of strings is used.
>
> What is used for unnamed arguments (typically provide by C)?  I like
> None, but I see the arguments for both "" and missing attribute.
>

It's open for debate.  I didn't even think about functions not having
__name__ set.  Basically whatever people want to go with for var_args
and var_kw_args.

> > * position : int
> >     The position of the parameter within the signature of the
> >     function (zero-indexed).  For keyword-only parameters the position
> >     value is arbitrary while not conflicting with positional
> >     parameters.
>
> Is this just a property/alias for signature.parameters.index(self) ?
>

Assuming that 'self' refers to some parameter, yes.

> What should a "parameter" object not associated with a specific
> signature return?  -1, None, or missing attribute?
>

This is not an option as it must be specified by the Parameter
constructor.  A Parameter object should not exist without belonging to
a Signature object.  That's why neither Signature nor Parameter have
their constructors specified; the signature() function is the only way
you should cause the construction of either object.

> Is there a way to get the associated Signature, or is it "compiled
> out" when the Signature and its child Parameters are first
> constructed?  (I think the position property is the only attribute
> that would use it, unless you want some of the other attributes --
> like annotations -- to be live.)

There is currently no way to work backwards from a Parameter object to
its parent Signature.  It could be added if people wanted.

>
> ...
>
> I would also like to see a
>
>  * value : object
>
> attribute; this would be missing on most functions, but might be
> filled in on a Signature representing a closure, or an execution
> frame.

What for?  How does either have bearing on the call signature of a function?

>
>
> > When to construct the Signature object?
> > ---------------------------------------
>
> > The Signature object can either be created in an eager or lazy
> > fashion.  In the eager situation, the object can be created during
> > creation of the function object.
>
> Since most code doesn't need it, I would expect it to be optimized out
> at least as often as docstrings are.
>
> >  In the lazy situation, one would
> > pass a function object to a function and that would generate the
> > Signature object and store it to ``__signature__`` if
> > needed, and then return the value of ``__signature__``.
>
> Why store it?  Do you expect many use cases to need the signature more
> than once (but not to save it themselves)?

Because you can use these with decorators to allow introspection redirection::

 def dec(fxn):
   def inner(*args, **kwargs):
       return fxn(*args, **kwargs)
   sig = signature(fxn)
   inner.__signature__ = sig
   return inner

>
> If there is a __signature__ attribute on a object, you have to specify
> whether it can be replaced,

It can.

> which parts of it are writable,

Any of it.

> how that
> will affect the function's own behavior, etc.

It won't.

>  I also suspect it might
> become a source of heisenbugs, like the "reference leaks" that were
> really DUMMY items in a dict.
>
> If the Signature is just a snapshot no longer attached to the original
> function, then people won't expect changes to the Signature to affect
> the callable.
>

They are just snapshots unless people really want them to be live for
some reason.

> > Should ``Signature.bind`` return Parameter objects as keys?
>
> (see above) If a Signature is a snapshot (rather than a live part of
> the function), then it might make more sense to just add a value
> attribute to Parameter objects.
>

Why?  You might make several calls to bind() and thus setting what a
Parameter object would be bound to should be considered a temporary
thing.

> > Provide a mapping of parameter name to Parameter object?
> > --------------------------------------------------------
>
> > While providing access to the parameters in order is handy, it might
> > also be beneficial to provide a way to retrieve Parameter objects from
> > a Signature object based on the parameter's name.  Which style of
> > access (sequential/iteration or mapping) will influence how the
> > parameters are stored internally and whether __getitem__ accepts
> > strings or integers.
>
> I think it should accept both.
>
> What storage mechanism to use is an internal detail that should be
> left to the implementation.  I wouldn't expect Signature inspection to
> be inside a tight loop anyhow, unless it were part of a Generic
> Function dispatch engine ... and those authors (just PJE?) can
> optimize on what they actually need.
>

I guess I can just try to do ``item.__index__()`` and if that triggers
an AttributeError assume it is a name.

> > Remove ``has_*`` attributes?
> > ----------------------------
>
> > If an EAFP approach to the API is taken,
>
> Please leave them; it is difficult to catch Exceptions in a list comprehension.
>

You can also just use hasattr() if needed.

> > Have ``var_args`` and ``_var_kw_args`` default to ``None``?
>
> Makes sense to me, particularly since it should probably be consistent
> with function name, and that should probably be None.

So another vote for None.

Thanks for the feedback, Jim!

From guido at python.org  Sat Sep  8 05:15:18 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 7 Sep 2007 20:15:18 -0700
Subject: [Python-Dev] PEP 362: Signature objects
In-Reply-To: <bbaeab100709071759lb891876ye8246fb424a3cd39@mail.gmail.com>
References: <fb6fbf560709071432y1de4061eoa1414476593a48b@mail.gmail.com>
	<fb6fbf560709071443x583c8f7di1f51714d8132b235@mail.gmail.com>
	<bbaeab100709071759lb891876ye8246fb424a3cd39@mail.gmail.com>
Message-ID: <ca471dc20709072015p22cc308bmc7fd29f884b61e1f@mail.gmail.com>

On 9/7/07, Brett Cannon <brett at python.org> wrote:
> On 9/7/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > > A Signature object has the following structure attributes:
> >
> > > * name : str
> > >     Name of the function.  This is not fully qualified because
> > >     function objects for methods do not know the class they are
> > >     contained within.  This makes functions and methods
> > >     indistinguishable from one another when passed to decorators,
> > >     preventing proper creation of a fully qualified name.
> >
> > (1)  Would this change with the new static __class__ attribute used
> > for the new super?
>
> I don't know enough about the super implementation to know.  If you
> can figure out the class from the function object alone then sure,
> this can change.

I don't think it'll work -- the __class__ variable is only available
*within* the function, not when one is introspecting the function
object. Also, it is only available for functions that reference
'super' (or __class__ directly). As __class__ is passed into the
function call as a "cell" variable (like references to variables from
outer scopes), its mere presense slows down the call somewhat, hence
it is only present when used. (BTW, it is not an attribute.)

BTW there's a good reason why functions don't have easier access to
the class in which they are defined: functions can easily be moved or
shared between classes. The __class__ variable only records the class
inside which the function is defined lexically, if any.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From hrvoje.niksic at avl.com  Sat Sep  8 12:36:48 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Sat, 08 Sep 2007 12:36:48 +0200
Subject: [Python-Dev] working with Python threads from C
	extension	module?
In-Reply-To: <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>
	<07Sep7.162040pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <1189247808.11322.212.camel@localhost>

On Fri, 2007-09-07 at 16:20 -0700, Bill Janssen wrote:
> > #define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}
> > #define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }}
> 
> I'd forgotten how convoluted Py_BEGIN_ALLOW_THREADS and
> Py_END_ALLOW_THREADS were.  Anyone have any other suggestions about
> how to do this?

Be convoluted yourself and do this:

#define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) { Py_BEGIN_ALLOW_THREADS
#define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } }

(Untested, but I think it should work.)



From martin at v.loewis.de  Sat Sep  8 14:41:38 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 08 Sep 2007 14:41:38 +0200
Subject: [Python-Dev] Buildbot upgraded to 0.7.5
Message-ID: <46E29882.8080707@v.loewis.de>

I just upgraded the buildbot master to 0.7.5. If you
see any problems, please let me know.

Neal: buildbot now supports reloading of configurations,
without interrupting builds. Try "buildbot reconfig" when
you make a change (certain changes would still require a
restart).

Regards,
Martin

From janssen at parc.com  Sat Sep  8 17:18:35 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 08:18:35 PDT
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <1189247808.11322.212.camel@localhost> 
References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>
	<07Sep7.162040pdt."57996"@synergy1.parc.xerox.com>
	<1189247808.11322.212.camel@localhost>
Message-ID: <07Sep8.081836pdt."57996"@synergy1.parc.xerox.com>

> Be convoluted yourself and do this:
> 
> #define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) { Py_BEGIN_ALLOW_THREADS
> #define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } }
> 
> (Untested, but I think it should work.)

Yes, that had occurred to me.  We want the code inside the braces
still to run if the locks aren't held, so something more like

  #define PySSL_BEGIN_ALLOW_THREADS { \
			PyThreadState *_save;  \
			if (_ssl_locks_count>0) {_save = PyEval_SaveThread();}
  #define PySSL_BLOCK_THREADS	if (_ssl_locks_count>0){PyEval_RestoreThread(_save)};
  #define PySSL_UNBLOCK_THREADS	if (_ssl_locks_count>0){_save = PyEval_SaveThread()};
  #define PySSL_END_ALLOW_THREADS	if (_ssl_locks_count>0){PyEval_RestoreThread(_save);} \
		 }

would do the trick.  Unfortunately, this doesn't deal with the macro
behaviour.  The user has "turned on" threading; they expect reads and
writes to yield the GIL so that other threads can make progress.  But
the fact that threading has been "turned on" after the SSL module has
been initialized, means that threads don't work inside the SSL code.
So the user's understanding of the system will be broken.

No, I don't see any good way to fix this except to add a callback
chain inside PyThread_init_thread, which is run down when threads are
initialized.  Any module which needs to set up threads registers itself
on that chain, and gets called as part of PyThread_init_thread.  But
I'm far from the smartest person on this list :-), so perhaps someone
else will see a good solution.

This has got to be a problem with other extension modules linked to
libraries which have their own threading abstractions.

Bill

From gjcarneiro at gmail.com  Sat Sep  8 17:37:07 2007
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 8 Sep 2007 16:37:07 +0100
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <7088780289868241160@unknownmsgid>
References: <1189247808.11322.212.camel@localhost>
	<7088780289868241160@unknownmsgid>
Message-ID: <a467ca4f0709080837v631ea322p6f078306b164cc9a@mail.gmail.com>

On 08/09/2007, Bill Janssen <janssen at parc.com> wrote:
>
> > Be convoluted yourself and do this:
> >
> > #define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) {
> Py_BEGIN_ALLOW_THREADS
> > #define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } }
> >
> > (Untested, but I think it should work.)
>
> Yes, that had occurred to me.  We want the code inside the braces
> still to run if the locks aren't held, so something more like
>
>   #define PySSL_BEGIN_ALLOW_THREADS { \
>                         PyThreadState *_save;  \
>                         if (_ssl_locks_count>0) {_save =
> PyEval_SaveThread();}
>   #define PySSL_BLOCK_THREADS   if
> (_ssl_locks_count>0){PyEval_RestoreThread(_save)};
>   #define PySSL_UNBLOCK_THREADS if (_ssl_locks_count>0){_save =
> PyEval_SaveThread()};
>   #define PySSL_END_ALLOW_THREADS       if
> (_ssl_locks_count>0){PyEval_RestoreThread(_save);} \
>                  }
>
> would do the trick.  Unfortunately, this doesn't deal with the macro
> behaviour.  The user has "turned on" threading; they expect reads and
> writes to yield the GIL so that other threads can make progress.  But
> the fact that threading has been "turned on" after the SSL module has
> been initialized, means that threads don't work inside the SSL code.
> So the user's understanding of the system will be broken.
>
> No, I don't see any good way to fix this except to add a callback
> chain inside PyThread_init_thread, which is run down when threads are
> initialized.  Any module which needs to set up threads registers itself
> on that chain, and gets called as part of PyThread_init_thread.  But
> I'm far from the smartest person on this list :-), so perhaps someone
> else will see a good solution.


I think this is a helpful additional tool to solve threading problems.
Doesn't solve everything, but it certainly helps :-)

For instance, one thing it doesn't solve is when a library being wrapped can
be initialized with multithreading support, but only allows such
initialization as a very first API call; you can't initialize threading at
any arbitrary time during application runtime.  Unfortunately I don't think
there is any sane way to fix this problem :-(

This has got to be a problem with other extension modules linked to
> libraries which have their own threading abstractions.


Yes.

Another problem is that python extensions may not wish to incur performance
penalty of python threading calls.  For instance, pyorbit has these macros:

#define pyorbit_gil_state_ensure() (PyEval_ThreadsInitialized()?
(PyGILState_Ensure()) : 0)

#define pyorbit_gil_state_release(state) G_STMT_START { \
    if (PyEval_ThreadsInitialized())                    \
        PyGILState_Release(state);                      \
    } G_STMT_END

#define pyorbit_begin_allow_threads             \
    G_STMT_START {                              \
        PyThreadState *_save = NULL;            \
        if (PyEval_ThreadsInitialized())        \
            _save = PyEval_SaveThread();

#define pyorbit_end_allow_threads               \
        if (PyEval_ThreadsInitialized())        \
            PyEval_RestoreThread(_save);        \
    } G_STMT_END

They all call PyEval_ThreadsInitialized() before doing anything thread
related to save some performance.  The other reason to do it this way is
that the Python API calls themselves abort if they are called with threading
not initialized.  It would be nice the upstream python GIL macros were more
like pyorbit and became no-ops when threading is not enabled.

-- 
Gustavo J. A. M. Carneiro
INESC Porto, Telecommunications and Multimedia Unit
"The universe is always one step beyond logic." -- Frank Herbert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070908/5d3a01f6/attachment.htm 

From janssen at parc.com  Sat Sep  8 18:51:41 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 09:51:41 PDT
Subject: [Python-Dev] which SSL client protocols work with which server
	protocols?
Message-ID: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com>

I've now built a framework in test_ssl to test all client protocols
(SSL2, SSL3, SSL23, TLS1) against all server protocols, and here's
what I've come up with.  Servers are along the X axis, and clients are
on the Y axis.  "Yes" means that that client protocol can talk to that
server protocol.

	SSL2	SSL3	SS23	TLS1
SSL2	yes	no	no	no
SSL3	yes	yes	yes	no
SSL23	no	no	yes	no
TLS1	no	no	yes	yes

I'm a bit surprised by the facts that (1) an SSL2 client can't connect
to an SSL23 server, and (2) an SSL23 client can *only* connect to an
SSL23 server.  Can anyone verify that these combos (the results of
testing with the Python framework) are indeed to be expected?

Bill


From janssen at parc.com  Sat Sep  8 20:41:42 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 11:41:42 PDT
Subject: [Python-Dev] working with Python threads from C extension
	module?
In-Reply-To: <07Sep8.081836pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com>
	<07Sep7.162040pdt."57996"@synergy1.parc.xerox.com>
	<1189247808.11322.212.camel@localhost>
	<07Sep8.081836pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep8.114147pdt."57996"@synergy1.parc.xerox.com>

> This has got to be a problem with other extension modules linked to
> libraries which have their own threading abstractions.

Sure enough, sqlite3 simply assumes threads (won't build without
them), and turns them on if it's used (by calling
PyThread_get_thread_ident(), which in turn calls
PyThread_init_thread()).

Bill

From janssen at parc.com  Sat Sep  8 20:57:33 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 11:57:33 PDT
Subject: [Python-Dev] testing in a Python --without-threads build
Message-ID: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>

I can't seem to run the regression tests in a --without-threads build.
Might be interesting to configure a buildbot this way to keep
ourselves honest.

Bill

From janssen at parc.com  Sat Sep  8 21:19:26 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 12:19:26 PDT
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>

> I can't seem to run the regression tests in a --without-threads build.
> Might be interesting to configure a buildbot this way to keep
> ourselves honest.

Because regrtest.py was importing test_socket_ssl without catching the
ImportError exception:

% ./python.exe ./Lib/test/regrtest.py test_socket_ssl
test_socket_ssl
test_socket_ssl skipped -- No module named thread
1 test skipped:
    test_socket_ssl
Traceback (most recent call last):
  File "./Lib/test/regrtest.py", line 1190, in <module>
    main()
  File "./Lib/test/regrtest.py", line 416, in main
    e = _ExpectedSkips()
  File "./Lib/test/regrtest.py", line 1111, in __init__
    from test import test_socket_ssl
  File "/local/python/trunk/src/Lib/test/test_socket_ssl.py", line 8, in <module>
    import threading
  File "/local/python/trunk/src/Lib/threading.py", line 6, in <module>
    import thread
ImportError: No module named thread
%

So, is this an "expected skip" or not?

Bill


From martin at v.loewis.de  Sat Sep  8 21:40:52 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 08 Sep 2007 21:40:52 +0200
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>
	<07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E2FAC4.2030900@v.loewis.de>

>> I can't seem to run the regression tests in a --without-threads build.
>> Might be interesting to configure a buildbot this way to keep
>> ourselves honest.
> 
> Because regrtest.py was importing test_socket_ssl without catching the
> ImportError exception:

If that is the reason you cannot run it, then it seems it works just
fine. There is nothing wrong with tests getting skipped.

> So, is this an "expected skip" or not?

No. IIUC, "expected skips" are a platform property. For your platform,
support for threads is expected (whatever your platform is as log as
it was built in this millenium).

Regards,
Martin

From janssen at parc.com  Sat Sep  8 22:16:47 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 13:16:47 PDT
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <46E2FAC4.2030900@v.loewis.de> 
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>
	<07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>
	<46E2FAC4.2030900@v.loewis.de>
Message-ID: <07Sep8.131648pdt."57996"@synergy1.parc.xerox.com>

> > Because regrtest.py was importing test_socket_ssl without catching the
> > ImportError exception:
> 
> If that is the reason you cannot run it, then it seems it works just
> fine. There is nothing wrong with tests getting skipped.

It wasn't getting skipped, it was crashing the regression testing harness.
test_unittest catches the ImportError, but this was imported directly
from regrtest.py.

> > So, is this an "expected skip" or not?
> 
> No. IIUC, "expected skips" are a platform property. For your platform,
> support for threads is expected (whatever your platform is as log as
> it was built in this millenium).

OK.  I'll put in a check for this.  In fact, here's a patch:

Index: Lib/test/regrtest.py
===================================================================
--- Lib/test/regrtest.py	(revision 58052)
+++ Lib/test/regrtest.py	(working copy)
@@ -1108,7 +1108,6 @@
 class _ExpectedSkips:
     def __init__(self):
         import os.path
-        from test import test_socket_ssl
         from test import test_timeout
 
         self.valid = False
@@ -1122,8 +1121,13 @@
             if not os.path.supports_unicode_filenames:
                 self.expected.add('test_pep277')
 
-            if test_socket_ssl.skip_expected:
-                self.expected.add('test_socket_ssl')
+            try:
+                from test import test_socket_ssl
+            except ImportError:
+                pass
+            else:
+                if test_socket_ssl.skip_expected:
+                    self.expected.add('test_socket_ssl')
 
             if test_timeout.skip_expected:
                 self.expected.add('test_timeout')


Bill

From janssen at parc.com  Sat Sep  8 22:28:22 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 13:28:22 PDT
Subject: [Python-Dev] what platforms require RAND_add() before using SSL?
Message-ID: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com>

There are some functions in _ssl.c for gathering randomness from a
daemon, and adding that randomness to the pseudo-random number
generator in SSL, before using SSL.  There's a note there saying that
"on some platform" this is necessary.  Anyone know which platforms?

Bill

From janssen at parc.com  Sat Sep  8 22:36:39 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 13:36:39 PDT
Subject: [Python-Dev] [Python-3000] 3.0 crypto
In-Reply-To: <07Sep8.123933pdt."58663"@synergy1.parc.xerox.com> 
References: <07Sep8.123933pdt."58663"@synergy1.parc.xerox.com>
Message-ID: <07Sep8.133648pdt."57996"@synergy1.parc.xerox.com>

> We're already linking against the OpenSSL EVP libraries for hashlib
> (and against the OpenSSL SSL libraries for the SSL support).  It
> wouldn't be hard to expose the EVP functions a bit more, essentially
> as hash functions that return long (and reversible) hashes:
> 
>    encryptor = opensslevp.encryptor("AES-256-CBC", ...maybe some options...)
>    encryptor.update(...some plaintext...)

Almost certainly this signature should be

    encryptor = opensslevp.encryptor("AES-256-CBC", KEY, ...options...)

and correspondingly

    decryptor = opensslevp.decryptor("AES-256-CBC", KEY, ...options...)


Bill

From janssen at parc.com  Sun Sep  9 02:19:30 2007
From: janssen at parc.com (Bill Janssen)
Date: Sat, 8 Sep 2007 17:19:30 PDT
Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X
	machine
Message-ID: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com>

"test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged
in via SSH and X Windows):

% test_tcl
2007-09-08 17:00:22.629 python.exe[4163] CFLog (0): CFMessagePort: bootstrap_register(): failed 1100 (0x44c), port = 0x3a03, name = 'Processes-0.58327041'
See /usr/include/servers/bootstrap_defs.h for the error codes.
2007-09-08 17:00:22.630 python.exe[4163] CFLog (99): CFMessagePortCreateLocal(): failed to name Mach port (Processes-0.58327041)
CFMessagePortCreateLocal failed (name = Processes-0.58327041 error = 0)
Abort
%

This is on the trunk.

Bill

From nick.bastin at gmail.com  Sun Sep  9 05:05:23 2007
From: nick.bastin at gmail.com (Nicholas Bastin)
Date: Sat, 8 Sep 2007 23:05:23 -0400
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <46E2FAC4.2030900@v.loewis.de>
References: <46E2FAC4.2030900@v.loewis.de>
Message-ID: <66d0a6e10709082005u4af353ebpbd0b7cd6c27db242@mail.gmail.com>

Might expected skips instead be based on your current configuration
instead of what someone statically decided what would be appropriate
for your platform?  Every new release I have to go through the
'unexpected skips' to determine that they're perfectly fine for how I
configured python.

It seems that we ought to provide a mechanism for querying python for
how the build was configured (although for non-unittest cases, failing
to import some modules is usually sufficient information - knowing why
they fail probably doesn't matter)

On 9/8/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> I can't seem to run the regression tests in a --without-threads build.
> >> Might be interesting to configure a buildbot this way to keep
> >> ourselves honest.
> >
> > Because regrtest.py was importing test_socket_ssl without catching the
> > ImportError exception:
>
> If that is the reason you cannot run it, then it seems it works just
> fine. There is nothing wrong with tests getting skipped.
>
> > So, is this an "expected skip" or not?
>
> No. IIUC, "expected skips" are a platform property. For your platform,
> support for threads is expected (whatever your platform is as log as
> it was built in this millenium).
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/nick.bastin%40gmail.com
>

From martin at v.loewis.de  Sun Sep  9 09:38:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 09 Sep 2007 09:38:29 +0200
Subject: [Python-Dev] what platforms require RAND_add() before using SSL?
In-Reply-To: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E3A2F5.9020305@v.loewis.de>

> There are some functions in _ssl.c for gathering randomness from a
> daemon, and adding that randomness to the pseudo-random number
> generator in SSL, before using SSL.  There's a note there saying that
> "on some platform" this is necessary.  Anyone know which platforms?

In general, anything that does not have /dev/[u]random;
older Solaris releases and HP-UX in particular.

Regards,
Martin

From martin at v.loewis.de  Sun Sep  9 09:41:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 09 Sep 2007 09:41:30 +0200
Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X
 machine
In-Reply-To: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E3A3AA.6030103@v.loewis.de>

> "test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged
> in via SSH and X Windows):
> 
> % test_tcl
> 2007-09-08 17:00:22.629 python.exe[4163] CFLog (0): CFMessagePort: bootstrap_register(): failed 1100 (0x44c), port = 0x3a03, name = 'Processes-0.58327041'
> See /usr/include/servers/bootstrap_defs.h for the error codes.
> 2007-09-08 17:00:22.630 python.exe[4163] CFLog (99): CFMessagePortCreateLocal(): failed to name Mach port (Processes-0.58327041)
> CFMessagePortCreateLocal failed (name = Processes-0.58327041 error = 0)
> Abort
> %
> 
> This is on the trunk.

That's no surprise, I would say: it seems you link against TkAqua
(not X11 Tk); for that to work, you need a reference to WindowServer,
which won't be available when logged in through SSL.

Regards,
Martin

From giszo at nyomi.hu  Sun Sep  9 11:17:31 2007
From: giszo at nyomi.hu (Giszo)
Date: Sun, 9 Sep 2007 11:17:31 +0200
Subject: [Python-Dev] Porting python
Message-ID: <d812e3e6c33a211a36bbfef0c1caf7e8@89.132.25.148>

Hi!

I've tried to port Python (2.3.6 and 2.5.1) to my own OS. The compilation of
the python library is done after a few hours of work. When i try to run the
compiled executable i got an error shown on the following screenshot:
http://giszo.lame.hu/jshot/screens/screen31.png

After a little while of debugging i know that it fails bootstrapping the
exceptions because the initializer function failes to get the "__builtin__"
module. Adding debug printfs to the bltinmodule.c init code it looks like
the builtin module is initialized properly.

I'd like to ask some help where i should start checking the code to fix the
error.

Thanks!


________________________________________________
Message sent using UebiMiau 2.7.9



From martin at v.loewis.de  Sun Sep  9 11:46:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 09 Sep 2007 11:46:27 +0200
Subject: [Python-Dev] Porting python
In-Reply-To: <d812e3e6c33a211a36bbfef0c1caf7e8@89.132.25.148>
References: <d812e3e6c33a211a36bbfef0c1caf7e8@89.132.25.148>
Message-ID: <46E3C0F3.6080701@v.loewis.de>

> I'd like to ask some help where i should start checking the code to fix the
> error.

Python searches possible candidate locations of the standard library for
a landmark, see getpath.c; currently, the landmark is os.py.

If it doesn't find the landmark, it complains.

Regards,
Martin

From janssen at parc.com  Sun Sep  9 17:44:17 2007
From: janssen at parc.com (Bill Janssen)
Date: Sun, 9 Sep 2007 08:44:17 PDT
Subject: [Python-Dev] what platforms require RAND_add() before using SSL?
In-Reply-To: <46E3A2F5.9020305@v.loewis.de> 
References: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com>
	<46E3A2F5.9020305@v.loewis.de>
Message-ID: <07Sep9.084424pdt."57996"@synergy1.parc.xerox.com>

> > There are some functions in _ssl.c for gathering randomness from a
> > daemon, and adding that randomness to the pseudo-random number
> > generator in SSL, before using SSL.  There's a note there saying that
> > "on some platform" this is necessary.  Anyone know which platforms?
> 
> In general, anything that does not have /dev/[u]random;
> older Solaris releases and HP-UX in particular.

Thanks, I"ll add that to the documentation.  Any ideas what the values
of the "entropy" parameter to RAND_add() are like, or how they are
derived?  I did a rapid skim of RFC 1750, but didn't see it there.

Bill

From janssen at parc.com  Sun Sep  9 17:46:34 2007
From: janssen at parc.com (Bill Janssen)
Date: Sun, 9 Sep 2007 08:46:34 PDT
Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X
	machine
In-Reply-To: <46E3A3AA.6030103@v.loewis.de> 
References: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com>
	<46E3A3AA.6030103@v.loewis.de>
Message-ID: <07Sep9.084639pdt."57996"@synergy1.parc.xerox.com>

> > "test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged
> > in via SSH and X Windows):
> 
> That's no surprise, I would say: it seems you link against TkAqua
> (not X11 Tk); for that to work, you need a reference to WindowServer,
> which won't be available when logged in through SSL.

Actually, I think it literally *is* a surprise; if it were truly "no
surprise", the testing harness would have caught it and moved on to
the other tests.  But if you mean, "no big deal", I agree.

Bill

From lukem at NetBSD.org  Mon Sep 10 01:54:30 2007
From: lukem at NetBSD.org (Luke Mewburn)
Date: Mon, 10 Sep 2007 09:54:30 +1000
Subject: [Python-Dev] Word size inconsistencies in C extension modules
Message-ID: <20070909235430.GV25031@mewburn.net>

Hi folks.

While working on an in-house application that uses the curses
module, we noticed that it didn't work as expected on an AIX system
(powerpc 64-bit big-endian LP64), using python 2.3.5.

On a hunch, I took a look through the _cursesmodule.c code and
noticed the use of PyArg_ParseTuple()'s "l" decoding mode to retrieve
a "long" from python into a C type (attr_t) that on AIX is an int.
On 64-bit LP64 platforms, sizeof(long) > sizeof(int), so this
doesn't quite work, especially on big-endian systems.

Further research into curses shows that different platforms use a
different underlying C type for the attr_t type (int, unsigned int,
long, unsigned long), so changing the PyArg_ParseTuple() to using
the "i" decoding mode probably wasn't portable.

I documented this problem and provided a patch that fixes it against
the head of the svn trunk in http://bugs.python.org/issue1114
(because the problem appears to still exist in the latest code.)

My workaround was to use a separate explicit C "long" to decode
the value from python into, and then just assign that to the
final value and hope that the type promotion does the right thing
on the native platfomr.

My questions are:

 (a)    What's the "preferred" style in python extension modules
        of parsing a number from python into a C type, where the
        C type size may change on different platforms? 
        Is my method of guessing what the largest common size
        will be (long, unsigned long, ...), reading into that,
        and assigning to the final type, acceptable?

 (b)    Is there a desire to see the standard python C extension
        modules cleaned up to use the answer to (a), especially
        where said modules may be susceptable to the word
        size problems I mentioned?
        (64bit big-endian platforms such as powerpc and sparc64
        are good for detecting word-size lossage)


cheers,
Luke.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070910/734b38ce/attachment.pgp 

From janssen at parc.com  Mon Sep 10 03:41:32 2007
From: janssen at parc.com (Bill Janssen)
Date: Sun, 9 Sep 2007 18:41:32 PDT
Subject: [Python-Dev] tests expanded for SSL module -- other suggestions?
Message-ID: <07Sep9.184134pdt."57996"@synergy1.parc.xerox.com>

I'm looking for suggestions for other SSL module tests.  

Here's the result of running my (not yet checked-in) test_ssl.py
module in verbose mode.  I'm pretty happy with the codebase right now,
and barring other tests, I'm ready to check it in and start on the 3.x
patch (or perhaps the 2.3 package).

In the client/server tests, a new server thread is created for
each test.

In the STARTTLS test, several messages are exchanged in the clear,
then the client sends a STARTTLS message and after the server replies
"OK", initiates the TLS handshake.

It would be nice to have an external HTTPS server on python.org
that could be used for an HTTPS connection test.  Is there one?

Bill

% ./python.exe ./Lib/test/regrtest.py -u all -v test_ssl
test_ssl
testCrucialConstants (test.test_ssl.BasicTests) ... ok
testParseCert (test.test_ssl.BasicTests) ... 
{'notAfter': 'Feb 16 16:54:50 2013 GMT',
 'subject': ((('countryName', u'US'),),
             (('stateOrProvinceName', u'Delaware'),),
             (('localityName', u'Wilmington'),),
             (('organizationName', u'Python Software Foundation'),),
             (('organizationalUnitName', u'SSL'),),
             (('commonName', u'somemachine.python.org'),))}
ok
testRAND (test.test_ssl.BasicTests) ... 
 RAND_status is 1 (sufficient randomness)
ok
testSSLconnect (test.test_ssl.BasicTests) ... ok
testEcho (test.test_ssl.ConnectedTests) ... 
 server:  new connection from ('127.0.0.1', 51840)
 server: connection cipher is now ('AES256-SHA', 'TLSv1/SSLv3', 256)
 client:  sending 'FOO\n'...
 server: read 'FOO\n', sending back 'foo\n'...
 client:  read 'foo\n'
 client:  closing connection.
 server: client closed connection
ok
testMalformedCert (test.test_ssl.ConnectedTests) ... ok
testMalformedKey (test.test_ssl.ConnectedTests) ... ok
testNULLcert (test.test_ssl.ConnectedTests) ... ok
testReadCert (test.test_ssl.ConnectedTests) ... 
{'notAfter': 'Feb 16 16:54:50 2013 GMT',
 'subject': ((('countryName', u'US'),),
             (('stateOrProvinceName', u'Delaware'),),
             (('localityName', u'Wilmington'),),
             (('organizationName', u'Python Software Foundation'),),
             (('organizationalUnitName', u'SSL'),),
             (('commonName', u'somemachine.python.org'),))}
Connection cipher is ('AES256-SHA', 'TLSv1/SSLv3', 256).
ok
testRudeShutdown (test.test_ssl.ConnectedTests) ... ok
testSSL2 (test.test_ssl.ConnectedTests) ... 
 SSLv2->SSLv2 CERT_NONE
 SSLv2->SSLv2 CERT_OPTIONAL
 SSLv2->SSLv2 CERT_REQUIRED
 SSLv23->SSLv2 CERT_NONE
 {SSLv3->SSLv2} CERT_NONE
 {TLSv1->SSLv2} CERT_NONE
ok
testSSL23 (test.test_ssl.ConnectedTests) ... 
 {SSLv2->SSLv23} CERT_NONE
 SSLv3->SSLv23 CERT_NONE
 SSLv23->SSLv23 CERT_NONE
 TLSv1->SSLv23 CERT_NONE
 {SSLv2->SSLv23} CERT_OPTIONAL
 SSLv3->SSLv23 CERT_OPTIONAL
 SSLv23->SSLv23 CERT_OPTIONAL
 TLSv1->SSLv23 CERT_OPTIONAL
 {SSLv2->SSLv23} CERT_REQUIRED
 SSLv3->SSLv23 CERT_REQUIRED
 SSLv23->SSLv23 CERT_REQUIRED
 TLSv1->SSLv23 CERT_REQUIRED
ok
testSSL3 (test.test_ssl.ConnectedTests) ... 
 SSLv3->SSLv3 CERT_NONE
 SSLv3->SSLv3 CERT_OPTIONAL
 SSLv3->SSLv3 CERT_REQUIRED
 {SSLv2->SSLv3} CERT_NONE
 {SSLv23->SSLv3} CERT_NONE
 {TLSv1->SSLv3} CERT_NONE
ok
testSTARTTLS (test.test_ssl.ConnectedTests) ... 
 client:  sending 'msg 1'...
 server:  new connection from ('127.0.0.1', 51870)
 server: read 'msg 1', sending back 'msg 1'...
 client:  read 'msg 1' from server
 client:  sending 'MSG 2'...
 server: read 'MSG 2', sending back 'msg 2'...
 client:  read 'msg 2' from server
 client:  sending 'STARTTLS'...
 server: read STARTTLS from client, sending OK...
 client:  read 'OK\n' from server, starting TLS...
 server: connection cipher is now ('AES256-SHA', 'TLSv1/SSLv3', 256)
 client:  sending 'MSG 3'...
 server: read 'MSG 3', sending back 'msg 3'...
 client:  read 'msg 3' from server
 client:  sending 'msg 4'...
 server: read 'msg 4', sending back 'msg 4'...
 client:  read 'msg 4' from server
 client:  closing connection.
 server: client closed connection
ok
testTLS1 (test.test_ssl.ConnectedTests) ... 
 TLSv1->TLSv1 CERT_NONE
 TLSv1->TLSv1 CERT_OPTIONAL
 TLSv1->TLSv1 CERT_REQUIRED
 {SSLv2->TLSv1} CERT_NONE
 {SSLv3->TLSv1} CERT_NONE
 {SSLv23->TLSv1} CERT_NONE
ok

----------------------------------------------------------------------
Ran 15 tests in 6.866s

OK
1 test OK.
CAUTION:  stdout isn't compared in verbose mode:
a test that passes in verbose mode may fail without it.
[23679 refs]

From pfdubois at gmail.com  Mon Sep 10 07:30:14 2007
From: pfdubois at gmail.com (Paul Dubois)
Date: Sun, 9 Sep 2007 22:30:14 -0700
Subject: [Python-Dev] summaries not arriving
Message-ID: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>

The weekly summaries from the new bug tracker are disappearing somewhere
between the tracker and python-dev. My attempt to post one by hand was
rejected by python-dev-owner (Barry Warsaw?) without explanation. Perhaps he
has bounced the others; emails to python-dev-owner result in an automated
message suggesting that my mail may never be read so I don't know how to ask
him.

As a small boy I once knew wrote, I must not use bad words. (:->

Paul Dubois
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070909/f23e68ee/attachment.htm 

From martin at v.loewis.de  Mon Sep 10 07:37:02 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 10 Sep 2007 07:37:02 +0200
Subject: [Python-Dev] Word size inconsistencies in C extension modules
In-Reply-To: <20070909235430.GV25031@mewburn.net>
References: <20070909235430.GV25031@mewburn.net>
Message-ID: <46E4D7FE.5090905@v.loewis.de>

>  (a)    What's the "preferred" style in python extension modules
>         of parsing a number from python into a C type, where the
>         C type size may change on different platforms? 
>         Is my method of guessing what the largest common size
>         will be (long, unsigned long, ...), reading into that,
>         and assigning to the final type, acceptable?

Yes, that's the best thing we have come up with. You then have
the issue on potential truncation on assignment: if the value
passed fits into a long (say) but not an attr_t, it would
be good if an error was raised. In the past, we have typically
coded that ValueError explicitly after the ParseTuple call.

In principle, it is possible to deal with these in ParseTuple.
To do so:
a) in configure.in, make a configure-time check to compute the
   size of the type, and possibly its signedness.
b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT,
   which would be either "i" or "l" (or #error if it's neither
   the size of int nor the size of long). Then rely on string
   concatenation in using that define.

>  (b)    Is there a desire to see the standard python C extension
>         modules cleaned up to use the answer to (a), especially
>         where said modules may be susceptable to the word
>         size problems I mentioned?

Most certainly. There shouldn't be that many places left, though;
most have been fixed over the years already.

I have a GCC patch which checks for correctness of ParseTuple
calls (in terms of data size) if you are interested.

Regards,
Martin

From glyph at divmod.com  Mon Sep 10 07:58:52 2007
From: glyph at divmod.com (glyph at divmod.com)
Date: Mon, 10 Sep 2007 05:58:52 -0000
Subject: [Python-Dev] Design and direction of the SSL module (was Re:
	frozenset C API?)
In-Reply-To: <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
	<20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>
	<07Sep6.111518pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com>

Sorry for the late response.  As always, I have a lot of other stuff 
going on at the moment, but I'm very interested in this subject.

On 6 Sep, 06:15 pm, janssen at parc.com wrote:
>>PyOpenSSL, in particular, is both a popular de-facto
>>standard *and* almost completely unmaintained; python's standard 
>>library
>>could absorb/improve it with little fuss.
>
>Good idea, go for it!  A full wrapper for OpenSSL is beyond the scope
>of my ambition; I'm simply trying to add a simple fix to what's
>already in the standard library.

I guess I'd like to know two things.  One, what *is* the scope of your 
amibition?  I feel silly for asking, because I am pretty sure that 
somewhere in the beginning of this thread I missed either a proposal, a 
PEP reference, or a ticket number, but I've poked around a little and I 
can't seem to find it.  Can you provide a reference, or describe what it 
is you're trying to do?

Two, what's the scope of "the" plans for the SSL module in general for 
Python?  I think I misinterpreted several things that you said as "the 
plan" rather than your own personal requirements: but if in reality, I 
can "go for it", I'd really like to help make the stdlib SSL module to 
be a really good, full-featured OpenSSL implementation for Python so we 
can have it elsewhere.  (If I recall correctly you mentioned you'd like 
to use it with earlier Python versions as well...?)

Many of the things that you recommend using another SSL library for, 
like pulling out arbitrary extensions, are incredibly unweildy or flat- 
out broken in these libraries.  It's not that I mind going to a 
different source for this functionality; it's that in many cases, there 
*isn't* another source :).  I think I might have said this already, but 
subjectAltName, for example, isn't exposed in any way by PyOpenSSL.

I didn't particularly want to start my own brand-new SSL wrapper 
project, and contributing to the actively-maintained stdlib 
implementation is a lot more appealing than forking the moribund 
PyOpenSSL.

However, even with lots of help on the maintenance, converting the 
current SSL module into a complete SSL library is a lot of work.  Here 
are the questions that I'd like answers to before starting to think 
seriously about it:

    * Is this idea even congruent with the overall goals of other 
developers interested in SSL for Python?  If not, I'm obviously barking 
up the wrong tree.
    * Would it be possible to distribute as a separate library?  (I think 
I remember Bill saying something about that already...)
    * When would such work have to be completed by to fit into the 2.6 
release?  (I just want a rough estimate, here.)
    * Should someone - and I guess by someone I mean me - write up a PEP 
describing this?

My own design for an SSL wrapper - although this simply a Python layer 
around PyOpenSSL - is here:

http://twistedmatrix.com/trac/browser/trunk/twisted/internet/_sslverify.py

This isn't really complete - in particular, the documentation is 
lacking, and it can't implement the stuff PyOpenSSL is missing - but I 
definitely like the idea of having objects for DNs, certificates, CRs, 
keys, key pairs, and the ubiquitous certificate-plus-matching-private- 
key-in-one-file that you need to run an HTTPS server :).  If I am going 
to write a PEP, it will look a lot like that file.

_sslverify was originally designed for a system that does lots of 
automatic signing, so I am particularly interested in it being easy to 
implement a method like  PrivateCertificate.signCertificateRequest - 
it's always such a pain to get all the calls for signing a CR in any 
given library *just so*.
>>This begs the question: M2Crypto and PyOpenSSL already do what you're
>>proposing to do, as far as I can tell, and are, as you say, "more
>>powerful".

To clarify my point here, when I say that they "already do" what you're 
doing, what I mean is, they already wrap SSL, and you are trying to wrap 
SSL :).
>I'm trying to give the application the ability to do some level of
>authorization without requiring either of those packages.

I'd say "why wouldn't you want to require either of those packages?" but 
actually, I know why you wouldn't want to, and it's that they're bad. 
So, given that we don't want to require them, wouldn't it be nice if we 
didn't need to require them at all? :).
>Like being
>able to tell who's on the other side of the connection :-).  Right
>now, I think the right fields to expose are

I don't quite understand what you mean by "right" fields.  Right fields 
for what use case?  This definitely isn't "right" for what I want to use 
SSL for.
>  "subject" (I see little point to exposing "issuer"),

This is a good example of what I mean.  For HTTPS, the relationship 
between the subject and the issuer is moot, but in my own projects, the 
relationship is very interesting.  Specifically, properties of the 
issuer define what properties the subject may have, in the verification 
scheme for Vertex ( http://divmod.org/trac/wiki/DivmodVertex ).  (On the 
other hand, Vertex requires STARTTLS, so it itself can't be an *actual* 
use-case for this SSL library until it also starts supporting mid- 
connection TLS startup.)

I can understand that you might not have use-cases for exposing these 
features, but your phrasing suggests that it would be a bad idea to 
expose them, not just that it's too much work.  Am I misinterpreting? 
Are you just saying it isn't worth the work at this point?
>  "notAfter" (you're always guaranteed to be after "notBefore", or the
>  cert wouldn't validate, so I see little point to exposing that, but
>  "notAfter" can be used after the connection has been established),

Wouldn't it be nice to know *why* the cert didn't validate?  To provide 
the user with a message including the notBefore date, in case their 
clock is set wrong or something?
>I don't see how the other fields in the cert can be profitably used.

The entire idea of "extensions" is pretty direct about the fact that the 
original implementor need not understand their profitable use :).
>>When you say "the full DER form", are you simply referring to the full
>>blob, or a broken-down representation by key and by extension?
>
>The full blob.

Obviously, I think the broken-down representation would be nicer :).

I know I'll have to wrangle with a bit of ASN.1 if I want to get 
anything useful out of most extensions, but if it's just the extension 
data there are a lot of cases where I think I could fake it.  Re-parsing 
the whole DER is going to require a real, full-on ASN.1 library.

From greg at krypto.org  Mon Sep 10 08:40:05 2007
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 9 Sep 2007 23:40:05 -0700
Subject: [Python-Dev] BerkeleyDB 4.6.19 is buggy and causes test_bsddb3 to
	hang
Message-ID: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com>

BerkeleyDB 4.6.19 is a buggy release, the DB_HASH access method databases
can lockup the process.  This is why several of the bleeding edge distro
buildbots are timing out while running test_bsddb3.  I've created a simple C
test case and made sleepycat^Woracle aware of the problem.

I have a change in my sandbox to explicitly avoid linking with 4.6.19 but it
seems like committing it would just pollute setup.py with vague notions of
what versions of a specific library are bad.  I'd prefer to just disallow
use of libdb 4.6 completely in setup.py until oracle fixes this and we're
sure no OS release ships with 4.6.19.

thoughts?

-gps


http://groups.google.com/group/comp.databases.berkeley-db/browse_thread/thread/abf12452613ca7ec
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070909/6a2862d8/attachment.htm 

From lukem at NetBSD.org  Mon Sep 10 09:11:59 2007
From: lukem at NetBSD.org (Luke Mewburn)
Date: Mon, 10 Sep 2007 17:11:59 +1000
Subject: [Python-Dev] Word size inconsistencies in C extension modules
In-Reply-To: <46E4D7FE.5090905@v.loewis.de>
References: <20070909235430.GV25031@mewburn.net> <46E4D7FE.5090905@v.loewis.de>
Message-ID: <20070910071159.GA27320@mewburn.net>

On Mon, Sep 10, 2007 at 07:37:02AM +0200, "Martin v. L?wis" wrote:
  | In principle, it is possible to deal with these in ParseTuple.
  | To do so:
  | a) in configure.in, make a configure-time check to compute the
  |    size of the type, and possibly its signedness.
  | b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT,
  |    which would be either "i" or "l" (or #error if it's neither
  |    the size of int nor the size of long). Then rely on string
  |    concatenation in using that define.

Are there some good examples in the Python source where
this technique has been used already?
Or were you proposing a cleaner solution that could be
experimented with?


  | I have a GCC patch which checks for correctness of ParseTuple
  | calls (in terms of data size) if you are interested.

Sounds like a useful variation of the standard -Wformat stuff.

This probably wouldn't have helped in the AIX situation I experienced
(because the IBM compiler was used in that situation), but it could
be useful on other BE LP64 platforms that are more gcc-friendly
(e.g, NetBSD/sparc64).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070910/8af04ed8/attachment.pgp 

From martin at v.loewis.de  Mon Sep 10 09:54:08 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 10 Sep 2007 09:54:08 +0200
Subject: [Python-Dev] Word size inconsistencies in C extension modules
In-Reply-To: <20070910071159.GA27320@mewburn.net>
References: <20070909235430.GV25031@mewburn.net> <46E4D7FE.5090905@v.loewis.de>
	<20070910071159.GA27320@mewburn.net>
Message-ID: <46E4F820.1050204@v.loewis.de>

Luke Mewburn schrieb:
> On Mon, Sep 10, 2007 at 07:37:02AM +0200, "Martin v. L?wis" wrote:
>   | In principle, it is possible to deal with these in ParseTuple.
>   | To do so:
>   | a) in configure.in, make a configure-time check to compute the
>   |    size of the type, and possibly its signedness.
>   | b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT,
>   |    which would be either "i" or "l" (or #error if it's neither
>   |    the size of int nor the size of long). Then rely on string
>   |    concatenation in using that define.
> 
> Are there some good examples in the Python source where
> this technique has been used already?

Not directly. A check for the size of a library type can be found
for fpos_t, but there, no ParseTuple depends on it. An example
for using variable formatters (though again not for ParseTuple)
is PY_FORMAT_SIZE_T.

> Or were you proposing a cleaner solution that could be
> experimented with?

More that, yes.

>   | I have a GCC patch which checks for correctness of ParseTuple
>   | calls (in terms of data size) if you are interested.
> 
> Sounds like a useful variation of the standard -Wformat stuff.

Indeed, it's an extension to it. Unfortunately, introducing new kinds
of formats is only possible by editing GCC (and then, the existing
framework is focussed on %-style patterns, so I had to bypass that
framework as well - but there were hooks for doing so).

Regards,
Martin


From barry at python.org  Mon Sep 10 13:14:30 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 10 Sep 2007 07:14:30 -0400
Subject: [Python-Dev] summaries not arriving
In-Reply-To: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>
References: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>
Message-ID: <48199DF6-10D5-413D-9DAD-F7F1FD849072@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 10, 2007, at 1:30 AM, Paul Dubois wrote:

> The weekly summaries from the new bug tracker are disappearing  
> somewhere
> between the tracker and python-dev. My attempt to post one by hand was
> rejected by python-dev-owner (Barry Warsaw?) without explanation.  
> Perhaps he
> has bounced the others; emails to python-dev-owner result in an  
> automated
> message suggesting that my mail may never be read so I don't know  
> how to ask
> him.

Nope, I didn't bounce them.  I don't /think/ they'll bounce  
automatically.  Can you forward a bounce message to me directly?

- -Barry


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRuUnF3EjvBPtnXfVAQKDvwQAnNbVXwY1Fc00TLLFrOffFwU+jRJVxSyJ
J0HV/+ssVuX85+LfM7kAsbwZSAWM0PkTVrhfldtbD5j6D0x0II/C5GX21fQ0V+pg
fyf9HQ9i1LSUe7TvvCyXGSI7d8snNBqBpsyQ2EakQ3OGlcMjILPVVmyVSDFd2mLr
Z2VbrlinB58=
=HohF
-----END PGP SIGNATURE-----

From anthony at ekit-inc.com  Mon Sep 10 07:43:09 2007
From: anthony at ekit-inc.com (Anthony Baxter)
Date: Mon, 10 Sep 2007 15:43:09 +1000
Subject: [Python-Dev] summaries not arriving
In-Reply-To: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>
References: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>
Message-ID: <200709101543.10576.anthony@ekit-inc.com>

On Monday 10 September 2007, Paul Dubois wrote:
> As a small boy I once knew wrote, I must not use bad words. (:->

It's OK to use them about Barry, though, surely?

*wave* Hi Barry.




-- 
Anthony Baxter, ekit.      anthony at ekit-inc.com     (03) 9674 7015
Level 3 The Teahouse, 28 Clarendon St, Sth Melbourne Australia 3205 

From barry at python.org  Mon Sep 10 13:39:25 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 10 Sep 2007 07:39:25 -0400
Subject: [Python-Dev] summaries not arriving
In-Reply-To: <200709101543.10576.anthony@ekit-inc.com>
References: <f74a6c2f0709092230x66d0bc68nf478dac3890d8b3e@mail.gmail.com>
	<200709101543.10576.anthony@ekit-inc.com>
Message-ID: <B258668C-3385-43EF-A13F-C00BA9CF495E@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 10, 2007, at 1:43 AM, Anthony Baxter wrote:

> On Monday 10 September 2007, Paul Dubois wrote:
>> As a small boy I once knew wrote, I must not use bad words. (:->
>
> It's OK to use them about Barry, though, surely?
>
> *wave* Hi Barry.

It's okay from /you/ Anthony, because it's the only way I know you  
still care.

baby-unicorn-hugs-ly y'rs,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRuUs7XEjvBPtnXfVAQJZSQP+JoApQY+tY4zkZDN2OlE+jFv8xdF0vqRW
LCK+p8yIQjlrkMC58c2CChvOsWTcH6tZMFAd0jK8d9q8NxyyN3tM7mbh25Rnm9fo
KC9uDt787fY8RpRC5YC+zEtM589Y6omL3S4XcqdkTS9UWg6S50e9EDkqrjKmE1gb
8/1LSynRnF8=
=W6ef
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Mon Sep 10 16:21:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 10 Sep 2007 16:21:39 +0200
Subject: [Python-Dev] BerkeleyDB 4.6.19 is buggy and causes test_bsddb3
 to	hang
In-Reply-To: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com>
References: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com>
Message-ID: <46E552F3.10707@v.loewis.de>

> I have a change in my sandbox to explicitly avoid linking with 4.6.19
> but it seems like committing it would just pollute setup.py with vague
> notions of what versions of a specific library are bad.  I'd prefer to
> just disallow use of libdb 4.6 completely in setup.py until oracle fixes
> this and we're sure no OS release ships with 4.6.19.
> 
> thoughts?

That sounds like the right solution to me. We should review it when/if
a patch is available. "No OS release ships" is a difficult-to-test-for
condition, given things like Gentoo and Debian unstable that
simultaneously ship in dozens of versions (i.e. with "releases" every
two hours or more often). After a bug fix is available, and some time
has passed, I'd rather reallow 4.6.x, and put something into README
about this bug.

Thanks for investigating it.

Regards,
Martin

From janssen at parc.com  Mon Sep 10 17:37:14 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 08:37:14 PDT
Subject: [Python-Dev] Design and direction of the SSL module (was Re:
	frozenset C API?)
In-Reply-To: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
	<20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>
	<07Sep6.111518pdt."57996"@synergy1.parc.xerox.com>
	<20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com>
Message-ID: <07Sep10.083715pdt."57996"@synergy1.parc.xerox.com>

> One, what *is* the scope of your 
> amibition?  I feel silly for asking, because I am pretty sure that 
> somewhere in the beginning of this thread I missed either a proposal, a 
> PEP reference, or a ticket number, but I've poked around a little and I 
> can't seem to find it.  Can you provide a reference, or describe what it 
> is you're trying to do?

Sorry about that.  We kind of did this on the fly at the Python
sprint.

I was trying to fix two problems:  One, that the current socket.ssl
support didn't validate certificates, and two, that you couldn't do
server-side SSL with it.  I'm only interested in that aspect, and in
the simplest possible solution to those problems.  I don't want to
provide user validation callbacks, or arbitrary certificate decoding,
or general-purpose crypto, or support for building automatic CA
systems, or wrapping most of that great grab-bag of useful stuff
called OpenSSL.  Just fix the core issues with socket.ssl.

Along the way I've found a nasty little threading/malloc bug in the
existing code, and fixed that.  I've added real documentation for the
existing functionality.  I've gone around with you and Martin, mainly,
on what information to expose from the validated certificate, to
support authorization and accounting (the answer so far: "notAfter",
"subject", and "subjectAltName", if it's there).

> I'd really like to help make the stdlib SSL module to 
> be a really good, full-featured OpenSSL implementation for Python so we 
> can have it elsewhere.

Well, remember, it's just a socket-layer wrapper for TLS, it's not an
"OpenSSL implementation", by which I suppose you mean a full wrapper
for OpenSSL, much like PyOpenSSL is supposed to be.  For that purpose,
doesn't it make more sense to to extend/fix PyOpenSSL, rather than try
to grow the deliberately limited-purpose socket.ssl support into another
version of that?  Can't it be revived, if it is in fact moribund?

>     * Would it be possible to distribute as a separate library?  (I think 
> I remember Bill saying something about that already...)

Just to be clear that what you seem to want to work on and what I'm
working on seem to be two different things...  I plan to build a
back-port of the improved socket.ssl support as a standalone package
for 2.3 (because I need to use it on OS X 10.4).

> I'd say "why wouldn't you want to require either of those packages?" but 
> actually, I know why you wouldn't want to, and it's that they're bad. 

It's that they are too big and complicated to easily see how to fix.  But
that seems to be a side-effect of trying to wrap all of OpenSSL, which is
a big, evolving project.

> Wouldn't it be nice to know *why* the cert didn't validate?  To provide 

Yes, so I've put in a bit of work making sure the OpenSSL errors are
properly relayed back to the Python application.

> The entire idea of "extensions" is pretty direct about the fact that the 
> original implementor need not understand their profitable use :).

Not really.  Each extension is proposed, debated, and approved before
it's added to the spec for extensions.  My idea is that as support for
various extensions appear in OpenSSL, we can evaluate them and see if
they are worth supporting in Python.

> Specifically, properties of the 
> issuer define what properties the subject may have, in the verification 
> scheme for Vertex ( http://divmod.org/trac/wiki/DivmodVertex )

I didn't see a write-up of your scheme at that URL; can you point me
to a particular page in the Wiki which describes the use case?

I should point out that we're (actually, Greg Smith) also wrapping
another chunk of the OpenSSL library for hashing.  And last week I
suggested that we might wrap yet another chunk for doing cryptography.
This chunk-by-chunk approach might be a good way to go.  If a chunk
that did general X509 certificate munging did appear, I'd be happy
to change the SSL support to use it.

Bill


From janssen at parc.com  Mon Sep 10 19:30:54 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 10:30:54 PDT
Subject: [Python-Dev] which SSL client protocols work with which server
	protocols?
In-Reply-To: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep10.103100pdt."57996"@synergy1.parc.xerox.com>

> I've now built a framework in test_ssl to test all client protocols
> (SSL2, SSL3, SSL23, TLS1) against all server protocols, and here's
> what I've come up with.  Servers are along the X axis, and clients are
> on the Y axis.  "Yes" means that that client protocol can talk to that
> server protocol.
> 
> 	SSL2	SSL3	SS23	TLS1
> SSL2	yes	no	no	no
> SSL3	yes	yes	yes	no
> SSL23	no	no	yes	no
> TLS1	no	no	yes	yes
> 
> I'm a bit surprised by the facts that (1) an SSL2 client can't connect
> to an SSL23 server, and (2) an SSL23 client can *only* connect to an
> SSL23 server.  Can anyone verify that these combos (the results of
> testing with the Python framework) are indeed to be expected?

Sure enough, in testing on my FC7 platform, which has a more modern
version of OpenSSL (0.9.8e instead of the older 0.9.7l platform I was
using), an SSL2 client *can* connect to an SSL23 server.  And I got
one of the above entries wrong: an SSL23 client can connect to an SSL2
server.

I guess in the test harness, I'll just note the discrepancy, but not
fail the test either way.  And I'll add a note to the documentation.

Bill

From pfdubois at gmail.com  Mon Sep 10 19:30:47 2007
From: pfdubois at gmail.com (Paul Dubois)
Date: Mon, 10 Sep 2007 10:30:47 -0700
Subject: [Python-Dev] Fwd: Summary of Tracker Issues
In-Reply-To: <f74a6c2f0709092004h7bd15ac2vb345b3c32b8f3ea1@mail.gmail.com>
References: <20070910030148.856EA78098@psf.upfronthosting.co.za>
	<f74a6c2f0709092004h7bd15ac2vb345b3c32b8f3ea1@mail.gmail.com>
Message-ID: <f74a6c2f0709101030n4cb84838w27ea76f772f72a3c@mail.gmail.com>

Something seems to have gone wrong with the automation of the weekly
reports. They were working I think until we went live. While I work on
finding out what the trouble is, here is a report covering the period since
we went live. -- Paul Dubois
ACTIVITY SUMMARY (08/23/07 - 09/10/07) Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, simply click on the
issue ID. Do *not* respond to this message.

 1274 open (+64) / 11347 closed (+76) / 12621 total (+140)

 Average duration of open issues: 673 days.

 Median duration of open issues: 623 days.

 Open Issues Breakdown STATUS NumberChange open 1271 +64 pending 3 +0

Issues Created Or Reopened (140)  Title  Status  Date  Action  By   Patch to
rename *Server modules to lower-case
<http://bugs.python.org/issue1000>    08/23/07 created jasonpjason
2to3
crashes on input files with no trailing
newlines<http://bugs.python.org/issue1001> CLOSED 08/23/07 created
adrianholovaty   Patch
to rename HTMLParser module to lower_case
<http://bugs.python.org/issue1002>   08/23/07 created paulsmith
zipfile
password fails validation <http://bugs.python.org/issue1003>
08/23/07 created djw   MultiMethods
with type annotations in 3000 <http://bugs.python.org/issue1004>
08/23/07 created jasonpjason   Patches
to rename Queue module to queue <http://bugs.python.org/issue1005>
08/23/07 created paulsmith   Refactor
test_winreg.py to use unittest. <http://bugs.python.org/issue1006>
CLOSED 08/24/07 created varmaa   [py3k]
Fix dumbdbm, which fixes test_shelve (for me); instrument other
t<http://bugs.python.org/issue1007> CLOSED 08/24/07 created
larryhastings   Refactor
test_signal.py to use unittest. <http://bugs.python.org/issue1008>
CLOSED 08/24/07 created varmaa   Implementation
of PEP 3101, Advanced String Formatting
<http://bugs.python.org/issue1009> CLOSED 08/24/07 created
eric.smith   Broken bug tracker url <http://bugs.python.org/issue1010>
CLOSED 08/24/07 created nirs   Wrong
documentation for
rfc822.Message.getheader<http://bugs.python.org/issue1011> CLOSED
08/24/07 created nirs   Broken
URL at Doc/install/index.rst <http://bugs.python.org/issue1012>
CLOSED 08/24/07 created orsenthil   eval
error <http://bugs.python.org/issue1013>  CLOSED  08/24/07  created
Rayfward   cgi:
parse_qs and parse_qsl misbehave on empty
strings<http://bugs.python.org/issue1014>   08/24/07 created dljessup
 [PATCH]
Updated patch for rich dict view (dict().keys())
comparisons<http://bugs.python.org/issue1015> CLOSED 08/24/07 created
keir   [PATCH]
Updated fix for string to unicode fixes in time and
datetime<http://bugs.python.org/issue1016> CLOSED 08/24/07 created
ero.carrera   [PATCH] Add set operations (and, or, xor, subtract) to dict
views <http://bugs.python.org/issue1017>  CLOSED  08/24/07  created
keir   server-side
ssl support <http://bugs.python.org/issue1018>  CLOSED  08/25/07
created gvanrossum   Cleanup
pass on _curses and _curses_panel <http://bugs.python.org/issue1019>
 08/25/07 created larryhastings   pydoc
doesn't work on pyexpat <http://bugs.python.org/issue1020>
08/25/07 created nnorwitz   logging.basicConfig
does not allow to set NOTSET level <http://bugs.python.org/issue1021>
  08/25/07 created viper   use
bytes for code objects <http://bugs.python.org/issue1022>     08/25/07
created nnorwitz   [PATCH]
Unicode fixes in floatobject and
moduleobject<http://bugs.python.org/issue1023> CLOSED 08/25/07 created
ero.carrera   documentation for new SSL
module<http://bugs.python.org/issue1024> CLOSED 08/26/07 created
janssen   tracebacks
from list comps (probably other comps) don't show full
stack<http://bugs.python.org/issue1025>   08/26/07 created nnorwitz
Backport
ABC to 2.6 <http://bugs.python.org/issue1026>     08/26/07  created
baranguren   uudecoding
(uu.py) does not supprt base64, patch
attached<http://bugs.python.org/issue1027>   08/26/07 created dudek
Tkinter
binding involving Control-spacebar raises unicode
error<http://bugs.python.org/issue1028> CLOSED 08/26/07 created kbk
py3k:
io.StringIO.getvalue() returns \r\n <http://bugs.python.org/issue1029>
CLOSED 08/26/07 created
amaury.forgeotdarc   py3k: Adapt _winreg.c to the new buffer
API<http://bugs.python.org/issue1030> CLOSED 08/26/07 created
amaury.forgeotdarc   py3k: compilation with
VC2005<http://bugs.python.org/issue1031> CLOSED 08/26/07 created
amaury.forgeotdarc   Improve the hackish runtime_library_dirs support for
gcc <http://bugs.python.org/issue1032>     08/26/07  created
alexandre.vassalotti   Support for newline and encoding in tempfile
module<http://bugs.python.org/issue1033> CLOSED 08/27/07 created hupp
 [patch]
Add 2to3 support for displaying warnings as Python
comments<http://bugs.python.org/issue1034>   08/27/07 created
adrianholovaty   bytes
buffer API needs to support
PyBUF_LOCKDATA<http://bugs.python.org/issue1035>   08/27/07 created
gregory.p.smith   py3k _bsddb.c patch to use the new buffer
API<http://bugs.python.org/issue1036> CLOSED 08/27/07 created
gregory.p.smith   Ill-coded identifier crashes python when coding spec is
utf-8 <http://bugs.python.org/issue1037>  CLOSED  08/27/07  created
hyeshik.chang   [py3k] pdb does not work in python
3000<http://bugs.python.org/issue1038>   08/27/07 created
gregory.p.smith   Asssertion in Windows debug
build<http://bugs.python.org/issue1039> CLOSED 08/28/07 created
theller   Unicode
problem with TZ <http://bugs.python.org/issue1040>  CLOSED  08/28/07
created theller   io.py
problems on Windows <http://bugs.python.org/issue1041>  CLOSED
08/28/07 created theller   test_glob
fails with UnicodeDecodeError <http://bugs.python.org/issue1042>
08/28/07 created theller   test_builtin
failure on Windows <http://bugs.python.org/issue1043>  CLOSED
08/28/07 created theller   tarfile
insecure pathname extraction <http://bugs.python.org/issue1044>
CLOSED 08/28/07 created
lars.gustaebel   Performance regression in
2.5<http://bugs.python.org/issue1045>   08/28/07 created inducer
HTMLCalendar.formatyearpage
not behaving as documented <http://bugs.python.org/issue1046>  CLOSED
08/28/07 created inefab   py3k:
corrections for test_subprocess on
windows<http://bugs.python.org/issue1047>   08/28/07 created
amaury.forgeotdarc   py3k: correction for test_float on
Windows<http://bugs.python.org/issue1048> CLOSED 08/28/07 created
amaury.forgeotdarc   socket.socket.getsockname() has inconsistent
UNIX/Windows behavior <http://bugs.python.org/issue1049>  CLOSED
08/28/07 created janssen   py3k:
correction for test_marshal on Windows
<http://bugs.python.org/issue1050> CLOSED 08/29/07 created
amaury.forgeotdarc   certificate in Lib/test/test_ssl.py expires in February
2013 <http://bugs.python.org/issue1051>     08/29/07  created  janssen   SSL
patch for Windows buildbots problem <http://bugs.python.org/issue1052>
CLOSED 08/29/07 created janssen   bogus
attributes reported in asyncore doc <http://bugs.python.org/issue1053>
   08/29/07 created billiejoex   scriptsinstall
target fails in alternate build dir <http://bugs.python.org/issue1054>
   08/29/07 created
skip.montanaro   argument parsing in
datetime_strptime<http://bugs.python.org/issue1055> CLOSED 08/29/07
created loewis   test_cmd_line
starts python without -E <http://bugs.python.org/issue1056>
08/29/07 created twouters   Incorrect
URL with webbrowser and firefox under
Gnome<http://bugs.python.org/issue1057> CLOSED 08/29/07 created
bingham   Code
Example for 'property' bug <http://bugs.python.org/issue1058>  CLOSED
08/29/07 created KennethLove   *args
and **kwargs in function definitions
<http://bugs.python.org/issue1059> CLOSED 08/29/07 created
lars.gustaebel   zipfile cannot handle files larger than 2GB (inside
archive) <http://bugs.python.org/issue1060>     08/29/07  created  Kevin
Ar18   ABC caches should use weak refs <http://bugs.python.org/issue1061>
 08/30/07  created  gvanrossum   nice to have a way to tell if a socket is
bound <http://bugs.python.org/issue1062>     08/30/07  created  janssen   Small
typo in properties example <http://bugs.python.org/issue1063>  CLOSED
08/30/07 created cgrohmann   Test
issue <http://bugs.python.org/issue1064>  CLOSED  08/30/07  created
loewis   ssl.py
shouldn't change class names from 2.6 to
3.x<http://bugs.python.org/issue1065>   08/30/07 created janssen
Implement
PEPs 3109, 3134 <http://bugs.python.org/issue1066>  CLOSED  08/30/07
created collinwinter   test_smtplib
failures (caused by asyncore) <http://bugs.python.org/issue1067>
09/01/07 reopened gvanrossum   Documentation
Updates for PEP 3101 string formatting
<http://bugs.python.org/issue1068> CLOSED 08/31/07 created talin
invalid
file encoding results in "SyntaxError:
None"<http://bugs.python.org/issue1069> CLOSED 08/31/07 created
georg.brandl   unicode identifiers in error
messages<http://bugs.python.org/issue1070> CLOSED 08/31/07 created
georg.brandl   unicode.translate() doesn't error out on invalid translation
table <http://bugs.python.org/issue1071>     08/31/07  created
georg.brandl   Documentaion
font size too small <http://bugs.python.org/issue1072>  CLOSED
08/31/07 created nirs   Mysterious
failure under Windows <http://bugs.python.org/issue1073>  CLOSED
08/31/07 created akineko   python3.0-config
script does not run on py3k <http://bugs.python.org/issue1074>  CLOSED
08/31/07 created koen   py3k:
Unicode error in os.stat on Windows <http://bugs.python.org/issue1075>
CLOSED 08/31/07 created
amaury.forgeotdarc   py3 patch: full Unicode version for winreg
module<http://bugs.python.org/issue1076> CLOSED 08/31/07 created
amaury.forgeotdarc   itertools missing, causes interactive help to
break<http://bugs.python.org/issue1077>   09/01/07 created mattrussell
  cachersrc.py
using tuple unpacking args <http://bugs.python.org/issue1078>  CLOSED
09/01/07 created jinok   decode_header
does not follow RFC 2047 <http://bugs.python.org/issue1079>
09/01/07 created kael   Search
broken <http://bugs.python.org/issue1080>  CLOSED  09/01/07  created
nirs   file.seek
allows float arguments <http://bugs.python.org/issue1081>     09/01/07 created
georg.brandl   platform system may be Windows or Microsoft since
Vista<http://bugs.python.org/issue1082>   09/01/07 created
p.lavarre at ieee.org   Confusing error message when dividing timedelta
using /<http://bugs.python.org/issue1083>   09/01/07 created
skip.montanaro   ''.find() gives wrong result in Python built with
ICC<http://bugs.python.org/issue1084> CLOSED 09/01/07 created
sanders_muc   OS
X 10.5.x Build Problems <http://bugs.python.org/issue1085>  CLOSED  09/02/07
 created  noahgift   test_email failed <http://bugs.python.org/issue1086>
 09/02/07  created  xyb   py3k os.popen result is not iterable, patch
attached <http://bugs.python.org/issue1087>     09/02/07  created
carsten.haese   News page broken link to
3.0a1<http://bugs.python.org/issue1088> CLOSED 09/02/07 created
grahamh   ever
considered adding static typing to
python?<http://bugs.python.org/issue1089> CLOSED 09/02/07 created
adamjw   doctools/sphinx/web/application.py
does not start on windows <http://bugs.python.org/issue1090>  CLOSED
09/03/07 created osuchw   py3k
Mac installation errors <http://bugs.python.org/issue1091>  CLOSED  09/03/07
 created  hdiogenes   Unexpected results in Tutorial about
Unicode<http://bugs.python.org/issue1092> CLOSED 09/03/07 created
Viscaynot   product
function patch <http://bugs.python.org/issue1093>  CLOSED  09/03/07  created
 ryan.freckleton   TypeError in poplib.py
<http://bugs.python.org/issue1094> CLOSED 09/03/07 created
serge.julien   make install failed <http://bugs.python.org/issue1095>
CLOSED 09/03/07 created akineko   Deeply
recursive repr segfault <http://bugs.python.org/issue1096>
09/04/07 created rhamphoryncus   input()
should respect sys.stdin.encoding when in interactive
mode<http://bugs.python.org/issue1097> CLOSED 09/04/07 created
philyoon   decode_unicode
doesn't nul-terminate <http://bugs.python.org/issue1098>     09/04/07
created rhamphoryncus   Mac
compile fails with pydebug and framework
enabled<http://bugs.python.org/issue1099>   09/04/07 created hdiogenes
  Can't
input non-ascii characters in interactive
mode<http://bugs.python.org/issue1100> CLOSED 09/04/07 created
philyoon   Is
there just no PRINT statement any more? Or it just doesn't
work.<http://bugs.python.org/issue1101> CLOSED 09/06/07 reopened
loewis   Add
support for _msi.Record.GetString() and
_msi.Record.GetInteger()<http://bugs.python.org/issue1102>   09/04/07
created atuining   Typo
in dummy_threading documentation <http://bugs.python.org/issue1103>
CLOSED 09/04/07 created dthomasset   msilib.SummaryInfo.GetProperty()
truncates the string by one character
<http://bugs.python.org/issue1104>    09/04/07 created atuining
patch
for readme.txt in PCbuild8 <http://bugs.python.org/issue1105>  CLOSED
09/05/07 created chipped   Error
in random.shuffle <http://bugs.python.org/issue1106>  CLOSED  09/05/07
created Viscaynot   2to3,
lambda with non-tuple argument inside
parenthesis<http://bugs.python.org/issue1107> CLOSED 09/05/07 created
falsetru   Problem
with doctest and decorated functions
<http://bugs.python.org/issue1108>    09/05/07 created danilo
Warning
required when calling register() on an ABCMeta
subclass<http://bugs.python.org/issue1109>   09/05/07 created mark
Problems
with the msi installer - python-3.0a1.msi
<http://bugs.python.org/issue1110>   09/05/07 created vbr   Users'
directories information <http://bugs.python.org/issue1111>  CLOSED  09/05/07
 created  uzytkownik   Test debug assertion in bsddb
test_1413192.py<http://bugs.python.org/issue1112> CLOSED 09/05/07
created db3l   interrupt_main()
fails to interrupt raw_input() <http://bugs.python.org/issue1113>
CLOSED 09/05/07 created anand   _curses
issues on 64-bit big-endian (e.g, AIX) <http://bugs.python.org/issue1114>
 09/06/07  created  lukemewburn   Minor Change For Better cross
compile<http://bugs.python.org/issue1115>   09/06/07 created zengbo
reference
in extending doc to non-existing file
<http://bugs.python.org/issue1116> CLOSED 09/06/07 created anthon
Spurious
warning about missing _sha256 and _sha512 when not
needed<http://bugs.python.org/issue1117> CLOSED 09/06/07 created
dripton   hashlib
module fails with TypeError <http://bugs.python.org/issue1118>  CLOSED
09/06/07 created dripton   Search
index is messed up after partial
rebuilding<http://bugs.python.org/issue1119> CLOSED 09/06/07 created
lars.gustaebel   "make altinstall" installs pydoc, idle, smtpd.py with
broken shebang lin <http://bugs.python.org/issue1120>     09/06/07
created dripton   Document
inspect.getfullargspec() <http://bugs.python.org/issue1121>
09/06/07 created
brett.cannon   PyTuple_Size and PyTuple_GET_SIZE return type documentation
incorrect <http://bugs.python.org/issue1122>     09/07/07  created
gaul   split(None,
maxsplit) does not strip whitespace
correctly<http://bugs.python.org/issue1123>   09/07/07 created nirs
Webchecker
not parsing css "@import url" <http://bugs.python.org/issue1124>
09/07/07 created
ready.eddy   bytes.split shold have same interface as str.split, or
different name <http://bugs.python.org/issue1125>     09/07/07
created nirs   file.fileno
and file.isatty() should be implementable by any file like
o<http://bugs.python.org/issue1126>   09/07/07 created nirs   No
tests for inspect.getfullargspec() <http://bugs.python.org/issue1127>
  09/07/07 created
brett.cannon   msilib.Directory.make_short only handles file names with a
single dot in <http://bugs.python.org/issue1128>     09/07/07  created
atuining   OpenSSL
detection broken for Python 3.0a1 <http://bugs.python.org/issue1129>  CLOSED
 09/07/07  created  pythonmeister   Idle - Save (buffer) - closes IDLE and
does not save file (Windows XP) <http://bugs.python.org/issue1130>
09/08/07 created infixum   Reference
Manual: "for statement" links to "break
statement"<http://bugs.python.org/issue1131>   09/08/07 created
Martoon   compile
error in poplib.py <http://bugs.python.org/issue1132>  CLOSED
09/08/07 created andre   python3.0-config
raises SyntaxError <http://bugs.python.org/issue1133>  CLOSED
09/09/07 created complex   Parsing
a simple script eats all of your memory
<http://bugs.python.org/issue1134>   09/09/07 created complex
xview/yview
of Tix.Grid is broken <http://bugs.python.org/issue1135>     09/09/07
created ocean-city   Bdb
documentation <http://bugs.python.org/issue1136>     09/09/07  created
arklad   pyexpat
patch for changing buffer_size <http://bugs.python.org/issue1137>
09/09/07 created AchimGaedke   Fixer
needed for __future__ imports <http://bugs.python.org/issue1138>
09/09/07 created collinwinter   Make
python build with gcc-4.2 on OS X
10.4.9<http://bugs.python.org/issue1779871>   08/23/07 created
jyasskin
Issues Now Closed (188)  Title  By  Duration   2to3 crashes on input files
with no trailing newlines <http://bugs.python.org/issue1001>
collinwinter 14 days   Refactor
test_winreg.py to use unittest. <http://bugs.python.org/issue1006>
loewis 10 days   [py3k]
Fix dumbdbm, which fixes test_shelve (for me); instrument other
t<http://bugs.python.org/issue1007> loewis 10 days   Refactor
test_signal.py to use unittest. <http://bugs.python.org/issue1008>
georg.brandl  1 days   Implementation of PEP 3101, Advanced String
Formatting <http://bugs.python.org/issue1009>  gvanrossum  6 days   Broken
bug tracker url <http://bugs.python.org/issue1010>  georg.brandl  0
days   Wrong
documentation for rfc822.Message.getheader<http://bugs.python.org/issue1011>
georg.brandl  0 days   Broken URL at
Doc/install/index.rst<http://bugs.python.org/issue1012> loewis 9 days
 eval
error <http://bugs.python.org/issue1013>  georg.brandl  0 days   [PATCH]
Updated patch for rich dict view (dict().keys())
comparisons<http://bugs.python.org/issue1015> loewis 9 days   [PATCH]
Updated fix for string to unicode fixes in time and
datetime<http://bugs.python.org/issue1016> loewis 9 days   [PATCH]
Add set operations (and, or, xor, subtract) to dict
views<http://bugs.python.org/issue1017> loewis 9 days   server-side
ssl support <http://bugs.python.org/issue1018>  janssen  1 days   [PATCH]
Unicode fixes in floatobject and
moduleobject<http://bugs.python.org/issue1023> loewis 8 days
documentation
for new SSL module <http://bugs.python.org/issue1024>  gvanrossum  2
days   Tkinter
binding involving Control-spacebar raises unicode
error<http://bugs.python.org/issue1028> kbk 0 days   py3k:
io.StringIO.getvalue() returns \r\n <http://bugs.python.org/issue1029>
loewis 7 days   py3k:
Adapt _winreg.c to the new buffer API
<http://bugs.python.org/issue1030> loewis 7 days   py3k:
compilation with VC2005 <http://bugs.python.org/issue1031>  nnorwitz  0 days
  Support for newline and encoding in tempfile
module<http://bugs.python.org/issue1033> gvanrossum 1 days   py3k
_bsddb.c patch to use the new buffer API <http://bugs.python.org/issue1036>
gregory.p.smith  1 days   Ill-coded identifier crashes python when coding
spec is utf-8 <http://bugs.python.org/issue1037>  gvanrossum  2 days
Asssertion
in Windows debug build <http://bugs.python.org/issue1039>  theller  2
days   Unicode
problem with TZ <http://bugs.python.org/issue1040>  loewis  2 days   io.py
problems on Windows <http://bugs.python.org/issue1041>  gvanrossum  2
days   test_builtin
failure on Windows <http://bugs.python.org/issue1043>  georg.brandl  6
days   tarfile
insecure pathname extraction <http://bugs.python.org/issue1044>
lars.gustaebel  2 days   HTMLCalendar.formatyearpage not behaving as
documented <http://bugs.python.org/issue1046>  doerwalter  0 days   py3k:
correction for test_float on Windows
<http://bugs.python.org/issue1048> gvanrossum 1 days
socket.socket.getsockname()
has inconsistent UNIX/Windows behavior
<http://bugs.python.org/issue1049> loewis 1 days   py3k:
correction for test_marshal on Windows
<http://bugs.python.org/issue1050> gvanrossum 1 days   SSL
patch for Windows buildbots problem <http://bugs.python.org/issue1052>
janssen 1 days   argument
parsing in datetime_strptime <http://bugs.python.org/issue1055>
gvanrossum 1 days   Incorrect
URL with webbrowser and firefox under
Gnome<http://bugs.python.org/issue1057> orsenthil 1 days   Code
Example for 'property' bug <http://bugs.python.org/issue1058>
georg.brandl 0 days   *args
and **kwargs in function definitions <http://bugs.python.org/issue1059>
georg.brandl  1 days   Small typo in properties
example<http://bugs.python.org/issue1063>
georg.brandl  0 days   Test issue <http://bugs.python.org/issue1064>
georg.brandl  0 days   Implement PEPs 3109,
3134<http://bugs.python.org/issue1066> collinwinter 0 days
Documentation
Updates for PEP 3101 string formatting <http://bugs.python.org/issue1068>
georg.brandl  0 days   invalid file encoding results in "SyntaxError:
None"<http://bugs.python.org/issue1069> loewis 0 days   unicode
identifiers in error messages <http://bugs.python.org/issue1070>  loewis  0
days   Documentaion font size too small <http://bugs.python.org/issue1072>
georg.brandl  1 days   Mysterious failure under
Windows<http://bugs.python.org/issue1073> loewis 0 days
python3.0-config
script does not run on py3k <http://bugs.python.org/issue1074>
georg.brandl 0 days   py3k:
Unicode error in os.stat on Windows <http://bugs.python.org/issue1075>
loewis 2 days   py3
patch: full Unicode version for winreg
module<http://bugs.python.org/issue1076> loewis 2 days   cachersrc.py
using tuple unpacking args <http://bugs.python.org/issue1078>
georg.brandl 2 days   Search
broken <http://bugs.python.org/issue1080>  georg.brandl  0 days   ''.find()
gives wrong result in Python built with
ICC<http://bugs.python.org/issue1084> sanders_muc 0 days   OS
X 10.5.x Build Problems <http://bugs.python.org/issue1085>  loewis  0
days   News
page broken link to 3.0a1 <http://bugs.python.org/issue1088>  loewis  0 days
  ever considered adding static typing to
python?<http://bugs.python.org/issue1089> loewis 0 days
doctools/sphinx/web/application.py
does not start on windows <http://bugs.python.org/issue1090>
georg.brandl 0 days   py3k
Mac installation errors <http://bugs.python.org/issue1091>  hdiogenes  0
days   Unexpected results in Tutorial about
Unicode<http://bugs.python.org/issue1092>
georg.brandl  2 days   product function
patch<http://bugs.python.org/issue1093> gvanrossum 0 days   TypeError
in poplib.py <http://bugs.python.org/issue1094>  gvanrossum  7 days   make
install failed <http://bugs.python.org/issue1095>  georg.brandl  4
days   input()
should respect sys.stdin.encoding when in interactive
mode<http://bugs.python.org/issue1097> loewis 0 days   Can't
input non-ascii characters in interactive
mode<http://bugs.python.org/issue1100> philyoon 0 days   Is
there just no PRINT statement any more? Or it just doesn't
work.<http://bugs.python.org/issue1101> gvanrossum 0 days   Typo
in dummy_threading documentation <http://bugs.python.org/issue1103>
dthomasset 0 days   patch
for readme.txt in PCbuild8 <http://bugs.python.org/issue1105>  loewis  0
days   Error in random.shuffle <http://bugs.python.org/issue1106>
georg.brandl  0 days   2to3, lambda with non-tuple argument inside
parenthesis <http://bugs.python.org/issue1107>  collinwinter  4 days   Users'
directories information <http://bugs.python.org/issue1111>  uzytkownik  1
days   Test debug assertion in bsddb
test_1413192.py<http://bugs.python.org/issue1112>
gregory.p.smith  1 days   interrupt_main() fails to interrupt
raw_input()<http://bugs.python.org/issue1113> anand 2 days   reference
in extending doc to non-existing file <http://bugs.python.org/issue1116>
georg.brandl  0 days   Spurious warning about missing _sha256 and _sha512
when not needed <http://bugs.python.org/issue1117>  skip.montanaro  0
days   hashlib
module fails with TypeError <http://bugs.python.org/issue1118>
georg.brandl 0 days   Search
index is messed up after partial rebuilding<http://bugs.python.org/issue1119>
georg.brandl  0 days   OpenSSL detection broken for Python
3.0a1<http://bugs.python.org/issue1129>
georg.brandl  0 days   compile error in
poplib.py<http://bugs.python.org/issue1132>
georg.brandl  1 days   python3.0-config raises
SyntaxError<http://bugs.python.org/issue1133> loewis 0 days   Need
user-centered info for Windows users. <http://bugs.python.org/issue223599>
georg.brandl  2460 days   Codec naming scheme and aliasing
support<http://bugs.python.org/issue225476> lemburg 2445 days
exception
item from mapped function <http://bugs.python.org/issue447143>
georg.brandl 2212 days   include
SQL interface module <http://bugs.python.org/issue457493>
georg.brandl 2183 days   Add
"eu#" parser marker <http://bugs.python.org/issue514532>  georg.brandl  2023
days   Missing docs for module imputil
<http://bugs.python.org/issue515751> jafo 2024 days   pydoc
should respect __all__ <http://bugs.python.org/issue527668>
skip.montanaro 1999 days   bsddb185
module needs iterators <http://bugs.python.org/issue533281>
gregory.p.smith 1981 days   cStringIO
should provide a binary option <http://bugs.python.org/issue547537>
georg.brandl  1948 days   Docs in DocBook
format<http://bugs.python.org/issue574465>
georg.brandl  1884 days   pygettext should be
installed<http://bugs.python.org/issue642309>
skip.montanaro  1740 days   email: minimal header
encoding<http://bugs.python.org/issue658407>
skip.montanaro  1708 days   raw_input defers alarm
signal<http://bugs.python.org/issue685846>
georg.brandl  1666 days   Provide "plucker" format
docs.<http://bugs.python.org/issue698900>
georg.brandl  1631 days   Port tests to unittest (Part
2)<http://bugs.python.org/issue736962>
georg.brandl  1564 days   Compile error messages and
PEP-263<http://bugs.python.org/issue780725>
georg.brandl  1485 days   patch for build with read-only
$srcdir<http://bugs.python.org/issue786737> loewis 1486 days
robotparser
interactively prompts for username and
password<http://bugs.python.org/issue813986>
skip.montanaro  1430 days   Modules/Setup needs a suppress
flag?<http://bugs.python.org/issue857888>
skip.montanaro  1358 days   configure links unnecessary library
libdl<http://bugs.python.org/issue877124> loewis 1322 days   "ez"
format code for ParseTuple() <http://bugs.python.org/issue880951>
lemburg 1311 days   configure
warning / sys/un.h: present but cannot be
compiled<http://bugs.python.org/issue881765>
georg.brandl  1310 days   quopri encoding &
Unicode<http://bugs.python.org/issue883466>
georg.brandl  1318 days   2.3.3 str & list still use __getslice__ /
__setslice__ <http://bugs.python.org/issue887042>  georg.brandl  1302
days   making
the version of SSL configurable when creating
sockets<http://bugs.python.org/issue889813> janssen 1303 days   work
around to compile \r\n file <http://bugs.python.org/issue924771>
georg.brandl  1244 days   SSL-ed sockets don't close
correct?<http://bugs.python.org/issue978833> loewis 1168 days   Adding
missing ISO 8859 codecs, especially
Thai<http://bugs.python.org/issue1001895> lemburg 1116 days
socket.ssl
should explain that it is a 2/3
connection<http://bugs.python.org/issue1027394> janssen 1080 days
Use
correct encoding for printing
SyntaxErrors<http://bugs.python.org/issue1031213> gvanrossum 1079 days
  Irregular
behavior of datetime.__str__() <http://bugs.python.org/issue1074462>
skip.montanaro  1008 days   Add 'update FAQ' to release
checklist<http://bugs.python.org/issue1095328>
georg.brandl  962 days   Add SSL certificate
validation<http://bugs.python.org/issue1114345> janssen 944 days
enable
time + timedelta <http://bugs.python.org/issue1118748>  skip.montanaro  935
days   Python Programming FAQ should be updated for Python
2.4<http://bugs.python.org/issue1119439>
georg.brandl  925 days   zipfile
UnicodeDecodeError<http://bugs.python.org/issue1170311>
georg.brandl  888 days   Python and Turkish
Locale<http://bugs.python.org/issue1193061>
georg.brandl  852 days   a bunch of infinite C
recursions<http://bugs.python.org/issue1202533>
brett.cannon  844 days   add single html
files<http://bugs.python.org/issue1209562>
georg.brandl  833 days   crash recursive
__getattr__<http://bugs.python.org/issue1267884>
brett.cannon  744 days   tarfile: adding filed that use direct device
addressing <http://bugs.python.org/issue1276378>  lars.gustaebel  724
days   python.sty
correction - verbatim environment <http://bugs.python.org/issue1293788>
georg.brandl  705 days   python.sty: \py at sigline
correction<http://bugs.python.org/issue1293790>
georg.brandl  705 days   Use 'seealso' to add examples to
LibRef<http://bugs.python.org/issue1376361> loewis 635 days   add
more readline support <http://bugs.python.org/issue1388440>  loewis  621
days   2.3.5 source RPM install fails w/o
tk-devel<http://bugs.python.org/issue1403221> jafo 594 days
Inconsistency
in Programming FAQ <http://bugs.python.org/issue1421839>  georg.brandl  568
days   Add .format() method to str and
unicode<http://bugs.python.org/issue1463370> loewis 514 days
datetime.time
and datetime.timedelta <http://bugs.python.org/issue1487389>
skip.montanaro 478 days   winerror
module <http://bugs.python.org/issue1505257>  loewis  450 days   Turkish
Character <http://bugs.python.org/issue1528802>  georg.brandl  400
days   sqlite3
documentation on rowcount is
contradictory<http://bugs.python.org/issue1573854>
georg.brandl  318 days   doctest simple usage recipe is
misleading<http://bugs.python.org/issue1594966>
georg.brandl  285 days   specialcase simple sliceobj in
tuple/str/unicode<http://bugs.python.org/issue1617682> twouters 254
days   specialcase
simple sliceobj in list (and bugfixes)
<http://bugs.python.org/issue1617687> twouters 254 days   Extended
slicing for UserString <http://bugs.python.org/issue1617691>  twouters  253
days   Extended slicing for array
objects<http://bugs.python.org/issue1617698> twouters 253 days
slice-object
support for ctypes Pointer/Array <http://bugs.python.org/issue1617699>
twouters 256 days   slice-object
support for mmap <http://bugs.python.org/issue1617700>  twouters  253
days   extended
slicing for structseq <http://bugs.python.org/issue1617701>  twouters  253
days   extended slicing for buffer
objects<http://bugs.python.org/issue1617702> twouters 253 days
sys.intern()
2to3 fixer <http://bugs.python.org/issue1619049>  collinwinter  261
days   webbrowser.open_new()
suggestion <http://bugs.python.org/issue1624674>  georg.brandl  237 days   re
module documentation on search/match is
unclear<http://bugs.python.org/issue1625381>
georg.brandl  235 days   posixmodule.c leaks crypto context on
Windows<http://bugs.python.org/issue1626801> loewis 244 days   doc
misleading in re.compile <http://bugs.python.org/issue1630515>
georg.brandl 227 days   Refactor
test_class to use unittest lib <http://bugs.python.org/issue1671298>
georg.brandl  177 days   Add tests for pipes module
(test_pipes)<http://bugs.python.org/issue1680959>
georg.brandl  169 days   xreload.py won't update class
docstrings<http://bugs.python.org/issue1683288> gvanrossum 164 days
Explain
__method__ lookup semantics for new-style
classes<http://bugs.python.org/issue1684991>
georg.brandl  168 days   descrintro: error describing __new__
behavior<http://bugs.python.org/issue1686597>
georg.brandl  154 days   os.path.join.__doc__ should mention absolute
paths<http://bugs.python.org/issue1688564>
georg.brandl  150 days   Python 2.5 installer ended
prematurely<http://bugs.python.org/issue1694155> loewis 151 days   Bad
documentation for existing imp methods <http://bugs.python.org/issue1694833>
georg.brandl  141 days   __getslice__ still used in built-in
types<http://bugs.python.org/issue1697820>
georg.brandl  135 days   pickle example contains
errors<http://bugs.python.org/issue1699759>
georg.brandl  133 days   Refactor test_frozen.py to use
unittest.<http://bugs.python.org/issue1703379>
georg.brandl  128 days   socket.error exceptions not subclass of
StandardError <http://bugs.python.org/issue1706815>  gregory.p.smith  138
days   generation errors in PDF-A4 tags for
lib.pdf<http://bugs.python.org/issue1707497>
georg.brandl  120 days   imp.find_module doc
ambiguity<http://bugs.python.org/issue1708326>
georg.brandl  134 days   run test_1565150(test_os.py) only on
NTFS<http://bugs.python.org/issue1709599> loewis 123 days   IDLE
hangs in popup method completion <http://bugs.python.org/issue1721890>
 kbk 96 days   bsddb.btopen
. del of record doesn't update index <http://bugs.python.org/issue1725856>
gregory.p.smith  90 days   -q (quiet) option for python
interpreter<http://bugs.python.org/issue1728488>
georg.brandl  86 days   _lsprof.c:ptrace_enter_call assumes PyErr_* is
clean<http://bugs.python.org/issue1733973> arigo 89 days
struct.Struct.size
is not documented <http://bugs.python.org/issue1734111>  georg.brandl  75
days   Add/Remove programs shows Martin v
L??wis<http://bugs.python.org/issue1737210> loewis 79 days   Add
reduce to functools in 2.6 <http://bugs.python.org/issue1739906>  gvanrossum
 69 days   Incorrect docs for optparse OptionParser
add_help_option<http://bugs.python.org/issue1742164>
georg.brandl  61 days   Pickling of exceptions
broken<http://bugs.python.org/issue1742889>
georg.brandl  59 days   Examples dropped from PDF version of SQLite
docs<http://bugs.python.org/issue1743846>
georg.brandl  58 days   Improve exception pickling
support<http://bugs.python.org/issue1744398>
georg.brandl  57 days   AMD64 installer does not place python25.dll in
system dir <http://bugs.python.org/issue1746880>  loewis  62 days
expanduser("~")
on Windows looks for HOME first <http://bugs.python.org/issue1749583>
georg.brandl  62 days   fixing 2.5.1 build with unicode and dynamic loading
disabled <http://bugs.python.org/issue1752175>  georg.brandl  43 days
 getaddrinfo
no longer used in httplib <http://bugs.python.org/issue1752332>
georg.brandl  43 days   chown() does not handle UID >
INT_MAX<http://bugs.python.org/issue1752703>
georg.brandl  42 days   struni: assertion in Windows debug
build<http://bugs.python.org/issue1753395>
georg.brandl  48 days   reference count discrepancy, PyErr_Print vs.
PyErr_Clear <http://bugs.python.org/issue1756389>  georg.brandl  36
days   utilize
2.5 try/except/finally in contextlib <http://bugs.python.org/issue1757118>
georg.brandl  35 days   Documentation of descriptors needs more
detail<http://bugs.python.org/issue1758696>
georg.brandl  32 days   The -m switch does not use the builtin __main__
module <http://bugs.python.org/issue1764407>  ncoghlan  25 days   setup.py
trashes LDFLAGS <http://bugs.python.org/issue1765375>  georg.brandl  23 days
  poll() returns "status code", not "return
code"<http://bugs.python.org/issue1766421>
georg.brandl  21 days   os.chmod failure<http://bugs.python.org/issue1767242>
georg.brandl  35 days   Byte code WITH_CLEANUP missing, MAKE_CLOSURE
wrong<http://bugs.python.org/issue1768121>
georg.brandl  18 days   Misc improvements for the io
module<http://bugs.python.org/issue1771364> gvanrossum 18 days   bsddb
can't use unicode keys <http://bugs.python.org/issue1771381>
gregory.p.smith  14 days   Unify __builtins__ ->
__builtin__<http://bugs.python.org/issue1774369> gvanrossum 15 days
tempfile.TemporaryFile
differs between platforms <http://bugs.python.org/issue1776696>
georg.brandl  6 days   Segfault closing a file from concurrent
threads<http://bugs.python.org/issue1778376>
georg.brandl  7 days   <a href="http://bugs.python.org/issue1779550"
target="_blank" onclic...

[Message clipped]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070910/b058ae43/attachment-0001.htm 

From janssen at parc.com  Mon Sep 10 20:44:30 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 11:44:30 PDT
Subject: [Python-Dev] which SSL client protocols work with which server
	protocols?
In-Reply-To: <07Sep10.103100pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com>
	<07Sep10.103100pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep10.114436pdt."57996"@synergy1.parc.xerox.com>

Here's the updated connection table:

		SSL2	SSL3	SS23	TLS1
    SSL2	yes	no	yes	no
    SSL3	yes	yes	yes	no
    SSL23	yes	no	yes	no
    TLS1	no	no	yes	yes

Given this, I think the client-side default should be changed from
SSLv23 to SSLv3, and the server-side default should be SSLv23.

Bill

From trentm at activestate.com  Mon Sep 10 21:06:03 2007
From: trentm at activestate.com (Trent Mick)
Date: Mon, 10 Sep 2007 12:06:03 -0700
Subject: [Python-Dev] [PEPs]  Email addresses in PEPs?
In-Reply-To: <18145.59131.323002.910688@montanaro.dyndns.org>
References: <18121.47310.218893.540750@montanaro.dyndns.org>
	<ca471dc20708201120o23d527d4lc62e01bc4fc13585@mail.gmail.com>
	<bbaeab100708201216k213112e0u10b47013ec75837a@mail.gmail.com>
	<4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com>
	<46E1D2C3.5030705@activestate.com>
	<18145.59131.323002.910688@montanaro.dyndns.org>
Message-ID: <46E5959B.5060200@activestate.com>

skip at pobox.com wrote:
>     Trent> If some would find it useful, here is a snippet of code that
>     Trent> obfuscates email addresses for HTML as done by Markdown (a
>     Trent> text-to-html markup translator). It randomly encodes each
>     Trent> charater as a hex or decimal HTML entity (roughly 10% raw, 45%
>     Trent> hex, 45% dec).
> 
> Aren't most spammers' scrapers going to be intelligent enough by now
> (several years since they first arrived on the scene) to "see through" these
> sorts of common obfuscations?

Perhaps, yes. No way of really knowing. <shrug/>

Trent

-- 
Trent Mick
trentm at activestate.com

From facundobatista at gmail.com  Mon Sep 10 22:25:11 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Mon, 10 Sep 2007 17:25:11 -0300
Subject: [Python-Dev] Python tickets summary
Message-ID: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>

People:

I modified my tool, whichs makes a summary of all the Python tickets
(I moved the source where the info is taken from SF to our Roundup).

In result, the summary is now, again, updated daily:

  http://www.taniquetil.com.ar/facundo/py_tickets.html


Enjoy it.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From janssen at parc.com  Tue Sep 11 02:20:14 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 17:20:14 PDT
Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile
Message-ID: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>

The Alpha/Tru64 buildbot seems to be having difficulty compiling
the _ssl.c file.  Looks like missing header files.  Anyone know what
the configuration of OpenSSL on that machine is like?

Bill

From janssen at parc.com  Tue Sep 11 02:59:00 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 17:59:00 PDT
Subject: [Python-Dev] Solaris 10 buildbot test_ssl failures
In-Reply-To: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep10.175910pdt."57996"@synergy1.parc.xerox.com>

The Solaris 10 buildbot is complaining about test_ssl, and I think
it's because some of the functions in it use constants from the ssl
module at the top level, i.e.,

    def tryProtocolCombo (server_protocol,
                          client_protocol,
                          expectedToWork,
                          certsreqs=ssl.CERT_NONE):

Is this verboten?

Bill


From janssen at parc.com  Tue Sep 11 03:41:02 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 18:41:02 PDT
Subject: [Python-Dev] Design and direction of the SSL module (was Re:
	frozenset C API?)
In-Reply-To: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.101542pdt."57996"@synergy1.parc.xerox.com>
	<20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com>
	<07Sep6.111518pdt."57996"@synergy1.parc.xerox.com>
	<20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com>
Message-ID: <07Sep10.184105pdt."57996"@synergy1.parc.xerox.com>

By the way, if you're offering to help with this, there are a couple
of things I could use some help with.  I scratched my head a bit about
how to turn the "othername" possibility of a subjectAltName into a
Python data structure, using the OpenSSL C code, and finally gave up.
If you could provide a C function that does that, I'd be very grateful.

And there's a similar issue with the "permanent identifier" defined in
RFC 4043.  I don't see how to iterate over an ASN1 sequence using the
OpenSSL C code -- if you can figure out how to do that and provide a C
function to translate that field in a certificate into a Python data
structure, it would also be a great help.

Bill

From janssen at parc.com  Tue Sep 11 04:01:42 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 19:01:42 PDT
Subject: [Python-Dev] Solaris 10 buildbot test_ssl failures
In-Reply-To: <07Sep10.175910pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
	<07Sep10.175910pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep10.190146pdt."57996"@synergy1.parc.xerox.com>

> The Solaris 10 buildbot is complaining about test_ssl, and I think
> it's because some of the functions in it use constants from the ssl
> module at the top level, i.e.,
> 
>     def tryProtocolCombo (server_protocol,
>                           client_protocol,
>                           expectedToWork,
>                           certsreqs=ssl.CERT_NONE):
> 
> Is this verboten?

Of course it is.  Yep, that was it.  Solaris 10 is green.

Bill

From aahz at pythoncraft.com  Tue Sep 11 05:01:46 2007
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 10 Sep 2007 20:01:46 -0700
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <46E2FAC4.2030900@v.loewis.de>
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>
	<07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>
	<46E2FAC4.2030900@v.loewis.de>
Message-ID: <20070911030146.GA28351@panix.com>

On Sat, Sep 08, 2007, "Martin v. L?wis" wrote:
>
> No. IIUC, "expected skips" are a platform property. For your platform,
> support for threads is expected (whatever your platform is as log as
> it was built in this millenium).

Really?  I thought NetBSD was still iffy WRT threading.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html

From martin at v.loewis.de  Tue Sep 11 07:36:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Sep 2007 07:36:20 +0200
Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile
In-Reply-To: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E62954.8010208@v.loewis.de>

> The Alpha/Tru64 buildbot seems to be having difficulty compiling
> the _ssl.c file.  Looks like missing header files.  Anyone know what
> the configuration of OpenSSL on that machine is like?

Neal Norwitz and Ralf Grosse-Kunstleve have access to that machine.

Regards,
Martin

From anthony at ekit-inc.com  Tue Sep 11 07:42:01 2007
From: anthony at ekit-inc.com (Anthony Baxter)
Date: Tue, 11 Sep 2007 15:42:01 +1000
Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile
In-Reply-To: <46E62954.8010208@v.loewis.de>
References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
	<46E62954.8010208@v.loewis.de>
Message-ID: <200709111542.05696.anthony@ekit-inc.com>

On Tuesday 11 September 2007, Martin v. L?wis wrote:
> > The Alpha/Tru64 buildbot seems to be having difficulty
> > compiling the _ssl.c file.  Looks like missing header files. 
> > Anyone know what the configuration of OpenSSL on that machine
> > is like?
>
> Neal Norwitz and Ralf Grosse-Kunstleve have access to that
> machine.

Neal's on leave all this month, I believe.



-- 
Anthony Baxter, ekit.      anthony at ekit-inc.com     (03) 9674 7015
Level 3 The Teahouse, 28 Clarendon St, Sth Melbourne Australia 3205 

From martin at v.loewis.de  Tue Sep 11 07:43:16 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Sep 2007 07:43:16 +0200
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <20070911030146.GA28351@panix.com>
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>	<07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>	<46E2FAC4.2030900@v.loewis.de>
	<20070911030146.GA28351@panix.com>
Message-ID: <46E62AF4.4040207@v.loewis.de>

>> No. IIUC, "expected skips" are a platform property. For your platform,
>> support for threads is expected (whatever your platform is as log as
>> it was built in this millenium).
> 
> Really?  I thought NetBSD was still iffy WRT threading.

Ah, right. Still, it seems that people expect that thread support is
available on NetBSD. The list of expected skips does not mention
test_thread for 'netbsd3' (it only does so for 'sco_sv3' and 'riscos')

Regards,
Martin

From janssen at parc.com  Tue Sep 11 07:59:02 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Sep 2007 22:59:02 PDT
Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile
In-Reply-To: <200709111542.05696.anthony@ekit-inc.com> 
References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com>
	<46E62954.8010208@v.loewis.de>
	<200709111542.05696.anthony@ekit-inc.com>
Message-ID: <07Sep10.225906pdt."57996"@synergy1.parc.xerox.com>

> > Neal Norwitz and Ralf Grosse-Kunstleve have access to that
> > machine.
> 
> Neal's on leave all this month, I believe.

Well, I'm not sure it's urgent.  Are there lots of Alphas still
running?  And Tru64 is in end-of-life mode.

Bill

From tleeuwenburg at gmail.com  Tue Sep 11 08:41:30 2007
From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg)
Date: Tue, 11 Sep 2007 16:41:30 +1000
Subject: [Python-Dev] Compiling Python 2.5 and settinf UCS2 flag
Message-ID: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com>

Hi all,

I have an unusual use case in which some software I work on compiles a
version of Python for distribution. I'm not 100% across this as it's not at
all my area of responsibility, but I have been having some issues lately.

My hand-compiled version of Python 2.5 works just fine, and in turn uses a
hand-compiled Tcl/Tk with threading disabled.

The system then re-compiles Python2.5 for its own purposes. At this point,
it appears to ignore some of the options originally set using configure for
Python.

I have enough knowledge/control over the system to pass in some additional
compiler flags. I would like to try to force some behaviour normally set as
a flag to the configure script.

Is there a C compiler flag I can use to force the use of UCS2 unicode?

Thanks,
-Tennessee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/1fba7334/attachment-0001.htm 

From tulloss2 at uiuc.edu  Tue Sep 11 09:27:58 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Tue, 11 Sep 2007 02:27:58 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
Message-ID: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>

Hi,

I had a whole long email about exactly what I was doing, but I think I'll
get to the point instead. I'm trying to implement a python concurrency API
and would like to use cpython to do it. To do that, I would like to remove
the GIL.

So, since I'm new to interpreter hacking, some help would be appreciated.
I've listed what I think the GIL does; if you guys could add to this list or
refine it, I would very much appreciate it.

Roles of the GIL:
1. Some global interpreter state/modules are protected (where are these
globals at?)
2. When writing C extensions I can change the state of my python object
without worrying about synchronization
3. When writing C extensions I can change my own internal C state without
worrying about synchronization (unless I have other, non-python threads
running)

Does anyone know of a place where the GIL is required when not operating on
a python object? It seems like this would never happen, and would make
replacing the GIL somewhat easier.

I've only started looking at the code recently, so please forgive my
naivety. I'm still learning how the interpreter works on a high level, let
alone all the nitty gritty details!

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/364046d8/attachment.htm 

From martin at v.loewis.de  Tue Sep 11 10:33:17 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Sep 2007 10:33:17 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
Message-ID: <46E652CD.1070901@v.loewis.de>

> 1. Some global interpreter state/modules are protected (where are these
> globals at?)

It's the interpreter and thread state itself (pystate.h), for the thread
state, also _PyThreadState_Current. Then there is the GC state, in
particular "generations". There are various caches and counters also.

> 2. When writing C extensions I can change the state of my python object
> without worrying about synchronization
> 3. When writing C extensions I can change my own internal C state
> without worrying about synchronization (unless I have other, non-python
> threads running)
4. The builtin container types are protected by the GIL, and various
   other builtin objects
5. Reference counting is protected by the GIL
6. PyMalloc is protected by the GIL.

> Does anyone know of a place where the GIL is required when not operating
> on a python object?

See 6 above, also (obviously) 1.

> I've only started looking at the code recently, so please forgive my
> naivety. I'm still learning how the interpreter works on a high level,
> let alone all the nitty gritty details!

Good luck!

Martin

From thomas at python.org  Tue Sep 11 11:33:22 2007
From: thomas at python.org (Thomas Wouters)
Date: Tue, 11 Sep 2007 11:33:22 +0200
Subject: [Python-Dev] Compiling Python 2.5 and settinf UCS2 flag
In-Reply-To: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com>
References: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com>
Message-ID: <9e804ac0709110233s3bba472ctc5e5f57044103215@mail.gmail.com>

On 9/11/07, Tennessee Leeuwenburg <tleeuwenburg at gmail.com> wrote:
>
> Hi all,
>
> I have an unusual use case in which some software I work on compiles a
> version of Python for distribution. I'm not 100% across this as it's not at
> all my area of responsibility, but I have been having some issues lately.
>
> My hand-compiled version of Python 2.5 works just fine, and in turn uses a
> hand-compiled Tcl/Tk with threading disabled.
>
> The system then re-compiles Python2.5 for its own purposes. At this point,
> it appears to ignore some of the options originally set using configure for
> Python.
>
> I have enough knowledge/control over the system to pass in some additional
> compiler flags. I would like to try to force some behaviour normally set as
> a flag to the configure script.
>
> Is there a C compiler flag I can use to force the use of UCS2 unicode?


This isn't really a python-dev question, more a python-list one. Python dev
is for the development of Python, not with Python or your system ;)

The choice between UCS2 and UCS4 is made by configure, based on two things:
what you pass with the --enable-unicode argument (if anything), and what the
version of Tcl you're linking against seems to use. Tcl's choice is only
used if no explicit choice is given. configure also determines the proper
type for the actual UCS2/UCS4 data -- wchar_t, unsigned short or unsigned
long. Both choices are stored in pyconfig.h as is usual for configure. You
can't override them with C compiler flags, but you can edit pyconfig.h if
you can't change the configure flags. Keep in mind that you change both of
those (you probably want to just diff the two pyconfig.h's to see what else
is different.)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/e816911d/attachment.htm 

From matt at pollenation.net  Tue Sep 11 13:59:51 2007
From: matt at pollenation.net (Matt Goodall)
Date: Tue, 11 Sep 2007 12:59:51 +0100
Subject: [Python-Dev] which SSL client protocols work with which server
 protocols?
In-Reply-To: <07Sep10.114436pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com>	<07Sep10.103100pdt."57996"@synergy1.parc.xerox.com>
	<07Sep10.114436pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E68337.5050307@pollenation.net>

Bill Janssen wrote:
> Here's the updated connection table:
> 
> 		SSL2	SSL3	SS23	TLS1
>     SSL2	yes	no	yes	no
>     SSL3	yes	yes	yes	no
>     SSL23	yes	no	yes	no
>     TLS1	no	no	yes	yes
> 
> Given this, I think the client-side default should be changed from
> SSLv23 to SSLv3, and the server-side default should be SSLv23.

I believe you are correct.

I did some experiments with this a while ago after hitting problems
connecting to some SSL servers although I can't remember the exact
results now.

More importantly, what you recommend is what Twisted does and I'd
believe them more than me any time ;-).

See Twisted's DefaultOpenSSLContextFactory [1] for the server side and
ClientContextFactory [2] for the client side.


Cheers, Matt


[1] DefaultOpenSSLContextFactory,
http://twistedmatrix.com/trac/browser/trunk/twisted/internet/ssl.py#L67

[2] ClientContextFactory,
http://twistedmatrix.com/trac/browser/trunk/twisted/internet/ssl.py#L102

-- 
Matt Goodall, Pollenation Internet Ltd
Technology House, 237 Lidgett Lane, Leeds LS17 6QR
Registered No 4382123
A member of the Brunswick MCL Group of Companies
w: http://www.pollenation.net/
e: matt at pollenation.net
t: +44 113 2252500

From aahz at pythoncraft.com  Tue Sep 11 15:12:54 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 11 Sep 2007 06:12:54 -0700
Subject: [Python-Dev] testing in a Python --without-threads build
In-Reply-To: <46E62AF4.4040207@v.loewis.de>
References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com>
	<07Sep8.121928pdt."57996"@synergy1.parc.xerox.com>
	<46E2FAC4.2030900@v.loewis.de> <20070911030146.GA28351@panix.com>
	<46E62AF4.4040207@v.loewis.de>
Message-ID: <20070911131254.GA21952@panix.com>

On Tue, Sep 11, 2007, "Martin v. L?wis" wrote:
>
>>> No. IIUC, "expected skips" are a platform property. For your platform,
>>> support for threads is expected (whatever your platform is as log as
>>> it was built in this millenium).
>> 
>> Really?  I thought NetBSD was still iffy WRT threading.
> 
> Ah, right. Still, it seems that people expect that thread support is
> available on NetBSD. The list of expected skips does not mention
> test_thread for 'netbsd3' (it only does so for 'sco_sv3' and 'riscos')

I'm assuming that's because NetBSD has threads, they just don't work.  So
we don't want that to put NetBSD on the list of expected skips so that we
find out when threads do work.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html

From aahz at pythoncraft.com  Tue Sep 11 15:18:36 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 11 Sep 2007 06:18:36 -0700
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
Message-ID: <20070911131836.GB21952@panix.com>

On Tue, Sep 11, 2007, Justin Tulloss wrote:
>
> I had a whole long email about exactly what I was doing, but I think
> I'll get to the point instead. I'm trying to implement a python
> concurrency API and would like to use cpython to do it. To do that, I
> would like to remove the GIL.

You should review the work Greg Stein did to remove the GIL in 1.5.2;
although the interpreter core has changed considerably since then, I
believe the basic principles of the GIL are the same.  
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html

From ncoghlan at gmail.com  Tue Sep 11 16:38:45 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Sep 2007 00:38:45 +1000
Subject: [Python-Dev] Making directories and zip files executable
Message-ID: <46E6A875.9020208@gmail.com>

The local patch I have for PEP 366 is somewhat stale, and before I bring 
it up to date with SVN head, I'd like to close out the issue raised a 
while back regarding making zip files executable [1].

The original proposal was for a new command line switch, but PJE came up 
with a patch (attached to the roundup tracker item) that uses the 
existing import machinery to avoid the need for the extra command line 
switch (by checking if the argument is a valid sys.path entry before 
checking to see if it is an executable script).

I personally like the idea (and PJE's approach), and the performance 
impact on script startup time appears to be negligible (although I 
haven't performed any high precision measurements - I'm just using the 
Linux time utility on a short test script with and without the patch).

Are there any objections to my committing this?

Cheers,
Nick.

[1] http://bugs.python.org/issue1739468

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From thomas at python.org  Tue Sep 11 16:54:30 2007
From: thomas at python.org (Thomas Wouters)
Date: Tue, 11 Sep 2007 16:54:30 +0200
Subject: [Python-Dev] [issue1056] test_cmd_line starts python without -E
In-Reply-To: <1189519439.77.0.401459254933.issue1056@psf.upfronthosting.co.za>
References: <1189519439.77.0.401459254933.issue1056@psf.upfronthosting.co.za>
Message-ID: <9e804ac0709110754h28912b2cr53863d46d9e2bf9f@mail.gmail.com>

On 9/11/07, Nick Coghlan <report at bugs.python.org> wrote:

> (Is the head still being merged to the py3k branch? Or does this need to
> be forward-ported manually?)


No worries, the trunk is still being merged to py3k. I doubt we'll ever stop
doing that, unless the trunk becomes py3k and 2.x development is done on a
branch (in which case I imagine we'd merge between those two in some way :-)


-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/67b9e740/attachment.htm 

From tulloss2 at uiuc.edu  Tue Sep 11 17:07:34 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Tue, 11 Sep 2007 10:07:34 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E652CD.1070901@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
Message-ID: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>

On 9/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> > 1. Some global interpreter state/modules are protected (where are these
> > globals at?)
>
> It's the interpreter and thread state itself (pystate.h), for the thread
> state, also _PyThreadState_Current. Then there is the GC state, in
> particular "generations". There are various caches and counters also.


Caches seem like they definitely might be a problem. Would you mind
expanding on this a little? What gets cached and why?

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/b4c00904/attachment.htm 

From skip at pobox.com  Tue Sep 11 17:21:07 2007
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 11 Sep 2007 10:21:07 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
Message-ID: <18150.45667.738340.378354@montanaro.dyndns.org>


    Justin> Caches seem like they definitely might be a problem. Would you
    Justin> mind expanding on this a little? What gets cached and why?

I believe the integer free list falls into this category.

Skip

From martin at v.loewis.de  Tue Sep 11 17:50:00 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Sep 2007 17:50:00 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
Message-ID: <46E6B928.1090603@v.loewis.de>

>     It's the interpreter and thread state itself (pystate.h), for the thread
>     state, also _PyThreadState_Current. Then there is the GC state, in
>     particular "generations". There are various caches and counters also.
> 
> 
> Caches seem like they definitely might be a problem. Would you mind
> expanding on this a little? What gets cached and why?

Depends on the Python version what precisely gets cached. Several types
preserve a pool of preallocated objects, to speed up allocation.
Examples are intobject.c (block_list, free_list), frameobject.c
(free_list), listobject.c (free_list), methodobject.c (free_list),
float_object.c (block_list, free_list), classobject.c (free_list).

Plus there are tons of variables caching string objects. From
classobject.c alone: getattrstr, setattrstr, delattrs, docstr,
modstr, namestr, initstr, delstr, reprstr, strstr, hashstr, eqstr,
cmpstr, getitemstr, setitemstr, delitemstr, lenstr, iterstr, nextstr,
getslicestr, setslicestr, delslicestr, __contains__, all arguments
to UNARY, UNARY_FB, BINARY, BINARY_INPLACE (e.g. instance_neg,
instance_or, instance_ior, then cmp_obj, nonzerostr, indexstr.
(admittedly, classobject.c is extreme here).

There are probably more classes which I just forgot.

Regards,
Martin

From status at bugs.python.org  Tue Sep 11 18:22:59 2007
From: status at bugs.python.org (Tracker)
Date: Tue, 11 Sep 2007 16:22:59 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070911162259.50E71782C1@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (09/04/07 - 09/11/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1274 open (+31) / 11356 closed (+16) / 12630 total (+47)

Average duration of open issues: 673 days.
Median duration of open issues: 635 days.

Open Issues Breakdown
   open  1271 (+31)
pending     3 ( +0)

Issues Created Or Reopened (47)
_______________________________

Is there just no PRINT statement any more? Or it just doesn't wo 09/06/07
CLOSED http://bugs.python.org/issue1101    reopened loewis                   

Add support for _msi.Record.GetString() and _msi.Record.GetInteg 09/04/07
       http://bugs.python.org/issue1102    created  atuining                 

Typo in dummy_threading documentation                            09/04/07
CLOSED http://bugs.python.org/issue1103    created  dthomasset               

msilib.SummaryInfo.GetProperty() truncates the string by one cha 09/04/07
       http://bugs.python.org/issue1104    created  atuining                 

patch for readme.txt in PCbuild8                                 09/05/07
CLOSED http://bugs.python.org/issue1105    created  chipped                  

Error in random.shuffle                                          09/05/07
CLOSED http://bugs.python.org/issue1106    created  Viscaynot                

2to3,   lambda with non-tuple argument inside parenthesis        09/05/07
CLOSED http://bugs.python.org/issue1107    created  falsetru                 

Problem with doctest and decorated functions                     09/05/07
       http://bugs.python.org/issue1108    created  danilo                   

Warning required when calling register() on an ABCMeta subclass  09/05/07
       http://bugs.python.org/issue1109    created  mark                     

Problems with the msi installer - python-3.0a1.msi               09/05/07
       http://bugs.python.org/issue1110    created  vbr                      

Users' directories information                                   09/05/07
CLOSED http://bugs.python.org/issue1111    created  uzytkownik               

Test debug assertion in bsddb test_1413192.py                    09/05/07
CLOSED http://bugs.python.org/issue1112    created  db3l                     

interrupt_main() fails to interrupt raw_input()                  09/05/07
CLOSED http://bugs.python.org/issue1113    created  anand                    

_curses issues on 64-bit big-endian (e.g, AIX)                   09/06/07
       http://bugs.python.org/issue1114    created  lukemewburn              

Minor Change For Better cross compile                            09/06/07
       http://bugs.python.org/issue1115    created  zengbo                   

reference in extending doc to non-existing file                  09/06/07
CLOSED http://bugs.python.org/issue1116    created  anthon                   

Spurious warning about missing _sha256 and _sha512 when not need 09/06/07
CLOSED http://bugs.python.org/issue1117    created  dripton                  

hashlib module fails with TypeError                              09/06/07
CLOSED http://bugs.python.org/issue1118    created  dripton                  

Search index is messed up after partial rebuilding               09/06/07
CLOSED http://bugs.python.org/issue1119    created  lars.gustaebel           

"make altinstall" installs pydoc, idle, smtpd.py with broken she 09/06/07
       http://bugs.python.org/issue1120    created  dripton                  

Document inspect.getfullargspec()                                09/06/07
       http://bugs.python.org/issue1121    created  brett.cannon             

PyTuple_Size and PyTuple_GET_SIZE return type documentation inco 09/07/07
       http://bugs.python.org/issue1122    created  gaul                     

split(None, maxsplit) does not strip whitespace correctly        09/07/07
       http://bugs.python.org/issue1123    created  nirs                     

Webchecker not parsing css "@import url"                         09/07/07
       http://bugs.python.org/issue1124    created  ready.eddy               

bytes.split shold have same interface as str.split, or different 09/07/07
CLOSED http://bugs.python.org/issue1125    created  nirs                     

file.fileno and file.isatty() should be implementable by any fil 09/07/07
       http://bugs.python.org/issue1126    created  nirs                     

No tests for inspect.getfullargspec()                            09/07/07
       http://bugs.python.org/issue1127    created  brett.cannon             

msilib.Directory.make_short only handles file names with a singl 09/07/07
       http://bugs.python.org/issue1128    created  atuining                 

OpenSSL detection broken for Python 3.0a1                        09/07/07
CLOSED http://bugs.python.org/issue1129    created  pythonmeister            

Idle - Save (buffer) - closes IDLE and does not save file (Windo 09/08/07
       http://bugs.python.org/issue1130    created  infixum                  

Reference Manual: "for statement" links to "break statement"     09/08/07
       http://bugs.python.org/issue1131    created  Martoon                  

compile error in poplib.py                                       09/08/07
CLOSED http://bugs.python.org/issue1132    created  andre                    

python3.0-config raises SyntaxError                              09/09/07
CLOSED http://bugs.python.org/issue1133    created  complex                  

Parsing a simple script eats all of your memory                  09/09/07
       http://bugs.python.org/issue1134    created  complex                  

xview/yview of Tix.Grid is broken                                09/09/07
       http://bugs.python.org/issue1135    created  ocean-city               

Bdb documentation                                                09/09/07
       http://bugs.python.org/issue1136    created  arklad                   

pyexpat patch for changing buffer_size                           09/09/07
       http://bugs.python.org/issue1137    created  AchimGaedke              

Fixer needed for __future__ imports                              09/09/07
       http://bugs.python.org/issue1138    created  collinwinter             

PyFile_Encoding should be PyFile_SetEncoding                     09/10/07
       http://bugs.python.org/issue1139    created  gagenellina              

re.sub returns str when processing empty unicode string          09/10/07
       http://bugs.python.org/issue1140    created  beda                     

reading large files                                              09/10/07
       http://bugs.python.org/issue1141    created  Richard.Christen at unice.fr

code sample showing errors reading large files with py 2.5/3.0   09/10/07
       http://bugs.python.org/issue1142    created  Richard.Christen at unice.fr

Update to latest ElementTree in Python 2.6                       09/11/07
       http://bugs.python.org/issue1143    created  effbot                   

parsermodule validation out of sync with Grammar                 09/11/07
       http://bugs.python.org/issue1144    created  dbinger                  

Allow str.join to join non-string types (as per PEP 3100)        09/11/07
       http://bugs.python.org/issue1145    created  thomas.lee               

TextWrap vs words 1-character shorter than the width             09/11/07
       http://bugs.python.org/issue1146    created  sam                      

string exceptions inconsistently deprecated/disabled             09/11/07
       http://bugs.python.org/issue1147    created  exarkun                  



Issues Now Closed (43)
______________________

2to3 crashes on input files with no trailing newlines             14 days
       http://bugs.python.org/issue1001    collinwinter             

Backport ABC to 2.6                                               16 days
       http://bugs.python.org/issue1026    baranguren               

test_cmd_line starts python without -E                            13 days
       http://bugs.python.org/issue1056    ncoghlan                 

ssl.py shouldn't change class names from 2.6 to 3.x               11 days
       http://bugs.python.org/issue1065    janssen                  

Unexpected results in Tutorial about Unicode                       2 days
       http://bugs.python.org/issue1092    georg.brandl             

TypeError in poplib.py                                             7 days
       http://bugs.python.org/issue1094    gvanrossum               

make install failed                                                4 days
       http://bugs.python.org/issue1095    georg.brandl             

Deeply recursive repr segfault                                     7 days
       http://bugs.python.org/issue1096    brett.cannon             

Is there just no PRINT statement any more? Or it just doesn't w    0 days
       http://bugs.python.org/issue1101    gvanrossum               

Typo in dummy_threading documentation                              0 days
       http://bugs.python.org/issue1103    dthomasset               

patch for readme.txt in PCbuild8                                   0 days
       http://bugs.python.org/issue1105    loewis                   

Error in random.shuffle                                            0 days
       http://bugs.python.org/issue1106    georg.brandl             

2to3,   lambda with non-tuple argument inside parenthesis          4 days
       http://bugs.python.org/issue1107    collinwinter             

Users' directories information                                     1 days
       http://bugs.python.org/issue1111    uzytkownik               

Test debug assertion in bsddb test_1413192.py                      1 days
       http://bugs.python.org/issue1112    gregory.p.smith          

interrupt_main() fails to interrupt raw_input()                    2 days
       http://bugs.python.org/issue1113    anand                    

reference in extending doc to non-existing file                    0 days
       http://bugs.python.org/issue1116    georg.brandl             

Spurious warning about missing _sha256 and _sha512 when not nee    0 days
       http://bugs.python.org/issue1117    skip.montanaro           

hashlib module fails with TypeError                                0 days
       http://bugs.python.org/issue1118    georg.brandl             

Search index is messed up after partial rebuilding                 0 days
       http://bugs.python.org/issue1119    georg.brandl             

bytes.split shold have same interface as str.split, or differen    4 days
       http://bugs.python.org/issue1125    gvanrossum               

OpenSSL detection broken for Python 3.0a1                          0 days
       http://bugs.python.org/issue1129    georg.brandl             

compile error in poplib.py                                         1 days
       http://bugs.python.org/issue1132    georg.brandl             

python3.0-config raises SyntaxError                                0 days
       http://bugs.python.org/issue1133    loewis                   

raw_input defers alarm signal                                   1666 days
       http://bugs.python.org/issue685846  georg.brandl             

support for server side transactions in _ssl                    1498 days
       http://bugs.python.org/issue783188  loewis                   

patch for build with read-only $srcdir                          1486 days
       http://bugs.python.org/issue786737  loewis                   

SSL-ed sockets don't close correct?                             1168 days
       http://bugs.python.org/issue978833  loewis                   

a bunch of infinite C recursions                                 844 days
       http://bugs.python.org/issue1202533 brett.cannon             

add single html files                                            833 days
       http://bugs.python.org/issue1209562 georg.brandl             

crash recursive __getattr__                                      744 days
       http://bugs.python.org/issue1267884 brett.cannon             

Traceback error when compiling Regex                             537 days
       http://bugs.python.org/issue1456280 brett.cannon             

winerror module                                                  450 days
       http://bugs.python.org/issue1505257 loewis                   

SSL "issuer" and "server" names cannot be parsed                 321 days
       http://bugs.python.org/issue1583946 janssen                  

Suggest a textlist() method for ElementTree                      291 days
       http://bugs.python.org/issue1602189 effbot                   

sys.intern() 2to3 fixer                                          261 days
       http://bugs.python.org/issue1619049 collinwinter             

Explain __method__ lookup semantics for new-style classes        168 days
       http://bugs.python.org/issue1684991 georg.brandl             

socket.error exceptions not subclass of StandardError            138 days
       http://bugs.python.org/issue1706815 gregory.p.smith          

imp.find_module doc ambiguity                                    134 days
       http://bugs.python.org/issue1708326 georg.brandl             

_lsprof.c:ptrace_enter_call assumes PyErr_* is clean              89 days
       http://bugs.python.org/issue1733973 arigo                    

expanduser("~") on Windows looks for HOME first                   62 days
       http://bugs.python.org/issue1749583 georg.brandl             

os.chmod failure                                                  35 days
       http://bugs.python.org/issue1767242 georg.brandl             

Binding <Control-space> fails                                     26 days
       http://bugs.python.org/issue1774736 loewis                   



Top Issues Most Discussed (10)
______________________________

 11 re.sub returns str when processing empty unicode string            1 days
open    http://bugs.python.org/issue1140   

 11 bytes.split shold have same interface as str.split, or differen    4 days
closed  http://bugs.python.org/issue1125   

  8 split(None, maxsplit) does not strip whitespace correctly          5 days
open    http://bugs.python.org/issue1123   

  7 Spurious warning about missing _sha256 and _sha512 when not nee    0 days
closed  http://bugs.python.org/issue1117   

  7 make install failed                                                4 days
closed  http://bugs.python.org/issue1095   

  6 reading large files                                                1 days
open    http://bugs.python.org/issue1141   

  5 code sample showing errors reading large files with py 2.5/3.0     1 days
open    http://bugs.python.org/issue1142   

  5 Parsing a simple script eats all of your memory                    3 days
open    http://bugs.python.org/issue1134   

  4 logging: delay_fh option and configuration kwargs                 41 days
open    http://bugs.python.org/issue1765140

  4 interrupt_main() fails to interrupt raw_input()                    2 days
closed  http://bugs.python.org/issue1113   




From guido at python.org  Tue Sep 11 19:48:57 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 11 Sep 2007 10:48:57 -0700
Subject: [Python-Dev] Making directories and zip files executable
In-Reply-To: <46E6A875.9020208@gmail.com>
References: <46E6A875.9020208@gmail.com>
Message-ID: <ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.com>

I could use a refresher on how PJE's patch solves Andy's problem.

On 9/11/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The local patch I have for PEP 366 is somewhat stale, and before I bring
> it up to date with SVN head, I'd like to close out the issue raised a
> while back regarding making zip files executable [1].
>
> The original proposal was for a new command line switch, but PJE came up
> with a patch (attached to the roundup tracker item) that uses the
> existing import machinery to avoid the need for the extra command line
> switch (by checking if the argument is a valid sys.path entry before
> checking to see if it is an executable script).
>
> I personally like the idea (and PJE's approach), and the performance
> impact on script startup time appears to be negligible (although I
> haven't performed any high precision measurements - I'm just using the
> Linux time utility on a short test script with and without the patch).
>
> Are there any objections to my committing this?
>
> Cheers,
> Nick.
>
> [1] http://bugs.python.org/issue1739468
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://www.boredomandlaziness.org
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Tue Sep 11 20:12:09 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 11:12:09 PDT
Subject: [Python-Dev] adding a "test" fork to a setup.py package
Message-ID: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com>

I'm packaging up the SSL support for Python 2.3, and I'd like to be
able to include the unit test for it along with the package.  Ideally,
I'd like to be able to say

    % python setup.py build
    % python setup.py test

and have the regrtest.py module run my tests.  Any ideas (examples) of
how to do that?

Bill

From anomyo2 at gmail.com  Tue Sep 11 20:33:00 2007
From: anomyo2 at gmail.com (=?iso-8859-1?Q?Carlos_Mart=EDnez?=)
Date: Tue, 11 Sep 2007 20:33:00 +0200
Subject: [Python-Dev] Compiler Python
Message-ID: <000001c7f4a2$307fa720$917ef560$@com>

Hello

 

Someone knows since as I can obtain the information detailed about the
compiler of Python? (Table of tokens, lists of productions of the syntactic
one , semantic restrictions...)

 

Thanks.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/22f26470/attachment.htm 

From barry at python.org  Tue Sep 11 20:38:44 2007
From: barry at python.org (Barry Warsaw)
Date: Tue, 11 Sep 2007 14:38:44 -0400
Subject: [Python-Dev] adding a "test" fork to a setup.py package
In-Reply-To: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <E2C08A03-4BC3-420C-980D-75BA1C211FFA@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 11, 2007, at 2:12 PM, Bill Janssen wrote:

> I'm packaging up the SSL support for Python 2.3, and I'd like to be
> able to include the unit test for it along with the package.  Ideally,
> I'd like to be able to say
>
>     % python setup.py build
>     % python setup.py test
>
> and have the regrtest.py module run my tests.  Any ideas (examples) of
> how to do that?

The email package does something like this by having most of the  
tests in a subpackage of enum, with a shim in Lib/test for regrtest.   
The standalone package has a testall script, but that should really  
be converted to nosetests or some such.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRubgtXEjvBPtnXfVAQIGJgP/aeCOlybJj0sA3k6WOWMCOhugggLHHtO2
lu5v7hZZG5nqe5iApwxjbiylxvMRfRB6HS7dgEABx1D5OC3uFssn3kUzokfBtsxy
I/e4qYiTSCG3WZacqytAqmjKt3FkceIo+l6YKx29FjPlaoHHz0UzCJIdW9AuJp4a
Ussk9AOPIXo=
=FMDE
-----END PGP SIGNATURE-----

From tjreedy at udel.edu  Tue Sep 11 21:01:33 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 11 Sep 2007 15:01:33 -0400
Subject: [Python-Dev] Compiler Python
References: <000001c7f4a2$307fa720$917ef560$@com>
Message-ID: <fc6oma$63h$1@sea.gmane.org>


"Carlos Mart?nez" <anomyo2 at gmail.com> wrote in message 
news:000001c7f4a2$307fa720$917ef560$@com...
| Someone knows since as I can obtain the information detailed about the
| compiler of Python? (Table of tokens, lists of productions of the 
syntactic
| one , semantic restrictions...)

Ask this sort of question on the python-list mailing list or the
comp.lang.python or gmane.comp.python.general newsgroups.
And check the source code. 




From theller at ctypes.org  Tue Sep 11 21:04:31 2007
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 11 Sep 2007 21:04:31 +0200
Subject: [Python-Dev] adding a "test" fork to a setup.py package
In-Reply-To: <E2C08A03-4BC3-420C-980D-75BA1C211FFA@python.org>
References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com>
	<E2C08A03-4BC3-420C-980D-75BA1C211FFA@python.org>
Message-ID: <fc6orv$66l$1@sea.gmane.org>

Barry Warsaw schrieb:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Sep 11, 2007, at 2:12 PM, Bill Janssen wrote:
> 
>> I'm packaging up the SSL support for Python 2.3, and I'd like to be
>> able to include the unit test for it along with the package.  Ideally,
>> I'd like to be able to say
>>
>>     % python setup.py build
>>     % python setup.py test
>>
>> and have the regrtest.py module run my tests.  Any ideas (examples) of
>> how to do that?
> 
> The email package does something like this by having most of the  
> tests in a subpackage of enum, with a shim in Lib/test for regrtest.   
> The standalone package has a testall script, but that should really  
> be converted to nosetests or some such.

ctypes does it in a similar way.  Tests are in the Lib/ctypes/test package.

Thomas


From pje at telecommunity.com  Tue Sep 11 21:19:46 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 11 Sep 2007 15:19:46 -0400
Subject: [Python-Dev] Making directories and zip files executable
In-Reply-To: <ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.co
 m>
References: <46E6A875.9020208@gmail.com>
	<ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.com>
Message-ID: <20070911191716.06F913A40D7@sparrow.telecommunity.com>

At 10:48 AM 9/11/2007 -0700, Guido van Rossum wrote:
>I could use a refresher on how PJE's patch solves Andy's problem.

It does the same thing, but with __main__ instead of __zipmain__, and 
without needing the -z.  So instead of "python -z foo.zip" you can 
just do "python foo.zip".  This means you can use a reasonably 
cross-platform #! header that invokes Python via "env".

If you tried to use -z with env, Andy's approach would either only 
work on BSDish platforms that support multiple options to a #! 
command, or else you'd have to ditch the "env" and hardcode the patht 
to Python.  So not needing a command-line option improves the 
effective portability/usability of the executable zip file.

Also, being able to run "python directory_to_be_zipped_later" also 
lets you test/develop your program without it needing to be zipped first.


>On 9/11/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > The local patch I have for PEP 366 is somewhat stale, and before I bring
> > it up to date with SVN head, I'd like to close out the issue raised a
> > while back regarding making zip files executable [1].
> >
> > The original proposal was for a new command line switch, but PJE came up
> > with a patch (attached to the roundup tracker item) that uses the
> > existing import machinery to avoid the need for the extra command line
> > switch (by checking if the argument is a valid sys.path entry before
> > checking to see if it is an executable script).
> >
> > I personally like the idea (and PJE's approach), and the performance
> > impact on script startup time appears to be negligible (although I
> > haven't performed any high precision measurements - I'm just using the
> > Linux time utility on a short test script with and without the patch).
> >
> > Are there any objections to my committing this?
> >
> > Cheers,
> > Nick.
> >
> > [1] http://bugs.python.org/issue1739468


From pje at telecommunity.com  Tue Sep 11 21:23:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 11 Sep 2007 15:23:04 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.co
 m>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
Message-ID: <20070911192035.94CE33A40D7@sparrow.telecommunity.com>

At 10:07 AM 9/11/2007 -0500, Justin Tulloss wrote:


>On 9/11/07, "Martin v. L?wis" 
><<mailto:martin at v.loewis.de>martin at v.loewis.de> wrote:
> > 1. Some global interpreter state/modules are protected (where are these
> > globals at?)
>
>It's the interpreter and thread state itself (pystate.h), for the thread
>state, also _PyThreadState_Current. Then there is the GC state, in
>particular "generations". There are various caches and counters also.
>
>
>Caches seem like they definitely might be a problem. Would you mind 
>expanding on this a little? What gets cached and why?

It's not just caches and counters.  It's also every built-in type 
structure, builtin module, builtin function...  any Python object 
that's a built-in, period.  That includes things like None, True, and False.

Caches would include such things as the pre-created integers -100 
through 255, the 1-byte character strings for chr(0)-chr(255), and 
the interned strings cache, to name a few.

Most of these things I've mentioned are truly global, and not 
specific to an individual interpreter.


From brett at python.org  Tue Sep 11 21:30:40 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 11 Sep 2007 12:30:40 -0700
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E6B928.1090603@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<46E6B928.1090603@v.loewis.de>
Message-ID: <bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>

On 9/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >     It's the interpreter and thread state itself (pystate.h), for the thread
> >     state, also _PyThreadState_Current. Then there is the GC state, in
> >     particular "generations". There are various caches and counters also.
> >
> >
> > Caches seem like they definitely might be a problem. Would you mind
> > expanding on this a little? What gets cached and why?
>
> Depends on the Python version what precisely gets cached. Several types
> preserve a pool of preallocated objects, to speed up allocation.
> Examples are intobject.c (block_list, free_list), frameobject.c
> (free_list), listobject.c (free_list), methodobject.c (free_list),
> float_object.c (block_list, free_list), classobject.c (free_list).
>
> Plus there are tons of variables caching string objects. From
> classobject.c alone: getattrstr, setattrstr, delattrs, docstr,
> modstr, namestr, initstr, delstr, reprstr, strstr, hashstr, eqstr,
> cmpstr, getitemstr, setitemstr, delitemstr, lenstr, iterstr, nextstr,
> getslicestr, setslicestr, delslicestr, __contains__, all arguments
> to UNARY, UNARY_FB, BINARY, BINARY_INPLACE (e.g. instance_neg,
> instance_or, instance_ior, then cmp_obj, nonzerostr, indexstr.
> (admittedly, classobject.c is extreme here).
>
> There are probably more classes which I just forgot.

We should probably document where all of these globals lists are
instead of relying on looking for all file level static declarations
or something.  Or would there be benefit to moving things like this to
the interpreter struct so that threads within a single interpreter
call are locked but interpreters can act much more independently?

-Brett

From janssen at parc.com  Tue Sep 11 22:02:36 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 13:02:36 PDT
Subject: [Python-Dev] adding a "test" fork to a setup.py package
In-Reply-To: <E2C08A03-4BC3-420C-980D-75BA1C211FFA@python.org> 
References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com>
	<E2C08A03-4BC3-420C-980D-75BA1C211FFA@python.org>
Message-ID: <07Sep11.130243pdt."57996"@synergy1.parc.xerox.com>

It's actually not bad.  I put the test code and the data files in a
"test" subdirectory of my distribution, then added the following to
the setup.py file:

class Test (Command):

    user_options = []

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def run (self):

        """Run the regrtest module appropriately"""

        # figure out where the _ssl2 extension will be put
        b = build(self.distribution)
        b.initialize_options()
        b.finalize_options()
        extdir = os.path.abspath(b.build_platlib)

        # now set up the load path
        topdir = os.path.dirname(os.path.abspath(__file__))
        localtestdir = os.path.join(topdir, "test")
        sys.path.insert(0, topdir)        # for ssl package
        sys.path.insert(0, localtestdir)  # for test module
        sys.path.insert(0, extdir)        # for _ssl2 extension

        # make sure the network is enabled
        import test.test_support
        test.test_support.use_resources = ["network"]

        # and load the test and run it
        os.chdir(localtestdir)
        the_module = __import__("test_ssl", globals(), locals(), [])
        # Most tests run to completion simply as a side-effect of
        # being imported.  For the benefit of tests that can't run
        # that way (like test_threaded_import), explicitly invoke
        # their test_main() function (if it exists).
        indirect_test = getattr(the_module, "test_main", None)
        if indirect_test is not None:
            indirect_test()

and added 

      cmdclass={'test': Test},

to the setup call.  Irritating that you have to manually install the
test files as data_files.  Also irritating that data_files aren't
automatically added to the manifest, and that Test has to have null
initialize_options and finalize_options.  And that there's no easy
way to figure out where the build process left the extension.

Bill


From janssen at parc.com  Tue Sep 11 22:36:31 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 13:36:31 PDT
Subject: [Python-Dev] re-using the Python setup.py file?
Message-ID: <07Sep11.133638pdt."57996"@synergy1.parc.xerox.com>

I see that the setup.py at the top level of the Python distribution
does a lot of things wrt sensing compiler options, etc, that I'd like
to re-use in my SSL setup.py distribution file.  I'm a bit curious
as to why this framework isn't in the distutils package?

Bill

From martin at v.loewis.de  Tue Sep 11 22:53:01 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Sep 2007 22:53:01 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
Message-ID: <46E7002D.6050005@v.loewis.de>

> It's not just caches and counters.  It's also every built-in type
> structure, builtin module, builtin function...  any Python object that's
> a built-in, period.  That includes things like None, True, and False.

Sure - but those things don't get modified that often, except for their
reference count. In addition, they are objects, and Justin seems to
believe that things are easier if they are objects.

Regards,
Martin

From foom at fuhm.net  Tue Sep 11 22:54:58 2007
From: foom at fuhm.net (James Y Knight)
Date: Tue, 11 Sep 2007 16:54:58 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<46E6B928.1090603@v.loewis.de>
	<bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
Message-ID: <C9805EA8-B9A7-43F6-9FBD-DB06E903136B@fuhm.net>

On Sep 11, 2007, at 3:30 PM, Brett Cannon wrote:
> We should probably document where all of these globals lists are
> instead of relying on looking for all file level static declarations
> or something.  Or would there be benefit to moving things like this to
> the interpreter struct so that threads within a single interpreter
> call are locked but interpreters can act much more independently?

This would be nice. It would be really nice if python was embeddable  
more like TCL: separate interpreters really are separate, and don't  
share state. That means basically everything has to be stored in an  
interp-specific data structure, not in static variables.

But this has been raised before, and was rejected as not worth the  
amount of work that would be required to achieve it. (it's certainly  
not worth it enough for *me* to do the work, so I can't blame anyone  
else for making the same determination.)

James

From martin at v.loewis.de  Tue Sep 11 23:00:34 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 11 Sep 2007 23:00:34 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	
	<46E652CD.1070901@v.loewis.de>	
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>	
	<46E6B928.1090603@v.loewis.de>
	<bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
Message-ID: <46E701F2.3060606@v.loewis.de>

> We should probably document where all of these globals lists are
> instead of relying on looking for all file level static declarations
> or something.

I'm not sure what would be gained here, except for people occasionally
(i.e. every three years) asking how they can best get rid of the GIL.

> Or would there be benefit to moving things like this to
> the interpreter struct so that threads within a single interpreter
> call are locked but interpreters can act much more independently?

The "multiple interpreter" feature doesn't quite work, and likely
won't for a foreseeable future; specifically, objects can easily
leak across interpreters. That's actually not a problem for immutable
objects, like the strings, but here come the global objects into
play which PJE mentions: types, including exceptions. Making them
per-interpreter would probably break quite some code.

As for the cached strings - it would be easy to make a global table
of these, e.g. calling them _PyS__init__, _PyS__add__, and so on.
These could be initialized at startup, simplifying the code that
uses them because they don't have to worry about failures.

Regards,
Martin

From janssen at parc.com  Tue Sep 11 23:19:36 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 14:19:36 PDT
Subject: [Python-Dev] SSL package for Python 2.3 to 2.5
Message-ID: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com>

I've put up an initial source package at
http://www.parc.com/janssen/transient/ssl-1.0.tar.gz which I've tested
with Python 2.3.5 on Mac OS X 10.4.10 (Intel) and Python 2.5 on Fedora
Core 7.  Please send bugs you find to me at janssen at parc.com.

Try "python setup.py build", then "python setup.py test".

Bill

From greg at krypto.org  Tue Sep 11 23:43:45 2007
From: greg at krypto.org (Gregory P. Smith)
Date: Tue, 11 Sep 2007 14:43:45 -0700
Subject: [Python-Dev] re-using the Python setup.py file?
In-Reply-To: <-148172167956746104@unknownmsgid>
References: <-148172167956746104@unknownmsgid>
Message-ID: <52dc1c820709111443x54837f66i6f20b4f581962eff@mail.gmail.com>

On 9/11/07, Bill Janssen <janssen at parc.com> wrote:
>
> I see that the setup.py at the top level of the Python distribution
> does a lot of things wrt sensing compiler options, etc, that I'd like
> to re-use in my SSL setup.py distribution file.  I'm a bit curious
> as to why this framework isn't in the distutils package?


I suspect a combo of (a) nobody has done it yet and (b) many of the things
done there felt too hackish to the person writing them.  Regardless of (b)
I'd place my money on (a).

In maintaining external bsddb and hashlib module distributions for use on
older pythons I have so far just pasted code as appropriate to/from the
python setup.py and the separate distribution ones.  Not ideal but trivial
since once settled upon setup didn't change much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/c8e4ec90/attachment.htm 

From greg.ewing at canterbury.ac.nz  Wed Sep 12 00:55:20 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Sep 2007 10:55:20 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
Message-ID: <46E71CD8.5070608@canterbury.ac.nz>

Phillip J. Eby wrote:
> It's also every built-in type 
> structure, builtin module, builtin function...  any Python object 
> that's a built-in, period.

Where "built-in" in this context means anything implemented
in C (i.e. it includes extension modules).

--
Greg

From greg.ewing at canterbury.ac.nz  Wed Sep 12 01:20:31 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Sep 2007 11:20:31 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E7002D.6050005@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de>
Message-ID: <46E722BF.8000807@canterbury.ac.nz>

Martin v. L?wis wrote:
> Sure - but those things don't get modified that often, except for their
> reference count.

The reference count is the killer, though -- you have
to lock the object even to do that. And it happens
a LOT, to all objects, including immutable ones.

--
Greg

From janssen at parc.com  Wed Sep 12 02:30:01 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 17:30:01 PDT
Subject: [Python-Dev] what versions of Python don't have the "addr" field in
	the socket object?
Message-ID: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com>

Looks like this change is bothering the error returns from my
backported SSL code.  I believe this is only in 2.5.1 and later -- can
anyone confirm that?

Bill

------------------------------------------------------------
r52906 | martin.v.loewis | 2006-12-03 03:23:45 -0800 (Sun, 03 Dec 2006) | 4 lines

Patch #1544279: Improve thread-safety of the socket module by moving
the sock_addr_t storage out of the socket object.
Will backport to 2.5.


From janssen at parc.com  Wed Sep 12 03:02:54 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 11 Sep 2007 18:02:54 PDT
Subject: [Python-Dev] SSL package for Python 2.3 to 2.5
In-Reply-To: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep11.180303pdt."57996"@synergy1.parc.xerox.com>

> I've put up an initial source package at
> http://www.parc.com/janssen/transient/ssl-1.0.tar.gz which I've tested
> with Python 2.3.5 on Mac OS X 10.4.10 (Intel) and Python 2.5 on Fedora
> Core 7.  Please send bugs you find to me at janssen at parc.com.
> 
> Try "python setup.py build", then "python setup.py test".

There was a bug for 2.5.1 in the package (the socket data structure
changed with 2.5.1), so I've put up a different version,
http://www.parc.com/janssen/transient/ssl-1.1.tar.gz which I've tested
with 2.3.5 and 2.5.1 on OS X.

Same drill:  try "python setup.py build", then "python setup.py test".
Report bugs to janssen at parc.com.

Thanks to Collin Winter for reporting this problem with 2.5.1.

Bill

From aahz at pythoncraft.com  Wed Sep 12 03:34:12 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 11 Sep 2007 18:34:12 -0700
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <20070912013412.GB14034@panix.com>

On Thu, Sep 06, 2007, Bill Janssen wrote:
>
> By the way, I think the hostname matching provisions of 2818 (which
> is, after all, only an informational RFC, not a standard) are poorly
> thought out.  Many machines have more hostnames than you can shake a
> stick at, and often provide certs with the wrong hostname in them
> (usually because they have no way to determine what the *right*
> hostname is, from inside that machine).

...which is why you pretty much need to have a canonical hostname mapped
to each IP you're using on a machine.  Basically, you need to map the
hostname you intend to use to an IP, then do reverse-DNS to find out
whether the hostname is in fact the canonical hostname.  If not, you're
using the wrong hostname on your cert.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html

From martin at v.loewis.de  Wed Sep 12 09:30:10 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 12 Sep 2007 09:30:10 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <C9805EA8-B9A7-43F6-9FBD-DB06E903136B@fuhm.net>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<46E652CD.1070901@v.loewis.de>	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>	<46E6B928.1090603@v.loewis.de>	<bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
	<C9805EA8-B9A7-43F6-9FBD-DB06E903136B@fuhm.net>
Message-ID: <46E79582.2080300@v.loewis.de>

> But this has been raised before, and was rejected as not worth the  
> amount of work that would be required to achieve it.

In my understanding, there is an important difference between
"it was rejected", and "it was not done".

Regards,
Martin

From martin at v.loewis.de  Wed Sep 12 09:32:13 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 12 Sep 2007 09:32:13 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E722BF.8000807@canterbury.ac.nz>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<46E652CD.1070901@v.loewis.de>	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>	<46E7002D.6050005@v.loewis.de>
	<46E722BF.8000807@canterbury.ac.nz>
Message-ID: <46E795FD.1070103@v.loewis.de>

>> Sure - but those things don't get modified that often, except for their
>> reference count.
> 
> The reference count is the killer, though -- you have
> to lock the object even to do that. And it happens
> a LOT, to all objects, including immutable ones.

Now we are getting into details: you do NOT have to lock
an object to modify its reference count. An atomic
increment/decrement operation is enough.

Regards,
Martin

From martin at v.loewis.de  Wed Sep 12 09:36:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 12 Sep 2007 09:36:30 +0200
Subject: [Python-Dev] what versions of Python don't have the "addr"
 field in	the socket object?
In-Reply-To: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E796FE.3060409@v.loewis.de>

> I believe this is only in 2.5.1 and later -- can
> anyone confirm that?

That's correct.

Martin

From ncoghlan at gmail.com  Wed Sep 12 11:19:35 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Sep 2007 19:19:35 +1000
Subject: [Python-Dev] Making directories and zip files executable
In-Reply-To: <ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.com>
References: <46E6A875.9020208@gmail.com>
	<ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.com>
Message-ID: <46E7AF27.6090300@gmail.com>

Guido van Rossum wrote:
> I could use a refresher on how PJE's patch solves Andy's problem.

I'm not sure if you're asking about how you would execute a zip file 
after the patch has been applied, or about the mechanics of how the 
patch works. PJE's last post covered the former question, so I'll cover 
the gist of the latter.

The patch works by passing the script argument to the import machinery 
to see if it is recognised as a valid sys.path entry (i.e. either a 
directory or a zip file in a default Python installation). If it is, 
then add that location to the front of sys.path and use the -m switch 
support to execute the "__main__" module directly.

If the filename passed in isn't recognised as a sys.path entry, then it 
is executed as a script as normal.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jason.orendorff at gmail.com  Wed Sep 12 16:47:33 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Wed, 12 Sep 2007 10:47:33 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E795FD.1070103@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
Message-ID: <bb8868b90709120747l1e5fbcdfjf02999e606be56d3@mail.gmail.com>

On 9/12/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Now we are getting into details: you do NOT have to lock
> an object to modify its reference count. An atomic
> increment/decrement operation is enough.

One could measure the performance hit incurred by using atomic
operations for refcounting by hacking a few macros -- right?

Deferred reference counting (DRC for short) might help...
http://www.memorymanagement.org/glossary/d.html#deferred.reference.counting

I can explain a little more how this works if anyone's interested.
DRC basically eliminates reference counting for locals--that is,
pointers from the stack to an object.  An object becomes refcounted
only when some other object gets a pointer to it.  The drawback is
that destructors aren't called quite as promptly as in true
refcounting.  (They're still called in the right order,
though--barring cycles, an object's destructor is called before its
children's destructors.)

What counts as "stack" is up to the implementation; typically it means
"the C stack".  This could be used to eliminate most refcounting in C
code, although listobject.c would keep it.  The amount of per-platform
assembly code needed is surprisingly small (and won't change, once
you've written it--the Tamarin JavaScript VM does this).

You could go further and treat the Python f_locals and interpreter
stack as "stack". I think this would eliminate all refcounting in the
interpreter.  Of course, it complicates matters that f_locals is
actually an object visible from Python.

Just a thought, not a demand, please don't flame me,
-j

From skip at pobox.com  Wed Sep 12 17:31:59 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 12 Sep 2007 10:31:59 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<46E6B928.1090603@v.loewis.de>
	<bbaeab100709111230v6fab3d66w8eabf9623737b859@mail.gmail.com>
Message-ID: <18152.1647.855781.953782@montanaro.dyndns.org>


    Brett> We should probably document where all of these globals lists are
    Brett> instead of relying on looking for all file level static
    Brett> declarations or something.

I smell a wiki page.

Skip

    Brett> Or would there be benefit to moving things like this to the
    Brett> interpreter struct so that threads within a single interpreter
    Brett> call are locked but interpreters can act much more independently?

Would that simplify things all that much?  All containers would probably
still rely on the GIL.  Also, all objects rely on it to do reference count
increment/decrement as I recall.

Skip

From skip at pobox.com  Wed Sep 12 17:38:47 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 12 Sep 2007 10:38:47 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E795FD.1070103@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
Message-ID: <18152.2055.258930.576257@montanaro.dyndns.org>


    Martin> Now we are getting into details: you do NOT have to lock an
    Martin> object to modify its reference count. An atomic
    Martin> increment/decrement operation is enough.

Implemented in asm I suspect?  For common CPUs this could just be part of
the normal Python distribution.  For uncommon ones this could use a lock
until someone gets around to writing the necessary couple lines of
assembler.

Skip

From janssen at parc.com  Wed Sep 12 20:03:24 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 11:03:24 PDT
Subject: [Python-Dev] Windows package for new SSL package?
Message-ID: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>

I can't figure out how to build a Windows package for ssl-1.1.tar.gz,
and probably don't have the tools to do it anyway.  I presume that
both a Windows machine and Visual Studio (because there's a C
extension) is required?

Anyone out there who's interested in the challenge?

It's at http://www.parc.com/janssen/transient/ssl-1.1.tar.gz.

Incidentally, there's no documentation in the package; instead, just
use the development documentation at
http://docs.python.org/dev/library/ssl.html.

Bill

From janssen at parc.com  Wed Sep 12 20:05:54 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 11:05:54 PDT
Subject: [Python-Dev] SSL-protected server on python.org for testing?
Message-ID: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com>

The SSL tests currently use SSL-protected ports on gmail.com and
Verisign for testing.  That's not what they are for; I think we should
shift to using SSL-protected ports on python.org somewhere.  Are there
any HTTPS servers, or SSL-protected POP or IMAP servers, currently
running on python.org already that I could use?  The "use" is an SSL
handshake with the server, once or twice per test run.

Bill

From janssen at parc.com  Wed Sep 12 20:12:24 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 11:12:24 PDT
Subject: [Python-Dev] frozenset C API?
In-Reply-To: <20070912013412.GB14034@panix.com> 
References: <-4762611594645938717@unknownmsgid>
	<ca471dc20709041137s61697b41uc4a493725ae8c3b9@mail.gmail.com>
	<46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
	<20070912013412.GB14034@panix.com>
Message-ID: <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>

> > By the way, I think the hostname matching provisions of 2818 (which
> > is, after all, only an informational RFC, not a standard) are poorly
> > thought out.  Many machines have more hostnames than you can shake a
> > stick at, and often provide certs with the wrong hostname in them
> > (usually because they have no way to determine what the *right*
> > hostname is, from inside that machine).
> 
> ...which is why you pretty much need to have a canonical hostname mapped
> to each IP you're using on a machine.  Basically, you need to map the
> hostname you intend to use to an IP, then do reverse-DNS to find out
> whether the hostname is in fact the canonical hostname.  If not, you're
> using the wrong hostname on your cert.

Yep.  The problem is having a particular service know which
certificate it should choose to use, and also to know when the network
connectivity has changed.  Usually, server ports are bound to wildcard
IP addresses, so that they can still be reached even if the network
connectivity changes (particularly true for servers running on
laptops, or the Python server I'm running on my iPhone).  The server
has no way of knowing which IP address the client knows it as, and no
way of knowing which of its multiple certificates to present, so that
the name in the cert will match the name the client thought it was
using.

Or am I wrong?  Is there some interface in the socket API which gives
this information?

Bill


From barry at python.org  Wed Sep 12 20:25:59 2007
From: barry at python.org (Barry Warsaw)
Date: Wed, 12 Sep 2007 14:25:59 -0400
Subject: [Python-Dev] SSL-protected server on python.org for testing?
In-Reply-To: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <B124C7F2-0B23-4863-A902-21095659CC1A@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 12, 2007, at 2:05 PM, Bill Janssen wrote:

> The SSL tests currently use SSL-protected ports on gmail.com and
> Verisign for testing.  That's not what they are for; I think we should
> shift to using SSL-protected ports on python.org somewhere.  Are there
> any HTTPS servers, or SSL-protected POP or IMAP servers, currently
> running on python.org already that I could use?  The "use" is an SSL
> handshake with the server, once or twice per test run.

svn.python.org?

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRugvN3EjvBPtnXfVAQJejQP+JdsEJroDOWdN53cDvtdahJ/2AheObhhb
UEdOaucxW3i+odPEUmjLncVq70IQJt1T4YQuZ835iT+k6OkIoB+eaTU3OqslB6bv
JKMYsb0Jxdl/plqWld/6WBSH+fCGB5x+JrxelKdu2xVdF8i1YHU+FehK2y1k1kZi
Bc9hZ7kONN8=
=Uamc
-----END PGP SIGNATURE-----

From janssen at parc.com  Wed Sep 12 20:38:23 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 11:38:23 PDT
Subject: [Python-Dev] SSL-protected server on python.org for testing?
In-Reply-To: <B124C7F2-0B23-4863-A902-21095659CC1A@python.org> 
References: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com>
	<B124C7F2-0B23-4863-A902-21095659CC1A@python.org>
Message-ID: <07Sep12.113830pdt."57996"@synergy1.parc.xerox.com>

Yes, port 443 on svn.python.org seems to work for this purpose.

Everyone OK with that?  If so, I'll change the SSL test code.

Bill

From guido at python.org  Wed Sep 12 20:39:36 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Sep 2007 11:39:36 -0700
Subject: [Python-Dev] Making directories and zip files executable
In-Reply-To: <46E7AF27.6090300@gmail.com>
References: <46E6A875.9020208@gmail.com>
	<ca471dc20709111048t4bfcf1e5xbc7483cffd5490db@mail.gmail.com>
	<46E7AF27.6090300@gmail.com>
Message-ID: <ca471dc20709121139t38bbe8bds50931661ba3a67d2@mail.gmail.com>

On 9/12/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
> > I could use a refresher on how PJE's patch solves Andy's problem.
>
> I'm not sure if you're asking about how you would execute a zip file
> after the patch has been applied, or about the mechanics of how the
> patch works. PJE's last post covered the former question, so I'll cover
> the gist of the latter.
>
> The patch works by passing the script argument to the import machinery
> to see if it is recognised as a valid sys.path entry (i.e. either a
> directory or a zip file in a default Python installation).

Ah, this is the crux! I didn't understand Phillips wording of "an
importable path". I still didn't understand your wording "recognised
as a valid sys.path entry"; both wordings suggest a link between
sys.argv[0] and the current value of sys.path, which isn't the case --
it is whether they are recognized by the "meta import hook"! This only
became clear after I re-read the patch with Phillip's and your words
in the back of my head.

I now like and approve of the patch, and said so on the tracker.

--Guido

> If it is,
> then add that location to the front of sys.path and use the -m switch
> support to execute the "__main__" module directly.
>
> If the filename passed in isn't recognised as a sys.path entry, then it
> is executed as a script as normal.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://www.boredomandlaziness.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mhammond at skippinet.com.au  Thu Sep 13 01:22:48 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 13 Sep 2007 09:22:48 +1000
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <01bf01c7f593$d7beecc0$873cc640$@com.au>

> I can't figure out how to build a Windows package for ssl-1.1.tar.gz,
> and probably don't have the tools to do it anyway.  I presume that
> both a Windows machine and Visual Studio (because there's a C
> extension) is required?
> 
> Anyone out there who's interested in the challenge?
> 
> It's at http://www.parc.com/janssen/transient/ssl-1.1.tar.gz.
> 

I had a bit of a look at this.  I think I managed to get it building:
 
* find_ssl() is along way from working on Windows.  Python itself uses magic
to locate an SSL directory in the main Python directory's parent.  On my
system, this is c:\src\openssl-0.9.7e, but obviously that could be almost
anywhere, and with almost any name.  See PCBuild\build_ssl.py and
PCBuild\_ssl.mak for the gory details.  I'm not sure how you would like to
approach this (insist on an environment variable for the top-level SSL dir
name?), but in the meantime I hacked find_ssl() to:

    ssl_incs = [r"\src\openssl-0.9.7e\inc32",]
    ssl_libs = [r"\src\openssl-0.9.7e\out32"]
    return ssl_incs, ssl_libs, ["libeay32", "ssleay32", "gdi32", "wsock32"] 

* The call to find_ssl() appears to discard the 3rd param:

ssl_incs, ssl_libs, libs = find_ssl()
...
      ext_modules=[Extension('ssl._ssl2', ['ssl/_ssl2.c'],
                             include_dirs = ssl_incs + [socket_inc],
                             library_dirs = ssl_libs,
                             libraries = ['ssl', 'crypto'],
                             depends = ['ssl/socketmodule.h'])],

The 'libraries =' line probably means to pass 'libs' rather than the
literal.

* The "depends = ['ssl/socketmodule.h']" fails for me - no header of that
name exists in the ssl directory in your archive.

After those changes I was able to get it built and tested:
"""
Ran 15 tests in 3.157s

OK
"""

Hope this helps,

Mark


From janssen at parc.com  Thu Sep 13 03:59:54 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 18:59:54 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <01bf01c7f593$d7beecc0$873cc640$@com.au> 
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
Message-ID: <07Sep12.190001pdt."57996"@synergy1.parc.xerox.com>

Thanks, Mark (and David, who replied to me personally).  I'll update
the setup.py files with your suggestions and do a 1.2 (with more
metadata in it, too).  Looks like the functionality is working for
people, even if the build is still a bit flakey.

Bill

From janssen at parc.com  Thu Sep 13 04:57:17 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 19:57:17 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <01bf01c7f593$d7beecc0$873cc640$@com.au> 
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
Message-ID: <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>

> * find_ssl() is along way from working on Windows.  Python itself uses magic
> to locate an SSL directory in the main Python directory's parent.  On my
> system, this is c:\src\openssl-0.9.7e, but obviously that could be almost
> anywhere, and with almost any name.  See PCBuild\build_ssl.py and
> PCBuild\_ssl.mak for the gory details.  I'm not sure how you would like to
> approach this (insist on an environment variable for the top-level SSL dir
> name?)

Can't we look in the registry for this?  We have a working Python;
perhaps we can just use a Windows-specific registry lookup to find
OpenSSL?  (I'm just blue-skying here; I have no clue how things work
on Windows.)

Bill

From mhammond at skippinet.com.au  Thu Sep 13 05:03:50 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 13 Sep 2007 13:03:50 +1000
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
	<07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <01da01c7f5b2$cadca4b0$6095ee10$@com.au>

> > * find_ssl() is along way from working on Windows.  Python itself
> uses magic
> > to locate an SSL directory in the main Python directory's parent.  On
> my
> > system, this is c:\src\openssl-0.9.7e, but obviously that could be
> almost
> > anywhere, and with almost any name.  See PCBuild\build_ssl.py and
> > PCBuild\_ssl.mak for the gory details.  I'm not sure how you would
> like to
> > approach this (insist on an environment variable for the top-level
> SSL dir
> > name?)
> 
> Can't we look in the registry for this?  We have a working Python;
> perhaps we can just use a Windows-specific registry lookup to find
> OpenSSL?  (I'm just blue-skying here; I have no clue how things work
> on Windows.)

Not really.  Python itself, when building _ssl, doesn't look for a binary
install of openssl, but instead a source directory and a working perl
interpreter so an openssl can be built suitable for linking with Python.
This source directory is just downloaded and unzipped - no registration
takes place, and any binaries that may be built are ignored (we just want
the .h and .lib files)

It might be possible to try and use build_ssl.py to locate the openssl
directory, but this will still require that someone building it has Python
built from source - I'm fairly sure that someone installing a Python binary
will not have build_ssl.py, nor are they likely to have a suitable openssl
directory or installation just "hanging around" either.

Mark


From janssen at parc.com  Thu Sep 13 05:27:58 2007
From: janssen at parc.com (Bill Janssen)
Date: Wed, 12 Sep 2007 20:27:58 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <01da01c7f5b2$cadca4b0$6095ee10$@com.au> 
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
	<07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
Message-ID: <07Sep12.202807pdt."57996"@synergy1.parc.xerox.com>

> > Can't we look in the registry for this?  We have a working Python;
> > perhaps we can just use a Windows-specific registry lookup to find
> > OpenSSL?  (I'm just blue-skying here; I have no clue how things work
> > on Windows.)
> 
> Not really.  Python itself, when building _ssl, doesn't look for a binary
> install of openssl, but instead a source directory and a working perl
> interpreter so an openssl can be built suitable for linking with Python.
> This source directory is just downloaded and unzipped - no registration
> takes place, and any binaries that may be built are ignored (we just want
> the .h and .lib files)

In that case, I think your idea of just hard-coding a path is probably
the right thing to do.  I'll add a note that this is how you need to do
it if you are going to try "python setup.py build".  Presumably the
binary then built with "python setup.py bdist" will install on a Windows
machine regardless of where OpenSSL is installed?

Bill

From db3l.net at gmail.com  Thu Sep 13 05:41:57 2007
From: db3l.net at gmail.com (David Bolen)
Date: Wed, 12 Sep 2007 23:41:57 -0400
Subject: [Python-Dev] Windows package for new SSL package?
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
	<07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
Message-ID: <m2veafp6yi.fsf@valheru.db3l.homeip.net>

"Mark Hammond" <mhammond at skippinet.com.au> writes:

> It might be possible to try and use build_ssl.py to locate the openssl
> directory, but this will still require that someone building it has Python
> built from source - I'm fairly sure that someone installing a Python binary
> will not have build_ssl.py, nor are they likely to have a suitable openssl
> directory or installation just "hanging around" either.

Yep - even if a Windows user has an appropriate development
environment in general (and can build most standalone extensions with
just a binary Python install), as you say the odds are pretty small
they'd have an OpenSSL source tree around, with libraries built.

At the same time, I suspect that only a small percentage of Windows
users will want to rebuild the extension - rather they'll just want a
binary installer, something not uncommon to be published for Windows
users of many extension modules.  So that pushes the problem upstream
a bit where having a Python development tree might be more common or
familiar.

Rather than a lot of complexity to cater to that small percentage, I'd
probably just make setup.py need an explicit configuration - editing,
or perhaps environment variable - for the location of the root of the
OpenSSL source tree.  As you say, there's no guaranteed way to find it
otherwise, although I suppose it might try checking relative to the
Python executable (along the same lines as build_ssl.py) in case it's
being built from within the source tree.

Adding some comments that following instructions to build Python from
source (or at least the standard _ssl module) will yield just such a
tree should be a simple enough as a reference for those who need it.

The setup.py does also need to understand the different library names
(and required system libraries) to build properly under Windows, as
you've already highlighted, but that should be relatively easy to vary
by platform.

-- David


From aahz at pythoncraft.com  Thu Sep 13 06:26:06 2007
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 12 Sep 2007 21:26:06 -0700
Subject: [Python-Dev] SSL certs
In-Reply-To: <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>
References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
	<20070912013412.GB14034@panix.com>
	<07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <20070913042606.GB27547@panix.com>

On Wed, Sep 12, 2007, Bill Janssen wrote:
>
>>> By the way, I think the hostname matching provisions of 2818 (which
>>> is, after all, only an informational RFC, not a standard) are poorly
>>> thought out.  Many machines have more hostnames than you can shake a
>>> stick at, and often provide certs with the wrong hostname in them
>>> (usually because they have no way to determine what the *right*
>>> hostname is, from inside that machine).
>> 
>> ...which is why you pretty much need to have a canonical hostname mapped
>> to each IP you're using on a machine.  Basically, you need to map the
>> hostname you intend to use to an IP, then do reverse-DNS to find out
>> whether the hostname is in fact the canonical hostname.  If not, you're
>> using the wrong hostname on your cert.
> 
> Yep.  The problem is having a particular service know which
> certificate it should choose to use, and also to know when the network
> connectivity has changed.  Usually, server ports are bound to wildcard
> IP addresses, so that they can still be reached even if the network
> connectivity changes (particularly true for servers running on
> laptops, or the Python server I'm running on my iPhone).  The server
> has no way of knowing which IP address the client knows it as, and no
> way of knowing which of its multiple certificates to present, so that
> the name in the cert will match the name the client thought it was
> using.

My understanding is that the client tells the server which hostname it
wants to use; the server should then pass down that information.  That's
how virtual hosting works in the first place.  The only difference with
SSL is that the hostname must have a unique IP address, so that when the
client does a reverse DNS to validate the IP address presented by the SSL
certificate, it all comes together correctly.

There are, of course, wildcard certs; I don't understand how those work.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html

From surekap at gmail.com  Thu Sep 13 06:30:43 2007
From: surekap at gmail.com (Prateek Sureka)
Date: Thu, 13 Sep 2007 10:00:43 +0530
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <18152.2055.258930.576257@montanaro.dyndns.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
Message-ID: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>

I was reading GvR's post on this and came up with a theory on how to  
tackle the problem.

I ended up putting it in a blog post.

http://www.brainwavelive.com/blog/index.php?/archives/12-Suggestion- 
for-removing-the-Python-Global-Interpreter-Lock.html
What do you think?

Prateek

On Sep 12, 2007, at 9:08 PM, skip at pobox.com wrote:

>
>     Martin> Now we are getting into details: you do NOT have to  
> lock an
>     Martin> object to modify its reference count. An atomic
>     Martin> increment/decrement operation is enough.
>
> Implemented in asm I suspect?  For common CPUs this could just be  
> part of
> the normal Python distribution.  For uncommon ones this could use a  
> lock
> until someone gets around to writing the necessary couple lines of
> assembler.
>
> Skip
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> surekap%40gmail.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/f65a8897/attachment.htm 

From martin at v.loewis.de  Thu Sep 13 06:42:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 06:42:18 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
Message-ID: <46E8BFAA.5090008@v.loewis.de>

> What do you think?

I think what you are describing is the situation of today,
except in a less-performant way. The kernel *already*
implements such a "synchronization server", except that
all CPUs can act as such. You write

"Since we are guaranteeing that synchronized code is running on a single
core, it is the equivalent of a lock at the cost of a context switch."

This is precisely what a lock costs today: a context switch.

Since the Python interpreter is synchronized all of the time, it
would completely run on the synchronization server all of the
time. As you identify, that single CPU might get overloaded, so
your scheme would give no benefits (since Python code could never
run in parallel), and only disadvantages (since multiple Python
interpreters today can run on multiple CPUs, but could not
anymore under your scheme).

Regards,
Martin

From db3l.net at gmail.com  Thu Sep 13 06:43:44 2007
From: db3l.net at gmail.com (David Bolen)
Date: Thu, 13 Sep 2007 00:43:44 -0400
Subject: [Python-Dev] Windows package for new SSL package?
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
	<07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<07Sep12.202807pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <m2r6l3p43j.fsf@valheru.db3l.homeip.net>

Bill Janssen <janssen at parc.com> writes:

> In that case, I think your idea of just hard-coding a path is probably
> the right thing to do.  I'll add a note that this is how you need to do
> it if you are going to try "python setup.py build".  Presumably the
> binary then built with "python setup.py bdist" will install on a Windows
> machine regardless of where OpenSSL is installed?

Yes (though typically bdist_wininst for the Windows installer), but
perhaps not for the reason you think.

I think where there's probably a small disconnect here is that, there
really isn't an OpenSSL "installed" on the end user's machine.  Well,
there could be, but Python isn't using it.  The OpenSSL library is
statically linked as part of the _ssl.pyd module, as it will be with
your _ssl2.pyd module.  (That's also why there is no OpenSSL to "find"
in your setup even with Python installed - at least not any libraries
you can use).

In other words, both the standard and your extension module on Windows
bring along their own OpenSSL.

-- David


From tulloss2 at uiuc.edu  Thu Sep 13 09:08:35 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Thu, 13 Sep 2007 02:08:35 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
Message-ID: <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>

> What do you think?
>

I'm going to have to agree with Martin here, although I'm not sure I
understand what you're saying entirely. Perhaps if you explained where the
benefits of this approach come from, it would clear up what you're thinking.


After a few days of thought, I'm starting to realize the difficulty of
maintaining compatibility with existing C extensions after removing the GIL.
The possible C-level side effects are very difficult to work around without
kernel or hardware level transaction support. I see a couple of approaches
that might work (though I probably haven't thought of everything).

1. Use message passing and transactions.
Put every module into its own tasklet that ends up getting owned by one
thread or another. Every call to an object that is owned by that module is
put into a module wide message queue and delivered sequentially to its
objects. All this does is serialize requests to objects implemented in C to
slightly mitigate the need to lock. Then use transactions to protect any
python object. You still have the problem of C side effects going unnoticed
(IE Thread A executes function, object sets c-state in a certain way, Thread
B calls the same function, changes all the C-state, A reacts to return value
that no longer reflects on the actual state). So, this doesn't actually
work, but its close since python objects will remain consistent
w/transactions and conflicting C-code won't execute simultaneously.

2. Do it perl style.
Perl just spawns off multiple interpreters and doesn't share state between
them. This would require cleaning up what state belongs where, and probably
providing some global state lock free. For instance, all the numbers,
letters, and None are read only, so we could probably work out a way to
share them between threads. In fact, any python global could be read only
until it is written to. Then it belongs to the thread that wrote to it and
is updated in the other threads via some sort of cache-coherency protocol. I
haven't really wrapped my head around how C extensions would play with this
yet, but essentially code operating in different threads would be operating
on different copies of the modules. That seems fair to me.

3. Come up with an elegant way of handling multiple python processes. Of
course, this has some downsides. I don't really want to pickle python
objects around if I decide they need to be in another address space, which I
would probably occasionally need to do if I abstracted away the fact that a
bunch of interpreters had been spawned off.

4. Remove the GIL, use transactions for python objects, and adapt all
C-extensions to be thread safe. Woo.

I'll keep kicking around ideas for a while; hopefully they'll become more
refined as I explore the code more.

Justin

PS. A good paper on how hardware transactional memory could help us out:
http://www-faculty.cs.uiuc.edu/~zilles/papers/python_htm.dls2006.pdf
A few of you have probably read this already. Martin is even acknowledged,
but it was news to me!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/544d4416/attachment.htm 

From martin at v.loewis.de  Thu Sep 13 09:20:16 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 09:20:16 +0200
Subject: [Python-Dev] SSL certs
In-Reply-To: <20070913042606.GB27547@panix.com>
References: <46DDCD7C.40004@v.loewis.de>
	<46DE3DB8.6000004@v.loewis.de>	<46DECFF6.4040107@v.loewis.de>
	<46DEF5FF.8040602@v.loewis.de>	<46DEFF3C.90306@v.loewis.de>
	<-1936579380892715012@unknownmsgid>	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>	<20070912013412.GB14034@panix.com>	<07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>
	<20070913042606.GB27547@panix.com>
Message-ID: <46E8E4B0.60909@v.loewis.de>

> My understanding is that the client tells the server which hostname it
> wants to use; the server should then pass down that information.  That's
> how virtual hosting works in the first place.  The only difference with
> SSL is that the hostname must have a unique IP address, so that when the
> client does a reverse DNS to validate the IP address presented by the SSL
> certificate, it all comes together correctly.

Unfortunately, it does not quite work that way. The client tells the
server what hostname to use only *after* the SSL connection has been
established, and certificates being exchanged (in the Host: header).
So the Host: header cannot be used for selecting what certificate to
present to the client.

*That* is the reason why people typically assume they have to have
different IP addresses for different SSL hosts: certificate selection
must be done based on IP address (which is already known before
the SSL handshaking starts). There is no need for the client to do
a reverse name lookup, and indeed, the client should *not* do a
reverse DNS lookup to check the server's identity. Instead, it should
check the host name it wants to talk to against the certificate.

However, there is an alternative to using multiple IP addresses:
one could also use multiple "subject alternative names", and create
a certificate that lists them all.

> There are, of course, wildcard certs; I don't understand how those work.

The same way: the client does *not* perform a reverse name lookup.
Instead, it just matches the hostname against the name in the
certificate; if the certificate is for *.python.org (say) and the
client wants to talk to pypi.python.org, it matches, and hostname
verification passes. It would also pass if the client wanted to
talk to cheeseshop.python.org, or wiki.python.org (which all have
the same IP address).

Regards,
Martin

From lists at cheimes.de  Thu Sep 13 12:11:21 2007
From: lists at cheimes.de (Christian Heimes)
Date: Thu, 13 Sep 2007 12:11:21 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<46E652CD.1070901@v.loewis.de>	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.co m>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
Message-ID: <fcb2cc$65i$1@sea.gmane.org>

Phillip J. Eby wrote:
> It's not just caches and counters.  It's also every built-in type 
> structure, builtin module, builtin function...  any Python object 
> that's a built-in, period.  That includes things like None, True, and False.
> 
> Caches would include such things as the pre-created integers -100 
> through 255, the 1-byte character strings for chr(0)-chr(255), and 
> the interned strings cache, to name a few.
> 
> Most of these things I've mentioned are truly global, and not 
> specific to an individual interpreter.

Pardon my ignorance but why does Python do reference counting for truly
global and static objects like None, True, False, small and cached
integers, sys and other builtins? If I understand it correctly these
objects are never garbaged collected (at least they shouldn't) until the
interpreter exits. Wouldn't it decrease the overhead and increase speed
when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects?

Christian


From nd at perlig.de  Thu Sep 13 12:19:21 2007
From: nd at perlig.de (=?iso-8859-1?q?Andr=E9_Malo?=)
Date: Thu, 13 Sep 2007 12:19:21 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <fcb2cc$65i$1@sea.gmane.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org>
Message-ID: <200709131219.21152.nd@perlig.de>

* Christian Heimes wrote: 

> Pardon my ignorance but why does Python do reference counting for truly
> global and static objects like None, True, False, small and cached
> integers, sys and other builtins? If I understand it correctly these
> objects are never garbaged collected (at least they shouldn't) until the
> interpreter exits. Wouldn't it decrease the overhead and increase speed
> when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects?

The check what kind of object you have takes time, too. Right now, just 
counting up or down is most likely faster than that check on every refcount 
operation.

nd

From p.f.moore at gmail.com  Thu Sep 13 12:58:44 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 13 Sep 2007 11:58:44 +0100
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <m2veafp6yi.fsf@valheru.db3l.homeip.net>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2veafp6yi.fsf@valheru.db3l.homeip.net>
Message-ID: <79990c6b0709130358w54da4bag516d10ef301ed16a@mail.gmail.com>

On 13/09/2007, David Bolen <db3l.net at gmail.com> wrote:
> "Mark Hammond" <mhammond at skippinet.com.au> writes:
>
> > It might be possible to try and use build_ssl.py to locate the openssl
> > directory, but this will still require that someone building it has Python
> > built from source - I'm fairly sure that someone installing a Python binary
> > will not have build_ssl.py, nor are they likely to have a suitable openssl
> > directory or installation just "hanging around" either.
>
> Yep - even if a Windows user has an appropriate development
> environment in general (and can build most standalone extensions with
> just a binary Python install), as you say the odds are pretty small
> they'd have an OpenSSL source tree around, with libraries built.

It is possible to build extensions on Windows using the mingw gcc
toolchain. Users doing this may well have some or all of the gnuwin32
(http://gnuwin32.sf.net) utilities installed. Gnuwin32 includes
openssl (both headers, link libraries, and DLLs).

It seems to me a perfectly reasonable option for someone wanting to
build the SSL extension to grab mingw and gnuwin32 openssl.

I tried building with this config last night, but didn't have the time
to deal with hacking the setup.py - I see someone else has covered
this. I'll have another go with the new version when I get a chance.

> At the same time, I suspect that only a small percentage of Windows
> users will want to rebuild the extension - rather they'll just want a
> binary installer, something not uncommon to be published for Windows
> users of many extension modules.  So that pushes the problem upstream
> a bit where having a Python development tree might be more common or
> familiar.

Agreed. I assume Windows binary builds will be published, so it's only
early adopters, or people who want to work with their own builds for
some other reason, who might care.

Paul.

From p.f.moore at gmail.com  Thu Sep 13 13:02:55 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 13 Sep 2007 12:02:55 +0100
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <m2r6l3p43j.fsf@valheru.db3l.homeip.net>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
Message-ID: <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>

On 13/09/2007, David Bolen <db3l.net at gmail.com> wrote:
> Bill Janssen <janssen at parc.com> writes:
>
> > In that case, I think your idea of just hard-coding a path is probably
> > the right thing to do.  I'll add a note that this is how you need to do
> > it if you are going to try "python setup.py build".  Presumably the
> > binary then built with "python setup.py bdist" will install on a Windows
> > machine regardless of where OpenSSL is installed?
>
> Yes (though typically bdist_wininst for the Windows installer), but
> perhaps not for the reason you think.
>
> I think where there's probably a small disconnect here is that, there
> really isn't an OpenSSL "installed" on the end user's machine.  Well,
> there could be, but Python isn't using it.  The OpenSSL library is
> statically linked as part of the _ssl.pyd module, as it will be with
> your _ssl2.pyd module.  (That's also why there is no OpenSSL to "find"
> in your setup even with Python installed - at least not any libraries
> you can use).

That's not 100% true, is it? If I use mingw and Gnuwin32 openssl, I
believe the default is a dynamic link of openssl (it depends on the
import library used, and gnuwin32 supplies dynamic libs by default).
So the openssl DLLs need to be on the user's PATH for the extension
module to work.

> In other words, both the standard and your extension module on Windows
> bring along their own OpenSSL.

For the extension, you may need to (1) document that the user needs to
have the openssl DLLs on their PATH and possibly (1a) supply a zipfile
with the necessary DLLs as a supplemental download, or (2) arrange for
the openssl DLLs to be included in the extension installer, and
installed alongside the .pyd file.

Alternatively, it *may* be possible with setup.py magic to force a
static openssl link (but would that need hard coding for the gnuwin32
naming conventions?)

Paul.

From jon+python-dev at unequivocal.co.uk  Thu Sep 13 12:55:27 2007
From: jon+python-dev at unequivocal.co.uk (Jon Ribbens)
Date: Thu, 13 Sep 2007 11:55:27 +0100
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <200709131219.21152.nd@perlig.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
Message-ID: <20070913105527.GH32061@snowy.squish.net>

On Thu, Sep 13, 2007 at 12:19:21PM +0200, Andr? Malo wrote:
> > Pardon my ignorance but why does Python do reference counting for truly
> > global and static objects like None, True, False, small and cached
> > integers, sys and other builtins? If I understand it correctly these
> > objects are never garbaged collected (at least they shouldn't) until the
> > interpreter exits. Wouldn't it decrease the overhead and increase speed
> > when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects?
> 
> The check what kind of object you have takes time, too. Right now, just 
> counting up or down is most likely faster than that check on every refcount 
> operation.

To put it another way, would it actually matter if the reference
counts for such objects became hopelessly wrong due to non-atomic
adjustments?

From db3l.net at gmail.com  Thu Sep 13 13:14:26 2007
From: db3l.net at gmail.com (David Bolen)
Date: Thu, 13 Sep 2007 07:14:26 -0400
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
Message-ID: <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>

On 9/13/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 13/09/2007, David Bolen <db3l.net at gmail.com> wrote:

> > I think where there's probably a small disconnect here is that, there
> > really isn't an OpenSSL "installed" on the end user's machine.  Well,
> > there could be, but Python isn't using it.  The OpenSSL library is
> > statically linked as part of the _ssl.pyd module, as it will be with
> > your _ssl2.pyd module.  (That's also why there is no OpenSSL to "find"
> > in your setup even with Python installed - at least not any libraries
> > you can use).
>
> That's not 100% true, is it? If I use mingw and Gnuwin32 openssl, I
> believe the default is a dynamic link of openssl (it depends on the
> import library used, and gnuwin32 supplies dynamic libs by default).
> So the openssl DLLs need to be on the user's PATH for the extension
> module to work.

That's a fair point - my comments are all related to the standard
Python distribution and building extensions with the VS.NET compiler
(including the binary installer I had built for Bill).

> For the extension, you may need to (1) document that the user needs to
> have the openssl DLLs on their PATH and possibly (1a) supply a zipfile
> with the necessary DLLs as a supplemental download, or (2) arrange for
> the openssl DLLs to be included in the extension installer, and
> installed alongside the .pyd file.

If we're talking about the construction of binary Windows installers,
I'd just suggest that they get built as the built-in SSL module does,
including the static linking with a pure Windows OpenSSL build, which
is a bit simpler for the typical end user and has no other external
requirements.

Of course, that certainly doesn't stop a person who has already set up
their system for using mingw for extensions from doing their own
compilation, although it does raise a question as to whether the
setup.py would need some further adjustments to cover that case most
cleanly.

-- David

From martin at v.loewis.de  Thu Sep 13 13:15:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 13:15:39 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070913105527.GH32061@snowy.squish.net>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>	<fcb2cc$65i$1@sea.gmane.org>
	<200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
Message-ID: <46E91BDB.7070601@v.loewis.de>

> To put it another way, would it actually matter if the reference
> counts for such objects became hopelessly wrong due to non-atomic
> adjustments?

If they drop to zero (which may happen due to non-atomic adjustments),
Python will try to release the static memory, which will crash the
malloc implementation.

Regards,
Martin

From jon+python-dev at unequivocal.co.uk  Thu Sep 13 13:55:38 2007
From: jon+python-dev at unequivocal.co.uk (Jon Ribbens)
Date: Thu, 13 Sep 2007 12:55:38 +0100
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E91BDB.7070601@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de>
Message-ID: <20070913115538.GJ32061@snowy.squish.net>

On Thu, Sep 13, 2007 at 01:15:39PM +0200, "Martin v. L?wis" wrote:
> > To put it another way, would it actually matter if the reference
> > counts for such objects became hopelessly wrong due to non-atomic
> > adjustments?
> 
> If they drop to zero (which may happen due to non-atomic adjustments),
> Python will try to release the static memory, which will crash the
> malloc implementation.

That could be avoided by a flag on the object which is checked in
free(). I'm just suggesting it as an alternative as it sounds like
it might be more efficient than either locking or avoiding having
reference counts on these objects (especially if the reference count
is initialised to MAX_INT/2 or whatever).

From p.f.moore at gmail.com  Thu Sep 13 14:21:47 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 13 Sep 2007 13:21:47 +0100
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
Message-ID: <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>

On 13/09/2007, David Bolen <db3l.net at gmail.com> wrote:
> That's a fair point - my comments are all related to the standard
> Python distribution and building extensions with the VS.NET compiler
> (including the binary installer I had built for Bill).
[...]
> If we're talking about the construction of binary Windows installers,
> I'd just suggest that they get built as the built-in SSL module does,
> including the static linking with a pure Windows OpenSSL build, which
> is a bit simpler for the typical end user and has no other external
> requirements.

OK. Building with mingw is a bit of a hobby horse of mine, as the
requirement for the (expensive) VS.NET compiler forces many users to
rely on binary builds. I know other alternatives can be made to work,
but it's often too much pain to bother. I'd much rather extensions
which *can* be built using free tools, support actually doing so out
of the box.

> Of course, that certainly doesn't stop a person who has already set up
> their system for using mingw for extensions from doing their own
> compilation, although it does raise a question as to whether the
> setup.py would need some further adjustments to cover that case most
> cleanly.

And that's my point. I'd rather work to ensure that mingw works out of
the box, than leave things requiring VS.NET for a clean build. It's
not relevant here, but I've certainly been in a situation with other
extensions where I can't upgrade a Python install because the
distributors of a particular extension haven't produced a build for
the new version yet (*cough* mod_python *cough*) - and I live in fear
of support for some extensions dying, as I'll then have to move away
from them or stop upgrading Python.

Anyway, philosophy aside, I'll try to make some time in the next few
days to get a working setup.py for the SSL package using mingw.
Hopefully, Bill will then integrate this and we'll have mingw as a
supported option.

Paul.

From skip at pobox.com  Thu Sep 13 15:26:50 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 13 Sep 2007 08:26:50 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070913105527.GH32061@snowy.squish.net>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
Message-ID: <18153.15002.76898.448843@montanaro.dyndns.org>


    Jon> To put it another way, would it actually matter if the reference
    Jon> counts for such objects became hopelessly wrong due to non-atomic
    Jon> adjustments?

I believe this was suggested and tried by someone (within the last few
years).  It wasn't any benefit.  The costs of special-casing outweighed the
costs of uniform reference counting, not to mention the code got more
complex.  Or something like that.  Anyway, it didn't work.

Just thinking out loud here, what if ... we use atomic test-and-set to
handle reference counting (with a lock for those CPU architectures where we
haven't written the necessary assembler fragment), then implement a lock for
each mutable type and another for global state (thread state, interpreter
state, etc)?  Might that be close enough to free threading to provide some
benefits, but not so fine-grained that lock contention becomes a bottleneck?

Skip

From hrvoje.niksic at avl.com  Thu Sep 13 17:13:11 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Thu, 13 Sep 2007 17:13:11 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E91BDB.7070601@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de>
Message-ID: <1189696391.11322.275.camel@localhost>

On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote:
> > To put it another way, would it actually matter if the reference
> > counts for such objects became hopelessly wrong due to non-atomic
> > adjustments?
> 
> If they drop to zero (which may happen due to non-atomic adjustments),
> Python will try to release the static memory, which will crash the
> malloc implementation.

More precisely, Python will call the deallocator appropriate for the
object type.  If that deallocator does nothing, the object continues to
live.  Such objects could also start out with a refcount of sys.maxint
or so to ensure that calls to the no-op deallocator are unlikely.

The part I don't understand is how Python would know which objects are
global/static.  Testing for such a thing sounds like something that
would be slower than atomic incref/decref.



From surekap at gmail.com  Thu Sep 13 17:30:47 2007
From: surekap at gmail.com (Prateek Sureka)
Date: Thu, 13 Sep 2007 21:00:47 +0530
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E8BFAA.5090008@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<46E8BFAA.5090008@v.loewis.de>
Message-ID: <EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>


On Sep 13, 2007, at 10:12 AM, Martin v. L?wis wrote:

>> What do you think?
>
> I think what you are describing is the situation of today,
> except in a less-performant way. The kernel *already*
> implements such a "synchronization server", except that
> all CPUs can act as such. You write
>
> "Since we are guaranteeing that synchronized code is running on a  
> single
> core, it is the equivalent of a lock at the cost of a context switch."
>
> This is precisely what a lock costs today: a context switch.
>

Really? Wouldn't we save some memory allocation overhead (since in my  
design, the "lock" is a really just simple kernel instruction as  
opposed to a full blown object) thereby lowering lock overhead (and  
allowing us to go with finer-grained "locks"?

Since we're using an asynch message queue for the synch-server, it  
sounds like a standard lock-free algorithm.


> Since the Python interpreter is synchronized all of the time, it
> would completely run on the synchronization server all of the
> time. As you identify, that single CPU might get overloaded, so
> your scheme would give no benefits (since Python code could never
> run in parallel),

I think I neglected to mention that the locking would still need to  
be more fine grained - perhaps only do the context switch around  
refcounts (and the other places where the GIL is critical).
If we can do this in a way that allows simple list comprehensions to  
run in parallel, that would be really helpful (like a truly parallel  
map function).


> and only disadvantages (since multiple Python
> interpreters today can run on multiple CPUs, but could not
> anymore under your scheme).
>
Well, you could still run python code in parallel if you used  
multiple processes (each process having its own 'synchronization  
server'). Is that what you meant?


On Sep 13, 2007, at 12:38 PM, Justin Tulloss wrote:
>
> What do you think?
>
> I'm going to have to agree with Martin here, although I'm not sure  
> I understand what you're saying entirely. Perhaps if you explained  
> where the benefits of this approach come from, it would clear up  
> what you're thinking.

Well, my interpretation of the current problem is that removing the  
GIL has not been productive because of problems with lock contention  
on multi-core machines. Naturally, we need to make the locking more  
fine-grained to resolve this. Hopefully we can do so in a way that  
does not increase the lock overhead (hence my suggestion for a lock  
free approach using an asynch queue and a core as dedicated server).

If we can somehow guarantee all GC operations (which is why the GIL  
is needed in the first place) run on a single core, we get locking  
for free without actually having to have threads spinning.

regards,
Prateek

From martin at v.loewis.de  Thu Sep 13 17:55:24 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 17:55:24 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<46E8BFAA.5090008@v.loewis.de>
	<EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
Message-ID: <46E95D6C.1060704@v.loewis.de>

>> "Since we are guaranteeing that synchronized code is running on a single
>> core, it is the equivalent of a lock at the cost of a context switch."
>>
>> This is precisely what a lock costs today: a context switch.
>>
> 
> Really? Wouldn't we save some memory allocation overhead (since in my
> design, the "lock" is a really just simple kernel instruction as opposed
> to a full blown object)

The GIL is a single variable, not larger than 50 Bytes or so. Locking
it requires no memory at all in user-space, and might require 8 bytes
or so per waiting thread in kernel-space.

> thereby lowering lock overhead

Why do you think "lock overhead" is related to memory consumption?

> Since we're using an asynch message queue for the synch-server, it
> sounds like a standard lock-free algorithm.

You lost me here. What are you trying to achieve? It's not the lock
that people complain about, but that Python runs serially most
of the time.

> I think I neglected to mention that the locking would still need to be
> more fine grained - perhaps only do the context switch around refcounts
> (and the other places where the GIL is critical).

I think this is the point where I need to say "good luck implementing
it".

> Well, my interpretation of the current problem is that removing the GIL
> has not been productive because of problems with lock contention on
> multi-core machines.

My guess is that this interpretation is wrong. It was reported that
there was a slowdown by a factor of 2 in a single-threaded application.
That can't be due to lock contention.

> If we can somehow guarantee all GC operations (which is why the GIL is
> needed in the first place)

No, unless we disagree on what a "GC operation" is.

Regards,
Martin


From janssen at parc.com  Thu Sep 13 18:04:00 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 09:04:00 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <m2r6l3p43j.fsf@valheru.db3l.homeip.net> 
References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com>
	<01bf01c7f593$d7beecc0$873cc640$@com.au>
	<07Sep12.195721pdt."57996"@synergy1.parc.xerox.com>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<07Sep12.202807pdt."57996"@synergy1.parc.xerox.com>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
Message-ID: <07Sep13.090402pdt."57996"@synergy1.parc.xerox.com>

> In other words, both the standard and your extension module on Windows
> bring along their own OpenSSL.

I see -- thanks.

Bill

From janssen at parc.com  Thu Sep 13 18:08:23 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 09:08:23 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> 
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
	<79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>
Message-ID: <07Sep13.090827pdt."57996"@synergy1.parc.xerox.com>

> Anyway, philosophy aside, I'll try to make some time in the next few
> days to get a working setup.py for the SSL package using mingw.
> Hopefully, Bill will then integrate this and we'll have mingw as a
> supported option.

I'll be happy to do that!

Bill

From surekap at gmail.com  Thu Sep 13 18:29:15 2007
From: surekap at gmail.com (Prateek Sureka)
Date: Thu, 13 Sep 2007 21:59:15 +0530
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E95D6C.1060704@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<46E8BFAA.5090008@v.loewis.de>
	<EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
	<46E95D6C.1060704@v.loewis.de>
Message-ID: <E1DC5852-2294-487C-862F-472517493AAD@gmail.com>


On Sep 13, 2007, at 9:25 PM, Martin v. L?wis wrote:

>>> "Since we are guaranteeing that synchronized code is running on a  
>>> single
>>> core, it is the equivalent of a lock at the cost of a context  
>>> switch."
>>>
>>> This is precisely what a lock costs today: a context switch.
>>>
>>
>> Really? Wouldn't we save some memory allocation overhead (since in my
>> design, the "lock" is a really just simple kernel instruction as  
>> opposed
>> to a full blown object)
>
> The GIL is a single variable, not larger than 50 Bytes or so. Locking
> it requires no memory at all in user-space, and might require 8 bytes
> or so per waiting thread in kernel-space.
>
>> thereby lowering lock overhead
>
> Why do you think "lock overhead" is related to memory consumption?

Well, it can be one (or both) of two things - 1) memory consumption,  
2) cost of acquiring and releasing the locks (which you said is the  
same as a context switch).

Since we've also identified (according to GvR's post: http:// 
www.artima.com/weblogs/viewpost.jsp?thread=214235) that the slowdown  
was 2x in a single threaded application (which couldn't be due to  
lock contention), it must be due to lock overhead (unless the  
programming was otherwise faulty or there is something else about  
locks that I don't know about - Martin?). Hence I'm assuming that we  
need to reduce lock overhead. If acquiring and releasing locks (part  
of lock overhead) is a simple context switch (and I don't doubt you  
here), then the only remaining thing to optimize is memory operations  
related to lock objects.

>
>> Since we're using an asynch message queue for the synch-server, it
>> sounds like a standard lock-free algorithm.
>
> You lost me here. What are you trying to achieve? It's not the lock
> that people complain about, but that Python runs serially most
> of the time.

http://en.wikipedia.org/wiki/Lock-free_and_wait- 
free_algorithms#The_lock-free_approach

Specifically, i'm trying to achieve the approach using a "deposit  
request".


>> I think I neglected to mention that the locking would still need  
>> to be
>> more fine grained - perhaps only do the context switch around  
>> refcounts
>> (and the other places where the GIL is critical).
>
> I think this is the point where I need to say "good luck implementing
> it".

I don't mean to be unhelpful. Its just that this discussion started  
because people (not me - although I would definitely benefit) showed  
interest in removing the GIL.

>> Well, my interpretation of the current problem is that removing  
>> the GIL
>> has not been productive because of problems with lock contention on
>> multi-core machines.
>
> My guess is that this interpretation is wrong. It was reported that
> there was a slowdown by a factor of 2 in a single-threaded  
> application.
> That can't be due to lock contention.

I agree with your point Martin (see my analysis above). Regarding  
lock contention: I'm guessing that if single threaded applications  
are so badly affected, then the cumulative overhead on multithreaded  
applications will be even worse. So we need to reduce the overhead.  
But then since all Python code runs under the GIL - which is a pretty  
coarse lock, we have to make the new locking more fine-grained (which  
is what I think the original patch by Greg Stein did). I'm also  
guessing that if you do that then for each refcount you're going to  
have to acquire a lock... which happens *very* frequently (and I  
think by your earlier responses you concur). So that means anytime  
multiple threads try to access the same object, they will need to do  
an incref/decref. e.g. If you access a global variable inside a for- 
loop from multiple threads.

>> If we can somehow guarantee all GC operations (which is why the  
>> GIL is
>> needed in the first place)
>
> No, unless we disagree on what a "GC operation" is.

Ok. Other people know more about the specifics of the GIL than I do.  
However, the main issue with removing the GIL seems to be the  
reference counting algorithm. That is what I was alluding to. In any  
case, it is not relevant for the rest of the discussion.

regards,
Prateek

From martin at v.loewis.de  Thu Sep 13 18:51:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 18:51:42 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <E1DC5852-2294-487C-862F-472517493AAD@gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<46E8BFAA.5090008@v.loewis.de>
	<EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
	<46E95D6C.1060704@v.loewis.de>
	<E1DC5852-2294-487C-862F-472517493AAD@gmail.com>
Message-ID: <46E96A9E.8080305@v.loewis.de>

> http://www.artima.com/weblogs/viewpost.jsp?thread=214235) that the
> slowdown was 2x in a single threaded application (which couldn't be due
> to lock contention), it must be due to lock overhead (unless the
> programming was otherwise faulty or there is something else about locks
> that I don't know about - Martin?). Hence I'm assuming that we need to
> reduce lock overhead. If acquiring and releasing locks (part of lock
> overhead) is a simple context switch (and I don't doubt you here), then
> the only remaining thing to optimize is memory operations related to
> lock objects.

I think you are putting too many assumptions on top of each other.
It might also have been that the locks in the slow implementation
were too fine-grained, and that some performance could have been
regained by making them coarser again.

>> You lost me here. What are you trying to achieve? It's not the lock
>> that people complain about, but that Python runs serially most
>> of the time.
> 
> http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms#The_lock-free_approach

The asynchronous model assumes that the sender can continue to process
data without needing a reply. This is not true for the Python threading
model: if the thread needs access to some data structure, it really
needs to wait for the result of that access, because that's the
semantics of the operations.

> Specifically, i'm trying to achieve the approach using a "deposit request".

For that to work, you need to produce a list of requests that can be
processed asynchronously. I can't see any in the Python interpreter.

> I'm also guessing that if
> you do that then for each refcount you're going to have to acquire a
> lock... which happens *very* frequently (and I think by your earlier
> responses you concur).

In that it occurs frequently - not in that you have to acquire a lock
to modify the refcount. You don't.

> Ok. Other people know more about the specifics of the GIL than I do.
> However, the main issue with removing the GIL seems to be the reference
> counting algorithm. 

It isn't. Reference counting could be done easily without the GIL.
It's rather the container objects, and the global variables, that
need protection.

Regards,
Martin

From rhamph at gmail.com  Thu Sep 13 19:08:40 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 13 Sep 2007 11:08:40 -0600
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <1189696391.11322.275.camel@localhost>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost>
Message-ID: <aac2c7cb0709131008s7692de15p5606d73b82cfd863@mail.gmail.com>

On 9/13/07, Hrvoje Nik?i? <hrvoje.niksic at avl.com> wrote:
> On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote:
> > > To put it another way, would it actually matter if the reference
> > > counts for such objects became hopelessly wrong due to non-atomic
> > > adjustments?
> >
> > If they drop to zero (which may happen due to non-atomic adjustments),
> > Python will try to release the static memory, which will crash the
> > malloc implementation.
>
> More precisely, Python will call the deallocator appropriate for the
> object type.  If that deallocator does nothing, the object continues to
> live.  Such objects could also start out with a refcount of sys.maxint
> or so to ensure that calls to the no-op deallocator are unlikely.
>
> The part I don't understand is how Python would know which objects are
> global/static.  Testing for such a thing sounds like something that
> would be slower than atomic incref/decref.

I've explained my experiments here:
http://www.artima.com/forums/flat.jsp?forum=106&thread=214235&start=30&msRange=15#279978

Basically though, atomic incref/decref won't work.  Once you've got
two threads modifying the same location the costs skyrocket.  Even
without being properly atomic you'll get the same slowdown on x86
(who's cache coherency is fairly strict.)

The only two options are:
A) Don't modify an object on every incref/decref.  Deletion must be
delayed.  This lets you share (thread-safe) objects.
B) Don't share *any* objects.  This is a process model (even if
they're lightweight like erlang).  For the near future, it's much
easier to do this using real processes though.

Threading is much more powerful, but it remains to be proven that it
can be done efficiently.

-- 
Adam Olsen, aka Rhamphoryncus

From janssen at parc.com  Thu Sep 13 19:15:32 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 10:15:32 PDT
Subject: [Python-Dev] SSL certs
In-Reply-To: <46E8E4B0.60909@v.loewis.de> 
References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
	<20070912013412.GB14034@panix.com>
	<07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>
	<20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de>
Message-ID: <07Sep13.101532pdt."57996"@synergy1.parc.xerox.com>

> However, there is an alternative to using multiple IP addresses:
> one could also use multiple "subject alternative names", and create
> a certificate that lists them all.

Unfortunately, much of the client code that does the hostname
verification is wrapped up in gullible Web browsers or Java HTTPS
libraries that swallowed RFC 2818 whole, and not easily accessible by
applications.  Does any of it recognize and accept "subject
alternative name"?

It's possible to at least override the default Java client-side
hostname verification handling in a new application.  And Python is
lucky; because there was no client-side hostname verification
possible, RFC 2818 hasn't been plastered into the Python standard
library :-).

Bill

From martin at v.loewis.de  Thu Sep 13 19:18:43 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 13 Sep 2007 19:18:43 +0200
Subject: [Python-Dev] SSL certs
In-Reply-To: <07Sep13.101532pdt."57996"@synergy1.parc.xerox.com>
References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de>
	<46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de>
	<46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<07Sep6.102547pdt."57996"@synergy1.parc.xerox.com>
	<20070912013412.GB14034@panix.com>
	<07Sep12.111225pdt."57996"@synergy1.parc.xerox.com>
	<20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de>
	<07Sep13.101532pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46E970F3.2080304@v.loewis.de>

>> However, there is an alternative to using multiple IP addresses:
>> one could also use multiple "subject alternative names", and create
>> a certificate that lists them all.
> 
> Unfortunately, much of the client code that does the hostname
> verification is wrapped up in gullible Web browsers or Java HTTPS
> libraries that swallowed RFC 2818 whole, and not easily accessible by
> applications.  Does any of it recognize and accept "subject
> alternative name"?

Works fine with Firefox and MSIE.

Regards,
Martin

From jason.orendorff at gmail.com  Thu Sep 13 19:29:23 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Thu, 13 Sep 2007 13:29:23 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
Message-ID: <bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>

On 9/13/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
> 1. Use message passing and transactions.  [...]
> 2. Do it perl style. [...]
> 3. Come up with an elegant way of handling multiple python processes. [...]
> 4. Remove the GIL, use transactions for python objects, [...]

The SpiderMonkey JavaScript engine takes a very different approach,
described here:
http://developer.mozilla.org/en/docs/SpiderMonkey_Internals:_Thread_Safety

The SpiderMonkey C API threading model should sound familiar:  C code
can assume that simple operations, like dictionary lookups, are atomic
and thread-safe.  C code must explicitly JS_SuspendRequest() before
doing blocking I/O or number-crunching (just like
Py_BEGIN_ALLOW_THREADS).  The main difference is that SpiderMonkey's
"requests" are not mutually exclusive, the way the GIL is.

SpiderMonkey does fine-grained locking for mutable objects to avoid
race conditions.  The clever bit is that SpiderMonkey's per-object
locking does *not* require a context switch or even an atomic
instruction, in the usual case where an object is *not* shared among
threads.  (Programs that embed SpiderMonkey therefore run faster if
they manage to ensure that threads share relatively few mutable
objects.  JavaScript doesn't have modules.)

Suppose Python went this route.  There would still have to be a
"stop-the-world" global lock, because the cycle collector won't work
if other threads are going about changing pointers.  (SpiderMonkey's
GC does the same thing.)  Retaining such a lock has another advantage:
this change could be completely backward-compatible to extensions.
Just use this global lock as the GIL when entering a non-thread-safe
extension (all existing extensions would be considered
non-thread-safe).

This means non-thread-safe extensions would be hoggish (but not much
worse than they are already!).  Making an existing extension
thread-safe would require some thought, but it wouldn't be terribly
hard.  In the simplest cases, the extension writer could just add a
flag to the type saying "ok, I'm thread-safe".

Refcounting is another major issue.  SpiderMonkey uses GC instead.
CPython would need to do atomic increfs/decrefs.  (Deferred
refcounting could mitigate the cost.)

The main drawback (aside from the amount of work) is the patent.
SpiderMonkey's license grants a worldwide, royalty-free license, but
not under the Python license.  I think this could be wrangled, if the
technical approach looks worthwhile.

-j

From janssen at parc.com  Thu Sep 13 19:55:36 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 10:55:36 PDT
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
Message-ID: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>

I see that base64.b64encode and base64.standard_b64encode no longer
introduce line breaks into the output strings, as base64.encodestring
does.  Shouldn't there be an option on one of them to do this?

Bill

From janssen at parc.com  Thu Sep 13 20:43:01 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 11:43:01 PDT
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>

> I see that base64.b64encode and base64.standard_b64encode no longer
> introduce line breaks into the output strings, as base64.encodestring
> does.  Shouldn't there be an option on one of them to do this?

See:

http://mail.python.org/pipermail/python-bugs-list/2001-October/007856.html

section 2.1 of http://www.faqs.org/rfcs/rfc3548.html

Perhaps adding MIME_b64encode() and PEM_b64encode() routines?  Or just
an optional parameter to standard_b64encode, called "max_line_length",
defaulting to 0, meaning no max?

Bill



From facundobatista at gmail.com  Thu Sep 13 21:08:48 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 13 Sep 2007 16:08:48 -0300
Subject: [Python-Dev] Python tickets summary
In-Reply-To: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
References: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
Message-ID: <e04bdf310709131208v307bb273gd47a48779b4db61b@mail.gmail.com>

2007/9/10, Facundo Batista <facundobatista at gmail.com>:

> I modified my tool, whichs makes a summary of all the Python tickets
> (I moved the source where the info is taken from SF to our Roundup).
>
> In result, the summary is now, again, updated daily:

Taking an idea from Jeff Rush, now there're separate listings in
function of the keyword of the ticket.

This way, you can see only the Py3k tickets, or the patchs, etc.

All the listings are accesible from the same pages, start here:

  http://www.taniquetil.com.ar/facundo/py_tickets.html

(remember to refresh)

Any idea to improve these pages is welcomed.

Regards,

--
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From draghuram at gmail.com  Thu Sep 13 21:14:41 2007
From: draghuram at gmail.com (Raghuram Devarakonda)
Date: Thu, 13 Sep 2007 15:14:41 -0400
Subject: [Python-Dev] [Tracker-discuss] Python tickets summary
In-Reply-To: <e04bdf310709131208v307bb273gd47a48779b4db61b@mail.gmail.com>
References: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
	<e04bdf310709131208v307bb273gd47a48779b4db61b@mail.gmail.com>
Message-ID: <2c51ecee0709131214q14bcce2du1b3cb077088f063d@mail.gmail.com>

On 9/13/07, Facundo Batista <facundobatista at gmail.com> wrote:
>   http://www.taniquetil.com.ar/facundo/py_tickets.html

It looks like the column "Opened by" contains information for "Last
update by"  and vice versa. At least, that is the case with issue
1159.

Thanks,
Raghu

From rhamph at gmail.com  Thu Sep 13 21:43:28 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 13 Sep 2007 13:43:28 -0600
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost>
	<aac2c7cb0709131008s7692de15p5606d73b82cfd863@mail.gmail.com>
	<2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com>
Message-ID: <aac2c7cb0709131243q5cf9aa01o16e0a727106c2bcd@mail.gmail.com>

On 9/13/07, Justin Tulloss <jmtulloss at gmail.com> wrote:
>
>
> On 9/13/07, Adam Olsen <rhamph at gmail.com> wrote:
> >
> > Basically though, atomic incref/decref won't work.  Once you've got
> > two threads modifying the same location the costs skyrocket.  Even
> > without being properly atomic you'll get the same slowdown on x86
> > (who's cache coherency is fairly strict.)
>
>
> I'm a bit skeptical of the actual costs of atomic incref. For there to be
> contention, you would need to have to be modifying the same memory location
> at the exact same time. That seems unlikely to ever happen. We can't bank on
> it never happening, but an occasionally expensive operation is ok. After
> all, it's occasional.

That was my initial expectation too.  However, the incref *is* a
modification.  It's not simply an issue of the "exact same time", but
anything that causes the cache entries to bounce back and forth and
delay the rest of the pipeline.  If you have a simple loop like "for i
in range(count): 1.0+n", then the 1.0 literal will get shared between
threads, and the refcount will get hammered.

Is it reasonable to expect that much sharing?  I think it is.
Literals are an obvious example, but there's also configuration data
passed between threads.  Pystone seems to have enough sharing to kill
performance.  And after all, isn't sharing the whole point (even in
the definition) of threads?

-- 
Adam Olsen, aka Rhamphoryncus

From p.f.moore at gmail.com  Thu Sep 13 21:46:10 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 13 Sep 2007 20:46:10 +0100
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <-3701875869749923189@unknownmsgid>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
	<79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>
	<-3701875869749923189@unknownmsgid>
Message-ID: <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com>

On 13/09/2007, Bill Janssen <janssen at parc.com> wrote:
> > Anyway, philosophy aside, I'll try to make some time in the next few
> > days to get a working setup.py for the SSL package using mingw.
> > Hopefully, Bill will then integrate this and we'll have mingw as a
> > supported option.
>
> I'll be happy to do that!

OK, the following patch to setup.py works for mingw32. You need to set
2 variables -

1. The location where you installed gnuwin32
2. Whether you want a static or dynamic build

I've checked both versions on Python 2.5.1 and they pass all tests.
Static build is 670k, dynamic is 26k (but depends on the openssl DLLs
libssl32.dll and libeay32.dll).

Ideally, these should be settable via command line options or
something. Also, it would be nice to detect the use of MSVC and do
something equivalent (but presumably somewhat different), but I don't
know how to detect the type of compiler the user has selected :-(

Anyway, I hope it's useful. If nothing else, it offers a way for
people to build the module with free software on Windows.

I could build some Windows installers if you want, but I'd need to
download and install some extra versions of Python, so you'd have to
tell me which you want doing (and I can't offer to commit to doing
this on a regular basis...)

Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mingw.patch
Type: application/octet-stream
Size: 2071 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070913/4a0a3595/attachment-0001.obj 

From tulloss2 at uiuc.edu  Thu Sep 13 22:16:57 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Thu, 13 Sep 2007 15:16:57 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
	<bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
Message-ID: <2cfeb93c0709131316t23297e4doee08d46601cbfb2c@mail.gmail.com>

On 9/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
>
> On 9/13/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
> > 1. Use message passing and transactions.  [...]
> > 2. Do it perl style. [...]
> > 3. Come up with an elegant way of handling multiple python processes.
> [...]
> > 4. Remove the GIL, use transactions for python objects, [...]
>
> The SpiderMonkey JavaScript engine takes a very different approach,
> described here:
> http://developer.mozilla.org/en/docs/SpiderMonkey_Internals:_Thread_Safety


This is basically the same as what perl does, as far as I understand it.
There are differences, but they're not that substantial. It's basically the
idea of keeping all state separate and treating global access as a special
case. I think this is a pretty solid approach, since globals shouldn't be
accessed that often. What we would want to do differently is make sure that
read-only globals can be cheaply accessed from any thread. Otherwise we lose
the performance benefit of having them in the first place.

Refcounting is another major issue.  SpiderMonkey uses GC instead.
> CPython would need to do atomic increfs/decrefs.  (Deferred
> refcounting could mitigate the cost.)


This is definitely something to think about. I don't really have an answer
straight off, but there are several things we could try.

The main drawback (aside from the amount of work) is the patent.
> SpiderMonkey's license grants a worldwide, royalty-free license, but
> not under the Python license.  I think this could be wrangled, if the
> technical approach looks worthwhile.


I'm not sure this is an issue. It's not like we would be using the code,
just the patented algorithm. Any code we wrote to implement the algorithm
would of course be covered under the python license. I'm not a legal guy
though.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/3ea47b36/attachment.htm 

From janssen at parc.com  Thu Sep 13 22:24:01 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 13 Sep 2007 13:24:01 PDT
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> 
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
	<79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>
	<-3701875869749923189@unknownmsgid>
	<79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com>
Message-ID: <07Sep13.132403pdt."57996"@synergy1.parc.xerox.com>

> I could build some Windows installers if you want, but I'd need to
> download and install some extra versions of Python, so you'd have to
> tell me which you want doing (and I can't offer to commit to doing
> this on a regular basis...)

Thanks, but let's wait till this settles down a bit (say, a week
passes without me saying anything about it :-).  Then I'll definitely
want both VS and mingw versions to upload to the Cheeseshop.  But
it's not quite ready yet.

Bill

From p.f.moore at gmail.com  Thu Sep 13 22:28:34 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 13 Sep 2007 21:28:34 +0100
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <7888070288790377344@unknownmsgid>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>
	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>
	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>
	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>
	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>
	<79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>
	<-3701875869749923189@unknownmsgid>
	<79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com>
	<7888070288790377344@unknownmsgid>
Message-ID: <79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com>

On 13/09/2007, Bill Janssen <janssen at parc.com> wrote:
> > I could build some Windows installers if you want, but I'd need to
> > download and install some extra versions of Python, so you'd have to
> > tell me which you want doing (and I can't offer to commit to doing
> > this on a regular basis...)
>
> Thanks, but let's wait till this settles down a bit (say, a week
> passes without me saying anything about it :-).  Then I'll definitely
> want both VS and mingw versions to upload to the Cheeseshop.  But
> it's not quite ready yet.

OK, ignore my other message then (except as an indication that I can
build them when you're ready :-)).

You don't need VS and mingw binary installers, though - the mingw ones
will work for any Python (ignoring specialised custom builds, and
anyone doing one of them is probably capable of building the ssl
module!).

Paul.

From jmtulloss at gmail.com  Thu Sep 13 20:45:09 2007
From: jmtulloss at gmail.com (Justin Tulloss)
Date: Thu, 13 Sep 2007 13:45:09 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <aac2c7cb0709131008s7692de15p5606d73b82cfd863@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost>
	<aac2c7cb0709131008s7692de15p5606d73b82cfd863@mail.gmail.com>
Message-ID: <2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com>

On 9/13/07, Adam Olsen <rhamph at gmail.com> wrote:
>
>
> Basically though, atomic incref/decref won't work.  Once you've got
> two threads modifying the same location the costs skyrocket.  Even
> without being properly atomic you'll get the same slowdown on x86
> (who's cache coherency is fairly strict.)




I'm a bit skeptical of the actual costs of atomic incref. For there to be
contention, you would need to have to be modifying the same memory location
at the exact same time. That seems unlikely to ever happen. We can't bank on
it never happening, but an occasionally expensive operation is ok. After
all, it's occasional.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/f373a0ec/attachment.htm 

From greg.ewing at canterbury.ac.nz  Fri Sep 14 00:59:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 10:59:36 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E795FD.1070103@v.loewis.de>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
Message-ID: <46E9C0D8.5050100@canterbury.ac.nz>

Martin v. L?wis wrote:

> Now we are getting into details: you do NOT have to lock
> an object to modify its reference count. An atomic
> increment/decrement operation is enough.

I stand corrected. But if it were as simple as that,
I think it would have been done by now. I got the
impression that this had already been tried, and it
was still too slow.

--
Greg

From mhammond at skippinet.com.au  Fri Sep 14 01:18:12 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Fri, 14 Sep 2007 09:18:12 +1000
Subject: [Python-Dev] Windows package for new SSL package?
In-Reply-To: <79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com>
References: <01bf01c7f593$d7beecc0$873cc640$@com.au>	<01da01c7f5b2$cadca4b0$6095ee10$@com.au>	<m2r6l3p43j.fsf@valheru.db3l.homeip.net>	<79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com>	<9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com>	<79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com>	<-3701875869749923189@unknownmsgid>	<79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com>	<7888070288790377344@unknownmsgid>
	<79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com>
Message-ID: <028f01c7f65c$671741b0$3545c510$@com.au>

> You don't need VS and mingw binary installers, though - the mingw ones
> will work for any Python (ignoring specialised custom builds, and
> anyone doing one of them is probably capable of building the ssl
> module!).

Why I appreciate your points about building the extension with free tools,
wouldn't it be prudent to release binaries using the same compiler as Python
itself, assuming that option is available?  If I read this thread correctly,
a mingw build will rely on an openssl DLL being available or installed,
which would seem to be less desirable than the way it builds with the
openssl Python itself builds with.

Mark


From skip at pobox.com  Fri Sep 14 01:38:05 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 13 Sep 2007 18:38:05 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <1189696391.11322.275.camel@localhost>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de>
	<1189696391.11322.275.camel@localhost>
Message-ID: <18153.51677.587497.132597@montanaro.dyndns.org>


    Hrvoje> More precisely, Python will call the deallocator appropriate for
    Hrvoje> the object type.  If that deallocator does nothing, the object
    Hrvoje> continues to live.  Such objects could also start out with a
    Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op
    Hrvoje> deallocator are unlikely.

Maybe sys.maxint/2?  You'd hate for the first incref to invoke the
deallocator even if it was a no-op.  

Skip

From jon+python-dev at unequivocal.co.uk  Fri Sep 14 03:01:22 2007
From: jon+python-dev at unequivocal.co.uk (Jon Ribbens)
Date: Fri, 14 Sep 2007 02:01:22 +0100
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <18153.51677.587497.132597@montanaro.dyndns.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de>
	<1189696391.11322.275.camel@localhost>
	<18153.51677.587497.132597@montanaro.dyndns.org>
Message-ID: <20070914010122.GN32061@snowy.squish.net>

On Thu, Sep 13, 2007 at 06:38:05PM -0500, skip at pobox.com wrote:
>     Hrvoje> More precisely, Python will call the deallocator appropriate for
>     Hrvoje> the object type.  If that deallocator does nothing, the object
>     Hrvoje> continues to live.  Such objects could also start out with a
>     Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op
>     Hrvoje> deallocator are unlikely.
> 
> Maybe sys.maxint/2?  You'd hate for the first incref to invoke the
> deallocator even if it was a no-op.  

I do believe I already suggested that ;-)

From greg.ewing at canterbury.ac.nz  Fri Sep 14 05:11:00 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 15:11:00 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <fcb2cc$65i$1@sea.gmane.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org>
Message-ID: <46E9FBC4.1020901@canterbury.ac.nz>

Christian Heimes wrote:
> Pardon my ignorance but why does Python do reference counting for truly
> global and static objects

Because it would cost more time to check whether the
reference counting needed to be done than to just do
it anyway.

Remember that *most* refcount operations are on
non-global objects. Putting in a test would slow
all of them down, but only speed a few of them
up.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Sep 14 05:15:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 15:15:23 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070913105527.GH32061@snowy.squish.net>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
Message-ID: <46E9FCCB.1050105@canterbury.ac.nz>

Jon Ribbens wrote:
> To put it another way, would it actually matter if the reference
> counts for such objects became hopelessly wrong due to non-atomic
> adjustments?

Again, it would cost time to check whether you could
get away with doing non-atomic refcounting.

If you're thinking that no check would be needed because
only things like True, False and None would be shared
between threads, that's quite wrong. If the threads
are to communicate at all, they need to share some
kind of data somewhere.

Also keep in mind that there is one case of "wrong"
refcounting that would be distastrous, which is the
case where the refcount becomes zero prematurely.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Sep 14 05:19:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 15:19:04 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <18153.15002.76898.448843@montanaro.dyndns.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
Message-ID: <46E9FDA8.6010303@canterbury.ac.nz>

skip at pobox.com wrote:
> what if ... we use atomic test-and-set to
> handle reference counting (with a lock for those CPU architectures where we
> haven't written the necessary assembler fragment), then implement a lock for
> each mutable type and another for global state (thread state, interpreter
> state, etc)?

Could be worth a try. A first step might be to just implement
the atomic refcounting, and run that single-threaded to see
if it has terribly bad effects on performance.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Sep 14 05:43:57 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 15:43:57 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<46E8BFAA.5090008@v.loewis.de>
	<EAA274DE-E0BF-4775-89C6-8763E9AECD66@gmail.com>
Message-ID: <46EA037D.5030909@canterbury.ac.nz>

Prateek Sureka wrote:
> Naturally, we need to make the locking more  
> fine-grained to resolve this. Hopefully we can do so in a way that  
> does not increase the lock overhead (hence my suggestion for a lock  
> free approach using an asynch queue and a core as dedicated server).

What you don't seem to see is that this would have
no less overhead, and probably a lot *more*, than
a mutex or other standard synchronisation mechanism.
Certainly a lot more than an atomic instruction for
the incref/decref.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Sep 14 05:55:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Sep 2007 15:55:30 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E652CD.1070901@v.loewis.de>
	<2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
	<bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
Message-ID: <46EA0632.5060603@canterbury.ac.nz>

Jason Orendorff wrote:
> The clever bit is that SpiderMonkey's per-object
> locking does *not* require a context switch or even an atomic
> instruction, in the usual case where an object is *not* shared among
> threads.

How does it tell whether an object is shared between
threads? That sounds like the really clever bit to me.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From tleeuwenburg at gmail.com  Fri Sep 14 06:13:12 2007
From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg)
Date: Fri, 14 Sep 2007 14:13:12 +1000
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46EA0632.5060603@canterbury.ac.nz>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
	<bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
	<46EA0632.5060603@canterbury.ac.nz>
Message-ID: <43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com>

Pardon me for talking with no experience in such matters, but...

Okay, incrementing a reference counter is atomic, therefore the cheapest
possible operation. Is it possible to keep reference counting atomic in a
multi-thread model?

Could you do the following... let's consider two threads, "A" and "B". Each
time an object is created, a reference count is created in both "A" and "B".
Let's suppose "A" has a real reference and "B" has no reference really.
Couldn't the GC check two reference registers for a reference count? The
object would then be cleaned up only if both registers were 0.

To exploit multiple CPUs, you could have two persistent Python processes on
each CPU with its own mini-GIL. Object creation would then involve a call to
each process to create the reference and GC would involve checking each
process to see what their count is. However, it would mean that within each
process, threads could create additional references or remove references in
an atomic way.

In a single-CPU system, this would be the same cost as currently, since I
think that situation would devolve to having just one place to check for
references. This seems to mean that it is the case that it would be no more
expensive for a single-CPU system.

In a two-CPU system, I'm no expertise on the actual call overheads of object
creation and garbage collection, but logically it would double the effort of
object creation and destruction (all such operations now need to occur on
both processes) but would keep reference increments and decrements atomic.

Once again, I'm really sorry if I'm completely off-base since I have never
done any actual coding in this area, but I thought I'd make the suggestion
just in case it happened to have relevance.

Thanks,
-Tennessee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/e11d56bb/attachment.htm 

From tulloss2 at uiuc.edu  Fri Sep 14 07:10:34 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Fri, 14 Sep 2007 00:10:34 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46EA0632.5060603@canterbury.ac.nz>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
	<bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
	<46EA0632.5060603@canterbury.ac.nz>
Message-ID: <2cfeb93c0709132210o4c5f6e56va0c9e2d9ebf1be27@mail.gmail.com>

On 9/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> Jason Orendorff wrote:
> > The clever bit is that SpiderMonkey's per-object
> > locking does *not* require a context switch or even an atomic
> > instruction, in the usual case where an object is *not* shared among
> > threads.
>
> How does it tell whether an object is shared between
> threads? That sounds like the really clever bit to me.


If you look at the article, they have a code sample.

Basically a global is "owned" by the first thread that touches it. That
thread can do whatever it wants with that global. If another thread wants to
touch the global, it locks everything to do so.

This is a pretty good idea except that in Python there are so many globals
that all threads benefit from having access to. Luckily, except for their
reference counts, they're mostly read-only. Therefore, if we can work out
this reference count, we can probably use a similar concept.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/53f674b4/attachment.htm 

From rhamph at gmail.com  Fri Sep 14 08:10:17 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 14 Sep 2007 00:10:17 -0600
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <46E9FDA8.6010303@canterbury.ac.nz>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
Message-ID: <aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>

On 9/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> skip at pobox.com wrote:
> > what if ... we use atomic test-and-set to
> > handle reference counting (with a lock for those CPU architectures where we
> > haven't written the necessary assembler fragment), then implement a lock for
> > each mutable type and another for global state (thread state, interpreter
> > state, etc)?
>
> Could be worth a try. A first step might be to just implement
> the atomic refcounting, and run that single-threaded to see
> if it has terribly bad effects on performance.

I've done this experiment.  It was about 12% on my box.  Later, once I
had everything else setup so I could run two threads simultaneously, I
found much worse costs.  All those literals become shared objects that
create contention.

I'm now working on an approach that writes out refcounts in batches to
reduce contention.  The initial cost is much higher, but it scales
better too.  I've currently got it to just under 50% cost, meaning two
threads is a slight net gain.

-- 
Adam Olsen, aka Rhamphoryncus

From steve at holdenweb.com  Fri Sep 14 08:15:41 2007
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 14 Sep 2007 02:15:41 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <1189696391.11322.275.camel@localhost>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>	<fcb2cc$65i$1@sea.gmane.org>
	<200709131219.21152.nd@perlig.de>	<20070913105527.GH32061@snowy.squish.net>	<46E91BDB.7070601@v.loewis.de>
	<1189696391.11322.275.camel@localhost>
Message-ID: <fcd8ue$qqo$2@sea.gmane.org>

Hrvoje Nik?i? wrote:
> On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote:
>>> To put it another way, would it actually matter if the reference
>>> counts for such objects became hopelessly wrong due to non-atomic
>>> adjustments?
>> If they drop to zero (which may happen due to non-atomic adjustments),
>> Python will try to release the static memory, which will crash the
>> malloc implementation.
> 
> More precisely, Python will call the deallocator appropriate for the
> object type.  If that deallocator does nothing, the object continues to
> live.  Such objects could also start out with a refcount of sys.maxint
> or so to ensure that calls to the no-op deallocator are unlikely.
> 
The thought of adding references is amusing. What happens when a 
refcount becomes negative by overflow? I know, I should read the source ...

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline


From tulloss2 at uiuc.edu  Fri Sep 14 08:51:35 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Fri, 14 Sep 2007 01:51:35 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
Message-ID: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>

On 9/14/07, Adam Olsen <rhamph at gmail.com> wrote:

> > Could be worth a try. A first step might be to just implement
> > the atomic refcounting, and run that single-threaded to see
> > if it has terribly bad effects on performance.
>
> I've done this experiment.  It was about 12% on my box.  Later, once I
> had everything else setup so I could run two threads simultaneously, I
> found much worse costs.  All those literals become shared objects that
> create contention.


It's hard to argue with cold hard facts when all we have is raw speculation.
What do you think of a model where there is a global "thread count" that
keeps track of how many threads reference an object? Then there are
thread-specific reference counters for each object. When a thread's refcount
goes to 0, it decrefs the object's thread count. If you did this right,
hopefully there would only be cache updates when you update the thread
count, which will only be when a thread first references an object and when
it last references an object.

I mentioned this idea earlier and it's growing on me. Since you've actually
messed around with the code, do you think this would alleviate some of the
contention issues?

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/9b77d4b1/attachment.htm 

From hrvoje.niksic at avl.com  Fri Sep 14 09:25:24 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Fri, 14 Sep 2007 09:25:24 +0200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <18153.51677.587497.132597@montanaro.dyndns.org>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<46E91BDB.7070601@v.loewis.de>
	<1189696391.11322.275.camel@localhost>
	<18153.51677.587497.132597@montanaro.dyndns.org>
Message-ID: <1189754724.11322.279.camel@localhost>

On Thu, 2007-09-13 at 18:38 -0500, skip at pobox.com wrote:
> Hrvoje> More precisely, Python will call the deallocator appropriate for
>     Hrvoje> the object type.  If that deallocator does nothing, the object
>     Hrvoje> continues to live.  Such objects could also start out with a
>     Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op
>     Hrvoje> deallocator are unlikely.
> 
> Maybe sys.maxint/2?  You'd hate for the first incref to invoke the
> deallocator even if it was a no-op.  

ob_refcnt is signed.  :-)



From martin at v.loewis.de  Fri Sep 14 14:24:12 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 14 Sep 2007 14:24:12 +0200
Subject: [Python-Dev] Daily Windows Installers
Message-ID: <46EA7D6C.8010600@v.loewis.de>

Together with David Bolen, I set up a series of buildbot
slaves that create an MSI installer from the 2.5, 2.6,
and 3.0 branches every day. The result files are available
from

http://www.python.org/dev/daily-msi/

The buildbot pages themselves are at

http://www.python.org/dev/buildbot/msi/

There are still some glitches with that installation
(in particular, the Microsoft help compiler seems to
crash occasionally).

If you find any problems with the MSI files themselves,
please report them to this list, or to the bug tracker.

Regards,
Martin

From facundobatista at gmail.com  Thu Sep 13 21:19:02 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 13 Sep 2007 16:19:02 -0300
Subject: [Python-Dev] Decimal news
Message-ID: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>

Hi people!

After some months, Decimal is now in the trunk again.

It's fully updated to the latest Cowlishaw specification, and
complying with the latest test cases (from a few days ago, which even
take in consideration some feedback from ours).

I want to thank so much to Mark Dickinson, who made *a lot* of this
work, not only the math part (he's a mathematician himself), but also
a lot of cleaning and speeding up.

Now we will put our hands in the documentation, for it to be 100% OK
way before 2.6 arrives.

Py3 will come after that.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From g.brandl at gmx.net  Fri Sep 14 17:43:17 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 14 Sep 2007 17:43:17 +0200
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <46EA7D6C.8010600@v.loewis.de>
References: <46EA7D6C.8010600@v.loewis.de>
Message-ID: <fcea66$a5i$1@sea.gmane.org>

Martin v. L?wis schrieb:
> Together with David Bolen, I set up a series of buildbot
> slaves that create an MSI installer from the 2.5, 2.6,
> and 3.0 branches every day. The result files are available
> from
> 
> http://www.python.org/dev/daily-msi/
> 
> The buildbot pages themselves are at
> 
> http://www.python.org/dev/buildbot/msi/
> 
> There are still some glitches with that installation
> (in particular, the Microsoft help compiler seems to
> crash occasionally).

I hope this isn't due to the files that Sphinx creates.
I had a nasty crash with HTML Help Workshop when I generated
an "invalid" index file -- but this was reproducible of course.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From p.f.moore at gmail.com  Fri Sep 14 17:46:11 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 14 Sep 2007 16:46:11 +0100
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <46EA7D6C.8010600@v.loewis.de>
References: <46EA7D6C.8010600@v.loewis.de>
Message-ID: <79990c6b0709140846g26a99c33hd39e60eb226edc1a@mail.gmail.com>

On 14/09/2007, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Together with David Bolen, I set up a series of buildbot
> slaves that create an MSI installer from the 2.5, 2.6,
> and 3.0 branches every day.

That's good news. Thanks for doing this.

Paul.

From rhamph at gmail.com  Fri Sep 14 18:33:09 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 14 Sep 2007 10:33:09 -0600
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
	<2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
Message-ID: <aac2c7cb0709140933kdf28d40gb08debe3d3c5c81a@mail.gmail.com>

On 9/14/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>
> On 9/14/07, Adam Olsen <rhamph at gmail.com> wrote:
> > > Could be worth a try. A first step might be to just implement
> > > the atomic refcounting, and run that single-threaded to see
> > > if it has terribly bad effects on performance.
> >
> > I've done this experiment.  It was about 12% on my box.  Later, once I
> > had everything else setup so I could run two threads simultaneously, I
> > found much worse costs.  All those literals become shared objects that
> > create contention.
>
> It's hard to argue with cold hard facts when all we have is raw speculation.
> What do you think of a model where there is a global "thread count" that
> keeps track of how many threads reference an object? Then there are
> thread-specific reference counters for each object. When a thread's refcount
> goes to 0, it decrefs the object's thread count. If you did this right,
> hopefully there would only be cache updates when you update the thread
> count, which will only be when a thread first references an object and when
> it last references an object.
>
> I mentioned this idea earlier and it's growing on me. Since you've actually
> messed around with the code, do you think this would alleviate some of the
> contention issues?

There would be some poor worst-case behaviour.  In the case of
literals you'd start referencing them when you call a function, then
stop when the function returns.  Same for any shared datastructure.

I think caching/buffering refcounts in general holds promise though.
My current approach uses a crude hash table as a cache and only
flushes when there's a collision or when the tracing GC starts up.  So
far I've only got about 50% of the normal performance, but that's with
90% or more scalability, and I'm hoping to keep improving it.

-- 
Adam Olsen, aka Rhamphoryncus

From martin at v.loewis.de  Fri Sep 14 18:45:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 14 Sep 2007 18:45:29 +0200
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <fcea66$a5i$1@sea.gmane.org>
References: <46EA7D6C.8010600@v.loewis.de> <fcea66$a5i$1@sea.gmane.org>
Message-ID: <46EABAA9.40407@v.loewis.de>

> I hope this isn't due to the files that Sphinx creates.
> I had a nasty crash with HTML Help Workshop when I generated
> an "invalid" index file -- but this was reproducible of course.

It's not clear what precisely the problem is, but yes, it
must have to do with the input :-) If you fixed that problem
fairly recently (within the last 48 hours), this may have been
the one we were seeing.

Unfortunately, this is again one of the Windows problems which
make buildbot on Windows so difficult: it brings up an error
window, and then hangs.

Regards,
Martin

From tonynelson at georgeanelson.com  Fri Sep 14 18:44:25 2007
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Fri, 14 Sep 2007 12:44:25 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
	<2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
Message-ID: <p04330100c310567bfd17@[192.168.123.162]>

At 1:51 AM -0500 9/14/07, Justin Tulloss wrote:
>On 9/14/07, Adam Olsen <<mailto:rhamph at gmail.com>rhamph at gmail.com> wrote:
>
>> Could be worth a try. A first step might be to just implement
>> the atomic refcounting, and run that single-threaded to see
>> if it has terribly bad effects on performance.
>
>I've done this experiment.  It was about 12% on my box.  Later, once I
>had everything else setup so I could run two threads simultaneously, I
>found much worse costs.  All those literals become shared objects that
>create contention.
>
>
>It's hard to argue with cold hard facts when all we have is raw
>speculation. What do you think of a model where there is a global "thread
>count" that keeps track of how many threads reference an object? Then
>there are thread-specific reference counters for each object. When a
>thread's refcount goes to 0, it decrefs the object's thread count. If you
>did this right, hopefully there would only be cache updates when you
>update the thread count, which will only be when a thread first references
>an object and when it last references an object.

It's likely that cache line contention is the issue, so don't glom all the
different threads' refcount for an object into one vector.  Keep each
thread's refcounts in a per-thread vector of objects, so only that thread
will cache that vector, or make refcounts so large that each will be in its
own cache line (usu. 64 bytes, not too horrible for testing purposes).  I
don't know all what would be required for separate vectors of refcounts,
but each object could contain its index into the vectors, which would all
be the same size (Go Virtual Memory!).


>I mentioned this idea earlier and it's growing on me. Since you've
>actually messed around with the code, do you think this would alleviate
>some of the contention issues?
>
>Justin

Your idea can be combined with the maxint/2 initial refcount for
non-disposable objects, which should about eliminate thread-count updates
for them.
-- 
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/70897382/attachment.htm 

From status at bugs.python.org  Fri Sep 14 19:36:49 2007
From: status at bugs.python.org (Tracker)
Date: Fri, 14 Sep 2007 17:36:49 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070914173649.C02167815C@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (09/07/07 - 09/14/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1274 open (+24) / 11372 closed (+11) / 12646 total (+35)

Average duration of open issues: 672 days.
Median duration of open issues: 640 days.

Open Issues Breakdown
   open  1270 (+24)
pending     4 ( +0)

Issues Created Or Reopened (35)
_______________________________

OpenSSL detection broken for Python 3.0a1                        09/07/07
CLOSED http://bugs.python.org/issue1129    created  pythonmeister            

Idle - Save (buffer) - closes IDLE and does not save file (Windo 09/08/07
       http://bugs.python.org/issue1130    created  infixum                  

Reference Manual: "for statement" links to "break statement"     09/08/07
       http://bugs.python.org/issue1131    created  Martoon                  

compile error in poplib.py                                       09/08/07
CLOSED http://bugs.python.org/issue1132    created  andre                    

python3.0-config raises SyntaxError                              09/09/07
CLOSED http://bugs.python.org/issue1133    created  complex                  

Parsing a simple script eats all of your memory                  09/09/07
       http://bugs.python.org/issue1134    created  complex                  

xview/yview of Tix.Grid is broken                                09/09/07
       http://bugs.python.org/issue1135    created  ocean-city               

Bdb documentation                                                09/09/07
CLOSED http://bugs.python.org/issue1136    created  arklad                   

pyexpat patch for changing buffer_size                           09/09/07
       http://bugs.python.org/issue1137    created  AchimGaedke              

Fixer needed for __future__ imports                              09/09/07
       http://bugs.python.org/issue1138    created  collinwinter             

PyFile_Encoding should be PyFile_SetEncoding                     09/10/07
CLOSED http://bugs.python.org/issue1139    created  gagenellina              

re.sub returns str when processing empty unicode string          09/10/07
       http://bugs.python.org/issue1140    created  beda                     

reading large files                                              09/10/07
       http://bugs.python.org/issue1141    created  Richard.Christen at unice.fr

code sample showing errors reading large files with py 2.5/3.0   09/10/07
       http://bugs.python.org/issue1142    created  Richard.Christen at unice.fr

Update to latest ElementTree in Python 2.6                       09/11/07
       http://bugs.python.org/issue1143    created  effbot                   

parsermodule validation out of sync with Grammar                 09/11/07
       http://bugs.python.org/issue1144    created  dbinger                  

Allow str.join to join non-string types (as per PEP 3100)        09/11/07
       http://bugs.python.org/issue1145    created  thomas.lee               

TextWrap vs words 1-character shorter than the width             09/11/07
       http://bugs.python.org/issue1146    created  sam                      

string exceptions inconsistently deprecated/disabled             09/11/07
CLOSED http://bugs.python.org/issue1147    created  exarkun                  

TypeError on join - httplib mixing str and bytes                 09/11/07
CLOSED http://bugs.python.org/issue1148    created  eopadoan                 

fdopen does not work as expected                                 09/11/07
       http://bugs.python.org/issue1149    created  luis at luispedro.org       

Rename PyBUF_WRITEABLE to PyBUF_WRITABLE                         09/11/07
       http://bugs.python.org/issue1150    created  gvanrossum               

"TypeError: expected string, bytes found" instead of KeyboardInt 09/11/07
       http://bugs.python.org/issue1151    created  eopadoan                 

Bug in documentation for SimpleXMLRPCServer                      09/12/07
CLOSED http://bugs.python.org/issue1152    created  FrankMillman             

help(pickle) fails: unorderable types: type() < type()           09/12/07
CLOSED http://bugs.python.org/issue1153    created  Qrczak                   

Carbon.CF memory leak                                            09/12/07
CLOSED http://bugs.python.org/issue1154    created  hhas                     

Carbon.CF memory management problem                              09/12/07
       http://bugs.python.org/issue1155    created  hhas                     

Suggested change to _exit function description in os module docu 09/12/07
       http://bugs.python.org/issue1156    created  jtonsing                 

test_urllib2net fails on test_ftp                                09/12/07
       http://bugs.python.org/issue1157    created  gvanrossum               

%f format for datetime objects                                   09/13/07
       http://bugs.python.org/issue1158    created  skip.montanaro           

os.getenv() not updated after external module uses C putenv()    09/13/07
       http://bugs.python.org/issue1159    created  robert.ancell            

Medium size regexp crashes python                                09/13/07
       http://bugs.python.org/issue1160    created  ostkamp                  

Garbled chars in offending line of SyntaxError traceback         09/13/07
       http://bugs.python.org/issue1161    created  eopadoan                 

Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B 09/13/07
CLOSED http://bugs.python.org/issue1162    created  swaroopch                

Patch to make py3k/Lib/test/test_thread.py use unittest          09/13/07
       http://bugs.python.org/issue1163    created  JonoDiCarlo              



Issues Now Closed (33)
______________________

Backport ABC to 2.6                                               16 days
       http://bugs.python.org/issue1026    baranguren               

[py3k] pdb does not work in python 3000                           16 days
       http://bugs.python.org/issue1038    georg.brandl             

test_cmd_line starts python without -E                            13 days
       http://bugs.python.org/issue1056    ncoghlan                 

ssl.py shouldn't change class names from 2.6 to 3.x               11 days
       http://bugs.python.org/issue1065    janssen                  

TypeError in poplib.py                                             7 days
       http://bugs.python.org/issue1094    gvanrossum               

make install failed                                                4 days
       http://bugs.python.org/issue1095    georg.brandl             

Deeply recursive repr segfault                                     7 days
       http://bugs.python.org/issue1096    brett.cannon             

2to3,   lambda with non-tuple argument inside parenthesis          4 days
       http://bugs.python.org/issue1107    collinwinter             

"make altinstall" installs pydoc, idle, smtpd.py with broken sh    6 days
       http://bugs.python.org/issue1120    georg.brandl             

Document inspect.getfullargspec()                                  6 days
       http://bugs.python.org/issue1121    georg.brandl             

PyTuple_Size and PyTuple_GET_SIZE return type documentation inc    6 days
       http://bugs.python.org/issue1122    georg.brandl             

bytes.split shold have same interface as str.split, or differen    4 days
       http://bugs.python.org/issue1125    gvanrossum               

OpenSSL detection broken for Python 3.0a1                          0 days
       http://bugs.python.org/issue1129    georg.brandl             

compile error in poplib.py                                         1 days
       http://bugs.python.org/issue1132    georg.brandl             

python3.0-config raises SyntaxError                                0 days
       http://bugs.python.org/issue1133    loewis                   

Bdb documentation                                                  3 days
       http://bugs.python.org/issue1136    georg.brandl             

PyFile_Encoding should be PyFile_SetEncoding                       3 days
       http://bugs.python.org/issue1139    georg.brandl             

string exceptions inconsistently deprecated/disabled               0 days
       http://bugs.python.org/issue1147    brett.cannon             

TypeError on join - httplib mixing str and bytes                   1 days
       http://bugs.python.org/issue1148    gvanrossum               

Bug in documentation for SimpleXMLRPCServer                        1 days
       http://bugs.python.org/issue1152    georg.brandl             

help(pickle) fails: unorderable types: type() < type()             0 days
       http://bugs.python.org/issue1153    georg.brandl             

Carbon.CF memory leak                                              0 days
       http://bugs.python.org/issue1154    georg.brandl             

Python doesn't compile on Microsoft Visual Studio 2008 "Orcas"     1 days
       http://bugs.python.org/issue1162    georg.brandl             

time mod's timezone doesn't honor TZ var                        2114 days
       http://bugs.python.org/issue487331  brett.cannon             

asyncore file wrapper & os.error                                1988 days
       http://bugs.python.org/issue539444  brett.cannon             

support for server side transactions in _ssl                    1498 days
       http://bugs.python.org/issue783188  loewis                   

class property fset not working                                  842 days
       http://bugs.python.org/issue1207379 georg.brandl             

Traceback error when compiling Regex                             537 days
       http://bugs.python.org/issue1456280 brett.cannon             

NNTPS support in nntplib                                         402 days
       http://bugs.python.org/issue1535659 janssen                  

SSL "issuer" and "server" names cannot be parsed                 321 days
       http://bugs.python.org/issue1583946 janssen                  

Suggest a textlist() method for ElementTree                      291 days
       http://bugs.python.org/issue1602189 effbot                   

socket.error exceptions not subclass of StandardError            138 days
       http://bugs.python.org/issue1706815 gregory.p.smith          

Binding <Control-space> fails                                     26 days
       http://bugs.python.org/issue1774736 loewis                   



Top Issues Most Discussed (10)
______________________________

 11 re.sub returns str when processing empty unicode string            4 days
open    http://bugs.python.org/issue1140   

  9 os.getenv() not updated after external module uses C putenv()      1 days
open    http://bugs.python.org/issue1159   

  8 code sample showing errors reading large files with py 2.5/3.0     4 days
open    http://bugs.python.org/issue1142   

  8 reading large files                                                4 days
open    http://bugs.python.org/issue1141   

  6 Allow str.join to join non-string types (as per PEP 3100)          3 days
open    http://bugs.python.org/issue1145   

  5 %f format for datetime objects                                     2 days
open    http://bugs.python.org/issue1158   

  5 Parsing a simple script eats all of your memory                    6 days
open    http://bugs.python.org/issue1134   

  5 bytes.split shold have same interface as str.split, or differen    4 days
closed  http://bugs.python.org/issue1125   

  4 logging: delay_fh option and configuration kwargs                 44 days
open    http://bugs.python.org/issue1765140

  4 Python SEGFAULT on tuple.__repr__ and str()                      176 days
pending http://bugs.python.org/issue1686386




From barry at python.org  Fri Sep 14 20:34:44 2007
From: barry at python.org (Barry Warsaw)
Date: Fri, 14 Sep 2007 14:34:44 -0400
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
	<07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 13, 2007, at 2:43 PM, Bill Janssen wrote:

>> I see that base64.b64encode and base64.standard_b64encode no longer
>> introduce line breaks into the output strings, as base64.encodestring
>> does.  Shouldn't there be an option on one of them to do this?
>
> See:
>
> http://mail.python.org/pipermail/python-bugs-list/2001-October/ 
> 007856.html
>
> section 2.1 of http://www.faqs.org/rfcs/rfc3548.html
>
> Perhaps adding MIME_b64encode() and PEM_b64encode() routines?  Or just
> an optional parameter to standard_b64encode, called "max_line_length",
> defaulting to 0, meaning no max?

It turns out to be inconvenient in other contexts to do the line  
splitting at this lower level, so I would prefer to leave the current  
methods as is (that means, no change in semantics or arguments).

I wouldn't necessarily be opposed to new functions that did the line  
splitting, but ideally, you could design an API that provided that  
behavior for any of the existing alternatives in the base64 module,  
without duplicating them all.  It's not clear to me how you'd do that  
though (or if it's worth it).

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCUAwUBRurURHEjvBPtnXfVAQLSZAP4pdn3lUPvVfqSl+RT4GBzYmL1uUTMrmJx
+lc+7SEaOj0sphfQbTmN9kKlwS2cJQ7UdZQzXM6t5+zlM+b4GRl6pA0CEk/M3PUI
VWs3JkxgMRQA0CoeF5AflLru7ZxEL7pYej88y9KPAZCQ7H6e0+b8TCr/6Qj0YiYw
c2eLfZoSAA==
=klKj
-----END PGP SIGNATURE-----

From tulloss2 at uiuc.edu  Fri Sep 14 21:13:47 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Fri, 14 Sep 2007 14:13:47 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <p04330100c310567bfd17@192.168.123.162>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
	<2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
	<p04330100c310567bfd17@192.168.123.162>
Message-ID: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com>

Your idea can be combined with the maxint/2 initial refcount for
> non-disposable objects, which should about eliminate thread-count updates
> for them.
> --
>

 I don't really like the maxint/2 idea because it requires us to
differentiate between globals and everything else. Plus, it's a hack. I'd
like a more elegant solution if possible.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/1e3cddfd/attachment.htm 

From janssen at parc.com  Fri Sep 14 21:20:32 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 14 Sep 2007 12:20:32 PDT
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org> 
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
	<07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
	<A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org>
Message-ID: <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com>

> >> I see that base64.b64encode and base64.standard_b64encode no longer
> >> introduce line breaks into the output strings, as base64.encodestring
> >> does.  Shouldn't there be an option on one of them to do this?
> >
> > See:
> >
> > http://mail.python.org/pipermail/python-bugs-list/2001-October/ 
> > 007856.html
> >
> > section 2.1 of http://www.faqs.org/rfcs/rfc3548.html
> >
> > Perhaps adding MIME_b64encode() and PEM_b64encode() routines?  Or just
> > an optional parameter to standard_b64encode, called "max_line_length",
> > defaulting to 0, meaning no max?
> 
> It turns out to be inconvenient in other contexts to do the line  
> splitting at this lower level, so I would prefer to leave the current  
> methods as is (that means, no change in semantics or arguments).
> 
> I wouldn't necessarily be opposed to new functions that did the line  
> splitting, but ideally, you could design an API that provided that  
> behavior for any of the existing alternatives in the base64 module,  
> without duplicating them all.  It's not clear to me how you'd do that  
> though (or if it's worth it).

I think that's probably right.  I just added the PEM line-wrapping to
the code in the ssl module.  Though I hate to keep adding
line-wrapping code here and there...  Perhaps just adding a utility
function, wrap_lines(), or some such to the module would suffice.

Bill

From db3l.net at gmail.com  Fri Sep 14 21:26:25 2007
From: db3l.net at gmail.com (David Bolen)
Date: Fri, 14 Sep 2007 15:26:25 -0400
Subject: [Python-Dev] Daily Windows Installers
References: <46EA7D6C.8010600@v.loewis.de> <fcea66$a5i$1@sea.gmane.org>
Message-ID: <m2zlzpnj4u.fsf@valheru.db3l.homeip.net>

Georg Brandl <g.brandl at gmx.net> writes:

> I hope this isn't due to the files that Sphinx creates.
> I had a nasty crash with HTML Help Workshop when I generated
> an "invalid" index file -- but this was reproducible of course.

The really annoying thing is that this only occurs (so far) in the 3.0
tree when run beneath the buildbot, although it seems consistent
there.  Using the same tree right after a crash, and running the same
build command interactively always seems to work fine.  I thought it
might be a stdout/console thing but redirecting the compiler's output
to a file still crashes.

I think, but can't prove it has parsed all the input files, since the
last bit of output even in verbose mode is still buffered in its
process when it crashes.

I did determine that genindex.html is being created with malformed
HTML (< and > in operators aren't being quoted as &lt; and &gt;), but
manually fixing that didn't resolve the crash.  And even in the 2.6
branch (which builds fine) genindex.html has erroneous uses of
"<protocol>" that isn't quoted either.

For the moment I'm probably going to work to ensure we don't get the
pop-up box (which blocks the rest of the processing) so at least an
MSI can get created even if the chm is bad.

-- David


From exarkun at divmod.com  Fri Sep 14 21:30:49 2007
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 14 Sep 2007 15:30:49 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com>
Message-ID: <20070914193049.8162.648500711.divmod.quotient.8674@ohm>

On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>Your idea can be combined with the maxint/2 initial refcount for
>> non-disposable objects, which should about eliminate thread-count updates
>> for them.
>> --
>>
>
> I don't really like the maxint/2 idea because it requires us to
>differentiate between globals and everything else. Plus, it's a hack. I'd
>like a more elegant solution if possible.

It's not really a solution either.  If your program runs for a couple
minutes and then exits, maybe it won't trigger some catastrophic behavior
from this hack, but if you have a long running process then you're almost
certain to be screwed over by this (it wouldn't even have to be *very*
long running - a month or two could do it on a 32bit platform).

Jean-Paul

From janssen at parc.com  Fri Sep 14 21:36:22 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 14 Sep 2007 12:36:22 PDT
Subject: [Python-Dev] SSL and asyncore update (and SSL_shutdown)
Message-ID: <07Sep14.123630pdt."57996"@synergy1.parc.xerox.com>

I've now got an HTTPS server (more importantly, one built on asyncore
and SocketServer and BaseHTTPServer), running in the test suite.

Also, I think that, for the moment, I'm going to take
ssl.ssl_shutdown() out of the library.  The state machine implemented
at the GoogleSprint really only does the client side of
SSL_shutdown(); it will take a bit more work for the server side.

I'll check this in this weekend, and update the 2.3-compatible package.

Bill

From barry at python.org  Fri Sep 14 21:54:23 2007
From: barry at python.org (Barry Warsaw)
Date: Fri, 14 Sep 2007 15:54:23 -0400
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
	<07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
	<A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org>
	<07Sep14.122040pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 14, 2007, at 3:20 PM, Bill Janssen wrote:

> I think that's probably right.  I just added the PEM line-wrapping to
> the code in the ssl module.  Though I hate to keep adding
> line-wrapping code here and there...  Perhaps just adding a utility
> function, wrap_lines(), or some such to the module would suffice.

Does anything in textwrap already do the trick?  If not, that might  
be the best place to refactor similar code to.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRurm8HEjvBPtnXfVAQIGFwP9G/hYbZ/cR7n9X4NtKATFqm/Mp+q8SH3b
jFUEuvf/y0/0Ri6aKpC9QJzLNg+ZlgthmaYNRT488SXPplbB4mtysFbJg+A9x3d3
fi4rkqXnrvJt6Msqbti7wt6sGYZRisDveztuKM5Sh8t+die+55e3bZg7ght6Vyuk
+N6V9lg2/3A=
=iknB
-----END PGP SIGNATURE-----

From steve at holdenweb.com  Fri Sep 14 21:58:57 2007
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 14 Sep 2007 15:58:57 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm>
References: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com>
	<20070914193049.8162.648500711.divmod.quotient.8674@ohm>
Message-ID: <fcep65$2fo$1@sea.gmane.org>

Jean-Paul Calderone wrote:
> On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>> Your idea can be combined with the maxint/2 initial refcount for
>>> non-disposable objects, which should about eliminate thread-count updates
>>> for them.
>>> --
>>>
>> I don't really like the maxint/2 idea because it requires us to
>> differentiate between globals and everything else. Plus, it's a hack. I'd
>> like a more elegant solution if possible.
> 
> It's not really a solution either.  If your program runs for a couple
> minutes and then exits, maybe it won't trigger some catastrophic behavior
> from this hack, but if you have a long running process then you're almost
> certain to be screwed over by this (it wouldn't even have to be *very*
> long running - a month or two could do it on a 32bit platform).
> 
Could each class define the value to be added to or subtracted from the 
refcount? We'd only need a bit to store the value (since it would always 
be zero or one), but the execution time might increase quite a lot if 
there's no nifty way to conditionally add or subtract one.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline


From g.brandl at gmx.net  Fri Sep 14 22:37:47 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 14 Sep 2007 22:37:47 +0200
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <m2zlzpnj4u.fsf@valheru.db3l.homeip.net>
References: <46EA7D6C.8010600@v.loewis.de> <fcea66$a5i$1@sea.gmane.org>
	<m2zlzpnj4u.fsf@valheru.db3l.homeip.net>
Message-ID: <fcereb$bei$1@sea.gmane.org>

David Bolen schrieb:
> Georg Brandl <g.brandl at gmx.net> writes:
> 
>> I hope this isn't due to the files that Sphinx creates.
>> I had a nasty crash with HTML Help Workshop when I generated
>> an "invalid" index file -- but this was reproducible of course.
> 
> The really annoying thing is that this only occurs (so far) in the 3.0
> tree when run beneath the buildbot, although it seems consistent
> there.  Using the same tree right after a crash, and running the same
> build command interactively always seems to work fine.  I thought it
> might be a stdout/console thing but redirecting the compiler's output
> to a file still crashes.
> 
> I think, but can't prove it has parsed all the input files, since the
> last bit of output even in verbose mode is still buffered in its
> process when it crashes.

Can't help you there, just notice that this is the same point where
I saw "my" crash.

> I did determine that genindex.html is being created with malformed
> HTML (< and > in operators aren't being quoted as &lt; and &gt;), but
> manually fixing that didn't resolve the crash.  And even in the 2.6
> branch (which builds fine) genindex.html has erroneous uses of
> "<protocol>" that isn't quoted either.

Okay, I should really fix this. Added a todo-list item.

> For the moment I'm probably going to work to ensure we don't get the
> pop-up box (which blocks the rest of the processing) so at least an
> MSI can get created even if the chm is bad.

Thanks for handling this!

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From tomerfiliba at gmail.com  Fri Sep 14 22:30:20 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Fri, 14 Sep 2007 20:30:20 -0000
Subject: [Python-Dev] import file extensions
Message-ID: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>

a quick question: i'm working on a pythonic build system, where the
build
scripts are plain python files. but i want to differentiate them from
normal
python files (.py) by a different suffix (say .pyy), but then i can't
import
them.

so i'm wondering, is there a quick way to just add another extension
to
import mechanism? or do i have to write a fully fledged import hook?


-tomer


From guido at python.org  Fri Sep 14 23:11:41 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 14 Sep 2007 14:11:41 -0700
Subject: [Python-Dev] import file extensions
In-Reply-To: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
Message-ID: <ca471dc20709141411l7f9a02f6paabca56ced3cea06@mail.gmail.com>

I think you're looking for a PEP 302 style meta hook.

On 9/14/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> a quick question: i'm working on a pythonic build system, where the
> build
> scripts are plain python files. but i want to differentiate them from
> normal
> python files (.py) by a different suffix (say .pyy), but then i can't
> import
> them.
>
> so i'm wondering, is there a quick way to just add another extension
> to
> import mechanism? or do i have to write a fully fledged import hook?
>
>
> -tomer
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri Sep 14 23:19:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 14 Sep 2007 23:19:49 +0200
Subject: [Python-Dev] import file extensions
In-Reply-To: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
Message-ID: <46EAFAF5.8030007@v.loewis.de>

> so i'm wondering, is there a quick way to just add another extension 
> to import mechanism? or do i have to write a fully fledged import
> hook?

[this question is off-topic for python-dev]

If recompiling Python is an option, the quick way is to edit the
interpreter, and add that extension.

If that is not an option, but it is an option to put all .pyy files
in a single directory, the quick way is to add an entry to
sys.path_hooks.

If that is also not an option, the quick way is to add an entry to
sys.meta_path.

The best way would be to not use import, but provide a separate
function (e.g. calling it "require").

Regards,
Martin


From fuzzyman at voidspace.org.uk  Fri Sep 14 23:17:06 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 14 Sep 2007 22:17:06 +0100
Subject: [Python-Dev] [python] Re:  import file extensions
In-Reply-To: <ca471dc20709141411l7f9a02f6paabca56ced3cea06@mail.gmail.com>
References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
	<ca471dc20709141411l7f9a02f6paabca56ced3cea06@mail.gmail.com>
Message-ID: <46EAFA52.8020103@voidspace.org.uk>

Guido van Rossum wrote:
> I think you're looking for a PEP 302 style meta hook.
>   

Or even execfile in a context...

Michael Foord
http://www.manning.com/foord
> On 9/14/07, tomer filiba <tomerfiliba at gmail.com> wrote:
>   
>> a quick question: i'm working on a pythonic build system, where the
>> build
>> scripts are plain python files. but i want to differentiate them from
>> normal
>> python files (.py) by a different suffix (say .pyy), but then i can't
>> import
>> them.
>>
>> so i'm wondering, is there a quick way to just add another extension
>> to
>> import mechanism? or do i have to write a fully fledged import hook?
>>
>>
>> -tomer
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>     
>
>
>   


From tomerfiliba at gmail.com  Fri Sep 14 23:34:16 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Fri, 14 Sep 2007 23:34:16 +0200
Subject: [Python-Dev] import file extensions
In-Reply-To: <46EAFAF5.8030007@v.loewis.de>
References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com>
	<46EAFAF5.8030007@v.loewis.de>
Message-ID: <1d85506f0709141434t4bc4ec89ub43bc66dde81f897@mail.gmail.com>

On 9/14/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> The best way would be to not use import, but provide a separate
> function (e.g. calling it "require").
>

yepp, that's probably the cleanest and quickest solution. i needed
to see all the  alternatives to realize this though. sorry.


-- 
An NCO and a Gentleman

From db3l.net at gmail.com  Fri Sep 14 23:51:26 2007
From: db3l.net at gmail.com (David Bolen)
Date: Fri, 14 Sep 2007 17:51:26 -0400
Subject: [Python-Dev] Daily Windows Installers
References: <46EA7D6C.8010600@v.loewis.de> <fcea66$a5i$1@sea.gmane.org>
	<m2zlzpnj4u.fsf@valheru.db3l.homeip.net>
	<fcereb$bei$1@sea.gmane.org>
Message-ID: <m2veacoqzl.fsf@valheru.db3l.homeip.net>

Georg Brandl <g.brandl at gmx.net> writes:

> David Bolen schrieb:
>> Georg Brandl <g.brandl at gmx.net> writes:
(...)
>> For the moment I'm probably going to work to ensure we don't get the
>> pop-up box (which blocks the rest of the processing) so at least an
>> MSI can get created even if the chm is bad.
>
> Thanks for handling this!

I hit it with a sledge-hammer and modified my build slave to disable
error boxes for anything it runs, so we'll get the 3.0 MSI now but
with a bad chm until it gets figured out.

-- David


From exarkun at divmod.com  Fri Sep 14 23:59:06 2007
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 14 Sep 2007 17:59:06 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <B4F514BD-06DD-4CD1-B4BC-0D6706BA0BB4@fuhm.net>
Message-ID: <20070914215906.8162.2036179390.divmod.quotient.8742@ohm>

On Fri, 14 Sep 2007 17:43:39 -0400, James Y Knight <foom at fuhm.net> wrote:
>
>On Sep 14, 2007, at 3:30 PM, Jean-Paul Calderone wrote:
>>On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss  <tulloss2 at uiuc.edu> 
>>wrote:
>>>Your idea can be combined with the maxint/2 initial refcount for
>>>>non-disposable objects, which should about eliminate thread-count 
>>>>updates
>>>>for them.
>>>>--
>>>
>>>I don't really like the maxint/2 idea because it requires us to
>>>differentiate between globals and everything else. Plus, it's a  hack. I'd
>>>like a more elegant solution if possible.
>>
>>It's not really a solution either.  If your program runs for a couple
>>minutes and then exits, maybe it won't trigger some catastrophic  behavior
>>from this hack, but if you have a long running process then you're  almost
>>certain to be screwed over by this (it wouldn't even have to be *very*
>>long running - a month or two could do it on a 32bit platform).
>
>Not true: the refcount becoming 0 only calls a dealloc function.. For 
>objects which are not deletable, the dealloc function should simply  set the 
>refcount back to maxint/2. Done.
>

So, eg, replace the Py_FatalError in none_dealloc with an assignment to
ob_refcnt?  Good point, sounds like it could work (I'm pretty sure you
know more about deallocation in CPython than I :).

Jean-Paul

From janssen at parc.com  Sat Sep 15 00:01:49 2007
From: janssen at parc.com (Bill Janssen)
Date: Fri, 14 Sep 2007 15:01:49 PDT
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org> 
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
	<07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
	<A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org>
	<07Sep14.122040pdt."57996"@synergy1.parc.xerox.com>
	<5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org>
Message-ID: <07Sep14.150150pdt."57996"@synergy1.parc.xerox.com>

> Does anything in textwrap already do the trick?  If not, that might  
> be the best place to refactor similar code to.

Yes, textwrap.fill.  Thanks for pointing it out.

Bill

From rhamph at gmail.com  Sat Sep 15 00:21:57 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 14 Sep 2007 16:21:57 -0600
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070914215906.8162.2036179390.divmod.quotient.8742@ohm>
References: <B4F514BD-06DD-4CD1-B4BC-0D6706BA0BB4@fuhm.net>
	<20070914215906.8162.2036179390.divmod.quotient.8742@ohm>
Message-ID: <aac2c7cb0709141521w43e1e059xc796c286ebc647a0@mail.gmail.com>

On 9/14/07, Jean-Paul Calderone <exarkun at divmod.com> wrote:
> On Fri, 14 Sep 2007 17:43:39 -0400, James Y Knight <foom at fuhm.net> wrote:
> >
> >On Sep 14, 2007, at 3:30 PM, Jean-Paul Calderone wrote:
> >>On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss  <tulloss2 at uiuc.edu>
> >>wrote:
> >>>Your idea can be combined with the maxint/2 initial refcount for
> >>>>non-disposable objects, which should about eliminate thread-count
> >>>>updates
> >>>>for them.
> >>>>--
> >>>
> >>>I don't really like the maxint/2 idea because it requires us to
> >>>differentiate between globals and everything else. Plus, it's a  hack. I'd
> >>>like a more elegant solution if possible.
> >>
> >>It's not really a solution either.  If your program runs for a couple
> >>minutes and then exits, maybe it won't trigger some catastrophic  behavior
> >>from this hack, but if you have a long running process then you're  almost
> >>certain to be screwed over by this (it wouldn't even have to be *very*
> >>long running - a month or two could do it on a 32bit platform).
> >
> >Not true: the refcount becoming 0 only calls a dealloc function.. For
> >objects which are not deletable, the dealloc function should simply  set the
> >refcount back to maxint/2. Done.
> >
>
> So, eg, replace the Py_FatalError in none_dealloc with an assignment to
> ob_refcnt?  Good point, sounds like it could work (I'm pretty sure you
> know more about deallocation in CPython than I :).

As I've said, this is all moot.  The cache coherence protocols on x86
means this will be nearly as slow as proper atomic refcounting, and
will not scale if multiple threads regularly touch the object.  My
experience is that they will touch it regularly.

-- 
Adam Olsen, aka Rhamphoryncus

From greg.ewing at canterbury.ac.nz  Sat Sep 15 00:23:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 15 Sep 2007 10:23:39 +1200
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>
	<fcb2cc$65i$1@sea.gmane.org> <200709131219.21152.nd@perlig.de>
	<20070913105527.GH32061@snowy.squish.net>
	<18153.15002.76898.448843@montanaro.dyndns.org>
	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
	<2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com>
Message-ID: <46EB09EB.7070600@canterbury.ac.nz>

Justin Tulloss wrote:
> 
> What do you think of a model where there is a global 
> "thread count" that keeps track of how many threads reference an object? 

I've thought about that sort of thing before. The problem
is how you keep track of how many threads reference an
object, without introducing far more overhead than
you're trying to eliminate.

> Then there are thread-specific reference counters for each object.

What happens when a new thread comes into existence? Do
you go through all existing objects and add another element
to their refcount arrays?

--
Greg

From barry at python.org  Sat Sep 15 00:37:57 2007
From: barry at python.org (Barry Warsaw)
Date: Fri, 14 Sep 2007 18:37:57 -0400
Subject: [Python-Dev] base64 -- should b64encode introduce line breaks?
In-Reply-To: <07Sep14.150150pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com>
	<07Sep13.114308pdt."57996"@synergy1.parc.xerox.com>
	<A11015C5-2F58-426F-BAC5-A71DE16C8416@python.org>
	<07Sep14.122040pdt."57996"@synergy1.parc.xerox.com>
	<5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org>
	<07Sep14.150150pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <29E110FC-FC99-40C6-8775-085B29EEC0A9@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 14, 2007, at 6:01 PM, Bill Janssen wrote:

>> Does anything in textwrap already do the trick?  If not, that might
>> be the best place to refactor similar code to.
>
> Yes, textwrap.fill.  Thanks for pointing it out.

/me tries to remember that for Py3k's email package. ;)

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRusNRnEjvBPtnXfVAQIpNgP/SosDYX/GDMolxcv3U2WrGzMjQa+gd8ai
J/Oaw2vdSf8H84eU9ziKaWHQtK0obS9XrnUTLUDyfAKObNffZVvldG1KUV9vAKhr
3JuNJ3xiIk7RKXdkKd5mA7SXXqRd80NVN26Za0H8bkl16mhdpZM7OqJmhaIkCkXr
AJtjJP5esWQ=
=v/C8
-----END PGP SIGNATURE-----

From tonynelson at georgeanelson.com  Sat Sep 15 00:55:01 2007
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Fri, 14 Sep 2007 18:55:01 -0400
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm>
References: <20070914193049.8162.648500711.divmod.quotient.8674@ohm>
Message-ID: <p04330102c310be90745d@[192.168.123.162]>

At 3:30 PM -0400 9/14/07, Jean-Paul Calderone wrote:
>On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>>Your idea can be combined with the maxint/2 initial refcount for
>>> non-disposable objects, which should about eliminate thread-count updates
>>> for them.
>>> --
>>>
>>
>> I don't really like the maxint/2 idea because it requires us to
>>differentiate between globals and everything else. Plus, it's a hack. I'd
>>like a more elegant solution if possible.
>
>It's not really a solution either.  If your program runs for a couple
>minutes and then exits, maybe it won't trigger some catastrophic behavior
>from this hack, but if you have a long running process then you're almost
>certain to be screwed over by this (it wouldn't even have to be *very*
>long running - a month or two could do it on a 32bit platform).

I don't think either of you understand what setting the initial refcount to
maxint/2 for global objects in a thread's refcount vector would do.  It has
/no/ effect on refcounting.  It only prevents the refcount from becoming
zero for objects that can never be released, but which would always have a
zero thread refcount on thread exit, which would cause a useless and
frequent thread count decrement for the object.  As the object can never be
released, its thread count would be initially non-zero, so the thread count
won't be made zero when the thread refcount becomes zero.  The thread count
is shared in the object.  The thread refcount is per thread, and should not
be shared, even at the physical cache line level, if good performance is
desired.

When a new thread is created, part of the thread state would be the
refcount vector.  Hopefully it would mostly be just VM magic, but the
initial part of the vector would contain the immortal objects' refcount,
and those would be set to maxint/2.  Or 1, for that matter.
-- 
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From jon+python-dev at unequivocal.co.uk  Sat Sep 15 02:50:58 2007
From: jon+python-dev at unequivocal.co.uk (Jon Ribbens)
Date: Sat, 15 Sep 2007 01:50:58 +0100
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm>
References: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com>
	<20070914193049.8162.648500711.divmod.quotient.8674@ohm>
Message-ID: <20070915005058.GS32061@snowy.squish.net>

On Fri, Sep 14, 2007 at 03:30:49PM -0400, Jean-Paul Calderone wrote:
> > I don't really like the maxint/2 idea because it requires us to
> >differentiate between globals and everything else. Plus, it's a hack. I'd
> >like a more elegant solution if possible.
> 
> It's not really a solution either.  If your program runs for a couple
> minutes and then exits, maybe it won't trigger some catastrophic behavior
> from this hack, but if you have a long running process then you're almost
> certain to be screwed over by this

You misunderstand - the point of the 'maxint/2' thing isn't to prevent
something from happening at all, it's to prevent it from happening
*frequently*.

From talin at acm.org  Sat Sep 15 07:48:40 2007
From: talin at acm.org (Talin)
Date: Fri, 14 Sep 2007 22:48:40 -0700
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>	<20070911192035.94CE33A40D7@sparrow.telecommunity.com>	<fcb2cc$65i$1@sea.gmane.org>
	<200709131219.21152.nd@perlig.de>	<20070913105527.GH32061@snowy.squish.net>	<18153.15002.76898.448843@montanaro.dyndns.org>	<46E9FDA8.6010303@canterbury.ac.nz>
	<aac2c7cb0709132310v69d2bd41w2cd093e6ae34863@mail.gmail.com>
Message-ID: <46EB7238.5040104@acm.org>

Adam Olsen wrote:
> I'm now working on an approach that writes out refcounts in batches to
> reduce contention.  The initial cost is much higher, but it scales
> better too.  I've currently got it to just under 50% cost, meaning two
> threads is a slight net gain.

http://www.research.ibm.com/people/d/dfb/publications.html

Look at the various papers on 'Recycler'.

The way it works is that for each thread, there is an addref buffer and 
a decref buffer. The buffers are arrays of pointers. Each time a 
reference is addref'd, its appended to the addref buffer, likewise for 
decref. When a buffer gets full, it is added to a queue and then a new 
buffer is allocated.

There is a background thread that actually applies the refcounts from 
the buffers and frees the objects. Since this background thread is the 
only thread that ever touches the actual refcount field of the object, 
there's no need for locking.

-- Talin

From jmtulloss at gmail.com  Fri Sep 14 07:08:13 2007
From: jmtulloss at gmail.com (Justin Tulloss)
Date: Fri, 14 Sep 2007 00:08:13 -0500
Subject: [Python-Dev] Removing the GIL (Me, not you!)
In-Reply-To: <43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com>
References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com>
	<46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz>
	<46E795FD.1070103@v.loewis.de>
	<18152.2055.258930.576257@montanaro.dyndns.org>
	<741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com>
	<2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com>
	<bb8868b90709131029p8545c4bxeed0f7294dea69b9@mail.gmail.com>
	<46EA0632.5060603@canterbury.ac.nz>
	<43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com>
Message-ID: <2cfeb93c0709132208g5198439ds3415c0f9a7689174@mail.gmail.com>

I'm not sure I understand entirely what you're saying, but it sounds like
you want multiple reference counts. A reference count per thread might not
be a bad idea, but I can't think of how it would work without locks. If
every object has an array of reference counts, then the GC would need to
lock that array to check to see if they're all 0. That means the
incref/decref operations would need to acquire this lock or risk messing up
the GC.

Perhaps you could have something where you have a reference count per thread
and then a thread count per object. Then you would only need to lock the
thread count for the first and last reference a thread makes to an object.
Once there are no threads referencing and object, its obviously safe for
cleanup. Of course, I'm not convinced atomic ops are really so expensive you
can't have every thread doing it at once, but Adam says that the caches will
be thrashed if we have a bunch of threads continuously updating the same
memory address. I can see the possibility. Perhaps once we have a version
that actually demonstrates this thrashing, we can alleviate it with some
sort of multiple reference count scheme.

Justin

On 9/13/07, Tennessee Leeuwenburg <tleeuwenburg at gmail.com> wrote:
>
> Pardon me for talking with no experience in such matters, but...
>
> Okay, incrementing a reference counter is atomic, therefore the cheapest
> possible operation. Is it possible to keep reference counting atomic in a
> multi-thread model?
>
> Could you do the following... let's consider two threads, "A" and "B".
> Each time an object is created, a reference count is created in both "A" and
> "B". Let's suppose "A" has a real reference and "B" has no reference really.
> Couldn't the GC check two reference registers for a reference count? The
> object would then be cleaned up only if both registers were 0.
>
> To exploit multiple CPUs, you could have two persistent Python processes
> on each CPU with its own mini-GIL. Object creation would then involve a call
> to each process to create the reference and GC would involve checking each
> process to see what their count is. However, it would mean that within each
> process, threads could create additional references or remove references in
> an atomic way.
>
> In a single-CPU system, this would be the same cost as currently, since I
> think that situation would devolve to having just one place to check for
> references. This seems to mean that it is the case that it would be no more
> expensive for a single-CPU system.
>
> In a two-CPU system, I'm no expertise on the actual call overheads of
> object creation and garbage collection, but logically it would double the
> effort of object creation and destruction (all such operations now need to
> occur on both processes) but would keep reference increments and decrements
> atomic.
>
> Once again, I'm really sorry if I'm completely off-base since I have never
> done any actual coding in this area, but I thought I'd make the suggestion
> just in case it happened to have relevance.
>
> Thanks,
> -Tennessee
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/tulloss2%40uiuc.edu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/1b63ece8/attachment.htm 

From Martin.Drautzburg at web.de  Fri Sep 14 22:48:15 2007
From: Martin.Drautzburg at web.de (Martin Drautzburg)
Date: Fri, 14 Sep 2007 22:48:15 +0200
Subject: [Python-Dev] How to pickle class derived from c++ extension
Message-ID: <200709142248.16539.Martin.Drautzburg@web.de>

I understand that I can picke an extension class written in C/C++ by providing 
a __reduce__() method along with __getstate__()/__setstate__(). While I still 
havent gotten this to work, my main question is:

How could I possibly pickle an object of a  python class which is derived from 
the C++ extension?

It seems that I can define 

	>>> class Bar(list):
	...      pass

and add more attributes

	>>> l=Bar()
	>>> l.x=11

and __reduce__() will show the "x" attribute

	>>> l.__reduce__()
	(<function _reconstructor at 0xb7e2cf0c>, (<class '__main__.Bar'>, 
<type 'list'>, []), {'x': 11})

But this does not seem to work with my extension class Foo. I defined a 
__getstate__() method and __reduce__() indeed shows me some state.  But if I 
create a derived class Bar on the Python side and an object bar as an 
instance of that class, and add an "x" attribute to that bar object, then 
__reduce__ing that object shows nothing about the "x" attribute.

This is in a way undestandable, as __reduce__() eventually just calls 
__getstate__() and the only implementation it can find is in my Foo extension 
class, which knows nothing abpout the Bar derived class let alone its "x" 
attribute.

I would like to have __reduce__() do it the pyhon way as far as it cat get, 
and then call some magic method of my C++ class to pickle the "C++ part" of 
an object. Is there a way to achieve this? The "list" class seems to have 
something that my Foo class does not have. What is this?

Or of course if there is a better way, to picke objects of classes which are 
derived from C++ extensions I'd be happy to hear about it.



From aahz at pythoncraft.com  Sat Sep 15 18:50:44 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 15 Sep 2007 09:50:44 -0700
Subject: [Python-Dev] How to pickle class derived from c++ extension
In-Reply-To: <200709142248.16539.Martin.Drautzburg@web.de>
References: <200709142248.16539.Martin.Drautzburg@web.de>
Message-ID: <20070915165044.GA17750@panix.com>

On Fri, Sep 14, 2007, Martin Drautzburg wrote:
>
> I understand that I can picke an extension class written
> in C/C++ by providing a __reduce__() method along with
> __getstate__()/__setstate__(). While I still havent gotten this to
> work, my main question is:
>
> How could I possibly pickle an object of a python class which is
> derived from the C++ extension?

python-dev is not an appropriate place to ask questions about writing
your own applications.  I suggest the C++-sig or capi-sig lists or the
comp.lang.python newsgroup.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.

From pfdubois at gmail.com  Sun Sep 16 04:46:11 2007
From: pfdubois at gmail.com (Paul Dubois)
Date: Sat, 15 Sep 2007 19:46:11 -0700
Subject: [Python-Dev] Eric Raymond account on bug tracker locked
Message-ID: <f74a6c2f0709151946v756eb17bla7ec0a18b83ba4@mail.gmail.com>

Eric Raymond (esr)'s account on bugs.python.org has been misused. Since this
may mean that his password on sf.net is also compromised, I cannot trust
that address to notify him. I have changed the password to prevent further
mischief. If someone knows a bona-fide way to contact him let me know it and
I'll inform him, if he doesn't see this himself.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070915/ed935f57/attachment.htm 

From guido at python.org  Sun Sep 16 16:51:42 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Sep 2007 07:51:42 -0700
Subject: [Python-Dev] Eric Raymond account on bug tracker locked
In-Reply-To: <f74a6c2f0709151946v756eb17bla7ec0a18b83ba4@mail.gmail.com>
References: <f74a6c2f0709151946v756eb17bla7ec0a18b83ba4@mail.gmail.com>
Message-ID: <ca471dc20709160751s333843e1xb99dbbe623e24db1@mail.gmail.com>

He's probably still esr at thyrsus.com. But he has long stopped being an
active developer so I doubt that informing him matters much.

On 9/15/07, Paul Dubois <pfdubois at gmail.com> wrote:
> Eric Raymond (esr)'s account on bugs.python.org has been misused. Since this
> may mean that his password on sf.net is also compromised, I cannot trust
> that address to notify him. I have changed the password to prevent further
> mischief. If someone knows a bona-fide way to contact him let me know it and
> I'll inform him, if he doesn't see this himself.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Mon Sep 17 01:01:29 2007
From: janssen at parc.com (Bill Janssen)
Date: Sun, 16 Sep 2007 16:01:29 PDT
Subject: [Python-Dev] 'text' mode rears its ugly head again
Message-ID: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com>

I've checked in the asyncore SSL patch, and the Windows buildbots are
failing on the HTTPS test.  I believe it's due to this insane
differentiation between between text files and binary files, a bad
idea introduced by Windows and perpetuated (apparently) by Python.  I
can't believe this wasn't eliminated in py3k!

Anyway, I think what's going on is that the two data blobs the test
compares, one read from a file opened with "open(filename, 'r')", and
the other a data stream read from an HTTP response "file" returned
from urllib.urlopen(), have different line-endings.  Of course, this
only matters on Windows; on UNIX, the faux differentiation doesn't
exist.

Bill


From janssen at parc.com  Mon Sep 17 02:16:51 2007
From: janssen at parc.com (Bill Janssen)
Date: Sun, 16 Sep 2007 17:16:51 PDT
Subject: [Python-Dev] 'text' mode rears its ugly head again
In-Reply-To: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> 
References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <07Sep16.171657pdt."57996"@synergy1.parc.xerox.com>

> I've checked in the asyncore SSL patch, and the Windows buildbots are
> failing on the HTTPS test.

Fixed.

Bill

From ncoghlan at gmail.com  Mon Sep 17 12:53:08 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 17 Sep 2007 20:53:08 +1000
Subject: [Python-Dev] 'text' mode rears its ugly head again
In-Reply-To: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com>
References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <46EE5C94.2050008@gmail.com>

Bill Janssen wrote:
> I've checked in the asyncore SSL patch, and the Windows buildbots are
> failing on the HTTPS test.  I believe it's due to this insane
> differentiation between between text files and binary files, a bad
> idea introduced by Windows and perpetuated (apparently) by Python.  I
> can't believe this wasn't eliminated in py3k!

The binary/text distinction is being increased in Py3k rather than 
reduced (the API for binary files uses bytes, the API for text files 
uses Unicode strings).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From orsenthil at gmail.com  Mon Sep 17 18:37:36 2007
From: orsenthil at gmail.com (O.R.Senthil Kumaran)
Date: Mon, 17 Sep 2007 22:07:36 +0530
Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo()
Message-ID: <20070917163736.GA5434@gmail.com>

To get the hostname, I can use socket.gethostbyname() but that has an
inherent limitation wherein does it not support IPv6 name resolution, and
getaddrinfo() should be used instead.

Looking up the socket.getaddrinfo() documentation, I come to know that

The getaddrinfo() function returns a list of 5-tuples with the following
structure:

(family, socktype, proto, canonname, sockaddr)

family, socktype, proto are all integer and are meant to be passed to
the socket() function. canonname is a string representing the canonical
name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME
is specified for a numeric host.

With this information, if I try something like this:

>>> for res in socket.getaddrinfo('goofy.goofy.com', None,
	socket.AI_CANONNAME):

        print res

(2, 1, 6, '', ('10.98.1.6', 0))
(2, 2, 17, '', ('10.98.1.6', 0))
(2, 3, 0, '', ('10.98.1.6', 0))

In the output, I see the cannoname to be always blank ''. I am not
getting the IPv4 or IPv6 address as a result of using getaddrinfo().

Am I making any mistake?

What i am trying is a replacement function for
socket.gethostbyname(hostname) which will work for both IPv4 and IPv6 (and make changes in urllib2 to support that)

# return hostbyname for either IPv4 or IPv6 address. Common function.

def ipv6_gethostbyname(hostname):
        for res in socket.getaddrinfo(hostname,None,
		socket.AI_CANONNAME):
                fa, socktype, proto, canonname, sa = res
        return cannoname

The above function does not seem to work. It returns blank value only.

Any help/ pointers? 

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From janssen at parc.com  Mon Sep 17 20:08:27 2007
From: janssen at parc.com (Bill Janssen)
Date: Mon, 17 Sep 2007 11:08:27 PDT
Subject: [Python-Dev] 'text' mode rears its ugly head again
In-Reply-To: <46EE5C94.2050008@gmail.com> 
References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com>
	<46EE5C94.2050008@gmail.com>
Message-ID: <07Sep17.110832pdt."57996"@synergy1.parc.xerox.com>

> > differentiation between between text files and binary files, a bad
> > idea introduced by Windows and perpetuated (apparently) by Python.  I
> > can't believe this wasn't eliminated in py3k!
> 
> The binary/text distinction is being increased in Py3k rather than 
> reduced (the API for binary files uses bytes, the API for text files 
> uses Unicode strings).

Actually, it's not so much the differentiation that bothers me, as it
is the default of assuming "text".  I think the default should be
"binary", and getting the file in "text" mode should require extra
effort.  It should be 'rt', not 'rb' -- an extra qualifier for text
mode, not for binary mode.  That would eliminate a lot of the little
bugs like this one that crop up in ports to the ineffable assemblage
that is Windows.

Bill

From facundobatista at gmail.com  Mon Sep 17 22:58:44 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Mon, 17 Sep 2007 17:58:44 -0300
Subject: [Python-Dev] Hash to longs, and Decimal
Message-ID: <e04bdf310709171358o606ceb21i989bd063fd93737a@mail.gmail.com>

Hi everybody!

In the Tracker Issue...

  http://bugs.python.org/issue1772851

... Mark Dickinson came with a patch that alters in a very corner case
how the hash is calculated to a long integer.

This allows changes in Decimal that lead to a better hashing behaviour
for big, big, really big numbers.

The patch applies cleanly, all the tests pass ok (Mark also provided
more tests for the hash function).

I won't commit this right now; I'll delay the change for a couple of
days in case somebody wants to take a look at it.

Thanks!

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From skip at pobox.com  Mon Sep 17 23:05:15 2007
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 17 Sep 2007 16:05:15 -0500
Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo()
In-Reply-To: <20070917163736.GA5434@gmail.com>
References: <20070917163736.GA5434@gmail.com>
Message-ID: <18158.60427.849358.838885@montanaro.dyndns.org>


    Senthil> To get the hostname, I can use socket.gethostbyname() but that
    Senthil> has an inherent limitation wherein does it not support IPv6
    Senthil> name resolution, and getaddrinfo() should be used instead.
    ...

For those who would ask Senthil to take this to comp.lang.python, he already
did and got no response.  He's working on fixes to urllib2, so this seems to
me to be a python-dev question and I suggested he post here.  I tried it
with 2.5, 2.6 and 3.0 and got blanks for the canonical name as well.
Hopefully someone with more network-fu can steer him in the right direction.

Skip

From guido at python.org  Mon Sep 17 23:17:19 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Sep 2007 14:17:19 -0700
Subject: [Python-Dev] Hash to longs, and Decimal
In-Reply-To: <e04bdf310709171358o606ceb21i989bd063fd93737a@mail.gmail.com>
References: <e04bdf310709171358o606ceb21i989bd063fd93737a@mail.gmail.com>
Message-ID: <ca471dc20709171417o21a9a1b9va0f9d199b894e698@mail.gmail.com>

Seems a fine idea. I don't have the time for a code review but I'll
leave that up to you all.

--Guido

On 9/17/07, Facundo Batista <facundobatista at gmail.com> wrote:
> Hi everybody!
>
> In the Tracker Issue...
>
>   http://bugs.python.org/issue1772851
>
> ... Mark Dickinson came with a patch that alters in a very corner case
> how the hash is calculated to a long integer.
>
> This allows changes in Decimal that lead to a better hashing behaviour
> for big, big, really big numbers.
>
> The patch applies cleanly, all the tests pass ok (Mark also provided
> more tests for the hash function).
>
> I won't commit this right now; I'll delay the change for a couple of
> days in case somebody wants to take a look at it.
>
> Thanks!
>
> --
> .    Facundo
>
> Blog: http://www.taniquetil.com.ar/plog/
> PyAr: http://www.python.org/ar/
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From dickinsm at gmail.com  Mon Sep 17 23:50:45 2007
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 17 Sep 2007 17:50:45 -0400
Subject: [Python-Dev] Hash to longs, and Decimal
In-Reply-To: <e04bdf310709171358o606ceb21i989bd063fd93737a@mail.gmail.com>
References: <e04bdf310709171358o606ceb21i989bd063fd93737a@mail.gmail.com>
Message-ID: <5c6f2a5d0709171450p7feb91f5oc55371824a53108e@mail.gmail.com>

On 9/17/07, Facundo Batista <facundobatista at gmail.com> wrote:
>
> In the Tracker Issue...
>
>   http://bugs.python.org/issue1772851
>
> ... Mark Dickinson came with a patch that alters in a very corner case
> how the hash is calculated to a long integer.
>

Much as I'd like this patch to be applied, I feel compelled to point out
that it does have a significant(?) downside:  it slows down hashing of large
integers to some degree.

On my machine (Dual Xeon 2.8Ghz/SuSE Linux 10.2/gcc 4.1 with -O3), using
timeit.timeit('hash(n)') to get timings, the new hash function takes 70%
more time for 1000 digit integers, 20% longer for 100 digit integers, but
has no measurable performance impact for small (int-sized) longs.  I don't
know how significant this performance hit is in the larger scheme of things.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070917/9a327041/attachment.htm 

From martin at v.loewis.de  Tue Sep 18 00:00:53 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 18 Sep 2007 00:00:53 +0200
Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo()
In-Reply-To: <20070917163736.GA5434@gmail.com>
References: <20070917163736.GA5434@gmail.com>
Message-ID: <46EEF915.9060301@v.loewis.de>

> Any help/ pointers? 

Did you read the man page of getaddrinfo, or the RFC?

Regards,
Martin

From trentm at activestate.com  Tue Sep 18 00:13:51 2007
From: trentm at activestate.com (Trent Mick)
Date: Mon, 17 Sep 2007 15:13:51 -0700
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <m2veacoqzl.fsf@valheru.db3l.homeip.net>
References: <46EA7D6C.8010600@v.loewis.de>
	<fcea66$a5i$1@sea.gmane.org>	<m2zlzpnj4u.fsf@valheru.db3l.homeip.net>	<fcereb$bei$1@sea.gmane.org>
	<m2veacoqzl.fsf@valheru.db3l.homeip.net>
Message-ID: <46EEFC1F.5020004@activestate.com>

David Bolen wrote:
> I hit it with a sledge-hammer and modified my build slave to disable
> error boxes for anything it runs, so we'll get the 3.0 MSI now but
> with a bad chm until it gets figured out.

How do you tell Windows to do that?

Trent

-- 
Trent Mick
trentm at activestate.com

From db3l.net at gmail.com  Tue Sep 18 00:19:07 2007
From: db3l.net at gmail.com (David Bolen)
Date: Mon, 17 Sep 2007 18:19:07 -0400
Subject: [Python-Dev] Daily Windows Installers
In-Reply-To: <46EEFC1F.5020004@activestate.com>
References: <46EA7D6C.8010600@v.loewis.de> <fcea66$a5i$1@sea.gmane.org>
	<m2zlzpnj4u.fsf@valheru.db3l.homeip.net> <fcereb$bei$1@sea.gmane.org>
	<m2veacoqzl.fsf@valheru.db3l.homeip.net>
	<46EEFC1F.5020004@activestate.com>
Message-ID: <9f94e2360709171519m14c1bbd5uda975ed704974230@mail.gmail.com>

On 9/17/07, Trent Mick <trentm at activestate.com> wrote:

> How do you tell Windows to do that?

Via the SetErrorMode call.

Since the Windows buildbot already uses the win32 extensions, I just
used the existing win32api wrapper (although through ctypes is very
easy too).  In my case I just surrounded the reactor.spawnProcess call
in buildbot/slave/commands.py with:

old_err_mode = win32api.SetErrorMode(7)
and
win32api.SetErrorMode(old_err_mode)

I suppose I should really tweak that to 0x8007 rather than just 7 to
include missing file dialogs (like when a removeable device is not
available).

Since the error mode is inherited by child processes (unless
explicitly overridden in the CreateProcess call), this effectively
covers the primary child process and any others it may spawn during
execution, so it works even though buildbot uses an intermediate
command interpreter to execute whatever command is requested.

We had a bit of discussion about this recently on the py3k devel list,
in regards to failures in the python buildbot tests, in regards to
more local changes within Python itself.

-- David

From Blinston_Fernandes at Dell.com  Tue Sep 18 04:14:46 2007
From: Blinston_Fernandes at Dell.com (Blinston_Fernandes at Dell.com)
Date: Tue, 18 Sep 2007 07:44:46 +0530
Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo()
In-Reply-To: <20070917163736.GA5434@gmail.com>
References: <20070917163736.GA5434@gmail.com>
Message-ID: <C38B725A049843478F1DA8F6A2448CDBC3F15E@blrx3m12.blr.amer.dell.com>

On python2.4.1

>>> socket.getaddrinfo('www.python.org', None, socket.AF_INET,
socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME)
[(2, 2, 17, 'dinsdale.python.org', ('82.94.237.218', 0))]
>>>

Blinston.

-----Original Message-----
From: python-dev-bounces+blinston_fernandes=dell.com at python.org
[mailto:python-dev-bounces+blinston_fernandes=dell.com at python.org] On
Behalf Of O.R.Senthil Kumaran
Sent: Monday, September 17, 2007 10:08 PM
To: python-dev at python.org
Subject: [Python-Dev] IPv6 hostname resolution using
socket.getaddrinfo()

To get the hostname, I can use socket.gethostbyname() but that has an
inherent limitation wherein does it not support IPv6 name resolution,
and
getaddrinfo() should be used instead.

Looking up the socket.getaddrinfo() documentation, I come to know that

The getaddrinfo() function returns a list of 5-tuples with the following
structure:

(family, socktype, proto, canonname, sockaddr)

family, socktype, proto are all integer and are meant to be passed to
the socket() function. canonname is a string representing the canonical
name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME
is specified for a numeric host.

With this information, if I try something like this:

>>> for res in socket.getaddrinfo('goofy.goofy.com', None,
	socket.AI_CANONNAME):

        print res

(2, 1, 6, '', ('10.98.1.6', 0))
(2, 2, 17, '', ('10.98.1.6', 0))
(2, 3, 0, '', ('10.98.1.6', 0))

In the output, I see the cannoname to be always blank ''. I am not
getting the IPv4 or IPv6 address as a result of using getaddrinfo().

Am I making any mistake?

What i am trying is a replacement function for
socket.gethostbyname(hostname) which will work for both IPv4 and IPv6
(and make changes in urllib2 to support that)

# return hostbyname for either IPv4 or IPv6 address. Common function.

def ipv6_gethostbyname(hostname):
        for res in socket.getaddrinfo(hostname,None,
		socket.AI_CANONNAME):
                fa, socktype, proto, canonname, sa = res
        return cannoname

The above function does not seem to work. It returns blank value only.

Any help/ pointers? 

--
O.R.Senthil Kumaran
http://uthcode.sarovar.org
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/blinston_fernandes%40d
ell.com

From ksankar at doubleclix.net  Tue Sep 18 20:45:25 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 18 Sep 2007 11:45:25 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive
 (4 to 32 cores) multi-core architectures
Message-ID: <46F01CC5.3000603@doubleclix.net>

Folks,
   As a follow-up to the py3k discussions started by Bruce and Guido, I 
pinged Brett and he suggested I submit an exploratory proposal. Would 
appreciate insights, wisdom, the good, the bad and the ugly.

A)    Does it make sense ?
B)    Which application sets should we consider in designing the 
interfaces and implementations
C)    In this proposal, parallelism and concurrency are used in an 
interchangeable fashion. Thoughts ?
D)    Please suggest pertinent links, discussions and insights.
E)    I have kept the proposal to a minimum to start the discussions and 
to explore if this is the right thing to do. Collaboratively, as we 
zero-in on one or two approaches, the idea is to expand it to a crisp 
and clear PEP. Need to do some more formatting as well.
Cheers
<k/>
P.S : I had sent this to python-ideas couple of days ago and received 
two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their 
comments yet. Folks who are on both lists, pardon me for the spam.
------------------------------------------------------------------------------------------------------------ 

PEP: xxxxxxxx
Title: Concurrency for moderately massive (4 to 32 cores) multi-core 
architectures
Version: $Revision$
Last-Modified: $Date$
Author: Krishna Sankar <ksankar (at) doubleclix.net>,
Status: Wandering ! (as in "Not all those who wander are lost ..." 
-J.R.R.Tolkien)
Type: Process
Content-Type: text/x-rst
Created: 15-Sep-2007

Abstract
--------
This proposal aims at leveraging the multi-core capability as an 
embedded mechanism in python. It is not whether python is slow or fast, 
but of performance and control of parallelism/concurrency in a 
moderately massive parallelism world. The aim is 4 to 32 cores. The 
proposal advocates two mechanisms - one for task parallelism and another 
for data intensive parallelism. Scientific computing and web 2.0 
frameworks are the forefront users for this proposal. Other applications 
would benefit as well.

Rationale
---------
Multicore architectures need no introductions and their ubiquity is 
evident. It is imperative that Python has one or more standard ways of 
leveraging multi-core architectures. OTOH, traditional thread based 
concurrency and lock based exclusions are becoming more and more 
difficult to program correctly.

First of all, the question is not whether py is slow or fast but 
performance of a system written in py. Which means, ability to leverage 
multi-core architectures as well as control. Control in term of things 
like ability to pin one process/task to a core, ability to pin one or 
more homogeneous tasks to specific cores et al, as well as not wait for 
a global lock and similar primitives. (Before anybody jumps into a 
conclusion, this is not about GIL by any means ;o))

Second, it is clear that we need a good solution (not THE solution) for 
moderately massive parallelism in multi-core architectures (i.e. 8-32 
cores). Share nothing might not be optimal; we need some form of memory 
sharing, not just copy all data via messages. May be functional 
programming based on the blackboard pattern would work, who knows.

I have seen systems saturated still having only ~25% of CPU utilization 
(in a 4 core system!). It is because we didn't leverage multi-cores and 
parallelism. So while py3k will not be slow, lack of a cohesive 
multi-core strategy will show up in system performance and byte us 
later(pun intended!).

At least, in my mind, this is not an exercise about exposing locks and 
mutexes or threads in Python. I do believe that the GIL will be 
refactored to more granularity in the coming months (similar to the 
Global Locks in Linux) and most probably we will get microThreads et al. 
As we all know, architecture is constraining as well as liberating. The 
language primitives influence greatly how we think about a problem.

In the discussions, Guido is right in insisting on speed, and Bruce is 
right in asking for language constructs. Without pragmatic speed, folks 
won't use it; same is the case without the required constructs. Both are 
barriers to adoption. We have an opportunity to offer a solution for 
multi-core architectures and let us seize it - we will rush in where 
angels fear to tread!

Programming Models
------------------
There are at least 3 possible paradigms

A. conventional threading model
B. Functional model, Erlang being the most appropriate C. Some form of 
limited shared memory model (message passing but pass pointers, 
blackboard model) D. Others, like Transactional Memory [2]

There is enough literature out there, so do not plan to explain these 
here. (<KS> Do we need more explanation? </KS>)

Pragmatic proposal
------------------
May I suggest we embed two primitives in Python 3K:
A)    A functional style share-nothing set of interfaces (and 
implementations thereof) - provides  the task parallelism/concurrency 
capability, "small messages, big computations" as Joe Armstrong calls it[3]
B)    A limited shared memory based model for data intensive parallelism

Most probably this would be part of stdlib. While Guido is almost right 
in saying that this is a (std)library problem, it is not fully so. We 
would need a few primitives from the underlying PVM substrate. Possibly 
one reason for Guido's position is the lack of clarity as to what needs 
to be changed and why. IMHO, just saying take GIL off does not solve the 
problem either.

The Zen of Python parallelism
-----------------------------
I draw inspiration for the very timely article by James Reinders in DDJ 
[1]. It embodies what we should be doing viz.:
1. Refactor the problem into parallel tasks. We cannot help if the 
domain is sequential 2. Program to abstraction & program chores not 
cores. Writing correct program using raw threads et al is difficult. Let 
the underlying substrate decide how best to optimize 3. Design for scale 
4. Have an option to turn concurrency off, for debugging 5. Declarative 
parallelism based mechanisms (?)

Related Efforts
---------------
The good news is there are at least 2 or 3 paradigms with 
implementations and rough benchmarks.
Parallel python http://www.artima.com/weblogs/viewpost.jsp?thread=214303
http://cheeseshop.python.org/pypi/parallel
Processing http://cheeseshop.python.org/pypi/processing
http://code.google.com/p/papyros/

Discussions
-----------
There are at least four thread sets (pardon the pun !) I am aware of:
1. The GIL discussions in python-dev and Guido's blog on GIL 
http://www.artima.com/weblogs/viewpost.jsp?thread=214235
2. The py3k topics started by Bruce 
http://www.artima.com/weblogs/viewpost.jsp?thread=214112, response by 
Guide http://www.artima.com/weblogs/viewpost.jsp?thread=214325 and reply 
to reply by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214480
3. Python and concurrency 
http://mail.python.org/pipermail/python-ideas/2007-March/000338.html
4. Adam's reply in python-ideas 
http://mail.python.org/pipermail/python-ideas/2007-September/000972.html

References
[1]http://www.ddj.com/architect/201804248
[2]Transaction 
http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=444
[3]Programming Erlang by Joe Armstrong



From guido at python.org  Tue Sep 18 21:15:40 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Sep 2007 12:15:40 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F01CC5.3000603@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
Message-ID: <ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>

On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
> Folks,
>    As a follow-up to the py3k discussions started by Bruce and Guido, I
> pinged Brett and he suggested I submit an exploratory proposal. Would
> appreciate insights, wisdom, the good, the bad and the ugly.
>
> A)    Does it make sense ?
> B)    Which application sets should we consider in designing the
> interfaces and implementations
> C)    In this proposal, parallelism and concurrency are used in an
> interchangeable fashion. Thoughts ?
> D)    Please suggest pertinent links, discussions and insights.
> E)    I have kept the proposal to a minimum to start the discussions and
> to explore if this is the right thing to do. Collaboratively, as we
> zero-in on one or two approaches, the idea is to expand it to a crisp
> and clear PEP. Need to do some more formatting as well.

I'd say it is a little light on specific proposals. The only section
that actually proposes anything is this:

> Pragmatic proposal
> ------------------
> May I suggest we embed two primitives in Python 3K:
> A)    A functional style share-nothing set of interfaces (and
> implementations thereof) - provides  the task parallelism/concurrency
> capability, "small messages, big computations" as Joe Armstrong calls it[3]
> B)    A limited shared memory based model for data intensive parallelism
>
> Most probably this would be part of stdlib. While Guido is almost right
> in saying that this is a (std)library problem, it is not fully so. We
> would need a few primitives from the underlying PVM substrate. Possibly
> one reason for Guido's position is the lack of clarity as to what needs
> to be changed and why. IMHO, just saying take GIL off does not solve the
> problem either.

Before I can meaningfully comment I think I'd like to hear more about
what specifically you are thinking of.

I don't mind the necessary changes to the PVM. I do like to see how
this affects existing C extensions though.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Tue Sep 18 21:26:14 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 18 Sep 2007 12:26:14 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F01CC5.3000603@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
Message-ID: <20070918192614.GA6757@panix.com>

On Tue, Sep 18, 2007, Krishna Sankar wrote:
>
>    As a follow-up to the py3k discussions started by Bruce and Guido, I 
> pinged Brett and he suggested I submit an exploratory proposal. Would 
> appreciate insights, wisdom, the good, the bad and the ugly.

This should probably start in python-ideas.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.

From tulloss2 at uiuc.edu  Tue Sep 18 21:54:43 2007
From: tulloss2 at uiuc.edu (Justin Tulloss)
Date: Tue, 18 Sep 2007 14:54:43 -0500
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F01CC5.3000603@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
Message-ID: <2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com>

On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>
> Folks,
>    As a follow-up to the py3k discussions started by Bruce and Guido, I
> pinged Brett and he suggested I submit an exploratory proposal. Would
> appreciate insights, wisdom, the good, the bad and the ugly.


I am currently working on parallelizing python as an undergraduate
independent study. I plan on first removing the GIL with as little overall
effect as possible and then implementing a task-oriented threading API on
top, probably based on Stackless (since they already do a great job with
concurrency in a single thread).

If you're interested in all the details, I'd be happy to share. I haven't
gotten far yet (the semester just started!), but I feel that actually
implementing these things would be the best way to get a PEP through.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/cc6ff97f/attachment.htm 

From weilawei at gmail.com  Tue Sep 18 23:46:58 2007
From: weilawei at gmail.com (Rob Crowther)
Date: Tue, 18 Sep 2007 17:46:58 -0400
Subject: [Python-Dev] Extending Python 3000
Message-ID: <e8452f7c0709181446o8e52ad4pabf0078f723bc7c1@mail.gmail.com>

I'm attempting to wrap the GNU MP mpf_t type and related functions, as I
need the extra precision and speed for a package I'm writing. The available
wrappers are either unmaintained or undocumented and incompatible with
Python 3000. Following the docs in the Python 3000 section of the website,
I've started off with this:

*mpfmodule.c*


#include <Python.h>
#include <stdio.h>
#include <gmp.h>

typedef struct {
    PyObject_HEAD
    mpf_t ob_val;
} MPFObject;

static PyTypeObject MPFType = {
    PyObject_HEAD_INIT(NULL)
    0,                         /*ob_size*/
    "mpf.MPF",                 /*tp_name*/
    sizeof(MPFObject),         /*tp_basicsize*/
    0,                         /*tp_itemsize*/
    0,                         /*tp_dealloc*/
    0,                         /*tp_print*/
    0,                         /*tp_getattr*/
    0,                         /*tp_setattr*/
    0,                         /*tp_compare*/
    0,                         /*tp_repr*/
    0,                         /*tp_as_number*/
    0,                         /*tp_as_sequence*/
    0,                         /*tp_as_mapping*/
    0,                         /*tp_hash */
    0,                         /*tp_call*/
    0,                         /*tp_str*/
    0,                         /*tp_getattro*/
    0,                         /*tp_setattro*/
    0,                         /*tp_as_buffer*/
    Py_TPFLAGS_DEFAULT,        /*tp_flags*/
    "GNU MP mpf_t objects",    /* tp_doc */
};

static PyMethodDef mpf_methods[] = {
    {NULL}
};

#ifndef PyMODINIT_FUNC
#define PyMODINIT_FUNC void
#endif

PyMODINIT_FUNC initmpf(void) {
    PyObject* m;

    MPFType.tp_new = PyType_GenericNew;
    if (PyType_Ready(&MPFType) < 0)
        return;

    m = Py_InitModule3("mpf", mpf_methods,
                       "Wrapper around GNU MP mpf_t and related methods");

    Py_INCREF(&MPFType);
    PyModule_AddObject(m, "MPF", (PyObject *) &MPFType);
}



Upon running my setup.py script, it gives me numerous warnings. These do not
occur if I attempt to use Python 2.5. It also works fine under Python 2.5.


weilawei at archeron:~/Code/mpf$ python setup.py build
running build
running build_ext
building 'mpf' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes
-fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include
-I/usr/local/include/python3.0 -c mpfmodule.c -o build/temp.linux-i686-3.0
/mpfmodule.o
*mpfmodule.c:11: warning: missing braces around initializer
mpfmodule.c:11: warning: (near initialization for
&#8216;MPFType.ob_base.ob_base&#8217;)
mpfmodule.c:13: warning: initialization makes integer from pointer without a
cast
mpfmodule.c:32: warning: initialization from incompatible pointer type*
gcc -pthread -shared build/temp.linux-i686-3.0/mpfmodule.o -L/usr/local/lib
-lgmp -o build/lib.linux-i686-3.0/mpf.so


Inside the Python interpreter:

weilawei at archeron:~/Code/mpf/build/lib.linux-i686-3.0$ python
Python 3.0a1 (py3k, Sep 15 2007, 00:33:44)
[GCC 4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mpf
>>> test = mpf.MPF()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
*MemoryError*
>>> mpf.MPF
*Segmentation fault*



Pointers as to what has changed and what I need to do to compile an
extension for Py3k would be very much appreciated. Thank you all in advance
for your time.

- Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/737432ba/attachment-0001.htm 

From alexandre at peadrop.com  Wed Sep 19 00:09:00 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 18 Sep 2007 18:09:00 -0400
Subject: [Python-Dev] Extending Python 3000
In-Reply-To: <e8452f7c0709181446o8e52ad4pabf0078f723bc7c1@mail.gmail.com>
References: <e8452f7c0709181446o8e52ad4pabf0078f723bc7c1@mail.gmail.com>
Message-ID: <acd65fa20709181509l6baa3f6at4c38db09dc15c0db@mail.gmail.com>

PyObject_HEAD was changed in Py3k to make it conform to C's strict
aliasing rules (See PEP 3123 [1]).

In your code, you need to change:

    static PyTypeObject MPFType = {
        PyObject_HEAD_INIT(NULL)
        0,                         /*ob_size*/
        ...
    }

to this:

    static PyTypeObject MPFType = {
        PyVarObject_HEAD_INIT(NULL, 0)
        ...
    }

Good luck,
-- Alexandre

[1]: http://www.python.org/dev/peps/pep-3123/

From thomas at python.org  Wed Sep 19 02:02:59 2007
From: thomas at python.org (Thomas Wouters)
Date: Tue, 18 Sep 2007 17:02:59 -0700
Subject: [Python-Dev] SSL certs
In-Reply-To: <-6753447202579215070@unknownmsgid>
References: <46DDCD7C.40004@v.loewis.de> <46DECFF6.4040107@v.loewis.de>
	<46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de>
	<-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<20070912013412.GB14034@panix.com> <20070913042606.GB27547@panix.com>
	<46E8E4B0.60909@v.loewis.de> <-6753447202579215070@unknownmsgid>
Message-ID: <9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com>

On 9/13/07, Bill Janssen <janssen at parc.com> wrote:
>
> > However, there is an alternative to using multiple IP addresses:
> > one could also use multiple "subject alternative names", and create
> > a certificate that lists them all.
>
> Unfortunately, much of the client code that does the hostname
> verification is wrapped up in gullible Web browsers or Java HTTPS
> libraries that swallowed RFC 2818 whole, and not easily accessible by
> applications.  Does any of it recognize and accept "subject
> alternative name"?


For what it's worth, when I last looked at this (a year or so ago), only a
few fringe browsers on mobile phones had issues with accepting our wildcard
certificate, and some of those only because they didn't trust the root
authority.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/300bb343/attachment.htm 

From thomas at python.org  Wed Sep 19 02:19:25 2007
From: thomas at python.org (Thomas Wouters)
Date: Tue, 18 Sep 2007 17:19:25 -0700
Subject: [Python-Dev] Decimal news
In-Reply-To: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
Message-ID: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>

On 9/13/07, Facundo Batista <facundobatista at gmail.com> wrote:
>
> Hi people!
>
> After some months, Decimal is now in the trunk again.
>
> It's fully updated to the latest Cowlishaw specification, and
> complying with the latest test cases (from a few days ago, which even
> take in consideration some feedback from ours).
>
> I want to thank so much to Mark Dickinson, who made *a lot* of this
> work, not only the math part (he's a mathematician himself), but also
> a lot of cleaning and speeding up.
>
> Now we will put our hands in the documentation, for it to be 100% OK
> way before 2.6 arrives.
>
> Py3 will come after that.


Unfortunately, that's not how it works :-) If you check something into the
trunk, it will be merged into Py3k sooner or later. I may ask the original
submitter for assistance if it's incredibly hard to figure out the changes,
but so far, I only had to do that with the SSL changes. The decimal changes
are being merged as I write this (tests running now.) Is there anything in
particular that needs to be done for decimal in Py3k, besides renaming
__div__ to __truediv__?

If you re-eally need to check something into the trunk that re-eally must
not be merged into py3k, but you're afraid it's not going to be obvious to
the merger, please record the change as 'merged' using "svnmerge merge -M
-r<revision>". Please take care when picking the revision ;) You can also
just email me or someone else you see doing merges, as I doubt this will be
a common occurance.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/c1ae54cf/attachment.htm 

From orsenthil at gmail.com  Wed Sep 19 03:27:40 2007
From: orsenthil at gmail.com (O.R.Senthil Kumaran)
Date: Wed, 19 Sep 2007 06:57:40 +0530
Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo()
In-Reply-To: <C38B725A049843478F1DA8F6A2448CDBC3F15E@blrx3m12.blr.amer.dell.com>
References: <20070917163736.GA5434@gmail.com>
	<C38B725A049843478F1DA8F6A2448CDBC3F15E@blrx3m12.blr.amer.dell.com>
Message-ID: <20070919012740.GA3410@gmail.com>

* Blinston_Fernandes at Dell.com <Blinston_Fernandes at Dell.com> :

> On python2.4.1
> 
> >>> socket.getaddrinfo('www.python.org', None, socket.AF_INET,
> socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME)
> [(2, 2, 17, 'dinsdale.python.org', ('82.94.237.218', 0))]
> >>>
> 
> Blinston.


Thanks a lot, Blinston. That helped.
I just have to take care of socket.AF_INET6 flag for IPv6 now.

>>>socket.getaddrinfo('localhost', None, socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME)
[(10, 2, 17, 'localhost', ('fe80::219:5bff:fefd:6270', 0, 0, 0))]

Shall do a little more research on flags and see if documentation needs any update. Because current one speaks only about AI_CANONNAME being set.


Thank you. :)
Senthil



> 
> -----Original Message-----
> From: python-dev-bounces+blinston_fernandes=dell.com at python.org
> [mailto:python-dev-bounces+blinston_fernandes=dell.com at python.org] On
> Behalf Of O.R.Senthil Kumaran
> Sent: Monday, September 17, 2007 10:08 PM
> To: python-dev at python.org
> Subject: [Python-Dev] IPv6 hostname resolution using
> socket.getaddrinfo()
> 
> To get the hostname, I can use socket.gethostbyname() but that has an
> inherent limitation wherein does it not support IPv6 name resolution,
> and
> getaddrinfo() should be used instead.
> 
> Looking up the socket.getaddrinfo() documentation, I come to know that
> 
> The getaddrinfo() function returns a list of 5-tuples with the following
> structure:
> 
> (family, socktype, proto, canonname, sockaddr)
> 
> family, socktype, proto are all integer and are meant to be passed to
> the socket() function. canonname is a string representing the canonical
> name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME
> is specified for a numeric host.
> 
> With this information, if I try something like this:
> 
> >>> for res in socket.getaddrinfo('goofy.goofy.com', None,
> 	socket.AI_CANONNAME):
> 
>         print res
> 
> (2, 1, 6, '', ('10.98.1.6', 0))
> (2, 2, 17, '', ('10.98.1.6', 0))
> (2, 3, 0, '', ('10.98.1.6', 0))
> 
> In the output, I see the cannoname to be always blank ''. I am not
> getting the IPv4 or IPv6 address as a result of using getaddrinfo().
> 
> Am I making any mistake?
> 
> What i am trying is a replacement function for
> socket.gethostbyname(hostname) which will work for both IPv4 and IPv6
> (and make changes in urllib2 to support that)
> 
> # return hostbyname for either IPv4 or IPv6 address. Common function.
> 
> def ipv6_gethostbyname(hostname):
>         for res in socket.getaddrinfo(hostname,None,
> 		socket.AI_CANONNAME):
>                 fa, socktype, proto, canonname, sa = res
>         return cannoname
> 
> The above function does not seem to work. It returns blank value only.
> 
> Any help/ pointers? 
> 
> --
> O.R.Senthil Kumaran
> http://uthcode.sarovar.org
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/blinston_fernandes%40d
> ell.com

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From janssen at parc.com  Wed Sep 19 03:29:23 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 18 Sep 2007 18:29:23 PDT
Subject: [Python-Dev] SSL certs
In-Reply-To: <9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com> 
References: <46DDCD7C.40004@v.loewis.de> <46DECFF6.4040107@v.loewis.de>
	<46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de>
	<-1936579380892715012@unknownmsgid>
	<60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com>
	<20070912013412.GB14034@panix.com>
	<20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de>
	<-6753447202579215070@unknownmsgid>
	<9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com>
Message-ID: <07Sep18.182929pdt."57996"@synergy1.parc.xerox.com>

I guess something we should think about is whether to introduce RFC
2818 hostname checking into urllib.urlopen() and similar utilities.
Presumably one would add an optional arg specifying a file full of
root certs to validate against, and if that arg was present, would
retrieve the hostname info from the validated cert, and do the
client-side check.

Bill

From ksankar at doubleclix.net  Wed Sep 19 04:36:25 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 18 Sep 2007 19:36:25 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>
References: <46F01CC5.3000603@doubleclix.net>
	<ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>
Message-ID: <46F08B29.6050206@doubleclix.net>

Guido,
    The vagueness is deliberate, to  keep the options open until we have 
some form o convergence. Parallelism/concurrency is a vast and important 
domain that I do not want to develop a hasty proposal. But I did want to 
start thinking in terms of concrete proposals, not pontifying, hence the 
"pragmatic" section.
    Happy to hear that you are open to PVM changes. It will not be asked 
unless and until we all are crisp about it.
Cheers
<k/>
   
Guido van Rossum wrote:
> On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>   
>> Folks,
>>    As a follow-up to the py3k discussions started by Bruce and Guido, I
>> pinged Brett and he suggested I submit an exploratory proposal. Would
>> appreciate insights, wisdom, the good, the bad and the ugly.
>>
>> A)    Does it make sense ?
>> B)    Which application sets should we consider in designing the
>> interfaces and implementations
>> C)    In this proposal, parallelism and concurrency are used in an
>> interchangeable fashion. Thoughts ?
>> D)    Please suggest pertinent links, discussions and insights.
>> E)    I have kept the proposal to a minimum to start the discussions and
>> to explore if this is the right thing to do. Collaboratively, as we
>> zero-in on one or two approaches, the idea is to expand it to a crisp
>> and clear PEP. Need to do some more formatting as well.
>>     
>
> I'd say it is a little light on specific proposals. The only section
> that actually proposes anything is this:
>
>   
>> Pragmatic proposal
>> ------------------
>> May I suggest we embed two primitives in Python 3K:
>> A)    A functional style share-nothing set of interfaces (and
>> implementations thereof) - provides  the task parallelism/concurrency
>> capability, "small messages, big computations" as Joe Armstrong calls it[3]
>> B)    A limited shared memory based model for data intensive parallelism
>>
>> Most probably this would be part of stdlib. While Guido is almost right
>> in saying that this is a (std)library problem, it is not fully so. We
>> would need a few primitives from the underlying PVM substrate. Possibly
>> one reason for Guido's position is the lack of clarity as to what needs
>> to be changed and why. IMHO, just saying take GIL off does not solve the
>> problem either.
>>     
>
> Before I can meaningfully comment I think I'd like to hear more about
> what specifically you are thinking of.
>
> I don't mind the necessary changes to the PVM. I do like to see how
> this affects existing C extensions though.
>
>   


From ksankar at doubleclix.net  Wed Sep 19 04:43:18 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 18 Sep 2007 19:43:18 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com>
References: <46F01CC5.3000603@doubleclix.net>
	<2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com>
Message-ID: <46F08CC6.2060208@doubleclix.net>

Justin,
    Yep, trying out an implementation is a good way. Please share your 
thoughts as and when you are ready.
Cheers & good luck
<k/>
Justin Tulloss wrote:
>
>
> On 9/18/07, *Krishna Sankar* <ksankar at doubleclix.net 
> <mailto:ksankar at doubleclix.net>> wrote:
>
>     Folks,
>        As a follow-up to the py3k discussions started by Bruce and
>     Guido, I
>     pinged Brett and he suggested I submit an exploratory proposal. Would
>     appreciate insights, wisdom, the good, the bad and the ugly.
>
>
> I am currently working on parallelizing python as an undergraduate 
> independent study. I plan on first removing the GIL with as little 
> overall effect as possible and then implementing a task-oriented 
> threading API on top, probably based on Stackless (since they already 
> do a great job with concurrency in a single thread).
>
> If you're interested in all the details, I'd be happy to share. I 
> haven't gotten far yet (the semester just started!), but I feel that 
> actually implementing these things would be the best way to get a PEP 
> through.
>
> Justin
>
>


From guido at python.org  Wed Sep 19 05:24:28 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Sep 2007 20:24:28 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F08B29.6050206@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
	<ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>
	<46F08B29.6050206@doubleclix.net>
Message-ID: <ca471dc20709182024v48c0aa37wc1637f711103e8b3@mail.gmail.com>

On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>     The vagueness is deliberate, to  keep the options open until we have
> some form o convergence. Parallelism/concurrency is a vast and important
> domain that I do not want to develop a hasty proposal. But I did want to
> start thinking in terms of concrete proposals, not pontifying, hence the
> "pragmatic" section.

As long as it's this vague it doesn't deserve to be called a PEP
though. PEPs can't be vague, they must make specific proposals. As
long as this is intentionally half-baked it belongs back in
python-ideas and there's no point in pretending to be writing a "PEP".

>     Happy to hear that you are open to PVM changes. It will not be asked
> unless and until we all are crisp about it.
> Cheers
> <k/>
>
> Guido van Rossum wrote:
> > On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
> >
> >> Folks,
> >>    As a follow-up to the py3k discussions started by Bruce and Guido, I
> >> pinged Brett and he suggested I submit an exploratory proposal. Would
> >> appreciate insights, wisdom, the good, the bad and the ugly.
> >>
> >> A)    Does it make sense ?
> >> B)    Which application sets should we consider in designing the
> >> interfaces and implementations
> >> C)    In this proposal, parallelism and concurrency are used in an
> >> interchangeable fashion. Thoughts ?
> >> D)    Please suggest pertinent links, discussions and insights.
> >> E)    I have kept the proposal to a minimum to start the discussions and
> >> to explore if this is the right thing to do. Collaboratively, as we
> >> zero-in on one or two approaches, the idea is to expand it to a crisp
> >> and clear PEP. Need to do some more formatting as well.
> >>
> >
> > I'd say it is a little light on specific proposals. The only section
> > that actually proposes anything is this:
> >
> >
> >> Pragmatic proposal
> >> ------------------
> >> May I suggest we embed two primitives in Python 3K:
> >> A)    A functional style share-nothing set of interfaces (and
> >> implementations thereof) - provides  the task parallelism/concurrency
> >> capability, "small messages, big computations" as Joe Armstrong calls it[3]
> >> B)    A limited shared memory based model for data intensive parallelism
> >>
> >> Most probably this would be part of stdlib. While Guido is almost right
> >> in saying that this is a (std)library problem, it is not fully so. We
> >> would need a few primitives from the underlying PVM substrate. Possibly
> >> one reason for Guido's position is the lack of clarity as to what needs
> >> to be changed and why. IMHO, just saying take GIL off does not solve the
> >> problem either.
> >>
> >
> > Before I can meaningfully comment I think I'd like to hear more about
> > what specifically you are thinking of.
> >
> > I don't mind the necessary changes to the PVM. I do like to see how
> > this affects existing C extensions though.
> >
> >
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Sep 19 05:24:28 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Sep 2007 20:24:28 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
	massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F08B29.6050206@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
	<ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>
	<46F08B29.6050206@doubleclix.net>
Message-ID: <ca471dc20709182024v48c0aa37wc1637f711103e8b3@mail.gmail.com>

On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>     The vagueness is deliberate, to  keep the options open until we have
> some form o convergence. Parallelism/concurrency is a vast and important
> domain that I do not want to develop a hasty proposal. But I did want to
> start thinking in terms of concrete proposals, not pontifying, hence the
> "pragmatic" section.

As long as it's this vague it doesn't deserve to be called a PEP
though. PEPs can't be vague, they must make specific proposals. As
long as this is intentionally half-baked it belongs back in
python-ideas and there's no point in pretending to be writing a "PEP".

>     Happy to hear that you are open to PVM changes. It will not be asked
> unless and until we all are crisp about it.
> Cheers
> <k/>
>
> Guido van Rossum wrote:
> > On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
> >
> >> Folks,
> >>    As a follow-up to the py3k discussions started by Bruce and Guido, I
> >> pinged Brett and he suggested I submit an exploratory proposal. Would
> >> appreciate insights, wisdom, the good, the bad and the ugly.
> >>
> >> A)    Does it make sense ?
> >> B)    Which application sets should we consider in designing the
> >> interfaces and implementations
> >> C)    In this proposal, parallelism and concurrency are used in an
> >> interchangeable fashion. Thoughts ?
> >> D)    Please suggest pertinent links, discussions and insights.
> >> E)    I have kept the proposal to a minimum to start the discussions and
> >> to explore if this is the right thing to do. Collaboratively, as we
> >> zero-in on one or two approaches, the idea is to expand it to a crisp
> >> and clear PEP. Need to do some more formatting as well.
> >>
> >
> > I'd say it is a little light on specific proposals. The only section
> > that actually proposes anything is this:
> >
> >
> >> Pragmatic proposal
> >> ------------------
> >> May I suggest we embed two primitives in Python 3K:
> >> A)    A functional style share-nothing set of interfaces (and
> >> implementations thereof) - provides  the task parallelism/concurrency
> >> capability, "small messages, big computations" as Joe Armstrong calls it[3]
> >> B)    A limited shared memory based model for data intensive parallelism
> >>
> >> Most probably this would be part of stdlib. While Guido is almost right
> >> in saying that this is a (std)library problem, it is not fully so. We
> >> would need a few primitives from the underlying PVM substrate. Possibly
> >> one reason for Guido's position is the lack of clarity as to what needs
> >> to be changed and why. IMHO, just saying take GIL off does not solve the
> >> problem either.
> >>
> >
> > Before I can meaningfully comment I think I'd like to hear more about
> > what specifically you are thinking of.
> >
> > I don't mind the necessary changes to the PVM. I do like to see how
> > this affects existing C extensions though.
> >
> >
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ksankar at doubleclix.net  Wed Sep 19 05:43:20 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Tue, 18 Sep 2007 20:43:20 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <ca471dc20709182024v48c0aa37wc1637f711103e8b3@mail.gmail.com>
References: <46F01CC5.3000603@doubleclix.net>	
	<ca471dc20709181215w30de4634sb98dc5e02e34383f@mail.gmail.com>	
	<46F08B29.6050206@doubleclix.net>
	<ca471dc20709182024v48c0aa37wc1637f711103e8b3@mail.gmail.com>
Message-ID: <46F09AD8.10005@doubleclix.net>

Agreed it is not a PEP yet. Hence the word "Exploration" in front of it. 
This ia domain which needs some discussions before developing a good 
PEP. May be I should call it a PEPlet ;o)
Cheers
<k/>
Guido van Rossum wrote:
> On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>   
>>     The vagueness is deliberate, to  keep the options open until we have
>> some form o convergence. Parallelism/concurrency is a vast and important
>> domain that I do not want to develop a hasty proposal. But I did want to
>> start thinking in terms of concrete proposals, not pontifying, hence the
>> "pragmatic" section.
>>     
>
> As long as it's this vague it doesn't deserve to be called a PEP
> though. PEPs can't be vague, they must make specific proposals. As
> long as this is intentionally half-baked it belongs back in
> python-ideas and there's no point in pretending to be writing a "PEP".
>
>   
>>     Happy to hear that you are open to PVM changes. It will not be asked
>> unless and until we all are crisp about it.
>> Cheers
>> <k/>
>>
>> Guido van Rossum wrote:
>>     
>>> On 9/18/07, Krishna Sankar <ksankar at doubleclix.net> wrote:
>>>
>>>       
>>>> Folks,
>>>>    As a follow-up to the py3k discussions started by Bruce and Guido, I
>>>> pinged Brett and he suggested I submit an exploratory proposal. Would
>>>> appreciate insights, wisdom, the good, the bad and the ugly.
>>>>
>>>> A)    Does it make sense ?
>>>> B)    Which application sets should we consider in designing the
>>>> interfaces and implementations
>>>> C)    In this proposal, parallelism and concurrency are used in an
>>>> interchangeable fashion. Thoughts ?
>>>> D)    Please suggest pertinent links, discussions and insights.
>>>> E)    I have kept the proposal to a minimum to start the discussions and
>>>> to explore if this is the right thing to do. Collaboratively, as we
>>>> zero-in on one or two approaches, the idea is to expand it to a crisp
>>>> and clear PEP. Need to do some more formatting as well.
>>>>
>>>>         
>>> I'd say it is a little light on specific proposals. The only section
>>> that actually proposes anything is this:
>>>
>>>
>>>       
>>>> Pragmatic proposal
>>>> ------------------
>>>> May I suggest we embed two primitives in Python 3K:
>>>> A)    A functional style share-nothing set of interfaces (and
>>>> implementations thereof) - provides  the task parallelism/concurrency
>>>> capability, "small messages, big computations" as Joe Armstrong calls it[3]
>>>> B)    A limited shared memory based model for data intensive parallelism
>>>>
>>>> Most probably this would be part of stdlib. While Guido is almost right
>>>> in saying that this is a (std)library problem, it is not fully so. We
>>>> would need a few primitives from the underlying PVM substrate. Possibly
>>>> one reason for Guido's position is the lack of clarity as to what needs
>>>> to be changed and why. IMHO, just saying take GIL off does not solve the
>>>> problem either.
>>>>
>>>>         
>>> Before I can meaningfully comment I think I'd like to hear more about
>>> what specifically you are thinking of.
>>>
>>> I don't mind the necessary changes to the PVM. I do like to see how
>>> this affects existing C extensions though.
>>>
>>>
>>>       
>>     
>
>
>   


From jdsw2002 at yahoo.com  Wed Sep 19 10:40:30 2007
From: jdsw2002 at yahoo.com (jd)
Date: Wed, 19 Sep 2007 01:40:30 -0700 (PDT)
Subject: [Python-Dev] Pygtk app and hangs.
Message-ID: <183106.76731.qm@web35813.mail.mud.yahoo.com>

Hi
  I have a non-trivial pygtk running in to
hangs/freezes. 

Over all here is how program looks like.

gobject.threads_init()

gtk.main within threads_enter/threads_leave

All UI operaions in main threads.
Some call backs create UIWorker threads, UI worker
thread  does some work.. and then do gobkect.idle_add
to call a function that updates the UI.

I have a timer uses gobject.timeout_add.

the idle callbacks and timeout call backs are in
threads_enter/threads_leave.

I use some lib, that creates its own threads and does
socket operations.

Q. Anything I missed, or any suggestions. Is there a
comprehensive list/scheme on how to write a MT pygtk
app?

Q. I tried to setup debug version 2.5 but it fails
with 

  import gtk
  File
"/usr/lib/python2.5/site-packages/gtk-2.0/gtk/__init__.py",
line 38, in <module>
    import gobject as _gobject
  File
"/usr/lib/python2.5/site-packages/gtk-2.0/gobject/__init__.py",
line 30, in <module>
    from _gobject import *
ImportError:
/usr/lib/python2.5/site-packages/gtk-2.0/gobject/_gobject.so:
undefined symbol: PyUnicodeUCS4_FromObject

What special switch do I need to give to the configure
while building python ? 

Q. I have attached thread dumps. Any input on what
might be going  ?

Q. Modal dialogboxes event processing happens in the
main thread ?

Sorry for sending it to both the list. But the app is
pygtk while stack *seems* fairly clean (other than
main thread).

Thanks a ton, in advance.
/Jd


       
____________________________________________________________________________________
Pinpoint customers who are looking for what you sell. 
http://searchmarketing.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gdb_hang_problem
Type: application/octet-stream
Size: 18347 bytes
Desc: 130815649-gdb_hang_problem
Url : http://mail.python.org/pipermail/python-dev/attachments/20070919/11299660/attachment-0001.obj 

From steve at shrogers.com  Wed Sep 19 14:39:33 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Wed, 19 Sep 2007 06:39:33 -0600
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F01CC5.3000603@doubleclix.net>
References: <46F01CC5.3000603@doubleclix.net>
Message-ID: <46F11885.7000406@shrogers.com>

Krishna Sankar wrote:
> Folks,
>    As a follow-up to the py3k discussions started by Bruce and Guido, I 
> pinged Brett and he suggested I submit an exploratory proposal. Would 
> appreciate insights, wisdom, the good, the bad and the ugly.
>
> A)    Does it make sense ?
> B)    Which application sets should we consider in designing the 
> interfaces and implementations
> C)    In this proposal, parallelism and concurrency are used in an 
> interchangeable fashion. Thoughts ?
> D)    Please suggest pertinent links, discussions and insights.
> E)    I have kept the proposal to a minimum to start the discussions and 
> to explore if this is the right thing to do. Collaboratively, as we 
> zero-in on one or two approaches, the idea is to expand it to a crisp 
> and clear PEP. Need to do some more formatting as well.
> Cheers
> <k/>
> P.S : I had sent this to python-ideas couple of days ago and received 
> two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their 
> comments yet. Folks who are on both lists, pardon me for the spam.
# Proto-PEP elided.

Other than number of cores, you don't mention hardware architecture.  I 
presume that you're thinking of symmetric multiprocessor architectures.  
If so, this should be explicit.  You should also consider that SMP may 
not be the predominant multi-core architecture in the future, the Cell 
processor has one general purpose processor and eight more specialized 
processors.  You might not want to limit the PEP to 32 cores, I know of 
startups working on 40 and 64 core chips.

Shared memory may be necessary for good performance, but it doesn't have 
to be exposed at the language level.  While Erlang has strictly message 
passing semantics, I believe that it uses shared memory in the low level 
implementation.

# Steve




From facundobatista at gmail.com  Wed Sep 19 15:01:32 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 19 Sep 2007 10:01:32 -0300
Subject: [Python-Dev] Decimal news
In-Reply-To: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
Message-ID: <e04bdf310709190601g664c2cd4w4e8b2f9a0c441d51@mail.gmail.com>

2007/9/18, Thomas Wouters <thomas at python.org>:

> Unfortunately, that's not how it works :-) If you check something into the
> trunk, it will be merged into Py3k sooner or later. I may ask the original
> submitter for assistance if it's incredibly hard to figure out the changes,
> but so far, I only had to do that with the SSL changes. The decimal changes
> are being merged as I write this (tests running now.) Is there anything in
> particular that needs to be done for decimal in Py3k, besides renaming
> __div__ to __truediv__?

There isn't nothing really special to do, but my plan was because I
didn't know how the mechanism worked, ;)

It'd be great if all the changes that I'm making to Decimal are
automatically, at some point, merged into Py3k (I guess that using the
conversion tool).

But at some point, both codes may start to diverge, because
Py3k-specific optimizations could be done there... but this could be
done in an year or two, ;).

So, how is this handled? Until which moment can I expect that the
changes in the trunk are merged to Py3k?

Thank you very much!

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From thomas at python.org  Wed Sep 19 16:30:18 2007
From: thomas at python.org (Thomas Wouters)
Date: Wed, 19 Sep 2007 07:30:18 -0700
Subject: [Python-Dev] Decimal news
In-Reply-To: <e04bdf310709190601g664c2cd4w4e8b2f9a0c441d51@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
	<e04bdf310709190601g664c2cd4w4e8b2f9a0c441d51@mail.gmail.com>
Message-ID: <9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com>

On 9/19/07, Facundo Batista <facundobatista at gmail.com> wrote:
>
> 2007/9/18, Thomas Wouters <thomas at python.org>:
>
> > Unfortunately, that's not how it works :-) If you check something into
> the
> > trunk, it will be merged into Py3k sooner or later. I may ask the
> original
> > submitter for assistance if it's incredibly hard to figure out the
> changes,
> > but so far, I only had to do that with the SSL changes. The decimal
> changes
> > are being merged as I write this (tests running now.) Is there anything
> in
> > particular that needs to be done for decimal in Py3k, besides renaming
> > __div__ to __truediv__?
>
> There isn't nothing really special to do, but my plan was because I
> didn't know how the mechanism worked, ;)
>
> It'd be great if all the changes that I'm making to Decimal are
> automatically, at some point, merged into Py3k (I guess that using the
> conversion tool).


I don't usually have to use the 2to3 tool, but sometimes, yes.

But at some point, both codes may start to diverge, because
> Py3k-specific optimizations could be done there... but this could be
> done in an year or two, ;).
>
> So, how is this handled? Until which moment can I expect that the
> changes in the trunk are merged to Py3k?


Until you hear otherwise :) You can commit py3k-specific changes to the py3k
branch, the merges shouldn't lose them. (Of course, mistakes in merging are
possible, which is why tests are good :) If I do base the merge on the 2to3
outpt of the trunk version, I'd be careful not to lose changes made in the
branch.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070919/e30f5b69/attachment.htm 

From facundobatista at gmail.com  Wed Sep 19 16:33:48 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 19 Sep 2007 11:33:48 -0300
Subject: [Python-Dev] Decimal news
In-Reply-To: <9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
	<e04bdf310709190601g664c2cd4w4e8b2f9a0c441d51@mail.gmail.com>
	<9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com>
Message-ID: <e04bdf310709190733j21ee7194y18a129a5c6572897@mail.gmail.com>

2007/9/19, Thomas Wouters <thomas at python.org>:

> > So, how is this handled? Until which moment can I expect that the
> > changes in the trunk are merged to Py3k?
>
> Until you hear otherwise :) You can commit py3k-specific changes to the py3k
> branch, the merges shouldn't lose them. (Of course, mistakes in merging are
> possible, which is why tests are good :) If I do base the merge on the 2to3
> outpt of the trunk version, I'd be careful not to lose changes made in the
> branch.

Ok, thank you very much!!

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From aahz at pythoncraft.com  Wed Sep 19 17:13:07 2007
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 19 Sep 2007 08:13:07 -0700
Subject: [Python-Dev] Pygtk app and hangs.
In-Reply-To: <183106.76731.qm@web35813.mail.mud.yahoo.com>
References: <183106.76731.qm@web35813.mail.mud.yahoo.com>
Message-ID: <20070919151307.GA7802@panix.com>

On Wed, Sep 19, 2007, jd wrote:
>
>   I have a non-trivial pygtk running in to hangs/freezes.

python-dev is not an appropriate place to ask for help with debugging
your programs.  It is only for people working on the Python package
itself.  Please use the pygtk list (which you already did) or the
newsgroup comp.lang.python.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The best way to get information on Usenet is not to ask a question, but
to post the wrong information.

From status at bugs.python.org  Wed Sep 19 20:14:03 2007
From: status at bugs.python.org (Tracker)
Date: Wed, 19 Sep 2007 18:14:03 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070919181403.346B9782C1@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (09/12/07 - 09/19/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1266 open (+13) / 11396 closed (+11) / 12662 total (+24)

Open issues with patches:   408

Average duration of open issues: 678 days.
Median duration of open issues: 643 days.

Open Issues Breakdown
   open  1262 (+13)
pending     4 ( +0)

Issues Created Or Reopened (24)
_______________________________

Suggested change to _exit function description in os module docu 09/12/07
CLOSED http://bugs.python.org/issue1156    created  jtonsing                 
                                                                               

test_urllib2net fails on test_ftp                                09/12/07
       http://bugs.python.org/issue1157    created  gvanrossum               
                                                                               

%f format for datetime objects                                   09/13/07
       http://bugs.python.org/issue1158    created  skip.montanaro           
       py3k, patch                                                             

os.getenv() not updated after external module uses C putenv()    09/13/07
       http://bugs.python.org/issue1159    created  robert.ancell            
                                                                               

Medium size regexp crashes python                                09/13/07
       http://bugs.python.org/issue1160    created  ostkamp                  
                                                                               

Garbled chars in offending line of SyntaxError traceback         09/13/07
       http://bugs.python.org/issue1161    created  eopadoan                 
                                                                               

Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B 09/13/07
CLOSED http://bugs.python.org/issue1162    created  swaroopch                
                                                                               

Patch to make py3k/Lib/test/test_thread.py use unittest          09/13/07
       http://bugs.python.org/issue1163    created  JonoDiCarlo              
       patch                                                                   

tp_print slots don't release the GIL                             09/14/07
CLOSED http://bugs.python.org/issue1164    created  arigo                    
       patch                                                                   

Should itertools.count work for arbitrary integers?              09/15/07
       http://bugs.python.org/issue1165    created  eopadoan                 
       py3k                                                                    

NameError when calling malloc                                    09/15/07
CLOSED http://bugs.python.org/issue1166    created  esr                      
                                                                               

gdbm/ndbm 1.8.1+ needs libgdbm_compat.so                         09/16/07
       http://bugs.python.org/issue1167    created  ikelly                   
       patch                                                                   

complex arithmetic: strange results with "imag"                  09/16/07
CLOSED http://bugs.python.org/issue1168    created  newman                   
                                                                               

Option -OO doesn't remove docstrings from functions              09/16/07
CLOSED http://bugs.python.org/issue1169    created  piro                     
       patch                                                                   

shlex have problems with parsing unicode                         09/17/07
       http://bugs.python.org/issue1170    created  dexen                    
                                                                               

allow subclassing of bytes type                                  09/17/07
       http://bugs.python.org/issue1171    created  mfenniak                 
       py3k, patch                                                             

Documentation for done attribute of FieldStorage class           09/17/07
CLOSED http://bugs.python.org/issue1172    created  bkline                   
       patch                                                                   

yield expressions not documented in Language Reference           09/17/07
CLOSED http://bugs.python.org/issue1173    created  dangyogi                 
                                                                               

new generator methods not documented in Library Reference        09/17/07
CLOSED http://bugs.python.org/issue1174    created  dangyogi                 
                                                                               

.readline() has bug WRT nonblocking files                        09/18/07
CLOSED http://bugs.python.org/issue1175    created  ajb                      
                                                                               

str.split() takes no keyword arguments (Should this be expected? 09/18/07
       http://bugs.python.org/issue1176    created  sergioc                  
                                                                               

urllib* 20x responses not OK?                                    09/19/07
CLOSED http://bugs.python.org/issue1177    reopened jafo                     
       patch                                                                   

IDLE - add "paste code" functionality                            09/18/07
       http://bugs.python.org/issue1178    created  taleinat                 
       patch                                                                   

[CVE-2007-4965] Integer overflow in imageop module               09/19/07
       http://bugs.python.org/issue1179    created  cartman                  
                                                                               



Issues Now Closed (34)
______________________

cgi:  parse_qs and parse_qsl misbehave on empty strings            24 days
       http://bugs.python.org/issue1014    gvanrossum               
                                                                               

[py3k] pdb does not work in python 3000                            16 days
       http://bugs.python.org/issue1038    georg.brandl             
       py3k                                                                    

platform system may be Windows or Microsoft since Vista            17 days
       http://bugs.python.org/issue1082    p.lavarre at ieee.org       
       patch                                                                   

"make altinstall" installs pydoc, idle, smtpd.py with broken she    6 days
       http://bugs.python.org/issue1120    georg.brandl             
                                                                               

Document inspect.getfullargspec()                                   6 days
       http://bugs.python.org/issue1121    georg.brandl             
       py3k                                                                    

split(None, maxsplit) does not strip whitespace correctly          12 days
       http://bugs.python.org/issue1123    nirs                     
                                                                               

file.fileno and file.isatty() should be implementable by any fil   11 days
       http://bugs.python.org/issue1126    jafo                     
                                                                               

Reference Manual: "for statement" links to "break statement"       10 days
       http://bugs.python.org/issue1131    georg.brandl             
                                                                               

re.sub returns str when processing empty unicode string             7 days
       http://bugs.python.org/issue1140    jafo                     
                                                                               

reading large files                                                 8 days
       http://bugs.python.org/issue1141    jafo                     
                                                                               

TypeError on join - httplib mixing str and bytes                    1 days
       http://bugs.python.org/issue1148    gvanrossum               
                                                                               

fdopen does not work as expected                                    6 days
       http://bugs.python.org/issue1149    jafo                     
                                                                               

Rename PyBUF_WRITEABLE to PyBUF_WRITABLE                            6 days
       http://bugs.python.org/issue1150    jafo                     
       patch                                                                   

help(pickle) fails: unorderable types: type() < type()              0 days
       http://bugs.python.org/issue1153    georg.brandl             
                                                                               

Carbon.CF memory leak                                               0 days
       http://bugs.python.org/issue1154    georg.brandl             
                                                                               

Suggested change to _exit function description in os module docu    2 days
       http://bugs.python.org/issue1156    jtonsing                 
                                                                               

Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B    1 days
       http://bugs.python.org/issue1162    georg.brandl             
                                                                               

tp_print slots don't release the GIL                                2 days
       http://bugs.python.org/issue1164    brett.cannon             
       patch                                                                   

NameError when calling malloc                                       0 days
       http://bugs.python.org/issue1166    loewis                   
                                                                               

complex arithmetic: strange results with "imag"                     0 days
       http://bugs.python.org/issue1168    georg.brandl             
                                                                               

Option -OO doesn't remove docstrings from functions                 3 days
       http://bugs.python.org/issue1169    georg.brandl             
       patch                                                                   

Documentation for done attribute of FieldStorage class              1 days
       http://bugs.python.org/issue1172    jafo                     
       patch                                                                   

yield expressions not documented in Language Reference              0 days
       http://bugs.python.org/issue1173    georg.brandl             
                                                                               

new generator methods not documented in Library Reference           0 days
       http://bugs.python.org/issue1174    georg.brandl             
                                                                               

.readline() has bug WRT nonblocking files                           1 days
       http://bugs.python.org/issue1175    gvanrossum               
                                                                               

urllib* 20x responses not OK?                                       0 days
       http://bugs.python.org/issue1177    facundobatista           
       patch                                                                   

time mod's timezone doesn't honor TZ var                         2114 days
       http://bugs.python.org/issue487331  brett.cannon             
                                                                               

asyncore file wrapper & os.error                                 1988 days
       http://bugs.python.org/issue539444  brett.cannon             
                                                                               

long file name support broken in windows                         1983 days
       http://bugs.python.org/issue542314  mhammond                 
                                                                               

urllib2 raises exception with non-200 success codes.             1193 days
       http://bugs.python.org/issue971965  georg.brandl             
                                                                               

class property fset not working                                   842 days
       http://bugs.python.org/issue1207379 georg.brandl             
                                                                               

Reading with bz2.BZ2File() returns one garbage character          306 days
       http://bugs.python.org/issue1597011 jafo                     
                                                                               

Decimal and long hash, compatibly and efficiently                  38 days
       http://bugs.python.org/issue1772851 facundobatista           
       patch                                                                   

ctypes on Solaris                                                  25 days
       http://bugs.python.org/issue1777530 theller                  
                                                                               



Top Issues Most Discussed (10)
______________________________

 12 platform system may be Windows or Microsoft since Vista           17 days
closed  http://bugs.python.org/issue1082   

  9 tp_print slots don't release the GIL                               2 days
closed  http://bugs.python.org/issue1164   

  9 os.getenv() not updated after external module uses C putenv()      6 days
open    http://bugs.python.org/issue1159   

  9 %f format for datetime objects                                     7 days
open    http://bugs.python.org/issue1158   

  7 urllib* 20x responses not OK?                                      0 days
closed  http://bugs.python.org/issue1177   

  6 Optimizations for cgi.FieldStorage methods                       399 days
open    http://bugs.python.org/issue1541463

  6 .readline() has bug WRT nonblocking files                          1 days
closed  http://bugs.python.org/issue1175   

  6 Allow str.join to join non-string types (as per PEP 3100)          8 days
open    http://bugs.python.org/issue1145   

  5 Decimal and long hash, compatibly and efficiently                 38 days
closed  http://bugs.python.org/issue1772851

  5 Documentation for done attribute of FieldStorage class             1 days
closed  http://bugs.python.org/issue1172   




From facundobatista at gmail.com  Wed Sep 19 22:41:49 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 19 Sep 2007 17:41:49 -0300
Subject: [Python-Dev] Python tickets summary
In-Reply-To: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
References: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
Message-ID: <e04bdf310709191341l289c034fjb4bf6d6b88ae4ef9@mail.gmail.com>

2007/9/10, Facundo Batista <facundobatista at gmail.com>:

> I modified my tool, whichs makes a summary of all the Python tickets
> (I moved the source where the info is taken from SF to our Roundup).

Based on an idea from Dennis Benzinger, now the temporal bars show the
moments where each comment was made, so it's easy to see the "rhythm"
of the ticket activity:

  http://www.taniquetil.com.ar/facundo/py_tickets.html

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From ksankar at doubleclix.net  Wed Sep 19 23:00:48 2007
From: ksankar at doubleclix.net (Krishna Sankar)
Date: Wed, 19 Sep 2007 14:00:48 -0700
Subject: [Python-Dev] Exploration PEP : Concurrency for moderately
 massive (4 to 32 cores) multi-core architectures
In-Reply-To: <46F11885.7000406@shrogers.com>
References: <46F01CC5.3000603@doubleclix.net> <46F11885.7000406@shrogers.com>
Message-ID: <46F18E00.40700@doubleclix.net>

Steve,

    Thanks.
    a)   Yep, SMP for now. Agreed on the need for asymmetric 
architectures like cell-processor. We need to start somewhere and then 
can extend to more exotic realms.
    b)   Yep, need to scale to arbitrary number of cores. But as a 
start, I wanted to differentiate from massive parallelism.
    c)   Yep, we can have message passing semantics at the interface 
level and then underneath share the memory (even optimize with the copy 
on write patter). I was thinking that we would need to cross process 
space; for example federate 8 separate py processes (in an 8 core 
machine) and have a shared data path between them, based on shared 
memory allocated at configuration time.

Cheers
<k/>
Steven H. Rogers wrote:
> Krishna Sankar wrote:
>   
>> Folks,
>>    As a follow-up to the py3k discussions started by Bruce and Guido, I 
>> pinged Brett and he suggested I submit an exploratory proposal. Would 
>> appreciate insights, wisdom, the good, the bad and the ugly.
>>
>> A)    Does it make sense ?
>> B)    Which application sets should we consider in designing the 
>> interfaces and implementations
>> C)    In this proposal, parallelism and concurrency are used in an 
>> interchangeable fashion. Thoughts ?
>> D)    Please suggest pertinent links, discussions and insights.
>> E)    I have kept the proposal to a minimum to start the discussions and 
>> to explore if this is the right thing to do. Collaboratively, as we 
>> zero-in on one or two approaches, the idea is to expand it to a crisp 
>> and clear PEP. Need to do some more formatting as well.
>> Cheers
>> <k/>
>> P.S : I had sent this to python-ideas couple of days ago and received 
>> two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their 
>> comments yet. Folks who are on both lists, pardon me for the spam.
>>     
> # Proto-PEP elided.
>
> Other than number of cores, you don't mention hardware architecture.  I 
> presume that you're thinking of symmetric multiprocessor architectures.  
> If so, this should be explicit.  You should also consider that SMP may 
> not be the predominant multi-core architecture in the future, the Cell 
> processor has one general purpose processor and eight more specialized 
> processors.  You might not want to limit the PEP to 32 cores, I know of 
> startups working on 40 and 64 core chips.
>
> Shared memory may be necessary for good performance, but it doesn't have 
> to be exposed at the language level.  While Erlang has strictly message 
> passing semantics, I believe that it uses shared memory in the low level 
> implementation.
>
> # Steve
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ksankar%40doubleclix.net
>
>
>   


From rrr at ronadam.com  Thu Sep 20 01:55:41 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 19 Sep 2007 18:55:41 -0500
Subject: [Python-Dev] Python tickets summary
In-Reply-To: <e04bdf310709191341l289c034fjb4bf6d6b88ae4ef9@mail.gmail.com>
References: <e04bdf310709101325o1315adb0pebcb2f33d8313f35@mail.gmail.com>
	<e04bdf310709191341l289c034fjb4bf6d6b88ae4ef9@mail.gmail.com>
Message-ID: <46F1B6FD.70007@ronadam.com>



Facundo Batista wrote:
> 2007/9/10, Facundo Batista <facundobatista at gmail.com>:
> 
>> I modified my tool, whichs makes a summary of all the Python tickets
>> (I moved the source where the info is taken from SF to our Roundup).
> 
> Based on an idea from Dennis Benzinger, now the temporal bars show the
> moments where each comment was made, so it's easy to see the "rhythm"
> of the ticket activity:
> 
>   http://www.taniquetil.com.ar/facundo/py_tickets.html

Looks good. :-)

I noticed that there is a background of light blue between marks.  That is 
hard to see on my computer because it is so close to the grey tone.

Also shouldn't the light blue background bar extend all the way to the end 
for all open items?

Cheers,
   Ron

From janssen at parc.com  Thu Sep 20 19:29:30 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 20 Sep 2007 10:29:30 PDT
Subject: [Python-Dev] SSL module backport package ready for more testing
Message-ID: <07Sep20.102937pdt."57996"@synergy1.parc.xerox.com>

I've posted an sdist version of the 'ssl' module for Pythons 2.3.5 to
2.5.x, at http://www.parc.com/janssen/transient/ssl-1.3.tar.gz.  I think
this is 'gold master', but before I upload it to the Cheeseshop, I'd like
to get more testing on a broader variety of platforms.

The intent of this package is to allow code development with older versions
of Python that will continue to work on Python 2.6 and 3.x.

To build,

   python setup.py build

To test,

   python setup.py test

I'd appreciate feedback on testing results; please send to
janssen at parc.com.

Thanks!

Bill

From alex.neundorf at kitware.com  Thu Sep 20 22:30:36 2007
From: alex.neundorf at kitware.com (Alexander Neundorf)
Date: Thu, 20 Sep 2007 16:30:36 -0400
Subject: [Python-Dev] Building Python with CMake
In-Reply-To: <200708301628.57127.alex.neundorf@kitware.com>
References: <200707131359.17030.alex.neundorf@kitware.com>
	<f78mad$876$1@sea.gmane.org>
	<200708301628.57127.alex.neundorf@kitware.com>
Message-ID: <200709201630.36733.alex.neundorf@kitware.com>

Hi,

On Thursday 30 August 2007 16:28, Alexander Neundorf wrote:
...
> The cmake files for building python are now in a cvs repository:
> http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/?roo
>t=ParaView3
>
> This is inside the ParaView3 repository:
> http://www.paraview.org/New/download.html
>
> I used them today to build Python from svn trunk.
>
> I'll add some documentation how to use them, how to get them and what works
> and what doesn't work tomorrow.

Ok, it took a bit longer.
The wiki page is here:
http://paraview.org/ParaView3/index.php/BuildingPythonWithCMake

With the cmake files from cvs you can build Python svn, which will become 
Python 2.6.
It use it for Linux, IBM BlueGene/L and Cray Xt3 (in both cases for the 
compute nodes, not the front end nodes).

It works also for Windows, but I didn't take the time to check that all the 
configure checks deliver the correct results, so I just reused the premade 
pyconfig.h there.

Most modules are built now. For every module you can select whether to build 
it statically or dynamically or not at all. Source and binary packages can be 
created using "make packages".

These files don't conflict with any files in Python svn, so if somebody is 
interested adding them to Python svn shouldn't cause any problems.

Bye
Alex

P.S. due to moving I'll be mainly offline in the next weeks


From steven.bethard at gmail.com  Thu Sep 20 22:58:54 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 20 Sep 2007 14:58:54 -0600
Subject: [Python-Dev] Building Python with CMake
In-Reply-To: <200709201630.36733.alex.neundorf@kitware.com>
References: <200707131359.17030.alex.neundorf@kitware.com>
	<f78mad$876$1@sea.gmane.org>
	<200708301628.57127.alex.neundorf@kitware.com>
	<200709201630.36733.alex.neundorf@kitware.com>
Message-ID: <d11dcfba0709201358v7499845er448cef97bf73a33e@mail.gmail.com>

On 9/20/07, Alexander Neundorf <alex.neundorf at kitware.com> wrote:
> On Thursday 30 August 2007 16:28, Alexander Neundorf wrote:
> ...
> > The cmake files for building python are now in a cvs repository:
> > http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/?roo
> >t=ParaView3

Thanks for your work on this!  That page seems to require a login.
Any chance you could post it to something like::

    http://wiki.python.org/moin/BuildingPythonWithCMake

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From alex.neundorf at kitware.com  Thu Sep 20 23:08:35 2007
From: alex.neundorf at kitware.com (Alexander Neundorf)
Date: Thu, 20 Sep 2007 17:08:35 -0400
Subject: [Python-Dev] Building Python with CMake
In-Reply-To: <d11dcfba0709201358v7499845er448cef97bf73a33e@mail.gmail.com>
References: <200707131359.17030.alex.neundorf@kitware.com>
	<200709201630.36733.alex.neundorf@kitware.com>
	<d11dcfba0709201358v7499845er448cef97bf73a33e@mail.gmail.com>
Message-ID: <200709201708.36006.alex.neundorf@kitware.com>

On Thursday 20 September 2007 16:58, Steven Bethard wrote:
> On 9/20/07, Alexander Neundorf <alex.neundorf at kitware.com> wrote:
> > On Thursday 30 August 2007 16:28, Alexander Neundorf wrote:
> > ...
> >
> > > The cmake files for building python are now in a cvs repository:
> > > http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/
> > >?roo t=ParaView3
>
> Thanks for your work on this!  That page seems to require a login.
> Any chance you could post it to something like::
>
>     http://wiki.python.org/moin/BuildingPythonWithCMake

I guess I need a login there too, so I put it somewhere where I already have 
one:
http://www.cmake.org/Wiki/BuildingPythonWithCMake

Alex

From rrr at ronadam.com  Fri Sep 21 00:03:19 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 20 Sep 2007 17:03:19 -0500
Subject: [Python-Dev] Better unittest  failures
Message-ID: <46F2EE27.7020507@ronadam.com>



The value of a unittest test is not in how well they pass, but in how well 
they fail.

While looking at possibly helping with the str_uni branch when that was 
going on I found that in some cases unittest failure results can take a 
little bit (or a lot) of work to figure out just what was failing, where 
and why.

While helping Eric test the new format function and class I came up with a 
partial solution which may be a bases for further improvements.  Eric told 
me it did help quite a bit.  So I think it's worth looking into.

Since we were running over a hundred different options over several 
different implementations to make sure they all passed and failed in the 
same way, we were using data based test cases so we could easily test the 
same data with each version.  Unfortunately that has a drawback that the 
traceback doesn't show what data was used when testing exceptions.

Additionally when something did fail it was not always obvious what and why 
   it was failing.


One of the conclusions I came to is it would be better if tests did not 
raise standard python exceptions unless the test itself has a problem.  By 
having tests raise special *Test_Only* exceptions, it can make the output 
of the test very much clearer.

Here are the added Test_Only Excepitons.  These would only be in the 
unittest module to catch the following situations.

      Wrong_Result_Returned
      Unexpected_Exception_Raised
      No_Exception_Raised
      Wrong_Exception_Raised

And two new functions that use them.

      assertTestReturns(expect, test, message)
      assertTestRaises(expect, test, message)


These additions would not effect any existing tests.  To use these requires 
  the code to be tested to be wrapped in a function with no arguments.  And 
it is the same format for both assertTestReturns and assertTestRaises.

      for data in testdata:
         expect, a, b, c = data
         def test():
             return foo(a, b, c)
         assertTestReturns(expect, test, repr(data))



Replacing all existing tests with this form isn't reasonable but adding 
this as an option for those who want to use it is very easy to do.

The test file I used to generate the following output is attached.


Cheers,
    Ron




###
#
#  Test output using standard assertEquals and assertRaises.
#

   * The data has the form [(ref#, expect, args, kwds), ...]

   * The ref# is there to help find the failing test for situation where
you may have dozens of almost identical data.  It's not required but 
helpful to have.

   * I didn't include actual bad testcase tests in these examples, but if 
some generated exceptions similar to the that of the failing tests, I think 
it could add a bit more confusion to the situation than the not too 
confusing example here.



$ python ut_test.py
EEFFFFFF
======================================================================
ERROR: test_A (__main__.test1_normal_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 100, in test_A
     result = some_function(*args, **kwds)
   File "ut_test.py", line 62, in some_function
     baz = kwds['baz']
KeyError: 'baz'

#
#  This fails as a test "error" instead of a test "fail".
#  What was args and kwds here?
#

======================================================================
ERROR: test_B (__main__.test1_normal_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 108, in test_B
     self.assertRaises(expect, test, args, kwds)
   File "unittest.py", line 320, in failUnlessRaises
     callableObj(*args, **kwargs)
   File "ut_test.py", line 107, in test
     return some_function(*args, **kwds)
   File "ut_test.py", line 62, in some_function
     baz = kwds['baz']
KeyError: 'baz'

#
#  Same as above.  Fails as a test "error", unkown arguments
#  values for some_function().
#


======================================================================
FAIL: test_C (__main__.test1_normal_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 114, in test_C
     self.assertRaises(expect, test, args, kwds)
AssertionError: KeyError not raised

#
#  What was args, and kwds values?
#


======================================================================
FAIL: test_D (__main__.test1_normal_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 120, in test_D
     repr((n, expect, args, kwds)))
AssertionError: (8, ('Total baz:', 4), [1, 2], {'baz': 'Total baz:'})

#
#  This one is ok.
#





###
#
#   Test output using the added methods and test only exceptions with
#   the same test data.
#

    * Test errors only occur on actual test "errors".

    * The reason for the fail is explained in all cases for test "fails".

    * The only time you get an actual python exception is when the test
it self has a problem.  Otherwise you get an test_exception that
refers to the exception in the actual code.


======================================================================
FAIL: test_A (__main__.test2_new_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 131, in test_A
     repr((n, expect, args, kwds)))
   File "ut_test.py", line 36, in assertTestReturns
     result = test()
   File "ut_test.py", line 129, in test
     return some_function(*args, **kwds)
   File "ut_test.py", line 62, in some_function
     baz = kwds['baz']
Unexpected_Exception_Raised: KeyError('baz',)

Reference:
(2, ('Total baz:', 3), [1, 2], {'raz': 'Total baz:'})

======================================================================
FAIL: test_B (__main__.test2_new_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 138, in test_B
     repr((n, expect, args, kwds)))
   File "ut_test.py", line 45, in assertTestRaises
     result = test()
   File "ut_test.py", line 136, in test
     return some_function(*args, **kwds)
   File "ut_test.py", line 62, in some_function
     baz = kwds['baz']
Wrong_Exception_Raised: KeyError('baz',)

Reference:
(4, <type 'exceptions.IndexError'>, [1, 2], {'raz': 'Total baz:'})

======================================================================
FAIL: test_C (__main__.test2_new_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 145, in test_C
     repr((n, expect, args, kwds)))
   File "ut_test.py", line 52, in assertTestRaises
     raise self.No_Exception_Raised(result, ref)
No_Exception_Raised: returned -> ('Total baz:', 3)

Reference:
(6, <type 'exceptions.KeyError'>, [1, 2], {'baz': 'Total baz:'})

======================================================================
FAIL: test_D (__main__.test2_new_failures)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "ut_test.py", line 152, in test_D
     repr((n, expect, args, kwds)))
   File "ut_test.py", line 41, in assertTestReturns
     raise self.Wrong_Result_Returned(result, ref)
Wrong_Result_Returned: ('Total baz:', 3)

Reference:
(8, ('Total baz:', 4), [1, 2], {'baz': 'Total baz:'})

----------------------------------------------------------------------
Ran 8 tests in 0.004s

FAILED (failures=6, errors=2)



-------------- next part --------------
A non-text attachment was scrubbed...
Name: ut_test.py
Type: text/x-python
Size: 4850 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070920/134d74ee/attachment.py 

From status at bugs.python.org  Fri Sep 21 19:37:10 2007
From: status at bugs.python.org (Tracker)
Date: Fri, 21 Sep 2007 17:37:10 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070921173710.78B38782C1@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (09/14/07 - 09/21/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1269 open (+12) / 11400 closed (+11) / 12669 total (+23)

Open issues with patches:   412

Average duration of open issues: 678 days.
Median duration of open issues: 655 days.

Open Issues Breakdown
   open  1264 (+12)
pending     5 ( +0)

Issues Created Or Reopened (23)
_______________________________

tp_print slots don't release the GIL                             09/14/07
CLOSED http://bugs.python.org/issue1164    created  arigo                    
       patch                                                                   

Should itertools.count work for arbitrary integers?              09/15/07
       http://bugs.python.org/issue1165    created  eopadoan                 
       py3k                                                                    

NameError when calling malloc                                    09/15/07
CLOSED http://bugs.python.org/issue1166    created  esr                      
                                                                               

gdbm/ndbm 1.8.1+ needs libgdbm_compat.so                         09/16/07
       http://bugs.python.org/issue1167    created  ikelly                   
       patch                                                                   

complex arithmetic: strange results with "imag"                  09/16/07
CLOSED http://bugs.python.org/issue1168    created  newman                   
                                                                               

Option -OO doesn't remove docstrings from functions              09/16/07
CLOSED http://bugs.python.org/issue1169    created  piro                     
       patch                                                                   

shlex have problems with parsing unicode                         09/17/07
       http://bugs.python.org/issue1170    created  dexen                    
                                                                               

allow subclassing of bytes type                                  09/17/07
       http://bugs.python.org/issue1171    created  mfenniak                 
       py3k, patch                                                             

Documentation for done attribute of FieldStorage class           09/17/07
CLOSED http://bugs.python.org/issue1172    created  bkline                   
       patch                                                                   

yield expressions not documented in Language Reference           09/17/07
CLOSED http://bugs.python.org/issue1173    created  dangyogi                 
                                                                               

new generator methods not documented in Library Reference        09/17/07
CLOSED http://bugs.python.org/issue1174    created  dangyogi                 
                                                                               

.readline() has bug WRT nonblocking files                        09/18/07
CLOSED http://bugs.python.org/issue1175    created  ajb                      
                                                                               

str.split() takes no keyword arguments (Should this be expected? 09/18/07
CLOSED http://bugs.python.org/issue1176    created  sergioc                  
                                                                               

urllib* 20x responses not OK?                                    09/19/07
CLOSED http://bugs.python.org/issue1177    reopened jafo                     
       patch                                                                   

IDLE - add "paste code" functionality                            09/18/07
       http://bugs.python.org/issue1178    created  taleinat                 
       patch                                                                   

[CVE-2007-4965] Integer overflow in imageop module               09/19/07
       http://bugs.python.org/issue1179    created  cartman                  
                                                                               

Option to ignore or substitute ~/.pydistutils.cfg                09/19/07
       http://bugs.python.org/issue1180    created  hoffman                  
                                                                               

Redefine clear() for os.environ to use unsetenv() if possible    09/19/07
CLOSED http://bugs.python.org/issue1181    created  horcicka                 
       patch                                                                   

Paticular decimal mod operation wrongly output NaN.              09/20/07
       http://bugs.python.org/issue1182    created  ocean-city               
                                                                               

race in SocketServer.ForkingMixIn.collect_children               09/20/07
       http://bugs.python.org/issue1183    created  dripton                  
       patch                                                                   

test fixes for immutable bytes change                            09/20/07
       http://bugs.python.org/issue1184    created  hupp                     
       py3k, patch                                                             

py3k: Completely remove nb_coerce slot                           09/20/07
       http://bugs.python.org/issue1185    created  amaury.forgeotdarc       
       patch                                                                   

optparse documentation: -- being collapsed to - in HTML          09/21/07
       http://bugs.python.org/issue1186    created  hoffman                  
                                                                               



Issues Now Closed (29)
______________________

cgi:  parse_qs and parse_qsl misbehave on empty strings            24 days
       http://bugs.python.org/issue1014    gvanrossum               
                                                                               

platform system may be Windows or Microsoft since Vista            17 days
       http://bugs.python.org/issue1082    p.lavarre at ieee.org       
       patch                                                                   

split(None, maxsplit) does not strip whitespace correctly          12 days
       http://bugs.python.org/issue1123    nirs                     
                                                                               

file.fileno and file.isatty() should be implementable by any fil   11 days
       http://bugs.python.org/issue1126    jafo                     
                                                                               

Reference Manual: "for statement" links to "break statement"       10 days
       http://bugs.python.org/issue1131    georg.brandl             
                                                                               

re.sub returns str when processing empty unicode string             7 days
       http://bugs.python.org/issue1140    jafo                     
                                                                               

reading large files                                                 8 days
       http://bugs.python.org/issue1141    jafo                     
                                                                               

fdopen does not work as expected                                    6 days
       http://bugs.python.org/issue1149    jafo                     
                                                                               

Rename PyBUF_WRITEABLE to PyBUF_WRITABLE                            6 days
       http://bugs.python.org/issue1150    jafo                     
       patch                                                                   

Suggested change to _exit function description in os module docu    2 days
       http://bugs.python.org/issue1156    jtonsing                 
                                                                               

tp_print slots don't release the GIL                                2 days
       http://bugs.python.org/issue1164    brett.cannon             
       patch                                                                   

NameError when calling malloc                                       0 days
       http://bugs.python.org/issue1166    loewis                   
                                                                               

complex arithmetic: strange results with "imag"                     0 days
       http://bugs.python.org/issue1168    georg.brandl             
                                                                               

Option -OO doesn't remove docstrings from functions                 3 days
       http://bugs.python.org/issue1169    georg.brandl             
       patch                                                                   

Documentation for done attribute of FieldStorage class              1 days
       http://bugs.python.org/issue1172    jafo                     
       patch                                                                   

yield expressions not documented in Language Reference              0 days
       http://bugs.python.org/issue1173    georg.brandl             
                                                                               

new generator methods not documented in Library Reference           0 days
       http://bugs.python.org/issue1174    georg.brandl             
                                                                               

.readline() has bug WRT nonblocking files                           1 days
       http://bugs.python.org/issue1175    gvanrossum               
                                                                               

str.split() takes no keyword arguments (Should this be expected?    2 days
       http://bugs.python.org/issue1176    sergioc                  
                                                                               

urllib* 20x responses not OK?                                       0 days
       http://bugs.python.org/issue1177    facundobatista           
       patch                                                                   

Redefine clear() for os.environ to use unsetenv() if possible       1 days
       http://bugs.python.org/issue1181    georg.brandl             
       patch                                                                   

Need Windows os.link() support                                   2146 days
       http://bugs.python.org/issue478407  jafo                     
       patch                                                                   

long file name support broken in windows                         1983 days
       http://bugs.python.org/issue542314  mhammond                 
                                                                               

urllib2 raises exception with non-200 success codes.             1193 days
       http://bugs.python.org/issue971965  georg.brandl             
                                                                               

Optimizations for cgi.FieldStorage methods                        400 days
       http://bugs.python.org/issue1541463 georg.brandl             
       patch                                                                   

Reading with bz2.BZ2File() returns one garbage character          306 days
       http://bugs.python.org/issue1597011 jafo                     
                                                                               

UnicodeError in compileall if "make install" is run before "make  154 days
       http://bugs.python.org/issue1704287 jafo                     
       patch                                                                   

Decimal and long hash, compatibly and efficiently                  38 days
       http://bugs.python.org/issue1772851 facundobatista           
       patch                                                                   

ctypes on Solaris                                                  25 days
       http://bugs.python.org/issue1777530 theller                  
                                                                               



Top Issues Most Discussed (10)
______________________________

 12 platform system may be Windows or Microsoft since Vista           17 days
closed  http://bugs.python.org/issue1082   

  9 [CVE-2007-4965] Integer overflow in imageop module                 3 days
open    http://bugs.python.org/issue1179   

  9 tp_print slots don't release the GIL                               2 days
closed  http://bugs.python.org/issue1164   

  7 Optimizations for cgi.FieldStorage methods                       400 days
closed  http://bugs.python.org/issue1541463

  7 urllib* 20x responses not OK?                                      0 days
closed  http://bugs.python.org/issue1177   

  6 .readline() has bug WRT nonblocking files                          1 days
closed  http://bugs.python.org/issue1175   

  5 Documentation for done attribute of FieldStorage class             1 days
closed  http://bugs.python.org/issue1172   

  4 64/32-bit issue when unpickling random.Random                    115 days
open    http://bugs.python.org/issue1727780

  4 UnicodeError in compileall if "make install" is run before "mak  154 days
closed  http://bugs.python.org/issue1704287

  4 Redefine clear() for os.environ to use unsetenv() if possible      1 days
closed  http://bugs.python.org/issue1181   




From scav at blueyonder.co.uk  Wed Sep 26 10:42:28 2007
From: scav at blueyonder.co.uk (scav at blueyonder.co.uk)
Date: Wed, 26 Sep 2007 09:42:28 +0100 (BST)
Subject: [Python-Dev] Python 3.0a documentation
Message-ID: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>

I'd like to help out cleaning up the Python3.0 documentation.  There are a
lot of little leftovers from 2.x that are no longer true. (mentions of
long, callable() etc.)

Ideally (especially in the tutorial), we should only refer to 3.0 features
and syntax, and keep the special cases and "other ways to do it" to a
minimum.

Before I dive in and start submitting patches, what does everyone else
think?  How much reference to previous python versions should be left in? 
Does it make sense to keep notes of the nature of "since version 2.3 ..."
when there is an intentional discontinuity at 3.0?

Peter Harris




From guido at python.org  Wed Sep 26 16:27:50 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Sep 2007 07:27:50 -0700
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
Message-ID: <ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>

I fully support removing all historic references from the 3.0 language
manual. Please do help out! You can just start putting patches ("svn
diff") into bugs.python.org; typically Georg gets to these very
quickly. Do use subversion, not the distributed tarbal (which was out
of date by the time it was uploaded to python.org. :-).

--Guido

On 9/26/07, scav at blueyonder.co.uk <scav at blueyonder.co.uk> wrote:
> I'd like to help out cleaning up the Python3.0 documentation.  There are a
> lot of little leftovers from 2.x that are no longer true. (mentions of
> long, callable() etc.)
>
> Ideally (especially in the tutorial), we should only refer to 3.0 features
> and syntax, and keep the special cases and "other ways to do it" to a
> minimum.
>
> Before I dive in and start submitting patches, what does everyone else
> think?  How much reference to previous python versions should be left in?
> Does it make sense to keep notes of the nature of "since version 2.3 ..."
> when there is an intentional discontinuity at 3.0?
>
> Peter Harris
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From phd at phd.pp.ru  Wed Sep 26 17:24:08 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 26 Sep 2007 19:24:08 +0400
Subject: [Python-Dev] Iterating over objects of unknown length
Message-ID: <20070926152408.GA24021@phd.pp.ru>

Hello!

   (This seems like a "developing with Python" question and partially it is
but please read on.)

   I have a class that represents SQL queries. Instances of the class can
be iterated over. As an SQL query doesn't know in advance if it will
produce any row the class doesn't implement __len__(). Moreover, users of
the class sometimes write

if sqlQuery:
   for row in sqlQuery: ...
else:
   # no rows

which is a bug (the query doesn't know if it's True or False; to find it
out the user have to execute the query by trying to iterate over it). To
prevent users from writing such code the class implements __nonzero__()
that always raises an exception.
   Unfortunately, I found some libraries test the object in boolean context
before iterating over it and that, of course, triggers the exception from
__nonzero__().
   Even worse, some libraries test the object in boolean context regardless
of iterating over it. For example, logging module (this is where my
question becomes "developing for Python") triggers the exception in such
simple case:

logginig.debug("Query: %s", sqlQuery)

   Funny, the code

logginig.debug("Query: %s, another: %s", sqlQuery, another_value)

   doesn't trigger the exception. This is due to the code in
logginig/__init__.py:

        if args and (len(args) == 1) and args[0] and (type(args[0]) == types.DictType):
            args = args[0]

(class LogRecord, method __init__). "and args[0]" triggers the exception.

   My questions are:

1. Should I consider this a bug in the logging module (and other libraries)
   and submit patches?
2. Or should I stop raising exceptions in __nonzero__()?

   In this particular case with logging the fix is simple - do "and args[0]"
after type check.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From guido at python.org  Wed Sep 26 18:29:10 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Sep 2007 09:29:10 -0700
Subject: [Python-Dev] Iterating over objects of unknown length
In-Reply-To: <20070926152408.GA24021@phd.pp.ru>
References: <20070926152408.GA24021@phd.pp.ru>
Message-ID: <ca471dc20709260929h281ed9fp3e0d8dd0df2d775a@mail.gmail.com>

The logging code looks archaic: IMO it should be:

  if args and len(args) == 1 and isinstance(args[0], dict) and args[0]:

But I also fail to see why you would be so draconian as to disallow
truth testing of a query altogether. Your query looks like an
iterator. There are tons of other iterators in the language, library
and 3rd party code, and it would be madness to try to fix all of them
in the way you suggest just because some users don't get the concept
of iterators.

So I'm for #1 *and* #2.

--Guido

On 9/26/07, Oleg Broytmann <phd at phd.pp.ru> wrote:
> Hello!
>
>    (This seems like a "developing with Python" question and partially it is
> but please read on.)
>
>    I have a class that represents SQL queries. Instances of the class can
> be iterated over. As an SQL query doesn't know in advance if it will
> produce any row the class doesn't implement __len__(). Moreover, users of
> the class sometimes write
>
> if sqlQuery:
>    for row in sqlQuery: ...
> else:
>    # no rows
>
> which is a bug (the query doesn't know if it's True or False; to find it
> out the user have to execute the query by trying to iterate over it). To
> prevent users from writing such code the class implements __nonzero__()
> that always raises an exception.
>    Unfortunately, I found some libraries test the object in boolean context
> before iterating over it and that, of course, triggers the exception from
> __nonzero__().
>    Even worse, some libraries test the object in boolean context regardless
> of iterating over it. For example, logging module (this is where my
> question becomes "developing for Python") triggers the exception in such
> simple case:
>
> logginig.debug("Query: %s", sqlQuery)
>
>    Funny, the code
>
> logginig.debug("Query: %s, another: %s", sqlQuery, another_value)
>
>    doesn't trigger the exception. This is due to the code in
> logginig/__init__.py:
>
>         if args and (len(args) == 1) and args[0] and (type(args[0]) == types.DictType):
>             args = args[0]
>
> (class LogRecord, method __init__). "and args[0]" triggers the exception.
>
>    My questions are:
>
> 1. Should I consider this a bug in the logging module (and other libraries)
>    and submit patches?
> 2. Or should I stop raising exceptions in __nonzero__()?
>
>    In this particular case with logging the fix is simple - do "and args[0]"
> after type check.
>
> Oleg.
> --
>      Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
>            Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Sep 26 18:33:33 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 26 Sep 2007 12:33:33 -0400
Subject: [Python-Dev] Iterating over objects of unknown length
In-Reply-To: <20070926152408.GA24021@phd.pp.ru>
References: <20070926152408.GA24021@phd.pp.ru>
Message-ID: <20070926163100.5A1303A4045@sparrow.telecommunity.com>

At 07:24 PM 9/26/2007 +0400, Oleg Broytmann wrote:
>Hello!
>
>    (This seems like a "developing with Python" question and partially it is
>but please read on.)
>
>    I have a class that represents SQL queries. Instances of the class can
>be iterated over. As an SQL query doesn't know in advance if it will
>produce any row the class doesn't implement __len__(). Moreover, users of
>the class sometimes write
>
>if sqlQuery:
>    for row in sqlQuery: ...
>else:
>    # no rows

This isn't consistent with iterators; e.g.:

 >>> x=iter([])
 >>> if x: print "yes"
...
yes

ISTM that you should be returning "True" from __nonzero__, since you 
don't implement len().


>1. Should I consider this a bug in the logging module (and other libraries)
>    and submit patches?
>2. Or should I stop raising exceptions in __nonzero__()?

#2 - Python objects should always be __nonzero__, unless they are 
empty containers, zeros, or otherwise specifically False.  It's 
reasonable for libraries to expect that truth-testing an object is always safe.


From skip at pobox.com  Wed Sep 26 18:34:45 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 26 Sep 2007 11:34:45 -0500
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
	<ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>
Message-ID: <18170.35365.126292.41202@montanaro.dyndns.org>


    Guido> I fully support removing all historic references from the 3.0
    Guido> language manual.

By historic I assume you mean references to 2.x modules, classes, functions,
etc which are no longer present.  One thing I would suggest is that the more
recent versionadded strings be kept.  At the very least, if something is
going to be new in 2.6, keep that.  Maybe also keep the 2.5 versionadded
references.  Older references can probably be deleted.

Skip

From guido at python.org  Wed Sep 26 18:37:17 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Sep 2007 09:37:17 -0700
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <18170.35365.126292.41202@montanaro.dyndns.org>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
	<ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>
	<18170.35365.126292.41202@montanaro.dyndns.org>
Message-ID: <ca471dc20709260937q6f0d6cb3ha74a978b620f6a26@mail.gmail.com>

On 9/26/07, skip at pobox.com <skip at pobox.com> wrote:
>
>     Guido> I fully support removing all historic references from the 3.0
>     Guido> language manual.
>
> By historic I assume you mean references to 2.x modules, classes, functions,
> etc which are no longer present.  One thing I would suggest is that the more
> recent versionadded strings be kept.  At the very least, if something is
> going to be new in 2.6, keep that.  Maybe also keep the 2.5 versionadded
> references.  Older references can probably be deleted.

In the 2.x docs, all versionadded strings should stay. But IMO in the
3.0 docs we should get rid of them all. If you want compatibility
information, look at the 2.6 docs (those should also mention things
that are changing in 3.0).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From phd at phd.pp.ru  Wed Sep 26 18:37:28 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 26 Sep 2007 20:37:28 +0400
Subject: [Python-Dev] Iterating over objects of unknown length
In-Reply-To: <ca471dc20709260929h281ed9fp3e0d8dd0df2d775a@mail.gmail.com>
References: <20070926152408.GA24021@phd.pp.ru>
	<ca471dc20709260929h281ed9fp3e0d8dd0df2d775a@mail.gmail.com>
Message-ID: <20070926163728.GB26579@phd.pp.ru>

On Wed, Sep 26, 2007 at 09:29:10AM -0700, Guido van Rossum wrote:
> But I also fail to see why you would be so draconian as to disallow
> truth testing of a query altogether. Your query looks like an
> iterator. There are tons of other iterators in the language, library
> and 3rd party code, and it would be madness to try to fix all of them
> in the way you suggest just because some users don't get the concept
> of iterators.

   Seems me myself didn't get it:

On Wed, Sep 26, 2007 at 12:33:33PM -0400, Phillip J. Eby wrote:
> This isn't consistent with iterators; e.g.:
> 
> >>> x=iter([])
> >>> if x: print "yes"
> ...
> yes

On Wed, Sep 26, 2007 at 09:29:10AM -0700, Guido van Rossum wrote:
> So I'm for #1 *and* #2.

   I see now. Thank you!

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From martin at v.loewis.de  Wed Sep 26 20:14:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Sep 2007 20:14:09 +0200
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <ca471dc20709260937q6f0d6cb3ha74a978b620f6a26@mail.gmail.com>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>	<ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>	<18170.35365.126292.41202@montanaro.dyndns.org>
	<ca471dc20709260937q6f0d6cb3ha74a978b620f6a26@mail.gmail.com>
Message-ID: <46FAA171.2040104@v.loewis.de>

> In the 2.x docs, all versionadded strings should stay. But IMO in the
> 3.0 docs we should get rid of them all. If you want compatibility
> information, look at the 2.6 docs (those should also mention things
> that are changing in 3.0).

I agree. People who target 3.x need to test anyway if they also want
to support some 2.x version (if that is possible at all), so it does
not help them to know what Python version introduced a certain feature
they use.

Regards,
Martin

From g.brandl at gmx.net  Wed Sep 26 20:31:04 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 26 Sep 2007 20:31:04 +0200
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <46FAA171.2040104@v.loewis.de>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>	<ca471dc20709260727j647b0a51p12a9ac9df85c5fd7@mail.gmail.com>	<18170.35365.126292.41202@montanaro.dyndns.org>	<ca471dc20709260937q6f0d6cb3ha74a978b620f6a26@mail.gmail.com>
	<46FAA171.2040104@v.loewis.de>
Message-ID: <fde8h5$tc$1@sea.gmane.org>

Martin v. L?wis schrieb:
>> In the 2.x docs, all versionadded strings should stay. But IMO in the
>> 3.0 docs we should get rid of them all. If you want compatibility
>> information, look at the 2.6 docs (those should also mention things
>> that are changing in 3.0).
> 
> I agree. People who target 3.x need to test anyway if they also want
> to support some 2.x version (if that is possible at all), so it does
> not help them to know what Python version introduced a certain feature
> they use.

Also, it has already been done, and would be painful to undo :)

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From dinov at exchange.microsoft.com  Wed Sep 26 22:23:58 2007
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Wed, 26 Sep 2007 13:23:58 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion.

When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n.  This works great as long as you stay within an entirely Python world.  Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n.  But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n.  Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file.

So today users have to be aware of the fact that Python internally always uses \n.  They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version.

It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n.  Ultimately that creates a file that has line endings which aren't good on any platform.  On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that.  And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc...  Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want.

So I'm curious: Is there a reason this behavior is useful that I'm missing?  Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe?  Or should we just tell our users to open files in binary mode? :)

From fuzzyman at voidspace.org.uk  Wed Sep 26 22:42:09 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 26 Sep 2007 21:42:09 +0100
Subject: [Python-Dev] [python]  New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FAC421.3020809@voidspace.org.uk>

Dino Viehland wrote:
> We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion.
>
> When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n.  This works great as long as you stay within an entirely Python world.  Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n.  But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n.  Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file.
>
> So today users have to be aware of the fact that Python internally always uses \n.  They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version.
>
> It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n.  Ultimately that creates a file that has line endings which aren't good on any platform.  On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that.  And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc...  Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want.
>
> So I'm curious: Is there a reason this behavior is useful that I'm missing?  Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe?  Or should we just tell our users to open files in binary mode? :)
>   

It is normal when working with Windows interaction in the Python world 
to be aware that you might receive strings with '\r\n' in and do the 
conversion yourself.

We come across this a great deal with Resolver (when working with multi 
line text boxes for example) and quite happily replace '\r\n' with '\n' 
and vice versa as needed. As a developer who uses both Python and 
IronPython I say that this isn't a problem. I may be wrong or outvoted 
of course...

Michael
http://www.manning.com/foord


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   


From martin at v.loewis.de  Thu Sep 27 00:00:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Sep 2007 00:00:44 +0200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FAD68C.5030702@v.loewis.de>

> This works great as long as you stay within an entirely Python world.
> Because Python uses \n for everything internally

I think you misunderstand fairly significantly how this all works
together. Python does not use \n "for everything internally". Python
is well capable of representing \r separately, and does so if you
ask it to.

> So I'm curious: Is there a reason this behavior is useful that I'm
> missing? 

I think you are missing how it works in the first place (or else
you failed to communicate to me what precise behavior you find
puzzling).

Regards,
Martin


From guido at python.org  Thu Sep 27 00:04:26 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Sep 2007 15:04:26 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <ca471dc20709261504k13e5fddfp6a2d253b0776381c@mail.gmail.com>

On 9/26/07, Dino Viehland <dinov at exchange.microsoft.com> wrote:
> We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion.
>
> When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n.  This works great as long as you stay within an entirely Python world.  Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n.  But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n.  Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file.
>
> So today users have to be aware of the fact that Python internally always uses \n.  They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version.
>
> It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n.  Ultimately that creates a file that has line endings which aren't good on any platform.  On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that.  And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc...  Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want.
>
> So I'm curious: Is there a reason this behavior is useful that I'm missing?

No, it is simply the way Microsoft's C stdio library works. :-(

A simple workaround would be to apply s.replace('\r', '') before
writing anything of course.

> Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe?  Or should we just tell our users to open files in binary mode? :)

Py3k supports a number of different ways of working with newlines for
text files, but not (yet) one that leaves \r\n alone while translating
a lone \n into \r\n. It may not be too late to change that though.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From dinov at exchange.microsoft.com  Thu Sep 27 00:09:04 2007
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Wed, 26 Sep 2007 15:09:04 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <46FAD68C.5030702@v.loewis.de>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD68C.5030702@v.loewis.de>
Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file.  That's what I meant by uses \n for everything internally.

But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream.  My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is.

But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing.  Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file.  But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings.


-----Original Message-----
From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
Sent: Wednesday, September 26, 2007 3:01 PM
To: Dino Viehland
Cc: python-dev at python.org
Subject: Re: [Python-Dev] New lines, carriage returns, and Windows

> This works great as long as you stay within an entirely Python world.
> Because Python uses \n for everything internally

I think you misunderstand fairly significantly how this all works
together. Python does not use \n "for everything internally". Python
is well capable of representing \r separately, and does so if you
ask it to.

> So I'm curious: Is there a reason this behavior is useful that I'm
> missing?

I think you are missing how it works in the first place (or else
you failed to communicate to me what precise behavior you find
puzzling).

Regards,
Martin


From fuzzyman at voidspace.org.uk  Thu Sep 27 00:14:58 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 26 Sep 2007 23:14:58 +0100
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FAD9E2.6000103@voidspace.org.uk>

Dino Viehland wrote:
> My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file.  That's what I meant by uses \n for everything internally.
>
> But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream.  My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is.
>
> But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing.  Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file.  But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings.
>   

We always do replace('\r\n','\n') but same difference...

Michael

>
> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
> Sent: Wednesday, September 26, 2007 3:01 PM
> To: Dino Viehland
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] New lines, carriage returns, and Windows
>
>   
>> This works great as long as you stay within an entirely Python world.
>> Because Python uses \n for everything internally
>>     
>
> I think you misunderstand fairly significantly how this all works
> together. Python does not use \n "for everything internally". Python
> is well capable of representing \r separately, and does so if you
> ask it to.
>
>   
>> So I'm curious: Is there a reason this behavior is useful that I'm
>> missing?
>>     
>
> I think you are missing how it works in the first place (or else
> you failed to communicate to me what precise behavior you find
> puzzling).
>
> Regards,
> Martin
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   


From dinov at exchange.microsoft.com  Thu Sep 27 00:17:01 2007
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Wed, 26 Sep 2007 15:17:01 -0700
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
 and Windows
In-Reply-To: <46FAD9E2.6000103@voidspace.org.uk>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD9E2.6000103@voidspace.org.uk>
Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

And if this is fine for you, given that you may have the largest WinForms / IronPython code base, I tend to think the replace may be reasonable.  But we have had someone get surprised by this behavior.

-----Original Message-----
From: Michael Foord [mailto:fuzzyman at voidspace.org.uk]
Sent: Wednesday, September 26, 2007 3:15 PM
To: Dino Viehland
Cc: python-dev at python.org
Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and Windows

Dino Viehland wrote:
> My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file.  That's what I meant by uses \n for everything internally.
>
> But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream.  My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is.
>
> But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing.  Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file.  But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings.
>

We always do replace('\r\n','\n') but same difference...

Michael

>
> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
> Sent: Wednesday, September 26, 2007 3:01 PM
> To: Dino Viehland
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] New lines, carriage returns, and Windows
>
>
>> This works great as long as you stay within an entirely Python world.
>> Because Python uses \n for everything internally
>>
>
> I think you misunderstand fairly significantly how this all works
> together. Python does not use \n "for everything internally". Python
> is well capable of representing \r separately, and does so if you
> ask it to.
>
>
>> So I'm curious: Is there a reason this behavior is useful that I'm
>> missing?
>>
>
> I think you are missing how it works in the first place (or else
> you failed to communicate to me what precise behavior you find
> puzzling).
>
> Regards,
> Martin
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>


From fuzzyman at voidspace.org.uk  Thu Sep 27 00:22:52 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 26 Sep 2007 23:22:52 +0100
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD9E2.6000103@voidspace.org.uk>
	<7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FADBBC.5070407@voidspace.org.uk>

Dino Viehland wrote:
> And if this is fine for you, given that you may have the largest WinForms / IronPython code base, I tend to think the replace may be reasonable.  But we have had someone get surprised by this behavior.
>   

It is a slight impedance mismatch between Python and Windows - but isn't 
restricted to IronPython, so changing Python semantics doesn't seem like 
the right answer.

Alternatively a more intelligent text mode (that writes '\n' as '\r\n' 
and '\r\n' as '\r\n' on Windows) doesn't sound like *such* a bad idea - 
but you will still get caught out by this. A string read in text mode  
will read '\r\n' as '\n'. Setting this on a winforms component will 
still do the wrong thing. Better to be aware of the difference and use 
binary mode.

Michael

> -----Original Message-----
> From: Michael Foord [mailto:fuzzyman at voidspace.org.uk]
> Sent: Wednesday, September 26, 2007 3:15 PM
> To: Dino Viehland
> Cc: python-dev at python.org
> Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and Windows
>
> Dino Viehland wrote:
>   
>> My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file.  That's what I meant by uses \n for everything internally.
>>
>> But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream.  My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is.
>>
>> But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing.  Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file.  But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings.
>>
>>     
>
> We always do replace('\r\n','\n') but same difference...
>
> Michael
>
>   
>> -----Original Message-----
>> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
>> Sent: Wednesday, September 26, 2007 3:01 PM
>> To: Dino Viehland
>> Cc: python-dev at python.org
>> Subject: Re: [Python-Dev] New lines, carriage returns, and Windows
>>
>>
>>     
>>> This works great as long as you stay within an entirely Python world.
>>> Because Python uses \n for everything internally
>>>
>>>       
>> I think you misunderstand fairly significantly how this all works
>> together. Python does not use \n "for everything internally". Python
>> is well capable of representing \r separately, and does so if you
>> ask it to.
>>
>>
>>     
>>> So I'm curious: Is there a reason this behavior is useful that I'm
>>> missing?
>>>
>>>       
>> I think you are missing how it works in the first place (or else
>> you failed to communicate to me what precise behavior you find
>> puzzling).
>>
>> Regards,
>> Martin
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>>
>>
>>     
>
>
>   


From jimjjewett at gmail.com  Thu Sep 27 02:35:36 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 26 Sep 2007 20:35:36 -0400
Subject: [Python-Dev] urllib exception compatibility
Message-ID: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>

urllib goes to goes to some trouble to ensure that it raises IOError,
even when the underlying exception comes from another module.[*]  I'm
wondering if it would make sense to just have those modules'
exceptions inherit from IOError.

In particular, should socket.error, ftp.Error and
httplib.HTTPException (used in Py3K) inherit from IOError?

I'm also wondering whether it would be acceptable to change the
details of the exceptions.  For example, could

            raise IOError, ('ftp error', msg), sys.exc_info()[2]

be reworded, or is there there too much risk that someone is checking
for an "errno" of 'ftp error'?


[*]  This isn't a heavily tested path; some actually fail with a
TypeError since 2.5, because IOError no longer accepts argument tuples
longer than 3.  http://bugs.python.org/issue1209  Fortunately, this
makes me less worried about changing the contents of the specific
attributes to something more useful...

-jJ

From greg.ewing at canterbury.ac.nz  Thu Sep 27 03:33:47 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 13:33:47 +1200
Subject: [Python-Dev] Iterating over objects of unknown length
In-Reply-To: <20070926152408.GA24021@phd.pp.ru>
References: <20070926152408.GA24021@phd.pp.ru>
Message-ID: <46FB087B.8080107@canterbury.ac.nz>

Oleg Broytmann wrote:
> Hello!
> 
>    (This seems like a "developing with Python" question and partially it is
> but please read on.)
> 
>    I have a class that represents SQL queries. Instances of the class can
> be iterated over. ... users of
> the class sometimes write
> 
> if sqlQuery:
>    for row in sqlQuery: ...
> else:
>    # no rows
> 
> To prevent users from writing such code the class implements __nonzero__()
> that always raises an exception.

I'm not sure I like that idea. It's common practice to write
'if x:' as a shorthand for 'if x is not None:' when it's known
that x is an object that doesn't have a notion of emptiness.
A __nonzero__ that always raises an exception just to spite
you interferes with that.

Another thing is that any code doing "if x" to test for
emptiness is clearly expecting x to be a sequence, *not*
an iterator, and you've violated the contract by passing
it one. This is what you may be running into with the libraries
you mention.

Generally I think it's a bad idea to try to protect people
from themselves when doing so can interfere with legitimate
usage.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Sep 27 03:52:45 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Sep 2007 18:52:45 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
Message-ID: <ca471dc20709261852y65c11270k5396cbf41c4ebc6c@mail.gmail.com>

Shouldn't these all inherit from EnvironmentError?

Or should EnvironmentError and IOError be the same thing perhaps?

--Guido

On 9/26/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> urllib goes to goes to some trouble to ensure that it raises IOError,
> even when the underlying exception comes from another module.[*]  I'm
> wondering if it would make sense to just have those modules'
> exceptions inherit from IOError.
>
> In particular, should socket.error, ftp.Error and
> httplib.HTTPException (used in Py3K) inherit from IOError?
>
> I'm also wondering whether it would be acceptable to change the
> details of the exceptions.  For example, could
>
>             raise IOError, ('ftp error', msg), sys.exc_info()[2]
>
> be reworded, or is there there too much risk that someone is checking
> for an "errno" of 'ftp error'?
>
>
> [*]  This isn't a heavily tested path; some actually fail with a
> TypeError since 2.5, because IOError no longer accepts argument tuples
> longer than 3.  http://bugs.python.org/issue1209  Fortunately, this
> makes me less worried about changing the contents of the specific
> attributes to something more useful...
>
> -jJ
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From phd at phd.pp.ru  Thu Sep 27 04:04:08 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 27 Sep 2007 06:04:08 +0400
Subject: [Python-Dev] Iterating over objects of unknown length
In-Reply-To: <46FB087B.8080107@canterbury.ac.nz>
References: <20070926152408.GA24021@phd.pp.ru>
	<46FB087B.8080107@canterbury.ac.nz>
Message-ID: <20070927020408.GC4287@phd.pp.ru>

On Thu, Sep 27, 2007 at 01:33:47PM +1200, Greg Ewing wrote:
> Oleg Broytmann wrote:
> >if sqlQuery:
> >   for row in sqlQuery: ...
> >else:
> >   # no rows
> >
> >To prevent users from writing such code the class implements __nonzero__()
> >that always raises an exception.
> 
> I'm not sure I like that idea. It's common practice to write
> 'if x:' as a shorthand for 'if x is not None:' when it's known
> that x is an object that doesn't have a notion of emptiness.
> Another thing is that any code doing "if x" to test for
> emptiness is clearly expecting x to be a sequence, *not*
> an iterator, and you've violated the contract by passing
> it one. This is what you may be running into with the libraries
> you mention.

   In most cases the code in those libraries is, using the word of Mr. van
Rossum, "archaic". It was developed for old versions of Python (long before
Python has got the iterator protocol). I will file bug reports and patches
(I have filed one about logginig/__init__.py) to allow developers to either
fix the code or document the fact the code really requires a finite
sequence.
   Unfortunately now when my code no longer raises an exception it would be
harder to spot the buggy libraries.

> Generally I think it's a bad idea to try to protect people
> from themselves when doing so can interfere with legitimate
> usage.

   I agree. I admitted in mailing list it was my design mistake. The
offending __nonzero__ was removed from SVN today.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From greg.ewing at canterbury.ac.nz  Thu Sep 27 04:04:13 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 14:04:13 +1200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FB0F9D.6010303@canterbury.ac.nz>

Dino Viehland wrote:
> When writing a string in text mode that contains \r\n we both write \r\r\n

Maybe there should be a universal newlines mode defined for
output as well as input, which translates any of "\r", "\n"
or "\r\n" into the platform line ending.

Although I suspect that a string containing "\r\n" is going
to cause more problems for Python applications than this.
E.g. consider what happens when you try to split a string
on newlines.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From skip at pobox.com  Thu Sep 27 04:46:07 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 26 Sep 2007 21:46:07 -0500
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <46FB0F9D.6010303@canterbury.ac.nz>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FB0F9D.6010303@canterbury.ac.nz>
Message-ID: <18171.6511.526695.684154@montanaro.dyndns.org>


    Greg> Maybe there should be a universal newlines mode defined for output
    Greg> as well as input, which translates any of "\r", "\n" or "\r\n"
    Greg> into the platform line ending.

I thought that's the way it was supposed to work, but it clearly doesn't:

    >>> f = open("test.txt", "wt")
    >>> f.write("a\rb\rnc\n")
    7
    >>> f.close()
    >>> open("test.txt", "rb").read()
    b'a\rb\rnc\n'

I'd be open to such a change.  Principle of least surprise?

Skip

From greg.ewing at canterbury.ac.nz  Thu Sep 27 04:54:51 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Sep 2007 14:54:51 +1200
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
Message-ID: <46FB1B7B.1020001@canterbury.ac.nz>

Jim Jewett wrote:
> In particular, should socket.error, ftp.Error and
> httplib.HTTPException (used in Py3K) inherit from IOError?

I'd say that if they incorporate a C library result code they
should inherit from IOError, or if they incorporate a system
call result code they should inherit from OSError. Otherwise
they should inherit from EnvironmentError.

I don't think there's any point in simply catching one of
these and re-wrapping it in the library's own exeption
class, but if such wrapping is done, it should inherit
from EnvironmentError as well.

It's convenient to be able to catch EnvironmentError and
get anything that is caused by circumstances outside the
program's control.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From martin at v.loewis.de  Thu Sep 27 07:24:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Sep 2007 07:24:41 +0200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FB3E99.9030100@v.loewis.de>

> My understanding is that users can write code that uses only \n and
> Python will write the end-of-line character(s) that are appropriate
> for the platform when writing to a file.  That's what I meant by uses
> \n for everything internally.

Here you misunderstand - that's only the case when the file is opened
in text mode. If the file is opened in binary mode, and you write \n,
then it writes just a single byte (0xA).

> But if you write \r\n to a file Python completely ignores the
> presence of the \r and transforms the \n into a \r\n anyway, hence
> the \r\r in the resulting stream.  My last question is simply does
> anyone find writing \r\r\n when the original string contained \r\n a
> useful behavior - personally I don't see how it is.

That's just for consistency.

> But Guido's response makes this sound like it's a problem w/ VC++
> stdio implementation and not something that Python is explicitly
> doing.

That's correct - it's the notion of "text mode" for file IO.

> Anyway, it'd might be useful to have a text-mode file that
> you can write \r\n to and only get \r\n in the resulting file.

This I don't understand. Why don't you just use binary mode then?
At least for Python 2.x, the *only* difference between text and
binary mode is the treatment of line endings.

For Python 3, things will be different as the distinction goes
further; the precise API for line endings is still being considered
there.

Regards,
Martin


From guido at python.org  Thu Sep 27 16:32:44 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 27 Sep 2007 07:32:44 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <46FB1B7B.1020001@canterbury.ac.nz>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
Message-ID: <ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>

How about making IOError, OSError and EnvironmentError all aliases for
the same thing? The distinction is really worthless historical
baggage.

On 9/26/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Jim Jewett wrote:
> > In particular, should socket.error, ftp.Error and
> > httplib.HTTPException (used in Py3K) inherit from IOError?
>
> I'd say that if they incorporate a C library result code they
> should inherit from IOError, or if they incorporate a system
> call result code they should inherit from OSError. Otherwise
> they should inherit from EnvironmentError.
>
> I don't think there's any point in simply catching one of
> these and re-wrapping it in the library's own exeption
> class, but if such wrapping is done, it should inherit
> from EnvironmentError as well.
>
> It's convenient to be able to catch EnvironmentError and
> get anything that is caused by circumstances outside the
> program's control.
>
> --
> Greg Ewing, Computer Science Dept, +--------------------------------------+
> University of Canterbury,          | Carpe post meridiem!                 |
> Christchurch, New Zealand          | (I'm not a morning person.)          |
> greg.ewing at canterbury.ac.nz        +--------------------------------------+
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From dinov at exchange.microsoft.com  Thu Sep 27 18:49:07 2007
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Thu, 27 Sep 2007 09:49:07 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <46FB3E99.9030100@v.loewis.de>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FB3E99.9030100@v.loewis.de>
Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

> This I don't understand. Why don't you just use binary mode then?
> At least for Python 2.x, the *only* difference between text and
> binary mode is the treatment of line endings.

That just flips the problem to the other side.  Now if I have a
Python library that I'm mixing w/ .NET code I need to be sure to
transform the line endings, but now from \n -> \r\n (and hopefully
they'd detect the new-line style they should use so it works
correctly on Mono on *nix or Silverlight on OS X as well).



From guido at python.org  Thu Sep 27 19:35:18 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 27 Sep 2007 10:35:18 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <18171.6511.526695.684154@montanaro.dyndns.org>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FB0F9D.6010303@canterbury.ac.nz>
	<18171.6511.526695.684154@montanaro.dyndns.org>
Message-ID: <ca471dc20709271035h29948358xe6bba5466edb1963@mail.gmail.com>

[moving to python-3000]

The symmetry isn't as strong as you suggest, but I agree it would be a
useful feature. Would you mind filing a Py3k feature request so we
don't forget?

A proposal for an API given the existing newlines=... parameter
(described in detail in PEP 3116) would be even better.

And a patch would be best, but you know that. :-)

--Guido

On 9/26/07, skip at pobox.com <skip at pobox.com> wrote:
>
>     Greg> Maybe there should be a universal newlines mode defined for output
>     Greg> as well as input, which translates any of "\r", "\n" or "\r\n"
>     Greg> into the platform line ending.
>
> I thought that's the way it was supposed to work, but it clearly doesn't:
>
>     >>> f = open("test.txt", "wt")
>     >>> f.write("a\rb\rnc\n")
>     7
>     >>> f.close()
>     >>> open("test.txt", "rb").read()
>     b'a\rb\rnc\n'
>
> I'd be open to such a change.  Principle of least surprise?
>
> Skip
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at krypto.org  Thu Sep 27 20:09:45 2007
From: greg at krypto.org (Gregory P. Smith)
Date: Thu, 27 Sep 2007 11:09:45 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
Message-ID: <52dc1c820709271109j7d95b63chad74d2b615dd792e@mail.gmail.com>

On 9/27/07, Guido van Rossum <guido at python.org> wrote:
>
> How about making IOError, OSError and EnvironmentError all aliases for
> the same thing? The distinction is really worthless historical
> baggage.
>

+1 on that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/5ba4dd70/attachment.htm 

From brett at python.org  Thu Sep 27 22:23:58 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 27 Sep 2007 13:23:58 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
Message-ID: <bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>

On 9/27/07, Guido van Rossum <guido at python.org> wrote:
> How about making IOError, OSError and EnvironmentError all aliases for
> the same thing? The distinction is really worthless historical
> baggage.
>

+1 from me.

Should OSError and IOError become aliases to EnvironmentError?  I
assume WindowsError and VMSError will just directly subclass which
ever exception sticks around.

And should we bother with a PendingDeprecationWarning for IOError or
OSError?  Or just have a Py3K warning for them and not worry about
their removal in the 2.x series and just let 2to3 handle the
transition?

-Brett

From dalcinl at gmail.com  Fri Sep 28 00:18:09 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 27 Sep 2007 19:18:09 -0300
Subject: [Python-Dev] building with -Wwrite-strings
Message-ID: <e7ba66e40709271518v1f10cdd4h8643f74048bf0b82@mail.gmail.com>

I'm trying to build Python (2.6) with GCC the option -Wwrite-strings.

1 - Is there any interest on this?

2 - What should I do for the very common (taken from int_new):
   static char *kwlist[] = {"x", "base", 0};

I was able to remove all the warning in Objects/*, except those related to (2).


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From graham.horler at gmail.com  Fri Sep 28 00:35:36 2007
From: graham.horler at gmail.com (Graham Horler)
Date: Thu, 27 Sep 2007 23:35:36 +0100
Subject: [Python-Dev] urllib exception compatibility
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
Message-ID: <63jqje$2dsube@venus.eclipse.kcom.com>

On 27 Sep 2007, 21:23:58, Brett Cannon wrote:
> Should OSError and IOError become aliases to EnvironmentError?  I
> assume WindowsError and VMSError will just directly subclass which
> ever exception sticks around.
> 
> And should we bother with a PendingDeprecationWarning for IOError or
> OSError?  Or just have a Py3K warning for them and not worry about
> their removal in the 2.x series and just let 2to3 handle the
> transition?

Am I missing something, as I thought Py2K was supposed to throw backwards
compatability to the wind in favor of doing the "Right Thing"?

If so, can't we lose the proposed OSError and IOError aliases altogether,
and just keep EnvironmentError?

Perhaps "EnvironmentError" is a bit long to type in all the places OSError
and IOError are used, I personally like the look of OSError and IOError better
in my code.  I vote for a shorter name for EnvironmentError, e.g. EMError.

just my 2c, Graham

> 
> -Brett
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com

From guido at python.org  Fri Sep 28 00:59:12 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 27 Sep 2007 15:59:12 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <63jqje$2dsube@venus.eclipse.kcom.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
Message-ID: <ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>

I'd be happy to make them all IOError. 2to3 can clean this up.

On 9/27/07, Graham Horler <graham.horler at gmail.com> wrote:
> On 27 Sep 2007, 21:23:58, Brett Cannon wrote:
> > Should OSError and IOError become aliases to EnvironmentError?  I
> > assume WindowsError and VMSError will just directly subclass which
> > ever exception sticks around.
> >
> > And should we bother with a PendingDeprecationWarning for IOError or
> > OSError?  Or just have a Py3K warning for them and not worry about
> > their removal in the 2.x series and just let 2to3 handle the
> > transition?
>
> Am I missing something, as I thought Py2K was supposed to throw backwards
> compatability to the wind in favor of doing the "Right Thing"?
>
> If so, can't we lose the proposed OSError and IOError aliases altogether,
> and just keep EnvironmentError?
>
> Perhaps "EnvironmentError" is a bit long to type in all the places OSError
> and IOError are used, I personally like the look of OSError and IOError better
> in my code.  I vote for a shorter name for EnvironmentError, e.g. EMError.
>
> just my 2c, Graham
>
> >
> > -Brett
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bjourne at gmail.com  Fri Sep 28 01:35:27 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri, 28 Sep 2007 01:35:27 +0200
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
Message-ID: <740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com>

On 9/27/07, Guido van Rossum <guido at python.org> wrote:
> How about making IOError, OSError and EnvironmentError all aliases for
> the same thing? The distinction is really worthless historical
> baggage.

Wouldn't it also be nice to have some subclasses of IOError like
FileNotFoundError, IOPermissionError and EOFError? Often that would be
easier than having to use the errno attribute to find out the exact
cause.

-- 
mvh Bj?rn

From guido at python.org  Fri Sep 28 01:41:57 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 27 Sep 2007 16:41:57 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com>
Message-ID: <ca471dc20709271641l2fdcd6e4y3d964c48c9badac9@mail.gmail.com>

I suspect that the use case for those errors is far less than you think.

On 9/27/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> On 9/27/07, Guido van Rossum <guido at python.org> wrote:
> > How about making IOError, OSError and EnvironmentError all aliases for
> > the same thing? The distinction is really worthless historical
> > baggage.
>
> Wouldn't it also be nice to have some subclasses of IOError like
> FileNotFoundError, IOPermissionError and EOFError? Often that would be
> easier than having to use the errno attribute to find out the exact
> cause.
>
> --
> mvh Bj?rn
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From eric+python-dev at trueblade.com  Fri Sep 28 02:39:28 2007
From: eric+python-dev at trueblade.com (Eric Smith)
Date: Thu, 27 Sep 2007 20:39:28 -0400
Subject: [Python-Dev] Decimal news
In-Reply-To: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
Message-ID: <46FC4D40.4090808@trueblade.com>

Thomas Wouters wrote:

> Unfortunately, that's not how it works :-) If you check something into 
> the trunk, it will be merged into Py3k sooner or later. I may ask the 
> original submitter for assistance if it's incredibly hard to figure out 
> the changes, but so far, I only had to do that with the SSL changes. The 
> decimal changes are being merged as I write this (tests running now.) Is 
> there anything in particular that needs to be done for decimal in Py3k, 
> besides renaming __div__ to __truediv__?
> 
> If you re-eally need to check something into the trunk that re-eally 
> must not be merged into py3k, but you're afraid it's not going to be 
> obvious to the merger, please record the change as 'merged' using 
> "svnmerge merge -M -r<revision>". Please take care when picking the 
> revision ;) You can also just email me or someone else you see doing 
> merges, as I doubt this will be a common occurance.

I'm getting ready to port my PEP 3101 implementation (str.format() and
friends) from py3k back to 2.6.  How do I make it obvious that this is
something that doesn't need to be ported to py3k?  I'm not sure what
"obvious to the merger" means.  Is a "backported" checkin comment good
enough?  With any luck this will be done with a single checkin to the
trunk, and another checkin to py3k so that the implementations can
remain identical.

I just want to make sure I don't make life more difficult than necessary
for the folks doing the very valuable merge process.

Eric.



From greg.ewing at canterbury.ac.nz  Fri Sep 28 03:30:58 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 28 Sep 2007 13:30:58 +1200
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
Message-ID: <46FC5952.7070707@canterbury.ac.nz>

Guido van Rossum wrote:
> How about making IOError, OSError and EnvironmentError all aliases for
> the same thing? The distinction is really worthless historical
> baggage.

To my mind, the distinction is that IOError and OSError
have an attribute for the error code, and the code found
there has a well-defined meaning (C library error code
and system call error code respectively), whereas
EnvironmentError is more general.

While it might be possible to merge them all together
on Unix-like systems, that wouldn't necessarily be
true on all platforms -- the IOError and OSError codes
might belong to different domains. Although I suppose
you could have another attribute to distinguish them
if necessary.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From thomas at python.org  Fri Sep 28 03:32:41 2007
From: thomas at python.org (Thomas Wouters)
Date: Thu, 27 Sep 2007 18:32:41 -0700
Subject: [Python-Dev] Decimal news
In-Reply-To: <46FC4D40.4090808@trueblade.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
	<46FC4D40.4090808@trueblade.com>
Message-ID: <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com>

On 9/27/07, Eric Smith <eric+python-dev at trueblade.com> wrote:
>
> Thomas Wouters wrote:
>
> > Unfortunately, that's not how it works :-) If you check something into
> > the trunk, it will be merged into Py3k sooner or later. I may ask the
> > original submitter for assistance if it's incredibly hard to figure out
> > the changes, but so far, I only had to do that with the SSL changes. The
> > decimal changes are being merged as I write this (tests running now.) Is
> > there anything in particular that needs to be done for decimal in Py3k,
> > besides renaming __div__ to __truediv__?
> >
> > If you re-eally need to check something into the trunk that re-eally
> > must not be merged into py3k, but you're afraid it's not going to be
> > obvious to the merger, please record the change as 'merged' using
> > "svnmerge merge -M -r<revision>". Please take care when picking the
> > revision ;) You can also just email me or someone else you see doing
> > merges, as I doubt this will be a common occurance.
>
> I'm getting ready to port my PEP 3101 implementation (str.format() and
> friends) from py3k back to 2.6.  How do I make it obvious that this is
> something that doesn't need to be ported to py3k?  I'm not sure what
> "obvious to the merger" means.  Is a "backported" checkin comment good
> enough?  With any luck this will be done with a single checkin to the
> trunk, and another checkin to py3k so that the implementations can
> remain identical.


Funny, just a few hours ago Guido mentioned we (meaning I) should write this
up in a PEP :) I'll do that in the next few weeks.

Obvious to the merger means whatever the merger expects it to mean ;) Yes,
checkin comments are good. If an automatic merge fails, and the code isn't
straightforward to merge from just looking at the files, looking at the
actual changes in both branches is the next step. If the comment says
'backport from py3k' (preferably with which version was backported), that
makes it easy to ignore the whole change (but perhaps not later checkins.)
After you backport, maintenance should be done in the trunk, not the py3k
branch (except of course, for parts that don't apply to the trunk.)

I just want to make sure I don't make life more difficult than necessary
> for the folks doing the very valuable merge process.


As long as you commit any given thing only once, it's pretty easy to work
out. As soon as you find yourself (more than once) committing things to py3k
and then realizing it should go to the trunk, you may be making life harder.
I appreciate that you're careful about this though, thanks :)


-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/7cc0dd5c/attachment.htm 

From greg.ewing at canterbury.ac.nz  Fri Sep 28 03:45:01 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 28 Sep 2007 13:45:01 +1200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FB3E99.9030100@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <46FC5C9D.3030803@canterbury.ac.nz>

Dino Viehland wrote:
>>Why don't you just use binary mode then?
> 
> That just flips the problem to the other side.

Seems to me that IronPython really needs two string
types, "Python string" and ".NET string", with automatic
conversion when crossing a boundary between Python
code and .NET code.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg at krypto.org  Fri Sep 28 06:58:05 2007
From: greg at krypto.org (Gregory P. Smith)
Date: Thu, 27 Sep 2007 21:58:05 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
Message-ID: <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>

Is IOError is the right name to use?  OSError is raised for things that are
not IO such as subprocess, dlopen, system.

Nobody likes typing out EnvironmentError and dislike the suggestion of
EMError, should it just be OSError?  errno values are after all OS specific.

-gps

On 9/27/07, Guido van Rossum <guido at python.org> wrote:
>
> I'd be happy to make them all IOError. 2to3 can clean this up.
>
> On 9/27/07, Graham Horler <graham.horler at gmail.com> wrote:
> > On 27 Sep 2007, 21:23:58, Brett Cannon wrote:
> > > Should OSError and IOError become aliases to EnvironmentError?  I
> > > assume WindowsError and VMSError will just directly subclass which
> > > ever exception sticks around.
> > >
> > > And should we bother with a PendingDeprecationWarning for IOError or
> > > OSError?  Or just have a Py3K warning for them and not worry about
> > > their removal in the 2.x series and just let 2to3 handle the
> > > transition?
> >
> > Am I missing something, as I thought Py2K was supposed to throw
> backwards
> > compatability to the wind in favor of doing the "Right Thing"?
> >
> > If so, can't we lose the proposed OSError and IOError aliases
> altogether,
> > and just keep EnvironmentError?
> >
> > Perhaps "EnvironmentError" is a bit long to type in all the places
> OSError
> > and IOError are used, I personally like the look of OSError and IOError
> better
> > in my code.  I vote for a shorter name for EnvironmentError, e.g.
> EMError.
> >
> > just my 2c, Graham
> >
> > >
> > > -Brett
> > > _______________________________________________
> > > Python-Dev mailing list
> > > Python-Dev at python.org
> > > http://mail.python.org/mailman/listinfo/python-dev
> > > Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/0061630e/attachment.htm 

From greg.ewing at canterbury.ac.nz  Fri Sep 28 07:50:17 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 28 Sep 2007 17:50:17 +1200
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
Message-ID: <46FC9619.6020400@canterbury.ac.nz>

Gregory P. Smith wrote:
> Is IOError is the right name to use?  OSError is raised for things that 
> are not IO such as subprocess, dlopen, system.

The trouble with either of these is that the class
of errors we're talking about don't necessarily come
directly from the OS or I/O library.

Often I raise my own EnvironmentError instances for
things which don't have any associated OS error code
but are nonetheless environment-related, such as an
error in a file format.

I don't reuse IOError or OSError because I feel as
though I ought to supply an errno with these, but
there isn't any.

I suppose we could pick one of these and make it
official that it's okay to instantiate it without
an errno. But it's hard to decide which one,
because they both sound too narrow in scope.

I don't like EMError either, btw. Maybe EnvError?
Although that sounds like it has something to do
with the unix environment variables.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From rhamph at gmail.com  Fri Sep 28 08:03:17 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 28 Sep 2007 00:03:17 -0600
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <46FC9619.6020400@canterbury.ac.nz>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
	<46FC9619.6020400@canterbury.ac.nz>
Message-ID: <aac2c7cb0709272303g683e7893ifb3022c33355c4fa@mail.gmail.com>

On 9/27/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Gregory P. Smith wrote:
> > Is IOError is the right name to use?  OSError is raised for things that
> > are not IO such as subprocess, dlopen, system.
>
> The trouble with either of these is that the class
> of errors we're talking about don't necessarily come
> directly from the OS or I/O library.
>
> Often I raise my own EnvironmentError instances for
> things which don't have any associated OS error code
> but are nonetheless environment-related, such as an
> error in a file format.
>
> I don't reuse IOError or OSError because I feel as
> though I ought to supply an errno with these, but
> there isn't any.
>
> I suppose we could pick one of these and make it
> official that it's okay to instantiate it without
> an errno. But it's hard to decide which one,
> because they both sound too narrow in scope.
>
> I don't like EMError either, btw. Maybe EnvError?
> Although that sounds like it has something to do
> with the unix environment variables.

ExternalError?  Pretty vague though.

-- 
Adam Olsen, aka Rhamphoryncus

From theller at ctypes.org  Fri Sep 28 13:24:42 2007
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 28 Sep 2007 13:24:42 +0200
Subject: [Python-Dev] Decimal news
In-Reply-To: <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>	<46FC4D40.4090808@trueblade.com>
	<9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com>
Message-ID: <fdio9q$9ra$1@sea.gmane.org>

Thomas Wouters schrieb:
> On 9/27/07, Eric Smith <eric+python-dev at trueblade.com> wrote:
>>
>> Thomas Wouters wrote:
>>
>> > Unfortunately, that's not how it works :-) If you check something into
>> > the trunk, it will be merged into Py3k sooner or later. I may ask the
>> > original submitter for assistance if it's incredibly hard to figure out
>> > the changes, but so far, I only had to do that with the SSL changes. The
>> > decimal changes are being merged as I write this (tests running now.) Is
>> > there anything in particular that needs to be done for decimal in Py3k,
>> > besides renaming __div__ to __truediv__?
>> >
>> > If you re-eally need to check something into the trunk that re-eally
>> > must not be merged into py3k, but you're afraid it's not going to be
>> > obvious to the merger, please record the change as 'merged' using
>> > "svnmerge merge -M -r<revision>". Please take care when picking the
>> > revision ;) You can also just email me or someone else you see doing
>> > merges, as I doubt this will be a common occurance.

I think that the 'svnmerge block -r<revision>' command should be used.  Or not?

Thomas


From g.brandl at gmx.net  Fri Sep 28 15:42:09 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 28 Sep 2007 15:42:09 +0200
Subject: [Python-Dev] Python 3.0a documentation
In-Reply-To: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82>
Message-ID: <fdj0be$8vf$1@sea.gmane.org>

scav at blueyonder.co.uk schrieb:
> I'd like to help out cleaning up the Python3.0 documentation.  There are a
> lot of little leftovers from 2.x that are no longer true. (mentions of
> long, callable() etc.)

I've applied the first four patches, thank you!

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From dangyogi at gmail.com  Sun Sep 23 02:03:30 2007
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Sat, 22 Sep 2007 20:03:30 -0400
Subject: [Python-Dev] Adding concat function to itertools
Message-ID: <46F5AD52.5040407@gmail.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070922/daea9fb4/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: itertoolsmodule.c
Type: text/x-csrc
Size: 62269 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070922/daea9fb4/attachment-0001.c 

From bioinformed at gmail.com  Fri Sep 28 17:50:27 2007
From: bioinformed at gmail.com (Kevin Jacobs <jacobs@bioinformed.com>)
Date: Fri, 28 Sep 2007 11:50:27 -0400
Subject: [Python-Dev] Adding concat function to itertools
In-Reply-To: <46F5AD52.5040407@gmail.com>
References: <46F5AD52.5040407@gmail.com>
Message-ID: <2e1434c10709280850j513ddf6s6eaf748a1d1cc90@mail.gmail.com>

On 9/22/07, Bruce Frederiksen <dangyogi at gmail.com> wrote:
>
>  I've added a new function to itertools called 'concat'.  This function is
> much like *chain*, but takes all of the iterables as a single argument.
>

I've needed this once or twice, though my implementation was called
'starchain', in line with 'starmap'.  I'm not a big fan of either name,
though -- 'chainstar' and 'mapstar' are only marginally better (though it
makes me want to come up with 'saw' and 'chainsaw' functions).  Nor can I
comment on the general applicability of such a function, other than to say
that it was useful in some of my applications that utilize iterators of
iterators of indeterminate length.

-Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070928/d14fd999/attachment.htm 

From brett at python.org  Fri Sep 28 19:07:40 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 28 Sep 2007 10:07:40 -0700
Subject: [Python-Dev] Adding concat function to itertools
In-Reply-To: <46F5AD52.5040407@gmail.com>
References: <46F5AD52.5040407@gmail.com>
Message-ID: <bbaeab100709281007w5879534dub799a391d85d689e@mail.gmail.com>

On 9/22/07, Bruce Frederiksen <dangyogi at gmail.com> wrote:
>
>  I've added a new function to itertools called 'concat'.  This function is
> much like chain, but takes all of the iterables as a single argument.  Thus
> concat(some_iterables) is logically equivalent to chain(*some_iterables);
> the difference being that chain(*some_iterables) results in some_iterables
> being fully expanded before the call to chain, while concat(some_iterables)
> only iterates on some_iterables as needed.  This makes concat more
> attractive when some_iterables is either expensive to expand or "infinite"
> in length.
>
>  Thus, concat(iterable) is like:
>
>  def concat(iterables):
>  for it in iterables:
>  for element in it:
>  yield element
>
>
>
>  I've attached an updated itertoolsmodule.c file to this email with concat
> added to it.  This was based on the 2.5.1 source.
>
>  I ask that this be considered for adoption into standard python.
>
>  Thanks in advance!
>

Best thing to do is to put this up on the bug tracker
(bugs.python.org) and assign it to Raymond Hettinger as itertools is
his baby.

-Brett

From status at bugs.python.org  Fri Sep 28 19:37:06 2007
From: status at bugs.python.org (Tracker)
Date: Fri, 28 Sep 2007 17:37:06 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070928173706.BFD71782DC@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (09/21/07 - 09/28/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1278 open (+16) / 11424 closed (+17) / 12702 total (+33)

Open issues with patches:   415

Average duration of open issues: 680 days.
Median duration of open issues: 670 days.

Open Issues Breakdown
   open  1273 (+16)
pending     5 ( +0)

Issues Created Or Reopened (33)
_______________________________

pipe fd handling issues in subprocess.py on POSIX                09/21/07
       http://bugs.python.org/issue1187    created  anissen                  
       patch                                                                   

universal newlines doesn't identify CRLF during tell()           09/22/07
CLOSED http://bugs.python.org/issue1188    created  pjenvey                  
                                                                               

Documentation for tp_as_number tp_as_sequence tp_as_mapping      09/22/07
CLOSED http://bugs.python.org/issue1189    created  amaury.forgeotdarc       
       patch                                                                   

Windows rants& sugestions.                                       09/22/07
CLOSED http://bugs.python.org/issue1190    created  wolfstar359              
                                                                               

bsddb does not build with bsddb lib v3.1.                        09/22/07
       http://bugs.python.org/issue1191    created  giraffedata              
                                                                               

Python 3 documents crash Firefox                                 09/24/07
CLOSED http://bugs.python.org/issue1192    created  rtmq                     
                                                                               

os.system() encoding bug on Windows                              09/24/07
       http://bugs.python.org/issue1193    created  r_mosaic                 
                                                                               

The reduce() documentation is lost in Python 3.0a1               09/24/07
CLOSED http://bugs.python.org/issue1194    created  r_mosaic                 
                                                                               

Problems on Linux with Ctrl-D and Ctrl-C during raw_input        09/24/07
       http://bugs.python.org/issue1195    created  Rebecca                  
                                                                               

int() documentation does not specify default radix               09/24/07
CLOSED http://bugs.python.org/issue1196    created  tcdelaney                
                                                                               

logging: formatter does not accept %(funcName)s properly         09/24/07
CLOSED http://bugs.python.org/issue1197    created  CM                       
                                                                               

Test of 2to3 component auditor                                   09/24/07
CLOSED http://bugs.python.org/issue1198    created  dubois                   
                                                                               

Documentation for tp_as_number... version 2.6                    09/25/07
       http://bugs.python.org/issue1199    created  amaury.forgeotdarc       
       patch                                                                   

Allow array.array to be parsed by the t# format unit.            09/25/07
       http://bugs.python.org/issue1200    created  jyasskin                 
                                                                               

Error in array concept                                           09/25/07
CLOSED http://bugs.python.org/issue1201    created  zip                      
                                                                               

zlib.crc32() and adler32() return value                          09/25/07
       http://bugs.python.org/issue1202    created  arigo                    
                                                                               

ctypes doesn't work on Mac with --disable-toolbox-glue           09/25/07
       http://bugs.python.org/issue1203    created  janssen                  
                                                                               

readline configuration for shared libs w/o curses dependencies   09/25/07
       http://bugs.python.org/issue1204    created  mbeachy                  
       patch                                                                   

urllib fail to read URL contents, urllib2 crash Python           09/26/07
       http://bugs.python.org/issue1205    created  cosoleto                 
                                                                               

logging/__init__.py                                              09/26/07
CLOSED http://bugs.python.org/issue1206    created  phd                      
       patch                                                                   

Load tests from path (patch included)                            09/26/07
       http://bugs.python.org/issue1207    created  bluebird                 
                                                                               

Match object should be guaranteed to always be true              09/26/07
CLOSED http://bugs.python.org/issue1208    created  horcicka                 
                                                                               

IOError won't accept tuples longer than 3                        09/27/07
CLOSED http://bugs.python.org/issue1209    created  jimjjewett               
                                                                               

imaplib does not run under Python 3                              09/27/07
       http://bugs.python.org/issue1210    created  rtmq                     
                                                                               

cleanup patch for 3.0 tutorial/interpreter.rst                   09/27/07
CLOSED http://bugs.python.org/issue1211    created  scav                     
       patch                                                                   

3.0 tutorial/introduction.rst mentions 'long'                    09/27/07
CLOSED http://bugs.python.org/issue1212    created  scav                     
       patch                                                                   

3.0 tutorial/classes.rst patch                                   09/27/07
CLOSED http://bugs.python.org/issue1213    created  scav                     
       patch                                                                   

Timeout in CGIXMLRPCRequestHandler under IIS                     09/27/07
       http://bugs.python.org/issue1214    created  steenie                  
       patch                                                                   

Python hang when catching a segfault                             09/27/07
       http://bugs.python.org/issue1215    created  tebeka                   
                                                                               

Python2.5.1 fails to compile under  VC.NET2002 ( 7.0 )           09/27/07
       http://bugs.python.org/issue1216    created  kartlee                  
                                                                               

infinite loop in re module                                       09/27/07
CLOSED http://bugs.python.org/issue1217    created  andresriancho            
                                                                               

Restrict Google search to docs when in the docs subtree?         09/27/07
       http://bugs.python.org/issue1218    created  skip.montanaro           
                                                                               

3.0 library/stdtypes.rst patch                                   09/28/07
CLOSED http://bugs.python.org/issue1219    created  scav                     
       patch                                                                   



Issues Now Closed (26)
______________________

logging.basicConfig does not allow to set NOTSET level             33 days
       http://bugs.python.org/issue1021    vsajip                   
                                                                               

Test issue                                                         23 days
       http://bugs.python.org/issue1064    loewis                   
                                                                               

Allow str.join to join non-string types (as per PEP 3100)          16 days
       http://bugs.python.org/issue1145    gvanrossum               
       patch                                                                   

urllib* 20x responses not OK?                                       5 days
       http://bugs.python.org/issue1177    georg.brandl             
       patch                                                                   

py3k: Completely remove nb_coerce slot                              1 days
       http://bugs.python.org/issue1185    gvanrossum               
       patch                                                                   

optparse documentation: -- being collapsed to - in HTML             3 days
       http://bugs.python.org/issue1186    georg.brandl             
                                                                               

universal newlines doesn't identify CRLF during tell()              1 days
       http://bugs.python.org/issue1188    gvanrossum               
                                                                               

Documentation for tp_as_number tp_as_sequence tp_as_mapping         3 days
       http://bugs.python.org/issue1189    georg.brandl             
       patch                                                                   

Windows rants& sugestions.                                          0 days
       http://bugs.python.org/issue1190    loewis                   
                                                                               

Python 3 documents crash Firefox                                    0 days
       http://bugs.python.org/issue1192    loewis                   
                                                                               

The reduce() documentation is lost in Python 3.0a1                  0 days
       http://bugs.python.org/issue1194    georg.brandl             
                                                                               

int() documentation does not specify default radix                  0 days
       http://bugs.python.org/issue1196    georg.brandl             
                                                                               

logging: formatter does not accept %(funcName)s properly            1 days
       http://bugs.python.org/issue1197    vsajip                   
                                                                               

Test of 2to3 component auditor                                      0 days
       http://bugs.python.org/issue1198    dubois                   
                                                                               

Error in array concept                                              0 days
       http://bugs.python.org/issue1201    gvanrossum               
                                                                               

logging/__init__.py                                                 1 days
       http://bugs.python.org/issue1206    vsajip                   
       patch                                                                   

Match object should be guaranteed to always be true                 0 days
       http://bugs.python.org/issue1208    georg.brandl             
                                                                               

IOError won't accept tuples longer than 3                           0 days
       http://bugs.python.org/issue1209    georg.brandl             
                                                                               

cleanup patch for 3.0 tutorial/interpreter.rst                      1 days
       http://bugs.python.org/issue1211    georg.brandl             
       patch                                                                   

3.0 tutorial/introduction.rst mentions 'long'                       1 days
       http://bugs.python.org/issue1212    georg.brandl             
       patch                                                                   

3.0 tutorial/classes.rst patch                                      1 days
       http://bugs.python.org/issue1213    georg.brandl             
       patch                                                                   

infinite loop in re module                                          0 days
       http://bugs.python.org/issue1217    brett.cannon             
                                                                               

3.0 library/stdtypes.rst patch                                      0 days
       http://bugs.python.org/issue1219    georg.brandl             
       patch                                                                   

syslog syscall support for SysLogLogger                           147 days
       http://bugs.python.org/issue1711603 luke-jr                  
       patch                                                                   

RotatingFileHandler.doRollover behave wrong vs. log4j's            77 days
       http://bugs.python.org/issue1752539 vsajip                   
                                                                               

logging.FileHandler may throw exception in flush()                 63 days
       http://bugs.python.org/issue1760556 vsajip                   
                                                                               



Top Issues Most Discussed (6)
_____________________________

  5 urllib fail to read URL contents, urllib2 crash Python             2 days
open    http://bugs.python.org/issue1205   

  5 ctypes doesn't work on Mac with --disable-toolbox-glue             3 days
open    http://bugs.python.org/issue1203   

  5 Patch to rename HTMLParser module to lower_case                   36 days
open    http://bugs.python.org/issue1002   

  4 infinite loop in re module                                         0 days
closed  http://bugs.python.org/issue1217   

  4 optparse documentation: -- being collapsed to - in HTML            3 days
closed  http://bugs.python.org/issue1186   

  3 Paticular decimal mod operation wrongly output NaN.                8 days
open    http://bugs.python.org/issue1182   




From python at rcn.com  Fri Sep 28 19:45:19 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 28 Sep 2007 10:45:19 -0700
Subject: [Python-Dev] Adding concat function to itertools
References: <46F5AD52.5040407@gmail.com>
	<bbaeab100709281007w5879534dub799a391d85d689e@mail.gmail.com>
Message-ID: <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1>

[Bruce Frederiksen]
>>  I've added a new function to itertools called 'concat'.  This function is
>> much like chain, but takes all of the iterables as a single argument.

Any practical use cases or is this just a theoretical improvement?

For Py2.x, I'm not willing to unnecessarily expand the module.
However, for Py3k, I'm open to changing the signature for chain().


Raymond

From p.f.moore at gmail.com  Fri Sep 28 20:28:10 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 28 Sep 2007 19:28:10 +0100
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FAD68C.5030702@v.loewis.de>
	<7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <79990c6b0709281128j71e2ff6ep3601c6ffbb270836@mail.gmail.com>

On 26/09/2007, Dino Viehland <dinov at exchange.microsoft.com> wrote:
> My understanding is that users can write code that uses only \n and Python will write the
> end-of-line character(s) that are appropriate for the platform when writing to a file.  That's
> what I meant by uses \n for everything internally.

OK, so far so good - although I'm not *quite* sure there's a
self-consistent definition of "code that only uses \n". I'll assume
you mean code that has a concept of lines, that lines never contain
anything other than text (specifically, neither \r or \n can appear in
a line, I'll punt on whether other weird stuff like form feed are
legal), and that whenever your code needs to write data to a file, it
writes lines with \n alone between them.

> But if you write \r\n to a file Python completely ignores the presence of the \r and
> transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream.  My last
> question is simply does anyone find writing \r\r\n when the original string contained \r\n a
> useful behavior - personally I don't see how it is.

In the above model, lines can't contain \r and between lines you only
ever write \n - so where did the \r\n come from?

If you receive what you think are lines from an outside source, and
they contain \r, then you didn't sanity check your data.

If you receive a block of raw (effectively binary!) data which you
want to translate into your model, it's up to you how you cut it up
into lines.

If you read data using one of Python's text modes, it's up to you to
understand how it works.

> But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation
> and not something that Python is explicitly doing.

I'm not sure it's a CRT issue. Certainly the \r\n vs \n confusion
comes from the CRT - the underlying OS (just like Unix!!!!) only deals
in files as streams of bytes. But ultimately, "lines" are an
abstraction in your code. All the CRT (and Python) do is help (or
maybe hinder) you with the "normal" cases.

> Anyway, it'd might be useful to have a  text-mode file that you can write \r\n to and only
> get \r\n in the resulting file.

I can't comment on that, other than to say that if you better defined
the semantic model (lines, how things are encoded/decoded to files,
etc, somewhat like I tried to above) it would be more obvious what use
case this was trying to address.

> But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users
> of the behavior when interoperating w/ APIs that return \r\n in strings.

I'd say users of the relevant APIs need to understand how the APIs
represent "lines", so that they can convert the received data to their
program's model of lines. Of course, that probably corresponds to
something like s.replace('\r','') or likely more correctly data_lines
= s.split('\r\n'). A "rule of thumb" that doesn't make it clear that
the concept of "line" has 2 different binary representations in 2
different areas (data back from APIs vs data from files) is likely to
ultimately lead to mistakes and confusion.

If you think this is bad, wait until you have to deal with Unicode
issues like what *encoding* the data is being supplied to you in.
Makes guessing newline conventions seem simple (at least to this
parochial English-speaker :-)) Although as this is IronPython, you may
already have that covered...

Paul.

PS In real life, you often just want a cheap and cheerful answer. For
that, "strip out spurious \r characters" may be fine.

From stephen at xemacs.org  Fri Sep 28 22:11:33 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 29 Sep 2007 05:11:33 +0900
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <46FC9619.6020400@canterbury.ac.nz>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
	<46FC9619.6020400@canterbury.ac.nz>
Message-ID: <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:
 > Gregory P. Smith wrote:
 > > Is IOError is the right name to use?  OSError is raised for things that 
 > > are not IO such as subprocess, dlopen, system.
 > 
 > The trouble with either of these is that the class
 > of errors we're talking about don't necessarily come
 > directly from the OS or I/O library.

Agree, but I think this is a case where practicality beats purity.

+1 for OSerror.


From guido at python.org  Fri Sep 28 22:27:38 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Sep 2007 13:27:38 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
	<46FC9619.6020400@canterbury.ac.nz>
	<87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <ca471dc20709281327r5b40da9du879fdd235120ae5a@mail.gmail.com>

On 9/28/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Greg Ewing writes:
>  > Gregory P. Smith wrote:
>  > > Is IOError is the right name to use?  OSError is raised for things that
>  > > are not IO such as subprocess, dlopen, system.
>  >
>  > The trouble with either of these is that the class
>  > of errors we're talking about don't necessarily come
>  > directly from the OS or I/O library.
>
> Agree, but I think this is a case where practicality beats purity.
>
> +1 for OSerror.

The OS is a somewhat troublesome abstraction boundary. I/O is a more
general concept (and PPBP). +1 for IOError.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri Sep 28 23:09:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 28 Sep 2007 23:09:54 +0200
Subject: [Python-Dev] building with -Wwrite-strings
In-Reply-To: <e7ba66e40709271518v1f10cdd4h8643f74048bf0b82@mail.gmail.com>
References: <e7ba66e40709271518v1f10cdd4h8643f74048bf0b82@mail.gmail.com>
Message-ID: <46FD6DA2.1060107@v.loewis.de>

> I'm trying to build Python (2.6) with GCC the option -Wwrite-strings.
> 
> 1 - Is there any interest on this?

It might be nice to have, but will certainly come at a cost. So feel
free to try this out; at the end, we might agree that this change is
too intrusive.

> 2 - What should I do for the very common (taken from int_new):
>    static char *kwlist[] = {"x", "base", 0};

What's wrong with

static const char *kwlist[] = {"x", "base", 0};

Regards,
Martin

From mike.klaas at gmail.com  Fri Sep 28 23:40:08 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Fri, 28 Sep 2007 14:40:08 -0700
Subject: [Python-Dev] Adding concat function to itertools
In-Reply-To: <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1>
References: <46F5AD52.5040407@gmail.com>
	<bbaeab100709281007w5879534dub799a391d85d689e@mail.gmail.com>
	<001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1>
Message-ID: <E49CA551-5E48-41B6-872A-0163882D70EF@gmail.com>

On 28-Sep-07, at 10:45 AM, Raymond Hettinger wrote:

> [Bruce Frederiksen]
>>>  I've added a new function to itertools called 'concat'.  This  
>>> function is
>>> much like chain, but takes all of the iterables as a single  
>>> argument.
>
> Any practical use cases or is this just a theoretical improvement?
>
> For Py2.x, I'm not willing to unnecessarily expand the module.
> However, for Py3k, I'm open to changing the signature for chain().

For me, a fraction of chain() uses are of the * variety:

d = defaultdict(list)
allvals = chain(*d.values())

return chain(*imap(cache.__getitem__, keylist))

Interestingly, they seem to all have something to do with dictionary  
values() that are themselves iterable.

-Mike



From python at rcn.com  Fri Sep 28 23:53:42 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 28 Sep 2007 14:53:42 -0700
Subject: [Python-Dev] Adding concat function to itertools
References: <46F5AD52.5040407@gmail.com>
	<bbaeab100709281007w5879534dub799a391d85d689e@mail.gmail.com>
	<001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1>
	<E49CA551-5E48-41B6-872A-0163882D70EF@gmail.com>
Message-ID: <002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1>

 [Bruce Frederiksen]
>>>>  I've added a new function to itertools called 'concat'.  This  
>>>> function is
>>>> much like chain, but takes all of the iterables as a single  
>>>> argument.

[Raymond]
>> Any practical use cases or is this just a theoretical improvement?
>>
>> For Py2.x, I'm not willing to unnecessarily expand the module.
>> However, for Py3k, I'm open to changing the signature for chain().

[Bruce]
> For me, a fraction of chain() uses are of the * variety:
> 
> d = defaultdict(list)
> allvals = chain(*d.values())
> 
> return chain(*imap(cache.__getitem__, keylist))
> 
> Interestingly, they seem to all have something to do with dictionary  
> values() that are themselves iterable.

I see.  These are instances of a recurring general use case of
chain() as a one-level flattener.

Will give consideration to changing the signature of chain() for Py3.0.
Besides the concat() variation using a single iterable input, another
alternative is the min()/max() style signature where one input is
interpreted as iterable and multiple arguments as comprising an
input tuple.


Raymond

From djm at mindrot.org  Sat Sep 29 00:09:32 2007
From: djm at mindrot.org (Damien Miller)
Date: Sat, 29 Sep 2007 08:09:32 +1000 (EST)
Subject: [Python-Dev] Adding concat function to itertools
In-Reply-To: <002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1>
References: <46F5AD52.5040407@gmail.com>
	<bbaeab100709281007w5879534dub799a391d85d689e@mail.gmail.com>
	<001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1>
	<E49CA551-5E48-41B6-872A-0163882D70EF@gmail.com>
	<002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1>
Message-ID: <Pine.BSO.4.64.0709290807260.21668@fuyu.mindrot.org>

On Fri, 28 Sep 2007, Raymond Hettinger wrote:

> > Interestingly, they seem to all have something to do with dictionary  
> > values() that are themselves iterable.
> 
> I see.  These are instances of a recurring general use case of
> chain() as a one-level flattener.
> 
> Will give consideration to changing the signature of chain() for Py3.0.
> Besides the concat() variation using a single iterable input, another
> alternative is the min()/max() style signature where one input is
> interpreted as iterable and multiple arguments as comprising an
> input tuple.

Has anyone considered making the iterator __add__ operator perform
something similar to chain? I.e.

list(a + b) => [ a0, a1, ... an, b0, b1, bn]

(where "a" and "b" are iterables)

-d

From brett at python.org  Sat Sep 29 02:23:24 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 28 Sep 2007 17:23:24 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <ca471dc20709281327r5b40da9du879fdd235120ae5a@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<46FB1B7B.1020001@canterbury.ac.nz>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
	<46FC9619.6020400@canterbury.ac.nz>
	<87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<ca471dc20709281327r5b40da9du879fdd235120ae5a@mail.gmail.com>
Message-ID: <bbaeab100709281723t19db6addq27e19238727a4ce@mail.gmail.com>

On 9/28/07, Guido van Rossum <guido at python.org> wrote:
> On 9/28/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > Greg Ewing writes:
> >  > Gregory P. Smith wrote:
> >  > > Is IOError is the right name to use?  OSError is raised for things that
> >  > > are not IO such as subprocess, dlopen, system.
> >  >
> >  > The trouble with either of these is that the class
> >  > of errors we're talking about don't necessarily come
> >  > directly from the OS or I/O library.
> >
> > Agree, but I think this is a case where practicality beats purity.
> >
> > +1 for OSerror.
>
> The OS is a somewhat troublesome abstraction boundary. I/O is a more
> general concept (and PPBP). +1 for IOError.

What is PPBP?

-Brett

From tjreedy at udel.edu  Sat Sep 29 03:43:27 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 28 Sep 2007 21:43:27 -0400
Subject: [Python-Dev] Adding concat function to itertools
References: <46F5AD52.5040407@gmail.com>
Message-ID: <fdkak0$aeh$1@sea.gmane.org>


"Bruce Frederiksen" <dangyogi at gmail.com> wrote in message 
news:46F5AD52.5040407 at gmail.com...

A 64K attachment.  Please do not do such a worse-than-useless thing again.
Especially when only 1K is original.




From guido at python.org  Sat Sep 29 05:10:52 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Sep 2007 20:10:52 -0700
Subject: [Python-Dev] urllib exception compatibility
In-Reply-To: <bbaeab100709281723t19db6addq27e19238727a4ce@mail.gmail.com>
References: <fb6fbf560709261735n64fe7790y198410e8d318eb92@mail.gmail.com>
	<ca471dc20709270732y5f04dbb2vacb6137698d7262c@mail.gmail.com>
	<bbaeab100709271323g4e25c5e8i1d343e402536b6c8@mail.gmail.com>
	<63jqje$2dsube@venus.eclipse.kcom.com>
	<ca471dc20709271559l142d63ci1cf00966e3c70b79@mail.gmail.com>
	<52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com>
	<46FC9619.6020400@canterbury.ac.nz>
	<87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<ca471dc20709281327r5b40da9du879fdd235120ae5a@mail.gmail.com>
	<bbaeab100709281723t19db6addq27e19238727a4ce@mail.gmail.com>
Message-ID: <ca471dc20709282010g4c2842e1h71d5193fd0472af9@mail.gmail.com>

On 9/28/07, Brett Cannon <brett at python.org> wrote:
> On 9/28/07, Guido van Rossum <guido at python.org> wrote:
> > On 9/28/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > > Greg Ewing writes:
> > >  > Gregory P. Smith wrote:
> > >  > > Is IOError is the right name to use?  OSError is raised for things that
> > >  > > are not IO such as subprocess, dlopen, system.
> > >  >
> > >  > The trouble with either of these is that the class
> > >  > of errors we're talking about don't necessarily come
> > >  > directly from the OS or I/O library.
> > >
> > > Agree, but I think this is a case where practicality beats purity.
> > >
> > > +1 for OSerror.
> >
> > The OS is a somewhat troublesome abstraction boundary. I/O is a more
> > general concept (and PPBP). +1 for IOError.
>
> What is PPBP?

Typo for PBP : Practicality Beats Purity. :)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nmm1 at cus.cam.ac.uk  Sat Sep 29 11:25:29 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sat, 29 Sep 2007 10:25:29 +0100
Subject: [Python-Dev] New lines, carriage returns, and Windows
Message-ID: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>

"Paul Moore" <p.f.moore at gmail.com> wrote:
> 
> OK, so far so good - although I'm not *quite* sure there's a
> self-consistent definition of "code that only uses \n". I'll assume
> you mean code that has a concept of lines, that lines never contain
> anything other than text (specifically, neither \r or \n can appear in
> a line, I'll punt on whether other weird stuff like form feed are
> legal), and that whenever your code needs to write data to a file, it
> writes lines with \n alone between them.

I won't.  There are a few of us still left who know how this started,
and here is a simplified description.

Unix was a computer scientist's workbench, and made no attempt to be
general.  In particular, its text datastream model was appropriate
for the imnportant devices of the day - teletypes and similar.  So
far, so good.  But what was forgotten later is that the model does
NOT extend to other systems and, in particular, made no sense on the
record-oriented models generally used by mainframes (see Fortran for
an example).

When C was standardised, this was fudged.  I tried to get it improved,
but it is one of the many things I failed to do.  The handling of
ALL of the control characters in text I/O is non-portable (even \t,
despite what the satndard says), and you have to follow the system's
constraints if things are to work.  Unfortunately, the kludging that
the compiler does to map C to the operating system confuses things
still further - though it is essential.

Now, BCPL was an ancestor of C, but always was a more portable
language (i.e. it didn't start with a specific operating system in
mind), and used/uses a rather better model.  In this, line separators
are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
"newline-with-overprinting".  Now, THAT model is more generic.
Not fully generic, of course, but it would cater for all of Unix,
CPM and its derivatives (yes, Microsoft), MacOS and most mainframes
(with some reservations).

So, until and unless Python chooses to define its own I/O model,
these problems will continue to arise.  Whether this one is a simple
bug or an avoidable feature, I can't say without looking harder,
but bugs are often caused by attempting to implement impossible
or confusing specifications.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From guido at python.org  Sat Sep 29 17:07:18 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 29 Sep 2007 08:07:18 -0700
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
Message-ID: <ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>

On 9/29/07, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> "Paul Moore" <p.f.moore at gmail.com> wrote:
> >
> > OK, so far so good - although I'm not *quite* sure there's a
> > self-consistent definition of "code that only uses \n". I'll assume
> > you mean code that has a concept of lines, that lines never contain
> > anything other than text (specifically, neither \r or \n can appear in
> > a line, I'll punt on whether other weird stuff like form feed are
> > legal), and that whenever your code needs to write data to a file, it
> > writes lines with \n alone between them.
>
> I won't.  There are a few of us still left who know how this started,
> and here is a simplified description.
>
> Unix was a computer scientist's workbench, and made no attempt to be
> general.  In particular, its text datastream model was appropriate
> for the imnportant devices of the day - teletypes and similar.  So
> far, so good.  But what was forgotten later is that the model does
> NOT extend to other systems and, in particular, made no sense on the
> record-oriented models generally used by mainframes (see Fortran for
> an example).
>
> When C was standardised, this was fudged.  I tried to get it improved,
> but it is one of the many things I failed to do.  The handling of
> ALL of the control characters in text I/O is non-portable (even \t,
> despite what the satndard says), and you have to follow the system's
> constraints if things are to work.  Unfortunately, the kludging that
> the compiler does to map C to the operating system confuses things
> still further - though it is essential.
>
> Now, BCPL was an ancestor of C, but always was a more portable
> language (i.e. it didn't start with a specific operating system in
> mind), and used/uses a rather better model.  In this, line separators
> are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
> "newline-with-overprinting".  Now, THAT model is more generic.
> Not fully generic, of course, but it would cater for all of Unix,
> CPM and its derivatives (yes, Microsoft), MacOS and most mainframes
> (with some reservations).
>
> So, until and unless Python chooses to define its own I/O model,
> these problems will continue to arise.  Whether this one is a simple
> bug or an avoidable feature, I can't say without looking harder,
> but bugs are often caused by attempting to implement impossible
> or confusing specifications.

Have you looked at Py3k at all, especially PEP 3116 (new I/O)?

Python *does* have its own I/O model. There are binary files and text
files. For binary files, you write bytes and the semantic model is
that of an array of bytes; byte indices are seek positions.

For text files, the contents is considered to be Unicode, encoded as
bytes in a binary file. So text file always has an underlying binary
file. Two translations take place, both of which have defaults varying
by platform. One translation is encoding Unicode text into bytes upon
output, and decoding bytes to Unicode text upon input. This can use
any encoding supported by the encodings package.

The other translation deals with line endings. Upon input, any of
\r\n, \r, or \n is translated to a single \n by default (this is nhe
"universal newlines" algorithm from Python 2.x). This can be tweaked
or disabled. Upon output, \n is translated into a platform specific
string chosen from \r\n, \r, or \n. This can also be disabled or
overridden. Note that \r, when written, is never treated specially; if
you want special processing for \r on output, you can write your own
translation layer.

That's all. There is nothing unimplementable or confusing in these
specifications.

Python doesn't care about record I/O on legacy OSes; it does care
about variability found in practice between popular OSes.

Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
literals) or Unicode (in text literals). Again, no support for legacy
systems that don't use ASCII or a superset.

Legacy OSes are called that for a reason.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From thomas at python.org  Sat Sep 29 17:26:35 2007
From: thomas at python.org (Thomas Wouters)
Date: Sat, 29 Sep 2007 17:26:35 +0200
Subject: [Python-Dev] Decimal news
In-Reply-To: <fdio9q$9ra$1@sea.gmane.org>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
	<46FC4D40.4090808@trueblade.com>
	<9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com>
	<fdio9q$9ra$1@sea.gmane.org>
Message-ID: <9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com>

On 9/28/07, Thomas Heller <theller at ctypes.org> wrote:
>
> Thomas Wouters schrieb:
> >> > If you re-eally need to check something into the trunk that re-eally
> >> > must not be merged into py3k, but you're afraid it's not going to be
> >> > obvious to the merger, please record the change as 'merged' using
> >> > "svnmerge merge -M -r<revision>". Please take care when picking the
> >> > revision ;) You can also just email me or someone else you see doing
> >> > merges, as I doubt this will be a common occurance.
>
> I think that the 'svnmerge block -r<revision>' command should be used.  Or
> not?


If you're comfortable with using svnmerge yourself, sure. If you're worried
that you might mess up the state of the branch, you can leave it up to us
(me.)


-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070929/7ea5fdb3/attachment.htm 

From fuzzyman at voidspace.org.uk  Sat Sep 29 17:30:26 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 29 Sep 2007 16:30:26 +0100
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
In-Reply-To: <ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
Message-ID: <46FE6F92.40601@voidspace.org.uk>

Guido van Rossum wrote:
> [snip..]
> Python *does* have its own I/O model. There are binary files and text
> files. For binary files, you write bytes and the semantic model is
> that of an array of bytes; byte indices are seek positions.
>
> For text files, the contents is considered to be Unicode, encoded as
> bytes in a binary file. So text file always has an underlying binary
> file. Two translations take place, both of which have defaults varying
> by platform. One translation is encoding Unicode text into bytes upon
> output, and decoding bytes to Unicode text upon input. This can use
> any encoding supported by the encodings package.
>
> The other translation deals with line endings. Upon input, any of
> \r\n, \r, or \n is translated to a single \n by default (this is nhe
> "universal newlines" algorithm from Python 2.x). This can be tweaked
> or disabled. Upon output, \n is translated into a platform specific
> string chosen from \r\n, \r, or \n. This can also be disabled or
> overridden. Note that \r, when written, is never treated specially; if
> you want special processing for \r on output, you can write your own
> translation layer.
>   
So the question is, that when a string containing '\r\n' is written to a 
file in text mode on a Windows platform, should it be written with the 
encoded representation of '\r\n' or '\r\r\n'?

Purity would dictate the latter and practicality the former (IMO)...

However, that would mean that round tripping a string would change it 
('\r\n' would be written as '\r\n' and then read as '\n') - on the other 
hand (particularly given that we are treating the data as text and not a 
binary blob) I don't see how writing '\r\r\n' would ever actually be 
useful in text.

+1 on just writing '\r\n' from me.

Michael Foord
http://www.manning.com/foord


> That's all. There is nothing unimplementable or confusing in these
> specifications.
>
> Python doesn't care about record I/O on legacy OSes; it does care
> about variability found in practice between popular OSes.
>
> Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
> literals) or Unicode (in text literals). Again, no support for legacy
> systems that don't use ASCII or a superset.
>
> Legacy OSes are called that for a reason.
>
>   


From python at rcn.com  Sat Sep 29 18:46:58 2007
From: python at rcn.com (Raymond Hettinger)
Date: Sat, 29 Sep 2007 09:46:58 -0700
Subject: [Python-Dev] Decimal news
In-Reply-To: <9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com>
References: <e04bdf310709131219n7dde0ef9q9aa69265b0d3affb@mail.gmail.com>
	<9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com>
	<46FC4D40.4090808@trueblade.com>
	<9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com>
	<fdio9q$9ra$1@sea.gmane.org>
	<9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com>
Message-ID: <14790F8C-DCFB-43E7-BA28-1AB3EF80EEFC@rcn.com>

If the differences are few, I prefer that you insert some conditionals  
that attach different functions based on the version number. That way  
we can keep a single version of the source that works on all of the  
pythons.

Raymond

On Sep 29, 2007, at 8:26 AM, "Thomas Wouters" <thomas at python.org> wrote:

>
>
> On 9/28/07, Thomas Heller <theller at ctypes.org> wrote:
> Thomas Wouters schrieb:
> >> > If you re-eally need to check something into the trunk that re- 
> eally
> >> > must not be merged into py3k, but you're afraid it's not going  
> to be
> >> > obvious to the merger, please record the change as 'merged' using
> >> > "svnmerge merge -M -r<revision>". Please take care when picking  
> the
> >> > revision ;) You can also just email me or someone else you see  
> doing
> >> > merges, as I doubt this will be a common occurance.
>
> I think that the 'svnmerge block -r<revision>' command should be  
> used.  Or not?
>
> If you're comfortable with using svnmerge yourself, sure. If you're  
> worried that you might mess up the state of the branch, you can  
> leave it up to us (me.)
>
>
> -- 
> Thomas Wouters <thomas at python.org>
>
> Hi! I'm a .signature virus! copy me into your .signature file to  
> help me spread!
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070929/375b55d2/attachment.htm 

From tjreedy at udel.edu  Sat Sep 29 20:30:59 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 29 Sep 2007 14:30:59 -0400
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk><ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk>
Message-ID: <fdm5l4$rok$1@sea.gmane.org>


"Michael Foord" <fuzzyman at voidspace.org.uk> wrote in message 
news:46FE6F92.40601 at voidspace.org.uk...
| Guido van Rossum wrote:

[snip first part of nice summary of Python i/o model]

| > The other translation deals with line endings. Upon input, any of
| > \r\n, \r, or \n is translated to a single \n by default (this is nhe 
[sic]
| > "universal newlines" algorithm from Python 2.x). This can be tweaked
| > or disabled. Upon output, \n is translated into a platform specific
| > string chosen from \r\n, \r, or \n. This can also be disabled or
| > overridden. Note that \r, when written, is never treated specially; if
| > you want special processing for \r on output, you can write your own
| > translation layer.

| So the question is, that when a string containing '\r\n' is written to a
| file in text mode on a Windows platform, should it be written with the
| encoded representation of '\r\n' or '\r\r\n'?

I think Guido pretty clearly said that on output, the default behavior is 
that \r is nothing special.  If you want a special case exception, write a 
special case translator. +1 from me.

To propose otherwise is to propose that the default semantic meaning of 
Python text objects depend on the platform that it might be 
output-translated for.  I believe the point of universal newline support 
was to get away from this.

| Purity would dictate the latter and practicality the former (IMO)...

I disagree.  Special case exceptions complicate both learnability and code 
readability and maintainability.  Simplicity is practicality.  The symmetry 
of 'platform-line-endings =input> \n =output> plaform-line-endings' is both 
pure and practical.

| However, that would mean that round tripping a string would change it
| ('\r\n' would be written as '\r\n' and then read as '\n')

Whereas \r\r\n would be read back as \r\n, which is what should happen. 
Round-trip-ability is practical to me.

| - on the other
| hand (particularly given that we are treating the data as text and not a
| binary blob) I don't see how writing '\r\r\n' would ever actually be
| useful in text.

There are two normal ways for internal Python text to have \r\n:
1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
same platform).
2. Intentially put there by a programmer.  If s/he also chooses default \n 
translation on output, \r<translation of \n> is correct.

The leaves
1. Bugs due to ignorance or accident.  These should be repaired.
2. Other special situations, which can be handled by disabling, overriding, 
and layering the defaults.  This seems enough flexibility to me.

Terry Jan Reedy





From fuzzyman at voidspace.org.uk  Sat Sep 29 20:35:53 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 29 Sep 2007 19:35:53 +0100
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
In-Reply-To: <fdm5l4$rok$1@sea.gmane.org>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk><ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>	<46FE6F92.40601@voidspace.org.uk>
	<fdm5l4$rok$1@sea.gmane.org>
Message-ID: <46FE9B09.8000800@voidspace.org.uk>

Terry Reedy wrote:
> "Michael Foord" <fuzzyman at voidspace.org.uk> wrote in message 
> news:46FE6F92.40601 at voidspace.org.uk...
> | Guido van Rossum wrote:
>
> [snip first part of nice summary of Python i/o model]
>
> | > The other translation deals with line endings. Upon input, any of
> | > \r\n, \r, or \n is translated to a single \n by default (this is nhe 
> [sic]
> | > "universal newlines" algorithm from Python 2.x). This can be tweaked
> | > or disabled. Upon output, \n is translated into a platform specific
> | > string chosen from \r\n, \r, or \n. This can also be disabled or
> | > overridden. Note that \r, when written, is never treated specially; if
> | > you want special processing for \r on output, you can write your own
> | > translation layer.
>
> | So the question is, that when a string containing '\r\n' is written to a
> | file in text mode on a Windows platform, should it be written with the
> | encoded representation of '\r\n' or '\r\r\n'?
>
> I think Guido pretty clearly said that on output, the default behavior is 
> that \r is nothing special.  If you want a special case exception, write a 
> special case translator. +1 from me.
>
> To propose otherwise is to propose that the default semantic meaning of 
> Python text objects depend on the platform that it might be 
> output-translated for.  I believe the point of universal newline support 
> was to get away from this.
>
> | Purity would dictate the latter and practicality the former (IMO)...
>
> I disagree.  Special case exceptions complicate both learnability and code 
> readability and maintainability.  Simplicity is practicality.  The symmetry 
> of 'platform-line-endings =input> \n =output> plaform-line-endings' is both 
> pure and practical.
>
> | However, that would mean that round tripping a string would change it
> | ('\r\n' would be written as '\r\n' and then read as '\n')
>
> Whereas \r\r\n would be read back as \r\n, which is what should happen. 
> Round-trip-ability is practical to me.
>
> | - on the other
> | hand (particularly given that we are treating the data as text and not a
> | binary blob) I don't see how writing '\r\r\n' would ever actually be
> | useful in text.
>
> There are two normal ways for internal Python text to have \r\n:
> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
> same platform).
> 2. Intentially put there by a programmer.  If s/he also chooses default \n 
> translation on output, \r<translation of \n> is correct.
>   
Actually, I usually get these strings from Windows UI components. A file 
containing '\r\n' is read in with '\r\n' being translated to '\n'. New 
user input is added containing '\r\n' line endings. The file is written 
out and now contains a mix of '\r\n' and '\r\r\n'.

Michael



From steven.bethard at gmail.com  Sat Sep 29 20:42:38 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 29 Sep 2007 12:42:38 -0600
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <46FE9B09.8000800@voidspace.org.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
Message-ID: <d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>

On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> Terry Reedy wrote:
> > There are two normal ways for internal Python text to have \r\n:
> > 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
> > same platform).
> > 2. Intentially put there by a programmer.  If s/he also chooses default \n
> > translation on output, \r<translation of \n> is correct.
> >
> Actually, I usually get these strings from Windows UI components. A file
> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
> user input is added containing '\r\n' line endings. The file is written
> out and now contains a mix of '\r\n' and '\r\r\n'.

Out of curiosity, why don't the Python wrappers for your Windows UI
components do the appropriate '\r\n' -> '\n' conversions?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From fuzzyman at voidspace.org.uk  Sat Sep 29 20:47:20 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 29 Sep 2007 19:47:20 +0100
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>	
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>	
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>	
	<46FE9B09.8000800@voidspace.org.uk>
	<d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>
Message-ID: <46FE9DB8.9000604@voidspace.org.uk>

Steven Bethard wrote:
> On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>   
>> Terry Reedy wrote:
>>     
>>> There are two normal ways for internal Python text to have \r\n:
>>> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
>>> same platform).
>>> 2. Intentially put there by a programmer.  If s/he also chooses default \n
>>> translation on output, \r<translation of \n> is correct.
>>>
>>>       
>> Actually, I usually get these strings from Windows UI components. A file
>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>> user input is added containing '\r\n' line endings. The file is written
>> out and now contains a mix of '\r\n' and '\r\r\n'.
>>     
>
> Out of curiosity, why don't the Python wrappers for your Windows UI
> components do the appropriate '\r\n' -> '\n' conversions?
>   

One of the great things about IronPython is that you don't *need* any 
wrappers - you access .NET objects natively (which in fact wrap the 
lower level win32 API) - and the .NET APIs are usually not as bad as you 
probably assume. ;-)

You just have to be aware that line endings are '\r\n'. I'm not sure how 
or if pywin32 handles this.

Michael

> STeVe
>   


From nmm1 at cus.cam.ac.uk  Sat Sep 29 20:48:20 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sat, 29 Sep 2007 19:48:20 +0100
Subject: [Python-Dev] New lines, carriage returns, and Windows
Message-ID: <E1IbhMS-0003Ht-Ox@draco.cus.cam.ac.uk>

"Guido van Rossum" <guido at python.org> wrote:
> 
> Have you looked at Py3k at all, especially PEP 3116 (new I/O)?

No.

> Python *does* have its own I/O model. There are binary files and text
> files. For binary files, you write bytes and the semantic model is
> that of an array of bytes; byte indices are seek positions.

That is the same model as C and Unix.  It is text files that we are
discussing.

> For text files, the contents is considered to be Unicode, encoded as
> bytes in a binary file. So text file always has an underlying binary
> file. Two translations take place, both of which have defaults varying
> by platform. One translation is encoding Unicode text into bytes upon
> output, and decoding bytes to Unicode text upon input. This can use
> any encoding supported by the encodings package.

The character code isn't the issue here, and is almost completely
irrelevant.

> The other translation deals with line endings. Upon input, any of
> \r\n, \r, or \n is translated to a single \n by default (this is nhe
> "universal newlines" algorithm from Python 2.x). This can be tweaked
> or disabled. Upon output, \n is translated into a platform specific
> string chosen from \r\n, \r, or \n. This can also be disabled or
> overridden. Note that \r, when written, is never treated specially; if
> you want special processing for \r on output, you can write your own
> translation layer.

Grrk.  That's the problem.  You don't get back what you have written,
for a start, which isn't nice.  There are other issues, too.

> That's all. There is nothing unimplementable or confusing in these
> specifications.

Nothing unimplementable, I agree.  Nothing confusing?  Not in the
experience of the users I have dealt with.

> Python doesn't care about record I/O on legacy OSes; it does care
> about variability found in practice between popular OSes.

As a short-term solution, that is fine.  But I have seen the wheel
turn a couple of times in 40 years, and expect it to continue after
I am safely 6' under ....

> Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
> literals) or Unicode (in text literals). Again, no support for legacy
> systems that don't use ASCII or a superset.

That's not a problem.  I don't see that changing in the forseeable
future.

> Legacy OSes are called that for a reason.

Well, I remember when the text I/O model that C, Unix and Python
use WAS a feature of legacy OSs :-)

Seriously.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From steven.bethard at gmail.com  Sat Sep 29 20:59:28 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 29 Sep 2007 12:59:28 -0600
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <46FE9DB8.9000604@voidspace.org.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
	<d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>
	<46FE9DB8.9000604@voidspace.org.uk>
Message-ID: <d11dcfba0709291159s570d8899gf0745237948613e0@mail.gmail.com>

On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> Steven Bethard wrote:
> > On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> >
> >> Terry Reedy wrote:
> >>
> >>> There are two normal ways for internal Python text to have \r\n:
> >>> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
> >>> same platform).
> >>> 2. Intentially put there by a programmer.  If s/he also chooses default \n
> >>> translation on output, \r<translation of \n> is correct.
> >>>
> >>>
> >> Actually, I usually get these strings from Windows UI components. A file
> >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
> >> user input is added containing '\r\n' line endings. The file is written
> >> out and now contains a mix of '\r\n' and '\r\r\n'.
> >
> > Out of curiosity, why don't the Python wrappers for your Windows UI
> > components do the appropriate '\r\n' -> '\n' conversions?
>
> One of the great things about IronPython is that you don't *need* any
> wrappers - you access .NET objects natively (which in fact wrap the
> lower level win32 API) - and the .NET APIs are usually not as bad as you
> probably assume. ;-)
>
> You just have to be aware that line endings are '\r\n'.

Ahh, I see.  So all the .NET components function like Python 3.0's
io.open(..., newline='\n'), where no translation of \n (to or from
\r\n) is performed.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From fuzzyman at voidspace.org.uk  Sat Sep 29 21:19:24 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 29 Sep 2007 20:19:24 +0100
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <d11dcfba0709291159s570d8899gf0745237948613e0@mail.gmail.com>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>	
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>	
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>	
	<46FE9B09.8000800@voidspace.org.uk>	
	<d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>	
	<46FE9DB8.9000604@voidspace.org.uk>
	<d11dcfba0709291159s570d8899gf0745237948613e0@mail.gmail.com>
Message-ID: <46FEA53C.4070406@voidspace.org.uk>

Steven Bethard wrote:
> On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>   
>> Steven Bethard wrote:
>>     
>>> On 9/29/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>>>
>>>       
>>>> Terry Reedy wrote:
>>>>
>>>>         
>>>>> There are two normal ways for internal Python text to have \r\n:
>>>>> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
>>>>> same platform).
>>>>> 2. Intentially put there by a programmer.  If s/he also chooses default \n
>>>>> translation on output, \r<translation of \n> is correct.
>>>>>
>>>>>
>>>>>           
>>>> Actually, I usually get these strings from Windows UI components. A file
>>>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>>>> user input is added containing '\r\n' line endings. The file is written
>>>> out and now contains a mix of '\r\n' and '\r\r\n'.
>>>>         
>>> Out of curiosity, why don't the Python wrappers for your Windows UI
>>> components do the appropriate '\r\n' -> '\n' conversions?
>>>       
>> One of the great things about IronPython is that you don't *need* any
>> wrappers - you access .NET objects natively (which in fact wrap the
>> lower level win32 API) - and the .NET APIs are usually not as bad as you
>> probably assume. ;-)
>>
>> You just have to be aware that line endings are '\r\n'.
>>     
>
> Ahh, I see.  So all the .NET components function like Python 3.0's
> io.open(..., newline='\n'), where no translation of \n (to or from
> \r\n) is performed.
>   

Effectively yes. Although for Python compatibility, opening a file in 
text mode using the python 'open' or 'file' will behave in the usual way.

Michael

> STeVe
>   


From p.f.moore at gmail.com  Sat Sep 29 21:47:35 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 29 Sep 2007 20:47:35 +0100
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <46FEA53C.4070406@voidspace.org.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
	<d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>
	<46FE9DB8.9000604@voidspace.org.uk>
	<d11dcfba0709291159s570d8899gf0745237948613e0@mail.gmail.com>
	<46FEA53C.4070406@voidspace.org.uk>
Message-ID: <79990c6b0709291247i45cc37cdl9a0b56b29053bbe6@mail.gmail.com>

>>> Actually, I usually get these strings from Windows UI components. A file
>>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>>> user input is added containing '\r\n' line endings. The file is written
>>> out and now contains a mix of '\r\n' and '\r\r\n'.
>>>
>> Out of curiosity, why don't the Python wrappers for your Windows UI
>> components do the appropriate '\r\n' -> '\n' conversions?
>>
> One of the great things about IronPython is that you don't *need* any
> wrappers - you access .NET objects natively (which in fact wrap the
> lower level win32 API) - and the .NET APIs are usually not as bad as you
> probably assume. ;-)

Given the current lengthy discussion about newline translation, maybe
it isn't such a great thing :-)

Seriously, you do need a wrapper in this particular case - to convert
the .NET line ending convention to Python's. The issue here is that
such a wrapper is so trivial, that it's usually easier to simply do
the translation with adhoc .replace('\r\n', '\n') calls. The problem
comes when you accidentally forget a translation - then you get the
clash between the .NET (\r\\n) and Python (\n) models. But of course,
the solution in that case is to simply add the omitted translation,
not to change Python's IO model.

Of course, all this grand theory is just that - theory. In my case, it
helped me understand what's going on, but that's all. For real life
code, you just add the appropriate replace() calls. Whether theory
helps you keep track of where replace() is needed, or whether you just
know, doesn't really matter much.

But regardless - the Python IO model doesn't need changing. (Not even
2.x, and the py3k model is even better in this regard).

Paul.

From tjreedy at udel.edu  Sat Sep 29 23:53:49 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 29 Sep 2007 17:53:49 -0400
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
	and Windows
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk><ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>	<46FE6F92.40601@voidspace.org.uk><fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
Message-ID: <fdmhhe$pt7$1@sea.gmane.org>


"Michael Foord" <fuzzyman at voidspace.org.uk> wrote in message 
news:46FE9B09.8000800 at voidspace.org.uk...
| Terry Reedy wrote:
| > There are two normal ways for internal Python text to have \r\n:
| > 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
| > same platform).
| > 2. Intentially put there by a programmer.  If s/he also chooses default 
\n
| > translation on output, \r<translation of \n> is correct.
| >
| Actually, I usually get these strings from Windows UI components. A file
| containing '\r\n' is read in with '\r\n' being translated to '\n'. New
| user input is added containing '\r\n' line endings. The file is written
| out and now contains a mix of '\r\n' and '\r\r\n'.

I covered this in the part you snipped:

"2. Other special situations, which can be handled by disabling, 
overriding,
and layering the defaults.  This seems enough flexibility to me."

While mixing input like this may seem 'normal' to you, I believe it is 
'special'
considering the total Python community.  I can think of at least 4 decent 
solutions, depending on the details of the input and what you do with it.

tjr




From greg.ewing at canterbury.ac.nz  Sun Sep 30 01:46:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 30 Sep 2007 11:46:19 +1200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
Message-ID: <46FEE3CB.1010002@canterbury.ac.nz>

On 9/29/07, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:

> Now, BCPL was an ancestor of C, but always was a more portable
> language (i.e. it didn't start with a specific operating system in
> mind), and used/uses a rather better model.  In this, line separators
> are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
> "newline-with-overprinting".

I don't see how this is different from Unix/C "\n" being
an atomic newline character.

If you're saying that BCPL is better because it defines
standard semantics for more control characters than just
"\n", that may be true, but C is doing about the best it
can with "\n" as far as I can see, given all the crazy
things that different OSes want to do with line endings.

In any case, the problem which started all this isn't
really an I/O problem at all, it's a mismatch between
the world of Python strings which use "\n" and .NET
library code expecting strings which use "\r\n".

The correct thing to do with that is to translate whenever
a string crosses a boundary between Python code and
.NET code. This is something that ought to be done
automatically by the Python/.NET interfacing machinery,
maybe by having a different type for .NET strings.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Sep 30 02:22:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 30 Sep 2007 12:22:19 +1200
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
In-Reply-To: <46FE9DB8.9000604@voidspace.org.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
	<d11dcfba0709291142u1fb3aa68vf81d1125e95e6a90@mail.gmail.com>
	<46FE9DB8.9000604@voidspace.org.uk>
Message-ID: <46FEEC3B.90006@canterbury.ac.nz>

Michael Foord wrote:
> One of the great things about IronPython is that you don't *need* any 
> wrappers - you access .NET objects natively

But it seems that you really *do* need wrappers to
deal with the line endings problem, whether they're
provided automatically or you it yourself manually.

This is reminiscent of the C-string vs. Pascal-string
fiasco when Apple switched from Pascal to C as their
main application programming language. Some development
environments provided glue code that did the translation
automatically; others required you to do it yourself,
which was a huge nuisance.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Sep 30 02:30:45 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 30 Sep 2007 12:30:45 +1200
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <E1IbhMS-0003Ht-Ox@draco.cus.cam.ac.uk>
References: <E1IbhMS-0003Ht-Ox@draco.cus.cam.ac.uk>
Message-ID: <46FEEE35.7060007@canterbury.ac.nz>

Nick Maclaren wrote:
> Grrk.  That's the problem.  You don't get back what you have written

You do as long as you *don't* use universal newlines mode
for reading. This is the best that can be done, because
universal newlines are inherently ambiguous.

If you want universal newlines, you just have to accept
that you can't also have \r characters meaning something
other than newlines in your files. This is true regardless
of what programming language or I/O model is being used.

--
Greg

From nmm1 at cus.cam.ac.uk  Sun Sep 30 11:34:58 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sun, 30 Sep 2007 10:34:58 +0100
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
Message-ID: <E1IbvCU-0000Jv-Pb@draco.cus.cam.ac.uk>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> > Grrk.  That's the problem.  You don't get back what you have written
> 
> You do as long as you *don't* use universal newlines mode
> for reading. This is the best that can be done, because
> universal newlines are inherently ambiguous.

I don't know PRECISELY what you mean by "universal newlines mode",
and this issue is all about the details, so any response would
merely enhance the confusion.

> If you want universal newlines, you just have to accept
> that you can't also have \r characters meaning something
> other than newlines in your files. This is true regardless
> of what programming language or I/O model is being used.

No, that is not true, and I have used more than one model where
it wasn't.  Let's stick to models where newlines are special
characters - I prefer the ones where they are not, but that is
by the way.

Model 1:  certain characters can be used only in combination.
E.g. \f must occur immediately before (or after) a \n, which
it modifies.  r is either a newline-with-overprint or must be
associated with a \n.  In both cases, only ONE of the alternatives
is permitted in the chosen model - the other use then becomes an
error (and raises an exception).

Model 2: (BCPL) there are a variety of newline characters, \n for
plain newline, \f for newline-with-form-feed and \r for newline-
with-overprint.  ALL cause a newline, with the associated property.

Note that the above is what the program sees - what is written
to the outside world and how input is read is another matter.

But I can assure you, from my own and many other people's experience,
that neither of the above models cause the confusion being shown by
the postings in this thread.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Sun Sep 30 11:49:56 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sun, 30 Sep 2007 10:49:56 +0100
Subject: [Python-Dev] New lines, carriage returns, and Windows
Message-ID: <E1IbvQy-0000RO-Cj@draco.cus.cam.ac.uk>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> I don't see how this is different from Unix/C "\n" being
> an atomic newline character.

Have you used systems with the I/O models I referred to (or ones
with newlines being out-of-bound data)?

> If you're saying that BCPL is better because it defines
> standard semantics for more control characters than just
> "\n", that may be true, but C is doing about the best it
> can with "\n" as far as I can see, given all the crazy
> things that different OSes want to do with line endings.

I am afraid that you are wrong - see my other posting for how
to do it better.  Look, I have implemented both of those two models
on systems that are FAR more different than most people can imagine.
Both work, and neither causes confusion.  The C/Unix/Python one does.

> In any case, the problem which started all this isn't
> really an I/O problem at all, it's a mismatch between
> the world of Python strings which use "\n" and .NET
> library code expecting strings which use "\r\n".

That's an I/O problem :-)

> The correct thing to do with that is to translate whenever
> a string crosses a boundary between Python code and
> .NET code. This is something that ought to be done
> automatically by the Python/.NET interfacing machinery,
> maybe by having a different type for .NET strings.

Agreed.  But the REASON it causes trouble is the inconsistency
in the basic C/Unix/Python text I/O model.  Let's consider just
\f, \r and \n, and a few questions:

    Exactly what does a free-standing \f mean?

    Does \n\f\n mean starting at the top of a page or one line down?

    How do \r and \f interact with line-buffering?  Think about
MacOS here.

I could go on, but those are enough to indicate that the problem
is insoluble.  The answer "Undefined but not even explicitly
discouraged" is a recipe for confusion.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From skip at pobox.com  Sun Sep 30 15:28:28 2007
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 30 Sep 2007 08:28:28 -0500
Subject: [Python-Dev] New lines, carriage returns, and Windows
In-Reply-To: <ca471dc20709271035h29948358xe6bba5466edb1963@mail.gmail.com>
References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<46FB0F9D.6010303@canterbury.ac.nz>
	<18171.6511.526695.684154@montanaro.dyndns.org>
	<ca471dc20709271035h29948358xe6bba5466edb1963@mail.gmail.com>
Message-ID: <18175.42108.40732.660470@montanaro.dyndns.org>


    Greg> Maybe there should be a universal newlines mode defined for output
    Greg> as well as input, which translates any of "\r", "\n" or "\r\n"
    Greg> into the platform line ending.

    Skip> I'd be open to such a change.  Principle of least surprise?

    Guido> The symmetry isn't as strong as you suggest, but I agree it would
    Guido> be a useful feature. Would you mind filing a Py3k feature request
    Guido> so we don't forget?

    Guido> A proposal for an API given the existing newlines=... parameter
    Guido> (described in detail in PEP 3116) would be even better.

I've been thinking about this some more (in lieu of actually writing up any
sort of proposal ;-) and I'm not so sure it would be all that useful.  If
you've opened a file in text mode you should only be writing newlines as
'\n' anyway.  If you want to translate a text file imported from another
system to use the current system's line ending just open both the input and
output files in text mode.

With universal newlines mode for output, should writing '\r\n' result in one
or two newlines (or one-and-a-half)?  Depending on the platform you can
argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows
that it should be left alone as '\r\n'.  There is, of course, the current
'\r\r\n' behavior as well.  I don't think there's obviously one best answer.
If you want to do something esoteric, open the file in binary mode and do
whatever you like.

Skip

From nmm1 at cus.cam.ac.uk  Sun Sep 30 15:38:12 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sun, 30 Sep 2007 14:38:12 +0100
Subject: [Python-Dev] New lines, carriage returns, and Windows
Message-ID: <E1Ibyzs-0001x2-4S@draco.cus.cam.ac.uk>

skip at pobox.com wrote:
> 
> I've been thinking about this some more (in lieu of actually writing up any
> sort of proposal ;-) and I'm not so sure it would be all that useful.  If
> you've opened a file in text mode you should only be writing newlines as
> '\n' anyway.  If you want to translate a text file imported from another
> system to use the current system's line ending just open both the input and
> output files in text mode.

I.e. at least \r, \f and \v are discouraged - i.e. system-dependent,
at best.  That works.

> With universal newlines mode for output, should writing '\r\n' result in one
> or two newlines (or one-and-a-half)?  Depending on the platform you can
> argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows
> that it should be left alone as '\r\n'.  There is, of course, the current
> '\r\r\n' behavior as well.  I don't think there's obviously one best answer.

Quite.  And it has nothing to do with the format the outside system
uses - your first question is purely a matter of what the semantics
of the Python program are.  The question applies as much to zOS as
to any of the systems Python supports.

> If you want to do something esoteric, open the file in binary mode and do
> whatever you like.

Er, no.  That's the Unix mistake.  It works, provided two things are
true:

    1) You don't need to write portable formatting.

    2) The 'outside system' uses the control characters of a byte
stream for formatting.

Let's skip (1) - but (2) is universally true, nowadays, isn't it?
Er, no.  Consider reading and writing to an X window (NOT an xterm).
Such formatting is out-of-band (sorry, I used out-of-bound in a
previous posting).

Ouch.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From skip at pobox.com  Sun Sep 30 15:39:42 2007
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 30 Sep 2007 08:39:42 -0500
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
 and        Windows
In-Reply-To: <46FE9B09.8000800@voidspace.org.uk>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
Message-ID: <18175.42782.873060.154910@montanaro.dyndns.org>


    Michael> Actually, I usually get these strings from Windows UI
    Michael> components. A file containing '\r\n' is read in with '\r\n'
    Michael> being translated to '\n'. New user input is added containing
    Michael> '\r\n' line endings. The file is written out and now contains a
    Michael> mix of '\r\n' and '\r\r\n'.

So you need a translation layer between the UI component and your code.
Treat the component as a text file and perform the desired mapping.  Yes?

Skip

From fuzzyman at voidspace.org.uk  Sun Sep 30 15:49:56 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sun, 30 Sep 2007 14:49:56 +0100
Subject: [Python-Dev] [python] Re:  New lines, carriage returns,
 and        Windows
In-Reply-To: <18175.42782.873060.154910@montanaro.dyndns.org>
References: <E1IbYZl-0000pk-HS@libra.cus.cam.ac.uk>
	<ca471dc20709290807l351b87d7k2dd651e5ef18a83c@mail.gmail.com>
	<46FE6F92.40601@voidspace.org.uk> <fdm5l4$rok$1@sea.gmane.org>
	<46FE9B09.8000800@voidspace.org.uk>
	<18175.42782.873060.154910@montanaro.dyndns.org>
Message-ID: <46FFA984.9060602@voidspace.org.uk>

skip at pobox.com wrote:
>     Michael> Actually, I usually get these strings from Windows UI
>     Michael> components. A file containing '\r\n' is read in with '\r\n'
>     Michael> being translated to '\n'. New user input is added containing
>     Michael> '\r\n' line endings. The file is written out and now contains a
>     Michael> mix of '\r\n' and '\r\r\n'.
>
> So you need a translation layer between the UI component and your code.
> Treat the component as a text file and perform the desired mapping.  Yes?
>
>   

Actually the problem was reported by one of the IronPython developers on 
behalf of another user. We stick to using the .NET file I/O and so don't 
have a problem. The only time it is an issue for us is our tests, where 
we have string literals in our test code (where new lines are obviously 
'\n') and we do a manual 'replace'. Not very difficult.

It is just slightly ironic that the time Python 'gets it wrong' (for 
some value of wrong) is when you are using text mode for I/O :-)

Michael

> Skip
>
>   


From nmm1 at cus.cam.ac.uk  Sun Sep 30 16:12:00 2007
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sun, 30 Sep 2007 15:12:00 +0100
Subject: [Python-Dev] [python] Re: New lines, carriage returns,
	and Windows
Message-ID: <E1IbzWa-0005qt-Cv@virgo.cus.cam.ac.uk>

Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> skip at pobox.com wrote:
> 
> >     Michael> Actually, I usually get these strings from Windows UI
> >     Michael> components. A file containing '\r\n' is read in with '\r\n'
> >     Michael> being translated to '\n'. New user input is added containing
> >     Michael> '\r\n' line endings. The file is written out and now contains a
> >     Michael> mix of '\r\n' and '\r\r\n'.
> >       
> > So you need a translation layer between the UI component and your code.
> > Treat the component as a text file and perform the desired mapping.  Yes?
> 
> Actually the problem was reported by one of the IronPython developers on 
> behalf of another user. We stick to using the .NET file I/O and so don't 
> have a problem. The only time it is an issue for us is our tests, where 
> we have string literals in our test code (where new lines are obviously 
> '\n') and we do a manual 'replace'. Not very difficult.
> 
> It is just slightly ironic that the time Python 'gets it wrong' (for 
> some value of wrong) is when you are using text mode for I/O :-)

Plus ca change, ....

That has been the problem for as long as I have been using the byte
stream model (nearly 40 years now).  Provided that you can get
control, OR there are well-defined semantics, you can sort things
out.  The semantics "we define only the trivial case, and the
programmer must do something arcane, undefined and system-dependent
for the rest" means that it is impossible for an interface to do
the 'right' translation unless it knows what each side of it is
assuming.

As I say, there are solutions.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679