From fredrik@pythonware.com  Wed Feb 23 09:55:05 2005
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 23 Feb 2005 10:55:05 +0100
Subject: [Python-Dev] RE: Nested scopes resolution -- you can breathe again!
References: <XFMail.010223104112.mikael@isy.liu.se>
Message-ID: <01c301c5198d$c6bcc3f0$0900a8c0@SPIFF>

Mikael Olofsson wrote:
> There really is a time machine. So I guess I can get the full Python 3k
> functionality by doing
> 
> from __future__ import *

I wouldn't do that: it imports both "warnings_are_errors" and
"from_import_star_is_evil", and we've found that it's impossible
to catch ParadoxErrors in a platform independent way.

Cheers /F


From abo at minkirri.apana.org.au  Tue Feb  1 00:30:05 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue Feb  1 00:30:46 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	function calls)
In-Reply-To: <Pine.LNX.4.58.0501311513040.10672@smtp.eecs.umich.edu>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
	<f0480f69f325f0dd1af7695c993d73ca@redivi.com>
	<Pine.LNX.4.58.0501311513040.10672@smtp.eecs.umich.edu>
Message-ID: <1107214205.3719.23.camel@schizo>

On Mon, 2005-01-31 at 15:16 -0500, Nathan Binkert wrote:
> > Wouldn't it be nicer to have a facility that let you send messages
> > between processes and manage concurrency properly instead?  You'll need
> > most of this anyway to do multithreading sanely, and the benefit to the
> > multiple process model is that you can scale to multiple machines, not
> > just processors.  For brokering data between processes on the same
> > machine, you can use mapped memory if you can't afford to copy it
> > around, which gives you basically all the benefits of threads with
> > fewer pitfalls.
> 
> I don't think this is an answered problem.  There are plenty of
> researchers on both sides of this fence.  It is not been proven at all
> that threads are a bad model.
> 
> http://capriccio.cs.berkeley.edu/pubs/threads-hotos-2003.pdf or even
> http://www.python.org/~jeremy/weblog/030912.html

These are both threads vs events discussions (ie, threads vs an
async-event handler loop). This has nearly nothing to do with multiple
CPU utilisation. The real discussion for multiple CPU utilisation is
threads vs processes.

Once again, my knowledge of this is old and possibly out of date, but
threads do not scale well on multiple CPU's because threads use shared
memory between each thread. Multiple CPU hardware _can_ have physically
shared memory, but it is hardware hell keeping CPU caches in sync etc.
It is much easier to build a multi-CPU machine with separate memory for
each CPU, and high speed communication channels between each CPU. I
suspect most modern multi-CPU's use this architecture. 

Assuming they have the separate-memory architecture, you get much better
CPU utilisation if you design your program as separate processes
communicating together, not threads sharing memory. In fact, it wouldn't
surprise me if most Operating Systems that support threads don't support
distributing threads over multiple CPU's at all.

A quick google search revealed this;

http://www.heise.de/ct/english/98/13/140/

Keeping in mind the high overheads of sharing memory between CPU's, the
discussion about threads at this url seems to confirm; threads with
shared memory are hard to distribute over multiple CPU's. Different OS's
and/or thread implementations have tried (or just outright rejected)
different ways of doing it, to varying degrees of success. IMHO, the
fact that QNX doesn't distribute threads speaks volumes.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From abo at minkirri.apana.org.au  Tue Feb  1 03:06:34 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue Feb  1 03:07:22 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	function calls)
In-Reply-To: <1107214205.3719.23.camel@schizo>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
	<f0480f69f325f0dd1af7695c993d73ca@redivi.com>
	<Pine.LNX.4.58.0501311513040.10672@smtp.eecs.umich.edu>
	<1107214205.3719.23.camel@schizo>
Message-ID: <1107223595.3719.36.camel@schizo>

On Tue, 2005-02-01 at 10:30 +1100, Donovan Baarda wrote:
> On Mon, 2005-01-31 at 15:16 -0500, Nathan Binkert wrote:
> > > Wouldn't it be nicer to have a facility that let you send messages
> > > between processes and manage concurrency properly instead?  You'll need
[...]
> A quick google search revealed this;
> 
> http://www.heise.de/ct/english/98/13/140/
> 
> Keeping in mind the high overheads of sharing memory between CPU's, the
> discussion about threads at this url seems to confirm; threads with
> shared memory are hard to distribute over multiple CPU's. Different OS's
> and/or thread implementations have tried (or just outright rejected)
> different ways of doing it, to varying degrees of success. IMHO, the
> fact that QNX doesn't distribute threads speaks volumes.

Sorry for replying to my reply, but I forgot the bit that brings it all
back On Topic :-)

The belief that the opcode granularity thread-switch driven by the GIL
is the cause of Python's threads being non-distributable is only half
true. 

Since OS's don't distribute threads well, any attempts to "Fix Python's
Threading" in an attempt to make its threads distributable is a waste of
time. The only thing that this might achieve would be to reduce the
latency on thread switches, maybe allowing faster response to OS events
like signals. However, the complexity introduced would cause more
problems than it would fix, and could easily result in worse
performance, not better.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From p.f.moore at gmail.com  Tue Feb  1 10:18:04 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue Feb  1 10:18:07 2005
Subject: [Python-Dev] python-dev Summary for 2004-12-16 through 2004-12-31
	[draft]
In-Reply-To: <41FEAAEC.5080805@ocf.berkeley.edu>
References: <41FEAAEC.5080805@ocf.berkeley.edu>
Message-ID: <79990c6b050201011871e86ce3@mail.gmail.com>

On Mon, 31 Jan 2005 14:02:20 -0800, Brett C. <bac@ocf.berkeley.edu> wrote:
> 2.5 was released just before the time this summary covers so most stuff was on bug
> fixes discovered after the release.

Give Guido the time machine keys back!

I assume you meant 2.4, or is this a blatant attempt to get back ahead
of schedule with summaries? :-)

Paul.

PS If you look in this month's python-dev archives, you'll see
evidence of /F's last attempt to steal the time machine, with a
message posted from the "far future" of Feb 23rd, 2005. He clearly
stalled the machine, as he posted from an alternate reality. Let this
be a warning!
From cedric.dev at tele2.fr  Tue Feb  1 13:20:10 2005
From: cedric.dev at tele2.fr (cedric paille)
Date: Tue Feb  1 12:24:14 2005
Subject: [Python-Dev] Python reference count question
Message-ID: <007701c50858$5ff4cc30$90010d0a@umanis.com>

Hi all, i'm working on an app that embed python 2.3 with Gnu/Linux, and i'd like to have some precisions:

I'm making python's modules to extend my application's functions with a built in script editor.
At now all works very well, but i'd like to know if i'm not forgetting some references inc/dec....

Here is a portion of my code:

static PyObject *
Scene_GetNodeGraph(PyObject *self, PyObject *args)
{
  NodeGraph* Ng = NodeGraph::GetInstance();
  std::vector<String> NodG;
  Ng->GetNodeGraph(NodG);
  PyObject* List = PyList_New(NodG.size());
  PyObject* Str = NULL;
  std::vector<String>::iterator it = NodG.begin();
  int i = 0;
  for (;it != NodG.end();it++)
  {
    Str = PyString_FromString(it->AsChar());
    PyList_SetItem(List,i,Str);
     i++;
  }
  return List;
}

Can someone take a look at this and tell me if i must add some inc/decref ?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050201/20c06f0b/attachment.html
From steve at holdenweb.com  Tue Feb  1 16:32:43 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue Feb  1 16:38:13 2005
Subject: [Python-Dev] Database import problems
Message-ID: <41FFA11B.6000807@holdenweb.com>

I wonder if there is a developer with MySQL or sqlite and the 
appropriate Python interface module who can help me to understand a 
problem I'm experiencing trying to use PEP 302-style import hooks.

Basically I suspect we've either got an import bug or (more likely IMHO) 
a documentation bug, but I don't want to file on sf until I know exactly 
what the problem is, and I'm reluctant to use too much bandwidth on 
python-dev, which I know to be a busy list.

The background is visible in the Python-list archives starting at

http://mail.python.org/pipermail/python-list/2005-January/262148.html

Of course it's possible that a savvy developer can just tell me what the 
problem is by reading that thread.

If not, being a bear of little brain I need help from someone who is 
used to running debugging interpreters and can see exactly what's going 
on - my debugging system is fine for Python source, but has no insight 
into the interpreter code itself.

Since I'm not currently subscribed to python-dev an email response (or, 
better, a follow-up on the c.l.py thread) would be appreciated if you 
can solve this problem.

I'm happy to send full code off-list (or on-list, come to that) to 
anybody who can assist.

regards
  Steve
-- 
Meet the Python developers and your c.l.py favorites
Come to PyCon!!!!  http://www.python.org/pycon/2005/
Steve Holden               http://www.holdenweb.com/


From gvanrossum at gmail.com  Tue Feb  1 16:49:41 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Feb  1 16:49:52 2005
Subject: [Python-Dev] Database import problems
In-Reply-To: <41FFA11B.6000807@holdenweb.com>
References: <41FFA11B.6000807@holdenweb.com>
Message-ID: <ca471dc2050201074923d833d6@mail.gmail.com>

On Tue, 01 Feb 2005 10:32:43 -0500, Steve Holden <steve@holdenweb.com> wrote:
> I wonder if there is a developer with MySQL or sqlite and the
> appropriate Python interface module who can help me to understand a
> problem I'm experiencing trying to use PEP 302-style import hooks.
[...]

I sent Steve a private reply pointing out the line
"sys.modules['path'] = path" in os.py.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aahz at pythoncraft.com  Tue Feb  1 16:50:12 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Feb  1 16:50:14 2005
Subject: [Python-Dev] Python reference count question
In-Reply-To: <007701c50858$5ff4cc30$90010d0a@umanis.com>
References: <007701c50858$5ff4cc30$90010d0a@umanis.com>
Message-ID: <20050201155012.GA14254@panix.com>

On Tue, Feb 01, 2005, cedric paille wrote:
>
> Hi all, i'm working on an app that embed python 2.3 with Gnu/Linux,
> and i'd like to have some precisions:

python-dev is for the core developers to discuss bugs and patches.
Please use comp.lang.python for questions about using Python.  Thanks.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Given that C++ has pointers and typecasts, it's really hard to have a serious 
conversation about type safety with a C++ programmer and keep a straight face.
It's kind of like having a guy who juggles chainsaws wearing body armor 
arguing with a guy who juggles rubber chickens wearing a T-shirt about who's 
in more danger."  --Roy Smith, c.l.py, 2004.05.23
From ndbecker2 at verizon.net  Tue Feb  1 17:11:37 2005
From: ndbecker2 at verizon.net (Neal Becker)
Date: Tue Feb  1 17:37:25 2005
Subject: [Python-Dev] complex I/O problem
Message-ID: <cto9n3$ivk$1@sea.gmane.org>

If I call "print" on a complex value, I may get this:
'(2+2j)'

But this is not acceptable as input:
complex ('(2+2j)')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: complex() arg is a malformed string

Whatever format is used for output should be accepted as input!

From amk at amk.ca  Tue Feb  1 18:16:10 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Tue Feb  1 18:18:14 2005
Subject: [Python-Dev] complex I/O problem
In-Reply-To: <cto9n3$ivk$1@sea.gmane.org>
References: <cto9n3$ivk$1@sea.gmane.org>
Message-ID: <20050201171610.GA10114@rogue.amk.ca>

On Tue, Feb 01, 2005 at 11:11:37AM -0500, Neal Becker wrote:
> complex ('(2+2j)')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: complex() arg is a malformed string
> 
> Whatever format is used for output should be accepted as input!

This isn't true in general; it's not true of strings, for example, nor
of files.  Parsing complex numbers would be pretty complicated,
because it would have to accept '(2+2j)', '2+2j', '3e-6j', and perhaps
even '4j+3'.  It seems easier to just use eval() than to make
complex() implement an entire mini-parser.

--amk
From gvanrossum at gmail.com  Tue Feb  1 18:27:45 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Feb  1 18:27:51 2005
Subject: [Python-Dev] complex I/O problem
In-Reply-To: <20050201171610.GA10114@rogue.amk.ca>
References: <cto9n3$ivk$1@sea.gmane.org> <20050201171610.GA10114@rogue.amk.ca>
Message-ID: <ca471dc205020109274ba62e98@mail.gmail.com>

On Tue, 1 Feb 2005 12:16:10 -0500, A.M. Kuchling <amk@amk.ca> wrote:
> On Tue, Feb 01, 2005 at 11:11:37AM -0500, Neal Becker wrote:
> > complex ('(2+2j)')
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > ValueError: complex() arg is a malformed string
> >
> > Whatever format is used for output should be accepted as input!
> 
> This isn't true in general; it's not true of strings, for example, nor
> of files.  Parsing complex numbers would be pretty complicated,
> because it would have to accept '(2+2j)', '2+2j', '3e-6j', and perhaps
> even '4j+3'.  It seems easier to just use eval() than to make
> complex() implement an entire mini-parser.

Well, complex('2+2j') works, so it's not that far...

But the rules are different:

- There's no requirement whatsoever for str(); it can be whatever
makes the most sense for the type.

- For repr(), if at all possible, eval(repr(x)) == x should hold, in a
suitable environment (you may have to import certain things in the
namespace). If this can't be made true, repr(x) should be of the form
<...>.

- If there's no need for str() and repr() to be different, let str(x)
== repr(x).

So I think complex() is just fine.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jcarlson at uci.edu  Tue Feb  1 18:31:22 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue Feb  1 18:33:45 2005
Subject: [Python-Dev] complex I/O problem
In-Reply-To: <20050201171610.GA10114@rogue.amk.ca>
References: <cto9n3$ivk$1@sea.gmane.org> <20050201171610.GA10114@rogue.amk.ca>
Message-ID: <20050201092927.48FF.JCARLSON@uci.edu>


"A.M. Kuchling" <amk@amk.ca> wrote:
> 
> On Tue, Feb 01, 2005 at 11:11:37AM -0500, Neal Becker wrote:
> > complex ('(2+2j)')
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > ValueError: complex() arg is a malformed string
> > 
> > Whatever format is used for output should be accepted as input!
> 
> This isn't true in general; it's not true of strings, for example, nor
> of files.  Parsing complex numbers would be pretty complicated,
> because it would have to accept '(2+2j)', '2+2j', '3e-6j', and perhaps
> even '4j+3'.  It seems easier to just use eval() than to make
> complex() implement an entire mini-parser.


Which brings up the fact that while some things are able to make the
eval(str(obj)) loop, more are able to make the eval(repr(obj)) loop
(like strings themselves...).

 - Josiah

From fdrake at acm.org  Tue Feb  1 21:06:17 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Feb  1 21:06:43 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	functioncalls)
In-Reply-To: <1107198504.4185.5.camel@localhost>
References: <1107190265.41fe61f9caab8@mcherm.com>
	<1107198504.4185.5.camel@localhost>
Message-ID: <200502011506.17223.fdrake@acm.org>

On Monday 31 January 2005 14:08, Glyph Lefkowitz wrote:
 > As it stands, this idiom works most of the time, and if an EMFILE errno
 > triggered the GC, it would always work.

That might help things on Unix, but I don't think that's meaningful.  Windows 
is much more sensitive to files being closed, and the refcount solution 
supports that more effectively than delayed garbage collection strategies.

With the current approach, you can delete the file right away after releasing 
the last reference to the open file object, even on Windows.  You can't do 
that with delayed GC since Windows will be convinced that the file is still 
open and refuse to let you delete it.  To fix that, you'd have to trigger GC 
from the failed removal operation and try again.

I think we'd find there are a lot more operations that need that support than 
we'd like to think.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From theller at python.net  Tue Feb  1 21:17:17 2005
From: theller at python.net (Thomas Heller)
Date: Tue Feb  1 21:15:41 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
Message-ID: <4qgwf31u.fsf@python.net>

The 2.4 python.org installer installs msvcr71.dll on the target system.

If someone uses py2exe or a similar tool to create a frozen application,
is he allowed to redistribute this msvcr71.dll to other users together
with his application or not, even if he doesn't own MSVC?

This was asked on the py2exe users list, but I could not answer this
question.  Googling for msvcr71.dll finds some site which offer to
download it, and they pretend that they are not violating any license,
but I wasn't able to find definite words from MS about that.

Thanks,

Thomas

From bac at OCF.Berkeley.EDU  Tue Feb  1 23:22:45 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Feb  1 23:22:53 2005
Subject: [Python-Dev] python-dev Summary for 2004-12-16 through 2004-12-31
	[draft]
In-Reply-To: <79990c6b050201011871e86ce3@mail.gmail.com>
References: <41FEAAEC.5080805@ocf.berkeley.edu>
	<79990c6b050201011871e86ce3@mail.gmail.com>
Message-ID: <42000135.8030701@ocf.berkeley.edu>

Paul Moore wrote:
> On Mon, 31 Jan 2005 14:02:20 -0800, Brett C. <bac@ocf.berkeley.edu> wrote:
> 
>>2.5 was released just before the time this summary covers so most stuff was on bug
>>fixes discovered after the release.
> 
> 
> Give Guido the time machine keys back!
> 

Fine, but I was going to go back in time, win the lottery, and give so much 
money to the PSF that a bunch of people were going to work on Python full-time 
for the rest of their lives.  It's your fault, Paul, that isn't going to happen 
now.  =)

> I assume you meant 2.4, or is this a blatant attempt to get back ahead
> of schedule with summaries? :-)
> 

=)  No, it's a typo.  Problem of always using and working on 2.5 but having to 
remember when I am dealing with older versions.

> Paul.
> 
> PS If you look in this month's python-dev archives, you'll see
> evidence of /F's last attempt to steal the time machine, with a
> message posted from the "far future" of Feb 23rd, 2005. He clearly
> stalled the machine, as he posted from an alternate reality. Let this
> be a warning!

Will actually be nice to finally not have to automatically skip the first line 
in the archive page thanks to that funky email.

-Brett
From mike at skew.org  Wed Feb  2 01:50:51 2005
From: mike at skew.org (Mike Brown)
Date: Wed Feb  2 01:51:01 2005
Subject: [Python-Dev] mimetypes and _winreg
In-Reply-To: <40CB5684.2090609@garthy.com>
Message-ID: <200502020050.j120opQW020156@chilled.skew.org>

Following up on this 12 Jun 2004 post...

Garth wrote:
> Thomas Heller wrote:
> >Mike Brown <mike@skew.org> writes:
> >>I thought it would be nice to try to improve the mimetypes module by having 
> >>it, on Windows, query the Registry to get the mapping of filename extensions 
> >>to media types, since the mimetypes code currently just blindly checks 
> >>posix-specific paths for httpd-style mapping files. However, it seems that the 
> >>way to get mappings from the Windows registry is excessively slow in Python.
> >>
> >>I'm told that the reason has to do with the limited subset of APIs that are 
> >>exposed in the _winreg module. I think it is that EnumKey(key, index) is 
> >>querying for the entire list of subkeys for the given key every time you call 
> >>it. Or something. Whatever the situation is, the code I tried below is way 
> >>slower than I think it ought to be.
> >>
> >>Does anyone have any suggestions (besides "write it in C")? Could _winreg 
> >>possibly be improved to provide an iterator or better interface to get the 
> >>subkeys? (or certain ones? There are a lot of keys under HKEY_CLASSES_ROOT, 
> >>and I only need the ones that start with a period).
> >
> >See this post I made some time ago:
> ><http://mail.python.org/pipermail/python-dev/2004-January/042198.html>
> >
> >>Should I file this as a feature request?
> >
> >If you still think it should be changed in the core, you should work on
> >a patch.
> >
> I could file a patch if no one else is looking at it. The solution would 
> be to use RegEnumKeyEx and remove RegQueryInfoKey. This loses
> compatability with win16 which I guess is ok.
> 
> Garth

I would say it looks like no one else was looking at it, and Garth apparently 
didn't submit a patch. It's beyond my means to come up with a patch myself. 
Would someone be willing to take a look at it?

Sorry, but I really want access to registry subkeys to stop being so dog-slow. 
:)

Thanks for taking a look,

-Mike
From vwehren at home.nl  Wed Feb  2 06:30:04 2005
From: vwehren at home.nl (Vincent Wehren)
Date: Wed Feb  2 06:30:05 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <4qgwf31u.fsf@python.net>
References: <4qgwf31u.fsf@python.net>
Message-ID: <4200655C.2030305@home.nl>

Thomas Heller wrote:
> The 2.4 python.org installer installs msvcr71.dll on the target system.
> 
> If someone uses py2exe or a similar tool to create a frozen application,
> is he allowed to redistribute this msvcr71.dll to other users together
> with his application or not, even if he doesn't own MSVC?


According to the EULA, you may distribute anything listed in redist.txt:

"""2.2	Redistributable Code-General.   Microsoft grants you a 
nonexclusive, royalty-free right to reproduce and distribute the object 
code form of any portion of the Software listed in REDIST.TXT 
("Redistributable Code").  For general redistribution requirements for 
Redistributable Code, see Section 3.1, below."""

So the right to distribute is coupled to the a) the EULA and b) 
redist.txt. (As a side note, the Microsoft Visual C++ Toolkit 2003 for 
example contains NO redistributables per redist.txt).

In the case of not owning a compiler at all, chances seem pretty slim 
you have any rights to distribute anything.

--
Vincent Wehren


> 
> This was asked on the py2exe users list, but I could not answer this
> question.  Googling for msvcr71.dll finds some site which offer to
> download it, and they pretend that they are not violating any license,
> but I wasn't able to find definite words from MS about that.
> 
> Thanks,
> 
> Thomas
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/vwehren%40home.nl
> 

From t-meyer at ihug.co.nz  Wed Feb  2 09:38:00 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Wed Feb  2 09:38:47 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801EC814F@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDFE@its-xchg4.massey.ac.nz>

[Thanks for bringing this up, BTW, Thomas].

[Thomas Heller]
>> The 2.4 python.org installer installs msvcr71.dll on the 
>> target system. 
>>
>> If someone uses py2exe or a similar tool to create a frozen 
>> application, is he allowed to redistribute this msvcr71.dll
>> to other users together with his application or not, even if
>> he doesn't own MSVC?

[Vincent Wehren]
> According to the EULA,

Is that the EULA of MS VC++?

> you may distribute anything listed in redist.txt:

And, just to be clear, mscvr71.dll is in redist.txt?

> """2.2	Redistributable Code-General.   Microsoft grants you a 
> nonexclusive, royalty-free right to reproduce and distribute 
> the object code form of any portion of the Software listed in
> REDIST.TXT ("Redistributable Code").  For general redistribution 
> requirements for Redistributable Code, see Section 3.1, below."""

Is it legit to redistribute an EULA?  If so, would you mind sending me a
copy of this (off-list)?

> So the right to distribute is coupled to the a) the EULA and b) 
> redist.txt. (As a side note, the Microsoft Visual C++ Toolkit 
> 2003 for example contains NO redistributables per redist.txt).

I'm not that familiar with the names of all these things.  Is the "Microsoft
Visual C++ Toolkit 2003" the free thing that you can get?

> In the case of not owning a compiler at all, chances seem pretty slim 
> you have any rights to distribute anything.

Well, I 'own' a copy of gcc, which is a compiler <wink>.

Can anyone here suggest a way to get around this?  As a specific example:
the SpamBayes distribution includes a py2exe binary, and it would be nice
(although not essential) to build this with 2.4.  However, at the moment my
name goes down as the release manager, and I don't have (AFAICT) a licence
to redistribute msvcr71.dl.

Should people in this situation just stick with 2.3 or buy a copy of a MS
compiler?

=Tony.Meyer

From ajm at flonidan.dk  Wed Feb  2 11:38:05 2005
From: ajm at flonidan.dk (Anders J. Munch)
Date: Wed Feb  2 11:38:20 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
Message-ID: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL>

>From Tony Meyer [mailto:t-meyer@ihug.co.nz]:

> Can anyone here suggest a way to get around this?  As a specific
> example: the SpamBayes distribution includes a py2exe binary, and it
> would be nice (although not essential) to build this with 2.4.
> However, at the moment my name goes down as the release manager, and
> I don't have (AFAICT) a licence to redistribute msvcr71.dl.

Instead of redistributing msvcr71.dll on your own volition, help
someone else distribute it:

1. John X. Programmer buys the product, agrees to the EULA and puts
   the DLL up for download, with the explicit and stated intent of
   distributing it to anyone who needs it.
2. You, being the nice person you are, decide to help John
   X. Programmer.  You do that by including msvcr71.dll in your
   software distribution.  After all, the users of your software needs
   it.  As you are merely aiding John X. Programmer in performing the
   redistribution that is within his rights to do, there is no need
   for anyone to be granted any additional rights, and specifically
   you do not need to agree to the EULA.

Unless the EULA contains specific language to forbid such multi-stage
open-ended redistribution, I'd say you can just re-redistribute away.

but-then-I-am-not-a-lawyer-ly y'rs, Anders
From theller at python.net  Wed Feb  2 12:05:32 2005
From: theller at python.net (Thomas Heller)
Date: Wed Feb  2 12:04:02 2005
Subject: [Python-Dev] mimetypes and _winreg
In-Reply-To: <200502020050.j120opQW020156@chilled.skew.org> (Mike Brown's
	message of "Tue, 1 Feb 2005 17:50:51 -0700 (MST)")
References: <200502020050.j120opQW020156@chilled.skew.org>
Message-ID: <brb3dxxf.fsf@python.net>

Mike Brown <mike@skew.org> writes:

> Following up on this 12 Jun 2004 post...
>
> Garth wrote:
>> Thomas Heller wrote:
>> >Mike Brown <mike@skew.org> writes:
>> >>I thought it would be nice to try to improve the mimetypes module by having 
>> >>it, on Windows, query the Registry to get the mapping of filename extensions 
>> >>to media types, since the mimetypes code currently just blindly checks 
>> >>posix-specific paths for httpd-style mapping files. However, it seems that the 
>> >>way to get mappings from the Windows registry is excessively slow in Python.
>> >>
>> >>I'm told that the reason has to do with the limited subset of APIs that are 
>> >>exposed in the _winreg module. I think it is that EnumKey(key, index) is 
>> >>querying for the entire list of subkeys for the given key every time you call 
>> >>it. Or something. Whatever the situation is, the code I tried below is way 
>> >>slower than I think it ought to be.
>> >>
>> >>Does anyone have any suggestions (besides "write it in C")? Could _winreg 
>> >>possibly be improved to provide an iterator or better interface to get the 
>> >>subkeys? (or certain ones? There are a lot of keys under HKEY_CLASSES_ROOT, 
>> >>and I only need the ones that start with a period).
>> >
>> >See this post I made some time ago:
>> ><http://mail.python.org/pipermail/python-dev/2004-January/042198.html>
>> >
>> >>Should I file this as a feature request?
>> >
>> >If you still think it should be changed in the core, you should work on
>> >a patch.
>> >
>> I could file a patch if no one else is looking at it. The solution would 
>> be to use RegEnumKeyEx and remove RegQueryInfoKey. This loses
>> compatability with win16 which I guess is ok.
>> 
>> Garth
>
> I would say it looks like no one else was looking at it, and Garth apparently 
> didn't submit a patch. It's beyond my means to come up with a patch myself. 
> Would someone be willing to take a look at it?

There is a patch, but, as so often, work on it has stalled.
http://www.python.org/sf/977553

Thomas

From stephen at xemacs.org  Wed Feb  2 13:40:23 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Feb  2 13:40:40 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL> (Anders
	J. Munch's message of "Wed, 2 Feb 2005 11:38:05 +0100")
References: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL>
Message-ID: <87wttr3zk8.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Anders" == Anders J Munch <ajm@flonidan.dk> writes:

    Anders> Unless the EULA contains specific language to forbid such
    Anders> multi-stage open-ended redistribution, I'd say you can
    Anders> just re-redistribute away.

    Anders> but-then-I-am-not-a-lawyer-ly y'rs, Anders

I am not either, but in matters like this it works the other way
around: all rights not _explicitly_ granted are reserved.  Somebody
had better ask a real lawyer; in theory, you could be putting
downstream users who share with their friends at risk.


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From vwehren at home.nl  Wed Feb  2 18:27:47 2005
From: vwehren at home.nl (Vincent Wehren)
Date: Wed Feb  2 18:27:48 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDFE@its-xchg4.massey.ac.nz>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDFE@its-xchg4.massey.ac.nz>
Message-ID: <42010D93.7010002@home.nl>

Tony Meyer wrote:
> [Thanks for bringing this up, BTW, Thomas].
> 
> [Thomas Heller]
> 
> 
> [Vincent Wehren]
> 
>>According to the EULA,
> 
> 
> Is that the EULA of MS VC++?


The full text of the EULA for Visual C++ Toolkit 2003 can be found
at http://msdn.microsoft.com/visualc/vctoolkit2003/eula.aspx

For VS.NET:
http://proprietary.clendons.co.nz/licenses/eula/VisualStudiodotnetEnterpriseArchitect2002-eula.htm

> 
>>you may distribute anything listed in redist.txt:
> 
> 
> And, just to be clear, mscvr71.dll is in redist.txt?

Not in the free toolkit; in the $-version it must be.

> I'm not that familiar with the names of all these things.  Is the "Microsoft
> Visual C++ Toolkit 2003" the free thing that you can get?

Yep.

>>In the case of not owning a compiler at all, chances seem pretty slim 
>>you have any rights to distribute anything.
> 
> 
> Well, I 'own' a copy of gcc, which is a compiler <wink>.
> 
> Can anyone here suggest a way to get around this?  As a specific example:
> the SpamBayes distribution includes a py2exe binary, and it would be nice
> (although not essential) to build this with 2.4.  However, at the moment my
> name goes down as the release manager, and I don't have (AFAICT) a licence
> to redistribute msvcr71.dl.

Okay: thinking about this for a bit longer: it is the Python interpreter 
that needs msvcr71.dll, right. You  need the python interpreter for 
py2exe. The distributor of Python is allowed to redistribute 
msvcr71.dll, and you are acting as re-distributor for the Python 
interpreter (to end users) and the EULA never even cares for/applies to 
the frozen binary...

--
Vincent Wehren

> 
> Should people in this situation just stick with 2.3 or buy a copy of a MS
> compiler?
> 
> =Tony.Meyer
> 
> 

From nhodgson at bigpond.net.au  Wed Feb  2 22:04:40 2005
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Wed Feb  2 22:04:49 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
References: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL>
Message-ID: <001e01c5096a$cfbe8480$214c8890@neil>

Anders J. Munch:

> 1. John X. Programmer buys the product, agrees to the EULA and puts
>    the DLL up for download, with the explicit and stated intent of
>    distributing it to anyone who needs it.

   Disallowed in 3.1(a):
# you agree: ... to distribute the Redistributables only ... in 
# conjunction with and as a part of a software application 
# product developed by you that adds significant and primary 
# functionality to the Redistributables

> Unless the EULA contains specific language to forbid such multi-stage
> open-ended redistribution, I'd say you can just re-redistribute away.

   Lawyers think like lawyers much better than developers do.

   Neil
From theller at python.net  Wed Feb  2 22:16:10 2005
From: theller at python.net (Thomas Heller)
Date: Wed Feb  2 22:14:41 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <001e01c5096a$cfbe8480$214c8890@neil> (Neil Hodgson's message
	of "Thu, 3 Feb 2005 08:04:40 +1100")
References: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL>
	<001e01c5096a$cfbe8480$214c8890@neil>
Message-ID: <3bwed5np.fsf@python.net>

"Neil Hodgson" <nhodgson@bigpond.net.au> writes:

> Anders J. Munch:
>
>> 1. John X. Programmer buys the product, agrees to the EULA and puts
>>    the DLL up for download, with the explicit and stated intent of
>>    distributing it to anyone who needs it.
>
>    Disallowed in 3.1(a):
> # you agree: ... to distribute the Redistributables only ... in 
> # conjunction with and as a part of a software application 
> # product developed by you that adds significant and primary 
> # functionality to the Redistributables
>

All this pretty much subsumes what I was thinking.

The only question that remains is: why are there some sites like
http://www.dll-files.com/ which offer this and other MS dlls for
download?


For the spambayes binary, maybe there should be another person adding
the msvcr71.dll to the distribution that Tony builds?  Someone who has a
MSVC license, and also is developer on the spambayes project?

Thomas

From tim.peters at gmail.com  Wed Feb  2 23:12:36 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Feb  2 23:12:39 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <3bwed5np.fsf@python.net>
References: <6D9E824FA10BD411BE95000629EE2EC3C6DE48@FLONIDAN-MAIL>
	<001e01c5096a$cfbe8480$214c8890@neil> <3bwed5np.fsf@python.net>
Message-ID: <1f7befae050202141243ecc3a2@mail.gmail.com>

[Thomas Heller]
> ...
> For the spambayes binary, maybe there should be another person adding
> the msvcr71.dll to the distribution that Tony builds?  Someone who has a
> MSVC license, and also is developer on the spambayes project?

To the best of my knowledge, Tony is distributing my duly licensed
copy of msvcr71.dll with spambayes.  And so long as I remain totally
ignorant of what Tony actually does, that will remain my best
knowledge.  Win-win <wink>.
From noamraph at gmail.com  Wed Feb  2 23:55:31 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed Feb  2 23:56:13 2005
Subject: [Python-Dev] A proposal: built in support for abstract methods
Message-ID: <b348a085050202145541331104@mail.gmail.com>

Hello,

I would like to suggest a new method decorator: abstractmethod. I'm
definitely not the only one who've thought about it, but I discussed
this on c.l.py, and came to think that it's a nice idea. An even
Pythonic!

This has nothing to do with type checking and adaptation - or, to be
more precise, it may be combined with them, but it will live happily
without them. I don't understand these issues a great deal.

What was my situations? I had to write a few classes, all with the
same interface but with a different implementation, that were meant to
work inside some infrastructure. The specific class that would be used
would be selected by what exactly the user wanted. Some methods of
these classes were exactly the same in all of the classes, so
naturally, I wrote a base class with an implementation of these
methods.

But then came the question: and what about the other methods? I wanted
to document that they should exist in all the classes of that family,
and that they should do XYZ; otherwise, they won't fit the
infrastructure. So I wrote something like:

def get_changed(self):
    """This method should return the changed keys since last call."""
    raise NotImplementedError

But I wasn't happy about it. I thought that

@abstractmethod
def get_changed(self):
    """This methods should ..."""

would have been nicer. Why?

1. "Beautiful is better than ugly." - Who was talking here about
errors? I just wanted to say what the method should do!

2. "Explicit is better than implicit." - This is really the issue. I
*meant* to declare that a method should be implemented in subclasses,
and what it should do, but I *was* actually defining a method which
raises NotImplementedError when called with no arguments. I am used to
understanding NotImplementedError as "We should really implement this
some day, when we have the time", not as "In order to be a proud
subclass of BaseClass, you should implement this method".

3. "There should be one-- and preferably only one --obvious way to do
it." - I could have written this in a few other ways:

def get_changed(self):
    """This method should return the changed keys since last call.

    PURE VIRTUAL.
    """

def get_changed(self):
    """This method should return the changed keys since last call."""
    raise NotImplementedError, "get_changed is an abstract methods.
Subclasses of BaseClass should implement it."

What's good about the last example is that when the exception occurs,
it would be easier to find the problem. What's bad about it, is that
it's completely redundent, and very long to write.

Ok. Now another thing: I want classes that contain abstractmethods be
uninstantiable. One (and the main) reason is that instantiating that
class of mine doesn't make sense. It doesn't know how to do anything
useful, and doesn't represent any consistent object that you can have
instances of. The other reason is that it will help the programmer to
find out quickly methods he forgot to implement in his subclasses. You
may say that it suits "Errors should never pass silently."

The basic reason why I think this is fitting is that abstract classes
are something which is natural when creating class hierarchies;
usually, when I write a method, all subclasses must inherit it, or
implement another version with a compatible behaviour. Sometimes there
is no standard behaviour, so all subclasses must choose the second
option.

This concept is already in use in Python's standard library today!
"basestring" was created as the base class of "str" and "unicode".
What I'm proposing is just to make this possible also in code written
in Python.

George Sakkis has posted a very nice Python implementation of this:
http://groups-beta.google.com/group/comp.lang.python/msg/597e9ffa7b1f709b

To summarize, I think that abstract methods are simply not regular
functions, since by definition they don't specify actions, and so they
deserves an object of their own. And if it helps with testing the
subclasses - then why not?


What do you say?

Noam
From t-meyer at ihug.co.nz  Thu Feb  3 01:23:39 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Thu Feb  3 01:23:45 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801F98639@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFE0A@its-xchg4.massey.ac.nz>

[Thomas Heller]
>> For the spambayes binary, maybe there should be another 
>> person adding the msvcr71.dll to the distribution that Tony
>> builds?  Someone who has a MSVC license, and also is developer
>> on the spambayes project?

[Tim Peters]
> To the best of my knowledge, Tony is distributing my duly 
> licensed copy of msvcr71.dll with spambayes.  And so long as 
> I remain totally ignorant of what Tony actually does, that 
> will remain my best knowledge.  Win-win <wink>. 

That solves the specific SpamBayes problem.  It still seems like this is
somewhat of a PITA for people wanting to build frozen Windows apps with
Python 2.4, though.  OTOH, I can't personally think of anything (apart from
the it'll-never-fly go back to VC6 solution or the bound-to-be-terrible
static linking solution) that the Python developers can do about it.

(Well, there's that chap from Microsoft at PyCon, right?  How about one of
you convince him to convince Microsoft to give all Python developers a
licence to redistribute msvcr71.dll?  <wink>).

BTW, this bit of the EULA isn't great:

""(iii) to distribute the Licensee Software containing the Redistributables
pursuant to an end user license agreement (which may be "break-the-seal",
"click-wrap" or signed), with terms no less protective than those contained
in this EULA;"""

The PSF licence is probably somewhat less protective than that one.  I
suppose the PSF licence really applies to the source, though, and not the
built binary.  Or something like that :)

(Users giving the software directly to someone else, rather than downloading
from the official site, is probably covered by:

"""You also agree not to permit further distribution of the Redistributables
by your end users except you may permit further redistribution of the
Redistributables by your distributors to your end-user customers if your
distributors only distribute the Redistributables in conjunction with, and
as part of, the Licensee Software and you and your distributors comply with
all other terms of this EULA."""

Where the users become our redistributors.)

=Tony.Meyer

From bac at OCF.Berkeley.EDU  Thu Feb  3 01:58:19 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Feb  3 01:58:33 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <41E83EB8.8060405@ocf.berkeley.edu>
References: <16870.61059.451494.303971@montanaro.dyndns.org>	<41E74790.60108@ocf.berkeley.edu>	<16871.37525.981821.580939@montanaro.dyndns.org>	<41E80995.5030901@ocf.berkeley.edu>	<16872.3770.25143.582154@montanaro.dyndns.org>
	<41E83EB8.8060405@ocf.berkeley.edu>
Message-ID: <4201772B.90601@ocf.berkeley.edu>

Everyone went silent on this topic.  Does this mean people just stopped caring 
(which I doubt since I know Skip wants this bad enough to bring it up every so 
often)?  Was it the issue of symmetry with strftime?

I am willing to add this (albeit the simple way I proposed in my last email on 
this thread) but I obviously don't want to bother if no one wants it or likes 
my proposed solution.

-Brett
From pje at telecommunity.com  Thu Feb  3 02:01:36 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Feb  3 01:59:26 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFE0A@its-xchg4.massey. ac.nz>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801F98639@its-xchg4.massey.ac.nz>
Message-ID: <5.1.1.6.0.20050202195401.038c04d0@mail.telecommunity.com>

At 01:23 PM 2/3/05 +1300, Tony Meyer wrote:
>(Users giving the software directly to someone else, rather than downloading
>from the official site, is probably covered by:
>
>"""You also agree not to permit further distribution of the Redistributables
>by your end users except you may permit further redistribution of the
>Redistributables by your distributors to your end-user customers if your
>distributors only distribute the Redistributables in conjunction with, and
>as part of, the Licensee Software and you and your distributors comply with
>all other terms of this EULA."""
>
>Where the users become our redistributors.)

Sounds like this puts all Python users in the clear, since Python is the 
Licensee Software in that case.  So, anybody can distribute msvcr71 as 
"part of" Python.

OTOH, the other wording sounds like Python itself has to have a click-wrap, 
tear-open, or signature EULA!  IOW, the EULA appears to prohibit free 
distribution of the runtime with a program that has no EULA.

So, in an amusing turn of events, the EULA actually appears to forbid the 
current offering of Python for Windows, since it does not have such a EULA.

This is a much bigger worry than the original question.  If we're actually 
allowed to distribute Python with the runtime at all, then py2exe and such 
are perfectly safe, since it's in conjunction with permitted 
redistribution.  If distribution of the runtime is not allowed, on the 
other hand, then use of MSVC 7 for Python becomes altogether impossible 
without adding some kind of click-wrap licensing scheme.

From t-meyer at ihug.co.nz  Thu Feb  3 02:32:11 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Thu Feb  3 02:32:12 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801F986B4@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFE0E@its-xchg4.massey.ac.nz>

(I should point out the thread that starts here, too:

<http://mail.python.org/pipermail/python-dev/2004-November/050101.html>

in case anyone isn't aware of it).

> Sounds like this puts all Python users in the clear, since 
> Python is the Licensee Software in that case.  So, anybody can
> distribute msvcr71 as "part of" Python.

I guess it would really take a lawyer (well, probably several) to say
whether distributing a frozen application is distributing Python or not.

> OTOH, the other wording sounds like Python itself has to have 
> a click-wrap, tear-open, or signature EULA!  IOW, the EULA
> appears to prohibit free distribution of the runtime with a
> program that has no EULA.
> 
> So, in an amusing turn of events, the EULA actually appears 
> to forbid the current offering of Python for Windows, since
> it does not have such a EULA.

I presume that adding a "click-wrap" EULA to the Python .msi would not be
difficult.  Lots of other .msi's have "click-wrap" licenses, so there must
be some sample code that can be used.  The license is already in the
distribution, it would just be displayed at an additional time.

The EULA has to be no less restrictive than the MSVC one (presumably only in
relation to the bits of MSVC that are being redistributed), so I guess a
section at the end of the PSF license that duplicates the relevant bits of
the MSVC one would work.  (Of course, IANAL).

=Tony.Meyer

From tjreedy at udel.edu  Thu Feb  3 03:11:31 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Feb  3 03:11:51 2005
Subject: [Python-Dev] Re: Is msvcr71.dll re-redistributable?
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801F98639@its-xchg4.massey.ac.nz>
	<ECBA357DDED63B4995F5C1F5CBE5B1E801DAFE0A@its-xchg4.massey. ac.nz>
	<5.1.1.6.0.20050202195401.038c04d0@mail.telecommunity.com>
Message-ID: <cts17h$loo$1@sea.gmane.org>


"Phillip J. Eby" <pje@telecommunity.com> wrote in message 
news:5.1.1.6.0.20050202195401.038c04d0@mail.telecommunity.com...
> So, in an amusing turn of events, the EULA actually appears to forbid the 
> current offering of Python for Windows, since it does not have such a 
> EULA.

Except of course that MS gave Python developers several copies of its 
newest compiler specifically for the purpose of compiling the Windows 
distribution.

It would be nice to get a clear English statement from MS.  I have dealt 
with the legalese in property sales agreements, lease agreements, and 
normal software licenses, but the quoted EULA snippets are the most obscure 
by far.

Terry J. Reedy


From anthony at interlink.com.au  Thu Feb  3 11:30:57 2005
From: anthony at interlink.com.au (anthony@interlink.com.au)
Date: Thu Feb  3 11:31:21 2005
Subject: [Python-Dev] Returned mail: Data format error
Message-ID: <20050203103119.28BFB1E4003@bag.python.org>

Your message was not delivered due to the following reason:

Your message could not be delivered because the destination computer was
unreachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message could not be delivered within 7 days:
Mail server 190.102.237.222 is not responding.

The following recipients could not receive this message:
<python-dev@python.org>

Please reply to postmaster@interlink.com.au
if you feel this message to be in error.

From skip at pobox.com  Thu Feb  3 14:12:30 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Feb  3 14:12:02 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <4201772B.90601@ocf.berkeley.edu>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<41E80995.5030901@ocf.berkeley.edu>
	<16872.3770.25143.582154@montanaro.dyndns.org>
	<41E83EB8.8060405@ocf.berkeley.edu>
	<4201772B.90601@ocf.berkeley.edu>
Message-ID: <16898.9022.505916.761977@montanaro.dyndns.org>


    Brett> Everyone went silent on this topic.  Does this mean people just
    Brett> stopped caring (which I doubt since I know Skip wants this bad
    Brett> enough to bring it up every so often)?  Was it the issue of
    Brett> symmetry with strftime?

I have a patch to do strptime() fractional seconds, but stumbled on the
reverse direction (making strftime() accept fractional seconds).

I'll submit a patch with what I have later today.  I have to catch a train
just now.

Skip
From Jack.Jansen at cwi.nl  Thu Feb  3 15:15:37 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Feb  3 15:16:08 2005
Subject: [Python-Dev] Is msvcr71.dll re-redistributable?
In-Reply-To: <5.1.1.6.0.20050202195401.038c04d0@mail.telecommunity.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801F98639@its-xchg4.massey.ac.nz>
	<5.1.1.6.0.20050202195401.038c04d0@mail.telecommunity.com>
Message-ID: <b530db0f6ac45cedc730bf462282602f@cwi.nl>


On 3 Feb 2005, at 02:01, Phillip J. Eby wrote:

> Sounds like this puts all Python users in the clear, since Python is 
> the Licensee Software in that case.  So, anybody can distribute 
> msvcr71 as "part of" Python.
>
> OTOH, the other wording sounds like Python itself has to have a 
> click-wrap, tear-open, or signature EULA!  IOW, the EULA appears to 
> prohibit free distribution of the runtime with a program that has no 
> EULA.
>
> So, in an amusing turn of events, the EULA actually appears to forbid 
> the current offering of Python for Windows, since it does not have 
> such a EULA.

That was also my conclusion last year:-(

But at least Python can still be distributed without msvcr71, putting 
the burden of obtaining it on the end user, because of Python's 
license. In another project we're using GPL, and careful reading 
(disclaimer: IANAL) has not convinced me that GPL and the EULA are 
compatible. Actually, I have this vague feeling that the MSVC 7 EULA 
(plus the fact that MS isn't shipping msvcr71.dll with Windows) might 
have been drafted specifically to be incompatible with the clause in 
GPL that doesn't allow you to link against third party libraries unless 
they're part of the OS.

What we've done in that project is link with msvcr71.dll, but not 
include it in the installer. I think that we could (theoretically) 
still be dragged into court by the FSF, but at least not by Microsoft.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From gvanrossum at gmail.com  Thu Feb  3 16:03:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb  3 16:03:29 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
Message-ID: <ca471dc2050203070366d6f1e9@mail.gmail.com>

If you read BugTraq, python-announce or the Daily Python URL today,
you would have noticed a Python Security Advisory. (If you missed it:
http://www.python.org/security/PSF-2005-001/ .)

This was the first one issued in this form, but I'm sure it won't be
the last one. Until now, we haven't had any infrastructure for this
type of thing. In this particular case, the original discoverer first
asked on c.l.py for advice on how to proceed, which yielded only
unhelpful referrals to SF or python-dev. Then he wrote the authors of
the affected module. Fredrik was so kind to forward it to me, and I
happened to have time to deal with it. (Hey, I work for a security
company, so I would have *made* time if I had to.)

But I may not always be that responsive -- I could be busy, or
traveling, or people might not think of mailing me. I believe it would
be better if there was a "response team" for such situations. The
response team would normally not have to do anything; they wouldn't
have to be actively looking for security bugs, for example. But anyone
with a (suspected) security problem related to Python would be able to
email the team (e.g. security at python.org), trusting that the
information would be kept confidential until a patch is developed; the
response team would then investigate the problem and decide on an
appropriate response.

I want to be on the team; Barry also works for a security company and
I hope he'll want to join (he can also make up a better acronym :-); I
hope at least one person from the release team can be involved, e.g.
Anthony; and I would like to see some more volunteers involved to have
a good spread of availability and expertise. (How about a Windows
user?) If you want to be on the team, send email to me *personally*.
For discussion about the team's responsibilities and procedures,
please follow up here.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From skip at pobox.com  Thu Feb  3 17:01:02 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Feb  3 17:00:21 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <ca471dc2050203070366d6f1e9@mail.gmail.com>
References: <ca471dc2050203070366d6f1e9@mail.gmail.com>
Message-ID: <16898.19134.658304.948731@montanaro.dyndns.org>


    Guido> For discussion about the team's responsibilities and procedures,
    Guido> please follow up here.

I noticed the checkins.  I think there is one other necessary output: source
patches against all the affected versions need to be made available so
people can apply the patch to an existing installed version without needing
to upgrade.

Skip
From gvanrossum at gmail.com  Thu Feb  3 17:25:09 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb  3 17:25:42 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <16898.19134.658304.948731@montanaro.dyndns.org>
References: <ca471dc2050203070366d6f1e9@mail.gmail.com>
	<16898.19134.658304.948731@montanaro.dyndns.org>
Message-ID: <ca471dc20502030825648b3346@mail.gmail.com>

> I noticed the checkins.  I think there is one other necessary output: source
> patches against all the affected versions need to be made available so
> people can apply the patch to an existing installed version without needing
> to upgrade.

Patches for 2.2, 2.3 and 2.4 are on the website
(python.org/security/PSF-2005-001/ has links). The module didn't exist
before 2.2.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From 2004b at usenet.alexanderweb.de  Thu Feb  3 17:36:59 2005
From: 2004b at usenet.alexanderweb.de (Alexander Schremmer)
Date: Thu Feb  3 17:54:01 2005
Subject: [Python-Dev] Re: Is msvcr71.dll re-redistributable?
References: <4qgwf31u.fsf@python.net>
Message-ID: <ol0p2epp6983$.dlg@usenet.alexanderweb.de>

On Tue, 01 Feb 2005 21:17:17 +0100, Thomas Heller wrote:

> The 2.4 python.org installer installs msvcr71.dll on the target system.
> 
> If someone uses py2exe or a similar tool to create a frozen application,
> is he allowed to redistribute this msvcr71.dll to other users together
> with his application or not, even if he doesn't own MSVC?

How about statically compiling the code? Then you do not need to distribute
the runtime library. It should not make a big difference for the rather
large file python24.dll

Kind regards,
Alexander

From theller at python.net  Thu Feb  3 19:37:40 2005
From: theller at python.net (Thomas Heller)
Date: Thu Feb  3 19:36:15 2005
Subject: [Python-Dev] Re: Is msvcr71.dll re-redistributable?
In-Reply-To: <ol0p2epp6983$.dlg@usenet.alexanderweb.de> (Alexander
	Schremmer's message of "Thu, 3 Feb 2005 17:36:59 +0100")
References: <4qgwf31u.fsf@python.net>
	<ol0p2epp6983$.dlg@usenet.alexanderweb.de>
Message-ID: <7jlpbibv.fsf@python.net>

Alexander Schremmer <2004b@usenet.alexanderweb.de> writes:

> On Tue, 01 Feb 2005 21:17:17 +0100, Thomas Heller wrote:
>
>> The 2.4 python.org installer installs msvcr71.dll on the target system.
>> 
>> If someone uses py2exe or a similar tool to create a frozen application,
>> is he allowed to redistribute this msvcr71.dll to other users together
>> with his application or not, even if he doesn't own MSVC?
>
> How about statically compiling the code? Then you do not need to distribute
> the runtime library. It should not make a big difference for the rather
> large file python24.dll

This would not work since each binary extension for Python 2.4 uses the
dll runtime lib.

Thomas

From bac at OCF.Berkeley.EDU  Fri Feb  4 02:39:39 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Feb  4 02:39:48 2005
Subject: [Python-Dev] python-dev Summary for 2005-01-01 through 2005-01-15
	[draft]
Message-ID: <4202D25B.9030808@ocf.berkeley.edu>

Wow, another summary out the same week as the previous one!  Perk of keeping 
things short and to the point.  Then again keeping them this simple and short 
begs the question of whether the summaries are worth it still at that point.

Regardless, probably will send this one out Saturday or Sunday so corrections 
need to get in by then.

-------------------------------------

=====================
Summary Announcements
=====================
PyCon_ will be upon us come late March!  Still time to plan to go.

A warning on the thoroughness off this summary is in order.  While trying to 
delete a single thread of email I managed to accidentally delete my entire 
python-dev mailbox.  I did the best I could to retrieve the emails but it's 
possible I didn't resuscitate all of my emails, so I may have overlooked something.

.. _PyCon: http://www.pycon.org/

=======
Summary
=======
-------------
PEP movements
-------------
.. tip:: PEP updates by email are available as a topic from the 
`Python-checkins`_ mailing list.

`PEP 246`_ was a major topic of discussion during the time period covered by 
this summary.  This all stemmed from `Guido's blog`_ entries on optional type 
checking. This led to a huge discussion on many aspects of protocols, 
interfaces, and adaptation and the broadening of this author's vocabulary to 
include "Liskov violation".

"Monkey typing" also became a new term to know thanks to Phillip J. Eby's 
proto-PEP on the topic (found at 
http://peak.telecommunity.com/DevCenter/MonkeyTyping).  Stemming from the 
phrase "monkey see, monkey do", it's Phillip version of taking PEP 246 
logically farther (I think; the whole thing is more than my currently 
burned-out-on-school brain can handle right now).

.. _Python-checkins: http://mail.python.org/mailman/listinfo/python-checkins
.. _PEP 246: http://www.python.org/peps/pep-0246.html
.. Guido's blog: http://www.artima.com/weblogs/index.jsp?blogger=guido

Contributing threads:
   - `getattr and __mro__ <>`__
   - `Son of PEP 246, redux <>`__
   - `PEP 246: lossless and stateless <>`__
   - `PEP 246: LiskovViolation as a name <>`__
   - `"Monkey Typing" pre-PEP, partial draft <>`__


------------------------------------------------------------------------------------
Optional type checking: how to inadvertently cause a flame war worse than 
decorators
------------------------------------------------------------------------------------
`Guido's blog`_ had comments on the idea of adding optional static type 
checking to Python.  While just comments in a blog, it caused a massive 
response from people, mostly negative from what I gathered.  After Guido 
discussed things some more it culminated in a blog entry found at 
http://www.artima.com/weblogs/viewpost.jsp?thread=87182 that lays out what his 
actual plans are.  I highly recommend reading it since it suggests adding 
optional run-time type checking for function arguments along with some other 
proposals.

All of this led to `PEP 246`_ getting updated.  For some more details on that 
see the `PEP movements`_ section of this summary.

And if there is a lesson to be learned from all of this, it's that when Alex 
Martelli and Phillip J. Eby start a technical discussion it's going to be long, 
in-depth, complex, and lead to my inbox being brimming in python-dev email.


------------------------------
Let's get the AST branch done!
------------------------------
Guido posted an email to the list stating he would like to to make progress 
towards integrating "things like type inferencing, integrating PyChecker, or 
optional static type checking" into Python.  In order to make that easier he 
put out a request that people work on the AST branch and finish it.

For those that don't know about Python's back-end, the compiler as it stands 
now takes the parse tree from the parser and emits bytecode directly from that. 
  This is far from optimal since the parse tree is more verbose than needed and 
it is not the easiest thing to work with.

The AST branch attempts to fix this by taking a more traditional approach to 
compiling.  This means the parse tree is used to generate an AST (abstract 
syntax tree; and even more technically could be considered a control flow graph 
in view of how it is implemented) which in turn is used to emit bytecode.  The 
AST itself is much easier to work with when compared to the parse tree; better 
to know you are working with an 'if' guard thanks to it being an 'if' node in 
the AST than checking if the parse tree statement you are working with starts 
with 'if' and ends with a ':'.

While all of this sounds great, the issue is the AST branch is not finished 
yet.  It is not entirely far off, but new features from 2.4 (decorators and 
generator expressions) need to be added along with more bug fixing and clean up.

This means the AST branch is going to get finished for 2.5 somehow.  But help 
is needed.  While the usual suspects who have previously contributed to the 
branch are hoping to finish it, more help is always appreciated.  If you care 
to get involved, check out the AST branch (tagged as 'ast-branch' in CVS; see 
the `python-dev FAQ`_ on how to do a tagged branch checkout), read 
Python/compile.txt and just dive in!  There will also be a sprint on the AST 
branch at PyCon.

.. _python-dev FAQ: http://www.python.org/dev/devfaq.html

Contributing threads:
   - `Please help complete the AST branch <>`__
   - `Will ASTbranch compile on windows yet? <>`__
   - `ast branch pragmatics <>`__
   - `Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 
2.161.2.16 <>`__


--------------------------------
Ditching unbound methods in Py3k
--------------------------------
Guido suggested removing unbound methods from Python since their usefulness of 
checking their first argument and other slight differences from functions just 
didn't seem worth keeping around and complicating the language.  So the idea 
seems sound.

But then people with uses for the extra information kept in unbound methods 
(im_func and im_self) popped up.  To make the long thread short, enough people 
stepped up mentioning uses they had for the information for Guido to retract 
the suggestion in the name of backwards compatibility.

But unbound methods are now on the list of things to go in Python 3000.

Contributing threads:
   - `Let's get rid of unbound methods <>`__
   - `Getting rid of unbound methods: patch available <>`__
   - `PEP 246 - concrete assistance to developers of new	adapter classes <>`__


------------------------------------------
Getting exceptions to be new-style classes
------------------------------------------
A patch to allow exceptions to be new-style classes is currently at 
http://www.python.org/1104669 .  The plan is to get that patch in order, apply 
it, and as long as a ton of code does not break from exceptions moving from 
classic to new-style classes it will be made permanent in 2.5 .

This in no way touches on the major changes as touched upon in a `previous 
summary 
<http://www.python.org/dev/summary/2004-09-01_2004-09-15.html#cleaning-the-exception-house>`__ 
which will need a PEP to get the hierarchy cleaned up and discuss any possible 
changes to bar 'except' statements.

Contributing threads:
   - `Exceptions *must*? be old-style classes? <>`__

===============
Skipped Threads
===============
- Mac questions
- 2.3.5 schedule, and something I'd like to get in
-  csv module TODO list
- an idea for improving struct.unpack api
- Minor change to behaviour of csv module
- PATCH/RFC for AF_NETLINK support
- logging class submission
- Recent IBM Patent releases
- frame.f_locals is writable
- redux: fractional seconds in strptime
- Darwin's realloc(...) implementation never shrinks allocations
From kbk at shore.net  Fri Feb  4 05:41:59 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri Feb  4 05:42:08 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200502040442.j144fxhi015740@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  284 open ( +4) /  2748 closed ( +1) /  3032 total ( +5)
Bugs    :  804 open ( +1) /  4812 closed (+13) /  5616 total (+14)
RFE     :  167 open ( +0) /   142 closed ( +1) /   309 total ( +1)

New / Reopened Patches
______________________

Patch for Lib/bsddb/__init__.py to work with modulefinder  (2005-01-31)
       http://python.org/sf/1112812  opened by  Tony Meyer

New tutorial tests in test_generators.py  (2005-01-31)
       http://python.org/sf/1113421  opened by  Francis Girard

Add SSL certificate validation  (2005-02-01)
       http://python.org/sf/1114345  opened by  James Eagan

support PY_LONGLONG in structmember  (2005-02-02)
       http://python.org/sf/1115086  opened by  Sam Rushing

Add SSL certificate validation  (2005-02-03)
       http://python.org/sf/1115631  opened by  James Eagan

Patches Closed
______________

Make history recall a-cyclic  (2004-03-11)
       http://python.org/sf/914546  closed by  kbk

New / Reopened Bugs
___________________

Cannot ./configure on FC3 with gcc 3.4.2  (2005-01-26)
CLOSED http://python.org/sf/1110007  reopened by  liturgist

cgi.FieldStorage memory usage can spike in line-oriented ops  (2005-01-30)
       http://python.org/sf/1112549  opened by  Chris McDonough

patch 1079734 broke cgi.FieldStorage w/ multipart post req.  (2005-01-31)
       http://python.org/sf/1112856  opened by  Irmen de Jong

ioctl has problems on 64 bit machines  (2005-01-31)
       http://python.org/sf/1112949  opened by  Stephen Norris

move_file()'s return value when dry_run=1 unclear  (2005-01-31)
       http://python.org/sf/1112955  opened by  Eelis

Please add do-while guard to Py_DECREF etc.  (2005-01-31)
       http://python.org/sf/1113244  opened by  Richard Kettlewell

OSATerminology still semi-broken  (2005-01-31)
       http://python.org/sf/1113328  opened by  has

document {m} regex matcher wrt empty matches  (2005-01-31)
       http://python.org/sf/1113484  opened by  Wummel

keywords in keyword_arguments not possible  (2005-02-01)
CLOSED http://python.org/sf/1113984  opened by  Christoph Zwerschke

inicode.decode  (2005-02-01)
CLOSED http://python.org/sf/1114093  opened by  Manlio Perillo

copy.py bug  (2005-02-02)
       http://python.org/sf/1114776  opened by  Vincenzo Di Somma

webbrowser doesn't start default Gnome browser by default  (2005-02-02)
       http://python.org/sf/1114929  opened by  Jeremy Sanders

eval !  (2005-02-02)
CLOSED http://python.org/sf/1115039  opened by  Andrew Collier

Built-in compile function with PEP 0263 encoding bug  (2005-02-03)
       http://python.org/sf/1115379  opened by  Christoph Zwerschke

os.path.splitext don't handle unix hidden file correctly  (2005-02-04)
       http://python.org/sf/1115886  opened by  Jeong-Min Lee

Bugs Closed
___________

broken link in tkinter docs  (2005-01-24)
       http://python.org/sf/1108490  closed by  jlgijsbers

recursion core dumps  (2005-01-26)
       http://python.org/sf/1110055  closed by  tim_one

install_lib fails under Python 2.1  (2004-11-02)
       http://python.org/sf/1058960  closed by  loewis

Double __init__.py executing  (2004-06-22)
       http://python.org/sf/977250  closed by  loewis

Cannot ./configure on FC3 with gcc 3.4.2  (2005-01-26)
       http://python.org/sf/1110007  closed by  liturgist

IDLE hangs due to subprocess  (2004-12-28)
       http://python.org/sf/1092225  closed by  kbk

Empty curses module is loaded in win32  (2004-07-12)
       http://python.org/sf/989333  closed by  tebeka

Tab / Space Configuration Does Not Work in IDLE  (2003-08-05)
       http://python.org/sf/783887  closed by  kbk

Negative numbers to os.read() cause segfault  (2004-12-01)
       http://python.org/sf/1077106  closed by  mwh

Time module missing from latest module index  (2005-01-25)
       http://python.org/sf/1109523  closed by  montanaro

keywords in keyword_arguments not possible  (2005-02-01)
       http://python.org/sf/1113984  closed by  rhettinger

unicode.decode  (2005-02-01)
       http://python.org/sf/1114093  closed by  lemburg

eval !  (2005-02-02)
       http://python.org/sf/1115039  closed by  rhettinger

New / Reopened RFE
__________________

All Statements Should Have Return Values (Syntax Proposal)  (2005-02-01)
CLOSED http://python.org/sf/1114404  opened by  Lenny Domnitser

RFE Closed
__________

All Statements Should Have Return Values (Syntax Proposal)  (2005-02-01)
       http://python.org/sf/1114404  closed by  goodger

From burt at dfki.de  Fri Feb  4 15:04:33 2005
From: burt at dfki.de (burt@dfki.de)
Date: Fri Feb  4 15:04:36 2005
Subject: [Python-Dev] JOB OPENING: Implementor for Python and Search
Message-ID: <87fz0ch15a.fsf@dfki.uni-sb.de>

I hope posting job vacancies does not violate established list
netiquette. The job in question is mainly to do with the PyPy EU project.

-- Alastair

---
----
Alastair Burt
German Centre for AI (DFKI), Stuhlsatzenhausweg 3
Saarbruecken 66123, Germany
Email: burt@dfki.de
Tel: +49 681 302 2565
Fax: +49 681 302 5338


DFKI-LT - Job Opening

    The German Research Center for Artificial Intelligence (DFKI GmbH) is
    seeking for its Language Technology Lab a researcher/software developer
    with a strong background in Computer Science, who is interested in
    working on industrial R&D in the area of language implementation and
    support for the semantic web. The contract will be for approx. two
    years, with extensions being subject to availability of funding.

Description of work

    The successful candidate will carry on research and collaborate on
    software design and implementation in the following areas:

      - Investigation of search in constraint and logic programming
        languages.

      - Investigation of query languages for the semantic web.

      - Design of conceptual framework for search in Python.

      - Implementation of search in Python using the facilities offered by
        PyPy.

      - Application of the new search functionality to support queries in
        ontology driven web sites.

Requirements:

      - Good programming skills, particularly in the Python programming language.

      - Knowledge of semantic web technologies.

      - Excellent communication skills in English.

      -  Working with high motivation in a team.

Additional assets:

      - Knowledge of the implementation of Python.

      - Knowledge of constraint programming.

Additional Information

    DFKI GmbH is located on the campus of Saarland University in
    Saarbr?cken, Germany. The university's research groups and curricula in
    the fields of Computational Linguistics and Computer Science are
    internationally renowned.
     
    The LT-Lab offers excellent working conditions in a well-established
    research group. The position provides opportunities to collaborate in a
    variety of international projects. The competitive salary is calculated
    according to qualifications based on DFKI GmbH scales. The successful
    candidate will have opportunities for improving their qualification.
     
    Please send your electronic application (preferably in PDF format) to
    lt-jobs@dfki.de, referring to job opening No. 200501, not later than
    February 15, 2005. A meaningful application should include a cover
    letter, a CV, a brief summary of research interests, a statement of
    interest in the position offered, and contact information for three
    references.


From fdrake at acm.org  Fri Feb  4 17:06:39 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Feb  4 17:06:54 2005
Subject: [Python-Dev] JOB OPENING: Implementor for Python and Search
In-Reply-To: <87fz0ch15a.fsf@dfki.uni-sb.de>
References: <87fz0ch15a.fsf@dfki.uni-sb.de>
Message-ID: <200502041106.22317.fdrake@acm.org>

On Friday 04 February 2005 09:04, burt@dfki.de wrote:
 > I hope posting job vacancies does not violate established list
 > netiquette. The job in question is mainly to do with the PyPy EU project.

There's a Python Job Board on python.org; see

    http://www.python.org/Jobs-howto.html

for information on posting opportunities there.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From skip at pobox.com  Fri Feb  4 17:31:05 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Feb  4 17:31:09 2005
Subject: [Python-Dev] JOB OPENING: Implementor for Python and Search
In-Reply-To: <87fz0ch15a.fsf@dfki.uni-sb.de>
References: <87fz0ch15a.fsf@dfki.uni-sb.de>
Message-ID: <16899.41801.358248.884554@montanaro.dyndns.org>


    Alastair> I hope posting job vacancies does not violate established list
    Alistair> netiquette. The job in question is mainly to do with the PyPy
    Alistair> EU project.

Not a huge faux pas, but you will get much better exposure by submitting it
to the Python Job Board.  Details on posting vacancies can be found here:

    http://www.python.org/Jobs-howto.html

-- 
Skip Montanaro
skip@mojam.com
http://www.mojam.com/
From gvanrossum at gmail.com  Fri Feb  4 19:46:52 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Feb  4 19:46:59 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Misc NEWS, 1.1237, 1.1238
In-Reply-To: <E1Cx8RA-000676-G4@sc8-pr-cvs1.sourceforge.net>
References: <E1Cx8RA-000676-G4@sc8-pr-cvs1.sourceforge.net>
Message-ID: <ca471dc2050204104657e0b8cf@mail.gmail.com>

[jhylton@users.sourceforge.net]
> Log Message:
> Add NEWS item about future parser bug.

Give back the time machine!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jhylton at gmail.com  Fri Feb  4 20:00:25 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Fri Feb  4 20:00:29 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Misc NEWS,
	1.1237, 1.1238
In-Reply-To: <ca471dc2050204104657e0b8cf@mail.gmail.com>
References: <E1Cx8RA-000676-G4@sc8-pr-cvs1.sourceforge.net>
	<ca471dc2050204104657e0b8cf@mail.gmail.com>
Message-ID: <e8bf7a530502041100b39de6e@mail.gmail.com>

On Fri, 4 Feb 2005 10:46:52 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> [jhylton@users.sourceforge.net]
> > Log Message:
> > Add NEWS item about future parser bug.
> 
> Give back the time machine!

I already will have by the time you needed it.

Jeremy
From Jack.Jansen at cwi.nl  Sat Feb  5 00:46:11 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sat Feb  5 00:46:18 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Misc NEWS,
	1.1237, 1.1238
In-Reply-To: <e8bf7a530502041100b39de6e@mail.gmail.com>
References: <E1Cx8RA-000676-G4@sc8-pr-cvs1.sourceforge.net>
	<ca471dc2050204104657e0b8cf@mail.gmail.com>
	<e8bf7a530502041100b39de6e@mail.gmail.com>
Message-ID: <F301EAC3-7706-11D9-9D87-000D934FF6B4@cwi.nl>


On 4-feb-05, at 20:00, Jeremy Hylton wrote:
>>> Add NEWS item about future parser bug.
>>
>> Give back the time machine!
>
> I already will have by the time you needed it.

I knew this was going to happen one day.

(And now we should all be getting out our copies of the HHGTTG and work 
out the horrible future past conditional tense and such. It's probably 
having shall been in book 2).
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From jjl at pobox.com  Sat Feb  5 00:57:23 2005
From: jjl at pobox.com (John J Lee)
Date: Sat Feb  5 01:00:25 2005
Subject: [Python-Dev] cookielib patch
Message-ID: <Pine.LNX.4.58.0502042353110.6828@alice>

Anyone like to commit 1028908?

Patch was written by module author (me), including an important doc
warning re (lack of) thread safety which I mistakenly thought had got into
2.4.0.


John
From jjl at pobox.com  Sat Feb  5 01:06:37 2005
From: jjl at pobox.com (John J Lee)
Date: Sat Feb  5 01:09:38 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <ca471dc2050203070366d6f1e9@mail.gmail.com>
References: <ca471dc2050203070366d6f1e9@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0502042357550.6828@alice>

On Thu, 3 Feb 2005, Guido van Rossum wrote:
[...]
> hope at least one person from the release team can be involved, e.g.
[...]

Guido, from python-announce list:
[...]
> Python 2.3.5 will be released from www.python.org within a few days
> containing a fix for this issue.  Python 2.4.1 will be released later
> this month containing the same fix.  Patches for Python 2.2, 2.3 and
> 2.4 are also immediately available:
[...]

Hope this question isn't too dumb:

How will Python releases made in response to security bugs be done: will
they just include the security fix (rather than being taken from CVS
HEAD), without the usual alpha / beta testing cycle?  Or what...?


John
From anthony at interlink.com.au  Sat Feb  5 07:43:17 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat Feb  5 07:43:26 2005
Subject: [Python-Dev] 2.3.5 and 2.4.1 release plans
Message-ID: <200502051743.18393.anthony@interlink.com.au>

Ok, so here's the state of play: 2.3.5 is currently aimed for next Tuesday,
but there's an outstanding issue - the new copy code appears to have
broken something, see www.python.org/sf/1114776 for the gory details.
I'm completely out of time this weekend to look into it too closely - if 
someone has 1/2 an hour and wants to do some triage on the bug, I'd
appreciate it, a great deal.

I'm currently thinking about a 2.4.1 around the 23td of Feb - Martin and
Fred, does this work for you? There's a bunch of backporting that should
probably happen for that - I will try to get some time to do this in the next
week or so.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From anthony at interlink.com.au  Sat Feb  5 07:44:32 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat Feb  5 07:44:44 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python future.c,
	2.14, 2.15
In-Reply-To: <E1Cx8Lv-0005rt-C2@sc8-pr-cvs1.sourceforge.net>
References: <E1Cx8Lv-0005rt-C2@sc8-pr-cvs1.sourceforge.net>
Message-ID: <200502051744.32779.anthony@interlink.com.au>

On Saturday 05 February 2005 05:38, jhylton@users.sourceforge.net wrote:
> Fix bug that allowed future statements virtually anywhere in a module.
>
> If we exit via the break here, we need to set ff_last_lineno or
> FUTURE_POSSIBLE() will remain true.  The bug affected statements
> containing a variety of expressions, but not all expressions.  It has
> been present since Python 2.2.

While this is undoubtedly a bug fix, I'm not sure that it should be 
backported - it will break people's code that is "working" now (albeit
in a faulty way). What do people think?

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From python at rcn.com  Sat Feb  5 08:31:26 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Feb  5 08:35:13 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	future.c, 2.14, 2.15
In-Reply-To: <200502051744.32779.anthony@interlink.com.au>
Message-ID: <001701c50b54$b3c94e40$2c10c797@oemcomputer>

[Anthony]
> While this is undoubtedly a bug fix, I'm not sure that it should be
> backported - it will break people's code that is "working" now (albeit
> in a faulty way). What do people think?

I concur -- the balance of risks is towards the patch causing more harm
than good.


Raymond

From aleax at aleax.it  Sat Feb  5 09:06:53 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sat Feb  5 09:06:51 2005
Subject: [Python-Dev] 2.3.5 and 2.4.1 release plans
In-Reply-To: <200502051743.18393.anthony@interlink.com.au>
References: <200502051743.18393.anthony@interlink.com.au>
Message-ID: <346c27eb4c05f81a6e089b28d19079b5@aleax.it>


On 2005 Feb 05, at 07:43, Anthony Baxter wrote:

> Ok, so here's the state of play: 2.3.5 is currently aimed for next 
> Tuesday,
> but there's an outstanding issue - the new copy code appears to have
> broken something, see www.python.org/sf/1114776 for the gory details.
> I'm completely out of time this weekend to look into it too closely - 
> if
> someone has 1/2 an hour and wants to do some triage on the bug, I'd
> appreciate it, a great deal.

Done: the issue is easy to fix but not to reproduce, and I'd like to 
reproduce it so as to fix the unit tests, which currently don't catch 
the problem.

The problem boils down to: deepcopying an instance of a type that 
doesn't have an __mro__ (and is not one of the many types explicitly 
recorded in the _deepcopy_dispatch dictionary, such as types.ClassType, 
types.InstanceType, etc, etc).

The easy fix: instead of cls.__mro__ use inspect.getmro which deals 
with that specifically.

Before I commit the fix: can anybody help out with an example of a type 
anywhere in the standard library that should be deepcopyable, used to 
be deepcopyable in 2.3.4, isn't one of those which get explicitly 
recorded in copy._deepcopy_dispatch, AND doesn't have an __mro__?  Even 
the _testcapi.Copyable type magically grows an __mro__; I'm not sure 
how to MAKE a type w/o one...


Thanks,

Alex

From jhylton at gmail.com  Sat Feb  5 16:49:13 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Sat Feb  5 16:49:16 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	future.c, 2.14, 2.15
In-Reply-To: <001701c50b54$b3c94e40$2c10c797@oemcomputer>
References: <200502051744.32779.anthony@interlink.com.au>
	<001701c50b54$b3c94e40$2c10c797@oemcomputer>
Message-ID: <e8bf7a5305020507492717c8a5@mail.gmail.com>

On Sat, 5 Feb 2005 02:31:26 -0500, Raymond Hettinger <python@rcn.com> wrote:
> [Anthony]
> > While this is undoubtedly a bug fix, I'm not sure that it should be
> > backported - it will break people's code that is "working" now (albeit
> > in a faulty way). What do people think?
> 
> I concur -- the balance of risks is towards the patch causing more harm
> than good.

I would not backport it to Python 2.3.  People have been using it for
a long time.  I'd be inclined to backport it to Python 2.4, which is
still relatively new.  If someone has buggy code, an upgrade is going
to cause a problem for them at some point.  Given how unlikely the
risk is -- particularly given that division is the only useful future
now -- I'd say the risk is acceptable for Python 2.4.1.  (Unlike, say,
Python 2.4.2.)

Jeremy
From aleax at aleax.it  Sat Feb  5 17:01:18 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sat Feb  5 17:01:18 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	future.c, 2.14, 2.15
In-Reply-To: <e8bf7a5305020507492717c8a5@mail.gmail.com>
References: <200502051744.32779.anthony@interlink.com.au>
	<001701c50b54$b3c94e40$2c10c797@oemcomputer>
	<e8bf7a5305020507492717c8a5@mail.gmail.com>
Message-ID: <22e5d82ed8314b0280871488c3d75356@aleax.it>


On 2005 Feb 05, at 16:49, Jeremy Hylton wrote:

> On Sat, 5 Feb 2005 02:31:26 -0500, Raymond Hettinger <python@rcn.com> 
> wrote:
>> [Anthony]
>>> While this is undoubtedly a bug fix, I'm not sure that it should be
>>> backported - it will break people's code that is "working" now 
>>> (albeit
>>> in a faulty way). What do people think?
>>
>> I concur -- the balance of risks is towards the patch causing more 
>> harm
>> than good.
>
> I would not backport it to Python 2.3.  People have been using it for
> a long time.  I'd be inclined to backport it to Python 2.4, which is
> still relatively new.  If someone has buggy code, an upgrade is going
> to cause a problem for them at some point.  Given how unlikely the
> risk is -- particularly given that division is the only useful future
> now -- I'd say the risk is acceptable for Python 2.4.1.  (Unlike, say,
> Python 2.4.2.)

+1 on having the fix in 2.4.1 but not in 2.3.5 -- exactly for the 
reasons Jeremy is giving.


Alex

From gvanrossum at gmail.com  Sat Feb  5 17:02:46 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Feb  5 17:02:49 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <Pine.LNX.4.58.0502042357550.6828@alice>
References: <ca471dc2050203070366d6f1e9@mail.gmail.com>
	<Pine.LNX.4.58.0502042357550.6828@alice>
Message-ID: <ca471dc205020508028619ea4@mail.gmail.com>

> How will Python releases made in response to security bugs be done: will
> they just include the security fix (rather than being taken from CVS
> HEAD), without the usual alpha / beta testing cycle?  Or what...?

Depends where you get the release. *Vendors* (ActiveState, Red Hat,
Ubuntu, Debian, etc.) typically release a new version that has *just*
the fix; they have the infrastructure in place to do this sort of
thing quickly and to let their customers benefit quickly.

On python.org, however, we tend to take the maintenance branch for a
particular version (e.g. 2.3.x or 2.4.x), add the fix, and accellerate
the release. For example, we'll release 2.3.5 next week, and 2.4.1
probably some time this month. (In addition, of course, we publish the
raw patch; also, we might end up making exceptions and/or start
following the vendors' example in some or all cases).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From skip at pobox.com  Sat Feb  5 21:31:34 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sat Feb  5 21:31:06 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <ca471dc205020508028619ea4@mail.gmail.com>
References: <ca471dc2050203070366d6f1e9@mail.gmail.com>
	<Pine.LNX.4.58.0502042357550.6828@alice>
	<ca471dc205020508028619ea4@mail.gmail.com>
Message-ID: <16901.11558.650334.340590@montanaro.dyndns.org>


    >> How will Python releases made in response to security bugs be done:
    >> will they just include the security fix (rather than being taken from
    >> CVS HEAD), without the usual alpha / beta testing cycle?  Or what...?

    Guido> On python.org, however, we tend to take the maintenance branch
    Guido> for a particular version (e.g. 2.3.x or 2.4.x), add the fix, and
    Guido> accellerate the release. 

Would it be possible to release a 2.3.4a that has just the fix over and
above the released version?  In this case it turns out that the fix nearly
coincided with the release of 2.3.5 and 2.4.1.  Would you do an accelerated
release if this had come up right after they were released?

Skip
From python at rcn.com  Sat Feb  5 21:44:34 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Feb  5 21:48:29 2005
Subject: [Python-Dev] Wanted: members for Python Security Response Team
In-Reply-To: <16901.11558.650334.340590@montanaro.dyndns.org>
Message-ID: <001b01c50bc3$81f3e460$fa01a044@oemcomputer>

> Would it be possible to release a 2.3.4a that has just the fix over
and
> above the released version?  In this case it turns out that the fix
nearly
> coincided with the release of 2.3.5 and 2.4.1.  Would you do an
> accelerated
> release if this had come up right after they were released?

Just go to 2.3.6.  No need to add a further complication to the
numbering scheme.


Raymond

From tjreedy at udel.edu  Sat Feb  5 23:22:49 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat Feb  5 23:23:08 2005
Subject: [Python-Dev] Re: Wanted: members for Python Security Response Team
References: <16901.11558.650334.340590@montanaro.dyndns.org>
	<001b01c50bc3$81f3e460$fa01a044@oemcomputer>
Message-ID: <cu3gts$k16$1@sea.gmane.org>


"Raymond Hettinger" <python@rcn.com> wrote in message 
news:001b01c50bc3$81f3e460$fa01a044@oemcomputer...

>> Would it be possible to release a 2.3.4a that has just the fix over
> and
>> above the released version?  In this case it turns out that the fix
> nearly
>> coincided with the release of 2.3.5 and 2.4.1.  Would you do an
>> accelerated
>> release if this had come up right after they were released?

> Just go to 2.3.6.  No need to add a further complication to the
> numbering scheme.

As I remember, 2.3.1 was precedent for this -- a quick 
fix-one-critical-item release about a week after 2.3.

Perhaps Python.org should have a release-announcement-only mailing list for 
people who would not get the news any other way.  And/or perhaps final 
release announcements and security warnings could be made on the various 
Python-application mail lists if not so done already.

Terry J. Reedy


From ncoghlan at iinet.net.au  Sun Feb  6 03:31:54 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Feb  6 03:32:00 2005
Subject: [Python-Dev] Re: Wanted: members for Python Security Response Team
In-Reply-To: <cu3gts$k16$1@sea.gmane.org>
References: <16901.11558.650334.340590@montanaro.dyndns.org>	<001b01c50bc3$81f3e460$fa01a044@oemcomputer>
	<cu3gts$k16$1@sea.gmane.org>
Message-ID: <4205819A.4000108@iinet.net.au>

Terry Reedy wrote:
> Perhaps Python.org should have a release-announcement-only mailing list for 
> people who would not get the news any other way.  And/or perhaps final 
> release announcements and security warnings could be made on the various 
> Python-application mail lists if not so done already.

Alternately, could some topics be set up on the existing lists? (ala the new PEP 
topic for the checkins list).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From tim.peters at gmail.com  Sun Feb  6 08:34:00 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Feb  6 08:34:04 2005
Subject: [Python-Dev] 2.3.5 and 2.4.1 release plans
In-Reply-To: <346c27eb4c05f81a6e089b28d19079b5@aleax.it>
References: <200502051743.18393.anthony@interlink.com.au>
	<346c27eb4c05f81a6e089b28d19079b5@aleax.it>
Message-ID: <1f7befae05020523344e36fb3e@mail.gmail.com>

[Anthony Baxter]
>> Ok, so here's the state of play: 2.3.5 is currently aimed for next
>> Tuesday, but there's an outstanding issue - the new copy code appears
>> to have broken something, see www.python.org/sf/1114776 for the gory
>> details.
...

[Alex Martelli]
> The problem boils down to: deepcopying an instance of a type that
> doesn't have an __mro__ (and is not one of the many types explicitly
> recorded in the _deepcopy_dispatch dictionary, such as types.ClassType,
> types.InstanceType, etc, etc).
>
> The easy fix: instead of cls.__mro__ use inspect.getmro which deals
> with that specifically.
>
> Before I commit the fix: can anybody help out with an example of a type
> anywhere in the standard library that should be deepcopyable, used to
> be deepcopyable in 2.3.4, isn't one of those which get explicitly
> recorded in copy._deepcopy_dispatch, AND doesn't have an __mro__?  Even
> the _testcapi.Copyable type magically grows an __mro__; I'm not sure
> how to MAKE a type w/o one...

Since the original bug report came from Zopeland, chances are good
(although the report is too vague to be sure) that the problem
involves ExtensionClass.  That's complicated C code in Zope predating
new-style classes, making it possible to build Python-class-like
objects in C code under old Pythons.  In general, EC-derived classes
don't play well with newer Python features (well, at least not until
Zope 2.8, where ExtensionClass is recoded as a new-style Python class
-- but still keeping some semantics from old-style classes ... ).

Anyway, I expect that instances of any EC-derived class would have the
problem in the bug report.  For example, the base Persistent class in
ZODB 3.2.5 is an ExtensionClass:

$ \python23\python.exe
Python 2.3.5c1 (#61, Jan 25 2005, 19:52:06) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import ZODB  # don't ask -- it's necessary to import this first
>>> from Persistence import Persistent
>>> p = Persistent()
>>> import copy
>>> copy.deepcopy(p)  # deepcopy() barfs on __mro__
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
 File "C:\Python23\lib\copy.py", line 200, in deepcopy
   copier = _getspecial(cls, "__deepcopy__")
 File "C:\Python23\lib\copy.py", line 66, in _getspecial
   for basecls in cls.__mro__:
AttributeError: __mro__
>>> copy.copy(p)  # copy() does too
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
 File "C:\Python23\lib\copy.py", line 86, in copy
   copier = _getspecial(cls, "__copy__")
 File "C:\Python23\lib\copy.py", line 66, in _getspecial
   for basecls in cls.__mro__:
AttributeError: __mro__

Unsure whether this is enough, but at least inspect.getmro() isn't
phased by an EC-derived class:

>>> inspect.getmro(Persistent)
(<extension class Persistence.Persistent at 100040D8>,)

More info from the bug report filer is really needed.  A problem is
that this stuff doesn't appear "to work" under Python 2.3.4 either:

$ ../Python-2.3.4/python
Python 2.3.4 (#1, Aug  9 2004, 17:15:36)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ZODB
>>> from Persistence import Persistent
>>> p = Persistent()
>>> import copy
>>> copy.deepcopy(p)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/tim/Python-2.3.4/Lib/copy.py", line 206, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/tim/Python-2.3.4/Lib/copy.py", line 338, in _reconstruct
    y = callable(*args)
TypeError: ExtensionClass object argument after * must be a sequence
>>> copy.copy(p)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/tim/Python-2.3.4/Lib/copy.py", line 95, in copy
    return _reconstruct(x, rv, 0)
  File "/home/tim/Python-2.3.4/Lib/copy.py", line 338, in _reconstruct
    y = callable(*args)
TypeError: ExtensionClass object argument after * must be a sequence
>>>
From aleax at aleax.it  Sun Feb  6 09:07:30 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Feb  6 09:07:30 2005
Subject: [Python-Dev] 2.3.5 and 2.4.1 release plans
In-Reply-To: <1f7befae05020523344e36fb3e@mail.gmail.com>
References: <200502051743.18393.anthony@interlink.com.au>
	<346c27eb4c05f81a6e089b28d19079b5@aleax.it>
	<1f7befae05020523344e36fb3e@mail.gmail.com>
Message-ID: <c3efbef07d97cdd634b099b51d82af68@aleax.it>


On 2005 Feb 06, at 08:34, Tim Peters wrote:
    ...
>> The easy fix: instead of cls.__mro__ use inspect.getmro which deals
>> with that specifically.
    ...
> Since the original bug report came from Zopeland, chances are good
> (although the report is too vague to be sure) that the problem
> involves ExtensionClass.  That's complicated C code in Zope predating

True, of course.  Still, any type w/o an __mro__ that's not recorded in 
the dispatch table will tickle the same bug -- give the same traceback, 
at least (if the original submitter would then proceed to tickle more 
bugs once this one's solved, I can't know, of course -- but this one 
does need fixing).

> Unsure whether this is enough, but at least inspect.getmro() isn't
> phased by an EC-derived class:

I'm pretty sure it's enough -- at least for SOME "types w/o __mro__".

Thanks to a suggestion from John Lenton on c.l.py, I was able to make a 
unit test based on:

     class C(type):
       def __getattribute__(self, attr):
         if attr == '__mro__':
           raise AttributeError, "What, *me*, a __mro__? Nevah!"
         return super(C, self).__getattribute__(attr)

     class D(object):
       __metaclass__ = C

Cheating, maybe, but it does show that the 2.3.5rc1 copy.py breaks and 
moving to inspect.mro repairs the break, which is all one really asks 
of a tiny unit test;-).  So, I've committed test and fix on the 2.3 
maintenance branch and marked the bug as fixed.

(Hmmmm, is it only me, or is sourceforce bug browsing broken for bugs 
with 7-digits numbers?  This one was 1114776 -- first one w/a 7-digit 
number I had yet seen -- and in no way could I get the browser to list 
it, it kept listing only 6-digit ones...).


Alex

From skip at pobox.com  Sun Feb  6 17:49:05 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sun Feb  6 17:49:12 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <E1CxgMJ-0003JT-KV@sc8-pr-cvs1.sourceforge.net>
References: <E1CxgMJ-0003JT-KV@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16902.19073.787609.523027@montanaro.dyndns.org>


In a python-checkins message, Raymond stated:

    Raymond> Replace list of constants with tuples of constants.

I understand the motivation here (the peephole optimizer can convert a tuple
of constants into a single constant that need not be constructed over and
over), but is the effort worth the cost of changing the logical nature of
the data structures used?  If lists are conceptually like vectors or arrays
in other languages and tuples are like C structs or Pascal records, then by
converting from list to tuple form you've somehow muddied the data structure
water just to take advantage of tuples' immutability.

Wouldn't it be better to have the peephole optimizer recognize the throwaway
nature of lists in these contexts:

    for elt in [1, 2, 4, 8, 16]:
        ...

    if foo in [list, tuple]:
        ...

(anywhere a list of constants immediately follows the "in" or "not in"
keywords) and convert them into constants?  The cases you converted all
matched that usage.

Skip

From gvanrossum at gmail.com  Sun Feb  6 17:54:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Feb  6 17:55:03 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <16902.19073.787609.523027@montanaro.dyndns.org>
References: <E1CxgMJ-0003JT-KV@sc8-pr-cvs1.sourceforge.net>
	<16902.19073.787609.523027@montanaro.dyndns.org>
Message-ID: <ca471dc205020608543aa5e9b1@mail.gmail.com>

On Sun, 6 Feb 2005 10:49:05 -0600, Skip Montanaro <skip@pobox.com> wrote:
> 
> In a python-checkins message, Raymond stated:
> 
>     Raymond> Replace list of constants with tuples of constants.
> 
> I understand the motivation here (the peephole optimizer can convert a tuple
> of constants into a single constant that need not be constructed over and
> over), but is the effort worth the cost of changing the logical nature of
> the data structures used?  If lists are conceptually like vectors or arrays
> in other languages and tuples are like C structs or Pascal records, then by
> converting from list to tuple form you've somehow muddied the data structure
> water just to take advantage of tuples' immutability.
> 
> Wouldn't it be better to have the peephole optimizer recognize the throwaway
> nature of lists in these contexts:
> 
>     for elt in [1, 2, 4, 8, 16]:
>         ...
> 
>     if foo in [list, tuple]:
>         ...
> 
> (anywhere a list of constants immediately follows the "in" or "not in"
> keywords) and convert them into constants?  The cases you converted all
> matched that usage.

I'm with Skip, *unless* the change is in a PROVEN TIME-CRITICAL PIECE OF CODE.

Let's not hand-micro-optimize code just because we can.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From nnorwitz at gmail.com  Sun Feb  6 19:05:46 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun Feb  6 19:05:51 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <16902.19073.787609.523027@montanaro.dyndns.org>
References: <E1CxgMJ-0003JT-KV@sc8-pr-cvs1.sourceforge.net>
	<16902.19073.787609.523027@montanaro.dyndns.org>
Message-ID: <ee2a432c05020610054cd8d31f@mail.gmail.com>

On Sun, 6 Feb 2005 10:49:05 -0600, Skip Montanaro <skip@pobox.com> wrote:
> 
> Wouldn't it be better to have the peephole optimizer recognize the throwaway
> nature of lists in these contexts:
> 
>     for elt in [1, 2, 4, 8, 16]:
>         ...
> 
>     if foo in [list, tuple]:
>         ...
> 
> (anywhere a list of constants immediately follows the "in" or "not in"
> keywords) and convert them into constants?  The cases you converted all
> matched that usage.

I think I implemented this once.  I'll try to see if I can find a
patch.  It wasn't too difficult, but I'm not sure if the patch was
clean.

Neal
From python at rcn.com  Sun Feb  6 19:03:56 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb  6 19:07:42 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <16902.19073.787609.523027@montanaro.dyndns.org>
Message-ID: <001001c50c76$3a298140$8abb9d8d@oemcomputer>

[Skip]
> If lists are conceptually like vectors or
> arrays
> in other languages and tuples are like C structs or Pascal records,
then
> by
> converting from list to tuple form you've somehow muddied the data
> structure
> water just to take advantage of tuples' immutability.

In the context of literals used with the "in" operator, practices are
widely divergent within the standard library and within the tutorial.
Even within a single module, there were arbitrary switches between "x in
[1,2,3]" and "x in (1,2,3)" and "x in 1,2,3".

It seems that the list-as-arrays-tuple-as-records guideline is not
meaningful or applicable in the context of the "in" operator.
Proscribing tuple.__contains__ and tuple.__iter__ carrys the notion a
bit too far.


> Wouldn't it be better to have the peephole optimizer recognize the
> throwaway
> nature of lists 

That's a good idea.  Implementing it will be more straight-forward after
the AST branch gets completed.


Raymond

From python at rcn.com  Sun Feb  6 19:15:30 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb  6 19:19:16 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <ee2a432c05020610054cd8d31f@mail.gmail.com>
Message-ID: <001101c50c77$d7cff9a0$8abb9d8d@oemcomputer>

[Neal]
> I think I implemented this once.  I'll try to see if I can find a
> patch.  It wasn't too difficult, but I'm not sure if the patch was
> clean.

If the opportunity arises, another worthwhile peepholer buildout would
be to recognize if-elif chains that can be transformed to a single
lookup and dispatch (see MAL's note in pep 275).


Raymond 

From python at rcn.com  Sun Feb  6 19:42:24 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb  6 19:46:11 2005
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib/test
	test_copy.py, 1.11.8.1, 1.11.8.2
In-Reply-To: <E1CxhHJ-0005OV-Qu@sc8-pr-cvs1.sourceforge.net>
Message-ID: <001701c50c7b$9a25b3c0$8abb9d8d@oemcomputer>

> Modified Files:
>       Tag: release23-maint
> 	test_copy.py
> Log Message:
> fix bug 1114776

Don't forget release24-maint.


Raymond

From skip at pobox.com  Sun Feb  6 20:13:51 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sun Feb  6 20:13:57 2005
Subject: [Python-Dev] list of constants -> tuple of constants
In-Reply-To: <001001c50c76$3a298140$8abb9d8d@oemcomputer>
References: <16902.19073.787609.523027@montanaro.dyndns.org>
	<001001c50c76$3a298140$8abb9d8d@oemcomputer>
Message-ID: <16902.27759.886169.551985@montanaro.dyndns.org>


    Raymond> [Skip]
    >> If lists are conceptually like vectors or arrays in other languages
    >> and tuples are like C structs or Pascal records, then by converting
    >> from list to tuple form you've somehow muddied the data structure
    >> water just to take advantage of tuples' immutability.

    Raymond> In the context of literals used with the "in" operator,
    Raymond> practices are widely divergent within the standard library and
    Raymond> within the tutorial.

Then perhaps we should strive to make the standard library and tutorial more
consistent.  Answers to questions on c.l.py often advocate the standard
library as a good source for example code.

    Raymond> It seems that the list-as-arrays-tuple-as-records guideline is
    Raymond> not meaningful or applicable in the context of the "in"
    Raymond> operator.  Proscribing tuple.__contains__ and tuple.__iter__
    Raymond> carrys the notion a bit too far.

I agree that the presence of __contains__ and __iter__ kind of blurs the
distinction between the concept of sequence and struct.

Skip
From raymond.hettinger at verizon.net  Mon Feb  7 08:21:33 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon Feb  7 08:25:37 2005
Subject: [Python-Dev] Other library updates
Message-ID: <000501c50ce5$a70b31e0$b806a044@oemcomputer>

Any objections to replacing the likes of types.IntType and
types.ListType with int and list?


Raymond

From doko at cs.tu-berlin.de  Mon Feb  7 14:36:32 2005
From: doko at cs.tu-berlin.de (Matthias Klose)
Date: Mon Feb  7 14:36:39 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1107726549.20128.12.camel@localhost>
References: <1107726549.20128.12.camel@localhost>
Message-ID: <16903.28384.621922.349@gargle.gargle.HOWL>

A Debian user pointed out (http://bugs.debian.org/293932), that the
current license for the Python profiler is not conforming to the DFSG
(Debian free software guidelines).

http://www.python.org/doc/current/lib/node829.html states

  "This permission is explicitly restricted to the copying and
  modification of the software to remain in Python, compiled Python,
  or other languages (such as C) wherein the modified or derived code
  is exclusively imported into a Python module."

The DFSG, http://www.debian.org/doc/debian-policy/ch-archive.html#s-dfsg,
third paragraph state:

  "Derived Works
    The license must allow modifications and derived works, and must
    allow them to be distributed under the same terms as the license
    of the original software."

- Does somebody knows about the history of this license, why it is
  more restricted than the Python license?
- Is there a chance to change the license for these two modules
  (profile.py, pstats.py)?


The md5.h/md5c.c files allow "copy and use", but no modification of
the files. There are some alternative implementations, i.e. in glibc,
openssl, so a replacement should be sage. Any other requirements when
considering a replacement?

	Matthias
From aleax at aleax.it  Mon Feb  7 14:49:56 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Feb  7 14:50:02 2005
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib/test
	test_copy.py, 1.11.8.1, 1.11.8.2
In-Reply-To: <001701c50c7b$9a25b3c0$8abb9d8d@oemcomputer>
References: <001701c50c7b$9a25b3c0$8abb9d8d@oemcomputer>
Message-ID: <f7029d0348cf474f6bae51814a64ce91@aleax.it>


On 2005 Feb 06, at 19:42, Raymond Hettinger wrote:

>> Modified Files:
>>       Tag: release23-maint
>> 	test_copy.py
>> Log Message:
>> fix bug 1114776
>
> Don't forget release24-maint.

Done -- but the maintenance branch of 2.4 has a problem right now: it 
doesn't pass unit tests, specifically test_os (I checked right after a 
cvs up and before doing any changes, of course).

This appears to be connected to: mapping_tests.py being very strict (or 
something) and demanding that some mapping be able to update itself 
from a ``simple dictionary'' that's not iterable and does not have an 
.items method either; while the _Environ class in os.py appears to make 
some reasonable demands from the argument to its .update method.  I'm 
not _sure_ which side of the dispute is in the right, so I haven't 
changed anything there (even though committing anything with unit tests 
broken makes my teeth grit).  I do admit that this kind of issue makes 
a good case for more formalized interfaces...;-)


Alex

From skip at pobox.com  Mon Feb  7 14:58:24 2005
From: skip at pobox.com (Skip Montanaro)
Date: Mon Feb  7 14:58:29 2005
Subject: [Python-Dev] Other library updates
In-Reply-To: <000501c50ce5$a70b31e0$b806a044@oemcomputer>
References: <000501c50ce5$a70b31e0$b806a044@oemcomputer>
Message-ID: <16903.29696.379798.207105@montanaro.dyndns.org>


    Raymond> Any objections to replacing the likes of types.IntType and
    Raymond> types.ListType with int and list?

+1

Skip
From gvanrossum at gmail.com  Mon Feb  7 17:31:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Feb  7 17:31:18 2005
Subject: [Python-Dev] Other library updates
In-Reply-To: <000501c50ce5$a70b31e0$b806a044@oemcomputer>
References: <000501c50ce5$a70b31e0$b806a044@oemcomputer>
Message-ID: <ca471dc205020708314626e33e@mail.gmail.com>

> Any objections to replacing the likes of types.IntType and
> types.ListType with int and list?

I presume in isinstance tests etc.? In general the procedure for
modernizing source code is not to touch it unless you're reviewing or
editing the whole module (or at least part of it) anyway.

This would be a good occasion to see if perhaps the tests you find are
formulated too narrowly -- e.g.  isinstance(x, int) should almost
always be isinstance(x, (int, long)), and isinstance(x, list) is also
often a poorly written test for "sequence-ness".

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From fredrik at pythonware.com  Mon Feb  7 17:52:55 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Feb  7 18:02:07 2005
Subject: [Python-Dev] Re: python/dist/src/Lib DocXMLRPCServer.py, 1.4,
	1.5 cookielib.py, 1.6, 1.7 copy.py, 1.43, 1.44 optparse.py,
	1.12, 1.13 pickle.py, 1.160, 1.161 subprocess.py, 1.13,
	1.14 unittest.py, 1.37, 1.38 xmlrpclib.py, 1.36, 1.37
References: <E1Cy9gi-0006JY-Na@sc8-pr-cvs1.sourceforge.net>
Message-ID: <cu86a9$ei0$1@sea.gmane.org>

> Reduce the usage of the types module.

> Index: xmlrpclib.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/xmlrpclib.py,v

    # Notes:
    # this version is designed to work with Python 2.1 or newer.

> -    dispatch[IntType] = dump_int
> +    dispatch[int] = dump_int

$ python2.1
>>> type(0) == int
0
>>> type([]) == list
0
>>> type({}) == dict
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'dict' is not defined

</F> 


From raymond.hettinger at verizon.net  Tue Feb  8 06:44:45 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue Feb  8 06:48:52 2005
Subject: [Python-Dev] test_codecs failing
Message-ID: <000501c50da1$56c25800$52b79d8d@oemcomputer>

The most recent test_codecs check-in (1.19) is failing on a MSCV6.0
compilation running on WinMe:

----------------------------------------------------------------------
Ran 35 tests in 1.430s

FAILED (failures=1)
Traceback (most recent call last):
  File "\py25\lib\test\test_codecs.py", line 786, in ?
    test_main()
  File "\py25\lib\test\test_codecs.py", line 781, in test_main
    BasicStrTest
  File "C:\PY25\lib\test\test_support.py", line 290, in run_unittest
    run_suite(suite, testclass)
  File "C:\PY25\lib\test\test_support.py", line 275, in run_suite
    raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File "\py25\lib\test\test_codecs.py", line 165, in test_badbom
    self.assertRaises(UnicodeError, f.read)
AssertionError: UnicodeError not raised


C:\pydev>python
Python 2.5a0 (#46, Feb  7 2005, 21:37:18) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.


Raymond

From anthony at interlink.com.au  Tue Feb  8 06:53:04 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Feb  8 06:54:14 2005
Subject: [Python-Dev] BRANCH FREEZE for 2.3.5
Message-ID: <200502081653.05122.anthony@interlink.com.au>

Can people stay off the release23-maint branch while 
we cut 2.3.5 (final), starting in about 5 hours time (say,
around 1200 UTC). 

Thanks,
Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From fredrik at pythonware.com  Tue Feb  8 09:43:13 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Feb  8 09:43:29 2005
Subject: [Python-Dev] Re: python/dist/src/Lib rfc822.py,1.78,1.79
References: <E1CyQN2-0001E8-9g@sc8-pr-cvs1.sourceforge.net>
Message-ID: <cu9tvu$2gi$1@sea.gmane.org>

rhettinger@users.sourceforge.net wrote:

> @@ -399,9 +393,8 @@
>         del self[name] # Won't fail if it doesn't exist
>         self.dict[name.lower()] = value
>         text = name + ": " + value
> -        lines = text.split("\n")
> -        for line in lines:
> -            self.headers.append(line + "\n")
> +        self.headers.extend(text.splitlines(True))
> +        self.headers.append('\n')

and you're 100% sure that the change in how things are stored
in headers won't affect any existing code?

(the docstring says that headers contain a list of lines, which is no
longer true)

</F> 


From mdehoon at ims.u-tokyo.ac.jp  Tue Feb  8 10:08:52 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Tue Feb  8 10:04:48 2005
Subject: [Python-Dev] Patch review [ 981773 ] crach link c++ extension by
	mingw
Message-ID: <420881A4.3000601@ims.u-tokyo.ac.jp>

Patch review [ 981773 ] crach link c++ extension by mingw

When building a C++ extension for Windows using MinGW, the linking would fail 
due to an incorrect link command. The patch contains a solution for this problem.

I could reproduce this bug with Python 2.3.5c1, but in Python 2.4 it seems to 
have been fixed. Using this Python version:

'2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)]'

a C++ extension compiled and linked correctly with MinGW. So I think this patch
is no longer needed (except if we want to back-port it to 2.3.5, which I doubt).

--Michiel.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon

From walter at livinglogic.de  Tue Feb  8 11:11:31 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Feb  8 11:11:34 2005
Subject: [Python-Dev] test_codecs failing
In-Reply-To: <000501c50da1$56c25800$52b79d8d@oemcomputer>
References: <000501c50da1$56c25800$52b79d8d@oemcomputer>
Message-ID: <42089053.7090904@livinglogic.de>

Raymond Hettinger wrote:
> The most recent test_codecs check-in (1.19) is failing on a MSCV6.0
> compilation running on WinMe:
> 
> ----------------------------------------------------------------------
> Ran 35 tests in 1.430s
> 
> FAILED (failures=1)
> Traceback (most recent call last):
 > [...]
> test.test_support.TestFailed: Traceback (most recent call last):
>   File "\py25\lib\test\test_codecs.py", line 165, in test_badbom
>     self.assertRaises(UnicodeError, f.read)
> AssertionError: UnicodeError not raised

Fixed. But the question remains: Why does a StreamWriter have
a read() method?

Bye,
    Walter D?rwald
From mal at egenix.com  Tue Feb  8 11:34:32 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Feb  8 11:34:36 2005
Subject: [Python-Dev] test_codecs failing
In-Reply-To: <42089053.7090904@livinglogic.de>
References: <000501c50da1$56c25800$52b79d8d@oemcomputer>
	<42089053.7090904@livinglogic.de>
Message-ID: <420895B8.6050300@egenix.com>

Walter D?rwald wrote:
> Raymond Hettinger wrote:
> 
>> The most recent test_codecs check-in (1.19) is failing on a MSCV6.0
>> compilation running on WinMe:
>>
>> ----------------------------------------------------------------------
>> Ran 35 tests in 1.430s
>>
>> FAILED (failures=1)
>> Traceback (most recent call last):
> 
>  > [...]
> 
>> test.test_support.TestFailed: Traceback (most recent call last):
>>   File "\py25\lib\test\test_codecs.py", line 165, in test_badbom
>>     self.assertRaises(UnicodeError, f.read)
>> AssertionError: UnicodeError not raised
> 
> 
> Fixed. But the question remains: Why does a StreamWriter have
> a read() method?

It inherits that method from the underlying stream - just as all
other methods and attributes that the stream defines and which
are not overridden by the StreamWriter methods. This approach was
taken to make it possible to user StreamWriter (and StreamReader)
instance as drop-in replacement in situations where the application
normally expects a file-like object.

Note that a file opened in write mode also exposes a read()
method.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From fredrik at pythonware.com  Tue Feb  8 10:10:49 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Feb  8 16:26:33 2005
Subject: [Python-Dev] Re: python/dist/src/Lib rfc822.py,1.78,1.79
References: <E1CyQN2-0001E8-9g@sc8-pr-cvs1.sourceforge.net>
	<cu9tvu$2gi$1@sea.gmane.org>
Message-ID: <cualjv$add$1@sea.gmane.org>


>> @@ -399,9 +393,8 @@
>>         del self[name] # Won't fail if it doesn't exist
>>         self.dict[name.lower()] = value
>>         text = name + ": " + value
>> -        lines = text.split("\n")
>> -        for line in lines:
>> -            self.headers.append(line + "\n")
>> +        self.headers.extend(text.splitlines(True))
>> +        self.headers.append('\n')
>
> and you're 100% sure that the change in how things are stored
> in headers won't affect any existing code?
>
> (the docstring says that headers contain a list of lines, which is no
> longer true)

and the module documentation says:

    Each line contains a trailing newline. The blank line terminating
    the headers is not contained in the list.

which is no longer true (unless I'm missing something here)

</F> 


From gvanrossum at gmail.com  Tue Feb  8 16:35:17 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Feb  8 16:35:21 2005
Subject: [Python-Dev] Re: python/dist/src/Lib rfc822.py,1.78,1.79
In-Reply-To: <cualjv$add$1@sea.gmane.org>
References: <E1CyQN2-0001E8-9g@sc8-pr-cvs1.sourceforge.net>
	<cu9tvu$2gi$1@sea.gmane.org> <cualjv$add$1@sea.gmane.org>
Message-ID: <ca471dc205020807352e5a4b7@mail.gmail.com>

On Tue, 8 Feb 2005 10:10:49 +0100, Fredrik Lundh <fredrik@pythonware.com> wrote:
> 
> >> @@ -399,9 +393,8 @@
> >>         del self[name] # Won't fail if it doesn't exist
> >>         self.dict[name.lower()] = value
> >>         text = name + ": " + value
> >> -        lines = text.split("\n")
> >> -        for line in lines:
> >> -            self.headers.append(line + "\n")
> >> +        self.headers.extend(text.splitlines(True))
> >> +        self.headers.append('\n')
> >
> > and you're 100% sure that the change in how things are stored
> > in headers won't affect any existing code?
> >
> > (the docstring says that headers contain a list of lines, which is no
> > longer true)
> 
> and the module documentation says:
> 
>     Each line contains a trailing newline. The blank line terminating
>     the headers is not contained in the list.
> 
> which is no longer true (unless I'm missing something here)

This would have been caught if there was a unit test validating what
the documentation says. Why aren't there unit tests for this code? I
think we need to raise the bar for "wholistic" improvements to a
module: first write a unit test if there isn't already one (and if
there is one, make sure that it tests all documented behavior), *then*
refactor. Yes, this would be less fun. It's not supposed to be fun.
It's supposed to avoid breaking code.

Raymond, please roll back that change until this is taken care of.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From barry at python.org  Tue Feb  8 17:59:23 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Feb  8 17:59:27 2005
Subject: [Python-Dev] Re: python/dist/src/Lib rfc822.py,1.78,1.79
In-Reply-To: <ca471dc205020807352e5a4b7@mail.gmail.com>
References: <E1CyQN2-0001E8-9g@sc8-pr-cvs1.sourceforge.net>
	<cu9tvu$2gi$1@sea.gmane.org> <cualjv$add$1@sea.gmane.org>
	<ca471dc205020807352e5a4b7@mail.gmail.com>
Message-ID: <1107881963.19011.18.camel@geddy.wooz.org>

On Tue, 2005-02-08 at 10:35, Guido van Rossum wrote:

> This would have been caught if there was a unit test validating what
> the documentation says. Why aren't there unit tests for this code? I
> think we need to raise the bar for "wholistic" improvements to a
> module: first write a unit test if there isn't already one (and if
> there is one, make sure that it tests all documented behavior), *then*
> refactor. Yes, this would be less fun. It's not supposed to be fun.
> It's supposed to avoid breaking code.

+1.  This module is used in so many place, you really have to take the
documented interface seriously (not that you shouldn't otherwise, of
course).  I suspect even the undocumented current semantics are relied
on in many place.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050208/6bed0e61/attachment.pgp
From greg at electricrain.com  Tue Feb  8 20:52:43 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue Feb  8 20:52:51 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <16903.28384.621922.349@gargle.gargle.HOWL>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
Message-ID: <20050208195243.GD10650@zot.electricrain.com>

> The md5.h/md5c.c files allow "copy and use", but no modification of
> the files. There are some alternative implementations, i.e. in glibc,
> openssl, so a replacement should be sage. Any other requirements when
> considering a replacement?
> 
> 	Matthias

I believe the "plan" for md5 and sha1 and such is to use the much
faster openssl versions "in the future" (based on a long thread
debating future interfaces to such things on python-dev last summer).
That'll sidestep any tedious license issue and give a better
implementation at the same time.  i don't believe anyone has taken the
time to make such a patch yet.

-g

From tim.peters at gmail.com  Tue Feb  8 21:37:50 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Feb  8 21:37:53 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <16903.28384.621922.349@gargle.gargle.HOWL>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
Message-ID: <1f7befae05020812377c72de26@mail.gmail.com>

[Matthias Klose]
> A Debian user pointed out (http://bugs.debian.org/293932), that the
> current license for the Python profiler is not conforming to the DFSG
> (Debian free software guidelines).
>
> http://www.python.org/doc/current/lib/node829.html states
>
>  "This permission is explicitly restricted to the copying and
>  modification of the software to remain in Python, compiled Python,
>  or other languages (such as C) wherein the modified or derived code
>  is exclusively imported into a Python module."
...
> - Does somebody knows about the history of this license, why it is
>  more restricted than the Python license?

Simply because that's the license Jim Roskind slapped on it when he
contributed this code 10 years ago.  I imagine (but don't know) that
Guido looked at it, thought "hmm -- shouldn't be a problem for
Python's users", and so accepted it.

> - Is there a chance to change the license for these two modules
>  (profile.py, pstats.py)?

Not unless some remnant of InfoSeek Corp can be found, since they're
the copyright holder (their work, their license).  Alas, Jim Roskind
hasn't been seen in the Python world this century.

OTOH, if InfoSeek has vanished, it's unlikely they'll be suing anyone.
 Given how Python-specific profile.py and pstats.py are, it's hard for
me to imagine anyone wanting to make a derivative that isn't imported
into a Python module.  In that respect it seems like a license clause
that forbids you to run the software while the tip of your tongue is
licking the back of your own neck.

Still, if that matters, perhaps Debian will need to leave these
modules out.  Bold <ahem> users will still be able to grab them from
any number of other places.
From jhylton at gmail.com  Tue Feb  8 21:52:29 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Feb  8 21:52:32 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1f7befae05020812377c72de26@mail.gmail.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<1f7befae05020812377c72de26@mail.gmail.com>
Message-ID: <e8bf7a53050208125224e89bf@mail.gmail.com>

Maybe some ambitious PSF activitst could contact Roskind and Steve
Kirsch and see if they know who at Disney to talk to...  Or maybe the
Disney guys who were at PyCon last year could help.

Jeremy


On Tue, 8 Feb 2005 15:37:50 -0500, Tim Peters <tim.peters@gmail.com> wrote:
> [Matthias Klose]
> > A Debian user pointed out (http://bugs.debian.org/293932), that the
> > current license for the Python profiler is not conforming to the DFSG
> > (Debian free software guidelines).
> >
> > http://www.python.org/doc/current/lib/node829.html states
> >
> >  "This permission is explicitly restricted to the copying and
> >  modification of the software to remain in Python, compiled Python,
> >  or other languages (such as C) wherein the modified or derived code
> >  is exclusively imported into a Python module."
> ...
> > - Does somebody knows about the history of this license, why it is
> >  more restricted than the Python license?
> 
> Simply because that's the license Jim Roskind slapped on it when he
> contributed this code 10 years ago.  I imagine (but don't know) that
> Guido looked at it, thought "hmm -- shouldn't be a problem for
> Python's users", and so accepted it.
> 
> > - Is there a chance to change the license for these two modules
> >  (profile.py, pstats.py)?
> 
> Not unless some remnant of InfoSeek Corp can be found, since they're
> the copyright holder (their work, their license).  Alas, Jim Roskind
> hasn't been seen in the Python world this century.
> 
> OTOH, if InfoSeek has vanished, it's unlikely they'll be suing anyone.
>  Given how Python-specific profile.py and pstats.py are, it's hard for
> me to imagine anyone wanting to make a derivative that isn't imported
> into a Python module.  In that respect it seems like a license clause
> that forbids you to run the software while the tip of your tongue is
> licking the back of your own neck.
> 
> Still, if that matters, perhaps Debian will need to leave these
> modules out.  Bold <ahem> users will still be able to grab them from
> any number of other places.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>
From martin at v.loewis.de  Tue Feb  8 22:35:28 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Feb  8 22:35:12 2005
Subject: [Python-Dev] 2.3.5 and 2.4.1 release plans
In-Reply-To: <200502051743.18393.anthony@interlink.com.au>
References: <200502051743.18393.anthony@interlink.com.au>
Message-ID: <420930A0.5080808@v.loewis.de>

Anthony Baxter wrote:
> I'm currently thinking about a 2.4.1 around the 23td of Feb - Martin and
> Fred, does this work for you?

Yes. I will need to test whether my replacement of VB scripts in the
installer with native DLLs works even on W95; I'm confident to complete
this next week (already have the W95 machine installed).

Regards,
Martin
From anthony at python.org  Wed Feb  9 08:27:49 2005
From: anthony at python.org (Anthony Baxter)
Date: Wed Feb  9 08:28:22 2005
Subject: [Python-Dev] RELEASED Python 2.3.5, final
Message-ID: <200502091827.56277.anthony@python.org>


On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.3.5 (final).

Python 2.3.5 is a bug-fix release. See the release notes at the website
(also available as Misc/NEWS in the source distribution) for details of
the bugs squished in this release.

Python 2.3.5 contains an important security fix for SimpleXMLRPCServer -
for more, see the announcement of PSF-2005-001 at:

    http://www.python.org/security/PSF-2005-001/ 

Python 2.3.5 is the last planned release in the Python 2.3 series, and
is being released for those people who still need to run Python 2.3.
Python 2.4 is a newer release, and should be preferred if possible. From
here, bugfix releases are switching to the Python 2.4 branch - 2.4.1
will be the next Python release.

For more information on Python 2.3.5, including download links for
various platforms, release notes, and known issues, please see:

    http://www.python.org/2.3.5

Highlights of this new release include:

  - Bug fixes. According to the release notes, more than 50 bugs 
    have been fixed, including a couple of bugs that could cause 
    Python to crash. 

Highlights of the previous major Python release (2.3) are available     
from the Python 2.3 page, at                                            

    http://www.python.org/2.3/highlights.html

Enjoy the new release,
Anthony

Anthony Baxter
anthony@python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050209/5f77db03/attachment.pgp
From trentm at ActiveState.com  Wed Feb  9 19:01:52 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed Feb  9 19:04:09 2005
Subject: [Python-Dev] update copyright date in PC/python_nt.rc?
Message-ID: <420A5010.5030008@activestate.com>

Howdy,

The copyright date was updated to 2005 in Python/getcopyright.c. Should 
the same be done in PC/python_nt.rc? Or perhaps, is there any reason 
python_nt.rc should NOT be updated?

Cheers,
Trent

-- 
Trent Mick
trentm@activestate.com
From bjourne at gmail.com  Wed Feb  9 20:20:16 2005
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Wed Feb  9 20:27:32 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks and stack
	traces with vars
Message-ID: <740c3aec050209112069d8c328@mail.gmail.com>

I'd like to help develop Python for fun and profit and I've heard that
posting patch reviews to python-dev is a good way to contribute. So
here goes:

PATCH REVIEW: [ 1098732 ]

Skip Montanaro has written a patch which makes it so that you can
inspect variable values in tracebacks. IMHO, it is a brilliant idea
and can make debugging quite alot easier. However, I'm not so fond of
the way that he has implemented it, it needs work. He basically
outputs all names in all stackframes all the way up to the top which
makes the traceback look way to cluttered. He has also implemented it
as a hook to sys.excepthook, I would ike it to be the default way in
which tracebacks are printed, or atleast ctivated by a command line
switch to Python. What does everyone else think? Does Skip's idea have
any merit?

http://sourceforge.net/tracker/index.php?func=detail&aid=1098732&group_id=5470&atid=305470

-- 
mvh Bj?rn
From pje at telecommunity.com  Wed Feb  9 20:43:04 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Feb  9 20:40:58 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks
	and stack traces with vars
In-Reply-To: <740c3aec050209112069d8c328@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>

At 08:20 PM 2/9/05 +0100, BJ?rn Lindqvist wrote:
>Does Skip's idea have
>any merit?

Yes, but not as a default behavior.  Many people already consider the fact 
that tracebacks display file paths to be a potential security problem.  If 
anything, the default traceback display should have less information, not 
more.  (E.g., display module __name__ instead of the code's __file__).

Also note that the stdlib already has a cgitb module that does this sort of 
display for CGI scripts, so the technique isn't new, and cgitb provides a 
good example for people to create their own advanced traceback formatters with.

If there were another command line option added to Python for this, I'd 
personally prefer it be an option to enter the debugger when a terminal 
traceback is printed.  Currently, I use 'python -i' so that I get an 
interpreter prompt, then use 'import pdb; pdb.pm()' to enter the debugger 
at the point where the error occurred.  One can then print whatever local 
variables are desired, go up and down the stack, list code, and even 
perform calculations on the values on the stack.

About the only place I can think of where such an extremely verbose 
traceback would be useful and safe, is inside of unit tests.  I believe 
that the py.test package uses traceback introspection of this kind in order 
to display relevant values when an assertion fails.  So, it might be useful 
in the context of a unit test error report to get some of that information, 
but even there, there is a question of how much is relevant for display.

From tim.peters at gmail.com  Wed Feb  9 21:30:34 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Feb  9 21:31:10 2005
Subject: [Python-Dev] update copyright date in PC/python_nt.rc?
In-Reply-To: <420A5010.5030008@activestate.com>
References: <420A5010.5030008@activestate.com>
Message-ID: <1f7befae05020912302f782316@mail.gmail.com>

[Trent Mick]
> The copyright date was updated to 2005 in Python/getcopyright.c. Should
> the same be done in PC/python_nt.rc?

Yes.

> Or perhaps, is there any reason python_nt.rc should NOT be updated?

Only reason I can think of is your inexcusable laziness for not having
done it yourself <wink>.
From trentm at ActiveState.com  Wed Feb  9 22:07:04 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed Feb  9 22:09:26 2005
Subject: [Python-Dev] update copyright date in PC/python_nt.rc?
In-Reply-To: <1f7befae05020912302f782316@mail.gmail.com>
References: <420A5010.5030008@activestate.com>
	<1f7befae05020912302f782316@mail.gmail.com>
Message-ID: <420A7B78.6030606@activestate.com>

> Only reason I can think of is your inexcusable laziness for not having
> done it yourself <wink>.

Done. I'd ask whether I should backport this to release23-maint... but 
then I'd have to reason whether there is any point given that a 2.3.6 is 
unlikely. And I'd have to ask Anthony. and... 
<shoulder-shrug>enh</shoulder-shrug>.

Trent

-- 
Trent Mick
trentm@activestate.com
From oliphant at ee.byu.edu  Wed Feb  9 22:43:34 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Feb  9 22:43:37 2005
Subject: [Python-Dev] Clarification sought about including a
 multidimensional array object into Python core
Message-ID: <420A8406.4020808@ee.byu.edu>


There has recently been some much-needed discussion on the 
numpy-discussions list run by sourceforge regarding the state of the 
multidimensional array objects available for Python.  It is desired by 
many that there be a single multidimensional array object in the Python 
core to facilitate data transfer and interfacing between multiple packages.

I am a co-author of the current PEP regarding inclusion of the 
multidimensional array object into the core.  However, that PEP is 
sorely outdated.  Currently there are two multidimensional array objects 
that are in use in the Python community:
 
   Numeric --- original arrayobject created by Jim Hugunin and many 
others.  Has been developed and used for 10 years.  An upgrade that adds 
the features of numarray but maintains the same basic structure of 
Numeric called Numeric3 is in development and will be ready for more 
wide-spread use in a couple of weeks.

   Numarray --- in development for about 3 years.  It was billed by some 
as a replacement for Numeric,.  While introducing some new features, it 
still has not covered the full feature set that Numeric had making it 
impossible for all Numeric users to use it.  In addition, it is still 
unacceptably slow for many operations that Numeric does well. 

Scientific users will always have to install more packages in order to 
use Python for their purposes.  However, there is still the desire that 
the basic array object would be common among all Python users.   To 
assist in writing a new PEP, we need clarification from Guido and others 
involved regarding

1) What specifically about Numeric prevented it from being acceptable as 
an addition to the Python core.
2) Are there any fixed requirements (other than coding style) before an 
arrayobject would be accepted into the Python core.

Thanks for your comments.  I think they will help the discussion 
currently taking place.

-Travis Oliphant


From bac at OCF.Berkeley.EDU  Wed Feb  9 22:59:35 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Feb  9 22:59:57 2005
Subject: [Python-Dev] discourage patch reviews to the list? (was: Patch
 review: [ 1098732 ])
In-Reply-To: <740c3aec050209112069d8c328@mail.gmail.com>
References: <740c3aec050209112069d8c328@mail.gmail.com>
Message-ID: <420A87C7.7030102@ocf.berkeley.edu>

BJ?rn Lindqvist wrote:
> I'd like to help develop Python for fun and profit and I've heard that
> posting patch reviews to python-dev is a good way to contribute. So
> here goes:
> 

Are we actually promoting this?  I am fine with people doing this when they 
have done five reviews and want their specific patch looked at (personally I 
prefer when people do it in a single email, but I can live with individual ones).

But if people don't have that in mind, should we not be encouraging this?  I 
mean it seems to be defeating the purpose of SF and having the various mailing 
lists that send out updates on SF posts.

-Brett
From gvanrossum at gmail.com  Wed Feb  9 23:45:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Feb  9 23:45:59 2005
Subject: [Python-Dev] Clarification sought about including a
	multidimensional array object into Python core
In-Reply-To: <420A8406.4020808@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
Message-ID: <ca471dc205020914453f4da355@mail.gmail.com>

> 1) What specifically about Numeric prevented it from being acceptable as
> an addition to the Python core.

It's very long ago, I believe that the authors themselves didn't think
it was good enough. It certainly had a very hackish coding style.

Numarray was supposed to fix all that. I'm sorry to hear that it
hasn't (yet) reached the maturity you find necessary.

> 2) Are there any fixed requirements (other than coding style) before an
> arrayobject would be accepted into the Python core.

The intended user community must accept the code as "best-of-breed".
It seems that the Num* community has some work to do in this respect.

Also (this applies to all code) the code must be stable enough that
the typical Python release cycle (about 18 months between feature
releases) doesn't cause problems.

Finally there must be someone willing to be responsible for
maintenance of the code.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From martin at v.loewis.de  Thu Feb 10 00:10:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 00:10:03 2005
Subject: [Python-Dev] discourage patch reviews to the list?
In-Reply-To: <420A87C7.7030102@ocf.berkeley.edu>
References: <740c3aec050209112069d8c328@mail.gmail.com>
	<420A87C7.7030102@ocf.berkeley.edu>
Message-ID: <420A9849.6020304@v.loewis.de>

Brett C. wrote:
  > But if people don't have that in mind, should we not be encouraging
> this?  I mean it seems to be defeating the purpose of SF and having the 
> various mailing lists that send out updates on SF posts.

Clearly, the comment should *also* go to SF - posting it to python-dev
may mean it gets lost eventually (in particular, when somebody gets to
look at the patch).

Bj?rn did post his comment to SF, and a summary to python-dev. I
personally think this is a good strategy: it puts focus on things
that should be worked on.

Let me explain why I think that these patches should be worked on:
- it might be that the analysis of the patch suggests that the patch
   should be rejected, as-is. If so, it has a good chance to be
   closed *right away* with somebody with write privileges to the
   tracker, if he agrees with the analysis taken. People who care
   can follow the link in the email message, and see that the patch
   was closed. People who don't care can quickly grasp this is a patch
   review, and delete the message.
- it might be that the analysis suggests changes. Posting it to
   python-dev gives the submitter of the patch a chance to challenge
   the review. If somebody thinks the requested changes are unecessary,
   they will comment. People actually prefer to discuss questionable
   requests for changes on the mailing list, instead of discussing
   them in the SF tracker.
- it might be that the analysis recommend acceptance. Again, it might
   be that this can trigger a quick action by some committer - anybody
   else can safely ignore the message. However, *some* committer should
   take *some* action on the patch - one day or the other. Having
   the right to commit is a privilege, but it is also an obligation.
   The patch needs to be eventually looked at, and decided upon.
   Somebody already did the majority of the work, and suggested an
   action. It should be easy to decide whether this action is
   agreeable or not (unless the review is flawed, in which case
   the reviewer should be told about this).

To put it the other way 'round: should we only discuss changes on
python-dev which *don't* have patches on SF???? I don't think
so.

Furthermore, this strategy exposes the reviewer. A reviewer is
somebody who will potentially get write access to the tracker,
and perhaps CVS write access. A reviewer who wants to contribute
in this way regularly clearly needs to gain the trust of other
contributors, and posting smart, valuable, objective, balanced
reviews on contributed patches is an excellent way to gain such
trust (likewise, posting reviews which turn out to be flawed
is a way to find out that the reviewer still needs to learn
things before he can be trusted).

Regards,
Martin

P.S. These remarks are mostly of general nature - I haven't
actually studied yet Bj?rn's review (but I leave it in my
inbox so I can get back to it next week).
From martin at v.loewis.de  Thu Feb 10 00:21:08 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 00:21:10 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks	and
	stack traces with vars
In-Reply-To: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
Message-ID: <420A9AE4.5090000@v.loewis.de>

Phillip J. Eby wrote:
> Yes, but not as a default behavior.  Many people already consider the 
> fact that tracebacks display file paths to be a potential security 
> problem.  If anything, the default traceback display should have less 
> information, not more.  (E.g., display module __name__ instead of the 
> code's __file__).

Notice that this patch does not change the exception printing behaviour
of Python at all. It just changes the implementation of
traceback.print_exception, so it only affects code that actually uses
this function. Furthermore, it only affects code that uses this function
and is *changed* to supply the argument True for print_args.

> Also note that the stdlib already has a cgitb module that does this sort 
> of display for CGI scripts, so the technique isn't new, and cgitb 
> provides a good example for people to create their own advanced 
> traceback formatters with.

Sure. However, if this is frequently needed (outside the context of
CGI), it would sure be helpful if the traceback module supported it.

> If there were another command line option added to Python for this, I'd 
> personally prefer it be an option to enter the debugger when a terminal 
> traceback is printed.  Currently, I use 'python -i' so that I get an 
> interpreter prompt, then use 'import pdb; pdb.pm()' to enter the 
> debugger at the point where the error occurred.

With the patch, you would have to add an explicit try/except into
your code, to supply True for print_args (or set a sys.excepthook,
as Skip suggests in his patch readme).

Regards,
Martin
From mwh at python.net  Thu Feb 10 00:22:59 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 10 00:23:01 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Python compile.c, 2.340, 2.341
In-Reply-To: <E1CxuXI-0007GK-TQ@sc8-pr-cvs1.sourceforge.net>
	(rhettinger@users.sourceforge.net's
	message of "Sun, 06 Feb 2005 14:05:44 -0800")
References: <E1CxuXI-0007GK-TQ@sc8-pr-cvs1.sourceforge.net>
Message-ID: <2m65115ne4.fsf@starship.python.net>

rhettinger@users.sourceforge.net writes:

> Update of /cvsroot/python/python/dist/src/Python
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26507/Python
>
> Modified Files:
> 	compile.c 
> Log Message:
> Transform "x in (1,2,3)" to "x in frozenset([1,2,3])".
>
> Inspired by Skip's idea to recognize the throw-away nature of sequences
> in this context and to transform their type to one with better performance.

This breaks code:

>>> [] in (1,)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

(and so breaks test_email -- is noone else running the test suite?).

It's a cute idea, but IMHO violates the principle of least surprise
too much.

Cheers,
mwh

-- 
  ZAPHOD:  Who are you?
  ROOSTA:  A friend.
  ZAPHOD:  Oh yeah? Anyone's friend in particular, or just generally 
           well-disposed to people?               -- HHGttG, Episode 7
From mwh at python.net  Thu Feb 10 00:25:54 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 10 00:25:56 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks and
	stack traces with vars
In-Reply-To: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	(Phillip J. Eby's message of "Wed, 09 Feb 2005 14:43:04 -0500")
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
Message-ID: <2m1xbp5n99.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 08:20 PM 2/9/05 +0100, BJ?rn Lindqvist wrote:
>>Does Skip's idea have
>>any merit?
>
> Yes, but not as a default behavior.  Many people already consider the
> fact that tracebacks display file paths to be a potential security
> problem.  If anything, the default traceback display should have less
> information, not more.  (E.g., display module __name__ instead of the
> code's __file__).

Oh, come on.  Making tracebacks less useful to protect people who
accidentally spray them across the internet seems absurd.  Would you
like them not to show source, either?

Cheers,
mwh

-- 
  Many of the posts you see on Usenet are actually from moths.  You
  can tell which posters they are by their attraction to the flames.
                                      -- Internet Oracularity #1279-06
From martin at v.loewis.de  Thu Feb 10 00:26:56 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 00:26:57 2005
Subject: [Python-Dev] Clarification sought about including a
	multidimensional array object into Python core
In-Reply-To: <420A8406.4020808@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
Message-ID: <420A9C40.4060306@v.loewis.de>

Travis Oliphant wrote:
> I am a co-author of the current PEP regarding inclusion of the 
> multidimensional array object into the core.  However, that PEP is 
> sorely outdated.
[...]
> 1) What specifically about Numeric prevented it from being acceptable as 
> an addition to the Python core.
> 2) Are there any fixed requirements (other than coding style) before an 
> arrayobject would be accepted into the Python core.

I think you answered these questions yourself. If a PEP is sorely
outdated after only 3 years of its life, there clearly is something
wrong with the PEP. Python language features will have to live
10 years or so before they can be considered outdated, and then
another 20 years before they can be removed (look at string
exceptions as an example).

So if it is still not clear what kind of API would be adequate
after all these years, it is best (IMO) to wait a few more years
for somebody to show up with a good solution to the problem
(which I admit I don't understand).

Regards,
Martin
From bob at redivi.com  Thu Feb 10 00:40:08 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Feb 10 00:40:21 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks and
	stack traces with vars
In-Reply-To: <2m1xbp5n99.fsf@starship.python.net>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<2m1xbp5n99.fsf@starship.python.net>
Message-ID: <a6412d74275802faaa184f1a635343ae@redivi.com>


On Feb 9, 2005, at 6:25 PM, Michael Hudson wrote:

> "Phillip J. Eby" <pje@telecommunity.com> writes:
>
>> At 08:20 PM 2/9/05 +0100, BJ?rn Lindqvist wrote:
>>> Does Skip's idea have
>>> any merit?
>>
>> Yes, but not as a default behavior.  Many people already consider the
>> fact that tracebacks display file paths to be a potential security
>> problem.  If anything, the default traceback display should have less
>> information, not more.  (E.g., display module __name__ instead of the
>> code's __file__).
>
> Oh, come on.  Making tracebacks less useful to protect people who
> accidentally spray them across the internet seems absurd.  Would you
> like them not to show source, either?

On Mac OS X the paths to the files are so long as to make the 
tracebacks really ugly and *less* usable.  I certainly wouldn't mind if 
__name__ showed up instead of __file__.  I have a "pywhich" script that 
shows me the file given a name that I use:

(note that modulegraph.util.imp_find_module is like imp.find_module but 
it will walk the packages to find the actual module and it only returns 
the filename)

#!/usr/bin/env python
import sys, os
from modulegraph.util import imp_find_module
for module in sys.argv[1:]:
     path,oext = os.path.splitext(imp_find_module(module)[1])
     for ext in ('.py', oext):
         if os.path.exists(path+ext):
             print path+ext
             break

-bob

From gvanrossum at gmail.com  Thu Feb 10 00:53:56 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 10 00:54:01 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks and
	stack traces with vars
In-Reply-To: <a6412d74275802faaa184f1a635343ae@redivi.com>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<2m1xbp5n99.fsf@starship.python.net>
	<a6412d74275802faaa184f1a635343ae@redivi.com>
Message-ID: <ca471dc205020915534234152a@mail.gmail.com>

> > Oh, come on.  Making tracebacks less useful to protect people who
> > accidentally spray them across the internet seems absurd.  Would you
> > like them not to show source, either?

My response exactly.

> On Mac OS X the paths to the files are so long as to make the
> tracebacks really ugly and *less* usable.  I certainly wouldn't mind if
> __name__ showed up instead of __file__.  I have a "pywhich" script that
> shows me the file given a name that I use:

Well, sorry, but not everybody is as smart as you, and having the file
name rather than the module name there helps debugging important
sys.python issues. It wouldn't be the first time that someone has a
hacked version of a standard module tucked away in a directory that
happens to land on the path, and seeing the pathname is then a lot
more productive than the module name.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From oliphant at ee.byu.edu  Thu Feb 10 00:54:29 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Feb 10 00:54:37 2005
Subject: [Python-Dev] Clarification sought about including a
	multidimensional array object into Python core
In-Reply-To: <420A9C40.4060306@v.loewis.de>
References: <420A8406.4020808@ee.byu.edu> <420A9C40.4060306@v.loewis.de>
Message-ID: <420AA2B5.2060801@ee.byu.edu>

Martin v. L?wis wrote:

> Travis Oliphant wrote:
>
>> I am a co-author of the current PEP regarding inclusion of the 
>> multidimensional array object into the core.  However, that PEP is 
>> sorely outdated.
>
> [...]
>
>> 1) What specifically about Numeric prevented it from being acceptable 
>> as an addition to the Python core.
>> 2) Are there any fixed requirements (other than coding style) before 
>> an arrayobject would be accepted into the Python core.
>
>
> I think you answered these questions yourself. If a PEP is sorely
> outdated after only 3 years of its life, there clearly is something
> wrong with the PEP. 

Exactly, the PEP does not reflect the reality of what anybody wants in 
the core.  It needs modification, or replacment.   Can I just do that?  
Or do I need permission from Barrett and others who has only a passing 
interest in this anymore.

> Python language features will have to live
> 10 years or so before they can be considered outdated, and then
> another 20 years before they can be removed (look at string
> exceptions as an example).

I think you misunderstood my meaning.  For example Numeric has lived 10 
years with very few changes.  It seems to me it is rather stable.

>
> So if it is still not clear what kind of API would be adequate
> after all these years, it is best (IMO) to wait a few more years
> for somebody to show up with a good solution to the problem
> (which I admit I don't understand).

It actually is pretty clear to many.  There have been a wide variety of 
modules written on top of Numeric and Numarray.     Most of the rough 
spots around the edges have been ironed out.   Our arguments now are 
about packaging other code living on top of an arrayobject.

Thanks for your help,

-Travis

From pje at telecommunity.com  Thu Feb 10 01:11:48 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Feb 10 01:09:45 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance
	tracebacks	and stack traces with vars
In-Reply-To: <420A9AE4.5090000@v.loewis.de>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050209191027.03ea87f0@mail.telecommunity.com>

At 12:21 AM 2/10/05 +0100, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
>>Yes, but not as a default behavior.  Many people already consider the 
>>fact that tracebacks display file paths to be a potential security 
>>problem.  If anything, the default traceback display should have less 
>>information, not more.  (E.g., display module __name__ instead of the 
>>code's __file__).
>
>Notice that this patch does not change the exception printing behaviour
>of Python at all. It just changes the implementation of
>traceback.print_exception, so it only affects code that actually uses
>this function. Furthermore, it only affects code that uses this function
>and is *changed* to supply the argument True for print_args.

I was just responding to the OP, who was advocating it for Python default 
behavior, or behavior controlled by the command line.  That's why I said, 
"Yes, but not as a default behavior."

From david.ascher at gmail.com  Thu Feb 10 01:12:26 2005
From: david.ascher at gmail.com (David Ascher)
Date: Thu Feb 10 01:12:30 2005
Subject: [Python-Dev] Clarification sought about including a
	multidimensional array object into Python core
In-Reply-To: <ca471dc205020914453f4da355@mail.gmail.com>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
Message-ID: <dd28fc2f050209161264f9b601@mail.gmail.com>

On Wed, 9 Feb 2005 14:45:18 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:

> The intended user community must accept the code as "best-of-breed".
> It seems that the Num* community has some work to do in this respect.

I've not followed the num* discussion in quite a while, but my
impression back then was that there wasn't "one" such community. 
Instead, the technical differences in the approaches required in
specific fields, regarding things like the relative importance of
memory profiles, speed, error handling, willingness to require modern
C++ compilers, etc. made practical compromises quite tricky.

I would love to be proven wrong.

--david
From pje at telecommunity.com  Thu Feb 10 01:15:48 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Feb 10 01:13:44 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance tracebacks
	and stack traces with vars
In-Reply-To: <2m1xbp5n99.fsf@starship.python.net>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050209191225.03eaeec0@mail.telecommunity.com>

At 11:25 PM 2/9/05 +0000, Michael Hudson wrote:
>"Phillip J. Eby" <pje@telecommunity.com> writes:
>
> > At 08:20 PM 2/9/05 +0100, BJ?rn Lindqvist wrote:
> >>Does Skip's idea have
> >>any merit?
> >
> > Yes, but not as a default behavior.  Many people already consider the
> > fact that tracebacks display file paths to be a potential security
> > problem.  If anything, the default traceback display should have less
> > information, not more.  (E.g., display module __name__ instead of the
> > code's __file__).
>
>Oh, come on.  Making tracebacks less useful to protect people who
>accidentally spray them across the internet seems absurd.  Would you
>like them not to show source, either?

I said that many people considered that to be the case, not that I 
did.  ;)  I'd personally prefer to read module names than filenames, so I 
guess I should've mentioned that.  :)

Of course, Guido has previously answered the filename vs. modulename 
question (years ago in fact), so it was moot even before I mentioned 
it.  For some reason it slipped my mind at the time, though.

From oliphant at ee.byu.edu  Thu Feb 10 01:34:59 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Feb 10 01:35:04 2005
Subject: [Python-Dev] Clarification sought about including a
	multidimensional array object into Python core
In-Reply-To: <dd28fc2f050209161264f9b601@mail.gmail.com>
References: <420A8406.4020808@ee.byu.edu>	
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com>
Message-ID: <420AAC33.807@ee.byu.edu>

David Ascher wrote:

>I've not followed the num* discussion in quite a while, but my
>impression back then was that there wasn't "one" such community. 
>Instead, the technical differences in the approaches required in
>specific fields, regarding things like the relative importance of
>memory profiles, speed, error handling, willingness to require modern
>C++ compilers, etc. made practical compromises quite tricky.
>  
>

I really appreciate comments from those who remember some of the old 
discussions.

There are indeed some different needs.  Most of this, however, is in the 
ufunc object (how do you do math with the arrays).   And, a lot of this 
has been ameliorated with the new concepts of error modes that numarray 
introduced.

There is less argumentation over the basic array object as a memory 
structure.   The biggest argument right now is the design of the object: 
i.e.  a mixture of Python and C (numarray) versus a C-only object 
(Numeric3).

In other words, what I'm saying is that in terms of how the array object 
should be structure, a lot is known.  What is more controversial is 
should the design be built upon Numarray's object structure (a mixture 
of Python and C), or on Numeric's --- all in C

-Travis


From martin at v.loewis.de  Thu Feb 10 01:39:09 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 01:39:11 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance  tracebacks	and
	stack traces with vars
In-Reply-To: <5.1.1.6.0.20050209191027.03ea87f0@mail.telecommunity.com>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<5.1.1.6.0.20050209191027.03ea87f0@mail.telecommunity.com>
Message-ID: <420AAD2D.1060300@v.loewis.de>

Phillip J. Eby wrote:
> I was just responding to the OP, who was advocating it for Python 
> default behavior, or behavior controlled by the command line.  That's 
> why I said, "Yes, but not as a default behavior."

I wasn't sure how to interpret the message - I could not find out
whether you have looked at the patch, and agreed with it, or whether
you merely read the OP's summary of the patch.

Regards,
Martin
From martin at v.loewis.de  Thu Feb 10 01:49:33 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 01:49:35 2005
Subject: [Python-Dev] Clarification sought about including
	a	multidimensional array object into Python core
In-Reply-To: <420AA2B5.2060801@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu> <420A9C40.4060306@v.loewis.de>
	<420AA2B5.2060801@ee.byu.edu>
Message-ID: <420AAF9D.6090303@v.loewis.de>

Travis Oliphant wrote:
> Exactly, the PEP does not reflect the reality of what anybody wants in 
> the core.  It needs modification, or replacment.   Can I just do that?  

My understanding is this: you can, and you should.

You are the author of the PEP (together with Paul Barrett), and the
PEP is still in Draft status (with a Python-Version of 2.2). Until
the PEP is Accepted or Rejected status, you can make any changes to
it that you want. It would be nice if you would track the Post-History
section, and perhaps a History section at the end, pointing out that
the PEP got completely restructured at some point.

> Or do I need permission from Barrett and others who has only a passing 
> interest in this anymore.

According to PEP 1, you could ask Barrett for a complete takeover,
to remove him from the Authors list. If he agrees, there would be
no problem to change that list after so much time has passed.

> I think you misunderstood my meaning.  For example Numeric has lived 10 
> years with very few changes.  It seems to me it is rather stable.

I probably misunderstand something. If Numeric has been stable for
10 years, why is not good (no need to answer here - an answer in the
PEP would be appreciated)? If there is something new to replace it,
how stable is that?

Regards,
Martin
From martin at v.loewis.de  Thu Feb 10 01:53:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 01:53:25 2005
Subject: [Python-Dev] Clarification sought about including
	a	multidimensional array object into Python core
In-Reply-To: <420AAC33.807@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>		<ca471dc205020914453f4da355@mail.gmail.com>	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu>
Message-ID: <420AB084.1000008@v.loewis.de>

Travis Oliphant wrote:
> In other words, what I'm saying is that in terms of how the array object 
> should be structure, a lot is known.  What is more controversial is 
> should the design be built upon Numarray's object structure (a mixture 
> of Python and C), or on Numeric's --- all in C

To me, this sounds like an implementation detail. I'm sure it is an
important detail, as I understand all of this is mostly done for
performance reasons. The PEP should list the options, include criteria
for selection, and then propose a choice. People can then discuss
whether the list of options is complete (if not, you need to extend
it), whether the criteria are agreed (they might be not, and there
might be difficult consensus, which the PEP should point out), and
whether the choice is the right one given the criteria (there should
be no debate about this - everybody should agree factually that the
choice meets the criteria best).

Regards,
Martin
From bac at OCF.Berkeley.EDU  Thu Feb 10 02:25:14 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Feb 10 02:25:28 2005
Subject: [Python-Dev] discourage patch reviews to the list?
In-Reply-To: <420A9849.6020304@v.loewis.de>
References: <740c3aec050209112069d8c328@mail.gmail.com>	<420A87C7.7030102@ocf.berkeley.edu>
	<420A9849.6020304@v.loewis.de>
Message-ID: <420AB7FA.3040106@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
>  > But if people don't have that in mind, should we not be encouraging
> 
>> this?  I mean it seems to be defeating the purpose of SF and having 
>> the various mailing lists that send out updates on SF posts.

[SNIP]

> Bj?rn did post his comment to SF, and a summary to python-dev. I
> personally think this is a good strategy: it puts focus on things
> that should be worked on.
> 
> Let me explain why I think that these patches should be worked on:
> - it might be that the analysis of the patch suggests that the patch
>   should be rejected, as-is.
[SNIP]
> - it might be that the analysis suggests changes.
[SNIP]

> - it might be that the analysis recommend acceptance.
[SNIP]

All valid points, but I also don't want people to suddenly start posting 
one-liners or bug posts.

I guess it comes down to a signal-to-noise ratio and if the level of signal we 
are currently getting will hold.  If we say it is okay for people to send in 
patch reviews *only* and not notifications of new patches, bug reports, or bug 
reviews, then I can handle it.

> To put it the other way 'round: should we only discuss changes on
> python-dev which *don't* have patches on SF???? I don't think
> so.
> 

And neither do I.  I just don't want a ton of random emails on python-dev that 
really belong in the SF tracker instead.  Reason why we don't tend to take 
direct bug reports in email unless there is a question over semantics.

> Furthermore, this strategy exposes the reviewer. A reviewer is
> somebody who will potentially get write access to the tracker,
> and perhaps CVS write access. A reviewer who wants to contribute
> in this way regularly clearly needs to gain the trust of other
> contributors, and posting smart, valuable, objective, balanced
> reviews on contributed patches is an excellent way to gain such
> trust (likewise, posting reviews which turn out to be flawed
> is a way to find out that the reviewer still needs to learn
> things before he can be trusted).
> 

That is a very good point.  Guess I am softening on my rejection to this.  =)

If people in general agree to this idea of having people post patch reviews to 
python-dev I will update the dev intro essay to reflect all of this.  I will 
also add a mention about the 5-1 patch review deal.

[SNIP]
> P.S. These remarks are mostly of general nature - I haven't
> actually studied yet Bj?rn's review (but I leave it in my
> inbox so I can get back to it next week).

Same here.  I didn't mean to single out Bj?rn in any way.  He just happened to 
trigger an email out of me.  =)

-Brett
From paul at pfdubois.com  Thu Feb 10 02:30:16 2005
From: paul at pfdubois.com (Paul F. Dubois)
Date: Thu Feb 10 02:30:19 2005
Subject: [Python-Dev] Numeric life as I see it
In-Reply-To: <420AB084.1000008@v.loewis.de>
References: <420A8406.4020808@ee.byu.edu>		<ca471dc205020914453f4da355@mail.gmail.com>	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
Message-ID: <420AB928.3090004@pfdubois.com>


Martin v. L?wis wrote:
The PEP should list the options, include criteria
> for selection, and then propose a choice. People can then discuss
> whether the list of options is complete (if not, you need to extend
> it), whether the criteria are agreed (they might be not, and there
> might be difficult consensus, which the PEP should point out), and
> whether the choice is the right one given the criteria (there should
> be no debate about this - everybody should agree factually that the
> choice meets the criteria best).
> 

Unrealistic. I think it is undisputed that there are people with 
irreconcilably different needs. Frankly, we spent many, many months on 
the design of Numeric and it represents a set of compromises already. 
However, the one thing it wouldn't compromise on was speed, even at the 
expense of safety. A community exists that cannot live with this 
compromise. We were told that the Python core could also not live with 
that compromise.

Over the years there was pressure to add safety, convenience, 
flexibility, etc., all sometimes incompatible with speed. Numarray 
represents in some sense the set of compromises in that direction, 
besides its technical innovations. Numeric / Numeric3 represents the 
need for speed camp.

I think it is reasonable to suppose that the need for speed piece can be 
wrapped suitably by the need for safety-flexibility-convenience 
facilities. I believe that hope underlies Travis' plan.

The Nummies (the official set of developers) thought that the Numeric 
code base was an unsuitable basis for further development. There was no 
dissent about that at least. My idea was to get something like what 
Travis is now doing done to replace it. I felt it important to get 
myself out of the picture after five years as the lead developer 
especially since my day job had ceased to involve using Numeric.

However, removing my cork from the bottle released the unresolved 
pressure between these two camps. My plan for transition failed. I 
thought I had consensus on the goal and in fact it wasn't really there. 
Everyone is perfectly good-willed and clever and trying hard to "all 
just get along", but the goal was lost.  Eric Raymond should write a 
book about it called "Bumbled Bazaar".

I hope everyone will still try to achieve that goal. Interoperability of 
all the Numeric-related software (including supporting a 'default' 
plotting package) is required.

Aside: While I am at it, let me reiterate what I have said to the other 
developers privately: there is NO value to inheriting from the array 
class. Don't try to achieve that capability if it costs anything, even 
just effort, because it buys you nothing. Those of you who keep 
remarking on this as if it would simply haven't thought it through IMHO. 
It sounds so intellectually appealing that David Ascher and I had a 
version of Numeric that almost did it before we realized our folly.


From martin at v.loewis.de  Thu Feb 10 03:30:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 10 03:30:03 2005
Subject: [Python-Dev] discourage patch reviews to the list?
In-Reply-To: <420AB7FA.3040106@ocf.berkeley.edu>
References: <740c3aec050209112069d8c328@mail.gmail.com>	<420A87C7.7030102@ocf.berkeley.edu>
	<420A9849.6020304@v.loewis.de> <420AB7FA.3040106@ocf.berkeley.edu>
Message-ID: <420AC729.6070804@v.loewis.de>

Brett C. wrote:
> All valid points, but I also don't want people to suddenly start posting 
> one-liners or bug posts.

I agree that keeping the noise level low is desirable; I hope this will
come out naturally when we start commenting on high-noise remarks.
For example, I would have no problems telling somebody who says
"me too" on a feature request that he should go away and come back
with an implementation of the requested feature. I would still apply
the "standard" conventions of python-dev: that you should be fairly
knowledgable about the things you are talking about before posting.

> I guess it comes down to a signal-to-noise ratio and if the level of 
> signal we are currently getting will hold.  If we say it is okay for 
> people to send in patch reviews *only* and not notifications of new 
> patches, bug reports, or bug reviews, then I can handle it.

People do tend to notify about patches from time to time, especially
when they are committers, and want to weigh in their reputation to
advance peer review of the proposed changes. Other people who notify
about new patches they made will continue to get my "5 for 1" offer
which actually triggered this new interest in contributing-by-reviewing.

Another reason not to post patches to python-dev is message size for
modem users although I'm doubtful how valid this rationale is these
days, given ADSL, spam, HTML mails, and everything...

> And neither do I.  I just don't want a ton of random emails on 
> python-dev that really belong in the SF tracker instead.  Reason why we 
> don't tend to take direct bug reports in email unless there is a 
> question over semantics.

I certainly don't want to see random comments on python-dev, either
(and I do see random comments come in bursts, and have to choose
to ignore entire threads because of that. I don't have to
write python-dev summaries, though :-)

I disagree with the primary reason not to take bug reports on
python-dev, however: bug reports in email get lost if not
immediately processed; usage of a tracker is necessary to
actually "keep track". So this kind of bug management is the
primary reason for the tracker, not that we want to keep
random users out of python-dev (although this is a convenient
side effect).

Regards,
Martin


From skip at pobox.com  Thu Feb 10 04:44:05 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Feb 10 04:44:25 2005
Subject: [Python-Dev] Patch review: [ 1098732 ] Enhance
	tracebacks      and stack traces with vars
In-Reply-To: <5.1.1.6.0.20050209191027.03ea87f0@mail.telecommunity.com>
References: <5.1.1.6.0.20050209143415.030e1750@mail.telecommunity.com>
	<5.1.1.6.0.20050209191027.03ea87f0@mail.telecommunity.com>
Message-ID: <16906.55429.990644.712145@montanaro.dyndns.org>


    Phillip> I was just responding to the OP, who was advocating it for
    Phillip> Python default behavior, or behavior controlled by the command
    Phillip> line.  That's why I said, "Yes, but not as a default behavior."

My original intent was that it would probably not fly as default behavior.
I'm not sure I would always want that behavior either.  I would like it for
long-running daemons that crash while unattended (places where "python -i"
wouldn't really help).

Skip
From oliphant at ee.byu.edu  Thu Feb 10 05:09:52 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Feb 10 05:09:56 2005
Subject: [Python-Dev] Re: Numeric life as I see it
In-Reply-To: <420AB928.3090004@pfdubois.com>
References: <420A8406.4020808@ee.byu.edu>		<ca471dc205020914453f4da355@mail.gmail.com>	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com>
Message-ID: <420ADE90.9050304@ee.byu.edu>


> Martin v. L?wis wrote:
> The PEP should list the options, include criteria
>
>> for selection, and then propose a choice. People can then discuss
>> whether the list of options is complete (if not, you need to extend
>> it), whether the criteria are agreed (they might be not, and there
>> might be difficult consensus, which the PEP should point out), and
>> whether the choice is the right one given the criteria (there should
>> be no debate about this - everybody should agree factually that the
>> choice meets the criteria best).
>>
>
> Unrealistic. I think it is undisputed that there are people with 
> irreconcilably different needs. Frankly, we spent many, many months on 
> the design of Numeric and it represents a set of compromises already. 
> However, the one thing it wouldn't compromise on was speed, even at 
> the expense of safety. A community exists that cannot live with this 
> compromise. We were told that the Python core could also not live with 
> that compromise.


I'm not sure I agree.  The ufuncobject is the only place where this 
concern existed (should we trip OverFlow, ZeroDivision, etc. errors 
durring array math).   Numarray introduced and implemented the concept 
of error modes that can be pushed and popped.  I believe this is the 
right solution for the ufuncobject.

One question we are pursuing is could the arrayobject get into the core 
without a particular ufunc object.   Most see this as sub-optimal, but 
maybe it is the only way.

>
> Over the years there was pressure to add safety, convenience, 
> flexibility, etc., all sometimes incompatible with speed. Numarray 
> represents in some sense the set of compromises in that direction, 
> besides its technical innovations. Numeric / Numeric3 represents the 
> need for speed camp.

I don't see numarray as representing this at all.  To me, numarray 
represents the desire to have more flexible array types (specifically 
record arrays and maybe character arrays).   I personally don't see 
Numeric3 as in any kind of "need for speed" camp either.  I've never 
liked this distinction, because I don't think it represents a true 
dichotomy.  To me,  the differences between Numeric3 and numarray are 
currently more "architectural" than implementational.    

Perhaps you are referring to the fact that because numarray has several 
portions written in Python it is "more flexible" or "more convenient" 
but slower for small arrays.  If you are saying that then I guess 
Numeric3 is a "need for speed" implementation, and I apologize for not 
understanding. 

>
> I think it is reasonable to suppose that the need for speed piece can 
> be wrapped suitably by the need for safety-flexibility-convenience 
> facilities. I believe that hope underlies Travis' plan.

If the "safety-flexibility-convenience" facilities can be specified, 
then I'm all for one implementation.    Numeric3 design goals do not go 
against any of these ideas intentionally.

>
> The Nummies (the official set of developers) thought that the Numeric 
> code base was an unsuitable basis for further development. There was 
> no dissent about that at least. My idea was to get something like what 
> Travis is now doing done to replace it. I felt it important to get 
> myself out of the picture after five years as the lead developer 
> especially since my day job had ceased to involve using Numeric.

Some of the parts needed to be re-written, but I didn't think that meant 
moving away from the goal to have a single C-type that is the 
arrayobject.   During this process Python 2.2 came out and allowed 
sub-classing from C-types.  As Perry mentioned, and I think needs to be 
emphasized again, this changed things as any benefit from having a 
Python-class for the final basic array type disappeared --- beyond ease 
of prototyping and testing.

>
> However, removing my cork from the bottle released the unresolved 
> pressure between these two camps. My plan for transition failed. I 
> thought I had consensus on the goal and in fact it wasn't really 
> there. Everyone is perfectly good-willed and clever and trying hard to 
> "all just get along", but the goal was lost.  Eric Raymond should 
> write a book about it called "Bumbled Bazaar".

This is an accurate description.  Fortunately, I don't think any 
ill-will exists (assuming I haven't created any with my recent 
activities).  I do want to "get-along."  I just don't want to be silent 
when there are issues that I think I understand.

>
> I hope everyone will still try to achieve that goal. Interoperability 
> of all the Numeric-related software (including supporting a 'default' 
> plotting package) is required.

Utopia is always out of reach :-)

> Aside: While I am at it, let me reiterate what I have said to the 
> other developers privately: there is NO value to inheriting from the 
> array class. Don't try to achieve that capability if it costs 
> anything, even just effort, because it buys you nothing. Those of you 
> who keep remarking on this as if it would simply haven't thought it 
> through IMHO. It sounds so intellectually appealing that David Ascher 
> and I had a version of Numeric that almost did it before we realized 
> our folly.
>
I appreciate some of what Paul is saying here, but I'm not fully 
convinced that this is still true with Python 2.2 and up new-style 
c-types.   The concerns seem to be over the fact that you have to 
re-implement everything in the sub-class because the base-class will 
always return one of its objects instead of a sub-class object.
It seems to me, however,  that if the C methods use the object type 
alloc function when creating new objects then some of this problem is 
avoided (i.e. if the method is called with a sub-class type passed in, 
then a sub-class type gets set).

Have you looked at how Python now allows sub-classing in C?  I'm not an 
expert here, but it seems like a lot of the problems you were discussing 
have been ameliorated.  There are probably still issues, but....

I will know more when I seen what happens with a Matrix Object 
inheriting from a Python C-array object.

I'm wondering if anyone else with more knowledge about new-style c-types 
could help
here and show me the error of my thinking. 

-Travis


From gvanrossum at gmail.com  Thu Feb 10 05:36:39 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 10 05:36:44 2005
Subject: [Python-Dev] Re: Numeric life as I see it
In-Reply-To: <420ADE90.9050304@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com> <420AAC33.807@ee.byu.edu>
	<420AB084.1000008@v.loewis.de> <420AB928.3090004@pfdubois.com>
	<420ADE90.9050304@ee.byu.edu>
Message-ID: <ca471dc205020920364ec6fa40@mail.gmail.com>

[Paul]
> > Aside: While I am at it, let me reiterate what I have said to the
> > other developers privately: there is NO value to inheriting from the
> > array class. Don't try to achieve that capability if it costs
> > anything, even just effort, because it buys you nothing. Those of you
> > who keep remarking on this as if it would simply haven't thought it
> > through IMHO. It sounds so intellectually appealing that David Ascher
> > and I had a version of Numeric that almost did it before we realized
> > our folly.

[Travis]
> I appreciate some of what Paul is saying here, but I'm not fully
> convinced that this is still true with Python 2.2 and up new-style
> c-types.   The concerns seem to be over the fact that you have to
> re-implement everything in the sub-class because the base-class will
> always return one of its objects instead of a sub-class object.
> It seems to me, however,  that if the C methods use the object type
> alloc function when creating new objects then some of this problem is
> avoided (i.e. if the method is called with a sub-class type passed in,
> then a sub-class type gets set).

This would severely constrain the __new__ method of the subclass.

> Have you looked at how Python now allows sub-classing in C?  I'm not an
> expert here, but it seems like a lot of the problems you were discussing
> have been ameliorated.  There are probably still issues, but....
> 
> I will know more when I seen what happens with a Matrix Object
> inheriting from a Python C-array object.

And why would a Matrix need to inherit from a C-array? Wouldn't it
make more sense from an OO POV for the Matrix to *have* a C-array
without *being* one?

> I'm wondering if anyone else with more knowledge about new-style c-types
> could help here and show me the error of my thinking.

I'm trying...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From oliphant at ee.byu.edu  Thu Feb 10 06:02:11 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Feb 10 06:02:15 2005
Subject: [Numpy-discussion] Re: [Python-Dev] Re: Numeric life as I see it
In-Reply-To: <ca471dc205020920364ec6fa40@mail.gmail.com>
References: <420A8406.4020808@ee.byu.edu>	
	<ca471dc205020914453f4da355@mail.gmail.com>	
	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu>	 <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com>	 <420ADE90.9050304@ee.byu.edu>
	<ca471dc205020920364ec6fa40@mail.gmail.com>
Message-ID: <420AEAD3.9030705@ee.byu.edu>


>[Travis]
>  
>
>>I appreciate some of what Paul is saying here, but I'm not fully
>>convinced that this is still true with Python 2.2 and up new-style
>>c-types.   The concerns seem to be over the fact that you have to
>>re-implement everything in the sub-class because the base-class will
>>always return one of its objects instead of a sub-class object.
>>It seems to me, however,  that if the C methods use the object type
>>alloc function when creating new objects then some of this problem is
>>avoided (i.e. if the method is called with a sub-class type passed in,
>>then a sub-class type gets set).
>>    
>>
>
>This would severely constrain the __new__ method of the subclass.
>  
>
I obviously don't understand the intricacies here, so fortunately it's 
not a key issue for me because I'm not betting the farm on being able to 
inherit from the arrayobject.  But, it is apparent that I don't 
understand all the issues.

>>Have you looked at how Python now allows sub-classing in C?  I'm not an
>>expert here, but it seems like a lot of the problems you were discussing
>>have been ameliorated.  There are probably still issues, but....
>>
>>I will know more when I seen what happens with a Matrix Object
>>inheriting from a Python C-array object.
>>    
>>
>
>And why would a Matrix need to inherit from a C-array? Wouldn't it
>make more sense from an OO POV for the Matrix to *have* a C-array
>without *being* one?
>  
>
The only reason I'm thinking of here is to have it inherit from the 
C-array many of the default methods without having to implement them all 
itself.   I think Paul is saying that this never works with C-types like 
arrays, and I guess from your comments you agree with him.

The only real reason for wanting to construct a separate Matrix object 
is the need to overload the * operation to do matrix multiplication 
instead of element-by-element multiplication. 

-Travis


From david.ascher at gmail.com  Thu Feb 10 06:50:26 2005
From: david.ascher at gmail.com (David Ascher)
Date: Thu Feb 10 06:50:29 2005
Subject: [Numpy-discussion] Re: [Python-Dev] Re: Numeric life as I see it
In-Reply-To: <420AEAD3.9030705@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com> <420AAC33.807@ee.byu.edu>
	<420AB084.1000008@v.loewis.de> <420AB928.3090004@pfdubois.com>
	<420ADE90.9050304@ee.byu.edu>
	<ca471dc205020920364ec6fa40@mail.gmail.com>
	<420AEAD3.9030705@ee.byu.edu>
Message-ID: <dd28fc2f0502092150752c7689@mail.gmail.com>

On Wed, 09 Feb 2005 22:02:11 -0700, Travis Oliphant <oliphant@ee.byu.edu> wrote:

GvR:
>And why would a Matrix need to inherit from a C-array? Wouldn't it
>make more sense from an OO POV for the Matrix to *have* a C-array
>without *being* one?

Travis:
> The only reason I'm thinking of here is to have it inherit from the
> C-array many of the default methods without having to implement them all
> itself.   I think Paul is saying that this never works with C-types like
> arrays, and I guess from your comments you agree with him.
> 
> The only real reason for wanting to construct a separate Matrix object
> is the need to overload the * operation to do matrix multiplication
> instead of element-by-element multiplication.

This is dredging stuff up from years (and layers and layers of new
memories =), but I think that what Paul was referring to was in fact
independent of implementation language.

The basic problem, IIRC, had to do with the classic (it turns out)
problem of confusing the need for reuse of implementation bits with
interface inheritance.  We always felt that things that people felt
were "array-like" (Matrices, missing value arrays, etc.) _should_
inherit from array, and that (much like you're saying), it would save
work.  In practice, however, there were a few problems (again, from
lousy memory), all boiling down to the fact that the array object
implemenation implies interfaces that weren't actually applicable to
the others.  The biggest problems had to do with the fact that when
you do subclassing, you end up in a nasty combinatorial problem when
you wanted to figure out what operand1 operator operand2 means, if
operand1 is a derivative and operand2 is a different derivative.  In
other words, if you multiply a matrix with a missingvalues array, what
should you do?  Having a common inheritance means you need to _stop_
default behaviors from happening, to avoid meaningless results.  It
gets worse with function calls that take "array-like objects" as
arguments.

A lot of this may be resolvable with the recent notions of adaptation
and more formalized interfaces.  In the meantime, I would, like Paul,
recommend that you separate the interface-bound type aspects (which is
what Python classes are in fact!) from the implementation sharing.

This may be obvious to everyone, and if so, sorry.

--david
From oliphant at ee.byu.edu  Thu Feb 10 10:30:22 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Feb 10 10:30:38 2005
Subject: [Python-Dev] Re: [Numpy-discussion] Re: Numeric life as I see it
In-Reply-To: <1c3044466186480f55ef45d2c977731b@laposte.net>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com> <420ADE90.9050304@ee.byu.edu>
	<1c3044466186480f55ef45d2c977731b@laposte.net>
Message-ID: <420B29AE.8030701@ee.byu.edu>


>> One question we are pursuing is could the arrayobject get into the  
>> core without a particular ufunc object.   Most see this as  
>> sub-optimal, but maybe it is the only way.
>
>
> Since all the artithmetic operations are in ufunc that would be  
> suboptimal solution, but indeed still a workable one.


I think replacing basic number operations of the arrayobject should 
simple, so perhaps a default ufunc object could be worked out for 
inclusion.

>
>> I appreciate some of what Paul is saying here, but I'm not fully  
>> convinced that this is still true with Python 2.2 and up new-style  
>> c-types.   The concerns seem to be over the fact that you have to  
>> re-implement everything in the sub-class because the base-class will  
>> always return one of its objects instead of a sub-class object.
>
>
> I'd say that such discussions should be postponed until someone  
> proposes a good use for subclassing arrays. Matrices are not one, in 
> my  opinion.
>
Agreed.  It is is not critical to what I am doing, and I obviously need 
more understanding before tackling such things.  Numeric3 uses the new 
c-type largely because of the nice getsets table which is separate from 
the methods table.  This replaces the rather ugly C-functions getattr 
and setattr.

-Travis


From p.f.moore at gmail.com  Thu Feb 10 10:40:21 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Feb 10 10:40:24 2005
Subject: [Python-Dev] discourage patch reviews to the list?
In-Reply-To: <420AB7FA.3040106@ocf.berkeley.edu>
References: <740c3aec050209112069d8c328@mail.gmail.com>
	<420A87C7.7030102@ocf.berkeley.edu> <420A9849.6020304@v.loewis.de>
	<420AB7FA.3040106@ocf.berkeley.edu>
Message-ID: <79990c6b05021001407626182@mail.gmail.com>

On Wed, 09 Feb 2005 17:25:14 -0800, Brett C. <bac@ocf.berkeley.edu> wrote:
> All valid points, but I also don't want people to suddenly start posting
> one-liners or bug posts.
> 
> I guess it comes down to a signal-to-noise ratio and if the level of signal we
> are currently getting will hold.  If we say it is okay for people to send in
> patch reviews *only* and not notifications of new patches, bug reports, or bug
> reviews, then I can handle it.

Having done some reviews (admittedly for the 5-for-1 deal) I do like
seeing patch reviews appear on python-dev. As they are meant to be
reviews, this implies a certain level of effort expended, and quality
in the response. I agree with Martin that detail comments should go in
the tracker - a posting can summarise to an extent, but should be
enough to let python-dev readers know if they can act on the review.

It's nice to see new contributors doing good work to help Python, and
I assume they like the chance to feel like they are "participating" by
posting helpful contributions to python-dev. IMHO, the tracker doesn't
give this same feeling of "contributing".

Also, review postings encourage others to do the same - I know I did
my reviews after having seen someone else post a set of reviews. It
made me think "hey, I could do that!" I'm sure there are other lurkers
on python-dev who could be encouraged to assist in the same way.

Having said this, I'd suggest that if people intend to review multiple
patches, they post a summary covering a number of patches at a time.

Paul.
From jimjjewett at gmail.com  Thu Feb 10 19:51:43 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu Feb 10 19:51:48 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Python compile.c, 2.343, 2.344
In-Reply-To: <E1Cz3Lt-0008Bv-HT@sc8-pr-cvs1.sourceforge.net>
References: <E1Cz3Lt-0008Bv-HT@sc8-pr-cvs1.sourceforge.net>
Message-ID: <fb6fbf56050210105156fd73a@mail.gmail.com>

On Wed, 09 Feb 2005 17:42:41 -0800, rhettinger@users.sourceforge.net
<rhettinger@users.sourceforge.net> wrote:
> Update of /cvsroot/python/python/dist/src/Python
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31172
> 
> Modified Files:
>         compile.c
> Log Message:
> Remove the set conversion which didn't work with:  [] in (0,)

Why is this a problem?  If there were *any* unhashable objects
in the container, then the compiler would have bailed on the 
initial set-conversion.

If there aren't any unhashable values, then the (unhashable) item
being checked is not in the set. ==> Return False.

Are you worried about unhashable objects (as item) which 
compare == to something that is hashable (in container)?
Custom rich compares can already confuse the "in" tests.

Or is the problem that guarding against/trapping this case is 
somehow so expensive that it overrides the expected savings?

-jJ
From tim.peters at gmail.com  Thu Feb 10 20:09:34 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 10 20:09:39 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
Message-ID: <1f7befae0502101109161da0d1@mail.gmail.com>

[fdrake@users.sourceforge.net]
> Modified Files:
>        xmlrpclib.py
> Log Message:
> accept datetime.datetime instances when marshalling;
> dateTime.iso8601 elements still unmarshal into xmlrpclib.DateTime objects
> 
> Index: xmlrpclib.py

...

> +            if datetime and isinstance(value, datetime.datetime):
> +                self.value = value.strftime("%Y%m%dT%H:%M:%S")
> +                return

... [and similarly later] ...

Fred, is there a reason to avoid datetime.datetime's .isoformat()
method here?  Like so:

>>> import datetime
>>> print datetime.datetime(2005, 2, 10, 14, 0, 8).isoformat()
2005-02-10T14:00:08

A possible downside is that you'll also get fractional seconds if the
instance records a non-zero .microseconds value.
From fdrake at acm.org  Thu Feb 10 20:23:59 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Feb 10 20:24:11 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <1f7befae0502101109161da0d1@mail.gmail.com>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<1f7befae0502101109161da0d1@mail.gmail.com>
Message-ID: <200502101423.59995.fdrake@acm.org>

On Thursday 10 February 2005 14:09, Tim Peters wrote:
 > Fred, is there a reason to avoid datetime.datetime's .isoformat()
 > method here?  Like so:

Yes.  The XML-RPC spec is quite vague.  It claims that the dates are in ISO 
8601 format, but doesn't say anything more about it.  The example shows a 
string without hyphens (but with colons), so I stuck with eactly that.

 > A possible downside is that you'll also get fractional seconds if the
 > instance records a non-zero .microseconds value.

There's nothing in the XML-RPC spec about the resolution of time, so, again, 
I'd rather be conservative in what we generate.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From tim.peters at gmail.com  Thu Feb 10 20:44:21 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 10 20:44:24 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <200502101423.59995.fdrake@acm.org>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<1f7befae0502101109161da0d1@mail.gmail.com>
	<200502101423.59995.fdrake@acm.org>
Message-ID: <1f7befae050210114446dee240@mail.gmail.com>

[Tim]
>> Fred, is there a reason to avoid datetime.datetime's .isoformat()
>> method here?  Like so:

> Yes.  The XML-RPC spec is quite vague.  It claims that the dates are in ISO
> 8601 format, but doesn't say anything more about it.  The example shows a
> string without hyphens (but with colons), so I stuck with eactly that.

Well, then since that isn't ISO 8601 format, it would be nice to have
a comment explaining why it's claiming to be anyway <0.5 wink>.

>> A possible downside is that you'll also get fractional seconds if the
>> instance records a non-zero .microseconds value.

> There's nothing in the XML-RPC spec about the resolution of time, so, again,
> I'd rather be conservative in what we generate.

    dt.replace(microsecond=0).isoformat()

suffices for that much.  Tack on .replace('-', '') to do the whole job.
From mwh at python.net  Thu Feb 10 20:54:13 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 10 20:54:15 2005
Subject: [Python-Dev]  Re: [Python-checkins] python/dist/src/Python
	compile.c, 2.343, 2.344
In-Reply-To: <fb6fbf56050210105156fd73a@mail.gmail.com> (Jim Jewett's
	message of "Thu, 10 Feb 2005 13:51:43 -0500")
References: <E1Cz3Lt-0008Bv-HT@sc8-pr-cvs1.sourceforge.net>
	<fb6fbf56050210105156fd73a@mail.gmail.com>
Message-ID: <2mmzuc42e2.fsf@starship.python.net>

Jim Jewett <jimjjewett@gmail.com> writes:

> On Wed, 09 Feb 2005 17:42:41 -0800, rhettinger@users.sourceforge.net
> <rhettinger@users.sourceforge.net> wrote:
>> Update of /cvsroot/python/python/dist/src/Python
>> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31172
>> 
>> Modified Files:
>>         compile.c
>> Log Message:
>> Remove the set conversion which didn't work with:  [] in (0,)
>
> Why is this a problem?

It broke the test suite...

> If there were *any* unhashable objects in the container, then the
> compiler would have bailed on the initial set-conversion.

Also, the RHS wouldn't have been a tuple of constants, as far as the
compiler saw it.

> If there aren't any unhashable values, then the (unhashable) item
> being checked is not in the set. ==> Return False.

This would seem to require changing the frozenset implementation.  I
don't know if the option of unhashable implying returning false from
frozenset.__contains__() was considered at the time it was implemented
but it doesn't feel right to me.

> Are you worried about unhashable objects (as item) which 
> compare == to something that is hashable (in container)?
> Custom rich compares can already confuse the "in" tests.

This was a concern of mine, yes.  Although any custom object
(particularly an unhashable one!) that compares equal to something so
simple as an integer, string or tuple seems bad design, I'm not sure
that's the point.

> Or is the problem that guarding against/trapping this case is 
> somehow so expensive that it overrides the expected savings?

If you want to compile the expression

x in (1,2,3)

to contain the moral equivalent of a try:except: block, I'd question
your sanity.

Cheers,
mwh

-- 
  > It might get my attention if you'd spin around in your chair,
  > spoke in tongues, and puked jets of green goblin goo.
  I can arrange for this.  ;-)            -- Barry Warsaw & Fred Drake
From fdrake at acm.org  Thu Feb 10 21:16:37 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Feb 10 21:16:42 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <1f7befae050210114446dee240@mail.gmail.com>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<200502101423.59995.fdrake@acm.org>
	<1f7befae050210114446dee240@mail.gmail.com>
Message-ID: <200502101516.37550.fdrake@acm.org>

On Thursday 10 February 2005 14:44, Tim Peters wrote:
 > Well, then since that isn't ISO 8601 format, it would be nice to have
 > a comment explaining why it's claiming to be anyway <0.5 wink>.

Hmm, that's right (ISO 8601:2000, section 5.4.2).  Sigh.

 >     dt.replace(microsecond=0).isoformat()
 >
 > suffices for that much.  Tack on .replace('-', '') to do the whole job.

Yep, that would work too.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From fdrake at acm.org  Thu Feb 10 21:32:14 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Feb 10 21:32:21 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <1f7befae050210114446dee240@mail.gmail.com>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<200502101423.59995.fdrake@acm.org>
	<1f7befae050210114446dee240@mail.gmail.com>
Message-ID: <200502101532.14964.fdrake@acm.org>

On Thursday 10 February 2005 14:44, Tim Peters wrote:
 > Well, then since that isn't ISO 8601 format, it would be nice to have
 > a comment explaining why it's claiming to be anyway <0.5 wink>.

I've posted a note on the XML-RPC list about this.  There doesn't seem to be 
anything that describes the range of what's accepted and produced by the 
various XML-RPC libraries, but I've not looked hard for it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From jjl at pobox.com  Thu Feb 10 23:30:23 2005
From: jjl at pobox.com (John J Lee)
Date: Thu Feb 10 23:34:20 2005
Subject: [Python-Dev] Patches for cookielib bugs, for 2.4.1
Message-ID: <Pine.LNX.4.58.0502102223190.6482@alice>

Hope these can get in before 2.4.1.

All include unit tests.


http://python.org/sf/1117339

  cookielib and cookies with special names


http://python.org/sf/1117454

  cookielib.LWPCookieJar incorrectly loads value-less cookies


http://python.org/sf/1117398

  cookielib LWPCookieJar and MozillaCookieJar exceptions


John
From pinard at iro.umontreal.ca  Fri Feb 11 00:00:04 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Fri Feb 11 00:00:32 2005
Subject: [Python-Dev] discourage patch reviews to the list?
In-Reply-To: <420AC729.6070804@v.loewis.de>
References: <740c3aec050209112069d8c328@mail.gmail.com>
	<420A87C7.7030102@ocf.berkeley.edu> <420A9849.6020304@v.loewis.de>
	<420AB7FA.3040106@ocf.berkeley.edu> <420AC729.6070804@v.loewis.de>
Message-ID: <20050210230004.GA17095@phenix.progiciels-bpi.ca>

[Martin von L?wis]

> I disagree with the primary reason not to take bug reports on
> python-dev, however: bug reports in email get lost if not immediately
> processed; usage of a tracker is necessary to actually "keep
> track".

Some developers and users appreciate bug trackers, or at least are able
to stand them.  Others, at least like me, just hate them.

When a developer replies to one of my emails, asking me that I use
the bug tracker, my email was surely not lost, since the developer is
replying to it.  That developer could have used the bug tracker himself,
the way he sees fit, instead of inviting me to do it.

In fact, a developer asking me to use the tracker of the day is trying
to educate me into using it.  Or maybe he knows that using the tracker
is uneasy and is trying to spare himself some disgust.  Or maybe he is
consciously trying to turn me down :-). I do not buy the argument of the
fear of emails being lost.  Actually, almost all of my emails reporting
bugs received a reply in one form or another, so developers do see them.

If a developer wants to use a bug tracker, then nice, good for him.
For one, trackers merely tell me that I should get a life and do nicer
things than reporting bugs.  In any case, Python has plenty of users,
and others will contribute anyway.  So, after all, why should I?

> So this kind of bug management is the primary reason for the tracker,
> not that we want to keep random users out of python-dev (although this
> is a convenient side effect).

Hey, that's good!  Trackers may act like a randomiser! :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca
From david.ascher at gmail.com  Fri Feb 11 00:03:29 2005
From: david.ascher at gmail.com (David Ascher)
Date: Fri Feb 11 00:03:32 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
	xmlrpclib.py, 1.38, 1.39
In-Reply-To: <200502101532.14964.fdrake@acm.org>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<200502101423.59995.fdrake@acm.org>
	<1f7befae050210114446dee240@mail.gmail.com>
	<200502101532.14964.fdrake@acm.org>
Message-ID: <dd28fc2f050210150334a983e1@mail.gmail.com>

On Thu, 10 Feb 2005 15:32:14 -0500, Fred L. Drake, Jr. <fdrake@acm.org> wrote:
> On Thursday 10 February 2005 14:44, Tim Peters wrote:
>  > Well, then since that isn't ISO 8601 format, it would be nice to have
>  > a comment explaining why it's claiming to be anyway <0.5 wink>.
> 
> I've posted a note on the XML-RPC list about this.  There doesn't seem to be
> anything that describes the range of what's accepted and produced by the
> various XML-RPC libraries, but I've not looked hard for it.

Is there any surprise here? =)
From tim.peters at gmail.com  Fri Feb 11 00:27:01 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 11 00:28:49 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Lib xmlrpclib.py, 1.38, 1.39
In-Reply-To: <200502101516.37550.fdrake@acm.org>
References: <E1CzJ88-0004ti-Q5@sc8-pr-cvs1.sourceforge.net>
	<200502101423.59995.fdrake@acm.org>
	<1f7befae050210114446dee240@mail.gmail.com>
	<200502101516.37550.fdrake@acm.org>
Message-ID: <1f7befae05021015277972d295@mail.gmail.com>

[Tim]
>> Well, then since that isn't ISO 8601 format, it would be nice to have
>> a comment explaining why it's claiming to be anyway <0.5 wink>.

[Fred]
> Hmm, that's right (ISO 8601:2000, section 5.4.2).  Sigh.

Ain't your fault.  I didn't remember that I had seen the XML-RPC spec
before, in conjunction with its crazy rules for representing floats. 
It's a very vague doc indeed.

Anyway, some quick googling strongly suggests that many XML-RPC
implementors don't know anything about 8601 either, and accept/produce
only the non-8601 format inferred from the single example in "the
spec".  Heh -- kids <wink>.
From bjourne at gmail.com  Fri Feb 11 01:15:18 2005
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri Feb 11 01:18:50 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	compile.c, 2.343, 2.344
In-Reply-To: <fb6fbf56050210105156fd73a@mail.gmail.com>
References: <E1Cz3Lt-0008Bv-HT@sc8-pr-cvs1.sourceforge.net>
	<fb6fbf56050210105156fd73a@mail.gmail.com>
Message-ID: <740c3aec05021016151c0de340@mail.gmail.com>

> On Wed, 09 Feb 2005 17:42:41 -0800, rhettinger@users.sourceforge.net
> <rhettinger@users.sourceforge.net> wrote:
> > Update of /cvsroot/python/python/dist/src/Python
> > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31172
> >
> > Modified Files:
> >         compile.c
> > Log Message:
> > Remove the set conversion which didn't work with:  [] in (0,)
> 
> Why is this a problem?  If there were *any* unhashable objects
> in the container, then the compiler would have bailed on the
> initial set-conversion.

>>> [] in frozenset(["hi", "ho"])
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

The compiler do bail out when there are unhashable objects outside the
tuple, but not if the LHS is unhashable. I believe that is because
internally frozenset uses a dict and it does something similar to
d.has_key([]) in this case. It should be trivial for the compiler to
also check the LHS for hashability I think.

That is also why the email unit test failed - LHS was unhashable but
the RHS was hashable. There is a patch for that (1119016) at SF but
that may no longer be needed.
 
-- 
mvh Bj?rn
From python at rcn.com  Fri Feb 11 01:59:25 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Feb 11 02:03:22 2005
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Pythoncompile.c, 2.343, 2.344
In-Reply-To: <740c3aec05021016151c0de340@mail.gmail.com>
Message-ID: <004101c50fd4$ee8c9080$83f9cc97@oemcomputer>

[Raymond]
> > > Remove the set conversion which didn't work with:  [] in (0,)

[Jim]
> > Why is this a problem?  If there were *any* unhashable objects
> > in the container, then the compiler would have bailed on the
> > initial set-conversion.
> 
> >>> [] in frozenset(["hi", "ho"])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: list objects are unhashable

[Bjorn]
> The compiler do bail out when there are unhashable objects outside the
> tuple, but not if the LHS is unhashable. I believe that is because
> internally frozenset uses a dict and it does something similar to
> d.has_key([]) in this case. It should be trivial for the compiler to
> also check the LHS for hashability I think.
> 
> That is also why the email unit test failed - LHS was unhashable but
> the RHS was hashable. There is a patch for that (1119016) at SF but
> that may no longer be needed.

Right, that patch only fixes a symptom.  Also, the compiler cannot check
the hashability of the search key because it is likely not known at
compile time (i.e. x in (1,2,3) where x is a function argument).

For the time being, the set conversion concept was removed entirely.  To
go forward with it at some point, it will need a fast search type other
than frozenset, something like:

class FastSearchTuple(tuple):
    """Tuple lookalike that has O(1) search time if both the key and
     tuple elements are hashable; otherwise it reverts to an O(n) linear
     search. Used by compile.c for 'in' tests on tuples of constants.
    """ 
  
    def __init__(self, data):
        try:
            self.dict = dict.fromkeys(data)
        except TypeError:
            self.dict = None

    def __contains__(self, key):
        try:
            return key in self.dict
        except TypeError:
            return tuple.__contains__(self, key)


Raymond Hettinger
From abo at minkirri.apana.org.au  Fri Feb 11 03:15:47 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Feb 11 03:16:32 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050208195243.GD10650@zot.electricrain.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
Message-ID: <1108088147.3753.51.camel@schizo>

On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
> > The md5.h/md5c.c files allow "copy and use", but no modification of
> > the files. There are some alternative implementations, i.e. in glibc,
> > openssl, so a replacement should be sage. Any other requirements when
> > considering a replacement?

One thing to consider is "degree of difficulty" :-)

> > 	Matthias
> 
> I believe the "plan" for md5 and sha1 and such is to use the much
> faster openssl versions "in the future" (based on a long thread
> debating future interfaces to such things on python-dev last summer).
> That'll sidestep any tedious license issue and give a better
> implementation at the same time.  i don't believe anyone has taken the
> time to make such a patch yet.

I wasn't around for that discussion. There are two viable replacements
for the RSA implementation currently used; 

libmd <http://www.penguin.cz/~mhi/libmd/>
openssl <http://www.openssl.org/>.

The libmd implementation is by Colin Plumb and has the licence; "This
code is in the public domain; do with it what you wish." The API is
identical to the RSA implementation and BSD world's libmd and hence is a
drop in replacement. This implementation is faster than the RSA
implementation.

The openssl implementation has an apache style license. The API is
almost the same but slightly different to the RSA API, so it would
require a little bit of work to make it fit. This implementation is the
fastest currently available, as it includes many platform specific
optimisations for a large range of platforms.

Currently md5c.c is included in the python sources. The libmd
implementation has a drop in replacement for md5c.c. The openssl
implementation is a complicated tangle of Makefile expanded template
code that would be harder to include in the Python sources.

In the Linux world, openssl is starting to become ubiquitous, so not
including it and statically or even dynamically linking against it is
feasible. However, using Python in other lands will probably require
something to be included.

Long term, I think openssl is the way to go. Short term, libmd is a
painless replacement that gets around the licencing issues.

I have been using the libmd API stuff for md4 in librsync, and am
looking at migrating to the openssl API. If people hassle me, I could
probably do the openssl API migration for Python, but I'm not sure what
the best approach would be to including the source in Python sources.

FWIW, I also have an md4sum module and md4c.c implementation that I'm
happy to contribute to Python (done for pysysnc).

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From bob at redivi.com  Fri Feb 11 03:30:59 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Feb 11 03:31:06 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108088147.3753.51.camel@schizo>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
Message-ID: <b260a14d6151f744a38c5397eab5b740@redivi.com>


On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:

> On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
>>> The md5.h/md5c.c files allow "copy and use", but no modification of
>>> the files. There are some alternative implementations, i.e. in glibc,
>>> openssl, so a replacement should be sage. Any other requirements when
>>> considering a replacement?
>
> One thing to consider is "degree of difficulty" :-)
>
>>> 	Matthias
>>
>> I believe the "plan" for md5 and sha1 and such is to use the much
>> faster openssl versions "in the future" (based on a long thread
>> debating future interfaces to such things on python-dev last summer).
>> That'll sidestep any tedious license issue and give a better
>> implementation at the same time.  i don't believe anyone has taken the
>> time to make such a patch yet.
>
> I wasn't around for that discussion. There are two viable replacements
> for the RSA implementation currently used;
>
> libmd <http://www.penguin.cz/~mhi/libmd/>
> openssl <http://www.openssl.org/>.
--
> In the Linux world, openssl is starting to become ubiquitous, so not
> including it and statically or even dynamically linking against it is
> feasible. However, using Python in other lands will probably require
> something to be included.
>
> Long term, I think openssl is the way to go. Short term, libmd is a
> painless replacement that gets around the licencing issues.

OpenSSL is also ubiquitous on Mac OS X (as a shared lib):

Mac OS X 10.2.8 has OpenSSL 0.9.6i Feb 19 2003
Mac OS X 10.3.8 has OpenSSL 0.9.7b 10 Apr 2003

One possible alternative would be to bring in something like PyOpenSSL 
<http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and sha?) 
extensions as Python modules that use that API.

-bob

From abo at minkirri.apana.org.au  Fri Feb 11 03:50:48 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Feb 11 03:51:25 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <b260a14d6151f744a38c5397eab5b740@redivi.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
Message-ID: <1108090248.3753.53.camel@schizo>

On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote:
> On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:
> 
> > On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
[...]
> One possible alternative would be to bring in something like PyOpenSSL 
> <http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and sha?) 
> extensions as Python modules that use that API.

Only problem with this, is pyopenssl doesn't yet include any mdX or sha
modules.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From bob at redivi.com  Fri Feb 11 05:13:55 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Feb 11 05:14:17 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108090248.3753.53.camel@schizo>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
Message-ID: <226e9c65e562f9b0439333053036fef3@redivi.com>


On Feb 10, 2005, at 9:50 PM, Donovan Baarda wrote:

> On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote:
>> On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:
>>
>>> On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
> [...]
>> One possible alternative would be to bring in something like PyOpenSSL
>> <http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and 
>> sha?)
>> extensions as Python modules that use that API.
>
> Only problem with this, is pyopenssl doesn't yet include any mdX or sha
> modules.

My bad, how about M2Crypto <http://sandbox.rulemaker.net/ngps/m2/> 
then?  This one supports message digests and is more license compatible 
with Python to boot.

-bob

From abo at minkirri.apana.org.au  Fri Feb 11 07:15:39 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Feb 11 07:16:20 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <226e9c65e562f9b0439333053036fef3@redivi.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
Message-ID: <1108102539.3753.87.camel@schizo>

On Thu, 2005-02-10 at 23:13 -0500, Bob Ippolito wrote:
> On Feb 10, 2005, at 9:50 PM, Donovan Baarda wrote:
> 
> > On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote:
[...]
> > Only problem with this, is pyopenssl doesn't yet include any mdX or sha
> > modules.
> 
> My bad, how about M2Crypto <http://sandbox.rulemaker.net/ngps/m2/> 
> then?  This one supports message digests and is more license compatible 
> with Python to boot.
[...]

This one does have md5 support, but the Python API is rather different
from the current python md5sum API. It hooks into the slightly higher
level MVP openssl layer, rather than the lower level md5 layer. Hooking
into the MVP layer pretty much requires including all the openssl
message digest implementations (which may or may not be a good idea).

It also uses SWIG to generate the extension module. I don't think
anything else in Python itself uses SWIG, so starting to use it would
introduce a "Build Dependency".

I think it would be cleaner and simpler to modify the existing
md5module.c to use the openssl md5 layer API (this is just a
search/replace to change the function names). The bigger problem is
deciding what/how/whether to include the openssl md5 implementation
sources so that win32 can use them.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From abo at minkirri.apana.org.au  Fri Feb 11 07:52:20 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Feb 11 07:52:57 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108102539.3753.87.camel@schizo>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
Message-ID: <1108104740.3753.91.camel@schizo>

On Fri, 2005-02-11 at 17:15 +1100, Donovan Baarda wrote:
[...]
> I think it would be cleaner and simpler to modify the existing
> md5module.c to use the openssl md5 layer API (this is just a
> search/replace to change the function names). The bigger problem is
> deciding what/how/whether to include the openssl md5 implementation
> sources so that win32 can use them.

Thinking about it, probably the best way is to include the libmd md5c.c
modified to use the openssl API, and then use configure to check for and
use openssl if it is available. That way win32 could use the provided
md5c.c, and other platforms could use the faster openssl.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From doko at cs.tu-berlin.de  Fri Feb 11 12:55:02 2005
From: doko at cs.tu-berlin.de (Matthias Klose)
Date: Fri Feb 11 12:55:26 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108088147.3753.51.camel@schizo>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
Message-ID: <16908.40214.287358.160325@gargle.gargle.HOWL>

Donovan Baarda writes:
> On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
> > > The md5.h/md5c.c files allow "copy and use", but no modification of
> > > the files. There are some alternative implementations, i.e. in glibc,
> > > openssl, so a replacement should be sage. Any other requirements when
> > > considering a replacement?
> 
> One thing to consider is "degree of difficulty" :-)
> 
> > > 	Matthias
> > 
> > I believe the "plan" for md5 and sha1 and such is to use the much
> > faster openssl versions "in the future" (based on a long thread
> > debating future interfaces to such things on python-dev last summer).
> > That'll sidestep any tedious license issue and give a better
> > implementation at the same time.  i don't believe anyone has taken the
> > time to make such a patch yet.
> 
> I wasn't around for that discussion. There are two viable replacements
> for the RSA implementation currently used; 
> 
> libmd <http://www.penguin.cz/~mhi/libmd/>
> openssl <http://www.openssl.org/>.
> 
> The libmd implementation is by Colin Plumb and has the licence; "This
> code is in the public domain; do with it what you wish." The API is
> identical to the RSA implementation and BSD world's libmd and hence is a
> drop in replacement. This implementation is faster than the RSA
> implementation.
> 
[...]
> 
> Currently md5c.c is included in the python sources. The libmd
> implementation has a drop in replacement for md5c.c. The openssl
> implementation is a complicated tangle of Makefile expanded template
> code that would be harder to include in the Python sources.

I would prefer that one as a short term solution. Patch at #1118602.
From doko at cs.tu-berlin.de  Fri Feb 11 13:04:38 2005
From: doko at cs.tu-berlin.de (Matthias Klose)
Date: Fri Feb 11 13:04:56 2005
Subject: Bug#293932: [Python-Dev] license issues with profiler.py and
	md5.h/md5c.c
In-Reply-To: <e8bf7a53050208125224e89bf@mail.gmail.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<1f7befae05020812377c72de26@mail.gmail.com>
	<e8bf7a53050208125224e89bf@mail.gmail.com>
Message-ID: <16908.40790.23812.274563@gargle.gargle.HOWL>

Jeremy Hylton writes:
> Maybe some ambitious PSF activitst could contact Roskind and Steve
> Kirsch and see if they know who at Disney to talk to...  Or maybe the
> Disney guys who were at PyCon last year could help.

please could somebody give me a contact address?

	Matthias
From jhylton at gmail.com  Fri Feb 11 13:35:18 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Fri Feb 11 13:35:21 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <16908.40214.287358.160325@gargle.gargle.HOWL>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<16908.40214.287358.160325@gargle.gargle.HOWL>
Message-ID: <e8bf7a530502110435247a03c@mail.gmail.com>

On Fri, 11 Feb 2005 12:55:02 +0100, Matthias Klose <doko@cs.tu-berlin.de> wrote:
> > Currently md5c.c is included in the python sources. The libmd
> > implementation has a drop in replacement for md5c.c. The openssl
> > implementation is a complicated tangle of Makefile expanded template
> > code that would be harder to include in the Python sources.
> 
> I would prefer that one as a short term solution. Patch at #1118602.

Unfortunately a license that says it is in the public domain is
unacceptable (and should be for Debian, too).  That is to say, it's
not possible for someone to claim that something they produce is in
the public domain.  See http://www.linuxjournal.com/article/6225

Jeremy
From skip at pobox.com  Fri Feb 11 13:54:32 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Feb 11 13:54:44 2005
Subject: Bug#293932: [Python-Dev] license issues with profiler.py and
	md5.h/md5c.c
In-Reply-To: <16908.40790.23812.274563@gargle.gargle.HOWL>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<1f7befae05020812377c72de26@mail.gmail.com>
	<e8bf7a53050208125224e89bf@mail.gmail.com>
	<16908.40790.23812.274563@gargle.gargle.HOWL>
Message-ID: <16908.43784.902706.197167@montanaro.dyndns.org>


    >> Maybe some ambitious PSF activitst could contact Roskind and Steve
    >> Kirsch and see if they know who at Disney to talk to...  Or maybe the
    >> Disney guys who were at PyCon last year could help.

    Matthias> please could somebody give me a contact address?

Steve's easy enough to get ahold of:

    http://www.skirsch.com/

(He even still has a UltraSeek-powered search of his site. ;-)

Search Kirsch's site for Jim Roskind returned jar@netscape.com but that was
dated 31 Oct 2000.  An abstract for a talk at University of Arizona in late
2003 sort of implied he was still at Netscape then ... maybe...

Skip

From greg at electricrain.com  Fri Feb 11 18:51:18 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Fri Feb 11 18:51:26 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108102539.3753.87.camel@schizo>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
Message-ID: <20050211175118.GC25441@zot.electricrain.com>

> I think it would be cleaner and simpler to modify the existing
> md5module.c to use the openssl md5 layer API (this is just a
> search/replace to change the function names). The bigger problem is
> deciding what/how/whether to include the openssl md5 implementation
> sources so that win32 can use them.

yes, that is all i was suggesting.

win32 python is already linked against openssl for the socket module
ssl support, having the md5 and sha1 modules depend on openssl should
not cause a problem.

-greg

From trentm at ActiveState.com  Fri Feb 11 19:37:15 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Fri Feb 11 19:39:35 2005
Subject: [Python-Dev] ViewCVS on SourceForge is broken
Message-ID: <420CFB5B.7030007@activestate.com>

Has anyone else noticed that viewcvs is broken on SF?

> [trentm@booboo ~]
> $ curl -D tmp/headers http://cvs.sourceforge.net/viewcvs.py/python
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
> <html><head>
> <title>502 Bad Gateway</title>
> </head><body>
> <h1>Bad Gateway</h1>
> <p>The proxy server received an invalid
> response from an upstream server.<br />
> </p>
> </body></html>
> [trentm@booboo ~]
> $ cat tmp/headers
> HTTP/1.1 502 Bad Gateway
> Date: Fri, 11 Feb 2005 18:38:25 GMT
> Server: Apache/2.0.40 (Red Hat Linux)
> Content-Length: 232
> Connection: close
> Content-Type: text/html; charset=iso-8859-1

Or is this just me? It is also broken for other projects for me -- e.g. 
'pywin32'.


Cheers,
Trent

-- 
Trent Mick
trentm@activestate.com
From tim.peters at gmail.com  Fri Feb 11 20:14:30 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 11 20:14:33 2005
Subject: [Python-Dev] ViewCVS on SourceForge is broken
In-Reply-To: <420CFB5B.7030007@activestate.com>
References: <420CFB5B.7030007@activestate.com>
Message-ID: <1f7befae05021111143c346e3@mail.gmail.com>

[Trent Mick]
> Has anyone else noticed that viewcvs is broken on SF?

It failed the same way from Virginia just now.  I suppose that's your
reward for kindly updating the Python copyright <wink>.

The good news is that you can use this lull in your Python work to
contribute to ZODB development!  ViewCVS at zope.org is always happy
to see you:

    http://svn.zope.org/ZODB/trunk/
From theller at python.net  Fri Feb 11 20:20:57 2005
From: theller at python.net (Thomas Heller)
Date: Fri Feb 11 20:19:24 2005
Subject: [Python-Dev] ViewCVS on SourceForge is broken
In-Reply-To: <1f7befae05021111143c346e3@mail.gmail.com> (Tim Peters's
	message of "Fri, 11 Feb 2005 14:14:30 -0500")
References: <420CFB5B.7030007@activestate.com>
	<1f7befae05021111143c346e3@mail.gmail.com>
Message-ID: <7jleewdi.fsf@python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Trent Mick]
>> Has anyone else noticed that viewcvs is broken on SF?
>
> It failed the same way from Virginia just now.  I suppose that's your
> reward for kindly updating the Python copyright <wink>.
>
The failure lasts already for several days:

http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1#1107968334

Thomas

From tim.peters at gmail.com  Fri Feb 11 20:24:51 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 11 20:24:54 2005
Subject: [Python-Dev] ViewCVS on SourceForge is broken
In-Reply-To: <7jleewdi.fsf@python.net>
References: <420CFB5B.7030007@activestate.com>
	<1f7befae05021111143c346e3@mail.gmail.com> <7jleewdi.fsf@python.net>
Message-ID: <1f7befae05021111246ca3c616@mail.gmail.com>

[Thomas Heller]
<http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1#1107968334>

Jeez Louise!

    As of 2005-02-09 there is an outage of anonymous CVS (tarballs,
    pserver-based CVS and ViewCVS) for projects whose UNIX names start 
    with the letters m, n, p, q, t, y and z. We are currently working on
    resolving this issue.

So that means it wouldn't even do us any good to rename the project to
Thomas, Trent, Mick, Tim, Peters, or ZPython either!  All right. 
Heller 2.5, here we come.
From theller at python.net  Fri Feb 11 20:27:11 2005
From: theller at python.net (Thomas Heller)
Date: Fri Feb 11 20:25:39 2005
Subject: [Python-Dev] ViewCVS on SourceForge is broken
In-Reply-To: <1f7befae05021111143c346e3@mail.gmail.com> (Tim Peters's
	message of "Fri, 11 Feb 2005 14:14:30 -0500")
References: <420CFB5B.7030007@activestate.com>
	<1f7befae05021111143c346e3@mail.gmail.com>
Message-ID: <1xbmew34.fsf@python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Trent Mick]
>> Has anyone else noticed that viewcvs is broken on SF?
>
> It failed the same way from Virginia just now.  I suppose that's your
> reward for kindly updating the Python copyright <wink>.
>
> The good news is that you can use this lull in your Python work to
> contribute to ZODB development!  ViewCVS at zope.org is always happy
> to see you:
>
>     http://svn.zope.org/ZODB/trunk/

Thomas Heller <theller@python.net> writes:

> The failure lasts already for several days:
>
> http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1#1107968334

  "As of 2005-02-09 there is an outage of anonymous CVS (tarballs,
  pserver-based CVS and ViewCVS) for projects whose UNIX names start
  with the letters m, n, p, q, t, y and z."

As you can see, both projects with names starting with 'p' and 'z' are
affected, so may I suggest to contribute to *ctypes* instead of zope ;-)

Thomas

From mcherm at mcherm.com  Fri Feb 11 21:03:29 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Fri Feb 11 21:03:39 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
Message-ID: <1108152209.420d0f91e312c@mcherm.com>

Jeremy writes:

> Unfortunately a license that says it is in the public domain is
> unacceptable (and should be for Debian, too).  That is to say, it's
> not possible for someone to claim that something they produce is in
> the public domain.  See http://www.linuxjournal.com/article/6225

Not quite true. It would be a bit off-topic to discuss on this list
so I will simply point you to:

    http://creativecommons.org/license/publicdomain-2

...which is specifically designed for the US legal system. It _IS_
possible for someone to produce something in the public domain, it
just isn't as easy as some people think (just saying it doesn't
necessarily make it so (at least under US law)) and it may not be
a good idea.

I would expect that if something truly WERE in the public domain,
then it would be acceptable for Python (and for Debian too, for
that matter). I can't comment on whether this applies to libmd.

-- Michael Chermside

From tim.peters at gmail.com  Fri Feb 11 21:46:00 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 11 21:46:03 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108152209.420d0f91e312c@mcherm.com>
References: <1108152209.420d0f91e312c@mcherm.com>
Message-ID: <1f7befae0502111246244647c9@mail.gmail.com>

[Jeremy Hylton]
>> Unfortunately a license that says it is in the public domain is
>> unacceptable (and should be for Debian, too).  That is to say, it's
>> not possible for someone to claim that something they produce is in
>> the public domain.  See http://www.linuxjournal.com/article/6225

[Michael Chermside]
> Not quite true. It would be a bit off-topic to discuss on this list
> so I will simply point you to:
> 
>    http://creativecommons.org/license/publicdomain-2
> 
> ...which is specifically designed for the US legal system. It _IS_
> possible for someone to produce something in the public domain, it
> just isn't as easy as some people think (just saying it doesn't
> necessarily make it so (at least under US law)) and it may not be
> a good idea.

The article Jeremy pointed at was written by the Python Software
Foundation's occasional legal counsel, and he disagrees.  While I
would love to believe that copyright law isn't this bizarre, I can't
recommend going against the best legal advice the PSF was willing to
pay for <wink -- but Larry is widely recognized as a bona fide expert
in IP law, and has helped the PSF a lot>.

Note that Creative Commons doesn't recommend that you do either; from their FAQ:

   Can I use a Creative Commons license for software?

   In theory, yes, but it is not in your best interest. We strongly
encourage you to
   use one of the very good software licenses available today. (The
Free Software
   Foundation and the Open Source Initiative stand out as resources for such
   licenses.) 

> I would expect that if something truly WERE in the public domain,
> then it would be acceptable for Python (and for Debian too, for
> that matter).

So would I, but according to Larry there isn't such a thing (excepting
software written by the US Government; and for other software you
might be thinking about today, maybe in about a century if the author
lets their copyright lapse).

If Larry is correct, it isn't legally possible for an individual in
the US to disclaim copyright, regardless what they may say or sign. 
The danger then is that accepting software that purports to be free of
copyright can come back to bite you, if the author later changes their
mind (from your POV; the claim is that from US law's POV, nothing has
actually changed, since the author never actually gave up copyright to
begin with).

The very fact that this argument exists underscores the desirability
of only accepting software with an explicit license, spelling out the
copyright holder's intents wrt distribution, modification, etc.  Then
you're just in legal mud, instead of legal quicksand.
From pje at telecommunity.com  Fri Feb 11 23:59:33 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Feb 11 23:57:10 2005
Subject: [Python-Dev] license issues with profiler.py and
  md5.h/md5c.c
In-Reply-To: <1f7befae0502111246244647c9@mail.gmail.com>
References: <1108152209.420d0f91e312c@mcherm.com>
	<1108152209.420d0f91e312c@mcherm.com>
Message-ID: <5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>

At 03:46 PM 2/11/05 -0500, Tim Peters wrote:
>If Larry is correct, it isn't legally possible for an individual in
>the US to disclaim copyright, regardless what they may say or sign.
>The danger then is that accepting software that purports to be free of
>copyright can come back to bite you, if the author later changes their
>mind (from your POV; the claim is that from US law's POV, nothing has
>actually changed, since the author never actually gave up copyright to
>begin with).
>
>The very fact that this argument exists underscores the desirability
>of only accepting software with an explicit license, spelling out the
>copyright holder's intents wrt distribution, modification, etc.  Then
>you're just in legal mud, instead of legal quicksand.

And as long as we're flailing about in a substance which may include, but 
is not limited to, mud and/or quicksand or other flailing-suitable legal 
substances, it should be pointed out that even though software presented by 
its owner to be in the public domain is technically still copyright by that 
individual, the odds of them successfully prosecuting a copyright 
enforcement action might be significantly narrowed, due to the doctrine of 
promissory estoppel.

Promissory estoppel is basically the idea that one-sided promises *are* 
enforceable when somebody reasonably relies on them and is injured by the 
withdrawal.  IBM, for example, has pled in its defense against SCO that 
SCO's distribution of its so-called proprietary code under the GPL 
constituted a reasonable promise that others were free to use the code 
under the terms of the GPL, and that IBM further relied on that 
promise.  Ergo, they are claiming, SCO's promise is enforceable by law.

Of course, SCO v. IBM hasn't had any judgments yet, certainly not on that 
subject, and maybe never will.  But it's important to know that the law 
*does* have some principles like this that allow overriding the more 
egregiously insane aspects of the law.  :)

Oh, also, if somebody decides to back out on their dedication to the public 
domain, and you can show that they did it on purpose, then that's "unclean 
hands" and possibly "copyright abuse" as well.

Just to muddy up the waters a little bit.  :)  Obviously, the PSF should 
follow its own lawyer's advice, but it seemed to me that the point of Mr. 
Rosen's article was more to advise people releasing software to use a 
license that allows them to disclaim warranties.

I personally can't see how taking the reasonable interpretation of a public 
domain declaration can lead to any difficulties, but then, IANAL.  I'm 
surprised, however, that he didn't even touch on promissory estoppel, if 
there is some reason he believes that the doctrine wouldn't apply to a 
software license.  Heck, I was under the impression that free copyright 
licenses in general got their effect by way of promissory estoppel, since 
such licenses are always one-sided promises.  The GPL in particular makes 
an explicit point of this, even though it doesn't use the words "promissory 
estoppel".  The point is that the law doesn't allow you to copy, so the 
license is your defense against a charge of copyright 
infringement.  Therefore, even Rosen's so-called "Give it away" license is 
enforceable, in the sense that the licensor should be barred from taking 
action against someone taking the license at face value.

Rosen also says, "Under basic contract law, a gift cannot be enforced. The 
donor can retract his gift at any time, for any reason".  If this were 
true, I could give you a watch for Christmas and then sue you to make you 
give it back, so I'm not sure what he's getting at here.

But again, IANAL, certainly not a famous one like Mr. Rosen.  I *am* most 
curious to know why his article seems to imply that a promise not to sue 
someone for copyright infringement isn't a valid defense against such a 
suit, because that would seem to imply that *no* free software license is 
valid, including the GPL or the PSF license!  (Surely those "gifts" can be 
retracted too, no?)

From abo at minkirri.apana.org.au  Sat Feb 12 00:11:01 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sat Feb 12 00:11:16 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
Message-ID: <00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>

G'day again,

From: "Gregory P. Smith" <greg@electricrain.com>
> > I think it would be cleaner and simpler to modify the existing
> > md5module.c to use the openssl md5 layer API (this is just a
> > search/replace to change the function names). The bigger problem is
> > deciding what/how/whether to include the openssl md5 implementation
> > sources so that win32 can use them.
>
> yes, that is all i was suggesting.
>
> win32 python is already linked against openssl for the socket module
> ssl support, having the md5 and sha1 modules depend on openssl should
> not cause a problem.

IANAL... I have too much common sense, so I won't argue licences :-)

So is openssl already included in the Python sources, or is it just a
dependency? I had a quick look and couldn't find it so it must be a
dependency.

Given that Python is already dependant on openssl, it makes sense to change
md5sum to use it. I have a feeling that openssl internally uses md5, so this
way we wont link against two different md5sum implementations.

----------------------------------------------------------------
Donovan Baarda                http://minkirri.apana.org.au/~abo/
----------------------------------------------------------------

From martin at v.loewis.de  Sat Feb 12 00:57:40 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Feb 12 00:57:44 2005
Subject: [Python-Dev] license issues with profiler.py and  md5.h/md5c.c
In-Reply-To: <5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
References: <1108152209.420d0f91e312c@mcherm.com>	<1108152209.420d0f91e312c@mcherm.com>
	<5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
Message-ID: <420D4674.4040804@v.loewis.de>

Phillip J. Eby wrote:
> I personally can't see how taking the reasonable interpretation of a 
> public domain declaration can lead to any difficulties, but then, 
> IANAL.

The ultimate question is whether we could legally relicense such
code under the Python license, ie. remove the PD declaration, and
attach the Python license to it. I'm sure somebody would come along
and claim "you cannot do that, and because you did, I cannot use
your code, because it is not legally trustworthy"; people would
say the same if the PD declaration would stay around.

It is important for us that our users (including our commercial
users) trust that Python has a clear legal track record. For such
users, it is irrelevant whether you think that a litigation of
the actual copyright holder would have any chance to stand in court,
or whether such action is even likely.

So for some users, replacing RSA-copyrighted-and-licensed code
with PD-declared-and-unlicensed code makes Python less trustworthy.
Clearly, for Debian, it is exactly the other way 'round. So I
have rejected the patch, preserving the status quo, until a properly
licensed open source implementation of md5 arrives. Until then,
Debian will have to patch Python.

> But again, IANAL, certainly not a famous one like Mr. Rosen.  I *am* 
> most curious to know why his article seems to imply that a promise not 
> to sue someone for copyright infringement isn't a valid defense against 
> such a suit

It might be, but that is irrelevant for open source projects that
include contributions. Either they don't care too much about such
things, in which case anything remotely "free" would be acceptable,
or they are very nit-picking, in which case you need a good record
for any contribution you ever received.

Regards,
Martin
From pje at telecommunity.com  Sat Feb 12 01:25:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Feb 12 01:23:11 2005
Subject: [Python-Dev] license issues with profiler.py and 
  md5.h/md5c.c
In-Reply-To: <420D4674.4040804@v.loewis.de>
References: <5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>

At 12:57 AM 2/12/05 +0100, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
>>I personally can't see how taking the reasonable interpretation of a 
>>public domain declaration can lead to any difficulties, but then, IANAL.
>
>The ultimate question is whether we could legally relicense such
>code under the Python license, ie. remove the PD declaration, and
>attach the Python license to it. I'm sure somebody would come along
>and claim "you cannot do that, and because you did, I cannot use
>your code, because it is not legally trustworthy"; people would
>say the same if the PD declaration would stay around.

Right, but now we've moved off the legality and into marketing, which is an 
even less sane subject in some ways.  The law at least has certain checks 
and balances built into it, but in marketing, people's irrationality knows 
no bounds.  ;)


>It might be, but that is irrelevant for open source projects that
>include contributions. Either they don't care too much about such
>things, in which case anything remotely "free" would be acceptable,
>or they are very nit-picking, in which case you need a good record
>for any contribution you ever received.

Isn't the PSF somewhere in between?  I mean, in theory we are supposed to 
be tracking stuff, but in practice there's no contributor agreement for CVS 
committers ala Zope Corp.'s approach.  So in some sense right now, Python 
depends largely on the implied promise of its contributors to license their 
contributions under the same terms as Python.  ISTM that if somebody's 
lawyer is worried about whether Python contains pseudo-public domain code, 
they should be downright horrified by the absence of a paper trail on the 
rest.  But IANAM (I Am Not A Marketer), either.  :)

From martin at v.loewis.de  Sat Feb 12 02:09:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Feb 12 02:09:08 2005
Subject: [Python-Dev] license issues with profiler.py and   md5.h/md5c.c
In-Reply-To: <5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>
References: <5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
	<5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>
Message-ID: <420D5731.8020702@v.loewis.de>

Phillip J. Eby wrote:
> Isn't the PSF somewhere in between?  I mean, in theory we are supposed 
> to be tracking stuff, but in practice there's no contributor agreement 
> for CVS committers ala Zope Corp.'s approach.  

That is not true, see

http://www.python.org/psf/contrib.html

We certainly don't have forms from all contributors, yet, but we
are working on it.

> So in some sense right 
> now, Python depends largely on the implied promise of its contributors 
> to license their contributions under the same terms as Python.  ISTM 
> that if somebody's lawyer is worried about whether Python contains 
> pseudo-public domain code, they should be downright horrified by the 
> absence of a paper trail on the rest.  But IANAM (I Am Not A Marketer), 
> either.  :)

And indeed, they are horrified. Right now, we can tell them we are
working on it - so I would like to see that any change that we make
to improve the PSF's legal standing. Adding code which was put into
the "public domain" makes it worse (atleast in the specific case -
we are clearly allowed to do what we do with the current md5 code;
for the newly-proposed code, it is not so clear, even if you think
it is likely we would win in court).

Regards,
Martin
From bob at redivi.com  Sat Feb 12 02:38:18 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sat Feb 12 02:38:33 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
Message-ID: <5d300838ef9716aeaae53579ab1f7733@redivi.com>


On Feb 11, 2005, at 6:11 PM, Donovan Baarda wrote:

> G'day again,
>
> From: "Gregory P. Smith" <greg@electricrain.com>
>>> I think it would be cleaner and simpler to modify the existing
>>> md5module.c to use the openssl md5 layer API (this is just a
>>> search/replace to change the function names). The bigger problem is
>>> deciding what/how/whether to include the openssl md5 implementation
>>> sources so that win32 can use them.
>>
>> yes, that is all i was suggesting.
>>
>> win32 python is already linked against openssl for the socket module
>> ssl support, having the md5 and sha1 modules depend on openssl should
>> not cause a problem.
>
> IANAL... I have too much common sense, so I won't argue licences :-)
>
> So is openssl already included in the Python sources, or is it just a
> dependency? I had a quick look and couldn't find it so it must be a
> dependency.
>
> Given that Python is already dependant on openssl, it makes sense to 
> change
> md5sum to use it. I have a feeling that openssl internally uses md5, 
> so this
> way we wont link against two different md5sum implementations.

It is an optional dependency that is used when present (read: not just 
win32).  The sources are not included with Python.

OpenSSL does internally have an implementation of md5 (and sha1, among 
other things).

-bob

From pje at telecommunity.com  Sat Feb 12 03:28:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Feb 12 03:26:19 2005
Subject: [Python-Dev] license issues with profiler.py and  md5.h/md5c.c
In-Reply-To: <420D5731.8020702@v.loewis.de>
References: <5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>
	<5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<1108152209.420d0f91e312c@mcherm.com>
	<5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com>
	<5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050211212759.03db5b30@mail.telecommunity.com>

At 02:09 AM 2/12/05 +0100, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
>>Isn't the PSF somewhere in between?  I mean, in theory we are supposed to 
>>be tracking stuff, but in practice there's no contributor agreement for 
>>CVS committers ala Zope Corp.'s approach.
>
>That is not true, see
>
>http://www.python.org/psf/contrib.html
>
>We certainly don't have forms from all contributors, yet, but we
>are working on it.
>
>>So in some sense right now, Python depends largely on the implied promise 
>>of its contributors to license their contributions under the same terms 
>>as Python.  ISTM that if somebody's lawyer is worried about whether 
>>Python contains pseudo-public domain code, they should be downright 
>>horrified by the absence of a paper trail on the rest.  But IANAM (I Am 
>>Not A Marketer), either.  :)
>
>And indeed, they are horrified. Right now, we can tell them we are
>working on it - so I would like to see that any change that we make
>to improve the PSF's legal standing. Adding code which was put into
>the "public domain" makes it worse (atleast in the specific case -
>we are clearly allowed to do what we do with the current md5 code;
>for the newly-proposed code, it is not so clear, even if you think
>it is likely we would win in court).

Thanks for the clarifications.

From abo at minkirri.apana.org.au  Sat Feb 12 03:54:27 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sat Feb 12 03:54:37 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
Message-ID: <013501c510ae$2abd7360$24ed0ccb@apana.org.au>

G'day,

From: "Bob Ippolito" <bob@redivi.com>
> On Feb 11, 2005, at 6:11 PM, Donovan Baarda wrote:
[...]
> > Given that Python is already dependant on openssl, it makes sense to
> > change
> > md5sum to use it. I have a feeling that openssl internally uses md5,
> > so this
> > way we wont link against two different md5sum implementations.
>
> It is an optional dependency that is used when present (read: not just
> win32).  The sources are not included with Python.

Are there any potential problems with making the md5sum module availability
"optional" in the same way as this?

> OpenSSL does internally have an implementation of md5 (and sha1, among
> other things).

Yeah, I know, that's why it could be used for the md5sum module :-)

What I meant was a Python application using ssl sockets and the md5sum
module will effectively have two different md5sum implementations in memory.
Using the openssl md5sum for the md5sum module will make it "leaner", as
well as faster.

----------------------------------------------------------------
Donovan Baarda                http://minkirri.apana.org.au/~abo/
----------------------------------------------------------------

From tjreedy at udel.edu  Sat Feb 12 07:40:36 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat Feb 12 07:40:52 2005
Subject: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
References: <5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com><1108152209.420d0f91e312c@mcherm.com><1108152209.420d0f91e312c@mcherm.com><5.1.1.6.0.20050211172834.03c16e10@mail.telecommunity.com><5.1.1.6.0.20050211191840.03814ec0@mail.telecommunity.com>
	<420D5731.8020702@v.loewis.de>
Message-ID: <cuk89b$icp$1@sea.gmane.org>


""Martin v. L�wis"" <martin@v.loewis.de> wrote in message 
news:420D5731.8020702@v.loewis.de...
> http://www.python.org/psf/contrib.html

After reading this page and pages linked thereto, I get the impression that 
you are only asking for contributor forms from contributors of original 
material (such as module or manual section) and not from submitters of 
suggestions (via news,mail) or patches (via sourceforge).  Correct?

Seems sensible to me that contributing via a public suggestion box 
constitutes permission to use the suggestion.

Terry J. Reedy


From amk at amk.ca  Sat Feb 12 14:37:21 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Sat Feb 12 14:40:04 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <013501c510ae$2abd7360$24ed0ccb@apana.org.au>
References: <20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
Message-ID: <20050212133721.GA13429@rogue.amk.ca>

On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
> Are there any potential problems with making the md5sum module availability
> "optional" in the same way as this?

The md5 module has been a standard module for a long time; making it
optional in the next version of Python isn't possible.  We'd have to
require OpenSSL to compile Python.

I'm happy to replace the MD5 and/or SHA implementations with other
code, provided other code with a suitable license can be found.

--amk
From barry at python.org  Sat Feb 12 15:06:12 2005
From: barry at python.org (Barry Warsaw)
Date: Sat Feb 12 15:06:14 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050212133721.GA13429@rogue.amk.ca>
References: <20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
Message-ID: <1108217172.20404.37.camel@presto.wooz.org>

On Sat, 2005-02-12 at 08:37, A.M. Kuchling wrote:

> The md5 module has been a standard module for a long time; making it
> optional in the next version of Python isn't possible.  We'd have to
> require OpenSSL to compile Python.

I totally agree.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050212/74657c79/attachment.pgp
From rkern at ucsd.edu  Sat Feb 12 15:11:17 2005
From: rkern at ucsd.edu (Robert Kern)
Date: Sat Feb 12 15:11:43 2005
Subject: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050212133721.GA13429@rogue.amk.ca>
References: <20050208195243.GD10650@zot.electricrain.com>	<1108088147.3753.51.camel@schizo>	<b260a14d6151f744a38c5397eab5b740@redivi.com>	<1108090248.3753.53.camel@schizo>	<226e9c65e562f9b0439333053036fef3@redivi.com>	<1108102539.3753.87.camel@schizo>	<20050211175118.GC25441@zot.electricrain.com>	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>	<5d300838ef9716aeaae53579ab1f7733@redivi.com>	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
Message-ID: <cul2mc$7ak$1@sea.gmane.org>

A.M. Kuchling wrote:
> On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
> 
>>Are there any potential problems with making the md5sum module availability
>>"optional" in the same way as this?
> 
> 
> The md5 module has been a standard module for a long time; making it
> optional in the next version of Python isn't possible.  We'd have to
> require OpenSSL to compile Python.
> 
> I'm happy to replace the MD5 and/or SHA implementations with other
> code, provided other code with a suitable license can be found.

How about this one:

http://sourceforge.net/project/showfiles.php?group_id=42360

 From an API standpoint, it's trivially different from the one currently 
in Python.

 From md5.c:

/*
   Copyright (C) 1999, 2000, 2002 Aladdin Enterprises.  All rights reserved.

   This software is provided 'as-is', without any express or implied
   warranty.  In no event will the authors be held liable for any damages
   arising from the use of this software.

   Permission is granted to anyone to use this software for any purpose,
   including commercial applications, and to alter it and redistribute it
   freely, subject to the following restrictions:

   1. The origin of this software must not be misrepresented; you must not
      claim that you wrote the original software. If you use this software
      in a product, an acknowledgment in the product documentation would be
      appreciated but is not required.
   2. Altered source versions must be plainly marked as such, and must 
not be
      misrepresented as being the original software.
   3. This notice may not be removed or altered from any source 
distribution.

   L. Peter Deutsch
   ghost@aladdin.com

  */
/* $Id: md5.c,v 1.6 2002/04/13 19:20:28 lpd Exp $ */
/*
   Independent implementation of MD5 (RFC 1321).

   This code implements the MD5 Algorithm defined in RFC 1321, whose
   text is available at
         http://www.ietf.org/rfc/rfc1321.txt
   The code is derived from the text of the RFC, including the test suite
   (section A.5) but excluding the rest of Appendix A.  It does not include
   any code or documentation that is identified in the RFC as being
   copyrighted.

[etc.]

-- 
Robert Kern
rkern@ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter

From aahz at pythoncraft.com  Sat Feb 12 15:53:26 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat Feb 12 15:53:29 2005
Subject: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <cuk89b$icp$1@sea.gmane.org>
References: <420D5731.8020702@v.loewis.de> <cuk89b$icp$1@sea.gmane.org>
Message-ID: <20050212145326.GA7836@panix.com>

On Sat, Feb 12, 2005, Terry Reedy wrote:
> ""Martin v. L�wis"" <martin@v.loewis.de> wrote in message 
> news:420D5731.8020702@v.loewis.de...
>>
>> http://www.python.org/psf/contrib.html
> 
> After reading this page and pages linked thereto, I get the impression that 
> you are only asking for contributor forms from contributors of original 
> material (such as module or manual section) and not from submitters of 
> suggestions (via news,mail) or patches (via sourceforge).  Correct?

Half-correct: patches constitute "work" and should also require a
contrib agreement.  But we're probably not going to press the point
until we get contrib agreements from all CVS committers.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From tjreedy at udel.edu  Sat Feb 12 21:30:42 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat Feb 12 21:30:59 2005
Subject: [Python-Dev] Re: Re: license issues with profiler.py and
	md5.h/md5c.c
References: <420D5731.8020702@v.loewis.de> <cuk89b$icp$1@sea.gmane.org>
	<20050212145326.GA7836@panix.com>
Message-ID: <culotk$ft$1@sea.gmane.org>


"Aahz" <aahz@pythoncraft.com> wrote in message 
news:20050212145326.GA7836@panix.com...
On Sat, Feb 12, 2005, Terry Reedy wrote:
>>> http://www.python.org/psf/contrib.html

>> After reading this page and pages linked thereto, I get the impression 
>> that
>> you are only asking for contributor forms from contributors of original
>> material (such as module or manual section) and not from submitters of
>> suggestions (via news,mail) or patches (via sourceforge).  Correct?

> Half-correct: patches constitute "work" and should also require a
> contrib agreement.

As I remember, my impression was based on the suggested procedure of first 
copywrite one's work and then license it under one of two acceptible 
"original licenses".  This makes sense for a whole module, but hardly for 
most patches, to the point of being nonsense for a patch of one word, as 
some of mine have been (in text form, with the actual diff being prepared 
by the committer).  This is not to deny that editing -- finding the exact 
place to insert or change a word is "work" -- but to say that it is work of 
a different sort from original authorship.

So, if the lawyer thinks patches should also have a contrib agreement, then 
I strongly recommend a separate blanket agreement that covers all patches 
one ever contributes as one ongoing work.

>  But we're probably not going to press the point
> until we get contrib agreements from all CVS committers.

Even though I am not such, I would happily fill and fax a blanket patch 
agreement were that deemed to be helpful.

Terry J. Reedy


From greg at electricrain.com  Sat Feb 12 22:04:02 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Sat Feb 12 22:04:08 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050212133721.GA13429@rogue.amk.ca>
References: <1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
Message-ID: <20050212210402.GE25441@zot.electricrain.com>

On Sat, Feb 12, 2005 at 08:37:21AM -0500, A.M. Kuchling wrote:
> On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
> > Are there any potential problems with making the md5sum module availability
> > "optional" in the same way as this?
> 
> The md5 module has been a standard module for a long time; making it
> optional in the next version of Python isn't possible.  We'd have to
> require OpenSSL to compile Python.
> 
> I'm happy to replace the MD5 and/or SHA implementations with other
> code, provided other code with a suitable license can be found.
> 

agreed.  it can not be made optional.  What I'd prefer (and will do if
i find the time) is to have the md5 and sha1 module use OpenSSLs
implementations when available.  Falling back to their built in ones
when openssl isn't present.  That way its always there but uses the
much faster optimized openssl algorithms when they exist.

-g
From david.ascher at gmail.com  Sat Feb 12 22:42:01 2005
From: david.ascher at gmail.com (David Ascher)
Date: Sat Feb 12 22:42:05 2005
Subject: [Python-Dev] Jim Roskind
Message-ID: <dd28fc2f05021213424c4d039d@mail.gmail.com>

I contacted Jim Roskind re: the profiler code.  

i said:

 I'm a strong supporter of Opensource software, but I'm probably not
going to be able to help you very much.  I could be much more helpful
with understanding the code or its use ;-).

To summarize what I'll say: I don't own the rights to this stuff.  ...
but I don't believe there are any patents that I was ever involved
with that might encumber this work.

I would note that my profiler code is really very rarely used in
commercial products, and it is much more typically used by developers
(I guess a developer toolkit, if sold, would use it).  I'm pretty
delighted that the code has found so much use by developers over the
years.  As I noted in the intro to the documentation, I had only been
coding in Python for 3 weeks when I wrote it.  On the positive side,
it exposed many weaknesses in many developer's code (including our own
at InfoSeek), as well as in core Python code (subtle bugs in the
interpreter) that surely helped everyone.  Even though I was a newbie,
It was VERY carefully crafted,, and I'd expect that it would take a
fair amount of effort to reproduce it (and that is is probably why it
has not been changed much... or at least no one told me when they
changed/fixed it ;-) ).

With regard to why I probably can't help much.....

First off, InfoSeek (holder of the copyright) was bought by Disney,
and I don't know what if anything has eventually become of the
tradename.  There is a chance that Disney owns the rights... and I
have no idea who to ask there :-/.

Second, I took a look at the Copyright, and it sure seems pretty
permissive.  I'm amazed if folks want something more permissive.  
This is what I found on the web for it:

    Copyright ? 1994, by InfoSeek Corporation, all rights reserved.

    Written by James Roskind.10.1

    Permission to use, copy, modify, and distribute this Python
software and its associated documentation for any purpose (subject to
the restriction in the following sentence) without fee is hereby
granted, provided that the above copyright notice appears in all
copies, and that both that copyright notice and this permission notice
appear in supporting documentation, and that the name of InfoSeek not
be used in advertising or publicity pertaining to distribution of the
software without specific, written prior permission. This permission
is explicitly restricted to the copying and modification of the
software to remain in Python, compiled Python, or other languages
(such as C) wherein the modified or derived code is exclusively
imported into a Python module.

    INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

As I recall, I probably personally created the terms of the above
license.  I used a similar license on my C/C++ grammar, and Infoseek
just added a bunch of wording to be sure that they were not at risk,
and that their name would not be used in vain (or in advertising
material).  I think they were also interested in limiting its use to
Python.... but I don't think that is a concern that would bother you.

I read the link you directed me to, and its primary focus seemed ot be
on patents for related or included technology.

I don't believe that infoseek applied for or got any patents in this
area (and certainly if they did so without my name, it would probably
invalidate the patent), and I'm sure I didn't get any patents in this
area at Netscape/AOL.  In fact I don't think I got any patents back in
1994 or 1995.  My only prior patent dated back to about 1983 (a
hardware patent) that has since expired.

I have some patents since (roughly) 1995, and even though I don't
think any of them relate to profiling (though some did relate to
languages, or more specifically, security in languages), I wouldn't
want to mess with assigning rights to any of those patents, as they
belong to AOL/Netscape.  Here again, to my knowledge, none of my
patents relate in any way to this area (profiling).  Sadly, if they
did, I would not have the right to assign them.

I'm sure you're just doing your job, and following through by dotting
all the I's and crossing all T's.  My suggestion is to (as you said)
work around the issue.  You could always re-write the code from
scratch, as the approaches are not rocket science and are pretty
thoroughly explained.  I wouldn't suggest it unless you are desperate.
 If I were you, I'd wait for a license problem to emerge (which I
don't believe will ever happen).

Hope that helps,

Jim


 David Ascher wrote on 2/11/2005, 8:57 PM:

> Dear Jim --
>
> David Ascher here, writing to you on behalf of the Python Software
> Foundation.  Someone recently pointed to your copyright statement in
> Python's standard library (profile.py, if you recall, way back from
> '94).  Apparently there are some issues re: the specific terms of the
> license you picked.  We can probably find ways of working around those
> issues but I was wondering if you'd be willing to relicense the code
> under a different license, as per http://www.python.org/psf/contrib.html
>
> I don't really know if we need to worry about the current owners of
> InfoSeek, whoever that may be.  You'd know better.
From david.ascher at gmail.com  Sat Feb 12 22:45:54 2005
From: david.ascher at gmail.com (David Ascher)
Date: Sat Feb 12 22:45:57 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <e8bf7a53050208125224e89bf@mail.gmail.com>
References: <1107726549.20128.12.camel@localhost>
	<16903.28384.621922.349@gargle.gargle.HOWL>
	<1f7befae05020812377c72de26@mail.gmail.com>
	<e8bf7a53050208125224e89bf@mail.gmail.com>
Message-ID: <dd28fc2f05021213454bb7e7b6@mail.gmail.com>

On Tue, 8 Feb 2005 15:52:29 -0500, Jeremy Hylton <jhylton@gmail.com> wrote:
> Maybe some ambitious PSF activitst could contact Roskind and Steve
> Kirsch and see if they know who at Disney to talk to...  Or maybe the
> Disney guys who were at PyCon last year could help.

I contacted Jim.  His response follows:

---
I'm a strong supporter of Opensource software, but I'm probably not
going to be able to help you very much.  I could be much more helpful
with understanding the code or its use ;-).

To summarize what I'll say: I don't own the rights to this stuff.  ...
but I don't believe there are any patents that I was ever involved
with that might encumber this work.

I would note that my profiler code is really very rarely used in
commercial products, and it is much more typically used by developers
(I guess a developer toolkit, if sold, would use it).  I'm pretty
delighted that the code has found so much use by developers over the
years.  As I noted in the intro to the documentation, I had only been
coding in Python for 3 weeks when I wrote it.  On the positive side,
it exposed many weaknesses in many developer's code (including our own
at InfoSeek), as well as in core Python code (subtle bugs in the
interpreter) that surely helped everyone.  Even though I was a newbie,
It was VERY carefully crafted,, and I'd expect that it would take a
fair amount of effort to reproduce it (and that is is probably why it
has not been changed much... or at least no one told me when they
changed/fixed it ;-) ).

With regard to why I probably can't help much.....

First off, InfoSeek (holder of the copyright) was bought by Disney,
and I don't know what if anything has eventually become of the
tradename.  There is a chance that Disney owns the rights... and I
have no idea who to ask there :-/.

Second, I took a look at the Copyright, and it sure seems pretty
permissive.  I'm amazed if folks want something more permissive.  
This is what I found on the web for it:

    Copyright ? 1994, by InfoSeek Corporation, all rights reserved.

    Written by James Roskind.10.1

    Permission to use, copy, modify, and distribute this Python
software and its associated documentation for any purpose (subject to
the restriction in the following sentence) without fee is hereby
granted, provided that the above copyright notice appears in all
copies, and that both that copyright notice and this permission notice
appear in supporting documentation, and that the name of InfoSeek not
be used in advertising or publicity pertaining to distribution of the
software without specific, written prior permission. This permission
is explicitly restricted to the copying and modification of the
software to remain in Python, compiled Python, or other languages
(such as C) wherein the modified or derived code is exclusively
imported into a Python module.

    INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

As I recall, I probably personally created the terms of the above
license.  I used a similar license on my C/C++ grammar, and Infoseek
just added a bunch of wording to be sure that they were not at risk,
and that their name would not be used in vain (or in advertising
material).  I think they were also interested in limiting its use to
Python.... but I don't think that is a concern that would bother you.

I read the link you directed me to, and its primary focus seemed ot be
on patents for related or included technology.

I don't believe that infoseek applied for or got any patents in this
area (and certainly if they did so without my name, it would probably
invalidate the patent), and I'm sure I didn't get any patents in this
area at Netscape/AOL.  In fact I don't think I got any patents back in
1994 or 1995.  My only prior patent dated back to about 1983 (a
hardware patent) that has since expired.

I have some patents since (roughly) 1995, and even though I don't
think any of them relate to profiling (though some did relate to
languages, or more specifically, security in languages), I wouldn't
want to mess with assigning rights to any of those patents, as they
belong to AOL/Netscape.  Here again, to my knowledge, none of my
patents relate in any way to this area (profiling).  Sadly, if they
did, I would not have the right to assign them.

I'm sure you're just doing your job, and following through by dotting
all the I's and crossing all T's.  My suggestion is to (as you said)
work around the issue.  You could always re-write the code from
scratch, as the approaches are not rocket science and are pretty
thoroughly explained.  I wouldn't suggest it unless you are desperate.
 If I were you, I'd wait for a license problem to emerge (which I
don't believe will ever happen).
---

FWIW, I agree.  Personnally, I think that if Debian has a problem with
the above, it's their problem to deal with, not Python's.

--david
From rkern at ucsd.edu  Sun Feb 13 00:24:27 2005
From: rkern at ucsd.edu (Robert Kern)
Date: Sun Feb 13 00:24:50 2005
Subject: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <dd28fc2f05021213454bb7e7b6@mail.gmail.com>
References: <1107726549.20128.12.camel@localhost>	<16903.28384.621922.349@gargle.gargle.HOWL>	<1f7befae05020812377c72de26@mail.gmail.com>	<e8bf7a53050208125224e89bf@mail.gmail.com>
	<dd28fc2f05021213454bb7e7b6@mail.gmail.com>
Message-ID: <cum33g$gdh$1@sea.gmane.org>

David Ascher wrote:

> FWIW, I agree.  Personnally, I think that if Debian has a problem with
> the above, it's their problem to deal with, not Python's.

The OSI may also have a problem with the license if they were to be made 
aware of it.

See section 8 of the Open Source Definition:

"""8. License Must Not Be Specific to a Product

The rights attached to the program must not depend on the program's 
being part of a particular software distribution. If the program is 
extracted from that distribution and used or distributed within the 
terms of the program's license, all parties to whom the program is 
redistributed should have the same rights as those that are granted in 
conjunction with the original software distribution.
"""

I'm not entirely sure if this affects the PSF's use of OSI's trademark.

IANAL. TINLA.

-- 
Robert Kern
rkern@ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter

From greg at electricrain.com  Sun Feb 13 02:35:35 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Sun Feb 13 02:35:39 2005
Subject: [Python-Dev] Re: OpenSSL sha module / license issues with
	md5.h/md5c.c
In-Reply-To: <013501c510ae$2abd7360$24ed0ccb@apana.org.au>
References: <20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
Message-ID: <20050213013535.GF25441@zot.electricrain.com>

I've created an OpenSSL version of the sha module.  trivial to modify
to be a md5 module.  Its a first version with cleanup to be done and
such.  being managed in the SF patch manager:

 https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470

enjoy.  i'll do more cleanup and work on it soon.

From martin at v.loewis.de  Sun Feb 13 20:38:47 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 13 20:38:50 2005
Subject: [Python-Dev] Re: Re: license issues with profiler.py
	and	md5.h/md5c.c
In-Reply-To: <culotk$ft$1@sea.gmane.org>
References: <420D5731.8020702@v.loewis.de>
	<cuk89b$icp$1@sea.gmane.org>	<20050212145326.GA7836@panix.com>
	<culotk$ft$1@sea.gmane.org>
Message-ID: <420FACC7.9020502@v.loewis.de>

Terry Reedy wrote:
> As I remember, my impression was based on the suggested procedure of first 
> copywrite one's work and then license it under one of two acceptible 
> "original licenses".  This makes sense for a whole module, but hardly for 
> most patches, to the point of being nonsense for a patch of one word, as 
> some of mine have been (in text form, with the actual diff being prepared 
> by the committer).

To my understanding, there is no way to "copyright one's work" - in the
terminology of Larry Rosen (and I guess U.S. copyright law), "copyright
subsists". I.e. the creator of some work has copyright, whether he wants
it or not.

Now, the question is, what precisely constitutes "work"? To my
understanding, modifying an existing work creates derivative work;
he who creates the derivative work first needs a license to do
so, and then owns the title of the derivative work.

There is, of course, the issue of trivial changes - "nobody could have
it done differently". However, I understand that the bar for trivial
changes is very, very low; I understand that even putting a comment into
the change indicating what the change was already makes this original
work. Nobody is obliged to phrase the comment in precisely the same way,
so this specific wording of the comment is original work of the
contributor, who needs to license the change to us.

> So, if the lawyer thinks patches should also have a contrib agreement, then 
> I strongly recommend a separate blanket agreement that covers all patches 
> one ever contributes as one ongoing work.

Our contributor's form is such a blanket agreement. You fill it out
once, and then you indicate, in each patch, that this patch falls under
the agreement you sent in earlier.

> Even though I am not such, I would happily fill and fax a blanket patch 
> agreement were that deemed to be helpful.

When we have sufficient coverage from committers, I will move on to
people in Misc/ACKS. You can just go ahead and send in the form
right away.

Regards,
Martin
From abo at minkirri.apana.org.au  Mon Feb 14 01:02:23 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon Feb 14 01:03:02 2005
Subject: [Python-Dev] Re: OpenSSL sha module / license issues with
	md5.h/md5c.c
In-Reply-To: <20050213013535.GF25441@zot.electricrain.com>
References: <20050208195243.GD10650@zot.electricrain.com>
	<1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050213013535.GF25441@zot.electricrain.com>
Message-ID: <1108339344.3768.24.camel@schizo>

On Sat, 2005-02-12 at 17:35 -0800, Gregory P. Smith wrote:
> I've created an OpenSSL version of the sha module.  trivial to modify
> to be a md5 module.  Its a first version with cleanup to be done and
> such.  being managed in the SF patch manager:
> 
>  https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470
> 
> enjoy.  i'll do more cleanup and work on it soon.

Hmmm. I see the patch entry, but it seems to be missing the actual
patch.

Did you code this from scratch, or did you base it on the current
md5module.c? Is it using the openssl sha interface, or the higher level
EVP interface? 

The reason I ask is it would be pretty trivial to modify md5module.c to
use the openssl API for any digest, and would be less risk than
fresh-coding one.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From abo at minkirri.apana.org.au  Mon Feb 14 01:19:34 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon Feb 14 01:20:12 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050212210402.GE25441@zot.electricrain.com>
References: <1108088147.3753.51.camel@schizo>
	<b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
	<20050212210402.GE25441@zot.electricrain.com>
Message-ID: <1108340374.3768.33.camel@schizo>

G'day,

On Sat, 2005-02-12 at 13:04 -0800, Gregory P. Smith wrote:
> On Sat, Feb 12, 2005 at 08:37:21AM -0500, A.M. Kuchling wrote:
> > On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
> > > Are there any potential problems with making the md5sum module availability
> > > "optional" in the same way as this?
> > 
> > The md5 module has been a standard module for a long time; making it
> > optional in the next version of Python isn't possible.  We'd have to
> > require OpenSSL to compile Python.
> > 
> > I'm happy to replace the MD5 and/or SHA implementations with other
> > code, provided other code with a suitable license can be found.
> > 
> 
> agreed.  it can not be made optional.  What I'd prefer (and will do if
> i find the time) is to have the md5 and sha1 module use OpenSSLs
> implementations when available.  Falling back to their built in ones
> when openssl isn't present.  That way its always there but uses the
> much faster optimized openssl algorithms when they exist.

So we need a fallback md5 implementation for when openssl is not
available.

The RSA implementation is not usable because it has an unsuitable
license. Looking at this licence again, I'm not sure what the problem
is. It allows you to freely modify, distribute, etc, with the only limit
you must retain the RSA licence blurb.

The libmd implementation cannot be used because the author tried to give
it away unconditionally, and the lawyers say you can't. (dumb! dumb!
dumb! someone needs to figure out a way to systematically get around
this kind of stupidity, perhaps have someone in a less legally stupid
country claim and re-license free code).

The libmd5-rfc sourceforge project implementation
<http://sourceforge.net/projects/libmd5-rfc/> looks OK. It needs to be
modified to have an API identical to openssl (rename
structures/functions). Then setup.py needs to be modified to use openssl
if available, or fallback to the provided libmd5-rfc implementation.

The SHA module is a bit different... it includes a built in SHA
implementation. It might pay to strip out the implementation and give it
an openssl-like API, then make shamodule.c a use it, or openssl if
available. Greg Smith might have already done much of this...

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From greg at electricrain.com  Mon Feb 14 01:21:54 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Mon Feb 14 01:21:59 2005
Subject: [Python-Dev] Re: OpenSSL sha module / license issues with
	md5.h/md5c.c
In-Reply-To: <1108339344.3768.24.camel@schizo>
References: <b260a14d6151f744a38c5397eab5b740@redivi.com>
	<1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050213013535.GF25441@zot.electricrain.com>
	<1108339344.3768.24.camel@schizo>
Message-ID: <20050214002154.GI25441@zot.electricrain.com>

On Mon, Feb 14, 2005 at 11:02:23AM +1100, Donovan Baarda wrote:
> On Sat, 2005-02-12 at 17:35 -0800, Gregory P. Smith wrote:
> > I've created an OpenSSL version of the sha module.  trivial to modify
> > to be a md5 module.  Its a first version with cleanup to be done and
> > such.  being managed in the SF patch manager:
> > 
> >  https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470
> > 
> > enjoy.  i'll do more cleanup and work on it soon.
> 
> Hmmm. I see the patch entry, but it seems to be missing the actual
> patch.
> 
> Did you code this from scratch, or did you base it on the current
> md5module.c? Is it using the openssl sha interface, or the higher level
> EVP interface? 
> 
> The reason I ask is it would be pretty trivial to modify md5module.c to
> use the openssl API for any digest, and would be less risk than
> fresh-coding one.

Ugh.  Sourceforge ignored it on the patch submission.  i've attached
it properly now.

This initial version is derived from shamodule.c which does not have
any license issues.  it is currently only meant as an example of how
easy it is to use the openssl hashing interface.

I'm taking it an turning it into a generic openssl hash wrapper
that'll do md5 sha1 and anything else.

-g
From ncoghlan at iinet.net.au  Mon Feb 14 03:26:44 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Feb 14 03:27:57 2005
Subject: [Python-Dev] A hybrid C & Python implementation for itertools
Message-ID: <42100C64.5090001@iinet.net.au>

I can't really imagine Raymond liking this idea, and I have a feeling the idea 
has been shot down before. However, I can't persuade Google to tell me anything 
about such an occasion, so here goes anyway. . .

The utilities in the itertools module can easily be composed to provide 
additional useful functionality (e.g. the itertools recipes given in the 
documentation [1]).

However, having to recode these every time you need them, or arranging access to 
a utility module can be a pain for application programming in some corporate 
environments [2]. The lack of builtin support also leads to many variations on a 
theme, only some of which actually work properly, or which work, but in subtly 
different ways [3]. On the other hand, it really isn't worth the effort to code 
these algorithms in C for the current itertools module.

If itertools was a hybrid module, the handy 3-4 liners could go in the Python 
section, with the heavy lifting done by the underlying C module. The Python 
equivalents to the current C code could also be placed in the hybrid module (as 
happens with some of the other hybrid modules in the library).

An alternative approach is based on an idea from Alex Martelli [4]. As Alex 
points out, itertools is currently more about *creating* iterators than it is 
about consuming them (the only function desription that doesn't start with 'Make 
an iterator' is itertools.tee and that starts with 'Return n independent 
iterators'). Alex's idea would involve adding a module with a new name that is 
focused on *consuming* iterators (IOW, extending the available standard 
accumulators beyond the existing min(), max() and sum() without further 
populating the builtins).

The downside of the latter proposal is that the recipes in the itertools 
documentation relate both to producing *and* consuming iterators, so a new 
module would leave the question of where to put the handy iterator producers.

Regards,
Nick.

[1] http://www.python.org/dev/doc/devel/lib/itertools-recipes.html
[2] http://mail.python.org/pipermail/python-list/2005-February/266310.html
[3] http://mail.python.org/pipermail/python-list/2005-February/266311.html
[4] http://groups-beta.google.com/group/comp.lang.python/msg/a76b4c2caf6c435c

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From python at rcn.com  Mon Feb 14 05:07:10 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Feb 14 05:11:03 2005
Subject: [Python-Dev] A hybrid C & Python implementation for itertools
References: <42100C64.5090001@iinet.net.au>
Message-ID: <006e01c5124a$a81ed540$5e2dc797@oemcomputer>

[Nick Coghlan]
> If itertools was a hybrid module, the handy 3-4 liners could go in the Python
> section, with the heavy lifting done by the underlying C module. The Python
> equivalents to the current C code could also be placed in the hybrid module (as
> happens with some of the other hybrid modules in the library).

Both of those ideas likely reflect the future direction of itertools.

FWIW, the historical reasons for keeping the derived tools in the docs were:

* Not casting them in stone too early so they could be updated and refined at will.

* They had more value as a teaching tool (showing how basic tools could be combined) than as stand-alone tools.

* Adding more tools makes the whole toolset harder to use.

* When an itertool solution is not immediately obvious, then a generator solution is likely to be easier to write and more
understandable. Your two alternate partitioning recipes provide an excellent case in point.

* Several of the derived tools do not arise often in practice.  For example, I've never used tabulate(), nth(), pairwise(), or
repeatfunc().


> Alex's idea would involve adding a module with a new name that is
> focused on *consuming* iterators (IOW, extending the available standard
> accumulators beyond the existing min(), max() and sum() without further
> populating the builtins).

That would be nice.  From the existing itertool recipes, good candidates would include take(), all(), any(), no(), and quantify().


Raymond
From just at letterror.com  Mon Feb 14 10:23:03 2005
From: just at letterror.com (Just van Rossum)
Date: Mon Feb 14 10:23:06 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib
	libimp.tex, 1.36, 1.36.2.1 libsite.tex, 1.26,
	1.26.4.1 libtempfile.tex, 1.22, 1.22.4.1 libos.tex, 1.146.2.1,
	1.146.2.2
In-Reply-To: <E1D0Sfx-0001fc-42@sc8-pr-cvs1.sourceforge.net>
Message-ID: <r01050400-1038-086536B37E6A11D9A42C003065D5E7E4@[10.0.0.23]>

bcannon@users.sourceforge.net wrote:

>  \begin{datadesc}{PY_RESOURCE}
> -The module was found as a Macintosh resource.  This value can only be
> -returned on a Macintosh.
> +The module was found as a Mac OS 9 resource.  This value can only be
> +returned on a Mac OS 9 or earlier Macintosh.
>  \end{datadesc}

not entirely true: it's limited to the sa called "OS9" version of
MacPython, which happily runs natively on OSX as a Carbon app...

Just
From troels at thule.no  Mon Feb 14 15:03:22 2005
From: troels at thule.no (Troels Walsted Hansen)
Date: Mon Feb 14 15:03:28 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
Message-ID: <4210AFAA.9060108@thule.no>

Hi all,

The Python binding in libxml2 uses the following code for __repr__():

class xmlNode(xmlCore):
     def __init__(self, _obj=None):
         self._o = None
         xmlCore.__init__(self, _obj=_obj)

     def __repr__(self):
         return "<xmlNode (%s) object at 0x%x>" % (self.name, id (self))

With Python 2.3.4 I'm seeing warnings like the one below:
<frozen module libxml2>:2357: FutureWarning: %u/%o/%x/%X of negative int 
will return a signed string in Python 2.4 and up

I believe this is caused by the memory address having the sign bit set, 
causing builtin_id() to return a negative integer.

I grepped around in the Python standard library and found a rather 
awkward work-around that seems to be slowly propagating to various 
module using the "'%x' % id(self)" idiom:

Lib/asyncore.py:
         # On some systems (RH10) id() can be a negative number.
         # work around this.
         MAX = 2L*sys.maxint+1
         return '<%s at %#x>' % (' '.join(status), id(self)&MAX)


$ grep -r 'can be a negative number' *
Lib/asyncore.py:        # On some systems (RH10) id() can be a negative 
number.
Lib/repr.py:            # On some systems (RH10) id() can be a negative 
number.
Lib/tarfile.py:        # On some systems (RH10) id() can be a negative 
number.
Lib/test/test_repr.py:        # On some systems (RH10) id() can be a 
negative number.
Lib/xml/dom/minidom.py:        # On some systems (RH10) id() can be a 
negative number.

There are many modules that do not have this work-around in Python 2.3.4.

Wouldn't it be more elegant to make builtin_id() return an unsigned 
long integer? Is the performance impact too great?

A long integer is used on platforms where SIZEOF_VOID_P > SIZEOF_LONG 
(most 64 bit platforms?), so all Python code must be prepared to handle 
it already...

Troels
From tim.peters at gmail.com  Mon Feb 14 16:41:35 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Feb 14 16:41:37 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <4210AFAA.9060108@thule.no>
References: <4210AFAA.9060108@thule.no>
Message-ID: <1f7befae050214074122b715a@mail.gmail.com>

[Troels Walsted Hansen]
> The Python binding in libxml2 uses the following code for __repr__():
>
> class xmlNode(xmlCore):
>     def __init__(self, _obj=None):
>         self._o = None
>         xmlCore.__init__(self, _obj=_obj)
> 
>     def __repr__(self):
>         return "<xmlNode (%s) object at 0x%x>" % (self.name, id (self))
>
> With Python 2.3.4 I'm seeing warnings like the one below:
> <frozen module libxml2>:2357: FutureWarning: %u/%o/%x/%X of negative int
> will return a signed string in Python 2.4 and up
> 
> I believe this is caused by the memory address having the sign bit set,
> causing builtin_id() to return a negative integer.

Yes, that's right.

> I grepped around in the Python standard library and found a rather
> awkward work-around that seems to be slowly propagating to various
> module using the "'%x' % id(self)" idiom:

No, it's not propagating any more:  I see that none of these exist in 2.4:

> Lib/asyncore.py:
>         # On some systems (RH10) id() can be a negative number.
>         # work around this.
>         MAX = 2L*sys.maxint+1
>         return '<%s at %#x>' % (' '.join(status), id(self)&MAX)
> 
> $ grep -r 'can be a negative number' *
> Lib/asyncore.py:        # On some systems (RH10) id() can be a negative
> number.
> Lib/repr.py:            # On some systems (RH10) id() can be a negative
> number.
> Lib/tarfile.py:        # On some systems (RH10) id() can be a negative
> number.
> Lib/test/test_repr.py:        # On some systems (RH10) id() can be a
> negative number.
> Lib/xml/dom/minidom.py:        # On some systems (RH10) id() can be a
> negative number.
>
> There are many modules that do not have this work-around in Python 2.3.4.

Not sure, but it looks like this stuff was ripped out in 2.4 simply
because 2.4 no longer produces a FutureWarning in these cases.  That
doesn't address that the output changed, or that the output for a
negative id() produced by %x under 2.4 is probably surprising to most.

> Wouldn't it be more elegant to make builtin_id() return an unsigned
> long integer?

I think so.  This is the function ZODB 3.3 uses, BTW:

# Addresses can "look negative" on some boxes, some of the time.  If you
# feed a "negative address" to an %x format, Python 2.3 displays it as
# unsigned, but produces a FutureWarning, because Python 2.4 will display
# it as signed.  So when you want to prodce an address, use positive_id() to
# obtain it.
def positive_id(obj):
    """Return id(obj) as a non-negative integer."""

    result = id(obj)
    if result < 0:
        # This is a puzzle:  there's no way to know the natural width of
        # addresses on this box (in particular, there's no necessary
        # relation to sys.maxint).  Try 32 bits first (and on a 32-bit
        # box, adding 2**32 gives a positive number with the same hex
        # representation as the original result).
        result += 1L << 32
        if result < 0:
            # Undo that, and try 64 bits.
            result -= 1L << 32
            result += 1L << 64
            assert result >= 0 # else addresses are fatter than 64 bits
    return result

The gives a non-negative result regardless of Python version and
(almost) regardless of platform (the `assert` hasn't triggered on any
ZODB 3.3 platform yet).

> Is the performance impact too great?

For some app, somewhere, maybe.  It's a tradeoff.  The very widespread
practice of embedding %x output from id() favors getting rid of the
sign issue, IMO.

> A long integer is used on platforms where SIZEOF_VOID_P > SIZEOF_LONG
> (most 64 bit platforms?),

Win64 is probably the only major (meaning likely to be popular among
Python users) platform where sizeof(void*) > sizeof(long).

> so all Python code must be prepared to handle it already...

In theory <wink>.
From foom at fuhm.net  Mon Feb 14 17:33:13 2005
From: foom at fuhm.net (James Y Knight)
Date: Mon Feb 14 17:33:25 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1f7befae050214074122b715a@mail.gmail.com>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
Message-ID: <1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>


On Feb 14, 2005, at 10:41 AM, Tim Peters wrote:

>> Wouldn't it be more elegant to make builtin_id() return an unsigned
>> long integer?
>
> I think so.  This is the function ZODB 3.3 uses, BTW:
>
> def positive_id(obj):
>     """Return id(obj) as a non-negative integer."""
>  [...]

I think it'd be nice to change it, too. Twisted also uses a similar 
function.

However, last time this topic came up, this Tim Peters guy argued 
against it. ;)

Quoting 
http://mail.python.org/pipermail/python-dev/2004-November/050049.html:

> Python doesn't promise to return a postive integer for id(), although
> it may have been nicer if it did.  It's dangerous to change that now,
> because some code does depend on the "32 bit-ness as a signed integer"
> accident of CPython's id() implementation on 32-bit machines.  For
> example, code using struct.pack(), or code using one of ZODB's
> specialized int-key BTree types with id's as keys.

James

From tim.peters at gmail.com  Mon Feb 14 18:30:46 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Feb 14 18:30:49 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
	<1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
Message-ID: <1f7befae05021409307ab36a15@mail.gmail.com>

[James Y Knight]
> I think it'd be nice to change it, too. Twisted also uses a similar
> function.
>
> However, last time this topic came up, this Tim Peters guy argued
> against it. ;)
>
> Quoting
> http://mail.python.org/pipermail/python-dev/2004-November/050049.html:
> 
>> Python doesn't promise to return a postive integer for id(), although
>> it may have been nicer if it did.  It's dangerous to change that now,
>> because some code does depend on the "32 bit-ness as a signed integer"
>> accident of CPython's id() implementation on 32-bit machines.  For
>> example, code using struct.pack(), or code using one of ZODB's
>> specialized int-key BTree types with id's as keys.

Yup, it's still a tradeoff, and it's still dangerous (as any change in
visible behavior is).  It's especially unfortunate that since

    "%x" % id(obj)

does produce different output in 2.4 than in 2.3 when id(obj) < 0, we
would change that output _again_ in 2.5 if id(obj) grew a new
non-negative promise.  That is, the best time to do this would have
been for 2.4.  Maybe it's just a wart we have to live with now; OTOH,
the docs explicitly warn that id() may return a long, so any code
relying on "short int"-ness has always been relying on an
implementation quirk.
From jcarlson at uci.edu  Mon Feb 14 18:29:57 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon Feb 14 18:32:21 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
References: <1f7befae050214074122b715a@mail.gmail.com>
	<1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
Message-ID: <20050214092543.36F0.JCARLSON@uci.edu>


James Y Knight <foom@fuhm.net> wrote:
> 
> 
> On Feb 14, 2005, at 10:41 AM, Tim Peters wrote:
> 
> >> Wouldn't it be more elegant to make builtin_id() return an unsigned
> >> long integer?
> >
> > I think so.  This is the function ZODB 3.3 uses, BTW:
> >
> > def positive_id(obj):
> >     """Return id(obj) as a non-negative integer."""
> >  [...]
> 
> I think it'd be nice to change it, too. Twisted also uses a similar 
> function.
> 
> However, last time this topic came up, this Tim Peters guy argued 
> against it. ;)
> 
> Quoting 
> http://mail.python.org/pipermail/python-dev/2004-November/050049.html:
> 
> > Python doesn't promise to return a postive integer for id(), although
> > it may have been nicer if it did.  It's dangerous to change that now,
> > because some code does depend on the "32 bit-ness as a signed integer"
> > accident of CPython's id() implementation on 32-bit machines.  For
> > example, code using struct.pack(), or code using one of ZODB's
> > specialized int-key BTree types with id's as keys.

All Tim was saying is that you can't /change/ builtin_id() because of
backwards compatibiliity with Zope and struct.pack().  You are free to
create a positive_id() function, and request its inclusion into builtins
(low probability; people don't like doing that). Heck, you are even free
to drop it in your local site.py implementation.  But changing the
current function is probably a no-no.

 - Josiah

From tim.peters at gmail.com  Mon Feb 14 20:13:57 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Feb 14 20:14:01 2005
Subject: [Python-Dev] Re: [Zope] Windows Low Fragementation Heap yields
	speedup of ~15%
In-Reply-To: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>
References: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>
Message-ID: <1f7befae050214111319abbda@mail.gmail.com>

[Gfeller Martin]
> I'm running a large Zope application on a 1x1GHz CPU 1GB mem
> Window XP Prof machine using Zope 2.7.3 and Py 2.3.4
> The application typically builds large lists by appending
> and extending them.

That's historically been an especially bad case for Windows systems,
although the behavior varied across specific Windows flavors.  Python
has changed lots of things over time to improve it, including yet
another twist on list-reallocation strategy new in Python 2.4.

> We regularly observed that using a given functionality a
> second time using the same process was much slower (50%)
> than when it ran the first time after startup.

Heh.  On Win98SE, the _first_ time you ran pystone after rebooting the
machine, it ran twice as fast as the second (or third, fourth, ...)
time you tried it.  The only way I ever found to get back the original
speed without a reboot was to run a different process in-between that
allocated almost all physical memory in one giant chunk.  Presumably
that convinced Win98SE to throw away its fragmented heap and start
over again.

> This behavior greatly improved with Python 2.3 (thanks
> to the improved Python object allocator, I presume).

The page you reference later describes a scheme that's (at least
superficially) a lot like pymalloc uses for "small objects".  In
effect, pymalloc takes over buckets 1-32 in the table.

> Nevertheless, I tried to convert the heap used by Python
> to a Windows Low Fragmentation Heap (available on XP
> and 2003 Server). This improved the overall run time
> of a typical CPU-intensive report by about 15%
> (overall run time is in the 5 minutes range), with the
> same memory consumption.
>
> I consider 15% significant enough to let you know about it.

Yes, and thank you.  FYI, Python doesn't call any of the Win32 heap
functions directly; the behavior it sees is inherited from whatever
Microsoft's C implementation uses to support C's
malloc()/realloc()/free().  pymalloc requests 256KB at a time from the
platform malloc, and carves it up itself, so pymalloc isn't affected
by LFH (LFH punts on requests over 16KB, much as pymalloc punts on
requests over 256 bytes).

But "large objects" (including list guts) don't go thru pymalloc to
begin with, so as long as your list guts fit in 16KB, LFH could make a
real difference to how they behave.  Well, actually, it's probably
more the case that LFH gives a boost by keeping small objects _out_ of
the general heap.  Then growing a giant list doesn't bump into
gazillions of small objects.

> For information about the Low Fragmentation Heap, see
> <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/low_fragmentation_heap.asp>
> 
> Best regards,
> Martin
>
> PS: Since I don't speak C, I used ctypes to convert all
>    heaps in the process to LFH (I don't know how to determine
>    which one is the C heap).

It's the one consuming all the time <wink>.
From tismer at stackless.com  Tue Feb 15 01:38:43 2005
From: tismer at stackless.com (Christian Tismer)
Date: Tue Feb 15 01:38:36 2005
Subject: [Python-Dev] Ann: PyPy Sprint before PYCON 2005 in Washington
Message-ID: <42114493.8050006@stackless.com>

PyPy Sprint before PYCON 2005 in Washington
-------------------------------------------

In the four days from 19th March till 22th March (inclusive)
the PyPy team will host a sprint on their new Python-in-Python
implementation.   The PyPy project was granted funding by the
European Union as part of its Sixth Framework Program,
and is now on track to produce a stackless Python-in-Python
Just-in-Time Compiler by December 2006.  Our Python
implementation, released under the MIT/BSD license, already
provides new levels of flexibility and extensibility at the
core interpreter and object implementation level.

Armin Rigo and Holger Krekel will also give talks about PyPy
and the separate  py.test tool (used to perform various kinds
of testing in PyPy) during the conference.

Naturally, we are eager to see how the other re-implementation
of Python, namely IronPython, is doing and to explore
collaboration possibilities.  Of course, that will depend on
the degree of openness that Microsoft wants to employ.

The Pycon2005 sprint is going to focus on reaching
compatibility with CPython (currently we target version 2.3.4)
for our PyPy version running on top of CPython. One goal of
the sprint is to pass 60% or more of the unmodified regression
tests of mainline CPython.  It will thus be a great way to get
to know CPython and PyPy better at the same time!  Other
possible work areas include:

- translation to C to get a first working lower-level representation
   of the interpreter "specified in Python"

- integrating and implementing a full parser/compiler chain
   written in Python maybe already targetting the new
   AST-branch of mainline CPython

- fixing various remaining issues that will come up while
   trying to reach the compatibility goal

- integrate or code pure python implementations of some Python modules
   currently written in C.

- whatever issues you come up with! (please tell us
   before hand so we can better plan introductions etc.pp.)

Besides core developers, Bea D?ring will be present to help
improving and document our sprint and agile development
process.

We are going to give tutorials about PyPy's basic concepts and
provide help to newcomers usually by pairing them with
experienced pypythonistas. However, we kindly ask newcomers to
be present on the first day's morning (19th of March) of the
sprint to be able to get everyone a smooth start into the
sprint. So far most newcomers had few problems in getting a
good start into our codebase.  However, it is good to have the
following preparational points in mind:

- some experience with programming in the Python language and
   interest to dive deeper

- subscription to  pypy-dev and  pypy-sprint at

     http://codespeak.net/pypy/index.cgi?lists

- have a subversion-client, Pygame and graphviz installed on
   the machine you bring to the sprint.

- have a look at our current  documentation, especially the
   architecture and  getting-started documents under

     http://codespeak.net/pypy/index.cgi?doc

The pypy-dev and pypy-sprint lists are also the contact points
for raising questions and suggesting and discussing sprint
topics beforehand. We are on #pypy on irc.freenode.net most
of the time. Please don't hesitate to contact us or introduce
yourself and your interests!


Logistics
---------

Organizational details will be posted to pypy-sprint and are
or will be available in the Pycon2005-Sprint wiki here:

     http://www.python.org/moin/PyConDC2005/Sprints

Registration
------------

send mail to pypy-sprint@codespeak.net, stating the days
you can be present and any specific interests if applicable.

Registered Participants
-----------------------

all days:

      Jacob Hall?n
      Armin Rigo
      Holger Krekel
      Samuele Pedroni
      Anders Chrigstr?m
      Bea D?ring
      Christian Tismer
      Richard Emslie

-- 
Christian Tismer             :^)   <mailto:tismer@stackless.com>
tismerysoft GmbH             :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9A     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 802 86 56  mobile +49 173 24 18 776  fax +49 30 80 90 57 05
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/
From ncoghlan at iinet.net.au  Tue Feb 15 10:43:30 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Feb 15 10:43:33 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <20050214092543.36F0.JCARLSON@uci.edu>
References: <1f7befae050214074122b715a@mail.gmail.com>	<1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
	<20050214092543.36F0.JCARLSON@uci.edu>
Message-ID: <4211C442.3010001@iinet.net.au>

Josiah Carlson wrote:
>>Quoting 
>>http://mail.python.org/pipermail/python-dev/2004-November/050049.html:
>>
>>
>>>Python doesn't promise to return a postive integer for id(), although
>>>it may have been nicer if it did.  It's dangerous to change that now,
>>>because some code does depend on the "32 bit-ness as a signed integer"
>>>accident of CPython's id() implementation on 32-bit machines.  For
>>>example, code using struct.pack(), or code using one of ZODB's
>>>specialized int-key BTree types with id's as keys.
> 
> 
> All Tim was saying is that you can't /change/ builtin_id() because of
> backwards compatibiliity with Zope and struct.pack().  You are free to
> create a positive_id() function, and request its inclusion into builtins
> (low probability; people don't like doing that). Heck, you are even free
> to drop it in your local site.py implementation.  But changing the
> current function is probably a no-no.

There's always the traditional response to "want to fix it but can't due to 
backwards compatibility": a keyword argument that defaults to False.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From fredrik at pythonware.com  Tue Feb 15 10:56:58 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Feb 15 10:57:26 2005
Subject: [Python-Dev] Re: builtin_id() returns negative numbers
References: <4210AFAA.9060108@thule.no><1f7befae050214074122b715a@mail.gmail.com>
	<1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
Message-ID: <cusgsm$1ve$1@sea.gmane.org>

James Y Knight wrote:

> However, last time this topic came up, this Tim Peters guy argued against it. ;)
>
> Quoting http://mail.python.org/pipermail/python-dev/2004-November/050049.html:
>
>> Python doesn't promise to return a postive integer for id(), although
>> it may have been nicer if it did.  It's dangerous to change that now,
>> because some code does depend on the "32 bit-ness as a signed integer"
>> accident of CPython's id() implementation on 32-bit machines.  For
>> example, code using struct.pack(), or code using one of ZODB's
>> specialized int-key BTree types with id's as keys.

can anyone explain the struct.pack and ZODB use cases?  the first one
doesn't make sense to me, and the other relies on Python *not* behaving
as documented (which is worse than relying on undocumented behaviour,
imo).

</F> 


From fredrik at pythonware.com  Tue Feb 15 13:47:35 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Feb 15 13:47:51 2005
Subject: [Python-Dev] pymalloc on 2.1.3
Message-ID: <cusqsj$v9k$1@sea.gmane.org>

does anyone remember if there were any big changes in pymalloc between
the 2.1 series (where it was introduced) and 2.3 (where it was enabled by
default).

or in other words, is the 2.1.3 pymalloc stable enough for production use?

(we're having serious memory fragmentation problems on a 2.1.3 system,
and while I can patch/rebuild the interpreter if necessary, we cannot update
the system right now...)

</F> 


From mwh at python.net  Tue Feb 15 13:58:19 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Feb 15 13:58:22 2005
Subject: [Python-Dev] pymalloc on 2.1.3
In-Reply-To: <cusqsj$v9k$1@sea.gmane.org> (Fredrik Lundh's message of "Tue,
	15 Feb 2005 13:47:35 +0100")
References: <cusqsj$v9k$1@sea.gmane.org>
Message-ID: <2mmzu60yl0.fsf@starship.python.net>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> does anyone remember if there were any big changes in pymalloc between
> the 2.1 series (where it was introduced) and 2.3 (where it was enabled by
> default).

Yes.  (Was it really 2.1?  Time flies!)

> or in other words, is the 2.1.3 pymalloc stable enough for production use?

Well, Tim posted ways of making it crash, but I don't know how likely
they are to occur in non-malicious code.

"cvs log Objects/obmalloc.c" might enlighten, or at least give an idea
which months of the python-dev archive to search.

Cheers,
mwh

-- 
  <Aardappel> this "I hate c++" is so old
  <dash> it's as old as C++, yes
                                                -- from Twisted.Quotes
From tim.peters at gmail.com  Tue Feb 15 15:50:02 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Feb 15 15:50:05 2005
Subject: [Python-Dev] Re: builtin_id() returns negative numbers
In-Reply-To: <cusgsm$1ve$1@sea.gmane.org>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
	<1F0A5980-7EA6-11D9-9DB9-000A95A50FB2@fuhm.net>
	<cusgsm$1ve$1@sea.gmane.org>
Message-ID: <1f7befae05021506507964d814@mail.gmail.com>

[Fredrik Lundh]
> can anyone explain the struct.pack and ZODB use cases?  the first one
> doesn't make sense to me,

Not deep and surely not common, just possible.  If you're on a 32-bit
box and doing struct.pack("...i...", ... id(obj) ...), it in fact
cannot fail now (no, that isn't guaranteed by the docs, it's just an
implementation reality), but would fail if id() ever returned a
positive long with the same bit pattern as a negative 32-bit int
("OverflowError: long int too large to convert to int")..

> and the other relies on Python *not* behaving as documented (which is worse
> than relying on undocumented behaviour, imo).

I don't know what you think the problem with ZODB's integer-flavored
keys might be, then.  The problem I'm thinking of is that by
"integer-flavored" they really mean *C* int, not Python integer (which
is C long).  They're delicate enough that way that they already don't
work right on most current 64-bit boxes whenever the value of a Python
int doesn't in fact fit in the platform's C int:

    http://collector.zope.org/Zope/1592

If id() returned a long in some cases on 32-bit boxes, then code using
id() as key (in an II or IO tree) or value (in an II or OI) tree would
stop working. Again, the Python docs didn't guarantee this would work,
and the int-flavored BTrees have 64-bit box bugs in their handling of
integers, but the id()-as-key-or-value case has nevertheless worked
wholly reliably until now on 32-bit boxes.

Any change in visible behavior has the potential to break code -- that
shouldn't be controversial, because it's so obvious, and so
relentlessly proved in real life.  It's a tradeoff.  I've said I'm in
favor of taking away the sign issue for id() in this case, although
I'm not going to claim that no code will break as a result, and I'd be
a lot more positive about it if we could use the time machine to
change this behavior for 2.4.
From tim.peters at gmail.com  Tue Feb 15 16:19:01 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Feb 15 16:19:04 2005
Subject: [Python-Dev] pymalloc on 2.1.3
In-Reply-To: <cusqsj$v9k$1@sea.gmane.org>
References: <cusqsj$v9k$1@sea.gmane.org>
Message-ID: <1f7befae0502150719a24607d@mail.gmail.com>

[Fredrik Lundh]
> does anyone remember if there were any big changes in pymalloc between
> the 2.1 series (where it was introduced) and 2.3 (where it was enabled by
> default).

Yes, huge -- few original lines survived exactly, although many
survived in intent.

> or in other words, is the 2.1.3 pymalloc stable enough for production use?

Different question entirely <wink>.  It _was_ used in production by
some people, and happily so.  Major differences:

+ 2.1 used a probabilistic scheme for guessing whether addresses passed
  to it were obtained from pymalloc or from the system malloc.  It was
  easy for a malicous pure-Python program to corrupt pymalloc and/or malloc
  internals as a result, leading to things like segfaults, and even sneaky ways
  to mutate the Python bytecode stream.  It's extremely unlikely that a non-
  malicious program could bump into these.

+ Horrid hackery went into 2.3's version to cater to broken extension modules
  that called PyMem functions without holding the GIL.  2.1's may not be
  as thread-safe in these cases.

+ 2.1's only fields requests up to 64 bytes, 2.3's up to 256 bytes.  Changes in
  the dict implementation, and new-style classes, for 2.3 made it a pragmatic
  necessity to boost the limit for 2.3.

> (we're having serious memory fragmentation problems on a 2.1.3 system,
> and while I can patch/rebuild the interpreter if necessary, we cannot update
> the system right now...)

I'd give it a shot -- pymalloc has always been very effective at
handling large numbers of small objects gracefully.  The meaning of
"small" got 4x bigger since 2.1, which appeared to be a pure win, but
64 bytes was enough under 2.1 that most small instance dicts fit.
From mwh at python.net  Tue Feb 15 16:49:49 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Feb 15 16:49:51 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <2mu0pebo6u.fsf@starship.python.net> (Michael Hudson's message
	of "Tue, 18 Jan 2005 18:13:29 +0000")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de> <2my8eqbrk2.fsf@starship.python.net>
	<2mu0pebo6u.fsf@starship.python.net>
Message-ID: <2mfyzx257m.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> Michael Hudson <mwh@python.net> writes:
>
>> I hope to have a new patch (which makes PyExc_Exception new-style, but
>> allows arbitrary old-style classes as exceptions) "soon".  It may even
>> pass bits of "make test" :)
>
> Done: http://www.python.org/sf/1104669

Now I think it's really done, apart from documentation.

My design decision was to make Exception new-style.  Things can be
raised if they are instances of old-style classes or instances of
Exception.  If this meets with general agreement, I'd like to check
the above patch in.  It will break some highly introspective code, so
it's IMO best to get it in early in the 2.5 cycle.

The other option is to keep Exception old-style but allow new-style
subclasses, but I think all this will do is break the above mentioned
introspective code in a quieter way...

The patch also updates the PendingDeprecationWarning on raising a
string exception to a full DeprecationWarning (something that should
be done anyway).

Cheers,
mwh

-- 
  python py.py ~/Source/python/dist/src/Lib/test/pystone.py
  Pystone(1.1) time for 5000 passes = 19129.1
  This machine benchmarks at 0.261381 pystones/second
From gvanrossum at gmail.com  Tue Feb 15 19:55:53 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Feb 15 19:56:12 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <2mfyzx257m.fsf@starship.python.net>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de> <2my8eqbrk2.fsf@starship.python.net>
	<2mu0pebo6u.fsf@starship.python.net>
	<2mfyzx257m.fsf@starship.python.net>
Message-ID: <ca471dc205021510556e51a937@mail.gmail.com>

> My design decision was to make Exception new-style.  Things can be
> raised if they are instances of old-style classes or instances of
> Exception.  If this meets with general agreement, I'd like to check
> the above patch in.

I like it, but didn't you forget to mention that strings can still be
raised? I think we can't break that (but we can insert a deprecation
warning for this in 2.5 so we can hopefully deprecate it in 2.6, or
2.7 at the latest).

> The patch also updates the PendingDeprecationWarning on raising a
> string exception to a full DeprecationWarning (something that should
> be done anyway).

What I said. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From mwh at python.net  Tue Feb 15 20:27:23 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Feb 15 20:27:25 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205021510556e51a937@mail.gmail.com> (Guido van Rossum's
	message of "Tue, 15 Feb 2005 10:55:53 -0800")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de> <2my8eqbrk2.fsf@starship.python.net>
	<2mu0pebo6u.fsf@starship.python.net>
	<2mfyzx257m.fsf@starship.python.net>
	<ca471dc205021510556e51a937@mail.gmail.com>
Message-ID: <2mbral1v50.fsf@starship.python.net>

Guido van Rossum <gvanrossum@gmail.com> writes:

>> My design decision was to make Exception new-style.  Things can be
>> raised if they are instances of old-style classes or instances of
>> Exception.  If this meets with general agreement, I'd like to check
>> the above patch in.
>
> I like it, but didn't you forget to mention that strings can still be
> raised? I think we can't break that (but we can insert a deprecation
> warning for this in 2.5 so we can hopefully deprecate it in 2.6, or
> 2.7 at the latest).

I try to forget that as much as possible :)

>> The patch also updates the PendingDeprecationWarning on raising a
>> string exception to a full DeprecationWarning (something that should
>> be done anyway).
>
> What I said. :-)

:)

I'll try to bash the documentation into shape next.

Cheers,
mwh

-- 
  please realize that the Common  Lisp community is more than 40
  years old.  collectively, the community has already been where
  every clueless newbie  will be going for the next three years.
  so relax, please.                     -- Erik Naggum, comp.lang.lisp
From ejones at uwaterloo.ca  Tue Feb 15 22:39:38 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Tue Feb 15 22:39:32 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
Message-ID: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>

After I finally understood what thread-safety guarantees the Python  
memory allocator needs to provide, I went and did some hard thinking  
about the code this afternoon. I believe that my modifications provide  
the same guarantees that the original version did. I do need to declare  
the arenas array to be volatile, and leak the array when resizing it.  
Please correct me if I am wrong, but the situation that needs to be  
supported is this:

While one thread holds the GIL, any other thread can call PyObject_Free  
with a pointer that was returned by the system malloc.


The following situation is *not* supported:

While one thread holds the GIL, another thread calls PyObject_Free with  
a pointer that was returned by PyObject_Malloc.


I'm hoping that I got things a little better this time around. I've  
submitted my updated patch to the patch tracker. For reference, I've  
included links to SourceForge and the previous thread.

Thank you,

Evan Jones


Previous thread:

http://mail.python.org/pipermail/python-dev/2005-January/051255.html

Patch location:

http://sourceforge.net/tracker/index.php? 
func=detail&aid=1123430&group_id=5470&atid=305470

From tim.peters at gmail.com  Tue Feb 15 23:52:02 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Feb 15 23:52:05 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
In-Reply-To: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>
References: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>
Message-ID: <1f7befae05021514524d0a35ec@mail.gmail.com>

[Evan Jones]
> After I finally understood what thread-safety guarantees the Python
> memory allocator needs to provide, I went and did some hard thinking
> about the code this afternoon. I believe that my modifications provide
> the same guarantees that the original version did. I do need to declare
> the arenas array to be volatile, and leak the array when resizing it.
> Please correct me if I am wrong, but the situation that needs to be
> supported is this:

As I said before, I don't think we need to support this any more. 
More, I think we should not -- the support code is excruciatingly
subtle, it wasted plenty of your time trying to keep it working, and
if we keep it in it's going to continue to waste time over the coming
years (for example, in the short term, it will waste my time reviewing
it).

> While one thread holds the GIL, any other thread can call PyObject_Free
> with a pointer that was returned by the system malloc.

What _was_ supported was more generally that

    any number of threads could call PyObject_Free with pointers that were
    returned by the system malloc/realloc

at the same time as

    a single thread, holding the GIL, was doing anything whatsoever (including
    executing any code inside obmalloc.c)

Although that's a misleading way of expressing the actual intent; more
on that below.

> The following situation is *not* supported:
>
> While one thread holds the GIL, another thread calls PyObject_Free with
> a pointer that was returned by PyObject_Malloc.

Right, that was never supported (and I doubt it could be without
introducing a new mutex in obmalloc.c).

> I'm hoping that I got things a little better this time around. I've
> submitted my updated patch to the patch tracker. For reference, I've
> included links to SourceForge and the previous thread.
> 
> Thank you,

Thank you!  I probably can't make time to review anything before this
weekend.  I will try to then.  I expect it would be easier if you
ripped out the horrid support for PyObject_Free abuse; in a sane
world, the release-build PyMem_FREE, PyMem_Del, and PyMem_DEL would
expand to "free" instead of to "PyObject_FREE" (via changes to
pymem.h).

IOW, it was never the _intent_ that people be able to call
PyObject_Free without holding the GIL.  The need for that came from a
different problem, that old code sometimes mixed calls to PyObject_New
with calls to PyMem_DEL (or PyMem_FREE or PyMem_Del).  It's for that
latter reason that PyMem_DEL (and its synonyms) were changed to expand
to PyObject_Free.  This shouldn't be supported anymore.

Because it _was_ supported, there was no way to tell whether
PyObject_Free was being called because (a) we were catering to
long-obsolete but once-loved code that called PyMem_DEL while holding
the GIL and with a pointer obtained by PyObject_New; or, (b) somebody
was calling PyMem_Del (etc) with a non-object pointer they had
obtained from PyMem_New, or from the system malloc directly.

It was never legit to do #a without holding the GIL.  It was clear as
mud whether it was legit to do #b without holding the GIL.  If
PyMem_Del (etc) change to expand to "free" in a release build, then #b
can remain clear as mud without harming anyone.  Nobody should be
doing #a anymore.  If someone still is, "tough luck -- fix it, you've
had years of warning" is easy for me to live with at this stage.

I suppose the other consideration is that already-compiled extension
modules on non-Windows(*) systems will, if they're not recompiled,
continue to call PyObject_Free everywhere they had a
PyMem_Del/DEL/FREE call.  If such code is calling it without holding
the GIL, and obmalloc.c stops trying to support this insanity, then
they're going to grow some thread races they woudn't have if they did
recompile (to get such call sites remapped to the system free).  I
don't really care about that either:  it's a general rule that
virtually all Python API functions must be called with the GIL held,
and there was never an exception in the docs for the PyMem_ family.

(*) Windows is immune simply because the Windows Python is set up in such
    a way that you always have to recompile extension modules when Python's
    minor version number (the j in i.j.k) gets bumped.
From ejones at uwaterloo.ca  Wed Feb 16 04:02:53 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Wed Feb 16 04:04:05 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
In-Reply-To: <1f7befae05021514524d0a35ec@mail.gmail.com>
References: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>
	<1f7befae05021514524d0a35ec@mail.gmail.com>
Message-ID: <4c0d14b0b08390d046e1220b6f360745@uwaterloo.ca>

On Feb 15, 2005, at 17:52, Tim Peters wrote:
> As I said before, I don't think we need to support this any more.
> More, I think we should not -- the support code is excruciatingly
> subtle, it wasted plenty of your time trying to keep it working, and
> if we keep it in it's going to continue to waste time over the coming
> years (for example, in the short term, it will waste my time reviewing
> it).

I do not have nearly enough experience in the Python world to evaluate 
this decision. I've only been programming in Python for about two years 
now, and as I am sure you are aware, this is my first patch that I have 
submitted to Python. I don't really know my way around the Python 
internals, beyond writing basic extensions in C. Martin's opinion is 
clearly the opposite of yours.

Basically, the debate seems to boil down to maintaining backwards 
compatibility at the cost of making the code in obmalloc.c harder to 
understand. The particular case that is being supported could 
definitely be viewed as a "bug" in the code that using obmalloc. It 
also likely is quite rare. However, until now it has been supported, so 
it is hard to judge exactly how much code would be affected. It would 
definitely be a minor barrier to moving to Python 2.5. Is there some 
sort of consensus that is possible on this issue?

>> While one thread holds the GIL, any other thread can call 
>> PyObject_Free
>> with a pointer that was returned by the system malloc.
> What _was_ supported was more generally that
>
>     any number of threads could call PyObject_Free with pointers that 
> were
>     returned by the system malloc/realloc
>
> at the same time as
>
>     a single thread, holding the GIL, was doing anything whatsoever 
> (including
>     executing any code inside obmalloc.c)

Okay, good, that is what I have assumed.

> Although that's a misleading way of expressing the actual intent; more
> on that below.

That's fine. It may be a misleading description of the intent, but it 
is an accurate description of the required behaviour. At least I hope 
it is.

>   I expect it would be easier if you
> ripped out the horrid support for PyObject_Free abuse; in a sane
> world, the release-build PyMem_FREE, PyMem_Del, and PyMem_DEL would
> expand to "free" instead of to "PyObject_FREE" (via changes to
> pymem.h).

It turns out that basically the only thing that would change would be 
removing the "volatile" specifiers from two of the global variables, 
plus it would remove about 100 lines of comments. :) The "work" was 
basically just hurting my brain trying to reason about the concurrency 
issues, not changing code.

> It was never legit to do #a without holding the GIL.  It was clear as
> mud whether it was legit to do #b without holding the GIL.  If
> PyMem_Del (etc) change to expand to "free" in a release build, then #b
> can remain clear as mud without harming anyone.  Nobody should be
> doing #a anymore.  If someone still is, "tough luck -- fix it, you've
> had years of warning" is easy for me to live with at this stage.

Hmm... The issue is that case #a may not be an easy problem to 
diagnose: Some implementations of free() will happily do nothing if 
they are passed a pointer they know nothing about. This would just 
result in a memory leak. Other implementations of free() can output a 
warning or crash in this case, which would make it trivial to locate.

> I suppose the other consideration is that already-compiled extension
> modules on non-Windows(*) systems will, if they're not recompiled,
> continue to call PyObject_Free everywhere they had a
> PyMem_Del/DEL/FREE call.

Is it guaranteed that extension modules will be binary compatible with 
future Python releases? I didn't think this was the case.

Thanks for the feedback,

Evan Jones

From tim.peters at gmail.com  Wed Feb 16 05:26:18 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Feb 16 05:26:22 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
In-Reply-To: <4c0d14b0b08390d046e1220b6f360745@uwaterloo.ca>
References: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>
	<1f7befae05021514524d0a35ec@mail.gmail.com>
	<4c0d14b0b08390d046e1220b6f360745@uwaterloo.ca>
Message-ID: <1f7befae05021520263d77a2a3@mail.gmail.com>

[Tim Peters]
>> As I said before, I don't think we need to support this any more.
>> More, I think we should not -- the support code is excruciatingly
>> subtle, it wasted plenty of your time trying to keep it working, and
>> if we keep it in it's going to continue to waste time over the coming
>> years (for example, in the short term, it will waste my time reviewing
>> it).

[Evan Jones]
> I do not have nearly enough experience in the Python world to evaluate
> this decision. I've only been programming in Python for about two years
> now, and as I am sure you are aware, this is my first patch that I have
> submitted to Python. I don't really know my way around the Python
> internals, beyond writing basic extensions in C. Martin's opinion is
> clearly the opposite of yours.

?  This is all I recall Martin saying about this:

    http://mail.python.org/pipermail/python-dev/2005-January/051265.html

    I'm not certain it is acceptable to make this assumption. Why is it
    not possible to use the same approach that was previously used (i.e.
    leak the arenas array)?

Do you have something else in mind?  I'll talk with Martin about it if
he still wants to.  Martin, this miserable code must die!

> Basically, the debate seems to boil down to maintaining backwards
> compatibility at the cost of making the code in obmalloc.c harder to
> understand.

The "let it leak to avoid thread problems" cruft is arguably the
single most obscure bit of coding in Python's code base.  I created
it, so I get to say that <wink>.  Even 100 lines of comments aren't
enough to make it clear, as you've discovered.  I've lost track of how
many hours of my life have been pissed away explaining it, and its
consequences (like how come this or that memory-checking program
complains about the memory leak it causes), and the historical madness
that gave rise to it in the beginning.  I've had enough of it -- the
only purpose this part ever had was to protect against C code that
wasn't playing by the rules anyway.  BFD.  There are many ways to
provoke segfaults with C code that breaks the rules, and there's just
not anything that special about this way _except_ that I added
objectionable (even at the time) hacks to preserve this kind of broken
C code until authors had time to fix it.  Time's up.

> The particular case that is being supported could definitely be viewed
> as a "bug" in the code that using obmalloc. It also likely is quite rare.
> However, until now it has been supported, so it is hard to judge exactly
> how much code would be affected.

People spent many hours searching for affected code when it first went
in, and only found a few examples then, in obscure extension modules. 
It's unlikely usage has grown.  The hack was put it in for the dubious
benefit of the few examples that were found then.

> It would definitely be a minor barrier to moving to Python 2.5.

That's in part what python-dev is for.  Of course nobody here has code
that will break -- but the majority of high-use extension modules are
maintained by people who read this list, so that's not as empty as it
sounds.

It's also what alpha and beta releases are for.  Fear of change isn't
a good enough reason to maintain this code.

> Is there some sort of consensus that is possible on this issue?

Absolutely, provided it matches my view <0.5 wink>.  Rip it out, and
if alpha/beta testing suggests that's a disaster, _maybe_ put it back
in.

...

> It turns out that basically the only thing that would change would be
> removing the "volatile" specifiers from two of the global variables,
> plus it would remove about 100 lines of comments. :) The "work" was
> basically just hurting my brain trying to reason about the concurrency
> issues, not changing code.

And the brain of everyone else who ever bumps into this.  There's a
high probability that if this code actually doesn't work (can you
produce a formal proof of correctness for it?  I can't -- and I
tried), nothing can be done to repair it; and code this outrageously
delicate has a decent chance of being buggy no matter how many people
stare at it (overlooking that you + me isn't that many).  You also
mentioned before that removing the "volatile"s may have given a speed
boost, and that's believable.  I mentioned above the unending costs in
explanations, and nuisance gripes from memory-integrity tools about
the deliberate leaks.  There are many kinds of ongoing costs here, and
no _intended_ benefit anymore (it certainly wasn't my intent to cater
to buggy C code forever).

>> It was never legit to do #a without holding the GIL.  It was clear as
>> mud whether it was legit to do #b without holding the GIL.  If
>> PyMem_Del (etc) change to expand to "free" in a release build, then #b
>> can remain clear as mud without harming anyone.  Nobody should be
>> doing #a anymore.  If someone still is, "tough luck -- fix it, you've
>> had years of warning" is easy for me to live with at this stage.

> Hmm... The issue is that case #a may not be an easy problem to
> diagnose:

Many errors in C code are difficult to diagnose.  That's life.  Mixing
a PyObject call with a PyMem call is obvious now "by eyeball", so if
there is such code still out there, and it blows up, an experienced
eye has a good chance of spotting the error at once.
'
> Some implementations of free() will happily do nothing if
> they are passed a pointer they know nothing about. This would just
> result in a memory leak. Other implementations of free() can output a
> warning or crash in this case, which would make it trivial to locate.

I expect most implementations of free() would end up corrupting memory
state, leading to no symptoms or to disastrous symptoms, from 0 to a
googol cycles after the mistake was made.  Errors in using malloc/free
are often nightmares to debug.  We're not trying to make coding in C
pleasant here -- which is good, because that's unachievable <wink>.

>> I suppose the other consideration is that already-compiled extension
>> modules on non-Windows(*) systems will, if they're not recompiled,
>> continue to call PyObject_Free everywhere they had a
>> PyMem_Del/DEL/FREE call.

> Is it guaranteed that extension modules will be binary compatible with
> future Python releases? I didn't think this was the case.

Nope, that's not guarantfeed.  There's a magic number
(PYTHON_API_VERSION) that changes whenever the Python C API undergoes
an incompatible change, and binary compatibility is guaranteed across
releases if that doesn't change.  The then-current value of
PYTHON_API_VERSION gets compiled into extensions, by virtue of the
module-initialization macro their initialization function has to call.
 The guts of that function are in the Python core (Py_InitModule4()),
which raises this warning if the passed-in version doesn't match the
current version:

 "Python C API version mismatch for module %.100s:\
 This Python has API version %d, module %.100s has version %d.";

This is _just_ a warning, though.  Perhaps unfortunately for Python's
users, Guido learned long ago that most API mismatches don't actually
matter for his own code <wink>.  For example, the C API officially
changed when the signature of PyFrame_New() changed in 2001 -- but
almost no extension modules call that function.

Similarly, if we change PyMem_Del (etc) to map to the system free(),
PYTHON_API_VERSION should be bumped for Python 2.5 -- but many people
will ignore the mismatch warning, and again it will probably make no
difference (if there's code still out there that calls PyMem_DEL (etc)
without holding the GIL, I don't know about it).
From kbk at shore.net  Wed Feb 16 06:32:36 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Feb 16 06:32:48 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200502160532.j1G5Wahi031058@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  298 open (+14) /  2754 closed ( +6) /  3052 total (+20)
Bugs    :  823 open (+19) /  4829 closed (+17) /  5652 total (+36)
RFE     :  168 open ( +1) /   144 closed ( +2) /   312 total ( +3)

New / Reopened Patches
______________________

date.strptime and time.strptime as well  (2005-02-04)
       http://python.org/sf/1116362  opened by  Josh

NameError in cookielib domain check  (2005-02-04)
CLOSED http://python.org/sf/1116583  opened by  Chad Miller

Minor improvement on 1116583  (2005-02-06)
       http://python.org/sf/1117114  opened by  John J Lee

cookielib and cookies with special names  (2005-02-06)
       http://python.org/sf/1117339  opened by  John J Lee

cookielib LWPCookieJar and MozillaCookieJar exceptions  (2005-02-06)
       http://python.org/sf/1117398  opened by  John J Lee

cookielib.LWPCookieJar incorrectly loads value-less cookies  (2005-02-06)
       http://python.org/sf/1117454  opened by  John J Lee

urllib2 .getheaders attribute error  (2005-02-07)
       http://python.org/sf/1117588  opened by  Wummel

replace md5 impl. with one having a more free license  (2005-02-07)
CLOSED http://python.org/sf/1117961  opened by  Matthias Klose

unknown locale: lt_LT (patch)  (2005-02-08)
       http://python.org/sf/1118341  opened by  Nerijus Baliunas

Fix crash in xmlprase_GetInputContext in pyexpat.c  (2005-02-08)
       http://python.org/sf/1118602  opened by  Mathieu Fenniak

enable time + timedelta  (2005-02-08)
       http://python.org/sf/1118748  opened by  Josh

fix for a bug in Header.__unicode__()  (2005-02-09)
CLOSED http://python.org/sf/1119016  opened by  Bj?rn Lindqvist

python -c readlink()s and stat()s '-c'  (2005-02-09)
       http://python.org/sf/1119423  opened by  Brian Foley

patches to compile for AIX 4.1.x  (2005-02-09)
       http://python.org/sf/1119626  opened by  Stuart D. Gathman

better datetime support for xmlrpclib  (2005-02-10)
       http://python.org/sf/1120353  opened by  Fred L. Drake, Jr.

ZipFile.open - read-only file-like obj for files in archive  (2005-02-11)
       http://python.org/sf/1121142  opened by  Alan McIntyre

Reference count bug fix  (2005-02-12)
       http://python.org/sf/1121234  opened by  Michiel de Hoon

sha and md5 modules should use OpenSSL when possible  (2005-02-12)
       http://python.org/sf/1121611  opened by  Gregory P. Smith

Python memory allocator: Free memory  (2005-02-15)
       http://python.org/sf/1123430  opened by  Evan Jones

Patches Closed
______________

Add SSL certificate validation  (2005-02-03)
       http://python.org/sf/1115631  closed by  noonian

NameError in cookielib domain check  (2005-02-04)
       http://python.org/sf/1116583  closed by  rhettinger

replace md5 impl. with one having a more free license  (2005-02-07)
       http://python.org/sf/1117961  closed by  loewis

fix for a bug in Header.__unicode__()  (2005-02-09)
       http://python.org/sf/1119016  closed by  sonderblade

time.tzset() not built on Solaris  (2005-01-04)
       http://python.org/sf/1096244  closed by  bcannon

OSATerminology extension fix  (2004-06-25)
       http://python.org/sf/979784  closed by  jackjansen

New / Reopened Bugs
___________________

xmlrpclib: wrong decoding in '_stringify'  (2005-02-04)
CLOSED http://python.org/sf/1115989  opened by  Dieter Maurer

Prefix search is filesystem-centric  (2005-02-04)
       http://python.org/sf/1116520  opened by  Steve Holden

Wrong match with regex, non-greedy problem  (2005-02-05)
CLOSED http://python.org/sf/1116571  opened by  rengel

Solaris 10 fails to compile complexobject.c  (2005-02-04)
       http://python.org/sf/1116722  opened by  Case Van Horsen

Dictionary Evaluation Issue  (2005-02-05)
       http://python.org/sf/1117048  opened by  WalterBrunswick

Typo in list.sort() documentation  (2005-02-06)
CLOSED http://python.org/sf/1117063  opened by  Viktor Ferenczi

sgmllib.SGMLParser  (2005-02-06)
CLOSED http://python.org/sf/1117302  opened by  Paul Birnie

SimpleHTTPServer and mimetypes: almost together  (2005-02-06)
       http://python.org/sf/1117556  opened by  Matthew L Daniel

os.path.exists returns false negatives in MAC environments.  (2005-02-07)
       http://python.org/sf/1117601  opened by  Stephen Bennett

profiler: Bad return and Bad call errors with exceptions  (2005-02-06)
       http://python.org/sf/1117670  opened by  Matthew Mueller

"in" operator bug ?  (2005-02-07)
CLOSED http://python.org/sf/1117757  opened by  Andrea Bolzonella

BSDDB openhash   (2005-02-07)
       http://python.org/sf/1117761  opened by  Andrea Bolzonella

lists coupled  (2005-02-07)
CLOSED http://python.org/sf/1118101  opened by  chopf

Error in representation of complex numbers(again)  (2005-02-09)
       http://python.org/sf/1118729  opened by  George Yoshida

builtin file() vanishes  (2005-02-08)
CLOSED http://python.org/sf/1118977  opened by  Barry Alan Scott

Docs for set() omit constructor  (2005-02-09)
CLOSED http://python.org/sf/1119282  opened by  Kent Johnson

curses.initscr - initscr exit w/o env(TERM) set  (2005-02-09)
       http://python.org/sf/1119331  opened by  Jacob Lilly

xrange() builtin accepts keyword arg silently  (2005-02-09)
       http://python.org/sf/1119418  opened by  Martin Blais

Python Programming FAQ should be updated for Python 2.4  (2005-02-09)
       http://python.org/sf/1119439  opened by  Michael Hoffman

ScrolledText allows Frame.bbox to hide Text.bbox  (2005-02-09)
       http://python.org/sf/1119673  opened by  Drew Perttula

list extend() accepts args besides lists  (2005-02-09)
CLOSED http://python.org/sf/1119700  opened by  Dan Everhart

Static library incompatible with nptl  (2005-02-10)
       http://python.org/sf/1119860  opened by  daniel

Static library incompatible with nptl  (2005-02-10)
CLOSED http://python.org/sf/1119866  opened by  daniel

Python 2.4.0 crashes with a segfault, EXAMPLE ATTACHED  (2005-02-11)
       http://python.org/sf/1120452  opened by  Viktor Ferenczi

bug in unichr() documentation  (2005-02-11)
       http://python.org/sf/1120777  opened by  Marko Kreen

Problem in join function definition  (2005-02-11)
CLOSED http://python.org/sf/1120862  opened by  yseb

file seek error  (2005-02-11)
CLOSED http://python.org/sf/1121152  opened by  Richard Lawhorn

Python24.dll crashes, EXAMPLE ATTACHED  (2005-02-12)
       http://python.org/sf/1121201  opened by  Viktor Ferenczi

zip incorrectly and incompletely documented  (2005-02-12)
       http://python.org/sf/1121416  opened by  Alan

Decorated functions are unpickleable  (2005-02-12)
CLOSED http://python.org/sf/1121475  opened by  S Joshua Swamidass

distutils.dir_utils not unicode compatible  (2005-02-12)
       http://python.org/sf/1121494  opened by  Morten Lied Johansen

subprocess example missing "stdout=PIPE"  (2005-02-12)
       http://python.org/sf/1121579  opened by  Monte Davidoff

SMTPHandler argument misdescribed  (2005-02-13)
       http://python.org/sf/1121875  opened by  Peter

marshal may crash on truncated input  (2005-02-14)
       http://python.org/sf/1122301  opened by  Fredrik Lundh

incorrect handle of declaration in markupbase   (2005-02-14)
       http://python.org/sf/1122916  opened by  Wai Yip Tung

Typo in Curses-Function doc  (2005-02-15)
       http://python.org/sf/1123268  opened by  Aaron C. Spike

test_peepholer failing on HEAD  (2005-02-15)
CLOSED http://python.org/sf/1123354  opened by  Tim Peters

add SHA256/384/512 to lib  (2005-02-16)
       http://python.org/sf/1123660  opened by  paul rubin

Bugs Closed
___________

xmlrpclib: wrong decoding in '_stringify'  (2005-02-04)
       http://python.org/sf/1115989  closed by  fdrake

Wrong match with regex, non-greedy problem  (2005-02-05)
       http://python.org/sf/1116571  closed by  effbot

Typo in list.sort() documentation  (2005-02-05)
       http://python.org/sf/1117063  closed by  rhettinger

sgmllib.SGMLParser  (2005-02-06)
       http://python.org/sf/1117302  closed by  effbot

PyThreadState_SetAsyncExc segfault  (2004-11-18)
       http://python.org/sf/1069160  closed by  gvanrossum

"in" operator bug ?  (2005-02-07)
       http://python.org/sf/1117757  closed by  tim_one

lists coupled  (2005-02-07)
       http://python.org/sf/1118101  closed by  tim_one

builtin file() vanishes  (2005-02-09)
       http://python.org/sf/1118977  closed by  loewis

Docs for set() omit constructor  (2005-02-09)
       http://python.org/sf/1119282  closed by  rhettinger

list extend() accepts args besides lists  (2005-02-09)
       http://python.org/sf/1119700  closed by  rhettinger

Static library incompatible with nptl  (2005-02-10)
       http://python.org/sf/1119866  closed by  ekloef

Problem in join function definition  (2005-02-11)
       http://python.org/sf/1120862  closed by  rhettinger

file seek error  (2005-02-11)
       http://python.org/sf/1121152  closed by  tim_one

Decorated functions are unpickleable  (2005-02-12)
       http://python.org/sf/1121475  closed by  bcannon

"Macintosh" references in the docs need to be checked.  (2005-01-04)
       http://python.org/sf/1095802  closed by  bcannon

RE '*.?' cores if len of found string exceeds 10000  (2004-10-26)
       http://python.org/sf/1054564  closed by  effbot

missing mappings in locale tables  (2002-10-09)
       http://python.org/sf/620739  closed by  effbot

test_peepholer failing on HEAD  (2005-02-15)
       http://python.org/sf/1123354  closed by  tim_one

New / Reopened RFE
__________________

urllib.urlopen should put the http-error-code in .info()  (2005-02-07)
       http://python.org/sf/1117751  opened by  Robert Kiendl

Option to force variables to be declared  (2005-02-14)
       http://python.org/sf/1122279  opened by  Zac Evans

Line Numbers  (2005-02-14)
       http://python.org/sf/1122532  opened by  Egon Frerich

RFE Closed
__________

commands.mkarg function should be public  (2001-12-04)
       http://python.org/sf/489106  closed by  donut

Missing socketpair() function.  (2002-06-12)
       http://python.org/sf/567969  closed by  grahamh

From martin at v.loewis.de  Wed Feb 16 08:50:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Feb 16 08:50:55 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
In-Reply-To: <1f7befae05021520263d77a2a3@mail.gmail.com>
References: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>	<1f7befae05021514524d0a35ec@mail.gmail.com>	<4c0d14b0b08390d046e1220b6f360745@uwaterloo.ca>
	<1f7befae05021520263d77a2a3@mail.gmail.com>
Message-ID: <4212FB5B.1030209@v.loewis.de>

Tim Peters wrote:
>     I'm not certain it is acceptable to make this assumption. Why is it
>     not possible to use the same approach that was previously used (i.e.
>     leak the arenas array)?
> 
> Do you have something else in mind?  I'll talk with Martin about it if
> he still wants to.  Martin, this miserable code must die!

That's fine with me. I meant what I said: "I'm not certain". The patch
original claimed that it cannot possibly preserve this feature, and
I felt that this claim was incorrect - indeed, Evan then understood
the feature, and made it possible.

I can personally accept breaking the code that still relies on the
invalid APIs. The only problem is that it is really hard to determine
whether some code *does* violate the API usage.

Regards,
Martin
From konrad.hinsen at laposte.net  Thu Feb 10 09:38:40 2005
From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net)
Date: Wed Feb 16 14:20:17 2005
Subject: [Numpy-discussion] Re: [Python-Dev] Re: Numeric life as I see it
In-Reply-To: <ca471dc205020920364ec6fa40@mail.gmail.com>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com> <420ADE90.9050304@ee.byu.edu>
	<ca471dc205020920364ec6fa40@mail.gmail.com>
Message-ID: <a225db1f60892095a12adb7adc2178e6@laposte.net>

On 10.02.2005, at 05:36, Guido van Rossum wrote:

> And why would a Matrix need to inherit from a C-array? Wouldn't it
> make more sense from an OO POV for the Matrix to *have* a C-array
> without *being* one?

Definitely. Most array operations make no sense on matrices. And  
matrices are limited to two dimensions. Making Matrix a subclass of  
Array would be inheritance for implementation while removing 90% of the  
interface.

On the other hand, a Matrix object is perfectly defined by its  
behaviour and independent of its implementation. One could perfectly  
well implement one using Python lists or dictionaries, even though that  
would be pointless from a performance point of view.

Konrad.
--
------------------------------------------------------------------------ 
-------
Konrad Hinsen
Laboratoire Leon Brillouin, CEA Saclay,
91191 Gif-sur-Yvette Cedex, France
Tel.: +33-1 69 08 79 25
Fax: +33-1 69 08 82 61
E-Mail: khinsen@cea.fr
------------------------------------------------------------------------ 
-------

From konrad.hinsen at laposte.net  Thu Feb 10 09:45:28 2005
From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net)
Date: Wed Feb 16 14:20:18 2005
Subject: [Python-Dev] Re: [Numpy-discussion] Re: Numeric life as I see it
In-Reply-To: <420ADE90.9050304@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com> <420ADE90.9050304@ee.byu.edu>
Message-ID: <1c3044466186480f55ef45d2c977731b@laposte.net>

On 10.02.2005, at 05:09, Travis Oliphant wrote:

> I'm not sure I agree.  The ufuncobject is the only place where this  
> concern existed (should we trip OverFlow, ZeroDivision, etc. errors  
> durring array math).   Numarray introduced and implemented the concept  
> of error modes that can be pushed and popped.  I believe this is the  
> right solution for the ufuncobject.

Indeed. Note also that the ufunc stuff is less critical to agree on  
than the array data structure. Anyone unhappy with ufuncs could write  
their own module and use it instead. It would be the data structure and  
its access rules that fix the structure of all the code that uses it,  
so that's what needs to be acceptable to everyone.

> One question we are pursuing is could the arrayobject get into the  
> core without a particular ufunc object.   Most see this as  
> sub-optimal, but maybe it is the only way.

Since all the artithmetic operations are in ufunc that would be  
suboptimal solution, but indeed still a workable one.

> I appreciate some of what Paul is saying here, but I'm not fully  
> convinced that this is still true with Python 2.2 and up new-style  
> c-types.   The concerns seem to be over the fact that you have to  
> re-implement everything in the sub-class because the base-class will  
> always return one of its objects instead of a sub-class object.

I'd say that such discussions should be postponed until someone  
proposes a good use for subclassing arrays. Matrices are not one, in my  
opinion.

Konrad.
--
------------------------------------------------------------------------ 
-------
Konrad Hinsen
Laboratoire Leon Brillouin, CEA Saclay,
91191 Gif-sur-Yvette Cedex, France
Tel.: +33-1 69 08 79 25
Fax: +33-1 69 08 82 61
E-Mail: khinsen@cea.fr
------------------------------------------------------------------------ 
-------

From verveer at embl.de  Thu Feb 10 10:53:10 2005
From: verveer at embl.de (Peter Verveer)
Date: Wed Feb 16 14:20:20 2005
Subject: [Python-Dev] Re: [Numpy-discussion] Re: Numeric life as I see it
In-Reply-To: <420B29AE.8030701@ee.byu.edu>
References: <420A8406.4020808@ee.byu.edu>
	<ca471dc205020914453f4da355@mail.gmail.com>
	<dd28fc2f050209161264f9b601@mail.gmail.com>
	<420AAC33.807@ee.byu.edu> <420AB084.1000008@v.loewis.de>
	<420AB928.3090004@pfdubois.com> <420ADE90.9050304@ee.byu.edu>
	<1c3044466186480f55ef45d2c977731b@laposte.net>
	<420B29AE.8030701@ee.byu.edu>
Message-ID: <50ac60a36c2add7d708dc02d8bf623a3@embl.de>


On Feb 10, 2005, at 10:30 AM, Travis Oliphant wrote:

>
>>> One question we are pursuing is could the arrayobject get into the  
>>> core without a particular ufunc object.   Most see this as  
>>> sub-optimal, but maybe it is the only way.
>>
>>
>> Since all the artithmetic operations are in ufunc that would be  
>> suboptimal solution, but indeed still a workable one.
>
>
> I think replacing basic number operations of the arrayobject should 
> simple, so perhaps a default ufunc object could be worked out for 
> inclusion.

I agree, getting it in the core is among others, intended to give it 
broad access, not just to hard-core numeric people. For many uses 
(including many of my simpler scripts) you don't need the more exotic 
functionality of ufuncs. You could just do with implementing the 
standard math functions, possibly leaving out things like reduce. That 
would be very easy to implement.

>
>>
>>> I appreciate some of what Paul is saying here, but I'm not fully  
>>> convinced that this is still true with Python 2.2 and up new-style  
>>> c-types.   The concerns seem to be over the fact that you have to  
>>> re-implement everything in the sub-class because the base-class will 
>>>  always return one of its objects instead of a sub-class object.
>>
>>
>> I'd say that such discussions should be postponed until someone  
>> proposes a good use for subclassing arrays. Matrices are not one, in 
>> my  opinion.
>>
> Agreed.  It is is not critical to what I am doing, and I obviously 
> need more understanding before tackling such things.  Numeric3 uses 
> the new c-type largely because of the nice getsets table which is 
> separate from the methods table.  This replaces the rather ugly 
> C-functions getattr and setattr.

I would agree that sub-classing arrays might not be worth the trouble.

Peter

From perry at stsci.edu  Thu Feb 10 16:21:24 2005
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Feb 16 14:20:21 2005
Subject: [Python-Dev] RE: [Numpy-discussion] Numeric life as I see it
In-Reply-To: <420AB928.3090004@pfdubois.com>
Message-ID: <NEBBIJKBMLDBLNCEEFOCKEBNFJAA.perry@stsci.edu>

Paul Dubois wrote:

>
> Aside: While I am at it, let me reiterate what I have said to the other
> developers privately: there is NO value to inheriting from the array
> class. Don't try to achieve that capability if it costs anything, even
> just effort, because it buys you nothing. Those of you who keep
> remarking on this as if it would simply haven't thought it through IMHO.
> It sounds so intellectually appealing that David Ascher and I had a
> version of Numeric that almost did it before we realized our folly.
>
To be contrarian, we did find great benefit (at least initially) for
inheritance for developing the record array and character array classes
since they share so many structural operations (indexing, slicing,
transposes,
concatenation, etc.) with numeric arrays. It's possible that the approach
that Travis is considering doesn't need to use inheritance to accomplish
this (I don't know enough about the details yet), but it sure did save a
lot of duplication of implementation.

I do understand what you are getting at. Any numerical array inheritance
generally forces one to reimplement all ufuncs and such, and that does
make it less useful in that case (though I still wonder if it still isn't
better than the alternatives)

Perry Greenfield


From nick at ilm.com  Fri Feb 11 23:32:15 2005
From: nick at ilm.com (Nick Rasmussen)
Date: Wed Feb 16 14:20:22 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
Message-ID: <20050211223215.GS14902@ewok.lucasdigital.com>

tommy said that this would be the best place to ask
this question....

I'm trying to get functions wrapped via boost to show
up as builtin types so that pydoc includes them when
documenting the module containing them.  Right now
boost python functions are created using a PyTypeObject
such that when inspect.isbuiltin does:

    return isinstance(object, types.BuiltinFunctionType)

isintance returns 0.

Initially I had just modified a local pydoc to document all
functions with unknown source modules (since the module can't
be deduced from non-python functions), but I figured that
the right fix was to get boost::python functions to correctly
show up as builtins, so I tried setting PyCFunction_Type as the
boost function type object's tp_base, which worked fine for me
using linux on amd64, but when my patch was tried out on other
platforms, it ran into regression test failures:

http://mail.python.org/pipermail/c++-sig/2005-February/008545.html

So I have some questions:

Should boost::python functions be modified in some way to show
up as builtin function types or is the right fix really to patch
pydoc?

Is PyCFunction_Type intended to be subclassable?  I noticed that
it does not have Py_TPFLAGS_BASETYPE set in its tp_flags.  Also,
PyCFunction_Type has Py_TPFLAGS_HAVE_GC, and as the assertion failures
in the testsuite seemed to be centered around object allocation/
garbage collection, so is there something related to subclassing a
gc-aware class that needs to be happening (currently the boost type
object doesn't support garbage collection).

If subclassing PyCFunction_Type isn't the right way to make these
functions be considered as builtin functions, what is?

-nick


From apolinejuliet at yahoo.com  Mon Feb 14 04:31:40 2005
From: apolinejuliet at yahoo.com (apoline juliet obina)
Date: Wed Feb 16 14:20:24 2005
Subject: [Python-Dev] Py2.3.1
Message-ID: <20050214033140.60072.qmail@web30707.mail.mud.yahoo.com>

iis it "pydos" ? your net add?/
 

---------------------------------
  Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050214/ca6c95d6/attachment.htm
From Martin.Gfeller at comit.ch  Mon Feb 14 19:41:51 2005
From: Martin.Gfeller at comit.ch (Gfeller Martin)
Date: Wed Feb 16 14:20:25 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%
Message-ID: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>

Dear all,

I'm running a large Zope application on a 1x1GHz CPU 1GB mem 
Window XP Prof machine using Zope 2.7.3 and Py 2.3.4 
The application typically builds large lists by appending 
and extending them. 

We regularly observed that using a given functionality a 
second time using the same process was much slower (50%) 
than when it ran the first time after startup. 
This behavior greatly improved with Python 2.3 (thanks 
to the improved Python object allocator, I presume). 

Nevertheless, I tried to convert the heap used by Python 
to a Windows Low Fragmentation Heap (available on XP 
and 2003 Server). This improved the overall run time 
of a typical CPU-intensive report by about 15% 
(overall run time is in the 5 minutes range), with the
same memory consumption.

I consider 15% significant enough to let you know about it.

For information about the Low Fragmentation Heap, see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/low_fragmentation_heap.asp

Best regards,
Martin 

PS: Since I don't speak C, I used ctypes to convert all 
    heaps in the process to LFH (I don't know how to determine
    which one is the C heap).


________________________


COMIT AG
Risk Management Systems
Pflanzschulstrasse 7 
CH-8004 Z?rich 

Telefon	+41 (44) 1 298 92 84 

http://www.comit.ch 
http://www.quantax.com - Quantax Trading and Risk System

From leogah at spamcop.net  Mon Feb 14 23:35:31 2005
From: leogah at spamcop.net (Richard Brodie)
Date: Wed Feb 16 14:20:26 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
Message-ID: <000701c512e5$7de81660$af0189c3@oemcomputer>

> Maybe it's just a wart we have to live with now; OTOH,
> the docs explicitly warn that id() may return a long, so any code
> relying on "short int"-ness has always been relying on an
> implementation quirk.

Well, the docs say that %x does unsigned conversion, so they've
been relying on an implementation quirk as well ;)

Would it be practical to add new conversion syntax to string 
interpolation? Like, for example, %p as an unsigned hex number
the same size as (void *). 

Otherwise, unless I misunderstand integer unification, one would
just have to strike the distinction between, say, %d and %u.


From mwh at python.net  Wed Feb 16 14:33:28 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Feb 16 14:33:31 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <20050211223215.GS14902@ewok.lucasdigital.com> (Nick
	Rasmussen's message of "Fri, 11 Feb 2005 14:32:15 -0800")
References: <20050211223215.GS14902@ewok.lucasdigital.com>
Message-ID: <2m4qgc1vfb.fsf@starship.python.net>

Nick Rasmussen <nick@ilm.com> writes:

[five days ago]

> Should boost::python functions be modified in some way to show
> up as builtin function types or is the right fix really to patch
> pydoc?

My heart leans towards the latter.

> Is PyCFunction_Type intended to be subclassable?

Doesn't look like it, does it? :) More seriosly, "no".

Cheers,
mwh

-- 
  ARTHUR:  Don't ask me how it works or I'll start to whimper.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11
From pje at telecommunity.com  Wed Feb 16 17:02:18 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Feb 16 17:00:32 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <20050211223215.GS14902@ewok.lucasdigital.com>
Message-ID: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>

At 02:32 PM 2/11/05 -0800, Nick Rasmussen wrote:
>tommy said that this would be the best place to ask
>this question....
>
>I'm trying to get functions wrapped via boost to show
>up as builtin types so that pydoc includes them when
>documenting the module containing them.  Right now
>boost python functions are created using a PyTypeObject
>such that when inspect.isbuiltin does:
>
>     return isinstance(object, types.BuiltinFunctionType)

FYI, this may not be the "right" way to do this, but since 2.3 
'isinstance()' looks at an object's __class__ rather than its type(), so 
you could perhaps include a '__class__' descriptor in your method type that 
returns BuiltinFunctionType and see if that works.

It's a kludge, but it might let your code work with existing versions of 
Python.

From bob at redivi.com  Wed Feb 16 17:26:34 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Feb 16 17:26:43 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
References: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
Message-ID: <5614e00fb134b968fa76a1896c456f4a@redivi.com>


On Feb 16, 2005, at 11:02, Phillip J. Eby wrote:

> At 02:32 PM 2/11/05 -0800, Nick Rasmussen wrote:
>> tommy said that this would be the best place to ask
>> this question....
>>
>> I'm trying to get functions wrapped via boost to show
>> up as builtin types so that pydoc includes them when
>> documenting the module containing them.  Right now
>> boost python functions are created using a PyTypeObject
>> such that when inspect.isbuiltin does:
>>
>>     return isinstance(object, types.BuiltinFunctionType)
>
> FYI, this may not be the "right" way to do this, but since 2.3 
> 'isinstance()' looks at an object's __class__ rather than its type(), 
> so you could perhaps include a '__class__' descriptor in your method 
> type that returns BuiltinFunctionType and see if that works.
>
> It's a kludge, but it might let your code work with existing versions 
> of Python.

It works in Python 2.3.0:

import types
class FakeBuiltin(object):
     __doc__ = property(lambda self: self.doc)
     __name__ = property(lambda self: self.name)
     __self__ = property(lambda self: None)
     __class__ = property(lambda self: types.BuiltinFunctionType)
     def __init__(self, name, doc):
         self.name = name
         self.doc = doc

 >>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
Help on built-in function name:

name(...)
     name(foo, bar, baz) -> rval


-bob

From pje at telecommunity.com  Wed Feb 16 17:43:51 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Feb 16 17:42:04 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <5614e00fb134b968fa76a1896c456f4a@redivi.com>
References: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
	<5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050216114230.037364a0@mail.telecommunity.com>

At 11:26 AM 2/16/05 -0500, Bob Ippolito wrote:
> >>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
>Help on built-in function name:
>
>name(...)
>     name(foo, bar, baz) -> rval

If you wanted to be even more ambitious, you could return FunctionType and 
have a fake func_code so pydoc will be able to see the argument signature 
directly.  :)

From bob at redivi.com  Wed Feb 16 17:52:56 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Feb 16 17:53:11 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <5.1.1.6.0.20050216114230.037364a0@mail.telecommunity.com>
References: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
	<5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
	<5.1.1.6.0.20050216114230.037364a0@mail.telecommunity.com>
Message-ID: <640f0846671b73a92939648d278e4861@redivi.com>


On Feb 16, 2005, at 11:43, Phillip J. Eby wrote:

> At 11:26 AM 2/16/05 -0500, Bob Ippolito wrote:
>> >>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
>> Help on built-in function name:
>>
>> name(...)
>>     name(foo, bar, baz) -> rval
>
> If you wanted to be even more ambitious, you could return FunctionType 
> and have a fake func_code so pydoc will be able to see the argument 
> signature directly.  :)

I was thinking that too, but I didn't have the energy to code it in an 
email :)

-bob

From fredrik at pythonware.com  Wed Feb 16 21:08:14 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb 16 21:19:07 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
Message-ID: <cv091p$13d$1@sea.gmane.org>

any special reason why "in" is faster if the substring is found, but
a lot slower if it's not in there?

timeit -s "s = 'not there'*100" "s.find('not there') != -1"
1000000 loops, best of 3: 0.749 usec per loop

timeit -s "s = 'not there'*100" "'not there' in s"
10000000 loops, best of 3: 0.122 usec per loop

timeit -s "s = 'not the xyz'*100" "s.find('not there') != -1"
100000 loops, best of 3: 7.03 usec per loop

timeit -s "s = 'not the xyz'*100" "'not there' in s"
10000 loops, best of 3: 25.9 usec per loop

</F>

ps. btw, it's about time we did something about this:

timeit -s "s = 'not the xyz'*100" -s "import re; p = re.compile('not there')" "p.search(s)"
100000 loops, best of 3: 5.72 usec per loop 


From FBatista at uniFON.com.ar  Wed Feb 16 21:23:59 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Wed Feb 16 21:28:28 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
Message-ID: <A128D751272CD411BC9200508BC2194D053C7FBB@escpl.tcp.com.ar>

[Fredrik Lundh]

#- any special reason why "in" is faster if the substring is found, but
#- a lot slower if it's not in there?

Maybe because it stops searching when it finds it?

The time seems to be very dependant of the position of the first match:

  fbatista@pytonisa ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s =
'not there'*100" "'not there' in s"
  1000000 loops, best of 3: 0.222 usec per loop

  fbatista@pytonisa ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s =
'blah blah'*20 + 'not there'*100" "'not there' in s"
  100000 loops, best of 3: 5.54 usec per loop

  fbatista@pytonisa ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s =
'blah blah'*40 + 'not there'*100" "'not there' in s"
  100000 loops, best of 3: 10.8 usec per loop


.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050216/e799aff5/attachment.html
From mike at skew.org  Wed Feb 16 21:34:16 2005
From: mike at skew.org (Mike Brown)
Date: Wed Feb 16 21:34:18 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
In-Reply-To: <cv091p$13d$1@sea.gmane.org>
Message-ID: <200502162034.j1GKYGBU067236@chilled.skew.org>

Fredrik Lundh wrote:
> any special reason why "in" is faster if the substring is found, but
> a lot slower if it's not in there?

Just guessing here, but in general I would think that it would stop searching 
as soon as it found it, whereas until then, it keeps looking, which takes more 
time. But I would also hope that it would be smart enough to know that it 
doesn't need to look past the 2nd character in 'not the xyz' when it is 
searching for 'not there' (due to the lengths of the sequences).
From amk at amk.ca  Wed Feb 16 21:54:31 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Wed Feb 16 21:57:23 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
In-Reply-To: <200502162034.j1GKYGBU067236@chilled.skew.org>
References: <cv091p$13d$1@sea.gmane.org>
	<200502162034.j1GKYGBU067236@chilled.skew.org>
Message-ID: <20050216205431.GA8873@rogue.amk.ca>

On Wed, Feb 16, 2005 at 01:34:16PM -0700, Mike Brown wrote:
> time. But I would also hope that it would be smart enough to know that it 
> doesn't need to look past the 2nd character in 'not the xyz' when it is 
> searching for 'not there' (due to the lengths of the sequences).

Assuming stringobject.c:string_contains is the right function, the
code looks like this:

	size = PyString_GET_SIZE(el);
	rhs = PyString_AS_STRING(el);
	lhs = PyString_AS_STRING(a);

	/* optimize for a single character */
	if (size == 1)
		return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;

	end = lhs + (PyString_GET_SIZE(a) - size);
	while (lhs <= end) {
		if (memcmp(lhs++, rhs, size) == 0)
			return 1;
	}

So it's doing a zillion memcmp()s.  I don't think there's a more
efficient way to do this with ANSI C; memmem() is a GNU extension that
searches for blocks of memory.  Perhaps saving some memcmps by writing

	 if ((*lhs  == *rhs) && memcmp(lhs++, rhs, size) == 0)

would help.

--amk

From gvanrossum at gmail.com  Wed Feb 16 22:03:10 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Feb 16 22:03:13 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
In-Reply-To: <20050216205431.GA8873@rogue.amk.ca>
References: <cv091p$13d$1@sea.gmane.org>
	<200502162034.j1GKYGBU067236@chilled.skew.org>
	<20050216205431.GA8873@rogue.amk.ca>
Message-ID: <ca471dc2050216130324749ad2@mail.gmail.com>

> Assuming stringobject.c:string_contains is the right function, the
> code looks like this:
> 
>         size = PyString_GET_SIZE(el);
>         rhs = PyString_AS_STRING(el);
>         lhs = PyString_AS_STRING(a);
> 
>         /* optimize for a single character */
>         if (size == 1)
>                 return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
> 
>         end = lhs + (PyString_GET_SIZE(a) - size);
>         while (lhs <= end) {
>                 if (memcmp(lhs++, rhs, size) == 0)
>                         return 1;
>         }
> 
> So it's doing a zillion memcmp()s.  I don't think there's a more
> efficient way to do this with ANSI C; memmem() is a GNU extension that
> searches for blocks of memory.  Perhaps saving some memcmps by writing
> 
>          if ((*lhs  == *rhs) && memcmp(lhs++, rhs, size) == 0)
> 
> would help.

Which is exactly how s.find() wins this race. (I guess it loses when
it's found by having to do the "find" lookup.) Maybe string_contains
should just call string_find_internal()?

And then there's the question of how the re module gets to be faster
still; I suppose it doesn't bother with memcmp() at all.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From irmen at xs4all.nl  Wed Feb 16 22:08:36 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Wed Feb 16 22:08:38 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
In-Reply-To: <200502162034.j1GKYGBU067236@chilled.skew.org>
References: <200502162034.j1GKYGBU067236@chilled.skew.org>
Message-ID: <4213B654.7070901@xs4all.nl>

Mike Brown wrote:
> Fredrik Lundh wrote:
> 
>>any special reason why "in" is faster if the substring is found, but
>>a lot slower if it's not in there?
> 
> 
> Just guessing here, but in general I would think that it would stop searching 
> as soon as it found it, whereas until then, it keeps looking, which takes more 
> time. But I would also hope that it would be smart enough to know that it 
> doesn't need to look past the 2nd character in 'not the xyz' when it is 
> searching for 'not there' (due to the lengths of the sequences).

There's the Boyer-Moore string search algorithm which is
allegedly much faster than a simplistic scanning approach,
and I also found this: http://portal.acm.org/citation.cfm?id=79184
So perhaps there's room for improvement :)

--Irmen
From fredrik at pythonware.com  Wed Feb 16 22:19:20 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb 16 22:19:13 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
References: <cv091p$13d$1@sea.gmane.org><200502162034.j1GKYGBU067236@chilled.skew.org>
	<20050216205431.GA8873@rogue.amk.ca>
Message-ID: <cv0d72$g3m$1@sea.gmane.org>

A.M. Kuchling wrote:

>> time. But I would also hope that it would be smart enough to know that it
>> doesn't need to look past the 2nd character in 'not the xyz' when it is
>> searching for 'not there' (due to the lengths of the sequences).
>
> Assuming stringobject.c:string_contains is the right function, the
> code looks like this:
>
> size = PyString_GET_SIZE(el);
> rhs = PyString_AS_STRING(el);
> lhs = PyString_AS_STRING(a);
>
> /* optimize for a single character */
> if (size == 1)
> return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
>
> end = lhs + (PyString_GET_SIZE(a) - size);
> while (lhs <= end) {
> if (memcmp(lhs++, rhs, size) == 0)
> return 1;
> }
>
> So it's doing a zillion memcmp()s.  I don't think there's a more
> efficient way to do this with ANSI C; memmem() is a GNU extension that
> searches for blocks of memory.

oops.  so whoever implemented contains didn't even bother to look at the
find implementation... (which uses the same brute-force algorithm, but a better
implementation...)

> Perhaps saving some memcmps by writing
>
> if ((*lhs  == *rhs) && memcmp(lhs++, rhs, size) == 0)
>
> would help.

memcmp still compiles to REP CMPB on many x86 compilers, and the setup
overhead for memcmp sucks on modern x86 hardware; it's usually better to
write your own bytewise comparision...

(and the fact that we're still brute-force search algorithms in "find" is a bit
embarrassing -- note that RE outperforms "in" by a factor of five....  guess
it's time to finish the split/replace parts of stringlib and produce a patch... ;-)

</F> 


From fredrik at pythonware.com  Wed Feb 16 22:23:03 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb 16 22:33:56 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
References: <cv091p$13d$1@sea.gmane.org>
	<200502162034.j1GKYGBU067236@chilled.skew.org>
Message-ID: <cv0de1$grg$1@sea.gmane.org>

Mike Brown wrote:
>> any special reason why "in" is faster if the substring is found, but
>> a lot slower if it's not in there?
>
> Just guessing here, but in general I would think that it would stop searching
> as soon as it found it, whereas until then, it keeps looking, which takes more
> time.

the point was that string.find does the same thing, but is much faster in
the "no match" case.

> But I would also hope that it would be smart enough to know that it
> doesn't need to look past the 2nd character in 'not the xyz' when it is
> searching for 'not there' (due to the lengths of the sequences).

note that the target string was "not the xyz"*100, so the search algorithm
surely has to look past the second character ;-)

(btw, the benchmark was taken from jim hugunin's ironpython talk, and
seems to be carefully designed to kill performance also for more advanced
algorithms -- including boyer-moore)

</F> 


From fredrik at pythonware.com  Wed Feb 16 22:50:55 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb 16 22:50:53 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
References: <cv091p$13d$1@sea.gmane.org><200502162034.j1GKYGBU067236@chilled.skew.org><20050216205431.GA8873@rogue.amk.ca>
	<ca471dc2050216130324749ad2@mail.gmail.com>
Message-ID: <cv0f28$n2a$1@sea.gmane.org>

Guido van Rossum wrote:

> Which is exactly how s.find() wins this race. (I guess it loses when
> it's found by having to do the "find" lookup.) Maybe string_contains
> should just call string_find_internal()?

I somehow suspected that "in" did some extra work in case the "find"
failed; guess I should have looked at the code instead...  I didn't really
expect anyone to use a bad implementation of a brute-force algorithm
(O(nm)) when the library already contained a reasonably good version
of the same algorithm.

> And then there's the question of how the re module gets to be faster
> still; I suppose it doesn't bother with memcmp() at all.

the benchmark cheats (a bit) -- it builds a state machine (KMP-style) in
"compile", and uses that to search in O(n) time.

that approach won't fly for "in" and find, of course, but it's definitely possible
to make them run a lot faster than RE (i.e. O(n/m) for most cases)...

but refactoring the contains code to use find_internal sounds like a good
first step.  any takers?

</F> 


From tim.peters at gmail.com  Wed Feb 16 22:55:27 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Feb 16 22:55:49 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
Message-ID: <1f7befae05021613553afaaa2f@mail.gmail.com>

Rev 2.66 of funcobject.c made func.__name__ writable for the first
time.  That's great, but the patch also introduced what I'm pretty
sure was an unintended incompatibility:  after 2.66, func.__name__ was
no longer *readable* in restricted execution mode.  I can't think of a
good reason to restrict reading func.__name__, and it looks like this
part of the change was an accident.  So, unless someone objects soon,
I intend to restore that func.__name__ is readable regardless of
execution mode (but will continue to be unwritable in restricted
execution mode).

Objections?

Tres Seaver filed a bug report (some Zope tests fail under 2.4 because of this):

    http://www.python.org/sf/1124295
From raymond.hettinger at verizon.net  Wed Feb 16 23:06:54 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Feb 16 23:11:46 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
Message-ID: <000001c51473$df4717a0$8d2acb97@oemcomputer>

> but refactoring the contains code to use find_internal sounds like a
good
> first step.? any takers?
> 
> </F> 
?
I'm up for it.
?

Raymond Hettinger


From fredrik at pythonware.com  Wed Feb 16 23:10:40 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb 16 23:11:52 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
References: <cv091p$13d$1@sea.gmane.org><200502162034.j1GKYGBU067236@chilled.skew.org><20050216205431.GA8873@rogue.amk.ca>
	<cv0d72$g3m$1@sea.gmane.org>
Message-ID: <cv0g7a$r53$1@sea.gmane.org>


> memcmp still compiles to REP CMPB on many x86 compilers, and the setup
> overhead for memcmp sucks on modern x86 hardware

make that "compiles to REPE CMPSB" and "the setup overhead for
REPE CMPSB"

</F> 


From Scott.Daniels at Acm.Org  Wed Feb 16 23:00:54 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed Feb 16 23:12:18 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
In-Reply-To: <4213B654.7070901@xs4all.nl>
References: <200502162034.j1GKYGBU067236@chilled.skew.org>
	<4213B654.7070901@xs4all.nl>
Message-ID: <cv0flh$p8l$1@sea.gmane.org>

Irmen de Jong wrote:
> There's the Boyer-Moore string search algorithm which is
> allegedly much faster than a simplistic scanning approach,
> and I also found this: http://portal.acm.org/citation.cfm?id=79184
> So perhaps there's room for improvement :)

The problem is setup vs. run.  If the question is 'ab in 'rabcd',
Boyer-Moore and other fancy searches will be swamped with prep time.
In Fred's comparison with re, he does the re.compile(...) outside of
the timing loop.  You need to decide what the common case is.
The longer the thing you are searching in, the more one-time-only
overhead you can afford to reduce the per-search-character cost.

--Scott David Daniels
Scott.Daniels@Acm.Org

From gvanrossum at gmail.com  Wed Feb 16 23:16:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Feb 16 23:16:11 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
In-Reply-To: <cv0flh$p8l$1@sea.gmane.org>
References: <200502162034.j1GKYGBU067236@chilled.skew.org>
	<4213B654.7070901@xs4all.nl> <cv0flh$p8l$1@sea.gmane.org>
Message-ID: <ca471dc2050216141628096a9@mail.gmail.com>

> The longer the thing you are searching in, the more one-time-only
> overhead you can afford to reduce the per-search-character cost.

Only if you don't find it close to the start.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From Scott.Daniels at Acm.Org  Wed Feb 16 23:19:20 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed Feb 16 23:33:23 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
In-Reply-To: <cv0de1$grg$1@sea.gmane.org>
References: <cv091p$13d$1@sea.gmane.org>	<200502162034.j1GKYGBU067236@chilled.skew.org>
	<cv0de1$grg$1@sea.gmane.org>
Message-ID: <cv0go3$47b$1@sea.gmane.org>

Fredrik Lundh wrote:
> (btw, the benchmark was taken from jim hugunin's ironpython talk, and
> seems to be carefully designed to kill performance also for more advanced
> algorithms -- including boyer-moore)

Looking for "not there" in "not the xyz"*100 using Boyer-Moore should do
about 300 probes once the table is set (the underscores below):
     not the xyznot the xyznot the xyz...
     not ther_
              not the__
                not ther_
                         not the__
                           not ther_
         ...

-- Scott David Daniels
Scott.Daniels@Acm.Org

From fredrik at pythonware.com  Thu Feb 17 00:10:29 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Feb 17 00:16:13 2005
Subject: [Python-Dev] Re: string find(substring) vs. substring in string
References: <cv091p$13d$1@sea.gmane.org>	<200502162034.j1GKYGBU067236@chilled.skew.org><cv0de1$grg$1@sea.gmane.org>
	<cv0go3$47b$1@sea.gmane.org>
Message-ID: <cv0jne$6m6$1@sea.gmane.org>

Scott David Daniels wrote:

> Looking for "not there" in "not the xyz"*100 using Boyer-Moore should do
> about 300 probes once the table is set (the underscores below):
>
>     not the xyznot the xyznot the xyz...
>     not ther_
>              not the__
>                not ther_
>                         not the__
>                           not ther_
>         ...

yup; it gets into a 9/2/9/2 rut. tweak the pattern a little, and you get better
results for BM.

("kill" is of course an understatement, but BM usually works better.  but it still
needs a sizeof(alphabet) table, so you can pretty much forget about it if you
want to support unicode...)

</F> 


From martin at v.loewis.de  Thu Feb 17 00:42:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 17 00:42:09 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>
References: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>
Message-ID: <4213DA4D.8090502@v.loewis.de>

Gfeller Martin wrote:
> Nevertheless, I tried to convert the heap used by Python 
> to a Windows Low Fragmentation Heap (available on XP 
> and 2003 Server). This improved the overall run time 
> of a typical CPU-intensive report by about 15% 
> (overall run time is in the 5 minutes range), with the
> same memory consumption.

I must admit that I'm surprised. I would have expected
that most allocations in Python go through obmalloc, so
the heap would only see "large" allocations.

It would be interesting to find out, in your application,
why it is still an improvement to use the low-fragmentation
heaps.

Regards,
Martin
From allison at sumeru.stanford.EDU  Thu Feb 17 01:06:24 2005
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Thu Feb 17 01:06:31 2005
Subject: [Python-Dev] string find(substring) vs. substring in string
In-Reply-To: <4213B654.7070901@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10502161605290.7085-100000@sumeru.stanford.EDU>

Boyer-Moore and variants need a bit of preprocessing on the pattern which
makes them great for long patterns but more costly for short ones.

On Wed, 16 Feb 2005, Irmen de Jong wrote:

> Mike Brown wrote:
> > Fredrik Lundh wrote:
> > 
> >>any special reason why "in" is faster if the substring is found, but
> >>a lot slower if it's not in there?
> > 
> > 
> > Just guessing here, but in general I would think that it would stop searching 
> > as soon as it found it, whereas until then, it keeps looking, which takes more 
> > time. But I would also hope that it would be smart enough to know that it 
> > doesn't need to look past the 2nd character in 'not the xyz' when it is 
> > searching for 'not there' (due to the lengths of the sequences).
> 
> There's the Boyer-Moore string search algorithm which is
> allegedly much faster than a simplistic scanning approach,
> and I also found this: http://portal.acm.org/citation.cfm?id=79184
> So perhaps there's room for improvement :)
> 
> --Irmen
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu
> 

From ejones at uwaterloo.ca  Thu Feb 17 02:26:16 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Thu Feb 17 02:26:22 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <4213DA4D.8090502@v.loewis.de>
References: <B7824E219C630A4BB5F97190F523C12BBD7200@vulcanos.ch.comitgroup.net>
	<4213DA4D.8090502@v.loewis.de>
Message-ID: <de46f46377e616c9194a8118e8bf911c@uwaterloo.ca>

On Feb 16, 2005, at 18:42, Martin v. L?wis wrote:
> I must admit that I'm surprised. I would have expected
> that most allocations in Python go through obmalloc, so
> the heap would only see "large" allocations.
>
> It would be interesting to find out, in your application,
> why it is still an improvement to use the low-fragmentation
> heaps.

Hmm... This is an excellent point. A grep through the Python source 
code shows that the following files call the native system malloc (I've 
excluded a few obviously platform specific files). A quick visual 
inspection shows that most of these are using it to allocate some sort 
of array or string, so it likely *should* go through the system malloc. 
Gfeller, any idea if you are using any of the modules on this list? If 
so, it would be pretty easy to try converting them to call the obmalloc 
functions instead, and see how that affects the performance.

Evan Jones


Demo/pysvr/pysvr.c
Modules/_bsddb.c
Modules/_curses_panel.c
Modules/_cursesmodule.c
Modules/_hotshot.c
Modules/_sre.c
Modules/audioop.c
Modules/bsddbmodule.c
Modules/cPickle.c
Modules/cStringIO.c
Modules/getaddrinfo.c
Modules/main.c
Modules/pyexpat.c
Modules/readline.c
Modules/regexpr.c
Modules/rgbimgmodule.c
Modules/svmodule.c
Modules/timemodule.c
Modules/zlibmodule.c
PC/getpathp.c
Python/strdup.c
Python/thread.c

From greg.ewing at canterbury.ac.nz  Thu Feb 17 03:27:09 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Feb 17 03:27:24 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <000701c512e5$7de81660$af0189c3@oemcomputer>
References: <000701c512e5$7de81660$af0189c3@oemcomputer>
Message-ID: <421400FD.8090303@canterbury.ac.nz>

Richard Brodie wrote:
> 
> Otherwise, unless I misunderstand integer unification, one would
> just have to strike the distinction between, say, %d and %u.

Couldn't that be done anyway? The distinction really only
makes sense in C, where there's no way of knowing whether
the value is signed or unsigned otherwise. In Python the
value itself knows whether it's signed or not.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Thu Feb 17 07:22:40 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 17 07:22:43 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <421400FD.8090303@canterbury.ac.nz>
References: <000701c512e5$7de81660$af0189c3@oemcomputer>
	<421400FD.8090303@canterbury.ac.nz>
Message-ID: <ca471dc205021622227d717bc@mail.gmail.com>

> > Otherwise, unless I misunderstand integer unification, one would
> > just have to strike the distinction between, say, %d and %u.
> 
> Couldn't that be done anyway? The distinction really only
> makes sense in C, where there's no way of knowing whether
> the value is signed or unsigned otherwise. In Python the
> value itself knows whether it's signed or not.

The time machine is at your service: in Python 2.4 there's no
difference. That's integer unification for you!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From greg at electricrain.com  Thu Feb 17 07:53:30 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Thu Feb 17 07:53:51 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108340374.3768.33.camel@schizo>
References: <1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
	<20050212210402.GE25441@zot.electricrain.com>
	<1108340374.3768.33.camel@schizo>
Message-ID: <20050217065330.GP25441@zot.electricrain.com>

fyi - i've updated the python sha1/md5 openssl patch.  it now replaces
the entire sha and md5 modules with a generic hashes module that gives
access to all of the hash algorithms supported by OpenSSL (including
appropriate legacy interface wrappers and falling back to the old code
when compiled without openssl).

 https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470

I don't quite like the module name 'hashes' that i chose for the
generic interface (too close to the builtin hash() function).  Other
suggestions on a module name?  'digest' comes to mind.

-greg

From fredrik at pythonware.com  Thu Feb 17 10:12:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Feb 17 10:12:16 2005
Subject: [Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c
References: <1108090248.3753.53.camel@schizo><226e9c65e562f9b0439333053036fef3@redivi.com><1108102539.3753.87.camel@schizo><20050211175118.GC25441@zot.electricrain.com><00c701c5108e$f3d0b930$24ed0ccb@apana.org.au><5d300838ef9716aeaae53579ab1f7733@redivi.com><013501c510ae$2abd7360$24ed0ccb@apana.org.au><20050212133721.GA13429@rogue.amk.ca><20050212210402.GE25441@zot.electricrain.com><1108340374.3768.33.camel@schizo>
	<20050217065330.GP25441@zot.electricrain.com>
Message-ID: <cv1mvo$qlk$1@sea.gmane.org>

"Gregory P. Smith" wrote:

> I don't quite like the module name 'hashes' that i chose for the
> generic interface (too close to the builtin hash() function).  Other
> suggestions on a module name?  'digest' comes to mind.

hashtools, hashlib, and _hash are common names for helper modules like this.

(you still provide md5 and sha wrappers, I hope)

</F> 


From mwh at python.net  Thu Feb 17 11:51:35 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 17 11:51:37 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <1f7befae05021613553afaaa2f@mail.gmail.com> (Tim Peters's
	message of "Wed, 16 Feb 2005 16:55:27 -0500")
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
Message-ID: <2mzmy3zcg8.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> Rev 2.66 of funcobject.c made func.__name__ writable for the first
> time.  That's great, but the patch also introduced what I'm pretty
> sure was an unintended incompatibility:  after 2.66, func.__name__ was
> no longer *readable* in restricted execution mode. 

Yeah, my bad.

> I can't think of a good reason to restrict reading func.__name__,
> and it looks like this part of the change was an accident.  So,
> unless someone objects soon, I intend to restore that func.__name__
> is readable regardless of execution mode (but will continue to be
> unwritable in restricted execution mode).
>
> Objections?

Well, I fixed it on reading the bug report and before getting to
python-dev mail :) Sorry if this duplicated your work, but hey, it was
only a two line change...

Cheers,
mwh

-- 
  The only problem with Microsoft is they just have no taste.
              -- Steve Jobs, (From _Triumph of the Nerds_ PBS special)
                                and quoted by Aahz on comp.lang.python
From astrand at lysator.liu.se  Thu Feb 17 13:22:03 2005
From: astrand at lysator.liu.se (Peter Astrand)
Date: Thu Feb 17 13:22:14 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
Message-ID: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>


I'd like to have your opinion on this bug. Personally, I'd prefer to keep
test_no_leaking as it is, but if you think otherwise...

One thing that actually can motivate that test_subprocess takes 20% of the
overall time is that this test is a good generic Python stress test - this
test might catch some other startup race condition, for example.

Regards,
?strand

---------- Forwarded message ----------
Date: Thu, 17 Feb 2005 04:09:33 -0800
From: SourceForge.net <noreply@sourceforge.net>
To: noreply@sourceforge.net
Subject: [ python-Bugs-1124637 ] test_subprocess is far too slow

Bugs item #1124637, was opened at 2005-02-17 11:10
Message generated for change (Comment added) made by mwh
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1124637&group_id=5470

Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Hudson (mwh)
Assigned to: Peter ?strand (astrand)
Summary: test_subprocess is far too slow

Initial Comment:
test_subprocess takes multiple minutes.  I'm pretty
sure it's "test_no_leaking".  It should either be sped
up or only tested when some -u argument is passed to
regrtest.

----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2005-02-17 12:09

Message:
Logged In: YES
user_id=6656

Bog standard linux pc -- p3 933, 384 megs of ram.

"$ time ./python ../Lib/test/regrtest.py test_subprocess"
reports 2 minutes 7.  This is a debug build, a release build
might be quicker.  A run of the entire test suite takes a
hair over nine minutes, so 20-odd % of the time seems to be
test_subprocess.

It also takes ages on my old-ish ibook (600 Mhz G3, also 384
megs of ram), but that's at home and I can't time it.

----------------------------------------------------------------------

Comment By: Peter ?strand (astrand)
Date: 2005-02-17 11:50

Message:
Logged In: YES
user_id=344921

Tell me a bit about your type of OS and hardware. On my
machine (P4 2.66 GHz with Linux), the test takes 28 seconds.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1124637&group_id=5470
From ncoghlan at iinet.net.au  Thu Feb 17 15:15:46 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu Feb 17 15:15:50 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
In-Reply-To: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
Message-ID: <4214A712.6090107@iinet.net.au>

Peter Astrand wrote:
> I'd like to have your opinion on this bug. Personally, I'd prefer to keep
> test_no_leaking as it is, but if you think otherwise...
> 
> One thing that actually can motivate that test_subprocess takes 20% of the
> overall time is that this test is a good generic Python stress test - this
> test might catch some other startup race condition, for example.

test_decimal has a short version which tests basic functionality and always 
runs, but enabling -udecimal also runs the specification tests (which take a 
fair bit longer).

So keeping the basic subprocess tests unconditional, and running the long ones 
only if -uall or -usubprocess are given would seem reasonable.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From fredrik at pythonware.com  Thu Feb 17 15:19:24 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Feb 17 15:19:58 2005
Subject: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<4214A712.6090107@iinet.net.au>
Message-ID: <cv2903$i9r$1@sea.gmane.org>

Nick Coghlan wrote:

>> One thing that actually can motivate that test_subprocess takes 20% of the
>> overall time is that this test is a good generic Python stress test - this
>> test might catch some other startup race condition, for example.
>
> test_decimal has a short version which tests basic functionality and always runs, but 
> enabling -udecimal also runs the specification tests (which take a fair bit longer).
>
> So keeping the basic subprocess tests unconditional, and running the long ones only if -uall 
> or -usubprocess are given would seem reasonable.

does anyone ever use the -u options when running tests?

</F> 


From mwh at python.net  Thu Feb 17 15:30:06 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 17 15:30:41 2005
Subject: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <cv2903$i9r$1@sea.gmane.org> (Fredrik Lundh's message of "Thu,
	17 Feb 2005 15:19:24 +0100")
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<4214A712.6090107@iinet.net.au> <cv2903$i9r$1@sea.gmane.org>
Message-ID: <2mll9nz2c1.fsf@starship.python.net>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> Nick Coghlan wrote:
>
>>> One thing that actually can motivate that test_subprocess takes 20% of the
>>> overall time is that this test is a good generic Python stress test - this
>>> test might catch some other startup race condition, for example.
>>
>> test_decimal has a short version which tests basic functionality and always runs, but 
>> enabling -udecimal also runs the specification tests (which take a fair bit longer).
>>
>> So keeping the basic subprocess tests unconditional, and running the long ones only if -uall 
>> or -usubprocess are given would seem reasonable.
>
> does anyone ever use the -u options when running tests?

Yes, occasionally.  Esp. with test_compiler a testall run is an
overnight job but I try to do it every now and again.

Cheers,
mwh

-- 
  If design space weren't so vast, and the good solutions so small a
  portion of it, programming would be a lot easier.
                                            -- maney, comp.lang.python
From tim.peters at gmail.com  Thu Feb 17 15:43:20 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 15:43:55 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <2mzmy3zcg8.fsf@starship.python.net>
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
Message-ID: <1f7befae050217064337532915@mail.gmail.com>

[Michael Hudson]
> ...
> Well, I fixed it on reading the bug report and before getting to
> python-dev mail :) Sorry if this duplicated your work, but hey, it was
> only a two line change...

Na, the real work was tracking it down in the bowels of Zope's C-coded
security machinery -- we'll let you do that part next time <wink>.

Did you add a test to ensure this remains fixed?  A NEWS blurb (at
least for 2.4.1 -- the test failures under 2.4 are very visible in the
Zope world, due to auto-generated test runner failure reports)?
From tim.peters at gmail.com  Thu Feb 17 15:43:20 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 15:45:14 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <2mzmy3zcg8.fsf@starship.python.net>
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
Message-ID: <1f7befae050217064337532915@mail.gmail.com>

[Michael Hudson]
> ...
> Well, I fixed it on reading the bug report and before getting to
> python-dev mail :) Sorry if this duplicated your work, but hey, it was
> only a two line change...

Na, the real work was tracking it down in the bowels of Zope's C-coded
security machinery -- we'll let you do that part next time <wink>.

Did you add a test to ensure this remains fixed?  A NEWS blurb (at
least for 2.4.1 -- the test failures under 2.4 are very visible in the
Zope world, due to auto-generated test runner failure reports)?
From tim.peters at gmail.com  Thu Feb 17 15:56:14 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 15:56:16 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <2mzmy3zcg8.fsf@starship.python.net>
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
Message-ID: <1f7befae05021706564914b901@mail.gmail.com>

[Michael Hudson]
> ...
> Well, I fixed it on reading the bug report and before getting to
> python-dev mail :) Sorry if this duplicated your work, but hey, it was
> only a two line change...

Na, the real work was tracking it down in the bowels of Zope's C-coded
security machinery -- we'll let you do that part next time <wink>.

Did you add a test to ensure this remains fixed?  A NEWS blurb (at
least for 2.4.1 -- the test failures under 2.4 are visible in the Zope
world, due to auto-generated test runner failure reports; alas, this
is in a new test, and 2.4 worked fine with the Zope tests as they were
when 2.4 was released)?
From mwh at python.net  Thu Feb 17 15:55:23 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 17 16:15:42 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <1f7befae050217064337532915@mail.gmail.com> (Tim Peters's
	message of "Thu, 17 Feb 2005 09:43:20 -0500")
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
	<1f7befae050217064337532915@mail.gmail.com>
Message-ID: <2mfyzvz15w.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Michael Hudson]
>> ...
>> Well, I fixed it on reading the bug report and before getting to
>> python-dev mail :) Sorry if this duplicated your work, but hey, it was
>> only a two line change...
>
> Na, the real work was tracking it down in the bowels of Zope's C-coded
> security machinery -- we'll let you do that part next time <wink>.
>
> Did you add a test to ensure this remains fixed?

Yup.

> A NEWS blurb (at least for 2.4.1 -- the test failures under 2.4 are
> very visible in the Zope world, due to auto-generated test runner
> failure reports)?

No, I'll do that now.  I'm not very good at remembering NEWS blurbs...

Cheers,
mwh

-- 
6. The code definitely is not portable - it will produce incorrect 
   results if run from the surface of Mars.
               -- James Bonfield, http://www.ioccc.org/2000/rince.hint
From tim.peters at gmail.com  Thu Feb 17 16:17:22 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 16:17:27 2005
Subject: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <cv2903$i9r$1@sea.gmane.org>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<4214A712.6090107@iinet.net.au> <cv2903$i9r$1@sea.gmane.org>
Message-ID: <1f7befae05021707171476f540@mail.gmail.com>

[Fredrik Lundh]
> does anyone ever use the -u options when running tests?

Yes -- I routinely do -uall, under both release and debug builds, but
only on Windows.  WinXP in particular seems to do a good job when
hyper-threading is available -- running the tests doesn't slow down
anything else I'm doing, except during the disk-intensive tests
(test_largefile is a major pig on Windows).
From anthony at interlink.com.au  Thu Feb 17 16:24:35 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Feb 17 16:25:11 2005
Subject: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <cv2903$i9r$1@sea.gmane.org>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<4214A712.6090107@iinet.net.au> <cv2903$i9r$1@sea.gmane.org>
Message-ID: <200502180224.36851.anthony@interlink.com.au>

On Friday 18 February 2005 01:19, Fredrik Lundh wrote:
>
> does anyone ever use the -u options when running tests?

I use "make testall" (which invokes with -uall) regularly, and turn
on specific options when they're testing something I'm working with.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From tim.peters at gmail.com  Thu Feb 17 16:25:50 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 16:25:53 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <2mfyzvz15w.fsf@starship.python.net>
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
	<1f7befae050217064337532915@mail.gmail.com>
	<2mfyzvz15w.fsf@starship.python.net>
Message-ID: <1f7befae05021707252136573e@mail.gmail.com>

[sorry for the near-duplicate msgs -- looks like gmail lied when it claimed the
 first msg was still in "draft" status]

>> Did you add a test to ensure this remains fixed?

[mwh]
> Yup.

Bless you.  Did you attach a contributor agreement and mark the test
as being contributed under said contributor agreement, adjacent to
your valid copyright notice <wink>?

>> A NEWS blurb ...?

> No, I'll do that now.  I'm not very good at remembering NEWS blurbs...

LOL -- sorry, I'm just imagining what NEWS would look like if we
required a contributor-agreement notification on each blurb.  I
appreciate your work here, and will try to find a drug to counteract
the ones I appear to have overdosed on this morning ...
From mwh at python.net  Thu Feb 17 16:29:12 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 17 16:29:14 2005
Subject: [Python-Dev] 2.4 func.__name__ breakage
In-Reply-To: <1f7befae05021707252136573e@mail.gmail.com> (Tim Peters's
	message of "Thu, 17 Feb 2005 10:25:50 -0500")
References: <1f7befae05021613553afaaa2f@mail.gmail.com>
	<2mzmy3zcg8.fsf@starship.python.net>
	<1f7befae050217064337532915@mail.gmail.com>
	<2mfyzvz15w.fsf@starship.python.net>
	<1f7befae05021707252136573e@mail.gmail.com>
Message-ID: <2m8y5nyzlj.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [sorry for the near-duplicate msgs -- looks like gmail lied when it claimed the
>  first msg was still in "draft" status]
>
>>> Did you add a test to ensure this remains fixed?
>
> [mwh]
>> Yup.
>
> Bless you.  Did you attach a contributor agreement and mark the test
> as being contributed under said contributor agreement, adjacent to
> your valid copyright notice <wink>?

Fortunately 2 lines < 25 lines, so I think I'm safe on this one :)

Cheers,
mwh

-- 
  <moshez> glyph: I don't know anything about reality.
                                                -- from Twisted.Quotes
From gvanrossum at gmail.com  Thu Feb 17 16:30:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 17 16:31:00 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
In-Reply-To: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
Message-ID: <ca471dc20502170730771cd622@mail.gmail.com>

> I'd like to have your opinion on this bug. Personally, I'd prefer to keep
> test_no_leaking as it is, but if you think otherwise...
> 
> One thing that actually can motivate that test_subprocess takes 20% of the
> overall time is that this test is a good generic Python stress test - this
> test might catch some other startup race condition, for example.

A suite of unit tests is a precious thing. We want to test as much as
we can, and as thoroughly as possible; but at the same time we want
the test to run reasonably fast. If the test takes too long, human
nature being what it is, this will actually cause less thorough
testing because developers don't feel like running the test suite
after each small change, and then we get frequent problems where
someone breaks the build because they couldn't wait to run the unit
test.

(For example, where I work we have a Java test suite that takes 25
minutes to run. The build is broken on a daily basis by developers
(including me) who make a small change and check it in believing it
won't break anything.)

The Python test suite already has a way (the -u flag) to distinguish
between "regular" broad-coverage testing and deep coverage for
specific (or all) areas. Let's keep the really long-running tests out
of the regular test suite.

There used to be a farm of machines that did nothing but run the test
suite ("snake-farm"). This seems to have stopped (it was run by
volunteers at a Swedish university). Maybe we should revive such an
effort, and make sure it runs with -u all.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From astrand at lysator.liu.se  Thu Feb 17 16:52:12 2005
From: astrand at lysator.liu.se (Peter Astrand)
Date: Thu Feb 17 16:52:24 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
In-Reply-To: <ca471dc20502170730771cd622@mail.gmail.com>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<ca471dc20502170730771cd622@mail.gmail.com>
Message-ID: <Pine.GSO.4.51L2.0502171648080.17992@koeberg.lysator.liu.se>

On Thu, 17 Feb 2005, Guido van Rossum wrote:

> > I'd like to have your opinion on this bug. Personally, I'd prefer to keep
> > test_no_leaking as it is, but if you think otherwise...

> A suite of unit tests is a precious thing. We want to test as much as
> we can, and as thoroughly as possible; but at the same time we want
> the test to run reasonably fast. If the test takes too long, human
> nature being what it is, this will actually cause less thorough
> testing because developers don't feel like running the test suite
> after each small change, and then we get frequent problems where

Good point.


> The Python test suite already has a way (the -u flag) to distinguish
> between "regular" broad-coverage testing and deep coverage for
> specific (or all) areas. Let's keep the really long-running tests out
> of the regular test suite.

I'm convinced. Is this easy to implement? Anyone interested in doing this?


> There used to be a farm of machines that did nothing but run the test
> suite ("snake-farm"). This seems to have stopped (it was run by
> volunteers at a Swedish university). Maybe we should revive such an
> effort, and make sure it runs with -u all.

Yes, Snake Farm is/was a project at "Lysator", an academic computer
society located at Linkoping University. As you can tell from my mail
address, I'm a member as well. I haven't been involved in the Snake Farm
project, though.


/Peter ?strand <astrand@lysator.liu.se>

From python at rcn.com  Thu Feb 17 17:02:54 2005
From: python at rcn.com (Raymond Hettinger)
Date: Thu Feb 17 17:06:54 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <ca471dc20502170730771cd622@mail.gmail.com>
Message-ID: <002301c5150a$24760de0$3bbd2c81@oemcomputer>

> Let's keep the really long-running tests out
> of the regular test suite.

For test_subprocess, consider adopting the technique used by
test_decimal.  When -u decimal is not specified, a small random
selection of the resource intensive tests are run.  That way, all of the
tests eventually get run even if no one is routinely using -u all.


Raymond
From skip at pobox.com  Thu Feb 17 17:19:35 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Feb 17 17:17:40 2005
Subject: [Python-Dev] Five review rule on the /dev/ page?
Message-ID: <16916.50199.723442.36695@montanaro.dyndns.org>


I am frantically trying to get ready to be out of town for a week of
vacation.  Someone sent me some patches for datetime and asked me to look at
them.  I begged off but referred him to http://www.python.org/dev/ and made
mention of the five patch review idea.  Can someone make sure that's
explained on the /dev/ site?

Thx,

Skip
From walter at livinglogic.de  Thu Feb 17 17:22:25 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu Feb 17 17:22:28 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
In-Reply-To: <ca471dc20502170730771cd622@mail.gmail.com>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<ca471dc20502170730771cd622@mail.gmail.com>
Message-ID: <4214C4C1.5070309@livinglogic.de>

Guido van Rossum wrote:

> [...]
> There used to be a farm of machines that did nothing but run the test
> suite ("snake-farm"). This seems to have stopped (it was run by
> volunteers at a Swedish university). Maybe we should revive such an
> effort, and make sure it runs with -u all.

I've changed the job that produces the data for
http://coverage.livinglogic.de/ to run
python Lib/test/regrtest.py -uall -T -N

Unfortunately this job currently produces only coverage info, the output
of the test suite is thrown away. It should be easy to fix this, so that
the output gets put into the database.

Bye,
    Walter D?rwald
From mwh at python.net  Thu Feb 17 18:11:19 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Feb 17 18:11:22 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <002301c5150a$24760de0$3bbd2c81@oemcomputer> (Raymond
	Hettinger's message of "Thu, 17 Feb 2005 11:02:54 -0500")
References: <002301c5150a$24760de0$3bbd2c81@oemcomputer>
Message-ID: <2m3bvvyuvc.fsf@starship.python.net>

"Raymond Hettinger" <python@rcn.com> writes:

>> Let's keep the really long-running tests out
>> of the regular test suite.
>
> For test_subprocess, consider adopting the technique used by
> test_decimal.  When -u decimal is not specified, a small random
> selection of the resource intensive tests are run.  That way, all of the
> tests eventually get run even if no one is routinely using -u all.

I do like this strategy but I don't think it applies to this test --
it has to try to create more than 'ulimit -n' processes, if I
understand it correctly.  Which makes me think there might be other
ways to write the test if the resource module is available...

Cheers,
mwh

-- 
34. The string is a stark data structure and everywhere it is
    passed there is much duplication of process.  It is a perfect
    vehicle for hiding information.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
From tim.peters at gmail.com  Thu Feb 17 18:26:36 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 18:26:40 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far
	tooslow (fwd)
In-Reply-To: <2m3bvvyuvc.fsf@starship.python.net>
References: <002301c5150a$24760de0$3bbd2c81@oemcomputer>
	<2m3bvvyuvc.fsf@starship.python.net>
Message-ID: <1f7befae05021709266fbc542d@mail.gmail.com>

[Raymond Hettinger]
>> For test_subprocess, consider adopting the technique used by
>> test_decimal.  When -u decimal is not specified, a small random
>> selection of the resource intensive tests are run.  That way, all of the
>> tests eventually get run even if no one is routinely using -u all.

[Michael Hudson]
> I do like this strategy but I don't think it applies to this test --
> it has to try to create more than 'ulimit -n' processes, if I
> understand it correctly.  Which makes me think there might be other
> ways to write the test if the resource module is available...

Aha!  That explains why test_subprocess runs so much faster on Windows
despite that Windows process-creation time is measured in geological
eras:  test_no_leaking special-cases Windows to do only 65 iterations
instead of 1026.  It's easy to put that under control of a -u option
instead; e.g., instead of

        max_handles = 1026
        if mswindows:
            max_handles = 65

just use 1026 all the time, and stuff, e.g.,

        if not test_support.is_resource_enabled("subprocess"):
            return

at the start of test_no_leaking().
From aahz at pythoncraft.com  Thu Feb 17 18:33:46 2005
From: aahz at pythoncraft.com (Aahz)
Date: Thu Feb 17 18:33:50 2005
Subject: [Python-Dev] Five review rule on the /dev/ page?
In-Reply-To: <16916.50199.723442.36695@montanaro.dyndns.org>
References: <16916.50199.723442.36695@montanaro.dyndns.org>
Message-ID: <20050217173346.GB18117@panix.com>

On Thu, Feb 17, 2005, Skip Montanaro wrote:
>
> I am frantically trying to get ready to be out of town for a
> week of vacation.  Someone sent me some patches for datetime
> and asked me to look at them.  I begged off but referred him to
> http://www.python.org/dev/ and made mention of the five patch review
> idea.  Can someone make sure that's explained on the /dev/ site?

This should go into Brett's survey of the Python dev process, not as
official documentation.  It's simply an offer made by some of the
prominent members of python-dev.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From arigo at tunes.org  Thu Feb 17 19:11:19 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu Feb 17 19:14:50 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1f7befae050214074122b715a@mail.gmail.com>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
Message-ID: <20050217181119.GA3055@vicky.ecs.soton.ac.uk>

Hi Tim,

On Mon, Feb 14, 2005 at 10:41:35AM -0500, Tim Peters wrote:
>         # This is a puzzle:  there's no way to know the natural width of
>         # addresses on this box (in particular, there's no necessary
>         # relation to sys.maxint).

Isn't this natural width nowadays available as:

    256 ** struct.calcsize('P')

?


Armin
From tim.peters at gmail.com  Thu Feb 17 19:44:11 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Feb 17 19:44:16 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <20050217181119.GA3055@vicky.ecs.soton.ac.uk>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
	<20050217181119.GA3055@vicky.ecs.soton.ac.uk>
Message-ID: <1f7befae050217104431312214@mail.gmail.com>

[Tim Peters]
>>         # This is a puzzle:  there's no way to know the natural width of
>>         # addresses on this box (in particular, there's no necessary
>>         # relation to sys.maxint).
 
[Armin Rigo]
> Isn't this natural width nowadays available as:
> 
>    256 ** struct.calcsize('P')
> 
> ?

Looks right to me -- cool!  I never used struct's 'P' format because
it always appeared useless to me:  even if I could ship pointers
across processes or boxes, there's not much I could do with them after
getting integers back from unpack().  But silly me!  I'm sure Guido
put it there anticipating the need for calcsize('P') when making a
positive_id() function in Python.

Now if you'll just sign and fax a Zope contributor agreement, I'll
upgrade ZODB to use this slick trick <wink>.
From fredrik at pythonware.com  Thu Feb 17 21:21:38 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Feb 17 21:21:43 2005
Subject: [Python-Dev] Re: Re: string find(substring) vs. substring in string
References: <000001c51473$df4717a0$8d2acb97@oemcomputer>
Message-ID: <cv2u6j$a2l$1@sea.gmane.org>

Raymond Hettinger wrote:

> > but refactoring the contains code to use find_internal sounds like a good
> > first step. any takers?
>
> I'm up for it.

excellent!

just fyi, unless my benchmark is mistaken, the Unicode implementation has
the same problem:

    str in -> 25.8 �sec per loop
    unicode in -> 26.8 �sec per loop

    str.find() -> 6.73 �sec per loop
    unicode.find() -> 7.24 �sec per loop

oddly enough, if I change the target string so it doesn't contain any partial
matches at all, unicode.find() wins the race:

    str in -> 24.5 �sec per loop
    unicode in -> 24.6 �sec per loop

    str.find() -> 2.86 �sec per loop
    unicode.find() -> 2.16 �sec per loop

</F> 


From bac at OCF.Berkeley.EDU  Thu Feb 17 21:22:29 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Feb 17 21:22:44 2005
Subject: [Python-Dev] Five review rule on the /dev/ page?
In-Reply-To: <20050217173346.GB18117@panix.com>
References: <16916.50199.723442.36695@montanaro.dyndns.org>
	<20050217173346.GB18117@panix.com>
Message-ID: <4214FD05.7020203@ocf.berkeley.edu>

[removed pydotorg from people receiving this email]

Aahz wrote:
> On Thu, Feb 17, 2005, Skip Montanaro wrote:
> 
>>I am frantically trying to get ready to be out of town for a
>>week of vacation.  Someone sent me some patches for datetime
>>and asked me to look at them.  I begged off but referred him to
>>http://www.python.org/dev/ and made mention of the five patch review
>>idea.  Can someone make sure that's explained on the /dev/ site?
> 
> 
> This should go into Brett's survey of the Python dev process, not as
> official documentation.  It's simply an offer made by some of the
> prominent members of python-dev.

I am planning on adding that blurb in there.

Actually, while I have everyone's attention, I might as well throw an idea out 
there about sprucing up yet again the docs on contributing.  I was thinking of 
taking the current dev intro and have it just explain how things basically work 
around here.  So the doc would become more of just a high-level overview of how 
we dev the language.

But I would cut out the helping out section and spin that into another doc that 
would go into some more detail on how to make a contribution.  So this would 
specify in more detail how to report a bug, how to comment on one, etc. (same 
goes for patches).  This is where I would stick the 5-for-1 deal.

Lastly, write up a doc that covers what one with CVS checkin rights needs to do 
when checking in code.  So how one goes about getting checkin rights, getting 
initial checkins OK'ed by others, and then the usual steps taken for a checkin.

Sound worth it to people?  Not really needed so go back and do your homework, 
Brett?  What?

-Brett
From Jack.Jansen at cwi.nl  Thu Feb 17 21:46:03 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Feb 17 21:46:03 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib
	libimp.tex, 1.36, 1.36.2.1 libsite.tex, 1.26,
	1.26.4.1 libtempfile.tex, 1.22, 1.22.4.1 libos.tex, 1.146.2.1,
	1.146.2.2
In-Reply-To: <r01050400-1038-086536B37E6A11D9A42C003065D5E7E4@[10.0.0.23]>
References: <r01050400-1038-086536B37E6A11D9A42C003065D5E7E4@[10.0.0.23]>
Message-ID: <F04F6B16-8124-11D9-962A-000D934FF6B4@cwi.nl>


On 14-feb-05, at 10:23, Just van Rossum wrote:

> bcannon@users.sourceforge.net wrote:
>
>>  \begin{datadesc}{PY_RESOURCE}
>> -The module was found as a Macintosh resource.  This value can only be
>> -returned on a Macintosh.
>> +The module was found as a Mac OS 9 resource.  This value can only be
>> +returned on a Mac OS 9 or earlier Macintosh.
>>  \end{datadesc}
>
> not entirely true: it's limited to the sa called "OS9" version of
> MacPython, which happily runs natively on OSX as a Carbon app...

But as of 2.4 there's no such thing as MacPython-OS9 any more. But as 
the constant is still in there I thought it best to document it.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From walter at livinglogic.de  Thu Feb 17 23:22:20 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu Feb 17 23:22:22 2005
Subject: [Python-Dev] Negative indices in UserString.MutableString
Message-ID: <1543.84.56.105.228.1108678940.squirrel@isar.livinglogic.de>

Currently UserString.MutableString does not support negative indices:

>>> import UserString
>>> UserString.MutableString("foo")[-1] = "bar"
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/Python-test/dist/src/Lib/UserString.py", line 149, in __setitem__
    if index < 0 or index >= len(self.data): raise IndexError
IndexError

Should this be fixed so that negative value are treated as being relative to the end?

Bye,
   Walter D?rwald


From aahz at pythoncraft.com  Thu Feb 17 23:23:36 2005
From: aahz at pythoncraft.com (Aahz)
Date: Thu Feb 17 23:23:37 2005
Subject: [Python-Dev] Negative indices in UserString.MutableString
In-Reply-To: <1543.84.56.105.228.1108678940.squirrel@isar.livinglogic.de>
References: <1543.84.56.105.228.1108678940.squirrel@isar.livinglogic.de>
Message-ID: <20050217222336.GA18285@panix.com>

On Thu, Feb 17, 2005, Walter D?rwald wrote:
>
> Currently UserString.MutableString does not support negative indices:
> 
> >>> import UserString
> >>> UserString.MutableString("foo")[-1] = "bar"
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/home/Python-test/dist/src/Lib/UserString.py", line 149, in __setitem__
>     if index < 0 or index >= len(self.data): raise IndexError
> IndexError
> 
> Should this be fixed so that negative value are treated as being
> relative to the end?

Yup!  As usual, patches welcome.  (Yes, I'm comfortable channeling Guido
here.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From greg.ewing at canterbury.ac.nz  Fri Feb 18 02:58:46 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Feb 18 02:59:05 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1f7befae050217104431312214@mail.gmail.com>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
	<20050217181119.GA3055@vicky.ecs.soton.ac.uk>
	<1f7befae050217104431312214@mail.gmail.com>
Message-ID: <42154BD6.4030001@canterbury.ac.nz>

Tim Peters wrote:
> Looks right to me -- cool!  I never used struct's 'P' format because
> it always appeared useless to me:    But silly me!  I'm sure Guido
> put it there anticipating the need for calcsize('P') when making a
> positive_id() function in Python.

Smells like more time machine activity to me. Any minute
now you'll find there's suddenly a positive_id() builtin
that's been there ever since 1.3 or so. And the 'P' format,
then always never having just become useful, will have
unappeared...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From nick at ilm.com  Thu Feb 17 00:56:24 2005
From: nick at ilm.com (Nick Rasmussen)
Date: Fri Feb 18 03:04:18 2005
Subject: [Python-Dev] subclassing PyCFunction_Type
In-Reply-To: <5614e00fb134b968fa76a1896c456f4a@redivi.com>
References: <5.1.1.6.0.20050216110025.02fb7e70@mail.telecommunity.com>
	<5614e00fb134b968fa76a1896c456f4a@redivi.com>
Message-ID: <20050216235624.GO17806@ewok.lucasdigital.com>

On Wed, 16 Feb 2005, Bob Ippolito wrote:

> 
> On Feb 16, 2005, at 11:02, Phillip J. Eby wrote:
> 
> >At 02:32 PM 2/11/05 -0800, Nick Rasmussen wrote:
> >>tommy said that this would be the best place to ask
> >>this question....
> >>
> >>I'm trying to get functions wrapped via boost to show
> >>up as builtin types so that pydoc includes them when
> >>documenting the module containing them.  Right now
> >>boost python functions are created using a PyTypeObject
> >>such that when inspect.isbuiltin does:
> >>
> >>    return isinstance(object, types.BuiltinFunctionType)
> >
> >FYI, this may not be the "right" way to do this, but since 2.3 
> >'isinstance()' looks at an object's __class__ rather than its type(), 
> >so you could perhaps include a '__class__' descriptor in your method 
> >type that returns BuiltinFunctionType and see if that works.
> >
> >It's a kludge, but it might let your code work with existing versions 
> >of Python.
> 
> It works in Python 2.3.0:
> 

That seemed to do the trick for me as well, I'll run it past
the boost::python folks and see what they think.

many thanks

-nick

From maalanen at ra.abo.fi  Thu Feb 17 17:30:27 2005
From: maalanen at ra.abo.fi (Marcus Alanen)
Date: Fri Feb 18 03:04:20 2005
Subject: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too
	slow (fwd)
In-Reply-To: <ca471dc20502170730771cd622@mail.gmail.com>
References: <Pine.GSO.4.51L2.0502171318280.14434@koeberg.lysator.liu.se>
	<ca471dc20502170730771cd622@mail.gmail.com>
Message-ID: <4214C6A3.1000806@ra.abo.fi>

Guido van Rossum wrote:

> The Python test suite already has a way (the -u flag) to distinguish
> between "regular" broad-coverage testing and deep coverage for
> specific (or all) areas. Let's keep the really long-running tests out
> of the regular test suite.
> 
> There used to be a farm of machines that did nothing but run the test
> suite ("snake-farm"). This seems to have stopped (it was run by
> volunteers at a Swedish university). Maybe we should revive such an
> effort, and make sure it runs with -u all.

Hello Guido and everybody else,

I hacked together a simple distributed unittest runner for our projects. 
Requirements are a NFS-mounted home directory across the slave nodes and 
SSH-based "automatic" authentication, i.e. no passwords or passphrases 
necessary. It officially works-for-me for around three hosts (see below) 
so that cuts the time down basically to a third (real-life example ~600 
seconds to ~200 seconds, so it does work :-). It also supports 
"serialized tests", i.e. tests that must be run one after the other and 
cannot be run in parallel.

http://mde.abo.fi/tools/disttest/

Comes with some problems; my blurb from advogato.org:
"""
Disttest is a distributed unittesting runner. You simply set the 
DISTTEST_HOSTS variable to a space-separated list of hostnames to 
connect to using SSH, and then run "disttest". The nodes must all have 
the same filesystem (usually an NFS-mounted /home) and have the Disttest 
program installed. You even gain a bit with just one computer by setting 
the variable to "localhost localhost". :-)

There are currently two annoying problem with it, though. For some 
reason, 1) the unittest program connecting to the X server sometimes 
fails to provide the correct authentication, and 2) sometimes the actual 
connection to the X server can't be established. I think these are 
related to 1) congestion on the shared .Xauthority file, and 2) a too 
small listen() queue on the forwarding port by the SSH daemon. Both 
problems show up when using too many (over 4?) hosts, which is the whole 
point of the program! Sigh.
"""

Error checking probably bad. Anyway, feel free to check it out, modify, 
comment or anything. We're thinking of checking the assumptions in the 
blurb above, but no timetable is set.

My guess is that the NFS-mounted home directory is the showstopper and 
people usually don't have lot's of machines hanging around, but that's 
for you to decide.

Disclaimer: I don't know anything of CPython development nor of the 
tests in the CPython test suite. ;-)

Best regards, and a big thank you for Python,
Marcus

From Martin.Gfeller at comit.ch  Thu Feb 17 19:34:50 2005
From: Martin.Gfeller at comit.ch (Gfeller Martin)
Date: Fri Feb 18 03:04:22 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
Message-ID: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>

Hi,

what immediately comes to mind are Modules/cPickle.c and Modules/cStringIO.c, which (I believe) are heavily used by ZODB (which in turn is heavily used by the application). 

The lists also get fairly large, although not huge - up to typically 50000 (complex) objects in the tests I've measured. As I said, I don't speak C, so I can only speculate - do the lists at some point grow beyond the upper limit of obmalloc, but are handled by the LFH (which has a higher upper limit, if I understood Tim Peters correctly)?

Best regards,
Martin


-----Original Message-----
From: Evan Jones [mailto:ejones@uwaterloo.ca] 
Sent: Thursday, 17 Feb 2005 02:26
To: Python Dev
Cc: Gfeller Martin; Martin v. L?wis
Subject: Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%


On Feb 16, 2005, at 18:42, Martin v. L?wis wrote:
> I must admit that I'm surprised. I would have expected
> that most allocations in Python go through obmalloc, so
> the heap would only see "large" allocations.
>
> It would be interesting to find out, in your application,
> why it is still an improvement to use the low-fragmentation
> heaps.

Hmm... This is an excellent point. A grep through the Python source 
code shows that the following files call the native system malloc (I've 
excluded a few obviously platform specific files). A quick visual 
inspection shows that most of these are using it to allocate some sort 
of array or string, so it likely *should* go through the system malloc. 
Gfeller, any idea if you are using any of the modules on this list? If 
so, it would be pretty easy to try converting them to call the obmalloc 
functions instead, and see how that affects the performance.

Evan Jones


Demo/pysvr/pysvr.c
Modules/_bsddb.c
Modules/_curses_panel.c
Modules/_cursesmodule.c
Modules/_hotshot.c
Modules/_sre.c
Modules/audioop.c
Modules/bsddbmodule.c
Modules/cPickle.c
Modules/cStringIO.c
Modules/getaddrinfo.c
Modules/main.c
Modules/pyexpat.c
Modules/readline.c
Modules/regexpr.c
Modules/rgbimgmodule.c
Modules/svmodule.c
Modules/timemodule.c
Modules/zlibmodule.c
PC/getpathp.c
Python/strdup.c
Python/thread.c

From tim.peters at gmail.com  Fri Feb 18 04:38:08 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 18 04:38:14 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
References: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
Message-ID: <1f7befae050217193863ffc028@mail.gmail.com>

[Gfeller Martin]
> what immediately comes to mind are Modules/cPickle.c and
> Modules/cStringIO.c, which (I believe) are heavily used by ZODB (which in turn
> is heavily used by the application).

I probably guessed right the first time <wink>:  LFH doesn't help with
the lists directly, but helps indirectly by keeping smaller objects
out of the general heap where the list guts actually live.

Say we have a general heap with a memory map like this, meaning a
contiguous range of available memory, where 'f' means a block is free.
 The units of the block don't really matter, maybe one 'f' is one
byte, maybe one 'f' is 4MB -- it's all the same in the end:

fffffffffffffffffffffffffffffffffffffffffffffff

Now you allocate a relatively big object (like the guts of a large
list), and it's assigned a contiguous range of blocks marked 'b':

bbbbbbbbbbbbbbbffffffffffffffffffffffffffffffff

Then you allocate a small object, marked 's':

bbbbbbbbbbbbbbbsfffffffffffffffffffffffffffffff

The you want to grow the big object.  Oops!  It can't extend the block
of b's in-place, because 's' is in the way.  Instead it has to copy
the whole darn thing:

fffffffffffffffsbbbbbbbbbbbbbbbffffffffffffffff

But if 's' is allocated from some _other_ heap, then the big object
can grow in-place, and that's much more efficient than copying the
whole thing.

obmalloc has two primary effects:  it manages a large number of very
small (<= 256 bytes) memory chunks very efficiently, but it _also_
helps larger objects indirectly, by keeping the very small objects out
of the platform C malloc's way.

LFH appears to be an extension of the same basic idea, raising the
"small object" limit to 16KB.

Now note that pymalloc and LFH are *bad* ideas for objects that want
to grow.  pymalloc and LFH segregate the memory they manage into
blocks of different sizes.  For example, pymalloc keeps a list of free
blocks each of which is exactly 64 bytes long.  Taking a 64-byte block
out of that list, or putting it back in, is very efficient.  But if an
object that uses a 64-byte block wants to grow, pymalloc can _never_
grow it in-place, it always has to copy it.  That's a cost that comes
with segregating memory by size, and for that reason Python
deliberately doesn't use pymalloc in several cases where objects are
expected to grow over time.

One thing to take from that is that LFH can't be helping list-growing
in a direct way either, if LFH (as seems likely) also needs to copy
objects that grow in order to keep its internal memory segregated by
size.  The indirect benefit is still available, though:  LFH may be
helping simply by keeping smaller objects out of the general heap's
hair.

> The lists also get fairly large, although not huge - up to typically 50000
> (complex) objects in the tests I've measured.

That's much larger than LFH can handle.  Its limit is 16KB.  A Python
list with 50K elements requires a contiguous chunk of 200KB on a
32-bit machine to hold the list guts.

> As I said, I don't speak C, so I can only speculate - do the lists at some point
>grow beyond the upper limit of obmalloc, but are handled by the LFH
(which has a
> higher upper limit, if I understood Tim Peters correctly)?

A Python list object comprises two separately allocated pieces of
memory.  First is a list header, a small piece of memory of fixed
size, independent of len(list).  The list header is always obtained
from obmalloc; LFH will never be involved with that, and neither will
the system malloc.  The list header has a pointer to a separate piece
of memory, which contains the guts of a list, a contiguous vector of
len(list) pionters (to Python objects).  For a list of length n, this
needs 4*n bytes on a 32-bit box.  obmalloc never manages that space,
and for the reason given above:  we expect that list guts may grow,
and obmalloc is meant for fixed-size chunks of memory.

So the list guts will get handled by LFH, until the list needs more
than 4K entries (hitting the 16KB LFH limit).  Until then, LFH
probably wastes time by copying growing list guts from size class to
size class.  Then the list guts finally get copied to the general
heap, and stay there.

I'm afraid the only you can know for sure is by obtaining detailed
memory maps and analyzing them.
From abo at minkirri.apana.org.au  Fri Feb 18 05:09:51 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Feb 18 05:10:35 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <20050217065330.GP25441@zot.electricrain.com>
References: <1108090248.3753.53.camel@schizo>
	<226e9c65e562f9b0439333053036fef3@redivi.com>
	<1108102539.3753.87.camel@schizo>
	<20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
	<20050212210402.GE25441@zot.electricrain.com>
	<1108340374.3768.33.camel@schizo>
	<20050217065330.GP25441@zot.electricrain.com>
Message-ID: <1108699791.3758.98.camel@schizo>

On Wed, 2005-02-16 at 22:53 -0800, Gregory P. Smith wrote:
> fyi - i've updated the python sha1/md5 openssl patch.  it now replaces
> the entire sha and md5 modules with a generic hashes module that gives
> access to all of the hash algorithms supported by OpenSSL (including
> appropriate legacy interface wrappers and falling back to the old code
> when compiled without openssl).
> 
>  https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470
> 
> I don't quite like the module name 'hashes' that i chose for the
> generic interface (too close to the builtin hash() function).  Other
> suggestions on a module name?  'digest' comes to mind.

I just had a quick look, and have these comments (psedo patch review?).
Apologies for the noise on the list...

DESCRIPTION
===========

This patch keeps the current md5c.c, md5module.c files and adds the
following; _hashopenssl.c, hashes.py, md5.py, sha.py.

The old md5 and sha extension modules get replaced by hashes.py, md5.py,
and sha.py python modules that leverage off _hash (openssl) or _md5 and
_sha (no openssl) extension modules.

The new _hash extension module "wraps" the high level openssl EVP
interface, which uses a string parameter to indicate what type of
message digest algorithm to use. The advantage of this is it makes all
openssl supported digests available, and if openssl adds more, we get
them for free. A disadvantage of this is it is an abstraction level
above the actual md5 and sha implementations, and this may add
overheads. These overheads are probably negligible compared to the
actual implementation speedups.

The new _md5 and _sha extension modules are simply re-named versions of
the old md5 and sha modules.

The hashes.py module acts as an import wrapper for _hash, and falls back
to using _md5 and _sha modules if _hash is not available. It provides an
EVP style API (string hash name parameter), that supports only md5 and
sha hashes if openssl is not available.

The new md5.py and sha.py modules simply use hash.py.

COMMENTS
========

The introduction of a "hashes" module with a new API that supports many
different digests (provided openssl is available) is extending Python,
not just "fixing the licenses" of md5 and sha modules.

If all we wanted to do was fix the md5 module, a simpler solution would
be to change the md5c.c API to match openssl's implementation, and make
md5module.c use it, conditionally compiling against md5c.c or linking
against openssl in setup.py. A similar approach could be used for sha,
but would require stripping the sha implementation out of shamodule.c

I am mildly of concerned about the namespace/filespace clutter
introduced by this implementation... it feels unnecessary, as does the
tangled dependencies between them. With openssl, hashes.py duplicates
the functionality of _hash. Without openssl, md5.py and sha.py duplicate
_md5 and _sha, via a roundabout route through hash.py.

The python wrappers seem overly complicated, with things like

  def new(name, string=None):
    if string:
      return _hash.new(name)
    else:
      return _hash.new.(name,string)

being common where the following would suffice;

  def new(name,string=""):
    return _hash.new(name,string)

I think this is because _hash.new() uses an optional string parameter,
but I have a feeling a C update with a zero length string is faster than
this Python if. If it was a concern, the C implementation could check
the value of the string length before calling update.

Given the convenience methods for different hashes in hashes.py (which
incidentally look like they are only available when _hash is not
available... something else that needs fixing), the md5.py module could
be simply coded as;

  from hashes import md5
  new = md5

Despite all these nit-picks, it looks pretty good. It is orders of
magnitude better than any of the other non-existent solutions, including
the one I didn't code :-)

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From raymond.hettinger at verizon.net  Fri Feb 18 07:53:37 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri Feb 18 07:57:43 2005
Subject: [Python-Dev] Prospective Peephole Transformation
Message-ID: <000c01c51586$92c7dd60$3a01a044@oemcomputer>

Based on some ideas from Skip, I had tried transforming the likes of "x
in (1,2,3)" into "x in frozenset([1,2,3])".  When applicable, it
substantially simplified the generated code and converted the O(n)
lookup into an O(1) step.  There were substantial savings even if the
set contained only a single entry.  When disassembled, the bytecode is
not only much shorter, it is also much more readable (corresponding
almost directly to the original source).

The problem with the transformation was that it didn't handle the case
where x was non-hashable and it would raise a TypeError instead of
returning False as it should.  That situation arose once in the email
module's test suite.

To get it to work, I would have to introduce a frozenset subtype:

    class Searchset(frozenset):
        def __contains__(self, element):
            try:
                return frozenset.__contains__(self, element)
            except TypeError:
                return False

Then, the transformation would be "x in Searchset([1, 2, 3])".  Since
the new Searchset object goes in the constant table, marshal would have
to be taught how to save and restore the object.

This is a more complicated than the original frozenset version of the
patch, so I would like to get feedback on whether you guys think it is
worth it.


Raymond Hettinger

From fredrik at pythonware.com  Fri Feb 18 09:18:31 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Feb 18 09:18:40 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
References: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
Message-ID: <cv486i$mei$1@sea.gmane.org>

Raymond Hettinger wrote:

> Based on some ideas from Skip, I had tried transforming the likes of "x
> in (1,2,3)" into "x in frozenset([1,2,3])".  When applicable, it
> substantially simplified the generated code and converted the O(n)
> lookup into an O(1) step.  There were substantial savings even if the
> set contained only a single entry.

savings in what?  time or bytecode size?  constructed micro-benchmarks,
or examples from real-life code?

do we have any statistics on real-life "n" values?

</F> 


From martin at v.loewis.de  Fri Feb 18 10:06:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Feb 18 10:06:28 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <1108699791.3758.98.camel@schizo>
References: <1108090248.3753.53.camel@schizo>	<226e9c65e562f9b0439333053036fef3@redivi.com>	<1108102539.3753.87.camel@schizo>	<20050211175118.GC25441@zot.electricrain.com>	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>	<5d300838ef9716aeaae53579ab1f7733@redivi.com>	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>	<20050212133721.GA13429@rogue.amk.ca>	<20050212210402.GE25441@zot.electricrain.com>	<1108340374.3768.33.camel@schizo>	<20050217065330.GP25441@zot.electricrain.com>
	<1108699791.3758.98.camel@schizo>
Message-ID: <4215B010.2090600@v.loewis.de>

Donovan Baarda wrote:
> This patch keeps the current md5c.c, md5module.c files and adds the
> following; _hashopenssl.c, hashes.py, md5.py, sha.py.
[...]
> If all we wanted to do was fix the md5 module

If we want to fix the licensing issues with the md5 module, this patch
does not help at all, as it keeps the current md5 module (along with
its licensing issues). So any patch to solve the problem will need
to delete the code with the questionable license.

Then, the approach in the patch breaks the promise that the md5 module
is always there. It would require that OpenSSL is always there - a
promise that we cannot make (IMO).

Regards,
Martin
From arigo at tunes.org  Fri Feb 18 12:36:08 2005
From: arigo at tunes.org (Armin Rigo)
Date: Fri Feb 18 12:39:37 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <1f7befae050217104431312214@mail.gmail.com>
References: <4210AFAA.9060108@thule.no>
	<1f7befae050214074122b715a@mail.gmail.com>
	<20050217181119.GA3055@vicky.ecs.soton.ac.uk>
	<1f7befae050217104431312214@mail.gmail.com>
Message-ID: <20050218113608.GB25496@vicky.ecs.soton.ac.uk>

Hi Tim,


On Thu, Feb 17, 2005 at 01:44:11PM -0500, Tim Peters wrote:
> >    256 ** struct.calcsize('P')
> 
> Now if you'll just sign and fax a Zope contributor agreement, I'll
> upgrade ZODB to use this slick trick <wink>.

I hereby donate this line of code to the public domain :-)


Armin
From skip at pobox.com  Fri Feb 18 15:41:42 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Feb 18 15:39:15 2005
Subject: [Python-Dev] Five review rule on the /dev/ page?
In-Reply-To: <20050217173346.GB18117@panix.com>
References: <16916.50199.723442.36695@montanaro.dyndns.org>
	<20050217173346.GB18117@panix.com>
Message-ID: <16917.65190.515241.199460@montanaro.dyndns.org>

    aahz> This should go into Brett's survey of the Python dev process, not
    aahz> as official documentation.  It's simply an offer made by some of
    aahz> the prominent members of python-dev.

As long as it's referred to from www.python.org/dev that's fine.  

Skip
From skip at pobox.com  Fri Feb 18 15:57:39 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Feb 18 15:55:29 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <cv486i$mei$1@sea.gmane.org>
References: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
	<cv486i$mei$1@sea.gmane.org>
Message-ID: <16918.611.903084.183700@montanaro.dyndns.org>


    >> Based on some ideas from Skip, I had tried transforming the likes of
    >> "x in (1,2,3)" into "x in frozenset([1,2,3])"....

    Fredrik> savings in what?  time or bytecode size?  constructed
    Fredrik> micro-benchmarks, or examples from real-life code?

    Fredrik> do we have any statistics on real-life "n" values?

My original suggestion wasn't based on performance issues.  It was based on
the notion of tuples-as-records and lists-as-arrays.  Raymond had originally
gone through the code and changed

    for x in [1,2,3]:

to 

    for x in (1,2,3):

I suggested that since the standard library code is commonly used as an
example of basic Python principles (that's probably not the right word), it
should uphold that ideal tuple/list distinction.  Raymond then translated

    for x in [1,2,3]:

to

    for x in frozenset([1,2,3]):

I'm unclear why the list in "for x in [1,2,3]" or "if x not in [1,2,3]"
can't fairly easily be recognized as a constant and just be placed in the
constants array.  The bytecode would show n LOAD_CONST opcodes followed by
BUILD_LIST then either a COMPARE_OP (in the test case) or GET_ITER+FOR_ITER
(in the for loop case).  I think the optimizer should be able to recognize
both constructs fairly easily.

I don't know if that would provide a performance increase or not.  I was
after separation of functionality between tuples and lists.

Skip
From python at rcn.com  Fri Feb 18 15:58:10 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Feb 18 16:02:09 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <16918.611.903084.183700@montanaro.dyndns.org>
Message-ID: <000001c515ca$4378e260$803cc797@oemcomputer>

> I'm unclear why the list in "for x in [1,2,3]" or "if x not in
[1,2,3]"
> can't fairly easily be recognized as a constant and just be placed in
the
> constants array. 

That part got done (at least for the if-statement).

The question is whether the type transformation idea should be carried a
step further so that a single step search operation replaces the linear
search.


Raymond
From irmen at xs4all.nl  Fri Feb 18 15:36:15 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Fri Feb 18 16:02:14 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <16918.611.903084.183700@montanaro.dyndns.org>
References: <000c01c51586$92c7dd60$3a01a044@oemcomputer>	<cv486i$mei$1@sea.gmane.org>
	<16918.611.903084.183700@montanaro.dyndns.org>
Message-ID: <4215FD5F.4040605@xs4all.nl>

Skip Montanaro wrote:

> I suggested that since the standard library code is commonly used as an
> example of basic Python principles (that's probably not the right word), it
> should uphold that ideal tuple/list distinction.  Raymond then translated
> 
>     for x in [1,2,3]:
> 
> to
> 
>     for x in frozenset([1,2,3]):

I may be missing something here (didn't follow the whole thread) but
those two are not functionally equal.
The docstring on frozenset sais "Build an immutable unordered collection."
So there's no guarantee that the elements will return from the
frozenset iterator in the order that you constructed the frozenset with,
right?


--Irmen
From python at rcn.com  Fri Feb 18 16:15:04 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Feb 18 16:19:03 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <4215FD5F.4040605@xs4all.nl>
Message-ID: <000101c515cc$9f96d0a0$803cc797@oemcomputer>

> > Raymond then
> translated
> >
> >     for x in [1,2,3]:
> >
> > to
> >
> >     for x in frozenset([1,2,3]):

That's not right.  for-statements are not touched.


> I may be missing something here (didn't follow the whole thread) but
> those two are not functionally equal.
> The docstring on frozenset sais "Build an immutable unordered
collection."
> So there's no guarantee that the elements will return from the
> frozenset iterator in the order that you constructed the frozenset
with,
> right?

Only contains expressions are translated:

    "if x in [1,2,3]"

currently turns into:

    "if x in (1,2,3)"

and I'm proposing that it go one step further:

    "if x in Seachset([1,2,3])"

where Search set is a frozenset subtype that doesn't require x to be
hashable.  Also, the transformation would only happen when the contents
of the search are all constants.


Raymond
From pje at telecommunity.com  Fri Feb 18 16:36:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Feb 18 16:34:03 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <000101c515cc$9f96d0a0$803cc797@oemcomputer>
References: <4215FD5F.4040605@xs4all.nl>
Message-ID: <5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>

At 10:15 AM 2/18/05 -0500, Raymond Hettinger wrote:

>Only contains expressions are translated:
>
>     "if x in [1,2,3]"
>
>currently turns into:
>
>     "if x in (1,2,3)"
>
>and I'm proposing that it go one step further:
>
>     "if x in Seachset([1,2,3])"

ISTM that whenever I use a constant in-list like that, it's almost always 
with just a few (<4) items, so it doesn't seem worth the extra effort 
(especially disrupting the marshal module) just to squeeze out those extra 
two comparisons and replace them with a hashing operation.

From fredrik at pythonware.com  Fri Feb 18 16:45:32 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Feb 18 16:45:45 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
References: <4215FD5F.4040605@xs4all.nl>
	<000101c515cc$9f96d0a0$803cc797@oemcomputer>
	<5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
Message-ID: <cv52ck$99f$1@sea.gmane.org>

Phillip J. Eby wrote:

>>Only contains expressions are translated:
>>
>>     "if x in [1,2,3]"
>>
>>currently turns into:
>>
>>     "if x in (1,2,3)"
>>
>>and I'm proposing that it go one step further:
>>
>>     "if x in Seachset([1,2,3])"
>
> ISTM that whenever I use a constant in-list like that, it's almost always with just a few (<4) 
> items, so it doesn't seem worth the extra effort (especially disrupting the marshal module) just 
> to squeeze out those extra two comparisons and replace them with a hashing operation.

it could be worth expanding them to

    "if x == 1 or x == 2 or x == 3:"

though...

C:\>timeit -s "a = 1" "if a in (1, 2, 3): pass"
10000000 loops, best of 3: 0.11 usec per loop
C:\>timeit -s "a = 1" "if a == 1 or a == 2 or a == 3: pass"
10000000 loops, best of 3: 0.0691 usec per loop

C:\>timeit -s "a = 2" "if a == 1 or a == 2 or a == 3: pass"
10000000 loops, best of 3: 0.123 usec per loop
C:\>timeit -s "a = 2" "if a in (1, 2, 3): pass"
10000000 loops, best of 3: 0.143 usec per loop

C:\>timeit -s "a = 3" "if a == 1 or a == 2 or a == 3: pass"
10000000 loops, best of 3: 0.187 usec per loop
C:\>timeit -s "a = 3" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.197 usec per loop

C:\>timeit -s "a = 4" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.225 usec per loop
C:\>timeit -s "a = 4" "if a == 1 or a == 2 or a == 3: pass"
10000000 loops, best of 3: 0.161 usec per loop

</F> 


From skip at pobox.com  Fri Feb 18 17:03:28 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Feb 18 17:00:59 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <000101c515cc$9f96d0a0$803cc797@oemcomputer>
References: <4215FD5F.4040605@xs4all.nl>
	<000101c515cc$9f96d0a0$803cc797@oemcomputer>
Message-ID: <16918.4560.171364.66303@montanaro.dyndns.org>


    >> > Raymond then
    >> translated
    >> >
    >> >     for x in [1,2,3]:
    >> >
    >> > to
    >> >
    >> >     for x in frozenset([1,2,3]):

    Raymond> That's not right.  for-statements are not touched.

Thanks for the correction.  My apologies for the misstep.

Skip
From pje at telecommunity.com  Fri Feb 18 17:42:51 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Feb 18 17:40:12 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <cv52ck$99f$1@sea.gmane.org>
References: <4215FD5F.4040605@xs4all.nl>
	<000101c515cc$9f96d0a0$803cc797@oemcomputer>
	<5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com>

At 04:45 PM 2/18/05 +0100, Fredrik Lundh wrote:
>Phillip J. Eby wrote:
>
> >>Only contains expressions are translated:
> >>
> >>     "if x in [1,2,3]"
> >>
> >>currently turns into:
> >>
> >>     "if x in (1,2,3)"
> >>
> >>and I'm proposing that it go one step further:
> >>
> >>     "if x in Seachset([1,2,3])"
> >
> > ISTM that whenever I use a constant in-list like that, it's almost 
> always with just a few (<4)
> > items, so it doesn't seem worth the extra effort (especially disrupting 
> the marshal module) just
> > to squeeze out those extra two comparisons and replace them with a 
> hashing operation.
>
>it could be worth expanding them to
>
>     "if x == 1 or x == 2 or x == 3:"
>
>though...
>
>C:\>timeit -s "a = 1" "if a in (1, 2, 3): pass"
>10000000 loops, best of 3: 0.11 usec per loop
>C:\>timeit -s "a = 1" "if a == 1 or a == 2 or a == 3: pass"
>10000000 loops, best of 3: 0.0691 usec per loop
>
>C:\>timeit -s "a = 2" "if a == 1 or a == 2 or a == 3: pass"
>10000000 loops, best of 3: 0.123 usec per loop
>C:\>timeit -s "a = 2" "if a in (1, 2, 3): pass"
>10000000 loops, best of 3: 0.143 usec per loop
>
>C:\>timeit -s "a = 3" "if a == 1 or a == 2 or a == 3: pass"
>10000000 loops, best of 3: 0.187 usec per loop
>C:\>timeit -s "a = 3" "if a in (1, 2, 3): pass"
>1000000 loops, best of 3: 0.197 usec per loop
>
>C:\>timeit -s "a = 4" "if a in (1, 2, 3): pass"
>1000000 loops, best of 3: 0.225 usec per loop
>C:\>timeit -s "a = 4" "if a == 1 or a == 2 or a == 3: pass"
>10000000 loops, best of 3: 0.161 usec per loop
>
></F>

Were these timings done with the code that turns (1,2,3) into a constant?

Also, I presume that these timings still include extra LOAD_FAST operations 
that could be replaced with DUP_TOP in the actual expansion, although I 
don't know how much difference that would make in practice, since saving 
the argument fetch might be offset by the need to swap and pop at the end.

From fredrik at pythonware.com  Fri Feb 18 17:52:08 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Feb 18 17:52:16 2005
Subject: [Python-Dev] Re: Re: Prospective Peephole Transformation
References: <4215FD5F.4040605@xs4all.nl><000101c515cc$9f96d0a0$803cc797@oemcomputer><5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
	<cv52ck$99f$1@sea.gmane.org>
	<5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com>
Message-ID: <cv569g$njv$1@sea.gmane.org>

Phillip J. Eby wrote:

> Were these timings done with the code that turns (1,2,3) into a constant?

I used a stock 2.4 from python.org, which seems to do this (for tuples,
not for lists).

> Also, I presume that these timings still include extra LOAD_FAST operations that could be replaced 
> with DUP_TOP in the actual expansion, although I don't know how much difference that would make in 
> practice, since saving the argument fetch might be offset by the need to swap and pop at the end.

here's the disassembly:

>>> dis.dis(compile("if a in (1, 2, 3): pass", "", "exec"))
  1           0 LOAD_NAME                0 (a)
              3 LOAD_CONST               4 ((1, 2, 3))
              6 COMPARE_OP               6 (in)
              9 JUMP_IF_FALSE            4 (to 16)
             12 POP_TOP
             13 JUMP_FORWARD             1 (to 17)
        >>   16 POP_TOP
        >>   17 LOAD_CONST               3 (None)
             20 RETURN_VALUE

>>> dis.dis(compile("if a == 1 or a == 2 or a == 3: pass", "", "exec"))
  1           0 LOAD_NAME                0 (a)
              3 LOAD_CONST               0 (1)
              6 COMPARE_OP               2 (==)
              9 JUMP_IF_TRUE            26 (to 38)
             12 POP_TOP
             13 LOAD_NAME                0 (a)
             16 LOAD_CONST               1 (2)
             19 COMPARE_OP               2 (==)
             22 JUMP_IF_TRUE            13 (to 38)
             25 POP_TOP
             26 LOAD_NAME                0 (a)
             29 LOAD_CONST               2 (3)
             32 COMPARE_OP               2 (==)
             35 JUMP_IF_FALSE            4 (to 42)
        >>   38 POP_TOP
             39 JUMP_FORWARD             1 (to 43)
        >>   42 POP_TOP
        >>   43 LOAD_CONST               3 (None)
             46 RETURN_VALUE

</F> 


From pje at telecommunity.com  Fri Feb 18 18:09:29 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Feb 18 18:06:50 2005
Subject: [Python-Dev] Re: Re: Prospective Peephole Transformation
In-Reply-To: <cv569g$njv$1@sea.gmane.org>
References: <4215FD5F.4040605@xs4all.nl>
	<000101c515cc$9f96d0a0$803cc797@oemcomputer>
	<5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
	<cv52ck$99f$1@sea.gmane.org>
	<5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com>

At 05:52 PM 2/18/05 +0100, Fredrik Lundh wrote:
>Phillip J. Eby wrote:
>
> > Were these timings done with the code that turns (1,2,3) into a constant?
>
>I used a stock 2.4 from python.org, which seems to do this (for tuples,
>not for lists).
>
> > Also, I presume that these timings still include extra LOAD_FAST 
> operations that could be replaced
> > with DUP_TOP in the actual expansion, although I don't know how much 
> difference that would make in
> > practice, since saving the argument fetch might be offset by the need 
> to swap and pop at the end.
>
>here's the disassembly:

FYI, that's not a dissassembly of what timeit was actually timing; see 
'template' in timeit.py.  As a practical matter, the only difference would 
probably be the use of LOAD_FAST instead of LOAD_NAME, as timeit runs the 
code in a function body.  But whatever.

Still, it's rather interesting that tuple.__contains__ appears slower than 
a series of LOAD_CONST and "==" operations, considering that the tuple 
should be doing basically the same thing, only without bytecode 
fetch-and-decode overhead.  Maybe it's tuple.__contains__ that needs 
optimizing here?

From fredrik at pythonware.com  Fri Feb 18 18:12:50 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Feb 18 18:12:40 2005
Subject: [Python-Dev] Re: Re: Re: Prospective Peephole Transformation
References: <4215FD5F.4040605@xs4all.nl><000101c515cc$9f96d0a0$803cc797@oemcomputer><5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com><cv52ck$99f$1@sea.gmane.org><5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com>
	<cv569g$njv$1@sea.gmane.org>
	<5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com>
Message-ID: <cv57ga$rtm$1@sea.gmane.org>

Phillip J. Eby wrote:

>>here's the disassembly:
>
> FYI, that's not a dissassembly of what timeit was actually timing; see 'template' in timeit.py. 
> As a practical matter, the only difference would probably be the use of LOAD_FAST instead of 
> LOAD_NAME, as
> timeit runs the code in a function body.

>>> def f1(a):
...     if a in (1, 2, 3):
...             pass
...
>>> def f2(a):
...     if a == 1 or a == 2 or a == 3:
...             pass
...
>>> dis.dis(f1)
  2           0 LOAD_FAST                0 (a)
              3 LOAD_CONST               4 ((1, 2, 3))
              6 COMPARE_OP               6 (in)
              9 JUMP_IF_FALSE            4 (to 16)
             12 POP_TOP
  3          13 JUMP_FORWARD             1 (to 17)
        >>   16 POP_TOP
        >>   17 LOAD_CONST               0 (None)
             20 RETURN_VALUE
>>>
>>> dis.dis(f2)
  2           0 LOAD_FAST                0 (a)
              3 LOAD_CONST               1 (1)
              6 COMPARE_OP               2 (==)
              9 JUMP_IF_TRUE            26 (to 38)
             12 POP_TOP
             13 LOAD_FAST                0 (a)
             16 LOAD_CONST               2 (2)
             19 COMPARE_OP               2 (==)
             22 JUMP_IF_TRUE            13 (to 38)
             25 POP_TOP
             26 LOAD_FAST                0 (a)
             29 LOAD_CONST               3 (3)
             32 COMPARE_OP               2 (==)
             35 JUMP_IF_FALSE            4 (to 42)
        >>   38 POP_TOP
  3          39 JUMP_FORWARD             1 (to 43)
        >>   42 POP_TOP
        >>   43 LOAD_CONST               0 (None)
             46 RETURN_VALUE

> Still, it's rather interesting that tuple.__contains__ appears slower than a series of LOAD_CONST 
> and "==" operations, considering that the tuple should be doing basically the same thing, only 
> without bytecode fetch-and-decode overhead.  Maybe it's tuple.__contains__ that needs optimizing 
> here?

wouldn't be the first time...

</F> 


From jimjjewett at gmail.com  Fri Feb 18 20:10:05 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Feb 18 20:10:09 2005
Subject: [Python-Dev] Prospective Peephole Transformation
Message-ID: <fb6fbf5605021811107e46e75f@mail.gmail.com>

Raymond Hettinger:

> tried transforming the likes of "x in (1,2,3)" into "x in frozenset([1,2,3])".  
>... There were substantial savings even if the set contained only a
single entry.

>... where x was non-hashable and it would raise a TypeError instead of
> returning False as it should.  

I read the objection as saying that it should not return False, because
an unhashable object might pretend it is equal to a hashable one in the set.

"""
    class Searchset(frozenset):
        def __contains__(self, element):
            try:
                return frozenset.__contains__(self, element)
            except TypeError:
                return False
"""
So instead of 
            return False
it should be 
            return x in frozenset.__iter__()

This would be a net loss if there were many unhashable x.  You could restrict
the iteration to x that implement a custom __eq__, if you ensured that none 
of the SearchSet elements do... but it starts to get uglier and less general.

Raymond has already look at http://www.python.org/sf/1141428, which
contains some test case patches to enforce this implicit 
"sequences always use __eq__; only mappings can short-circuit on __hash__" 
contract.

-jJ
From mal at egenix.com  Fri Feb 18 21:57:16 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Feb 18 21:57:22 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
References: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
Message-ID: <421656AC.6010602@egenix.com>

Raymond Hettinger wrote:
> Based on some ideas from Skip, I had tried transforming the likes of "x
> in (1,2,3)" into "x in frozenset([1,2,3])".  When applicable, it
> substantially simplified the generated code and converted the O(n)
> lookup into an O(1) step.  There were substantial savings even if the
> set contained only a single entry.  When disassembled, the bytecode is
> not only much shorter, it is also much more readable (corresponding
> almost directly to the original source).
> 
> The problem with the transformation was that it didn't handle the case
> where x was non-hashable and it would raise a TypeError instead of
> returning False as it should.  That situation arose once in the email
> module's test suite.
> 
> To get it to work, I would have to introduce a frozenset subtype:
> 
>     class Searchset(frozenset):
>         def __contains__(self, element):
>             try:
>                 return frozenset.__contains__(self, element)
>             except TypeError:
>                 return False
> 
> Then, the transformation would be "x in Searchset([1, 2, 3])".  Since
> the new Searchset object goes in the constant table, marshal would have
> to be taught how to save and restore the object.
> 
> This is a more complicated than the original frozenset version of the
> patch, so I would like to get feedback on whether you guys think it is
> worth it.

Wouldn't it help a lot more if the compiler would detect that
(1,2,3) is immutable and convert it into a constant at
compile time ?!

The next step would then be to have Python roll out these
loops (in -O mode).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 18 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From oliphant at ee.byu.edu  Fri Feb 18 22:12:53 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Feb 18 22:12:56 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like objects
	can be used
Message-ID: <42165A55.3000609@ee.byu.edu>

Hello again,

There is a great discussion going on the numpy list regarding a proposed 
PEP for multidimensional arrays that is in the works.

During this discussion as resurfaced regarding slicing with objects that 
are not IntegerType objects but that
have a tp_as_number->nb_int method to convert to an int. 

Would it be possible to change

_PyEval_SliceIndex  in ceval.c

so that rather than throwing an error if the indexing object is not an 
integer, the code first checks to see if the object has a
tp_as_number->nb_int method and calls it instead.

If this is acceptable, it is an easy patch.

Thanks,

-Travis Oliphant

From gvanrossum at gmail.com  Fri Feb 18 22:28:34 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Feb 18 22:28:39 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <42165A55.3000609@ee.byu.edu>
References: <42165A55.3000609@ee.byu.edu>
Message-ID: <ca471dc2050218132822081b8d@mail.gmail.com>

> Would it be possible to change
> 
> _PyEval_SliceIndex  in ceval.c
> 
> so that rather than throwing an error if the indexing object is not an
> integer, the code first checks to see if the object has a
> tp_as_number->nb_int method and calls it instead.

I don't think this is the right solution; since float has that method,
it would allow floats to be used as slice indices, but that's not
supposed to work (to protect yourself against irreproducible results
due to rounding errors).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Fri Feb 18 22:31:47 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Feb 18 22:31:58 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <42165A55.3000609@ee.byu.edu>
References: <42165A55.3000609@ee.byu.edu>
Message-ID: <42165EC3.6010209@ocf.berkeley.edu>

Travis Oliphant wrote:
> Hello again,
> 
> There is a great discussion going on the numpy list regarding a proposed 
> PEP for multidimensional arrays that is in the works.
> 
> During this discussion as resurfaced regarding slicing with objects that 
> are not IntegerType objects but that
> have a tp_as_number->nb_int method to convert to an int.
> Would it be possible to change
> 
> _PyEval_SliceIndex  in ceval.c
> 
> so that rather than throwing an error if the indexing object is not an 
> integer, the code first checks to see if the object has a
> tp_as_number->nb_int method and calls it instead.
> 

You would also have to change apply_slice() since that also has a guard for 
checking the slice arguments are either NULL, int, or long objects.

But I am +1 with it since the guard is already there for ints and longs to 
handle those properly and thus the common case does not slow down in any way. 
As long as it also accepts Python objects that define __int__ and not just C 
types that have the nb_int slot defined I am okay with this idea.

-Brett
From oliphant at ee.byu.edu  Fri Feb 18 22:35:43 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Feb 18 22:35:46 2005
Subject: [Python-Dev] Fix _PyEval_SliceIndex (Take two)
Message-ID: <42165FAF.8080703@ee.byu.edu>


(More readable second paragraph)

Hello again,

There is a great discussion going on the numpy list regarding a proposed 
PEP for multidimensional arrays that is in the works.

During this discussion a problem has resurfaced regarding slicing with 
objects that are not IntegerType objects but that have a 
tp_as_number->nb_int method. Would it be possible to change

_PyEval_SliceIndex  in ceval.c

so that rather than raising an exception if the indexing object is not 
an integer, the code first checks to see if the object has a 
tp_as_number->nb_int method and trys it before raising an exception.

If this is acceptable, it is an easy patch.

Thanks,

-Travis Oliphant
From david.ascher at gmail.com  Fri Feb 18 22:36:31 2005
From: david.ascher at gmail.com (David Ascher)
Date: Fri Feb 18 22:36:34 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <ca471dc2050218132822081b8d@mail.gmail.com>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
Message-ID: <dd28fc2f050218133624639391@mail.gmail.com>

On Fri, 18 Feb 2005 13:28:34 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> > Would it be possible to change
> >
> > _PyEval_SliceIndex  in ceval.c
> >
> > so that rather than throwing an error if the indexing object is not an
> > integer, the code first checks to see if the object has a
> > tp_as_number->nb_int method and calls it instead.
> 
> I don't think this is the right solution; since float has that method,
> it would allow floats to be used as slice indices, but that's not
> supposed to work (to protect yourself against irreproducible results
> due to rounding errors).

I wonder if floats are the special case here, not "integer like objects".

I've never been particularly happy about the confusion between the two
roles of int() and it's C equivalents, i.e. casting and conversion.
From gvanrossum at gmail.com  Fri Feb 18 22:48:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Feb 18 22:48:55 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <dd28fc2f050218133624639391@mail.gmail.com>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
	<dd28fc2f050218133624639391@mail.gmail.com>
Message-ID: <ca471dc20502181348bbc4d19@mail.gmail.com>

[Travis]
> > > Would it be possible to change
> > >
> > > _PyEval_SliceIndex  in ceval.c
> > >
> > > so that rather than throwing an error if the indexing object is not an
> > > integer, the code first checks to see if the object has a
> > > tp_as_number->nb_int method and calls it instead.

[Guido]
> > I don't think this is the right solution; since float has that method,
> > it would allow floats to be used as slice indices, but that's not
> > supposed to work (to protect yourself against irreproducible results
> > due to rounding errors).

[David]
> I wonder if floats are the special case here, not "integer like objects".
> 
> I've never been particularly happy about the confusion between the two
> roles of int() and it's C equivalents, i.e. casting and conversion.

You're right, that's the crux of the matter; I unfortunately copied a
design mistake from C here. In Python 3000 I'd like to change this so
that floats have a __trunc__() method to return an integer (invokable
via trunc(x)).

But in Python 2.x, we can't be sure that floats are the *only*
exception -- surely people who are implementing their own "float-like"
classes are copying float's example and implementing __int__ to mean
the same thing. For example, the new decimal class in Python 2.4 has a
converting/truncating __int__ method. (And despite being decimal, it's
no less approximate than float; decimal is *not* an exact numerical
type.)

So I still think it's unsafe (in Python 2.x) to accept __int__ in the
way Travis proposes.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bob at redivi.com  Fri Feb 18 22:54:25 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Feb 18 22:54:28 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <dd28fc2f050218133624639391@mail.gmail.com>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
	<dd28fc2f050218133624639391@mail.gmail.com>
Message-ID: <80cd5d26efaff4232b909b0567fb5ea3@redivi.com>


On Feb 18, 2005, at 4:36 PM, David Ascher wrote:

> On Fri, 18 Feb 2005 13:28:34 -0800, Guido van Rossum
> <gvanrossum@gmail.com> wrote:
>>> Would it be possible to change
>>>
>>> _PyEval_SliceIndex  in ceval.c
>>>
>>> so that rather than throwing an error if the indexing object is not 
>>> an
>>> integer, the code first checks to see if the object has a
>>> tp_as_number->nb_int method and calls it instead.
>>
>> I don't think this is the right solution; since float has that method,
>> it would allow floats to be used as slice indices, but that's not
>> supposed to work (to protect yourself against irreproducible results
>> due to rounding errors).
>
> I wonder if floats are the special case here, not "integer like 
> objects".
>
> I've never been particularly happy about the confusion between the two
> roles of int() and it's C equivalents, i.e. casting and conversion.

All of the __special__ methods for this purpose seem to be usable only 
for conversion, not casting (__str__, __unicode__, etc.).  The only way 
I've found to pass for a particular value type is to subclass one.   We 
do this a lot in PyObjC.

It ends up being a net win anyway, because you get free implementations 
of all the relevant methods, at the expense of having two copies of the 
value.  The fact that these proxy objects are no longer 
visible-from-Python subclasses of Objective-C objects isn't really a 
big deal in our case, because the canonical Objective-C way to checking 
inheritance still work.  The wrapper types use an attribute protocol 
for casting (__pyobjc_object__), and delegate to this object with 
__getattr__.

 >>> from Foundation import *
 >>> one = NSNumber.numberWithInt_(1)
 >>> type(one).mro()
[<class 'objc._pythonify.OC_PythonInt'>, <type 'int'>, <type 'object'>]
 >>> isinstance(one, NSNumber)
False
 >>> isinstance(one.__pyobjc_object__, NSNumber)
True
 >>> one.isKindOfClass_(NSNumber)
1
 >>> type(one)
<class 'objc._pythonify.OC_PythonInt'>
 >>> type(one.__pyobjc_object__)
<objective-c class NSCFNumber at 0x300620>

-bob

From ejones at uwaterloo.ca  Fri Feb 18 22:58:36 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Fri Feb 18 22:59:25 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <1f7befae050217193863ffc028@mail.gmail.com>
References: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
	<1f7befae050217193863ffc028@mail.gmail.com>
Message-ID: <c55eeea4ae6ed8ad4e53b0cb972ac363@uwaterloo.ca>

On Thu, 2005-02-17 at 22:38, Tim Peters wrote:
> Then you allocate a small object, marked 's':
>
> bbbbbbbbbbbbbbbsfffffffffffffffffffffffffffffff

Isn't the whole point of obmalloc is that we don't want to allocate "s" 
on the heap, since it is small? I guess "s" could be an object that 
might potentially grow.

> One thing to take from that is that LFH can't be helping list-growing
> in a direct way either, if LFH (as seems likely) also needs to copy
> objects that grow in order to keep its internal memory segregated by
> size.  The indirect benefit is still available, though:  LFH may be
> helping simply by keeping smaller objects out of the general heap's
> hair.

So then wouldn't this mean that there would have to be some sort of 
small object being allocated via the system malloc that is causing the 
poor behaviour? As you mention, I wouldn't think it would be list 
objects, since resizing lists using LFH should be *worse*. That would 
actually be something that is worth verifying, however. It could be 
that the Windows LFH is extra clever?

> I'm afraid the only you can know for sure is by obtaining detailed
> memory maps and analyzing them.

Well, it would also be useful to find out what code is calling the 
system malloc. This would make it easy to examine the code and see if 
it should be calling obmalloc or the system malloc. Any good ideas for 
easily obtaining this information? I imagine that some profilers must 
be able to produce a complete call graph?

Evan Jones

From ejones at uwaterloo.ca  Fri Feb 18 23:07:46 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Fri Feb 18 23:12:10 2005
Subject: [Python-Dev] Memory Allocator Part 2: Did I get it right?
In-Reply-To: <4212FB5B.1030209@v.loewis.de>
References: <8b28704b4465e03002fc70db5facedb6@uwaterloo.ca>	<1f7befae05021514524d0a35ec@mail.gmail.com>	<4c0d14b0b08390d046e1220b6f360745@uwaterloo.ca>
	<1f7befae05021520263d77a2a3@mail.gmail.com>
	<4212FB5B.1030209@v.loewis.de>
Message-ID: <bd8c1ebd0ca5f44634cc16839be9f36b@uwaterloo.ca>

Sorry for taking so long to get back to this thread, it has been one of 
those weeks for me.

On Feb 16, 2005, at 2:50, Martin v. L?wis wrote:
> Evan then understood the feature, and made it possible.

This is very true: it was a very useful exercise.

> I can personally accept breaking the code that still relies on the
> invalid APIs. The only problem is that it is really hard to determine
> whether some code *does* violate the API usage.

Great. Please ignore the patch on SourceForge for a little while. I'll 
produce a "revision 3" this weekend, without the compatibility hack.

Evan Jones

From python at rcn.com  Fri Feb 18 23:09:19 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Feb 18 23:13:23 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <421656AC.6010602@egenix.com>
Message-ID: <001401c51606$7ec6cda0$803cc797@oemcomputer>

> Wouldn't it help a lot more if the compiler would detect that
> (1,2,3) is immutable and convert it into a constant at
> compile time ?!

Yes.  We've already gotten it to that point:

Python 2.5a0 (#46, Feb 15 2005, 19:11:35) [MSC v.1200 32 bit (Intel)] on
win32
>>> import dis
>>> dis.dis(compile('x in ("xml", "html", "css")', '', 'eval'))
  0           0 LOAD_NAME                0 (x)
              3 LOAD_CONST               3 (('xml', 'html', 'css'))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE  

The question is whether to go a step further to replace the linear
search with a single hashed lookup:

  0           0 LOAD_NAME                0 (x)
              3 LOAD_CONST               3 (searchset(['xml', 'html',
'css']))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE  

This situation seems to arise often in source code.  You can see the
cases in the standard library with:   grep 'in ("' *.py

The transformation is easy to make at compile time.  The part holding me
back is the introduction of searchset as a frozenset subtype and
teaching marshal how to put it a pyc file.

FWIW, some sample timings are included below (using frozenset to
approximate what searchset would do).  The summary is that the tuple
search takes .49usec plus .12usec for each item searched until a match
is found.  The frozenset lookup takes a constant .53 usec.


Raymond


------------------------------------------------------------------------

C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='xml'"
"x in s"
1000000 loops, best of 9: 0.49 usec per loop

C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='css'"
"x in s"
1000000 loops, best of 9: 0.621 usec per loop

C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='html'"
"x in s"
1000000 loops, best of 9: 0.747 usec per loop

C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='pdf'"
"x in s"
100000 loops, best of 9: 0.851 usec per loop

C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='xml'" "x in s"
1000000 loops, best of 9: 0.529 usec per loop

C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='css'" "x in s"
1000000 loops, best of 9: 0.522 usec per loop

C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='html'" "x in s"
1000000 loops, best of 9: 0.53 usec per loop

C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='pdf'" "x in s"
1000000 loops, best of 9: 0.523 usec per loop

From oliphant at ee.byu.edu  Fri Feb 18 23:40:54 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Feb 18 23:40:57 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <ca471dc2050218132822081b8d@mail.gmail.com>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
Message-ID: <42166EF6.7010600@ee.byu.edu>

Guido van Rossum wrote:

>>Would it be possible to change
>>
>>_PyEval_SliceIndex  in ceval.c
>>
>>so that rather than throwing an error if the indexing object is not an
>>integer, the code first checks to see if the object has a
>>tp_as_number->nb_int method and calls it instead.
>>    
>>
>
>I don't think this is the right solution; since float has that method,
>it would allow floats to be used as slice indices, 
>  
>
O.K.,

then how about if arrayobjects can make it in the core, then a check for 
a rank-0 integer-type
arrayobject is allowed before raising an exception?

-Travis


From tim.peters at gmail.com  Fri Feb 18 23:51:37 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Feb 18 23:51:40 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <c55eeea4ae6ed8ad4e53b0cb972ac363@uwaterloo.ca>
References: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
	<1f7befae050217193863ffc028@mail.gmail.com>
	<c55eeea4ae6ed8ad4e53b0cb972ac363@uwaterloo.ca>
Message-ID: <1f7befae050218145157bd81c9@mail.gmail.com>

[Tim Peters]
...
>> Then you allocate a small object, marked 's':
>>
>> bbbbbbbbbbbbbbbsfffffffffffffffffffffffffffffff

[Evan Jones]
> Isn't the whole point of obmalloc

No, because it doesn't matter what follows that introduction: 
obmalloc has several points, including exploiting the GIL, heuristics
aiming at reusing memory while it's still high in the memory
heirarchy, almost never touching a piece of memory until it's actually
needed, and so on.

> is that we don't want to allocate "s" on the heap, since it is small?

That's one of obmalloc's goals, yes.  But "small" is a relative
adjective, not absolute.  Because we're primarily talking about LFH
here, the natural meaning for "small" in _this_ thread is < 16KB,
which is much larger than "small" means to obmalloc.  The memory-map
example applies just well to LFH as to obmalloc, by changing which
meaning for "small" you have in mind.

> I guess "s" could be an object that might potentially grow.

For example, list guts in Python are never handled by obmalloc,
although the small fixed-size list _header_ object is always handled
by obmalloc.

>> One thing to take from that is that LFH can't be helping list-growing
>> in a direct way either, if LFH (as seems likely) also needs to copy
>> objects that grow in order to keep its internal memory segregated by
>> size.  The indirect benefit is still available, though:  LFH may be
>> helping simply by keeping smaller objects out of the general heap's
>> hair.

> So then wouldn't this mean that there would have to be some sort of
> small object being allocated via the system malloc that is causing the
> poor behaviour?

Yes.  For example, a 300-character string could do it (that's not
small to obmalloc, but is to LFH).  Strings produced by pickling are
very often that large, and especially in Zope (which uses pickles
extensively under the covers -- reading and writing persistent objects
in Zope all involve pickle strings).

> As you mention, I wouldn't think it would be list objects, since resizing
> lists using LFH should be *worse*.

Until they get to LFH's boundary for "small", and we have only the
vaguest idea what Martin's app does here -- we know it grows lists
containing 50K elements in the end, and ... well, that's all I really
know about it <wink>.

A well-known trick is applicable in that case, if Martin thinks it's
worth the bother:
grow the list to its final size once, at the start (overestimating if
you don't know for sure).  Then instead of appending, keep an index to
the next free slot, same as you'd do in C.  Then the list guts never
move, so if that doesn't yield the same kind of speedup without using
LFH, list copying wasn't actually the culprit to begin with.

> That would actually be something that is worth verifying, however.

Not worth the time to me -- Windows is closed-source, and I'm too old
to enjoy staring at binary disassemblies any more.  Besides, list guts
can't stay in LFH after the list exceeds 4K elements.  If list-copying
costs are significant here, they're far more likely to be due to
copying lists over 4K elements than under -- copying a list takes
O(len(list)) time.  So the realloc() strategy used by LFH _probably_
isn't of _primary)_ interest here.

> It could be that the Windows LFH is extra clever?

Sure -- that I doubt it moves Heaven & Earth to cater to reallocs is
just educated guessing.  I wrote my first production heap manager at
Cray Research, around 1979 <wink>.

> ...
> Well, it would also be useful to find out what code is calling the
> system malloc. This would make it easy to examine the code and see if
> it should be calling obmalloc or the system malloc. Any good ideas for
> easily obtaining this information? I imagine that some profilers must
> be able to produce a complete call graph?

Windows supports extensive facilities for analyzing heap usage, even
from an external process that attaches to the process you want to
analyze.  Ditto for profiling.  But it's not easy, and I don't know of
any free tools that are of real help.  If someone were motivated
enough, it would probably be easiest to run Martin's app on a Linux
box, and use the free Linux tools to analyze it.
From david.ascher at gmail.com  Sat Feb 19 00:08:24 2005
From: david.ascher at gmail.com (David Ascher)
Date: Sat Feb 19 00:08:34 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <42166EF6.7010600@ee.byu.edu>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
	<42166EF6.7010600@ee.byu.edu>
Message-ID: <dd28fc2f05021815081cd58d7f@mail.gmail.com>

On Fri, 18 Feb 2005 15:40:54 -0700, Travis Oliphant <oliphant@ee.byu.edu> wrote:
> Guido van Rossum wrote:
> 
> >>Would it be possible to change
> >>
> >>_PyEval_SliceIndex  in ceval.c
> >>
> >>so that rather than throwing an error if the indexing object is not an
> >>integer, the code first checks to see if the object has a
> >>tp_as_number->nb_int method and calls it instead.
> >>
> >>
> >
> >I don't think this is the right solution; since float has that method,
> >it would allow floats to be used as slice indices,
> >
> >
> O.K.,
> 
> then how about if arrayobjects can make it in the core, then a check for
> a rank-0 integer-type
> arrayobject is allowed before raising an exception?

Following up on Bob's point, maybe making rank-0 integer type
arrayobjects inherit from int has some mileage?  Somewhat weird,
but...
From mal at egenix.com  Sat Feb 19 00:42:35 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat Feb 19 00:42:42 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <001401c51606$7ec6cda0$803cc797@oemcomputer>
References: <001401c51606$7ec6cda0$803cc797@oemcomputer>
Message-ID: <42167D6B.9020606@egenix.com>

Raymond Hettinger wrote:
>>Wouldn't it help a lot more if the compiler would detect that
>>(1,2,3) is immutable and convert it into a constant at
>>compile time ?!
> 
> 
> Yes.  We've already gotten it to that point:
> 
> Python 2.5a0 (#46, Feb 15 2005, 19:11:35) [MSC v.1200 32 bit (Intel)] on
> win32
> 
>>>>import dis
>>>>dis.dis(compile('x in ("xml", "html", "css")', '', 'eval'))
> 
>   0           0 LOAD_NAME                0 (x)
>               3 LOAD_CONST               3 (('xml', 'html', 'css'))
>               6 COMPARE_OP               6 (in)
>               9 RETURN_VALUE  

Cool. Does that work for all tuples in the program ?

> The question is whether to go a step further to replace the linear
> search with a single hashed lookup:
> 
>   0           0 LOAD_NAME                0 (x)
>               3 LOAD_CONST               3 (searchset(['xml', 'html',
> 'css']))
>               6 COMPARE_OP               6 (in)
>               9 RETURN_VALUE  
> 
> This situation seems to arise often in source code.  You can see the
> cases in the standard library with:   grep 'in ("' *.py

I did a search on our code and Python's std lib. It turns
out that by far most such usages use either 2 or 3
values in the tuple. If you look at the types of the
values, the most common usages are strings and integers.

I'd assume that you'll get somewhat different results
from your benchmark if you had integers in the tuple.

> The transformation is easy to make at compile time.  The part holding me
> back is the introduction of searchset as a frozenset subtype and
> teaching marshal how to put it a pyc file.

Hmm, what if you'd teach tuples to do faster contains lookups for
string or integer only content, e.g. by introducing sub-types for
string-only and integer-only tuples ?!

> FWIW, some sample timings are included below (using frozenset to
> approximate what searchset would do).  The summary is that the tuple
> search takes .49usec plus .12usec for each item searched until a match
> is found.  The frozenset lookup takes a constant .53 usec.
> 
> 
> 
> Raymond
> 
> 
> 
> ------------------------------------------------------------------------
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='xml'"
> "x in s"
> 1000000 loops, best of 9: 0.49 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='css'"
> "x in s"
> 1000000 loops, best of 9: 0.621 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='html'"
> "x in s"
> 1000000 loops, best of 9: 0.747 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='pdf'"
> "x in s"
> 100000 loops, best of 9: 0.851 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='xml'" "x in s"
> 1000000 loops, best of 9: 0.529 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='css'" "x in s"
> 1000000 loops, best of 9: 0.522 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='html'" "x in s"
> 1000000 loops, best of 9: 0.53 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='pdf'" "x in s"
> 1000000 loops, best of 9: 0.523 usec per loop

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 19 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From gvanrossum at gmail.com  Sat Feb 19 00:49:44 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Feb 19 00:49:47 2005
Subject: [Python-Dev] Fixing _PyEval_SliceIndex so that integer-like
	objects can be used
In-Reply-To: <dd28fc2f05021815081cd58d7f@mail.gmail.com>
References: <42165A55.3000609@ee.byu.edu>
	<ca471dc2050218132822081b8d@mail.gmail.com>
	<42166EF6.7010600@ee.byu.edu>
	<dd28fc2f05021815081cd58d7f@mail.gmail.com>
Message-ID: <ca471dc2050218154950056a84@mail.gmail.com>

[Travis]
> > then how about if arrayobjects can make it in the core, then a check for
> > a rank-0 integer-type
> > arrayobject is allowed before raising an exception?

Sure, *if* you can get the premise accepted.

[David]
> Following up on Bob's point, maybe making rank-0 integer type
> arrayobjects inherit from int has some mileage?  Somewhat weird,
> but...

Hm, currently inheriting from int would imply that the C-level memory
lay-out of the object is an extension of the built-in int type. That's
probably too much of a constraint. But perhaps somehow
rank-0-integer-array and int could be the same type? I don't think it
would hurt too badly if an int had a method to find out its rank as an
array. And I assume you can't iterate over a rank-0 array, right?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From ejones at uwaterloo.ca  Sat Feb 19 01:10:55 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sat Feb 19 01:10:51 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <1f7befae050218145157bd81c9@mail.gmail.com>
References: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
	<1f7befae050217193863ffc028@mail.gmail.com>
	<c55eeea4ae6ed8ad4e53b0cb972ac363@uwaterloo.ca>
	<1f7befae050218145157bd81c9@mail.gmail.com>
Message-ID: <a37b6dc38de35c527101b950d30efa9c@uwaterloo.ca>

On Feb 18, 2005, at 17:51, Tim Peters wrote:
> grow the list to its final size once, at the start (overestimating if
> you don't know for sure).  Then instead of appending, keep an index to
> the next free slot, same as you'd do in C.  Then the list guts never
> move, so if that doesn't yield the same kind of speedup without using
> LFH, list copying wasn't actually the culprit to begin with.

If this *does* improve the performance of his application by 15%, that  
would strongly argue for an addition to the list API similar to Java's  
ArrayList.ensureCapacity or the STL's vector<T>::reserve. Since the  
list implementation already maintains separate ints for the list array  
size and the list occupied size, this would really just expose this  
implementation detail to Python. I don't like revealing the  
implementation in this fashion, but if it does make a significant  
performance difference, it could be worth it.

http://java.sun.com/j2se/1.5.0/docs/api/java/util/ 
ArrayList.html#ensureCapacity(int)
http://www.sgi.com/tech/stl/Vector.html#4

Evan Jones

From tim.peters at gmail.com  Sat Feb 19 02:43:06 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Feb 19 02:43:10 2005
Subject: [Python-Dev] Re: Re: Re: Prospective Peephole Transformation
In-Reply-To: <cv57ga$rtm$1@sea.gmane.org>
References: <4215FD5F.4040605@xs4all.nl>
	<000101c515cc$9f96d0a0$803cc797@oemcomputer>
	<5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
	<cv52ck$99f$1@sea.gmane.org>
	<5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com>
	<cv569g$njv$1@sea.gmane.org>
	<5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com>
	<cv57ga$rtm$1@sea.gmane.org>
Message-ID: <1f7befae050218174345e029e8@mail.gmail.com>

[Phillip J. Eby]
>> Still, it's rather interesting that tuple.__contains__ appears slower than a
>> series of LOAD_CONST and "==" operations, considering that the tuple
>> should be doing basically the same thing, only> without bytecode fetch-and-
>> decode overhead.  Maybe it's tuple.__contains__ that needs optimizing
>> here?

[Fredrik Lundh]
> wouldn't be the first time...

How soon we forget <wink>.  Fredrik introduced a pile of optimizations
special-casing the snot out of small integers into ceval.c a long time
ago, like this in COMPARE_OP:

case COMPARE_OP:
	w = POP();
	v = TOP();
	if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
		/* INLINE: cmp(int, int) */
		register long a, b;
		register int res;
		a = PyInt_AS_LONG(v);
		b = PyInt_AS_LONG(w);
		switch (oparg) {
		case PyCmp_LT: res = a <  b; break;
		case PyCmp_LE: res = a <= b; break;
		case PyCmp_EQ: res = a == b; break;
		case PyCmp_NE: res = a != b; break;
		case PyCmp_GT: res = a >  b; break;
		case PyCmp_GE: res = a >= b; break;
		case PyCmp_IS: res = v == w; break;
		case PyCmp_IS_NOT: res = v != w; break;
		default: goto slow_compare;
		}
		x = res ? Py_True : Py_False;
		Py_INCREF(x);
	}
	else {
		  slow_compare:
			x = cmp_outcome(oparg, v, w);
	}

That's a hell of a lot faster than tuple comparison's deferral to
PyObject_RichCompareBool can be, even if we inlined the same blob
inside the latter (then we'd still have the additional overhead of
calling PyObject_RichCompareBool).  As-is, PyObject_RichCompareBool()
has to do (relatively) significant work just to out find which
concrete comparision implementation to call.

As a result, "i == j" in Python source code, when i and j are little
ints, is much faster than comparing i and j via any other route in
Python.  That's mostly really good, IMO -- /F's int optimizations are
of major value in real life.  Context-dependent optimizations make
code performance less predictable too -- that's life.
From python at rcn.com  Sat Feb 19 02:41:24 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Feb 19 02:45:29 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <42167D6B.9020606@egenix.com>
Message-ID: <002401c51624$1f0ff3a0$803cc797@oemcomputer>

> >>Wouldn't it help a lot more if the compiler would detect that
> >>(1,2,3) is immutable and convert it into a constant at
> >>compile time ?!
> >
> >
> > Yes.  We've already gotten it to that point:
 . . .
> 
> Cool. Does that work for all tuples in the program ?

It is limited to just tuples of constants (strings, ints, floats,
complex, None, and other tuples).  Also, it is limited in its ability to
detect a nesting like: a=((1,2),(3,4)).  One other limitation is that
floats like -0.23 are not recognized as constants because the initial
compilation still produces a UNARY_NEGATIVE operation:

>>> dis.dis(compile('-0.23', '', 'eval'))
  0           0 LOAD_CONST               0 (0.23000000000000001)
              3 UNARY_NEGATIVE      
              4 RETURN_VALUE


> I did a search on our code and Python's std lib. It turns
> out that by far most such usages use either 2 or 3
> values in the tuple. If you look at the types of the
> values, the most common usages are strings and integers.

Right, those are the most common cases.  The linear searches are
ubiquitous.  Here's a small selection:

    if comptype not in ('NONE', 'ULAW', 'ALAW', 'G722')
    return tail.lower() in (".py", ".pyw")
    assert n in (2, 3, 4, 5)
    if value[2] in ('F','n','N')
    if sectName in ("temp", "cdata", "ignore", "include", "rcdata")
    if not decode or encoding in ('', '7bit', '8bit', 'binary'):
    if (code in (301, 302, 303, 307) and m in ("GET", "HEAD")

Unfortunately, there are several common patterns that are skipped
because rarely changed globals/builtins cannot be treated as constants:
    if isinstance(x, (int, float, complex)):  # types are not constants
    if op in (ROT_TWO, POP_TOP, LOAD_FAST):   # global consts from
opcode.py
    except (TypeError, KeyError, IndexError): # builtins are not
constant


> I'd assume that you'll get somewhat different results
> from your benchmark if you had integers in the tuple.

Nope, the results are substantially the same give or take 2usec.


> Hmm, what if you'd teach tuples to do faster contains lookups for
> string or integer only content, e.g. by introducing sub-types for
> string-only and integer-only tuples ?!

For a linear search, tuples are already pretty darned good and leave
room for only microscopic O(n) improvements.  The bigger win comes from
using a better algorithm and data structure -- hashing beats linear
search hands-down.  The constant search time is faster for all n>1,
resulting in much improved scalability.  No tweaking of
tuple.__contains__() can match it.

Sets are the right data structure for fast membership testing.  I would
love for sets to be used internally while letting users continue to
write the clean looking code shown above.


Raymond
From tim.peters at gmail.com  Sat Feb 19 03:06:45 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Feb 19 03:06:48 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
References: <000c01c51586$92c7dd60$3a01a044@oemcomputer>
Message-ID: <1f7befae050218180668dad506@mail.gmail.com>

[Raymond Hettinger]
> ...
> The problem with the transformation was that it didn't handle the case
> where x was non-hashable and it would raise a TypeError instead of
> returning False as it should.

I'm very glad you introduced the optimization of building small
constant tuples at compile-time.  IMO, that was a pure win.

I don't like this one, though.  The meaning of "x in (c1, c2, ...,
c_n)" is "x == c1 or x == c2  or ... or x == c_n", and a
transformation that doesn't behave exactly like the latter in all
cases is simply wrong.  Even if x isn't hashable, it could still be of
a type that implements __eq__, and where x.__eq__(c_i) returned True
for some i, and then False is plainly the wrong result.  It could also
be that x is of a type that is hashable, but where x.__hash__() raises
TypeError at this point in the code.  That could be for good or bad
(bug) reasons, but suppressing the TypeError and converting into False
would be a bad thing regardless.

> That situation arose once in the email module's test suite.

I don't even care if no code in the standard library triggered a
problem here:  the transformation isn't semantically correct on the
face of it.  If we knew the type of x at compile-time, then sure, in
most (almost all) cases we could know it was a safe transformation
(and even without the hack to turn TypeError into False).  But we
don't know now, so the worst case has to be assumed:  can't do this
one now.  Maybe someday, though.
From tim.peters at gmail.com  Sat Feb 19 03:24:55 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Feb 19 03:24:59 2005
Subject: [Python-Dev] Windows Low Fragementation Heap yields speedup of
	~15%
In-Reply-To: <a37b6dc38de35c527101b950d30efa9c@uwaterloo.ca>
References: <B7824E219C630A4BB5F97190F523C12BBD7209@vulcanos.ch.comitgroup.net>
	<1f7befae050217193863ffc028@mail.gmail.com>
	<c55eeea4ae6ed8ad4e53b0cb972ac363@uwaterloo.ca>
	<1f7befae050218145157bd81c9@mail.gmail.com>
	<a37b6dc38de35c527101b950d30efa9c@uwaterloo.ca>
Message-ID: <1f7befae050218182444fb7413@mail.gmail.com>

[Tim Peters]
>> grow the list to its final size once, at the start (overestimating if
>> you don't know for sure).  Then instead of appending, keep an index to
>> the next free slot, same as you'd do in C.  Then the list guts never
>> move, so if that doesn't yield the same kind of speedup without using
>> LFH, list copying wasn't actually the culprit to begin with.

[Evan Jones]
> If this *does* improve the performance of his application by 15%, that
> would strongly argue for an addition to the list API similar to Java's
> ArrayList.ensureCapacity or the STL's vector<T>::reserve. Since the
> list implementation already maintains separate ints for the list array
> size and the list occupied size, this would really just expose this
> implementation detail to Python. I don't like revealing the
> implementation in this fashion, but if it does make a significant
> performance difference, it could be worth it.

That's a happy thought!  It was first suggested for Python in 1991
<wink>, but before Python 2.4 the list implementation didn't have
separate members for current size and capacity, so "can't get there
from here" was the only response.  It still wouldn't be trivial,
because nothing in listobject.c now believes the allocated size ever
needs to be preserved, and all len()-changing list operations ensure
that "not too much" overallocation remains (see list_resize() in
listobject.c for details).

But let's see whether it would help first.
From ncoghlan at iinet.net.au  Sat Feb 19 05:46:32 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Feb 19 05:46:38 2005
Subject: [Python-Dev] Proposal for a module to deal with hashing
In-Reply-To: <20050217065330.GP25441@zot.electricrain.com>
References: <1108090248.3753.53.camel@schizo>	<226e9c65e562f9b0439333053036fef3@redivi.com>	<1108102539.3753.87.camel@schizo>	<20050211175118.GC25441@zot.electricrain.com>	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>	<5d300838ef9716aeaae53579ab1f7733@redivi.com>	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>	<20050212133721.GA13429@rogue.amk.ca>	<20050212210402.GE25441@zot.electricrain.com>	<1108340374.3768.33.camel@schizo>
	<20050217065330.GP25441@zot.electricrain.com>
Message-ID: <4216C4A8.9060408@iinet.net.au>

Gregory P. Smith wrote:
> fyi - i've updated the python sha1/md5 openssl patch.  it now replaces
> the entire sha and md5 modules with a generic hashes module that gives
> access to all of the hash algorithms supported by OpenSSL (including
> appropriate legacy interface wrappers and falling back to the old code
> when compiled without openssl).
> 
>  https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470
> 
> I don't quite like the module name 'hashes' that i chose for the
> generic interface (too close to the builtin hash() function).  Other
> suggestions on a module name?  'digest' comes to mind.

'hashtools' and 'hashlib' would both have precedents in the standard library 
(itertools and urllib, for example).

It occurs to me that such a module would provide a way to fix the bug with 
incorrectly hashable instances of new-style classes:

Py> class C:
...     def __eq__(self, other): return True
...
Py> hash(C())
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: unhashable instance
Py> class C(object):
...   def __eq__(self, other): return True
...
Py> hash(C())
10357232

Guido wanted to fix this by eliminating object.__hash__, but that caused 
problems for Jython. If I remember that discussion correctly, the problem was 
that, in Jython, the default hash is _not_ simply hash(id(obj)) the way it is in 
CPython, so Python code needs a way to get access to the default implementation. 
A hashtools.default_hash that worked like the current object.__hash__ would seem 
to provide such a spelling, and allow object.__hash__ to be removed (fixing the 
above bug).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From python at rcn.com  Sat Feb 19 05:47:01 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Feb 19 05:54:07 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <1f7befae050218180668dad506@mail.gmail.com>
Message-ID: <002d01c5163e$3184d720$803cc797@oemcomputer>

> I'm very glad you introduced the optimization of building small
> constant tuples at compile-time.  IMO, that was a pure win.

It's been out in the wild for a while now with no issues.  I'm somewhat
happy with it.


>  the transformation isn't semantically correct on the
> face of it.

Well that's the end of that.

What we really need is a clean syntax for specifying a constant
frozenset without compiler transformations of tuples.  That would have
the further advantage of letting builtins and globals be used as element
values.

   if isinstance(x, {int, float, complex}):
   if opcode in {REPEAT, MIN_REPEAT, MAX_REPEAT}:
   if (code in {301, 302, 303, 307} and m in {"GET", "HEAD"}:
   if op in (ROT_TWO, POP_TOP, LOAD_FAST)

Perhaps something other notation would be better but the idea is
basically the same.


Raymond
From ncoghlan at iinet.net.au  Sat Feb 19 06:03:27 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Feb 19 06:03:54 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
Message-ID: <4216C89F.3040400@iinet.net.au>

This is something I've typed way too many times:

Py> class C():
   File "<stdin>", line 1
     class C():
             ^
SyntaxError: invalid syntax

It's the asymmetry with functions that gets to me - defining a function with no 
arguments still requires parentheses in the definition statement, but defining a 
class with no bases requires the parentheses to be omitted.

Which leads in to the real question: Does this *really* need to be a syntax 
error? Or could it be used as an easier way to spell "class C(object):"?

Then, in Python 3K, simply drop support for omitting the parentheses from class 
definitions - require inheriting from ClassicClass instead. This would also have 
the benefit that the elimination of defaulting to classic classes would cause a 
syntax error rather than subtle changes in behaviour.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From abo at minkirri.apana.org.au  Sat Feb 19 06:18:00 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sat Feb 19 06:18:10 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
References: <4210AFAA.9060108@thule.no><1f7befae050214074122b715a@mail.gmail.com><20050217181119.GA3055@vicky.ecs.soton.ac.uk><1f7befae050217104431312214@mail.gmail.com>
	<20050218113608.GB25496@vicky.ecs.soton.ac.uk>
Message-ID: <024f01c51642$612a6c70$24ed0ccb@apana.org.au>


From: "Armin Rigo" <arigo@tunes.org>
> Hi Tim,
>
>
> On Thu, Feb 17, 2005 at 01:44:11PM -0500, Tim Peters wrote:
> > >    256 ** struct.calcsize('P')
> >
> > Now if you'll just sign and fax a Zope contributor agreement, I'll
> > upgrade ZODB to use this slick trick <wink>.
>
> I hereby donate this line of code to the public domain :-)

Damn... we can't use it then!

Seriously, on the Python lists there has been a discussion rejecting an
md5sum implementation because the author "donated it to the public domain".
Apparently lawyers have decided that you can't give code away. Intellectual
charity is illegal :-)

----------------------------------------------------------------
Donovan Baarda                http://minkirri.apana.org.au/~abo/
----------------------------------------------------------------

From abo at minkirri.apana.org.au  Sat Feb 19 06:38:36 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sat Feb 19 06:38:48 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
References: <1108090248.3753.53.camel@schizo>	<226e9c65e562f9b0439333053036fef3@redivi.com>	<1108102539.3753.87.camel@schizo>	<20050211175118.GC25441@zot.electricrain.com>	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>	<5d300838ef9716aeaae53579ab1f7733@redivi.com>	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>	<20050212133721.GA13429@rogue.amk.ca>	<20050212210402.GE25441@zot.electricrain.com>	<1108340374.3768.33.camel@schizo>	<20050217065330.GP25441@zot.electricrain.com>
	<1108699791.3758.98.camel@schizo> <4215B010.2090600@v.loewis.de>
Message-ID: <027b01c51645$42262dc0$24ed0ccb@apana.org.au>

From: "Martin v. L?wis" <martin@v.loewis.de>
> Donovan Baarda wrote:
> > This patch keeps the current md5c.c, md5module.c files and adds the
> > following; _hashopenssl.c, hashes.py, md5.py, sha.py.
> [...]
> > If all we wanted to do was fix the md5 module
>
> If we want to fix the licensing issues with the md5 module, this patch
> does not help at all, as it keeps the current md5 module (along with
> its licensing issues). So any patch to solve the problem will need
> to delete the code with the questionable license.

It maybe half fixes it in that if Python is happy with the RSA one, they can
continue to include it, and if Debian is unhappy with it, they can remove it
and build against openssl.

It doesn't fully fix the license problem. It is still worth considering
because it doesn't make it worse, and it does allow Python to use much
faster implementations and support other digest algorithms when openssl is
available.

> Then, the approach in the patch breaks the promise that the md5 module
> is always there. It would require that OpenSSL is always there - a
> promise that we cannot make (IMO).

It would be better if found an alternative md5c.c. I found one that was the
libmd implementation that someone mildly tweaked and then slapped an LGPL
on. I have a feeling that would make the lawyers tremble more than the
"public domain" libmd one, unless they are happy that someone else is
prepared to wear the grief for slapping a LGPL onto something public domain.

Probably the best at the moment is the sourceforge one, which is listed as
having a "zlib/libpng licence".

----------------------------------------------------------------
Donovan Baarda                http://minkirri.apana.org.au/~abo/
----------------------------------------------------------------

From greg at electricrain.com  Sat Feb 19 07:46:32 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Sat Feb 19 07:46:35 2005
Subject: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
In-Reply-To: <4215B010.2090600@v.loewis.de>
References: <20050211175118.GC25441@zot.electricrain.com>
	<00c701c5108e$f3d0b930$24ed0ccb@apana.org.au>
	<5d300838ef9716aeaae53579ab1f7733@redivi.com>
	<013501c510ae$2abd7360$24ed0ccb@apana.org.au>
	<20050212133721.GA13429@rogue.amk.ca>
	<20050212210402.GE25441@zot.electricrain.com>
	<1108340374.3768.33.camel@schizo>
	<20050217065330.GP25441@zot.electricrain.com>
	<1108699791.3758.98.camel@schizo> <4215B010.2090600@v.loewis.de>
Message-ID: <20050219064632.GF14279@zot.electricrain.com>

On Fri, Feb 18, 2005 at 10:06:24AM +0100, "Martin v. L?wis" wrote:
> Donovan Baarda wrote:
> >This patch keeps the current md5c.c, md5module.c files and adds the
> >following; _hashopenssl.c, hashes.py, md5.py, sha.py.
> [...]
> >If all we wanted to do was fix the md5 module
> 
> If we want to fix the licensing issues with the md5 module, this patch
> does not help at all, as it keeps the current md5 module (along with
> its licensing issues). So any patch to solve the problem will need
> to delete the code with the questionable license.
> 
> Then, the approach in the patch breaks the promise that the md5 module
> is always there. It would require that OpenSSL is always there - a
> promise that we cannot make (IMO).

I'm aware of that.

My goals are primarily to get a good openssl based hashes/digest
module going to be used instead of the built in implementations when
openssl available because openssl is -so- much faster.  Fixing the
debian instigated md5 licensing issue is secondary and is something
I'll get to later on after i work on the fun stuff.

And as Donovan has said, the patch already does present debian with
the option of dropping that md5 module and using the openssl derived
one instead if they're desperate.  based on laziness winning and the
issue being so minor i hope they just wait for a patch from me that
replaces the md5c.c with one of the acceptably licensed ones for their
2.3/2.4 packages.

-g

From aleax at aleax.it  Sat Feb 19 08:55:44 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sat Feb 19 08:55:48 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <4216C89F.3040400@iinet.net.au>
References: <4216C89F.3040400@iinet.net.au>
Message-ID: <03a3f1153caf34d2d087fcc240486a24@aleax.it>


On 2005 Feb 19, at 06:03, Nick Coghlan wrote:

> This is something I've typed way too many times:
>
> Py> class C():
>   File "<stdin>", line 1
>     class C():
>             ^
> SyntaxError: invalid syntax
>
> It's the asymmetry with functions that gets to me - defining a 
> function with no arguments still requires parentheses in the 
> definition statement, but defining a class with no bases requires the 
> parentheses to be omitted.

Seconded.  It's always irked me enough that it's the only ``apology'' 
for Python syntax you'll see in the Nutshell -- top of p. 71, "The 
syntax of the class statement has a small, tricky difference from that 
of the def statement" etc.

> Which leads in to the real question: Does this *really* need to be a 
> syntax error? Or could it be used as an easier way to spell "class 
> C(object):"?

-0 ... instinctively, I dread the task of explaining / teaching about 
the rationale for this somewhat kludgy transitional solution [[empty 
parentheses may be written OR omitted, with large difference in 
meaning, not very related to other cases of such parentheses]], even 
though I think you're right that it would make the future transition to 
3.0 somewhat safer.


Alex

From python at rcn.com  Sat Feb 19 09:01:14 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Feb 19 09:08:54 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
References: <4216C89F.3040400@iinet.net.au>
	<03a3f1153caf34d2d087fcc240486a24@aleax.it>
Message-ID: <000101c51659$b2f79e80$afbb9d8d@oemcomputer>

> > This is something I've typed way too many times:
> >
> > Py> class C():
> >   File "<stdin>", line 1
> >     class C():
> >             ^
> > SyntaxError: invalid syntax
> >
> > It's the asymmetry with functions that gets to me - defining a
> > function with no arguments still requires parentheses in the
> > definition statement, but defining a class with no bases requires the
> > parentheses to be omitted.
>
> Seconded.  It's always irked me enough that it's the only ``apology''
> for Python syntax you'll see in the Nutshell -- top of p. 71, "The
> syntax of the class statement has a small, tricky difference from that
> of the def statement" etc.

+1  For me, this would come-up when experimenting with mixins.  Adding and removing a mixin usually entailed a corresponding
change to the parentheses.


Raymond

From michael.walter at gmail.com  Sat Feb 19 09:12:50 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Sat Feb 19 09:12:54 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <000101c51659$b2f79e80$afbb9d8d@oemcomputer>
References: <4216C89F.3040400@iinet.net.au>
	<03a3f1153caf34d2d087fcc240486a24@aleax.it>
	<000101c51659$b2f79e80$afbb9d8d@oemcomputer>
Message-ID: <877e9a1705021900123c6f0ce2@mail.gmail.com>

But... only as an additional option, not as a replacement, right?

Michael


On Sat, 19 Feb 2005 03:01:14 -0500, Raymond Hettinger <python@rcn.com> wrote:
> > > This is something I've typed way too many times:
> > >
> > > Py> class C():
> > >   File "<stdin>", line 1
> > >     class C():
> > >             ^
> > > SyntaxError: invalid syntax
> > >
> > > It's the asymmetry with functions that gets to me - defining a
> > > function with no arguments still requires parentheses in the
> > > definition statement, but defining a class with no bases requires the
> > > parentheses to be omitted.
> >
> > Seconded.  It's always irked me enough that it's the only ``apology''
> > for Python syntax you'll see in the Nutshell -- top of p. 71, "The
> > syntax of the class statement has a small, tricky difference from that
> > of the def statement" etc.
> 
> +1  For me, this would come-up when experimenting with mixins.  Adding and removing a mixin usually entailed a corresponding
> change to the parentheses.
> 
> 
> Raymond
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From fredrik at pythonware.com  Sat Feb 19 10:33:59 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Feb 19 10:33:57 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
References: <4215FD5F.4040605@xs4all.nl><000101c515cc$9f96d0a0$803cc797@oemcomputer><5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com><cv52ck$99f$1@sea.gmane.org><5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com><cv569g$njv$1@sea.gmane.org><5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com><cv57ga$rtm$1@sea.gmane.org>
	<1f7befae050218174345e029e8@mail.gmail.com>
Message-ID: <cv70vp$2gd$1@sea.gmane.org>

Tim Peters wrote:

> [Fredrik Lundh]
>> wouldn't be the first time...
>
> How soon we forget <wink>.

oh, that was in the dark ages of Python 1.4.  I've rebooted myself many times since
then...

> Fredrik introduced a pile of optimizations special-casing the snot out
> of small integers into ceval.c a long time ago

iirc, you claimed that after a couple of major optimizations had been added, "there's
no single optimization left that can speed up pystone by more than X%", so I came
up with an "(X+2)%" optimization.  you should do that more often ;-)

> As a result, "i == j" in Python source code, when i and j are little
> ints, is much faster than comparing i and j via any other route in
> Python.

which explains why my "in" vs. "or" tests showed good results for integers, but not
for strings...

I'd say that this explains why it would still make sense to let the code generator change
"x in (a, b, c)" to "x == a or x == b or x == c", as long as a, b, and c are all integers.
(see my earlier timeit results)

</F> 


From fredrik at pythonware.com  Sat Feb 19 10:40:16 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Feb 19 10:40:11 2005
Subject: [Python-Dev] Re: builtin_id() returns negative numbers
References: <4210AFAA.9060108@thule.no><1f7befae050214074122b715a@mail.gmail.com><20050217181119.GA3055@vicky.ecs.soton.ac.uk><1f7befae050217104431312214@mail.gmail.com><20050218113608.GB25496@vicky.ecs.soton.ac.uk>
	<024f01c51642$612a6c70$24ed0ccb@apana.org.au>
Message-ID: <cv71bi$363$1@sea.gmane.org>

Donovan Baarda wrote:

> Apparently lawyers have decided that you can't give code away. Intellectual
> charity is illegal :-)

what else would a lawyer say?  do you really expect lawyers to admit that there
are ways to do things that don't involve lawyers?

</F> 


From martin at v.loewis.de  Sat Feb 19 11:47:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Feb 19 11:47:15 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <cv70vp$2gd$1@sea.gmane.org>
References: <4215FD5F.4040605@xs4all.nl><000101c515cc$9f96d0a0$803cc797@oemcomputer><5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com><cv52ck$99f$1@sea.gmane.org><5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com><cv569g$njv$1@sea.gmane.org><5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com><cv57ga$rtm$1@sea.gmane.org>	<1f7befae050218174345e029e8@mail.gmail.com>
	<cv70vp$2gd$1@sea.gmane.org>
Message-ID: <42171931.4020600@v.loewis.de>

Fredrik Lundh wrote:
> I'd say that this explains why it would still make sense to let the code generator change
> "x in (a, b, c)" to "x == a or x == b or x == c", as long as a, b, and c are all integers.

How often does that happen in real code?

Regards,
Martin
From martin at v.loewis.de  Sat Feb 19 11:54:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Feb 19 11:54:09 2005
Subject: [Python-Dev] builtin_id() returns negative numbers
In-Reply-To: <024f01c51642$612a6c70$24ed0ccb@apana.org.au>
References: <4210AFAA.9060108@thule.no><1f7befae050214074122b715a@mail.gmail.com><20050217181119.GA3055@vicky.ecs.soton.ac.uk><1f7befae050217104431312214@mail.gmail.com>	<20050218113608.GB25496@vicky.ecs.soton.ac.uk>
	<024f01c51642$612a6c70$24ed0ccb@apana.org.au>
Message-ID: <42171ACE.9020502@v.loewis.de>

Donovan Baarda wrote:
> Seriously, on the Python lists there has been a discussion rejecting an
> md5sum implementation because the author "donated it to the public domain".
> Apparently lawyers have decided that you can't give code away. Intellectual
> charity is illegal :-)

Despite the smiley: It is not illegal - it just does not have any legal
effect. Just by saying "I am the chancellor of Germany", it does not
make you the chancellor of Germany; instead, you need to go through the
election processes. Likewise, saying "the public can have my code" does
not make it so. Instead, you have to formulate a license that permits
the public to do with the code what you think it should be allowed to
do. Most people who've used the term "public domain" in the past didn't
really care whether they still have the copyright - what they wanted
to say is that anybody can use their work for any purpose.

Regards,
Martin
From mal at egenix.com  Sat Feb 19 13:06:37 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat Feb 19 13:06:40 2005
Subject: [Python-Dev] Prospective Peephole Transformation
In-Reply-To: <002401c51624$1f0ff3a0$803cc797@oemcomputer>
References: <002401c51624$1f0ff3a0$803cc797@oemcomputer>
Message-ID: <42172BCD.2010807@egenix.com>

Raymond Hettinger wrote:
>>Hmm, what if you'd teach tuples to do faster contains lookups for
>>string or integer only content, e.g. by introducing sub-types for
>>string-only and integer-only tuples ?!
> 
> 
> For a linear search, tuples are already pretty darned good and leave
> room for only microscopic O(n) improvements.  The bigger win comes from
> using a better algorithm and data structure -- hashing beats linear
> search hands-down.  The constant search time is faster for all n>1,
> resulting in much improved scalability.  No tweaking of
> tuple.__contains__() can match it.
> 
> Sets are the right data structure for fast membership testing.  I would
> love for sets to be used internally while letting users continue to
> write the clean looking code shown above.

That's what I was thinking off: if the compiler can detect
the constant nature and the use of a common type, it could
set a flag in the tuple type telling it about this feature.
The tuple could then convert the tuple contents to a set
internally and when the __contains__ hook is first called
and use the set for the lookup.

Alternatively, you could use a sub-type for a few common
cases.

In either case you would have to teach marshal how to
treat the extra bit of information.

The user won't notice all this in the Python program
and can continue to write clean code (in some cases,
even cleaner code than before - I usually use the keyword
hack to force certain things into the locals at module
load time, but would love to get rid off this).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 19 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From aahz at pythoncraft.com  Sat Feb 19 16:11:46 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat Feb 19 16:11:48 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <42171931.4020600@v.loewis.de>
References: <1f7befae050218174345e029e8@mail.gmail.com>
	<cv70vp$2gd$1@sea.gmane.org> <42171931.4020600@v.loewis.de>
Message-ID: <20050219151146.GA4837@panix.com>

On Sat, Feb 19, 2005, "Martin v. L?wis" wrote:
> Fredrik Lundh wrote:
>>
>>I'd say that this explains why it would still make sense to let the code 
>>generator change
>>"x in (a, b, c)" to "x == a or x == b or x == c", as long as a, b, and c 
>>are all integers.
> 
> How often does that happen in real code?

Dunno how often, but I was working on some code at my company yesterday
that did that -- we use a lot of ints to indicate options.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From mwh at python.net  Sat Feb 19 21:27:13 2005
From: mwh at python.net (Michael Hudson)
Date: Sat Feb 19 21:27:16 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <4216C89F.3040400@iinet.net.au> (Nick Coghlan's message of
	"Sat, 19 Feb 2005 15:03:27 +1000")
References: <4216C89F.3040400@iinet.net.au>
Message-ID: <2mpsywxplq.fsf@starship.python.net>

Nick Coghlan <ncoghlan@iinet.net.au> writes:

> This is something I've typed way too many times:
>
> Py> class C():
>    File "<stdin>", line 1
>      class C():
>              ^
> SyntaxError: invalid syntax
>
> It's the asymmetry with functions that gets to me - defining a
> function with no arguments still requires parentheses in the
> definition statement, but defining a class with no bases requires the
> parentheses to be omitted.

Yeah, this has annoyed me for ages too.

However!  You obviously haven't read Misc/HISTORY recently enough :)

The surprising thing is that "class C():" used to work (in fact before
0.9.4 the parens mandatory).  It became a syntax error in 0.9.9,
seemingly because Guido was peeved that people hadn't updated all
their old code to the new syntax.  I wonder if he'd like to try that
trick again today :)

I'd still vote for it to be changed.

> Which leads in to the real question: Does this *really* need to be a
> syntax error? Or could it be used as an easier way to spell "class
> C(object):"?

-1.  Too magical, too opaque.

> Then, in Python 3K, simply drop support for omitting the parentheses
> from class definitions - require inheriting from ClassicClass
> instead.

HISTORY repeats itself...

Cheers,
mwh

-- 
  [Perl] combines all the worst aspects of C and Lisp: a billion
  different sublanguages in one monolithic executable.  It combines
  the power of C with the readability of PostScript. -- Jamie Zawinski
From reinhold-birkenfeld-nospam at wolke7.net  Sun Feb 20 00:26:36 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sun Feb 20 00:26:09 2005
Subject: [Python-Dev] Some old patches
Message-ID: <cv8hnv$enk$1@sea.gmane.org>

Hello,

this time working up some of the patches with beards:

- #751943

  Adds the display of the line number to cgitb stack traces even when
  the source code is not available to cgitb. This makes sense in the
  case that the source is lying around somewhere else. However, the
  original patch generates a link to "file://?" on the occasion that
  the source file name is not known. I have created a new patch
  (#1144549) that fixes this, and also renames all local variables
  "file" in cgitb to avoid builtin shadowing.

- #749830

  Allows the mmap call on UNIX to be supplied a length argument of
  0 to mmap the whole file (which is already implemented on Windows).
  However, the patch doesn't apply on current CVS, so I made a new patch
  (#1144555) that does. Recommend apply, unless this may cause problems
  on some Unices which I don't know about.

- #547176

  Allows the rlcompleter to complete on [] item access (constructs like
  sim[0].<TAB> could then be completed). As comments in the patch point
  out, this easily leads to execution of arbitrary code via __getitem__,
  which is IMHO a too big side effect of completing (though IPython does
  this). Recommend reject.

- #645894

  Allows the use of resource.getrusage time values for profile.py, which
  results in better timing resolution on FreeBSD. However, this may lead
  to worse timing resolution on other OS, so perhaps the patch should be
  changed to be restricted to this particular platform.

- #697613 -- bug #670311

  This handles the problem that python -i exits on SystemExit exceptions
  by introducting two new API functions. While it works for me, I am not
  sure whether this is too much overhead for fixing a glitch no one else
  complained about.

- #802188

  This adds a specific error message for invalid tokens after a '\' used
  as line continuation. While it may be helpful when the invalid token
  is whitespace, Python usually shows the exact location of the invalid
  token, so you can examine this line and find the error. On the other
  hand, the patch is no big deal, so if a specific error message is
  welcome, it may as well be applied.

Enough for today... and best of all: I have no patch which I want to
promote!

Reinhold

From gvanrossum at gmail.com  Sun Feb 20 02:08:09 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Feb 20 02:08:15 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <2mpsywxplq.fsf@starship.python.net>
References: <4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
Message-ID: <ca471dc20502191708214a9f2f@mail.gmail.com>

> > This is something I've typed way too many times:
> >
> > Py> class C():
> >    File "<stdin>", line 1
> >      class C():
> >              ^
> > SyntaxError: invalid syntax
> >
> > It's the asymmetry with functions that gets to me - defining a
> > function with no arguments still requires parentheses in the
> > definition statement, but defining a class with no bases requires the
> > parentheses to be omitted.

It's fine to fix this in 2.5. I guess I can add this to my list of
early oopsies -- although to the very bottom. :-)

It's *not* fine to make C() mean C(object). (We already have enough
other ways to declaring new-style classes.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From ncoghlan at iinet.net.au  Sun Feb 20 03:13:25 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Feb 20 03:13:31 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <ca471dc20502191708214a9f2f@mail.gmail.com>
References: <4216C89F.3040400@iinet.net.au>	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
Message-ID: <4217F245.2020004@iinet.net.au>

Guido van Rossum wrote:
>>>This is something I've typed way too many times:
>>>
>>>Py> class C():
>>>   File "<stdin>", line 1
>>>     class C():
>>>             ^
>>>SyntaxError: invalid syntax
>>>
>>>It's the asymmetry with functions that gets to me - defining a
>>>function with no arguments still requires parentheses in the
>>>definition statement, but defining a class with no bases requires the
>>>parentheses to be omitted.
> 
> 
> It's fine to fix this in 2.5. I guess I can add this to my list of
> early oopsies -- although to the very bottom. :-)
> 
> It's *not* fine to make C() mean C(object). (We already have enough
> other ways to declaring new-style classes.)
> 

Fair enough - the magnitude of the semantic difference between "class C:" and 
"class C():" bothered me a little, too. I'll just have to remember that I can 
put "__metaclass__ == type" at the top of modules :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From jack at performancedrivers.com  Sun Feb 20 04:35:38 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Sun Feb 20 04:35:42 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <4217F245.2020004@iinet.net.au>
References: <4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
	<4217F245.2020004@iinet.net.au>
Message-ID: <20050220033538.GF9263@performancedrivers.com>

On Sun, Feb 20, 2005 at 12:13:25PM +1000, Nick Coghlan wrote:
> Guido van Rossum wrote:
> >>>This is something I've typed way too many times:
> >>>
> >>>Py> class C():
> >>>  File "<stdin>", line 1
> >>>    class C():
> >>>            ^
> >>>SyntaxError: invalid syntax
> >>>
> >>>It's the asymmetry with functions that gets to me - defining a
> >>>function with no arguments still requires parentheses in the
> >>>definition statement, but defining a class with no bases requires the
> >>>parentheses to be omitted.
> >
> >
> >It's fine to fix this in 2.5. I guess I can add this to my list of
> >early oopsies -- although to the very bottom. :-)
> >
> >It's *not* fine to make C() mean C(object). (We already have enough
> >other ways to declaring new-style classes.)
> >
> 
> Fair enough - the magnitude of the semantic difference between "class C:" 
> and "class C():" bothered me a little, too. I'll just have to remember that 
> I can put "__metaclass__ == type" at the top of modules :)

I always use new style classes so I only have to remember one set of behaviors.
"__metaclass__ = type" is warty, it has the "action at a distance" problem that 
decorators solve for functions.  I didn't dig into the C but does having 'type' 
as metaclass guarantee the same behavior as inheriting 'object' or does object 
provide something type doesn't?  *wince*

Py3k? Faster please[*].

-Jack

* a US-ism of a conservative bent, loosely translated as "change for the
  better? I'll get behind that."
From python at rcn.com  Sun Feb 20 04:46:40 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb 20 04:51:42 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <ca471dc20502191708214a9f2f@mail.gmail.com>
Message-ID: <001301c516fe$ed674700$f33ec797@oemcomputer>

> > > This is something I've typed way too many times:
> > >
> > > Py> class C():
> > >    File "<stdin>", line 1
> > >      class C():
> > >              ^
> > > SyntaxError: invalid syntax
> > >
> > > It's the asymmetry with functions that gets to me - defining a
> > > function with no arguments still requires parentheses in the
> > > definition statement, but defining a class with no bases requires
the
> > > parentheses to be omitted.
> 
> It's fine to fix this in 2.5. 

Yea!


Raymond
From raymond.hettinger at verizon.net  Sun Feb 20 05:20:25 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun Feb 20 05:24:29 2005
Subject: [Python-Dev] UserString
Message-ID: <000001c51703$80f97520$f33ec797@oemcomputer>

I noticed that UserString objects have methods that do not accept other
UserString objects as arguments:


>>> from UserString import UserString
>>> UserString('slartibartfast').count(UserString('a'))

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in -toplevel-
    UserString('slartibartfast').count(UserString('a'))
  File "C:\PY24\lib\UserString.py", line 66, in count
    return self.data.count(sub, start, end)
TypeError: expected a character buffer object

>>> UserString('abc') in UserString('abcde')

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in -toplevel-
    UserString('abc') in UserString('abcde')
  File "C:\PY24\lib\UserString.py", line 35, in __contains__
    return char in self.data
TypeError: 'in <string>' requires string as left operand


This sort of thing is easy to test for and easy to fix.  The question is
whether we care about updating this module anymore or is it a relic.
Also, is the use case one that we care about.  AFAICT, this has never
come up before.


Raymond

From gvanrossum at gmail.com  Sun Feb 20 06:33:31 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Feb 20 06:33:38 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <20050220033538.GF9263@performancedrivers.com>
References: <4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
	<4217F245.2020004@iinet.net.au>
	<20050220033538.GF9263@performancedrivers.com>
Message-ID: <ca471dc2050219213315024299@mail.gmail.com>

> I didn't dig into the C but does having 'type'
> as metaclass guarantee the same behavior as inheriting 'object' or does object
> provide something type doesn't?  *wince*

No, they're equivalent. __metaclass__ = type cause the base class to
be object, and a base class of object causes the metaclass to be type.
But I agree wholeheartedly: class C(object): is much preferred.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Sun Feb 20 09:15:25 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Feb 20 09:15:29 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <20050220033538.GF9263@performancedrivers.com>
References: <4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
	<4217F245.2020004@iinet.net.au>
	<20050220033538.GF9263@performancedrivers.com>
Message-ID: <243fad4f779b2c979e1aa71fd866cda1@aleax.it>


On 2005 Feb 20, at 04:35, Jack Diederich wrote:

> I always use new style classes so I only have to remember one set of 
> behaviors.

I agree: that's reason #1 I recommend always using new-style whenever I 
teach / tutor / mentor in Python nowadays.

> "__metaclass__ = type" is warty, it has the "action at a distance" 
> problem that
> decorators solve for functions.

I disagree.  I view it as akin to a "from __future__ import" except 
that -- since the compiler doesn't need-to-know, as typeclass-picking 
happens at runtime -- it was accomplished by less magical and more 
flexible means.

>   I didn't dig into the C but does having 'type'
> as metaclass guarantee the same behavior as inheriting 'object' or 
> does object
> provide something type doesn't?  *wince*

I believe the former holds, since for example:

 >>> class X: __metaclass__ = type
...
 >>> X.__bases__
(<type 'object'>,)

If you're making a newstyle class with an oldstyle base, it's different:

 >>> class Y: pass
...
 >>> class X(Y): __metaclass__ = type
...
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: Error when calling the metaclass bases
     a new-style class can't have only classic bases

in this case, you do need to inherit object explicitly:

 >>> class X(Y, object): pass
...
 >>> X.__bases__
(<class __main__.Y at 0x38d330>, <type 'object'>)
 >>> type(X)
<type 'type'>

This is because types.ClassType turns somersaults to enable this: in 
this latter construct, Python's mechanisms determine ClassType as the 
metaclass (it's the metaclass of the first base class), but then 
ClassType in turn sniffs around for another metaclass to delegate to, 
among the supplied bases, and having found one washes its hands of the 
whole business;-).


Alex

From aleax at aleax.it  Sun Feb 20 09:32:35 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Feb 20 09:32:43 2005
Subject: [Python-Dev] UserString
In-Reply-To: <000001c51703$80f97520$f33ec797@oemcomputer>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
Message-ID: <0f5201ccd99380eeac0400da69d6d9f7@aleax.it>


On 2005 Feb 20, at 05:20, Raymond Hettinger wrote:
    ...
> This sort of thing is easy to test for and easy to fix.  The question 
> is
> whether we care about updating this module anymore or is it a relic.
> Also, is the use case one that we care about.  AFAICT, this has never
> come up before.

I did have some issues w/UserString at a client's, but that was 
connected to some code doing type-checking (and was fixed by injecting 
basestring as a base of the client's subclass of UserString and 
ensuring the type-checking always used isinstance and basestring).

My two cents: a *mixin* to make it easy to emulate full-fledged strings 
would be almost as precious as your DictMixin (ones to emulate lists, 
sets, files [w/buffering], ..., might be even more useful).  The point 
is all of these rich interfaces have a lot of redundancy and a mixin 
can provide all methods generically based on a few fundamental methods, 
which can be quite useful, just like DictMixin.

But a complete emulation of strings (etc) is mostly of "didactical" 
use, a sort of checklist to help ensure one implements all methods, not 
really useful for new code "in production"; at least, I haven't found 
such uses recently.  The above-mentioned client's class was an attempt 
to join RE functionality to strings and was a rather messy hack anyway, 
for example (perhaps prompted by client's previous familiarity with 
Perl, I'm not sure); at any rate, the client should probably have 
subclassed str or unicode if he really wanted that hack.  I can't think 
of a GOOD use for UserString (etc) since subclassing str (etc) was 
allowed in 2.2 or at least since a few loose ends about newstyle 
classes were neatly tied up in 2.3.

If we do decide "it is a relic, no more updates" perhaps some 
indication of deprecation would be warranted.  ((In any case, I do 
think the mixins would be useful)).


Alex

From mwh at python.net  Sun Feb 20 10:38:29 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Feb 20 10:38:31 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <243fad4f779b2c979e1aa71fd866cda1@aleax.it> (Alex Martelli's
	message of "Sun, 20 Feb 2005 09:15:25 +0100")
References: <4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
	<4217F245.2020004@iinet.net.au>
	<20050220033538.GF9263@performancedrivers.com>
	<243fad4f779b2c979e1aa71fd866cda1@aleax.it>
Message-ID: <2mvf8nwoyy.fsf@starship.python.net>

Alex Martelli <aleax@aleax.it> writes:

> On 2005 Feb 20, at 04:35, Jack Diederich wrote:
>
>>   I didn't dig into the C but does having 'type'
>> as metaclass guarantee the same behavior as inheriting 'object' or
>> does object
>> provide something type doesn't?  *wince*
>
> I believe the former holds, since for example:

I was going to say that 'type(object) is type' is everything you need
to know, but you also need the bit of code in type_new that replaces
an empty bases tuple with (object,) -- but 

class C:
    __metaclass__ = Type

and

class C(object):
    pass

produce identical classes.

> This is because types.ClassType turns somersaults to enable this: in
> this latter construct, Python's mechanisms determine ClassType as the
> metaclass (it's the metaclass of the first base class), but then
> ClassType in turn sniffs around for another metaclass to delegate to,
> among the supplied bases, and having found one washes its hands of the
> whole business;-).

It's also notable that type_new does exactly the same thing!

Cheers,
mwh

-- 
  <etrepum> Jokes around here tend to get followed by implementations.
                                                -- from Twisted.Quotes
From fredrik at pythonware.com  Sun Feb 20 13:07:17 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Feb 20 13:07:21 2005
Subject: [Python-Dev] Re: Re: Prospective Peephole Transformation
References: <4215FD5F.4040605@xs4all.nl><000101c515cc$9f96d0a0$803cc797@oemcomputer><5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com><cv52ck$99f$1@sea.gmane.org><5.1.1.6.0.20050218113820.02f83870@mail.telecommunity.com><cv569g$njv$1@sea.gmane.org><5.1.1.6.0.20050218120310.03c70510@mail.telecommunity.com><cv57ga$rtm$1@sea.gmane.org>	<1f7befae050218174345e029e8@mail.gmail.com><cv70vp$2gd$1@sea.gmane.org>
	<42171931.4020600@v.loewis.de>
Message-ID: <cv9uar$4qo$1@sea.gmane.org>

Martin v. L�wis wrote:

>> I'd say that this explains why it would still make sense to let the code generator change
>> "x in (a, b, c)" to "x == a or x == b or x == c", as long as a, b, and c are all integers.
>
> How often does that happen in real code?

don't know, but it happens:

[fredrik@brain Python-2.4]$ grep "if.*in *([0-9]" Lib/*.py
Lib/BaseHTTPServer.py:        if self.command != 'HEAD' and code >= 200 and code not in (204, 304):
Lib/asyncore.py:        if err in (0, EISCONN):
Lib/mimify.py:    if len(args) not in (0, 1, 2):
Lib/sunau.py:        if nchannels not in (1, 2, 4):
Lib/sunau.py:        if sampwidth not in (1, 2, 4):
Lib/urllib2.py:        if code not in (200, 206):
Lib/urllib2.py:        if (code in (301, 302, 303, 307) and m in ("GET", "HEAD")
Lib/whichdb.py:    if magic in (0x00061561, 0x61150600):
Lib/whichdb.py:    if magic in (0x00061561, 0x61150600):
[fredrik@brain Python-2.4]$ grep "if.*in *\[[0-9]" Lib/*.py
Lib/decimal.py:            if value[0] not in [0,1]:
Lib/smtplib.py:        if code not in [235, 503]:

judging from the standard library, "string in string tuple/list" is a lot more common.

</F> 


From raymond.hettinger at verizon.net  Sun Feb 20 16:39:24 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun Feb 20 16:43:33 2005
Subject: [Python-Dev] Store x Load x -->  DupStore
Message-ID: <000101c51762$5b8369e0$7c1cc797@oemcomputer>

Any objections to new peephole transformation that merges a store/load
pair into a single step?
 
There is a tested patch at:  www.python.org/sf/1144842

It folds the two steps into a new opcode.  In the case of
store_name/load_name, it saves one three byte instruction, a trip around
the eval-loop, two stack mutations, a incref/decref pair, a dictionary
lookup, and an error check (for the lookup).  While it acts like a dup
followed by a store, it is implemented more simply as a store that
doesn't pop the stack.  The transformation is broadly applicable and
occurs thousands of times in the standard library and test suite.


Raymond Hettinger

From gvanrossum at gmail.com  Sun Feb 20 17:06:28 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Feb 20 17:06:33 2005
Subject: [Python-Dev] UserString
In-Reply-To: <0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
Message-ID: <ca471dc205022008063dd52a41@mail.gmail.com>

[Alex]
> I did have some issues w/UserString at a client's, but that was
> connected to some code doing type-checking (and was fixed by injecting
> basestring as a base of the client's subclass of UserString and
> ensuring the type-checking always used isinstance and basestring).

Oh, bah. That's not what basestring was for. I can't blame you or your
client, but my *intention* was that basestring would *only* be the
base of the two *real* built-in string types (str and unicode). The
reason for its existence was that some low-level built-in (or
extension) operations only accept those two *real* string types and
consequently some user code might want to validate ("look before you
leap") its own arguments if those eventually ended up being passed to
aforementioned low-level built-in code. My intention was always that
UserString and other string-like objects would explicitly *not*
inherit from basestring. Of course, my intention was lost, your client
used basestring to mean "any string-ish object", got away with it
because they weren't using any of those low-level built-ins, and you
had to comply rather than explain it to them.

Sounds like a good reason to add interfaces to the language. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Sun Feb 20 17:17:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Feb 20 17:17:17 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
Message-ID: <ca471dc205022008171c3f413d@mail.gmail.com>

> Any objections to new peephole transformation that merges a store/load
> pair into a single step?
> 
> There is a tested patch at:  www.python.org/sf/1144842
> 
> It folds the two steps into a new opcode.  In the case of
> store_name/load_name, it saves one three byte instruction, a trip around
> the eval-loop, two stack mutations, a incref/decref pair, a dictionary
> lookup, and an error check (for the lookup).  While it acts like a dup
> followed by a store, it is implemented more simply as a store that
> doesn't pop the stack.  The transformation is broadly applicable and
> occurs thousands of times in the standard library and test suite.

What exactly are you trying to accomplish? Do you have examples of
code that would be sped up measurably by this transformation? Does
anybody care about those speedups even if they *are* measurable?

I'm concerned that there's too much hacking of the VM going on with
too little benefit. The VM used to be relatively simple code that many
people could easily understand. The benefit of that was that new
language features could be implemented relatively easily even by
relatively inexperienced developers. All that seems to be lost, and I
fear that the end result is going to be a calcified VM that's only 10%
faster than the original, since we appear to have reached the land of
diminishing returns here.

I don't see any concentrated efforts trying to figure out where the
biggest pain is and how to relieve it; rather, it looks as if the
easiest targets are being approached. Now, if these were low-hanging
fruit, I'd happily agree, but I'm not so sure that they are all that
valuable.

Where are the attempts to speed up function/method calls? That's an
area where we could *really* use a breakthrough...

Eventually we'll need a radically different approach, maybe PyPy,
maybe Starkiller.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Sun Feb 20 17:41:31 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Feb 20 17:41:36 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc205022008063dd52a41@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
Message-ID: <d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>


On 2005 Feb 20, at 17:06, Guido van Rossum wrote:

> [Alex]
>> I did have some issues w/UserString at a client's, but that was
>> connected to some code doing type-checking (and was fixed by injecting
>> basestring as a base of the client's subclass of UserString and
>> ensuring the type-checking always used isinstance and basestring).
>
> Oh, bah. That's not what basestring was for. I can't blame you or your
> client, but my *intention* was that basestring would *only* be the
> base of the two *real* built-in string types (str and unicode). The
> reason for its existence was that some low-level built-in (or
> extension) operations only accept those two *real* string types and
> consequently some user code might want to validate ("look before you
> leap") its own arguments if those eventually ended up being passed to
> aforementioned low-level built-in code. My intention was always that
> UserString and other string-like objects would explicitly *not*
> inherit from basestring. Of course, my intention was lost, your client
> used basestring to mean "any string-ish object", got away with it
> because they weren't using any of those low-level built-ins, and you
> had to comply rather than explain it to them.

I would gladly have explained, if I had understood your design intent 
correctly at the time (whether the explanation would have done much 
good is another issue); but I'm afraid I didn't.  Now I do (thanks for 
explaining!) though I'm not sure what can be done in retrospect to 
communicate it more widely.

The need to check "is this thingy here string-like" is sort of 
frequent, because strings are sequences which, when iterated on, yield 
sequences (strings of length 1) which, when iterated on, yield 
sequences ad infinitum.  Strings are sequences but more often than not 
one wants to treat them as "scalars" instead.  isinstance and 
basestring allow that frequently needed check so nicely, that, if 
they're not intended for it, they're an "attractive nuisance" 
legally;-).

The need to make stringlike thingies emerges both for bad reasons 
(e.g., I never liked that client's "string cum re" perloidism) and good 
ones (e.g., easing the interfacing with external frameworks that have 
their own stringythings, such as Qt's QtString); and checking if 
something is stringlike is also frequent, as per previous para.  
Darn...


> Sounds like a good reason to add interfaces to the language. :-)

If an interface must be usable to say "is this string-like?" it will 
have to be untyped, I guess, and the .translate method will be a small 
problem (one-argument for unicode, two-args for str, and very different 
argument semantics) -- don't recall offhand if there are other such 
nonpolymorphic methods there.


Alex

From pje at telecommunity.com  Sun Feb 20 18:37:41 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Feb 20 18:34:59 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <ca471dc205022008171c3f413d@mail.gmail.com>
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
Message-ID: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>

At 08:17 AM 2/20/05 -0800, Guido van Rossum wrote:
>Where are the attempts to speed up function/method calls? That's an
>area where we could *really* use a breakthrough...

Amen!

So what happened to Armin's pre-allocated frame patch?  Did that get into 2.4?

Also, does anybody know where all the time goes in a function call, 
anyway?  I assume that some of the pieces are:

* tuple/dict allocation for arguments (but some of this is bypassed on the 
fast branch for Python-to-Python calls, right?)

* frame allocation and setup (but Armin's patch was supposed to eliminate 
most of this whenever a function isn't being used re-entrantly)

* argument "parsing" (check number of args, map kwargs to their positions, 
etc.; but isn't some of this already fast-pathed for Python-to-Python calls?)

I suppose the fast branch fixes don't help special methods like __getitem__ 
et al, since those don't go through the fast branch, but I don't think 
those are the majority of function calls.

And whatever happened to CALL_METHOD?  Do we need a tp_callmethod that 
takes an argument array, length, and keywords, so that we can skip 
instancemethod allocation in the common case of calling a method directly?

From pje at telecommunity.com  Sun Feb 20 18:15:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Feb 20 18:35:09 2005
Subject: [Python-Dev] Requesting that a class be a new-style class
In-Reply-To: <243fad4f779b2c979e1aa71fd866cda1@aleax.it>
References: <20050220033538.GF9263@performancedrivers.com>
	<4216C89F.3040400@iinet.net.au>
	<2mpsywxplq.fsf@starship.python.net>
	<ca471dc20502191708214a9f2f@mail.gmail.com>
	<4217F245.2020004@iinet.net.au>
	<20050220033538.GF9263@performancedrivers.com>
Message-ID: <5.1.1.6.0.20050220121233.021107a0@mail.telecommunity.com>

At 09:15 AM 2/20/05 +0100, Alex Martelli wrote:
>This is because types.ClassType turns somersaults to enable this: in this 
>latter construct, Python's mechanisms determine ClassType as the metaclass 
>(it's the metaclass of the first base class), but then ClassType in turn 
>sniffs around for another metaclass to delegate to, among the supplied 
>bases, and having found one washes its hands of the whole business;-).

To be pedantic, the actual algorithm in 2.2+ has nothing to do with the 
first base class; that's the pre-2.2 algorithm.  The 2.2 algorithm looks 
for the most-derived metaclass of the base classes, and simply ignores 
classic bases altogether.

From martin at v.loewis.de  Sun Feb 20 18:41:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 20 18:41:22 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <ca471dc205022008171c3f413d@mail.gmail.com>
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<ca471dc205022008171c3f413d@mail.gmail.com>
Message-ID: <4218CBBF.8030400@v.loewis.de>

Guido van Rossum wrote:
> I'm concerned that there's too much hacking of the VM going on with
> too little benefit.

I completely agree. It would be so much more useful if people tried
to fix the bugs that have been reported.

Regards,
Martin
From mwh at python.net  Sun Feb 20 19:38:39 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Feb 20 19:38:40 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <ca471dc205022008171c3f413d@mail.gmail.com> (Guido van Rossum's
	message of "Sun, 20 Feb 2005 08:17:15 -0800")
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<ca471dc205022008171c3f413d@mail.gmail.com>
Message-ID: <2mr7jbvzyo.fsf@starship.python.net>

Guido van Rossum <gvanrossum@gmail.com> writes:

>> Any objections to new peephole transformation that merges a store/load
>> pair into a single step?
>> 
>> There is a tested patch at:  www.python.org/sf/1144842
>> 
>> It folds the two steps into a new opcode.  In the case of
>> store_name/load_name, it saves one three byte instruction, a trip around
>> the eval-loop, two stack mutations, a incref/decref pair, a dictionary
>> lookup, and an error check (for the lookup).  While it acts like a dup
>> followed by a store, it is implemented more simply as a store that
>> doesn't pop the stack.  The transformation is broadly applicable and
>> occurs thousands of times in the standard library and test suite.

I'm still a little curious as to what code creates such opcodes...

> What exactly are you trying to accomplish? Do you have examples of
> code that would be sped up measurably by this transformation? Does
> anybody care about those speedups even if they *are* measurable?
>
> I'm concerned that there's too much hacking of the VM going on with
> too little benefit. The VM used to be relatively simple code that many
> people could easily understand. The benefit of that was that new
> language features could be implemented relatively easily even by
> relatively inexperienced developers. All that seems to be lost, and I
> fear that the end result is going to be a calcified VM that's only 10%
> faster than the original, since we appear to have reached the land of
> diminishing returns here.

In the case of the bytecode optimizer, I'm not sure this is a fair
accusation.  Even if you don't understand it, you can ignore it and
not have your understanding of the rest of the VM affected (I'm not
sure that compile.c has ever been "easily understood" in any case :).

> I don't see any concentrated efforts trying to figure out where the
> biggest pain is and how to relieve it; rather, it looks as if the
> easiest targets are being approached. Now, if these were low-hanging
> fruit, I'd happily agree, but I'm not so sure that they are all that
> valuable.

I think some of the peepholer's work are pure wins -- x,y = y,x
unpacking and the creation of constant tuples certainly spring to
mind.

If Raymond wants to spend his time on this stuff, that's his choice.
I don't think the obfuscation cost is all that high.

> Where are the attempts to speed up function/method calls? That's an
> area where we could *really* use a breakthrough...

The problem is that it's hard!

> Eventually we'll need a radically different approach, maybe PyPy,
> maybe Starkiller.

Yup.

Cheers,
mwh

-- 
  Gevalia is undrinkable low-octane see-through only slightly
  roasted bilge water. Compared to .us coffee it is quite
  drinkable.                                      -- M?ns Nilsson, asr
From mwh at python.net  Sun Feb 20 20:00:13 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Feb 20 20:00:30 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	(Phillip J. Eby's message of "Sun, 20 Feb 2005 12:37:41 -0500")
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
Message-ID: <2mmztzvyyq.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 08:17 AM 2/20/05 -0800, Guido van Rossum wrote:
>>Where are the attempts to speed up function/method calls? That's an
>>area where we could *really* use a breakthrough...
>
> Amen!
>
> So what happened to Armin's pre-allocated frame patch?  Did that get into 2.4?

No, because it slows down recursive function calls, or functions that
happen to be called at the same time in different threads.  Fixing
*that* would require things like code specific frame free-lists and
that's getting a bit convoluted and might waste quite a lot of memory.

Eliminating the blockstack would be nice (esp. if it's enough to get
frames small enough that they get allocated by PyMalloc) but this
seemed to be tricky too (or at least Armin, Samuele and I spent a
cuple of hours yakking about it on IRC and didn't come up with a clear
approach).  Dynamically allocating the blockstack would be simpler,
and might acheive a similar win.  (This is all from memory, I haven't
thought about specifics in a while).

> Also, does anybody know where all the time goes in a function call,
> anyway?

I did once...

> I assume that some of the pieces are:
>
> * tuple/dict allocation for arguments (but some of this is bypassed on
>   the fast branch for Python-to-Python calls, right?)

All of it, in easy cases.  ISTR that the fast path could be a little
wider -- it bails when the called function has default arguments, but
I think this case could be handled easily enough.

> * frame allocation and setup (but Armin's patch was supposed to
>   eliminate most of this whenever a function isn't being used
>   re-entrantly)

Ah, you remember the wart :) I think even with the patch, frame setup
is a significant amount of work.  Why are frames so big?

> * argument "parsing" (check number of args, map kwargs to their
>   positions, etc.; but isn't some of this already fast-pathed for
>   Python-to-Python calls?)

Yes.  With some effort you could probably avoid a copy (and incref) of
the arguments from the callers to the callees stack area.  BFD.

> I suppose the fast branch fixes don't help special methods like
> __getitem__ et al, since those don't go through the fast branch, but I
> don't think those are the majority of function calls.

Indeed.  I suspect this fails the effort/benefit test, but I could be
wrong.

> And whatever happened to CALL_METHOD?

It didn't work as an optimization, as far as I remember.  I think the
patch is on SF somewhere.  Or is a branch in CVS?  Oh, it's patch
#709744.

> Do we need a tp_callmethod that takes an argument array, length, and
> keywords, so that we can skip instancemethod allocation in the
> common case of calling a method directly?

Hmm, didn't think of that, and I don't think it's how the CALL_ATTR
attempt worked.  I presume it would need to take a method name too :)

I already have a patch that does this for regular function calls (it's
a rearrangement/refactoring not an optimization though).

Cheers,
mwh

-- 
  I think perhaps we should have electoral collages and construct
  our representatives entirely of little bits of cloth and papier 
  mache.          -- Owen Dunn, ucam.chat, from his review of the year
From bac at OCF.Berkeley.EDU  Sun Feb 20 20:41:03 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Feb 20 20:41:12 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <2mmztzvyyq.fsf@starship.python.net>
References: <000101c51762$5b8369e0$7c1cc797@oemcomputer>	<000101c51762$5b8369e0$7c1cc797@oemcomputer>	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<2mmztzvyyq.fsf@starship.python.net>
Message-ID: <4218E7CF.1020208@ocf.berkeley.edu>

Michael Hudson wrote:
> "Phillip J. Eby" <pje@telecommunity.com> writes:
[SNIP]
>>And whatever happened to CALL_METHOD?
> 
> 
> It didn't work as an optimization, as far as I remember.  I think the
> patch is on SF somewhere.  Or is a branch in CVS?  Oh, it's patch
> #709744.
> 
> 
>>Do we need a tp_callmethod that takes an argument array, length, and
>>keywords, so that we can skip instancemethod allocation in the
>>common case of calling a method directly?
> 
> 
> Hmm, didn't think of that, and I don't think it's how the CALL_ATTR
> attempt worked.  I presume it would need to take a method name too :)
> 

CALL_ATTR basically replaced ``LOAD_ATTR; CALL_FUNCTION`` with a single opcode. 
  Idea was that the function creation by the LOAD_ATTR was a wasted step so 
might as well just skip it and call the method directly.

Problem was the work required to support both classic and new-style classes. 
Now I have not looked at the code since it was written back at PyCon 2003 and I 
was a total newbie to the core's C code at that point and I think Thomas said 
it had been two years since he did any major core hacking.  In other words it 
could possibly have been done better.  =)

-Brett
From pje at telecommunity.com  Sun Feb 20 21:22:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Feb 20 21:19:19 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <2mr7jbvzyo.fsf@starship.python.net>
References: <ca471dc205022008171c3f413d@mail.gmail.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<ca471dc205022008171c3f413d@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050220150416.029b3960@mail.telecommunity.com>

At 06:38 PM 2/20/05 +0000, Michael Hudson wrote:
> >> It folds the two steps into a new opcode.  In the case of
> >> store_name/load_name, it saves one three byte instruction, a trip around
> >> the eval-loop, two stack mutations, a incref/decref pair, a dictionary
> >> lookup, and an error check (for the lookup).  While it acts like a dup
> >> followed by a store, it is implemented more simply as a store that
> >> doesn't pop the stack.  The transformation is broadly applicable and
> >> occurs thousands of times in the standard library and test suite.
>
>I'm still a little curious as to what code creates such opcodes...

A simple STORE+LOAD case:

 >>> dis.dis(compile("x=1; y=x*2","?","exec"))
   1           0 LOAD_CONST               0 (1)
               3 STORE_NAME               0 (x)
               6 LOAD_NAME                0 (x)
               9 LOAD_CONST               1 (2)
              12 BINARY_MULTIPLY
              13 STORE_NAME               1 (y)
              16 LOAD_CONST               2 (None)
              19 RETURN_VALUE

And a simple DUP+STORE case:

 >>> dis.dis(compile("x=y=1","?","exec"))
   1           0 LOAD_CONST               0 (1)
               3 DUP_TOP
               4 STORE_NAME               0 (x)
               7 STORE_NAME               1 (y)
              10 LOAD_CONST               1 (None)
              13 RETURN_VALUE

Of course, I'm not sure how commonly this sort of code occurs in places 
where it makes a difference to anything.  Function call overhead continues 
to be Python's most damaging performance issue, because it makes it 
expensive to use abstraction.

Here's a thought.  Suppose we split frames into an "object" part and a 
"struct" part, with the object part being just a pointer to the struct 
part, and a flag indicating whether the struct part is stack-allocated or 
malloc'ed.  This would let us stack-allocate the bulk of the frame 
structure, but still have a frame "object" to pass around.  On exit from 
the C routine that stack-allocated the frame struct, we check to see if the 
frame object has a refcount>1, and if so, malloc a permanent home for the 
frame struct and update the frame object's struct pointer and flag.

In this way, frame allocation overhead could be reduced to the cost of an 
alloca, or just incorporated into the stack frame setup of the C routine 
itself, allowing the entire struct to be treated as "local variables" from 
a C perspective (which might benefit performance on architectures that 
reserve a register for local variable access).

Of course, this would slow down exception handling and other scenarios that 
result in extra references to a frame object, but if the OS malloc is the 
slow part of frame allocation (frame objects are too large for pymalloc), 
then perhaps it would be a net win.  On the other hand, this approach would 
definitely use more stack space per calling level.

From pje at telecommunity.com  Sun Feb 20 21:56:26 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Feb 20 21:53:45 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <2mmztzvyyq.fsf@starship.python.net>
References: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>

At 07:00 PM 2/20/05 +0000, Michael Hudson wrote:
>"Phillip J. Eby" <pje@telecommunity.com> writes:
>
> > At 08:17 AM 2/20/05 -0800, Guido van Rossum wrote:
> >>Where are the attempts to speed up function/method calls? That's an
> >>area where we could *really* use a breakthrough...
> >
> > Amen!
> >
> > So what happened to Armin's pre-allocated frame patch?  Did that get 
> into 2.4?
>
>No, because it slows down recursive function calls, or functions that
>happen to be called at the same time in different threads.  Fixing
>*that* would require things like code specific frame free-lists and
>that's getting a bit convoluted and might waste quite a lot of memory.

Ah.  I thought it was just going to fall back to the normal case if the 
pre-allocated frame wasn't available (i.e., didn't have a refcount of 1).


>Eliminating the blockstack would be nice (esp. if it's enough to get
>frames small enough that they get allocated by PyMalloc) but this
>seemed to be tricky too (or at least Armin, Samuele and I spent a
>cuple of hours yakking about it on IRC and didn't come up with a clear
>approach).  Dynamically allocating the blockstack would be simpler,
>and might acheive a similar win.  (This is all from memory, I haven't
>thought about specifics in a while).

I'm not very familiar with the operation of the block stack, but why does 
it need to be a stack?  For exception handling purposes, wouldn't it 
suffice to know the offset of the current handler, and have an opcode to 
set the current handler location?  And for "for" loops, couldn't an 
anonymous local be used to hold the loop iterator instead of using a stack 
variable?

Hm, actually I think I see the answer; in the case of module-level code 
there can be no "anonymous local variables" the way there can in 
functions.  Hmm.  I guess you'd need to also have a "reset stack to level 
X" opcode, then, and both it and the set-handler opcode would have to be 
placed at every destination of a jump that crosses block boundaries.  It's 
not clear how big a win that is, due to the added opcodes even on non-error 
paths.

Hey, wait a minute...  all the block stack data is static, isn't it?  I 
mean, the contents of the block stack at any point in a code string could 
be determined statically, by examination of the bytecode, couldn't it?  If 
that's the case, then perhaps we could design a pre-computed data structure 
similar to co_lnotab that would be used by the evaluator in place of the 
blockstack.

Of course, I may be talking through my hat here, as I have very little 
experience with how the blockstack works.  However, if this idea makes 
sense, then perhaps it could actually speed up non-error paths as well 
(except perhaps for the 'return' statement), at the cost of a larger code 
structure and compiler complexity.  But, if it also means that frames can 
be allocated faster (e.g. via pymalloc), it might be worth it, just like 
getting rid of SET_LINENO turned out to be a net win.


>All of it, in easy cases.  ISTR that the fast path could be a little
>wider -- it bails when the called function has default arguments, but
>I think this case could be handled easily enough.

When it has *any* default arguments, or only when it doesn't have values to 
supply for them?


>Why are frames so big?

Because there are CO_MAXBLOCKS * 12 bytes in there for the block stack.  If 
there was no need for that, frames could perhaps be allocated via 
pymalloc.  They only have around 100 bytes or so in them, apart from the 
blockstack and locals/value stack.


> > Do we need a tp_callmethod that takes an argument array, length, and
> > keywords, so that we can skip instancemethod allocation in the
> > common case of calling a method directly?
>
>Hmm, didn't think of that, and I don't think it's how the CALL_ATTR
>attempt worked.  I presume it would need to take a method name too :)

Er, yeah, I thought that was obvious.  :)

From pje at telecommunity.com  Sun Feb 20 22:34:50 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Feb 20 22:32:10 2005
Subject: [Python-Dev] Eliminating the block stack (was Re: Store x Load
	x --> DupStore)
In-Reply-To: <5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
References: <2mmztzvyyq.fsf@starship.python.net>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050220160300.02e8bc30@mail.telecommunity.com>

At 03:56 PM 2/20/05 -0500, Phillip J. Eby wrote:
>At 07:00 PM 2/20/05 +0000, Michael Hudson wrote:
>>Eliminating the blockstack would be nice (esp. if it's enough to get
>>frames small enough that they get allocated by PyMalloc) but this
>>seemed to be tricky too (or at least Armin, Samuele and I spent a
>>cuple of hours yakking about it on IRC and didn't come up with a clear
>>approach).  Dynamically allocating the blockstack would be simpler,
>>and might acheive a similar win.  (This is all from memory, I haven't
>>thought about specifics in a while).

I think I have an idea how to do it in a (relatively) simple fashion; see 
if you can find a hole in it:

* Change the PyTryBlock struct to include an additional member, 'int 
b_prev', that refers to the previous block in a chain

* Change the compiler's emission of SETUP_* opcodes, so that instead of a 
PyTryBlock being added to the blockstack at interpretation time, it's added 
to the end of a 'co_blktree' block array at compile time, with its 'b_prev' 
pointing to the current "top" of the block stack.  Instead of the SETUP_* 
argument being the handler offset, have it be the index of the just-added 
blocktree entry.

* Replace f_blockstack and f_iblock with 'int f_iblktree', and change 
PyFrame_BlockSetup() to set this equal to the SETUP_* argument, and 
PyFrame_BlockPop() to use this as an index into the code's co_blktree to 
retrieve the needed values.  PyFrame_BlockPop() would then set f_iblktree 
equal to the "popped" block's 'b_prev' member, thus "popping" the block 
from this virtual stack.

(Note, by the way, that the blocktree could actually be created as a 
post-processing step of the current compilation process, by a loop that 
scans the bytecode and tracks the current stack and blockstack levels, and 
then replaces the SETUP_* opcodes' arguments.  This might be a simpler 
option than trying to change the compiler to do it along the way.)

Can anybody see any flaws in this concept?  As far as I can tell it just 
generates all possible block stack states at compile time, but doesn't 
change block semantics in the least, and it scarcely touches the eval 
loop.  It seems like it could drop the size of frames enough to let them 
use pymalloc instead of the OS malloc, at the cost of a 16 bytes per block 
increase in the size of code objects.  (And of course the necessary changes 
to 'marshal' and 'dis' as well as the compiler and eval loop.)

(More precisely, frames whose f_nlocals + f_stacksize is 40 or less, would 
be 256 bytes or less, and therefore pymalloc-able.  However, this should 
cover all but the most complex functions.)

From mwh at python.net  Sun Feb 20 22:54:43 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Feb 20 22:54:46 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
	(Phillip J. Eby's message of "Sun, 20 Feb 2005 15:56:26 -0500")
References: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
Message-ID: <2m8y5ix5gc.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 07:00 PM 2/20/05 +0000, Michael Hudson wrote:
>>"Phillip J. Eby" <pje@telecommunity.com> writes:
>>
>> > At 08:17 AM 2/20/05 -0800, Guido van Rossum wrote:
>> >>Where are the attempts to speed up function/method calls? That's an
>> >>area where we could *really* use a breakthrough...
>> >
>> > Amen!
>> >
>> > So what happened to Armin's pre-allocated frame patch?  Did that
>> get into 2.4?
>>
>>No, because it slows down recursive function calls, or functions that
>>happen to be called at the same time in different threads.  Fixing
>>*that* would require things like code specific frame free-lists and
>>that's getting a bit convoluted and might waste quite a lot of memory.
>
> Ah.  I thought it was just going to fall back to the normal case if
> the pre-allocated frame wasn't available (i.e., didn't have a refcount
> of 1).

Well, I don't think that's the test, but that might work.  Someone
should try it :) (I'm trying something else currently).

>>Eliminating the blockstack would be nice (esp. if it's enough to get
>>frames small enough that they get allocated by PyMalloc) but this
>>seemed to be tricky too (or at least Armin, Samuele and I spent a
>>cuple of hours yakking about it on IRC and didn't come up with a clear
>>approach).  Dynamically allocating the blockstack would be simpler,
>>and might acheive a similar win.  (This is all from memory, I haven't
>>thought about specifics in a while).
>
> I'm not very familiar with the operation of the block stack, but why
> does it need to be a stack?  

Finally blocks are the problem, I think.

> For exception handling purposes, wouldn't it suffice to know the
> offset of the current handler, and have an opcode to set the current
> handler location?  And for "for" loops, couldn't an anonymous local
> be used to hold the loop iterator instead of using a stack variable?
> Hm, actually I think I see the answer; in the case of module-level
> code there can be no "anonymous local variables" the way there can in
> functions.  Hmm.

I don't think this is the killer blow.  I can't remember the details
and it's too late to think about them, so I'm going to wait and see if
Samuele replies :)

>>All of it, in easy cases.  ISTR that the fast path could be a little
>>wider -- it bails when the called function has default arguments, but
>>I think this case could be handled easily enough.
>
> When it has *any* default arguments, or only when it doesn't have
> values to supply for them?

When it has *any*, I think.  I also think this is easy to change.

>>Why are frames so big?
>
> Because there are CO_MAXBLOCKS * 12 bytes in there for the block
> stack.  If there was no need for that, frames could perhaps be
> allocated via pymalloc.  They only have around 100 bytes or so in
> them, apart from the blockstack and locals/value stack.

What I'm trying is allocating the blockstack separately and see if two
pymallocs are cheaper than one malloc.

>> > Do we need a tp_callmethod that takes an argument array, length, and
>> > keywords, so that we can skip instancemethod allocation in the
>> > common case of calling a method directly?
>>
>>Hmm, didn't think of that, and I don't think it's how the CALL_ATTR
>>attempt worked.  I presume it would need to take a method name too :)
>
> Er, yeah, I thought that was obvious.  :)

Someone should try this too :)

Cheers,
mwh

-- 
  It is never worth a first class man's time to express a majority
  opinion.  By definition, there are plenty of others to do that.
                                                        -- G. H. Hardy
From greg.ewing at canterbury.ac.nz  Mon Feb 21 03:14:13 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon Feb 21 03:14:29 2005
Subject: [Python-Dev] Eliminating the block stack (was Re: Store x Load x
	--> DupStore)
In-Reply-To: <5.1.1.6.0.20050220160300.02e8bc30@mail.telecommunity.com>
References: <2mmztzvyyq.fsf@starship.python.net>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220160300.02e8bc30@mail.telecommunity.com>
Message-ID: <421943F5.7080408@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 03:56 PM 2/20/05 -0500, Phillip J. Eby wrote:
> 
>> At 07:00 PM 2/20/05 +0000, Michael Hudson wrote:
>>
>>> Eliminating the blockstack would be nice (esp. if it's enough to get
>>> frames small enough that they get allocated by PyMalloc)

Someone might like to take a look at the way Pyrex
generates C code for try-except and try-finally blocks.
It manages to get (what I hope is) the same effect
using local variables and gotos.

It doesn't have to deal with a stack pointer, but
I think that should just be a compiler-determinable
adjustment to be done when jumping to an outer
block.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Mon Feb 21 04:32:11 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon Feb 21 04:32:27 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
References: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
Message-ID: <4219563B.8080503@canterbury.ac.nz>

Phillip J. Eby wrote:

> Hm, actually I think I see the answer; in the case of module-level code 
> there can be no "anonymous local variables" the way there can in 
> functions.

Why not? There's still a frame object associated with the call
of the anonymous function holding the module's top-level code.
The compiler can allocate locals in that frame, even if the
user's code can't.

> I guess you'd need to also have a "reset stack to 
> level X" opcode, then, and both it and the set-handler opcode would have 
> to be placed at every destination of a jump that crosses block 
> boundaries.  It's not clear how big a win that is, due to the added 
> opcodes even on non-error paths.

Only exceptions and break statements would require stack
pointer adjustment, and they're relatively rare. I don't
think an extra opcode in those cases would make much of
a difference.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Mon Feb 21 04:32:25 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon Feb 21 04:32:43 2005
Subject: [Python-Dev] UserString
In-Reply-To: <d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
Message-ID: <42195649.3030400@canterbury.ac.nz>

Alex Martelli wrote:
> 
> On 2005 Feb 20, at 17:06, Guido van Rossum wrote:
> 
>> Oh, bah. That's not what basestring was for. I can't blame you or your
>> client, but my *intention* was that basestring would *only* be the
>> base of the two *real* built-in string types (str and unicode).

I think all this just reinforces the notion that LBYL is
a bad idea!

> The need to check "is this thingy here string-like" is sort of frequent, 
> because strings are sequences which, when iterated on, yield sequences 
> (strings of length 1) which, when iterated on, yield sequences ad 
> infinitum.

Yes, this characteristic of strings is unfortunate because it
tends to make some degree of LBYLing unavoidable. I don't
think the right solution is to try to come up with safe ways
of doing LBYL on strings, though, at least not in the long
term.

Maybe in Python 3000 this could be fixed by making strings *not*
be sequences. They would be sliceable, but *not* indexable or
iterable. If you wanted to iterate over their chars, you
would have to say 'for c in s.chars()' or something.

Then you would be able to test whether something is sequence-like
by the presence of __getitem__ or __iter__ methods, without
getting tripped up by strings.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From pje at telecommunity.com  Mon Feb 21 04:41:09 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Feb 21 04:38:29 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <4219563B.8080503@canterbury.ac.nz>
References: <5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050220223833.02e8dc80@mail.telecommunity.com>

At 04:32 PM 2/21/05 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
>>Hm, actually I think I see the answer; in the case of module-level code 
>>there can be no "anonymous local variables" the way there can in functions.
>
>Why not? There's still a frame object associated with the call
>of the anonymous function holding the module's top-level code.
>The compiler can allocate locals in that frame, even if the
>user's code can't.

That's a good point, but if you look at my "eliminating the block stack" 
post, you'll see that there's a simpler way to potentially get rid of the 
block stack, where "simpler" means "simpler changes in fewer places".

From pje at telecommunity.com  Mon Feb 21 04:44:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Feb 21 04:42:05 2005
Subject: [Python-Dev] UserString
In-Reply-To: <42195649.3030400@canterbury.ac.nz>
References: <d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
Message-ID: <5.1.1.6.0.20050220224135.02e90ad0@mail.telecommunity.com>

At 04:32 PM 2/21/05 +1300, Greg Ewing wrote:
>Alex Martelli wrote:
>>The need to check "is this thingy here string-like" is sort of frequent, 
>>because strings are sequences which, when iterated on, yield sequences 
>>(strings of length 1) which, when iterated on, yield sequences ad infinitum.
>
>Yes, this characteristic of strings is unfortunate because it
>tends to make some degree of LBYLing unavoidable.

FWIW, the trick I usually use to deal with this aspect of strings in 
recursive algorithms is to check whether the current item of an iteration 
is the same object I'm iterating over; if so, I know I've descended into a 
string.  It doesn't catch it on the first recursion level of course (unless 
it was a 1-character string to start with), but it's a quick-and-dirty way 
to EAFP such algorithms.

From gvanrossum at gmail.com  Mon Feb 21 04:42:34 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Feb 21 04:42:37 2005
Subject: [Python-Dev] UserString
In-Reply-To: <42195649.3030400@canterbury.ac.nz>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
Message-ID: <ca471dc205022019427d850488@mail.gmail.com>

> >> Oh, bah. That's not what basestring was for. I can't blame you or your
> >> client, but my *intention* was that basestring would *only* be the
> >> base of the two *real* built-in string types (str and unicode).
> 
> I think all this just reinforces the notion that LBYL is
> a bad idea!

In this case, perhaps; but in general? (And I think there's a
legitimate desire to sometimes special-case string-like things, e.g.
consider a function that takes either a stream or a filename
argument.)

Anyway, can you explain why LBYL is bad?

> > The need to check "is this thingy here string-like" is sort of frequent,
> > because strings are sequences which, when iterated on, yield sequences
> > (strings of length 1) which, when iterated on, yield sequences ad
> > infinitum.
> 
> Yes, this characteristic of strings is unfortunate because it
> tends to make some degree of LBYLing unavoidable. I don't
> think the right solution is to try to come up with safe ways
> of doing LBYL on strings, though, at least not in the long
> term.
> 
> Maybe in Python 3000 this could be fixed by making strings *not*
> be sequences. They would be sliceable, but *not* indexable or
> iterable. If you wanted to iterate over their chars, you
> would have to say 'for c in s.chars()' or something.
> 
> Then you would be able to test whether something is sequence-like
> by the presence of __getitem__ or __iter__ methods, without
> getting tripped up by strings.

There would be other ways to get out of this dilemma; we could
introduce a char type, for example. Also, strings might be
recognizable by other means, e.g. the presence of a lower() method or
some other characteristic method that doesn't apply to sequence in
general.

(To Alex: leaving transform() out of the string interface seems to me
the simplest solution.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Mon Feb 21 04:47:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Feb 21 04:47:11 2005
Subject: [Python-Dev] Eliminating the block stack (was Re: Store x Load x
	--> DupStore)
In-Reply-To: <5.1.1.6.0.20050220160300.02e8bc30@mail.telecommunity.com>
References: <2mmztzvyyq.fsf@starship.python.net>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
	<5.1.1.6.0.20050220160300.02e8bc30@mail.telecommunity.com>
Message-ID: <ca471dc20502201947631d73ab@mail.gmail.com>

> >>Eliminating the blockstack would be nice (esp. if it's enough to get
> >>frames small enough that they get allocated by PyMalloc) but this
> >>seemed to be tricky too (or at least Armin, Samuele and I spent a
> >>cuple of hours yakking about it on IRC and didn't come up with a clear
> >>approach).  Dynamically allocating the blockstack would be simpler,
> >>and might acheive a similar win.  (This is all from memory, I haven't
> >>thought about specifics in a while).

I don't know if this helps, but since I invented the block stack
around 1990, I believe I recall the main reason to make it dynamic was
to simplify code generation, not because it is inherently dynamic. At
the time an extra run-time data structure seemed to require less
coding than an extra compile-time data structure. The same argument
got me using dicts for locals; that was clearly a bottleneck and
eliminated long ago, but I think we should be able to lose the block
stack now, too. Somewhat ironically, eliminating the block stack will
reduce the stack frame size, while eliminating the dict for locals
added to it. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Mon Feb 21 08:06:37 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Feb 21 08:06:43 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc205022019427d850488@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
Message-ID: <89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>


On 2005 Feb 21, at 04:42, Guido van Rossum wrote:

>>>> Oh, bah. That's not what basestring was for. I can't blame you or 
>>>> your
>>>> client, but my *intention* was that basestring would *only* be the
>>>> base of the two *real* built-in string types (str and unicode).
>>
>> I think all this just reinforces the notion that LBYL is
>> a bad idea!
>
> In this case, perhaps; but in general? (And I think there's a
> legitimate desire to sometimes special-case string-like things, e.g.
> consider a function that takes either a stream or a filename
> argument.)
>
> Anyway, can you explain why LBYL is bad?

In the general case, it's bad because of a combination of issues.  It 
may violate "once, and only once!" -- the operations one needs to check 
may basicaly duplicate the operations one then wants to perform.  Apart 
from wasted effort, it may happen that the situation changes between 
the look and the leap (on an external file, or due perhaps to threading 
or other reentrancy).  It's often hard in the look to cover exactly the 
set of prereq's you need for the leap -- e.g. I've often seen code such 
as
     if i < len(foo):
         foo[i] = 24
which breaks for i<-len(foo); the first time this happens the guard's 
changed to 0<=i<len(foo) which then stops the code from working 
w/negative index; finally it stabilizes to the correct check, 
-len(foo)<=i<len(foo), but even then it's just duplicating the same 
check that Python performs again when you then use foo[i]... just 
cluttering code.  The intermediate Pythonista's who's learned to code 
"try: foo[i]=24 // except IndexError: pass" is much better off than the 
one who's still striving to LBYL as he had (e.g.) when using C.

Etc -- this is all very general and generic.

I had convinced myself that strings were a special case worth singling 
out, via isinstance and basestring, just as (say) dictionaries are 
singled out quite differently by metods such as get... I may well have 
been too superficial in this conclusion.

>> Then you would be able to test whether something is sequence-like
>> by the presence of __getitem__ or __iter__ methods, without
>> getting tripped up by strings.
>
> There would be other ways to get out of this dilemma; we could
> introduce a char type, for example. Also, strings might be
> recognizable by other means, e.g. the presence of a lower() method or
> some other characteristic method that doesn't apply to sequence in
> general.

Sure, there would many possibilities.

> (To Alex: leaving transform() out of the string interface seems to me
> the simplest solution.)

I guess you mean translate.  Yes, that would probably be simplest.


Alex

From aleax at aleax.it  Mon Feb 21 08:06:37 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Feb 21 08:06:45 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc205022019427d850488@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
Message-ID: <89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>


On 2005 Feb 21, at 04:42, Guido van Rossum wrote:

>>>> Oh, bah. That's not what basestring was for. I can't blame you or 
>>>> your
>>>> client, but my *intention* was that basestring would *only* be the
>>>> base of the two *real* built-in string types (str and unicode).
>>
>> I think all this just reinforces the notion that LBYL is
>> a bad idea!
>
> In this case, perhaps; but in general? (And I think there's a
> legitimate desire to sometimes special-case string-like things, e.g.
> consider a function that takes either a stream or a filename
> argument.)
>
> Anyway, can you explain why LBYL is bad?

In the general case, it's bad because of a combination of issues.  It 
may violate "once, and only once!" -- the operations one needs to check 
may basicaly duplicate the operations one then wants to perform.  Apart 
from wasted effort, it may happen that the situation changes between 
the look and the leap (on an external file, or due perhaps to threading 
or other reentrancy).  It's often hard in the look to cover exactly the 
set of prereq's you need for the leap -- e.g. I've often seen code such 
as
     if i < len(foo):
         foo[i] = 24
which breaks for i<-len(foo); the first time this happens the guard's 
changed to 0<=i<len(foo) which then stops the code from working 
w/negative index; finally it stabilizes to the correct check, 
-len(foo)<=i<len(foo), but even then it's just duplicating the same 
check that Python performs again when you then use foo[i]... just 
cluttering code.  The intermediate Pythonista's who's learned to code 
"try: foo[i]=24 // except IndexError: pass" is much better off than the 
one who's still striving to LBYL as he had (e.g.) when using C.

Etc -- this is all very general and generic.

I had convinced myself that strings were a special case worth singling 
out, via isinstance and basestring, just as (say) dictionaries are 
singled out quite differently by metods such as get... I may well have 
been too superficial in this conclusion.

>> Then you would be able to test whether something is sequence-like
>> by the presence of __getitem__ or __iter__ methods, without
>> getting tripped up by strings.
>
> There would be other ways to get out of this dilemma; we could
> introduce a char type, for example. Also, strings might be
> recognizable by other means, e.g. the presence of a lower() method or
> some other characteristic method that doesn't apply to sequence in
> general.

Sure, there would many possibilities.

> (To Alex: leaving transform() out of the string interface seems to me
> the simplest solution.)

I guess you mean translate.  Yes, that would probably be simplest.


Alex

From mwh at python.net  Mon Feb 21 10:00:11 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Feb 21 10:00:13 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <2m8y5ix5gc.fsf@starship.python.net> (Michael Hudson's message
	of "Sun, 20 Feb 2005 21:54:43 +0000")
References: <5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<000101c51762$5b8369e0$7c1cc797@oemcomputer>
	<5.1.1.6.0.20050220122401.028a4e50@mail.telecommunity.com>
	<5.1.1.6.0.20050220152217.029bb650@mail.telecommunity.com>
	<2m8y5ix5gc.fsf@starship.python.net>
Message-ID: <2m1xbawan8.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

>> Because there are CO_MAXBLOCKS * 12 bytes in there for the block
>> stack.  If there was no need for that, frames could perhaps be
>> allocated via pymalloc.  They only have around 100 bytes or so in
>> them, apart from the blockstack and locals/value stack.
>
> What I'm trying is allocating the blockstack separately and see if two
> pymallocs are cheaper than one malloc.

This makes no difference at all, of course -- once timeit or pystone
gets going the code path that actually allocates a new frame as
opposed to popping one off the free list simply never gets executed.
Duh!

Cheers,
mwh
(and despite what the sigmonster implies, I wasn't drunk last night :)

-- 
  This is an off-the-top-of-the-head-and-not-quite-sober suggestion,
  so is probably technically laughable.  I'll see how embarassed I
  feel tomorrow morning.            -- Patrick Gosling, ucam.comp.misc
From z_axis at 163.com  Mon Feb 21 14:54:33 2005
From: z_axis at 163.com (z-axis)
Date: Mon Feb 21 14:49:38 2005
Subject: [Python-Dev] Re: Welcome to the "Python-Dev" mailing list
Message-ID: <20050221134936.909271E4003@bag.python.org>

 
hi,friends
i am a python newbie but i used Java for about 5 years. when i saw python introduce in a famous magzine called <<programmer>> in China, i am immediately absorbed by its pretty code.
i hope i can use Python to do real development.

regards!
����

======== 2005-02-21 14:28:00 ����������д���� ========

Welcome to the Python-Dev@python.org mailing list! If you are a new
subscriber, please take the time to introduce yourself briefly in your
first post. It is appreciated if you lurk around for a while before
posting! :-)

Additional information on Python's development process can be found in
the Python Developer's Guide:

  http://www.python.org/dev/

To post to this list, send your email to:

  python-dev@python.org

General information about the mailing list is at:

  http://mail.python.org/mailman/listinfo/python-dev

If you ever want to unsubscribe or change your options (eg, switch to
or from digest mode, change your password, etc.), visit your
subscription page at:

  http://mail.python.org/mailman/options/python-dev/z_axis%40163.com

You can also make such adjustments via email by sending a message to:

  Python-Dev-request@python.org

with the word `help' in the subject or body (don't include the
quotes), and you will get back a message with instructions.

You must know your password to change your options (including changing
the password, itself) or to unsubscribe.  It is:

  zpython999

Normally, Mailman will remind you of your python.org mailing list
passwords once every month, although you can disable this if you
prefer.  This reminder will also include instructions on how to
unsubscribe or change your account options.  There is also a button on
your options page that will email your current password to you.

= = = = = = = = = = = = = = = = = = = = = = 
������������������
��
��������������������������z-axis
                          z_axis@163.com
�������������������� ����������2005-02-21
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050221/e537f44e/attachment.htm
From gvanrossum at gmail.com  Mon Feb 21 17:15:47 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Feb 21 17:15:51 2005
Subject: [Python-Dev] UserString
In-Reply-To: <89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
Message-ID: <ca471dc205022108152cbcbedf@mail.gmail.com>

> > Anyway, can you explain why LBYL is bad?
> 
> In the general case, it's bad because of a combination of issues.  It
> may violate "once, and only once!" -- the operations one needs to check
> may basicaly duplicate the operations one then wants to perform.  Apart
> from wasted effort, it may happen that the situation changes between
> the look and the leap (on an external file, or due perhaps to threading
> or other reentrancy).  It's often hard in the look to cover exactly the
> set of prereq's you need for the leap -- e.g. I've often seen code such
> as
>      if i < len(foo):
>          foo[i] = 24
> which breaks for i<-len(foo); the first time this happens the guard's
> changed to 0<=i<len(foo) which then stops the code from working
> w/negative index; finally it stabilizes to the correct check,
> -len(foo)<=i<len(foo), but even then it's just duplicating the same
> check that Python performs again when you then use foo[i]... just
> cluttering code.  The intermediate Pythonista's who's learned to code
> "try: foo[i]=24 // except IndexError: pass" is much better off than the
> one who's still striving to LBYL as he had (e.g.) when using C.
> 
> Etc -- this is all very general and generic.

Right. There are plenty of examples where LBYL is better, e.g. because
there are too many different exceptions to catch, or they occur in too
many places. One of my favorites is creating a directory if it doesn't
already exist; I always use this LBYL-ish pattern:

 if not os.path.exists(dn):
    try:
       os.makedirs(dn)
    except os.error, err:
       ...log the error...

because the specific exception for "it already exists" is quite subtle
to pull out of the os.error structure.

Taken to th extreme, the "LBYL is bad" meme would be an argument
against my optional type checking proposal, which I doubt is what you
want.

So, I'd like to take a much more balanced view on LBYL.

> I had convinced myself that strings were a special case worth singling
> out, via isinstance and basestring, just as (say) dictionaries are
> singled out quite differently by metods such as get... I may well have
> been too superficial in this conclusion.

I think there are lots of situations where the desire to special-case
strings is legitimate.

> >> Then you would be able to test whether something is sequence-like
> >> by the presence of __getitem__ or __iter__ methods, without
> >> getting tripped up by strings.
> >
> > There would be other ways to get out of this dilemma; we could
> > introduce a char type, for example. Also, strings might be
> > recognizable by other means, e.g. the presence of a lower() method or
> > some other characteristic method that doesn't apply to sequence in
> > general.
> 
> Sure, there would many possibilities.
> 
> > (To Alex: leaving transform() out of the string interface seems to me
> > the simplest solution.)
> 
> I guess you mean translate.  Yes, that would probably be simplest.

Right.

BTW, there's *still* no sign from a PEP 246 rewrite. Maybe someone
could offer Clark a hand? (Last time I inquired he was recovering from
a week of illness.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Mon Feb 21 22:24:32 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Feb 21 22:28:35 2005
Subject: [Python-Dev] Store x Load x --> DupStore
In-Reply-To: <ca471dc205022008171c3f413d@mail.gmail.com>
Message-ID: <000a01c5185b$bc999700$f61ac797@oemcomputer>

> Where are the attempts to speed up function/method calls? That's an
> area where we could *really* use a breakthrough...

At one time you had entertained treating some of the builtin calls as
fixed.  Is that something you want to go forward with?  It would entail
a "from __future__" and transition period.

It would not be hard to take code like "return len(alist)" and transform
it from:

  2           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (alist)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE      

to:

  2           0 LOAD_FAST                0 (alist)
              3 OBJECT_LEN
              4 RETURN_VALUE      

Some functions already have a custom opcode that cannot be used unless
we freeze the meaning of the function name:  

    repr -->  UNARY_CONVERT --> PyObject_Repr
    iter -->  GET_ITER      --> PyObject_GetIter

Alternately, functions could be served by a table of known, fixed
functions:

  2           0 LOAD_FAST                0 (alist)
              3 CALL_DEDICATED           0 (PyObject_Len)
              6 RETURN_VALUE      

where the dispatch table is something like:  [PyObject_Len,
PyObject_Repr, PyObject_IsInstance, PyObject_IsTrue, PyObject_GetIter,
...].

Of course, none of these offer a big boost and there is some loss of
dynamic behavior.


Raymond
From barry at python.org  Tue Feb 22 03:50:01 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Feb 22 03:50:17 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc205022108152cbcbedf@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
Message-ID: <1109040601.25187.170.camel@presto.wooz.org>

On Mon, 2005-02-21 at 11:15, Guido van Rossum wrote:

> Right. There are plenty of examples where LBYL is better, e.g. because
> there are too many different exceptions to catch, or they occur in too
> many places. One of my favorites is creating a directory if it doesn't
> already exist; I always use this LBYL-ish pattern:
> 
>  if not os.path.exists(dn):
>     try:
>        os.makedirs(dn)
>     except os.error, err:
>        ...log the error...
> 
> because the specific exception for "it already exists" is quite subtle
> to pull out of the os.error structure.

Really?  I do this kind of thing all the time:

import os
import errno
try:
    os.makedirs(dn)
except OSError, e:
    if e.errno <> errno.EEXIST:
        raise

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050221/ae2d9387/attachment.pgp
From quarl at NOSPAM.quarl.org  Tue Feb 22 02:41:38 2005
From: quarl at NOSPAM.quarl.org (Karl Chen)
Date: Tue Feb 22 07:34:34 2005
Subject: [Python-Dev] textwrap wordsep_re
Message-ID: <quack.20050221T1741.87brad1ict@quack.cs.berkeley.edu>


Hi, 

textwrap.fill() is awesome.

Except when the string to wrap contains dates -- which I would
like not to be broken.  In general I think wordsep_re can be
smarter about what it decides are hyphenated words.

For example, this code:
    print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
produces:
    aaaaaaaaaa 2005-
    02-21

A slightly tweaked wordsep_re:
    textwrap.TextWrapper.wordsep_re = \
        re.compile(r'(\s+|'                  # any whitespace
                   r'[^\s\w]*\w+[a-zA-Z]-(?=[a-zA-Z]\w+)|' # hyphenated words
                   r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))')   # em-dash
    print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
behaves better:
    aaaaaaaaaa
    2005-02-21


What do you think about changing the default wordsep_re?

-- 
Karl 2005-02-21 17:39

From aahz at pythoncraft.com  Tue Feb 22 15:35:06 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Feb 22 15:35:10 2005
Subject: [Python-Dev] textwrap wordsep_re
In-Reply-To: <quack.20050221T1741.87brad1ict@quack.cs.berkeley.edu>
References: <quack.20050221T1741.87brad1ict@quack.cs.berkeley.edu>
Message-ID: <20050222143506.GA27893@panix.com>

On Mon, Feb 21, 2005, Karl Chen wrote:
>
> A slightly tweaked wordsep_re:
>     textwrap.TextWrapper.wordsep_re = \
>         re.compile(r'(\s+|'                  # any whitespace
>                    r'[^\s\w]*\w+[a-zA-Z]-(?=[a-zA-Z]\w+)|' # hyphenated words
>                    r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))')   # em-dash
>     print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
> behaves better:
>     aaaaaaaaaa
>     2005-02-21
> 
> What do you think about changing the default wordsep_re?

Please post a patch to SF.  If you're not familiar with the process,
take a look at http://www.python.org/dev/dev_intro.html

Another thing: I don't know whether you'll get this in direct e-mail;
it's considered a bit rude for python-dev to use munged addresses.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From gvanrossum at gmail.com  Tue Feb 22 17:16:52 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Feb 22 17:16:57 2005
Subject: [Python-Dev] UserString
In-Reply-To: <1109040601.25187.170.camel@presto.wooz.org>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
Message-ID: <ca471dc20502220816304deea7@mail.gmail.com>

> Really?  I do this kind of thing all the time:
> 
> import os
> import errno
> try:
>     os.makedirs(dn)
> except OSError, e:
>     if e.errno <> errno.EEXIST:
>         raise

You have a lot more faith in the errno module than I do. Are you sure
the same error codes work on all platforms where Python works? It's
also not exactly readable (except for old Unix hacks).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From david.ascher at gmail.com  Tue Feb 22 17:20:47 2005
From: david.ascher at gmail.com (David Ascher)
Date: Tue Feb 22 17:20:50 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc20502220816304deea7@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
	<ca471dc20502220816304deea7@mail.gmail.com>
Message-ID: <dd28fc2f0502220820434b591@mail.gmail.com>

On Tue, 22 Feb 2005 08:16:52 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> > Really?  I do this kind of thing all the time:
> >
> > import os
> > import errno
> > try:
> >     os.makedirs(dn)
> > except OSError, e:
> >     if e.errno <> errno.EEXIST:
> >         raise
> 
> You have a lot more faith in the errno module than I do. Are you sure
> the same error codes work on all platforms where Python works? It's
> also not exactly readable (except for old Unix hacks).

Agreed. In general, I often wish in production code (especially in
not-100% Python systems) that Python did a better job of at the very
least documenting what kinds of exceptions were raised by what
function calls.  Otherwise you end up with what are effectively
blanket try/except statements way too often for my taste.

--da
From andymac at bullseye.apana.org.au  Tue Feb 22 13:13:08 2005
From: andymac at bullseye.apana.org.au (Andrew MacIntyre)
Date: Tue Feb 22 19:19:49 2005
Subject: [Python-Dev] Re: Prospective Peephole Transformation
In-Reply-To: <cv52ck$99f$1@sea.gmane.org>
References: <4215FD5F.4040605@xs4all.nl>	<000101c515cc$9f96d0a0$803cc797@oemcomputer>	<5.1.1.6.0.20050218103403.03869990@mail.telecommunity.com>
	<cv52ck$99f$1@sea.gmane.org>
Message-ID: <421B21D4.5050306@bullseye.apana.org.au>

Fredrik Lundh wrote:

> it could be worth expanding them to
> 
>     "if x == 1 or x == 2 or x == 3:"
> 
> though...
> 
> C:\>timeit -s "a = 1" "if a in (1, 2, 3): pass"
> 10000000 loops, best of 3: 0.11 usec per loop
> C:\>timeit -s "a = 1" "if a == 1 or a == 2 or a == 3: pass"
> 10000000 loops, best of 3: 0.0691 usec per loop
> 
> C:\>timeit -s "a = 2" "if a == 1 or a == 2 or a == 3: pass"
> 10000000 loops, best of 3: 0.123 usec per loop
> C:\>timeit -s "a = 2" "if a in (1, 2, 3): pass"
> 10000000 loops, best of 3: 0.143 usec per loop
> 
> C:\>timeit -s "a = 3" "if a == 1 or a == 2 or a == 3: pass"
> 10000000 loops, best of 3: 0.187 usec per loop
> C:\>timeit -s "a = 3" "if a in (1, 2, 3): pass"
> 1000000 loops, best of 3: 0.197 usec per loop
> 
> C:\>timeit -s "a = 4" "if a in (1, 2, 3): pass"
> 1000000 loops, best of 3: 0.225 usec per loop
> C:\>timeit -s "a = 4" "if a == 1 or a == 2 or a == 3: pass"
> 10000000 loops, best of 3: 0.161 usec per loop

Out of curiousity I ran /F's tests on my FreeBSD 4.8 box with a recent 
checkout:

$ ./python Lib/timeit.py -s "a = 1" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.247 usec per loop
$ ./python Lib/timeit.py -s "a = 1" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.225 usec per loop
$ ./python Lib/timeit.py -s "a = 2" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.343 usec per loop
$ ./python Lib/timeit.py -s "a = 2" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.353 usec per loop
$ ./python Lib/timeit.py -s "a = 3" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.415 usec per loop
$ ./python Lib/timeit.py -s "a = 3" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.457 usec per loop
$ ./python Lib/timeit.py -s "a = 4" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.467 usec per loop
$ ./python Lib/timeit.py -s "a = 4" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.488 usec per loop

I then applied this patch:
--- Objects/tupleobject.c.orig  Fri Jun 11 05:28:08 2004
+++ Objects/tupleobject.c       Tue Feb 22 22:10:18 2005
@@ -298,6 +298,11 @@
         int i, cmp;

         for (i = 0, cmp = 0 ; cmp == 0 && i < a->ob_size; ++i)
+               cmp = (PyTuple_GET_ITEM(a, i) == el);
+       if (cmp)
+               return cmp;
+
+       for (i = 0, cmp = 0 ; cmp == 0 && i < a->ob_size; ++i)
                 cmp = PyObject_RichCompareBool(el, PyTuple_GET_ITEM(a, i),
                                                    Py_EQ);
         return cmp;

Re-running the tests yielded:

$ ./python Lib/timeit.py -s "a = 1" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.234 usec per loop
$ ./python Lib/timeit.py -s "a = 1" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.228 usec per loop
$ ./python Lib/timeit.py -s "a = 2" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.239 usec per loop
$ ./python Lib/timeit.py -s "a = 2" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.36 usec per loop
$ ./python Lib/timeit.py -s "a = 3" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.241 usec per loop
$ ./python Lib/timeit.py -s "a = 3" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.469 usec per loop
$ ./python Lib/timeit.py -s "a = 4" "if a in (1, 2, 3): pass"
1000000 loops, best of 3: 0.475 usec per loop
$ ./python Lib/timeit.py -s "a = 4" "if a == 1 or a == 2 or a == 3: pass"
1000000 loops, best of 3: 0.489 usec per loop

-------------------------------------------------------------------------
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac@pcug.org.au             (alt) |        Belconnen ACT 2616
Web:    http://www.andymac.org/               |        Australia
From quarl at cs.berkeley.edu  Mon Feb 21 12:39:41 2005
From: quarl at cs.berkeley.edu (Karl Chen)
Date: Tue Feb 22 20:00:13 2005
Subject: [Python-Dev] textwrap.py wordsep_re
Message-ID: <quack.20050221T0339.87ekfacfb6@quack.cs.berkeley.edu>


Hi, 

textwrap.fill() is awesome.

Except when the string to wrap contains dates -- which I would
like not to be filled.  In general I think wordsep_re can be
smarter about what it decides are hyphenated words.

For example, this code:
    print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
produces:
    aaaaaaaaaa 2005-
    02-21

A slightly tweaked wordsep_re:
    textwrap.TextWrapper.wordsep_re =\
        re.compile(r'(\s+|'                  # any whitespace
                   r'[^\s\w]*\w+[a-zA-Z]-(?=[a-zA-Z]\w+)|' # hyphenated words
                   r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))')   # em-dash
    print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
behaves better:
    aaaaaaaaaa
    2005-02-21


What do you think about changing the default wordsep_re?

-- 
Karl 2005-02-21 03:32
From michel at dialnetwork.com  Wed Feb 23 03:04:34 2005
From: michel at dialnetwork.com (Michel Pelletier)
Date: Wed Feb 23 00:24:07 2005
Subject: [Python-Dev] UserString
In-Reply-To: <20050222110123.608C41E403C@bag.python.org>
References: <20050222110123.608C41E403C@bag.python.org>
Message-ID: <200502221804.34808.michel@dialnetwork.com>

On Tuesday 22 February 2005 03:01 am, Guido wrote:

>
> BTW, there's *still* no sign from a PEP 246 rewrite. Maybe someone
> could offer Clark a hand? (Last time I inquired he was recovering from
> a week of illness.)

Last summer Alex, Clark, Phillip and I swapped a few emails about reviving the 
245/246 drive and submitting a plan for a PSF grant.  I was pushing the 
effort and then had to lamely drop out due to a new job.

This is good grant material for someone which leads to my question, when will 
the next cycle of PSF grants happen?   I'm not volunteering and I won't have 
the bandwidth to participate, but if there are other starving souls out there 
willing to do the heavy lifting to help Alex it could get done quickly within 
the PSFs own framework for advancing the language.

-Michel
From andrewm at object-craft.com.au  Wed Feb 23 01:14:45 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Feb 23 01:14:34 2005
Subject: [Python-Dev] UserString 
In-Reply-To: <ca471dc20502220816304deea7@mail.gmail.com> 
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
	<ca471dc20502220816304deea7@mail.gmail.com>
Message-ID: <20050223001445.DB6583C889@coffee.object-craft.com.au>

>>     if e.errno <> errno.EEXIST:
>>         raise
>
>You have a lot more faith in the errno module than I do. Are you sure
>the same error codes work on all platforms where Python works? It's
>also not exactly readable (except for old Unix hacks).

On the other hand, LBYL in this context can result in race conditions
and security vulnerabilities. "os.makedirs" is already a composite of
many system calls, so all bets are off anyway, but for simpler operations
that result in an atomic system call, this is important.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From tim.peters at gmail.com  Wed Feb 23 03:57:22 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Feb 23 03:57:25 2005
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Python compile.c, 2.344, 2.345
In-Reply-To: <E1D2qPE-0000j1-Fr@sc8-pr-cvs1.sourceforge.net>
References: <E1D2qPE-0000j1-Fr@sc8-pr-cvs1.sourceforge.net>
Message-ID: <1f7befae050222185758fdd46e@mail.gmail.com>

[rhettinger@users.sourceforge.net]
> Modified Files:
>        compile.c
> Log Message:
> Teach the peepholer to fold unary operations on constants.
>
> Afterwards, -0.5 loads in a single step and no longer requires a runtime
> UNARY_NEGATIVE operation.

Aargh.  The compiler already folded in a leading minus for ints, and
exempting floats from this was deliberate.  Stick this in a file:

import math
print math.atan2(-0.0, -0.0)

If you run that directly, a decent 754-conforming libm will display an
approximation to -pi (-3.14...; this is the required result in C99 if
its optional 754 support is implemented, and even MSVC has done this
all along).  But if you import the same module from a .pyc or .pyo,
now on the HEAD it prints 0.0 instead.  In 2.4 it still prints -pi.

I often say that all behavior in the presence of infinities, NaNs, and
signed zeroes is undefined in CPython, and that's strictly true (just
_try_ to find reassuring words about any of those cases in the Python
docs <wink>).  But it's still the case that we (meaning mostly me)
strive to preserve sensible 754 semantics when it's reasonably
possible to do so.  Not even gonzo-optimizing Fortran compilers will
convert -0.0 to 0.0 anymore, precisely because it's not semantically
neutral.

In this case, it's marshal that drops the sign bit of a float 0 on the
floor, so surprises result if and only if you run from a precompiled
Python module now.

I don't think you need to revert the whole patch, but -0.0 must be
left alone (or marshal taught to preserve the sign of a float 0.0 --
but then you have the problem of _detecting_ the sign of a float 0.0,
and nothing in standard C89 can do so).  Even in 754-land, it's OK to
fold in the sign for non-zero float literals (-x is always
unexceptional in 754 unless x is a signaling NaN, and there are no
signaling NaN literals; and the sign bit of any finite float except
zero is already preserved by marshal).
From kbk at shore.net  Wed Feb 23 05:19:55 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Feb 23 05:20:51 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200502230419.j1N4Jthi005718@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  308 open (+10) /  2755 closed ( +1) /  3063 total (+11)
Bugs    :  838 open (+15) /  4834 closed ( +5) /  5672 total (+20)
RFE     :  168 open ( +0) /   148 closed ( +4) /   316 total ( +4)

New / Reopened Patches
______________________

do not add directory of sys.argv[0] into sys.path  (2004-05-02)
       http://python.org/sf/946373  reopened by  wrobell

isapi.samples.advanced.py fix  (2005-02-17)
       http://python.org/sf/1126187  opened by  Philippe Kirsanov

more __contains__ tests  (2005-02-17)
       http://python.org/sf/1141428  opened by  Jim Jewett

Fix to allow urllib2 digest auth to talk to livejournal.com  (2005-02-18)
       http://python.org/sf/1143695  opened by  Benno Rice

Add IEEE Float support to wave.py  (2005-02-19)
       http://python.org/sf/1144504  opened by  Ben Schwartz

cgitb: make more usable for 'binary-only' sw (new patch)  (2005-02-19)
       http://python.org/sf/1144549  opened by  Reinhold Birkenfeld

allow UNIX mmap size to default to current file size (new)  (2005-02-19)
       http://python.org/sf/1144555  opened by  Reinhold Birkenfeld

Make OpenerDirector instances pickle-able  (2005-02-20)
       http://python.org/sf/1144636  opened by  John J Lee

webbrowser.Netscape.open bug fix  (2005-02-20)
       http://python.org/sf/1144816  opened by  Pernici Mario

Replace store/load pair with a single new opcode  (2005-02-20)
       http://python.org/sf/1144842  opened by  Raymond Hettinger

Remove some invariant conditions and assert in ceval  (2005-02-20)
       http://python.org/sf/1145039  opened by  Neal Norwitz

Patches Closed
______________

date.strptime and time.strptime as well  (2005-02-04)
       http://python.org/sf/1116362  closed by  josh-sf

New / Reopened Bugs
___________________

attempting to use urllib2 on some URLs fails starting on 2.4  (2005-02-16)
       http://python.org/sf/1123695  opened by  Stephan Sokolow

descrintro describes __new__ and __init__ behavior wrong  (2005-02-15)
       http://python.org/sf/1123716  opened by  Steven Bethard

gensuitemodule.processfile fails  (2005-02-16)
       http://python.org/sf/1123727  opened by  Jurjen N.E. Bos

PyDateTime_FromDateAndTime documented as PyDate_FromDateAndT  (2005-02-16)
CLOSED http://python.org/sf/1124278  opened by  smilechaser

Function's __name__ no longer accessible in restricted mode  (2005-02-16)
CLOSED http://python.org/sf/1124295  opened by  Tres Seaver

Python24.dll crashes, EXAMPLE ATTACHED  (2005-02-12)
CLOSED http://python.org/sf/1121201  reopened by  complex

IDLE line wrapping  (2005-02-16)
CLOSED http://python.org/sf/1124503  opened by  Chris Rebert

test_os fails on 2.4  (2005-02-17)
CLOSED http://python.org/sf/1124513  reopened by  doerwalter

test_os fails on 2.4  (2005-02-16)
CLOSED http://python.org/sf/1124513  opened by  Brett Cannon

test_subprocess is far too slow  (2005-02-17)
       http://python.org/sf/1124637  opened by  Michael Hudson

Math mode not well handled in \documentclass{howto}  (2005-02-17)
       http://python.org/sf/1124692  opened by  Daniele Varrazzo

GetStdHandle in interactive GUI  (2005-02-17)
       http://python.org/sf/1124861  opened by  davids

subprocess.py Errors with IDLE  (2005-02-17)
       http://python.org/sf/1126208  opened by  Kurt B. Kaiser

subprocesss module retains older license header  (2005-02-17)
       http://python.org/sf/1138653  opened by  Tres Seaver

Python syntax is not so XML friendly!  (2005-02-18)
CLOSED http://python.org/sf/1143855  opened by  Colbert Philippe

inspect.getsource() breakage in 2.4  (2005-02-18)
       http://python.org/sf/1143895  opened by  Armin Rigo

future warning in commets  (2005-02-18)
       http://python.org/sf/1144057  opened by  Grzegorz Makarewicz

reload() is broken for C extension objects  (2005-02-19)
       http://python.org/sf/1144263  opened by  Matthew G. Knepley

htmllib quote parse error within a <script>  (2005-02-19)
       http://python.org/sf/1144533  opened by  Allan Hoeltje

No os.statvfs on FreeBSD  (2005-02-21)
       http://python.org/sf/1145231  opened by  Volker Stolz

shutil.copystat() may fail...  (2005-02-21)
       http://python.org/sf/1145257  opened by  Petr Prikryl

Strange behaviour concerning variable names  (2005-02-22)
CLOSED http://python.org/sf/1145950  opened by  Felix Nawothnig

bsddb3 build problems on FreeBSD (2.4 + 2.5)  (2005-02-23)
       http://python.org/sf/1146231  opened by  Andrew I MacIntyre

Windows deadlock with PyEval_ReleaseLock   (2005-02-22)
       http://python.org/sf/1147646  opened by  Lou Montulli

fix bsddb documentation psize -> pgsize  (2005-02-22)
       http://python.org/sf/1149413  opened by  Martin Mokrejs

bssdb wrapper does not export some low-level functions  (2005-02-23)
       http://python.org/sf/1149447  opened by  Martin Mokrejs

Bugs Closed
___________

Static library incompatible with nptl  (2005-02-10)
       http://python.org/sf/1119860  closed by  loewis

PyDateTime_FromDateAndTime documented as PyDate_FromDateAndT  (2005-02-16)
       http://python.org/sf/1124278  closed by  bcannon

Function's __name__ no longer accessible in restricted mode  (2005-02-16)
       http://python.org/sf/1124295  closed by  mwh

Python24.dll crashes, EXAMPLE ATTACHED  (2005-02-11)
       http://python.org/sf/1121201  closed by  bcannon

test_os fails on 2.4  (2005-02-17)
       http://python.org/sf/1124513  closed by  doerwalter

test_os fails on 2.4  (2005-02-17)
       http://python.org/sf/1124513  closed by  loewis

Strange behaviour concerning variable names  (2005-02-22)
       http://python.org/sf/1145950  closed by  mwh

RFE Closed
__________

Option to force variables to be declared  (2005-02-14)
       http://python.org/sf/1122279  closed by  rhettinger

Line Numbers  (2005-02-14)
       http://python.org/sf/1122532  closed by  kbk

Python syntax is not so XML friendly!  (2005-02-18)
       http://python.org/sf/1143855  closed by  bcannon

IDLE line wrapping  (2005-02-16)
       http://python.org/sf/1124503  closed by  kbk

From Martin.Gfeller at comit.ch  Wed Feb 23 12:46:56 2005
From: Martin.Gfeller at comit.ch (Gfeller Martin)
Date: Wed Feb 23 12:47:03 2005
Subject: [Python-Dev] Windows Low Fragmentation Heap yields speedup of ~15%
Message-ID: <B7824E219C630A4BB5F97190F523C12BBD7210@vulcanos.ch.comitgroup.net>

> A well-known trick is applicable in that case, if Martin thinks it's
> worth the bother:
> grow the list to its final size once, at the start (overestimating if
> you don't know for sure).  Then instead of appending, keep an index to
> the next free slot, same as you'd do in C.  Then the list guts never
> move, so if that doesn't yield the same kind of speedup without using
> LFH, list copying wasn't actually the culprit to begin with.

I actually did that in Py2.1, and removed it when we switched to Zope 2.7/Py 2.3,
because it was no longer worth it, presumably due to obmalloc becoming
enabled. Unfortunately, I have lost the speedup gained by my Fast-Append-List
in Py 2.1, but I recall it saved about 5% in a similar test than the
one I have today. 

> Yes.  For example, a 300-character string could do it (that's not
> small to obmalloc, but is to LFH).  Strings produced by pickling are
> very often that large, and especially in Zope (which uses pickles
> extensively under the covers -- reading and writing persistent objects
> in Zope all involve pickle strings).

With my amateur (in C-stuff) knowledge, this sounds as a very likely point. 
>From Evan's 17 Feb mail, it sounds that cPickle would not use obmalloc - 
if this is the case, wouldn't that be an obvious (and relatively easy) speedup?

> If someone were motivated enough, it would probably be easiest to 
> run Martin's app on a Linux box, and use the free Linux tools to analyze it.

As it is often the case, real-life figures do not come from a benchmark,
but from an application test with real data, so putting the whole thing
up on Linux would need quite some time - which I don't have :-(

Best regards, Martin


-----Original Message-----
From: Tim Peters [mailto:tim.peters@gmail.com] 
Sent: Friday, 18 Feb 2005 23:52
To: Evan Jones
Cc: Gfeller Martin; Martin v. L?wis; Python Dev
Subject: Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%


[Tim Peters]
...
>> Then you allocate a small object, marked 's':
>>
>> bbbbbbbbbbbbbbbsfffffffffffffffffffffffffffffff

[Evan Jones]
> Isn't the whole point of obmalloc

No, because it doesn't matter what follows that introduction: 
obmalloc has several points, including exploiting the GIL, heuristics
aiming at reusing memory while it's still high in the memory
heirarchy, almost never touching a piece of memory until it's actually
needed, and so on.

> is that we don't want to allocate "s" on the heap, since it is small?

That's one of obmalloc's goals, yes.  But "small" is a relative
adjective, not absolute.  Because we're primarily talking about LFH
here, the natural meaning for "small" in _this_ thread is < 16KB,
which is much larger than "small" means to obmalloc.  The memory-map
example applies just well to LFH as to obmalloc, by changing which
meaning for "small" you have in mind.

> I guess "s" could be an object that might potentially grow.

For example, list guts in Python are never handled by obmalloc,
although the small fixed-size list _header_ object is always handled
by obmalloc.

>> One thing to take from that is that LFH can't be helping list-growing
>> in a direct way either, if LFH (as seems likely) also needs to copy
>> objects that grow in order to keep its internal memory segregated by
>> size.  The indirect benefit is still available, though:  LFH may be
>> helping simply by keeping smaller objects out of the general heap's
>> hair.

> So then wouldn't this mean that there would have to be some sort of
> small object being allocated via the system malloc that is causing the
> poor behaviour?

Yes.  For example, a 300-character string could do it (that's not
small to obmalloc, but is to LFH).  Strings produced by pickling are
very often that large, and especially in Zope (which uses pickles
extensively under the covers -- reading and writing persistent objects
in Zope all involve pickle strings).

> As you mention, I wouldn't think it would be list objects, since resizing
> lists using LFH should be *worse*.

Until they get to LFH's boundary for "small", and we have only the
vaguest idea what Martin's app does here -- we know it grows lists
containing 50K elements in the end, and ... well, that's all I really
know about it <wink>.

A well-known trick is applicable in that case, if Martin thinks it's
worth the bother:
grow the list to its final size once, at the start (overestimating if
you don't know for sure).  Then instead of appending, keep an index to
the next free slot, same as you'd do in C.  Then the list guts never
move, so if that doesn't yield the same kind of speedup without using
LFH, list copying wasn't actually the culprit to begin with.

> That would actually be something that is worth verifying, however.

Not worth the time to me -- Windows is closed-source, and I'm too old
to enjoy staring at binary disassemblies any more.  Besides, list guts
can't stay in LFH after the list exceeds 4K elements.  If list-copying
costs are significant here, they're far more likely to be due to
copying lists over 4K elements than under -- copying a list takes
O(len(list)) time.  So the realloc() strategy used by LFH _probably_
isn't of _primary)_ interest here.

> It could be that the Windows LFH is extra clever?

Sure -- that I doubt it moves Heaven & Earth to cater to reallocs is
just educated guessing.  I wrote my first production heap manager at
Cray Research, around 1979 <wink>.

> ...
> Well, it would also be useful to find out what code is calling the
> system malloc. This would make it easy to examine the code and see if
> it should be calling obmalloc or the system malloc. Any good ideas for
> easily obtaining this information? I imagine that some profilers must
> be able to produce a complete call graph?

Windows supports extensive facilities for analyzing heap usage, even
from an external process that attaches to the process you want to
analyze.  Ditto for profiling.  But it's not easy, and I don't know of
any free tools that are of real help.  If someone were motivated
enough, it would probably be easiest to run Martin's app on a Linux
box, and use the free Linux tools to analyze it.
From reinhold-birkenfeld-nospam at wolke7.net  Wed Feb 23 18:17:49 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed Feb 23 18:17:30 2005
Subject: [Python-Dev] Re: Some old patches
In-Reply-To: <cv8hnv$enk$1@sea.gmane.org>
References: <cv8hnv$enk$1@sea.gmane.org>
Message-ID: <cvidis$om5$1@sea.gmane.org>

Reinhold Birkenfeld wrote:
> Hello,
> 
> this time working up some of the patches with beards:

No one listening? I'm sorry when patch reviews are not welcome in
python-devel; in this case I'll post individual comments to the patches
on SF.

Reinhold

From aahz at pythoncraft.com  Wed Feb 23 19:18:10 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed Feb 23 19:18:12 2005
Subject: [Python-Dev] Re: Some old patches
In-Reply-To: <cvidis$om5$1@sea.gmane.org>
References: <cv8hnv$enk$1@sea.gmane.org> <cvidis$om5$1@sea.gmane.org>
Message-ID: <20050223181809.GA1346@panix.com>

On Wed, Feb 23, 2005, Reinhold Birkenfeld wrote:
> Reinhold Birkenfeld wrote:
>> 
>> this time working up some of the patches with beards:
> 
> No one listening? I'm sorry when patch reviews are not welcome in
> python-devel; in this case I'll post individual comments to the patches
> on SF.

You should definitely post the patch reviews to SF no matter what; that
way there's a historical record.  Patch review summaries are welcome on
python-dev, but it's the nature of the medium that they don't always get
responded to.

BTW, it's not clear whether your e-mail address is munged or not, which
likely contributes to reluctance to respond.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From reinhold-birkenfeld-nospam at wolke7.net  Wed Feb 23 19:38:35 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed Feb 23 19:37:51 2005
Subject: [Python-Dev] Re: Some old patches
In-Reply-To: <20050223181809.GA1346@panix.com>
References: <cv8hnv$enk$1@sea.gmane.org> <cvidis$om5$1@sea.gmane.org>
	<20050223181809.GA1346@panix.com>
Message-ID: <cviiaa$bba$1@sea.gmane.org>

Aahz wrote:
> On Wed, Feb 23, 2005, Reinhold Birkenfeld wrote:
>> Reinhold Birkenfeld wrote:
>>> 
>>> this time working up some of the patches with beards:
>> 
>> No one listening? I'm sorry when patch reviews are not welcome in
>> python-devel; in this case I'll post individual comments to the patches
>> on SF.
> 
> You should definitely post the patch reviews to SF no matter what; that
> way there's a historical record.  Patch review summaries are welcome on
> python-dev, but it's the nature of the medium that they don't always get
> responded to.
> 
> BTW, it's not clear whether your e-mail address is munged or not, which
> likely contributes to reluctance to respond.

Well, what about trying? [I see you tried, so don't bother] I insert
this "nospam" deliberately, as it keeps spam away. Perhaps it would be
best to mention this in a signature.

Reinhold

-- 
Mail address is perfectly valid!

From reinhold-birkenfeld-nospam at wolke7.net  Wed Feb 23 19:41:43 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed Feb 23 19:44:51 2005
Subject: [Python-Dev] What about CALL_ATTR?
Message-ID: <cviig6$bba$2@sea.gmane.org>

While rummaging in the old patches, I found this:

"""
The result of the PyCore sprint of me and Brett: the CALL_ATTR opcode
(LOAD_ATTR and CALL_FUNCTION combined) that skips the PyMethod creation
and destruction for classic classes (but not newstyle classes, yet.)

The code is somewhat rough yet, it needs commenting, some renaming, and
most importantly testing. It seems to work, however, and provides
between a 35% and 5% speedup. (5% in 'average' code, up to 35% in
instance method calls and instance creation alone.) It also needs to be
updated to include newstyle classes. I will likely work on this on the
flight home.
"""

(patch #709744)

How is the status of this? Sounds promising, I'd say...

Reinhold

-- 
Mail address is perfectly valid!

From bac at OCF.Berkeley.EDU  Wed Feb 23 20:58:54 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Feb 23 20:59:10 2005
Subject: [Python-Dev] What about CALL_ATTR?
In-Reply-To: <cviig6$bba$2@sea.gmane.org>
References: <cviig6$bba$2@sea.gmane.org>
Message-ID: <421CE07E.7040701@ocf.berkeley.edu>

Reinhold Birkenfeld wrote:
> While rummaging in the old patches, I found this:
> 
> """
> The result of the PyCore sprint of me and Brett: the CALL_ATTR opcode
> (LOAD_ATTR and CALL_FUNCTION combined) that skips the PyMethod creation
> and destruction for classic classes (but not newstyle classes, yet.)
> 
> The code is somewhat rough yet, it needs commenting, some renaming, and
> most importantly testing. It seems to work, however, and provides
> between a 35% and 5% speedup. (5% in 'average' code, up to 35% in
> instance method calls and instance creation alone.) It also needs to be
> updated to include newstyle classes. I will likely work on this on the
> flight home.
> """
> 
> (patch #709744)
> 
> How is the status of this? Sounds promising, I'd say...
> 

See my reply in the "Store x Load x -->  DupStore" thread at 
http://mail.python.org/pipermail/python-dev/2005-February/051725.html .

Basically Thomas discovered that it was slower when used with new-style 
classes.  But this was almost two years ago with Thomas having not done hacking 
on the core for two years IIRC and me having practically zero experience.

-Brett
From jcarlson at uci.edu  Wed Feb 23 22:05:01 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Feb 23 22:07:10 2005
Subject: [Python-Dev] Comment regarding PEP 328
Message-ID: <20050223121821.6E0E.JCARLSON@uci.edu>


In a recent discussion in a SF patch, I noticed that PEP 328* only seems
to support relative imports within packages, while bare import
statements use the entirety of sys.path, not solving the shadowing of
standard library module names.

I have certainly forgotten bits of discussion from last spring, but I
would offer that Python could offer standard library shadowing
protection through the use of an extended PEP 328 semantic.

More specifically; after a 'from __future__ import absolute_import'
statement, any import in the module performing "import foo" will only
check for foo in the standard library, and the use of the leading period,
"from . import foo", the will signify relative to the current path. **

The lack of a 'from __future__ import absolute_import' statement in a
module will not change the import semantic of that module.


This allows current code to continue to work, and for those who want to
choose names which shadow the standard library modules, a way of dealing
with their choices.

Further, in the case of PEP 328, the package relative imports were to
become the default in 2.6 (with deprecation in 2.5, availability in 2.4),
but with the lack of an implementation, perhaps those numbers should be
incremented.  If the behavior I describe is desireable, it would subsume
PEP 328, and perhaps should also become the default behavior at some
point in time (perhaps in the same adjusted timeline as PEP 328).

Alternatively, PEP 328 could be implemented as-is, and a second future
import could be defined which offers this functionality, being
permanently optional (or on a different timeline) via the future import.

Essentially, it would ignore the empty path "" in sys.path when the
functionality has been enabled via the proper future import in the
current module.


 - Josiah


* PEP 328 first describes the use of parenthesis in import statements so
that long import listings do not require backslash-escaping of newlines. 
It then describes a semantic for not checking sys.path when performing
an import, as well as allowing parent, cousin, uncle, etc., imports via
additional leading periods.  "from . import foo" for sibling imports,
"from .. import foo" for parent imports, etc.

** I describe the semantic as being a per-module option, as this is the
only backwards-compatible mechanism in the near future (Python 2.5). An
import implementation would merely check the existance of the proper
name binding -> object pair in the importer's global namespace.  The
standard library would need to be modified if or when current behavior
is deprecated (this would extend the modules needing to be modified due
to PEP 328).

From martin at v.loewis.de  Wed Feb 23 22:25:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Feb 23 22:25:42 2005
Subject: [Python-Dev] Re: Some old patches
In-Reply-To: <cvidis$om5$1@sea.gmane.org>
References: <cv8hnv$enk$1@sea.gmane.org> <cvidis$om5$1@sea.gmane.org>
Message-ID: <421CF4D2.3030304@v.loewis.de>

Reinhold Birkenfeld wrote:
> No one listening? I'm sorry when patch reviews are not welcome in
> python-devel; in this case I'll post individual comments to the patches
> on SF.

I have seen them, and I appreciate them, but I had no time to respond,
yet (likewise for the 20+ other reviews which I still need to look at).

Regards,
Martin
From gvanrossum at gmail.com  Thu Feb 24 04:00:02 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 24 04:00:08 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <20050223121821.6E0E.JCARLSON@uci.edu>
References: <20050223121821.6E0E.JCARLSON@uci.edu>
Message-ID: <ca471dc205022319001ccccaa2@mail.gmail.com>

> In a recent discussion in a SF patch, I noticed that PEP 328* only seems
> to support relative imports within packages, while bare import
> statements use the entirety of sys.path, not solving the shadowing of
> standard library module names.

Hm. I'm not convinced that there is a *problem* with shadowing of
standard library module names. You shouldn't pick a module name that
shadows a standard library module, or if you do you shouldn't want to
be able to still use those modules that you're shadowing. Anything
else is just asking for trouble.

> I have certainly forgotten bits of discussion from last spring, but I
> would offer that Python could offer standard library shadowing
> protection through the use of an extended PEP 328 semantic.
> 
> More specifically; after a 'from __future__ import absolute_import'
> statement, any import in the module performing "import foo" will only
> check for foo in the standard library, and the use of the leading period,
> "from . import foo", the will signify relative to the current path. **

And how exactly do you define "the standard library"? Anything that's
on sys.path? That would seem the only reasonable interpretation to me.
So I take it that you want the "script directory" off that path.
(Let's for the sake of argument call it ".".)

> The lack of a 'from __future__ import absolute_import' statement in a
> module will not change the import semantic of that module.

It's hard to imagine how this would work. sys.path is global, so
either "." is on it, or it isn't. So things in "." are either
considered part of the standard library, or they are not; this can't
be made dependent on the module's importation of something from
__future__.

> This allows current code to continue to work, and for those who want to
> choose names which shadow the standard library modules, a way of dealing
> with their choices.

My suggested way of dealing with their choices is summarized in the
first paragraph of my reply above.

> Further, in the case of PEP 328, the package relative imports were to
> become the default in 2.6 (with deprecation in 2.5, availability in 2.4),
> but with the lack of an implementation, perhaps those numbers should be
> incremented.  If the behavior I describe is desireable, it would subsume
> PEP 328, and perhaps should also become the default behavior at some
> point in time (perhaps in the same adjusted timeline as PEP 328).

That's a separate issue; the absolute/relative import part of PEP 328
didn't make it into 2.4 so I suppose we should ++ all those version
numbers.

> Alternatively, PEP 328 could be implemented as-is, and a second future
> import could be defined which offers this functionality, being
> permanently optional (or on a different timeline) via the future import.

I don't like permanently optional language features; that causes too
much confusion. I'd much rather settle on clear semantics that
everyone can understand (even if they may disagree).

But I certainly would prefer that the proposed feature becomes a
separate PEP which can be discussed, accepted or rejected, and
implemented separately from PEP 328, which is complete and accepted
and just awaiting someone to implement it.

> Essentially, it would ignore the empty path "" in sys.path when the
> functionality has been enabled via the proper future import in the
> current module.

But it's not always "" -- it's the directory where the "main" script was found.

Let me explain the biggest problem I see for your proposal: what would
be the canonical name for a module imported using your "new relative
semantics"? Remember, the canonical name of a module is its __name__
attribute, and the key that finds it in the sys.modules dict. Because
there's only one sys.modules dict (barring restricted execution
sandboxes), the canonical name must be unique. So if there's a
standard library module string, its canonical name is "string". Now
suppose you have your own non-standard-linrary module read from a file
string.py. What should its canonical name be? It can't be "string"
because that's already reserved for the standard library module name.

The best solution I can think of for this off the top of my head is to
somehow allow for the arrangement of a pseudo-package named __main__
and to make all these non-standard-library modules reside (logically)
in that module. If you can write a PEP along those lines you may be on
to something -- but I expect that the way to turn it on is not to
import something from __future__ but perhaps from __main__. I'm not
exactly sure how to get "." off sys.path, but maybe you can think
about that for your PEP proposal. What do you say?

>  - Josiah
> 
> * PEP 328 first describes the use of parenthesis in import statements so
> that long import listings do not require backslash-escaping of newlines.
> It then describes a semantic for not checking sys.path when performing
> an import, as well as allowing parent, cousin, uncle, etc., imports via
> additional leading periods.  "from . import foo" for sibling imports,
> "from .. import foo" for parent imports, etc.
> 
> ** I describe the semantic as being a per-module option, as this is the
> only backwards-compatible mechanism in the near future (Python 2.5). An
> import implementation would merely check the existance of the proper
> name binding -> object pair in the importer's global namespace.  The
> standard library would need to be modified if or when current behavior
> is deprecated (this would extend the modules needing to be modified due
> to PEP 328).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Thu Feb 24 08:23:20 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Feb 24 08:23:33 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41F6C474.8030700@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>
	<41F0F11B.8000600@livinglogic.de>	<41F3B46F.5040205@egenix.com>
	<41F6C474.8030700@livinglogic.de>
Message-ID: <421D80E8.9020306@ocf.berkeley.edu>

Walter D?rwald wrote:
> M.-A. Lemburg wrote:
> 
>> Walter D?rwald wrote:
>>
>>> M.-A. Lemburg wrote:
>>>
>>>  > [...]
>>>
>>>> __str__ and __unicode__ as well as the other hooks were
>>>> specifically added for the type constructors to use.
>>>> However, these were added at a time where sub-classing
>>>> of types was not possible, so it's time now to reconsider
>>>> whether this functionality should be extended to sub-classes
>>>> as well.
>>>
>>>
>>> So can we reach consensus on this, or do we need a
>>> BDFL pronouncement?
>>
>>
>> I don't have a clear picture of what the consensus currently
>> looks like :-)
>>
>> If we're going for for a solution that implements the hook
>> awareness for all __<typename>__ hooks, I'd be +1 on that.
>> If we only touch the __unicode__ case, we'd only be created
>> yet another special case. I'd vote -0 on that.
> 
>  > [...]
> 
> Here's the patch that implements this for int/long/float/unicode:
> http://www.python.org/sf/1109424
> 

Any movement on this?  +1 for making things work like str; if a subclass 
overrides __str__ it should use that method.  If correctness of what is 
returned is a worry then a check could be tossed in before the value is returned.

-Brett
From jcarlson at uci.edu  Thu Feb 24 09:01:57 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu Feb 24 09:03:35 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <ca471dc205022319001ccccaa2@mail.gmail.com>
References: <20050223121821.6E0E.JCARLSON@uci.edu>
	<ca471dc205022319001ccccaa2@mail.gmail.com>
Message-ID: <20050223204309.6E15.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> > In a recent discussion in a SF patch, I noticed that PEP 328* only seems
> > to support relative imports within packages, while bare import
> > statements use the entirety of sys.path, not solving the shadowing of
> > standard library module names.
> 
> Hm. I'm not convinced that there is a *problem* with shadowing of
> standard library module names. You shouldn't pick a module name that
> shadows a standard library module, or if you do you shouldn't want to
> be able to still use those modules that you're shadowing. Anything
> else is just asking for trouble.

While I personally don't tend to use names previously existing in
the standard library, seemingly a large number of people do, hence the
not-so-rare threads on comp.lang.python which ask about such things.


> > More specifically; after a 'from __future__ import absolute_import'
> > statement, any import in the module performing "import foo" will only
> > check for foo in the standard library, and the use of the leading period,
> > "from . import foo", the will signify relative to the current path. **
> 
> And how exactly do you define "the standard library"? Anything that's
> on sys.path? That would seem the only reasonable interpretation to me.
> So I take it that you want the "script directory" off that path.
> (Let's for the sake of argument call it ".".)

Sounds reasonable to me, with one caveat; if one were to consider
everything on sys.path to be in the standard library, then every script
ever written for Python, which doesn't remove the standard ''/'.' from
sys.path, would be part of the standard library.

I would suggest, as a replacement definition (probably with a caveat or
two), that any module with a reference in the documentation, that also
lies on the default sys.path, which is shipped with Python that is
distributed at python.org, is part of the standard library.


> > The lack of a 'from __future__ import absolute_import' statement in a
> > module will not change the import semantic of that module.
> 
> It's hard to imagine how this would work. sys.path is global, so
> either "." is on it, or it isn't. So things in "." are either
> considered part of the standard library, or they are not; this can't
> be made dependent on the module's importation of something from
> __future__.

Perhaps not, but in the process of importing a module into a namespace,
one can check for the existance of the object imported from __future__,
and ignore or not the "." entry in sys.path.


> > This allows current code to continue to work, and for those who want to
> > choose names which shadow the standard library modules, a way of dealing
> > with their choices.
> 
> My suggested way of dealing with their choices is summarized in the
> first paragraph of my reply above.

Perfectly reasonable.  I can think of examples where it would not be
reasonable, but they are quite cooked *wink*.


> > Alternatively, PEP 328 could be implemented as-is, and a second future
> > import could be defined which offers this functionality, being
> > permanently optional (or on a different timeline) via the future import.
> 
> I don't like permanently optional language features; that causes too
> much confusion. I'd much rather settle on clear semantics that
> everyone can understand (even if they may disagree).
> 
> But I certainly would prefer that the proposed feature becomes a
> separate PEP which can be discussed, accepted or rejected, and
> implemented separately from PEP 328, which is complete and accepted
> and just awaiting someone to implement it.

Sounds good.  I'll see what I can do this weekend about getting a
proto-pep started.


> > Essentially, it would ignore the empty path "" in sys.path when the
> > functionality has been enabled via the proper future import in the
> > current module.
> 
> But it's not always "" -- it's the directory where the "main" script was found.
> 
> Let me explain the biggest problem I see for your proposal: what would
> be the canonical name for a module imported using your "new relative
> semantics"? Remember, the canonical name of a module is its __name__
> attribute, and the key that finds it in the sys.modules dict. Because
> there's only one sys.modules dict (barring restricted execution
> sandboxes), the canonical name must be unique. So if there's a
> standard library module string, its canonical name is "string". Now
> suppose you have your own non-standard-linrary module read from a file
> string.py. What should its canonical name be? It can't be "string"
> because that's already reserved for the standard library module name.
> 
> The best solution I can think of for this off the top of my head is to
> somehow allow for the arrangement of a pseudo-package named __main__
> and to make all these non-standard-library modules reside (logically)
> in that module. If you can write a PEP along those lines you may be on
> to something -- but I expect that the way to turn it on is not to
> import something from __future__ but perhaps from __main__. I'm not
> exactly sure how to get "." off sys.path, but maybe you can think
> about that for your PEP proposal. What do you say?

You make very good points about naming in sys.modules.  I will think
about this and place some options in the proto-pep.

Thank you,
 - Josiah

From barry at python.org  Thu Feb 24 15:58:52 2005
From: barry at python.org (Barry Warsaw)
Date: Thu Feb 24 15:58:55 2005
Subject: [Python-Dev] UserString
In-Reply-To: <ca471dc20502220816304deea7@mail.gmail.com>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
	<ca471dc20502220816304deea7@mail.gmail.com>
Message-ID: <1109257132.2455.92.camel@presto.wooz.org>

On Tue, 2005-02-22 at 11:16, Guido van Rossum wrote:
> > Really?  I do this kind of thing all the time:
> > 
> > import os
> > import errno
> > try:
> >     os.makedirs(dn)
> > except OSError, e:
> >     if e.errno <> errno.EEXIST:
> >         raise
> 
> You have a lot more faith in the errno module than I do. Are you sure
> the same error codes work on all platforms where Python works? 

No, but I'm pretty confident the symbolic names for the errors are
consistent for any platform I've cared about <wink>.

> It's
> also not exactly readable (except for old Unix hacks).

Guilty as charged. ;)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050224/3ad2c808/attachment.pgp
From barry at python.org  Thu Feb 24 16:00:01 2005
From: barry at python.org (Barry Warsaw)
Date: Thu Feb 24 16:00:03 2005
Subject: [Python-Dev] UserString
In-Reply-To: <20050223001445.DB6583C889@coffee.object-craft.com.au>
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
	<ca471dc20502220816304deea7@mail.gmail.com>
	<20050223001445.DB6583C889@coffee.object-craft.com.au>
Message-ID: <1109257201.2450.94.camel@presto.wooz.org>

On Tue, 2005-02-22 at 19:14, Andrew McNamara wrote:
> >>     if e.errno <> errno.EEXIST:
> >>         raise
> >
> >You have a lot more faith in the errno module than I do. Are you sure
> >the same error codes work on all platforms where Python works? It's
> >also not exactly readable (except for old Unix hacks).
> 
> On the other hand, LBYL in this context can result in race conditions
> and security vulnerabilities. "os.makedirs" is already a composite of
> many system calls, so all bets are off anyway, but for simpler operations
> that result in an atomic system call, this is important.

Agreed.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050224/1ec9db65/attachment.pgp
From walter at livinglogic.de  Thu Feb 24 16:32:28 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu Feb 24 16:32:32 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <421D80E8.9020306@ocf.berkeley.edu>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>
	<41F0F11B.8000600@livinglogic.de>	<41F3B46F.5040205@egenix.com>
	<41F6C474.8030700@livinglogic.de>
	<421D80E8.9020306@ocf.berkeley.edu>
Message-ID: <421DF38C.1090203@livinglogic.de>

Brett C. wrote:

> Walter D?rwald wrote:
> 
>> M.-A. Lemburg wrote:
>>
>>> [...]
>>> I don't have a clear picture of what the consensus currently
>>> looks like :-)
>>>
>>> If we're going for for a solution that implements the hook
>>> awareness for all __<typename>__ hooks, I'd be +1 on that.
>>> If we only touch the __unicode__ case, we'd only be created
>>> yet another special case. I'd vote -0 on that.
>>> [...]
>>
>> Here's the patch that implements this for int/long/float/unicode:
>> http://www.python.org/sf/1109424
> 
> Any movement on this?  +1 for making things work like str; if a subclass 
> overrides __str__ it should use that method.  If correctness of what is
> returned is a worry then a check could be tossed in before the value is 
> returned.

It already works that way:

Python 2.5a0 (#1, Feb 24 2005, 16:25:04)
[GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-113)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> class u(unicode):
...  def __unicode__(self): return 42
...
 >>> unicode(u("foo"))
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: coercing to Unicode: need string or buffer, int found
 >>> class i(int):
...  def __int__(self): return "foo"
...
 >>> int(i(42))
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: __int__ returned non-int (type str)
 >>> class l(long):
...  def __long__(self): return "foo"
...
 >>> long(l(42))
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: __long__ returned non-long (type str)
 >>> class f(float):
...  def __float__(self): return "foo"
...
 >>> float(f(42))
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: __float__ returned non-float (type str)

Bye,
    Walter D?rwald
From gvanrossum at gmail.com  Thu Feb 24 17:51:06 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Feb 24 17:51:09 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <20050223204309.6E15.JCARLSON@uci.edu>
References: <20050223121821.6E0E.JCARLSON@uci.edu>
	<ca471dc205022319001ccccaa2@mail.gmail.com>
	<20050223204309.6E15.JCARLSON@uci.edu>
Message-ID: <ca471dc20502240851719e95e7@mail.gmail.com>

> While I personally don't tend to use names previously existing in
> the standard library, seemingly a large number of people do, hence the
> not-so-rare threads on comp.lang.python which ask about such things.

Sure. There are lots of FAQs whose answer is not "Python will have to change".

> > And how exactly do you define "the standard library"? Anything that's
> > on sys.path? That would seem the only reasonable interpretation to me.
> > So I take it that you want the "script directory" off that path.
> > (Let's for the sake of argument call it ".".)
> 
> Sounds reasonable to me, with one caveat; if one were to consider
> everything on sys.path to be in the standard library, then every script
> ever written for Python, which doesn't remove the standard ''/'.' from
> sys.path, would be part of the standard library.
> 
> I would suggest, as a replacement definition (probably with a caveat or
> two), that any module with a reference in the documentation, that also
> lies on the default sys.path, which is shipped with Python that is
> distributed at python.org, is part of the standard library.

This is not an operational definition for the sake of what you're
trying to accomplish; the import statement cannot possibly know which
modules are considered standard according to this definition.

> > > The lack of a 'from __future__ import absolute_import' statement in a
> > > module will not change the import semantic of that module.
> >
> > It's hard to imagine how this would work. sys.path is global, so
> > either "." is on it, or it isn't. So things in "." are either
> > considered part of the standard library, or they are not; this can't
> > be made dependent on the module's importation of something from
> > __future__.
> 
> Perhaps not, but in the process of importing a module into a namespace,
> one can check for the existance of the object imported from __future__,
> and ignore or not the "." entry in sys.path.

How would you recognize the "." entry in sys.path? (Remember "." is
just the name we give it, not its actual value.) It is not a fixed
string; it is not in a fixed position; it may even have been added
explicitly to sys.path in which case it should act as if its contents
was part of the standard library.

> > > This allows current code to continue to work, and for those who want to
> > > choose names which shadow the standard library modules, a way of dealing
> > > with their choices.
> >
> > My suggested way of dealing with their choices is summarized in the
> > first paragraph of my reply above.
> 
> Perfectly reasonable.  I can think of examples where it would not be
> reasonable, but they are quite cooked *wink*.

OK, then let's drop the issue -- my initial point was "this is not
broken so there's no need to fix it."

> > > Alternatively, PEP 328 could be implemented as-is, and a second future
> > > import could be defined which offers this functionality, being
> > > permanently optional (or on a different timeline) via the future import.
> >
> > I don't like permanently optional language features; that causes too
> > much confusion. I'd much rather settle on clear semantics that
> > everyone can understand (even if they may disagree).
> >
> > But I certainly would prefer that the proposed feature becomes a
> > separate PEP which can be discussed, accepted or rejected, and
> > implemented separately from PEP 328, which is complete and accepted
> > and just awaiting someone to implement it.
> 
> Sounds good.  I'll see what I can do this weekend about getting a
> proto-pep started.
[...]
> You make very good points about naming in sys.modules.  I will think
> about this and place some options in the proto-pep.

I recommend that you try to produce a working implementation (using
import hooks so you can write the whole thing in Python). This is
likely to clarify many of the semantic problems that I have tried to
hint at.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gward at python.net  Thu Feb 24 14:04:32 2005
From: gward at python.net (Greg Ward)
Date: Thu Feb 24 17:59:06 2005
Subject: [Python-Dev] textwrap.py wordsep_re
In-Reply-To: <quack.20050221T0339.87ekfacfb6@quack.cs.berkeley.edu>
References: <quack.20050221T0339.87ekfacfb6@quack.cs.berkeley.edu>
Message-ID: <20050224130432.GA16230@cthulhu.gerg.ca>

On 21 February 2005, Karl Chen said:
> Except when the string to wrap contains dates -- which I would
> like not to be filled.  In general I think wordsep_re can be
> smarter about what it decides are hyphenated words.
> 
> For example, this code:
>     print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
> produces:
>     aaaaaaaaaa 2005-
>     02-21

Oops!

> A slightly tweaked wordsep_re:
>     textwrap.TextWrapper.wordsep_re =\
>         re.compile(r'(\s+|'                  # any whitespace
>                    r'[^\s\w]*\w+[a-zA-Z]-(?=[a-zA-Z]\w+)|' # hyphenated words
>                    r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))')   # em-dash
>     print textwrap.fill('aaaaaaaaaa 2005-02-21', 18)
> behaves better:
>     aaaaaaaaaa
>     2005-02-21

Post a patch to SF and assign it to me.  Make sure the unit tests still
pass, and add a new one that doesn't pass without your fix.  Pester me
mercilessly until I act on it.  (I think your change is probably fine,
but I need more time to examine it than I have right now.)

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Cheops' Law: Nothing *ever* gets built on schedule or within budget.
From jdavid at itaapy.com  Tue Feb 22 20:47:23 2005
From: jdavid at itaapy.com (J. David Ibanez)
Date: Thu Feb 24 18:00:02 2005
Subject: [Python-Dev] Confusing "hasattr" behaviour
Message-ID: <421B8C4B.5050003@itaapy.com>


  Python 2.4 (#1, Feb 22 2005, 20:15:07)
  [GCC 3.3.5  (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> class A(object):
  ...     def get_x(self):
  ...         there_is_a_bug_here
  ...     x = property(get_x, None, None, '')
  ...
  >>> a = A()
  >>> getattr(a, 'x')
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "<stdin>", line 3, in get_x
  NameError: global name 'there_is_a_bug_here' is not defined
  >>> hasattr(a, 'x')
  False

Today I have spent a while to hunt down a couple of bugs in my application,
because "hasattr" was catching the exceptions.

I see there was a patch (#504714, three years ago) that addressed this, but
it was rejected because much code would get broken.

But, is it going to be considered for sometime in the future? 3.0 maybe?

And, would it really break so much code? Wouldn't it instead bring to the
day light many bugs that are already there (but failing in a more silent
way)?


Thanks,

-- 
J. David Ib??ez
Itaapy <http://www.itaapy.com>         Tel +33 (0)1 42 23 67 45
9 rue Darwin, 75018 Paris              Fax +33 (0)1 53 28 27 88 


From jcarlson at uci.edu  Thu Feb 24 19:25:07 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu Feb 24 19:26:53 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <ca471dc20502240851719e95e7@mail.gmail.com>
References: <20050223204309.6E15.JCARLSON@uci.edu>
	<ca471dc20502240851719e95e7@mail.gmail.com>
Message-ID: <20050224094833.6E18.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> Sure. There are lots of FAQs whose answer is not "Python will have to change".

And I'm not saying Python has to change either, hence the initial query
and planned PEP.  Boiling it down; if we could change import in such a
way that made standard library imports different from standard library
imports, we could fix module shadowing and perhaps gain a "bonus feature"
or two.

[...]

> > Sounds good.  I'll see what I can do this weekend about getting a
> > proto-pep started.
> [...]
> > You make very good points about naming in sys.modules.  I will think
> > about this and place some options in the proto-pep.
> 
> I recommend that you try to produce a working implementation (using
> import hooks so you can write the whole thing in Python). This is
> likely to clarify many of the semantic problems that I have tried to
> hint at.

Will do.

Thank you,
 - Josiah

From bac at OCF.Berkeley.EDU  Thu Feb 24 21:10:15 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Feb 24 21:10:37 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <421DF38C.1090203@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>	<41F0F11B.8000600@livinglogic.de>	<41F3B46F.5040205@egenix.com>	<41F6C474.8030700@livinglogic.de>	<421D80E8.9020306@ocf.berkeley.edu>
	<421DF38C.1090203@livinglogic.de>
Message-ID: <421E34A7.6050008@ocf.berkeley.edu>

Walter D?rwald wrote:
> Brett C. wrote:
> 
>> Walter D?rwald wrote:
>>
>>> M.-A. Lemburg wrote:
>>>
>>>> [...]
>>>> I don't have a clear picture of what the consensus currently
>>>> looks like :-)
>>>>
>>>> If we're going for for a solution that implements the hook
>>>> awareness for all __<typename>__ hooks, I'd be +1 on that.
>>>> If we only touch the __unicode__ case, we'd only be created
>>>> yet another special case. I'd vote -0 on that.
>>>> [...]
>>>
>>>
>>> Here's the patch that implements this for int/long/float/unicode:
>>> http://www.python.org/sf/1109424
>>
>>
>> Any movement on this?  +1 for making things work like str; if a 
>> subclass overrides __str__ it should use that method.  If correctness 
>> of what is
>> returned is a worry then a check could be tossed in before the value 
>> is returned.
> 
> 
> It already works that way:
> 
> Python 2.5a0 (#1, Feb 24 2005, 16:25:04)
> [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-113)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> class u(unicode):
> ...  def __unicode__(self): return 42

Well then I am +1 on doing this.

Since this is a semantic change probably need Guido to OK this?

-Brett
From martin at v.loewis.de  Thu Feb 24 22:08:08 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Feb 24 22:08:10 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EA9196.1020709@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl>
Message-ID: <421E4238.70702@v.loewis.de>

Irmen de Jong wrote:
> I've looked at one bug and a bunch of patches and
> added a comment to them:

Thanks! I have now processed the ones for which I found guidance.
As for the remaining ones:

> [ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows
> Looks good but added suggestion about when to test for special case

So what to do about this? Wait whether he revises the patch?
Accept anyway? Update the patch myself?

> [ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule

So what do you propose to do? AFAICT, there is no definition of
SEGMENT_SIZE in a TCP implementation, and I think we should not try
to make up a value.

IMO, Python should expose sockets more or less "as-is". If the system
has a flaw, Python should expose it instead of working around it.

> [ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names

Can you please elaborate the problem? What is a "special linux socket
name"?

Regardless, the comment of the other reviewer is also valid: any patch
needs documentation and test cases.

Regards,
Martin
From quarl at cs.berkeley.edu  Thu Feb 24 22:19:21 2005
From: quarl at cs.berkeley.edu (Karl Chen)
Date: Thu Feb 24 22:19:24 2005
Subject: [Python-Dev] textwrap.py wordsep_re
In-Reply-To: <20050224130432.GA16230@cthulhu.gerg.ca> (Greg Ward's message
	of "Thu, 24 Feb 2005 08:04:32 -0500")
References: <quack.20050221T0339.87ekfacfb6@quack.cs.berkeley.edu>
	<20050224130432.GA16230@cthulhu.gerg.ca>
Message-ID: <quack.20050224T1319.87sm3lvep2@quack.cs.berkeley.edu>

>>>>> On 2005-02-24 05:04 PST, Greg Ward writes:

    Greg> Post a patch to SF and assign it to me.  Make sure the
    Greg> unit tests still pass, and add a new one that doesn't
    Greg> pass without your fix.  Pester me mercilessly until I
    Greg> act on it.  (I think your change is probably fine, but I
    Greg> need more time to examine it than I have right now.)

I had already posted a patch on Aahz's advice.  I'll write a unit
test.

-- 
Karl 2005-02-24 13:18
From tjreedy at udel.edu  Thu Feb 24 22:25:07 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Feb 24 22:25:23 2005
Subject: [Python-Dev] Re: Confusing "hasattr" behaviour
References: <421B8C4B.5050003@itaapy.com>
Message-ID: <cvlgg5$h21$1@sea.gmane.org>


"J. David Ibanez" <jdavid@itaapy.com> wrote in message 
news:421B8C4B.5050003@itaapy.com...

Given that the behavior of hasattr is clearly defined in Lib Manual 2.1 as 
equivalent to

def hasattr(obj, name):
  try:
    getattr(obj, name)
    return True
  except:
   return False

I am not sure what could be confusing about it.  It is a simple getattr 
wrapper converting 'got something' to True and 'did not get anything' 
(raised an exception instead) to False.  Users should know this so they 
don't wastefully write 'if hasattr(o,n): x = getattr(o,n)'

[snip]
>Today I have spent a while to hunt down a couple of bugs in my 
>application,
>because "hasattr" was catching the exceptions.
[snip

If you want a different behavior, you can write your own version now, 
without waiting for a future that may never come, that only converts 
AttributeError to False and that possibly collapses all others to one type 
of your choosing.  To make sure that even this does not mask a bug, one 
needs to make sure that .__getattr__ methods do not pass on unintended 
bug-indicating AttributeErrors.

Terry J. Reedy


From just at letterror.com  Thu Feb 24 23:29:46 2005
From: just at letterror.com (Just van Rossum)
Date: Thu Feb 24 23:29:50 2005
Subject: [Python-Dev] Re: Confusing "hasattr" behaviour
In-Reply-To: <cvlgg5$h21$1@sea.gmane.org>
Message-ID: <r01050400-1038-97B66EE286B311D9A42C003065D5E7E4@[10.0.0.23]>

Terry Reedy wrote:

> 
> "J. David Ibanez" <jdavid@itaapy.com> wrote in message 
> news:421B8C4B.5050003@itaapy.com...
> 
> Given that the behavior of hasattr is clearly defined in Lib Manual
> 2.1 as equivalent to
> 
> def hasattr(obj, name):
>   try:
>     getattr(obj, name)
>     return True
>   except:
>    return False
> 
> I am not sure what could be confusing about it.  It is a simple
> getattr wrapper converting 'got something' to True and 'did not get
> anything' (raised an exception instead) to False.  Users should know
> this so they don't wastefully write 'if hasattr(o,n): x =
> getattr(o,n)'

See also http://python.org/sf/504714

Just
From greg.ewing at canterbury.ac.nz  Fri Feb 25 00:23:44 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Feb 25 00:23:59 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <20050224094833.6E18.JCARLSON@uci.edu>
References: <20050223204309.6E15.JCARLSON@uci.edu>
	<ca471dc20502240851719e95e7@mail.gmail.com>
	<20050224094833.6E18.JCARLSON@uci.edu>
Message-ID: <421E6200.9030501@canterbury.ac.nz>

Josiah Carlson wrote:
> if we could change import in such a
> way that made standard library imports different from standard library
> imports, we could

...go on to prove that black is white and get
ourselves killed by a python on the next
zebra crossing.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Fri Feb 25 00:28:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Feb 25 00:28:11 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <421E6200.9030501@canterbury.ac.nz>
References: <20050223204309.6E15.JCARLSON@uci.edu>
	<ca471dc20502240851719e95e7@mail.gmail.com>
	<20050224094833.6E18.JCARLSON@uci.edu>
	<421E6200.9030501@canterbury.ac.nz>
Message-ID: <ca471dc2050224152857414ff7@mail.gmail.com>

[Josiah Carlson]
> > if we could change import in such a
> > way that made standard library imports different from standard library
> > imports, we could

[Greg Ewing]
> ...go on to prove that black is white and get
> ourselves killed by a python on the next
> zebra crossing.

I was hoping that Josiah would find out for himself while doing the
assigned homework of writing an implementation, saving him the effort
of writing a PEP. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jcarlson at uci.edu  Fri Feb 25 01:12:34 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Feb 25 01:16:07 2005
Subject: [Python-Dev] Comment regarding PEP 328
In-Reply-To: <ca471dc2050224152857414ff7@mail.gmail.com>
References: <421E6200.9030501@canterbury.ac.nz>
	<ca471dc2050224152857414ff7@mail.gmail.com>
Message-ID: <20050224160801.6E1B.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> 
> [Josiah Carlson]
> > > if we could change import in such a
> > > way that made standard library imports different from standard library
> > > imports, we could
> 
> [Greg Ewing]
> > ...go on to prove that black is white and get
> > ourselves killed by a python on the next
> > zebra crossing.
> 
> [Guido van Rossum]
> I was hoping that Josiah would find out for himself while doing the
> assigned homework of writing an implementation, saving him the effort
> of writing a PEP. :-)

Gah that should have been "... in such a way that made standard library
imports different from user module/package imports ...".

I started on the implementation, and now know why PEP 328 probably
wasn't implemented in time for 2.4; import hooks can be ugly.

 - Josiah

From gregory.r.warnes at pfizer.com  Thu Feb 24 21:57:27 2005
From: gregory.r.warnes at pfizer.com (Warnes, Gregory R)
Date: Fri Feb 25 01:16:24 2005
Subject: [Python-Dev] Re: PEP 754
Message-ID: <915D2D65A9986440A277AC5C98AA466F978B63@groamrexm02.amer.pfizer.com>

[After a long delay, the thread continues....]

Hi All,

I'm pushing ahead on the tasks necessary to add the 'fcponst' module
described in PEP 754:  IEEE 754 Floating Point Special Values.

Per http://www.python.org/psf/contrib, I've 
- Changed the license to the Apache License, Version 2.0
- Just faxed the "Contributor Agreement" to the PSF

I've also 
- created a patch on sourceforge.net for fpconst code and documentation
(see
https://sourceforge.net/tracker/index.php?func=detail&aid=1151323&group_id=5
470&atid=305470)

I will need help connecting the included test functions into the python test
suite.

What else needs to be done to allow fpconst to go into the Python library?  

-Greg

> -----Original Message-----
> From: Tim Peters [mailto:tim.one@comcast.net]
> Sent: Friday, December 10, 2004 1:43 PM
> To: Warnes, Gregory R; goodger@python.org
> Cc: peps@python.org
> Subject: RE: [Python-Dev] Re: PEP 754
> 
> 
> [Warnes, Gregory R]
> > Hi David & Tim,
> >
> > First, I really like to see this go forward.  The fpconst module is
> > getting alot of use across the net, and it would be very useful to
> > finally get it into the standard python library.  What 
> needs to be done
> > to move forward?
> 
> Looks to me like exactly the same stuff as was needed before. 
>  Guido needs
> to pronounce on it.  It needs a patch on SourceForge, adding 
> the new module,
> new docs, and a test suite run by Python's standard regrtest.py.  The
> non-PSF copyright and license remain problematic.  That last should be
> easier to deal with after the PSF approves a Contributor 
> Agreement (probably
> within a month), but will be a show-stopper if Pfizer won't 
> contribute the
> module under the terms of the Contributor Agreement (which 
> can't be answered
> now by anyone, since the Contributor Agreement doesn't exist yet).
> 
> 
> 


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.

From steve at holdenweb.com  Fri Feb 25 02:48:31 2005
From: steve at holdenweb.com (Steve Holden)
Date: Fri Feb 25 02:55:00 2005
Subject: [Python-Dev] Re: Comment regarding PEP 328
In-Reply-To: <20050224160801.6E1B.JCARLSON@uci.edu>
References: <421E6200.9030501@canterbury.ac.nz>	<ca471dc2050224152857414ff7@mail.gmail.com>
	<20050224160801.6E1B.JCARLSON@uci.edu>
Message-ID: <cvlvt6$qf0$1@sea.gmane.org>

Josiah Carlson wrote:

> Guido van Rossum <gvanrossum@gmail.com> wrote:
> 
>>[Josiah Carlson]
>>
>>>>if we could change import in such a
>>>>way that made standard library imports different from standard library
>>>>imports, we could
>>
>>[Greg Ewing]
>>
>>>...go on to prove that black is white and get
>>>ourselves killed by a python on the next
>>>zebra crossing.
>>
>>[Guido van Rossum]
>>I was hoping that Josiah would find out for himself while doing the
>>assigned homework of writing an implementation, saving him the effort
>>of writing a PEP. :-)
> 
> 
> Gah that should have been "... in such a way that made standard library
> imports different from user module/package imports ...".
> 
> I started on the implementation, and now know why PEP 328 probably
> wasn't implemented in time for 2.4; import hooks can be ugly.
> 
>  - Josiah
> 
Hear, hear. If you'd like a working database import module to play with 
I can oblige, thanks to the assistance of many willing hands.

The import arena is full of obsolete hooks and other cruft and, while I 
can't help feeling that it would benefit from a complete redefinition I 
suspect that breakage-avoidance might require that waits until 3.0.

regards
  Steve

From bac at OCF.Berkeley.EDU  Fri Feb 25 03:58:39 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Feb 25 03:58:53 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <421E34A7.6050008@ocf.berkeley.edu>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>	<41F0F11B.8000600@livinglogic.de>	<41F3B46F.5040205@egenix.com>	<41F6C474.8030700@livinglogic.de>	<421D80E8.9020306@ocf.berkeley.edu>	<421DF38C.1090203@livinglogic.de>
	<421E34A7.6050008@ocf.berkeley.edu>
Message-ID: <421E945F.4040704@ocf.berkeley.edu>

Brett C. wrote:
> Walter D?rwald wrote:
> 
>> Brett C. wrote:
>>
>>> Walter D?rwald wrote:
>>>
>>>> M.-A. Lemburg wrote:
>>>>
>>>>> [...]
>>>>> I don't have a clear picture of what the consensus currently
>>>>> looks like :-)
>>>>>
>>>>> If we're going for for a solution that implements the hook
>>>>> awareness for all __<typename>__ hooks, I'd be +1 on that.
>>>>> If we only touch the __unicode__ case, we'd only be created
>>>>> yet another special case. I'd vote -0 on that.
[SNIP]
> Since this is a semantic change probably need Guido to OK this?

... which we now have.  Assigned the patch to myself.  I will get to it eventually.

-Brett
From bac at OCF.Berkeley.EDU  Fri Feb 25 07:26:59 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Feb 25 07:27:06 2005
Subject: [Python-Dev] python-dev Summary for 2005-01-16 through 2005-01-31
	[draft]
Message-ID: <421EC533.1080107@ocf.berkeley.edu>

A month late.  Glad I don't get paid for this or I probably would have been 
fired by now.  =)

For this summary draft I am bringing back the header this one time in hopes of 
getting some proof-reading of it.  I did a major restructuring with some 
accompanying rewrites to make it easier to navigate.  Details are in the 
Summary Announcements section.  And you can ignore XXX and any %s points since 
this is just a direct paste of the template I use for generating the Summaries.

I am hoping to get this summary out either Saturday or Sunday since I am 
anxious to try out AMK's new RSS feed for the Summaries (see the Intro section 
or Summary Announcements for the URI).

--------------------------------

python-dev Summary for %(start_ISO_date)s through %(end_ISO_date)s
++++++++++++++++++++++++++++++++++++++++++++++++++++

.. contents::

=====
Intro
=====

This is a summary of traffic on the `python-dev mailing list`_ from
%(start_traditional_date)s through %(end_traditional_date)s.
It is intended to inform the wider Python community of on-going
developments on the list on a semi-monthly basis.  An archive_ of
previous summaries is available online.

An `RSS feed`_ of the titles of the summaries is available.
You can also watch comp.lang.python or comp.lang.python.announce for
new summaries (or through their email gateways of python-list or
python-announce, respectively, as found at http://mail.python.org).

This is the XXX summary written by Brett Cannon (XXX).

To contact me, please send email to brett at python.org.  Do *not*
post to comp.lang.python if you wish to reach me.

The `Python Software Foundation`_ is the non-profit organization that
holds the intellectual property for Python.  It also tries to forward
the development and use of Python.  If you find the python-dev Summary
helpful please consider making a donation.  You can make a donation at
http://python.org/psf/donations.html .  Every penny helps so even a
small donation with a credit card, check, or by PayPal helps.

If you are looking for a way to expand your knowledge of Python's
development and inner-workings, consider writing the python-dev
Summaries yourself!  I am willing to hand over the reins to someone
who is willing to do a comparable or better job of writing the
Summaries.  If you are interested, please email me at
brett at python.org.


--------------------
Commenting on Topics
--------------------

To comment on anything mentioned here, just post to
`comp.lang.python`_ (or email python-list@python.org which is a
gateway to the newsgroup) with a subject line mentioning what you are
discussing.  All python-dev members are interested in seeing ideas
discussed by the community, so don't hesitate to take a stance on
something.  And if all of this really interests you then get involved
and join `python-dev`_!


-------------------------
How to Read the Summaries
-------------------------

The in-development version of the documentation for Python can be
found at http://www.python.org/dev/doc/devel/ and should be used when
looking up any documentation on new code; otherwise use the current
documentation as found at http://docs.python.org/ .  PEPs (Python
Enhancement Proposals) are located at http://www.python.org/peps/ .
To view files in the Python CVS online, go to
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ .  Reported
bugs and suggested patches can be found at the SourceForge_ project
page.

Please note that this summary is written using reStructuredText_.
Any unfamiliar punctuation is probably markup for reST_ (otherwise it
is probably regular expression syntax or a typo =); you can safely
ignore it.  I do suggest learning reST, though; it's simple and is
accepted for `PEP markup`_ and can be turned into many different
formats like HTML and LaTeX.  Unfortunately, even though reSt is
standardized, the wonders of programs that like to reformat text do
not allow me to guarantee you will be able to run the text version of
this summary through Docutils_ as-is unless it is from the
`original text file`_.

.. _python-dev: http://www.python.org/dev/
.. _SourceForge: http://sourceforge.net/tracker/?group_id=5470
.. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _PEP Markup: http://www.python.org/peps/pep-0012.html

.. _Docutils: http://docutils.sf.net/
.. _reST:
.. _reStructuredText: http://docutils.sf.net/rst.html
.. _PSF:
.. _Python Software Foundation: http://python.org/psf/

.. _last summary: http://www.python.org/dev/summary/%(html_prev_summary)s
.. _original text file: http://www.python.org/dev/summary/%(ht_name)s
.. _archive: http://www.python.org/dev/summary/
.. _RSS feed: http://www.python.org/dev/summary/channews.rdf


=====================
Summary Announcements
=====================

-----------------------------------------
School sure likes to destroy my free time
-----------------------------------------

A month late, that much closer to having this hectic quarter being over.  Sorry 
for being so delinquent with this summary but school has kept me busy and 
obviously the Real World has to take precedence over volunteer work.  Now if I 
could only get paid for doing this... =)

And if you hate the summaries being late, you could do it yourself.  This is 
not meant to be a flippant comment!  I am always willing to hand over 
development of the summaries to anyone who is willing to do a comparable job. 
If you are interested feel free to email me.  I have now made this a permanent 
offer in the header in case someone comes along later and decides they want to 
do this.


----------------------
RSS feed now available
----------------------

Thanks entirely to one of my Summaries author predecessors, A.M. Kuchling, the 
python-dev Summaries are available as an `RSS feed`_.  The feed contains the 
titles of every summary and so will be updated with the newest summaries as 
soon as they are posted online.


----------
New format
----------

I have done a thorough restructuring of the header and the Summary 
Announcements section.  The purpose of this is to make finding information in 
the header much easier.  It also keeps consistency by sectioning off everything 
as in the Summary section.

The other reason is for the ``contents`` directive in reST_.  This will provide 
a more thorough table of contents for the web version of the summary at the 
very top of the summaries.  This will allow people to jump directly to the 
section of the summary they care about the most.  Obviously this perk only 
exists in the HTML version of the Summaries.

But don't feel left out, text readers!  I am also playing with the amount of 
whitespace between sections.  As of right now I am leaning towards a single 
whitespace between section headers and text and two newlines between the end of 
text and a new section.  I think it reads more easily and seems a little less 
forced.  Obviously it lengthens the overall summary, but I think it is worth it 
for the ease of reading.

Then again I could be totally wrong about all of this and manage to alienate 
every person who reads the summaries regularly.  =)


=======
Summary
=======

---------------------
Python 2.3.5 released
---------------------

Consider how late this summary is I bet you already knew Python 2.3.5 was 
already out the door.  =)

With Python 2.4 out in the world this means there is a very high probability 
2.3.6 will never exist and this marks the end of the 2.3 branch.

Contributing threads:
   - `2.3.5 delayed til next week <>`__
   - `2.3 BRANCH FREEZE imminent! <>`__
   - `RELEASED Python 2.3.5, release candidate 1 <>`__


------------------------------------------------------
Making magic type conversion methods act like __str__
------------------------------------------------------

Walter D?rwald discovered that when you subclass 'unicode' and call unicode() 
on an instance of the subclass it will not call the implementation of 
__unicode__ of the subclass but instead will call unicode.__unicode__ .  When 
in the same scenario with strings, though, str() calls the subclass' __str__ 
method.  Turns out 'int' and 'float' act like 'unicode' while 'complex' acts 
like 'str'.

So who is right?  Docs say 'str' is wrong, but this is mainly an artifact of 
pre-2.2 inability to subclass types.  Turns out 'str' is acting properly. 
`Patch #1109424`_ implements the proper semantics and will eventually go in for 
2.5 (won't touch 2.4 since it is a semantic change).

.. Patch# 1109424: http://www.python.org/sf/1109424

Contributing threads:
   - `__str__ vs. __unicode__ <>`__


---------------------------------------------
Speeding up function calls to C API functions
---------------------------------------------

Neal Norwitz posted the patch found at http://www.python.org/sf/1107887 to help 
with function calls to C code.  The idea is to expand the family of values used 
in PyMethodDef.ml_flags for argument types to include specifying the number of 
minimum and maximum number of arguments.  This can provide a speedup by 
allowing the eval loop to unpack everything in the C stack and skip packing 
arguments in a tuple.

But not everyone was sure it was worth the extra need to specify all of this 
for functions.  Regardless of that and any other objections this would be more 
of a Python 3000 thing.

Which also led to a quick shift in topic to how Python 3.0 will be developed. 
Guido said it would be piece-meal.  Read 
http://joelonsoftware.com/articles/fog0000000069.html for why.

Contributing threads:
   - `Speed up function calls <>`__
   - `Moving towards Python 3.0 <>`__


------------------------------------------------------------
How to handle memory allocation with the presence of threads
------------------------------------------------------------

Evan Jones has been working on a patch to allow the garbage collector free up 
memory of small objects.  This led him to ask questions in terms of memory 
usage in the face of threading at the C level.  While the GIL usually needs to 
be held for any operation that touches Python code, he was not sure if this 
held for the memory API.

Tim Peters clarified all of this by pointing out the documentation in the C API 
manual about the GIL.  In a nutshell the memory API is not exempt from needing 
to hold the GIL, so hold it.

It was also pointed out there was a bunch of code to allow people to mix usage 
of PyMem_* functions with PyObject_* functions.  That was purely done for 
backwards-compatibility back in the day.  Mixing these two APIs for memory is 
very bad.  Don't do it!

Contributing threads:
   - `Improving the Python Memory Allocator <>`__
   - `Python Interpreter Thread Safety? <>`__


--------------------------
Slicing iterators rejected
--------------------------

Nick Coghlan proposed allowing iterators to be sliced liked other sequence 
types.  That way something like ``enumerate("ABCD")[1:]`` would work.

But Guido rejected it.  With itertools.islice existence it does not provide new 
functionality.  Plus "Iterators are for single sequential access" according to 
Guido, and thus should not be confused with sequences.

Contributing threads:
   - `Allowing slicing of iterators <>`__


===============
Skipped Threads
===============

- redux: fractional seconds in strptime
- how to test behavior wrt an extension type?
- Strange segfault in Python threads and linux kernel 2.6
- Unix line endings required for PyRun* breaking embedded Python
- Zen of Python
- PyCon: The Spam Continues ;-)
- Patch review: [ 1094542 ] add Bunch type to collections module
From andrewm at object-craft.com.au  Fri Feb 25 07:30:47 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Fri Feb 25 07:30:49 2005
Subject: [Python-Dev] UserString 
In-Reply-To: <1109257132.2455.92.camel@presto.wooz.org> 
References: <000001c51703$80f97520$f33ec797@oemcomputer>
	<0f5201ccd99380eeac0400da69d6d9f7@aleax.it>
	<ca471dc205022008063dd52a41@mail.gmail.com>
	<d5deb1f01b3d3ef46b4e35ad7768b4bf@aleax.it>
	<42195649.3030400@canterbury.ac.nz>
	<ca471dc205022019427d850488@mail.gmail.com>
	<89b4ed0afdf4a58a4425a588bdbb1965@aleax.it>
	<ca471dc205022108152cbcbedf@mail.gmail.com>
	<1109040601.25187.170.camel@presto.wooz.org>
	<ca471dc20502220816304deea7@mail.gmail.com>
	<1109257132.2455.92.camel@presto.wooz.org>
Message-ID: <20050225063047.47EFA3C889@coffee.object-craft.com.au>

>> You have a lot more faith in the errno module than I do. Are you sure
>> the same error codes work on all platforms where Python works? 
>
>No, but I'm pretty confident the symbolic names for the errors are
>consistent for any platform I've cared about <wink>.
>
>> It's also not exactly readable (except for old Unix hacks).
>
>Guilty as charged. ;)

The consistency of the semantics of core system calls is sort of trademark
of unix. Any system that claims to be Unix, but plays loose and fast
with semantics soon gets a very poor reputation (xenix, cough).

All well-coded unix apps are dependent on system calls returning
consistent errno's. Which is one thing that makes life so difficult for
"posix" environments layered on other operating systems.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From ncoghlan at iinet.net.au  Sat Feb 26 07:50:06 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Feb 26 07:50:20 2005
Subject: [Python-Dev] PEP 309 enhancements
Message-ID: <42201C1E.3060405@iinet.net.au>

Moving a discussion from the PEP309 SF tracker (Patch #941881) to here, since 
it's gone beyond the initial PEP 309 concept (and the SF tracker is a lousy 
place to have a design discussion, anyway).

The discussion started when Steven Bethard pointed out that partial objects 
can't be used as instance methods (using new.instancemethod means that the 
automatically supplied 'self' argument ends up in the wrong place, after the 
originally supplied arguments).

This problem is mentioned only indirectly in the PEP (pointing out that a 
version which prepended later arguments, rather than appending them might be 
useful). Such a class wouldn't solve the problem anyway, as it is only the 
*first* argument we want to give special treatment.

Keyword arguments won't help us, since the 'self' argument is always positional.

The initial suggestion was to provide a __get__ method on partial objects, which 
forces the insertion of the reference to self at the beginning of the argument 
list instead of at the end:

     def __get__(self, obj, type=None):
         if obj is None:
             return self
         return partial(self.fn, obj, *self.args, **self.kw)

However, this breaks down for nested partial functions - the call to the nested 
partial again moves the 'self' argument to after the originally supplied 
argument list. This can be addressed by automatically 'unfolding' nested 
partials (which should also give a speed benefit when supplying arguments 
piecemeal, since building incrementally or all at once will get you to the same 
place):

     def __init__(*args, **kw):
         self = args[0]
         try:
             func = args[1]
         except IndexError:
             raise TypeError("Expected at least 2 arguments, got %s" % len(args))
         if isinstance(func, partial):
             self.fn = func.fn
             self.args = func.args + args[2:]
             d = func.kw.copy()
             d.update(kw)
             self.kw = d
         else:
             self.fn, self.args, self.kw = (func, args[2:], kw)

At this point, the one thing you can't do is use a partial function as a *class* 
method, as the classmethod implementation doesn't give descriptors any special 
treatment.

So, instead of the above, I propose the inclusion of a callable 'partialmethod' 
descriptor in the functional module that takes the first positional argument 
supplied at call time and prepends it in the actual function call (this still 
requires automatic 'unfolding'in order to work correctly with nested partial 
functions):

class partialmethod(partial):
     def __call__(self, *args, **kw):
         if kw and self.kw:
             d = self.kw.copy()
             d.update(kw)
         else:
             d = kw or self.kw
         if args:
             first = args[:1]
             rest = args[1:]
         else:
             first = rest = ()
         return self.fn(*(first + self.args + rest), **d)

     def __get__(self, obj, type=None):
         if obj is None:
             return self
         return partial(self.fn, obj, *self.args, **self.kw)

Using a function that simply prints its arguments:

Py> class C:
...   a = functional.partialmethod(f, 'a')
...   b = classmethod(functional.partialmethod(f, 'b'))
...   c = staticmethod(functional.partial(f, 'c'))
...   d = functional.partial(functional.partialmethod(f, 'd', 1), 2)
...
Py> C.e = new.instancemethod(functional.partialmethod(f, 'e'), None, C)
Py> C().a(0)
((<__main__.C instance at 0x00A95710>, 'a'), {}, 0)
Py> C().b(0)
(<class __main__.C at 0x00A93FC0>, 'b', 0)
Py> C().c(0)
('c', 0)
Py> C().d(0)
('d', 1, 2, 0)
Py> C().e(0)
(<__main__.C instance at 0x00A95710>, 'e', 0)

Notice that you *don't* want to use partialmethod when creating a static method.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From p.f.moore at gmail.com  Sat Feb 26 14:13:40 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat Feb 26 14:13:43 2005
Subject: [Python-Dev] PEP 309 enhancements
In-Reply-To: <42201C1E.3060405@iinet.net.au>
References: <42201C1E.3060405@iinet.net.au>
Message-ID: <79990c6b05022605135a113352@mail.gmail.com>

On Sat, 26 Feb 2005 16:50:06 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
> Moving a discussion from the PEP309 SF tracker (Patch #941881) to here, since
> it's gone beyond the initial PEP 309 concept (and the SF tracker is a lousy
> place to have a design discussion, anyway).
> 
> The discussion started when Steven Bethard pointed out that partial objects
> can't be used as instance methods (using new.instancemethod means that the
> automatically supplied 'self' argument ends up in the wrong place, after the
> originally supplied arguments).
[...]
> However, this breaks down for nested partial functions - the call to the nested
> partial again moves the 'self' argument to after the originally supplied
> argument list. This can be addressed by automatically 'unfolding' nested
> partials (which should also give a speed benefit when supplying arguments
> piecemeal, since building incrementally or all at once will get you to the same
> place):
[...]
> At this point, the one thing you can't do is use a partial function as a *class*
> method, as the classmethod implementation doesn't give descriptors any special
> treatment.

While I see a theoretical need for this functionality, I don't know of
a practical use case. PEP 309 is marked as accepted in its current
form, and has an implementation in C (which appears to be justified on
speed grounds - see the SF tracker). The new code is in Python - if
it's to be included, then at *some* stage it needs translating to C.

Personally, I'd rather see partial as it stands, with its current
limitations, included. The alternative seems to be a potentially long
discussion, petering out without conclusion, and the whole thing
missing Python 2.5. (I know that's a long way off, but this already
happened with 2.4...)

> So, instead of the above, I propose the inclusion of a callable 'partialmethod'
> descriptor in the functional module that takes the first positional argument
> supplied at call time and prepends it in the actual function call (this still
> requires automatic 'unfolding'in order to work correctly with nested partial
> functions):

No problem with this, as long as:

1. An implementation (which needs to work alongside the existing C
code) is provided
2. PEP 309 is updated to include an explanation of the issue and a
justification for this solution.
3. The documentation is updated to make it clear why there are two
callables, and which to use where. (That could be difficult to explain
clearly - I understand some of the issues here, but just barely - the
docs would need to be much clearer).

Personally, I feel this classifies as YAGNI (whereas, I'd use the
current partial a *lot*).

OTOH, I think the fact that the existing partial can't be used as an
instance method does deserve mention in the documentation, regardless.
For that your explanation (quoted above, and repeated here) would
suffice:

"""
partial objects can't be used as instance methods (using
new.instancemethod means that the automatically supplied 'self'
argument ends up in the wrong place, after the originally supplied
arguments)
"""

Paul.
From ncoghlan at iinet.net.au  Sat Feb 26 16:55:54 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Feb 26 16:56:02 2005
Subject: [Python-Dev] PEP 309 enhancements
In-Reply-To: <79990c6b05022605135a113352@mail.gmail.com>
References: <42201C1E.3060405@iinet.net.au>
	<79990c6b05022605135a113352@mail.gmail.com>
Message-ID: <42209C0A.6010800@iinet.net.au>

Paul Moore wrote:
> Personally, I'd rather see partial as it stands, with its current
> limitations, included.  The alternative seems to be a potentially long
> discussion, petering out without conclusion, and the whole thing
> missing Python 2.5. (I know that's a long way off, but this already
> happened with 2.4...)

Yes - I certainly don't think this is a reason to hold off checking in what we 
already have. The only question is whether or not it gets tweaked before the 
first 2.5 alpha.

>>So, instead of the above, I propose the inclusion of a callable 'partialmethod'
>>descriptor in the functional module that takes the first positional argument
>>supplied at call time and prepends it in the actual function call (this still
>>requires automatic 'unfolding'in order to work correctly with nested partial
>>functions):
> 
> No problem with this, as long as:
> 
> 1. An implementation (which needs to work alongside the existing C
> code) is provided

That's the plan - the Python code is just the easiest way to make the intended 
semantics clear for the discussion.

> 2. PEP 309 is updated to include an explanation of the issue and a
> justification for this solution.

Seems like the most sensible place to record it for posterity (assuming we 
actually end up doing anything about it)

> 3. The documentation is updated to make it clear why there are two
> callables, and which to use where. (That could be difficult to explain
> clearly - I understand some of the issues here, but just barely - the
> docs would need to be much clearer).

The basic concept is that if you're building a partial function in general, use 
'partial'. If you're building an instance method or class method (i.e. the two 
cases where Python automatically provides the first argument), use 
'partialmethod', because the standard 'partial' generally won't do the right 
thing with the first argument.

> Personally, I feel this classifies as YAGNI (whereas, I'd use the
> current partial a *lot*).

I can certainly appreciate that point of view. It's a rather large conceptual 
hole though, so if it can be plugged elegantly, I'd like to see that happen 
before 2.5 goes live.

> OTOH, I think the fact that the existing partial can't be used as an
> instance method does deserve mention in the documentation, regardless.
> For that your explanation (quoted above, and repeated here) would
> suffice:
> 
> """
> partial objects can't be used as instance methods (using
> new.instancemethod means that the automatically supplied 'self'
> argument ends up in the wrong place, after the originally supplied
> arguments)
> """

The distinction is actually finer than that - it can work in some cases, 
provided the predefined arguments are all given as keyword arguments rather than 
positional arguments.

The real issue is that there may be situations where you don't have control over 
how the function you want to turn into an instance method was created, and if 
someone has used partial with positional arguments to create the function, you 
may have no practical way out. 'partialmethod' fixes that - it allows creating a 
partial function which expects the next positional argument to be the first 
argument to the underlying function, while remaining positional arguments are 
appended as usual.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From sxanth at cs.teiath.gr  Sun Feb 27 04:51:58 2005
From: sxanth at cs.teiath.gr (stelios xanthakis)
Date: Sat Feb 26 18:43:06 2005
Subject: [Python-Dev] pystone
Message-ID: <422143DE.8080305@cs.teiath.gr>

Hi,

It seems that in pystone, Proc1 the else branch is never reached.
Is this OK ??

Stelios

From raymond.hettinger at verizon.net  Sat Feb 26 19:20:46 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Feb 26 19:24:55 2005
Subject: [Python-Dev] PEP 309
Message-ID: <000001c51c2f$e469b260$bb3bc797@oemcomputer>

Paul Moore wrote:
> Personally, I'd rather see partial as it stands, with its current
> limitations, included.? The alternative seems to be a potentially long
> discussion, petering out without conclusion, and the whole thing
> missing Python 2.5. (I know that's a long way off, but this already
> happened with 2.4...)

-0  My preference is that it not go in as-is.  

It is better to teach how to write a closure than to introduce a new
construct that has its own problems and doesn't provide a real
improvement over what we have now.  Despite my enthusiasm for functional
programming and the ideas in PEP, I find the PFA implementation
annoying.

I had tried out the implementation that was being pushed for Py2.4 and
found it wanting.  Hopefully, it has improved since then.  Here are a
few thoughts based on trying to apply it to my existing code.

* The PFA implementation proposed for Py2.4 ran slower than an
equivalent closure.  If the latest implementation offers better
performance, then that may be a reason for having it around.

* Having PFA only handle the first argument was a PITA with Python.  In
Haskell and ML, function signatures seems to have been designed with
argument ordering better suited for left curries.  In contrast, Python
functions seem to have adverse argument ordering where the variables you
want to freeze appear toward the right.  This is subjective and may just
reflect the way I was aspiring to use "partial" to freeze options and
flags rather than the first argument of a binary operator.  Still, I
found closures to be more flexible in that they could handle any
argument pattern and could freeze more than one variable or keyword at a
time.

* The instance method limitation never came up for me.  However, it
bites to have a tool working in a way that doesn't match your mental
model.  We have to document the limitations, keep them in mind while
programming, and hope to remember them as possible causes if bugs ever
arise.  It would be great if these limitations could get ironed out.

* Using the word "partial" instead of "lambda" traded one bit of
unreadability for another.  The "lambda" form was often better because
it didn't abstract away its mechanism and because it supported more
general expressions.  I found that desk-checking code was harder because
I had to mentally undo the abstraction to check that the argument
signature was correct.

* It is not clear that the proposed implementation achieves one of the
principal benefits laid out in the PEP:  "I agree that lambda is usually
good enough, just not always. And I want the possibility of useful
introspection and subclassing."

If we get a better implementation, it would be nice if the PEP were
updated with better examples.  The TkInter example is weak because we
often want to set multiple defaults at the same time (foreground,
background, textsize, etc) and often those values are config options
rather than hardwired constants.


Raymond

From p.f.moore at gmail.com  Sat Feb 26 22:52:51 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat Feb 26 22:52:54 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000001c51c2f$e469b260$bb3bc797@oemcomputer>
References: <000001c51c2f$e469b260$bb3bc797@oemcomputer>
Message-ID: <79990c6b0502261352175db785@mail.gmail.com>

On Sat, 26 Feb 2005 13:20:46 -0500, Raymond Hettinger
<raymond.hettinger@verizon.net> wrote:
> It is better to teach how to write a closure than to introduce a new
> construct that has its own problems and doesn't provide a real
> improvement over what we have now.

You make some good points. But this all reminds me of the discussion
over itemgetter/attrgetter. They also special-case particular uses of
lambda, and in those cases the stated benefits were speed and
(arguably) readability (I still dislike the names, personally).

I think partial hits a similar spot - it covers a fair number of
common cases, and the C implementation is quoted as providing a speed
advantage over lambda. On the minus side, I'm not sure it covers as
many uses as {item,attr}getter, but on the plus side, I like the name
better :-) Seriously, not needing to explicitly handle *args and **kw
is a genuine readability benefit of partial.

Of course, optimising Python function calls, and optimising lambda to
death, would remove the need for any of these discussions. But there's
no real indication that this is likely in the short term...

This got me thinking, so I did a quick experiment:

>python -m timeit -s "from operator import itemgetter; l=range(8)"
"itemgetter(1)(l)"
1000000 loops, best of 3: 0.548 usec per loop

>python -m timeit -s "l=range(8)" "(lambda x:x[1])(l)"
1000000 loops, best of 3: 0.597 usec per loop

That's far less of a difference than I expected from itemgetter! The
quoted speed improvement in the C implementation of partial is far
better... So I got worried, and tried a similar experiment with the C
implementation of the functional module:

>python -m timeit -s "import t" "t.partial(t.f, 1, 2, 3, a=4, b=5)(6,
7, 8, c=9, d=10)"
100000 loops, best of 3: 3.91 usec per loop

>python -m timeit -s "import t" "(lambda *args, **kw: t.f(1, 2, 3,
a=4, b=5, *args, **kw))(6, 7, 8, c=9, d=10)"
100000 loops, best of 3: 3.6 usec per loop

[Here, t is just a helper which imports partial, and defines f as def
f(*args, **kw): return (args, kw)]

Now I wonder. Are my tests invalid, did lambda get faster, or is the
"lambda is slow" argument a myth?

Hmm, I'm starting to go round in circles here. I'll post this as it
stands, with apologies if it's incoherent. Blame it on a stinking cold
:-(

Paul.
From python at rcn.com  Sun Feb 27 01:26:11 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb 27 01:30:21 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <79990c6b0502261352175db785@mail.gmail.com>
Message-ID: <000801c51c62$f099a120$bb3bc797@oemcomputer>

> But this all reminds me of the discussion
> over itemgetter/attrgetter. They also special-case particular uses of
> lambda, and in those cases the stated benefits were speed and
> (arguably) readability (I still dislike the names, personally).

I wouldn't use those as justification for partial().  The names suck and
the speed-up is small.  They were directed at a specific and recurring
use case related to key= arguments.


> I think partial hits a similar spot - it covers a fair number of
> common cases, 

Are you sure about that?  Contriving examples is easy, but download a
few modules, scan them for use cases, and you may find, as I did, that
partial() rarely applies.  The argument order tends to be problematic.

Grepping through the standard library yields no favorable examples.  In
inspect.py, you could replace "formatvarkw=lambda name: '**' + name"
with
"partial(operator.add, '**') but that would not be an improvement.

Looking through the builtin functions also provides a clue:

  cmp(x,y)	      # partial(cmp, refobject) may be useful.
  coerce(x,y)     # not suitable for partial().
  divmod(x,y)     # we would want a right curry.
  filter(p,s)     # partial(filter, p) might be useful.
  getattr(o,n,d)  # we would want a right curry.
  hasattr(o,n)    # we would want a right curry.
  int(x,b)        # we would want a right curry.
  isinstance(o,c) # we would want a right curry.
  issubclass(a,b) # we would want a right curry.
  iter(o,s)       # we would want a right curry.
  long(x,b)       # we would want a right curry.
  map(f,s)        # partial(map, f) may be useful.
  pow(x,y,z)      # more likely to want to freeze y or z.
  range([a],b,[c])# not a good candidate.
  reduce(f,s,[i]) # could work for operator.add and .mul
  round(x, n)     # we would want a right curry.
  setattr(o,n,v)  # more likely to want to freeze n.
 

> the C implementation is quoted as providing a speed
> advantage over lambda. 

Your recent timings and my old timings show otherwise.


> Seriously, not needing to explicitly handle *args and **kw
> is a genuine readability benefit of partial.

I hope that is not the only real use case.  How often do you need to
curry a function with lots of positional and keyword arguments?  Even
when it does arise, it may a code smell indicating that subclassing
ought to be used.


> Now I wonder. Are my tests invalid, did lambda get faster, or is the
> "lambda is slow" argument a myth?

The test results are similar to what I got when I had tested the version
proposed for Py2.4.  The lambda version will win by an even greater
margin if you put it on an equal footing by factoring out the attribute
lookup with something like f=t.f.

Calling Python functions (whether defined with lambda or def) is slower
than C function calls because of the time to setup the stack-frame.
While partial() saves that cost, it has to spend some time building the
new argument tuple and forwarding the call.  You're timings show that to
be a net loss.

Sidenote:  Some C methods with exactly zero or one argument have
optimized paths that save time spent constructing, passing, and
unpacking an argument tuple.  Since partial() is aimed at multi-arg
functions, that part of "lambda is slower" is not relevant to the
comparison.


> Hmm, I'm starting to go round in circles here. 

I also wish that partial() ran faster than closures, that it didn't have
limitations, and that it applied in more situations.  C'est le vie.


Raymond

From python at rcn.com  Sun Feb 27 01:38:59 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb 27 01:43:08 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <79990c6b0502261352175db785@mail.gmail.com>
Message-ID: <000901c51c64$ba81c020$bb3bc797@oemcomputer>

> I did a quick experiment:
> 
> >python -m timeit -s "from operator import itemgetter; l=range(8)"
> "itemgetter(1)(l)"
> 1000000 loops, best of 3: 0.548 usec per loop
> 
> >python -m timeit -s "l=range(8)" "(lambda x:x[1])(l)"
> 1000000 loops, best of 3: 0.597 usec per loop
> 
> That's far less of a difference than I expected from itemgetter! 

You've timed how long it takes to both construct and apply the retrieval
function.  The relevant part is only the application:

C:\pydev>python -m timeit -r9 -s "from operator import itemgetter;
s=range(8); f=itemgetter(1)" "f(s)"
1000000 loops, best of 9: 0.806 usec per loop

C:\pydev>python -m timeit -r9 -s "s=range(8); f=lambda x:x[1]" "f(s)"
100000 loops, best of 9: 1.18 usec per loop

So the savings is about 30% which is neither astronomical, nor
negligible.


Raymond

From steven.bethard at gmail.com  Sun Feb 27 02:18:42 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun Feb 27 02:18:45 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000801c51c62$f099a120$bb3bc797@oemcomputer>
References: <79990c6b0502261352175db785@mail.gmail.com>
	<000801c51c62$f099a120$bb3bc797@oemcomputer>
Message-ID: <d11dcfba050226171857412785@mail.gmail.com>

On Sat, 26 Feb 2005 19:26:11 -0500, Raymond Hettinger <python@rcn.com> wrote:
> Are you sure about that?  Contriving examples is easy, but download a
> few modules, scan them for use cases, and you may find, as I did, that
> partial() rarely applies.  The argument order tends to be problematic.
>
> Grepping through the standard library yields no favorable examples.

I also didn't find many the last time I looked through:

http://mail.python.org/pipermail/python-list/2004-December/257990.html

> In inspect.py, you could replace "formatvarkw=lambda name: '**' + name"
> with "partial(operator.add, '**') but that would not be an improvement.

Yeah, I remember thinking that the nicer way to write this was probably
   formatvarkw='**%s'.__mod__

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From ncoghlan at iinet.net.au  Sun Feb 27 04:58:50 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Feb 27 04:58:55 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000001c51c2f$e469b260$bb3bc797@oemcomputer>
References: <000001c51c2f$e469b260$bb3bc797@oemcomputer>
Message-ID: <4221457A.3050903@iinet.net.au>

Raymond Hettinger wrote:
> * The PFA implementation proposed for Py2.4 ran slower than an
> equivalent closure.  If the latest implementation offers better
> performance, then that may be a reason for having it around.

Not having done the timing, I'll defer to Paul and yourself here. However, one 
of the proposed enhancements is to automatically flatten out nested partial 
calls - this won't speed up the basic cases, but will allow incremental 
construction in two or more stages without a speed loss at the final call.

> flags rather than the first argument of a binary operator.  Still, I
> found closures to be more flexible in that they could handle any
> argument pattern and could freeze more than one variable or keyword at a
> time.

I'm not sure what this one is about - the PEP 309 implementation allows a single 
partial call to freeze an arbitrary number of positional arguments (starting 
from the left), and an arbitrary number of keyword arguments at any position. 
(This is why the name was changed from curry to partial - it was general purpose 
partial function application, rather than left currying)

> * The instance method limitation never came up for me.  However, it
> bites to have a tool working in a way that doesn't match your mental
> model.  We have to document the limitations, keep them in mind while
> programming, and hope to remember them as possible causes if bugs ever
> arise.  It would be great if these limitations could get ironed out.

The 'best' idea I've come up with so far is to make partial a class factory 
instead of a straight class, taking an argument that states how many positional 
arguments to prepend at call time. A negative value would result in the addition 
of (len(callargs)+1) to the position at call time.

Then, partial() would return a partial application class which appended all 
positional arguments at call time, partial(-1) a class which prepended all 
positional arguments. partial(1) would be the equivalent of partialmethod, with 
the first argument prepended, and the rest appended.

In the general case, partial(n)(fn, *args1)(*args2) would give a call that 
looked like fn(args2[:n] + args1 + args2[n:]) for positive n, and 
fn(args2[:len(args2)+n+1] + args1 + args2[len(args2)+n+1:]) for negative n. n==0 
and n==-1 being obvious candidates for tailored implementations that avoided the 
unneeded slicing. The presence of keyword arguments at any point wouldn't affect 
the positional arguments.

With this approach, it may be necessary to ditch the flattening of nested 
partials in the general case. For instance, partial(partial(-1)(fn, c), a)(b) 
should give an ultimate call that looks like fn(a, b, c). Simple cases where the 
nested partial application has the same number of prepended arguments as the 
containing partial application should still permit flattening, though.

> * Using the word "partial" instead of "lambda" traded one bit of
> unreadability for another.

The class does do partial function application though - I thought the name fit 
pretty well.

> * It is not clear that the proposed implementation achieves one of the
> principal benefits laid out in the PEP:  "I agree that lambda is usually
> good enough, just not always. And I want the possibility of useful
> introspection and subclassing."

I think it succeeds on the introspection part, since the flattening of nested 
partials relies on the introspection abilities. Not so much on the subclassing - 
partialmethod wasn't able to reuse too much functionality from partial.

> If we get a better implementation, it would be nice if the PEP were
> updated with better examples.  The TkInter example is weak because we
> often want to set multiple defaults at the same time (foreground,
> background, textsize, etc) and often those values are config options
> rather than hardwired constants.

Hmm - the PEP may give a misleading impression of what the current 
implementation can and can't do. It's already significantly more flexible than 
what you mention here. For instance, most of the examples you give below could 
be done using keyword arguments.

That's likely to be rather slow though, since you end up manipulating 
dictionaries rather than tuples, so I won't pursue that aspect. Instead, I'm 
curious how many of them could be implemented using positional arguments and the 
class factory approach described above:

   cmp(x,y)	  # partial()(cmp, refobject)
   divmod(x,y)     # partial(-1)(y)
   filter(p,s)     # partial()(filter, p)
   getattr(o,n,d)  # partial(-1)(getattr, n, d)
   hasattr(o,n)    # partial(-1)(hasttr, n)
   int(x,b)        # partial(-1)(int, b)
   isinstance(o,c) # partial(-1)(isinstance, c)
   issubclass(a,b) # partial(-1)(issubclass, b)
   iter(o,s)       # partial(-1)(iter, s)
   long(x,b)       # partial(-1)(long, b)
   map(f,s)        # partial()(map, f)
   pow(x,y,z)      # partial(1)(pow, y) OR partial(-1)(pow, z)
                   # OR partial(-1)(pow, y, z)
   range([a],b,[c])# partial(-1)(range, c) (Default step other than 1)
                   # Always need to specify start, though
   reduce(f,s,[i]) # partial()(f
   round(x, n)     # we would want a right curry.
   setattr(o,n,v)  # partial(1)(setattr, n)

Essentially, partial() gives left curry type behaviour, partial(-1) gives right 
curry behaviour, and partial(n) let's you default an argument in the middle. For 
positive n, the number is the index of the first argument locked, for negative n 
it is the index of the last argument that is locked.

An argument could be made for providing the names 'leftpartial' (fixing 
left-hand arguments) and 'rightpartial' (fixing right-hand arguments) as aliases 
for partial() and partial(-1) respectively.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From d+pydev at trit.org  Sun Feb 27 06:17:29 2005
From: d+pydev at trit.org (Dima Dorfman)
Date: Sun Feb 27 06:17:32 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <4221457A.3050903@iinet.net.au>
References: <000001c51c2f$e469b260$bb3bc797@oemcomputer>
	<4221457A.3050903@iinet.net.au>
Message-ID: <20050227051729.GG18922@trit.org>

Nick Coghlan <ncoghlan@iinet.net.au> wrote:
> Raymond Hettinger wrote:
> >* The instance method limitation never came up for me.  However, it
> >bites to have a tool working in a way that doesn't match your mental
> >model.  We have to document the limitations, keep them in mind while
> >programming, and hope to remember them as possible causes if bugs ever
> >arise.  It would be great if these limitations could get ironed out.
> 
> The 'best' idea I've come up with so far is to make partial a class factory 
> instead of a straight class, taking an argument that states how many 
> positional arguments to prepend at call time. A negative value would result 
> in the addition of (len(callargs)+1) to the position at call time.

Other dynamic languages, like Lisp, are in the same boat in this
respect--real currying (and things that look like it) doesn't work too well
because the API wasn't designed with it in mind (curried languages have
functions that take less-frequently-changing parameters first, but most
other languages take them last). Scheme has a nice solution for this in
SRFI 26 (http://srfi.schemers.org/srfi-26/). It looks like this:

    (cut vector-set! x <> 0)

That produces a function that takes one argument. The <> is an
argument slot; for every <> in the cut form, the resultant callable
takes another argument. (This explanation is incomplete, and there are
some other features; read the SRFI for details.) I've been using
something similar in Python for a while, and I really like it. It
doesn't look as good because the slot has to be a real object and not
punctuation, but it works just as well. For example:

    cut(islice, cutslot, 0, 2)

That's pretty readable to me. My version also allows the resultant
callable to take any number of parameters after the slots have been
satisfied, so partial is just the special case of no explicit slots.
Perhaps a full example will make it clearer:

    >>> def test(a, b, c):
    ...     print 'a', a, 'b', b, 'c', c
    ... 
    >>> f = cut(test, cutslot, 'bravo')
    >>> f('alpha', 'charlie')
    a alpha b bravo c charlie

Here, b is specialized at cut time, a is passed through the slot, and
c is passed through the implicit slots at the end. The only thing this
can't do is a generic right-"curry"--where we don't know how many
parameters come before the one we want to specialize. If someone wants
to do that, they're probably better off using keyword arguments.

So far, my most common use for this is to specialize the first
argument to map, zip, or reduce. Very few cases actually need an
explicit cutslot, but those that do (like the islice example above)
look pretty good with it. My reasons for using cut instead of lambda
are usually cosmetic--the cut form is shorter and reads better when
what I'm doing would be a curry in a language designed for that. Throw
in a compose function and I almost never need to use lambda in a
decorator <ducks and runs from the anti-lambda crowd>.
From ncoghlan at iinet.net.au  Sun Feb 27 08:39:26 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Feb 27 08:39:30 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <20050227051729.GG18922@trit.org>
References: <000001c51c2f$e469b260$bb3bc797@oemcomputer>	<4221457A.3050903@iinet.net.au>
	<20050227051729.GG18922@trit.org>
Message-ID: <4221792E.3030205@iinet.net.au>

Dima Dorfman wrote:
> Nick Coghlan <ncoghlan@iinet.net.au> wrote:
> Here, b is specialized at cut time, a is passed through the slot, and
> c is passed through the implicit slots at the end. The only thing this
> can't do is a generic right-"curry"--where we don't know how many
> parameters come before the one we want to specialize. If someone wants
> to do that, they're probably better off using keyword arguments.

I think Raymond posted some decent examples using the builtins where having 
binding of the last few arguments on the right with decent performance would be 
desirable. As you yourself said - Python functions tend to have the arguments 
one is most likely to want to lock down on the right of the function signature, 
rather than on the left.

The current PEP 309 certainly supports that in the form of keyword arguments, 
but anyone interested in performance is going to revert back to the lambda solution.

The class factory approach relies on shaping the partial application of the 
arguments by specifying where the call time positional arguments are to be 
placed (with 'all at the start', 'all at the end' and 'one at the start, rest at 
the end' being the most common options).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From martin at v.loewis.de  Sun Feb 27 09:31:26 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 27 09:31:30 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000801c51c62$f099a120$bb3bc797@oemcomputer>
References: <000801c51c62$f099a120$bb3bc797@oemcomputer>
Message-ID: <4221855E.1040802@v.loewis.de>

Raymond Hettinger wrote:
> Are you sure about that?  Contriving examples is easy, but download a
> few modules, scan them for use cases, and you may find, as I did, that
> partial() rarely applies.  The argument order tends to be problematic.

So would you like to see the decision to accept PEP 309 reverted?

Regards,
Martin
From p.f.moore at gmail.com  Sun Feb 27 13:10:49 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun Feb 27 13:10:52 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <4221855E.1040802@v.loewis.de>
References: <000801c51c62$f099a120$bb3bc797@oemcomputer>
	<4221855E.1040802@v.loewis.de>
Message-ID: <79990c6b0502270410655f4bd3@mail.gmail.com>

On Sun, 27 Feb 2005 09:31:26 +0100, "Martin v. L?wis"
<martin@v.loewis.de> wrote:
> Raymond Hettinger wrote:
> > Are you sure about that?  Contriving examples is easy, but download a
> > few modules, scan them for use cases, and you may find, as I did, that
> > partial() rarely applies.  The argument order tends to be problematic.
> 
> So would you like to see the decision to accept PEP 309 reverted?

Either revert the decision, or apply the patch. I don't feel
comfortable advocating that the decision be reverted - according to
the CVS log, PEP 309 was accepted by Guido on 31 March 2004, so I
don't think it's for me to argue against that decision...

(I've already stated my position - I don't have any problem with the
function as it stands, and speed is not crucial to me so I have no
preference between the C and Python implementations. But it's not the
end of the world whatever happens...).

Paul.
From python at rcn.com  Sun Feb 27 13:39:28 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb 27 13:43:38 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <4221855E.1040802@v.loewis.de>
Message-ID: <000401c51cc9$613b2f00$4a16c797@oemcomputer>

> > Are you sure about that?  Contriving examples is easy, but download
a
> > few modules, scan them for use cases, and you may find, as I did,
that
> > partial() rarely applies.  The argument order tends to be
problematic.
> 
> So would you like to see the decision to accept PEP 309 reverted?

I would like for the principal advocates to reach a consensus that the
proposed implementation is a winner.  Ideally, that decision should be
informed by trying it out on their own, real code and seeing whether it
offers genuine improvements.  Along the way, they should assess whether
it is as applicable as expected, whether the existing limitations are
problematic, and whether performance is an issue.  All four issues are
in question.

My concern is that with Guido having approved the idea in abstract form,
the actual implementation has escaped scrutiny.  Also, if the API is
different from the PEP, acceptance should not be automatic.

If functional.partial() isn't a clear winner, it may be a reasonable to
ask that it be released in the wild and evolve before being solidified
in the standard library.  My sense is that that the current
implementation is far from its highest state of evolution.


Raymond

From mdehoon at ims.u-tokyo.ac.jp  Sun Feb 27 16:09:35 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sun Feb 27 16:05:21 2005
Subject: [Python-Dev] Patch review & request
Message-ID: <4221E2AF.7000708@ims.u-tokyo.ac.jp>

This is my last patch review in a series of five; next time I'll send them in a 
bunch of five as requested. I reviewed the following patches:

[ 684500 ] extending readline functionality
[ 723201 ] PyArg_ParseTuple problem with 'L' format
[ 981773 ] crach link c++ extension by mingw
[ 985713 ] bug skipping optional keyword arguments of type "w#"
[ 1093585 ] sanity check for readline remove/replace


I'd like to ask your attention for patch #1121234, which is a bug fix for a 
reference count problem in Tkinter. Nothing complicated, don't worry.

Thanks!

--Michiel.


Patch [ 985713 ] bug skipping optional keyword arguments of type "w#"
---------------------------------------------------------------------

The convertsimple function in Python/getargs.c converts arguments for 
PyArg_Parse, PyArg_ParseTuple etc. The function skipitem takes care of skipping 
optional keyword arguments. Some of the valid formats are included in 
convertsimple but missing in skipitem. The patch shows the example of the "w#" 
format. If the argument format is "s|w#H" with corresponding keywords "string", 
"buffer", "ushort", you can skip the "ushort" keyword but not the "buffer" 
keyword. So this works:
 >>> my_dummy_func("text", buffer=my_buffer)
but this doesn't:
 >>> my_dummy_func("text", ushort=my_ushort)
It results in the error message

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: argument 2 impossible<bad format char>

The patch fixes this by adding the appropriate formats to the skipitem function. 
The patch was written for the "w#" format; other formats that are missing are 
listed in the patch description.
It seems that the missing formats in the skipitem function are simply due to an 
oversight. Running the test suite revealed no problems with this patch. So I 
think it can be applied.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon

From alan.mcintyre at esrgtech.com  Sun Feb 27 18:19:14 2005
From: alan.mcintyre at esrgtech.com (Alan McIntyre)
Date: Sun Feb 27 18:19:17 2005
Subject: [Python-Dev] Yet another patch review batch + request 
Message-ID: <42220112.4030902@esrgtech.com>

I've taken a look at the following patches and made comments on the 
associated tracker items:

patch [977553] Speed up EnumKey call
    Looks like a good idea, but needs some cleanup and at least one
    test case.  I think I know enough about the Windows registry functions
    to finish this one up if the original author doesn't have time.

patch [1075147] Flush stdout/stderr if closed (fix for bug 1074011)
    Looking at current sysmodule.c shows that this change (plus some error
    checking) has already been made, and the referenced bug has already
    been closed.  

patch [1104111] setup.py --help and --help-commands altered.    
    Simple, helpful changes; got one 1+ and no negative comments over on 
distutils-sig:
    http://mail.python.org/pipermail/distutils-sig/2005-January/004378.html

patch [1107973] tarfile.ExFileObject iterators
    Doesn't break anything and makes ExFileObject more file-like, so
    I'm for it. Passes regression tests.

patch [1110248] patch for gzip.GzipFile.flush()
    related to bug 1110242
    Seems reasonable & is simple. Passes regression tests.
----------------------------------------------------------------
   
I would appreciate it if somebody could have a look at my patch:
[1121142] ZipFile.open - read-only file-like obj for files in archive

Another patch related to mine:
patch [992750] zipfile and big zipped file    
    This is the patch that I looked at before I submitted patch 1121142; if
    my patch is accepted (or rejected, I guess :-), this one should
    probably be closed since they implement the same idea.

Thanks,
Alan
From martin at v.loewis.de  Sun Feb 27 19:05:18 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 27 19:05:23 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000401c51cc9$613b2f00$4a16c797@oemcomputer>
References: <000401c51cc9$613b2f00$4a16c797@oemcomputer>
Message-ID: <42220BDE.1050303@v.loewis.de>

Raymond Hettinger wrote:
> I would like for the principal advocates to reach a consensus that the
> proposed implementation is a winner.

That I cannot understand. Do you want the advocates to verify that the
implementation conforms to the specification? or that the implementation
of the PEP is faster than any other existing implementation of the PEP?

These two hold, I believe.

> Ideally, that decision should be
> informed by trying it out on their own, real code and seeing whether it
> offers genuine improvements.

Performance-wise, or usability-wise? Because usability-wise, all
implementations of the PEP are identical, so all implementations
of the PEP should offer the precisely same improvements.

> Along the way, they should assess whether
> it is as applicable as expected, whether the existing limitations are
> problematic, and whether performance is an issue.  

Ah, so you question the specification, not the implementation of it.

> My concern is that with Guido having approved the idea in abstract form,
> the actual implementation has escaped scrutiny.  Also, if the API is
> different from the PEP, acceptance should not be automatic.

AFAICT, the proposed patch implements the behaviour of the PEP exactly.

> If functional.partial() isn't a clear winner, it may be a reasonable to
> ask that it be released in the wild and evolve before being solidified
> in the standard library.  My sense is that that the current
> implementation is far from its highest state of evolution.

Again, this I cannot understand. I do believe that there is no better
way to implement the PEP. The PEP very explicitly defines what precisely
functional.partial is, and the implementation follows that specification
very closely.

Regards,
Martin
From quarl at cs.berkeley.edu  Sun Feb 27 19:31:29 2005
From: quarl at cs.berkeley.edu (Karl Chen)
Date: Sun Feb 27 19:31:32 2005
Subject: [Python-Dev] textwrap.py wordsep_re
In-Reply-To: <20050224130432.GA16230@cthulhu.gerg.ca> (Greg Ward's message
	of "Thu, 24 Feb 2005 08:04:32 -0500")
References: <quack.20050221T0339.87ekfacfb6@quack.cs.berkeley.edu>
	<20050224130432.GA16230@cthulhu.gerg.ca>
Message-ID: <quack.20050227T1031.87oee5g8hq@quack.cs.berkeley.edu>

>>>>> On 2005-02-24 05:04 PST, Greg Ward writes:

    Greg> Post a patch to SF and assign it to me.  Make sure the
    Greg> unit tests still pass, and add a new one that doesn't
    Greg> pass without your fix.

Done.  Patch id 1149508.

-- 
Karl 2005-02-27 10:29
From martin at v.loewis.de  Sun Feb 27 21:16:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 27 21:16:04 2005
Subject: [Python-Dev] Quick access to Python bug reports in Thunderbird
Message-ID: <42222A80.4000905@v.loewis.de>

I find that I frequently need to open a bug report
whose bug number is in a python-dev email I read
with Mozilla Thunderbird.

I now tried to automate things a bit more, and found
the excellent DictionarySearch plugin:

http://dictionarysearch.mozdev.org/index.html

To install, follow these steps:
1. Download the plugin (for Thunderbird, make
    sure to download to disk)
http://dictionarysearch.mozdev.org/download.php/http://downloads.mozdev.org/dictionarysearch/dictionarysearch_0.8.xpi

2. In Thunderbird, open Tools/Extensions/Install,
    select the file downloaded

3. Open Tools/Extensions again, double-click DictionarySearch.

4. In "Dictionary 1", fill in
    Text: Python Bug $
    Access Key: y (as P will conflict with Print)
    URL: http://www.python.org/sf/$

To use the plugin, select a bug report number in a message,
select the context menu, and type y (or click the Python
Bug menu entry).

Regards,
Martin

P.S. In case you wonder why this plugin is DictionarySearch:
it is meant to lookup words in an email in a dictionary, e.g.
Wikipedia.
From alan.mcintyre at esrgtech.com  Sun Feb 27 21:45:37 2005
From: alan.mcintyre at esrgtech.com (Alan McIntyre)
Date: Sun Feb 27 21:45:55 2005
Subject: [Python-Dev] Quick access to Python bug reports in Thunderbird
In-Reply-To: <42222A80.4000905@v.loewis.de>
References: <42222A80.4000905@v.loewis.de>
Message-ID: <42223171.3000604@esrgtech.com>

Martin,

Thanks; that works very well (in Firefox, too). I got it to work for 
patches, but the URL is a bit uglier (like this: 
http://sourceforge.net/tracker/index.php?func=detail&aid=$&group_id=5470&atid=305470). 
I assume there's a way to shorten that some, but it works as is and I 
probably won't change it. :-)

Alan

Martin v. L?wis wrote:

> I find that I frequently need to open a bug report
> whose bug number is in a python-dev email I read
> with Mozilla Thunderbird.
>
> I now tried to automate things a bit more, and found
> the excellent DictionarySearch plugin:
>
> http://dictionarysearch.mozdev.org/index.html
>
> To install, follow these steps:
> 1. Download the plugin (for Thunderbird, make
>    sure to download to disk)
> http://dictionarysearch.mozdev.org/download.php/http://downloads.mozdev.org/dictionarysearch/dictionarysearch_0.8.xpi 
>
>
> 2. In Thunderbird, open Tools/Extensions/Install,
>    select the file downloaded
>
> 3. Open Tools/Extensions again, double-click DictionarySearch.
>
> 4. In "Dictionary 1", fill in
>    Text: Python Bug $
>    Access Key: y (as P will conflict with Print)
>    URL: http://www.python.org/sf/$
>
> To use the plugin, select a bug report number in a message,
> select the context menu, and type y (or click the Python
> Bug menu entry).
>
> Regards,
> Martin
>
> P.S. In case you wonder why this plugin is DictionarySearch:
> it is meant to lookup words in an email in a dictionary, e.g.
> Wikipedia.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/alan.mcintyre%40esrgtech.com 
>
>

From p.f.moore at gmail.com  Sun Feb 27 21:49:24 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun Feb 27 21:49:27 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <42220BDE.1050303@v.loewis.de>
References: <000401c51cc9$613b2f00$4a16c797@oemcomputer>
	<42220BDE.1050303@v.loewis.de>
Message-ID: <79990c6b0502271249600dc19f@mail.gmail.com>

On Sun, 27 Feb 2005 19:05:18 +0100, "Martin v. L?wis"
<martin@v.loewis.de> wrote:
> Again, this I cannot understand. I do believe that there is no better
> way to implement the PEP. The PEP very explicitly defines what precisely
> functional.partial is, and the implementation follows that specification
> very closely.

This is where I get confused as well. PEP 309 specifies a function,
and this specification has been accepted for inclusion in Python. The
discussion seems to centre around whether that acceptance was correct.

While I'm not saying that it's too late to attempt to persuade Guido
to reverse himself, it does seem to me to be a lot of fuss over a
fairly small function - and no-one said anything like this at the
time.

When I put up 5 reviews to get Martin to look at this, I honestly
believed that it was a simple case of an accepted PEP with a complete
implementation (admittedly scattered over a couple of SF patches), and
would simply be a matter of committing it.

IMHO, the burden is on those who want the "Accepted" status revoking
to persuade Guido to pronounce to that effect. Otherwise, based on the
standard PEP workflow process, it's time to move on, and ensure that
the patches provide a complete implementation, and assuming they do to
commit them.

(But I don't want to put myself up as a big "champion" of PEP 309 - I
like it, and I'd like to get the "accepted and there's a patch, but
not yet implemented" status resolved, but that's all. I'm not going to
switch to Perl if the patch isn't accepted :-))

Paul
From python at rcn.com  Sun Feb 27 22:14:38 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Feb 27 22:18:56 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <42220BDE.1050303@v.loewis.de>
Message-ID: <000a01c51d11$58d17a20$4a16c797@oemcomputer>

> > Along the way, they should assess whether
> > it is as applicable as expected, whether the existing limitations
are
> > problematic, and whether performance is an issue.
> 
> Ah, so you question the specification, not the implementation of it.

My only issue with the PEP is that it seemed much more promising when
reading it than when looking for real code that could benefit from it. I
liked it much better until I tried to use it.  My hope is that the
advocates will try it for themselves before pushing this one in on
faith.


> I do believe that there is no better
> way to implement the PEP. The PEP very explicitly defines what
precisely
> functional.partial is, and the implementation follows that
specification
> very closely.

My reading of the PEP did not include making the structure members
public.  This complicates and slows the implementation.  The notion of
introducing mutable state to the PFA is at odds with the driving forces
behind functional programming (i.e. statelessness).

If necessary for introspection, the structure members can be made
read-only.

Also, there may be room to improve the implementation by building on the
passed-in dictionary rather than creating a copy of the one in the
partial object.  The current choice may be the correct one because it
has the outer call override the defaults in the event of a keyword
conflict -- if so, that should be documented.

The test for callability is redundant and can be removed.

The traverse() function can be simplified with the PyVISIT macro.

Overall, I have no major objections to the PEP or the patch.  Before it
goes in on auto-pilot, it would be darned nice if the proponents said
that they've found it helpful in real code and that they are satisfied
with the timings.


partial(str.__add__, 'Ray')('mond')

From pedronis at strakt.com  Sun Feb 27 22:20:12 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Sun Feb 27 22:19:26 2005
Subject: [Python-Dev] PEP 309 enhancements
In-Reply-To: <42201C1E.3060405@iinet.net.au>
References: <42201C1E.3060405@iinet.net.au>
Message-ID: <4222398C.1070501@strakt.com>

Nick Coghlan wrote:
> The initial suggestion was to provide a __get__ method on partial 
> objects, which forces the insertion of the reference to self at the 
> beginning of the argument list instead of at the end:
> 
>     def __get__(self, obj, type=None):
>         if obj is None:
>             return self
>         return partial(self.fn, obj, *self.args, **self.kw)
> 

just a note:

I don't see why this is not also a possible definition:

return partial(self.fn, *(self.args+(obj,)), **self.kw)

it may be impractical, but it would implement the direct mechanics of 
partial should behave like a function.
From jlg at dds.nl  Sun Feb 27 23:42:16 2005
From: jlg at dds.nl (Johannes Gijsbers)
Date: Sun Feb 27 23:41:51 2005
Subject: [Python-Dev] Quick access to Python bug reports in Thunderbird
In-Reply-To: <42223171.3000604@esrgtech.com>
References: <42222A80.4000905@v.loewis.de> <42223171.3000604@esrgtech.com>
Message-ID: <20050227224216.GA11307@authsmtp.dds.nl>

On Sun, Feb 27, 2005 at 03:45:37PM -0500, Alan McIntyre wrote:
> Martin,
> 
> Thanks; that works very well (in Firefox, too). I got it to work
> for patches, but the URL is a bit uglier (like this:
> http://sourceforge.net/tracker/index.php?func=detail&aid=$&group_id=5470&atid=305470).
> I assume there's a way to shorten that some, but it works as is
> and I probably won't change it. :-)

The URL used by Martin should work for patches as well.

Johannes
From martin at v.loewis.de  Sun Feb 27 23:51:02 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Feb 27 23:51:05 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <79990c6b0502271249600dc19f@mail.gmail.com>
References: <000401c51cc9$613b2f00$4a16c797@oemcomputer>	
	<42220BDE.1050303@v.loewis.de>
	<79990c6b0502271249600dc19f@mail.gmail.com>
Message-ID: <42224ED6.5040607@v.loewis.de>

Paul Moore wrote:
> While I'm not saying that it's too late to attempt to persuade Guido
> to reverse himself, it does seem to me to be a lot of fuss over a
> fairly small function - and no-one said anything like this at the
> time.

I would probably fuss much less if it would not simultaneously introduce
a new module as well.

> When I put up 5 reviews to get Martin to look at this, I honestly
> believed that it was a simple case of an accepted PEP with a complete
> implementation (admittedly scattered over a couple of SF patches), and
> would simply be a matter of committing it.

That was a fair assumption. However, it turned out that
a) people still have doubts about the proposed functionality of the PEP.
    For some, it does too much, for others, too little. Changing the PEP
    now would be much cheaper than first committing the changes, and then
    redoing the PEP again, as we might need to deprecate the
    functional.partial first. So as part of the review, I need to confirm
    that there still is no opposition to the PEP (which now appears to
    be the case)
b) it is not obvious that the patch is complete. It probably is, but
    I would have committed a single patch much quicker than collecting
    bits and pieces from multiple patches, only to find out that they
    won't integrate properly.
c) it appears that the implementation of the PEP is incorrect (as
    Raymond just discovered). Again, it is better to require a perfect
    implementation before committing the changes, instead of pushing
    the contributor afterwards to add the missing changes.

> IMHO, the burden is on those who want the "Accepted" status revoking
> to persuade Guido to pronounce to that effect.

Most certainly. So far, nobody stepped forward and requested that this
status is revoked, so no persuading is necessary. However, as part of
the review process, it *is* necessary to check again whether somebody
would have preferred that the PEP is revoked - atleast when the
acceptance of the PEP is many months old.

> Otherwise, based on the
> standard PEP workflow process, it's time to move on, and ensure that
> the patches provide a complete implementation, and assuming they do to
> commit them.

Correct. I would have done so more readily if I knew how the "Accepted"
status got into the document. I could have researched that (going
through old email archives), or I could just ask whether people agree
that the status is indeed "Accepted".

> (But I don't want to put myself up as a big "champion" of PEP 309 - I
> like it, and I'd like to get the "accepted and there's a patch, but
> not yet implemented" status resolved, but that's all. I'm not going to
> switch to Perl if the patch isn't accepted :-))

It seems to me that the patch will be committed shortly, assuming
somebody corrects the remaining flaws in the implementation. I could
do that, but I would prefer if somebody contributed an updated patch.

Regards,
Martin
From alan.mcintyre at esrgtech.com  Sun Feb 27 23:53:13 2005
From: alan.mcintyre at esrgtech.com (Alan McIntyre)
Date: Sun Feb 27 23:53:16 2005
Subject: [Python-Dev] Quick access to Python bug reports in Thunderbird
In-Reply-To: <20050227224216.GA11307@authsmtp.dds.nl>
References: <42222A80.4000905@v.loewis.de> <42223171.3000604@esrgtech.com>
	<20050227224216.GA11307@authsmtp.dds.nl>
Message-ID: <42224F59.4050300@esrgtech.com>

Johannes Gijsbers wrote:

>On Sun, Feb 27, 2005 at 03:45:37PM -0500, Alan McIntyre wrote:
>  
>
>>Martin,
>>
>>Thanks; that works very well (in Firefox, too). I got it to work
>>for patches, but the URL is a bit uglier (like this:
>>http://sourceforge.net/tracker/index.php?func=detail&aid=$&group_id=5470&atid=305470).
>>I assume there's a way to shorten that some, but it works as is
>>and I probably won't change it. :-)
>>    
>>
>
>The URL used by Martin should work for patches as well.
>  
>
Johannes & Paul,

Thanks, somehow I managed to be oblivious to patches & bugs being 
essentially the same thing on SF. :-)

Alan
From martin at v.loewis.de  Mon Feb 28 00:18:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Feb 28 00:18:16 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <000a01c51d11$58d17a20$4a16c797@oemcomputer>
References: <000a01c51d11$58d17a20$4a16c797@oemcomputer>
Message-ID: <42225535.8080608@v.loewis.de>

Raymond Hettinger wrote:
> My reading of the PEP did not include making the structure members
> public.  This complicates and slows the implementation.  The notion of
> introducing mutable state to the PFA is at odds with the driving forces
> behind functional programming (i.e. statelessness).

Notice that the C code is (or atleast is meant as) a faithful
implementation of the "Example Implementation" in the PEP, including
the literal spelling of the class attributes. Now, it is not clear
what is meant as normative in the PEP; I would agree that these member
names are not meant to be exposed.

> If necessary for introspection, the structure members can be made
> read-only.

This issue is not discussed in the PEP. If exposed, I think I would
prefer different names. Starting all names with p_, might be
appropriate, and I would rename "fn" to "func" (following method
objects). Not sure what names would be appropriate for arguments
and keywords.

Notice that the proposed documentation says this:

"""Partial objects are callable objects that are created, and mostly
behave, like the functions created by \function{partial}. The main
difference is that because they are Python objects, they have attributes
that can be inspected or modified."""

So it was atleast the intention of the PEP author that partial functions
are mutable.

> Also, there may be room to improve the implementation by building on the
> passed-in dictionary rather than creating a copy of the one in the
> partial object.

Couldn't this cause the modifications be passed to the caller? This
would not be acceptable, but I could not figure out whether
CALL_FUNCTION* will always create a new kwdict, or whether it might
pass through a dict from the original caller.

> The current choice may be the correct one because it
> has the outer call override the defaults in the event of a keyword
> conflict -- if so, that should be documented.

Notice that the "Example Implementation" specifies this:

    if kw and self.kw:
             d = self.kw.copy()
             d.update(kw)
    else:
             d = kw or self.kw

In any case - the fine points of the semantics primarily need to go into
the documentation, which currently says

"""and keyword arguments override those provided when the new function 
was created."""

> Overall, I have no major objections to the PEP or the patch.  Before it
> goes in on auto-pilot, it would be darned nice if the proponents said
> that they've found it helpful in real code and that they are satisfied
> with the timings.

I guess "darned nice" is the best you can hope for. Not sure if Peter
Harris is still around.

Regards,
Martin
From martin at v.loewis.de  Mon Feb 28 00:34:39 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Feb 28 00:34:42 2005
Subject: [Python-Dev] Quick access to Python bug reports in Thunderbird
In-Reply-To: <42224F59.4050300@esrgtech.com>
References: <42222A80.4000905@v.loewis.de>
	<42223171.3000604@esrgtech.com>	<20050227224216.GA11307@authsmtp.dds.nl>
	<42224F59.4050300@esrgtech.com>
Message-ID: <4222590F.7010400@v.loewis.de>

Alan McIntyre wrote:
> Thanks, somehow I managed to be oblivious to patches & bugs being 
> essentially the same thing on SF. :-)

The SF URLs are different (or atleast, they used to be (*), as they
also include the "tracker ID" and the "project ID", and SF complains
if you guess either wrong.

Therefore, I wrote http://www.python.org/sf/ (also accessible
through sf?id=) which looks an ID up in all trackers, in sequence,
and caches the result in a file. Unfortunately, this
a) fails if the item gets moved between trackers
b) takes some time on the first access

Nevertheless, it works in most cases, and it is easier to
remember - especially as the shortest form is
python.org/sf/<itemid>
(browsers will add the http://, DNS will add the www.,
  and the script will forward to SF index.php)

Regards,
Martin

(*) Apparently, SF now supports an (apparently undocumented) URL format of

http://sourceforge.net/support/tracker.php?aid=<itemid>

I'ld like to try this some day more systematically, to see whether it
would help simplifying the current Python script on python.org.
From python at rcn.com  Mon Feb 28 06:37:26 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Feb 28 06:41:41 2005
Subject: [Python-Dev] PEP 309
In-Reply-To: <42224ED6.5040607@v.loewis.de>
Message-ID: <000201c51d57$9635ddc0$d1b02c81@oemcomputer>

[Martin]
> It seems to me that the patch will be committed shortly, assuming
> somebody corrects the remaining flaws in the implementation. I could
> do that, but I would prefer if somebody contributed an updated patch.

Done.


Raymond

From ncoghlan at iinet.net.au  Mon Feb 28 13:25:02 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Feb 28 13:25:20 2005
Subject: [Python-Dev] PEP 309 enhancements
In-Reply-To: <4222398C.1070501@strakt.com>
References: <42201C1E.3060405@iinet.net.au> <4222398C.1070501@strakt.com>
Message-ID: <42230D9E.4050202@iinet.net.au>

Samuele Pedroni wrote:
> Nick Coghlan wrote:
> 
>> The initial suggestion was to provide a __get__ method on partial 
>> objects, which forces the insertion of the reference to self at the 
>> beginning of the argument list instead of at the end:
>>
>>     def __get__(self, obj, type=None):
>>         if obj is None:
>>             return self
>>         return partial(self.fn, obj, *self.args, **self.kw)
>>
> 
> just a note:
> 
> I don't see why this is not also a possible definition:
> 
> return partial(self.fn, *(self.args+(obj,)), **self.kw)
> 
> it may be impractical, but it would implement the direct mechanics of 
> partial should behave like a function.

It certainly *is* a possible definition, but you can get essentially that 
behaviour using new.instancemethod, so a new descriptor isn't necessary for that 
case.

The suggested descriptor was to get an alternate behaviour which injected the 
automatically supplied self argument at the start of the list of positional 
arguments.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From gregory.r.warnes at pfizer.com  Mon Feb 28 22:37:45 2005
From: gregory.r.warnes at pfizer.com (Warnes, Gregory R)
Date: Mon Feb 28 22:38:43 2005
Subject: [Python-Dev] Re: PEP 754
Message-ID: <915D2D65A9986440A277AC5C98AA466F978B71@groamrexm02.amer.pfizer.com>

[After a long delay, the thread continues....]

Hi All,

I'm pushing ahead on the tasks necessary to add the 'fcponst' module
described in PEP 754:  IEEE 754 Floating Point Special Values.

Per http://www.python.org/psf/contrib, I've 
- Changed the license to the Apache License, Version 2.0
- Just faxed the "Contributor Agreement" to the PSF

I've also 
- created a patch on sourceforge.net for fpconst code and documentation
(see
https://sourceforge.net/tracker/index.php?func=detail&aid=1151323&group_id=5
470&atid=305470)

I will need help connecting the included test functions into the python test
suite.

What else needs to be done to allow fpconst to go into the Python library?  

-Greg

> -----Original Message-----
> From: Tim Peters [mailto:tim.one@comcast.net]
> Sent: Friday, December 10, 2004 1:43 PM
> To: Warnes, Gregory R; goodger@python.org
> Cc: peps@python.org
> Subject: RE: [Python-Dev] Re: PEP 754
> 
> 
> [Warnes, Gregory R]
> > Hi David & Tim,
> >
> > First, I really like to see this go forward.  The fpconst module is
> > getting alot of use across the net, and it would be very useful to
> > finally get it into the standard python library.  What 
> needs to be done
> > to move forward?
> 
> Looks to me like exactly the same stuff as was needed before. 
>  Guido needs
> to pronounce on it.  It needs a patch on SourceForge, adding 
> the new module,
> new docs, and a test suite run by Python's standard regrtest.py.  The
> non-PSF copyright and license remain problematic.  That last should be
> easier to deal with after the PSF approves a Contributor 
> Agreement (probably
> within a month), but will be a show-stopper if Pfizer won't 
> contribute the
> module under the terms of the Contributor Agreement (which 
> can't be answered
> now by anyone, since the Contributor Agreement doesn't exist yet).
> 
> 
> 


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.

From fredrik at pythonware.com  Wed Feb 23 10:55:05 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 23 Feb 2005 10:55:05 +0100
Subject: [Python-Dev] RE: Nested scopes resolution -- you can breathe again!
References: <XFMail.010223104112.mikael@isy.liu.se>
Message-ID: <01c301c5198d$c6bcc3f0$0900a8c0@SPIFF>

Mikael Olofsson wrote:
> There really is a time machine. So I guess I can get the full Python 3k
> functionality by doing
> 
> from __future__ import *

I wouldn't do that: it imports both "warnings_are_errors" and
"from_import_star_is_evil", and we've found that it's impossible
to catch ParadoxErrors in a platform independent way.

Cheers /F


From fredrik at pythonware.com  Wed Feb 23 10:55:05 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 23 Feb 2005 10:55:05 +0100
Subject: [Python-Dev] RE: Nested scopes resolution -- you can breathe again!
References: <XFMail.010223104112.mikael@isy.liu.se>
Message-ID: <01c301c5198d$c6bcc3f0$0900a8c0@SPIFF>

Mikael Olofsson wrote:
> There really is a time machine. So I guess I can get the full Python 3k
> functionality by doing
> 
> from __future__ import *

I wouldn't do that: it imports both "warnings_are_errors" and
"from_import_star_is_evil", and we've found that it's impossible
to catch ParadoxErrors in a platform independent way.

Cheers /F