From reinhold-birkenfeld-nospam at wolke7.net  Sat Jan  1 03:19:03 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat Jan  1 03:22:53 2005
Subject: [Python-Dev] Patch Reviewing
Message-ID: <cr51an$jup$1@sea.gmane.org>

Hello,

just felt a little bored and tried to review a few (no-brainer) patches.

Here are the results:

* Patch #1051395

  Minor fix in Lib/locale.py: Docs say that function _parse_localename
  returns a tuple; but for one codepath it returns a list.

  Patch fixes this by adding tuple(), recommending apply.

* Patch #1046831

  Minor fix in Lib/distutils/sysconfig.py: it defines a function to
  retrieve the Python version but does not use it everywhere; Patch
  fixes this, recommending apply.

* Patch #751031

  Adds recognizing JPEG-EXIF files (produced by digicams) to imghdr.py.
  Recommending apply.

* Patch #712317

  Fixes URL parsing in urlparse for URLs such as http://foo?bar. Splits
  at '?', so assigns 'foo' to netloc and 'bar' to query instead of
  assigning 'foo?bar' to netloc. Recommending apply.

regards,
Reinhold

From bac at OCF.Berkeley.EDU  Sat Jan  1 04:11:43 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Jan  1 04:11:59 2005
Subject: [Python-Dev] python-dev Summary for 2004-11-16 through 2004-11-30
	[draft]
Message-ID: <41D614EF.1090804@ocf.berkeley.edu>

With school starting up again Monday and New Years being tomorrow I don't plan 
to send this out until Tuesday.

Hope everyone has a good New Years.

-Brett

-----------------------------------

=====================
Summary Announcements
=====================
PyCon_ is coming up!  Being held March 23-25 in Washington, DC, registration is 
now open at http://www.python.org/pycon/2005/register.html for credit card 
users (you can pay by check as well; see the general info page for the conference).

.. _PyCon: http://www.python.org/pycon/2005/

=========
Summaries
=========
---------------------------------------------
Would you like the source with your function?
---------------------------------------------
Would you like all functions and classes to contain a __pycode__ attribute that 
contains a string of the code used to compile that code object?  Well, that 
very idea was proposed.  You would use a command-line switch to turn on the 
feature in order to remove the memory and any performance overhead for the 
default case of not needing this feature.

Some might ask why this is needed when inspect.getsource and its ilk exist. 
The perk is that __pycode__ would always exist while inspect.getsource is a 
best attempt but cannot guarantee it will have the source.

Beyond a suggested name change to __source__, various people have suggested 
very different uses.  Some see it as a convenient way to save interpreter work 
easily and thus not lose any nice code snippet developed interactively.  Others 
see a more programmatic use (such as AOP "advice" injection).  Both are rather 
different and led to the thread ending on the suggestion that a PEP be written 
that specifies what the intended use-case to make sure that need is properly met.

Contributing threads:
   - `__pycode__ extension <>`__

===============
Skipped Threads
===============
- PEP 310 Status
- python 2.3.5 release?
       look for 2.3.5 possibly in January
- Current CVS, Cygwin and "make test"
- syntactic shortcut - unpack to variably sizedlist
       mostly discussed `last summary`_
- Python 2.4, MS .NET 1.1 and distutils
- Trouble installing 2.4
- Looking for authoritative documentation on packages, import & ihooks
       no docs exist, but feel free to write some!  =)
- String literal concatenation & docstrings
       literal string concatenation only works if the newline separating the 
strings is not significant to the parser
- print "%X" % id(object()) not so nice
       does 'id' need to return only a positive?  No, but it would be nice.
- Bug in PyLocale_strcoll
- Multilib strikes back
- File encodings
       file.write does not work with Unicode strings; have to decode them to 
ASCII on your own
From python at rcn.com  Sat Jan  1 03:53:27 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Jan  1 04:33:09 2005
Subject: [Python-Dev] Patch Reviewing
References: <cr51an$jup$1@sea.gmane.org>
Message-ID: <000201c4efb2$2e8b1320$43facc97@oemcomputer>

[Reinhold Birkenfeld]
> just felt a little bored and tried to review a few (no-brainer) patches.

Thanks, please assign to me and I'll apply them.


Raymond Hettinger
From kbk at shore.net  Sat Jan  1 05:55:36 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat Jan  1 05:55:47 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200501010455.j014taqo000992@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  261 open ( +4) /  2718 closed ( +3) /  2979 total ( +7)
Bugs    :  801 open ( -6) /  4733 closed (+16) /  5534 total (+10)
RFE     :  165 open ( +2) /   139 closed ( +0) /   304 total ( +2)

New / Reopened Patches
______________________

Patch for bug 999042.  (2004-12-23)
       http://python.org/sf/1090482  opened by  Darek Suchojad

_AEModule.c patch  (2004-12-25)
       http://python.org/sf/1090958  opened by  has

py-compile DESTDIR support to compile in correct paths  (2004-12-27)
CLOSED http://python.org/sf/1091679  opened by  Thomas Vander Stichele

Refactoring Python/import.c  (2004-12-30)
       http://python.org/sf/1093253  opened by  Thomas Heller

socket leak in SocketServer  (2004-12-30)
       http://python.org/sf/1093468  opened by  Shannon -jj Behrens

sanity check for readline remove/replace  (2004-12-30)
       http://python.org/sf/1093585  opened by  DSM

miscellaneous doc typos  (2004-12-31)
CLOSED http://python.org/sf/1093896  opened by  DSM

Patches Closed
______________

Avoid calling tp_compare with different types  (2004-07-22)
       http://python.org/sf/995939  closed by  arigo

py-compile DESTDIR support to compile in correct paths  (2004-12-27)
       http://python.org/sf/1091679  closed by  jafo

miscellaneous doc typos  (2004-12-31)
       http://python.org/sf/1093896  closed by  rhettinger

New / Reopened Bugs
___________________

presentation typo in lib: 6.21.4.2 How callbacks are called  (2004-12-23)
       http://python.org/sf/1090139  reopened by  jlgijsbers

input from numeric pad always dropped when numlock off  (2004-11-27)
       http://python.org/sf/1074333  reopened by  kbk

minor bug in what's new > decorators  (2004-12-26)
CLOSED http://python.org/sf/1091302  opened by  vincent wehren

A large block of commands after an &quot;if&quot; cannot be   (2003-03-28)
       http://python.org/sf/711268  reopened by  facundobatista

DESTROOTed frameworkinstall fails  (2004-12-26)
CLOSED http://python.org/sf/1091468  opened by  Jack Jansen

No need to fix  (2004-12-27)
CLOSED http://python.org/sf/1091634  opened by  Bertram Scharpf

garbage collector still documented as optional  (2004-12-27)
       http://python.org/sf/1091740  opened by  Gregory H. Ball

IDLE hangs due to subprocess  (2004-12-28)
       http://python.org/sf/1092225  opened by  ZACK

slice [0:] default is len-1 not len  (2004-12-28)
CLOSED http://python.org/sf/1092240  opened by  Robert Phillips

Memory leak in socket.py on Mac OS X 10.3  (2004-12-28)
       http://python.org/sf/1092502  opened by  bacchusrx

os.remove fails on win32 with read-only file  (2004-12-29)
       http://python.org/sf/1092701  opened by  Joshua Weage

Make Generators Pickle-able  (2004-12-29)
       http://python.org/sf/1092962  opened by  Jayson Vantuyl

distutils/tests not installed  (2004-12-30)
       http://python.org/sf/1093173  opened by  Armin Rigo

mapitags.PROP_TAG() doesn't account for new longs  (2004-12-30)
       http://python.org/sf/1093389  opened by  Joe Hildebrand

Bugs Closed
___________

presentation typo in lib: 6.21.4.2 How callbacks are called  (2004-12-22)
       http://python.org/sf/1090139  closed by  rhettinger

Memory leaks?  (2004-10-16)
       http://python.org/sf/1048495  closed by  rhettinger

_bsddb segfault  (2004-07-15)
       http://python.org/sf/991754  closed by  dcjim

coercion results used dangerously  (2004-06-26)
       http://python.org/sf/980352  closed by  arigo

exec scoping problem  (2004-12-22)
       http://python.org/sf/1089978  closed by  arigo

_DummyThread() objects not freed from threading._active map  (2004-12-22)
       http://python.org/sf/1089632  closed by  bcannon

Mac Library Modules 1.1.1 Bad Info  (2004-12-14)
       http://python.org/sf/1085300  closed by  bcannon

minor bug in what's new > decorators  (2004-12-26)
       http://python.org/sf/1091302  closed by  montanaro

A large block of commands after an &quot;if&quot; cannot be   (2003-03-28)
       http://python.org/sf/711268  closed by  bcannon

Failed assert in stringobject.c  (2003-05-14)
       http://python.org/sf/737947  closed by  facundobatista

DESTROOTed frameworkinstall fails  (2004-12-26)
       http://python.org/sf/1091468  closed by  jackjansen

nturl2path.url2pathname() mishandles ///  (2002-12-07)
       http://python.org/sf/649961  closed by  mike_j_brown

No need to fix  (2004-12-27)
       http://python.org/sf/1091634  closed by  mwh

2.4a3: unhelpful error message from distutils  (2004-09-03)
       http://python.org/sf/1021756  closed by  effbot

BuildApplication includes many unneeded modules  (2004-12-01)
       http://python.org/sf/1076492  closed by  jackjansen

slice [0:] default is len-1 not len  (2004-12-28)
       http://python.org/sf/1092240  closed by  jlgijsbers

truncated gzip file triggers zlibmodule segfault  (2004-12-10)
       http://python.org/sf/1083110  closed by  akuchling

os.ttyname() accepts wrong arguments  (2004-12-07)
       http://python.org/sf/1080713  closed by  akuchling

New / Reopened RFE
__________________

Distutils needs a way *not* to install files  (2004-12-28)
       http://python.org/sf/1092365  opened by  Mike Orr

From bob at redivi.com  Sun Jan  2 04:40:35 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sun Jan  2 04:40:41 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX
	fixapplepython23.py, 1.2, 1.3
In-Reply-To: <E1CkroZ-0003lT-Gz@sc8-pr-cvs1.sourceforge.net>
References: <E1CkroZ-0003lT-Gz@sc8-pr-cvs1.sourceforge.net>
Message-ID: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com>

On Jan 1, 2005, at 5:33 PM, jackjansen@users.sourceforge.net wrote:

> Update of /cvsroot/python/python/dist/src/Mac/OSX
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv14408
>
> Modified Files:
> 	fixapplepython23.py
> Log Message:
> Create the wrapper scripts for gcc/g++ too.
>
> +SCRIPT="""#!/bin/sh
> +export MACOSX_DEPLOYMENT_TARGET=10.3
> +exec %s "${@}"

This script should check to see if MACOSX_DEPLOYMENT_TARGET is already 
set.  If I have some reason to set MACOSX_DEPLOYMENT_TARGET=10.4 for 
compilation (say I'm compiling an extension that requires 10.4 
features) then I'm going to have some serious problems with this fix.

-bob

From Jack.Jansen at cwi.nl  Sun Jan  2 22:28:22 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sun Jan  2 22:28:11 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX
	fixapplepython23.py, 1.2, 1.3
In-Reply-To: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com>
References: <E1CkroZ-0003lT-Gz@sc8-pr-cvs1.sourceforge.net>
	<0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com>
Message-ID: <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl>


On 2-jan-05, at 4:40, Bob Ippolito wrote:
>> +SCRIPT="""#!/bin/sh
>> +export MACOSX_DEPLOYMENT_TARGET=10.3
>> +exec %s "${@}"
>
> This script should check to see if MACOSX_DEPLOYMENT_TARGET is already 
> set.  If I have some reason to set MACOSX_DEPLOYMENT_TARGET=10.4 for 
> compilation (say I'm compiling an extension that requires 10.4 
> features) then I'm going to have some serious problems with this fix.

I was going to do that, but then I thought it didn't make any sense, 
because this script is *only* used in the context of Apple-provided 
Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other than 
10.3 (be it lower or higher) while compiling an extension for Apple's 
2.3 is going to produce disappointing results anyway.

But, if I've missed a use case, please enlighten me.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From bob at redivi.com  Sun Jan  2 22:35:16 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sun Jan  2 22:35:24 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX
	fixapplepython23.py, 1.2, 1.3
In-Reply-To: <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl>
References: <E1CkroZ-0003lT-Gz@sc8-pr-cvs1.sourceforge.net>
	<0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com>
	<3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl>
Message-ID: <3192B720-5D06-11D9-8981-000A9567635C@redivi.com>


On Jan 2, 2005, at 4:28 PM, Jack Jansen wrote:

>
> On 2-jan-05, at 4:40, Bob Ippolito wrote:
>>> +SCRIPT="""#!/bin/sh
>>> +export MACOSX_DEPLOYMENT_TARGET=10.3
>>> +exec %s "${@}"
>>
>> This script should check to see if MACOSX_DEPLOYMENT_TARGET is 
>> already set.  If I have some reason to set 
>> MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an 
>> extension that requires 10.4 features) then I'm going to have some 
>> serious problems with this fix.
>
> I was going to do that, but then I thought it didn't make any sense, 
> because this script is *only* used in the context of Apple-provided 
> Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other 
> than 10.3 (be it lower or higher) while compiling an extension for 
> Apple's 2.3 is going to produce disappointing results anyway.
>
> But, if I've missed a use case, please enlighten me.

You're right, of course.  I had realized that I was commenting on the 
fixpython script after I had replied, but my concern is still 
applicable to whatever solution is used for Python 2.4.1.  Anything 
lower than 10.3 is of course an error, in either case.

-bob

From Jack.Jansen at cwi.nl  Mon Jan  3 00:16:07 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Mon Jan  3 00:16:02 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX
	fixapplepython23.py, 1.2, 1.3
In-Reply-To: <3192B720-5D06-11D9-8981-000A9567635C@redivi.com>
References: <E1CkroZ-0003lT-Gz@sc8-pr-cvs1.sourceforge.net>
	<0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com>
	<3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl>
	<3192B720-5D06-11D9-8981-000A9567635C@redivi.com>
Message-ID: <47FBE1CA-5D14-11D9-81BB-000D934FF6B4@cwi.nl>


On 2-jan-05, at 22:35, Bob Ippolito wrote:
>> On 2-jan-05, at 4:40, Bob Ippolito wrote:
>>>> +SCRIPT="""#!/bin/sh
>>>> +export MACOSX_DEPLOYMENT_TARGET=10.3
>>>> +exec %s "${@}"
>>>
>>> This script should check to see if MACOSX_DEPLOYMENT_TARGET is 
>>> already set.  If I have some reason to set 
>>> MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an 
>>> extension that requires 10.4 features) then I'm going to have some 
>>> serious problems with this fix.
>>
>> I was going to do that, but then I thought it didn't make any sense, 
>> because this script is *only* used in the context of Apple-provided 
>> Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other 
>> than 10.3 (be it lower or higher) while compiling an extension for 
>> Apple's 2.3 is going to produce disappointing results anyway.
>>
>> But, if I've missed a use case, please enlighten me.
>
> You're right, of course.  I had realized that I was commenting on the 
> fixpython script after I had replied, but my concern is still 
> applicable to whatever solution is used for Python 2.4.1.  Anything 
> lower than 10.3 is of course an error, in either case.

2.4.1 will install this fix into Apple-installed Python 2.3 (if 
applicable, i.e. if you're installing 2.4.1 on 10.3), but for its own 
use it will have the newer distutils, which understands that it needs 
to pick up MACOSX_DEPLOYMENT_TARGET from the Makefile, so it'll never 
see these scripts.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From bob at redivi.com  Mon Jan  3 03:43:32 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 03:43:43 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
Message-ID: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>

Quite a few notable places in the Python sources expect realloc(...) to 
relinquish some memory if the requested size is smaller than the 
currently allocated size.  This is definitely not true on Darwin, and 
possibly other platforms.  I have tested this on OpenBSD and Linux, and 
the implementations on these platforms do appear to relinquish memory, 
but I didn't read the implementation.  I haven't been able to find any 
documentation that states that realloc should make this guarantee, but 
I figure Darwin does this as an "optimization" and because Darwin 
probably can't resize mmap'ed memory (at least it can't from Python, 
but this probably means it doesn't have this capability at all).

It is possible to "fix" this for Darwin, because you can ask the 
default malloc zone how big a particular allocation is, and how big an 
allocation of a given size will actually be (see: <malloc/malloc.h>).  
The obvious place to put this would be PyObject_Realloc, because this 
is at least called by _PyString_Resize (which will fix 
<http://python.org/sf/1092502>).

Should I write up a patch that "fixes" this?  I guess the best thing to 
do would be to determine whether the fix should be used at runtime, by 
allocating a meg or so, resizing it to 1 byte, and see if the size of 
the allocation changes.  If the size of the allocation does change, 
then the system realloc can be trusted to do what Python expects it to 
do, otherwise realloc should be done "cleanly" by allocating a new 
block (returning the original on failure, because it's good enough and 
some places in Python seem to expect that shrink will never fail), 
memcpy, free, return new block.

I wrote up a small hack that does this realloc indirection to CVS 
trunk, and it doesn't seem to cause any measurable difference in 
pystone performance.

Note that all versions of Darwin that I've looked at (6.x, 7.x, and 
8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have 
this "issue", but it might go away by Mac OS X 10.4 or some later 
release.

This URL points to the sf bug and Darwin 7.7's realloc(...) 
implementation: 
http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/

-bob

From tim.peters at gmail.com  Mon Jan  3 06:13:22 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Jan  3 06:13:25 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
Message-ID: <1f7befae05010221134a94eccd@mail.gmail.com>

[Bob Ippolito]
> Quite a few notable places in the Python sources expect realloc(...) to
> relinquish some memory if the requested size is smaller than the
> currently allocated size.

I don't know what "relinquish some memory" means.  If it means
something like "returns memory to the OS, so that the reported process
size shrinks", then no, nothing in Python ever assumes that.  That's
simply because "returns memory to the OS" and "process size" aren't
concepts in the C standard, and so nothing can be said about them in
general -- not in theory, and neither in practice, because platforms
(OS+libc combos) vary so widely in behavior here.

As a pragmatic matter, I *expect* that a production-quality realloc()
implementation will at least be able to reuse released memory,
provided that the amount released is at least half the amount
originally malloc()'ed (and, e.g., reasonable buddy systems may not be
able to do better than that).

> This is definitely not true on Darwin, and possibly other platforms.  I have tested
> this on OpenBSD and Linux, and the implementations on these platforms do
> appear to relinquish memory,

As above, don't know what this means.

> but I didn't read the implementation.  I haven't been able to find any
> documentation that states that realloc should make this guarantee,

realloc() guarantees very little; it certainly doesn't guarantee
anything, e.g., about OS interactions or process sizes.

> but I figure Darwin does this as an "optimization" and because Darwin
> probably can't resize mmap'ed memory (at least it can't from Python,
> but this probably means it doesn't have this capability at all).
>
> It is possible to "fix" this for Darwin,

I don't understand what's "broken".  Small objects go thru Python's
own allocator, which has its own realloc policies and its own
peculiarities (chiefly that pymalloc never free()s any memory
allocated for small objects).

> because you can ask the default malloc zone how big a particular
> allocation is, and how big an allocation of a given size will actually
> be (see: <malloc/malloc.h>).
> The obvious place to put this would be PyObject_Realloc, because this
> is at least called by _PyString_Resize (which will fix
> <http://python.org/sf/1092502>).

The diagnosis in the bug report seems to leave it pointing at
socket.py's _fileobject.read(), although I suspect the real cause is
in socketmodule.c's sock_recv().  We've had other reports of various
problems when people pass absurdly large values to socket recv().  A
better fix here would probably amount to rewriting sock_recv() to
refuse to pass enormous numbers to the platform recv() (it appears
that many platform recv() implementations simply don't expect a recv()
argument to be much bigger than the native network buffer size, and
screw up when that's not so).

> Should I write up a patch that "fixes" this?  I guess the best thing to
> do would be to determine whether the fix should be used at runtime, by
> allocating a meg or so, resizing it to 1 byte, and see if the size of
> the allocation changes.  If the size of the allocation does change,
> then the system realloc can be trusted to do what Python expects it to
> do, otherwise realloc should be done "cleanly" by allocating a new
> block (returning the original on failure, because it's good enough and
> some places in Python seem to expect that shrink will never fail),

Yup, that assumption (that a non-growing realloc can't fail) is all
over the place.

> memcpy, free, return new block.
>
> I wrote up a small hack that does this realloc indirection to CVS
> trunk, and it doesn't seem to cause any measurable difference in
> pystone performance.
> 
> Note that all versions of Darwin that I've looked at (6.x, 7.x, and
> 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have
> this "issue", but it might go away by Mac OS X 10.4 or some later
> release.
> 
> This URL points to the sf bug and Darwin 7.7's realloc(...)
> implementation:
> http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/

It would be good to rewrite sock_recv() more defensively in any case. 
Best I can tell, this implementation of realloc() is
standard-conforming but uniquely brain dead in its downsize behavior. 
I don't expect the latter will last (as you say on your page,
"probably plenty of other software" also makes the same pragmatic
assumptions about realloc downsize behavior), so I'm not keen to gunk
up Python to worm around it.
From bob at redivi.com  Mon Jan  3 07:08:24 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 07:08:37 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <1f7befae05010221134a94eccd@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
Message-ID: <E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>

On Jan 3, 2005, at 12:13 AM, Tim Peters wrote:

> [Bob Ippolito]
>> Quite a few notable places in the Python sources expect realloc(...) 
>> to
>> relinquish some memory if the requested size is smaller than the
>> currently allocated size.
>
> I don't know what "relinquish some memory" means.  If it means
> something like "returns memory to the OS, so that the reported process
> size shrinks", then no, nothing in Python ever assumes that.  That's
> simply because "returns memory to the OS" and "process size" aren't
> concepts in the C standard, and so nothing can be said about them in
> general -- not in theory, and neither in practice, because platforms
> (OS+libc combos) vary so widely in behavior here.
>
> As a pragmatic matter, I *expect* that a production-quality realloc()
> implementation will at least be able to reuse released memory,
> provided that the amount released is at least half the amount
> originally malloc()'ed (and, e.g., reasonable buddy systems may not be
> able to do better than that).

This is what I meant by relinquish (c/o merriam-webster):
     a : to stop holding physically : RELEASE <slowly relinquished his 
grip on the bar>
     b : to give over possession or control of : YIELD <few leaders 
willingly relinquish power>

Your expectation is not correct for Darwin's memory allocation scheme.  
It seems that Darwin creates allocations of immutable size.  The only 
way ANY part of an allocation will ever be used by ANYTHING else is if 
free() is called with that allocation.  free() can be called either 
explicitly, or implicitly by calling realloc() with a size larger than 
the size of the allocation.  In that case, it will create a new 
allocation of at least the requested size, copy the contents of the 
original allocation into the new allocation (probably with 
copy-on-write pages if it's large enough, so it might be cheap), and 
free() the allocation.  In the case where realloc() specifies a size 
that is not greater than the allocation's size, it will simply return 
the given allocation and cause no side-effects whatsoever.

Was this a good decision?  Probably not!  However, it is our (in the "I 
know you use Windows but I am not the only one that uses Mac OS X" 
sense) problem so long as Darwin is a supported platform, because it is 
highly unlikely that Apple will backport any "fix" to the allocator 
unless we can prove it has some security implications in software 
shipped with their OS.  I attempted to look for some easy ones by 
performing a quick audit of Apache, OpenSSH, and OpenSSL.  
Unfortunately, their developers did not share your expectation.  I 
found one sprintf-like routine in Apache that could be affected by this 
behavior, and one instance of immutable string creation in Apple's 
CoreFoundation CFString implementation, but I have yet to find an easy 
way to exploit this behavior from the outside.  I should probably be 
looking at PHP and Perl instead ;)

>> but I figure Darwin does this as an "optimization" and because Darwin
>> probably can't resize mmap'ed memory (at least it can't from Python,
>> but this probably means it doesn't have this capability at all).
>>
>> It is possible to "fix" this for Darwin,
>
> I don't understand what's "broken".  Small objects go thru Python's
> own allocator, which has its own realloc policies and its own
> peculiarities (chiefly that pymalloc never free()s any memory
> allocated for small objects).

What's broken is that there are several places in Python that seem to 
assume that you can allocate a large chunk of memory, and make it 
smaller in some meaningful way with realloc(...).  This is not true 
with Darwin.  You are right about small objects.  They don't matter 
because they're small, and because they're handled by Python's 
allocator.

>> because you can ask the default malloc zone how big a particular
>> allocation is, and how big an allocation of a given size will actually
>> be (see: <malloc/malloc.h>).
>> The obvious place to put this would be PyObject_Realloc, because this
>> is at least called by _PyString_Resize (which will fix
>> <http://python.org/sf/1092502>).
>
> The diagnosis in the bug report seems to leave it pointing at
> socket.py's _fileobject.read(), although I suspect the real cause is
> in socketmodule.c's sock_recv().  We've had other reports of various
> problems when people pass absurdly large values to socket recv().  A
> better fix here would probably amount to rewriting sock_recv() to
> refuse to pass enormous numbers to the platform recv() (it appears
> that many platform recv() implementations simply don't expect a recv()
> argument to be much bigger than the native network buffer size, and
> screw up when that's not so).

You are correct.  The real cause is in sock_recv(), and/or 
_PyString_Resize(), depending on how you look at it.

>> Note that all versions of Darwin that I've looked at (6.x, 7.x, and
>> 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have
>> this "issue", but it might go away by Mac OS X 10.4 or some later
>> release.
>
> It would be good to rewrite sock_recv() more defensively in any case.
> Best I can tell, this implementation of realloc() is
> standard-conforming but uniquely brain dead in its downsize behavior.

Presumably this can happen at other places (including third party 
extensions), so a better place to do this might be _PyString_Resize().  
list_resize() is another reasonable place to put this.  I'm sure there 
are other places that use realloc() too, and the majority of them do 
this through obmalloc.  So maybe instead of trying to track down all 
the places where this can manifest, we should just "gunk up" Python and 
patch PyObject_Realloc()?  Since we are both pretty confident that 
other allocators aren't like Darwin, this "gunk" can be #ifdef'ed to 
the __APPLE__ case.

> I don't expect the latter will last (as you say on your page,
> "probably plenty of other software" also makes the same pragmatic
> assumptions about realloc downsize behavior), so I'm not keen to gunk
> up Python to worm around it.

As I said above, I haven't yet found any other software that makes the 
same kind of realloc() assumptions that Python does.  I'm sure I'll 
find something, but what's important to me is that Python works well on 
Mac OS X, so something should happen.  If we can't prove that Apple's 
allocation strategy is a security flaw in some service that ships with 
the OS, any improvements to this strategy are very unlikely to be 
backported to current versions of Mac OS X.

-bob

From tim.peters at gmail.com  Mon Jan  3 08:16:34 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Jan  3 08:16:54 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
Message-ID: <1f7befae050102231638b0d39d@mail.gmail.com>

[Bob Ippolito]
> ...
> Your expectation is not correct for Darwin's memory allocation scheme.
> It seems that Darwin creates allocations of immutable size.  The only
> way ANY part of an allocation will ever be used by ANYTHING else is if
> free() is called with that allocation.

Ya, I understood that.  My conclusion was that Darwin's realloc()
implementation isn't production-quality.  So it goes.

>  free() can be called either explicitly, or implicitly by calling realloc() with
> a size larger than the size of the allocation.  In that case, it will create a new
> allocation of at least the requested size, copy the contents of the
> original allocation into the new allocation (probably with
> copy-on-write pages if it's large enough, so it might be cheap), and
> free() the allocation.

Really?  Another near-universal "quality of implementation"
expectation is that a growing realloc() will strive to extend
in-place.  Like realloc(malloc(1000000), 1000001).  For example, the
theoretical guarantee that one-at-a-time list.append() has amortized
linear time doesn't depend on that, but pragmatically it's greatly
helped by a reasonable growing realloc() implementation.

>  In the case where realloc() specifies a size that is not greater than the
> allocation's size, it will simply return the given allocation and cause no side-
> effects whatsoever.
>
> Was this a good decision?  Probably not!

Sounds more like a bug (or two) to me than "a decision", but I don't know.

>  However, it is our (in the "I know you use Windows but I am not the only
> one that uses Mac OS X sense) problem so long as Darwin is a supported
> platform, because it is highly unlikely that Apple will backport any "fix" to
> the allocator unless we can prove it has some security implications in
> software shipped with their OS. ...

Is there any known case where Python performs poorly on this OS, for
this reason, other than the "pass giant numbers to recv() and then
shrink the string because we didn't get anywhere near that many bytes"
case?  Claiming rampant performance problems should require evidence
too <wink>.

...
> Presumably this can happen at other places (including third party
> extensions), so a better place to do this might be _PyString_Resize().
> list_resize() is another reasonable place to put this.  I'm sure there
> are other places that use realloc() too, and the majority of them do
> this through obmalloc.  So maybe instead of trying to track down all
> the places where this can manifest, we should just "gunk up" Python and
> patch PyObject_Realloc()?

There is no "choke point" for allocations in Python -- some places
call the system realloc() directly.  Maybe the latter matter on Darwin
too, but maybe they don't.  The scope of this hack spreads if they do.
 I have no idea how often realloc() is called directly by 3rd-party
extension modules.  It's called directly a lot in Zope's C code, but
AFAICT only to grow vectors, never to shrink them.
'
> Since we are both pretty confident that other allocators aren't like Darwin,
> this "gunk" can be #ifdef'ed to the __APPLE__ case.

#ifdef's are a last resort:  they almost never go away, so they
complicate the code forever after, and typically stick around for
years even after the platform problems they intended to address have
been fixed.  For obvious reasons, they're also an endless source of
platform-specific bugs.

Note that pymalloc already does a memcpy+free when in
PyObject_Realloc(p, n) p was obtained from the system malloc or
realloc but n is small enough to meet the "small object" threshold
(pymalloc "takes over" small blocks that result from a
PyObject_Realloc()).  That's a reasonable strategy *because* n is
always small in such cases.  If you're going to extend this strategy
to n of arbitrary size, then you may also create new performance
problems for some apps on Darwin (copying n bytes can get arbitrarily
expensive).

> ...
>  I'm sure I'll find something, but what's important to me is that Python
> works well on Mac OS X, so something should happen.

I agree the socket-abuse case should be fiddled, and for more reasons
than just Darwin's realloc() quirks.  I don't know that there are
actual problems on Darwin broader than that case (and I'm not
challenging you to contrive one, I'm asking whether realloc() quirks
are suspected in any other case that's known).  Part of what you
demonstrated when you said that pystone didn't slow down when you
fiddled stuff is that pystone also didn't speed up.  I also don't know
that the memcpy+free wormaround is actually going to help more than it
hurts overall.  Yes, in the socket-abuse case, where the program
routinely malloc()s strings millions of bytes larger than the socket
can deliver, it would obviously help.  That's not typically program
behavior (however typical it may be of that specific app).  More
typical is shrinking a long list one element at a time, in which case
about half the list remaining would get memcpy'd from time to time
where such copies never get made today.

IOW, there's no straightforward pure win here.
From gvanrossum at gmail.com  Mon Jan  3 08:17:59 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan  3 08:18:18 2005
Subject: [Python-Dev] Zipfile needs?
In-Reply-To: <41D1B0C6.8040208@ocf.berkeley.edu>
References: <cqq8mc$3g9$1@sea.gmane.org> <41D1B0C6.8040208@ocf.berkeley.edu>
Message-ID: <ca471dc205010223171b9b2e15@mail.gmail.com>

> Encryption/decryption support.  Will most likely require a C extension since
> the algorithm relies on ints (or longs, don't remember) wrapping around when
> the value becomes too large.

You may want to do this in C for speed, but C-style int wrapping is
easily done by doing something like "x = x & 0xFFFFFFFFL" at crucial
points in the code (for unsigned 32-bit ints) with an additional "if x
& 0x80000000L: x -= 0x100000000L" to simulate signed 32-bit ints.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bob at redivi.com  Mon Jan  3 16:48:00 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 16:48:14 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
Message-ID: <D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>

On Jan 3, 2005, at 2:16 AM, Tim Peters wrote:

> [Bob Ippolito]
>> ...
>> Your expectation is not correct for Darwin's memory allocation scheme.
>> It seems that Darwin creates allocations of immutable size.  The only
>> way ANY part of an allocation will ever be used by ANYTHING else is if
>> free() is called with that allocation.
>
> Ya, I understood that.  My conclusion was that Darwin's realloc()
> implementation isn't production-quality.  So it goes.

Whatever that means.

>>  free() can be called either explicitly, or implicitly by calling 
>> realloc() with
>> a size larger than the size of the allocation.  In that case, it will 
>> create a new
>> allocation of at least the requested size, copy the contents of the
>> original allocation into the new allocation (probably with
>> copy-on-write pages if it's large enough, so it might be cheap), and
>> free() the allocation.
>
> Really?  Another near-universal "quality of implementation"
> expectation is that a growing realloc() will strive to extend
> in-place.  Like realloc(malloc(1000000), 1000001).  For example, the
> theoretical guarantee that one-at-a-time list.append() has amortized
> linear time doesn't depend on that, but pragmatically it's greatly
> helped by a reasonable growing realloc() implementation.

I said that it created allocations of fixed size, not that it created 
allocations of exactly the size you asked it to.  Yes, it will extend 
in-place for many cases, including the given.

>>  In the case where realloc() specifies a size that is not greater 
>> than the
>> allocation's size, it will simply return the given allocation and 
>> cause no side-
>> effects whatsoever.
>>
>> Was this a good decision?  Probably not!
>
> Sounds more like a bug (or two) to me than "a decision", but I don't 
> know.

You said yourself that it is standards compliant ;)  I have filed it as 
a bug, but it is probably unlikely to be backported to current versions 
of Mac OS X unless a case can be made that it is indeed a security 
flaw.

>>  However, it is our (in the "I know you use Windows but I am not the 
>> only
>> one that uses Mac OS X sense) problem so long as Darwin is a supported
>> platform, because it is highly unlikely that Apple will backport any 
>> "fix" to
>> the allocator unless we can prove it has some security implications in
>> software shipped with their OS. ...
>
> Is there any known case where Python performs poorly on this OS, for
> this reason, other than the "pass giant numbers to recv() and then
> shrink the string because we didn't get anywhere near that many bytes"
> case?  Claiming rampant performance problems should require evidence
> too <wink>.

Known case?  No.  Do I want to search Python application-space to find 
one?  No.

>> Presumably this can happen at other places (including third party
>> extensions), so a better place to do this might be _PyString_Resize().
>> list_resize() is another reasonable place to put this.  I'm sure there
>> are other places that use realloc() too, and the majority of them do
>> this through obmalloc.  So maybe instead of trying to track down all
>> the places where this can manifest, we should just "gunk up" Python 
>> and
>> patch PyObject_Realloc()?
>
> There is no "choke point" for allocations in Python -- some places
> call the system realloc() directly.  Maybe the latter matter on Darwin
> too, but maybe they don't.  The scope of this hack spreads if they do.
>  I have no idea how often realloc() is called directly by 3rd-party
> extension modules.  It's called directly a lot in Zope's C code, but
> AFAICT only to grow vectors, never to shrink them.

In the case of Python, "some places" means "nowhere relevant".  Four 
standard library extension modules relevant to the platform use realloc 
directly:

_sre
     Uses realloc only to grow buffers.
cPickle
     Uses realloc only to grow buffers.
cStringIO
     Uses realloc only to grow buffers.
regexpr:
     Uses realloc only to grow buffers.

If Zope doesn't use the allocator that Python gives it, then it can 
deal with its own problems.  I would expect most extensions to use 
Python's allocator.

>> Since we are both pretty confident that other allocators aren't like 
>> Darwin,
>> this "gunk" can be #ifdef'ed to the __APPLE__ case.
>
> #ifdef's are a last resort:  they almost never go away, so they
> complicate the code forever after, and typically stick around for
> years even after the platform problems they intended to address have
> been fixed.  For obvious reasons, they're also an endless source of
> platform-specific bugs.

They're also the only good way to deal with platform-specific 
inconsistencies.  In this specific case, it's not even possible to 
determine if a particular allocator implementation is stupid or not 
without at least using a platform-allocator-specific function to query 
the size reserved by a given allocation.

> Note that pymalloc already does a memcpy+free when in
> PyObject_Realloc(p, n) p was obtained from the system malloc or
> realloc but n is small enough to meet the "small object" threshold
> (pymalloc "takes over" small blocks that result from a
> PyObject_Realloc()).  That's a reasonable strategy *because* n is
> always small in such cases.  If you're going to extend this strategy
> to n of arbitrary size, then you may also create new performance
> problems for some apps on Darwin (copying n bytes can get arbitrarily
> expensive).

There's obviously a tradeoff between copying lots of bytes and having 
lots of memory go to waste.  That should be taken into consideration 
when considering how many pages could be returned to the allocator.  
Note that we can ask the allocator how much memory an allocation has 
actually reserved (which is usually somewhat larger than the amount you 
asked it for) and how much memory an allocation will reserve for a 
given size.  An allocation resize wouldn't even show up as smaller 
unless at least one page would be freed (for sufficiently large 
allocations anyway, the minimum granularity is 16 bytes because it 
guarantees that alignment).  Obviously if you have a lot of pages 
anyway, one page isn't a big deal, so we would probably only resort to 
free()/memcpy() if some fair percentage of the total pages used by the 
allocation could be rescued.

If it does end up causing some real performance problems anyway, 
there's always deeper hacks like using vm_copy(), a Darwin specific 
function which will do copy-on-write instead (which only makes sense if 
the allocation is big enough for this to actually be a performance 
improvement).

>> ...
>>  I'm sure I'll find something, but what's important to me is that 
>> Python
>> works well on Mac OS X, so something should happen.
>
> I agree the socket-abuse case should be fiddled, and for more reasons
> than just Darwin's realloc() quirks.  I don't know that there are
> actual problems on Darwin broader than that case (and I'm not
> challenging you to contrive one, I'm asking whether realloc() quirks
> are suspected in any other case that's known).  Part of what you
> demonstrated when you said that pystone didn't slow down when you
> fiddled stuff is that pystone also didn't speed up.  I also don't know
> that the memcpy+free wormaround is actually going to help more than it
> hurts overall.  Yes, in the socket-abuse case, where the program
> routinely malloc()s strings millions of bytes larger than the socket
> can deliver, it would obviously help.  That's not typically program
> behavior (however typical it may be of that specific app).  More
> typical is shrinking a long list one element at a time, in which case
> about half the list remaining would get memcpy'd from time to time
> where such copies never get made today.

I do not yet know of another specific case where Darwin's realloc() 
implementation causes a problem.

The list case would certainly be a loss with current behavior if the 
list gets extremely large at some point, but then becomes small and 
stays that way for a long period of time.

> IOW, there's no straightforward pure win here.

Well at least we have a nice bug to report to Apple, whether or not we 
do something about it ourselves.

-bob

From gvanrossum at gmail.com  Mon Jan  3 17:15:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan  3 17:15:27 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
Message-ID: <ca471dc205010308152e6f2bfb@mail.gmail.com>

Coming late to this thread.

I don't see the point of lying awake at night worrying about potential
memory losses unless you've heard someone complain about it. As Tim
has been trying to explain, here are plenty of other things in Python
that we *could* speed up if there was a need; since every speedup
uglifies the code somewhat, we'd end up with very ugly code if we did
them all. Remember, don't optimize prematurely.

Here's one theoretical reason why even with socket.recv() it probably
doesn't matter in practice: the overallocated string will usually be
freed as soon as the data has been parsed from it, and this will free
the overallocation as well!

OTOH, if you want to do more research, checking the usage patterns for
StringRealloc and TupleRealloc would be useful. I could imagine code
in either that makes a copy if the new size is less than some fraction
of the old size. Most code that I recall writing using these tends to
start with a guaranteed-to-fit overallocation, and a single resize at
the end.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bob at redivi.com  Mon Jan  3 17:30:14 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 17:30:26 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <ca471dc205010308152e6f2bfb@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
	<ca471dc205010308152e6f2bfb@mail.gmail.com>
Message-ID: <BEDF7E01-5DA4-11D9-8981-000A9567635C@redivi.com>


On Jan 3, 2005, at 11:15 AM, Guido van Rossum wrote:

> Coming late to this thread.
>
> I don't see the point of lying awake at night worrying about potential
> memory losses unless you've heard someone complain about it. As Tim
> has been trying to explain, here are plenty of other things in Python
> that we *could* speed up if there was a need; since every speedup
> uglifies the code somewhat, we'd end up with very ugly code if we did
> them all. Remember, don't optimize prematurely.

We *have* had someone complain about it: http://python.org/sf/1092502

> Here's one theoretical reason why even with socket.recv() it probably
> doesn't matter in practice: the overallocated string will usually be
> freed as soon as the data has been parsed from it, and this will free
> the overallocation as well!

That depends on how socket.recv is used.  Sometimes, a list of strings 
is used rather than a cStringIO (or equivalent), which can cause 
problems (see above referenced bug).

-bob

From Scott.Daniels at Acm.Org  Mon Jan  3 18:07:00 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Mon Jan  3 18:05:36 2005
Subject: [Python-Dev] Re: Zipfile needs?
In-Reply-To: <41D1B0C6.8040208@ocf.berkeley.edu>
References: <cqq8mc$3g9$1@sea.gmane.org> <41D1B0C6.8040208@ocf.berkeley.edu>
Message-ID: <crbu0p$847$1@sea.gmane.org>

Brett C. wrote:
> Scott David Daniels wrote:
> 
>> I'm hoping to add BZIP2 compression to zipfile for 2.5.  My primary
>> motivation is that Project Gutenberg seems to be starting to use BZIP2
>> compression for some of its zips.  What other wish list things do
>> people around here have for zipfile?  I thought I'd collect input here
>> and make a PEP.
> Encryption/decryption support.  Will most likely require a C extension 
> since the algorithm relies on ints (or longs, don't remember) wrapping 
> around when the value becomes too large.

I'm trying to use byte-block streams (iterators taking iterables) as
the basic structure of getting data in and out.  I think the encryption/
decryption can then be plugged in at the right point.  If it can be set
up properly, you can import the encryption separately and connect it to
zipfiles with a call.  Would this address what you want?  I believe
there is an issue actually building in the encryption/decryption in
terms of redistribution.

-- 
-- Scott David Daniels
Scott.Daniels@Acm.Org

From bacchusrx at skorga.org  Mon Jan  3 21:23:50 2005
From: bacchusrx at skorga.org (bacchusrx)
Date: Mon Jan  3 21:23:58 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never
	shrinks allocations
In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
Message-ID: <20050103202350.GA17165@skorga.org>

On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote:
> Is there any known case where Python performs poorly on this OS, for
> this reason, other than the "pass giant numbers to recv() and then
> shrink the string because we didn't get anywhere near that many bytes"
> case?
> 
> [...]
> 
> I agree the socket-abuse case should be fiddled, and for more reasons
> than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse
> case, where the program routinely malloc()s strings millions of bytes
> larger than the socket can deliver, it would obviously help.  That's
> not typically program behavior (however typical it may be of that
> specific app).

Note that, with respect to http://python.org/sf/1092502, the author of
the (original) program was using the documented interface to a file
object.  It's _fileobject.read() that decides to ask for huge numbers of
bytes from recv() (specifically, in the max(self._rbufsize, left)
condition). Patched to use a fixed recv_size, you of course sidestep the
realloc() nastiness in this particular case.

bacchusrx.
From bob at redivi.com  Mon Jan  3 21:55:19 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 21:55:29 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <20050103202350.GA17165@skorga.org>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<20050103202350.GA17165@skorga.org>
Message-ID: <C729F7B4-5DC9-11D9-ACB4-000A9567635C@redivi.com>

On Jan 3, 2005, at 3:23 PM, bacchusrx wrote:

> On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote:
>> Is there any known case where Python performs poorly on this OS, for
>> this reason, other than the "pass giant numbers to recv() and then
>> shrink the string because we didn't get anywhere near that many bytes"
>> case?
>>
>> [...]
>>
>> I agree the socket-abuse case should be fiddled, and for more reasons
>> than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse
>> case, where the program routinely malloc()s strings millions of bytes
>> larger than the socket can deliver, it would obviously help.  That's
>> not typically program behavior (however typical it may be of that
>> specific app).
>
> Note that, with respect to http://python.org/sf/1092502, the author of
> the (original) program was using the documented interface to a file
> object.  It's _fileobject.read() that decides to ask for huge numbers 
> of
> bytes from recv() (specifically, in the max(self._rbufsize, left)
> condition). Patched to use a fixed recv_size, you of course sidestep 
> the
> realloc() nastiness in this particular case.

While using a reasonably sized recv_size is a good idea, using a 
smaller request size simply means that it's less likely that the 
strings will be significantly resized.  It is still highly likely they 
*will* be resized and that doesn't solve the problem that 
over-allocated strings will persist until the entire request is 
fulfilled.

For example, receiving 1 byte chunks (if that's even possible) would 
exacerbate the issue even for a small request size.  If you asked for 8 
MB with a request size of 1024 bytes, and received it in 1 byte chunks, 
you would need a minimum of an impossible ~16 GB to satisfy that 
request (minimum ~8 GB to collect the strings, minimum ~8 GB to 
concatenate them) as opposed to the Python-optimal case of ~16 MB when 
always using compact representations.

Using cStringIO instead of a list of potentially over-allocated strings 
would actually have such Python-optimal memory usage characteristics on 
all platforms.

-bob

From bsder at mail.allcaps.org  Mon Jan  3 22:26:12 2005
From: bsder at mail.allcaps.org (Andrew P. Lentvorski, Jr.)
Date: Mon Jan  3 22:26:14 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
Message-ID: <1776760A-5DCE-11D9-A1A7-000A95C874EE@mail.allcaps.org>


On Jan 2, 2005, at 11:16 PM, Tim Peters wrote:

> [Bob Ippolito]
>>  However, it is our (in the "I know you use Windows but I am not the 
>> only
>> one that uses Mac OS X sense) problem so long as Darwin is a supported
>> platform, because it is highly unlikely that Apple will backport any 
>> "fix" to
>> the allocator unless we can prove it has some security implications in
>> software shipped with their OS. ...
>
> Is there any known case where Python performs poorly on this OS, for
> this reason, other than the "pass giant numbers to recv() and then
> shrink the string because we didn't get anywhere near that many bytes"
> case?  Claiming rampant performance problems should require evidence
> too <wink>.

Possibly.  When using the stock btdownloadcurses.py from 
bitconjurer.org,
I occasionally see a memory thrash on OS X.

Normally I have to be in a mode where I am aggregating lots of small
connections (10Kbps or less uploads) into a large download (10Mbps
transfer rate on a >500MB file).  When the file completes, Python sends
OS X into a long-lasting spinning ball of death.  It will emerge after
about 10 minutes or so.

I do not see this same behavior on Linux or FreeBSD.  I never filed a 
bug
because I can't reliably reproduce it (it is dependent upon the upload
characteristics of the torrent swarm).  However, it seems to fit the
bug and diagnosis.

-a

From bac at OCF.Berkeley.EDU  Mon Jan  3 22:42:38 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Jan  3 22:43:02 2005
Subject: [Python-Dev] Re: Zipfile needs?
In-Reply-To: <crbu0p$847$1@sea.gmane.org>
References: <cqq8mc$3g9$1@sea.gmane.org> <41D1B0C6.8040208@ocf.berkeley.edu>
	<crbu0p$847$1@sea.gmane.org>
Message-ID: <41D9BC4E.5020709@ocf.berkeley.edu>

Scott David Daniels wrote:
> Brett C. wrote:
> 
>> Scott David Daniels wrote:
>>
>>> I'm hoping to add BZIP2 compression to zipfile for 2.5.  My primary
>>> motivation is that Project Gutenberg seems to be starting to use BZIP2
>>> compression for some of its zips.  What other wish list things do
>>> people around here have for zipfile?  I thought I'd collect input here
>>> and make a PEP.
>>
>> Encryption/decryption support.  Will most likely require a C extension 
>> since the algorithm relies on ints (or longs, don't remember) wrapping 
>> around when the value becomes too large.
> 
> 
> I'm trying to use byte-block streams (iterators taking iterables) as
> the basic structure of getting data in and out.  I think the encryption/
> decryption can then be plugged in at the right point.  If it can be set
> up properly, you can import the encryption separately and connect it to
> zipfiles with a call.  Would this address what you want?  I believe
> there is an issue actually building in the encryption/decryption in
> terms of redistribution.
> 

Possibly.  Encryption is part of the PKZIP spec so I was just thinking of 
covering that, not adding external encryption support.  It really is not overly 
complex stuff, just will want to do it in C for speed probably as Guido 
suggested (but, as always, I would profile that first to see if performance is 
really that bad).

-Brett
From tim.peters at gmail.com  Mon Jan  3 22:49:29 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Jan  3 22:49:34 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
Message-ID: <1f7befae050103134942ab1696@mail.gmail.com>

[Tim Peters]
>> Ya, I understood that.  My conclusion was that Darwin's realloc()
>> implementation isn't production-quality.  So it goes.

[Bob Ippolito]
> Whatever that means.

Well, it means what it said.  The C standard says nothing about
performance metrics of any kind, and a production-quality
implementation of C requires very much more than just meeting what the
standard requires.  The phrase "quality of implementation" is used in
the C Rationale (but not in the standard proper) to cover all such
issues.  realloc() pragmatics are quality-of-implementation issues;
the accuracy of fp arithmetic is another (e.g., if you get back -666.0
from the C 1.0 + 2.0, there's nothing in the standard to justify a
complaint).

>>>  free() can be called either explicitly, or implicitly by calling
>>> realloc() with a size larger than the size of the allocation.

>From later comments feigning outrage <wink>, I take it that "the size
of the allocation" here does not mean the specific number the user
passed to the previous malloc/realloc call, but means whatever amount
of address space the implementation decided to use internally.  Sorry,
but I assumed it meant the former at first.

...

>>> Was this a good decision?  Probably not!

>> Sounds more like a bug (or two) to me than "a decision", but I don't
>> know.

> You said yourself that it is standards compliant ;)  I have filed it as
> a bug, but it is probably unlikely to be backported to current versions
> of Mac OS X unless a case can be made that it is indeed a security
> flaw.

That's plausible.  If you showed me a case where Python's list.sort()
took cubic time, I'd certainly consider that to be "a bug", despite
that nothing promises better behavior.  If I wrote a malloc subsystem
and somebody pointed out "did you know that when I malloc 1024**2+1
bytes, and then realloc(1), I lose the other megabyte forever?", I'd
consider that to be "a bug" too (because, docs be damned, I wouldn't
intentionally design a malloc subsystem with such behavior; and
pymalloc does in fact copy bytes on a shrinking realloc in blocks it
controls, whenever at least a quarter of the space is given back --
and it didn't at the start, and I considered that to be "a bug" when
it was pointed out).

> ...
> Known case?  No.  Do I want to search Python application-space to find
> one?  No.

Serious problems on a platform are usually well-known to users on that
platform.  For example, it was well-known that Python's list-growing
strategy as of a few years ago fragmented address space horribly on
Win9X.  This was a C quality-of-implementation issue specific to that
platform.  It was eventually resolved by improving the list-growing
strategy on all platforms -- although it's still the case that Win9X
does worse on list-growing than other platforms, it's no longer a
disaster for most list-growing apps on Win9X.

If there's a problem with "overallocate then realloc() to cut back" on
Darwin that affects many apps, then I'd expect Darwin users to know
about that already -- lots of people have used Python on Macs since
Python's beginning, "mysterious slowdowns" and "mysterious bloat" get
noticed, and Darwin has been around for a while.

..

>> There is no "choke point" for allocations in Python -- some places
>> call the system realloc() directly.  Maybe the latter matter on Darwin
>> too, but maybe they don't.  The scope of this hack spreads if they do.

...

> In the case of Python, "some places" means "nowhere relevant".  Four
> standard library extension modules relevant to the platform use realloc
> directly:
> 
> _sre
>     Uses realloc only to grow buffers.
> cPickle
>     Uses realloc only to grow buffers.
> cStringIO
>     Uses realloc only to grow buffers.
> regexpr:
>     Uses realloc only to grow buffers.

Good!

> If Zope doesn't use the allocator that Python gives it, then it can
> deal with its own problems.  I would expect most extensions to use
> Python's allocator.

I don't know.

...
 
> They're [#ifdef's] also the only good way to deal with platform-specific
> inconsistencies.  In this specific case, it's not even possible to
> determine if a particular allocator implementation is stupid or not
> without at least using a platform-allocator-specific function to query
> the size reserved by a given allocation.

We've had bad experience on several platforms when passing large
numbers to recv().  If that were addressed, it's unclear that Darwin
realloc() behavior would remain a real issue.  OTOH, it is clear that
*just* worming around Darwin realloc() behavior won't help other
platforms with problems in the same *immediate* area of bug 1092502. 
Gross over-allocation followed by a shrinking realloc() just isn't
common in Python.  sock_recv() is an exceptionally bad case.  More
typical is, e.g., fileobject.c's get_line(), where if "a line" exceed
100 characters the buffer keeps growing by 25% until there's enough
room, then it's cut back once at the end.  That typical use for
shrinking realloc() just isn't going to be implicated in a real
problem -- the over-allocation is always minor.


> ...
> There's obviously a tradeoff between copying lots of bytes and having
> lots of memory go to waste.  That should be taken into consideration
> when considering how many pages could be returned to the allocator.
> Note that we can ask the allocator how much memory an allocation has
> actually reserved (which is usually somewhat larger than the amount you
> asked it for) and how much memory an allocation will reserve for a
> given size.  An allocation resize wouldn't even show up as smaller
> unless at least one page would be freed (for sufficiently large
> allocations anyway, the minimum granularity is 16 bytes because it
> guarantees that alignment).  Obviously if you have a lot of pages
> anyway, one page isn't a big deal, so we would probably only resort to
> free()/memcpy() if some fair percentage of the total pages used by the
> allocation could be rescued.
> 
> If it does end up causing some real performance problems anyway,
> there's always deeper hacks like using vm_copy(), a Darwin specific
> function which will do copy-on-write instead (which only makes sense if
> the allocation is big enough for this to actually be a performance
> improvement).

As above, I'm skeptical that there's a general problem worth
addressing here, and am still under the possible illusion that the Mac
developers will eventually change their realloc()'s behavior anyway. 
If you're convinced it's worth the bother, go for it.  If you do, I
strongly hope that it keys off a new platform-neutral symbol (say,
Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation
code.  Then if it turns out that it is a broad problem (across apps or
across platforms), everyone can benefit.  PyObject_Realloc() seems the
best place to put it.  Unfortunately, for blocks obtained from the
system malloc(), there is no portable way to find out how much excess
was allocated in a release-build Python, so "avoids Darwin-specific
implementation code" may be impossible to achieve.  The more it
*can't* be used on any platform other than this flavor of Darwin, the
more inclined I am to advise just fixing the immediate problem
(sock_recv's potentially unbounded over-allocation).
From bacchusrx at skorga.org  Mon Jan  3 22:52:25 2005
From: bacchusrx at skorga.org (bacchusrx)
Date: Mon Jan  3 22:52:37 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never
	shrinks allocations
In-Reply-To: <C729F7B4-5DC9-11D9-ACB4-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<20050103202350.GA17165@skorga.org>
	<C729F7B4-5DC9-11D9-ACB4-000A9567635C@redivi.com>
Message-ID: <20050103215225.GA17273@skorga.org>

On Mon, Jan 03, 2005 at 03:55:19PM -0500, Bob Ippolito wrote:
> >Note that, with respect to http://python.org/sf/1092502, the author
> >of the (original) program was using the documented interface to a
> >file object.  It's _fileobject.read() that decides to ask for huge
> >numbers of bytes from recv() (specifically, in the
> >max(self._rbufsize, left) condition). Patched to use a fixed
> >recv_size, you of course sidestep the realloc() nastiness in this
> >particular case.
> 
> While using a reasonably sized recv_size is a good idea, using a
> smaller request size simply means that it's less likely that the
> strings will be significantly resized.  It is still highly likely they
> *will* be resized and that doesn't solve the problem that
> over-allocated strings will persist until the entire request is
> fulfilled.

You're right. I should have said, "you're more likely to get away with
it." The underlying issue still exists. My point is that the problem is
not analogous to the guy who tried to read 2GB directly from a socket
(as in http://python.org/sf/756104). 

Googling for MemoryError exceptions, you can find a number of spurious
problems on Darwin that are probably due to this bug: SpamBayes for
instance, or the thread at

http://mail.python.org/pipermail/python-list/2004-November/250625.html

bacchusrx.
From bob at redivi.com  Mon Jan  3 23:40:37 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan  3 23:40:48 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <1f7befae050103134942ab1696@mail.gmail.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050103134942ab1696@mail.gmail.com>
Message-ID: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>

On Jan 3, 2005, at 4:49 PM, Tim Peters wrote:

> [Tim Peters]
>>> Ya, I understood that.  My conclusion was that Darwin's realloc()
>>> implementation isn't production-quality.  So it goes.
>
> [Bob Ippolito]
>> Whatever that means.
>
> Well, it means what it said.  The C standard says nothing about
> performance metrics of any kind, and a production-quality
> implementation of C requires very much more than just meeting what the
> standard requires.  The phrase "quality of implementation" is used in
> the C Rationale (but not in the standard proper) to cover all such
> issues.  realloc() pragmatics are quality-of-implementation issues;
> the accuracy of fp arithmetic is another (e.g., if you get back -666.0
> from the C 1.0 + 2.0, there's nothing in the standard to justify a
> complaint).
>
>>>>  free() can be called either explicitly, or implicitly by calling
>>>> realloc() with a size larger than the size of the allocation.
>
> From later comments feigning outrage <wink>, I take it that "the size
> of the allocation" here does not mean the specific number the user
> passed to the previous malloc/realloc call, but means whatever amount
> of address space the implementation decided to use internally.  Sorry,
> but I assumed it meant the former at first.

Sorry for the confusion.

>>>> Was this a good decision?  Probably not!
>
>>> Sounds more like a bug (or two) to me than "a decision", but I don't
>>> know.
>
>> You said yourself that it is standards compliant ;)  I have filed it 
>> as
>> a bug, but it is probably unlikely to be backported to current 
>> versions
>> of Mac OS X unless a case can be made that it is indeed a security
>> flaw.
>
> That's plausible.  If you showed me a case where Python's list.sort()
> took cubic time, I'd certainly consider that to be "a bug", despite
> that nothing promises better behavior.  If I wrote a malloc subsystem
> and somebody pointed out "did you know that when I malloc 1024**2+1
> bytes, and then realloc(1), I lose the other megabyte forever?", I'd
> consider that to be "a bug" too (because, docs be damned, I wouldn't
> intentionally design a malloc subsystem with such behavior; and
> pymalloc does in fact copy bytes on a shrinking realloc in blocks it
> controls, whenever at least a quarter of the space is given back --
> and it didn't at the start, and I considered that to be "a bug" when
> it was pointed out).

I wouldn't equate "until free() is called" with "forever".  But yes, I 
consider it a bug just as you do, and have reported it appropriately.  
Practically, since it exists in Mac OS X 10.2 and Mac OS X 10.3, and 
may not ever be fixed, we should at least consider it.

>> ...
>> Known case?  No.  Do I want to search Python application-space to find
>> one?  No.
>
> Serious problems on a platform are usually well-known to users on that
> platform.  For example, it was well-known that Python's list-growing
> strategy as of a few years ago fragmented address space horribly on
> Win9X.  This was a C quality-of-implementation issue specific to that
> platform.  It was eventually resolved by improving the list-growing
> strategy on all platforms -- although it's still the case that Win9X
> does worse on list-growing than other platforms, it's no longer a
> disaster for most list-growing apps on Win9X.

It does take a long time to figure such weird behavior out though.  I 
would have to guess that most people Python users on Darwin have been 
at it for less than 3 years.

The number of people using Python on Darwin who have have written or 
used code that exercised this scenario are determined enough to track 
this sort of thing down is probably very small.

> If there's a problem with "overallocate then realloc() to cut back" on
> Darwin that affects many apps, then I'd expect Darwin users to know
> about that already -- lots of people have used Python on Macs since
> Python's beginning, "mysterious slowdowns" and "mysterious bloat" get
> noticed, and Darwin has been around for a while.

Most people on Mac OS X have a lot of memory, and Mac OS X generally 
does a good job about swapping in and out without causing much of a 
problem, so I'm personally not very surprised that it could go 
unnoticed this long.

Google says:
Results 1 - 10 of about 1,150 for (darwin OR Mac OR "OS X") AND 
MemoryError AND Python.
Results 1 - 10 of about 942 for malloc vm_allocate failed. (0.73 
seconds)?

Of course, in both cases, not all of these can be attributed to 
realloc()'s implementation, but I'm sure some of them can, especially 
the Python ones!

>> They're [#ifdef's] also the only good way to deal with 
>> platform-specific
>> inconsistencies.  In this specific case, it's not even possible to
>> determine if a particular allocator implementation is stupid or not
>> without at least using a platform-allocator-specific function to query
>> the size reserved by a given allocation.
>
> We've had bad experience on several platforms when passing large
> numbers to recv().  If that were addressed, it's unclear that Darwin
> realloc() behavior would remain a real issue.  OTOH, it is clear that
> *just* worming around Darwin realloc() behavior won't help other
> platforms with problems in the same *immediate* area of bug 1092502.
> Gross over-allocation followed by a shrinking realloc() just isn't
> common in Python.  sock_recv() is an exceptionally bad case.  More
> typical is, e.g., fileobject.c's get_line(), where if "a line" exceed
> 100 characters the buffer keeps growing by 25% until there's enough
> room, then it's cut back once at the end.  That typical use for
> shrinking realloc() just isn't going to be implicated in a real
> problem -- the over-allocation is always minor.

What about for list objects that are big at some point, then 
progressively shrink, but happen to stick around for a while?  An 
"event queue" that got clogged for some reason and then became stable?  
Dictionaries?  Of course these potential problems are a lot less likely 
to happen.

>> ...
>> There's obviously a tradeoff between copying lots of bytes and having
>> lots of memory go to waste.  That should be taken into consideration
>> when considering how many pages could be returned to the allocator.
>> Note that we can ask the allocator how much memory an allocation has
>> actually reserved (which is usually somewhat larger than the amount 
>> you
>> asked it for) and how much memory an allocation will reserve for a
>> given size.  An allocation resize wouldn't even show up as smaller
>> unless at least one page would be freed (for sufficiently large
>> allocations anyway, the minimum granularity is 16 bytes because it
>> guarantees that alignment).  Obviously if you have a lot of pages
>> anyway, one page isn't a big deal, so we would probably only resort to
>> free()/memcpy() if some fair percentage of the total pages used by the
>> allocation could be rescued.
>>
>> If it does end up causing some real performance problems anyway,
>> there's always deeper hacks like using vm_copy(), a Darwin specific
>> function which will do copy-on-write instead (which only makes sense 
>> if
>> the allocation is big enough for this to actually be a performance
>> improvement).
>
> As above, I'm skeptical that there's a general problem worth
> addressing here, and am still under the possible illusion that the Mac
> developers will eventually change their realloc()'s behavior anyway.
> If you're convinced it's worth the bother, go for it.  If you do, I
> strongly hope that it keys off a new platform-neutral symbol (say,
> Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation
> code.  Then if it turns out that it is a broad problem (across apps or
> across platforms), everyone can benefit.  PyObject_Realloc() seems the
> best place to put it.  Unfortunately, for blocks obtained from the
> system malloc(), there is no portable way to find out how much excess
> was allocated in a release-build Python, so "avoids Darwin-specific
> implementation code" may be impossible to achieve.  The more it
> *can't* be used on any platform other than this flavor of Darwin, the
> more inclined I am to advise just fixing the immediate problem
> (sock_recv's potentially unbounded over-allocation).

I'm pretty sure this kind of malloc functionality is very specific to 
Darwin and does not carry over to any other BSD.  In order for an 
intelligent implementation, an equivalent of malloc_size() and 
malloc_good_size() is required.  Unfortunately, despite the man page, 
malloc_good_size() is not declared in <malloc/malloc.h>, however there 
is another, declared, way to get at that functionality (by poking into 
the malloc_introspection_t struct of the malloc_default_zone()).

-bob

From martin at v.loewis.de  Mon Jan  3 23:46:52 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan  3 23:46:51 2005
Subject: [Python-Dev] Re: Zipfile needs?
In-Reply-To: <crbu0p$847$1@sea.gmane.org>
References: <cqq8mc$3g9$1@sea.gmane.org> <41D1B0C6.8040208@ocf.berkeley.edu>
	<crbu0p$847$1@sea.gmane.org>
Message-ID: <41D9CB5C.8090305@v.loewis.de>

Scott David Daniels wrote:
> I believe
> there is an issue actually building in the encryption/decryption in
> terms of redistribution.

Submitters should not worry about this too much. The issue primarily
exists in the U.S., and there are now (U.S.) official procedures to
deal with them, and the PSF can and does follow these procedures.

Regards,
Martin
From tdelaney at avaya.com  Tue Jan  4 00:25:23 2005
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Jan  4 00:25:34 2005
Subject: [Python-Dev] Out-of-date FAQs
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com>

While grabbing the link to the copyright restrictions FAQ (for someone
on python-list) I noticed a few out-of-date FAQ entries - specifically,
"most stable version" and "Why doesn't list.sort() return the sorted
list?". Bug reports have been submitted (and acted on - Raymond, you
work too fast ;)

I think it's important that the FAQs be up-to-date with the latest
idioms, etc, so as I have the time available I intend to review all the
existing FAQs that I'm qualified for.

As a general rule, when an idiom has changed, do we want to state both
the 2.4 idiom as well as the 2.3 idiom? In the case of list.sort(), that
would mean having both:

    for key in sorted(dict.iterkeys()):
        ...do whatever with dict[key]...

and

    keys = dict.keys()
    keys.sort()
    for key in keys:
        ...do whatever with dict[key]...

Tim Delaney
From irmen at xs4all.nl  Tue Jan  4 00:30:24 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Tue Jan  4 00:30:27 2005
Subject: [Python-Dev] Small fix for windows.tex
Message-ID: <41D9D590.2020006@xs4all.nl>

The current cvs docs failed to build for me, because of a small
misspelling in the windows.tex file. Here is a patch:

Index: Doc/ext/windows.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/ext/windows.tex,v
retrieving revision 1.10
diff -u -r1.10 windows.tex

--- Doc/ext/windows.tex	30 Dec 2004 10:44:32 -0000	1.10
+++ Doc/ext/windows.tex	3 Jan 2005 23:28:20 -0000
@@ -163,8 +163,8 @@
      click OK.  (Inserting them one by one is fine too.)

      Now open the \menuselection{Project \sub spam properties} dialog.
-    You only need to change a few settings.  Make sure \guilable{All
-    Configurations} is selected from the \guilable{Settings for:}
+    You only need to change a few settings.  Make sure \guilabel{All
+    Configurations} is selected from the \guilabel{Settings for:}
      dropdown list.  Select the C/\Cpp{} tab.  Choose the General
      category in the popup menu at the top.  Type the following text in
      the entry box labeled \guilabel{Additional Include Directories}:



--Irmen
From shane.holloway at ieee.org  Tue Jan  4 00:30:11 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Tue Jan  4 00:30:38 2005
Subject: [Python-Dev] Zipfile needs?
In-Reply-To: <cqq8mc$3g9$1@sea.gmane.org>
References: <cqq8mc$3g9$1@sea.gmane.org>
Message-ID: <41D9D583.6060400@ieee.org>

Scott David Daniels wrote:
> What other wish list things do people around here have for zipfile?  I thought I'd collect input here
> and make a PEP.

I was working on a project based around modifying zip files, and found 
that python just doesn't implement that part.  I'd like to see the 
ability to remove a file in the archive, as well as "write over" a file 
already in the archive.

It's a tall order, but you asked.  ;)

Thanks,
-Shane Holloway
From martin at v.loewis.de  Tue Jan  4 00:42:54 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan  4 00:42:50 2005
Subject: [Python-Dev] Small fix for windows.tex
In-Reply-To: <41D9D590.2020006@xs4all.nl>
References: <41D9D590.2020006@xs4all.nl>
Message-ID: <41D9D87E.9050501@v.loewis.de>

Irmen de Jong wrote:
> The current cvs docs failed to build for me, because of a small
> misspelling in the windows.tex file. Here is a patch:

Thanks, fixed.

Martin
From aahz at pythoncraft.com  Tue Jan  4 01:13:14 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Jan  4 01:13:17 2005
Subject: [Python-Dev] Out-of-date FAQs
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com>
Message-ID: <20050104001314.GA17136@panix.com>

On Tue, Jan 04, 2005, Delaney, Timothy C (Timothy) wrote:
>
> As a general rule, when an idiom has changed, do we want to state both
> the 2.4 idiom as well as the 2.3 idiom? In the case of list.sort(), that
> would mean having both:
> 
>     for key in sorted(dict.iterkeys()):
>         ...do whatever with dict[key]...
> 
> and
> 
>     keys = dict.keys()
>     keys.sort()
>     for key in keys:
>         ...do whatever with dict[key]...

Yes.  Until last July, the company I work for was still using 1.5.2.
Our current version is 2.2.  I think that the FAQ should be usable for
anyone with a "reasonably current" version of Python, say at least two
major versions.  IOW, answers should continue to work with 2.2 during
the lifetime of 2.4.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From tdelaney at avaya.com  Tue Jan  4 01:26:17 2005
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Jan  4 01:26:25 2005
Subject: [Python-Dev] Out-of-date FAQs
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE02520253@au3010avexu1.global.avaya.com>

Aahz wrote:

> Yes.  Until last July, the company I work for was still using 1.5.2.
> Our current version is 2.2.  I think that the FAQ should be usable for
> anyone with a "reasonably current" version of Python, say at least two
> major versions.  IOW, answers should continue to work with 2.2 during
> the lifetime of 2.4.

That seems reasonable to me.

Tim Delaney
From tim.peters at gmail.com  Tue Jan  4 02:38:08 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Jan  4 02:38:11 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050103134942ab1696@mail.gmail.com>
	<7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>
Message-ID: <1f7befae05010317387667d8b1@mail.gmail.com>

[Bob Ippolito]
> ...
> What about for list objects that are big at some point, then
> progressively shrink, but happen to stick around for a while?  An
> "event queue" that got clogged for some reason and then became stable?

It's less plausible that we''re going to see a lot of these
simultaneously alive.  It's possible, of course.  Note that if we do,
fiddling PyObject_Realloc() won't help:  list resizing goes thru the
PyMem_RESIZE() macro, which calls the platform realloc() directly in a
release build (BTW, I suspect that when you were looking for realloc()
calls, you were looking for the string "realloc(" -- but that's not
the only spelling; we don't even have alphabetical choke points
<wink>).

The list object itself goes thru Python's small-object allocator,
which makes sense because a list object has a small fixed size
independent of list length.  Space for list elements is allocated
seperately from the list object, and talks to the platform
malloc/free/realloc directly (in release builds, via how the PyMem_XYZ
macros resolve in release builds).

> Dictionaries?

They're not a potential problem here -- dict resizing (whether growing
or shrinking) always proceeds by allocating new space for the dict
guts, copying over elements from the original space, then freeing the
original space.  This is because the hash slot assigned to a key can
change when the table size changes, and keeping collision chains
straight is a real bitch if you try to do it in-place.  IOW, there are
implementation reasons for why CPython dicts will probably never use
realloc().

> Of course these potential problems are a lot less likely to happen.

I think so.

Guido's suggestion to look at PyString_Resize (etc) instead could be a
good one, since those methods know both the number of thingies (bytes,
list elements, tuple elements, ...) currently allocated and the number
of thingies being asked for.  That could be exploited by a portable
heuristic (like malloc+memcpy+free if the new number of thingies is at
least a quarter less than the old number of thingies, else let realloc
(however spelled) exercise its own judgment).  Since list_resize()
doesn't go thru pymalloc, that's the only clear way to worm around
realloc() quirks for lists.
From gvanrossum at gmail.com  Tue Jan  4 02:42:54 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 02:42:57 2005
Subject: [Python-Dev] Please help complete the AST branch
Message-ID: <ca471dc2050103174221d7442f@mail.gmail.com>

The AST branch has been "nearly complete" for several Python versions
now. The last time a serious effort was made was in May I believe, but
it wasn't enough to merge the code back into 2.4, alas.

It would be a real shame if this code was abandoned. If we're going to
make progress with things like type inferencing, integrating
PyChecker, or optional static type checking (see my blog on Artima --
I just finished rambling part II), the AST branch would be a much
better starting point than the current mainline bytecode compiler.
(Arguably, the compiler package, written in Python, would make an even
better start for prototyping, but I don't expect that it will ever be
fast enough to be Python's only bytecode compiler.)

So, I'm pleading. Please, someone, either from the established crew of
developers or a new volunteer (or more than one!), try to help out to
complete the work on the AST branch and merge it into 2.5.

I wish I could do this myself, and I *am* committed to more time for
Python than last year, but I think I should try to focus on language
design issues more than implementation issues. (Although I  haven't
heard what Larry Wall has been told -- apparently the Perl developers
don't want Larry writing code any more. :-)

Please, anyone? Raymond? Neil? Facundo? Brett?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Tue Jan  4 03:02:52 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Jan  4 03:03:18 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ca471dc2050103174221d7442f@mail.gmail.com>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
Message-ID: <41D9F94C.3020005@ocf.berkeley.edu>

Guido van Rossum wrote:
> The AST branch has been "nearly complete" for several Python versions
> now. The last time a serious effort was made was in May I believe, but
> it wasn't enough to merge the code back into 2.4, alas.
> 
> It would be a real shame if this code was abandoned.
[SNIP]
> So, I'm pleading. Please, someone, either from the established crew of
> developers or a new volunteer (or more than one!), try to help out to
> complete the work on the AST branch and merge it into 2.5.
> 
[SNIP]
> Please, anyone? Raymond? Neil? Facundo? Brett?
> 

Funny you should send this out today.  I just did some jiggling with my 
schedule so I could take the undergrad language back-end course this quarter. 
This led to me needing to take a grad-level projects class in Spring.  And what 
was the first suggestion my professor had for that course credit in Spring?

Finish the AST branch.  I am dedicated to finishing the AST branch as soon as 
my thesis is finished, class credit or no.  I just can't delve into that large 
of a project until I get my school stuff in order.  But if I get to do it for 
my class credit I will be able to dedicate 4 units of work to it a week (about 
8 hours minimum).

Plus there is the running tradition of sprinting on the AST branch at PyCon.  I 
was planning on shedding my bug fixing drive at PyCon this year and sprinting 
with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for 
working on it afterwards for my class credit.

Although if someone can start sooner than by all means, go for it!  I can find 
something else to get credit for (such as finishing my monster of a paper 
comparing Python to Java; 34 single-spaced pages just covering paradigm support 
and the standard libraries so far).  And obviously help would be great since it 
isn't a puny codebase (4,000 lines so far for the CST->AST and AST->bytecode code).

If anyone would like to see the current code, check out ast-branch from CVS 
(read the dev FAQ on how to check out a branch from CVS).  Read 
Python/compile.txt for an overview of how the thing works and such.

It will get done, just don't push for a 2.5 release within a month.  =)

-Brett
From jepler at unpythonic.net  Tue Jan  4 03:19:09 2005
From: jepler at unpythonic.net (Jeff Epler)
Date: Tue Jan  4 03:19:12 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <41D9F94C.3020005@ocf.berkeley.edu>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
Message-ID: <20050104021909.GB11833@unpythonic.net>

On Mon, Jan 03, 2005 at 06:02:52PM -0800, Brett C. wrote:
> Although if someone can start sooner than by all means, go for it!
> And obviously help would be great since it isn't a puny codebase
> (4,000 lines so far for the CST->AST and AST->bytecode code).

And obviously knowing a little more about the AST branch would be
helpful for those considering helping.

Is there any relatively up-to-date document about ast-branch?  googling
about it turned up some pypy stuff from 2003, and I didn't look much
further.

I just built the ast-branch for fun, and "make test" mostly worked.
    8 tests failed:
        test_builtin test_dis test_generators test_inspect test_pep263
        test_scope test_symtable test_trace
    6 skips unexpected on linux2:
        test_csv test_hotshot test_bsddb test_parser test_logging
        test_email
I haven't looked at any of the failures in detail, but at least
test_bsddb is due to missing development libs on this system

One more thing:  The software I work on by day has python scripting.
One part of that functionality is a tree display of a script.  I'm not
actively involved with this part of the software (yet).  Any comments on
whether ast-branch could be construed as helping make this kind of
functionality work better, faster, or easier?  The code we use currently
is based on a modified version of the parser which includes comment
information, so we need to be aware of changes in this area anyhow.

(on the other hand, I won't hold my breath for permission to do this
on the clock, because of our own release scheduling I have other
projects on my plate now, and a version of our software that uses a
post-2.3 Python is years away)

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050103/b4211f92/attachment-0001.pgp
From jhylton at gmail.com  Tue Jan  4 05:03:33 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Jan  4 05:03:36 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <41D9F94C.3020005@ocf.berkeley.edu>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
Message-ID: <e8bf7a530501032003575615d@mail.gmail.com>

On Mon, 03 Jan 2005 18:02:52 -0800, Brett C. <bac@ocf.berkeley.edu> wrote:
> Plus there is the running tradition of sprinting on the AST branch at PyCon.  I
> was planning on shedding my bug fixing drive at PyCon this year and sprinting
> with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for
> working on it afterwards for my class credit.

I'd like to sprint on it before PyCon; we'll have to see what my
schedule allows.

> If anyone would like to see the current code, check out ast-branch from CVS
> (read the dev FAQ on how to check out a branch from CVS).  Read
> Python/compile.txt for an overview of how the thing works and such.
> 
> It will get done, just don't push for a 2.5 release within a month.  =)

I think the branch is in an awkward state, because of the new features
added to Python 2.4 after the AST branch work ceased.  The ast branch
doesn't handle generator expressions or decorators; extending the ast
to support them would be a good first step.

There are also the simple logistical questions of integrating changes.
 Since most of the AST branch changes are confined to a few files, I
suspect the best thing to do is to merge all the changes from the head
except for compile.c.  I haven't done a major CVS branch integrate in
at least nine months; if someone feels more comfortable with that, it
would also be a good step.

Perhaps interested parties should take up the discussion on the
compiler-sig.  I think we can recover the state of last May's effort
pretty quickly, and I can help outline the remaining work even if I
can't help much.  (Although I hope I can help, too.)

Jeremy
From t-meyer at ihug.co.nz  Tue Jan  4 05:17:03 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Tue Jan  4 05:17:40 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801C10A93@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>

> Perhaps interested parties should take up the discussion on 
> the compiler-sig.

This isn't listed in the 'currently active' SIGs list on
<http://python.org/sigs/> - is it still active, or will it now be?  If so,
perhaps it should be added to the list?

By 'discussion on', do you mean via the wiki at
<http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/PythonAST>?

=Tony.Meyer

From theller at python.net  Tue Jan  4 11:00:15 2005
From: theller at python.net (Thomas Heller)
Date: Tue Jan  4 10:58:51 2005
Subject: [Python-Dev] Mac questions
Message-ID: <u0px4iq8.fsf@python.net>

I'm working on refactoring Python/import.c, currently the case_ok()
function.

I was wondering about these lines:
  /* new-fangled macintosh (macosx) */
  #elif defined(__MACH__) && defined(__APPLE__) && defined(HAVE_DIRENT_H)

Is this for Mac OSX? Does the Mac have a case insensitive file system
(my experiments on the SF compile farm say no)?

And finally: Is there any other way to find the true spelling of a file
except than a linear search with opendir()/readdir()/closedir() ?

Thomas

From bob at redivi.com  Tue Jan  4 11:41:03 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan  4 11:41:12 2005
Subject: [Python-Dev] Mac questions
In-Reply-To: <u0px4iq8.fsf@python.net>
References: <u0px4iq8.fsf@python.net>
Message-ID: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com>

On Jan 4, 2005, at 5:00 AM, Thomas Heller wrote:

> I'm working on refactoring Python/import.c, currently the case_ok()
> function.
>
> I was wondering about these lines:
>   /* new-fangled macintosh (macosx) */
>   #elif defined(__MACH__) && defined(__APPLE__) && 
> defined(HAVE_DIRENT_H)
>
> Is this for Mac OSX? Does the Mac have a case insensitive file system
> (my experiments on the SF compile farm say no)?

Yes, this tests positive for Mac OS X (and probably other variants of 
Darwin).
Yes, Mac OS X uses a case preserving but insensitive file system by 
default (HFS+), but has case sensitive file systems (UFS, and a case 
sensitive version of HFS+, NFS, etc.).  The SF compile farm may use one 
of these alternative file systems, probably NFS if anything.

> And finally: Is there any other way to find the true spelling of a file
> except than a linear search with opendir()/readdir()/closedir() ?

Yes, definitely.  I'm positive you can do this with CoreServices, but 
I'm not sure it's portable to Darwin (not Mac OS X).  I'm sure there is 
some Darwin-compatible way of doing it, but I don't know it off the top 
of my head.  I'll try to remember to look into it if nobody else finds 
it first.

-bob

From Jack.Jansen at cwi.nl  Tue Jan  4 11:56:09 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Tue Jan  4 11:54:47 2005
Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks
	allocations
In-Reply-To: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050103134942ab1696@mail.gmail.com>
	<7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>
Message-ID: <3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl>


On 3 Jan 2005, at 23:40, Bob Ippolito wrote:
> Most people on Mac OS X have a lot of memory, and Mac OS X generally 
> does a good job about swapping in and out without causing much of a 
> problem, so I'm personally not very surprised that it could go 
> unnoticed this long.

*Except* when you're low on free disk space. 10.2 and before were 
really bad with this, usually hanging the machine, 10.3 is better but 
it's still pretty bad when compared to other unixen. It probably has 
something to do with the way OSX overcommits memory and swapspace, for 
which it apparently uses a different algorithm than FreeBSD or Linux.

I wouldn't be surprised if the bittorrent problem report in this thread 
was due to being low on diskspace. And that could also be true for the 
original error report that sparked this discussion.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From bob at redivi.com  Tue Jan  4 12:25:46 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan  4 12:25:54 2005
Subject: [Pythonmac-SIG] Re: [Python-Dev] Darwin's realloc(...)
	implementation never shrinks allocations
In-Reply-To: <3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl>
References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com>
	<1f7befae05010221134a94eccd@mail.gmail.com>
	<E0C19636-5D4D-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050102231638b0d39d@mail.gmail.com>
	<D8729D50-5D9E-11D9-8981-000A9567635C@redivi.com>
	<1f7befae050103134942ab1696@mail.gmail.com>
	<7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com>
	<3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl>
Message-ID: <60DD4D4B-5E43-11D9-A787-000A9567635C@redivi.com>

On Jan 4, 2005, at 5:56 AM, Jack Jansen wrote:

> On 3 Jan 2005, at 23:40, Bob Ippolito wrote:
>> Most people on Mac OS X have a lot of memory, and Mac OS X generally 
>> does a good job about swapping in and out without causing much of a 
>> problem, so I'm personally not very surprised that it could go 
>> unnoticed this long.
>
> *Except* when you're low on free disk space. 10.2 and before were 
> really bad with this, usually hanging the machine, 10.3 is better but 
> it's still pretty bad when compared to other unixen. It probably has 
> something to do with the way OSX overcommits memory and swapspace, for 
> which it apparently uses a different algorithm than FreeBSD or Linux.
>
> I wouldn't be surprised if the bittorrent problem report in this 
> thread was due to being low on diskspace. And that could also be true 
> for the original error report that sparked this discussion.

I was able to trigger this bug with a considerable amount of free disk 
space using a laptop that has 1GB of RAM, although I did have to 
increase the buffer size from the given example quite a bit to get it 
to fail.  After all, a 32-bit process can't have more than 4 GB of 
addressable memory.  I am pretty sure that OS X is never supposed to 
overcommit memory.  The disk thrashing probably has a lot to do with 
the fact that Mac OS X will grow and shrink its swap based on demand, 
rather than having a fixed size swap partition as is common on other 
unixen.  I've never seen the problem myself, though.

 From what I remember about Linux, its malloc implementation merely 
increases the address space of a process.  The actual allocation will 
happen when you try and access the memory, and if it's overcommitted 
things will fail in a bad way.

-bob

From Jack.Jansen at cwi.nl  Tue Jan  4 13:42:26 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Tue Jan  4 13:42:49 2005
Subject: [Python-Dev] Mac questions
In-Reply-To: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com>
References: <u0px4iq8.fsf@python.net>
	<21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com>
Message-ID: <16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl>


On 4 Jan 2005, at 11:41, Bob Ippolito wrote:
>> And finally: Is there any other way to find the true spelling of a 
>> file
>> except than a linear search with opendir()/readdir()/closedir() ?
>
> Yes, definitely.  I'm positive you can do this with CoreServices, but 
> I'm not sure it's portable to Darwin (not Mac OS X).  I'm sure there 
> is some Darwin-compatible way of doing it, but I don't know it off the 
> top of my head.  I'll try to remember to look into it if nobody else 
> finds it first.

I haven't used pure darwin, but I assume it has support for FSRefs, 
right? Then you could use FSPathMakeRef() to turn the filename into an 
FSRef, and then FSGetCatalogInfo() to get the true filename.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From barry at python.org  Tue Jan  4 13:43:28 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Jan  4 13:43:32 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
Message-ID: <1104842608.3227.60.camel@presto.wooz.org>

On Mon, 2005-01-03 at 23:17, Tony Meyer wrote:
> > Perhaps interested parties should take up the discussion on 
> > the compiler-sig.
> 
> This isn't listed in the 'currently active' SIGs list on
> <http://python.org/sigs/> - is it still active, or will it now be?  If so,
> perhaps it should be added to the list?
> 
> By 'discussion on', do you mean via the wiki at
> <http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/PythonAST>?

If compiler-sig is where ASTers want to hang out, I'd be happy to
resurrect it.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/720764ad/attachment.pgp
From bob at redivi.com  Tue Jan  4 13:56:34 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan  4 13:56:43 2005
Subject: [Python-Dev] Mac questions
In-Reply-To: <16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl>
References: <u0px4iq8.fsf@python.net>
	<21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com>
	<16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl>
Message-ID: <0FE43B78-5E50-11D9-A950-000A9567635C@redivi.com>


On Jan 4, 2005, at 7:42 AM, Jack Jansen wrote:

>
> On 4 Jan 2005, at 11:41, Bob Ippolito wrote:
>>> And finally: Is there any other way to find the true spelling of a 
>>> file
>>> except than a linear search with opendir()/readdir()/closedir() ?
>>
>> Yes, definitely.  I'm positive you can do this with CoreServices, but 
>> I'm not sure it's portable to Darwin (not Mac OS X).  I'm sure there 
>> is some Darwin-compatible way of doing it, but I don't know it off 
>> the top of my head.  I'll try to remember to look into it if nobody 
>> else finds it first.
>
> I haven't used pure darwin, but I assume it has support for FSRefs, 
> right? Then you could use FSPathMakeRef() to turn the filename into an 
> FSRef, and then FSGetCatalogInfo() to get the true filename.

I believe your assumption is wrong.  CoreServices is not open source, 
and this looks like it confirms my suspicion:

(from <CoreFoundation/CFURL.h>)

#if !defined(DARWIN)

struct FSRef;

CF_EXPORT
CFURLRef CFURLCreateFromFSRef(CFAllocatorRef allocator, const struct 
FSRef *fsRef);

CF_EXPORT
Boolean CFURLGetFSRef(CFURLRef url, struct FSRef *fsRef);
#endif /* !DARWIN */

-bob

From jhylton at gmail.com  Tue Jan  4 14:25:18 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Jan  4 14:25:21 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <1104842608.3227.60.camel@presto.wooz.org>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
Message-ID: <e8bf7a530501040525640fb674@mail.gmail.com>

The list archives look like they are mostly full of spam, but it's
also the only list we've used to discuss the ast work.  I haven't
really worried whether the sig was "active," as long as the list was
around.  I don't mind if you want to resurrect it.  Is there some way
to delete the spam from the archives?

By "discussion on" I meant a discussion of the remaining work.  I'm
not sure why you quoted just that part.  I was suggesting that there
is an ongoing discussion that should continue on the compiler-sig.

Jeremy



On Tue, 04 Jan 2005 07:43:28 -0500, Barry Warsaw <barry@python.org> wrote:
> On Mon, 2005-01-03 at 23:17, Tony Meyer wrote:
> > > Perhaps interested parties should take up the discussion on
> > > the compiler-sig.
> >
> > This isn't listed in the 'currently active' SIGs list on
> > <http://python.org/sigs/> - is it still active, or will it now be?  If so,
> > perhaps it should be added to the list?
> >
> > By 'discussion on', do you mean via the wiki at
> > <http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/PythonAST>?
> 
> If compiler-sig is where ASTers want to hang out, I'd be happy to
> resurrect it.
> 
> -Barry
> 
> 
>
From gvanrossum at gmail.com  Tue Jan  4 16:31:30 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 16:57:21 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <e8bf7a530501040525640fb674@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
Message-ID: <ca471dc20501040731142eccf2@mail.gmail.com>

>I was suggesting that there
> is an ongoing discussion that should continue on the compiler-sig.

I'd be fine with keeping this on python-dev too.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jhylton at gmail.com  Tue Jan  4 17:17:33 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Jan  4 17:17:36 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ca471dc20501040731142eccf2@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
Message-ID: <e8bf7a5305010408177786b70@mail.gmail.com>

That's fine with me.  We had taken it to the compiler-sig when it
wasn't clear there was interest in the ast branch :-).

Jeremy


On Tue, 4 Jan 2005 07:31:30 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> >I was suggesting that there
> > is an ongoing discussion that should continue on the compiler-sig.
> 
> I'd be fine with keeping this on python-dev too.
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
From barry at python.org  Tue Jan  4 19:13:57 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Jan  4 19:14:07 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <e8bf7a5305010408177786b70@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
	<e8bf7a5305010408177786b70@mail.gmail.com>
Message-ID: <1104862406.12499.6.camel@geddy.wooz.org>

On Tue, 2005-01-04 at 11:17, Jeremy Hylton wrote:
> That's fine with me.  We had taken it to the compiler-sig when it
> wasn't clear there was interest in the ast branch :-).

Ok, then I'll leave compiler-sig where it is.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/d634ed53/attachment.pgp
From gvanrossum at gmail.com  Tue Jan  4 19:17:06 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 19:17:09 2005
Subject: [Python-Dev] Fwd: Thank You! :)
In-Reply-To: <011f01c4f289$e6b41760$3a0a000a@entereza.com>
References: <011f01c4f289$e6b41760$3a0a000a@entereza.com>
Message-ID: <ca471dc205010410174a600715@mail.gmail.com>

This really goes to all python-dev folks!


---------- Forwarded message ----------
From: Erik Johnson <ej@wellkeeper.com>
Date: Tue, 4 Jan 2005 11:19:15 -0700
Subject: Thank You! :)
To: guido@python.org


 
    You probably get a number of messages like this, but here is mine... 
  
    My name is Erik Johnson, and I work in Albuquerque, NM for a small
company called WellKeeper, Inc. We do remote oil & gas well
monitoring, and I am using Python as a replacement for Perl & PHP,
both for supporting dynamic web pages as well as driving a number of
non-web-based servers and data processors.
  
    I just wanted to take a moment and say "Thank you!" to you, Guido,
and your team for developing Python and then so generously sharing it
with the world. I know it must be a pretty thankless job sometimes. I
am still a neophyte Python hacker (Pythonista?), but I have been
pretty impressed with Python so far, and am looking forward to
learning Python better and accomplishing more with it in near near and
not too distant future.
  
    So... thanks again, Happy New Year, and best wishes to you, your
family, and your Python team for 2005!  (I hope you will pass these
good wishes along to your team.)
  
Sincerely, 
Erik Johnson 

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From skip at pobox.com  Tue Jan  4 19:23:12 2005
From: skip at pobox.com (Skip Montanaro)
Date: Tue Jan  4 19:23:04 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ca471dc20501040731142eccf2@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
Message-ID: <16858.57104.116033.996927@montanaro.dyndns.org>


    >> I was suggesting that there is an ongoing discussion that should
    >> continue on the compiler-sig.

    Guido> I'd be fine with keeping this on python-dev too.

+1 for a number of reasons:

    * It's more visible and would potentially get more people interested in
      what's happening (and maybe participate)

    * The python-dev list archives are searched regularly by a number of
      people not on the list (more external visibility/involvement)

    * Brett would probably include progress reports in his python-dev
      summary (again, more external visibility/involvement)

    * Who really feels the need to subscribe to yet another mailing list?

Skip
From bob at redivi.com  Tue Jan  4 19:25:03 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan  4 19:25:09 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <16858.57104.116033.996927@montanaro.dyndns.org>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
	<16858.57104.116033.996927@montanaro.dyndns.org>
Message-ID: <F35ECDD0-5E7D-11D9-A950-000A9567635C@redivi.com>


On Jan 4, 2005, at 1:23 PM, Skip Montanaro wrote:

>
>>> I was suggesting that there is an ongoing discussion that should
>>> continue on the compiler-sig.
>
>     Guido> I'd be fine with keeping this on python-dev too.
>
> +1 for a number of reasons:
>
>     * It's more visible and would potentially get more people 
> interested in
>       what's happening (and maybe participate)
>
>     * The python-dev list archives are searched regularly by a number 
> of
>       people not on the list (more external visibility/involvement)
>
>     * Brett would probably include progress reports in his python-dev
>       summary (again, more external visibility/involvement)
>
>     * Who really feels the need to subscribe to yet another mailing 
> list?

+1 for the same reasons

-bob

From gvanrossum at gmail.com  Tue Jan  4 19:28:03 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 19:28:06 2005
Subject: [Python-Dev] Let's get rid of unbound methods
Message-ID: <ca471dc2050104102814be915b@mail.gmail.com>

In my blog I wrote:

Let's get rid of unbound methods. When class C defines a method f, C.f
should just return the function object, not an unbound method that
behaves almost, but not quite, the same as that function object. The
extra type checking on the first argument that unbound methods are
supposed to provide is not useful in practice (I can't remember that
it ever caught a bug in my code) and sometimes you have to work around
it; it complicates function attribute access; and the overloading of
unbound and bound methods on the same object type is confusing. Also,
the type checking offered is wrong, because it checks for subclassing
rather than for duck typing.

This is a really simple change to begin with:

*** funcobject.c	28 Oct 2004 16:32:00 -0000	2.67
--- funcobject.c	4 Jan 2005 18:23:42 -0000
***************
*** 564,571 ****
  static PyObject *
  func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
  {
! 	if (obj == Py_None)
! 		obj = NULL;
  	return PyMethod_New(func, obj, type);
  }
  
--- 564,573 ----
  static PyObject *
  func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
  {
! 	if (obj == NULL || obj == Py_None) {
! 		Py_INCREF(func);
! 		return func;
! 	}
  	return PyMethod_New(func, obj, type);
  }
  
There are some test suite failures but I suspect they all have to do
with checking this behavior.

Of course, more changes would be needed: docs, the test suite, and
some simplifications to the instance method object implementation in
classobject.c.

Does anyone think this is a bad idea? Anyone want to run with it?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jim at zope.com  Tue Jan  4 19:36:03 2005
From: jim at zope.com (Jim Fulton)
Date: Tue Jan  4 19:36:07 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <41DAE213.9070906@zope.com>

Guido van Rossum wrote:
> In my blog I wrote:
> 
> Let's get rid of unbound methods. When class C defines a method f, C.f
> should just return the function object, not an unbound method that
> behaves almost, but not quite, the same as that function object. The
> extra type checking on the first argument that unbound methods are
> supposed to provide is not useful in practice (I can't remember that
> it ever caught a bug in my code) and sometimes you have to work around
> it; it complicates function attribute access;

I think this is probably a good thing as it potentially avoids
some unintential aliasing.

 > and the overloading of
> unbound and bound methods on the same object type is confusing. Also,
> the type checking offered is wrong, because it checks for subclassing
> rather than for duck typing.

duck typing?

> This is a really simple change to begin with:
> 
> *** funcobject.c	28 Oct 2004 16:32:00 -0000	2.67
> --- funcobject.c	4 Jan 2005 18:23:42 -0000
> ***************
> *** 564,571 ****
>   static PyObject *
>   func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
>   {
> ! 	if (obj == Py_None)
> ! 		obj = NULL;
>   	return PyMethod_New(func, obj, type);
>   }
>   
> --- 564,573 ----
>   static PyObject *
>   func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
>   {
> ! 	if (obj == NULL || obj == Py_None) {
> ! 		Py_INCREF(func);
> ! 		return func;
> ! 	}
>   	return PyMethod_New(func, obj, type);
>   }
>   
> There are some test suite failures but I suspect they all have to do
> with checking this behavior.
> 
> Of course, more changes would be needed: docs, the test suite, and
> some simplifications to the instance method object implementation in
> classobject.c.
> 
> Does anyone think this is a bad idea?

It *feels* very disruptive to me, but I'm probably wrong.
We'll still need unbound builtin methods, so the concept won't
go away. In fact, the change would mean that the behavior between
builtin methods and python methods would become more inconsistent.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From bob at redivi.com  Tue Jan  4 19:39:44 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan  4 19:39:58 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <00C9B1BE-5E80-11D9-A950-000A9567635C@redivi.com>


On Jan 4, 2005, at 1:28 PM, Guido van Rossum wrote:

> Let's get rid of unbound methods. When class C defines a method f, C.f
> should just return the function object, not an unbound method that
> behaves almost, but not quite, the same as that function object. The
> extra type checking on the first argument that unbound methods are
> supposed to provide is not useful in practice (I can't remember that
> it ever caught a bug in my code) and sometimes you have to work around
> it; it complicates function attribute access; and the overloading of
> unbound and bound methods on the same object type is confusing. Also,
> the type checking offered is wrong, because it checks for subclassing
> rather than for duck typing.

+1

I like this idea.  It may have some effect on current versions of 
PyObjC though, because we really do care about what self is in order to 
prevent crashes.  This is not a discouragement; we are already using 
custom descriptors and a metaclass, so it won't be a problem to do this 
ourselves if we are not doing it already.  I'll try and find some time 
later in the week to play with this patch to see if it does break 
PyObjC or not.  If it breaks PyObjC, I can sure that PyObjC 1.3 will be 
compatible with such a runtime change, as we're due for a refactoring 
in that area anyway.

-bob

From jack at performancedrivers.com  Tue Jan  4 19:42:17 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Tue Jan  4 19:42:21 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <20050104184217.GJ1404@performancedrivers.com>

On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote:
> In my blog I wrote:
> 
> Let's get rid of unbound methods. When class C defines a method f, C.f
> should just return the function object, not an unbound method that
> behaves almost, but not quite, the same as that function object. The
> extra type checking on the first argument that unbound methods are
> supposed to provide is not useful in practice (I can't remember that
> it ever caught a bug in my code) and sometimes you have to work around
> it; it complicates function attribute access; and the overloading of
> unbound and bound methods on the same object type is confusing. Also,
> the type checking offered is wrong, because it checks for subclassing
> rather than for duck typing.
> 
> Does anyone think this is a bad idea? Anyone want to run with it?
> 
I like the idea, it means I can get rid of this[1]

func = getattr(cls, 'do_command', None)
setattr(cls, 'do_command', staticmethod(func.im_func)) # don't let anyone on c.l.py see this

.. or at least change the comment *grin*,

-Jack

[1] http://cvs.sourceforge.net/viewcvs.py/lyntin/lyntin40/sandbox/leantin/mudcommands.py?view=auto
From aahz at pythoncraft.com  Tue Jan  4 19:47:06 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Jan  4 19:47:08 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAE213.9070906@zope.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<41DAE213.9070906@zope.com>
Message-ID: <20050104184706.GA3466@panix.com>

On Tue, Jan 04, 2005, Jim Fulton wrote:
> Guido van Rossum wrote:
>>
>> and the overloading of
>>unbound and bound methods on the same object type is confusing. Also,
>>the type checking offered is wrong, because it checks for subclassing
>>rather than for duck typing.
> 
> duck typing?

"If it looks like a duck and quacks like a duck, it must be a duck."

Python is often referred to as having duck typing because even without
formal interface declarations, good practice mostly depends on
conformant interfaces rather than subclassing to determine an object's
type.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From pje at telecommunity.com  Tue Jan  4 19:48:24 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan  4 19:48:18 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com>

At 10:28 AM 1/4/05 -0800, Guido van Rossum wrote:

>Of course, more changes would be needed: docs, the test suite, and
>some simplifications to the instance method object implementation in
>classobject.c.
>
>Does anyone think this is a bad idea?

Code that currently does 'aClass.aMethod.im_func' in order to access the 
function object would break, as would code that inspects 'im_self' to 
determine whether a method is a class or instance method.  (Although code 
of the latter sort would already break with static methods, I suppose.)

Cursory skimming of the first 100 Google hits for 'im_func' seems to show 
at least half a dozen instances of the first type of code, though.  Such 
code would also be in the difficult position of having to do things two 
ways in order to be both forward and backward compatible.

Also, I seem to recall once having relied on the behavior of a 
dynamically-created unbound method (via new.instancemethod) in order to 
create a descriptor of some sort.  But I don't remember where or when I did 
it or whether I still care.  :)

From aleax at aleax.it  Tue Jan  4 19:48:49 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan  4 19:48:54 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <e8bf7a5305010408177786b70@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
	<e8bf7a5305010408177786b70@mail.gmail.com>
Message-ID: <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 04, at 17:17, Jeremy Hylton wrote:

> That's fine with me.  We had taken it to the compiler-sig when it
> wasn't clear there was interest in the ast branch :-).

Speaking for myself, I have a burning interest in the AST branch 
(though I can't seem to get it correctly downloaded so far, I guess 
it's just my usual CVS-clumsiness and I'll soon find out what I'm doing 
wrong & fix it), and if I could follow the discussion right here on 
python-dev that would sure be convenient (now that I've finally put the 
2nd ed of the Cookbook to bed and am finally reading python-dev again 
after all these months -- almost caught up with recent traffic 
too;-)...


Alex

From pje at telecommunity.com  Tue Jan  4 19:51:42 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan  4 19:51:37 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAE213.9070906@zope.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050104134847.02b6bec0@mail.telecommunity.com>

At 01:36 PM 1/4/05 -0500, Jim Fulton wrote:
>duck typing?

AKA latent typing or, "if it walks like a duck and quacks like a duck, it 
must be a duck."  Or, more pythonically:

    if hasattr(ob,"quack") and hasattr(ob,"duckwalk"):
         # it's a duck

This is as distinct from both 'if isinstance(ob,Duck)' and 'if 
implements(ob,IDuck)'.  That is, "duck typing" is determining an object's 
type by inspection of its method/attribute signature rather than by 
explicit relationship to some type object.


From olsongt at verizon.net  Tue Jan  4 17:30:22 2005
From: olsongt at verizon.net (olsongt@verizon.net)
Date: Tue Jan  4 19:55:11 2005
Subject: [Python-Dev] Will ASTbranch compile on windows yet?
Message-ID: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net>

I submitted patch "[ 742621 ] ast-branch: msvc project sync" in the VC6.0 days.  There were some required changes to headers as well as the project files.  It had discouraged me in the past when Jeremy made calls for help on the astbranch and I wasn't even sure if the source was in a compilable state when I checked it out.  I'm sure it has discouraged other windows programmers as well.

-Grant

From tim.peters at gmail.com  Tue Jan  4 19:57:10 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Jan  4 19:57:13 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <1f7befae05010410576effd024@mail.gmail.com>

[Guido]
> In my blog I wrote:
> 
> Let's get rid of unbound methods. When class C defines a method
> f, C.f should just return the function object, not an unbound
> method that behaves almost, but not quite, the same as that
> function object. The extra type checking on the first argument that
> unbound methods are supposed to provide is not useful in practice
> (I can't remember that it ever caught a bug in my code)

Really?  Unbound methods are used most often (IME) to call a
base-class method from a subclass, like my_base.the_method(self, ...).
 It's especially easy to forget to write `self, ` there, and the
exception msg then is quite focused because of that extra bit of type
checking.  Otherwise I expect we'd see a more-mysterious
AttributeError or TypeError when the base method got around to trying
to do something with the bogus `self` passed to it.

I could live with that, though.

> and sometimes you have to work around it;

For me, 0 times in ... what? ... about 14 years <wink>.

> it complicates function attribute access; and the overloading of
> unbound and bound methods on the same object type is
> confusing.

Yup, it is a complication, without a compelling use case I know of. 
Across the Python, Zope2 and Zope3 code bases, types.UnboundMethodType
is defined once and used once (believe it or not, in unittest.py).
From tim.peters at gmail.com  Tue Jan  4 20:08:34 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Jan  4 20:08:38 2005
Subject: [Python-Dev] Will ASTbranch compile on windows yet?
In-Reply-To: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net>
References: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net>
Message-ID: <1f7befae05010411082bd35aab@mail.gmail.com>

[olsongt@verizon.net]
> I submitted patch "[ 742621 ] ast-branch: msvc project sync" in
> the VC6.0 days.  There were some required changes to headers
> as well as the project files.  It had discouraged me in the past
> when Jeremy made calls for help on the astbranch and I wasn't
> even sure if the source was in a compilable state when I checked
> it out.  I'm sure it has discouraged other windows programmers
> as well.

I'd be surprised if it compiled on Windows now, as I don't think any
Windows users have been working on that branch.

At the last (2004) PyCon, I was going to participate in the annual AST
sprint again, but it was so far from working on Windows then I gave up
(and joined the close-bugs/patches sprint instead).

I don't have time to join the current crusade.  If there's pent-up
interest among Windows users, it would be good to say which
compiler(s) you can use, since I expect not everyone can deal with VC
7.1 (e.g., I think Raymond Hettinger is limited to VC 6; and you said
you worked up a VC 6 patch, but didn't say whether you could use 7.1
now).
From jim at zope.com  Tue Jan  4 20:12:39 2005
From: jim at zope.com (Jim Fulton)
Date: Tue Jan  4 20:12:43 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com>
References: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com>
Message-ID: <41DAEAA7.2040706@zope.com>

Phillip J. Eby wrote:
> At 10:28 AM 1/4/05 -0800, Guido van Rossum wrote:
> 
>> Of course, more changes would be needed: docs, the test suite, and
>> some simplifications to the instance method object implementation in
>> classobject.c.
>>
>> Does anyone think this is a bad idea?
> 
> 
> Code that currently does 'aClass.aMethod.im_func' in order to access the 
> function object would break, as would code that inspects 'im_self' to 
> determine whether a method is a class or instance method.  (Although 
> code of the latter sort would already break with static methods, I 
> suppose.)

Code of the latter sort wouldn't break with the change. We'd still
have bound methods.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From gvanrossum at gmail.com  Tue Jan  4 20:40:30 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 20:40:33 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAE213.9070906@zope.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<41DAE213.9070906@zope.com>
Message-ID: <ca471dc2050104114026228858@mail.gmail.com>

[Jim]
> We'll still need unbound builtin methods, so the concept won't
> go away. In fact, the change would mean that the behavior between
> builtin methods and python methods would become more inconsistent.

Actually, unbound builtin methods are a different type than bound
builtin methods:

>>> type(list.append)
<type 'method_descriptor'>
>>> type([].append)
<type 'builtin_function_or_method'>
>>> 

Compare this to the same thing for a method on a user-defined class:

>>> type(C.foo)
<type 'instancemethod'>
>>> type(C().foo)
<type 'instancemethod'>

(The 'instancemethod' type knows whether it is a bound or unbound
method by checking whether im_self is set.)

[Phillip]
> Code that currently does 'aClass.aMethod.im_func' in order to access the
> function object would break, as would code that inspects 'im_self' to
> determine whether a method is a class or instance method.  (Although code
> of the latter sort would already break with static methods, I suppose.)

Right. (But I think you're using the terminology in a cunfused way --
im_self distinguishes between bould and unbound methods. Class methods
are a different beast.)

I guess for backwards compatibility, function objects could implement
dummy im_func and im_self attributes (im_func returning itself and
im_self returning None), while issuing a warning that this is a
deprecated feature.

[Tim]
> Really?  Unbound methods are used most often (IME) to call a
> base-class method from a subclass, like my_base.the_method(self, ...).
>  It's especially easy to forget to write `self, ` there, and the
> exception msg then is quite focused because of that extra bit of type
> checking.  Otherwise I expect we'd see a more-mysterious
> AttributeError or TypeError when the base method got around to trying
> to do something with the bogus `self` passed to it.

Hm, I hadn't thought ot this.

> I could live with that, though.

Most cases would be complaints about argument counts (it gets harier
when there are default args so the arg count is variable). Ironically,
I get those all the time these days due to the reverse error: using
super() but forgetting *not* to pass self!

> Across the Python, Zope2 and Zope3 code bases, types.UnboundMethodType
> is defined once and used once (believe it or not, in unittest.py).

But that might be because BoundMethodType is the same type object...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Tue Jan  4 20:38:29 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Jan  4 20:42:15 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAE213.9070906@zope.com>
Message-ID: <003901c4f294$ff8e1640$e841fea9@oemcomputer>

[Guido van Rossum]
> > Let's get rid of unbound methods. 

+1



[Jim Fulton]
> duck typing?

Requiring a specific interface instead of a specific type.



[Guido]
> > Does anyone think this is a bad idea?
[Jim]
> It *feels* very disruptive to me, but I'm probably wrong.
> We'll still need unbound builtin methods, so the concept won't
> go away. In fact, the change would mean that the behavior between
> builtin methods and python methods would become more inconsistent.

The type change would be disruptive and guaranteed to break some code.
Also, it would partially breakdown the distinction between functions and
methods.

The behavior, on the other hand, would remain essentially the same (sans
type checking).



Raymond

From jim at zope.com  Tue Jan  4 20:44:43 2005
From: jim at zope.com (Jim Fulton)
Date: Tue Jan  4 20:44:47 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104114026228858@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>	
	<41DAE213.9070906@zope.com>
	<ca471dc2050104114026228858@mail.gmail.com>
Message-ID: <41DAF22B.6030605@zope.com>

Guido van Rossum wrote:
> [Jim]
> 
>>We'll still need unbound builtin methods, so the concept won't
>>go away. In fact, the change would mean that the behavior between
>>builtin methods and python methods would become more inconsistent.
> 
> 
> Actually, unbound builtin methods are a different type than bound
> builtin methods:

Of course, but conceptually they are similar.  You would still
encounter the concept if you got an unbound builtin method.


Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From exarkun at divmod.com  Tue Jan  4 21:02:06 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Jan  4 21:02:09 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <20050104200206.25734.345337731.divmod.quotient.921@ohm>

On Tue, 4 Jan 2005 10:28:03 -0800, Guido van Rossum <gvanrossum@gmail.com> wrote:
>In my blog I wrote:
> 
> Let's get rid of unbound methods. When class C defines a method f, C.f
> should just return the function object, not an unbound method that
> behaves almost, but not quite, the same as that function object. The
> extra type checking on the first argument that unbound methods are
> supposed to provide is not useful in practice (I can't remember that
> it ever caught a bug in my code) and sometimes you have to work around
> it; it complicates function attribute access; and the overloading of
> unbound and bound methods on the same object type is confusing. Also,
> the type checking offered is wrong, because it checks for subclassing
> rather than for duck typing.
> 

  This would make pickling (or any serialization mechanism) of 
`Class.method' based on name next to impossible.  Right now, with
the appropriate support, this works:

    >>> import pickle
    >>> class Foo:
    ...     def bar(self): pass
    ... 
    >>> pickle.loads(pickle.dumps(Foo.bar))
    <unbound method Foo.bar>
    >>> 

  I don't see how it could if Foo.bar were just a function object.

  Jp
From exarkun at divmod.com  Tue Jan  4 21:15:00 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Jan  4 21:15:05 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050104200206.25734.345337731.divmod.quotient.921@ohm>
Message-ID: <20050104201500.25734.946201879.divmod.quotient.934@ohm>

On Tue, 04 Jan 2005 20:02:06 GMT, Jp Calderone <exarkun@divmod.com> wrote:
>On Tue, 4 Jan 2005 10:28:03 -0800, Guido van Rossum <gvanrossum@gmail.com> wrote:
> >In my blog I wrote:
> > 
> > Let's get rid of unbound methods. When class C defines a method f, C.f
> > should just return the function object, not an unbound method that
> > behaves almost, but not quite, the same as that function object. The
> > extra type checking on the first argument that unbound methods are
> > supposed to provide is not useful in practice (I can't remember that
> > it ever caught a bug in my code) and sometimes you have to work around
> > it; it complicates function attribute access; and the overloading of
> > unbound and bound methods on the same object type is confusing. Also,
> > the type checking offered is wrong, because it checks for subclassing
> > rather than for duck typing.
> > 
> 
>   This would make pickling (or any serialization mechanism) of 
> `Class.method' based on name next to impossible.  Right now, with
> the appropriate support, this works:

  It occurs to me that perhaps I was not clear enough here.  

  What I mean is that it is possible to serialize unbound methods 
currently, because they refer to both their own name, the name of 
their class object, and thus indirectly to the module in which they 
are defined.

  If looking up a method on a class object instead returns a function, 
then the class is no longer knowable, and most likely the function will
not have a unique name which can be used to allow a reference to it to 
be serialized.

  In particular, I don't see how one will be able to write something 
equivalent to this:

    import new, copy_reg, types

    def pickleMethod(method):
        return unpickleMethod, (method.im_func.__name__,
                                method.im_self,
                                method.im_class)

    def unpickleMethod(im_name, im_self, im_class):
        unbound = getattr(im_class, im_name)
        if im_self is None:
            return unbound
        return new.instancemethod(unbound.im_func,
                                  im_self,
                                  im_class)

    copy_reg.pickle(types.MethodType, 
                    pickleMethod, 
                    unpickleMethod)

  But perhaps I am just overlooking the obvious.

  Jp
From gvanrossum at gmail.com  Tue Jan  4 21:18:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan  4 21:18:18 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050104200206.25734.345337731.divmod.quotient.921@ohm>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<20050104200206.25734.345337731.divmod.quotient.921@ohm>
Message-ID: <ca471dc205010412183879fe01@mail.gmail.com>

[me]
> > Actually, unbound builtin methods are a different type than bound
> > builtin methods:

[Jim]
> Of course, but conceptually they are similar.  You would still
> encounter the concept if you got an unbound builtin method.

Well, these are all just implementation details. They really are all
just callables.

[Jp]
>   This would make pickling (or any serialization mechanism) of
> `Class.method' based on name next to impossible.  Right now, with
> the appropriate support, this works:
> 
>     >>> import pickle
>     >>> class Foo:
>     ...     def bar(self): pass
>     ...
>     >>> pickle.loads(pickle.dumps(Foo.bar))
>     <unbound method Foo.bar>
>     >>>
> 
>   I don't see how it could if Foo.bar were just a function object.

Is this a purely theoretical objection or are you actually aware of
anyone doing this? Anyway, that approach is pretty limited -- how
would you do it for static and class methods, or methods wrapped by
other decorators?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From exarkun at divmod.com  Tue Jan  4 21:27:37 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Jan  4 21:27:41 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc205010412183879fe01@mail.gmail.com>
Message-ID: <20050104202737.25734.1950245396.divmod.quotient.945@ohm>

On Tue, 4 Jan 2005 12:18:15 -0800, Guido van Rossum <gvanrossum@gmail.com> wrote:
>[me]
> > > Actually, unbound builtin methods are a different type than bound
> > > builtin methods:
> 
> [Jim]
> > Of course, but conceptually they are similar.  You would still
> > encounter the concept if you got an unbound builtin method.
> 
> Well, these are all just implementation details. They really are all
> just callables.
> 
> [Jp]
> >   This would make pickling (or any serialization mechanism) of
> > `Class.method' based on name next to impossible.  Right now, with
> > the appropriate support, this works:
> > 
> >     >>> import pickle
> >     >>> class Foo:
> >     ...     def bar(self): pass
> >     ...
> >     >>> pickle.loads(pickle.dumps(Foo.bar))
> >     <unbound method Foo.bar>
> >     >>>
> > 
> >   I don't see how it could if Foo.bar were just a function object.
> 
> Is this a purely theoretical objection or are you actually aware of
> anyone doing this? Anyway, that approach is pretty limited -- how
> would you do it for static and class methods, or methods wrapped by
> other decorators?

  It's not a feature I often depend on, however I have made use of 
it on occassion.  Twisted's supports serializing unbound methods 
this way, primarily to enhance the useability of tap files (a feature 
whereby an application is configured by constructing a Python object 
graph and then pickled to a file to later be loaded and run).

  "Objection" may be too strong a word for my stance here, I just 
wanted to point out another potentially incompatible behavior change.
I can't think of any software which I cam currently developing or 
maintaining which benefits from this feature, it just seems 
unfortunate to further complicate the already unpleasant business 
of serialization.

  Jp
From pje at telecommunity.com  Tue Jan  4 21:31:57 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan  4 21:31:54 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104114026228858@mail.gmail.com>
References: <41DAE213.9070906@zope.com>
	<ca471dc2050104102814be915b@mail.gmail.com>
	<41DAE213.9070906@zope.com>
Message-ID: <5.1.1.6.0.20050104152023.02db7cc0@mail.telecommunity.com>

At 11:40 AM 1/4/05 -0800, Guido van Rossum wrote:
>[Jim]
> > We'll still need unbound builtin methods, so the concept won't
> > go away. In fact, the change would mean that the behavior between
> > builtin methods and python methods would become more inconsistent.
>
>Actually, unbound builtin methods are a different type than bound
>builtin methods:
>
> >>> type(list.append)
><type 'method_descriptor'>
> >>> type([].append)
><type 'builtin_function_or_method'>
> >>>
>
>Compare this to the same thing for a method on a user-defined class:
>
> >>> type(C.foo)
><type 'instancemethod'>
> >>> type(C().foo)
><type 'instancemethod'>
>
>(The 'instancemethod' type knows whether it is a bound or unbound
>method by checking whether im_self is set.)
>
>[Phillip]
> > Code that currently does 'aClass.aMethod.im_func' in order to access the
> > function object would break, as would code that inspects 'im_self' to
> > determine whether a method is a class or instance method.  (Although code
> > of the latter sort would already break with static methods, I suppose.)
>
>Right. (But I think you're using the terminology in a cunfused way --
>im_self distinguishes between bould and unbound methods. Class methods
>are a different beast.)

IIUC, when you do 'SomeClass.aMethod', if 'aMethod' is a classmethod, then 
you will receive a bound method with an im_self of 'SomeClass'.  So, if you 
are introspecting items listed in 'dir(SomeClass)', this will be your only 
clue that 'aMethod' is a class method.  Similarly, the fact that you get an 
unbound method object if 'aMethod' is an instance method, allows you to 
distinguish it from a static method (if the object is a function).

That is, I'm saying that code that looks at the type and attributes of 
'aMethod' as retrieved from 'SomeClass' will now not be able to distinguish 
between a static method and an instance method, because both will return a 
function instance.

However, the 'inspect' module uses __dict__ rather than getattr to get at 
least some attributes, so it doesn't rely on this property.


>I guess for backwards compatibility, function objects could implement
>dummy im_func and im_self attributes (im_func returning itself and
>im_self returning None), while issuing a warning that this is a
>deprecated feature.

+1 on this part if the proposal goes through.

On the proposal as a whole, I'm -0, as I'm not quite clear on what this is 
going to simplify enough to justify the various semantic impacts such as 
upcalls, pickling, etc.  Method objects will still have to exist, so ISTM 
that this is only going to streamline the "__get__(None,type)" branch of 
functions' descriptor code, and the check for "im_self is None" in the 
__call__ of method objects.  (And maybe some eval loop shortcuts for 
calling methods?)

From bac at OCF.Berkeley.EDU  Tue Jan  4 22:11:54 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Jan  4 22:12:16 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>	<1104842608.3227.60.camel@presto.wooz.org>	<e8bf7a530501040525640fb674@mail.gmail.com>	<ca471dc20501040731142eccf2@mail.gmail.com>	<e8bf7a5305010408177786b70@mail.gmail.com>
	<45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <41DB069A.8030406@ocf.berkeley.edu>

Alex Martelli wrote:
> 
> On 2005 Jan 04, at 17:17, Jeremy Hylton wrote:
> 
>> That's fine with me.  We had taken it to the compiler-sig when it
>> wasn't clear there was interest in the ast branch :-).
> 
> 
> Speaking for myself, I have a burning interest in the AST branch (though 
> I can't seem to get it correctly downloaded so far, I guess it's just my 
> usual CVS-clumsiness and I'll soon find out what I'm doing wrong & fix 
> it)

See http://www.python.org/dev/devfaq.html#how-can-i-check-out-a-tagged-branch 
on how to do a checkout of a tagged branch.

-Brett
From bac at OCF.Berkeley.EDU  Tue Jan  4 22:50:28 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Jan  4 22:50:39 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <20050104021909.GB11833@unpythonic.net>
References: <ca471dc2050103174221d7442f@mail.gmail.com>	<41D9F94C.3020005@ocf.berkeley.edu>
	<20050104021909.GB11833@unpythonic.net>
Message-ID: <41DB0FA4.3070405@ocf.berkeley.edu>

Jeff Epler wrote:
> On Mon, Jan 03, 2005 at 06:02:52PM -0800, Brett C. wrote:
> 
>>Although if someone can start sooner than by all means, go for it!
>>And obviously help would be great since it isn't a puny codebase
>>(4,000 lines so far for the CST->AST and AST->bytecode code).
> 
> 
> And obviously knowing a little more about the AST branch would be
> helpful for those considering helping.
> 
> Is there any relatively up-to-date document about ast-branch?  googling
> about it turned up some pypy stuff from 2003, and I didn't look much
> further.
> 

Beyond the text file Python/compile.txt in CVS, nope.  I have tried to flesh 
that doc out as well as I could to explain how it all works.

If it doesn't answer all your questions then just ask here on python-dev (as 
the rest of this thread has seemed to agreed upon).  I will do my best to make 
sure any info that needs to work its way back into the doc gets checked in.

> I just built the ast-branch for fun, and "make test" mostly worked.
>     8 tests failed:
>         test_builtin test_dis test_generators test_inspect test_pep263
>         test_scope test_symtable test_trace
>     6 skips unexpected on linux2:
>         test_csv test_hotshot test_bsddb test_parser test_logging
>         test_email
> I haven't looked at any of the failures in detail, but at least
> test_bsddb is due to missing development libs on this system
> 
> One more thing:  The software I work on by day has python scripting.
> One part of that functionality is a tree display of a script.  I'm not
> actively involved with this part of the software (yet).  Any comments on
> whether ast-branch could be construed as helping make this kind of
> functionality work better, faster, or easier?  The code we use currently
> is based on a modified version of the parser which includes comment
> information, so we need to be aware of changes in this area anyhow.
> 

If by tree you mean execution paths, then yes, eventually.  When the back-end 
is finished the hope is to be able to export the AST to Python objects and thus 
have it usable in Python.  You could use the AST representation to display your 
tree.

-Brett
From jhylton at gmail.com  Tue Jan  4 22:54:28 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Jan  4 22:54:31 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <41DB0FA4.3070405@ocf.berkeley.edu>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<20050104021909.GB11833@unpythonic.net>
	<41DB0FA4.3070405@ocf.berkeley.edu>
Message-ID: <e8bf7a5305010413541cd17a02@mail.gmail.com>

Does anyone want to volunteer to integrate the current head to the
branch?  I think that's a pretty important near-term step.

Jeremy
From Jack.Jansen at cwi.nl  Wed Jan  5 00:01:34 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Wed Jan  5 00:01:25 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>


On 4-jan-05, at 19:28, Guido van Rossum wrote:
>  The
> extra type checking on the first argument that unbound methods are
> supposed to provide is not useful in practice (I can't remember that
> it ever caught a bug in my code)

It caught bugs for me a couple of times. If I remember correctly I was 
calling methods of something that was supposed to be a mixin class but 
I forgot to actually list the mixin as a base. But I don't think that's 
a serious enough issue alone to keep the unbound method type.

But I'm more worried about losing the other information in an unbound 
method, specifically im_class. I would guess that info is useful to 
class browsers and such, or are there other ways to get at that?
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From jack at uitdesloot.nl  Tue Jan  4 23:34:58 2005
From: jack at uitdesloot.nl (Jack Jansen)
Date: Wed Jan  5 00:04:54 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
Message-ID: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>

First question: what is the Python 2.3.5 release schedule and who is 
responsible?

Second question: I thought this info was in a PEP somewhere, but I 
could only find PEPs on major releases, should I have found this info 
somewhere?

And now the question that matters: there's some stuff I'd really like 
to get into 2.3.5, but it involves changes to configure, the Makefile 
and distutils, so because it's fairly extensive I thought I'd ask 
before just committing it.

The problem we're trying to solve is that due to the way Apple's 
framework architecture works newer versions of frameworks are preferred 
(at link time, and sometimes even at runtime) over older ones. That's 
fine for most uses of frameworks, but not when linking a Python 
extension against the Python framework: if you build an extension with 
Python 2.3 to later load it into 2.3 you don't want that framework to 
be linked against 2.4.

Now there's a way around this, from MacOSX 10.3 onwards, and that is 
not to link against the framework at all, but link with "-undefined 
dynamic_lookup". This will link the extension in a way similar to what 
other Unix systems do: any undefined externals are looked up when the 
extension is dynamically loaded. But because this feature only works 
with the dynamic loader from 10.3 or later you must have the 
environment variable MACOSX_DEPLOYMENT_TARGET set to 10.3 or higher 
when you build the extension, otherwise the linker will complain.

We've solved this issue for the trunk and we can solve it for 2.4.1: if 
MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to 
10.3. Moreover, when it is 10.3 or higher (possibly after being forced) 
we use the dynamic_lookup way of linking extensions. We also record the 
value of MACOSX_DEPLOYMENT_TARGET in the Makefile, and distutils picks 
it up later and sets the environment variable again.

We even have a hack to fix Apple-installed Python 2.3 in place by 
mucking with lib/config/Makefile, which we can do because 
Apple-installed Python 2.3 will obviously only be run on 10.3. And we 
check whether this hack is needed when you install a later Python 
version on 10.3.

That leaves Python 2.3.5 itself. The best fix here would be to backport 
the 2.4.1 solution: configure.in 1.456 and 1.478, 
distutils/sysconfig.py 1.59 and 1.62, Makefile.pre.in 1.144. Note that 
though the build procedure for extensions will change it doesn't affect 
binary compatibility: both types of extensions are loadable by both 
types of interpreters.

I think this is all safe, and these patches shouldn't affect any system 
other than MacOSX, but I'm a bit reluctant to fiddle with the build 
procedure for a micro-release, so that's why I'm asking.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From Jack.Jansen at cwi.nl  Wed Jan  5 00:06:29 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Wed Jan  5 00:06:18 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
Message-ID: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl>

First question: what is the Python 2.3.5 release schedule and who is 
responsible?

Second question: I thought this info was in a PEP somewhere, but I 
could only find PEPs on major releases, should I have found this info 
somewhere?

And now the question that matters: there's some stuff I'd really like 
to get into 2.3.5, but it involves changes to configure, the Makefile 
and distutils, so because it's fairly extensive I thought I'd ask 
before just committing it.

The problem we're trying to solve is that due to the way Apple's 
framework architecture works newer versions of frameworks are preferred 
(at link time, and sometimes even at runtime) over older ones. That's 
fine for most uses of frameworks, but not when linking a Python 
extension against the Python framework: if you build an extension with 
Python 2.3 to later load it into 2.3 you don't want that framework to 
be linked against 2.4.

Now there's a way around this, from MacOSX 10.3 onwards, and that is 
not to link against the framework at all, but link with "-undefined 
dynamic_lookup". This will link the extension in a way similar to what 
other Unix systems do: any undefined externals are looked up when the 
extension is dynamically loaded. But because this feature only works 
with the dynamic loader from 10.3 or later you must have the 
environment variable MACOSX_DEPLOYMENT_TARGET set to 10.3 or higher 
when you build the extension, otherwise the linker will complain.

We've solved this issue for the trunk and we can solve it for 2.4.1: if 
MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to 
10.3. Moreover, when it is 10.3 or higher (possibly after being forced) 
we use the dynamic_lookup way of linking extensions. We also record the 
value of MACOSX_DEPLOYMENT_TARGET in the Makefile, and distutils picks 
it up later and sets the environment variable again.

We even have a hack to fix Apple-installed Python 2.3 in place by 
mucking with lib/config/Makefile, which we can do because 
Apple-installed Python 2.3 will obviously only be run on 10.3. And we 
check whether this hack is needed when you install a later Python 
version on 10.3.

That leaves Python 2.3.5 itself. The best fix here would be to backport 
the 2.4.1 solution: configure.in 1.456 and 1.478, 
distutils/sysconfig.py 1.59 and 1.62, Makefile.pre.in 1.144. Note that 
though the build procedure for extensions will change it doesn't affect 
binary compatibility: both types of extensions are loadable by both 
types of interpreters.

I think this is all safe, and these patches shouldn't affect any system 
other than MacOSX, but I'm a bit reluctant to fiddle with the build 
procedure for a micro-release, so that's why I'm asking.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From bob at redivi.com  Wed Jan  5 00:26:27 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 00:26:49 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
Message-ID: <0EB97E30-5EA8-11D9-96B0-000A9567635C@redivi.com>


On Jan 4, 2005, at 6:01 PM, Jack Jansen wrote:

>
> On 4-jan-05, at 19:28, Guido van Rossum wrote:
>>  The
>> extra type checking on the first argument that unbound methods are
>> supposed to provide is not useful in practice (I can't remember that
>> it ever caught a bug in my code)
>
> It caught bugs for me a couple of times. If I remember correctly I was 
> calling methods of something that was supposed to be a mixin class but 
> I forgot to actually list the mixin as a base. But I don't think 
> that's a serious enough issue alone to keep the unbound method type.
>
> But I'm more worried about losing the other information in an unbound 
> method, specifically im_class. I would guess that info is useful to 
> class browsers and such, or are there other ways to get at that?

For a class browser, presumably, you would start at the class and then 
find the methods.  Starting from some class and walking the mro, you 
can inspect the dicts along the way and you'll find everything and know 
where it came from.

-bob

From martin at v.loewis.de  Wed Jan  5 00:54:02 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 00:53:56 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
Message-ID: <41DB2C9A.4070800@v.loewis.de>

Jack Jansen wrote:
> First question: what is the Python 2.3.5 release schedule and who is 
> responsible?

Last I heard it is going to be released "in January", and Anthony Baxter
is the release manager.

> Second question: I thought this info was in a PEP somewhere, but I could 
> only find PEPs on major releases, should I have found this info somewhere?

By following python-dev, or in a python-dev summary, e.g.

http://www.python.org/dev/summary/2004-11-01_2004-11-15.html

> The problem we're trying to solve is that due to the way Apple's 
> framework architecture works newer versions of frameworks are preferred 
> (at link time, and sometimes even at runtime) over older ones.

Can you elaborate on that somewhat? According to

http://developer.apple.com/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/VersionInformation.html

there are major and minor versions of frameworks. I would think that
every Python minor (2.x) release should produce a new major framework
version of the Python framework. Then, there would be no problem.

Why does this not work?

> I think this is all safe, and these patches shouldn't affect any system 
> other than MacOSX, but I'm a bit reluctant to fiddle with the build 
> procedure for a micro-release, so that's why I'm asking.

This is ultimately for the release manager to decide. My personal
feeling is that it is ok to fiddle with the build procedure. I'm
more concerned that the approach taken might be "wrong", in the
sense that it uses a stack of hacks and work-arounds for problems
which Apple envisions to be solved differently. That would be bad,
because it might make an implementation of the "proper" solution
more difficult.

Regards,
Martin
From bob at redivi.com  Wed Jan  5 01:08:54 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 01:09:05 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DB2C9A.4070800@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
Message-ID: <FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>


On Jan 4, 2005, at 6:54 PM, Martin v. L?wis wrote:

> Jack Jansen wrote:
>> First question: what is the Python 2.3.5 release schedule and who is  
>> responsible?
>
> Last I heard it is going to be released "in January", and Anthony  
> Baxter
> is the release manager.
>
>> Second question: I thought this info was in a PEP somewhere, but I  
>> could only find PEPs on major releases, should I have found this info  
>> somewhere?
>
> By following python-dev, or in a python-dev summary, e.g.
>
> http://www.python.org/dev/summary/2004-11-01_2004-11-15.html
>
>> The problem we're trying to solve is that due to the way Apple's  
>> framework architecture works newer versions of frameworks are  
>> preferred (at link time, and sometimes even at runtime) over older  
>> ones.
>
> Can you elaborate on that somewhat? According to
>
> http://developer.apple.com/documentation/MacOSX/Conceptual/ 
> BPFrameworks/Concepts/VersionInformation.html
>
> there are major and minor versions of frameworks. I would think that
> every Python minor (2.x) release should produce a new major framework
> version of the Python framework. Then, there would be no problem.
>
> Why does this not work?

It doesn't for reasons I care not to explain in depth, again.  Search  
the pythonmac-sig archives for longer explanations.  The gist is that  
you specifically do not want to link directly to the framework at all  
when building extensions.  These patches are required to do that  
correctly.

>> I think this is all safe, and these patches shouldn't affect any  
>> system other than MacOSX, but I'm a bit reluctant to fiddle with the  
>> build procedure for a micro-release, so that's why I'm asking.
>
> This is ultimately for the release manager to decide. My personal
> feeling is that it is ok to fiddle with the build procedure. I'm
> more concerned that the approach taken might be "wrong", in the
> sense that it uses a stack of hacks and work-arounds for problems
> which Apple envisions to be solved differently. That would be bad,
> because it might make an implementation of the "proper" solution
> more difficult.

This is not the wrong way to do it.

-bob

From kbk at shore.net  Wed Jan  5 01:57:28 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan  5 01:58:07 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <ca471dc20501040731142eccf2@mail.gmail.com> (Guido van Rossum's
	message of "Tue, 4 Jan 2005 07:31:30 -0800")
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>
	<1104842608.3227.60.camel@presto.wooz.org>
	<e8bf7a530501040525640fb674@mail.gmail.com>
	<ca471dc20501040731142eccf2@mail.gmail.com>
Message-ID: <87hdlwae13.fsf@hydra.bayview.thirdcreek.com>

Guido van Rossum <gvanrossum@gmail.com> writes:

> I'd be fine with keeping this on python-dev too.

Maybe tag the Subject: with [AST] when starting a thread?

-- 
KBK
From jcarlson at uci.edu  Wed Jan  5 02:18:30 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Jan  5 02:27:55 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1f7befae05010410576effd024@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
Message-ID: <20050104154707.927B.JCARLSON@uci.edu>


Tim Peters <tim.peters@gmail.com> wrote:
> Guido wrote:
> > Let's get rid of unbound methods. When class C defines a method
[snip]
> Really?  Unbound methods are used most often (IME) to call a
> base-class method from a subclass, like my_base.the_method(self, ...).
>  It's especially easy to forget to write `self, ` there, and the
> exception msg then is quite focused because of that extra bit of type
> checking.  Otherwise I expect we'd see a more-mysterious
> AttributeError or TypeError when the base method got around to trying
> to do something with the bogus `self` passed to it.

Agreed.  While it seems that super() is the 'modern paradigm' for this,
I have been using base.method(self, ...) for years now, and have been
quite happy with it.  After attempting to convert my code to use the
super() paradigm, and having difficulty, I discovered James Knight's
"Python's Super Considered Harmful" (available at
http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I
discovered how super really worked (I should have read the documention
in the first place), and reverted my changes to the base.method version.


> I could live with that, though.

I could live with it too, but I would probably use an equivalent of the
following (with actual type checking):

def mysuper(typ, obj):
    lm = list(o.__class__.__mro__)
    indx = lm.index(typ)
    if indx == 0:
        return obj
    return super(lm[indx-1], obj)


All in all, I'm -0.  I don't desire to replace all of my base.method
with mysuper(base, obj).method, but if I must sacrifice convenience for
the sake of making Python 2.5's implementation simpler, I guess I'll
deal with it. My familiarity with grep's regular expressions leaves
something to be desired, so I don't know how often base.method(self,...) is
or is not used in the standard library.

 - Josiah

From gvanrossum at gmail.com  Wed Jan  5 03:02:17 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan  5 03:02:20 2005
Subject: [Python-Dev] super() harmful?
In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
Message-ID: <ca471dc205010418022db8c838@mail.gmail.com>

[Josiah]
> Agreed.  While it seems that super() is the 'modern paradigm' for this,
> I have been using base.method(self, ...) for years now, and have been
> quite happy with it.  After attempting to convert my code to use the
> super() paradigm, and having difficulty, I discovered James Knight's
> "Python's Super Considered Harmful" (available at
> http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I
> discovered how super really worked (I should have read the documention
> in the first place), and reverted my changes to the base.method version.

I think that James Y Knight's page misrepresents the issue. Quoting:

"""
Note that the __init__ method is not special -- the same thing happens
with any method, I just use __init__ because it is the method that most
often needs to be overridden in many classes in the hierarchy.
"""

But __init__ *is* special, in that it is okay for a subclass __init__
(or __new__) to have a different signature than the base class
__init__; this is not true for other methods. If you change a regular
method's signature, you would break Liskov substitutability (i.e.,
your subclass instance wouldn't be acceptable where a base class
instance would be acceptable).

Super is intended for use that are designed with method cooperation in
mind, so I agree with the best practices in James's Conclusion:

"""
    * Use it consistently, and document that you use it,
      as it is part of the external interface for your class, like it or not.
    * Never call super with anything but the exact arguments you received,
      unless you really know what you're doing.
    * When you use it on methods whose acceptable arguments can be
      altered on a subclass via addition of more optional arguments,
      always accept *args, **kw, and call super like
      "super(MyClass, self).currentmethod(alltheargsideclared, *args,
**kwargs)".
      If you don't do this, forbid addition of optional arguments in subclasses.
    * Never use positional arguments in __init__ or __new__.
      Always use keyword args, and always call them as keywords,
      and always pass all keywords on to super.
"""

But that's not the same as calling it harmful. :-(

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tim.peters at gmail.com  Wed Jan  5 03:07:31 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Jan  5 03:07:35 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
Message-ID: <1f7befae050104180711743ebd@mail.gmail.com>

[Tim Peters]
>> ...  Unbound methods are used most often (IME) to call a
>> base-class method from a subclass, like
>> my_base.the_method(self, ...).
>>  It's especially easy to forget to write `self, ` there, and the
>> exception msg then is quite focused because of that extra bit of
>> type checking.  Otherwise I expect we'd see a more-mysterious
>> AttributeError or TypeError when the base method got around to
>> trying to do something with the bogus `self` passed to it.

[Josiah Carlson]
> Agreed.

Well, it's not that easy to agree with.  Guido replied that most such
cases would raise an argument-count-mismatch exception instead.  I
expect that's because he stopped working on Zope code, so actually
thinks it's odd again to see a gazillion methods like:

class Registerer(my_base):
    def register(*args, **kws):
        my_base.register(*args, **kws)

I bet he even presumes that if you chase such chains long enough,
you'll eventually find a register() method *somewhere* that actually
uses its arguments <wink>.

> While it seems that super() is the 'modern pradigm' for this,
> I have been using base.method(self, ...) for years now, and have
> been quite happy with it.  After attempting to convert my code to
> use the super() paradigm, and having difficulty, I discovered James
> Knight's "Python's Super Considered Harmful" (available at
> http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I
> discovered how super really worked (I should have read the
> documention in the first place), and reverted my changes to the
> base.method version.

How did super() get into this discussion?  I don't think I've ever
used it myself, but I avoid fancy inheritance graphs in "my own" code,
so can live with anything.

> I could live with it too, but I would probably use an equivalent of the
> following (with actual type checking):
>
> def mysuper(typ, obj):
>    lm = list(o.__class__.__mro__)
>    indx = lm.index(typ)
>    if indx == 0:
>        return obj
>    return super(lm[indx-1], obj)
>
> All in all, I'm -0.  I don't desire to replace all of my base.method
> with mysuper(base, obj).method, but if I must sacrifice
> convenience for the sake of making Python 2.5's implementation
> simpler, I guess I'll deal with it. My familiarity with grep's regular
> expressions leaves something to be desired, so I don't know how
> often base.method(self,...) is or is not used in the standard library.

I think there may be a misunderstanding here.  Guido isn't proposing
that base.method(self, ...) would stop working -- it would still work
fine.  The result of base.method would still be a callable object:  it
would no longer be of an "unbound method" type (it would just be a
function), and wouldn't do special checking on the first argument
passed to it anymore, but base.method(self, ...) would still invoke
the base class method.  You wouldn't need to rewrite anything (unless
you're doing heavy-magic introspection, picking callables apart).
From bob at redivi.com  Wed Jan  5 04:12:59 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 04:13:11 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
Message-ID: <B41D4282-5EC7-11D9-9DC0-000A9567635C@redivi.com>

On Jan 4, 2005, at 8:18 PM, Josiah Carlson wrote:

>
> Tim Peters <tim.peters@gmail.com> wrote:
>> Guido wrote:
>>> Let's get rid of unbound methods. When class C defines a method
> [snip]
>> Really?  Unbound methods are used most often (IME) to call a
>> base-class method from a subclass, like my_base.the_method(self, ...).
>>  It's especially easy to forget to write `self, ` there, and the
>> exception msg then is quite focused because of that extra bit of type
>> checking.  Otherwise I expect we'd see a more-mysterious
>> AttributeError or TypeError when the base method got around to trying
>> to do something with the bogus `self` passed to it.
>
> Agreed.  While it seems that super() is the 'modern paradigm' for this,
> I have been using base.method(self, ...) for years now, and have been
> quite happy with it.  After attempting to convert my code to use the
> super() paradigm, and having difficulty, I discovered James Knight's
> "Python's Super Considered Harmful" (available at
> http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I
> discovered how super really worked (I should have read the documention
> in the first place), and reverted my changes to the base.method 
> version.

How does removing the difference between unmount methods and 
base.method(self, ...) break anything at all if it was correct code in 
the first place?  As far as I can tell, all it does is remove any 
restriction on what "self" is allowed to be.

On another note -
I don't agree with the "super considered harmful" rant at all.  Yes, 
when you're using __init__ and __new__ of varying signatures in a 
complex class hierarchy, initialization is going to be one hell of a 
problem -- no matter which syntax you use.  All super is doing is 
taking the responsibility of calculating the MRO away from you, and it 
works awfully well for the general case where a method of a given name 
has the same signature and the class hierarchies are not insane.  If 
you have a class hierarchy where this is a problem, it's probably 
pretty fragile to begin with, and you should think about making it 
simpler.

-bob

From barry at python.org  Wed Jan  5 04:42:43 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Jan  5 04:42:47 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
Message-ID: <1104896563.16766.19.camel@geddy.wooz.org>

On Tue, 2005-01-04 at 18:01, Jack Jansen wrote:

> But I'm more worried about losing the other information in an unbound 
> method, specifically im_class. I would guess that info is useful to 
> class browsers and such, or are there other ways to get at that?

That would be my worry too.  OTOH, we have function attributes now, so
why couldn't we just stuff the class on the function's im_class
attribute?  Who'd be the wiser?  (Could the same be done for im_self and
im_func for backwards compatibility?)

quack-quack-ly y'rs,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/70559f60/attachment.pgp
From jcarlson at uci.edu  Wed Jan  5 07:28:37 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Jan  5 07:37:05 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1f7befae050104180711743ebd@mail.gmail.com>
References: <20050104154707.927B.JCARLSON@uci.edu>
	<1f7befae050104180711743ebd@mail.gmail.com>
Message-ID: <20050104220744.927E.JCARLSON@uci.edu>


Tim Peters <tim.peters@gmail.com> wrote:
> 
> [Tim Peters]
> >> ...  Unbound methods are used most often (IME) to call a
> >> base-class method from a subclass, like
> >> my_base.the_method(self, ...).
> >>  It's especially easy to forget to write `self, ` there, and the
> >> exception msg then is quite focused because of that extra bit of
> >> type checking.  Otherwise I expect we'd see a more-mysterious
> >> AttributeError or TypeError when the base method got around to
> >> trying to do something with the bogus `self` passed to it.
> 
> [Josiah Carlson]
> > Agreed.
> 
> Well, it's not that easy to agree with.  Guido replied that most such
> cases would raise an argument-count-mismatch exception instead.  I
> expect that's because he stopped working on Zope code, so actually
> thinks it's odd again to see a gazillion methods like:
> 
> class Registerer(my_base):
>     def register(*args, **kws):
>         my_base.register(*args, **kws)
> 
> I bet he even presumes that if you chase such chains long enough,
> you'll eventually find a register() method *somewhere* that actually
> uses its arguments <wink>.

If type checking is important, one can always add it using decorators. 
Then again, I would be willing to wager that most people wouldn't add it
due to laziness, until it bites them for more than a few hours worth of
debugging time.


> > While it seems that super() is the 'modern pradigm' for this,
> > I have been using base.method(self, ...) for years now, and have
> > been quite happy with it.  After attempting to convert my code to
> > use the super() paradigm, and having difficulty, I discovered James
> > Knight's "Python's Super Considered Harmful" (available at
> > http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I
> > discovered how super really worked (I should have read the
> > documention in the first place), and reverted my changes to the
> > base.method version.
> 
> How did super() get into this discussion?  I don't think I've ever
> used it myself, but I avoid fancy inheritance graphs in "my own" code,
> so can live with anything.

It was my misunderstanding of your statement in regards to base.method. 
I had thought that base.method(self, ...) would stop working, and
attempted to discover how one would be able to get the equivalent back,
regardless of the inheritance graph.


> > I could live with it too, but I would probably use an equivalent of the
> > following (with actual type checking):
> >
> > def mysuper(typ, obj):
> >    lm = list(o.__class__.__mro__)
> >    indx = lm.index(typ)
> >    if indx == 0:
> >        return obj
> >    return super(lm[indx-1], obj)
> >
> > All in all, I'm -0.  I don't desire to replace all of my base.method
> > with mysuper(base, obj).method, but if I must sacrifice
> > convenience for the sake of making Python 2.5's implementation
> > simpler, I guess I'll deal with it. My familiarity with grep's regular
> > expressions leaves something to be desired, so I don't know how
> > often base.method(self,...) is or is not used in the standard library.
> 
> I think there may be a misunderstanding here.  Guido isn't proposing
> that base.method(self, ...) would stop working -- it would still work
> fine.  The result of base.method would still be a callable object:  it
> would no longer be of an "unbound method" type (it would just be a
> function), and wouldn't do special checking on the first argument
> passed to it anymore, but base.method(self, ...) would still invoke
> the base class method.  You wouldn't need to rewrite anything (unless
> you're doing heavy-magic introspection, picking callables apart).

Indeed, there was a misunderstanding on my part.  I misunderstood your
discussion of base.method(self, ...) to mean that such things would stop
working.  My apologies.


 - Josiah

From andrewm at object-craft.com.au  Wed Jan  5 08:06:43 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 08:06:41 2005
Subject: [Python-Dev] csv module TODO list
Message-ID: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>

There's a bunch of jobs we (CSV module maintainers) have been putting
off - attached is a list (in no particular order): 

* unicode support (this will probably uglify the code considerably).

* 8 bit transparency (specifically, allow \0 characters in source string
  and as delimiters, etc).

* Reader and universal newlines don't interact well, reader doesn't
  honour Dialect's lineterminator setting. All outstanding bug id's
  (789519, 944890, 967934 and 1072404) are related to this - it's 
  a difficult problem and further discussion is needed.

* compare PEP-305 and library reference manual to the module as implemented
  and either document the differences or correct them.

* Address or document Francis Avila's issues as mentioned in this posting:

    http://www.google.com.au/groups?selm=vsb89q1d3n5qb1%40corp.supernews.com

* Several blogs complain that the CSV module is no good for parsing
  strings. Suggest making it clearer in the documentation that the reader
  accepts an iterable, rather than a file, and document why an iterable
  (as opposed to a string) is necessary (multi-line records with embedded
  newlines). We could also provide an interface that parses a single
  string (or the old Object Craft interface) for those that really feel
  the need. See:

    http://radio.weblogs.com/0124960/2003/09/12.html
    http://zephyrfalcon.org/weblog/arch_d7_2003_09_06.html#e335

* Compatability API for old Object Craft CSV module?

    http://mechanicalcat.net/cgi-bin/log/2003/08/18

  For example: "from csv.legacy import reader" or something.

* Pure python implementation? 

* Some CSV-like formats consider a quoted field a string, and an unquoted
  field a number - consider supporting this in the Reader and Writer. See:

    http://radio.weblogs.com/0124960/2004/04/23.html

* Add line number and record number counters to reader object?

* it's possible to get the csv parser to suck the whole source file
  into memory with an unmatched quote character. Need to limit size of
  internal buffer.

Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should
already have been addressed):

* remove TODO comment at top of file--it's empty
* is CSV going to be maintained outside the python tree?
  If not, remove the 2.2 compatibility macros for:
         PyDoc_STR, PyDoc_STRVAR, PyMODINIT_FUNC, etc.
* inline the following functions since they are used only in one place
        get_string, set_string, get_nullchar_as_None, set_nullchar_as_None,
        join_reset (maybe)
* rather than use PyErr_BadArgument, should you use assert?
        (first example, Dialect_set_quoting, line 218)
* is it necessary to have Dialect_methods, can you use 0 for tp_methods?
* remove commented out code (PyMem_DEL) on line 261
        Have you used valgrind on the test to find memory overwrites/leaks?
* PyString_AsString()[0] on line 331 could return NULL in which case
        you are dereferencing a NULL pointer
* note sure why there are casts on 0 pointers
        lines 383-393, 733-743, 1144-1154, 1164-1165
* Reader_getiter() can be removed and use PyObject_SelfIter()
* I think you need PyErr_NoMemory() before returning on line 768, 1178
* is PyString_AsString(self->dialect->lineterminator) on line 994
        guaranteed not to return NULL?  If not, it could crash by
        passing to memmove.
* PyString_AsString() can return NULL on line 1048 and 1063, 
        the result is passed to join_append()
* iteratable should be iterable?  (line 1088)
* why doesn't csv_writerows() have a docstring?  csv_writerow does
* any PyUnicode_* methods should be protected with #ifdef Py_USING_UNICODE
* csv_unregister_dialect, csv_get_dialect could use METH_O 
        so you don't need to use PyArg_ParseTuple
* in init_csv, recommend using 
        PyModule_AddIntConstant and PyModule_AddStringConstant
        where appropriate

Also, review comments from Jeremy Hylton, 10 Apr 2003:

    I've been reviewing extension modules looking for C types that should
    participate in garbage collection.  I think the csv ReaderObj and
    WriterObj should participate.  The ReaderObj it contains a reference to
    input_iter that could be an arbitrary Python object.  The iterator
    object could well participate in a cycle that refers to the ReaderObj.
    The WriterObj has a reference to a writeline callable, which could well
    be a method of an object that also points to the WriterObj.

    The Dialect object appears to be safe, because the only PyObject * it
    refers should be a string.  Safe until someone creates an insane string
    subclass <0.4 wink>.

    Also, an unrelated comment about the code, the lineterminator of the
    Dialect is managed by a collection of little helper functions like
    get_string, set_string, etc.  This code appears to be excessively
    general; since they're called only once, it seems clearer to inline the
    logic directly in the get/set methods for the lineterminator.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From skip at pobox.com  Wed Jan  5 08:33:04 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan  5 08:33:17 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <16859.38960.9935.682429@montanaro.dyndns.org>


    Andrew> There's a bunch of jobs we (CSV module maintainers) have been
    Andrew> putting off - attached is a list (in no particular order):

    ...

In addition, it occurred to me this evening that there's functionality in
the csv module I don't think anybody uses.  For example, you can register
CSV dialects by name, then pass in the string name instead of the dialect
class.  I'd be in favor of scrapping list_dialects, register_dialect and
unregister_dialect altogether.  While they are probably trivial little
functions I don't think they add much if anything to the implementation and
just complicate the _csv extension module slightly.  I'm also not aware that
anyone really uses the Sniffer class, though it does provide some useful
functionality should you need to analyze random CSV files.

Skip
From olsongt at verizon.net  Wed Jan  5 08:22:32 2005
From: olsongt at verizon.net (olsongt@verizon.net)
Date: Wed Jan  5 08:34:09 2005
Subject: [Python-Dev] Will ASTbranch compile on windows yet?
Message-ID: <20050105072232.VHEV24088.out009.verizon.net@outgoing.verizon.net>

[TIM]
> 
> I don't have time to join the current crusade.  If there's pent-up
> interest among Windows users, it would be good to say which
> compiler(s) you can use, since I expect not everyone can deal with VC
> 7.1 (e.g., I think Raymond Hettinger is limited to VC 6; and you said
> you worked up a VC 6 patch, but didn't say whether you could use 7.1
> now).
> 

I've attached an updated patch that gets things working against current cvs.  This also includes some fixes for typos that appear to have slipped through gcc and my have caused obscure bugs in *nix as well.

I'll gladly fix the MSVC 7.1 project files after someone with commit privleges merges changes from HEAD as Jeremy requested.

Any windows users building based on this patch would also need to run the 'asdl_c.py' utility manually right now before compiling.  Something like:

    C:\Src\ast-branch\dist\src\Parser>asdl_c.py -h ..\Include -c ..\Python Python.asdl

I'll get a proper fix in for MSVC 7.1, but don't feel like dealing with it for the obsolete 6.0 project files.

-Grant

From andrewm at object-craft.com.au  Wed Jan  5 08:55:06 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 08:55:01 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list 
In-Reply-To: <16859.38960.9935.682429@montanaro.dyndns.org> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<16859.38960.9935.682429@montanaro.dyndns.org>
Message-ID: <20050105075506.314C93C8E5@coffee.object-craft.com.au>

>    Andrew> There's a bunch of jobs we (CSV module maintainers) have been
>    Andrew> putting off - attached is a list (in no particular order):
>    ...
>
>In addition, it occurred to me this evening that there's functionality in
>the csv module I don't think anybody uses.  

It's very difficult to say for sure that nobody is using it once it's
released to the world.

>For example, you can register CSV dialects by name, then pass in the
>string name instead of the dialect class.  I'd be in favor of scrapping
>list_dialects, register_dialect and unregister_dialect altogether.  While
>they are probably trivial little functions I don't think they add much if
>anything to the implementation and just complicate the _csv extension
>module slightly.  

Yes, in hindsight, they're not really necessary, although I'm sure we
had some motivation for them initially. That said, they're there now,
and they shouldn't require much maintenance.

>I'm also not aware that anyone really uses the Sniffer class, though it
>does provide some useful functionality should you need to analyze random
>CSV files.

The comment I get repeatedly is that they don't use it because it's
"too magic/scary". That's as it should be. But if it didn't exist,
then someone would be requesting we add it... 8-)

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From martin at v.loewis.de  Wed Jan  5 09:33:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 09:33:09 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
Message-ID: <41DBA649.3080008@v.loewis.de>

Bob Ippolito wrote:
> It doesn't for reasons I care not to explain in depth, again.  Search  
> the pythonmac-sig archives for longer explanations.  The gist is that  
> you specifically do not want to link directly to the framework at all  
> when building extensions.

Because an Apple-built extension then may pick up a user-installed
Python? Why can this problem not be solved by adding -F options,
as Jack Jansen proposed?

> This is not the wrong way to do it.

I'm not convinced.

Regards,
Martin
From martin at v.loewis.de  Wed Jan  5 09:39:44 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 09:39:37 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <41DBA7D0.80101@v.loewis.de>

Andrew McNamara wrote:
> There's a bunch of jobs we (CSV module maintainers) have been putting
> off - attached is a list (in no particular order): 
> 
> * unicode support (this will probably uglify the code considerably).

Can you please elaborate on that? What needs to be done, and how is
that going to be done? It might be possible to avoid considerable
uglification.

Regards,
Martin
From mal at egenix.com  Wed Jan  5 10:10:30 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan  5 10:10:33 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <41DBA7D0.80101@v.loewis.de>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de>
Message-ID: <41DBAF06.6020401@egenix.com>

Martin v. L?wis wrote:
> Andrew McNamara wrote:
> 
>> There's a bunch of jobs we (CSV module maintainers) have been putting
>> off - attached is a list (in no particular order):
>> * unicode support (this will probably uglify the code considerably).
> 
> 
> Can you please elaborate on that? What needs to be done, and how is
> that going to be done? It might be possible to avoid considerable
> uglification.

Indeed. The trick is to convert to Unicode early and to use Unicode
literals instead of string literals in the code.

Note that the only real-life Unicode format in use is UTF-16
(with BOM mark) written by Excel. Note that there's no standard
for specifying the encoding in CSV files, so this is also the only
feasable format.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From ronaldoussoren at mac.com  Wed Jan  5 10:19:09 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed Jan  5 10:19:12 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DBA649.3080008@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
Message-ID: <DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>


On 5-jan-05, at 9:33, Martin v. L?wis wrote:

> Bob Ippolito wrote:
>> It doesn't for reasons I care not to explain in depth, again.  Search 
>>  the pythonmac-sig archives for longer explanations.  The gist is 
>> that  you specifically do not want to link directly to the framework 
>> at all  when building extensions.
>
> Because an Apple-built extension then may pick up a user-installed
> Python? Why can this problem not be solved by adding -F options,
> as Jack Jansen proposed?

It gets worse when you have a user-installed python 2.3 and a 
user-installed python 2.4. Those will be both be installed as 
/Library/Frameworks/Python.framework. This means that you cannot use 
the -F flag to select which one you want to link to, '-framework 
Python' will only link to the python that was installed the latest.

This is an issue on Mac OS X 10.2.

Ronald
From andrewm at object-craft.com.au  Wed Jan  5 10:34:14 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 10:34:11 2005
Subject: [Python-Dev] csv module TODO list 
In-Reply-To: <41DBAF06.6020401@egenix.com> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
Message-ID: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au>

>> Andrew McNamara wrote:
>>> There's a bunch of jobs we (CSV module maintainers) have been putting
>>> off - attached is a list (in no particular order):
>>> * unicode support (this will probably uglify the code considerably).
>> 
>Martin v. Löwis wrote:
>> Can you please elaborate on that? What needs to be done, and how is
>> that going to be done? It might be possible to avoid considerable
>> uglification.

I'm not altogether sure there. The parsing state machine is all written in
C, and deals with signed chars - I expect we'll need two versions of that
(or one version that's compiled twice using pre-processor macros). Quite
a large job. Suggestions gratefully received.

M.-A. Lemburg wrote:
>Indeed. The trick is to convert to Unicode early and to use Unicode
>literals instead of string literals in the code.

Yes, although it would be nice to also retain the 8-bit versions as well.

>Note that the only real-life Unicode format in use is UTF-16
>(with BOM mark) written by Excel. Note that there's no standard
>for specifying the encoding in CSV files, so this is also the only
>feasable format.

Yes - that's part of the problem I hadn't really thought about yet - the
csv module currently interacts directly with files as iterators, but it's 
clear that we'll need to decode as we go.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From mal at egenix.com  Wed Jan  5 10:44:40 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan  5 10:44:43 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>	<41DBA7D0.80101@v.loewis.de>
	<41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
Message-ID: <41DBB708.5030501@egenix.com>

Andrew McNamara wrote:
>>>Andrew McNamara wrote:
>>>
>>>>There's a bunch of jobs we (CSV module maintainers) have been putting
>>>>off - attached is a list (in no particular order):
>>>>* unicode support (this will probably uglify the code considerably).
>>>
>>Martin v. L?wis wrote:
>>
>>>Can you please elaborate on that? What needs to be done, and how is
>>>that going to be done? It might be possible to avoid considerable
>>>uglification.
> 
> 
> I'm not altogether sure there. The parsing state machine is all written in
> C, and deals with signed chars - I expect we'll need two versions of that
> (or one version that's compiled twice using pre-processor macros). Quite
> a large job. Suggestions gratefully received.
> 
> M.-A. Lemburg wrote:
> 
>>Indeed. The trick is to convert to Unicode early and to use Unicode
>>literals instead of string literals in the code.
> 
> 
> Yes, although it would be nice to also retain the 8-bit versions as well.

You can do so by using latin-1 as default encoding. Works great !

>>Note that the only real-life Unicode format in use is UTF-16
>>(with BOM mark) written by Excel. Note that there's no standard
>>for specifying the encoding in CSV files, so this is also the only
>>feasable format.
> 
> Yes - that's part of the problem I hadn't really thought about yet - the
> csv module currently interacts directly with files as iterators, but it's 
> clear that we'll need to decode as we go.

Depends on your needs: CSV files tend to be small enough
to do the decoding in one call in memory.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From andrewm at object-craft.com.au  Wed Jan  5 11:03:25 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 11:03:20 2005
Subject: [Python-Dev] csv module TODO list 
In-Reply-To: <41DBB708.5030501@egenix.com> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
	<41DBB708.5030501@egenix.com>
Message-ID: <20050105100325.A220D3C8E5@coffee.object-craft.com.au>

>> Yes, although it would be nice to also retain the 8-bit versions as well.
>
>You can do so by using latin-1 as default encoding. Works great !

Yep, although that means we wear the cost of decoding and encoding for
all 8 bit input.

What does the _sre.c code do?

>Depends on your needs: CSV files tend to be small enough
>to do the decoding in one call in memory.

We are routinely dealing with multi-gigabyte csv files - which is why the
original 2001 vintage csv module was written as a C state machine. 

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From mal at egenix.com  Wed Jan  5 11:16:50 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan  5 11:16:54 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <20050105100325.A220D3C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>	<41DBA7D0.80101@v.loewis.de>
	<41DBAF06.6020401@egenix.com>	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>	<41DBB708.5030501@egenix.com>
	<20050105100325.A220D3C8E5@coffee.object-craft.com.au>
Message-ID: <41DBBE92.4070106@egenix.com>

Andrew McNamara wrote:
>>>Yes, although it would be nice to also retain the 8-bit versions as well.
>>
>>You can do so by using latin-1 as default encoding. Works great !
> 
> Yep, although that means we wear the cost of decoding and encoding for
> all 8 bit input.

Right, but it makes the code very clean and straight forward.
Again, it depends on what you need. If performance is critical
then you probably need a C version written using the same trick
as _sre.c...

> What does the _sre.c code do?

It comes in two versions: one for 8-bit the other for Unicode.

>>Depends on your needs: CSV files tend to be small enough
>>to do the decoding in one call in memory.
> 
> We are routinely dealing with multi-gigabyte csv files - which is why the
> original 2001 vintage csv module was written as a C state machine. 

I see, but are you sure that the typical Python user will have
the same requirements to make it worth the effort (and
complexity) ?

I've written a few CSV parsers and writers myself over the years
and the requirements were different every time, in terms
of being flexible in the parsing phase, the interfaces and
the performance needs. Haven't yet found a one fits all
solution and don't really expect to any more :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From andrewm at object-craft.com.au  Wed Jan  5 11:33:05 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 11:33:00 2005
Subject: [Python-Dev] csv module TODO list 
In-Reply-To: <41DBBE92.4070106@egenix.com> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
	<41DBB708.5030501@egenix.com>
	<20050105100325.A220D3C8E5@coffee.object-craft.com.au>
	<41DBBE92.4070106@egenix.com>
Message-ID: <20050105103305.AD80B3C8E5@coffee.object-craft.com.au>

>> Yep, although that means we wear the cost of decoding and encoding for
>> all 8 bit input.
>
>Right, but it makes the code very clean and straight forward.

I agree it makes for a very clean solution, and 99% of the time I'd
chose that option.

>Again, it depends on what you need. If performance is critical
>then you probably need a C version written using the same trick
>as _sre.c...
>
>> What does the _sre.c code do?
>
>It comes in two versions: one for 8-bit the other for Unicode.

That's what I thought. I think the motivations here are similar to those
that drove the _sre developers.

>> We are routinely dealing with multi-gigabyte csv files - which is why the
>> original 2001 vintage csv module was written as a C state machine. 
>
>I see, but are you sure that the typical Python user will have
>the same requirements to make it worth the effort (and
>complexity) ?

This is open source, so I scratch my own itch (and that of my employers) - 
we need fast csv parsing more than we need unicode... 8-)

Okay, assuming we go the "produce two versions via evil macro tricks"
path, it's still not quite the same situation as _sre.c, which only has
to deal with the internal unicode representation.

One way to approach this would be to add an "encoding" keyword argument
to the readers and writers. If given, the parser would decode the input
stream to the internal representation before passing it through the
unicode state machine, which would yield tuples of unicode objects.

That leaves us with a bit of a problem where the source is already unicode
(eg, a list of unicode strings)... hmm.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From andrewm at object-craft.com.au  Wed Jan  5 12:08:49 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 12:08:43 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list 
In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <20050105110849.CBA843C8E5@coffee.object-craft.com.au>

>Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should
>already have been addressed):

I should apologise to Neal here for not replying to him at the time.

Okay, going though the issues Neal raised...

>* remove TODO comment at top of file--it's empty

Was fixed.

>* is CSV going to be maintained outside the python tree?
>  If not, remove the 2.2 compatibility macros for:
>         PyDoc_STR, PyDoc_STRVAR, PyMODINIT_FUNC, etc.

Does anyone thing we should continue to maintain this 2.2 compatibility?

>* inline the following functions since they are used only in one place
>        get_string, set_string, get_nullchar_as_None, set_nullchar_as_None,
>        join_reset (maybe)

It was done that way as I felt we would be adding more getters and
setters to the dialect object in future.

>* rather than use PyErr_BadArgument, should you use assert?
>        (first example, Dialect_set_quoting, line 218)

You mean C assert()? I don't think I'm really following you here -
where would the type of the object be checked in a way the user could
recover from?

>* is it necessary to have Dialect_methods, can you use 0 for tp_methods?

I was assuming I would need to add methods at some point (in fact, I did
have methods, but removed them).

>* remove commented out code (PyMem_DEL) on line 261
>        Have you used valgrind on the test to find memory overwrites/leaks?

No, valgrind wasn't used.

>* PyString_AsString()[0] on line 331 could return NULL in which case
>        you are dereferencing a NULL pointer

Was fixed.

>* note sure why there are casts on 0 pointers
>        lines 383-393, 733-743, 1144-1154, 1164-1165

To make it easier when the time comes to add one of those members.

>* Reader_getiter() can be removed and use PyObject_SelfIter()

Okay, wasn't aware of PyObject_SelfIter - will fix.

>* I think you need PyErr_NoMemory() before returning on line 768, 1178

The examples I looked at in the Python core didn't do this - are you sure?
(now lines 832 and 1280). 

>* is PyString_AsString(self->dialect->lineterminator) on line 994
>        guaranteed not to return NULL?  If not, it could crash by
>        passing to memmove.
>* PyString_AsString() can return NULL on line 1048 and 1063, 
>        the result is passed to join_append()

Looking at the PyString_AsString implementation, it looks safe (we ensure
it's really a string elsewhere)?

>* iteratable should be iterable?  (line 1088)

Sorry, I don't know what you're getting at here? (now line 1162).

>* why doesn't csv_writerows() have a docstring?  csv_writerow does

Was fixed.

>* any PyUnicode_* methods should be protected with #ifdef Py_USING_UNICODE

Was fixed.

>* csv_unregister_dialect, csv_get_dialect could use METH_O 
>        so you don't need to use PyArg_ParseTuple

Was fixed.

>* in init_csv, recommend using 
>        PyModule_AddIntConstant and PyModule_AddStringConstant
>        where appropriate

Was fixed.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From aleax at aleax.it  Wed Jan  5 12:11:37 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan  5 12:11:42 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1104896563.16766.19.camel@geddy.wooz.org>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
	<1104896563.16766.19.camel@geddy.wooz.org>
Message-ID: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 05, at 04:42, Barry Warsaw wrote:

> On Tue, 2005-01-04 at 18:01, Jack Jansen wrote:
>
>> But I'm more worried about losing the other information in an unbound
>> method, specifically im_class. I would guess that info is useful to
>> class browsers and such, or are there other ways to get at that?
>
> That would be my worry too.  OTOH, we have function attributes now, so
> why couldn't we just stuff the class on the function's im_class
> attribute?  Who'd be the wiser?  (Could the same be done for im_self 
> and
> im_func for backwards compatibility?)

Hmmm, seems to me we'd need copies of the function object for this 
purpose:

def f(*a): pass
class C(object): pass
class D(object): pass
C.f = D.f = f

If now we want C.f.im_class to differ from D.f.im_class then we need f 
to get copied implicitly when it's assigned to C.f (or, of course, when 
C.f is accessed... but THAT might be substantial overhead).  OK, I 
guess, as long as we don't expect any further attribute setting on f to 
affect C.f or D.f (and I don't know of any real use case where that 
would be needed).


Alex

From aleax at aleax.it  Wed Jan  5 12:28:39 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan  5 12:28:45 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl>
References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl>
Message-ID: <F21B7352-5F0C-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 05, at 00:06, Jack Jansen wrote:
    ...
> We've solved this issue for the trunk and we can solve it for 2.4.1: 
> if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to 
> 10.3. Moreover, when it is 10.3 or higher (possibly after being 
> forced) we use the dynamic_lookup way of linking

Not having followed Python/Mac developments closely (my fault, sigh), I 
would like to understand what this would imply for the forthcoming 10.4 
("Tiger") release of MacOS -- and that in turn depends, I assume, on 
what Python release will come with it.  Anybody who's under 
nondisclosure should of course keep mum, but can somebody help e.g. by 
telling me what Python is included in the current "development 
previews" versions of Tiger?  I'm not gonna spend $500 to become a 
highly-ranked enough "apple developer" to get those previews.  
Considering Apple's habitual timings, I'm sort of resigned to us being 
stuck with 2.3 for Tiger, but I would at least hope they'd get as late 
a 2.3.* as they can.  So, assuming Tiger's Python is going to be, say, 
2.3.4 or 2.3.5, would the change you're proposing make it MORE 
attractive to Apple to go for 2.3.5, LESS so, or is it indifferent from 
their POV...?

Thanks in advance for any help in getting the tradeoffs about this 
clearer in my mind!


Alex

From ronaldoussoren at mac.com  Wed Jan  5 12:40:07 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed Jan  5 12:40:13 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <F21B7352-5F0C-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl>
	<F21B7352-5F0C-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com>


On 5-jan-05, at 12:28, Alex Martelli wrote:

>
> On 2005 Jan 05, at 00:06, Jack Jansen wrote:
>    ...
>> We've solved this issue for the trunk and we can solve it for 2.4.1: 
>> if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it 
>> to 10.3. Moreover, when it is 10.3 or higher (possibly after being 
>> forced) we use the dynamic_lookup way of linking
>
> Not having followed Python/Mac developments closely (my fault, sigh), 
> I would like to understand what this would imply for the forthcoming 
> 10.4 ("Tiger") release of MacOS -- and that in turn depends, I assume, 
> on what Python release will come with it.  Anybody who's under 
> nondisclosure should of course keep mum, but can somebody help e.g. by 
> telling me what Python is included in the current "development 
> previews" versions of Tiger?

The Tiger that was released at WWDC included a patched version of 
Python 2.3.3. See: 
http://www.opensource.apple.com/darwinsource/WWDC2004/.

Ronald

From aleax at aleax.it  Wed Jan  5 13:02:53 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan  5 13:02:59 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <41DB069A.8030406@ocf.berkeley.edu>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E8FBCBD2@its-xchg4.massey.ac.nz>	<1104842608.3227.60.camel@presto.wooz.org>	<e8bf7a530501040525640fb674@mail.gmail.com>	<ca471dc20501040731142eccf2@mail.gmail.com>	<e8bf7a5305010408177786b70@mail.gmail.com>
	<45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it>
	<41DB069A.8030406@ocf.berkeley.edu>
Message-ID: <BAEE6219-5F11-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 04, at 22:11, Brett C. wrote:
    ...
>> Speaking for myself, I have a burning interest in the AST branch  
>> (though I can't seem to get it correctly downloaded so far, I guess  
>> it's just my usual CVS-clumsiness and I'll soon find out what I'm  
>> doing wrong & fix it)
>
> See  
> http://www.python.org/dev/devfaq.html#how-can-i-check-out-a-tagged- 
> branch on how to do a checkout of a tagged branch.

Done!  Believe it or not, I _had_ already tried following those very  
instructions -- and I kept omitting the word 'python' at the end and/or  
mispelling the tag as ast_branch (while it wants a dash, NOT an  
underscore...).  I guess having the instructions recommended to me  
again prompted me to doublecheck character by character extracarefully,  
so, thanks!-)


Alex

From andrewm at object-craft.com.au  Wed Jan  5 13:29:11 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan  5 13:29:05 2005
Subject: [Python-Dev] Re: csv module TODO list 
In-Reply-To: <20050105121921.GB24030@idi.ntnu.no> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<16859.38960.9935.682429@montanaro.dyndns.org>
	<20050105075506.314C93C8E5@coffee.object-craft.com.au>
	<20050105121921.GB24030@idi.ntnu.no>
Message-ID: <20050105122911.83EE93C8E5@coffee.object-craft.com.au>

>Quite a while ago I posted some material to the csv-list about
>problems using the csv module on Unix-style colon-separated files --
>it just doesn't deal properly with backslash escaping and is quite
>useless for this kind of file. I seem to recall the general view was
>that it wasn't intended for this kind of thing -- only the sort of csv
>that Microsoft Excel outputs/inputs, but if I am mistaken about this,
>perhaps fixing this issue might be put on the TODO-list? I'll be happy
>to re-send or summarize the relevant emails, if needed.

I think a related issue was included in my TODO list:

>* Address or document Francis Avila's issues as mentioned in this posting:
>
>    http://www.google.com.au/groups?selm=vsb89q1d3n5qb1%40corp.supernews.com

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From aleax at aleax.it  Wed Jan  5 13:37:50 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan  5 13:37:56 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com>
References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl>
	<F21B7352-5F0C-11D9-ADA4-000A95EFAE9E@aleax.it>
	<8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com>
Message-ID: <9CD489F6-5F16-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 05, at 12:40, Ronald Oussoren wrote:
    ...
> The Tiger that was released at WWDC included a patched version of 
> Python 2.3.3. See: 
> http://www.opensource.apple.com/darwinsource/WWDC2004/.

Thanks!  So, since WWDC was on June 28 and 2.3.4 had been released on 
May 27, we get some first sense of the speed or lack thereof of 2.3.x 
releases' entrance in Tiger's previews...


Alex

From mwh at python.net  Wed Jan  5 14:49:05 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Jan  5 14:49:07 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DBA649.3080008@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	05 Jan 2005 09:33:13 +0100")
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
Message-ID: <2mzmzogf5a.fsf@starship.python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Bob Ippolito wrote:
>> It doesn't for reasons I care not to explain in depth, again.
>> Search  the pythonmac-sig archives for longer explanations.  The
>> gist is that  you specifically do not want to link directly to the
>> framework at all  when building extensions.
>
> Because an Apple-built extension then may pick up a user-installed
> Python? Why can this problem not be solved by adding -F options,
> as Jack Jansen proposed?
>
>> This is not the wrong way to do it.
>
> I'm not convinced.

Martin, can you please believe that Jack, Bob, Ronald et al know what
they are talking about here?

Cheers,
mwh

-- 
  Q: Isn't it okay to just read Slashdot for the links?
  A: No. Reading Slashdot for the links is like having "just one hit"
     off the crack pipe.
     -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#faq
From bob at redivi.com  Wed Jan  5 16:18:08 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 16:18:18 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DBA649.3080008@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
Message-ID: <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com>


On Jan 5, 2005, at 3:33 AM, Martin v. L?wis wrote:

> Bob Ippolito wrote:
>> It doesn't for reasons I care not to explain in depth, again.  Search 
>>  the pythonmac-sig archives for longer explanations.  The gist is 
>> that  you specifically do not want to link directly to the framework 
>> at all  when building extensions.
>
> Because an Apple-built extension then may pick up a user-installed
> Python? Why can this problem not be solved by adding -F options,
> as Jack Jansen proposed?
>
>> This is not the wrong way to do it.
>
> I'm not convinced.

Then you haven't done the appropriate research by searching 
pythonmac-sig.  Do you even own a Mac?

-bob

From glyph at divmod.com  Wed Jan  5 16:37:16 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Wed Jan  5 16:35:24 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <B41D4282-5EC7-11D9-9DC0-000A9567635C@redivi.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<B41D4282-5EC7-11D9-9DC0-000A9567635C@redivi.com>
Message-ID: <1104939436.5854.25.camel@localhost>

On Tue, 2005-01-04 at 22:12 -0500, Bob Ippolito wrote:
> If you have a class hierarchy where this is a problem, it's probably 
> pretty fragile to begin with, and you should think about making it 
> simpler.

I agree with James's rant almost entirely, but I like super() anyway.  I
think it is an indication not of a new weakness of super(), but of a
long-standing weakness of __init__.

One approach I have taken in order to avoid copiously over-documenting
every super() using class is to decouple different phases of
initialization by making __init__ as simple as possible (setting a few
attributes, resisting the temptation to calculate things), and then
providing class methods like '.fromString' or '.forUnserialize' that
create instances that have been completely constructed for a particular
purpose.  That way the signatures are much more likely to line up across
inheritance hierarchies.  Perhaps this should be a suggested "best
practice" when using super() as well?


From glyph at divmod.com  Wed Jan  5 16:41:30 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Wed Jan  5 16:39:37 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
	<1104896563.16766.19.camel@geddy.wooz.org>
	<91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <1104939690.5854.30.camel@localhost>

On Wed, 2005-01-05 at 12:11 +0100, Alex Martelli wrote:

> Hmmm, seems to me we'd need copies of the function object for this 
> purpose:

For the stated use-case of serialization, only one copy would be
necessary, and besides - even *I* don't use idioms as weird as the one
you are suggesting very often ;).

I think it would be reasonable to assign im_class only to functions
defined in class scope.  The only serialization that would break in that
case is if your example had a 'del f' at the end.


From arigo at tunes.org  Wed Jan  5 17:10:45 2005
From: arigo at tunes.org (Armin Rigo)
Date: Wed Jan  5 17:21:42 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAF22B.6030605@zope.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<41DAE213.9070906@zope.com>
	<ca471dc2050104114026228858@mail.gmail.com>
	<41DAF22B.6030605@zope.com>
Message-ID: <20050105161045.GA19431@vicky.ecs.soton.ac.uk>

Hi Jim,

On Tue, Jan 04, 2005 at 02:44:43PM -0500, Jim Fulton wrote:
> >Actually, unbound builtin methods are a different type than bound
> >builtin methods:
> 
> Of course, but conceptually they are similar.  You would still
> encounter the concept if you got an unbound builtin method.

There are no such things as unbound builtin methods:

>>> list.append is list.__dict__['append']
True

In other words 'list.append' just returns exactly the same object as stored in
the list type's dict.  Guido's proposal is to make Python methods behave in
the same way.


Armin
From seojiwon at gmail.com  Wed Jan  5 17:32:44 2005
From: seojiwon at gmail.com (Jiwon Seo)
Date: Wed Jan  5 17:32:47 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <e8bf7a530501032003575615d@mail.gmail.com>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<e8bf7a530501032003575615d@mail.gmail.com>
Message-ID: <b008462b050105083256c1c0e4@mail.gmail.com>

I'd like to help here on the AST branch, if it's not too late.
(Especially I'm interested with the generator expression part.)

If I want to volunteer, do I just begin to work with it? Or do I need
to read something or discuss with someone?

Thanks.

Jiwon.

On Mon, 3 Jan 2005 23:03:33 -0500, Jeremy Hylton <jhylton@gmail.com> wrote:
> On Mon, 03 Jan 2005 18:02:52 -0800, Brett C. <bac@ocf.berkeley.edu> wrote:
> > Plus there is the running tradition of sprinting on the AST branch at PyCon.  I
> > was planning on shedding my bug fixing drive at PyCon this year and sprinting
> > with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for
> > working on it afterwards for my class credit.
> 
> I'd like to sprint on it before PyCon; we'll have to see what my
> schedule allows.
> 
> > If anyone would like to see the current code, check out ast-branch from CVS
> > (read the dev FAQ on how to check out a branch from CVS).  Read
> > Python/compile.txt for an overview of how the thing works and such.
> >
> > It will get done, just don't push for a 2.5 release within a month.  =)
> 
> I think the branch is in an awkward state, because of the new features
> added to Python 2.4 after the AST branch work ceased.  The ast branch
> doesn't handle generator expressions or decorators; extending the ast
> to support them would be a good first step.
> 
> There are also the simple logistical questions of integrating changes.
> Since most of the AST branch changes are confined to a few files, I
> suspect the best thing to do is to merge all the changes from the head
> except for compile.c.  I haven't done a major CVS branch integrate in
> at least nine months; if someone feels more comfortable with that, it
> would also be a good step.
> 
> Perhaps interested parties should take up the discussion on the
> compiler-sig.  I think we can recover the state of last May's effort
> pretty quickly, and I can help outline the remaining work even if I
> can't help much.  (Although I hope I can help, too.)
> 
> Jeremy
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com
>
From arigo at tunes.org  Wed Jan  5 17:30:06 2005
From: arigo at tunes.org (Armin Rigo)
Date: Wed Jan  5 17:40:57 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <ca471dc2050104102814be915b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <20050105163006.GB19431@vicky.ecs.soton.ac.uk>

Hi Guido,

On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote:
> Let's get rid of unbound methods.

Is there any other use case for 'C.x' not returning the same as
'appropriate_super_class_of_C.__dict__["x"]' ?  I guess it's too late now but
it would have been nice if user-defined __get__() methods had the more obvious
signature (self, instance) instead of (self, instance_or_None, cls=None).  
Given the amount of potential breakage people already pointed out I guess
it is not reasonable to change that.


Armin
From jhylton at gmail.com  Wed Jan  5 17:42:57 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Jan  5 17:43:00 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <b008462b050105083256c1c0e4@mail.gmail.com>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<e8bf7a530501032003575615d@mail.gmail.com>
	<b008462b050105083256c1c0e4@mail.gmail.com>
Message-ID: <e8bf7a530501050842affb328@mail.gmail.com>

On Thu, 6 Jan 2005 01:32:44 +0900, Jiwon Seo <seojiwon@gmail.com> wrote:
> I'd like to help here on the AST branch, if it's not too late.
> (Especially I'm interested with the generator expression part.)

Great!  It's not too late.

> If I want to volunteer, do I just begin to work with it? Or do I need
> to read something or discuss with someone?

The file Python/compile.txt on the ast-branch has a brief overview of
the project:

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto

Jeremy
From shane.holloway at ieee.org  Wed Jan  5 17:44:31 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Wed Jan  5 17:45:07 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <ca471dc2050104102814be915b@mail.gmail.com>	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>	<1104896563.16766.19.camel@geddy.wooz.org>
	<91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <41DC196F.8070400@ieee.org>



Alex Martelli wrote:
> def f(*a): pass class C(object): pass class D(object): pass C.f = D.f
> = f
> 
> If now we want C.f.im_class to differ from D.f.im_class then we need
> f to get copied implicitly when it's assigned to C.f (or, of course,
> when C.f is accessed... but THAT might be substantial overhead). OK,
> I guess, as long as we don't expect any further attribute setting on
> f to affect C.f or D.f (and I don't know of any real use case where
> that would be needed).

You'd have to do a copy anyway, because f() is still a module-level
callable entity. I also agree with Glyph that im_class should only
really be set in the case of methods defined within the class block.


Also, interestingly, removing unbound methods makes another thing possible.

     class A(object):
         def foo(self): pass

     class B(object):
         foo = A.foo

     class C(object):
         pass
     C.foo = A.foo


I'd really like to avoid making copies of functions for the sake of
reload() and edit-and-continue functionality. Currently we can track
down everything that has a reference to foo, and replace it with newfoo.
With copies, this would more difficult.

Thanks,
-Shane
From jhylton at gmail.com  Wed Jan  5 17:49:05 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Jan  5 17:49:08 2005
Subject: [Python-Dev] ast branch pragmatics
Message-ID: <e8bf7a53050105084947600ab0@mail.gmail.com>

The existing ast-branch mostly works, but it does not include most of
the new features of Python 2.4.  There is a substantial integration
effort, perhaps easy for someone who does a lot of CVS branch merges. 
(In particular, the head has already been merged to this branch once.)

I think it would be easier to create a new branch from the current
head, integrate the small number of changed files from ast-branch, and
work with that branch instead.  The idea is that it's an end-run
around doing an automatic CVS merge and relying on someone to manually
merge the changes.

At the same time, since there is a groundswell of support for
finishing the AST work, I'd like to propose that we stop making
compiler / bytecode changes until it is done.  Every change to
compile.c or the bytecode ends up creating a new incompatibilty that
needs to be merged.

If these two plans sound good, I'll get started on the new branch.

Jeremy
From foom at fuhm.net  Wed Jan  5 17:55:54 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed Jan  5 17:56:01 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <ca471dc205010418022db8c838@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
Message-ID: <A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>

I'm not sure why super got dragged into this, but...

On Jan 4, 2005, at 9:02 PM, Guido van Rossum wrote:
> I think that James Y Knight's page misrepresents the issue. Quoting:

> But __init__ *is* special, in that it is okay for a subclass __init__
> (or __new__) to have a different signature than the base class
> __init__; this is not true for other methods. If you change a regular
> method's signature, you would break Liskov substitutability (i.e.,
> your subclass instance wouldn't be acceptable where a base class
> instance would be acceptable).

You're right, some issues do apply to __init__ alone. However, two 
important ones do not:

The issue of mixing super() and explicit calls to the superclass's 
method occur with any method. (Thus making it difficult/impossible for 
a framework to convert to using super without breaking client code that 
subclasses).

Adding optional arguments to one branch of the inheritance tree, but 
not another, or adding different optional args in both branches. 
(breaks unless you always pass optional args as keywordargs, and all 
methods take **kwargs and pass that on to super).

> Super is intended for use that are designed with method cooperation in
> mind, so I agree with the best practices in James's Conclusion:
> [[omitted]]
> But that's not the same as calling it harmful. :-(

The 'harmfulness' comes from people being confused by, and misusing 
super, because it is so very very easy to do so, and so very hard to 
use correctly.

 From what I can tell, it is mostly used incorrectly. *Especially* uses 
in __init__ or __new__. Many people seem to use super in their __init__ 
methods thinking that it'll magically improve something (like perhaps 
making multiple inheritance trees that include their class work 
better), only to just cause a different set of problems for multiple 
inheritance trees, instead, because they don't realize they need to 
follow those recommendations.

Here's another page that says much the same thing, but from the 
viewpoint of recommending the use of super and showing you all the 
hoops to use it right:
http://wiki.osafoundation.org/bin/view/Chandler/UsingSuper

James

PS, I wrote that page last pycon but never got around to finishing it 
up and therefore never really publically announced it. But I told some 
people about it and then they kept asking me for the URL so I linked 
to?it, and well, then google found it of course, so I guess it's public 
now. ;)
From barry at python.org  Wed Jan  5 18:26:38 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Jan  5 18:26:56 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1104939436.5854.25.camel@localhost>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<B41D4282-5EC7-11D9-9DC0-000A9567635C@redivi.com>
	<1104939436.5854.25.camel@localhost>
Message-ID: <1104945997.32311.8.camel@geddy.wooz.org>

On Wed, 2005-01-05 at 10:37, Glyph Lefkowitz wrote:

> One approach I have taken in order to avoid copiously over-documenting
> every super() using class is to decouple different phases of
> initialization by making __init__ as simple as possible (setting a few
> attributes, resisting the temptation to calculate things), and then
> providing class methods like '.fromString' or '.forUnserialize' that
> create instances that have been completely constructed for a particular
> purpose.  That way the signatures are much more likely to line up across
> inheritance hierarchies.  Perhaps this should be a suggested "best
> practice" when using super() as well?

Yep, I've done the same thing.  It's definitely a good practice.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050105/f78f1142/attachment.pgp
From barry at python.org  Wed Jan  5 18:29:01 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Jan  5 18:29:10 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1104939690.5854.30.camel@localhost>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
	<1104896563.16766.19.camel@geddy.wooz.org>
	<91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1104939690.5854.30.camel@localhost>
Message-ID: <1104946141.32311.12.camel@geddy.wooz.org>

On Wed, 2005-01-05 at 10:41, Glyph Lefkowitz wrote:

> I think it would be reasonable to assign im_class only to functions
> defined in class scope.  The only serialization that would break in that
> case is if your example had a 'del f' at the end.

+1.  If you're doing something funkier, then you can set that attribute
yourself.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050105/7f041dd9/attachment-0001.pgp
From ark-mlist at att.net  Wed Jan  5 18:33:09 2005
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed Jan  5 18:32:53 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <41DAE213.9070906@zope.com>
Message-ID: <001d01c4f34c$a02f8770$6402a8c0@arkdesktop>

> duck typing?

That's the Australian pronunciation of "duct taping".


From fumanchu at amor.org  Wed Jan  5 18:38:52 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Wed Jan  5 18:41:36 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E33980EE@exchange.hqamor.amorhq.net>

Skip Montanaro wrote:
>     Andrew> There's a bunch of jobs we (CSV module 
> maintainers) have been
>     Andrew> putting off - attached is a list (in no particular order):
> 
>     ...
> 
> In addition, it occurred to me this evening that there's 
> functionality in the csv module I don't think anybody uses.
> ...
> I'm also not aware that anyone really uses the Sniffer class,
> though it does provide some useful functionality should you
> need to analyze random CSV files.

I used Sniffer quite heavily for my last contract. The client had
multiple multigig csv's which needed deduplicating, but they were all
from different sources and therefore in different formats. It would have
cost me many more hours without the Sniffer. Please keep it. <:)


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From jim at zope.com  Wed Jan  5 18:53:19 2005
From: jim at zope.com (Jim Fulton)
Date: Wed Jan  5 18:53:25 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050105161045.GA19431@vicky.ecs.soton.ac.uk>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<41DAE213.9070906@zope.com>
	<ca471dc2050104114026228858@mail.gmail.com>
	<41DAF22B.6030605@zope.com>
	<20050105161045.GA19431@vicky.ecs.soton.ac.uk>
Message-ID: <41DC298F.6040802@zope.com>

Armin Rigo wrote:
> Hi Jim,
> 
> On Tue, Jan 04, 2005 at 02:44:43PM -0500, Jim Fulton wrote:
> 
>>>Actually, unbound builtin methods are a different type than bound
>>>builtin methods:
>>
>>Of course, but conceptually they are similar.  You would still
>>encounter the concept if you got an unbound builtin method.
> 
> 
> There are no such things as unbound builtin methods:
> 
> 
>>>>list.append is list.__dict__['append']
> 
> True
> 
> In other words 'list.append' just returns exactly the same object as stored in
> the list type's dict.  Guido's proposal is to make Python methods behave in
> the same way.

OK, interesting.

I'm sold then.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From pje at telecommunity.com  Wed Jan  5 19:03:42 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan  5 19:03:58 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <20050105163006.GB19431@vicky.ecs.soton.ac.uk>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<ca471dc2050104102814be915b@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com>

At 04:30 PM 1/5/05 +0000, Armin Rigo wrote:
>Hi Guido,
>
>On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote:
> > Let's get rid of unbound methods.
>
>Is there any other use case for 'C.x' not returning the same as
>'appropriate_super_class_of_C.__dict__["x"]' ?

Er, classmethod would be one; a rather important one at that.

From tjreedy at udel.edu  Wed Jan  5 19:04:07 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed Jan  5 19:04:16 2005
Subject: [Python-Dev] Re: Please help complete the AST branch
References: <ca471dc2050103174221d7442f@mail.gmail.com><41D9F94C.3020005@ocf.berkeley.edu><e8bf7a530501032003575615d@mail.gmail.com><b008462b050105083256c1c0e4@mail.gmail.com>
	<e8bf7a530501050842affb328@mail.gmail.com>
Message-ID: <crha6n$tgf$1@sea.gmane.org>


"Jeremy Hylton" <jhylton@gmail.com> wrote in message 
news:e8bf7a530501050842affb328@mail.gmail.com...
> The file Python/compile.txt on the ast-branch has a brief overview of
> the project:

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto

Clicking on the above gave me:

(502)  Bad Gateway
The proxy server received an invalid response from an upstream server

??? Perhaps it is a temporary glitch on SF's backend cvs server.

Terry J. Reedy



From pje at telecommunity.com  Wed Jan  5 19:04:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan  5 19:04:52 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <1104946141.32311.12.camel@geddy.wooz.org>
References: <1104939690.5854.30.camel@localhost>
	<ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
	<1104896563.16766.19.camel@geddy.wooz.org>
	<91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1104939690.5854.30.camel@localhost>
Message-ID: <5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com>

At 12:29 PM 1/5/05 -0500, Barry Warsaw wrote:
>On Wed, 2005-01-05 at 10:41, Glyph Lefkowitz wrote:
>
> > I think it would be reasonable to assign im_class only to functions
> > defined in class scope.  The only serialization that would break in that
> > case is if your example had a 'del f' at the end.
>
>+1.  If you're doing something funkier, then you can set that attribute
>yourself.
>
>-Barry

Um, isn't all this stuff going to be more complicated and spread out over 
more of the code than just leaving unbound methods in place?

From gvanrossum at gmail.com  Wed Jan  5 19:10:32 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan  5 19:10:36 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl>
	<1104896563.16766.19.camel@geddy.wooz.org>
	<91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1104939690.5854.30.camel@localhost>
	<1104946141.32311.12.camel@geddy.wooz.org>
	<5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com>
Message-ID: <ca471dc2050105101056546b48@mail.gmail.com>

> Um, isn't all this stuff going to be more complicated and spread out over
> more of the code than just leaving unbound methods in place?

Well, in an early version of Python it was as simple as I'd like ot to
be again: the instancemethod type was only used for bound methods
(hence the name) and C.f would return same the function object as
C.__dict__["f"]. Apart from backwards compatibility with all the code
that has grown cruft to deal with the fact that C.f is not a function
object, I still see no reason why the current state of affairs is
better.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From kbk at shore.net  Wed Jan  5 19:20:51 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan  5 19:21:11 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <e8bf7a5305010413541cd17a02@mail.gmail.com> (Jeremy Hylton's
	message of "Tue, 4 Jan 2005 16:54:28 -0500")
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<20050104021909.GB11833@unpythonic.net>
	<41DB0FA4.3070405@ocf.berkeley.edu>
	<e8bf7a5305010413541cd17a02@mail.gmail.com>
Message-ID: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jhylton@gmail.com> writes:

> Does anyone want to volunteer to integrate the current head to the
> branch?  I think that's a pretty important near-term step.

I'll take a shot at it.

I see the following:

2216 changes:

1428 modifications w/o confict
399 adds
360 removes
29 conflicts

Major conflict:
Python/compile.c                (Probably not merged during 1st merge)
Lib/test/test_compile.c         (ditto)
Lib/test/test_os.py             (AST?)
Lib/test/test_re.py             (AST?)


Major conflict probably not AST related:
Lib/test/test_bool.py
Lib/test/test_urllib.py
Lib/test/output/test_profile
Python/pythonrun.c (check brackets!)

Other issues: need local -kk to avoid another 80 conflicts due to the priceless
              keyword expansion, have to watch out for binary files like IDLE icons.

              ViewCVS is down, slows things up.

I'm going to tag the trunk: mrg_to_ast-branch_05JAN05

-- 
KBK
From gvanrossum at gmail.com  Wed Jan  5 19:23:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan  5 19:23:04 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
Message-ID: <ca471dc2050105102328387030@mail.gmail.com>

> The issue of mixing super() and explicit calls to the superclass's
> method occur with any method. (Thus making it difficult/impossible for
> a framework to convert to using super without breaking client code that
> subclasses).

Well, client classes which are leaves of the class tree can still
safely use BaseClass.thisMethod(self, args) -- it's only classes that
are written to be extended that must all be converted to using
super(). So I'm not sure how you think your clients are breaking.

> Adding optional arguments to one branch of the inheritance tree, but
> not another, or adding different optional args in both branches.
> (breaks unless you always pass optional args as keywordargs, and all
> methods take **kwargs and pass that on to super).

But that breaks anyway; I don't see how using the old
Base.method(self, args) approach makes this easier, *unless* you are
using single inheritance. If you're expecting single inheritance
anyway, why bother with super()?

> > Super is intended for use that are designed with method cooperation in
> > mind, so I agree with the best practices in James's Conclusion:
> > [[omitted]]
> > But that's not the same as calling it harmful. :-(
> 
> The 'harmfulness' comes from people being confused by, and misusing
> super, because it is so very very easy to do so, and so very hard to
> use correctly.

And using multiple inheritance the old was was not confusing? Surely
you are joking.

>  From what I can tell, it is mostly used incorrectly. *Especially* uses
> in __init__ or __new__. Many people seem to use super in their __init__
> methods thinking that it'll magically improve something (like perhaps
> making multiple inheritance trees that include their class work
> better), only to just cause a different set of problems for multiple
> inheritance trees, instead, because they don't realize they need to
> follow those recommendations.

If they're happy with single inheritance, let them use super()
incorrectly. It works, and that's what count. Their code didn't work
right with multiple inheritance before, it still doesn't. Some people
just are uncomfortable with calling Base.method(self, ...) and feel
super is "more correct". Let them.

> Here's another page that says much the same thing, but from the
> viewpoint of recommending the use of super and showing you all the
> hoops to use it right:
> http://wiki.osafoundation.org/bin/view/Chandler/UsingSuper

The problem isn't caused by super but by multiple inheritance.

> James
> 
> PS, I wrote that page last pycon but never got around to finishing it
> up and therefore never really publically announced it. But I told some
> people about it and then they kept asking me for the URL so I linked
> to it, and well, then google found it of course, so I guess it's public
> now. ;)

Doesn't mean you can't fix it. :)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Jan  5 19:23:49 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan  5 19:23:53 2005
Subject: [Python-Dev] ast branch pragmatics
In-Reply-To: <e8bf7a53050105084947600ab0@mail.gmail.com>
References: <e8bf7a53050105084947600ab0@mail.gmail.com>
Message-ID: <ca471dc205010510233f0b0b67@mail.gmail.com>

> I think it would be easier to create a new branch from the current
> head, integrate the small number of changed files from ast-branch, and
> work with that branch instead.  The idea is that it's an end-run
> around doing an automatic CVS merge and relying on someone to manually
> merge the changes.
> 
> At the same time, since there is a groundswell of support for
> finishing the AST work, I'd like to propose that we stop making
> compiler / bytecode changes until it is done.  Every change to
> compile.c or the bytecode ends up creating a new incompatibilty that
> needs to be merged.
> 
> If these two plans sound good, I'll get started on the new branch.

+1

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Wed Jan  5 19:28:11 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Jan  5 19:31:22 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <001501c4f354$500c43c0$e841fea9@oemcomputer>

Would it be helpful for me to move the peepholer out of compile.c into a
separate source file?


Raymond Hettinger

From kbk at shore.net  Wed Jan  5 19:35:24 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan  5 19:35:39 2005
Subject: [Python-Dev] ast branch pragmatics
In-Reply-To: <e8bf7a53050105084947600ab0@mail.gmail.com> (Jeremy Hylton's
	message of "Wed, 5 Jan 2005 11:49:05 -0500")
References: <e8bf7a53050105084947600ab0@mail.gmail.com>
Message-ID: <878y77afmb.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jhylton@gmail.com> writes:

> The existing ast-branch mostly works, but it does not include most of
> the new features of Python 2.4.  There is a substantial integration
> effort, perhaps easy for someone who does a lot of CVS branch merges. 
> (In particular, the head has already been merged to this branch once.)
>
> I think it would be easier to create a new branch from the current
> head, integrate the small number of changed files from ast-branch, and
> work with that branch instead.  The idea is that it's an end-run
> around doing an automatic CVS merge and relying on someone to manually
> merge the changes.
>
> At the same time, since there is a groundswell of support for
> finishing the AST work, I'd like to propose that we stop making
> compiler / bytecode changes until it is done.  Every change to
> compile.c or the bytecode ends up creating a new incompatibilty that
> needs to be merged.
>
> If these two plans sound good, I'll get started on the new branch.

Hm, I saw this after making my previous post.

Well, you can see from that post that it's a bit of work, but not
overwhelming.

You have a better feel for how much change was made on ast-branch and
how complete the previous merge was.  So, you decide: if you want me
to do the merge, I can. But ast-branch-2 sounds OK, also.

-- 
KBK
From magnus at hetland.org  Wed Jan  5 13:19:21 2005
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Wed Jan  5 19:48:51 2005
Subject: [Python-Dev] Re: csv module TODO list
In-Reply-To: <20050105075506.314C93C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<16859.38960.9935.682429@montanaro.dyndns.org>
	<20050105075506.314C93C8E5@coffee.object-craft.com.au>
Message-ID: <20050105121921.GB24030@idi.ntnu.no>

Quite a while ago I posted some material to the csv-list about
problems using the csv module on Unix-style colon-separated files --
it just doesn't deal properly with backslash escaping and is quite
useless for this kind of file. I seem to recall the general view was
that it wasn't intended for this kind of thing -- only the sort of csv
that Microsoft Excel outputs/inputs, but if I am mistaken about this,
perhaps fixing this issue might be put on the TODO-list? I'll be happy
to re-send or summarize the relevant emails, if needed.

-- 
Magnus Lie Hetland       Fallen flower I see / Returning to its branch
http://hetland.org       Ah! a butterfly.           [Arakida Moritake]
From jhylton at gmail.com  Wed Jan  5 19:54:03 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Jan  5 19:54:06 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <001501c4f354$500c43c0$e841fea9@oemcomputer>
References: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com>
	<001501c4f354$500c43c0$e841fea9@oemcomputer>
Message-ID: <e8bf7a53050105105466f305fc@mail.gmail.com>

On Wed, 5 Jan 2005 13:28:11 -0500, Raymond Hettinger <python@rcn.com> wrote:
> Would it be helpful for me to move the peepholer out of compile.c into a
> separate source file?

It doesn't really matter.  There are two reasons.  1) We've been
working on the new compiler code in newcompile.c, rather than
compile.c.  When it is finished, we'll replace compile.c with
newcompile.c, but it was helpful to have both around at first.
2) Peephole optimizations would be done on the basic block
intermediate representation rather than code objects.  So we'll need
to rewrite it anyway to use the new IR.

Jeremy
From jhylton at gmail.com  Wed Jan  5 19:58:02 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Jan  5 19:58:05 2005
Subject: [Python-Dev] Please help complete the AST branch
In-Reply-To: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com>
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<20050104021909.GB11833@unpythonic.net>
	<41DB0FA4.3070405@ocf.berkeley.edu>
	<e8bf7a5305010413541cd17a02@mail.gmail.com>
	<87d5wjagak.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <e8bf7a5305010510583adcfb0@mail.gmail.com>

On Wed, 05 Jan 2005 13:20:51 -0500, Kurt B. Kaiser <kbk@shore.net> wrote:
> Jeremy Hylton <jhylton@gmail.com> writes:
> 
> > Does anyone want to volunteer to integrate the current head to the
> > branch?  I think that's a pretty important near-term step.
> 
> I'll take a shot at it.

Great!  I say this after reading your other message in response to my
suggestion to create a new branch.  If you can manage to do the
integration, it's simpler for everyone to stick to a single branch. 
(For example, there will be no opportunity for someone to work on the
wrong branch.)
 
> 29 conflicts

Oh.  That's not as bad as I expected.
 
> Major conflict:
> Python/compile.c                (Probably not merged during 1st merge)

I think that's right.  I didn't merge any of the changes, then.

> Lib/test/test_compile.c         (ditto)

Probably.

> Lib/test/test_os.py             (AST?)
> Lib/test/test_re.py             (AST?)

I wonder if these two were edited to worm around some bugs in early
versions of newcompile.c.  You could check the revision history.  If
that's the case, it's safe to drop the changes.

> Major conflict probably not AST related:
> Lib/test/test_bool.py
> Lib/test/test_urllib.py
> Lib/test/output/test_profile
> Python/pythonrun.c (check brackets!)

There are actually a lot of AST-related changes in pythonrun.c,
because it is the gunk between files and stdin and the actual compiler
and runtime.
 
Jeremy
From martin at v.loewis.de  Wed Jan  5 22:15:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 22:15:41 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
	<019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com>
Message-ID: <41DC5904.4070507@v.loewis.de>

Bob Ippolito wrote:
> Then you haven't done the appropriate research by searching 
> pythonmac-sig.  

Hmm.

 > Do you even own a Mac?

Do I have to, in order to understand the issues?

But to answer your question: yes, I do.

Regards,
Martin
From tjreedy at udel.edu  Wed Jan  5 22:24:14 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed Jan  5 22:24:24 2005
Subject: [Python-Dev] Re: Please help complete the AST branch
References: <ca471dc2050103174221d7442f@mail.gmail.com><41D9F94C.3020005@ocf.berkeley.edu><e8bf7a530501032003575615d@mail.gmail.com><b008462b050105083256c1c0e4@mail.gmail.com><e8bf7a530501050842affb328@mail.gmail.com>
	<crha6n$tgf$1@sea.gmane.org>
Message-ID: <crhltu$4g4$1@sea.gmane.org>


"Terry Reedy" <tjreedy@udel.edu> wrote in message 
news:crha6n$tgf$1@sea.gmane.org...
http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto
>
> Clicking on the above gave me:
>
> (502)  Bad Gateway
> The proxy server received an invalid response from an upstream server
>
> ??? Perhaps it is a temporary glitch on SF's backend cvs server.

Seems so, working now.



From kbk at shore.net  Wed Jan  5 22:30:34 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan  5 22:31:01 2005
Subject: [Python-Dev] Please help complete the AST branch
References: <ca471dc2050103174221d7442f@mail.gmail.com>
	<41D9F94C.3020005@ocf.berkeley.edu>
	<20050104021909.GB11833@unpythonic.net>
	<41DB0FA4.3070405@ocf.berkeley.edu>
	<e8bf7a5305010413541cd17a02@mail.gmail.com>
	<87d5wjagak.fsf@hydra.bayview.thirdcreek.com>
	<e8bf7a5305010510583adcfb0@mail.gmail.com>
Message-ID: <87llb78sxx.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jhylton@gmail.com> writes:

>> 29 conflicts
>
> Oh.  That's not as bad as I expected.

Proceeding....

>> Major conflict:
>> Python/compile.c                (Probably not merged during 1st merge)
>
> I think that's right.  I didn't merge any of the changes, then.
>
>> Lib/test/test_compile.c         (ditto)
>
> Probably.

So maybe it's not necessary to merge these two; just leave them behind? 
That would lighten the load quite a bit.

-- 
KBK
From bob at redivi.com  Wed Jan  5 22:39:11 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 22:39:19 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC5904.4070507@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
	<019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com>
	<41DC5904.4070507@v.loewis.de>
Message-ID: <3CAA6728-5F62-11D9-AB1C-000A95BA5446@redivi.com>


On Jan 5, 2005, at 16:15, Martin v. L?wis wrote:

> Bob Ippolito wrote:
>> Then you haven't done the appropriate research by searching 
>> pythonmac-sig.
>
> Hmm.
>
> > Do you even own a Mac?
>
> Do I have to, in order to understand the issues?
>
> But to answer your question: yes, I do.

Well, this issue has been discussed over and over again on 
pythonmac-sig over the past year or so (perhaps as far back as the 
10.3.0 release).  I do not have time at the moment to summarize, but 
the solution proposed is sane and there is no known better way.  If you 
take a look at the WWDC2004 sources for Python, a similar patch is 
applied by Apple.  However, Apple's patch breaks (at least) C++ 
compilation and SciPy's distutils extension for compiling Fortran due 
to distutils' stupidity.

-bob

From martin at v.loewis.de  Wed Jan  5 22:58:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 22:57:58 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>
Message-ID: <41DC62EC.6060608@v.loewis.de>

Ronald Oussoren wrote:

> It gets worse when you have a user-installed python 2.3 and a 
> user-installed python 2.4. Those will be both be installed as 
> /Library/Frameworks/Python.framework. 

Yes, but one is installed in Versions/2.3, and the other in
Versions/2.4.

> This means that you cannot use the 
> -F flag to select which one you want to link to, '-framework Python' 
> will only link to the python that was installed the latest.

What about using -F /Library/Frameworks/Python.framework/Versions/2.3?
Or, would there be a different way to specify the version of a
framework when linking, in addition to -F? What about

   -framework Python,/Versions/2.3

I could not find a specification how the suffix in -framework is meant
to work - perhaps it could be used here?

Regards,
Martin
From martin at v.loewis.de  Wed Jan  5 23:00:26 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 23:00:20 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
Message-ID: <41DC637A.5050105@v.loewis.de>

Andrew McNamara wrote:
>>>Can you please elaborate on that? What needs to be done, and how is
>>>that going to be done? It might be possible to avoid considerable
>>>uglification.
> 
> 
> I'm not altogether sure there. The parsing state machine is all written in
> C, and deals with signed chars - I expect we'll need two versions of that
> (or one version that's compiled twice using pre-processor macros). Quite
> a large job. Suggestions gratefully received.

I'm still trying to understand what *needs* to be done - I would move to
how this is done only later. What APIs should be extended/changed, and
in what way?

Regards,
Martin
From bob at redivi.com  Wed Jan  5 23:06:10 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan  5 23:06:17 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC62EC.6060608@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<41DBA649.3080008@v.loewis.de>
	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>
	<41DC62EC.6060608@v.loewis.de>
Message-ID: <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>


On Jan 5, 2005, at 16:58, Martin v. L?wis wrote:

> Ronald Oussoren wrote:
>
>> It gets worse when you have a user-installed python 2.3 and a 
>> user-installed python 2.4. Those will be both be installed as 
>> /Library/Frameworks/Python.framework.
>
> Yes, but one is installed in Versions/2.3, and the other in
> Versions/2.4.
>
>> This means that you cannot use the -F flag to select which one you 
>> want to link to, '-framework Python' will only link to the python 
>> that was installed the latest.
>
> What about using -F /Library/Frameworks/Python.framework/Versions/2.3?
> Or, would there be a different way to specify the version of a
> framework when linking, in addition to -F? What about
>
>   -framework Python,/Versions/2.3

Nope.  The only way to link to a non-current framework version is to 
forego any linker searching and specify the dyld file directly, i.e. 
/Library/Frameworks/Python.framework/Versions/2.3/Python.  The gcc 
toolchain does not in any way whatsoever understand versioned 
frameworks, period.

> I could not find a specification how the suffix in -framework is meant
> to work - perhaps it could be used here?

dylib suffixes are used for having separate versions of the dylib 
(debug, profile, etc.).  It is NOT for general production use, ever.

-bob

From martin at v.loewis.de  Wed Jan  5 23:19:40 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 23:19:33 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <2mzmzogf5a.fsf@starship.python.net>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<41DBA649.3080008@v.loewis.de>
	<2mzmzogf5a.fsf@starship.python.net>
Message-ID: <41DC67FC.8020703@v.loewis.de>

Michael Hudson wrote:
> Martin, can you please believe that Jack, Bob, Ronald et al know what
> they are talking about here?

I find that really hard to believe, because it contradicts to what I
think Apple wants me to believe. I'm willing to follow a series of
statements that I can confirm to be facts somehow (e.g. "As TechNote
XY says, OSX has a bug in that it loads the Current version at run-time,
no matter what version the binary says should be used"). I'm not really
willing to believe a statement without any kind of proof - regardless
who made that statement. "Read the mailing lists" is no proof.

If I was to accept anything said without doubt, Jack would not have
needed to post his message in the first place - he expressed his
opinion that he believed the changes to be appropriate. It was his
doubt that triggered mine. I am not going to interfere with the
changes -- it's just that I want to understand them.

Kind regards,
Martin
From martin at v.loewis.de  Wed Jan  5 23:38:26 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Jan  5 23:38:19 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<41DBA649.3080008@v.loewis.de>	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>	<41DC62EC.6060608@v.loewis.de>
	<01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>
Message-ID: <41DC6C62.90101@v.loewis.de>

Bob Ippolito wrote:
> Nope.  The only way to link to a non-current framework version is to 
> forego any linker searching and specify the dyld file directly, i.e. 
> /Library/Frameworks/Python.framework/Versions/2.3/Python.  The gcc 
> toolchain does not in any way whatsoever understand versioned 
> frameworks, period.

I see. I wish you had told me right from the beginning.

Regards,
Martin
From arigo at tunes.org  Wed Jan  5 23:39:18 2005
From: arigo at tunes.org (Armin Rigo)
Date: Wed Jan  5 23:50:12 2005
Subject: [Python-Dev] Let's get rid of unbound methods
In-Reply-To: <5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<ca471dc2050104102814be915b@mail.gmail.com>
	<5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com>
Message-ID: <20050105223918.GA26613@vicky.ecs.soton.ac.uk>

Hi Phillip,

On Wed, Jan 05, 2005 at 01:03:42PM -0500, Phillip J. Eby wrote:
> >Is there any other use case for 'C.x' not returning the same as
> >'appropriate_super_class_of_C.__dict__["x"]' ?
> 
> Er, classmethod would be one; a rather important one at that.

Oups.  Right, sorry.


Armin
From foom at fuhm.net  Thu Jan  6 00:00:38 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu Jan  6 00:00:41 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <ca471dc2050105102328387030@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
Message-ID: <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 5, 2005, at 1:23 PM, Guido van Rossum wrote:
>> The issue of mixing super() and explicit calls to the superclass's
>> method occur with any method. (Thus making it difficult/impossible for
>> a framework to convert to using super without breaking client code 
>> that
>> subclasses).
>
> Well, client classes which are leaves of the class tree can still
> safely use BaseClass.thisMethod(self, args) -- it's only classes that
> are written to be extended that must all be converted to using
> super(). So I'm not sure how you think your clients are breaking.

See the section "Subclasses must use super if their superclasses do". 
This is particularly a big issue with __init__.

>> Adding optional arguments to one branch of the inheritance tree, but
>> not another, or adding different optional args in both branches.
>> (breaks unless you always pass optional args as keywordargs, and all
>> methods take **kwargs and pass that on to super).
>
> But that breaks anyway; I don't see how using the old
> Base.method(self, args) approach makes this easier, *unless* you are
> using single inheritance. If you're expecting single inheritance
> anyway, why bother with super()?

There is a distinction between simple multiple inheritance, which did 
work in the old system vs. multiple inheritance in a diamond structure 
which did not work in the old system. However, consider something like 
the following (ignore the Interface/implements bit if you want. It's 
just to point out a common situation where two classes can 
independently implement the same method without having a common 
superclass):

class IFrob(Interface):
   def frob():
     """Frob the knob"""

class A:
   implements(IFrob)
   def frob(self, foo=False):
     print "A.frob(foo=%r)"%foo

class B:
   implements(IFrob)
   def frob(self, bar=False):
     print "B.frob(bar=%r)"%bar

class C(A,B):
   def m(self, foo=False, bar=False):
     A.m(self, foo=foo)
     B.m(self, bar=bar)
     print "C.frob(foo=%r, bar=%r)"%(foo,bar)

Now, how do you write that to use super? Here's what I come up with:

class IFrob(Interface):
   def frob():
     """Frob the knob"""

class A(object):
   implements(IFrob)
   def frob(self, foo=False, *args, **kwargs):
     try:
       f = super(A, self).frob
     except AttributeError:
       pass
     else:
       f(foo=foo, *args, **kwargs)
     print "A.frob(foo=%r)"%foo

class B(object):
   implements(IFrob)
   def frob(self, bar=False, *args, **kwargs):
     try:
       f = super(B, self).frob
     except AttributeError:
       pass
     else:
       f(bar=bar, *args, **kwargs)
     print "B.frob(bar=%r)"%bar

class C(A,B):
   def frob(self, foo=False, bar=False, *args, **kwargs):
     super(C, self).frob(foo, bar, *args, **kwargs)
     print "C.frob(foo=%r, bar=%r)"%(foo,bar)



> And using multiple inheritance the old was was not confusing? Surely
> you are joking.

It was pretty simple until you start having diamond structures. Then 
it's complicated. Now, don't get me wrong, I think that MRO-calculating 
mechanism really is "the right thing", in the abstract. I just think 
the way it works out as implemented in python is really confusing and 
it's easy to be worse off with it than without it.

> If they're happy with single inheritance, let them use super()
> incorrectly. It works, and that's what count. Their code didn't work
> right with multiple inheritance before, it still doesn't. Some people
> just are uncomfortable with calling Base.method(self, ...) and feel
> super is "more correct". Let them.

Their code worked right in M-I without diamonds before. Now it likely 
doesn't work in M-I at all.

James

From bob at redivi.com  Thu Jan  6 00:14:19 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan  6 00:14:28 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC6C62.90101@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<41DBA649.3080008@v.loewis.de>	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>	<41DC62EC.6060608@v.loewis.de>
	<01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>
	<41DC6C62.90101@v.loewis.de>
Message-ID: <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com>


On Jan 5, 2005, at 17:38, Martin v. L?wis wrote:

> Bob Ippolito wrote:
>> Nope.  The only way to link to a non-current framework version is to  
>> forego any linker searching and specify the dyld file directly, i.e.  
>> /Library/Frameworks/Python.framework/Versions/2.3/Python.  The gcc  
>> toolchain does not in any way whatsoever understand versioned  
>> frameworks, period.
>
> I see. I wish you had told me right from the beginning.

That is only part of the reason for these changes (concurrent Python  
2.3 and Python 2.4 in the same location), and is fringe enough that I  
wasn't even thinking of it at the time.

I just dug up some information I had written on this particular topic  
but never published, if you're interested:
http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks- 
considered-harmful/

-bob

From Jack.Jansen at cwi.nl  Thu Jan  6 00:21:58 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Jan  6 00:21:46 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
Message-ID: <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>

[Grmpf. I should check which account I use before pressing send. Here  
goes again]

On 5-jan-05, at 1:08, Bob Ippolito wrote:
>>> The problem we're trying to solve is that due to the way Apple's  
>>> framework architecture works newer versions of frameworks are  
>>> preferred (at link time, and sometimes even at runtime) over older  
>>> ones.
>>
>> Can you elaborate on that somewhat? According to
>>
>> http://developer.apple.com/documentation/MacOSX/Conceptual/ 
>> BPFrameworks/Concepts/VersionInformation.html
>>
>> there are major and minor versions of frameworks. I would think that
>> every Python minor (2.x) release should produce a new major framework
>> version of the Python framework. Then, there would be no problem.
>>
>> Why does this not work?
>
> It doesn't for reasons I care not to explain in depth, again.

But I do care:-) Specifically because I trust the crowd here to come up  
with good ideas (even if they're not Mac users:-).

Ronald already explained most of the problem, what it boils down to is  
that  multiple versions of a framework can live in a single location.  
For most applications that's better than the old MacOS9 architecture  
(which I believe is pretty similar to the Windows dll architecture)  
because you can ship a single foo.framework that contains both version  
1.2 and 1.3. There's also a symlink "Current" that will point to 1.3.  
At build time the linker will pick the version pointed at by "Current",  
but in the file it will record the actual version number. Hence, if you  
ship this framework new development will link to the newest version,  
but older programs will still load the older one.

When I did the framework python design I overlooked the fact that an  
older Python would have no way to specify that an extension would have  
to link against its own, old, framework, because on MacOS9 this wasn't  
a problem (the two had different filenames).

As an aside, I also overlooked the fact that a Python framework  
residing in /System could be overridden by one in /Library because in  
2.3 we linked frameworks by relative pathname, because I simply didn't  
envision any Python living in /System for some time to be. The -F  
options could solve that problem, but not the 2.3 and 2.4 both in  
/Library problem.

The "new" solution is basically to go back to the Unix way of building  
an extension: link it against nothing and sort things out at runtime.  
Not my personal preference, but at least we know that loading an  
extension into one Python won't bring in a fresh copy of a different  
interpreter or anything horrible like that.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman

From gvanrossum at gmail.com  Thu Jan  6 00:36:02 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Jan  6 00:36:05 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
	<9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
Message-ID: <ca471dc20501051536c4fd618@mail.gmail.com>

On Wed, 5 Jan 2005 18:00:38 -0500, James Y Knight <foom@fuhm.net> wrote:
> On Jan 5, 2005, at 1:23 PM, Guido van Rossum wrote:
> >> The issue of mixing super() and explicit calls to the superclass's
> >> method occur with any method. (Thus making it difficult/impossible for
> >> a framework to convert to using super without breaking client code
> >> that subclasses).
> >
> > Well, client classes which are leaves of the class tree can still
> > safely use BaseClass.thisMethod(self, args) -- it's only classes that
> > are written to be extended that must all be converted to using
> > super(). So I'm not sure how you think your clients are breaking.
> 
> See the section "Subclasses must use super if their superclasses do".
> This is particularly a big issue with __init__.

I see. I was thinking about subclassing a single class, you are
talking about subclassing multiple bases. Subclassing two or more
classes is *always* very subtle. Before 2.2 and super(), the only sane
way to do that was to have all except one base class be written as a
mix-in class for a specific base class (or family of base classes).

The idea of calling both __init__ methods doesn't work if there's a
diamond; if there *is* a diamond (or could be one), using super() is
the only sane solution.

> >> Adding optional arguments to one branch of the inheritance tree, but
> >> not another, or adding different optional args in both branches.
> >> (breaks unless you always pass optional args as keywordargs, and all
> >> methods take **kwargs and pass that on to super).
> >
> > But that breaks anyway; I don't see how using the old
> > Base.method(self, args) approach makes this easier, *unless* you are
> > using single inheritance. If you're expecting single inheritance
> > anyway, why bother with super()?
> 
> There is a distinction between simple multiple inheritance, which did
> work in the old system

Barely; see above.

> vs. multiple inheritance in a diamond structure
> which did not work in the old system. However, consider something like
> the following (ignore the Interface/implements bit if you want. It's
> just to point out a common situation where two classes can
> independently implement the same method without having a common
> superclass):
> 
> class IFrob(Interface):
>    def frob():
>      """Frob the knob"""
> 
> class A:
>    implements(IFrob)
>    def frob(self, foo=False):
>      print "A.frob(foo=%r)"%foo
> 
> class B:
>    implements(IFrob)
>    def frob(self, bar=False):
>      print "B.frob(bar=%r)"%bar
> 
> class C(A,B):
>    def m(self, foo=False, bar=False):
[I presume you meant from instead of m here]
>      A.m(self, foo=foo)
>      B.m(self, bar=bar)
>      print "C.frob(foo=%r, bar=%r)"%(foo,bar)
> 
> Now, how do you write that to use super?

The problem isn't in super(), the problem is that the classes A and B
aren't written cooperatively, so attempting to combine them using
multiple inheritance is asking for trouble. You'd be better off making
C a container class that has separate A and B instances.

> > And using multiple inheritance the old was was not confusing? Surely
> > you are joking.
> 
> It was pretty simple until you start having diamond structures. Then
> it's complicated. Now, don't get me wrong, I think that MRO-calculating
> mechanism really is "the right thing", in the abstract. I just think
> the way it works out as implemented in python is really confusing and
> it's easy to be worse off with it than without it.

So then don't use it. You couldn't have diamonds at all before 2.2.
With *care* and *understanding* you can do the right thing in 2.2 and
beyond.

I'm getting tired of super() being blamed for the problems inherent to
cooperative multiple inheritance. super() is the tool that you need to
solve a hairy problem; but don't blame super() for the problem's
hairiness.

> > If they're happy with single inheritance, let them use super()
> > incorrectly. It works, and that's what count. Their code didn't work
> > right with multiple inheritance before, it still doesn't. Some people
> > just are uncomfortable with calling Base.method(self, ...) and feel
> > super is "more correct". Let them.
> 
> Their code worked right in M-I without diamonds before. Now it likely
> doesn't work in M-I at all.

If you have a framework with classes written using the old paradigm
that a subclass must call the __init__ (or frob) method of each of its
superclasses, you can't change your framework to use super() instead
while maintaining backwards compatibility. If you didn't realize that
before you made the change and then got bitten by it, tough.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From martin at v.loewis.de  Thu Jan  6 00:46:32 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan  6 00:46:25 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<41DBA649.3080008@v.loewis.de>	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>	<41DC62EC.6060608@v.loewis.de>
	<01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>
	<41DC6C62.90101@v.loewis.de>
	<8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com>
Message-ID: <41DC7C58.6040006@v.loewis.de>

Bob Ippolito wrote:
> I just dug up some information I had written on this particular topic  
> but never published, if you're interested:
> http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks- 
> considered-harmful/

Interesting. I don't get the part why "-undefined dynamic_lookup"
is a good idea (and this is indeed what bothered me most to begin with).
As you say, explicitly specifying the target .dylib should work as
well, and it also does not require 10.3.

Regards,
Martin
From martin at v.loewis.de  Thu Jan  6 00:49:52 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan  6 00:49:45 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
Message-ID: <41DC7D20.4000901@v.loewis.de>

Jack Jansen wrote:
> But I do care:-) Specifically because I trust the crowd here to come up  
> with good ideas (even if they're not Mac users:-).

Thanks a lot.

> The "new" solution is basically to go back to the Unix way of building  
> an extension: link it against nothing and sort things out at runtime.  
> Not my personal preference, but at least we know that loading an  
> extension into one Python won't bring in a fresh copy of a different  
> interpreter or anything horrible like that.

This sounds good, except that it only works on OS X 10.3, right?
What about older versions?

Regards,
Martin
From andrewm at object-craft.com.au  Thu Jan  6 02:10:55 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Thu Jan  6 02:11:02 2005
Subject: [Python-Dev] csv module TODO list 
In-Reply-To: <41DC637A.5050105@v.loewis.de> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
	<41DC637A.5050105@v.loewis.de>
Message-ID: <20050106011055.001163C8E5@coffee.object-craft.com.au>

>>>>Can you please elaborate on that? What needs to be done, and how is
>>>>that going to be done? It might be possible to avoid considerable
>>>>uglification.
>> 
>> I'm not altogether sure there. The parsing state machine is all written in
>> C, and deals with signed chars - I expect we'll need two versions of that
>> (or one version that's compiled twice using pre-processor macros). Quite
>> a large job. Suggestions gratefully received.
>
>I'm still trying to understand what *needs* to be done - I would move to
>how this is done only later. What APIs should be extended/changed, and
>in what way?

That's certainly the first step, and I have to admit that I don't have
a clear idea at this time - the unicode issue has been in the "too hard"
basket since we started.

Marc-Andre Lemburg mentioned that he has encountered UTF-16 encoded csv
files, so a reasonable starting point would be the ability to read and
parse, as well as the ability to generate, one of these.

The reader interface currently returns a row at a time, consuming as many
lines from the supplied iterable (with the most common iterable being
a file). This suggests to me that we will need an optional "encoding"
argument to the reader constructor, and that the reader will need to
decode the source lines. That said, I'm hardly a unicode expert, so I
may be overlooking something (could a utf-16 encoded character span a
line break, for example).  The writer interface probably should have
similar facilities.

However - a number of people have complained about the "iterator"
interface, wanting to supply strings (the iterable is necessary because a
CSV row can span multiple lines). It's also conceiveable that the source
lines could already be unicode objects.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From andrewm at object-craft.com.au  Thu Jan  6 03:03:08 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Thu Jan  6 03:03:13 2005
Subject: [Csv] Re: [Python-Dev] csv module TODO list 
In-Reply-To: <20050106011055.001163C8E5@coffee.object-craft.com.au> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
	<41DC637A.5050105@v.loewis.de>
	<20050106011055.001163C8E5@coffee.object-craft.com.au>
Message-ID: <20050106020308.EBE5A3C8E5@coffee.object-craft.com.au>

>>I'm still trying to understand what *needs* to be done - I would move to
>>how this is done only later. What APIs should be extended/changed, and
>>in what way?
[...]
>The reader interface currently returns a row at a time, consuming as many
>lines from the supplied iterable (with the most common iterable being
>a file). This suggests to me that we will need an optional "encoding"
>argument to the reader constructor, and that the reader will need to
>decode the source lines. That said, I'm hardly a unicode expert, so I
>may be overlooking something (could a utf-16 encoded character span a
>line break, for example).  The writer interface probably should have
>similar facilities.

Ah - I see that the codecs module provides an EncodedFile class - better
to use this than add encoding/decoding cruft to the csv module.

So, do we duplicate the current reader and writer as UnicodeReader and
UnicodeWriter (how else do we know to use the unicode parser)? What about
the "dialects"? I guess if a dialect uses no unicode strings, it can be
applied to the current parser, but if it does include unicode strings,
then the parser would need to raise an exception.

The DictReader and DictWriter classes will probably need matching
UnicodeDictReader/UnicodeDictWriter versions (use common base class,
just specify alternate parser).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From ilya at bluefir.net  Thu Jan  6 06:27:16 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Thu Jan  6 06:24:30 2005
Subject: [Python-Dev] an idea for improving struct.unpack api 
Message-ID: <Pine.LNX.4.58.0501042046120.678@bagira>


A problem:

The current struct.unpack api works well for unpacking C-structures where
everything is usually unpacked at once, but it
becomes  inconvenient when unpacking binary files where things
often have to be unpacked field by field. Then one has to keep track
of offsets, slice the strings,call struct.calcsize(), etc...

Eg. with a current api unpacking  of a record which consists of a
header followed by a variable  number of items would go like this

 hdr_fmt="iiii"
 item_fmt="IIII"
 item_size=calcsize(item_fmt)
 hdr_size=calcsize(hdr_fmt)
 hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack
 offset=hdr_size
 for i in range(hdr[0]): #assume 1st field of header is a counter
   item=unpack( item_fmt, rec[ offset: offset+item_size])
   offset+=item_size

which is quite inconvenient...


A  solution:

We could have an optional offset argument for

unpack(format, buffer, offset=None)

the offset argument is an object which contains a single integer field
which gets incremented inside unpack() to point to the next byte.

so with a new API the above code could be written as

 offset=struct.Offset(0)
 hdr=unpack("iiii", offset)
 for i in range(hdr[0]):
    item=unpack( "IIII", rec, offset)

When an offset argument is provided, unpack() should allow some bytes to
be left unpacked at the end of the buffer..


Does this suggestion make sense? Any better ideas?

Ilya


From bob at redivi.com  Thu Jan  6 08:29:23 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan  6 08:29:36 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC7D20.4000901@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
Message-ID: <B03BCC84-5FB4-11D9-AB1C-000A95BA5446@redivi.com>

On Jan 5, 2005, at 18:49, Martin v. L?wis wrote:

> Jack Jansen wrote:
>> The "new" solution is basically to go back to the Unix way of 
>> building  an extension: link it against nothing and sort things out 
>> at runtime.  Not my personal preference, but at least we know that 
>> loading an  extension into one Python won't bring in a fresh copy of 
>> a different  interpreter or anything horrible like that.
>
> This sounds good, except that it only works on OS X 10.3, right?
> What about older versions?

Older versions do not support this feature and have to deal with the 
way things are as-is.  Mac OS X 10.2 is the only supported version that 
suffers this consequence, I don't think anyone has supported Python on 
Mac OS X 10.1 in quite some time.

-bob

From bob at redivi.com  Thu Jan  6 08:31:45 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan  6 08:31:50 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC7C58.6040006@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<41DBA649.3080008@v.loewis.de>	<DAD9DD96-5EFA-11D9-85EE-000D93AD379E@mac.com>	<41DC62EC.6060608@v.loewis.de>
	<01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com>
	<41DC6C62.90101@v.loewis.de>
	<8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com>
	<41DC7C58.6040006@v.loewis.de>
Message-ID: <04A1624E-5FB5-11D9-AB1C-000A95BA5446@redivi.com>

On Jan 5, 2005, at 18:46, Martin v. L?wis wrote:

> Bob Ippolito wrote:
>> I just dug up some information I had written on this particular topic  
>>  but never published, if you're interested:
>> http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks-  
>> considered-harmful/
>
> Interesting. I don't get the part why "-undefined dynamic_lookup"
> is a good idea (and this is indeed what bothered me most to begin  
> with).
> As you say, explicitly specifying the target .dylib should work as
> well, and it also does not require 10.3.

Without -undefined dynamic_lookup, your Python extensions are bound to  
a specific Python installation location (i.e. the system 2.3.0 and a  
user-installed 2.3.4).  This tends to be quite a problem.  With  
-undefined dynamic_lookup, they are not.

Just search for "version mismatch" on pythonmac-sig:
http://www.google.com/search?q=%22version+mismatch%22+pythonmac- 
sig+site:mail.python.org&ie=UTF-8&oe=UTF-8

-bob

From python at rcn.com  Thu Jan  6 08:33:39 2005
From: python at rcn.com (Raymond Hettinger)
Date: Thu Jan  6 08:36:52 2005
Subject: [Python-Dev] an idea for improving struct.unpack api 
In-Reply-To: <Pine.LNX.4.58.0501042046120.678@bagira>
Message-ID: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer>

[Ilya Sandler]
> A problem:
> 
> The current struct.unpack api works well for unpacking C-structures
where
> everything is usually unpacked at once, but it
> becomes  inconvenient when unpacking binary files where things
> often have to be unpacked field by field. Then one has to keep track
> of offsets, slice the strings,call struct.calcsize(), etc...

Yes.  That bites.


> Eg. with a current api unpacking  of a record which consists of a
> header followed by a variable  number of items would go like this
> 
>  hdr_fmt="iiii"
>  item_fmt="IIII"
>  item_size=calcsize(item_fmt)
>  hdr_size=calcsize(hdr_fmt)
>  hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack
>  offset=hdr_size
>  for i in range(hdr[0]): #assume 1st field of header is a counter
>    item=unpack( item_fmt, rec[ offset: offset+item_size])
>    offset+=item_size
> 
> which is quite inconvenient...
> 
> 
> A  solution:
> 
> We could have an optional offset argument for
> 
> unpack(format, buffer, offset=None)
> 
> the offset argument is an object which contains a single integer field
> which gets incremented inside unpack() to point to the next byte.
> 
> so with a new API the above code could be written as
> 
>  offset=struct.Offset(0)
>  hdr=unpack("iiii", offset)
>  for i in range(hdr[0]):
>     item=unpack( "IIII", rec, offset)
> 
> When an offset argument is provided, unpack() should allow some bytes
to
> be left unpacked at the end of the buffer..
> 
> 
> Does this suggestion make sense? Any better ideas?

Rather than alter struct.unpack(), I suggest making a separate class
that tracks the offset and encapsulates some of the logic that typically
surrounds unpacking:

    r = StructReader(rec)
    hdr = r('iiii')
    for item in r.getgroups('IIII', times=rec[0]):
       . . .

It would be especially nice if it handled the more complex case where
the next offset is determined in-part by the data being read (see the
example in section 11.3 of the tutorial):

    r = StructReader(open('myfile.zip', 'rb'))
    for i in range(3):                  # show the first 3 file headers
        fields = r.getgroup('LLLHH', offset=14)
        crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
        filename = g.getgroup('c', offset=16, times=filenamesize)
        extra = g.getgroup('c', times=extra_size)
        r.advance(comp_size)
        print filename, hex(crc32), comp_size, uncomp_size

If you come up with something, I suggest posting it as an ASPN recipe
and then announcing it on comp.lang.python.  That ought to generate some
good feedback based on other people's real world issues with
struct.unpack().


Raymond Hettinger

From foom at fuhm.net  Thu Jan  6 08:46:11 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu Jan  6 08:46:23 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <ca471dc20501051536c4fd618@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
	<9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc20501051536c4fd618@mail.gmail.com>
Message-ID: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 5, 2005, at 6:36 PM, Guido van Rossum wrote:
> The idea of calling both __init__ methods doesn't work if there's a
> diamond; if there *is* a diamond (or could be one), using super() is
> the only sane solution.

Very true.

> So then don't use it. You couldn't have diamonds at all before 2.2.
> With *care* and *understanding* you can do the right thing in 2.2 and
> beyond.
>
> I'm getting tired of super() being blamed for the problems inherent to
> cooperative multiple inheritance. super() is the tool that you need to
> solve a hairy problem; but don't blame super() for the problem's
> hairiness.

Please notice that I'm talking about concrete, real issues, not just a 
"super is bad!" rant. These are initially non-obvious (to me, at least) 
things that will actually happen in real code and that you actually do 
need to watch out for if you use super.

Yes. It is a hard problem. However, the issues I talk about are not 
issues with the functionality and theory of calling the next method in 
an MRO, they are issues with the combination of MROs, the 
implementation of MRO-calling in python (via "super"), and current 
practices in writing python code. They are not inherent in cooperative 
multiple inheritance, but occur mostly because of its late addition to 
python, and the cumbersome way in which you have to invoke super.

I wrote up the page as part of an investigation into converting Twisted 
to use super. I thought it would be a good idea to do the conversion, 
but others told me it would be a bad idea for backwards compatibility 
reasons. I did not believe, at first, and conducted experiments. In the 
end, I concluded that it is not possible, because of the issues with 
mixing the new and old paradigm.

> If you have a framework with classes written using the old paradigm
> that a subclass must call the __init__ (or frob) method of each of its
> superclasses, you can't change your framework to use super() instead
> while maintaining backwards compatibility.

Yep, that's what I said, too.

> If you didn't realize that
> before you made the change and then got bitten by it, tough.

Luckily, I didn't get bitten by it because I figured out the 
consequences and wrote a webpage about them before making an incorrect 
code change.

Leaving behind the backwards compatibility issues...

In order to make super really nice, it should be easier to use right. 
Again, the two major issues that cause problems are: 1) having to 
declare every method with *args, **kwargs, and having to pass those and 
all the arguments you take explicitly to super, and 2) that 
traditionally __init__ is called with positional arguments.

To fix #1, it would be really nice if you could write code something 
like the following snippet. Notice especially here that the 'bar' 
argument gets passed through C.__init__ and A.__init__, into 
D.__init__, without the previous two having to do anything about it. 
However, if you ask me to detail how this could *possibly* *ever* work 
in python, I have no idea. Probably the answer is that it can't.

class A(object):
     def __init__(self):
         print "A"
         next_method

class B(object):
     def __init__(self):
         print "B"
         next_method

class C(A):
     def __init__(self, foo):
         print "C","foo=",foo
         next_method
         self.foo=foo

class D(B):
     def __init__(self, bar):
         print "D", "bar=",bar
         next_method
         self.bar=bar

class E(C,D):
     def __init__(self, foo, bar):
         print "E"
         next_method

class E2(C,D):
     """Even worse, not defining __init__ should work right too."""

E(foo=10, bar=20)
E2(foo=10, bar=20)
# Yet, these ought to result in a TypeError because the quaz keyword 
isn't recognized by
# any __init__ method on any class in the hierarchy above E/E2:
E(foo=10, bar=20, quaz=5)
E2(foo=10, bar=20, quaz=5)

James

From aleax at aleax.it  Thu Jan  6 09:33:20 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan  6 09:33:24 2005
Subject: [Python-Dev] an idea for improving struct.unpack api 
In-Reply-To: <Pine.LNX.4.58.0501042046120.678@bagira>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
Message-ID: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 06, at 06:27, Ilya Sandler wrote:
    ...
> We could have an optional offset argument for
>
> unpack(format, buffer, offset=None)

I do agree on one concept here: when a function wants a string argument 
S, and the value for that string argument S is likely to come from some 
other bigger string Z as a subset Z[O:O+L], being able to optionally 
specify Z, O and L (or the endpoint, O+L), rather than having to do the 
slicing, can be a simplification and a substantial speedup.

When I had this kind of problem in the past I approached it with the 
buffer built-in.  Say I've slurped in a whole not-too-huge binary file 
into `data', and now need to unpack several pieces of it from different 
offsets; rather than:
     somestuff = struct.unpack(fmt, data[offs:offs+struct.calcsize(fmt)])
I can use:
     somestuff = struct.unpack(fmt, buffer(data, offs, 
struct.calcsize(fmt)))
as a kind of "virtual slicing".  Besides the vague-to-me "impending 
deprecation" state of the buffer builtin, there is some advantage, but 
it's a bit modest.  If I could pass data and offs directly to 
struct.unpack and thus avoid churning of one-use readonly buffer 
objects I'd probably be happier.


As for "passing offset implies the length is calcsize(fmt)" 
sub-concept, I find that slightly more controversial.  It's convenient, 
but somewhat ambiguous; in other cases (e.g. string methods) passing a 
start/offset and no end/length means to go to the end.  Maybe something 
more explicit, such as a length= parameter with a default of None 
(meaning "go to the end") but which can be explicitly passed as -1 to 
mean "use calcsize internally", might go down better.


As for the next part:

> the offset argument is an object which contains a single integer field
> which gets incremented inside unpack() to point to the next byte.

...I find this just too "magical".  It's only useful when you're 
specifically unpacking data bytes that are compactly back to back (no 
"filler" e.g. for alignment purposes) and pays some conceptual price -- 
introducing a new specialized type to play the role of "mutable int" 
and having an argument mutated, which is not usual in Python's library.

> so with a new API the above code could be written as
>
>  offset=struct.Offset(0)
>  hdr=unpack("iiii", offset)
>  for i in range(hdr[0]):
>     item=unpack( "IIII", rec, offset)
>
> When an offset argument is provided, unpack() should allow some bytes 
> to
> be left unpacked at the end of the buffer..
>
> Does this suggestion make sense? Any better ideas?

All in all, I suspect that something like...:

# out of the record-by-record loop:
hdrsize = struct.calcsize(hdr_fmt)
itemsize = struct.calcsize(item_fmt)
reclen = length_of_each_record

# loop record by record
while True:
     rec = binfile.read(reclen)
     if not rec:
         break
     hdr = struct.unpack(hdr_fmt, rec, 0, hdrsize)
     for offs in itertools.islice(xrange(hdrsize, reclen, itemsize), 
hdr[0]):
         item = struct.unpack(item_fmt, rec, offs, itemsize)
         # process item

might be a better compromise.  More verbose, because more explicit, of 
course.  And if you do this kind of thing often, easy to encapsulate in 
a generator with 4 parameters -- the two formats (header and item), the 
record length, and the binfile -- just yield the hdr first, then each 
struct.unpack result from the inner loop.

Having the offset and length parameters to struct.unpack might still be 
a performance gain worth pursuing (of course, we'd need some 
performance measurements from real-life use cases) even though from the 
point of view of code simplicity, in this example, there appears to be 
little or no gain wrt slicing rec[offs:offs+itemsize] or using 
buffer(rec, offs, itemsize).


Alex

From anthony at interlink.com.au  Thu Jan  6 11:28:26 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Jan  6 11:28:21 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <200501062128.28905.anthony@interlink.com.au>

My take on this:

    struct.pack/struct.unpack is already one of my least-favourite parts 
    of the stdlib. Of the modules I use regularly, I pretty much only ever
    have to go back and re-read the struct (and re) documentation because
    they just won't fit in my brain. Adding additional complexity to them 
    seems like a net loss to me. 

    I'd _love_ to find the time to write a sane replacement for struct - as
    well as the current use case, I'd also like it to handle things like
    attribute-length-value 3-tuples nicely (where you get a fixed field 
    which identifies the attribute, a fixed field which specifies the value
    length, and a value of 'length' bytes). Almost all sane network protocols
    (i.e. those written before the plague of pointy brackets) use this in
    some way.

    I'd much rather specify the format as something like a tuple of values -
    (INT, UINT, INT, STRING) (where INT &c are objects defined in the
    struct module). This also then allows users to specify their own formats
    if they have a particular need for something.

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From p.f.moore at gmail.com  Thu Jan  6 12:38:39 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Jan  6 12:38:41 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <200501062128.28905.anthony@interlink.com.au>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>
	<200501062128.28905.anthony@interlink.com.au>
Message-ID: <79990c6b050106033816e8ea25@mail.gmail.com>

On Thu, 6 Jan 2005 21:28:26 +1100, Anthony Baxter
<anthony@interlink.com.au> wrote:
> My take on this:
> 
>     struct.pack/struct.unpack is already one of my least-favourite parts
>     of the stdlib. Of the modules I use regularly, I pretty much only ever
>     have to go back and re-read the struct (and re) documentation because
>     they just won't fit in my brain. Adding additional complexity to them
>     seems like a net loss to me.

Have you looked at Thomas Heller's ctypes? Ignoring the FFI stuff, it
has a fairly comprehensive interface for defining and using C
structure types. A simple example:

>>> class POINT(Structure):
...    _fields_ = [('x', c_int), ('y', c_int)]
...
>>> p = POINT(1,2)
>>> p.x, p.y
(1, 2)
>>> str(buffer(p))
'\x01\x00\x00\x00\x02\x00\x00\x00'

To convert *from* a byte string is messier, but not too bad:

>>> s = str(buffer(p))
>>> s
'\x01\x00\x00\x00\x02\x00\x00\x00'
>>> p2 = POINT()
>>> ctypes.memmove(p2, s, ctypes.sizeof(POINT))
14688904
>>> p2.x, p2.y
(1, 2)

It might even be possible to get Thomas to add a small helper
classmethod to ctypes types, something like

    POINT.unpack(str, offset=0, length=None)

which does the equivalent of

    def unpack(cls, str, offset=0, length=None):
        if length is None:
            length=sizeof(cls)
        b = buffer(str, offset, length)
        new = cls()
        ctypes.memmove(new, b, length)
        return new

>     I'd _love_ to find the time to write a sane replacement for struct - as
>     well as the current use case, I'd also like it to handle things like
>     attribute-length-value 3-tuples nicely (where you get a fixed field
>     which identifies the attribute, a fixed field which specifies the value
>     length, and a value of 'length' bytes). Almost all sane network protocols
>     (i.e. those written before the plague of pointy brackets) use this in
>     some way.

I'm not sure ctypes handles that, mainly because I don't think C does
(without the usual trick of defining the last field as fixed length)

Paul.
From theller at python.net  Thu Jan  6 13:22:52 2005
From: theller at python.net (Thomas Heller)
Date: Thu Jan  6 13:21:32 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <79990c6b050106033816e8ea25@mail.gmail.com> (Paul Moore's
	message of "Thu, 6 Jan 2005 11:38:39 +0000")
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>
	<200501062128.28905.anthony@interlink.com.au>
	<79990c6b050106033816e8ea25@mail.gmail.com>
Message-ID: <4qhuzqzn.fsf@python.net>

Paul Moore <p.f.moore@gmail.com> writes:

> On Thu, 6 Jan 2005 21:28:26 +1100, Anthony Baxter
> <anthony@interlink.com.au> wrote:
>> My take on this:
>> 
>>     struct.pack/struct.unpack is already one of my least-favourite parts
>>     of the stdlib. Of the modules I use regularly, I pretty much only ever
>>     have to go back and re-read the struct (and re) documentation because
>>     they just won't fit in my brain. Adding additional complexity to them
>>     seems like a net loss to me.
>
> Have you looked at Thomas Heller's ctypes? Ignoring the FFI stuff, it
> has a fairly comprehensive interface for defining and using C
> structure types. A simple example:
>
>>>> class POINT(Structure):
> ...    _fields_ = [('x', c_int), ('y', c_int)]
> ...
>>>> p = POINT(1,2)
>>>> p.x, p.y
> (1, 2)
>>>> str(buffer(p))
> '\x01\x00\x00\x00\x02\x00\x00\x00'
>
> To convert *from* a byte string is messier, but not too bad:
[...]

For reading structures from files, the undocumented (*) readinto
method is very nice. An example:

class IMAGE_DOS_HEADER(Structure):
    ....
class IMAGE_NT_HEADERS(Structure):
    ....

class PEReader(object):
    def read_image(self, pathname):
        ################
        # the MSDOS header
        image = open(pathname, "rb")
        self.dos_header = IMAGE_DOS_HEADER()
        image.readinto(self.dos_header)

        ################
        # The PE header
        image.seek(self.dos_header.e_lfanew)
        self.nt_headers = IMAGE_NT_HEADERS()
        image.readinto(self.nt_headers)


> It might even be possible to get Thomas to add a small helper
> classmethod to ctypes types, something like
>
>     POINT.unpack(str, offset=0, length=None)

Maybe, but I would prefer the unbeloved buffer object (*) as argument,
because it has builtin offset and length.

> which does the equivalent of
>
>     def unpack(cls, str, offset=0, length=None):
>         if length is None:
>             length=sizeof(cls)
>         b = buffer(str, offset, length)
>         new = cls()
>         ctypes.memmove(new, b, length)
>         return new
>
>>     I'd _love_ to find the time to write a sane replacement for struct - as
>>     well as the current use case, I'd also like it to handle things like
>>     attribute-length-value 3-tuples nicely (where you get a fixed field
>>     which identifies the attribute, a fixed field which specifies the value
>>     length, and a value of 'length' bytes). Almost all sane network protocols
>>     (i.e. those written before the plague of pointy brackets) use this in
>>     some way.
>
> I'm not sure ctypes handles that, mainly because I don't think C does
> (without the usual trick of defining the last field as fixed length)

Correct.

(*) Which brings me to the questions I have in my mind for quite some
time: Why is readinto undocumented, and what about the status of the
buffer object: do the recent fixes to the buffer object change it's
status?

Thomas

From ncoghlan at iinet.net.au  Thu Jan  6 13:53:48 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu Jan  6 13:53:52 2005
Subject: [Python-Dev] Subscribing to PEP updates
Message-ID: <41DD34DC.5010005@iinet.net.au>

Someone asked on python-list about getting notifications of changes to PEP's.

As a low-effort solution, would it be possible to add a Sourceforge mailing list 
hook just for checkins to the nondist/peps directory?

Call it python-pep-updates or some such beast. If I remember how checkin 
notifications work correctly, the updates would even come with automatic diffs :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From Jack.Jansen at cwi.nl  Thu Jan  6 14:04:34 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Jan  6 14:04:46 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <41DC7D20.4000901@v.loewis.de>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
Message-ID: <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>


On 6 Jan 2005, at 00:49, Martin v. L?wis wrote:
>> The "new" solution is basically to go back to the Unix way of 
>> building  an extension: link it against nothing and sort things out 
>> at runtime.  Not my personal preference, but at least we know that 
>> loading an  extension into one Python won't bring in a fresh copy of 
>> a different  interpreter or anything horrible like that.
>
> This sounds good, except that it only works on OS X 10.3, right?
> What about older versions?

10.3 or later. For older OSX releases (either because you build Python 
on 10.2 or earlier, or because you've set MACOSX_DEPLOYMENT_TARGET to a 
value of 10.2 or less) we use the old behaviour of linking with 
"-framework Python".
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From mwh at python.net  Thu Jan  6 14:17:40 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Jan  6 14:17:42 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <Pine.LNX.4.58.0501042046120.678@bagira> (Ilya Sandler's
	message of "Wed, 5 Jan 2005 21:27:16 -0800 (PST)")
References: <Pine.LNX.4.58.0501042046120.678@bagira>
Message-ID: <2mr7kyhf2j.fsf@starship.python.net>

Ilya Sandler <ilya@bluefir.net> writes:

> A problem:
>
> The current struct.unpack api works well for unpacking C-structures where
> everything is usually unpacked at once, but it
> becomes  inconvenient when unpacking binary files where things
> often have to be unpacked field by field. Then one has to keep track
> of offsets, slice the strings,call struct.calcsize(), etc...

IMO (and E), struct.unpack is the primitive atop which something more
sensible is built.  I've certainly tried to build that more sensible
thing at least once, but haven't ever got the point of believing what
I had would be applicable to the general case... maybe it's time to
write such a thing for the standard library.

Cheers,
mwh

-- 
  ARTHUR:  Ford, you're turning into a penguin, stop it.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 2
From goodger at python.org  Thu Jan  6 15:01:42 2005
From: goodger at python.org (David Goodger)
Date: Thu Jan  6 15:01:55 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <41DD34DC.5010005@iinet.net.au>
References: <41DD34DC.5010005@iinet.net.au>
Message-ID: <41DD44C6.4010909@python.org>

[Nick Coghlan]
> Someone asked on python-list about getting notifications of changes to
> PEP's.
>
> As a low-effort solution, would it be possible to add a Sourceforge
> mailing list hook just for checkins to the nondist/peps directory?

-0

Probably possible, but not no-effort, so even if it gets a favorable
reaction someone needs to do some work.  Why not just subscribe to
python-checkins and filter out everything *but* nondist/peps?  As PEP
editor, that's what I do (although I filter manually/visually, since
I'm also interested in other checkins).

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/adad03d4/signature.pgp
From gjc at inescporto.pt  Thu Jan  6 16:21:36 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Thu Jan  6 16:22:08 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <2mr7kyhf2j.fsf@starship.python.net>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
Message-ID: <1105024896.25031.12.camel@localhost>

On Thu, 2005-01-06 at 13:17 +0000, Michael Hudson wrote:
> Ilya Sandler <ilya@bluefir.net> writes:
> 
> > A problem:
> >
> > The current struct.unpack api works well for unpacking C-structures where
> > everything is usually unpacked at once, but it
> > becomes  inconvenient when unpacking binary files where things
> > often have to be unpacked field by field. Then one has to keep track
> > of offsets, slice the strings,call struct.calcsize(), etc...
> 
> IMO (and E), struct.unpack is the primitive atop which something more
> sensible is built.  I've certainly tried to build that more sensible
> thing at least once, but haven't ever got the point of believing what
> I had would be applicable to the general case... maybe it's time to
> write such a thing for the standard library.

  I've been using this simple wrapper:

def stream_unpack(stream, format):
	return struct.unpack(format, stream.read(struct.calcsize(format)))

  It works with file-like objects, such as file, StringIO,
socket.makefile(), etc.  Working with streams is useful because
sometimes you don't know how much you need to read to decode a message
in advance.

  Regards.

> 
> Cheers,
> mwh
> 
-- 
Gustavo J. A. M. Carneiro
<gjc@inescporto.pt> <gustavo@users.sourceforge.net>
The universe is always one step beyond logic.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3086 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/349cb526/smime.bin
From martin at v.loewis.de  Thu Jan  6 17:05:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan  6 17:04:59 2005
Subject: [Python-Dev] csv module TODO list
In-Reply-To: <20050106011055.001163C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com>
	<20050105093414.00DFF3C8E5@coffee.object-craft.com.au>
	<41DC637A.5050105@v.loewis.de>
	<20050106011055.001163C8E5@coffee.object-craft.com.au>
Message-ID: <41DD61B1.1030507@v.loewis.de>

Andrew McNamara wrote:
> Marc-Andre Lemburg mentioned that he has encountered UTF-16 encoded csv
> files, so a reasonable starting point would be the ability to read and
> parse, as well as the ability to generate, one of these.

I see. That would be reasonable, indeed. Notice that this is not so much
a "Unicode issue", but more an "encoding" issue. If you solve the
"arbitrary encodings" problem, you solve UTF-16 as a side effect.

> The reader interface currently returns a row at a time, consuming as many
> lines from the supplied iterable (with the most common iterable being
> a file). This suggests to me that we will need an optional "encoding"
> argument to the reader constructor, and that the reader will need to
> decode the source lines.

Ok. In this context, I see two possible implementation strategies:
1. Implement the csv module two times: once for bytes, and once for
    Unicode characters. It is likely that the source code would be
    the same for each case; you just need to make sure the "Dialect
    and Formatting Parameters" change their width accordingly.
    If you use the SRE approach, you would do

    #define CSV_ITEM_T char
    #define CSV_NAME_PREFIX byte_
    #include "csvimpl.c"
    #define CSV_ITEM_T Py_Unicode
    #define CSV_NAME_PREFIX unicode_
    #include "csvimpl.c"

2. Use just the existing _csv module, and represent non-byte encodings
    as UTF-8. This will work as long as the delimiters and other markup
    characters have always a single byte in UTF-8, which is the case
    for "':\, as well as for \r and \n. Then, wenn processing using
    an explicit encoding, first convert the input into Unicode objects.
    Then encode the Unicode objects into UTF-8, and pass it to _csv.
    For the results you get back, convert each element back from UTF-8
    to a Unicode object.

This could be implemented as

def reader(f, encoding=None):
     if encoding is None: return _csv.reader(f)
     enc, dec, reader, writer = codecs.lookup(encoding)
     utf8_enc, utf8_dec, utf8_r, utf8_w = codecs.lookup("UTF-8")
     # Make a recoder which can only read
     utf8_stream = codecs.StreamRecoder(f, utf8_enc, None, Reader, None)
     csv_reader = _csv.reader(utf8_stream)
     # For performance reasons, map_result could be implemented in C
     def map_result(t):
         result = [None]*len(t)
         for i, val in enumerate(t):
             result[i] = utf8_dec(val)
         return tuple(result)
     return itertools.imap(map_result, csv_reader)
# This code is untested

This approach has the disadvantage of performing three recodings:
from input charset to Unicode, from Unicode to UTF-8, from UTF-8
to Unicode. One could:
- skip the initial recoding if the encoding is already known
   to be _csv-safe (i.e. if it is a pure ASCII superset).
   This would be valid for ASCII, iso-8859-n, UTF-8, ...
- offer the user to keep the results in the input encoding,
   instead of always returning Unicode objects.

Apart from this disadvantage, I think this gives people what they want:
they can specify the encoding of the input, and they get the results not
only csv-separated, but also unicode-decode. This approach is the same
that is used for Python source code encodings: the source is first
recoded into UTF-8, then parsed, then recoded back.

> That said, I'm hardly a unicode expert, so I
> may be overlooking something (could a utf-16 encoded character span a
> line break, for example).

This cannot happen: \r, in UTF-16, is also 2 bytes (0D 00, if UTF-16LE).
There are issues that Unicode has additional line break characters,
which is probably irrelevant.

Regards,
Martin
From ajm at flonidan.dk  Thu Jan  6 17:22:12 2005
From: ajm at flonidan.dk (Anders J. Munch)
Date: Thu Jan  6 17:22:37 2005
Subject: [Python-Dev] csv module TODO list 
Message-ID: <6D9E824FA10BD411BE95000629EE2EC3C6DE3C@FLONIDAN-MAIL>

Andrew McNamara wrote:
> 
> I'm not altogether sure there. The parsing state machine is all
> written in C, and deals with signed chars - I expect we'll need two
> versions of that (or one version that's compiled twice using
> pre-processor macros). Quite a large job. Suggestions gratefully
> received.

How about using UTF-8 internally?  Change nothing in _csv.c, but in
csv.py encode/decode any unicode strings into UTF-8 on the way to/from
_csv.  File-like objects passed in by the user can be wrapped in
proxies that take care of encoding and decoding user strings, as well
as trans-coding between UTF-8 and the users chosen file encoding.

All that coding work may slow things down, but your original fast _csv
module will still be there when you need it.

- Anders
From mchermside at ingdirect.com  Thu Jan  6 17:33:54 2005
From: mchermside at ingdirect.com (Chermside, Michael)
Date: Thu Jan  6 17:33:58 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
Message-ID: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>

> Why not just subscribe to
> python-checkins and filter out everything *but* nondist/peps?

But there are lots of people who might be interested in
following PEP updates but not other checkins. Pretty
much anyone who considers themselves a "user" of Python
not a developer. Perhaps they don't even know C. That's a
lot to filter through for such people. (After all, I
sure HOPE that only a small fraction of checkins are for
PEPs not code.)

I'm +0 on it... but I'll mention that if such a list were
created I'd subscribe. So maybe that's +0.2 instead.

-- Michael Chermside



This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it.

From gvanrossum at gmail.com  Thu Jan  6 18:13:54 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Jan  6 18:13:57 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
	<9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc20501051536c4fd618@mail.gmail.com>
	<091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
Message-ID: <ca471dc2050106091357a5c36b@mail.gmail.com>

> Please notice that I'm talking about concrete, real issues, not just a
> "super is bad!" rant.

Then why is the title "Python's Super Considered Harmful" ???

Here's my final offer.  Change the title to something like "Multiple
Inheritance Pitfalls in Python" and nobody will get hurt.

> They are not inherent in cooperative
> multiple inheritance, but occur mostly because of its late addition to python,

Would you rather not have seen it (== cooperative inheritance) added at all?

> and the cumbersome way in which you have to invoke super.

Given Python's dynamic nature I couldn't think of a way to make it
less cumbersome. I see you tried (see below) and couldn't either. At
this point I tend to say "put up or shut up."

> I wrote up the page as part of an investigation into converting Twisted
> to use super. I thought it would be a good idea to do the conversion,
> but others told me it would be a bad idea for backwards compatibility
> reasons. I did not believe, at first, and conducted experiments. In the
> end, I concluded that it is not possible, because of the issues with
> mixing the new and old paradigm.

So it has nothing to do with the new paradigm, just with backwards
compatibility. I appreciate those issues (more than you'll ever know)
but I don't see why you should try to discourage others from using the
new paradigm, which is what your article appears to do.

> Leaving behind the backwards compatibility issues...
> 
> In order to make super really nice, it should be easier to use right.
> Again, the two major issues that cause problems are: 1) having to
> declare every method with *args, **kwargs, and having to pass those and
> all the arguments you take explicitly to super,

That's only an issue with __init__ or with code written without
cooperative MI in mind. When using cooperative MI, you shouldn't
redefine method signatures, and all is well.

 and 2) that
> traditionally __init__ is called with positional arguments.

Cooperative MI doesn't have a really good solution for __init__.
Defining and calling __init__ only with keyword arguments is a good
solution. But griping about "traditionally" is a backwards
compatibility issue, which you said you were leaving behind.

> To fix #1, it would be really nice if you could write code something
> like the following snippet. Notice especially here that the 'bar'
> argument gets passed through C.__init__ and A.__init__, into
> D.__init__, without the previous two having to do anything about it.
> However, if you ask me to detail how this could *possibly* *ever* work
> in python, I have no idea. Probably the answer is that it can't.

Exactly. What is your next_method statement supposed to do?

No need to reply except when you've changed the article. I'm tired of
the allegations.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From foom at fuhm.net  Thu Jan  6 18:22:36 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu Jan  6 18:22:34 2005
Subject: [Python-Dev] buffer objects [was: an idea for improving
	struct.unpack api]
In-Reply-To: <4qhuzqzn.fsf@python.net>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it>
	<200501062128.28905.anthony@interlink.com.au>
	<79990c6b050106033816e8ea25@mail.gmail.com>
	<4qhuzqzn.fsf@python.net>
Message-ID: <8F37C34F-6007-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 6, 2005, at 7:22 AM, Thomas Heller wrote:
> (*) Which brings me to the questions I have in my mind for quite some
> time: Why is readinto undocumented, and what about the status of the
> buffer object: do the recent fixes to the buffer object change it's
> status?

I, for one, would be very unhappy if the byte buffer object were to go 
away. It's quite useful.

I didn't even realize readinto existed. It'd be great to add more of 
them. os.readinto for reading from fds and socket.socket.recvinto for 
reading from sockets. Is there any reason the writable buffer interface 
isn't exposed to python-land?

James

From pje at telecommunity.com  Thu Jan  6 18:44:58 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan  6 18:46:05 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
References: <ca471dc20501051536c4fd618@mail.gmail.com>
	<ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
	<9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc20501051536c4fd618@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050106122714.02135920@mail.telecommunity.com>

At 02:46 AM 1/6/05 -0500, James Y Knight wrote:
>To fix #1, it would be really nice if you could write code something like 
>the following snippet. Notice especially here that the 'bar' argument gets 
>passed through C.__init__ and A.__init__, into D.__init__, without the 
>previous two having to do anything about it. However, if you ask me to 
>detail how this could *possibly* *ever* work in python, I have no idea. 
>Probably the answer is that it can't.
>
>class A(object):
>     def __init__(self):
>         print "A"
>         next_method
>
>class B(object):
>     def __init__(self):
>         print "B"
>         next_method

Not efficiently, no, but it's *possible*.  Just write a 'next_method()' 
routine that walks the frame stack and self's MRO, looking for a 
match.  You know the method name from f_code.co_name, and you can check 
each class' __dict__ until you find a function or classmethod object whose 
code is f_code.  If not, move up to the next frame and try again.    Once 
you know the class that the function comes from, you can figure out the 
"next" method, and pull its args from the calling frame's args, walking 
backward to other calls on the same object, until you find all the args you 
need.  Oh, and don't forget to make sure that you're inspecting frames that 
have the same 'self' object.

Of course, the result would be a hideous evil ugly hack that should never 
see the light of day, but you could *do* it, if you *really really* wanted 
to.  And if you wrote it in C, it might be only 50 or 100 times slower than 
super().  :)

From barry at python.org  Thu Jan  6 19:01:51 2005
From: barry at python.org (Barry Warsaw)
Date: Thu Jan  6 19:02:02 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>
References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>
Message-ID: <1105034511.10728.3.camel@geddy.wooz.org>

On Thu, 2005-01-06 at 11:33, Chermside, Michael wrote:
> > Why not just subscribe to
> > python-checkins and filter out everything *but* nondist/peps?
> 
> But there are lots of people who might be interested in
> following PEP updates but not other checkins. Pretty
> much anyone who considers themselves a "user" of Python
> not a developer. Perhaps they don't even know C. That's a
> lot to filter through for such people. (After all, I
> sure HOPE that only a small fraction of checkins are for
> PEPs not code.)
> 
> I'm +0 on it... but I'll mention that if such a list were
> created I'd subscribe. So maybe that's +0.2 instead.

As an experiment, I just added a PEP topic to the python-checkins
mailing list.  You could subscribe to this list and just select the PEP
topic (which matches the regex "PEP" in the Subject header or first few
lines of the body).

Give it a shot and let's see if that does the trick.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/d1ebd9f4/attachment.pgp
From tjreedy at udel.edu  Thu Jan  6 20:16:23 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Jan  6 20:16:38 2005
Subject: [Python-Dev] Re: super() harmful?
References: <ca471dc2050104102814be915b@mail.gmail.com><1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><ca471dc205010418022db8c838@mail.gmail.com><A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc2050105102328387030@mail.gmail.com><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc20501051536c4fd618@mail.gmail.com>
	<091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
Message-ID: <crk2qf$o74$1@sea.gmane.org>


"James Y Knight" <foom@fuhm.net> wrote in message 
news:091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net...
> Please notice that I'm talking about concrete, real issues, not just a 
> "super is bad!" rant.

Umm, James, come on.  Let's be really real and concrete ;-).

Your title "Python's Super Considered Harmful" is an obvious reference to 
and takeoff on Dijkstra's influential polemic "Goto Considered Harmful".

To me, the obvious message therefore is that super(), like goto, is an 
ill-conceived monstrosity that warps peoples' minds and should be banished. 
I can also see a slight dig at Guido for introducing such a thing decades 
after Dijkstra taught us to know better.

If that is your summary message for me, fine.  If not, try something else. 
The title of a piece is part of its message -- especially when it has an 
intelligible meaning.  For people who read the title in, for instance, a 
clp post (as I did), but don't follow the link and read what is behind the 
title (which I did do), the title *is* the message.

Terry J. Reedy



From janssen at parc.com  Thu Jan  6 20:25:34 2005
From: janssen at parc.com (Bill Janssen)
Date: Thu Jan  6 20:25:49 2005
Subject: [Python-Dev] Re: super() harmful? 
In-Reply-To: Your message of "Thu, 06 Jan 2005 09:13:54 PST."
	<ca471dc2050106091357a5c36b@mail.gmail.com> 
Message-ID: <05Jan6.112539pst."58617"@synergy1.parc.xerox.com>

> Then why is the title "Python's Super Considered Harmful" ???
> 
> Here's my final offer.  Change the title to something like "Multiple
> Inheritance Pitfalls in Python" and nobody will get hurt.

Or better yet, considering the recent thread on Python marketing,
"Multiple Inheritance Mastery in Python" :-).

Bill
From bac at OCF.Berkeley.EDU  Thu Jan  6 20:29:45 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Jan  6 20:30:05 2005
Subject: [Python-Dev] Subscribing to PEP updates
In-Reply-To: <41DD34DC.5010005@iinet.net.au>
References: <41DD34DC.5010005@iinet.net.au>
Message-ID: <41DD91A9.1000901@ocf.berkeley.edu>

Nick Coghlan wrote:
> Someone asked on python-list about getting notifications of changes to 
> PEP's.
> 
> As a low-effort solution, would it be possible to add a Sourceforge 
> mailing list hook just for checkins to the nondist/peps directory?
> 
> Call it python-pep-updates or some such beast. If I remember how checkin 
> notifications work correctly, the updates would even come with automatic 
> diffs :)
> 

Probably not frequent or comprehensive enough, but I try to always have at 
least a single news item that clumps all PEP updates that python-dev gets 
notified about.

-Brett
From bob at redivi.com  Thu Jan  6 20:38:56 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan  6 20:39:04 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <2mr7kyhf2j.fsf@starship.python.net>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
Message-ID: <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com>


On Jan 6, 2005, at 8:17, Michael Hudson wrote:

> Ilya Sandler <ilya@bluefir.net> writes:
>
>> A problem:
>>
>> The current struct.unpack api works well for unpacking C-structures 
>> where
>> everything is usually unpacked at once, but it
>> becomes  inconvenient when unpacking binary files where things
>> often have to be unpacked field by field. Then one has to keep track
>> of offsets, slice the strings,call struct.calcsize(), etc...
>
> IMO (and E), struct.unpack is the primitive atop which something more
> sensible is built.  I've certainly tried to build that more sensible
> thing at least once, but haven't ever got the point of believing what
> I had would be applicable to the general case... maybe it's time to
> write such a thing for the standard library.

This is my ctypes-like attempt at a high-level interface for struct.  
It works well for me in macholib:  
http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py

-bob

From tim.peters at gmail.com  Thu Jan  6 20:47:35 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Jan  6 20:47:37 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <4494809200119597707@unknownmsgid>
References: <ca471dc2050106091357a5c36b@mail.gmail.com>
	<4494809200119597707@unknownmsgid>
Message-ID: <1f7befae05010611474d76bebd@mail.gmail.com>

[Guido]
>> Then why is the title "Python's Super Considered Harmful" ???
>>
>> Here's my final offer.  Change the title to something like "Multiple
>> Inheritance Pitfalls in Python" and nobody will get hurt.

[Bill Janssen]
> Or better yet, considering the recent thread on Python marketing,
> "Multiple Inheritance Mastery in Python" :-).

I'm sorry, but that's not good marketing -- it contains big words, and
putting the brand name last is ineffective.  How about

    Python's Super() is Super -- Over 1528.7% Faster than C!

BTW, it's important that fractional percentages end with an odd digit.
 Research shows that if the last digit is even, 34.1% of consumers
tend to suspect the number was made up.
From bac at OCF.Berkeley.EDU  Thu Jan  6 20:50:22 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Jan  6 20:50:40 2005
Subject: [Python-Dev] proto-pep: How to change CPython's bytecode
In-Reply-To: <41CC7F67.9070009@ocf.berkeley.edu>
References: <41CC7F67.9070009@ocf.berkeley.edu>
Message-ID: <41DD967E.3040405@ocf.berkeley.edu>

OK, latest update with all suggest revisions (mention this is for CPython, 
section for known previous bytecode work).

If no one has any revisions I will submit to David for official PEP acceptance 
this weekend.

----------------------------------

PEP: XXX
Title: How to change CPython's bytecode
Version: $Revision: 1.4 $
Last-Modified: $Date: 2003/09/22 04:51:50 $
Author: Brett Cannoon <brett@python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: XX-XXX-XXXX
Post-History: XX-XXX-XXXX


Abstract
========

Python source code is compiled down to something called bytecode.  This
bytecode must implement enough semantics to perform the actions required by the
Language Reference [#lang-ref].  As such, knowing how to add, remove, or change
the bytecode is important to do properly when changing the abilities of the
Python language.
This PEP covers how to accomplish this in the CPython implementation of the
language (referred to as simply "Python" for the rest of this PEP).


Rationale
=========

While changing Python's bytecode is not a frequent occurence, it still happens.
Having the required steps documented in a single location should make
experimentation with the bytecode easier since it is not necessarily obvious
what the steps are to change the bytecode.

This PEP, paired with PEP 306 [#PEP-306]_, should provide enough basic
guidelines for handling any changes performed to the Python language itself in
terms of syntactic changes that introduce new semantics.


Checklist
=========

This is a rough checklist of what files need to change and how they are
involved with the bytecode.  All paths are given from the viewpoint of
``/cvsroot/python/dist/src`` from CVS).  This list should not be considered
exhaustive nor to cover all possible situations.

- ``Include/opcode.h``
	This include file lists all known opcodes and associates each opcode
	name with
	a unique number.  When adding a new opcode it is important to take note
	of the ``HAVE_ARGUMENT`` value.  This ``#define``'s value specifies the
	value at which all opcodes that have a value greater than
	``HAVE_ARGUMENT`` are expected to take an argument to the opcode.

- ``Lib/opcode.py``
	Lists all of the opcodes and their associated value.  Used by the dis
	module [#dis]_ to map bytecode values to their names.

- ``Python/ceval.c``
	Contains the main interpreter loop.  Code to handle the evalution of an
	opcode here.

- ``Python/compile.c``
	To make sure an opcode is actually used, this file must be altered.
	The emitting of all bytecode occurs here.

- ``Lib/compiler/pyassem.py``, ``Lib/compiler/pycodegen.py``
	The 'compiler' package [#compiler]_ needs to be altered to also reflect
	any changes to the bytecode.

- ``Doc/lib/libdis.tex``
	The documentation [#opcode-list] for the dis module contains a complete
	list of all the opcodes.

- ``Python/import.c``
	Defines the magic word (named ``MAGIC``) used in .pyc files to detect if
	the bytecode used matches the one used by the version of Python running.
	This number needs to be changed to make sure that the running
	interpreter does not try to execute bytecode that it does not know
	about.


Suggestions for bytecode development
====================================

A few things can be done to make sure that development goes smoothly when
experimenting with Python's bytecode.  One is to delete all .py(c|o) files
after each semantic change to Python/compile.c .  That way all files will use
any bytecode changes.

Make sure to run the entire testing suite [#test-suite]_.  Since the
``regrtest.py`` driver recompiles all source code before a test is run it acts
a good test to make sure that no existing semantics are broken.

Running parrotbench [#parrotbench]_ is also a good way to make sure existing
semantics are not broken; this benchmark is practically a compliance test.


Previous experiments
====================
Skip Montanaro presented a paper at a Python workshop on a peephole optimizer
[#skip-peephole]_.

Michael Hudson has a non-active SourceForge project named Bytecodehacks
[#Bytecodehacks]_ that provides functionality for playing with bytecode
directly.


References
==========

.. [#lang-ref] Python Language Reference, van Rossum & Drake
    (http://docs.python.org/ref/ref.html)

.. [#PEP-306] PEP 306, How to Change Python's Grammar, Hudson
    (http://www.python.org/peps/pep-0306.html)

.. [#dis] dis Module
    (http://docs.python.org/lib/module-dis.html)

.. [#test-suite] 'test' package
    (http://docs.python.org/lib/module-test.html)

.. [#parrotbench] Parrotbench
    (ftp://ftp.python.org/pub/python/parrotbench/parrotbench.tgz,
    http://mail.python.org/pipermail/python-dev/2003-December/041527.html)

.. [#opcode-list] Python Byte Code Instructions
    (http://docs.python.org/lib/bytecodes.html)

.. [#skip-peephole]
 
http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html

.. [#Bytecodehacks]
    http://bytecodehacks.sourceforge.net/bch-docs/bch/index.html


Copyright
=========

This document has been placed in the public domain.



..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    End:

From ronaldoussoren at mac.com  Thu Jan  6 20:59:30 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu Jan  6 20:59:33 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
	<834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
Message-ID: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>


On 6-jan-05, at 14:04, Jack Jansen wrote:

>
> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote:
>>> The "new" solution is basically to go back to the Unix way of 
>>> building  an extension: link it against nothing and sort things out 
>>> at runtime.  Not my personal preference, but at least we know that 
>>> loading an  extension into one Python won't bring in a fresh copy of 
>>> a different  interpreter or anything horrible like that.
>>
>> This sounds good, except that it only works on OS X 10.3, right?
>> What about older versions?
>
> 10.3 or later. For older OSX releases (either because you build Python 
> on 10.2 or earlier, or because you've set MACOSX_DEPLOYMENT_TARGET to 
> a value of 10.2 or less) we use the old behaviour of linking with 
> "-framework Python".

Wouldn't it be better to link with the actual dylib inside the 
framework on 10.2? Otherwise you can no longer build 2.3 extensions 
after you've installed 2.4.

Ronald
From martin at v.loewis.de  Thu Jan  6 21:03:39 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan  6 21:03:31 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>	<41DB2C9A.4070800@v.loewis.de>	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>	<41DC7D20.4000901@v.loewis.de>	<834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
	<7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
Message-ID: <41DD999B.4040206@v.loewis.de>

Ronald Oussoren wrote:
> Wouldn't it be better to link with the actual dylib inside the framework 
> on 10.2? Otherwise you can no longer build 2.3 extensions after you've 
> installed 2.4.

That's what I thought, too.

Regards,
Martin
From bob at redivi.com  Thu Jan  6 21:03:39 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan  6 21:03:57 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
	<834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
	<7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
Message-ID: <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com>


On Jan 6, 2005, at 14:59, Ronald Oussoren wrote:

>
> On 6-jan-05, at 14:04, Jack Jansen wrote:
>
>>
>> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote:
>>>> The "new" solution is basically to go back to the Unix way of 
>>>> building  an extension: link it against nothing and sort things out 
>>>> at runtime.  Not my personal preference, but at least we know that 
>>>> loading an  extension into one Python won't bring in a fresh copy 
>>>> of a different  interpreter or anything horrible like that.
>>>
>>> This sounds good, except that it only works on OS X 10.3, right?
>>> What about older versions?
>>
>> 10.3 or later. For older OSX releases (either because you build 
>> Python on 10.2 or earlier, or because you've set 
>> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old 
>> behaviour of linking with "-framework Python".
>
> Wouldn't it be better to link with the actual dylib inside the 
> framework on 10.2? Otherwise you can no longer build 2.3 extensions 
> after you've installed 2.4.

It would certainly be better to do this for 10.2.

-bob

From martin at v.loewis.de  Thu Jan  6 21:12:40 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan  6 21:12:32 2005
Subject: [Python-Dev] Changing the default value of stat_float_times
Message-ID: <41DD9BB8.5060206@v.loewis.de>

When support for floating-point stat times was added in 2.3,
it was the plan that this should eventually become the default.
Does anybody object if I change the default now, for Python 2.5?
Applications which then break can globally change it back, with

os.stat_float_times(False)

Regards,
Martin
From aleax at aleax.it  Thu Jan  6 21:46:39 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan  6 21:46:45 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <crk2qf$o74$1@sea.gmane.org>
References: <ca471dc2050104102814be915b@mail.gmail.com><1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><ca471dc205010418022db8c838@mail.gmail.com><A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc2050105102328387030@mail.gmail.com><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc20501051536c4fd618@mail.gmail.com>
	<091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
	<crk2qf$o74$1@sea.gmane.org>
Message-ID: <10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 06, at 20:16, Terry Reedy wrote:

>
> "James Y Knight" <foom@fuhm.net> wrote in message
> news:091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net...
>> Please notice that I'm talking about concrete, real issues, not just a
>> "super is bad!" rant.
>
> Umm, James, come on.  Let's be really real and concrete ;-).
>
> Your title "Python's Super Considered Harmful" is an obvious reference 
> to
> and takeoff on Dijkstra's influential polemic "Goto Considered 
> Harmful".

...or any other of the 345,000 google hits on "considered harmful"...?-)

<http://www.meyerweb.com/eric/comment/chech.html>


Alex

From m.bless at gmx.de  Thu Jan  6 22:40:36 2005
From: m.bless at gmx.de (Martin Bless)
Date: Thu Jan  6 22:53:19 2005
Subject: [Python-Dev] Re: an idea for improving struct.unpack api
References: <Pine.LNX.4.58.0501042046120.678@bagira>
Message-ID: <gtbrt0dr8j41rj7q7gjj6vrg1tt6p57uaj@4ax.com>

On Wed, 5 Jan 2005 21:27:16 -0800 (PST), Ilya Sandler
<ilya@bluefir.net> wrote:

>The current struct.unpack api works well for unpacking C-structures where
>everything is usually unpacked at once, but it
>becomes  inconvenient when unpacking binary files where things
>often have to be unpacked field by field.

It may be helpful to remember Sam Rushings NPSTRUCT extension which
accompanied the Calldll module of that time (2001). Still available
from

http://www.nightmare.com/~rushing/dynwin/

mb - Martin

From tdelaney at avaya.com  Thu Jan  6 23:45:55 2005
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Thu Jan  6 23:46:09 2005
Subject: [Python-Dev] Re: super() harmful?
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com>

Guido van Rossum wrote:

>> and the cumbersome way in which you have to invoke super.
> 
> Given Python's dynamic nature I couldn't think of a way to make it
> less cumbersome. I see you tried (see below) and couldn't either. At
> this point I tend to say "put up or shut up."

Well, there's my autosuper recipe you've seen before:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286195

which does basically what Philip descibes ...

>>> import autosuper
>>>
>>> class A (autosuper.autosuper):
...     def test (self, a):
...         print 'A.test: %s' % (a,)
...
>>> class B (A):
...     def test (self, a):
...         print 'B.test: %s' % (a,)
...         self.super(a + 1)
...
>>> class C (A):
...     def test (self, a):
...         print 'C.test: %s' % (a,)
...         self.super.test(a + 1)
...
>>> class D (B, C):
...     def test (self, a):
...         print 'D.test: %s' % (a,)
...         self.super(a + 1)
...
>>> D().test(1)
D.test: 1
B.test: 2
C.test: 3
A.test: 4

It uses sys._getframe() of course ...

Tim Delaney
From skip at pobox.com  Wed Jan  5 21:21:18 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan  7 01:23:51 2005
Subject: [Python-Dev] Re: csv module TODO list
In-Reply-To: <20050105121921.GB24030@idi.ntnu.no>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<16859.38960.9935.682429@montanaro.dyndns.org>
	<20050105075506.314C93C8E5@coffee.object-craft.com.au>
	<20050105121921.GB24030@idi.ntnu.no>
Message-ID: <16860.19518.824788.613286@montanaro.dyndns.org>


    Magnus> Quite a while ago I posted some material to the csv-list about
    Magnus> problems using the csv module on Unix-style colon-separated
    Magnus> files -- it just doesn't deal properly with backslash escaping
    Magnus> and is quite useless for this kind of file. I seem to recall the
    Magnus> general view was that it wasn't intended for this kind of thing
    Magnus> -- only the sort of csv that Microsoft Excel outputs/inputs, 

Yes, that's my recollection as well.  It's possible that we can extend the
interpretation of the escape char.

    Magnus> I'll be happy to re-send or summarize the relevant emails, if
    Magnus> needed.

Yes, that would be helpful.  Can you send me an example (three or four
lines) of the sort of file it won't grok?

Skip
From skip at pobox.com  Wed Jan  5 20:34:09 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan  7 01:23:54 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list 
In-Reply-To: <20050105110849.CBA843C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050105110849.CBA843C8E5@coffee.object-craft.com.au>
Message-ID: <16860.16689.695012.975520@montanaro.dyndns.org>


    >> * is CSV going to be maintained outside the python tree?
    >> If not, remove the 2.2 compatibility macros for: PyDoc_STR,
    >> PyDoc_STRVAR, PyMODINIT_FUNC, etc.

    Andrew> Does anyone thing we should continue to maintain this 2.2
    Andrew> compatibility?

With the release of 2.4, 2.2 has officially dropped off the radar screen,
right (zero probability of a 2.2.n+1 release, though the probability was
vanishingly small before).  I'd say toss it.  Do just that in a single
checkin so someone who's interested can do a simple cvs diff to yield
an initial patch file for external maintenance of that feature.

    >> * inline the following functions since they are used only in one
    >> place get_string, set_string, get_nullchar_as_None,
    >> set_nullchar_as_None, join_reset (maybe)

    Andrew> It was done that way as I felt we would be adding more getters
    Andrew> and setters to the dialect object in future.

The only new dialect attribute I envision is an encoding attribute.

    >> * is it necessary to have Dialect_methods, can you use 0 for tp_methods?

    Andrew> I was assuming I would need to add methods at some point (in
    Andrew> fact, I did have methods, but removed them).

Dialect objects are really just data containers, right?  I don't see that
they would need any methods.

    >> * remove commented out code (PyMem_DEL) on line 261
    >> Have you used valgrind on the test to find memory overwrites/leaks?

    Andrew> No, valgrind wasn't used.

I have it here at work.  I'll try to find a few minutes to run the csv tests
under valgrind's control.

Skip
From tjreedy at udel.edu  Fri Jan  7 05:23:31 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Jan  7 05:23:35 2005
Subject: [Python-Dev] Re: Re: super() harmful?
References: <ca471dc2050104102814be915b@mail.gmail.com><1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><ca471dc205010418022db8c838@mail.gmail.com><A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc2050105102328387030@mail.gmail.com><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net><ca471dc20501051536c4fd618@mail.gmail.com><091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net><crk2qf$o74$1@sea.gmane.org>
	<10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <crl2s1$bu2$1@sea.gmane.org>


"Alex Martelli" <aleax@aleax.it> wrote in message 
news:10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it...
>
> On 2005 Jan 06, at 20:16, Terry Reedy wrote:
>
>> [Knight's] title "Python's Super Considered Harmful" is an obvious 
>> reference to
>> and takeoff on Dijkstra's influential polemic "Go To Statement 
>> Considered Harmful". http://www.acm.org/classics/oct95/
[title corrected from original posting and link added]

> ...or any other of the 345,000 google hits on "considered harmful"...?-)

Restricting the search space to 'Titles of computer science articles' would 
reduce the number of hits considerably.  Many things have been considered 
harmful at sometime in almost every field of human endeavor.  However, 
according to

Eric Meyer, "Considered Harmful" Essays Considered Harmful
> <http://www.meyerweb.com/eric/comment/chech.html>

even that restriction would lead to thousands of hits inspired directly or 
indirectly by Niklaus Wirth's title for Dijkstra's Letter to the Editor. 
Thanks for the link.

Terry J. Reedy



From andrewm at object-craft.com.au  Fri Jan  7 07:13:22 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Fri Jan  7 07:13:24 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list 
In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <20050107061322.A6A563C8E5@coffee.object-craft.com.au>

>There's a bunch of jobs we (CSV module maintainers) have been putting
>off - attached is a list (in no particular order): 
[...]
>Also, review comments from Jeremy Hylton, 10 Apr 2003:
>
>    I've been reviewing extension modules looking for C types that should
>    participate in garbage collection.  I think the csv ReaderObj and
>    WriterObj should participate.  The ReaderObj it contains a reference to
>    input_iter that could be an arbitrary Python object.  The iterator
>    object could well participate in a cycle that refers to the ReaderObj.
>    The WriterObj has a reference to a writeline callable, which could well
>    be a method of an object that also points to the WriterObj.

I finally got around to looking at this, only to realise Jeremy did the
work back in Apr 2003 (thanks). One question, however - the GC doco in
the Python/C API seems to suggest to me that PyObject_GC_Track should be
called on the newly minted object prior to returning from the initialiser
(and correspondingly PyObject_GC_UnTrack should be called prior to
dismantling). This isn't being done in the module as it stands. Is the
module wrong, or is my understanding of the reference manual incorrect?

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From andrewm at object-craft.com.au  Fri Jan  7 08:54:54 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Fri Jan  7 08:55:02 2005
Subject: [Python-Dev] Minor change to behaviour of csv module
Message-ID: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>

I'm considering a change to the csv module that could potentially break
some obscure uses of the module (but CSV files usually quote, rather
than escape, so the most common uses aren't effected).

Currently, with a non-default escapechar='\\', input like:

    field one,field \
    two,field three

Returns:

    ["field one", "field \\\ntwo", "field three"]

In the 2.5 series, I propose changing this to return:

    ["field one", "field \ntwo", "field three"]

Is this reasonable? Is the old behaviour desirable in any way (we could
add a switch to enable to new behaviour, but I feel that would only
allow the confusion to continue)?

BTW, some of my other changes have changed the exceptions raised when
bad arguments were passed to the reader and writer factory functions - 
previously, the exceptions were semi-random, including TypeError,
AttributeError and csv.Error - they should now almost always be TypeError
(like most other argument passing errors). I can't see this being a
problem, but I'm prepared to listen to arguments.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From bob at redivi.com  Fri Jan  7 11:08:52 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Jan  7 11:09:10 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
	<834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
	<7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
	<0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com>
Message-ID: <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com>


On Jan 6, 2005, at 15:03, Bob Ippolito wrote:

>
> On Jan 6, 2005, at 14:59, Ronald Oussoren wrote:
>
>>
>> On 6-jan-05, at 14:04, Jack Jansen wrote:
>>
>>>
>>> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote:
>>>>> The "new" solution is basically to go back to the Unix way of 
>>>>> building  an extension: link it against nothing and sort things 
>>>>> out at runtime.  Not my personal preference, but at least we know 
>>>>> that loading an  extension into one Python won't bring in a fresh 
>>>>> copy of a different  interpreter or anything horrible like that.
>>>>
>>>> This sounds good, except that it only works on OS X 10.3, right?
>>>> What about older versions?
>>>
>>> 10.3 or later. For older OSX releases (either because you build 
>>> Python on 10.2 or earlier, or because you've set 
>>> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old 
>>> behaviour of linking with "-framework Python".
>>
>> Wouldn't it be better to link with the actual dylib inside the 
>> framework on 10.2? Otherwise you can no longer build 2.3 extensions 
>> after you've installed 2.4.
>
> It would certainly be better to do this for 10.2.

This patch implements the proposed direct framework linking:
http://python.org/sf/1097739

-bob

From andrewm at object-craft.com.au  Fri Jan  7 13:06:23 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Fri Jan  7 13:06:23 2005
Subject: [Python-Dev] Minor change to behaviour of csv module 
In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> 
References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>
Message-ID: <20050107120623.EC0673C8E5@coffee.object-craft.com.au>

>I'm considering a change to the csv module that could potentially break
>some obscure uses of the module (but CSV files usually quote, rather
>than escape, so the most common uses aren't effected).
>
>Currently, with a non-default escapechar='\\', input like:
>
>    field one,field \
>    two,field three
>
>Returns:
>
>    ["field one", "field \\\ntwo", "field three"]
>
>In the 2.5 series, I propose changing this to return:
>
>    ["field one", "field \ntwo", "field three"]
>
>Is this reasonable? Is the old behaviour desirable in any way (we could
>add a switch to enable to new behaviour, but I feel that would only
>allow the confusion to continue)?

Thinking about this further, I suspect we have to retain the current
behaviour, as broken as it is, as the default: it's conceivable that
someone somewhere is post-processing the result to remove the backslashes,
and if we fix the csv module, we'll break their code.

Note that PEP-305 had nothing to say about escaping, nor does the module
reference manual.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From Jack.Jansen at cwi.nl  Fri Jan  7 14:05:39 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Fri Jan  7 14:06:00 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<41DB2C9A.4070800@v.loewis.de>
	<FCBFE0D5-5EAD-11D9-96B0-000A9567635C@redivi.com>
	<98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl>
	<41DC7D20.4000901@v.loewis.de>
	<834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl>
	<7A747A97-601D-11D9-85EE-000D93AD379E@mac.com>
	<0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com>
	<21A975DC-6094-11D9-922D-000A95BA5446@redivi.com>
Message-ID: <D3E9B6E7-60AC-11D9-AE74-000A958D1666@cwi.nl>


On 7 Jan 2005, at 11:08, Bob Ippolito wrote:
>>>> 10.3 or later. For older OSX releases (either because you build 
>>>> Python on 10.2 or earlier, or because you've set 
>>>> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old 
>>>> behaviour of linking with "-framework Python".
>>>
>>> Wouldn't it be better to link with the actual dylib inside the 
>>> framework on 10.2? Otherwise you can no longer build 2.3 extensions 
>>> after you've installed 2.4.
>>
>> It would certainly be better to do this for 10.2.
>
> This patch implements the proposed direct framework linking:
> http://python.org/sf/1097739

Looks good, I'll incorporate it. And as I haven't heard of any 
showstoppers for the -undefined dynamic_lookup (and Anthony seems to be 
offline this week) I'll put that in too.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From magnus at hetland.org  Fri Jan  7 14:38:17 2005
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Fri Jan  7 14:38:32 2005
Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module
In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>
References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>
Message-ID: <20050107133817.GB5503@idi.ntnu.no>

Andrew McNamara <andrewm@object-craft.com.au>:
>
[snip]
> Currently, with a non-default escapechar='\\', input like:
> 
>     field one,field \
>     two,field three
> 
> Returns:
> 
>     ["field one", "field \\\ntwo", "field three"]
> 
> In the 2.5 series, I propose changing this to return:
> 
>     ["field one", "field \ntwo", "field three"]

IMO this is the *only* reasonable behaviour. I don't understand why
the escape character should be left in; this is one of the reason why
UNIX-style colon-separated values don't work with the current module.

If one wanted the first version, one would (I presume) write

   field one,field \\\
   two,field three

-- 
Magnus Lie Hetland       Fallen flower I see / Returning to its branch
http://hetland.org       Ah! a butterfly.           [Arakida Moritake]
From mcherm at mcherm.com  Fri Jan  7 14:45:20 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Fri Jan  7 14:45:53 2005
Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module
Message-ID: <1105105520.41de927049442@mcherm.com>

Andrew explains that in the CSV module, escape characters are not
properly removed.

Magnus writes:
> IMO this is the *only* reasonable behaviour. I don't understand why
> the escape character should be left in; this is one of the reason why
> UNIX-style colon-separated values don't work with the current module.

Andrew writes back later:
> Thinking about this further, I suspect we have to retain the current
> behaviour, as broken as it is, as the default: it's conceivable that
> someone somewhere is post-processing the result to remove the backslashes,
> and if we fix the csv module, we'll break their code.

I'm with Magnus on this. No one has 4 year old code using the CSV module.
The existing behavior is just simply WRONG. Sure, of course we should
try to maintain backward compatibility, but surely SOME cases don't
require it, right? Can't we treat this misbehavior as an outright bug?

-- Michael Chermside

From aleax at aleax.it  Fri Jan  7 14:51:52 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan  7 14:52:03 2005
Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module
In-Reply-To: <1105105520.41de927049442@mcherm.com>
References: <1105105520.41de927049442@mcherm.com>
Message-ID: <48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 07, at 14:45, Michael Chermside wrote:

> Andrew explains that in the CSV module, escape characters are not
> properly removed.
>
> Magnus writes:
>> IMO this is the *only* reasonable behaviour. I don't understand why
>> the escape character should be left in; this is one of the reason why
>> UNIX-style colon-separated values don't work with the current module.
>
> Andrew writes back later:
>> Thinking about this further, I suspect we have to retain the current
>> behaviour, as broken as it is, as the default: it's conceivable that
>> someone somewhere is post-processing the result to remove the 
>> backslashes,
>> and if we fix the csv module, we'll break their code.
>
> I'm with Magnus on this. No one has 4 year old code using the CSV 
> module.
> The existing behavior is just simply WRONG. Sure, of course we should
> try to maintain backward compatibility, but surely SOME cases don't
> require it, right? Can't we treat this misbehavior as an outright bug?

+1 -- the nonremoval of escape characters smells like a bug to me, too.


Alex

From mwh at python.net  Fri Jan  7 15:07:21 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Jan  7 15:28:37 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> (Bob
	Ippolito's message of "Thu, 6 Jan 2005 14:38:56 -0500")
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
	<9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com>
Message-ID: <2mmzvlgwo6.fsf@starship.python.net>

Bob Ippolito <bob@redivi.com> writes:

> On Jan 6, 2005, at 8:17, Michael Hudson wrote:
>
>> Ilya Sandler <ilya@bluefir.net> writes:
>>
>>> A problem:
>>>
>>> The current struct.unpack api works well for unpacking C-structures
>>> where
>>> everything is usually unpacked at once, but it
>>> becomes  inconvenient when unpacking binary files where things
>>> often have to be unpacked field by field. Then one has to keep track
>>> of offsets, slice the strings,call struct.calcsize(), etc...
>>
>> IMO (and E), struct.unpack is the primitive atop which something more
>> sensible is built.  I've certainly tried to build that more sensible
>> thing at least once, but haven't ever got the point of believing what
>> I had would be applicable to the general case... maybe it's time to
>> write such a thing for the standard library.
>
> This is my ctypes-like attempt at a high-level interface for struct.
> It works well for me in macholib:
> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py

Unsurprisingly, that's fairly similar to mine :)

Cheers,
mwh

-- 
  If trees could scream, would we be so cavalier about cutting them
  down? We might, if they screamed all the time, for no good reason.
                                                        -- Jack Handey
From theller at python.net  Fri Jan  7 15:33:52 2005
From: theller at python.net (Thomas Heller)
Date: Fri Jan  7 15:32:35 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <2mmzvlgwo6.fsf@starship.python.net> (Michael Hudson's message
	of "Fri, 07 Jan 2005 14:07:21 +0000")
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
	<9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com>
	<2mmzvlgwo6.fsf@starship.python.net>
Message-ID: <llb5ux4f.fsf@python.net>

> Bob Ippolito <bob@redivi.com> writes:

>> This is my ctypes-like attempt at a high-level interface for struct.
>> It works well for me in macholib:
>> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py

Michael Hudson <mwh@python.net> writes:
>
> Unsurprisingly, that's fairly similar to mine :)

So, why don't you both use the original?
ctypes works on the mac, too ;-)

Thomas

From bob at redivi.com  Fri Jan  7 15:41:27 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Jan  7 15:41:33 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <llb5ux4f.fsf@python.net>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
	<9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com>
	<2mmzvlgwo6.fsf@starship.python.net> <llb5ux4f.fsf@python.net>
Message-ID: <36326B12-60BA-11D9-B08A-000A9567635C@redivi.com>

On Jan 7, 2005, at 9:33 AM, Thomas Heller wrote:

>> Bob Ippolito <bob@redivi.com> writes:
>
>>> This is my ctypes-like attempt at a high-level interface for struct.
>>> It works well for me in macholib:
>>> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py
>
> Michael Hudson <mwh@python.net> writes:
>>
>> Unsurprisingly, that's fairly similar to mine :)
>
> So, why don't you both use the original?
> ctypes works on the mac, too ;-)

I did use the original for the prototype of macholib!  Then I wrote a 
version in pure python to eliminate the compiler dependency and ended 
up adding way more features than I actually needed (variable length 
nested structures and stuff like that).  Eventually, I scaled it back 
to this so that it was easier to maintain and so that I could make some 
performance optimizations (well as many as you can make with the struct 
module).

-bob

From mwh at python.net  Fri Jan  7 15:57:25 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Jan  7 15:57:27 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <llb5ux4f.fsf@python.net> (Thomas Heller's message of "Fri, 07
	Jan 2005 15:33:52 +0100")
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<2mr7kyhf2j.fsf@starship.python.net>
	<9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com>
	<2mmzvlgwo6.fsf@starship.python.net> <llb5ux4f.fsf@python.net>
Message-ID: <2mis69gucq.fsf@starship.python.net>

Thomas Heller <theller@python.net> writes:

>> Bob Ippolito <bob@redivi.com> writes:
>
>>> This is my ctypes-like attempt at a high-level interface for struct.
>>> It works well for me in macholib:
>>> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py
>
> Michael Hudson <mwh@python.net> writes:
>>
>> Unsurprisingly, that's fairly similar to mine :)
>
> So, why don't you both use the original?
> ctypes works on the mac, too ;-)

Well, I probably wrote mine before ctypes worked on the Mac... and
certainly when I was away from internet access.  I guess I should look
at ctypes' interface, at least...

Cheers,
mwh

-- 
  I located the link but haven't bothered to re-read the article,
  preferring to post nonsense to usenet before checking my facts.
                                      -- Ben Wolfson, comp.lang.python
From ncoghlan at iinet.net.au  Fri Jan  7 16:05:24 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Jan  7 16:05:28 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <1105034511.10728.3.camel@geddy.wooz.org>
References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>
	<1105034511.10728.3.camel@geddy.wooz.org>
Message-ID: <41DEA534.9090400@iinet.net.au>

Barry Warsaw wrote:
> As an experiment, I just added a PEP topic to the python-checkins
> mailing list.  You could subscribe to this list and just select the PEP
> topic (which matches the regex "PEP" in the Subject header or first few
> lines of the body).
> 
> Give it a shot and let's see if that does the trick.

Neat - I've subscribed to that topic now :)

Do you mind if I suggest this to interested people on c.l.p?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From barry at python.org  Fri Jan  7 16:09:16 2005
From: barry at python.org (Barry Warsaw)
Date: Fri Jan  7 16:09:21 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <41DEA534.9090400@iinet.net.au>
References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>
	<1105034511.10728.3.camel@geddy.wooz.org>
	<41DEA534.9090400@iinet.net.au>
Message-ID: <1105110556.26433.57.camel@geddy.wooz.org>

On Fri, 2005-01-07 at 10:05, Nick Coghlan wrote:
> Barry Warsaw wrote:
> > As an experiment, I just added a PEP topic to the python-checkins
> > mailing list.  You could subscribe to this list and just select the PEP
> > topic (which matches the regex "PEP" in the Subject header or first few
> > lines of the body).
> > 
> > Give it a shot and let's see if that does the trick.
> 
> Neat - I've subscribed to that topic now :)
> 
> Do you mind if I suggest this to interested people on c.l.p?

Please do (he says, hoping it works :).
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050107/51aa8644/attachment-0001.pgp
From FBatista at uniFON.com.ar  Fri Jan  7 16:40:27 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Jan  7 16:43:17 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
Message-ID: <A128D751272CD411BC9200508BC2194D053C7E39@escpl.tcp.com.ar>

[Barry Warsaw]

> As an experiment, I just added a PEP topic to the python-checkins
> mailing list.  You could subscribe to this list and just select the PEP
> topic (which matches the regex "PEP" in the Subject header or first few
> lines of the body).
> 
> Give it a shot and let's see if that does the trick.

Can the defaults be configured? Because now the config is this:

- Which topic categories would you like to subscribe to? (default: "pep" not
checked)

- Do you want to receive messages that do not match any topic filter?
(default: No)

So, happens the following (happened to me, je): You subscribe to the list,
and confirm the registration, and you never get a message unless you change
this.

Thanks.

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050107/6a350a3d/attachment.htm
From tim.peters at gmail.com  Fri Jan  7 17:00:42 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Jan  7 17:00:45 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
In-Reply-To: <20050107061322.A6A563C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050107061322.A6A563C8E5@coffee.object-craft.com.au>
Message-ID: <1f7befae05010708005275e23d@mail.gmail.com>

[Andrew McNamara]
>> Also, review comments from Jeremy Hylton, 10 Apr 2003:
>>
>>    I've been reviewing extension modules looking for C types that should
>>    participate in garbage collection.  I think the csv ReaderObj and
>>    WriterObj should participate.  The ReaderObj it contains a reference to
>>    input_iter that could be an arbitrary Python object.  The iterator
>>    object could well participate in a cycle that refers to the ReaderObj.
>>    The WriterObj has a reference to a writeline callable, which could well
>>    be a method of an object that also points to the WriterObj.

> I finally got around to looking at this, only to realise Jeremy did the
> work back in Apr 2003 (thanks). One question, however - the GC doco in
> the Python/C API seems to suggest to me that PyObject_GC_Track should be
> called on the newly minted object prior to returning from the initialiser
> (and correspondingly PyObject_GC_UnTrack should be called prior to
> dismantling). This isn't being done in the module as it stands. Is the
> module wrong, or is my understanding of the reference manual incorrect?

The purpose of "tracking" and "untracking" is to let cyclic gc know
when it (respectively) is and isn't safe to call an object's
tp_traverse method.  Primarily, when an object is first created at the
C level, it may contain NULLs or heap trash in pointer slots, and then
the object's tp_traverse could segfault if it were called while the
object remained in an insane (wrt tp_traverse) state.  Similarly,
cleanup actions in the tp_dealloc may make a tp_traverse-sane object
tp_traverse-insane, so tp_dealloc should untrack the object before
that occurs.

If tracking is never done, then the object effectively never
participates in cyclic gc:  its tp_traverse will never get called, and
it will effectively act as an external root (keeping itself and
everything reachable from it alive).  So, yes, track it during
construction, but not before all the members referenced by its
tp_traverse are in a sane state.  Putting the track call "at the end"
of the constructor is usually best practice.

tp_dealloc should untrack it then.  In a debug build, that will
assert-fail if the object hasn't actually been tracked. 
PyObject_GC_Del will untrack it for you (if it's still tracked), but
it's risky to rely on that --  it's too easy to forget that Py_DECREFs
on contained objects can end up executing arbitrary Python code (via
__del__ and weakref callbacks, and via allowing other threads to run),
which can in turn trigger a round of cyclic gc *while* your tp_dealloc
is still running.  So it's safest to untrack the object very early in
tp_dealloc.

I doubt this happens in the csv module, but an untrack/track pair
should also be put around any block of method code that temporarily
puts the object into a tp_traverse-insane state and that contains any
C API calls that may end up triggering cyclic gc.  That's very rare.
From sjoerd at acm.org  Fri Jan  7 17:01:31 2005
From: sjoerd at acm.org (Sjoerd Mullender)
Date: Fri Jan  7 17:01:42 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7E39@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7E39@escpl.tcp.com.ar>
Message-ID: <41DEB25B.7040202@acm.org>

Batista, Facundo wrote:
> [Barry Warsaw]
>
>  > As an experiment, I just added a PEP topic to the python-checkins
>  > mailing list.  You could subscribe to this list and just select the PEP
>  > topic (which matches the regex "PEP" in the Subject header or first few
>  > lines of the body).
>  >
>  > Give it a shot and let's see if that does the trick.
>
> Can the defaults be configured? Because now the config is this:
>
> - Which topic categories would you like to subscribe to? (default: "pep"
> not checked)
>
> - Do you want to receive messages that do not match any topic filter?
> (default: No)
>
> So, happens the following (happened to me, je): You subscribe to the
> list, and confirm the registration, and you never get a message unless
> you change this.

However, there is an additional line in the description:
"If no topics of interest are selected above, then you will receive
every message sent to the mailing list."
In other words, don't check any topics, and you get everything.  If you
*do* check a topic, you only get the messages belonging to that topic.
This seems to me a reasonable default.

--
Sjoerd Mullender <sjoerd@acm.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 374 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050107/7807b763/signature.pgp
From foom at fuhm.net  Fri Jan  7 17:51:37 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri Jan  7 17:51:35 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com>
Message-ID: <653D5CE4-60CC-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 6, 2005, at 5:45 PM, Delaney, Timothy C (Timothy) wrote:
> Well, there's my autosuper recipe you've seen before:
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286195
>
> which does basically what Philip descibes ...

You missed the most important part of the example -- the automatic 
argument passing through unknowing methods.

James

From skip at pobox.com  Fri Jan  7 17:09:13 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan  7 18:10:27 2005
Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module
In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>
References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au>
Message-ID: <16862.46121.778915.968964@montanaro.dyndns.org>


    Andrew> I'm considering a change to the csv module that could
    Andrew> potentially break some obscure uses of the module (but CSV files
    Andrew> usually quote, rather than escape, so the most common uses
    Andrew> aren't effected).

I'm with the other respondents.  This looks like a bug that should be
squashed.

Skip
From ncoghlan at iinet.net.au  Fri Jan  7 18:30:37 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Jan  7 18:30:41 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <1105110556.26433.57.camel@geddy.wooz.org>
References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com>	
	<1105034511.10728.3.camel@geddy.wooz.org>
	<41DEA534.9090400@iinet.net.au>
	<1105110556.26433.57.camel@geddy.wooz.org>
Message-ID: <41DEC73D.4090908@iinet.net.au>

Barry Warsaw wrote:
> Please do (he says, hoping it works :).

Speaking of which. . . care to poke PEP 0 or one of the other PEP's? There's 
probably a couple of PEP's which could be moved from 'Open' to 'Accepted' or 
'Accepted' to 'Implemented' to try it out.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From foom at fuhm.net  Fri Jan  7 18:45:53 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri Jan  7 18:45:53 2005
Subject: [Python-Dev] Re: super() harmful?
In-Reply-To: <ca471dc2050106091357a5c36b@mail.gmail.com>
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<ca471dc205010418022db8c838@mail.gmail.com>
	<A9AD2A6E-5F3A-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050105102328387030@mail.gmail.com>
	<9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc20501051536c4fd618@mail.gmail.com>
	<091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net>
	<ca471dc2050106091357a5c36b@mail.gmail.com>
Message-ID: <F9EF8F18-60D3-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 6, 2005, at 12:13 PM, Guido van Rossum wrote:
> So it has nothing to do with the new paradigm, just with backwards
> compatibility. I appreciate those issues (more than you'll ever know)
> but I don't see why you should try to discourage others from using the
> new paradigm, which is what your article appears to do.

This is where I'm coming from:
In my own code, it is very rare to have diamond inheritance structures. 
And if there are, even more rare that both sides need to cooperatively 
override a method. Given that, super has no necessary advantage. And it 
has disadvantages.
- Backwards compatibility issues
- Going along with that, inadvertent mixing of paradigms (you have to 
remember which classes you use super with and which you don't or your 
code might have hard-to-find errors).
- Take your choice of: a) inability to add optional arguments to your 
methods, or b) having to use *args, **kwargs on every method and call 
super with those.
- Having to try/catch AttributeErrors from super if you use interfaces 
instead of a base class to define the methods in use.

So, I am indeed attempting to discourage people from using it, despite 
its importance. And also trying to educate people as to what they need 
to do if they have a case where it is necessary to use or if they just 
decide I'm full of crap and want to use it anyways.

>> In order to make super really nice, it should be easier to use right.
>> Again, the two major issues that cause problems are: 1) having to
>> declare every method with *args, **kwargs, and having to pass those 
>> and
>> all the arguments you take explicitly to super,
>
> That's only an issue with __init__ or with code written without
> cooperative MI in mind. When using cooperative MI, you shouldn't
> redefine method signatures, and all is well.

I have two issues with that statement. Firstly, it's often quite useful 
to be able to add optional arguments to methods. Secondly, that's not a 
property of cooperative MI, but one of cooperative MI in python.

As a counterpoint, with Dylan, you can add optional keyword arguments 
to a method as long as the generic was defined with the notation #key 
(specifying that it will accept keyword arguments at all). This is of 
course even true in a single inheritance situation like in the example 
below.

Now please don't misunderstand me, here. I'm not at all trying to say 
that Python sucks because it's not Dylan. I don't even particularly 
like Dylan, but it does have a number of good ideas. Additionally, 
Python and Dylan differ in fundamental ways: Python has classes and 
inheritance, Dylan has generic functions/multimethods. Dylan is (I 
believe) generally whole-program-at-a-time compiled/optimized, Python 
is not. So, I think a solution for python would have to be 
fundamentally different as well. But anyways, an example of what I'm 
talking about:

define generic g (arg1 :: <number>, #key);
define method g (arg1 :: <number>, #key)
   format-out("number.g\n");
end method g;

define method g (arg1 :: <rational>, #key base :: <integer> = 10)
   next-method();
   format-out("rational.g %d\n", base);
end method g;

define method g (arg1 :: <integer>, #key)
   next-method();
   format-out("integer.g\n");
end method g;

// Prints:
// number.g
// rational.g 1
// integer.g
g(1, base: 1);

// Produces: Error: Unrecognized keyword (base) as the second argument 
in call of g
g(1.0, base: 1);

> Cooperative MI doesn't have a really good solution for __init__.
> Defining and calling __init__ only with keyword arguments is a good
> solution. But griping about "traditionally" is a backwards
> compatibility issue, which you said you were leaving behind.

Well, kind of. In my mind, it was a different kind of issue, as it 
isn't solved by everyone moving over to using super. As nearly all the 
code that currently uses super does so without using keyword arguments 
for __init__, I considered it not so much backwards compatibility as a 
re-educating users kind of issue, the same as the requirement for 
passing along all your arguments.

> Exactly. What is your next_method statement supposed to do?

Well that's easy. It's supposed to call the next function in the MRO 
with _all_ the arguments passed along, even the ones that the current 
function didn't explicitly ask for. I was afraid you might ask a hard 
question, like: if E2 inherits C's __init__, how the heck is it 
supposed to manage to take two arguments nonetheless. That one I 
*really* don't have an answer for.

> No need to reply except when you've changed the article. I'm tired of
> the allegations.

Sigh.

James

From edcjones at erols.com  Fri Jan  7 20:44:51 2005
From: edcjones at erols.com (Edward C. Jones)
Date: Fri Jan  7 20:43:21 2005
Subject: [Python-Dev] Concurrency and Python
Message-ID: <41DEE6B3.8020006@erols.com>

Today's Slashdot 
(http://slashdot.org/articles/05/01/07/158236.shtml?tid=137) points to: 
"The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in 
Software" by Herb Sutter at 
"http://www.gotw.ca/publications/concurrency-ddj.htm".

Is Python a suitable language for concurrent programming? Should Python 
be a good language for concurrent programming? Python nicely satisfies 
several user needs now including teaching beginners, scripting, 
algorithm development, non time-critical code, and wrapping libraries. 
Which of these users will be needing concurrency? What is the state of 
programming theory for concurrency?
From jhylton at gmail.com  Fri Jan  7 21:37:50 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Fri Jan  7 21:37:53 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
In-Reply-To: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
Message-ID: <e8bf7a5305010712375ca9a373@mail.gmail.com>

How's the merge going?

And if I haven't already said thanks, then, thanks for doing it!

Jeremy
From foom at fuhm.net  Fri Jan  7 21:56:55 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri Jan  7 21:56:59 2005
Subject: [Python-Dev] Concurrency and Python
In-Reply-To: <41DEE6B3.8020006@erols.com>
References: <41DEE6B3.8020006@erols.com>
Message-ID: <AA4E6B82-60EE-11D9-8D68-000A95A50FB2@fuhm.net>

On Jan 7, 2005, at 2:44 PM, Edward C. Jones wrote:
> Is Python a suitable language for concurrent programming?

Depends on what you mean. Python is not very good for shared-memory  
concurrent programming. (That is, threads). The library doesn't have  
enough good abstractions for locks/synchronization/etc, and, of course  
the big issue of CPython only allowing one thread to execute bytecode  
at a time.

At the moment, threads are the fad, but I don't believe that will be  
scaling very well. As you scale up the number of CPUs, the amount of  
time wasted on memory synchronization similarly goes up, until you're  
wasting more time on memory consistency than doing actual work.

Thus, I expect the trend to be more towards async message passing  
architectures (that is, multiple processes each with their own memory),  
instead, and I think Python is about as good for that as any existing  
language. Which is to say: reasonable, but not insanely great.

> What is the state of programming theory for concurrency?

For an example of the kind of new language being developed around a  
asynchronous message passing model, see IBM's poorly-named "X10"  
language. I saw a talk on it and thought it sounded very promising.  
What it adds over the usual message passing system is an easier way to  
name and access remote data and to spawn parallel activities that  
operates on that data. The part about arrays of data spread out over a  
number of different "places" (roughly, a CPU and its own memory) and  
how to operate on them I found especially interesting.

I tried to find their project website, but since their name conflicts  
with the home automation system, it's hard to google for. Or perhaps  
they don't have a website.

Short summary information:
http://www.csail.mit.edu/events/eventcalendar/calendar.php? 
show=event&id=131

Talk slides:
http://www.cs.ualberta.ca/~amaral/cascon/CDP04/slides/sarkar.pdf

More talk slides, and a video:
http://www.research.ibm.com/vee04/video.html#sarkar
"Vivek Sarkar, Language and Virtual Machine Challenges for Large-Scale  
Parallel Systems"

James

From kbk at shore.net  Fri Jan  7 22:18:11 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri Jan  7 22:18:43 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
In-Reply-To: <e8bf7a5305010712375ca9a373@mail.gmail.com> (Jeremy Hylton's
	message of "Fri, 7 Jan 2005 15:37:50 -0500")
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5305010712375ca9a373@mail.gmail.com>
Message-ID: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jhylton@gmail.com> writes:

> How's the merge going?

Looks like it's done.  I tagged ast-branch when I finished:

merged_from_MAIN_07JAN05

Right now I'm trying to get Python-ast.c to compile.  It wasn't
modified by the merge, so there's some other issue.

> And if I haven't already said thanks, then, thanks for doing it!

You're welcome!  I volunteer to keep ast-branch synch'd, how often
do you want to do it?

-- 
KBK
From olsongt at verizon.net  Fri Jan  7 22:37:47 2005
From: olsongt at verizon.net (olsongt@verizon.net)
Date: Fri Jan  7 22:37:50 2005
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Python	pythonrun.c, 2.161.2.15, 2.161.2.16
Message-ID: <20050107213748.KMJJ28362.out005.verizon.net@outgoing.verizon.net>


> 
> From: kbk@shore.net (Kurt B. Kaiser)
> Date: 2005/01/07 Fri PM 09:18:11 GMT
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
> 	pythonrun.c, 2.161.2.15, 2.161.2.16
> 
> Jeremy Hylton <jhylton@gmail.com> writes:
> 
> > How's the merge going?
> 
> Looks like it's done.  I tagged ast-branch when I finished:
> 
> merged_from_MAIN_07JAN05
> 
> Right now I'm trying to get Python-ast.c to compile.  It wasn't
> modified by the merge, so there's some other issue.
> 

Python-ast.c should be autogenerated in the make process by asdl_c.py.  There are still some bugs in it.  The fix I think you need is posted.  A full diff against the current python_ast.c is attached to patch 742621.

@@ -1310,7 +1310,7 @@
                 free_expr(o->v.Repr.value);
                 break;
         case Num_kind:
-                Py_DECREF(o->v.Num.n);
+                free_expr(o->v.Num.n);
                 break;
         case Str_kind:
                 Py_DECREF(o->v.Str.s)

From jhylton at gmail.com  Fri Jan  7 22:43:56 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Fri Jan  7 22:43:59 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
In-Reply-To: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com>
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5305010712375ca9a373@mail.gmail.com>
	<87brc17xbg.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <e8bf7a5305010713434d2b9323@mail.gmail.com>

On Fri, 07 Jan 2005 16:18:11 -0500, Kurt B. Kaiser <kbk@shore.net> wrote:
> Looks like it's done.  I tagged ast-branch when I finished:
> 
> merged_from_MAIN_07JAN05
> 
> Right now I'm trying to get Python-ast.c to compile.  It wasn't
> modified by the merge, so there's some other issue.

I'm getting a compilation failure in symtable.c:

 gcc -pthread -c -fno-strict-aliasing -g -Wall -Wstrict-prototypes -I.
-I../Include   -DPy_BUILD_CORE -o Python/symtable.o
../Python/symtable.c
../Python/symtable.c: In function `symtable_new':
../Python/symtable.c:193: structure has no member named `st_tmpname'

Do you see that?

There is this one ugly corner of Python-ast.c.  There's a routine that
expects to take a pointer to a node, but instead gets passed an int. 
The generated code is bogus, and I haven't decided if it needs to be
worried about.  You need to manually edit the generated code to add a
cast.

> > And if I haven't already said thanks, then, thanks for doing it!
> 
> You're welcome!  I volunteer to keep ast-branch synch'd, how often
> do you want to do it?

I don't think we'll need to merge again.  This last merge got all the
language changes that were made for 2.4.  Since we've agreed to a
moratorium on more compiler/bytecode changes, we shouldn't need to
merge from the head again.

Jeremy
From kbk at shore.net  Sat Jan  8 01:14:00 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat Jan  8 01:14:25 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
In-Reply-To: <e8bf7a5305010713434d2b9323@mail.gmail.com> (Jeremy Hylton's
	message of "Fri, 7 Jan 2005 16:43:56 -0500")
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5305010712375ca9a373@mail.gmail.com>
	<87brc17xbg.fsf@hydra.bayview.thirdcreek.com>
	<e8bf7a5305010713434d2b9323@mail.gmail.com>
Message-ID: <877jmo93qv.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jhylton@gmail.com> writes:

> ../Python/symtable.c:193: structure has no member named `st_tmpname'
>
> Do you see that?

Yeah, the merge eliminated it from the symtable struct in symtable.h.
You moved it to symtable_entry at rev 2.12 in MAIN :-)

I'll research it.

Apparently my build differs enough so that I'm still stuck in
Python-ast.c (once I had fixed pythonrun.c).

> There is this one ugly corner of Python-ast.c.  There's a routine
> that expects to take a pointer to a node, but instead gets passed an
> int.  The generated code is bogus, and I haven't decided if it needs
> to be worried about.  You need to manually edit the generated code to
> add a cast.

OK, I was looking in that direction.  Problem is with cmpop stuff.
Three hard errors when compiling.

OpenBSD.

[...]

> I don't think we'll need to merge again.  This last merge got all the
> language changes that were made for 2.4.  Since we've agreed to a
> moratorium on more compiler/bytecode changes, we shouldn't need to
> merge from the head again.

Is the plan to merge ast-branch to MAIN?  If so, it's a little tricky
since all the changes to MAIN are on ast-branch.  So just before the
final merge we need to merge MAIN to ast-branch once more and then
merge the diff from HEAD to ast-branch back to MAIN.  Or something
like that.

-- 
KBK
From ilya at bluefir.net  Sat Jan  8 04:40:18 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sat Jan  8 04:37:35 2005
Subject: [Python-Dev] an idea for improving struct.unpack api 
In-Reply-To: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer>
References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer>
Message-ID: <Pine.LNX.4.58.0501071758200.3423@bagira>

I will try to respond to all comments at once.

But first a clarification:
   -I am not trying to design a high-level API on top of existing
   struct.unpack and
   -I am not trying to design a replacement for struct.unpack
   (If I were to replace struct.unpack(), then I would probably go along
  the lines of StructReader suggested by Raymond)

I view struct module as a low-level (un)packing library on top on which
a more complex stuff can be built and I am simply suggesting a way to
improve this low level functionality...


> > We could have an optional offset argument for
> >
> > unpack(format, buffer, offset=None)
> >
> > the offset argument is an object which contains a single integer field
> > which gets incremented inside unpack() to point to the next byte.

> As for "passing offset implies the length is calcsize(fmt)" sub-concept,
> I find that slightly more controversial.  It's convenient,
> but somewhat ambiguous; in other cases (e.g. string methods) passing a
> start/offset and no end/length means to go to the end.

I am not sure I agree: in most cases starting offset and no
length/end just means: "start whatever you are doing at this offset and
stop it whenever you are happy.."

At least that's the way I was alway thinking about functions like
string.find() and friends

Suggested struct.unpack() change seems to fit this mental model very well

>> the offset argument is an object which contains a single integer field
>> which gets incremented inside unpack() to point to the next byte.

> I find this just too "magical".

Why would it be magical? There is no guessing of user intentions involved.
The function simply  returns/uses an extra piece of information if the user
 asks for it. And the function already computes this piece of
information..


> It's only useful when you're specifically unpacking data bytes that are
>  compactly back to back (no "filler" e.g. for alignment purposes)

Yes, but it's a very common case when dealing with binary files formats.

Eg. I just looked at xdrlib.py code and it seems that almost every
invocation of struct._unpack would shrink from 3 lines to 1 line of code

(        i = self.__pos
        self.__pos = j = i+4
        data = self.__buf[i:j]
        return struct.unpack('>l', data)[0]

would become:
        return struct.unpack('>l', self.__buf, self.__pos)[0]
)


There are probably other places in stdlib which would benefit from this
api and stdlib does not deal with binary files that much..

>and pays some conceptual price -- introducing a new specialized type
> to play the role of "mutable int"

but the user does not have to pay anything if he does not need it! The
change is backward compatible. (Note that just supporting int offsets
would eliminate slicing, but it would not eliminate other annoyances,
and it's  possible to support both Offset and int args, is it worth the
hassle?)

> and having an argument mutated, which is not usual in Python's library.

Actually, it's so common that we simply stop noticing it :-)
Eg. when we call a superclass's method:
  SuperClass.__init__(self)

So, while I agree that there is an element of unusualness in the
suggested unpack() API, this element seems pretty small to me

> All in all, I suspect that something like.
> hdrsize = struct.calcsize(hdr_fmt)
> itemsize = struct.calcsize(item_fmt)
> reclen = length_of_each_record
> rec = binfile.read(reclen)
> hdr = struct.unpack(hdr_fmt, rec, 0, hdrsize)
>     for offs in itertools.islice(xrange(hdrsize, reclen, itemsize),
hdr[0]):
>         item = struct.unpack(item_fmt, rec, offs, itemsize)
>         # process item
>might be a better compromise

I think I again disagree: your example is almost as verbose as the current
unpack() api and you still need to call calcsize() explicitly and I
don't think there is any chance of gaining any noticeable
perfomance benefit. Too little gain to bother with any changes...


> struct.pack/struct.unpack is already one of my least-favourite parts
> of the stdlib.  Of the modules I use regularly, I pretty much only ever
> have to go back and re-read the struct (and re) documentation because
> they just won't fit in my brain. Adding additional complexity to them
> seems like a net loss to me.

Net loss to the end programmer? But if he does not need new
functionality he doesnot have to use it! In fact, I started with providing
an example of how new api makes client code simpler


> I'd much rather specify the format as something like a tuple of values -
>    (INT, UINT, INT, STRING) (where INT &c are objects defined in the
>    struct module). This also then allows users to specify their own formats
>    if they have a particular need for something

I don't disagree, but I think it's orthogonal to offset issue


Ilya




On Thu, 6 Jan 2005, Raymond Hettinger wrote:

> [Ilya Sandler]
> > A problem:
> >
> > The current struct.unpack api works well for unpacking C-structures
> where
> > everything is usually unpacked at once, but it
> > becomes  inconvenient when unpacking binary files where things
> > often have to be unpacked field by field. Then one has to keep track
> > of offsets, slice the strings,call struct.calcsize(), etc...
>
> Yes.  That bites.
>
>
> > Eg. with a current api unpacking  of a record which consists of a
> > header followed by a variable  number of items would go like this
> >
> >  hdr_fmt="iiii"
> >  item_fmt="IIII"
> >  item_size=calcsize(item_fmt)
> >  hdr_size=calcsize(hdr_fmt)
> >  hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack
> >  offset=hdr_size
> >  for i in range(hdr[0]): #assume 1st field of header is a counter
> >    item=unpack( item_fmt, rec[ offset: offset+item_size])
> >    offset+=item_size
> >
> > which is quite inconvenient...
> >
> >
> > A  solution:
> >
> > We could have an optional offset argument for
> >
> > unpack(format, buffer, offset=None)
> >
> > the offset argument is an object which contains a single integer field
> > which gets incremented inside unpack() to point to the next byte.
> >
> > so with a new API the above code could be written as
> >
> >  offset=struct.Offset(0)
> >  hdr=unpack("iiii", offset)
> >  for i in range(hdr[0]):
> >     item=unpack( "IIII", rec, offset)
> >
> > When an offset argument is provided, unpack() should allow some bytes
> to
> > be left unpacked at the end of the buffer..
> >
> >
> > Does this suggestion make sense? Any better ideas?
>
> Rather than alter struct.unpack(), I suggest making a separate class
> that tracks the offset and encapsulates some of the logic that typically
> surrounds unpacking:
>
>     r = StructReader(rec)
>     hdr = r('iiii')
>     for item in r.getgroups('IIII', times=rec[0]):
>        . . .
>
> It would be especially nice if it handled the more complex case where
> the next offset is determined in-part by the data being read (see the
> example in section 11.3 of the tutorial):
>
>     r = StructReader(open('myfile.zip', 'rb'))
>     for i in range(3):                  # show the first 3 file headers
>         fields = r.getgroup('LLLHH', offset=14)
>         crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
>         filename = g.getgroup('c', offset=16, times=filenamesize)
>         extra = g.getgroup('c', times=extra_size)
>         r.advance(comp_size)
>         print filename, hex(crc32), comp_size, uncomp_size
>
> If you come up with something, I suggest posting it as an ASPN recipe
> and then announcing it on comp.lang.python.  That ought to generate some
> good feedback based on other people's real world issues with
> struct.unpack().
>
>
> Raymond Hettinger
>
>
From ncoghlan at iinet.net.au  Sat Jan  8 05:50:47 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Jan  8 05:50:51 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <Pine.LNX.4.58.0501042046120.678@bagira>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
Message-ID: <41DF66A7.60800@iinet.net.au>

Ilya Sandler wrote:
>     item=unpack( "IIII", rec, offset)

How about making offset a standard integer, and change the signature to return a 
tuple when it is used:

   item = unpack(format, rec) # Full unpacking
   offset = 0
   item, offset = unpack(format, rec, offset) # Partial unpacking

The second item in the returned tuple being the offset of the first byte after 
the end of the unpacked item.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From python at rcn.com  Sat Jan  8 06:20:43 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Jan  8 06:24:03 2005
Subject: [Python-Dev] an idea for improving struct.unpack api 
In-Reply-To: <Pine.LNX.4.58.0501071758200.3423@bagira>
Message-ID: <000a01c4f541$cd8a78a0$86b89d8d@oemcomputer>

[Ilya Sandler]
> I view struct module as a low-level (un)packing library on top on
which
> a more complex stuff can be built and I am simply suggesting a way to
> improve this low level functionality...
> > > We could have an optional offset argument for
> > >
> > > unpack(format, buffer, offset=None)
> > >
> > > the offset argument is an object which contains a single integer
field
> > > which gets incremented inside unpack() to point to the next byte.

-1 on any modification of the existing unpack() function.  It is already
at its outer limits of complexity.  Attaching a stateful tracking object
(needing its own constructor and api) is not an improvement IMO.  Also,
I find the proposed "offset" object to be conceptually difficult to
follow for anything other than the simplest case -- for anything else,
it will make designing, reviewing, and debugging more difficult than it
is now.

In contrast, code built using the StructReader proposal leads to more
flexible, readable code.  Experience with the csv module points to
reader objects being a better solution.



[Nick Coghlan]
> How about making offset a standard integer, and change the signature
to
> return a
> tuple when it is used:
> 
>    item = unpack(format, rec) # Full unpacking
>    offset = 0
>    item, offset = unpack(format, rec, offset) # Partial unpacking
> 
> The second item in the returned tuple being the offset of the first
byte
> after
> the end of the unpacked item

Using standard integers helps improve the proposal by making the
operation less obscure.  But having than having the signature change is
bad; create a separate function instead:

    item, offset = unpack_here(format, rec, offset)

One other wrinkle is that "item" is itself a tuple and the whole thing
looks odd if unpacked:

    ((var0, var1, var2, var3), offset) = unpack_here(fmtstr, rec,
offset)




Raymond

From ilya at bluefir.net  Sat Jan  8 06:37:36 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sat Jan  8 06:34:52 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <41DF66A7.60800@iinet.net.au>
References: <Pine.LNX.4.58.0501042046120.678@bagira>
	<41DF66A7.60800@iinet.net.au>
Message-ID: <Pine.LNX.4.58.0501072123240.3423@bagira>

> How about making offset a standard integer, and change the signature to
> return  tuple when it is used:
>  item, offset = unpack(format, rec, offset) # Partial unpacking

Well, it would work well when unpack results are assigned to individual
vars:

   x,y,offset=unpack( "ii", rec, offset)

but it gets more complicated if you have something like:
   coords=unpack("10i", rec)

How would you pass/return offsets here? As an extra element in coords?
   coords=unpack("10i", rec, offset)
   offset=coords.pop()

But that would be counterintuitive and somewhat inconvinient..

Ilya




On Sat, 8 Jan 2005, Nick Coghlan wrote:

> Ilya Sandler wrote:
> >     item=unpack( "IIII", rec, offset)
>
> How about making offset a standard integer, and change the signature to return a
> tuple when it is used:
>
>    item = unpack(format, rec) # Full unpacking
>    offset = 0
>    item, offset = unpack(format, rec, offset) # Partial unpacking
>
> The second item in the returned tuple being the offset of the first byte after
> the end of the unpacked item.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://boredomandlaziness.skystorm.net
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net
>
From kbk at shore.net  Sat Jan  8 07:33:23 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat Jan  8 07:34:41 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
In-Reply-To: <877jmo93qv.fsf@hydra.bayview.thirdcreek.com> (Kurt B. Kaiser's
	message of "Fri, 07 Jan 2005 19:14:00 -0500")
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5305010712375ca9a373@mail.gmail.com>
	<87brc17xbg.fsf@hydra.bayview.thirdcreek.com>
	<e8bf7a5305010713434d2b9323@mail.gmail.com>
	<877jmo93qv.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <873bxc8m6k.fsf@hydra.bayview.thirdcreek.com>

kbk@shore.net (Kurt B. Kaiser) writes:
>  [JH]
>> ../Python/symtable.c:193: structure has no member named `st_tmpname'
>>
>> Do you see that?
>
> Yeah, the merge eliminated it from the symtable struct in symtable.h.
> You moved it to symtable_entry at rev 2.12 in MAIN :-)
>
> I'll research it.

I think it would be more efficient if you tackled it since almost
all the work is in compile.c ==> newcompile.c

The relevant changes are

compile.c 2.286
symtable.h 2.12
symtable.c 2.11

www.python.org/sf/734869

> Apparently my build differs enough so that I'm still stuck in
> Python-ast.c (once I had fixed pythonrun.c).

I resolved all the errors/warnings and diffed to against respository.
I was astonished to see the same changes, slightly different, being
replaced by mine.  Those were /your/ tweaks.

Apparently the $(AST_H) $(AST_C): target ran and Python-ast.c was
recreated (without the changes).  It's not clear to me how/why that
happened.  I did start with a clean checkout, but it seems that the
target only runs if Python-ast.c and/or its .h are missing (they
should have been in the checkout), or older than Python.asdl, which
they are not.  I don't see them in the .cvsignore.

Very amusing.

-- 
KBK
From ncoghlan at iinet.net.au  Sat Jan  8 08:15:23 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Jan  8 08:15:27 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <Pine.LNX.4.58.0501072123240.3423@bagira>
References: <Pine.LNX.4.58.0501042046120.678@bagira>	<41DF66A7.60800@iinet.net.au>
	<Pine.LNX.4.58.0501072123240.3423@bagira>
Message-ID: <41DF888B.2020208@iinet.net.au>

Ilya Sandler wrote:
>>How about making offset a standard integer, and change the signature to
>>return  tuple when it is used:
>> item, offset = unpack(format, rec, offset) # Partial unpacking
> 
> 
> Well, it would work well when unpack results are assigned to individual
> vars:
> 
>    x,y,offset=unpack( "ii", rec, offset)
> 
> but it gets more complicated if you have something like:
>    coords=unpack("10i", rec)
> 
> How would you pass/return offsets here? As an extra element in coords?
>    coords=unpack("10i", rec, offset)
>    offset=coords.pop()
> 
> But that would be counterintuitive and somewhat inconvinient..

I was thinking more along the lines of returning a 2-tuple with the 'normal' 
result of unpack as the first element:

   coords, offset = unpack("ii", rec, offset)
   x, y = coords

Raymond's suggestion of a separate function like 'unpack_here' is probably a 
good one, as magically changing function signatures are evil. Something like:

def unpack_here(format, record, offset = 0):
   end = offset + calcsize(format)
   return (unpack(format, record[offset:end]), end)

Presumably, a C version could avoid the slicing and hence be significantly more 
efficient.

Yes, the return type is a little clumsy, but it should still make it easier to 
write more efficient higher-level API's that unpack the structure a piece at a time.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From p.f.moore at gmail.com  Sat Jan  8 12:09:47 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat Jan  8 12:09:51 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <Pine.LNX.4.58.0501071758200.3423@bagira>
References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer>
	<Pine.LNX.4.58.0501071758200.3423@bagira>
Message-ID: <79990c6b05010803092b1570d1@mail.gmail.com>

On Fri, 7 Jan 2005 19:40:18 -0800 (PST), Ilya Sandler <ilya@bluefir.net> wrote:
> Eg. I just looked at xdrlib.py code and it seems that almost every
> invocation of struct._unpack would shrink from 3 lines to 1 line of code
> 
> (        i = self.__pos
>         self.__pos = j = i+4
>         data = self.__buf[i:j]
>         return struct.unpack('>l', data)[0]
> 
> would become:
>         return struct.unpack('>l', self.__buf, self.__pos)[0]
> )

FWIW, I could read and understand your original code without any
problems, whereas in the second version I would completely miss the
fact that self.__pos is updated, precisely because mutating arguments
are very rare in Python functions.

OTOH, Nick's idea of returning a tuple with the new offset might make
your example shorter without sacrificing readability:

    result, newpos = struct.unpack('>l', self.__buf, self.__pos)
    self.__pos = newpos # retained "newpos" for readability...
    return result

A third possibility - rather than "magically" adding an additional
return value because you supply a position, you could have a "where am
I?" format symbol (say & by analogy with the C "address of" operator).
Then you'd say

    result, newpos = struct.unpack('>l&', self.__buf, self.__pos)

Please be aware, I don't have a need myself for this feature - my
interest is as a potential reader of others' code...

Paul.
From anthony at interlink.com.au  Sat Jan  8 14:05:15 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat Jan  8 14:05:04 2005
Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in
In-Reply-To: <D3E9B6E7-60AC-11D9-AE74-000A958D1666@cwi.nl>
References: <DD45B9F1-5EA0-11D9-BB20-000D934FF6B4@uitdesloot.nl>
	<21A975DC-6094-11D9-922D-000A95BA5446@redivi.com>
	<D3E9B6E7-60AC-11D9-AE74-000A958D1666@cwi.nl>
Message-ID: <200501090005.15921.anthony@interlink.com.au>

On Saturday 08 January 2005 00:05, Jack Jansen wrote:
> > This patch implements the proposed direct framework linking:
> > http://python.org/sf/1097739
>
> Looks good, I'll incorporate it. And as I haven't heard of any
> showstoppers for the -undefined dynamic_lookup (and Anthony seems to be
> offline this week) I'll put that in too.

Sorry, I've been busy on other projects for the last couple of weeks,
and email's backed up to an alarming degree.

Currently I'm thinking of a 2.3.5 sometime around the 20th or so. I'll
have a better idea next week, once I've been back at work for a couple
of days and I've seen what stuff's backed up awaiting my time. 

At the moment I'm thinking of a 2.4.1 in maybe early March. The only
really outstanding bugfix is the marshal one, afaik.

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From gvanrossum at gmail.com  Sat Jan  8 17:52:31 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Jan  8 17:52:34 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <79990c6b05010803092b1570d1@mail.gmail.com>
References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer>
	<Pine.LNX.4.58.0501071758200.3423@bagira>
	<79990c6b05010803092b1570d1@mail.gmail.com>
Message-ID: <ca471dc205010808523924c575@mail.gmail.com>

First, let me say two things:

(a) A higher-level API can and should be constructed which acts like a
(binary) stream but has additional methods for reading and writing
values using struct format codes (or, preferably, somewhat
higher-level type names, as suggested). Instances of this API should
be constructable from a stream or from a "buffer" (e.g. a string).

(b) -1 on Ilya's idea of having a special object that acts as an
input-output integer; it is too unpythonic (no matter your objection).

[Paul Moore]
> OTOH, Nick's idea of returning a tuple with the new offset might make
> your example shorter without sacrificing readability:
> 
>     result, newpos = struct.unpack('>l', self.__buf, self.__pos)
>     self.__pos = newpos # retained "newpos" for readability...
>     return result

This is okay, except I don't want to overload this on unpack() --
let's pick a different function name like unpack_at().

> A third possibility - rather than "magically" adding an additional
> return value because you supply a position, you could have a "where am
> I?" format symbol (say & by analogy with the C "address of" operator).
> Then you'd say
> 
>     result, newpos = struct.unpack('>l&', self.__buf, self.__pos)
> 
> Please be aware, I don't have a need myself for this feature - my
> interest is as a potential reader of others' code...

I think that adding more magical format characters is probably not
doing the readers of this code a service.

I do like the idea of not introducing an extra level of tuple to
accommodate the position return value but instead make it the last
item in the tuple when using unpack_at().

Then the definition would be:

def unpack_at(fmt, buf, pos):
    size = calcsize(fmt)
    end = pos + size
    data = buf[pos:end]
    if len(data) < size:
        raise struct.error("not enough data for format")
    # if data is too long that would be a bug in buf[pos:size] and
cause an error below
    ret = unpack(fmt, data)
    ret = ret + (end,)
    return ret

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From m.bless at gmx.de  Sat Jan  8 18:40:33 2005
From: m.bless at gmx.de (Martin Bless)
Date: Sat Jan  8 18:40:26 2005
Subject: [Python-Dev] Re: csv module TODO list
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com>

I'd love to see a 'split' and a 'join' function in the csv module to
just convert between string and list without having to bother about
files. 

Something like

csv.split(aStr [, dialect='excel'[, fmtparam]])  -> list object

and

csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object

Feasible?

mb - Martin




From kbk at shore.net  Sat Jan  8 20:15:36 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat Jan  8 20:16:04 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200501081915.j08JFaqo021328@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  267 open ( +6) /  2727 closed ( +9) /  2994 total (+15)
Bugs    :  798 open ( -3) /  4748 closed (+15) /  5546 total (+12)
RFE     :  165 open ( +0) /   140 closed ( +1) /   305 total ( +1)

New / Reopened Patches
______________________

Remove witty comment in pydoc.py  (2005-01-01)
CLOSED http://python.org/sf/1094007  opened by  Reinhold Birkenfeld

Docs for file() vs open()  (2005-01-01)
CLOSED http://python.org/sf/1094011  opened by  Reinhold Birkenfeld

Improvements for shutil.copytree()  (2005-01-01)
CLOSED http://python.org/sf/1094015  opened by  Reinhold Birkenfeld

xml.dom.minidom.Node.replaceChild(obj, x, x) removes child x  (2005-01-01)
       http://python.org/sf/1094164  opened by  Felix Rabe

os.py: base class _Environ on dict instead of UserDict  (2005-01-02)
       http://python.org/sf/1094387  opened by  Matthias Klose

add Bunch type to collections module  (2005-01-02)
       http://python.org/sf/1094542  opened by  Steven Bethard

self.button.pack() in tkinter.tex example  (2005-01-03)
       http://python.org/sf/1094815  opened by  [N/A]

fixes urllib2 digest to allow arbitrary methods  (2005-01-03)
       http://python.org/sf/1095362  opened by  John Reese

Argument passing from /usr/bin/idle2.3 to idle.py  (2003-11-30)
       http://python.org/sf/851459  reopened by  jafo

fix for trivial flatten bug in astgen  (2005-01-04)
       http://python.org/sf/1095541  opened by  DSM

exclude CVS conflict files in sdist command  (2005-01-04)
       http://python.org/sf/1095784  opened by  Wummel

Fix for wm_iconbitmap to allow .ico files under Windows.  (2005-01-05)
       http://python.org/sf/1096231  opened by  John Fouhy

Info Associated with Merge to AST  (2005-01-07)
       http://python.org/sf/1097671  opened by  Kurt B. Kaiser

Direct framework linking for MACOSX_DEPLOYMENT_TARGET < 10.3  (2005-01-07)
       http://python.org/sf/1097739  opened by  Bob Ippolito

Encoding for Code Page 273 used by EBCDIC Germany Austria  (2005-01-07)
       http://python.org/sf/1097797  opened by  Michael Bierenfeld

Patches Closed
______________

locale.getdefaultlocale does not return tuple in some OS  (2004-10-21)
       http://python.org/sf/1051395  closed by  rhettinger

imghdr -- identify JPEGs in EXIF format  (2003-06-08)
       http://python.org/sf/751031  closed by  rhettinger

Remove witty comment in pydoc.py  (2005-01-01)
       http://python.org/sf/1094007  closed by  rhettinger

Docs for file() vs open()  (2005-01-01)
       http://python.org/sf/1094011  closed by  rhettinger

Improvements for shutil.copytree()  (2005-01-01)
       http://python.org/sf/1094015  closed by  jlgijsbers

a new subprocess.call which raises an error on non-zero rc  (2004-11-23)
       http://python.org/sf/1071764  closed by  astrand

Argument passing from /usr/bin/idle2.3 to idle.py  (2003-11-30)
       http://python.org/sf/851459  closed by  jafo

@decorators, including classes  (2004-08-12)
       http://python.org/sf/1007991  closed by  jackdied

Convert glob.glob to generator-based DFS  (2004-04-27)
       http://python.org/sf/943206  closed by  jlgijsbers

Make cgi.py use email instead of rfc822 or mimetools  (2004-12-06)
       http://python.org/sf/1079734  closed by  jlgijsbers

New / Reopened Bugs
___________________

marshal.dumps('hello',0) "Access violation"  (2005-01-03)
CLOSED http://python.org/sf/1094960  opened by  Mark Brophy

General FAW - incorrect "most stable version"  (2005-01-03)
       http://python.org/sf/1095328  opened by  Tim Delaney

Python FAQ: list.sort() out of date  (2005-01-03)
CLOSED http://python.org/sf/1095342  opened by  Tim Delaney

Bug In Python  (2005-01-04)
CLOSED http://python.org/sf/1095789  opened by  JastheAce

"Macintosh" references in the docs need to be checked.  (2005-01-04)
       http://python.org/sf/1095802  opened by  Jack Jansen

The doc for DictProxy is missing  (2005-01-04)
       http://python.org/sf/1095821  opened by  Colin J. Williams

Apple-installed Python fails to build extensions  (2005-01-04)
       http://python.org/sf/1095822  opened by  Jack Jansen

time.tzset() not built on Solaris  (2005-01-05)
       http://python.org/sf/1096244  opened by  Gregory Bond

sys.__stdout__ doco isn't discouraging enough  (2005-01-05)
       http://python.org/sf/1096310  opened by  Just van Rossum

_DummyThread() objects not freed from threading._active map  (2004-12-22)
       http://python.org/sf/1089632  reopened by  saravanand

Example needed in os.stat()  (2005-01-06)
CLOSED http://python.org/sf/1097229  opened by  Facundo Batista

SimpleHTTPServer sends wrong Content-Length header  (2005-01-06)
       http://python.org/sf/1097597  opened by  David Schachter

urllib2 doesn't handle urls without a scheme  (2005-01-07)
       http://python.org/sf/1097834  opened by  Jack Jansen

getsource and getsourcelines in the inspect module  (2005-01-07)
CLOSED http://python.org/sf/1098134  opened by  Bj?rn Lindqvist

mailbox should use email not rfc822  (2003-06-19)
       http://python.org/sf/756982  reopened by  jlgijsbers

typo in "Python Tutorial": 1. Whetting your appetite  (2005-01-08)
       http://python.org/sf/1098497  opened by  Ludootje

Bugs Closed
___________

Don't define _SGAPI on IRIX  (2003-04-27)
       http://python.org/sf/728330  closed by  loewis

python24.msi  install error  (2004-12-01)
       http://python.org/sf/1076500  closed by  loewis

garbage collector still documented as optional  (2004-12-27)
       http://python.org/sf/1091740  closed by  rhettinger

marshal.dumps('hello',0) "Access violation"  (2005-01-03)
       http://python.org/sf/1094960  closed by  rhettinger

General FAQ: list.sort() out of date  (2005-01-04)
       http://python.org/sf/1095342  closed by  jlgijsbers

Bug In Python  (2005-01-04)
       http://python.org/sf/1095789  closed by  rhettinger

test_macostools fails when running from source  (2004-07-16)
       http://python.org/sf/992185  closed by  jackjansen

Sate/Save typo in Mac/scripts/BuildApplication.py  (2004-12-01)
       http://python.org/sf/1076490  closed by  jackjansen

_DummyThread() objects not freed from threading._active map  (2004-12-22)
       http://python.org/sf/1089632  closed by  bcannon

Example needed in os.stat()  (2005-01-06)
       http://python.org/sf/1097229  closed by  facundobatista

gethostbyaddr on redhat for multiple hostnames  (2004-12-14)
       http://python.org/sf/1085069  closed by  loewis

Unable to see Python binary  (2004-12-10)
       http://python.org/sf/1082874  closed by  loewis

Change in signal function in the signal module  (2004-12-10)
       http://python.org/sf/1083177  closed by  akuchling

test_descr fails on win2k  (2004-07-12)
       http://python.org/sf/989337  closed by  rhettinger

test_imp failure  (2004-07-15)
       http://python.org/sf/991708  closed by  rhettinger

getsource and getsourcelines in the inspect module  (2005-01-07)
       http://python.org/sf/1098134  closed by  jlgijsbers

crash (SEGV) in Py_EndInterpreter()  (2002-11-17)
       http://python.org/sf/639611  closed by  jlgijsbers

shutil.copytree copies stat of files, but not of dirs  (2004-10-18)
       http://python.org/sf/1048878  closed by  jlgijsbers

shutil.copytree uses os.mkdir instead of os.mkdirs  (2004-06-19)
       http://python.org/sf/975763  closed by  jlgijsbers

RFE Closed
__________

optparse .error() should print options list  (2004-12-22)
       http://python.org/sf/1089955  closed by  gward

From skip at pobox.com  Sat Jan  8 21:45:25 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sat Jan  8 21:45:30 2005
Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree()
Message-ID: <16864.18021.476235.551214@montanaro.dyndns.org>

Is there a reason the standard library needs both os.removedirs and
shutil.rmtree?  They seem awful similar to me (I can see they aren't really
identical).  Ditto for os.renames and shutil.move.  Presuming they are all
really needed, is there some reason they don't all belong in the same
module?

Skip
From jlg at dds.nl  Sat Jan  8 22:05:29 2005
From: jlg at dds.nl (Johannes Gijsbers)
Date: Sat Jan  8 22:02:10 2005
Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree()
In-Reply-To: <16864.18021.476235.551214@montanaro.dyndns.org>
References: <16864.18021.476235.551214@montanaro.dyndns.org>
Message-ID: <20050108210529.GA29102@authsmtp.dds.nl>

On Sat, Jan 08, 2005 at 02:45:25PM -0600, Skip Montanaro wrote:
> Is there a reason the standard library needs both os.removedirs and
> shutil.rmtree?  They seem awful similar to me (I can see they aren't really
> identical).  Ditto for os.renames and shutil.move.  Presuming they are all
> really needed, is there some reason they don't all belong in the same
> module?

os.removedirs() only removes directories, it will fail to remove a
non-empty directory, for example. It also doesn't have the
ignore_errors/onerror arguments [1]. os.renames() is different from
shutil.move() in that it also creates intermediate directories (and
deletes any left empty).

So they're not identical, but I do agree they should be consolidated
and moved into one module. I'd say shutil, both because the os
module is already awfully crowded, and because these functions are
"high-level operations on files and collections of files" rather
than "a more portable way of using operating system dependent
functionality [...]".

Johannes

[1] That may actually be a good thing, though. It was a pain to keep
those working backwards-compatibly when shutil.rmtree was recently
rewritten.
From irmen at xs4all.nl  Sun Jan  9 04:11:11 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sun Jan  9 04:11:13 2005
Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines apart.
Message-ID: <41E0A0CF.1070502@xs4all.nl>

Hello
using current cvs Python on Linux, I observe this weird
behavior of the readline() method on file-like objects
returned from the codecs module:

[irmen@atlantis ypage]$ cat testfile1.txt
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
offending line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd 
fjasklfzzzzaa%whereisthis!!!
next line.
[irmen@atlantis ypage]$ cat testfile2.txt
aaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbb
stillokay:bbbbxx
broken!!!!badbad
againokay.
[irmen@atlantis ypage]$ cat bug.py
import codecs
for name in ("testfile1.txt","testfile2.txt"):
     f=codecs.open(name,encoding="iso-8859-1")  # precise encoding doesn't matter
     print "----",name,"----"
     for line in f:
         print "LINE:"+repr(line)
[irmen@atlantis ypage]$ python25 bug.py
---- testfile1.txt ----
LINE:u'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy\r\n'
LINE:u'offendi'
LINE:u'ng line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd fjasklfzzzzaa'
LINE:u'%whereisthis!!!\r\n'
LINE:u'next line.\r\n'
---- testfile2.txt ----
LINE:u'aaaaaaaaaaaaaaaaaaaaaaaa\n'
LINE:u'bbbbbbbbbbbbbbbbbbbbbbbb\n'
LINE:u'stillokay:bbbbxx\n'
LINE:u'broke'
LINE:u'n!!!!badbad\n'
LINE:u'againokay.\n'
[irmen@atlantis ypage]$


See how it breaks certain lines in half?
It only happens when a certain encoding is used, so regular
file objects behave as expected. Also, readlines() works fine.

Python 2.3.4 and Python 2.4 do not have this problem.

Am I missing something or is this a bug? Thanks!

--Irmen
From s.percivall at chello.se  Sun Jan  9 04:38:53 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sun Jan  9 04:38:56 2005
Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines
	apart.
In-Reply-To: <41E0A0CF.1070502@xs4all.nl>
References: <41E0A0CF.1070502@xs4all.nl>
Message-ID: <FBED068F-61EF-11D9-8C16-0003934AD54A@chello.se>

On 2005-01-09, at 04.11, Irmen de Jong wrote:
> Hello
> using current cvs Python on Linux, I observe this weird
> behavior of the readline() method on file-like objects
> returned from the codecs module:
>
> [...]
>
> See how it breaks certain lines in half?
> It only happens when a certain encoding is used, so regular
> file objects behave as expected. Also, readlines() works fine.
>
> Python 2.3.4 and Python 2.4 do not have this problem.
>
> Am I missing something or is this a bug? Thanks!

It looks like the readline method broke at revision 1.36 of codecs.py,
when it was modified, yes.

//Simon

From irmen at xs4all.nl  Sun Jan  9 17:49:29 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sun Jan  9 17:49:30 2005
Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines
	apart.
In-Reply-To: <FBED068F-61EF-11D9-8C16-0003934AD54A@chello.se>
References: <41E0A0CF.1070502@xs4all.nl>
	<FBED068F-61EF-11D9-8C16-0003934AD54A@chello.se>
Message-ID: <41E16099.9080804@xs4all.nl>

Simon Percivall wrote:

> It looks like the readline method broke at revision 1.36 of codecs.py,
> when it was modified, yes.

Okay. I've created a bug report 1098990: codec readline() splits lines apart

--Irmen
From ilya at bluefir.net  Sun Jan  9 21:19:42 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sun Jan  9 21:16:52 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <ca471dc205010808523924c575@mail.gmail.com>
References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> 
	<Pine.LNX.4.58.0501071758200.3423@bagira>
	<79990c6b05010803092b1570d1@mail.gmail.com>
	<ca471dc205010808523924c575@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0501090945590.669@bagira>

> (a) A higher-level API can and should be constructed which acts like a
> (binary) stream but has additional methods for reading and writing
> values using struct format codes (or, preferably, somewhat
> higher-level type names, as suggested). Instances of this API should
> be constructable from a stream or from a "buffer" (e.g. a string).


Ok, I think it's getting much bigger than what I was initially aiming for
;-)...

One more comment though regarding unpack_at

> Then the definition would be:
>
> def unpack_at(fmt, buf, pos):
>     size = calcsize(fmt)
>     end = pos + size
>     data = buf[pos:end]
>     if len(data) < size:
>         raise struct.error("not enough data for format")
>     ret = unpack(fmt, data)
>     ret = ret + (end,)
>     return ret

While I see usefulness of this, I think it's a too limited, eg.
  result=unpack_at(fmt,buf, offset)
  offset=result.pop()
feels quite unnatural...
So my feeling is that adding this new API is not worth the trouble.
Especially if there are plans for anything higher level...

Instead, I would suggest that even a very limited initial
implementation of StructReader() like object suggested by Raymond would
be more useful...

class StructReader: #or maybe call it Unpacker?
    def __init__(self, buf):
        self._buf=buf
        self._offset=0
    def unpack(self, format):
        """unpack at current offset, advance internal offset
          accordingly"""
          size=struct.calcize(format)
          self._pos+=size
          ret=struct.unpack(format, self._buf[self._pos:self._pos+size)
  	  return ret
     #or may be just make _offset public??
     def tell(self):
        "return current offset"
        return self._offset
     def seek(self, offset, whence=0):
        "set current offset"
        self._offset=offset

This solves the original offset tracking problem completely (at least as
far as inconvenience is concerned, improving unpack() perfomance
would require struct reader to be written in C) , while allowing to add
the rest later.

E.g the original "hdr+variable number of data items" code would
look:

 buf=StructReader(rec)
 hdr=buf.unpack("iiii")
 for i in range(hdr[0]):
    item=buf.unpack( "IIII")


Ilya


PS with unpack_at() this code would look like:

 offset=0
 hdr=buf.unpack("iiii", offset)
 offset=hdr.pop()
 for i in range(hdr[0]):
    item=buf.unpack( "IIII",offset)
    offset=item.pop()




On Sat, 8 Jan 2005, Guido van Rossum wrote:

> First, let me say two things:
>
> (a) A higher-level API can and should be constructed which acts like a
> (binary) stream but has additional methods for reading and writing
> values using struct format codes (or, preferably, somewhat
> higher-level type names, as suggested). Instances of this API should
> be constructable from a stream or from a "buffer" (e.g. a string).
>
> (b) -1 on Ilya's idea of having a special object that acts as an
> input-output integer; it is too unpythonic (no matter your objection).
>
> [Paul Moore]
> > OTOH, Nick's idea of returning a tuple with the new offset might make
> > your example shorter without sacrificing readability:
> >
> >     result, newpos = struct.unpack('>l', self.__buf, self.__pos)
> >     self.__pos = newpos # retained "newpos" for readability...
> >     return result
>
> This is okay, except I don't want to overload this on unpack() --
> let's pick a different function name like unpack_at().
>
> > A third possibility - rather than "magically" adding an additional
> > return value because you supply a position, you could have a "where am
> > I?" format symbol (say & by analogy with the C "address of" operator).
> > Then you'd say
> >
> >     result, newpos = struct.unpack('>l&', self.__buf, self.__pos)
> >
> > Please be aware, I don't have a need myself for this feature - my
> > interest is as a potential reader of others' code...
>
> I think that adding more magical format characters is probably not
> doing the readers of this code a service.
>
> I do like the idea of not introducing an extra level of tuple to
> accommodate the position return value but instead make it the last
> item in the tuple when using unpack_at().
>
> Then the definition would be:
>
> def unpack_at(fmt, buf, pos):
>     size = calcsize(fmt)
>     end = pos + size
>     data = buf[pos:end]
>     if len(data) < size:
>         raise struct.error("not enough data for format")
>     # if data is too long that would be a bug in buf[pos:size] and
> cause an error below
>     ret = unpack(fmt, data)
>     ret = ret + (end,)
>     return ret
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
From irmen at xs4all.nl  Sun Jan  9 21:42:20 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sun Jan  9 21:42:20 2005
Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines
	apart.
In-Reply-To: <41E16099.9080804@xs4all.nl>
References: <41E0A0CF.1070502@xs4all.nl>	<FBED068F-61EF-11D9-8C16-0003934AD54A@chello.se>
	<41E16099.9080804@xs4all.nl>
Message-ID: <41E1972C.5010807@xs4all.nl>

> Okay. I've created a bug report 1098990: codec readline() splits lines 
> apart

Btw, I've set it to group Python 2.5, is that correct?
Or should bugs that relate to the current CVS trunk have no group?
Thx
Irmen.
From andrewm at object-craft.com.au  Mon Jan 10 00:37:17 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Mon Jan 10 00:37:22 2005
Subject: [Python-Dev] Re: csv module TODO list 
In-Reply-To: <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com>
Message-ID: <20050109233717.8A3C33C8E5@coffee.object-craft.com.au>

>I'd love to see a 'split' and a 'join' function in the csv module to
>just convert between string and list without having to bother about
>files. 
>
>Something like
>
>csv.split(aStr [, dialect='excel'[, fmtparam]])  -> list object
>
>and
>
>csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object
>
>Feasible?

Yes, it's feasible, although newlines can be embedded in within fields
of a CSV record, hence the use of the iterator, rather than working with
strings. In your example above, if the parser gets to the end of the
string and finds it's still within a field, I'd propose just raising
an exception.

No promises, however - I only have a finite ammount of time to work on
this at the moment.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From python at rcn.com  Mon Jan 10 01:19:03 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Jan 10 01:22:23 2005
Subject: [Python-Dev] an idea for improving struct.unpack api
In-Reply-To: <Pine.LNX.4.58.0501090945590.669@bagira>
Message-ID: <001b01c4f6a9$fdf033e0$e841fea9@oemcomputer>

> Instead, I would suggest that even a very limited initial
> implementation of StructReader() like object suggested by Raymond
would
> be more useful...

I have a draft patch also.
Let's work out improvements off-list (perhaps on ASPN).
Feel free to email me directly.


Raymond

From andrewm at object-craft.com.au  Mon Jan 10 01:40:06 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Mon Jan 10 01:40:10 2005
Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module 
In-Reply-To: <48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it> 
References: <1105105520.41de927049442@mcherm.com>
	<48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050110004006.88CB63C8E5@coffee.object-craft.com.au>

>> Andrew explains that in the CSV module, escape characters are not
>> properly removed.
>>
>> Magnus writes:
>>> IMO this is the *only* reasonable behaviour. I don't understand why
>>> the escape character should be left in; this is one of the reason why
>>> UNIX-style colon-separated values don't work with the current module.
>>
>> Andrew writes back later:
>>> Thinking about this further, I suspect we have to retain the current
>>> behaviour, as broken as it is, as the default: it's conceivable that
>>> someone somewhere is post-processing the result to remove the 
>>> backslashes,
>>> and if we fix the csv module, we'll break their code.
>>
>> I'm with Magnus on this. No one has 4 year old code using the CSV 
>> module.
>> The existing behavior is just simply WRONG. Sure, of course we should
>> try to maintain backward compatibility, but surely SOME cases don't
>> require it, right? Can't we treat this misbehavior as an outright bug?
>
>+1 -- the nonremoval of escape characters smells like a bug to me, too.

Okay, I'm glad the community agrees (less work, less crustification).

For what it's worth, it wasn't a bug so much as a misfeature. I was
explicitly adding the escape character back in. The intention was to
make the feature more forgiving on users who accidently set the escape
character - in other words, only special (quoting, escaping, field
delimiter) characters received special treatment. With the benefit of
hindsight, that was an inadequately considered choice.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From andrewm at object-craft.com.au  Mon Jan 10 05:44:41 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Mon Jan 10 05:44:45 2005
Subject: [Python-Dev] csv module and universal newlines
In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
Message-ID: <20050110044441.250103C889@coffee.object-craft.com.au>

This item, from the TODO list, has been bugging me for a while:

>* Reader and universal newlines don't interact well, reader doesn't
>  honour Dialect's lineterminator setting. All outstanding bug id's
>  (789519, 944890, 967934 and 1072404) are related to this - it's 
>  a difficult problem and further discussion is needed.

The csv parser consumes lines from an iterator, but it also has it's own
idea of end-of-line conventions, which are currently only used by the
writer, not the reader, which is a source of much confusion. The writer,
by default, also attempts to emit a \r\n sequence, which results in more
confusion unless the file is opened in binary mode.

I'm looking for suggestions for how we can mitigate these problems
(without breaking things for existing users).

The standard file iterator includes the end-of-line characters in the
returned string. One potentional solution is, then, to ignore the line
chunking done by the file iterator, and logically concatenate the source
lines until the csv parser's idea of lineterminator is seen - but this
defeats negates the benefits of using an iterator.

Another option might be to provide a new interface that relies on a
file-like object being supplied. The lineterminator character would only
be used with this interface, with the current interface falling back to
using only \n. Rather a drastic solution.

Any other ideas?

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From mcherm at mcherm.com  Mon Jan 10 15:40:10 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon Jan 10 15:40:13 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
Message-ID: <1105368010.41e293ca97620@mcherm.com>

Barry writes:
> As an experiment, I just added a PEP topic to the python-checkins
> mailing list.  You could subscribe to this list and just select the PEP
> topic (which matches the regex "PEP" in the Subject header or first few
> lines of the body).
>
> Give it a shot and let's see if that does the trick.

I just got notification of the change to PEP 246 (and I haven't received
other checkin notifications), so I guess I can report that this is
working.

Thanks, Barry. Should we now mention this on c.l.py for others who
may be interested?

-- Michael Chermside

From aleax at aleax.it  Mon Jan 10 15:42:11 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Jan 10 16:08:58 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>

I had been promising to rewrite PEP 246 to incorporate the last several 
years' worth of discussions &c about it, and Guido's recent "stop the 
flames" artima blog post finally pushed me to complete the work.  
Feedback is of course welcome, so I thought I had better repost it 
here, rather than relying on would-be commenters to get it from CVS... 
I'm also specifically CC'ing Clark, the co-author, since he wasn't 
involved in this rewrite and of course I owe it to him to change or 
clearly attribute to myself anything he doesn't like to have "under his 
own name"!


Thanks,

Alex


PEP: 246
Title: Object Adaptation
Version: $Revision: 1.6 $
Author: aleax@aleax.it (Alex Martelli),
     cce@clarkevans.com (Clark C. Evans)
Status: Draft
Type: Standards Track
Created: 21-Mar-2001
Python-Version: 2.5
Post-History: 29-Mar-2001, 10-Jan-2005


Abstract

     This proposal puts forth an extensible cooperative mechanism for
     the adaptation of an incoming object to a context which expects an
     object supporting a specific protocol (say a specific type, class,
     or interface).

     This proposal provides a built-in "adapt" function that, for any
     object X and any protocol Y, can be used to ask the Python
     environment for a version of X compliant with Y.  Behind the
     scenes, the mechanism asks object X: "Are you now, or do you know
     how to wrap yourself to provide, a supporter of protocol Y?".
     And, if this request fails, the function then asks protocol Y:
     "Does object X support you, or do you know how to wrap it to
     obtain such a supporter?"  This duality is important, because
     protocols can be developed after objects are, or vice-versa, and
     this PEP lets either case be supported non-invasively with regard
     to the pre-existing component[s].

     Lastly, if neither the object nor the protocol know about each
     other, the mechanism may check a registry of adapter factories,
     where callables able to adapt certain objects to certain protocols
     can be registered dynamically.  This part of the proposal is
     optional: the same effect could be obtained by ensuring that
     certain kinds of protocols and/or objects can accept dynamic
     registration of adapter factories, for example via suitable custom
     metaclasses.  However, this optional part allows adaptation to be
     made more flexible and powerful in a way that is not invasive to
     either protocols or other objects, thereby gaining for adaptation
     much the same kind of advantage that Python standard library's
     "copy_reg" module offers for serialization and persistence.

     This proposal does not specifically constrain what a protocol
     _is_, what "compliance to a protocol" exactly _means_, nor what
     precisely a wrapper is supposed to do.  These omissions are
     intended to leave this proposal compatible with both existing
     categories of protocols, such as the existing system of type and
     classes, as well as the many concepts for "interfaces" as such
     which have been proposed or implemented for Python, such as the
     one in PEP 245 [1], the one in Zope3 [2], or the ones discussed in
     the BDFL's Artima blog in late 2004 and early 2005 [3].  However,
     some reflections on these subjects, intended to be suggestive and
     not normative, are also included.


Motivation

     Currently there is no standardized mechanism in Python for
     checking if an object supports a particular protocol.  Typically,
     existence of certain methods, particularly special methods such as
     __getitem__, is used as an indicator of support for a particular
     protocol.  This technique works well for a few specific protocols
     blessed by the BDFL (Benevolent Dictator for Life).  The same can
     be said for the alternative technique based on checking
     'isinstance' (the built-in class "basestring" exists specifically
     to let you use 'isinstance' to check if an object "is something
     like a string").  Neither approach is easily and generally
     extensible to other protocols, defined by applications and third
     party frameworks, outside of the standard Python core.

     Even more important than checking if an object already supports a
     given protocol can be the task of obtaining a suitable adapter
     (wrapper or proxy) for the object, if the support is not already
     there.  For example, a string does not support the file protocol,
     but you can wrap it into a StringIO instance to obtain an object
     which does support that protocol and gets its data from the string
     it wraps; that way, you can pass the string (suitably wrapped) to
     subsystems which require as their arguments objects that are
     readable as files.  Unfortunately, there is currently no general,
     standardized way to automate this extremely important kind of
     "adaptation by wrapping" operations.

     Typically, today, when you pass objects to a context expecting a
     particular protocol, either the object knows about the context and
     provides its own wrapper or the context knows about the object and
     wraps it appropriately.  The difficulty with these approaches is
     that such adaptations are one-offs, are not centralized in a
     single place of the users code, and are not executed with a common
     technique, etc.  This lack of standardization increases code
     duplication with the same adapter occurring in more than one place
     or it encourages classes to be re-written instead of adapted.  In
     either case, maintainability suffers.

     It would be very nice to have a standard function that can be
     called upon to verify an object's compliance with a particular
     protocol and provide for a wrapper if one is readily available --
     all without having to hunt through each library's documentation
     for the incantation appropriate to that particular, specific case.


Requirements

     When considering an object's compliance with a protocol, there are
     several cases to be examined:

     a) When the protocol is a type or class, and the object has
        exactly that type or is an instance of exactly that class (not
        a subclass).  In this case, compliance is automatic.

     b) When the object knows about the protocol, and either considers
        itself compliant, or knows how to wrap itself suitably.

     c) When the protocol knows about the object, and either the object
        already complies or the protocol knows how to suitably wrap the
        object.

     d) When the protocol is a type or class, and the object is a
        member of a subclass.  This is distinct from the first case (a)
        above, since inheritance (unfortunately) does not necessarily
        imply substitutability, and thus must be handled carefully.

     e) When the context knows about the object and the protocol and
        knows how to adapt the object so that the required protocol is
        satisfied.  This could use an adapter registry or similar
        approaches.

     The fourth case above is subtle.  A break of substitutability can
     occur when a subclass changes a method's signature, or restricts
     the domains accepted for a method's argument ("co-variance" on
     arguments types), or extends the co-domain to include return
     values which the base class may never produce ("contra-variance"
     on return types).  While compliance based on class inheritance
     _should_ be automatic, this proposal allows an object to signal
     that it is not compliant with a base class protocol.

     If Python gains some standard "official" mechanism for interfaces,
     however, then the "fast-path" case (a) can and should be extended
     to the protocol being an interface, and the object an instance of
     a type or class claiming compliance with that interface.  For
     example, if the "interface" keyword discussed in [3] is adopted
     into Python, the "fast path" of case (a) could be used, since
     instantiable classes implementing an interface would not be
     allowed to break substitutability.


Specification

     This proposal introduces a new built-in function, adapt(), which
     is the basis for supporting these requirements.

     The adapt() function has three parameters:

     - `obj', the object to be adapted

     - `protocol', the protocol requested of the object

     - `alternate', an optional object to return if the object could
       not be adapted

     A successful result of the adapt() function returns either the
     object passed `obj', if the object is already compliant with the
     protocol, or a secondary object `wrapper', which provides a view
     of the object compliant with the protocol.  The definition of
     wrapper is deliberately vague, and a wrapper is allowed to be a
     full object with its own state if necessary.  However, the design
     intention is that an adaptation wrapper should hold a reference to
     the original object it wraps, plus (if needed) a minimum of extra
     state which it cannot delegate to the wrapper object.

     An excellent example of adaptation wrapper is an instance of
     StringIO which adapts an incoming string to be read as if it was a
     textfile: the wrapper holds a reference to the string, but deals
     by itself with the "current point of reading" (from _where_ in the
     wrapped strings will the characters for the next, e.g., "readline"
     call come from), because it cannot delegate it to the wrapped
     object (a string has no concept of "current point of reading" nor
     anything else even remotely related to that concept).

     A failure to adapt the object to the protocol raises an
     AdaptationError (which is a subclass of TypeError), unless the
     alternate parameter is used, in this case the alternate argument
     is returned instead.

     To enable the first case listed in the requirements, the adapt()
     function first checks to see if the object's type or the object's
     class are identical to the protocol.  If so, then the adapt()
     function returns the object directly without further ado.

     To enable the second case, when the object knows about the
     protocol, the object must have a __conform__() method.  This
     optional method takes two arguments:

     - `self', the object being adapted

     - `protocol, the protocol requested

     Just like any other special method in today's Python, __conform__
     is meant to be taken from the object's class, not from the object
     itself (for all objects, except instances of "classic classes" as
     long as we must still support the latter).  This enables a
     possible 'tp_conform' slot to be added to Python's type objects in
     the future, if desired.

     The object may return itself as the result of __conform__ to
     indicate compliance.  Alternatively, the object also has the
     option of returning a wrapper object compliant with the protocol.
     If the object knows it is not compliant although it belongs to a
     type which is a subclass of the protocol, then __conform__ should
     raise a LiskovViolation exception (a subclass of AdaptationError).
     Finally, if the object cannot determine its compliance, it should
     return None to enable the remaining mechanisms.  If __conform__
     raises any other exception, "adapt" just propagates it.

     To enable the third case, when the protocol knows about the
     object, the protocol must have an __adapt__() method.  This
     optional method takes two arguments:

     - `self', the protocol requested

     - `obj', the object being adapted

     If the protocol finds the object to be compliant, it can return
     obj directly.  Alternatively, the method may return a wrapper
     compliant with the protocol.  If the protocol knows the object is
     not compliant although it belongs to a type which is a subclass of
     the protocol, then __adapt__ should raise a LiskovViolation
     exception (a subclass of AdaptationError).  Finally, when
     compliance cannot be determined, this method should return None to
     enable the remaining mechanisms.  If __adapt__ raises any other
     exception, "adapt" just propagates it.

     The fourth case, when the object's class is a sub-class of the
     protocol, is handled by the built-in adapt() function.  Under
     normal circumstances, if "isinstance(object, protocol)" then
     adapt() returns the object directly.  However, if the object is
     not substitutable, either the __conform__() or __adapt__()
     methods, as above mentioned, may raise an LiskovViolation (a
     subclass of AdaptationError) to prevent this default behavior.

     If none of the first four mechanisms worked, as a last-ditch
     attempt, 'adapt' falls back to checking a registry of adapter
     factories, indexed by the protocol and the type of `obj', to meet
     the fifth case.  Adapter factories may be dynamically registered
     and removed from that registry to provide "third party adaptation"
     of objects and protocols that have no knowledge of each other, in
     a way that is not invasive to either the object or the protocols.


Intended Use

     The typical intended use of adapt is in code which has received
     some object X "from the outside", either as an argument or as the
     result of calling some function, and needs to use that object
     according to a certain protocol Y.  A "protocol" such as Y is
     meant to indicate an interface, usually enriched with some
     semantics constraints (such as are typically used in the "design
     by contract" approach), and often also some pragmatical
     expectation (such as "the running time of a certain operation
     should be no worse than O(N)", or the like); this proposal does
     not specify how protocols are designed as such, nor how or whether
     compliance to a protocol is checked, nor what the consequences may
     be of claiming compliance but not actually delivering it (lack of
     "syntactic" compliance -- names and signatures of methods -- will
     often lead to exceptions being raised; lack of "semantic"
     compliance may lead to subtle and perhaps occasional errors
     [imagine a method claiming to be threadsafe but being in fact
     subject to some subtle race condition, for example]; lack of
     "pragmatic" compliance will generally lead to code that runs
     ``correctly'', but too slowly for practical use, or sometimes to
     exhaustion of resources such as memory or disk space).

     When protocol Y is a concrete type or class, compliance to it is
     intended to mean that an object allows all of the operations that
     could be performed on instances of Y, with "comparable" semantics
     and pragmatics.  For example, a hypothetical object X that is a
     singly-linked list should not claim compliance with protocol
     'list', even if it implements all of list's methods: the fact that
     indexing X[n] takes time O(n), while the same operation would be
     O(1) on a list, makes a difference.  On the other hand, an
     instance of StringIO.StringIO does comply with protocol 'file',
     even though some operations (such as those of module 'marshal')
     may not allow substituting one for the other because they perform
     explicit type-checks: such type-checks are "beyond the pale" from
     the point of view of protocol compliance.

     While this convention makes it feasible to use a concrete type or
     class as a protocol for purposes of this proposal, such use will
     often not be optimal.  Rarely will the code calling 'adapt' need
     ALL of the features of a certain concrete type, particularly for
     such rich types as file, list, dict; rarely can all those features
     be provided by a wrapper with good pragmatics, as well as syntax
     and semantics that are really the same as a concrete type's.

     Rather, once this proposal is accepted, a design effort needs to
     start to identify the essential characteristics of those protocols
     which are currently used in Python, particularly within the
     standard library, and to formalize them using some kind of
     "interface" construct (not necessarily requiring any new syntax: a
     simple custom metaclass would let us get started, and the results
     of the effort could later be migrated to whatever "interface"
     construct is eventually accepted into the Python language).  With
     such a palette of more formally designed protocols, the code using
     'adapt' will be able to ask for, say, adaptation into "a filelike
     object that is readable and seekable", or whatever else it
     specifically needs with some decent level of "granularity", rather
     than too-generically asking for compliance to the 'file' protocol.

     Adaptation is NOT "casting".  When object X itself does not
     conform to protocol Y, adapting X to Y means using some kind of
     wrapper object Z, which holds a reference to X, and implements
     whatever operation Y requires, mostly by delegating to X in
     appropriate ways.  For example, if X is a string and Y is 'file',
     the proper way to adapt X to Y is to make a StringIO(X), *NOT* to
     call file(X) [which would try to open a file named by X].

     Numeric types and protocols may need to be an exception to this
     "adaptation is not casting" mantra, however.


Guido's "Optional Static Typing: Stop the Flames" Blog Entry

     A typical simple use case of adaptation would be:

         def f(X):
             X = adapt(X, Y)
             # continue by using X according to protocol X

     In [4], the BDFL has proposed introducing the syntax:

         def f(X: Y):
             # continue by using X according to protocol X

     to be a handy shortcut for exactly this typical use of adapt, and,
     as a basis for experimentation until the parser has been modified
     to accept this new syntax, a semantically equivalent decorator:

         @arguments(Y)
         def f(X):
             # continue by using X according to protocol X

     These BDFL ideas are fully compatible with this proposal, as are
     other of Guido's suggestions in the same blog.



Reference Implementation and Test Cases

     The following reference implementation does not deal with classic
     classes: it consider only new-style classes.  If classic classes
     need to be supported, the additions should be pretty clear, though
     a bit messy (x.__class__ vs type(x), getting boundmethods directly
     from the object rather than from the type, and so on).

     -----------------------------------------------------------------
     adapt.py
     -----------------------------------------------------------------
     class AdaptationError(TypeError):
         pass
     class LiskovViolation(AdaptationError):
         pass

     _adapter_factory_registry = {}

     def registerAdapterFactory(objtype, protocol, factory):
         _adapter_factory_registry[objtype, protocol] = factory

     def unregisterAdapterFactory(objtype, protocol):
         del _adapter_factory_registry[objtype, protocol]

     def _adapt_by_registry(obj, protocol, alternate):
         factory = _adapter_factory_registry.get((type(obj), protocol))
         if factory is None:
             adapter = alternate
         else:
             adapter = factory(obj, protocol, alternate)
         if adapter is AdaptationError:
             raise AdaptationError
         else:
             return adapter


     def adapt(obj, protocol, alternate=AdaptationError):

         t = type(obj)

         # (a) first check to see if object has the exact protocol
         if t is protocol:
            return obj

         try:
             # (b) next check if t.__conform__ exists & likes protocol
             conform = getattr(t, '__conform__', None)
             if conform is not None:
                 result = conform(obj, protocol)
                 if result is not None:
                     return result

             # (c) then check if protocol.__adapt__ exists & likes obj
             adapt = getattr(type(protocol), '__adapt__', None)
             if adapt is not None:
                 result = adapt(protocol, obj)
                 if result is not None:
                     return result
         except LiskovViolation:
             pass
         else:
             # (d) check if object is instance of protocol
             if isinstance(obj, protocol):
                 return obj

         # (e) last chance: try the registry
         return _adapt_by_registry(obj, protocol, alternate)

     -----------------------------------------------------------------
     test.py
     -----------------------------------------------------------------
     from adapt import AdaptationError, LiskovViolation, adapt
     from adapt import registerAdapterFactory, unregisterAdapterFactory
     import doctest

     class A(object):
         '''
         >>> a = A()
         >>> a is adapt(a, A)   # case (a)
         True
         '''

     class B(A):
         '''
         >>> b = B()
         >>> b is adapt(b, A)   # case (d)
         True
         '''

     class C(object):
         '''
         >>> c = C()
         >>> c is adapt(c, B)   # case (b)
         True
         >>> c is adapt(c, A)   # a failure case
         Traceback (most recent call last):
             ...
         AdaptationError
         '''
         def __conform__(self, protocol):
             if protocol is B:
                 return self

     class D(C):
         '''
         >>> d = D()
         >>> d is adapt(d, D)   # case (a)
         True
         >>> d is adapt(d, C)   # case (d) explicitly blocked
         Traceback (most recent call last):
             ...
         AdaptationError
         '''
         def __conform__(self, protocol):
             if protocol is C:
                 raise LiskovViolation

     class MetaAdaptingProtocol(type):
         def __adapt__(cls, obj):
             return cls.adapt(obj)

     class AdaptingProtocol:
         __metaclass__ = MetaAdaptingProtocol
         @classmethod
         def adapt(cls, obj):
             pass

     class E(AdaptingProtocol):
         '''
         >>> a = A()
         >>> a is adapt(a, E)   # case (c)
         True
         >>> b = A()
         >>> b is adapt(b, E)   # case (c)
         True
         >>> c = C()
         >>> c is adapt(c, E)   # a failure case
         Traceback (most recent call last):
             ...
         AdaptationError
         '''
         @classmethod
         def adapt(cls, obj):
             if isinstance(obj, A):
                 return obj

     class F(object):
         pass

     def adapt_F_to_A(obj, protocol, alternate):
         if isinstance(obj, F) and issubclass(protocol, A):
             return obj
         else:
             return alternate

     def test_registry():
         '''
         >>> f = F()
         >>> f is adapt(f, A)   # a failure case
         Traceback (most recent call last):
             ...
         AdaptationError
         >>> registerAdapterFactory(F, A, adapt_F_to_A)
         >>> f is adapt(f, A)   # case (e)
         True
         >>> unregisterAdapterFactory(F, A)
         >>> f is adapt(f, A)   # a failure case again
         Traceback (most recent call last):
             ...
         AdaptationError
         >>> registerAdapterFactory(F, A, adapt_F_to_A)
         '''

     doctest.testmod()


Relationship To Microsoft's QueryInterface

     Although this proposal has some similarities to Microsoft's (COM)
     QueryInterface, it differs by a number of aspects.

     First, adaptation in this proposal is bi-directional, allowing the
     interface (protocol) to be queried as well, which gives more
     dynamic abilities (more Pythonic).  Second, there is no special
     "IUnknown" interface which can be used to check or obtain the
     original unwrapped object identity, although this could be
     proposed as one of those "special" blessed interface protocol
     identifiers.  Third, with QueryInterface, once an object supports
     a particular interface it must always there after support this
     interface; this proposal makes no such guarantee, since, in
     particular, adapter factories can be dynamically added to the
     registried and removed again later.

     Fourth, implementations of Microsoft's QueryInterface must support
     a kind of equivalence relation -- they must be reflexive,
     symmetrical, and transitive, in specific senses.  The equivalent
     conditions for protocol adaptation according to this proposal
     would also represent desirable properties:

         # given, to start with, a successful adaptation:
         X_as_Y = adapt(X, Y)

         # reflexive:
         assert adapt(X_as_Y, Y) is X_as_Y

         # transitive:
         X_as_Z = adapt(X, Z, None)
         X_as_Y_as_Z = adapt(X_as_Y, Z, None)
         assert (X_as_Y_as_Z is None) == (X_as_Z is None)

         # symmetrical:
         X_as_Z_as_Y = adapt(X_as_Z, Y, None)
         assert (X_as_Y_as_Z is None) == (X_as_Z_as_Y is None)

     However, while these properties are desirable, it may not be
     possible to guarantee them in all cases.  QueryInterface can
     impose their equivalents because it dictates, to some extent, how
     objects, interfaces, and adapters are to be coded; this proposal
     is meant to be not necessarily invasive, usable and to "retrofit"
     adaptation between two frameworks coded in mutual ignorance of
     each other without having to modify either framework.

     Transitivity of adaptation is in fact somewhat controversial, as
     is the relationship (if any) between adaptation and inheritance.

     The latter would not be controversial if we knew that inheritance
     always implies Liskov substitutability, which, unfortunately we
     don't.  If some special form, such as the interfaces proposed in
     [4], could indeed ensure Liskov substitutability, then for that
     kind of inheritance, only, we could perhaps assert that if X
     conforms to Y and Y inherits from Z then X conforms to Z... but
     only if substitutability was taken in a very strong sense to
     include semantics and pragmatics, which seems doubtful.  (For what
     it's worth: in QueryInterface, inheritance does not require nor
     imply conformance).  This proposal does not include any "strong"
     effects of inheritance, beyond the small ones specifically
     detailed above.

     Similarly, transitivity might imply multiple "internal" adaptation
     passes to get the result of adapt(X, Z) via some intermediate Y,
     intrinsically like adapt(adapt(X, Y), Z), for some suitable and
     automatically chosen Y.  Again, this may perhaps be feasible under
     suitably strong constraints, but the practical implications of
     such a scheme are still unclear to this proposal's authors.  Thus,
     this proposal does not include any automatic or implicit
     transitivity of adaptation, under whatever circumstances.

     For an implementation of the original version of this proposal
     which performs more advanced processing in terms of transitivity,
     and of the effects of inheritance, see Phillip J. Eby's
     PyProtocols [5].  The documentation accompanying PyProtocols is
     well worth studying for its considerations on how adapters should
     be coded and used, and on how adaptation can remove any need for
     typechecking in application code.


Questions and Answers

     Q:  What benefit does this proposal provide?

     A:  The typical Python programmer is an integrator, someone who is
         connecting components from various suppliers.  Often, to
         interface between these components, one needs intermediate
         adapters.  Usually the burden falls upon the programmer to
         study the interface exposed by one component and required by
         another, determine if they are directly compatible, or develop
         an adapter.  Sometimes a supplier may even include the
         appropriate adapter, but even then searching for the adapter
         and figuring out how to deploy the adapter takes time.

         This technique enables supplierrs to work with each other
         directly, by implementing __conform__ or __adapt__ as
         necessary.  This frees the integrator from making their own
         adapters.  In essence, this allows the components to have a
         simple dialogue among themselves.  The integrator simply
         connects one component to another, and if the types don't
         automatically match an adapting mechanism is built-in.

         Moreover, thanks to the adapter registry, a "fourth party" may
         supply adapters to allow interoperation of frameworks which
         are totally unaware of each other, non-invasively, and without
         requiring the integrator to do anything more than install the
         appropriate adapter factories in the registry at start-up.

         As long as libraries and frameworks cooperate with the
         adaptation infrastructure proposed here (essentially by
         defining and using protocols appropriately, and calling
         'adapt' as needed on arguments received and results of
         call-back factory functions), the integrator's work thereby
         becomes much simpler.

         For example, consider SAX1 and SAX2 interfaces: there is an
         adapter required to switch between them.  Normally, the
         programmer must be aware of this; however, with this
         adaptation proposal in place, this is no longer the case --
         indeed, thanks to the adapter registry, this need may be
         removed even if the framework supplying SAX1 and the one
         requiring SAX2 are unaware of each other.


     Q:  Why does this have to be built-in, can't it be standalone?

     A:  Yes, it does work standalone.  However, if it is built-in, it
         has a greater chance of usage.  The value of this proposal is
         primarily in standardization: having libraries and frameworks
         coming from different suppliers, including the Python standard
         library, use a single approach to adaptation.  Furthermore:

         0.  The mechanism is by its very nature a singleton.

         1.  If used frequently, it will be much faster as a built-in.

         2.  It is extensible and unassuming.

         3.  Once 'adapt' is built-in, it can support syntax extensions
             and even be of some help to a type inference system.


     Q:  Why the verbs __conform__ and __adapt__?

     A:  conform, verb intransitive
             1. To correspond in form or character; be similar.
             2. To act or be in accord or agreement; comply.
             3. To act in accordance with current customs or modes.

         adapt, verb transitive
             1. To make suitable to or fit for a specific use or
                situation.

         Source:  The American Heritage Dictionary of the English
                  Language, Third Edition


Backwards Compatibility

     There should be no problem with backwards compatibility unless
     someone had used the special names __conform__ or __adapt__ in
     other ways, but this seems unlikely, and, in any case, user code
     should never use special names for non-standard purposes.

     This proposal could be implemented and tested without changes to
     the interpreter.


Credits

     This proposal was created in large part by the feedback of the
     talented individuals on the main Python mailing lists and the
     type-sig list.  To name specific contributors (with apologies if
     we missed anyone!), besides the proposal's authors: the main
     suggestions for the proposal's first versions came from Paul
     Prescod, with significant feedback from Robin Thomas, and we also
     borrowed ideas from Marcin 'Qrczak' Kowalczyk and Carlos Ribeiro.

     Other contributors (via comments) include Michel Pelletier, Jeremy
     Hylton, Aahz Maruch, Fredrik Lundh, Rainer Deyke, Timothy Delaney,
     and Huaiyu Zhu.  The current version owes a lot to discussions
     with (among others) Phillip J. Eby, Guido van Rossum, Bruce Eckel,
     Jim Fulton, and Ka-Ping Yee, and to study and reflection of their
     proposals, implementations, and documentation about use and
     adaptation of interfaces and protocols in Python.


References and Footnotes

     [1] PEP 245, Python Interface Syntax, Pelletier
         http://www.python.org/peps/pep-0245.html

     [2] http://www.zope.org/Wikis/Interfaces/FrontPage

     [3] http://www.artima.com/weblogs/index.jsp?blogger=guido

     [4] http://www.artima.com/weblogs/viewpost.jsp?thread=87182

     [5] http://peak.telecommunity.com/PyProtocols.html


Copyright

     This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

From FBatista at uniFON.com.ar  Mon Jan 10 16:08:03 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Jan 10 16:10:28 2005
Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree()
Message-ID: <A128D751272CD411BC9200508BC2194D053C7E55@escpl.tcp.com.ar>

[Johannes Gijsbers]

#- So they're not identical, but I do agree they should be consolidated
#- and moved into one module. I'd say shutil, both because the os
#- module is already awfully crowded, and because these functions are
#- "high-level operations on files and collections of files" rather
#- than "a more portable way of using operating system dependent
#- functionality [...]".

+1.

We should be keeping this "should change this way" for when we restructure
the std lib. There's already a wiki somewhere for this?

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050110/e5dfab1a/attachment.htm
From barry at python.org  Mon Jan 10 16:13:13 2005
From: barry at python.org (Barry Warsaw)
Date: Mon Jan 10 16:13:21 2005
Subject: [Python-Dev] Re: Subscribing to PEP updates
In-Reply-To: <1105368010.41e293ca97620@mcherm.com>
References: <1105368010.41e293ca97620@mcherm.com>
Message-ID: <1105369993.29934.4.camel@geddy.wooz.org>

On Mon, 2005-01-10 at 09:40, Michael Chermside wrote:
> Barry writes:
> > As an experiment, I just added a PEP topic to the python-checkins
> > mailing list.  You could subscribe to this list and just select the PEP
> > topic (which matches the regex "PEP" in the Subject header or first few
> > lines of the body).
> >
> > Give it a shot and let's see if that does the trick.
> 
> I just got notification of the change to PEP 246 (and I haven't received
> other checkin notifications), so I guess I can report that this is
> working.

Excellent!

> Thanks, Barry. Should we now mention this on c.l.py for others who
> may be interested?

Sure, I think that would be great.  Thanks.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050110/5b83fbed/attachment.pgp
From gvanrossum at gmail.com  Mon Jan 10 16:46:39 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 10 16:46:43 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc2050110074614f5edc3@mail.gmail.com>

> I had been promising to rewrite PEP 246 to incorporate the last several
> years' worth of discussions &c about it, and Guido's recent "stop the
> flames" artima blog post finally pushed me to complete the work.
> Feedback is of course welcome, so I thought I had better repost it
> here, rather than relying on would-be commenters to get it from CVS...

Thanks for doing this, Alex! I yet have to read the whole thing [will
attempt do so later today] but the few snippets I caught make me feel
this is a big step forward.

I'm wondering if someone could do a similar thing for PEP 245,
interfaces syntax? Alex hinted that it's a couple of rounds behind the
developments in Zope and Twisted. I'm personally not keen on needing
*two* new keywords (interface and implements) so I hope that whoever
does the rewrite could add a section on the advantages and
disadvantages of the 'implements' keyword (my simplistic alternative
proposal is to simply include interfaces in the list of bases in the
class statement; the metaclass can then sort it out).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Mon Jan 10 18:43:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 18:42:32 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>

At 03:42 PM 1/10/05 +0100, Alex Martelli wrote:
>     The fourth case above is subtle.  A break of substitutability can
>     occur when a subclass changes a method's signature, or restricts
>     the domains accepted for a method's argument ("co-variance" on
>     arguments types), or extends the co-domain to include return
>     values which the base class may never produce ("contra-variance"
>     on return types).  While compliance based on class inheritance
>     _should_ be automatic, this proposal allows an object to signal
>     that it is not compliant with a base class protocol.

-1 if this introduces a performance penalty to a wide range of adaptations 
(i.e. those using abstract base classes), just to support people who want 
to create deliberate Liskov violations.  I personally don't think that we 
should pander to Liskov violators, especially since Guido seems to be 
saying that there will be some kind of interface objects available in 
future Pythons.


>     Just like any other special method in today's Python, __conform__
>     is meant to be taken from the object's class, not from the object
>     itself (for all objects, except instances of "classic classes" as
>     long as we must still support the latter).  This enables a
>     possible 'tp_conform' slot to be added to Python's type objects in
>     the future, if desired.

One note here: Zope and PEAK sometimes use interfaces that a function or 
module may implement.  PyProtocols' implementation does this by adding a 
__conform__ object to the function's dictionary so that the function can 
conform to a particular signature.  If and when __conform__ becomes 
tp_conform, this may not be necessary any more, at least for functions, 
because there will probably be some way for an interface to tell if the 
function at least conforms to the appropriate signature.  But for modules 
this will still be an issue.

I am not saying we shouldn't have a tp_conform; just suggesting that it may 
be appropriate for functions and modules (as well as classic classes) to 
have their tp_conform delegate back to self.__dict__['__conform__'] instead 
of a null implementation.



>     The object may return itself as the result of __conform__ to
>     indicate compliance.  Alternatively, the object also has the
>     option of returning a wrapper object compliant with the protocol.
>     If the object knows it is not compliant although it belongs to a
>     type which is a subclass of the protocol, then __conform__ should
>     raise a LiskovViolation exception (a subclass of AdaptationError).
>     Finally, if the object cannot determine its compliance, it should
>     return None to enable the remaining mechanisms.  If __conform__
>     raises any other exception, "adapt" just propagates it.
>
>     To enable the third case, when the protocol knows about the
>     object, the protocol must have an __adapt__() method.  This
>     optional method takes two arguments:
>
>     - `self', the protocol requested
>
>     - `obj', the object being adapted
>
>     If the protocol finds the object to be compliant, it can return
>     obj directly.  Alternatively, the method may return a wrapper
>     compliant with the protocol.  If the protocol knows the object is
>     not compliant although it belongs to a type which is a subclass of
>     the protocol, then __adapt__ should raise a LiskovViolation
>     exception (a subclass of AdaptationError).  Finally, when
>     compliance cannot be determined, this method should return None to
>     enable the remaining mechanisms.  If __adapt__ raises any other
>     exception, "adapt" just propagates it.
>     The fourth case, when the object's class is a sub-class of the
>     protocol, is handled by the built-in adapt() function.  Under
>     normal circumstances, if "isinstance(object, protocol)" then
>     adapt() returns the object directly.  However, if the object is
>     not substitutable, either the __conform__() or __adapt__()
>     methods, as above mentioned, may raise an LiskovViolation (a
>     subclass of AdaptationError) to prevent this default behavior.

I don't see the benefit of LiskovViolation, or of doing the exact type 
check vs. the loose check.  What is the use case for these?  Is it to allow 
subclasses to say, "Hey I'm not my superclass?"  It's also a bit confusing 
to say that if the routines "raise any other exceptions" they're 
propagated.  Are you saying that LiskovViolation is *not* propagated?



>     If none of the first four mechanisms worked, as a last-ditch
>     attempt, 'adapt' falls back to checking a registry of adapter
>     factories, indexed by the protocol and the type of `obj', to meet
>     the fifth case.  Adapter factories may be dynamically registered
>     and removed from that registry to provide "third party adaptation"
>     of objects and protocols that have no knowledge of each other, in
>     a way that is not invasive to either the object or the protocols.

This should either be fleshed out to a concrete proposal, or 
dropped.  There are many details that would need to be answered, such as 
whether "type" includes subtypes and whether it really means type or 
__class__.  (Note that isinstance() now uses __class__, allowing proxy 
objects to lie about their class; the adaptation system should support this 
too, and both the Zope and PyProtocols interface systems and PyProtocols' 
generic functions support it.)

One other issue: it's not possible to have standalone interoperable PEP 246 
implementations using a registry, unless there's a standardized place to 
put it, and a specification for how it gets there.  Otherwise, if someone 
is using both say Zope and PEAK in the same application, they would have to 
take care to register adaptations in both places.  This is actually a 
pretty minor issue since in practice both frameworks' interfaces handle 
adaptation, so there is no *need* for this extra registry in such cases.



>     Adaptation is NOT "casting".  When object X itself does not
>     conform to protocol Y, adapting X to Y means using some kind of
>     wrapper object Z, which holds a reference to X, and implements
>     whatever operation Y requires, mostly by delegating to X in
>     appropriate ways.  For example, if X is a string and Y is 'file',
>     the proper way to adapt X to Y is to make a StringIO(X), *NOT* to
>     call file(X) [which would try to open a file named by X].
>
>     Numeric types and protocols may need to be an exception to this
>     "adaptation is not casting" mantra, however.

The issue isn't that adaptation isn't casting; why would casting a string 
to a file mean that you should open that filename?  I don't think that 
"adaptation isn't casting" is enough to explain appropriate use of 
adaptation.  For example, I think it's quite valid to adapt a filename to a 
*factory* for opening files, or a string to a "file designator".  However, 
it doesn't make any sense (to me at least) to adapt from a file designator 
to a file, which IMO is the reason it's wrong to adapt from a string to a 
file in the way you suggest.  However, casting doesn't come into it 
anywhere that I can see.

If I were going to say anything about that case, I'd say that adaptation 
should not be "lossy"; adapting from a designator to a file loses 
information like what mode the file should be opened in.  (Similarly, I 
don't see adapting from float to int; if you want a cast to int, cast 
it.)  Or to put it another way, adaptability should imply substitutability: 
a string may be used as a filename, a filename may be used to designate a 
file.  But a filename cannot be used as a file; that makes no sense.


>Reference Implementation and Test Cases
>
>     The following reference implementation does not deal with classic
>     classes: it consider only new-style classes.  If classic classes
>     need to be supported, the additions should be pretty clear, though
>     a bit messy (x.__class__ vs type(x), getting boundmethods directly
>     from the object rather than from the type, and so on).

Please base a reference implementation off of either Zope or PyProtocols' 
field-tested implementations which deal correctly with __class__ vs. 
type(), and can detect whether they're calling a __conform__ or __adapt__ 
at the wrong metaclass level, etc.  Then, if there is a reasonable use case 
for LiskovViolation and the new type checking rules that justifies adding 
them, let's do so.



>     Transitivity of adaptation is in fact somewhat controversial, as
>     is the relationship (if any) between adaptation and inheritance.

The issue is simply this: what is substitutability?  If you say that 
interface B is substitutable for A, and C is substitutable for B, then C 
*must* be substitutable for A, or we have inadequately defined 
"substitutability".

If adaptation is intended to denote substitutability, then there can be 
absolutely no question that it is transitive, or else it is not possible to 
have any meaning for interface inheritance!

Thus, the controversies are: 1) whether adaptation should be required to 
indicate substitutability (and I think that your own presentation of the 
string->file example supports this), and 2) whether the adaptation system 
should automatically provide an A when provided with a C.  Existing 
implementations of interfaces for Python all do this where interface C is a 
subclass of A.  However, they differ as to whether *all* adaptation should 
indicate substitutability.  The Zope and Twisted designers believe that 
adaptation should not be required to imply substitutability, and that only 
interface and implementation inheritance imply 
substitutability.  (Although, as you point out, the latter is not always 
the case.)

PyProtocols OTOH believes that *all* adaptation must imply 
substitutability; non-substitutable adaptation or inheritance is a design 
error: "adaptation abuse", if you will.  So, in the PyProtocols view, it 
would never make sense to define an adaptation from float or decimal to 
integer that would permit loss of precision.  If you did define such an 
adaptation, it must refuse to adapt a float or decimal with a fractional 
part, since the number would no longer be substitutable if data loss occurred.

Of course, this is a separate issue from automatic transitive adaptation, 
in the sense that even if you agree that adaptation must imply 
substitutability, you can still disagree as to whether automatically 
locating a multi-step adaptation is desirable enough to be worth 
implementing.  However, if substitutability is guaranteed, then such 
multi-step adaptation cannot result in anything "controversial" occurring.


>     The latter would not be controversial if we knew that inheritance
>     always implies Liskov substitutability, which, unfortunately we
>     don't.  If some special form, such as the interfaces proposed in
>     [4], could indeed ensure Liskov substitutability, then for that
>     kind of inheritance, only, we could perhaps assert that if X
>     conforms to Y and Y inherits from Z then X conforms to Z... but
>     only if substitutability was taken in a very strong sense to
>     include semantics and pragmatics, which seems doubtful.

As a practical matter, all of the existing interface systems (Zope, 
PyProtocols, and even the defunct Twisted implementation) treat interface 
inheritance as guaranteeing substitutability for the base interface, and do 
so transitively.

However, it seems to me to be a common programming error among people new 
to interfaces to inherit from an interface when they intend to *require* 
the base interface's functionality, rather than *offer* the base 
interface's functionality.  It may be worthwhile to address this issue in 
the design of "standard" interfaces for Python.

This educational issue regarding substitutability is I believe inherent to 
the concept of interfaces, however, and does not go away simply by making 
non-inheritance adaptation non-transitive in the implementation.  It may, 
however, make it take longer for people to encounter the issue, thereby 
slowing their learning process.  ;)



>Backwards Compatibility
>
>     There should be no problem with backwards compatibility unless
>     someone had used the special names __conform__ or __adapt__ in
>     other ways, but this seems unlikely, and, in any case, user code
>     should never use special names for non-standard purposes.

Production implementations of the old version of PEP 246 exist, so the 
changes in semantics you've proposed may introduce backward compatibility 
issues.  More specifically, some field code may not work correctly with 
your proposed reference implementation, in the sense that code that worked 
with Zope or PyProtocols before, may not work with the reference 
implementation's adapt(), resulting in failure of adaptation where success 
occurred before, or in exceptions raised where no exception was raised before.

From pje at telecommunity.com  Mon Jan 10 18:59:10 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 18:57:57 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com>

At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote:
>As a practical matter, all of the existing interface systems (Zope, 
>PyProtocols, and even the defunct Twisted implementation) treat interface 
>inheritance as guaranteeing substitutability for the base interface, and 
>do so transitively.

An additional data point, by the way: the Eclipse Java IDE has an 
adaptation system that works very much like PEP 246 does, and it appears 
that in a future release they intend to support automatic adapter 
transitivity, so as to avoid requiring each provider of an interface to 
"provide O(n^2) adapters when writing the nth version of an 
interface."  IOW, their current release is transitive only for interface 
inheritance ala Zope or Twisted; their future release will be transitive 
for adapter chains ala PyProtocols.

From cce at clarkevans.com  Mon Jan 10 19:19:22 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon Jan 10 19:19:25 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
Message-ID: <20050110181922.GC47082@prometheusresearch.com>

Alex,

  This is wonderful work, thank you for keeping the ball in the air;
  I'm honored to keep my name as a co-author -- kinda like a free lunch.

Phillip,

  Once again, thank you!  Without PyProtocols and your advocacy,
  this proposal might have been buried in the historical bit-bucket.

On Mon, Jan 10, 2005 at 12:43:44PM -0500, Phillip J. Eby wrote:
| -1 if this introduces a performance penalty to a wide range of 
| adaptations (i.e. those using abstract base classes), just to support 
| people who want to create deliberate Liskov violations.  I personally 
| don't think that we should pander to Liskov violators, especially since 
| Guido seems to be saying that there will be some kind of interface 
| objects available in future Pythons.

I particularly like Alex's Liskov violation error; although it is
not hugely common, it does happen, and there should be a way for a 
class to indicate that it's only being used for implementation.

Perhaps... if the class doesn't have a __conform__ method, then its
adaptation is automatic (that is, only the class can raise this
case).  The rationale for only enabling one of the two paths is that
the base class would have been in-place before the derived class was
created; therefore, it is highly unlikely that __adapt__ would ever
be of help.  Therefore, there might be a performance penalty, but it'd 
be really small, simply checking to see if the slot is filled in.

Best,

Clark

From pje at telecommunity.com  Mon Jan 10 19:34:59 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 19:33:48 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050110181922.GC47082@prometheusresearch.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>

At 01:19 PM 1/10/05 -0500, Clark C. Evans wrote:
>Alex,
>
>   This is wonderful work, thank you for keeping the ball in the air;
>   I'm honored to keep my name as a co-author -- kinda like a free lunch.
>
>Phillip,
>
>   Once again, thank you!  Without PyProtocols and your advocacy,
>   this proposal might have been buried in the historical bit-bucket.
>
>On Mon, Jan 10, 2005 at 12:43:44PM -0500, Phillip J. Eby wrote:
>| -1 if this introduces a performance penalty to a wide range of
>| adaptations (i.e. those using abstract base classes), just to support
>| people who want to create deliberate Liskov violations.  I personally
>| don't think that we should pander to Liskov violators, especially since
>| Guido seems to be saying that there will be some kind of interface
>| objects available in future Pythons.
>
>I particularly like Alex's Liskov violation error; although it is
>not hugely common, it does happen, and there should be a way for a
>class to indicate that it's only being used for implementation.
>
>Perhaps... if the class doesn't have a __conform__ method, then its
>adaptation is automatic (that is, only the class can raise this
>case).  The rationale for only enabling one of the two paths is that
>the base class would have been in-place before the derived class was
>created; therefore, it is highly unlikely that __adapt__ would ever
>be of help.  Therefore, there might be a performance penalty, but it'd
>be really small, simply checking to see if the slot is filled in.

The performance penalty I was talking about was for using an abstract base 
class, in a subclass with a __conform__ method for conformance to other 
protocols.  In this case, __conform__ will be uselessly called every time 
the object is adapted to the abstract base class.

IMO it's more desirable to support abstract base classes than to allow 
classes to "opt out" of inheritance when testing conformance to a base 
class.  If you don't have an "is-a" relationship to your base class, you 
should be using delegation, not inheritance.  (E.g. 'set' has-a 'dict', not 
'set' is-a 'dict', so 'adapt(set,dict)' should fail, at least on the basis 
of isinstance checking.)

The other problem with a Liskov opt-out is that you have to explicitly do a 
fair amount of work to create a LiskovViolation-raising subclass; that work 
would be better spent migrating to use delegation instead of inheritance, 
which would also be cleaner and more comprehensible code than writing a 
__conform__ hack to announce your bad style in having chosen to use 
inheritance where delegation is more appropriate.  ;)

This latter problem is actually much worse than the performance issue, 
which was just my initial impression.  Now that I've thought about it some 
more, I think I'm against supporting Liskov violations even if it were 
somehow *faster*.  :)

From aleax at aleax.it  Mon Jan 10 19:42:11 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Jan 10 19:42:16 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
Message-ID: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 10, at 18:43, Phillip J. Eby wrote:
    ...
> At 03:42 PM 1/10/05 +0100, Alex Martelli wrote:
>>     The fourth case above is subtle.  A break of substitutability can
>>     occur when a subclass changes a method's signature, or restricts
>>     the domains accepted for a method's argument ("co-variance" on
>>     arguments types), or extends the co-domain to include return
>>     values which the base class may never produce ("contra-variance"
>>     on return types).  While compliance based on class inheritance
>>     _should_ be automatic, this proposal allows an object to signal
>>     that it is not compliant with a base class protocol.
>
> -1 if this introduces a performance penalty to a wide range of 
> adaptations (i.e. those using abstract base classes), just to support 
> people who want to create deliberate Liskov violations.  I personally 
> don't think that we should pander to Liskov violators, especially 
> since Guido seems to be saying that there will be some kind of 
> interface objects available in future Pythons.

If interfaces can ensure against Liskov violations in instances of 
their subclasses, then they can follow the "case (a)" fast path, sure.  
Inheriting from an interface (in Guido's current proposal, as per his 
Artima blog) is a serious commitment from the inheritor's part; 
inheriting from an ordinary type, in real-world current practice, need 
not be -- too many cases of assumed covariance, for example, are around 
in the wild, to leave NO recourse in such cases and just assume 
compliance.


>>     Just like any other special method in today's Python, __conform__
>>     is meant to be taken from the object's class, not from the object
>>     itself (for all objects, except instances of "classic classes" as
>>     long as we must still support the latter).  This enables a
>>     possible 'tp_conform' slot to be added to Python's type objects in
>>     the future, if desired.
>
> One note here: Zope and PEAK sometimes use interfaces that a function 
> or module may implement.  PyProtocols' implementation does this by 
> adding a __conform__ object to the function's dictionary so that the 
> function can conform to a particular signature.  If and when 
> __conform__ becomes tp_conform, this may not be necessary any more, at 
> least for functions, because there will probably be some way for an 
> interface to tell if the function at least conforms to the appropriate 
> signature.  But for modules this will still be an issue.
>
> I am not saying we shouldn't have a tp_conform; just suggesting that 
> it may be appropriate for functions and modules (as well as classic 
> classes) to have their tp_conform delegate back to 
> self.__dict__['__conform__'] instead of a null implementation.

I have not considered conformance of such objects as functions or 
modules; if that is important, I need to add it to the reference 
implementation in the PEP.  I'm reluctant to just get __conform__ from 
the object, though; it leads to all sort of issues with a *class* 
conforming vs its *instances*, etc.  Maybe Guido can Pronounce a little 
on this sub-issue...


> I don't see the benefit of LiskovViolation, or of doing the exact type 
> check vs. the loose check.  What is the use case for these?  Is it to 
> allow subclasses to say, "Hey I'm not my superclass?"  It's also a bit 
> confusing to say that if the routines "raise any other exceptions" 
> they're propagated.  Are you saying that LiskovViolation is *not* 
> propagated?

Indeed I am -- I thought that was very clearly expressed!  
LiskovViolation means to skip the loose isinstance check, but it STILL 
allows explicitly registered adapter factories a chance (if somebody 
registers such an adapter factory, presumably they've coded a suitable 
adapter object type to deal with some deuced Liskov violation, see...). 
  On the other hand, if some random exception occurs in __conform__ or 
__adapt__, that's a bug somewhere, so the exception propagates in order 
to help debugging.  The previous version treated TypeError specially, 
but I think (on the basis of just playing around a bit, admittedly) 
that offers no real added value and sometimes will hide bugs.


>>     If none of the first four mechanisms worked, as a last-ditch
>>     attempt, 'adapt' falls back to checking a registry of adapter
>>     factories, indexed by the protocol and the type of `obj', to meet
>>     the fifth case.  Adapter factories may be dynamically registered
>>     and removed from that registry to provide "third party adaptation"
>>     of objects and protocols that have no knowledge of each other, in
>>     a way that is not invasive to either the object or the protocols.
>
> This should either be fleshed out to a concrete proposal, or dropped.  
> There are many details that would need to be answered, such as whether 
> "type" includes subtypes and whether it really means type or 
> __class__.  (Note that isinstance() now uses __class__, allowing proxy 
> objects to lie about their class; the adaptation system should support 
> this too, and both the Zope and PyProtocols interface systems and 
> PyProtocols' generic functions support it.)

I disagree: I think the strawman-level proposal as fleshed out in the 
pep's reference implementation is far better than nothing.  I mention 
the issue of subtypes explicitly later, including why the pep does NOT 
do anything special with them -- the reference implementation deals 
with specific types.  And I use type(X) consistently, explicitly 
mentioning in the reference implementation that old-style classes are 
not covered.

I didn't know about the "let the object lie" quirk in isinstance.  If 
that quirk is indeed an intended design feature, rather than an 
implementation 'oops', it might perhaps be worth documenting it more 
clearly; I do not find that clearly spelled out in the place I'd expect 
it to be, namely <http://docs.python.org/lib/built-in-funcs.html> under 
'isinstance'.  If the "let the object lie" quirk is indeed a 
designed-in feature, then, I agree, using x.__class__ rather than 
type(x) is mandatory in the PEP and its reference implementation; 
however, I'll wait for confirmation of design intent before I change 
the PEP accordingly.

> One other issue: it's not possible to have standalone interoperable 
> PEP 246 implementations using a registry, unless there's a 
> standardized place to put it, and a specification for how it gets 
> there.  Otherwise, if someone is using both say Zope and PEAK in the 
> same application, they would have to take care to register adaptations 
> in both places.  This is actually a pretty minor issue since in 
> practice both frameworks' interfaces handle adaptation, so there is no 
> *need* for this extra registry in such cases.

I'm not sure I understand this issue, so I'm sure glad it's "pretty 
minor".

>>     Adaptation is NOT "casting".  When object X itself does not
>>     conform to protocol Y, adapting X to Y means using some kind of
>>     wrapper object Z, which holds a reference to X, and implements
>>     whatever operation Y requires, mostly by delegating to X in
>>     appropriate ways.  For example, if X is a string and Y is 'file',
>>     the proper way to adapt X to Y is to make a StringIO(X), *NOT* to
>>     call file(X) [which would try to open a file named by X].
>>
>>     Numeric types and protocols may need to be an exception to this
>>     "adaptation is not casting" mantra, however.
>
> The issue isn't that adaptation isn't casting; why would casting a 
> string to a file mean that you should open that filename?

Because, in most contexts, "casting" object X to type Y means calling 
Y(X).

>   I don't think that "adaptation isn't casting" is enough to explain 
> appropriate use of adaptation.  For example, I think it's quite valid 
> to adapt a filename to a *factory* for opening files, or a string to a 
> "file designator".  However, it doesn't make any sense (to me at 
> least) to adapt from a file designator to a file, which IMO is the 
> reason it's wrong to adapt from a string to a file in the way you 
> suggest.  However, casting doesn't come into it
> nywhere that I can see.

Maybe we're using different definitions of "casting"?

> If I were going to say anything about that case, I'd say that 
> adaptation should not be "lossy"; adapting from a designator to a file 
> loses information like what mode the file should be opened in.  
> (Similarly, I don't see adapting from float to int; if you want a cast 
> to int, cast it.)  Or to put it another way, adaptability should imply 
> substitutability: a string may be used as a filename, a filename may 
> be used to designate a file.  But a filename cannot be used as a file; 
> that makes no sense.

I don't understand this "other way" -- nor, to be honest, what you 
"would say" earlier, either.  I think it's pretty normal for adaptation 
to be "lossy" -- to rely on some but not all of the information in the 
original object: that's the "facade" design pattern, after all.  It 
doesn't mean that some info in the original object is lost forever, 
since the original object need not be altered; it just means that not 
ALL of the info that's in the original object used in the adapter -- 
and, what's wrong with that?!

For example, say that I have some immutable "record" types.  One, type 
Person, defined in some framework X, has a huge lot of immutable data 
fields, including firstName, middleName, lastName, and many, many 
others.  Another, type Employee, defines in some separate framework Y 
(that has no knowlege of X, and viceversa), has fewer data fields, and 
in particular one called 'fullName' which is supposed to be a string 
such as 'Firstname M. Lastname'.  I would like to register an adapter 
factory from type Person to protocol Employeee.  Since we said Person 
has many more data fields, adaptation will be "lossy" -- it will look 
upon Employee essentially as a "facade" (a simplified-interface) for 
Person.

Given the immutability, we MIGHT as well 'cast' here...:

def adapt_Person_to_Employee(person, protocol, alternate):
     assert issubclass(protocol, Y.Employee)
     return protocol(fullName='%s %s. %s' % (
         person.firstName, person.middleName[0], person.lastName), ...

although the canonical approach would be to make a wrapper:

class adapt_Person_to_Employee(object):
     def __init__(self, person, protocol, alternate):
         assert issubclass(protocol, Y.Employee)
         self.p = person
     def getFullName(self):
         return '%s %s. %s' % (
             self.p.firstName, self.p.middleName[0], self.p.lastName)
     fullName = property(getFullName)

which would be more general (work fine even for a mutable Person).

So, can you please explain your objections to what I said about 
adapting vs casting in terms of this example?  Do you think the 
example, or some variation thereof, should go in the PEP?


>> Reference Implementation and Test Cases
>>
>>     The following reference implementation does not deal with classic
>>     classes: it consider only new-style classes.  If classic classes
>>     need to be supported, the additions should be pretty clear, though
>>     a bit messy (x.__class__ vs type(x), getting boundmethods directly
>>     from the object rather than from the type, and so on).
>
> Please base a reference implementation off of either Zope or 
> PyProtocols' field-tested implementations which deal correctly with 
> __class__ vs. type(), and can detect whether they're calling a 
> __conform__ or __adapt__ at the wrong metaclass level, etc.  Then, if 
> there is a reasonable use case for LiskovViolation and the new type 
> checking rules that justifies adding them, let's do so.

I think that if a PEP includes a reference implementation, it should be 
self-contained rather than require some other huge package.  If you can 
critique specific problems in the reference implementation, I'll be 
very grateful and eager to correct them.

>>     Transitivity of adaptation is in fact somewhat controversial, as
>>     is the relationship (if any) between adaptation and inheritance.
>
> The issue is simply this: what is substitutability?  If you say that 
> interface B is substitutable for A, and C is substitutable for B, then 
> C *must* be substitutable for A, or we have inadequately defined 
> "substitutability".

Not necessarily, depending on the pragmatics involved.

> If adaptation is intended to denote substitutability, then there can 
> be absolutely no question that it is transitive, or else it is not 
> possible to have any meaning for interface inheritance!

If interface inheritance is intended to express ensured 
substitutability (finessing pragmatics), fine.  I'm not willing to 
commit to that meaning in the PEP.

Dinnertime -- I'd better send this already-long answer, and deal with 
the highly controversial remaining issues later.


Thanks, BTW, for your highly detailed feedback.


Alex

From mwh at python.net  Mon Jan 10 19:53:33 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Jan 10 19:53:35 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it> (Alex
	Martelli's message of "Mon, 10 Jan 2005 19:42:11 +0100")
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <2moefxf74i.fsf@starship.python.net>

Alex Martelli <aleax@aleax.it> writes:

> I didn't know about the "let the object lie" quirk in isinstance.  If
> that quirk is indeed an intended design feature, rather than an
> implementation 'oops', it might perhaps be worth documenting it more
> clearly; I do not find that clearly spelled out in the place I'd
> expect it to be, namely
> <http://docs.python.org/lib/built-in-funcs.html> under 'isinstance'.

Were you not at the PyPy sprint where bugs in some __getattr__ method
caused infinite recursions on the isinstance's code attempting to
access __class__?  The isinstance code then silently eats the error,
so we had (a) a massive slowdown and (b) isinstance failing in an
"impossible" way.  A clue was that if you ran the code on OS X with
its silly default stack limits the code dumped core instead of going
slowly insane.

This is on quirk I'm not likely to forget in a hurry...

Cheers,
mwh

-- 
  If trees could scream, would we be so cavalier about cutting them
  down? We might, if they screamed all the time, for no good reason.
                                                        -- Jack Handey
From cce at clarkevans.com  Mon Jan 10 20:27:23 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon Jan 10 20:27:26 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
Message-ID: <20050110192723.GA94340@prometheusresearch.com>

On Mon, Jan 10, 2005 at 01:34:59PM -0500, Phillip J. Eby wrote:
| The performance penalty I was talking about was for using an abstract 
| base class, in a subclass with a __conform__ method for conformance to 
| other protocols.  In this case, __conform__ will be uselessly called 
| every time the object is adapted to the abstract base class.

*nod*

If this proposal was "packaged" with an "interface" mechanism, would
this address your concern?  In this scenerio, there are two cases:

  - Older classes will most likely not have a __conform__ method.
  - Newer classes will use the 'interface' mechanism.

In this scenerio, there isn't a performance penalty for the 
usual case; and for migration purposes, a flag could be added
to disable the checking.

Best,

Clark
From michel at dialnetwork.com  Tue Jan 11 01:16:04 2005
From: michel at dialnetwork.com (Michel Pelletier)
Date: Mon Jan 10 22:46:41 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050110175818.429661E4002@bag.python.org>
References: <20050110175818.429661E4002@bag.python.org>
Message-ID: <200501101616.05123.michel@dialnetwork.com>

On Monday 10 January 2005 09:58 am, python-dev-request@python.org wrote:

> Message: 3
> Date: Mon, 10 Jan 2005 07:46:39 -0800
> From: Guido van Rossum <gvanrossum@gmail.com>
> Subject: Re: [Python-Dev] PEP 246, redux
> To: Alex Martelli <aleax@aleax.it>
> Cc: "Clark C.Evans" <cce@clarkevans.com>, Python Dev
> 	<python-dev@python.org>
> Message-ID: <ca471dc2050110074614f5edc3@mail.gmail.com>
> Content-Type: text/plain; charset=US-ASCII
>
> > I had been promising to rewrite PEP 246 to incorporate the last several
> > years' worth of discussions &c about it, and Guido's recent "stop the
> > flames" artima blog post finally pushed me to complete the work.
> > Feedback is of course welcome, so I thought I had better repost it
> > here, rather than relying on would-be commenters to get it from CVS...
>
> Thanks for doing this, Alex! I yet have to read the whole thing [will
> attempt do so later today] but the few snippets I caught make me feel
> this is a big step forward.

Me too!  I didn't realize it the first time 246 came around how important 
adaptation was and how interfaces just aren't as useful without it.

>
> I'm wondering if someone could do a similar thing for PEP 245,
> interfaces syntax? Alex hinted that it's a couple of rounds behind the
> developments in Zope and Twisted. 

Nothing implements 245, which is just about the syntax, I intended to write 
another PEP describing an implementation, at the time Jim's original 
straw-man; which I'm glad I didn't do as it would have been a waste of time.  
Had I written that document, then it would be a copule of rounds behind Zope 
and Twisted.  But as it stands now nothing need be based on 245.

> I'm personally not keen on needing 
> *two* new keywords (interface and implements) so I hope that whoever
> does the rewrite could add a section on the advantages and
> disadvantages of the 'implements' keyword (my simplistic alternative
> proposal is to simply include interfaces in the list of bases in the
> class statement; the metaclass can then sort it out).

I like implements, but any spelling works for me.  "implements" strikes me as 
an elegant counterpart to "interface" and risks minimal breakage.  Can we 
still import and say "implements()" for b/w compatibility and for those of us 
who do want an explicit statement like that?

-Michel
From m.bless at gmx.de  Mon Jan 10 23:04:25 2005
From: m.bless at gmx.de (Martin Bless)
Date: Mon Jan 10 23:04:11 2005
Subject: [Python-Dev] Re: csv module TODO list
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com>
	<20050109233717.8A3C33C8E5@coffee.object-craft.com.au>
Message-ID: <0ns5u0t0jihihultsjulm5nicosb02c6uj@4ax.com>

On Mon, 10 Jan 2005 10:37:17 +1100, Andrew McNamara
<andrewm@object-craft.com.au> wrote:

>>csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object

Oops, should have been

csv.join(aList [, dialect='excel'[, fmtparam]]) -> str object

>Yes, it's feasible,

Good!

>although newlines can be embedded in within fields
>of a CSV record, hence the use of the iterator, rather than working with
>strings.

In my use cases newlines usually don't come into play. It would be ok
for me if they were treated as any other char.

> In your example above, if the parser gets to the end of the
>string and finds it's still within a field, I'd propose just raising
>an exception.

Yes, that seems to be "the right answer".

>No promises, however - I only have a finite ammount of time to work on
>this at the moment.

Sure!

To my feeling these  "intelligent split and join" functions most
naturally should actually be string methods. I can see that -
considering the conceivable variety of dialects - this can't be done.
One more reason to have 'split' and 'join' available from the csv
module! 

mb - Martin

From pje at telecommunity.com  Mon Jan 10 23:12:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 23:11:28 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <200501101616.05123.michel@dialnetwork.com>
References: <20050110175818.429661E4002@bag.python.org>
	<20050110175818.429661E4002@bag.python.org>
Message-ID: <5.1.1.6.0.20050110170408.039f12e0@mail.telecommunity.com>

At 04:16 PM 1/10/05 -0800, Michel Pelletier wrote:
> > From: Guido van Rossum <gvanrossum@gmail.com>
> > Subject: Re: [Python-Dev] PEP 246, redux
> >
> > I'm wondering if someone could do a similar thing for PEP 245,
> > interfaces syntax? Alex hinted that it's a couple of rounds behind the
> > developments in Zope and Twisted.
>
>Nothing implements 245, which is just about the syntax,

The comment Guido's alluding to was mine; I was referring to PEP 245's use 
of '__implements__', and the difference between what a "class implements" 
and an "instance provides".  Twisted and Zope's early implementations just 
looked for ob.__implements__, which leads to issues with distinguishing 
between what a "class provides" from what its "instances provide".

So, I was specifically saying that this aspect of PEP 245 (and Guido's 
basing a Python interface implementation thereon) should be re-examined in 
the light of current practices that avoid this issue.  (I don't actually 
know what Zope currently does; it was changed after I had moved to using 
PyProtocols.  But the PyProtocols test suite tests that Zope does in fact 
have correct behavior for instances versus classes, because it's needed to 
exercise the PyProtocols-Zope interop tests.)


>I like implements, but any spelling works for me.  "implements" strikes me as
>an elegant counterpart to "interface" and risks minimal breakage.  Can we
>still import and say "implements()" for b/w compatibility and for those of us
>who do want an explicit statement like that?

If I understand Guido's proposal correctly, it should be possible to make a 
backward-compatible 'implements()' declaration function.  Maybe not *easy*, 
but certainly possible.

From theller at python.net  Mon Jan 10 23:15:38 2005
From: theller at python.net (Thomas Heller)
Date: Mon Jan 10 23:14:16 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it> (Alex
	Martelli's message of "Mon, 10 Jan 2005 15:42:11 +0100")
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <oefxgcc5.fsf@python.net>

Alex Martelli <aleax@aleax.it> writes:

> PEP: 246
> Title: Object Adaptation

Minor nit (or not?): You could provide a pointer to the Liskov
substitution principle, for those reader that aren't too familiar with
that term.

Besides, the text mentions three times that LiskovViolation is a
subclass of AdaptionError (plus once in the ref impl section).

Thomas

From pje at telecommunity.com  Mon Jan 10 23:19:02 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 23:17:50 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050110192723.GA94340@prometheusresearch.com>
References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com>

At 02:27 PM 1/10/05 -0500, Clark C. Evans wrote:
>If this proposal was "packaged" with an "interface" mechanism, would
>this address your concern?  In this scenerio, there are two cases:
>
>   - Older classes will most likely not have a __conform__ method.
>   - Newer classes will use the 'interface' mechanism.
>
>In this scenerio, there isn't a performance penalty for the
>usual case; and for migration purposes, a flag could be added
>to disable the checking.

As I said, after more thought, I'm actually less concerned about the 
performance than I am about even remotely encouraging the combination of 
Liskov violation *and* concrete adaptation targets.  But, if "after the 
dust settles" it turns out this is going to be supported after all, then we 
can worry about the performance if need be.

Note, however, that your statements actually support the idea of *not* 
adding a special case for Liskov violators.  If newer code uses interfaces, 
the Liskov-violation mechanism is useless.  If older code doesn't have 
__conform__, it cannot possibly *use* the Liskov-violation mechanism.

So, if neither old code nor new code will use the mechanism, why have it?  :)

From pje at telecommunity.com  Mon Jan 10 22:38:55 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 23:28:38 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>

At 07:42 PM 1/10/05 +0100, Alex Martelli wrote:

>On 2005 Jan 10, at 18:43, Phillip J. Eby wrote:
>    ...
>>At 03:42 PM 1/10/05 +0100, Alex Martelli wrote:
>>>     The fourth case above is subtle.  A break of substitutability can
>>>     occur when a subclass changes a method's signature, or restricts
>>>     the domains accepted for a method's argument ("co-variance" on
>>>     arguments types), or extends the co-domain to include return
>>>     values which the base class may never produce ("contra-variance"
>>>     on return types).  While compliance based on class inheritance
>>>     _should_ be automatic, this proposal allows an object to signal
>>>     that it is not compliant with a base class protocol.
>>
>>-1 if this introduces a performance penalty to a wide range of 
>>adaptations (i.e. those using abstract base classes), just to support 
>>people who want to create deliberate Liskov violations.  I personally 
>>don't think that we should pander to Liskov violators, especially since 
>>Guido seems to be saying that there will be some kind of interface 
>>objects available in future Pythons.
>
>If interfaces can ensure against Liskov violations in instances of their 
>subclasses, then they can follow the "case (a)" fast path, sure.
>Inheriting from an interface (in Guido's current proposal, as per his 
>Artima blog) is a serious commitment from the inheritor's part; inheriting 
>from an ordinary type, in real-world current practice, need not be -- too 
>many cases of assumed covariance, for example, are around in the wild, to 
>leave NO recourse in such cases and just assume compliance.

I understand that, sure.  But I don't understand why we should add 
complexity to PEP 246 to support not one but *two* bad practices: 1) 
implementing Liskov violations and 2) adapting to concrete classes.  It is 
only if you are doing *both* of these that this extra feature is needed.

If it were to support some kind of backward compatibility, that would be 
understandable.  However, in practice, I don't know of anybody using 
adapt(x,ConcreteClass), and even if they did, the person subclassing 
ConcreteClass will need to change their subclass to raise LiskovViolation, 
so why not just switch to delegation?

Anyway, it seems to me a bad idea to add complexity to support this 
case.  Do you have a more specific example of a situation in which a Liskov 
violation coupled to concrete class adaptation is a good idea?  Or am I 
missing something here?



>>I am not saying we shouldn't have a tp_conform; just suggesting that it 
>>may be appropriate for functions and modules (as well as classic classes) 
>>to have their tp_conform delegate back to self.__dict__['__conform__'] 
>>instead of a null implementation.
>
>I have not considered conformance of such objects as functions or modules; 
>if that is important,

It's used in at least Zope and PEAK; I don't know if it's in use in Twisted.


>  I need to add it to the reference implementation in the PEP.  I'm 
> reluctant to just get __conform__ from the object, though; it leads to 
> all sort of issues with a *class* conforming vs its *instances*, 
> etc.  Maybe Guido can Pronounce a little on this sub-issue...

Actually, if you looked at the field-tested implementations of the old PEP 
246, they actually have code that deals with this issue effectively, by 
recognizing TypeError when raised by attempting to invoke __adapt__ or 
__conform__ with the wrong number of arguments or argument types.  (The 
traceback for such errors does not include a frame for the called method, 
versus a TypeError raised *within* the function, which does have such a 
frame.  AFAIK, this technique should be compatible with any Python 
implementation that has traceback objects and does signature validation in 
its "native" code rather than in a new Python frame.)


>>I don't see the benefit of LiskovViolation, or of doing the exact type 
>>check vs. the loose check.  What is the use case for these?  Is it to 
>>allow subclasses to say, "Hey I'm not my superclass?"  It's also a bit 
>>confusing to say that if the routines "raise any other exceptions" 
>>they're propagated.  Are you saying that LiskovViolation is *not* propagated?
>
>Indeed I am -- I thought that was very clearly expressed!

The PEP just said that it would be raised by __conform__ or __adapt__, not 
that it would be caught by adapt() or that it would be used to control the 
behavior in that way.  Re-reading, I see that you do mention it much 
farther down.  But at the point where __conform__ and __adapt__ are 
explained, it has not been explained that adapt() should catch the error or 
do anything special with it.  It is simply implied by the "to prevent this 
default behavior" at the end of the section.  If this approach is accepted, 
the description should be made explicit, becausse for me at least it 
required a retroactive re-interpretation of the earlier part of the spec.


>The previous version treated TypeError specially, but I think (on the 
>basis of just playing around a bit, admittedly) that offers no real added 
>value and sometimes will hide bugs.

See http://peak.telecommunity.com/protocol_ref/node9.html for an analysis 
of the old PEP 246 TypeError behavior, and the changes made the by 
PyProtocols and Zope to deal with the situation better, while still 
respecting the fact that __conform__ and __adapt__ may be retrieved from 
the wrong "meta level" of descriptor.

Your new proposal does not actually fix this problem in the absence of 
tp_conform/tp_adapt slots; it merely substitutes possible confusion at the 
metaclass/class level for confusion at the class/instance level.  The only 
way to actually fix this is to detect when you have called the wrong level, 
and that is what the PyProtocols and Zope implementations of "old PEP 246" 
do.  (PyProtocols also introduces a special descriptor for methods defined 
on metaclasses, to help avoid creating this possible confusion in the first 
place, but that is a separate issue.)


>>>     If none of the first four mechanisms worked, as a last-ditch
>>>     attempt, 'adapt' falls back to checking a registry of adapter
>>>     factories, indexed by the protocol and the type of `obj', to meet
>>>     the fifth case.  Adapter factories may be dynamically registered
>>>     and removed from that registry to provide "third party adaptation"
>>>     of objects and protocols that have no knowledge of each other, in
>>>     a way that is not invasive to either the object or the protocols.
>>
>>This should either be fleshed out to a concrete proposal, or dropped.
>>There are many details that would need to be answered, such as whether 
>>"type" includes subtypes and whether it really means type or 
>>__class__.  (Note that isinstance() now uses __class__, allowing proxy 
>>objects to lie about their class; the adaptation system should support 
>>this too, and both the Zope and PyProtocols interface systems and 
>>PyProtocols' generic functions support it.)
>
>I disagree: I think the strawman-level proposal as fleshed out in the 
>pep's reference implementation is far better than nothing.

I'm not proposing to flesh out the functionality, just the specification; 
it should not be necessary to read the reference implementation and try to 
infer intent from it.  What part is implementation accident, and what is 
supposed to be the specification?  That's all I'm talking about here.  As 
currently written, the proposal is just, "we should have a registry", and 
is not precise enough to allow someone to implement it based strictly on 
the specification.


>   I mention the issue of subtypes explicitly later, including why the pep 
> does NOT do anything special with them -- the reference implementation 
> deals with specific types.  And I use type(X) consistently, explicitly 
> mentioning in the reference implementation that old-style classes are not 
> covered.

As a practical matter, classic classes exist and are useful, and PEP 246 
implementations already exist that work with them.  Dropping that 
functionality is a major step backward for PEP 246, IMO.


>I didn't know about the "let the object lie" quirk in isinstance.  If that 
>quirk is indeed an intended design feature,

It is; it's in one of the "what's new" feature highlights for either 2.3 or 
2.4, I forget which.  It was intended to allow proxy objects (like security 
proxies in Zope 3) to pretend to be an instance of the class they are proxying.


>>One other issue: it's not possible to have standalone interoperable PEP 
>>246 implementations using a registry, unless there's a standardized place 
>>to put it, and a specification for how it gets there.  Otherwise, if 
>>someone is using both say Zope and PEAK in the same application, they 
>>would have to take care to register adaptations in both places.  This is 
>>actually a pretty minor issue since in practice both frameworks' 
>>interfaces handle adaptation, so there is no *need* for this extra 
>>registry in such cases.
>
>I'm not sure I understand this issue, so I'm sure glad it's "pretty minor".

All I was saying is that if you have two 'adapt()' implementations, each 
using its own registry, you have a possible interoperability problem.  Two 
'adapt()' implementations that conform strictly to PEP 246 *without* a 
registry are interoperable because their behavior is the same.

As a practical matter, all this means is that standalone PEP 246 
implementations for older versions of Python either shouldn't implement a 
registry, or they need to have a standard place to put it that they can 
share with each other, and it needs to be implemented the same way.  (This 
is one reason I think the registry specification should be more formal; it 
may be necessary for existing PEP 246 implementations to be 
forward-compatible with the spec as implemented in later Python versions.)


>>The issue isn't that adaptation isn't casting; why would casting a string 
>>to a file mean that you should open that filename?
>
>Because, in most contexts, "casting" object X to type Y means calling Y(X).

Ah; I had not seen that called "casting" in Python, at least not to my 
immediate recollection.  However, if that is what you mean, then why not 
say it?  :)


>Maybe we're using different definitions of "casting"?

I'm most accustomed to the C and Java definitions of casting, so that's 
probably why I can't see how it relates at all.  :)


>>If I were going to say anything about that case, I'd say that adaptation 
>>should not be "lossy"; adapting from a designator to a file loses 
>>information like what mode the file should be opened in.
>>(Similarly, I don't see adapting from float to int; if you want a cast to 
>>int, cast it.)  Or to put it another way, adaptability should imply 
>>substitutability: a string may be used as a filename, a filename may be 
>>used to designate a file.  But a filename cannot be used as a file; that 
>>makes no sense.
>
>I don't understand this "other way" -- nor, to be honest, what you "would 
>say" earlier, either.  I think it's pretty normal for adaptation to be 
>"lossy" -- to rely on some but not all of the information in the original 
>object: that's the "facade" design pattern, after all.  It doesn't mean 
>that some info in the original object is lost forever, since the original 
>object need not be altered; it just means that not ALL of the info that's 
>in the original object used in the adapter -- and, what's wrong with that?!

I think we're using different definitions of "lossy", too.  I mean that 
defining an adaptation relationship between two types when there is more 
than one "sensible" way to get from one to the other is "lossy" of 
semantics/user choice.  If I have a file designator (such as a filename), I 
can choose how to open it.  If I adapt directly from string to file by way 
of filename, I lose this choice (it is "lossy" adaptation).

Here's a better way of phrasing it (I hope): adaptation should be 
unambiguous.  There should only be one sensible way to interpret a thing as 
implementing a particular interface, otherwise, adaptation itself has no 
meaning.  Whether an adaptation adds or subtracts behavior, it does not 
really change the underlying *intended* meaning of a thing, or else it is 
not really adaptation.  Adapting 12.0 to 12 does not change the meaning of 
the value, but adapting from 12.1 to 12 does.

Does that make more sense?  I think that some people start using adaptation 
and want to use it for all kinds of crazy things because it seems 
cool.  However, it takes a while to see that adaptation is just about 
removing unnecessary accidents-of-incompatibility; it's not a license to 
transform arbitrary things into arbitrary things.  There has to be some 
*meaning* to a particular adaptation, or the whole concept rapidly 
degenerates into an undifferentiated mess.

(Or else, you decide to "fix" it by disallowing transitive adaptation, 
which IMO is like cutting off your hand because it hurts when you punch a 
brick wall.  Stop punching brick walls (i.e. using semantic-lossy 
adaptations), and the problem goes away.  But I realize that I'm in the 
minority here with regards to this opinion.)


>For example, say that I have some immutable "record" types.  One, type 
>Person, defined in some framework X, has a huge lot of immutable data 
>fields, including firstName, middleName, lastName, and many, many 
>others.  Another, type Employee, defines in some separate framework Y 
>(that has no knowlege of X, and viceversa), has fewer data fields, and in 
>particular one called 'fullName' which is supposed to be a string such as 
>'Firstname M. Lastname'.  I would like to register an adapter factory from 
>type Person to protocol Employeee.  Since we said Person has many more 
>data fields, adaptation will be "lossy" -- it will look upon Employee 
>essentially as a "facade" (a simplified-interface) for Person.

But it doesn't change the *meaning*.  I realize that "meaning" is not an 
easy concept to pin down into a nice formal definition.  I'm just saying 
that adaptation is about semantics-preserving transformations, otherwise 
you could just tack an arbitrary object on to something and call it an 
adapter.  Adapters should be about exposing an object's *existing 
semantics* in terms of a different interface, whether the interface is a 
subset or superset of the original object's interface.  However, they 
should not add or remove arbitrary semantics that are not part of the 
difference in interfaces.

For example, adding a "current position" to a string to get a StringIO is a 
difference that is reflected in the difference in interface: a StringIO 
*is* just a string of characters with a current position that can be used 
in place of slicing.

But interpreting a string as a *file* doesn't make sense because of added 
semantics that have to be "made up", and are not merely interpreting the 
string's semantics "as a" file.  I suppose you could say that this is 
"noisy" adaptation rather than "lossy".  That is, to treat a string as a 
file by using it as a filename, you have to make up things that aren't 
present in the string.  (Versus the StringIO, where there's a sensible 
interpretation of a string "as a" StringIO.)

IOW, adaptation is all about "as a" relationships from concrete objects to 
abstract roles, and between abstract roles.  Although one may colloquially 
speak of using a screwdriver "as a" hammer, this is not the case in 
adaptation.  One may use a screwdriver "as a" pounder-of-nails.  The 
difference is that a hammer might also be usable "as a" 
remover-of-nails.  Therefore, there is no general "as a" relationship 
between pounder-of-nails and remover-of-nails, even though a hammer is 
usable "as" either one.  Thus, it does not make sense to say that a 
screwdriver is usable "as a" hammer, because this would imply it's also 
usable to remove nails.

This is why I don't believe it makes sense in the general case to adapt to 
concrete classes; such classes usually have many roles where they are 
usable.  I think the main difference in your position and mine is that I 
think one should adapt primarily to interfaces, and interface-to-interface 
adaptation should be reserved for non-lossy, non-noisy adapters.  Where if 
I understand the opposing position correctly, it is instead that one should 
avoid transitivity so that loss and noise do not accumulate too badly.


>So, can you please explain your objections to what I said about adapting 
>vs casting in terms of this example?  Do you think the example, or some 
>variation thereof, should go in the PEP?

I'm not sure I see how that helps.  I think it might be more useful to say 
that adaptation is not *conversion*, which is not the same thing (IME) as 
casting.  Casting in C and Java does not actually "convert" anything; it 
simply treats a value or object as if it were of a different type.  ISTM 
that bringing casting into the terminology just complicates the picture, 
because e.g. casting in Java actually corresponds to the subset of PEP 246 
adaptation for cases where adapt() returns the original object or raises an 
error.  (That is, if adapt() could only ever return the original object or 
raise an error, it would be precisely equivalent to Java casting, if I 
understand it correctly.)  Thus, at least with regard to object casting in 
Java, adaptation is a superset, and saying that it's not casting is just 
confusing.


>>>Reference Implementation and Test Cases
>>>
>>>     The following reference implementation does not deal with classic
>>>     classes: it consider only new-style classes.  If classic classes
>>>     need to be supported, the additions should be pretty clear, though
>>>     a bit messy (x.__class__ vs type(x), getting boundmethods directly
>>>     from the object rather than from the type, and so on).
>>
>>Please base a reference implementation off of either Zope or PyProtocols' 
>>field-tested implementations which deal correctly with __class__ vs. 
>>type(), and can detect whether they're calling a __conform__ or __adapt__ 
>>at the wrong metaclass level, etc.  Then, if there is a reasonable use 
>>case for LiskovViolation and the new type checking rules that justifies 
>>adding them, let's do so.
>
>I think that if a PEP includes a reference implementation, it should be 
>self-contained rather than require some other huge package.  If you can 
>critique specific problems in the reference implementation, I'll be very 
>grateful and eager to correct them.

Sure, I've got some above (e.g. your implementation will raise a spurious 
TypeError if it calls an __adapt__ or __conform__ at the wrong metaclass 
level, and getting them from the type does *not* fix this issue, it just 
bumps it up by one metalevel).  I wasn't proposing you pull in either whole 
package, though; just adapt() itself.  Here's the existing Python one from 
PyProtocols (there's also a more low-level one using the Python/C API, but 
it's probably not appropriate for the spec):

from sys import exc_info
from types import ClassType
ClassTypes = type, ClassType

def adapt(obj, protocol, default=_marker):

     """PEP 246-alike: Adapt 'obj' to 'protocol', return 'default'

     If 'default' is not supplied and no implementation is found,
     the result of 'factory(obj,protocol)' is returned.  If 'factory'
     is also not supplied, 'NotImplementedError' is then raised."""

     if isinstance(protocol,ClassTypes) and isinstance(obj,protocol):
         return obj

     try:
         _conform = obj.__conform__
     except AttributeError:
         pass
     else:
         try:
             result = _conform(protocol)
             if result is not None:
                 return result
         except TypeError:
             if exc_info()[2].tb_next is not None:
                 raise
     try:
         _adapt = protocol.__adapt__
     except AttributeError:
         pass
     else:
         try:
             result = _adapt(obj)
             if result is not None:
                 return result
         except TypeError:
             if exc_info()[2].tb_next is not None:
                 raise

     if default is _marker:
         raise AdaptationFailure("Can't adapt", obj, protocol)

     return default

Obviously, some changes would need to be made to implement your newly 
proposed functionality, but this one does support classic classes, modules, 
and functions, and it has neither the TypeError-hiding problem of the 
original PEP 246 nor the TypeError-raising problem of your new version.


>>>     Transitivity of adaptation is in fact somewhat controversial, as
>>>     is the relationship (if any) between adaptation and inheritance.
>>
>>The issue is simply this: what is substitutability?  If you say that 
>>interface B is substitutable for A, and C is substitutable for B, then C 
>>*must* be substitutable for A, or we have inadequately defined 
>>"substitutability".
>
>Not necessarily, depending on the pragmatics involved.

In that case, I generally prefer to be explicit and use conversion rather 
than using adaptation.  For example, if I really mean to truncate the 
fractional part of a number, I believe it's then appropriate to use 
'int(someNumber)' and make it clear that I'm intentionally using a lossy 
conversion rather than simply treating a number "as an" integer without 
changing its meaning.


>Thanks, BTW, for your highly detailed feedback.

No problem; talking this out helps me clarify my own thoughts on these 
matters.  I haven't had much occasion to clarify these matters, and when 
they come up, it's usually in the context of arguing some specific 
inappropriate use of adaptation, so I can easily present an alternative 
that makes sense in that context.  This discussion is helping me clarify 
the general principle, since I have to try to argue the general case, not 
just N specific cases.  :)

From bob at redivi.com  Mon Jan 10 23:42:52 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan 10 23:43:05 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <F67C1459-6358-11D9-AF07-000A95BA5446@redivi.com>


On Jan 10, 2005, at 16:38, Phillip J. Eby wrote:

> At 07:42 PM 1/10/05 +0100, Alex Martelli wrote:
>
>> On 2005 Jan 10, at 18:43, Phillip J. Eby wrote:
>>    ...
>>> I am not saying we shouldn't have a tp_conform; just suggesting that 
>>> it may be appropriate for functions and modules (as well as classic 
>>> classes) to have their tp_conform delegate back to 
>>> self.__dict__['__conform__'] instead of a null implementation.
>>
>> I have not considered conformance of such objects as functions or 
>> modules; if that is important,
>
> It's used in at least Zope and PEAK; I don't know if it's in use in 
> Twisted.

SVN trunk of Twisted (what will be 2.0) uses zope.interface.  It still 
has the older stuff implemented as a wrapper on top of zope.interface, 
but I think the guideline is to just use zope.interface directly for 
new code dependent on Twisted 2.0.

-bob

From pje at telecommunity.com  Mon Jan 10 23:59:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 10 23:58:30 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <F67C1459-6358-11D9-AF07-000A95BA5446@redivi.com>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050110175734.02dfcda0@mail.telecommunity.com>

At 05:42 PM 1/10/05 -0500, Bob Ippolito wrote:

>On Jan 10, 2005, at 16:38, Phillip J. Eby wrote:
>
>>At 07:42 PM 1/10/05 +0100, Alex Martelli wrote:
>>
>>>On 2005 Jan 10, at 18:43, Phillip J. Eby wrote:
>>>    ...
>>>>I am not saying we shouldn't have a tp_conform; just suggesting that it 
>>>>may be appropriate for functions and modules (as well as classic 
>>>>classes) to have their tp_conform delegate back to 
>>>>self.__dict__['__conform__'] instead of a null implementation.
>>>
>>>I have not considered conformance of such objects as functions or 
>>>modules; if that is important,
>>
>>It's used in at least Zope and PEAK; I don't know if it's in use in Twisted.
>
>SVN trunk of Twisted (what will be 2.0) uses zope.interface.

What I meant was, I don't know if Twisted actually *uses* interface 
declarations for modules and functions.  It has the ability to do so, 
certainly.  I was just saying I didn't know if the ability is actually used.

PEAK uses some interfaces for functions, but I don't think I've ever used 
them for modules, and can think of only one place in PEAK where it would 
make sense to declare a module as supporting an interface.  Zope policy is 
to use interfaces for *everything*, though, including documenting the 
interface provided by modules.

From phil at secdev.org  Mon Jan 10 17:17:49 2005
From: phil at secdev.org (Philippe Biondi)
Date: Tue Jan 11 01:53:33 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
Message-ID: <Pine.LNX.4.44.0501101659370.7440-100000@deneb.home.phil>

Hi,

I've done a small patch to use linux AF_NETLINK sockets (see below).
Please comment!

Is there a reason for recvmsg() and sendmsg() not to be implemented
yet in socketmodule ?


The integration with autoconf has not been done, even if
this patch should be ok :

--- configure.in.ori    2005-01-10 17:09:32.000000000 +0100
+++ configure.in        2005-01-06 18:53:18.000000000 +0100
@@ -967,7 +967,7 @@
 sys/audioio.h sys/bsdtty.h sys/file.h sys/loadavg.h sys/lock.h sys/mkdev.h \
 sys/modem.h \
 sys/param.h sys/poll.h sys/select.h sys/socket.h sys/time.h sys/times.h \
-sys/un.h sys/utsname.h sys/wait.h pty.h libutil.h \
+sys/un.h linux/netlink.h sys/utsname.h sys/wait.h pty.h libutil.h \
 sys/resource.h netpacket/packet.h sysexits.h bluetooth.h \
 bluetooth/bluetooth.h)
 AC_HEADER_DIRENT
--- pyconfig.h.ori      2005-01-10 17:11:11.000000000 +0100
+++ pyconfig.h  2005-01-06 19:27:33.000000000 +0100
@@ -559,6 +559,9 @@
 /* Define to 1 if you have the <sys/un.h> header file. */
 #define HAVE_SYS_UN_H 1

+/* Define to 1 if you have the <linux/netlink.h> header file. */
+#define HAVE_LINUX_NETLINK_H 1
+
 /* Define to 1 if you have the <sys/utsname.h> header file. */
 #define HAVE_SYS_UTSNAME_H 1



--- socketmodule.h.ori  2005-01-07 19:25:18.000000000 +0100
+++ socketmodule.h      2005-01-06 18:20:54.000000000 +0100
@@ -32,6 +32,12 @@
 # undef AF_UNIX
 #endif

+#ifdef HAVE_LINUX_NETLINK_H
+# include <linux/netlink.h>
+#else
+#  undef AF_NETLINK
+#endif
+
 #ifdef HAVE_BLUETOOTH_BLUETOOTH_H
 #include <bluetooth/bluetooth.h>
 #include <bluetooth/rfcomm.h>
@@ -87,6 +93,9 @@ typedef struct {
 #ifdef AF_UNIX
                struct sockaddr_un un;
 #endif
+#ifdef AF_NETLINK
+               struct sockaddr_nl nl;
+#endif
 #ifdef ENABLE_IPV6
                struct sockaddr_in6 in6;
                struct sockaddr_storage storage;
--- socketmodule.c.ori  2005-01-07 19:25:19.000000000 +0100
+++ socketmodule.c      2005-01-10 17:04:38.000000000 +0100
@@ -948,6 +948,14 @@ makesockaddr(int sockfd, struct sockaddr
        }
 #endif /* AF_UNIX */

+#if defined(AF_NETLINK)
+       case AF_NETLINK:
+       {
+               struct sockaddr_nl *a = (struct sockaddr_nl *) addr;
+               return Py_BuildValue("ii", a->nl_pid, a->nl_groups);
+       }
+#endif /* AF_NETLINK */
+
 #ifdef ENABLE_IPV6
        case AF_INET6:
        {
@@ -1084,6 +1092,31 @@ getsockaddrarg(PySocketSockObject *s, Py
        }
 #endif /* AF_UNIX */

+#if defined(AF_NETLINK)
+       case AF_NETLINK:
+       {
+               struct sockaddr_nl* addr;
+               int pid, groups;
+               addr = (struct sockaddr_nl *)&(s->sock_addr).nl;
+               if (!PyTuple_Check(args)) {
+                       PyErr_Format(
+                               PyExc_TypeError,
+                               "getsockaddrarg: "
+                               "AF_NETLINK address must be tuple, not %.500s",
+                               args->ob_type->tp_name);
+                       return 0;
+               }
+               if (!PyArg_ParseTuple(args, "II", &pid, &groups))
+                       return 0;
+               addr->nl_family = AF_NETLINK;
+               addr->nl_pid = pid;
+               addr->nl_groups = groups;
+               *addr_ret = (struct sockaddr *) addr;
+               *len_ret = sizeof(*addr);
+               return 1;
+       }
+#endif
+
        case AF_INET:
        {
                struct sockaddr_in* addr;
@@ -1280,6 +1313,13 @@ getsockaddrlen(PySocketSockObject *s, so
                return 1;
        }
 #endif /* AF_UNIX */
+#if defined(AF_NETLINK)
+       case AF_NETLINK:
+       {
+               *len_ret = sizeof (struct sockaddr_nl);
+               return 1;
+       }
+#endif

        case AF_INET:
        {
@@ -3938,8 +3978,20 @@ init_socket(void)
        PyModule_AddIntConstant(m, "AF_KEY", AF_KEY);
 #endif
 #ifdef AF_NETLINK
-       /*  */
+       /* Netlink socket */
        PyModule_AddIntConstant(m, "AF_NETLINK", AF_NETLINK);
+       PyModule_AddIntConstant(m, "NETLINK_ROUTE", NETLINK_ROUTE);
+       PyModule_AddIntConstant(m, "NETLINK_SKIP", NETLINK_SKIP);
+       PyModule_AddIntConstant(m, "NETLINK_USERSOCK", NETLINK_USERSOCK);
+       PyModule_AddIntConstant(m, "NETLINK_FIREWALL", NETLINK_FIREWALL);
+       PyModule_AddIntConstant(m, "NETLINK_TCPDIAG", NETLINK_TCPDIAG);
+       PyModule_AddIntConstant(m, "NETLINK_NFLOG", NETLINK_NFLOG);
+       PyModule_AddIntConstant(m, "NETLINK_XFRM", NETLINK_XFRM);
+       PyModule_AddIntConstant(m, "NETLINK_ARPD", NETLINK_ARPD);
+       PyModule_AddIntConstant(m, "NETLINK_ROUTE6", NETLINK_ROUTE6);
+       PyModule_AddIntConstant(m, "NETLINK_IP6_FW", NETLINK_IP6_FW);
+       PyModule_AddIntConstant(m, "NETLINK_DNRTMSG", NETLINK_DNRTMSG);
+       PyModule_AddIntConstant(m, "NETLINK_TAPBASE", NETLINK_TAPBASE);
 #endif
 #ifdef AF_ROUTE
        /* Alias to emulate 4.4BSD */


-- 
Philippe Biondi <phil@ secdev.org>      SecDev.org
Security Consultant/R&D                 http://www.secdev.org
PGP KeyID:3D9A43E2  FingerPrint:C40A772533730E39330DC0985EE8FF5F3D9A43E2

From neal at metaslash.com  Tue Jan 11 00:31:26 2005
From: neal at metaslash.com (Neal Norwitz)
Date: Tue Jan 11 01:53:34 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
In-Reply-To: <20050105110849.CBA843C8E5@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050105110849.CBA843C8E5@coffee.object-craft.com.au>
Message-ID: <20050110233126.GA14363@janus.swcomplete.com>

On Wed, Jan 05, 2005 at 10:08:49PM +1100, Andrew McNamara wrote:
> >Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should
> >already have been addressed):
> 
> I should apologise to Neal here for not replying to him at the time.

Hey, I'm impressed you got to them.  :-) I completely forgot about it.

> >* rather than use PyErr_BadArgument, should you use assert?
> >        (first example, Dialect_set_quoting, line 218)
> 
> You mean C assert()? I don't think I'm really following you here -
> where would the type of the object be checked in a way the user could
> recover from?

IIRC, I meant C assert().  This goes back to a discussion a long time
ago about what is the preferred way to handle invalid arguments.
I doubt it's important to change.

> >* I think you need PyErr_NoMemory() before returning on line 768, 1178
> 
> The examples I looked at in the Python core didn't do this - are you sure?
> (now lines 832 and 1280). 

Originally, they were a plain PyObject_NEW().  Now they are a
PyObject_GC_New() so it seems no further change is necessary.

> >* is PyString_AsString(self->dialect->lineterminator) on line 994
> >        guaranteed not to return NULL?  If not, it could crash by
> >        passing to memmove.
> >* PyString_AsString() can return NULL on line 1048 and 1063, 
> >        the result is passed to join_append()
> 
> Looking at the PyString_AsString implementation, it looks safe (we ensure
> it's really a string elsewhere)?

Ok.  Then it should be fine.  I spot checked lineterminator and it
looked ok.

> >* iteratable should be iterable?  (line 1088)
> 
> Sorry, I don't know what you're getting at here? (now line 1162).

Heh, I had to read that twice myself.  It was a typo (assuming
I wasn't completely wrong)--an extra "at", but it doesn't exist
any longer.

I don't think there are any changes remaining to be done from my
original code review.

BTW, I always try to run valgrind before a release, especially
major releases.

Neal
From dw-python.org at botanicus.net  Tue Jan 11 02:32:52 2005
From: dw-python.org at botanicus.net (David Wilson)
Date: Tue Jan 11 02:32:56 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <Pine.LNX.4.44.0501101659370.7440-100000@deneb.home.phil>
References: <Pine.LNX.4.44.0501101659370.7440-100000@deneb.home.phil>
Message-ID: <20050111013252.GA216@thailand.botanicus.net>

On Mon, Jan 10, 2005 at 05:17:49PM +0100, Philippe Biondi wrote:

> I've done a small patch to use linux AF_NETLINK sockets (see below).
> Please comment!

As of 2.6.10, a very useful new netlink family was merged -
NETLINK_KOBJECT_UEVENT. I'd imagine quite a lot of interest from Python
developers for NETLINK support will come from this new interface in the
coming years.

    http://lwn.net/Articles/101210/
    http://lkml.org/lkml/2004/9/10/315
    http://vrfy.org/projects/kevents/
    http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.10

I would like to see (optional?) support for this before your patch is
merged. I have a long-term interest in a Python-based service control /
init replacement / system management application, for use in specialised
environments. I could definately use this. :)

Thanks,


David.

-- 
Harmless - and in its harmlessness, diabolical.
    -- The Mold Of Yancy (Philip K. Dick)
From martin at v.loewis.de  Tue Jan 11 08:54:42 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan 11 08:54:41 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <Pine.LNX.4.44.0501101659370.7440-100000@deneb.home.phil>
References: <Pine.LNX.4.44.0501101659370.7440-100000@deneb.home.phil>
Message-ID: <41E38642.9080108@v.loewis.de>

Philippe Biondi wrote:
> I've done a small patch to use linux AF_NETLINK sockets (see below).
> Please comment!

I have a high-level comment - python-dev is normally the wrong place
for patches; please submit them to sf.net/projects/python instead.

Apart from that, the patch looks fine.

> Is there a reason for recvmsg() and sendmsg() not to be implemented
> yet in socketmodule ?

I'm not sure what you mean by "implemented": these functions are
implemented by the operating system, not in the socketmodule.

If you ask "why are they not exposed to Python yet?": There has been no
need to do so, so far. What do I get with recvmsg that I cannot get with
recv/recvfrom just as well?

Regards,
Martin
From aleax at aleax.it  Tue Jan 11 10:34:08 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 10:34:16 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <F1743E00-63B3-11D9-ADA4-000A95EFAE9E@aleax.it>

The volume of these discussions is (as expected) growing beyond any 
reasonable bounds; I hope the BDFL can find time to read them but I'm 
starting to doubt he will.  Since obviously we're not going to convince 
each other, and it seems to me we're at least getting close to 
pinpointing our differences, maybe we should try to jointly develop an 
"executive summary" of our differences and briefly stated pros and cons 
-- a PEP is _supposed_ to have such a section, after all.  Anyway, for 
now, here goes another LONG mail...:

On 2005 Jan 10, at 22:38, Phillip J. Eby wrote:
    ...
>> If interfaces can ensure against Liskov violations in instances of 
>> their subclasses, then they can follow the "case (a)" fast path, 
>> sure.
>> Inheriting from an interface (in Guido's current proposal, as per his 
>> Artima blog) is a serious commitment from the inheritor's part; 
>> inheriting from an ordinary type, in real-world current practice, 
>> need not be -- too many cases of assumed covariance, for example, are 
>> around in the wild, to leave NO recourse in such cases and just 
>> assume compliance.
>
> I understand that, sure.  But I don't understand why we should add 
> complexity to PEP 246 to support not one but *two* bad practices: 1) 
> implementing Liskov violations and 2) adapting to concrete classes.  
> It is only if you are doing *both* of these that this extra feature is 
> needed.

s/support/deal with/ .  If I was designing a "greenfield" system, I'd 
love nothing better than making a serious split between concrete 
classes (directly instantiable), abstract classes (subclassable), and 
protocols (adaptable-to).  Scott Meyer's suggestion, in one of his 
Effective C++ books, to never subclass a concrete class, has much to 
recommend itself, in particular.  But in the real world people do want 
to subclass concrete classes, just as much as they want covariance AND, 
I'll bet, adapting to classes, too, not just to interfaces seen as a 
separate category from classes.  I think PEP 246 should deal with the 
real world.  If the BDFL thinks otherwise, and believes that in this 
case Python should impose best practices rather than pragmatically deal 
with the way people's minds (as opposed to type-system maths) appear to 
work, I'll be glad to recast the PEP in that light.

Contrary to what you state, adapting to concrete (instantiable) classes 
rather than abstract (not directly instantiable) one is not necessary 
to make the mechanism required, by the way.  Consider an abstract class 
such as:

class Abstract(object):
     def tp1(self):
         ''' template method 1 (calls hook method 1) '''
     def tp2(self):
         ''' template method 2 (calls hook method 2) '''
     def hook1(self): raise NotImplementedError
     def hook2(self): raise NotImplementedError

One could subclass it just to get tp1...:

class Dubious(Abstract):
     def hook1(self): ''' implementing just hook1 '''

Now, instantiating d=Dubious() is dubious practice, but, absent 
specific checks, it "works", as long as only d.hook1() and d.tp1() are 
ever called -- never d.hook2() nor d.tp2().

I would like adapt(d, Abstract) to fail.  I'm not claiming that the 
ability to have a __conform__ method in Dubious to specifically block 
this adaptation is anywhere like a perfect solution, mind you -- it 
does require some change to the source of Dubious, for example.  I'm 
just saying that I think it's better than nothing, and that is where we 
seem to disagree.

> If it were to support some kind of backward compatibility, that would 
> be understandable.  However, in practice, I don't know of anybody 
> using adapt(x,ConcreteClass), and even if they did, the person 
> subclassing ConcreteClass will need to change their subclass to raise 
> LiskovViolation, so why not just switch to delegation?

Because delegation doesn't give you _easy_ access to Template Method 
design patterns which may well be the one reason you're subclassing 
Abstract in the first place.  TP hinges on a method calling some 
self.dothis(), self.dothat() hook methods; to get it via delegation 
rather than inheritance requires a more complicated arrangement where 
that 'self' actually belongs to a "private" concrete class which 
delegates some things back to the "real" class.  In practice, 
inheritance as a means of code reuse (rather than as a pristine 
Liskovian assertion of purity) is quite popular because of that.  C++ 
essentially acknowledges this fact by allowing _private_ inheritance, 
essentially meaning "I'm reusing that code but don't really mean to 
assert IS-A"; the effects of private inheritance could be simulated by 
delegation to a private auxiliary class, but the extra indirections and 
complications aren't negligible costs in terms of code complexity and 
maintainability.  In Python, we don't distinguish between private 
inheritance (to just reuse code) and ordinary inheritance (assumed to 
imply Liskov sub.), but that doesn't make the need go away.  The 
__conform__ raising LiskovViolation could be seen as a way to give the 
subclass the ability to say "this inheritance here is ``private'', an 
issue of implementation only and not Liskov compliant".

Maybe the ability to ``fake'' __class__ can help, but right now I don't 
see how, because setting __class__ isn't fake at all -- it really 
affects object behavior and type:

 >>> class A(object):
...   def x(self): print "A"
...
 >>> a = A()
 >>> class B(object):
...   def x(self): print "B"
...
 >>> a.__class__ = B
 >>> a.x()
B
 >>> type(a)
<class '__main__.B'>
 >>>

So, it doesn't seem to offer a way to fake out isinstance only, without 
otherwise affecting behavior.

> Anyway, it seems to me a bad idea to add complexity to support this 
> case.  Do you have a more specific example of a situation in which a 
> Liskov violation coupled to concrete class adaptation is a good idea?  
> Or am I missing something here?

I can give no example at all in which adapting to a concrete class is a 
_good_ idea, and I tried to indicate that in the PEP.  I just believe 
that if adaptation does not offer the possibility of using concrete 
classes as protocols, but rather requires the usage as protocols of 
some specially blessed 'interface' objects or whatever, then PEP 246 
will never fly, (a) because it would then require waiting for the 
interface thingies to appear, and (b) because people will find it 
pragmatically useful to just reuse the same classes as protocols too, 
and too limiting to have to design protocols specifically instead.  So, 
I see the ability to adapt to concrete (or, for that matter, abstract) 
classes as a "practicality beats purity" idea, needed to deal with the 
real world and the way people's minds work.

In practice we need covariance at least until a perfect system of 
parameterized interfaces is in place, and you can see from the Artima 
discussion just how difficult that is.  I want to reuse (say) DictMixin 
on my mappings which restrict keys to be strings, for example, even 
though such a mapping does not conform to an unrestricted, 
unparameterized Mapping protocol.


>>  I need to add it to the reference implementation in the PEP.  I'm 
>> reluctant to just get __conform__ from the object, though; it leads 
>> to all sort of issues with a *class* conforming vs its *instances*, 
>> etc.  Maybe Guido can Pronounce a little on this sub-issue...
>
> Actually, if you looked at the field-tested implementations of the old 
> PEP 246, they actually have code that deals with this issue 
> effectively, by recognizing TypeError when raised by attempting to 
> invoke __adapt__ or __conform__ with the wrong number of arguments or 
> argument types.  (The traceback for such errors does not include a 
> frame for the called method, versus a TypeError raised *within* the 
> function, which does have such a frame.  AFAIK, this technique should 
> be compatible with any Python implementation that has traceback 
> objects and does signature validation in its "native" code rather than 
> in a new Python frame.)

I do not like the idea of making 'adapt' such a special case compared 
with other built-in functions which internally call special methods.  
What I mean is, for example:

 >>> class H(object):
...   def __hash__(self): return 23
...
 >>> h = H()
 >>> h.__hash__ = lambda: 42
 >>> hash(h)
23

For hash, and all kinds of other built-in functions and operations, it 
*does not matter* whether instance h has its own per-instance __hash__ 
-- H.__hash__ is what gets called anyway.  Making adapt work 
differently gives me the shivers.

Moreover, the BDFL is thinking of removing the "unbound method" concept 
and having such accesses as Aclass.somemethod return just a plain 
function.  The internal typechecks done by unbound methods, on which 
such techniques as you mention may depend, might therefore be about to 
go away; this doesn't make it look nice to depend on them in a 
reference implementation.

If Guido pronounces otherwise, I'll gladly change the reference 
implementation accordingly (or remove said reference implementation, as 
you appear to suggest elsewhere), but unless and until this happens, 
I'm not convinced.


>>> I don't see the benefit of LiskovViolation, or of doing the exact 
>>> type check vs. the loose check.  What is the use case for these?  Is 
>>> it to allow subclasses to say, "Hey I'm not my superclass?"  It's 
>>> also a bit confusing to say that if the routines "raise any other 
>>> exceptions" they're propagated.  Are you saying that LiskovViolation 
>>> is *not* propagated?
>>
>> Indeed I am -- I thought that was very clearly expressed!
>
> The PEP just said that it would be raised by __conform__ or __adapt__, 
> not that it would be caught by adapt() or that it would be used to 
> control the behavior in that way.  Re-reading, I see that you do 
> mention it much farther down.  But at the point where __conform__ and 
> __adapt__ are explained, it has not been explained that adapt() should 
> catch the error or do anything special with it.  It is simply implied 
> by the "to prevent this default behavior" at the end of the section.  
> If this approach is accepted, the description should be made explicit, 
> becausse for me at least it required a retroactive re-interpretation 
> of the earlier part of the spec.

OK, I'll add more repetition to the specs, trying to make it more 
"sequentially readable", even though there were already criticized 
because they do repeat some aspects more than once.


>> The previous version treated TypeError specially, but I think (on the 
>> basis of just playing around a bit, admittedly) that offers no real 
>> added value and sometimes will hide bugs.
>
> See http://peak.telecommunity.com/protocol_ref/node9.html for an 
> analysis of the old PEP 246 TypeError behavior, and the changes made 
> the by PyProtocols and Zope to deal with the situation better, while 
> still respecting the fact that __conform__ and __adapt__ may be 
> retrieved from the wrong "meta level" of descriptor.

I've read that, and I'm not convinced, see above.

> Your new proposal does not actually fix this problem in the absence of 
> tp_conform/tp_adapt slots; it merely substitutes possible confusion at 
> the metaclass/class level for confusion at the class/instance level.  
> The only way to actually fix this is to detect when you have called 
> the wrong level, and that is what the PyProtocols and Zope 
> implementations of "old PEP 246" do.  (PyProtocols also introduces a 
> special descriptor for methods defined on metaclasses, to help avoid 
> creating this possible confusion in the first place, but that is a 
> separate issue.)

Can you give an example of "confusion at metaclass/class level"?  I 
can't see it.


>>> This should either be fleshed out to a concrete proposal, or dropped.
>>> There are many details that would need to be answered, such as 
>>> whether "type" includes subtypes and whether it really means type or 
>>> __class__.  (Note that isinstance() now uses __class__, allowing 
>>> proxy objects to lie about their class; the adaptation system should 
>>> support this too, and both the Zope and PyProtocols interface 
>>> systems and PyProtocols' generic functions support it.)
>>
>> I disagree: I think the strawman-level proposal as fleshed out in the 
>> pep's reference implementation is far better than nothing.
>
> I'm not proposing to flesh out the functionality, just the 
> specification; it should not be necessary to read the reference 
> implementation and try to infer intent from it.  What part is 
> implementation accident, and what is supposed to be the specification? 
>  That's all I'm talking about here.  As currently written, the 
> proposal is just, "we should have a registry", and is not precise 
> enough to allow someone to implement it based strictly on the 
> specification.

Wasn't python supposed to be executable pseudocode, and isn't 
pseudocode an acceptable way to express specs?-)  Ah well, I see your 
point, so that may well require more repetitious expansion, too.

>>   I mention the issue of subtypes explicitly later, including why the 
>> pep does NOT do anything special with them -- the reference 
>> implementation deals with specific types.  And I use type(X) 
>> consistently, explicitly mentioning in the reference implementation 
>> that old-style classes are not covered.
>
> As a practical matter, classic classes exist and are useful, and PEP 
> 246 implementations already exist that work with them.  Dropping that 
> functionality is a major step backward for PEP 246, IMO.

I disagree that entirely new features of Python (as opposed to external 
third party add-ons) should add complications to deal with old-style 
classes.  Heh, shades of the "metatype conflict removal" recipe 
discussion a couple months ago, right?-)  But then that recipe WAS a 
"third-party add-on".  If Python grew an intrinsic way to deal with 
metaclass conflicts, I'd be DELIGHTED if it didn't work for old-style 
classes, as long as this simplified it.

Basically, we both agree that adaptation must accept some complication 
to deal with practical real-world issues that are gonna stay around, we 
just disagree on what those issues are.  You appear to think old-style 
classes will stay around and need to be supported by new core Python 
functionality, while I think they can be pensioned off; you appear to 
think that programmers' minds will miraculously shift into a mode where 
they don't need covariance or other Liskov violations, and programmers 
will happily extract the protocol-ish aspects of their classes into 
neat pristine protocol objects rather than trying to double-use the 
classes as protocols too, while I think human nature won't budge much 
on this respect in the near future.

Having, I hope, amply clarified the roots of our disagreements, so we 
can wait for BDFL input before the needed PEP 246 rewrites.  If his 
opinions are much closer to yours than to mine, then perhaps the best 
next step would be to add you as the first author of the PEP and let 
you perform the next rewrite -- would you be OK with that?

>> I didn't know about the "let the object lie" quirk in isinstance.  If 
>> that quirk is indeed an intended design feature,
>
> It is; it's in one of the "what's new" feature highlights for either 
> 2.3 or 2.4, I forget which.  It was intended to allow proxy objects 
> (like security proxies in Zope 3) to pretend to be an instance of the 
> class they are proxying.

I just grepped through whatsnew23.tex and whatsnew24.tex and could not 
find it.  Can you please help me find the exact spot?  Thanks!

>>> The issue isn't that adaptation isn't casting; why would casting a 
>>> string to a file mean that you should open that filename?
>>
>> Because, in most contexts, "casting" object X to type Y means calling 
>> Y(X).
>
> Ah; I had not seen that called "casting" in Python, at least not to my 
> immediate recollection.  However, if that is what you mean, then why 
> not say it?  :)

What _have_ you seen called "casting" in Python?

>> Maybe we're using different definitions of "casting"?
>
> I'm most accustomed to the C and Java definitions of casting, so 
> that's probably why I can't see how it relates at all.  :)

Well, in C++ you can call (int)x or int(x) with the same semantics -- 
they're both casts.  In C or Java you must use the former syntax, in 
Python the latter, but they still relate.


>>> If I were going to say anything about that case, I'd say that 
>>> adaptation should not be "lossy"; adapting from a designator to a 
>>> file loses information like what mode the file should be opened in.
>>> (Similarly, I don't see adapting from float to int; if you want a 
>>> cast to int, cast it.)  Or to put it another way, adaptability 
>>> should imply substitutability: a string may be used as a filename, a 
>>> filename may be used to designate a file.  But a filename cannot be 
>>> used as a file; that makes no sense.
>>
>> I don't understand this "other way" -- nor, to be honest, what you 
>> "would say" earlier, either.  I think it's pretty normal for 
>> adaptation to be "lossy" -- to rely on some but not all of the 
>> information in the original object: that's the "facade" design 
>> pattern, after all.  It doesn't mean that some info in the original 
>> object is lost forever, since the original object need not be 
>> altered; it just means that not ALL of the info that's in the 
>> original object used in the adapter -- and, what's wrong with that?!
>
> I think we're using different definitions of "lossy", too.  I mean 
> that defining an adaptation relationship between two types when there 
> is more than one "sensible" way to get from one to the other is 
> "lossy" of semantics/user choice.  If I have a file designator (such 
> as a filename), I can choose how to open it.  If I adapt directly from 
> string to file by way of filename, I lose this choice (it is "lossy" 
> adaptation).

You could have specified some options (such as the mode) but they took 
their default value instead ('r' in this case).  What's ``lossy'' about 
accepting defaults?!

The adjective "lossy" is overwhelmingly often used in describing 
compression, and in that context it means, can every bit of the 
original be recovered (then the compression is lossless) or not (then 
it's lossy).  I can't easily find "lossy" used elsewhere than in 
compression, it's not even in American Heritage.  Still, when you 
describe a transformation such as 12.3 -> 12 as "lossy", the analogy is 
quite clear to me.  When you so describe the transformation 'foo.txt' 
-> file('foo.txt'), you've lost me completely: every bit of the 
original IS still there, as the .name attribute of the file object, so 
by no stretch of the imagination can I see the "lossiness" -- what bits 
of information are LOST?

I'm not just belaboring a term, I think the concept is very important, 
see later.

>
> Here's a better way of phrasing it (I hope): adaptation should be 
> unambiguous.  There should only be one sensible way to interpret a 
> thing as implementing a particular interface, otherwise, adaptation 
> itself has no meaning.  Whether an adaptation adds or subtracts 
> behavior, it does not really change the underlying *intended* meaning 
> of a thing, or else it is not really adaptation.  Adapting 12.0 to 12 
> does not change the meaning of the value, but adapting from 12.1 to 12 
> does.
>
> Does that make more sense?  I think that some people start using 
> adaptation and want to use

Definitely more sense than 'lossy', but that's only because the latter 
didn't make any sense to me at all (when stretched to include, e.g., 
opening files).  Again, see later.

> it for all kinds of crazy things because it seems cool.  However, it 
> takes a while to see that adaptation is just about removing 
> unnecessary accidents-of-incompatibility; it's not a license to 
> transform arbitrary things into arbitrary things.  There has to be 
> some *meaning* to a particular adaptation, or the whole concept 
> rapidly degenerates into an undifferentiated mess.

We agree, philosophically.  Not sure how the PEP could be enriched to 
get this across.  We still disagree, pragmatically, see later.

> (Or else, you decide to "fix" it by disallowing transitive adaptation, 
> which IMO is like cutting off your hand because it hurts when you 
> punch a brick wall.  Stop punching brick walls (i.e. using 
> semantic-lossy adaptations), and the problem goes away.  But I realize 
> that I'm in the minority here with regards to this opinion.)

I'm not so sure about your being in the minority, having never read for 
example Guido's opinion in the matter.  But, let's take an example of 
Facade.  (Here's the 'later' I kept pointing to;-).

I have three data types / protocols: LotsOfInfo has a bazillion data 
fields, including personFirstName, personMiddleName, personLastName, 
...
PersonName has just two data fields, theFirstName and theLastName.
FullName has three, itsFirst, itsMiddle, itsLast.

The adaptation between such types/protocols has meaning: drop/ignore 
redundant fields, rename relevant fields, make up missing ones by some 
convention (empty strings if they have to be strings, None to mean "I 
dunno" like SQL NULL, etc).  But, this *IS* lossy in some cases, in the 
normal sense: through the facade (simplified interface) I can't access 
ALL of the bits in the original (information-richer).

Adapting LotsOfInfo -> PersonName is fine; so does LotsOfInfo -> 
FullName.

Adapting PersonName -> FullName is iffy, because I don't have the 
deuced middlename information.  But that's what NULL aka None is for, 
so if that's allowed, I can survive.

But going from LotsOfInfo to FullName transitively, by way of 
PersonName, cannot give the same result as going directly -- the middle 
name info disappears, because there HAS been a "lossy" step.

So the issue of "lossy" DOES matter, and I think you muddy things up 
when you try to apply it to a string -> file adaptation ``by casting'' 
(opening the file thus named).

Forbidding lossy adaptation means forbidding facade here; not being 
allowed to get adaptation from a rich source of information when what's 
needed is a subset of that info with some renaming and perhaps mixing.  
I would not like that *AT ALL*; I believe it's unacceptable.

Forbidding indications of "I don't know" comparable to SQL's NULL (thus 
forbidding the adaptation PersonName -> FullName) might make the whole 
scheme incompatible with the common use of relational databases and the 
like -- probably not acceptable, either.

Allowing both lossy adaptations, NULLs, _and_ transitivity inevitably 
leads sooner or later to ACCIDENTAL info loss -- the proper adapter to 
go directly LotsOfInfo -> FullName was not registered, and instead of 
getting an exception to point out that error, your program limps along 
having accidentally dropped a piece of information, here the 
middle-name.

So, I'd like to disallow transitivity.


>> For example, say that I have some immutable "record" types.  One, 
>> type Person, defined in some framework X, has a huge lot of immutable 
>> data fields, including firstName, middleName, lastName, and many, 
>> many others.  Another, type Employee, defines in some separate 
>> framework Y (that has no knowlege of X, and viceversa), has fewer 
>> data fields, and in particular one called 'fullName' which is 
>> supposed to be a string such as 'Firstname M. Lastname'.  I would 
>> like to register an adapter factory from type Person to protocol 
>> Employeee.  Since we said Person has many more data fields, 
>> adaptation will be "lossy" -- it will look upon Employee essentially 
>> as a "facade" (a simplified-interface) for Person.
>
> But it doesn't change the *meaning*.  I realize that "meaning" is not 
> an easy concept to pin down into a nice formal definition.  I'm just 
> saying that adaptation is about semantics-preserving transformations, 
> otherwise you could just tack an arbitrary object on to something and 
> call it an adapter.  Adapters should be about exposing an object's 
> *existing semantics*
> in terms of a different interface, whether the interface is a subset 
> or superset of the original object's interface.  However, they should 
> not add or remove arbitrary semantics that are not part of the 
> difference in interfaces.

OK, but then 12.3 -> 12 should be OK, since the loss of the fractionary 
part IS part of the difference in interfaces, right?  And yet it 
doesn't SMELL like adaptation to me -- which is why I tried to push the 
issue away with the specific disclaimer about numbers.

> For example, adding a "current position" to a string to get a StringIO 
> is a difference that is reflected in the difference in interface: a 
> StringIO *is* just a string of characters with a current position that 
> can be used in place of slicing.
>
> But interpreting a string as a *file* doesn't make sense because of 
> added semantics that have to be "made up", and are not merely 
> interpreting the string's semantics "as a" file.  I suppose you could 
> say that this is "noisy" adaptation rather than "lossy".  That is, to 
> treat a string as a file by using it as a filename, you have to make 
> up things that aren't present in the string.  (Versus the StringIO, 
> where there's a sensible interpretation of a string "as a" StringIO.)
>
> IOW, adaptation is all about "as a" relationships from concrete 
> objects to abstract roles, and between abstract roles.  Although one 
> may colloquially speak of using a screwdriver "as a" hammer, this is 
> not the case in adaptation.  One may use a screwdriver "as a" 
> pounder-of-nails.  The difference is that a hammer might also be 
> usable "as a" remover-of-nails.  Therefore, there is no general "as a" 
> relationship between pounder-of-nails and remover-of-nails, even 
> though a hammer is usable "as" either one.  Thus, it does not make 
> sense to say that a screwdriver is usable "as a" hammer, because this 
> would imply it's also usable to remove nails.

I like the "as a" -- but it can't ignore Facade, I think.

>
> This is why I don't believe it makes sense in the general case to 
> adapt to concrete classes; such classes usually have many roles where 
> they are usable.  I think the main difference in your position and 
> mine is that I think one should adapt primarily to interfaces, and

I fully agree here.  I see the need to adapt to things that aren't 
protocols as an unpleasant reality we have to (heh) adapt to, not ideal 
by any means.

> interface-to-interface adaptation should be reserved for non-lossy, 
> non-noisy adapters.

No Facade, no NULLs?  Yes, we disagree about this one: I believe 
adaptation that occurs by showing just a subset of the info, with 
renaming etc, is absolutely fine (Facade); and adaptation by using an 
allowed NULL (say None) to mean "missing information", when going to a 
"wider" interface, is not pleasant but is sometimes indispensable in 
the real world -- that's why SQL works in the real world, even though 
SQL beginners and a few purists hate NULLs with a vengeance.

> Where if I understand the opposing position correctly, it is instead 
> that one should avoid transitivity so that loss and noise do not 
> accumulate too badly.

In a sense -- but that has nothing to do with concrete classes etc, in 
this context.  All of the "records"-like datatypes I'm using around 
here may perfectly well be as interfacey as you please, as long as 
interfaces/protocols let you access attributes property-like, and if 
they don't just transliterate to getThis, getThat, getTheOther, no big 
deal.

The points are rather that adaptation that "loses" (actually "hides") 
some information is something we MUST have; and adaptation that 
supplies "I don't know" markers (NULL-like) for some missing 
information, where that's allowed, is really very desirable.  Call this 
lossy and noisy if you wish, we still can't do without.

Transitivity is a nice convenience, IF it could be something that an 
adapter EXPLICITLY claims rather than something just happening by 
default.  I might live with it, grudgingly, if it was the default with 
some nice easy way to turn it off; my problem with that is -- even if 
90% of the cases could afford to be transitive, people will routinely 
forget to mark the other 10% and mysterious, hard-to-find bugs will 
result.  The identical objection can be raised about the 
LiskovViolation mechanism, which is why I say it's not perfect by any 
stretch of the imagination, btw (I just think SOME mechanism to turn 
off the default is needed and can't think of a better one yet).

In PyProtocols docs you specifically warn against adapting from an 
adapter... yet that's what transitivity intrinsically does!


>> So, can you please explain your objections to what I said about 
>> adapting vs casting in terms of this example?  Do you think the 
>> example, or some variation thereof, should go in the PEP?
>
> I'm not sure I see how that helps.  I think it might be more useful to 
> say that adaptation is not *conversion*, which is not the same thing 
> (IME) as casting.  Casting in C and Java does not actually "convert" 
> anything; it simply treats a value or object as if it were of a

Uh?  (int)12.34  DOES "convert" to the integer 12, creating an entirely 
new object.  SOME casting does convert, other doesn't (C++ clears up 
this mess by introducing many separate casts such as reinterpret_cast 
when you specifically want reinterpretation of bits, etc, etc, but for 
backwards compatibility keeps supporting the mess too).

> different type.  ISTM that bringing casting into the terminology just 
> complicates the picture, because e.g. casting in Java actually 
> corresponds to the subset of PEP 246 adaptation for cases where 
> adapt() returns the original object or raises an error.  (That is, if 
> adapt() could only ever return the original object or raise an error, 
> it would be precisely equivalent to Java casting, if I understand it 
> correctly.)  Thus, at least with regard to object casting in Java, 
> adaptation is a superset, and saying that it's not casting is just 
> confusing.

OK, I'll try to rephrase that part.  Obviously "casting" is too 
overloaded.


> Obviously, some changes would need to be made to implement your newly 
> proposed functionality, but this one does support classic classes, 
> modules, and functions, and it has neither the TypeError-hiding 
> problem of the original PEP 246 nor the TypeError-raising problem of 
> your new version.

...but it DOES break the normal semantics of relationship between 
builtins and special methods, as I exemplified above with hash and 
__hash__.


>>>>     Transitivity of adaptation is in fact somewhat controversial, as
>>>>     is the relationship (if any) between adaptation and inheritance.
>>>
>>> The issue is simply this: what is substitutability?  If you say that 
>>> interface B is substitutable for A, and C is substitutable for B, 
>>> then C *must* be substitutable for A, or we have inadequately 
>>> defined "substitutability".
>>
>> Not necessarily, depending on the pragmatics involved.
>
> In that case, I generally prefer to be explicit and use conversion 
> rather than using adaptation.  For example, if I really mean to 
> truncate the fractional part of a number, I believe it's then 
> appropriate to use 'int(someNumber)' and make it clear that I'm 
> intentionally using a lossy conversion rather than simply treating a 
> number "as an" integer without changing its meaning.

That's how it feels to me FOR NUMBERS, but I can't generalize the 
feeling to the general case of facade between "records" with many 
fields of information, see above.


Alex

From aleax at aleax.it  Tue Jan 11 10:39:30 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 10:39:40 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com>
References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
	<5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com>
Message-ID: <B179CB22-63B4-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 10, at 23:19, Phillip J. Eby wrote:
    ...
> As I said, after more thought, I'm actually less concerned about the 
> performance than I am about even remotely encouraging the combination 
> of Liskov violation *and* concrete adaptation

As per other msg, abstract classes have just the same issues as 
concrete ones.

If Ka-Ping Yee's idea (per Artima cmts on BDFL blog) of having 
interfaces supply template methods too ever flies, the issue would 
arise there, too (a BDFL comment with a -1 suggests it won't fly, 
though).

> targets.  But, if "after the dust settles" it turns out this is going 
> to be supported after all, then we can worry about the performance if 
> need be.
>
> Note, however, that your statements actually support the idea of *not* 
> adding a special case for Liskov violators.  If newer code uses 
> interfaces, the Liskov-violation mechanism is useless.  If older code 
> doesn't have __conform__, it cannot possibly *use* the 
> Liskov-violation mechanism.

Adding __conform__ to a class to raise a LiskovViolation when needed is 
a TINY change compared to the refactoring needed to use 
template-methods without subclassing.


Alex

From aleax at aleax.it  Tue Jan 11 10:40:35 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 10:40:39 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <oefxgcc5.fsf@python.net>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<oefxgcc5.fsf@python.net>
Message-ID: <D8107DBC-63B4-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 10, at 23:15, Thomas Heller wrote:

> Alex Martelli <aleax@aleax.it> writes:
>
>> PEP: 246
>> Title: Object Adaptation
>
> Minor nit (or not?): You could provide a pointer to the Liskov
> substitution principle, for those reader that aren't too familiar with
> that term.

Excellent idea, thanks.

> Besides, the text mentions three times that LiskovViolation is a
> subclass of AdaptionError (plus once in the ref impl section).

Always hard to strike a balance between what's repeated and what isn't, 
I'll try to get a better one on this point on the next edit.


Alex

From aleax at aleax.it  Tue Jan 11 10:59:07 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 10:59:12 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
Message-ID: <6EA3D69F-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 10, at 19:34, Phillip J. Eby wrote:
    ...
> IMO it's more desirable to support abstract base classes than to allow 
> classes to "opt out" of inheritance when testing conformance to a base 
> class.  If you don't have an "is-a" relationship to your base class, 
> you should be using delegation, not inheritance.  (E.g. 'set' has-a 
> 'dict', not 'set' is-a 'dict', so 'adapt(set,dict)' should fail, at 
> least on the basis of isinstance checking.)

C++'s private inheritance explicitly acknowledges how HANDY subclassing 
can be for pure implementation purposes; we don't have private 
inheritance but that doesn't mean subclassing becomes any less handy;-)

> The other problem with a Liskov opt-out is that you have to explicitly 
> do a fair amount of work to create a LiskovViolation-raising subclass;

A TINY amount of work.  Specifically, starting from:

class X(Abstract):
     """ most of X omitted """
     def hook1(self):
         """ implement hook1 so as to get tm1 from Abstract """
         if self.foo() > self.bar():
             self.baz()
         else:
             self.fie = self.flup - self.flum
         return self.zap()

all you have to do is ADD
     def __conform__(self, protocol):
         if issubclass(protocol, Abstract):
             raise LiskovViolation

that's all.

(See my big post about what Abstract is, with template methods tm1 and 
tm2 respectively using hook methods hook1 and hook2: X doesn't 
implement hook2).

>  that work would be better spent migrating to use delegation instead 
> of inheritance, which would also be cleaner and more comprehensible 
> code than writing a __conform__ hack to announce your bad style in 
> having chosen to use inheritance where delegation is more appropriate. 
>  ;)

The amount of effort is MUCH vaster.  Essentially RECODE everything so 
s/thing like:

class X(object):
     """ most of X omitted """
     class PrivateAuxiliaryClass(Abstract):
         def __init__(self, x):
             self.x = x
         def hook1(self):
             return self.x.hook1()
     def __init__(self):
         self.pac = self.PrivateAuxiliaryClass(self)
         # rest of X.__init__ omitted
     def tm1(self):
         return self.pac.tm1()

this isn't just a tiny band-aid to say "I really wish the language had 
private inheritance because I'm using Abstract as a base just for code 
reuse" -- it's a rich and complex restructuring, and in fact it's just 
the beginning; now you have a deuced reference loop between each 
instance x of X, and its x.pac, so you'll probably want to pull in 
weakref, too, to avoid giving too much work to the cyclical garbage 
collector.

Basically, rephrasing private inheritance with containment and 
delegation is a lot of messy work, and results in far more complicated 
structures.  And instead of paying the tiny price of a __conform__ call 
at adaptation time, you pay the price of delegating calls over and over 
at each x.tm1() call, so it's unlikely performance will improve.

By pushing Liskov conformance without supporting "private inheritance" 
or its equivalent, you're really pushing people to use much more 
complicated and sophisticated structures of objects than "private 
inheritance" affords when properly used... and the LAST thing OO 
programmers need is any encouragement towards more complicated 
structures!-)


Alex

From aleax at aleax.it  Tue Jan 11 11:01:29 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 11:01:35 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com>
Message-ID: <C3B4E544-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 10, at 18:59, Phillip J. Eby wrote:

> At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote:
>> As a practical matter, all of the existing interface systems (Zope, 
>> PyProtocols, and even the defunct Twisted implementation) treat 
>> interface inheritance as guaranteeing substitutability for the base 
>> interface, and do so transitively.
>
> An additional data point, by the way: the Eclipse Java IDE has an 
> adaptation system that works very much like PEP 246 does, and it 
> appears that in a future release they intend to support automatic 
> adapter transitivity, so as to avoid requiring each provider of an 
> interface to "provide O(n^2) adapters when writing the nth version of 
> an interface."  IOW, their current release is transitive only for 
> interface inheritance ala Zope or Twisted; their future release will 
> be transitive for adapter chains ala PyProtocols.

This is definitely relevant prior art, so thanks for pointing it out.  
If interfaces change so often that 'n' can become worryingly high, this 
is a valid concern.  In my world, though, published interfaces do NOT 
change as often as to require such remedies;-).


Alex

From aleax at aleax.it  Tue Jan 11 11:59:06 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 11:59:15 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <C3B4E544-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com>
	<C3B4E544-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <D04614ED-63BF-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 11:01, Alex Martelli wrote:

>
> On 2005 Jan 10, at 18:59, Phillip J. Eby wrote:
>
>> At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote:
>>> As a practical matter, all of the existing interface systems (Zope, 
>>> PyProtocols, and even the defunct Twisted implementation) treat 
>>> interface inheritance as guaranteeing substitutability for the base 
>>> interface, and do so transitively.
>>
>> An additional data point, by the way: the Eclipse Java IDE has an 
>> adaptation system that works very much like PEP 246 does, and it 
>> appears that in a future release they intend to support automatic 
>> adapter transitivity, so as to avoid requiring each provider of an 
>> interface to "provide O(n^2) adapters when writing the nth version of 
>> an interface."  IOW, their current release is transitive only for 
>> interface inheritance ala Zope or Twisted; their future release will 
>> be transitive for adapter chains ala PyProtocols.
>
> This is definitely relevant prior art, so thanks for pointing it out.  
> If interfaces change so often that 'n' can become worryingly high, 
> this is a valid concern.  In my world, though, published interfaces do 
> NOT change as often as to require such remedies;-).

...that was a bit too flippant -- I apologize.  It DOES happen that 
interfaces keep changing, and other situations where "adapter-chain 
transitivity" is quite handy do, absolutely!, occur, too.  Reflecting 
on Microsoft's QI (QueryInterface), based on a very strong injunction 
against changing interfaces and yet mandating transitivity, points that 
out -- that's prior art, too, and a LOT more of it than Eclipse can 
accumulate any time soon, considering how long COM has been at the 
heart of Microsoft's components strategy, how many millions of 
programmers have used or abused it.  Still, QI's set of constraints, 
amounting to a full-fledged equivalence relationship among all the 
"adapters" for a single underlying "object", is, I fear, stronger than 
we can impose (so it may be that Eclipse is a better parallel, but I 
know little of it while COM is in my bones, so that's what I keep 
thinking of;-).

So, I see transitivity as a nice thing to have _IF_ it's something that 
gets explicitly asserted for a certain adapter -- if the adapter has to 
explicitly state to the system that it "isn't lossy" (maybe), or "isn't 
noisy" (perhaps more useful), or something like that... some amount of 
reassurance about the adapter that makes it fully safe to use in such a 
chain.

Maybe it might suffice to let an adapter which IS 'lossy' (or, more 
likely, one that is 'noisy') state the fact.  I'm always reluctant by 
instinct to default to convenient but risky behavior, trusting 
programmers to explicitly assert otherwise when needed; but in many 
ways this kind of design is a part of Python and works fine (_with_ the 
BDFL's fine nose/instinct for making the right compromise between 
convenience and safety in each case, of course).

I'm still pondering the "don't adapt an adapter" suggestion, which 
seems a sound one, and yet also seems to be, intrinsically, what 
transitivity-by-chaining does.  Note that QI does not suffer from this, 
because it lets you get the underlying object identity (IUnknown) from 
any interface adapter.  Maybe, just maybe, we should also consider that 
-- a distinguished protocol bereft of any real substance but acting as 
a flag for "real unadapted object identity".  Perhaps we could use 
'object' for that, at least if the flow of logic in 'adapt' stays as in 
the current PEP 246 draft (i.e., __conform__ is given a chance before 
isinstance triggers -- so, all adapters could __conform__ to object by 
returning the underlying object being adapted, while other objects 
without such a feature in __conform__ would end up with 'adapt(x, 
object) is x').  Or, if object in this role turns out to be confusing, 
IDentity (;-) or some other specially designed protocol.

If we had this ability to "get at the underlying object" we could at 
least write clearer axioms about what transitivity must mean, as well 
as, help out with the "adapting an adapter" problems.  E.g., imagine:

def f(x: IFoo, y: IFoo):
     if x is y: ...

that wouldn't work if adapt(x, IFoo) returns a separate adapter each 
time, which is the most likely situation (think, again, of str->file 
adaptation by StringIO wrapping); but recovering underlying identities 
by "adapt(x, object) is adapt(y, object)" would work.


I don't think that IUnknown or an equivalent, per se, can do *instead* 
of the need to have an adapter explicitly state it's non-noisy (or VV). 
  Besides the need to check for object identity, which is pretty rare 
except when writing axioms, invariants or pre/post-conds;-), the 
IUnknown equivalent would perhaps be more of a conceptual/philosophical 
'prop' than a practically useful feature -- while I see the ability to 
block unintended consequences of inheritance and transitivity (or even 
better, state explicitly when those consequences are wanted, even if 
that should be 90% of the time...) as practically very, VERY useful, 
even if "conceptually" or "philosophically" dubious.



Alex

From arigo at tunes.org  Tue Jan 11 13:41:57 2005
From: arigo at tunes.org (Armin Rigo)
Date: Tue Jan 11 13:53:17 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <20050111124157.GA16642@vicky.ecs.soton.ac.uk>

Hi Phillip,

On Mon, Jan 10, 2005 at 04:38:55PM -0500, Phillip J. Eby wrote:
> Your new proposal does not actually fix this problem in the absence of 
> tp_conform/tp_adapt slots; it merely substitutes possible confusion at the 
> metaclass/class level for confusion at the class/instance level.

I think that what Alex has in mind is that the __adapt__() and __conform__()
methods should work just like all other special methods for new-style classes.  
The confusion comes from the fact that the reference implementation doesn't do 
that.  It should be fixed by replacing:

   conform = getattr(type(obj), '__conform__', None)

with:

   for basecls in type(obj).__mro__:
       if '__conform__' in basecls.__dict__:
           conform = basecls.__dict__['__conform__']
           break
   else:
       # not found

and the same for '__adapt__'.

The point about tp_xxx slots is that when implemented in C with slots, you get
the latter (correct) effect for free.  This is how metaconfusion is avoided in
post-2.2 Python.  Using getattr() for that is essentially broken.  Trying to
call the method and catching TypeErrors seems pretty fragile -- e.g. if you
are calling a __conform__() which is implemented in C you won't get a Python
frame in the traceback either.


A bientot,

Armin
From exarkun at divmod.com  Tue Jan 11 15:14:13 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Jan 11 15:14:18 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <41E38642.9080108@v.loewis.de>
Message-ID: <20050111141413.32125.2127123498.divmod.quotient.6072@ohm>

On Tue, 11 Jan 2005 08:54:42 +0100, "\"Martin v. LÃ¶wis\"" <martin@v.loewis.de> wrote:
>Philippe Biondi wrote:
> > I've done a small patch to use linux AF_NETLINK sockets (see below).
> > Please comment!
> 
> I have a high-level comment - python-dev is normally the wrong place
> for patches; please submit them to sf.net/projects/python instead.
> 
> Apart from that, the patch looks fine.
> 
> > Is there a reason for recvmsg() and sendmsg() not to be implemented
> > yet in socketmodule ?
> 
> I'm not sure what you mean by "implemented": these functions are
> implemented by the operating system, not in the socketmodule.
> 
> If you ask "why are they not exposed to Python yet?": There has been no
> need to do so, so far. What do I get with recvmsg that I cannot get with
> recv/recvfrom just as well?

  Everything that recvmsg() does.  recv() and recvfrom() give you 
"regular" bytes - data sent to the socket using send() or sendto().  
recvmsg() gives you messages - data sent to the socket using sendmsg().  
There is no way to receive messages by using recv() or recvfrom() (and 
no way to send them using send() or sendto()).

  Inversely, I believe send() and recv() can be implemented in terms of
sendmsg() and recvmsg().  Perhaps we should get rid of socket.send() and
socket.recv()? <wink>

  Other things that sendmsg() and recvmsg() can do include passing file 
descriptors, receiving notice of OOB TCP data, peek at bytes from the 
kernel's buffer without actually reading them, and implement 
scatter/gather IO functions (although you'd probably just want to wrap 
and use writev() and readv() instead).

  Jp
From exarkun at divmod.com  Tue Jan 11 15:15:23 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Jan 11 15:15:26 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <20050111013252.GA216@thailand.botanicus.net>
Message-ID: <20050111141523.32125.1902928401.divmod.quotient.6074@ohm>

On Tue, 11 Jan 2005 01:32:52 +0000, David Wilson <dw-python.org@botanicus.net> wrote:
>On Mon, Jan 10, 2005 at 05:17:49PM +0100, Philippe Biondi wrote:
> 
> > I've done a small patch to use linux AF_NETLINK sockets (see below).
> > Please comment!
> 
> As of 2.6.10, a very useful new netlink family was merged -
> NETLINK_KOBJECT_UEVENT. I'd imagine quite a lot of interest from Python
> developers for NETLINK support will come from this new interface in the
> coming years.
> 
> [snip]
> 
> I would like to see (optional?) support for this before your patch is
> merged. I have a long-term interest in a Python-based service control /
> init replacement / system management application, for use in specialised
> environments. I could definately use this. :)

  Useful indeed, but I'm not sure why basic NETLINK support should be 
held up for it?

  Jp
From phil at secdev.org  Tue Jan 11 09:45:09 2005
From: phil at secdev.org (Philippe Biondi)
Date: Tue Jan 11 16:13:26 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <41E38642.9080108@v.loewis.de>
Message-ID: <Pine.LNX.4.44.0501110940261.7440-100000@deneb.home.phil>

On Tue, 11 Jan 2005, [ISO-8859-1] "Martin v. L?wis" wrote:

> Philippe Biondi wrote:
> > I've done a small patch to use linux AF_NETLINK sockets (see below).
> > Please comment!
>
> I have a high-level comment - python-dev is normally the wrong place
> for patches; please submit them to sf.net/projects/python instead.

OK, I'll post it here.

>
> Apart from that, the patch looks fine.

Fine!

>
> > Is there a reason for recvmsg() and sendmsg() not to be implemented
> > yet in socketmodule ?
>
> I'm not sure what you mean by "implemented": these functions are
> implemented by the operating system, not in the socketmodule.
>
> If you ask "why are they not exposed to Python yet?": There has been no
> need to do so, so far. What do I get with recvmsg that I cannot get with
> recv/recvfrom just as well?

You can have access to ancillary messages. You can, for example transmit
credentials or file descriptors through unix sockets, which is very
interesting for privilege separation.


-- 
Philippe Biondi <phil@ secdev.org>      SecDev.org
Security Consultant/R&D                 http://www.secdev.org
PGP KeyID:3D9A43E2  FingerPrint:C40A772533730E39330DC0985EE8FF5F3D9A43E2

From pje at telecommunity.com  Tue Jan 11 16:34:20 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 16:33:16 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050111124157.GA16642@vicky.ecs.soton.ac.uk>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com>

At 12:41 PM 1/11/05 +0000, Armin Rigo wrote:
>The point about tp_xxx slots is that when implemented in C with slots, you get
>the latter (correct) effect for free.  This is how metaconfusion is avoided in
>post-2.2 Python.  Using getattr() for that is essentially broken.  Trying to
>call the method and catching TypeErrors seems pretty fragile -- e.g. if you
>are calling a __conform__() which is implemented in C you won't get a Python
>frame in the traceback either.

An excellent point.  The issue hasn't come up before now, though, because 
there aren't any __conform__ methods written in C in the field that I know 
of.  Presumably, if there are any added to CPython in future, it will be 
because there's a tp_conform slot and it's needed for built-in types, in 
which case the problem is again moot for the implementation.

(FYI, C methods implemented in Pyrex add a dummy frame to the traceback 
such that you see the file and line number of the original Pyrex source 
code.  Very handy for debugging.)

Anyway, I agree that your version of the code should be used to form the 
reference implementation, since the purpose of the reference implementation 
is to show the complete required semantics.

From aleax at aleax.it  Tue Jan 11 17:52:57 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 17:53:05 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com>
Message-ID: <3F03D3E4-63F1-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 16:34, Phillip J. Eby wrote:
    ...
> Anyway, I agree that your version of the code should be used to form 
> the reference implementation, since the purpose of the reference 
> implementation is to show the complete required semantics.

Great, one point at last on which we fully agree -- thanks Armin!-)

I was waiting for BDFL feedback before editing the PEP again, but if 
none is forthcoming I guess at some point I'll go ahead and at least to 
the edits that are apparently not controversial, like this one.  I'd 
like to have a summary of controversial points and short pro and con 
args, too, but I'm not unbiased enough to write it all by myself...;-)


Alex

From mcherm at mcherm.com  Tue Jan 11 18:27:26 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Jan 11 18:27:27 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <1105464446.41e40c7e27114@mcherm.com>

Phillip:

I think you must inhabit a far more perfect world than I do.

You say, for instance, that:
> ...-1 if this introduces a performance penalty [...] just to
> support people who want to create deliberate Liskov violations.
> I personally don't think that we should pander to Liskov
> violators

... but in my world, people violate Liskov all the time, even
in languages that attempt (unsuccessfully) to enforce it. [1]

You say that:
> I think one should adapt primarily to interfaces, and
> interface-to-interface adaptation should be reserved for
> non-lossy, non-noisy adapters.

... but in my world, half the time I'm using adaptation to
correct for the fact that someone else's poorly-written
code requests some class where it should have just used
an interface.

You seem to inhabit a world in which transitivity of adaptation
can be enforced. But in my world, people occasionally misuse
adaptation because they think they know what they're doing
or because they're in a big hurry and it's the most convenient
tool at hand.

I wish I lived in your world, but I don't.

-- Michael Chermside

[1] - Except for Eiffel. Eiffel seems to do a pretty good job
   of enforcing it.

From stephan.stapel at web.de  Tue Jan 11 18:26:56 2005
From: stephan.stapel at web.de (Stephan Stapel)
Date: Tue Jan 11 18:28:06 2005
Subject: [Python-Dev] logging class submission
Message-ID: <41E40C60.1090707@web.de>

Dear people on the dev list!

I hope that this is the right environment to post my submission request 
(I'm new to the scene).

I have modified the RotatingFileHandler of the logging module to create 
a daily rolling file handler.

As it works quite good, I would like to suggest inclusion into the 
standard logging module of Python. I know that the code is quite trivial 
but the class solves the problem of the RotatingFileHandler that you 
don't know where to find a certain log entry. By using dates within the 
log file name, one can exactly determine which log file to observe when 
searching for specific errors.

I hope you like to code and/ or point to improvements on it and finally 
move it into the logging module.

cheers,

Stephan


Here comes the code:


# Copyright 2004-2005 by Stephan Stapel <stephan.stapel@web.de>. All 
Rights Reserved.
#
# Permission to use, copy, modify, and distribute this software and its
# documentation for any purpose and without fee is hereby granted,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Stephan Stapel
# not be used in advertising or publicity pertaining to distribution
# of the software without specific, written prior permission.
#
# STEPHAN STAPEL DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 
INCLUDING
# ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
# STEPHAN STAPEL BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL 
DAMAGES OR
# ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, 
WHETHER
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING 
OUT
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

import logging
from datetime import date
import string

class DailyRollingFileHandler(logging.FileHandler):
     """
     The class is based on the standard RotatingFileHandler class from the
     official logging module.

     It rolls over each day, thus one log file per day is created in the 
form
     myapp-2005-01-07.log, myapp-2005-01-08.log etc.
     """

     def __init__(self, filename, mode="a"):
         """
         Open the specified file and use it as the stream for logging.

         Rollover occurs whenever the day changes.

         The names of the log files each contain the date when they were
         created. Thus if the filename "myapp.log" was used, the log files
         each have look like myapp-2005-01-07.log etc.

         The date is inserted at the position of the last '.' in the 
filename
         if any or simply appended to the given name if no dot was present.
         """
         self.currentDay   = date.today()
         # create the logfile name parts (base part and extension)
         nameparts     = string.split(string.strip(filename), ".")
         self.filestub = ""
         self.fileext  = ""

         # remove empty items
         while nameparts.count("") > 0:
             nameparts.remove("")

         if len(nameparts) < 2:
             self.filestub = nameparts[0]
         else:
             # construct the filename
             for part in nameparts[0:-2]:
                 self.filestub += part + "."
             self.filestub += nameparts[-2]
             self.fileext   = "." + nameparts[-1]

         logging.FileHandler.__init__(self, self.getFilename(), mode)

     def getFilename(self):
         return self.filestub + "-" + self.currentDay.isoformat() + 
self.fileext

     def doRollover(self):
         """
         Do a rollover, as described in __init__().
         """

         self.stream.close()
         self.currentDay   = date.today()
         self.baseFilename = self.getFilename()
         self.stream = open(self.baseFilename, "w")

     def emit(self, record):
         """
         Emit a record.

         Output the record to the file, catering for rollover as described
         in doRollover().
         """
         msg = "%s\n" % self.format(record)
         self.stream.seek(0, 2)  #due to non-posix-compliant Windows 
feature
         if date.today() != self.currentDay:
             self.doRollover()
         logging.FileHandler.emit(self, record)
From FBatista at uniFON.com.ar  Tue Jan 11 18:38:00 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Jan 11 18:40:40 2005
Subject: [Python-Dev] logging class submission
Message-ID: <A128D751272CD411BC9200508BC2194D053C7E72@escpl.tcp.com.ar>

[Stephan Stapel]

#- # Copyright 2004-2005 by Stephan Stapel <stephan.stapel@web.de>. All 
#- Rights Reserved.
#- #
#- # Permission to use, copy, modify, and distribute this 
#- software and its
#- # documentation for any purpose and without fee is hereby granted,
#- # provided that the above copyright notice appear in all 
#- copies and that
#- # both that copyright notice and this permission notice appear in
#- # supporting documentation, and that the name of Stephan Stapel
#- # not be used in advertising or publicity pertaining to distribution
#- # of the software without specific, written prior permission.

There's a license issue here?

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050111/10a47150/attachment.html
From aleax at aleax.it  Tue Jan 11 18:43:48 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 18:43:56 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <1105464446.41e40c7e27114@mcherm.com>
References: <1105464446.41e40c7e27114@mcherm.com>
Message-ID: <59332A98-63F8-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 18:27, Michael Chermside wrote:
    ...
> ... but in my world, people violate Liskov all the time, even
> in languages that attempt (unsuccessfully) to enforce it. [1]
    ...
> [1] - Except for Eiffel. Eiffel seems to do a pretty good job
>    of enforcing it.

...has Eiffel stopped its heroic efforts to support covariance...?  
It's been years since I last looked seriously into Eiffel (it was one 
of the languages we considered as a successor to Fortran and C as main 
application language, at my previous employer), but at that time that 
was one of the main differences between Eiffel (then commercial-only) 
and its imitator (freeware) Sather: Sather succumbed to mathematical 
type-theory and enforced contravariance, Effel still tried to pander 
for how the human mind works by allowing covariance (which implies a 
Liskov violation and is probably the main serious reason for it) and 
striving horrendously to shoehorn it in.  So what's the score now...?


Alex

From pje at telecommunity.com  Tue Jan 11 18:54:36 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 18:53:34 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <F1743E00-63B3-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>

At 10:34 AM 1/11/05 +0100, Alex Martelli wrote:
>The volume of these discussions is (as expected) growing beyond any 
>reasonable bounds; I hope the BDFL can find time to read them but I'm 
>starting to doubt he will.  Since obviously we're not going to convince 
>each other, and it seems to me we're at least getting close to pinpointing 
>our differences, maybe we should try to jointly develop an "executive 
>summary" of our differences and briefly stated pros and cons -- a PEP is 
>_supposed_ to have such a section, after all.

Yes, hopefully we will have sufficient convergence to do that soon.  For 
example, I'm going to stop arguing against the use case for Liskov 
violation, and try looking at alternative implementations.  If those don't 
work out, I'll stop objecting to that item altogether.



>the effects of private inheritance could be simulated by delegation to a 
>private auxiliary class, but the extra indirections and complications 
>aren't negligible costs in terms of code complexity and maintainability.

Ah.  Well, in PEAK, delegation of methods or even read-only attributes is 
trivial:

      class SomeObj(object):

          meth1 = meth2 = meth3 = binding.Delegate('_delegatee')

          _delegatee = binding.Make(OtherClass)


This class will create a private instance of OtherClass for a given SomeObj 
instance the first time meth1, meth2, or meth3 are retrieved from that 
instance.

I bring this up not to say that people should use PEAK for this, just 
explaining why my perspective was biased; I'm so used to doing this that I 
tend to forget it's nontrivial if you don't already have these sorts of 
descriptors available.



>Maybe the ability to ``fake'' __class__ can help, but right now I don't 
>see how, because setting __class__ isn't fake at all -- it really affects 
>object behavior and type:
>
>...
>
>So, it doesn't seem to offer a way to fake out isinstance only, without 
>otherwise affecting behavior.

Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.

 >>> class Phony(object):
         def getClass(self): return Dummy
         __class__ = property(getClass)


 >>> class Dummy: pass

 >>> Phony().__class__
<class __main__.Dummy at 0x00F4CE70>
 >>> isinstance(Phony(),Dummy)
True

Unfortunately, this still doesn't really help, because isinstance() seems 
to apply to a union of __class__ and type:

 >>> isinstance(Phony(),Phony)
True

So, lying about __class__ doesn't fix the issue because you're still 
considered isinstance, unless adapt() just uses __class__ and doesn't use 
isinstance().


>I can give no example at all in which adapting to a concrete class is a 
>_good_ idea, and I tried to indicate that in the PEP.  I just believe that 
>if adaptation does not offer the possibility of using concrete classes as 
>protocols, but rather requires the usage as protocols of some specially 
>blessed 'interface' objects or whatever, then PEP 246 will never fly, (a) 
>because it would then require waiting for the interface thingies to 
>appear, and (b) because people will find it pragmatically useful to just 
>reuse the same classes as protocols too, and too limiting to have to 
>design protocols specifically instead.

Okay, I strongly disagree on this point, because there are people using 
zope.interface and PyProtocols today, and they are *not* using concrete 
classes.  If PEP 246 were to go into Python 2.5 without interface types, 
all that would change is that Zope and PyProtocols would check to see if 
there is an adapt() in builtins and, if not, install their own version.

PEP 246 would certainly be more useful *with* some kind of interface type, 
but Guido has strongly implied that PEP 246 won't be going in *without* 
some kind of interface type, so it seems to me academic to say that PEP 246 
needs adaptation to concrete types based on isinstance().

In fact, maybe we should drop isinstance() from PEP 246 altogether, and 
only use __conform__ and __adapt__ to implement adaptation.  Thus, to say 
that you conform to a concrete type, you have to implement __conform__.  If 
this is done, then an abstract base used as an interface can have a 
__conform__ that answers 'self' for the abstract base used as a protocol, 
and a Liskov-violating subclass can return 'None' for the abstract 
base.  Inheritance of __conform__ will do the rest.

This approach allows concrete classes and Liskov violations, but simplifies 
adapt() since it drops the need for isinstance and for the Liskov 
exception.  Further, we could have a default object.__conform__ that does 
the isinstance check.  Then, a Liskov-violating subclass just overrides 
that __conform__ to block the inheritance it wants to block.

This approach can't work with a separately-distributed PEP 246 
implementation, but it should work quite well for a built-in implementation 
and it's backward compatible with the semantics expected by "old" PEP 246 
implementations.  It means that all objects will have a tp_conform slot 
that will have to be called, but in most cases it's just going to be a 
roundabout way of calling isisntance.


>For hash, and all kinds of other built-in functions and operations, it 
>*does not matter* whether instance h has its own per-instance __hash__ -- 
>H.__hash__ is what gets called anyway.  Making adapt work differently 
>gives me the shivers.

It's only different because of metaclasses and the absence of 
tp_conform/tp_adapt issues (assuming the function and module use cases are 
taken care of by having their tp_conform slots invoke 
self.__dict__['__conform__'] first).

Anyway, if you adapt a *class* that defines __conform__, you really want to 
be invoking the *metaclass* __conform__.  See Armin Rigo's post re: 
"metaconfusion" as he calls it.


>>The PEP just said that it would be raised by __conform__ or __adapt__, 
>>not that it would be caught by adapt() or that it would be used to 
>>control the behavior in that way.  Re-reading, I see that you do mention 
>>it much farther down.  But at the point where __conform__ and __adapt__ 
>>are explained, it has not been explained that adapt() should catch the 
>>error or do anything special with it.  It is simply implied by the "to 
>>prevent this default behavior" at the end of the section.
>>If this approach is accepted, the description should be made explicit, 
>>becausse for me at least it required a retroactive re-interpretation of 
>>the earlier part of the spec.
>
>OK, I'll add more repetition to the specs, trying to make it more 
>"sequentially readable", even though there were already criticized because 
>they do repeat some aspects more than once.

It might not be necessary if we agree that the isinstance check should be 
moved to an object.__conform__ method, and there is no longer a need for a 
LiskovViolation error to exist.


>Basically, we both agree that adaptation must accept some complication to 
>deal with practical real-world issues that are gonna stay around, we just 
>disagree on what those issues are.  You appear to think old-style classes 
>will stay around and need to be supported by new core Python 
>functionality, while I think they can be pensioned off;

Currently, exceptions must be classic classes.  Do you want to disallow 
adaptation of exceptions?  Are you proposing that ClassType.tp_conform not 
invoke self.__conform__?  I don't see any benefit to omitting that 
functionality.


>  you appear to think that programmers' minds will miraculously shift into 
> a mode where they don't need covariance or other Liskov violations, and 
> programmers will happily extract the protocol-ish aspects of their 
> classes into neat pristine protocol objects rather than trying to 
> double-use the classes as protocols too, while I think human nature won't 
> budge much on this respect in the near future.

Well, they're doing it now with Zope and PyProtocols, so it didn't seem 
like such a big assumption to me.  :)



>Having, I hope, amply clarified the roots of our disagreements, so we can 
>wait for BDFL input before the needed PEP 246 rewrites.  If his opinions 
>are much closer to yours than to mine, then perhaps the best next step 
>would be to add you as the first author of the PEP and let you perform the 
>next rewrite -- would you be OK with that?

Sure, although I think that if you're willing to not *object* to classic 
class support, and if we reach agreement on the other issues, it might not 
be necessary.


>>>I didn't know about the "let the object lie" quirk in isinstance.  If 
>>>that quirk is indeed an intended design feature,
>>
>>It is; it's in one of the "what's new" feature highlights for either 2.3 
>>or 2.4, I forget which.  It was intended to allow proxy objects (like 
>>security proxies in Zope 3) to pretend to be an instance of the class 
>>they are proxying.
>
>I just grepped through whatsnew23.tex and whatsnew24.tex and could not 
>find it.  Can you please help me find the exact spot?  Thanks!

Googling "isinstance __class__" returns this as the first hit:

http://mail.python.org/pipermail/python-bugs-list/2003-February/016098.html

Adding "2.3 new" to the query returns this:

http://www.python.org/2.3/highlights.html

which is the "highlights" document I alluded to.



>What _have_ you seen called "casting" in Python?

Er, I haven't seen anything called casting in Python, which is why I was 
confused.  :)


>>>Maybe we're using different definitions of "casting"?
>>
>>I'm most accustomed to the C and Java definitions of casting, so that's 
>>probably why I can't see how it relates at all.  :)
>
>Well, in C++ you can call (int)x or int(x) with the same semantics -- 
>they're both casts.  In C or Java you must use the former syntax, in 
>Python the latter, but they still relate.

Okay, but if you get your definition of "cast" from C and Java then what 
C++ and Python do are *conversion*, not casting, and what PEP 246 does *is* 
"casting".

That's why I think there should be no mention of "casting" in the PEP 
unless you explicitly mention what language you're talking about -- and 
Python shouldn't be a candidate language.  I've been trying to Google 
references to type casting in Python, and have so far mainly found 
arguments that Python does not have casting, and one that further asserts 
that even in C++, "conversion by constructor is not considered a 
cast."  Also, "cast" is of relatively recent vintage in Python 
documentation; outside of the C API and optional static typing discussions, 
it seems to have made its debut in a presentation about Python 2.2's 
changing 'int' and 'str' to type objects.

So, IMO the term has too many uses to add any clarification; it confused me 
because I thought that in C++ the things you're talking about were called 
"conversions", not casts.


>You could have specified some options (such as the mode) but they took 
>their default value instead ('r' in this case).  What's ``lossy'' about 
>accepting defaults?!

Because it means you're making stuff up and tacking it onto the object, not 
"adapting" the object.  As discussed later, this would probably be better 
called "noisy" adaptation than "lossy".


>The adjective "lossy" is overwhelmingly often used in describing 
>compression, and in that context it means, can every bit of the original 
>be recovered (then the compression is lossless) or not (then it's 
>lossy).  I can't easily find "lossy" used elsewhere than in compression, 
>it's not even in American Heritage.  Still, when you describe a 
>transformation such as 12.3 -> 12 as "lossy", the analogy is quite clear 
>to me.  When you so describe the transformation 'foo.txt' -> 
>file('foo.txt'), you've lost me completely: every bit of the original IS 
>still there, as the .name attribute of the file object, so by no stretch 
>of the imagination can I see the "lossiness" -- what bits of information 
>are LOST?

Right, "noisy" is a better word for this; let's move on.


>>it for all kinds of crazy things because it seems cool.  However, it 
>>takes a while to see that adaptation is just about removing unnecessary 
>>accidents-of-incompatibility; it's not a license to transform arbitrary 
>>things into arbitrary things.  There has to be some *meaning* to a 
>>particular adaptation, or the whole concept rapidly degenerates into an 
>>undifferentiated mess.
>
>We agree, philosophically.  Not sure how the PEP could be enriched to get 
>this across.

A few examples of "good" vs. "bad" adaptation might suffice, if each is 
accompanied by a brief justification for its classification.  The 
filename/file thing is a good one, int/float or decimal/float is good 
too.  We should present "bad" first, then show how to fix the example to 
accomplish the intent in a good way.  (Like filename->file factory + 
file->file factory, explicit type conversion for precision-losing 
conversion, etc.)


>>(Or else, you decide to "fix" it by disallowing transitive adaptation, 
>>which IMO is like cutting off your hand because it hurts when you punch a 
>>brick wall.  Stop punching brick walls (i.e. using semantic-lossy 
>>adaptations), and the problem goes away.  But I realize that I'm in the 
>>minority here with regards to this opinion.)
>
>I'm not so sure about your being in the minority, having never read for 
>example Guido's opinion in the matter.

I don't know if he has one; I mean that Jim Fulton, Glyph Lefkowitz, and 
yourself have been outspoken about the "potential danger" of transitive 
adaptation, apparently based on experience with other systems.  (Which 
seems to me a lot like the "potential danger" of whitespace that people 
speak of based on bad experiences with Make or Fortran.)  There have been 
comparatively few people who have had been outspoken about the virtues of 
transitive adaptation, perhaps because for those who use it, it seems quite 
natural.  (I have seen one blog post by someone that was like, "What do you 
mean those other systems aren't transitive?  I thought that was the whole 
point of adaptation.  How else would you do it?")


>But, let's take an example of Facade.  (Here's the 'later' I kept pointing 
>to;-).
>
>I have three data types / protocols: LotsOfInfo has a bazillion data 
>fields, including personFirstName, personMiddleName, personLastName, ...
>PersonName has just two data fields, theFirstName and theLastName.
>FullName has three, itsFirst, itsMiddle, itsLast.
>
>The adaptation between such types/protocols has meaning: drop/ignore 
>redundant fields, rename relevant fields, make up missing ones by some 
>convention (empty strings if they have to be strings, None to mean "I 
>dunno" like SQL NULL, etc).  But, this *IS* lossy in some cases, in the 
>normal sense: through the facade (simplified interface) I can't access ALL 
>of the bits in the original (information-richer).
>
>Adapting LotsOfInfo -> PersonName is fine; so does LotsOfInfo -> FullName.
>
>Adapting PersonName -> FullName is iffy, because I don't have the deuced 
>middlename information.  But that's what NULL aka None is for, so if 
>that's allowed, I can survive.
>
>But going from LotsOfInfo to FullName transitively, by way of PersonName, 
>cannot give the same result as going directly -- the middle name info 
>disappears, because there HAS been a "lossy" step.

Certainly it is preferable to go direct if it's possible, which is why 
PyProtocols always converges to the "shortest adapter path".  However, if 
you did *not* have a direct adaptation available from LotsOfInfo to 
FullName, would it not be *preferable* to have some adaptation than none?

The second point is that conversion from PersonName->FullName is only 
correct if FullName allows "I don't know" as a valid answer for the middle 
name.  If that's *not* the case, then such a conversion is "noisy" because 
it is pretending to know the middle name, when that isn't possible.


>So the issue of "lossy" DOES matter, and I think you muddy things up when 
>you try to apply it to a string -> file adaptation ``by casting'' (opening 
>the file thus named).

Right; as I keep saying, that isn't adaptation, it's conversion.  The 
closest adaptation you can get for the intent is to adapt a string to a 
file *factory*, that can then be used to open a file.


>Forbidding lossy adaptation means forbidding facade here; not being 
>allowed to get adaptation from a rich source of information when what's 
>needed is a subset of that info with some renaming and perhaps mixing.

No, it means it's a bad idea to have implicit conversions that result in 
unintended data loss or "making up" things to fill out data the original 
data doesn't have.  You should explicitly state that you mean to get rid of 
things, or what things you want to make up.

By the way, the analogy you're drawing between loss of floating point 
precision and dropping fields from information about a person isn't valid 
for the definition of "lossy" I'm struggling to clarify.  A floating point 
number is an atomic value, but facts about a person are not made atomic 
simply by storing them in the same object.  So, separating those facts or 
using only some of them does not lose any relevant semantics.



>Forbidding indications of "I don't know" comparable to SQL's NULL (thus 
>forbidding the adaptation PersonName -> FullName) might make the whole 
>scheme incompatible with the common use of relational databases and the 
>like -- probably not acceptable, either.

If your target protocol allows for "I don't know", then consumers of that 
protocol must be willing to accept "I don't know" for an answer, in which 
case everything is fine.  It's *faking* when you don't know, and the target 
protocol does *not* allow for not knowing, that is a problem.  ("Noisy" 
adaptation.)


>Allowing both lossy adaptations, NULLs, _and_ transitivity inevitably 
>leads sooner or later to ACCIDENTAL info loss -- the proper adapter to go 
>directly LotsOfInfo -> FullName was not registered, and instead of getting 
>an exception to point out that error, your program limps along having 
>accidentally dropped a piece of information, here the middle-name.

But in this case you have explicitly designed a protocol that does not 
guarantee that you get all the required information!  If the information is 
in fact required, why did you allow it to be null?  This makes no sense to me.


>OK, but then 12.3 -> 12 should be OK, since the loss of the fractionary 
>part IS part of the difference in interfaces, right?  And yet it doesn't 
>SMELL like adaptation to me -- which is why I tried to push the issue away 
>with the specific disclaimer about numbers.

The semantics of 12.3 are atomic.  Let us say it represents some real-world 
measurement, 12.3 inches perhaps.  In the real world, are those .3 inches 
somehow separable from the 12?  That makes no sense.


>>IOW, adaptation is all about "as a" relationships from concrete objects 
>>to abstract roles, and between abstract roles.  Although one may 
>>colloquially speak of using a screwdriver "as a" hammer, this is not the 
>>case in adaptation.  One may use a screwdriver "as a" 
>>pounder-of-nails.  The difference is that a hammer might also be usable 
>>"as a" remover-of-nails.  Therefore, there is no general "as a" 
>>relationship between pounder-of-nails and remover-of-nails, even though a 
>>hammer is usable "as" either one.  Thus, it does not make sense to say 
>>that a screwdriver is usable "as a" hammer, because this would imply it's 
>>also usable to remove nails.
>
>I like the "as a" -- but it can't ignore Facade, I think.

I don't think it's a problem, because 1) your example at least represents 
facts with relatively independent semantics: you *can* separate a first 
name from a last name, even though they belong to the same person.  And 2) 
if a target protocol has optional aspects, then lossy adaptation to it is 
okay by definition.  Conversely, if the aspect is *not* optional, then 
lossy adaptation to it is not acceptable.  I don't think there can really 
be a middle ground; you have to decide whether the information is required 
or not.  If you have a protocol whose semantics cannot provide the required 
target semantics, then you should explicitly perform the loss or addition 
of information, rather than doing so implicitly via adaptation.


>>interface-to-interface adaptation should be reserved for non-lossy, 
>>non-noisy adapters.
>
>No Facade, no NULLs?  Yes, we disagree about this one: I believe 
>adaptation that occurs by showing just a subset of the info, with renaming 
>etc, is absolutely fine (Facade); and adaptation by using an allowed NULL 
>(say None) to mean "missing information", when going to a "wider" 
>interface, is not pleasant but is sometimes indispensable in the real 
>world -- that's why SQL works in the real world, even though SQL beginners 
>and a few purists hate NULLs with a vengeance.

If you allow for nulls, that's fine -- just be prepared to get 
them.  Real-world databases also have NOT NULL columns for this reason.  :)



>The points are rather that adaptation that "loses" (actually "hides") some 
>information is something we MUST have;

Agreed.


>  and adaptation that supplies "I don't know" markers (NULL-like) for some 
> missing information, where that's allowed, is really very desirable.

Also agreed, emphasizing "where that's allowed".  The point is, if it's 
allowed, it's not a problem, is it?


>   Call this lossy and noisy if you wish, we still can't do without.

No; it's noisy only if the target requires a value and the source has no 
reasonable way to supply it, requiring you to make something up.  And 
leaving out independent semantics (like first name vs. last name) isn't 
lossy IMO.


>Transitivity is a nice convenience, IF it could be something that an 
>adapter EXPLICITLY claims rather than something just happening by 
>default.  I might live with it, grudgingly, if it was the default with 
>some nice easy way to turn it off; my problem with that is -- even if 90% 
>of the cases could afford to be transitive, people will routinely forget 
>to mark the other 10% and mysterious, hard-to-find bugs will result.

Actually, in the cases where I have mistakenly defined a lossy or noisy 
adaptation, my experience has been that it blows up very rapidly and 
obviously, often because PyProtocols will detect an adapter ambiguity (two 
adaptation paths of equal length), and it detects this at adapter 
registration time, not adaptation time.

However, the more *common* source of a transitivity problem in my 
experience is in *interface inheritance*, not oddball adapters.  As I 
mentioned previously, a common error is to derive an interface from an 
interface you require, rather than one you intend your new interface to 
provide.  In the presence of inheritance transitivity (which I have not 
heard you argue against), this means that you may provide something you 
don't intend, and therefore allow your interface to be used for something 
that you didn't intend to guarantee.

Anyway, this problem manifests when you try to adapt something to the base 
interface, and it works when it really shouldn't.  It's more difficult to 
track down than it ought to be, because looking at the base interface won't 
tell you anything, and the derived interface might be buried deep in a base 
class of the concrete object.

But there's no way to positively prevent this class of bugs without 
prohibiting interface inheritance, which is the most common source of 
adaptation transitivity bugs in my experience.


>In PyProtocols docs you specifically warn against adapting from an 
>adapter... yet that's what transitivity intrinsically does!

I warn against *not keeping an original object*, because the original 
object may be adaptable to things that an adapter is *not*.  This is 
because we don't have an 'IUnknown' to recover the original object, not 
because of transitivity.


>>In that case, I generally prefer to be explicit and use conversion rather 
>>than using adaptation.  For example, if I really mean to truncate the 
>>fractional part of a number, I believe it's then appropriate to use 
>>'int(someNumber)' and make it clear that I'm intentionally using a lossy 
>>conversion rather than simply treating a number "as an" integer without 
>>changing its meaning.
>
>That's how it feels to me FOR NUMBERS, but I can't generalize the feeling 
>to the general case of facade between "records" with many fields of 
>information, see above.

Then perhaps we have made some progress; "records" are typically a 
collection of facts with independent semantics, while a number is an atomic 
value.  Facts taken in isolation do not alter their semantics, but dropping 
precision from a value does.


So, to summarize my thoughts from this post:

* Replacing LiskovViolation is possible by dropping type/isinstance checks 
from adapt(), and adding an isinstance check to object.__conform__; Liskov 
violators then override __conform__ in their class to return None when 
asked to conform to a protocol they wish to reject, and return 
super().__conform__ for all other cases.  This achieves your use case while 
simplifying both the implementation and the usage.

* Classic class support is a must; exceptions are still required to be 
classic, and even if they weren't in 2.5, backward compatibility should be 
provided for at least one release.

* Lossy/noisy refer to removing or adding dependent semantics, not 
independent semantics, so facade-ish adaptation is not lossy or noisy.

* If a target protocol permits NULL, then adaptation that supplies NULL is 
not noisy or lossy.  If it is NOT NULL, then adaptation that supplies NULL 
is just plain wrong.  Either way, there is no issue with transitivity, 
because either it's allowed or it isn't.  (If NULLs aren't allowed, then 
you should be explicit when you make things up, and not do it implicitly 
via adaptation.)

* In my experience, incorrectly deriving an interface from another is the 
most common source of unintended adaptation side-effects, not adapter 
composition.


From FBatista at uniFON.com.ar  Tue Jan 11 18:58:32 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Jan 11 19:00:59 2005
Subject: [Python-Dev] logging class submission
Message-ID: <A128D751272CD411BC9200508BC2194D053C7E74@escpl.tcp.com.ar>

[Stephan Stapel]

#- > There's a license issue here? 
#- 
#- I was given the advise to use this license. If this license 
#- prohibts inclusion into Python, how should I re-license the code?

I just was asking. Who gave you the advise?

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050111/bbc5b682/attachment.htm
From pje at telecommunity.com  Tue Jan 11 19:03:18 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 19:02:15 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <6EA3D69F-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111125528.028f69e0@mail.telecommunity.com>

At 10:59 AM 1/11/05 +0100, Alex Martelli wrote:
>all you have to do is ADD
>     def __conform__(self, protocol):
>         if issubclass(protocol, Abstract):
>             raise LiskovViolation
>
>that's all.

That will raise a TypeError if protocol is not a class or type, so this 
could probably serve as an example of how difficult it is to write a good 
Liskov-violating __conform__.  :)

Actually, there's another problem with it; if you do this:

     class Y(X): pass
     class Z(Y): pass

then 'adapt(Z(),Y)' will now fail because of a Liskov violation.  It should 
really check for 'protocol is Abstract' or 'protocol in (Abstract,..)' in 
order to avoid this issue.


>Basically, rephrasing private inheritance with containment and delegation 
>is a lot of messy work, and results in far more complicated 
>structures.  And instead of paying the tiny price of a __conform__ call at 
>adaptation time, you pay the price of delegating calls over and over at 
>each x.tm1() call, so it's unlikely performance will improve.

Well, as I mentioned in my other post, such inheritance is a lot simpler 
with PEAK, so I've probably forgotten how hard it is if you're not using 
PEAK.  :)  PEAK also caches the delegated methods in the instance's 
__dict__, so there's virtually no performance penalty after the first access.

Again, not an argument that others should use PEAK, just an explanation as 
to why I missed this point; I've been using PEAK's delegation features for 
quite some time and so tend to think of delegation as something relatively 
trivial.

From barry at python.org  Tue Jan 11 19:09:02 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Jan 11 19:09:10 2005
Subject: [Python-Dev] logging class submission
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7E74@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7E74@escpl.tcp.com.ar>
Message-ID: <1105466942.18590.23.camel@geddy.wooz.org>

On Tue, 2005-01-11 at 12:58, Batista, Facundo wrote:
> [Stephan Stapel]
> 
> #- > There's a license issue here? 
> #- 
> #- I was given the advise to use this license. If this license 
> #- prohibts inclusion into Python, how should I re-license the code?
> 
> I just was asking. Who gave you the advise?

Here's a link to the PSF contribution form:

http://www.python.org/psf/contrib.html

This contains links to the recommended licenses for software that might
be included in Python.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050111/3f9fdb4c/attachment.pgp
From stephan.stapel at web.de  Tue Jan 11 19:12:29 2005
From: stephan.stapel at web.de (Stephan Stapel)
Date: Tue Jan 11 19:12:33 2005
Subject: [Python-Dev] logging class submission
Message-ID: <1490506001@web.de>

> I just was asking. Who gave you the advise?

someon in a german python forum. I'll change the license asap.

I'm just curious, but do I really have to use the contributor agreement etc.? I mean I'm just trying to submit a small class, no big framework.

cheers,

Stephan
________________________________________________________________
Verschicken Sie romantische, coole und witzige Bilder per SMS!
Jetzt neu bei WEB.DE FreeMail: http://freemail.web.de/?mc=021193

From pje at telecommunity.com  Tue Jan 11 19:20:14 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 19:19:10 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <1105464446.41e40c7e27114@mcherm.com>
Message-ID: <5.1.1.6.0.20050111131211.0293eec0@mail.telecommunity.com>

At 09:27 AM 1/11/05 -0800, Michael Chermside wrote:
>Phillip:
>
>I think you must inhabit a far more perfect world than I do.
>
>You say, for instance, that:
> > ...-1 if this introduces a performance penalty [...] just to
> > support people who want to create deliberate Liskov violations.
> > I personally don't think that we should pander to Liskov
> > violators

I've since dropped both the performance objection and the objection to 
supporting Liskov violation; in a more recent post I've proposed an 
alternative algorithm for allowing it, that has a simpler implementation.


>You say that:
> > I think one should adapt primarily to interfaces, and
> > interface-to-interface adaptation should be reserved for
> > non-lossy, non-noisy adapters.
>
>... but in my world, half the time I'm using adaptation to
>correct for the fact that someone else's poorly-written
>code requests some class where it should have just used
>an interface.

PEP 246 adaptation?  Or are you talking about some other language?  (I ask 
out of curiosity.)

I agree that if it's possible to adapt to concrete types, people will do 
so.  However, I think we all agree that this isn't a great idea and should 
still be considered bad style.  That's not the same thing as saying it 
should be forbidden, and I haven't said it should be forbidden.


>You seem to inhabit a world in which transitivity of adaptation
>can be enforced. But in my world, people occasionally misuse
>adaptation because they think they know what they're doing
>or because they're in a big hurry and it's the most convenient
>tool at hand.

How is this different from abuse of *any* language feature that you're then 
forced to work around?  Are you saying we should not provide a feature 
because *some* people will abuse the feature?  I don't understand.

If you allow interface inheritance, you're just as susceptible to an 
invalid adaptation path, and in my experience this is more likely to bite 
you unintentionally, mainly because interface inheritance works differently 
than class inheritance (which of course is used more often).  Do you want 
to prohibit interface inheritance, too?

From pje at telecommunity.com  Tue Jan 11 19:32:53 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 19:31:53 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <D04614ED-63BF-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <C3B4E544-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com>
	<C3B4E544-63B7-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050111130333.028f5820@mail.telecommunity.com>

At 11:59 AM 1/11/05 +0100, Alex Martelli wrote:
>On 2005 Jan 11, at 11:01, Alex Martelli wrote:
>>On 2005 Jan 10, at 18:59, Phillip J. Eby wrote:
>>>At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote:
>>>>As a practical matter, all of the existing interface systems (Zope, 
>>>>PyProtocols, and even the defunct Twisted implementation) treat 
>>>>interface inheritance as guaranteeing substitutability for the base 
>>>>interface, and do so transitively.
>>>
>>>An additional data point, by the way: the Eclipse Java IDE has an 
>>>adaptation system that works very much like PEP 246 does, and it appears 
>>>that in a future release they intend to support automatic adapter 
>>>transitivity, so as to avoid requiring each provider of an interface to 
>>>"provide O(n^2) adapters when writing the nth version of an 
>>>interface."  IOW, their current release is transitive only for interface 
>>>inheritance ala Zope or Twisted; their future release will be transitive 
>>>for adapter chains ala PyProtocols.
>>
>>This is definitely relevant prior art, so thanks for pointing it out.
>>If interfaces change so often that 'n' can become worryingly high, this 
>>is a valid concern.  In my world, though, published interfaces do NOT 
>>change as often as to require such remedies;-).

FWIW, I don't believe that by "nth version" the original author wasn't 
referring to changed versions of the same interface, but was instead maybe 
trying to say that N interfaces that adapt to IFoo, when IFoo has M 
interfaces that it can be adapted to, means that N*M adapters are required 
in all, if adapter composition isn't possible.


>"adapters" for a single underlying "object", is, I fear, stronger than we 
>can impose (so it may be that Eclipse is a better parallel, but I know 
>little of it while COM is in my bones, so that's what I keep thinking of;-).

Fair enough.  I think Eclipse's *implementation* maps fairly directly onto 
PEP 246, except that __conform__ is replaced by a 'getAdapter()' method, 
and an AdapterManager is used to look up adapters in place of both 
__adapt__ and the PEP 246 registry.  So, it is much closer to PEP 246 than 
COM, in that COM all adaptation is managed by the object, and it cannot be 
externally adapted.  (At least, the last I looked at COM many years ago it 
was the case; maybe that has changed now?)


>So, I see transitivity as a nice thing to have _IF_ it's something that 
>gets explicitly asserted for a certain adapter -- if the adapter has to 
>explicitly state to the system that it "isn't lossy" (maybe), or "isn't 
>noisy" (perhaps more useful), or something like that... some amount of 
>reassurance about the adapter that makes it fully safe to use in such a chain.

Maybe, although I think in our other thread we may be converging on 
definitions of lossy and noisy that are such we can agree that it's not 
really a problem.  (I hope.)


>Maybe it might suffice to let an adapter which IS 'lossy' (or, more 
>likely, one that is 'noisy') state the fact.

I don't see a valid use case for implementing such a thing as an 
automatically-invoked adapter.


>   I'm always reluctant by instinct to default to convenient but risky 
> behavior, trusting programmers to explicitly assert otherwise when 
> needed; but in many ways this kind of design is a part of Python and 
> works fine (_with_ the BDFL's fine nose/instinct for making the right 
> compromise between convenience and safety in each case, of course).

Proposal: let adaptation implemented via __conform__ be nontransitive, and 
adaptation via __adapt__ or the adapter registry be transitive.

This would mean that lossy or noisy adapters could be implemented as an 
implicitly-executed explicit conversion, but only directly on a particular 
concrete class and its subclasses, thereby further limiting the scope and 
impact of a lossy or noisy adapter.

Also, think about this: technically, if you implement lossy or noisy 
adaptation in __conform__, it *isn't* lossy or noisy, because you have to 
do it in the class -- which means that as the class' author, you have 
decreed it to have such semantics.  However, if you are a third party, you 
will have to explicitly invoke the lossy or noisy adapter.

IOW, if you globally register an adapter (with either the interface or the 
global registry), you are guaranteeing that your adaptation is not lossy or 
noisy.  Otherwise, you need to put it in __conform__ or use it explicitly.


>I'm still pondering the "don't adapt an adapter" suggestion, which seems a 
>sound one, and yet also seems to be, intrinsically, what 
>transitivity-by-chaining does.  Note that QI does not suffer from this, 
>because it lets you get the underlying object identity (IUnknown) from any 
>interface adapter.  Maybe, just maybe, we should also consider that -- a 
>distinguished protocol bereft of any real substance but acting as a flag 
>for "real unadapted object identity".  Perhaps we could use 'object' for 
>that, at least if the flow of logic in 'adapt' stays as in the current PEP 
>246 draft (i.e., __conform__ is given a chance before isinstance triggers 
>-- so, all adapters could __conform__ to object by returning the 
>underlying object being adapted, while other objects without such a 
>feature in __conform__ would end up with 'adapt(x, object) is x').  Or, if 
>object in this role turns out to be confusing, IDentity (;-) or some other 
>specially designed protocol.

It's a nice idea; the only problem I see is how far down it goes.  Any 
adapter composition implies that adapters need to know whether to also call 
adapt(x,object) on their adaptee.

From mcherm at mcherm.com  Tue Jan 11 19:47:13 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Jan 11 19:47:15 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <1105469233.41e41f315092a@mcherm.com>

I wrote:
> >... but in my world, half the time I'm using adaptation to
> >correct for the fact that someone else's poorly-written
> >code requests some class where it should have just used
> >an interface.

Phillip replies:
> PEP 246 adaptation?  Or are you talking about some other language?
> (I ask out of curiosity.)

Well, it's partly just a rhetorical device here. I mean PEP 246
adaption, but (unlike you!) I'm not actually using it yet (aside
from playing around to try things out), really I'm just guessing
how I WOULD be using it if it were part of core python.

> I agree that if it's possible to adapt to concrete types, people will do
> so.  However, I think we all agree that this isn't a great idea and should
> still be considered bad style.

I'd agree except for the case where I am trying to pass an object
into code which is misbehaving. If we do add type declarations that
trigger an adapt() call, then people WILL write poor code which
declares concrete types, and I will find myself writing __conform__
methods to work around it. In this case, I'm the one making use of
adaption (the original author was just expecting a TypeError), but
what I'm doing isn't (IMO) bad style.

> >You seem to inhabit a world in which transitivity of adaptation
> >can be enforced. But in my world, people occasionally misuse
> >adaptation because they think they know what they're doing
> >or because they're in a big hurry and it's the most convenient
> >tool at hand.
>
> How is this different from abuse of *any* language feature that you're
> then forced to work around?  Are you saying we should not provide a
> feature because *some* people will abuse the feature?  I don't
> understand.

If we're just recomending that people design for transitivity, then I
don't have a problem (although see Alex's fairly good point illustrated
with LotsOfInfo, PersonName, and FullName -- I found it convincing).
But I was under the impression that the point of transitivity was to
make it "required", then automatically walk chains of adaptions. Then
I fear one case of mis-used adaption could "poison" my entire adaption
mechanism. The N^2 explosion of pairwise-only adapters scares me less,
because I think in most real situations N will be small.

> If you allow interface inheritance, you're just as susceptible to an
> invalid adaptation path, and in my experience this is more likely to
> bite you unintentionally, mainly because interface inheritance works
> differently than class inheritance (which of course is used more
> often).  Do you want to prohibit interface inheritance, too?

Hmm. Sounds like you're making a point here that's important, but which
I don't quite get. Can you elaborate? I certainly hadn't intended to
prohibit interface inheritance... how exactly does it "bite" one?

-- Michael Chermside



From cce at clarkevans.com  Tue Jan 11 19:50:20 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue Jan 11 19:50:22 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
Message-ID: <20050111185020.GA28966@prometheusresearch.com>

On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote:
| * Replacing LiskovViolation is possible by dropping type/isinstance 
| checks from adapt(), and adding an isinstance check to 
| object.__conform__; Liskov violators then override __conform__ in their 
| class to return None when asked to conform to a protocol they wish to 
| reject, and return super().__conform__ for all other cases.  This 
| achieves your use case while simplifying both the implementation and the 
| usage.

I'd rather not assume that class inheritance implies substitutability,
unless the class is "marked" as an interface (assuming that one 
doesn't have interfaces).  I'd like it to be explicit -- a bit of a
nudge to remind a developer to verify substitutability is a good
thing. In this scenerio, a LiskovViolation exception isn't needed
(aside, I don't see the rationale for the exception: to prevent
third party adapters?). Could we make a boilerplate __conform__
which enables class-based substitutability a well-known decorator?

| * In my experience, incorrectly deriving an interface from another is the 
| most common source of unintended adaptation side-effects, not adapter 
| composition

It'd be nice if interfaces had a way to specify a test-suite that
could be run against a component which claims to be compliant.   For
example, it could provide invalid inputs and assert that the proper
errors are returned, etc.

Best,

Clark


-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From DavidA at ActiveState.com  Tue Jan 11 19:59:26 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Tue Jan 11 20:01:43 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <D8107DBC-63B4-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>	<oefxgcc5.fsf@python.net>
	<D8107DBC-63B4-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <41E4220E.3080507@ActiveState.com>

Alex Martelli wrote:
> 
> On 2005 Jan 10, at 23:15, Thomas Heller wrote:
> 
>> Alex Martelli <aleax@aleax.it> writes:
>>
>>> PEP: 246
>>> Title: Object Adaptation
>>
>>
>> Minor nit (or not?): You could provide a pointer to the Liskov
>> substitution principle, for those reader that aren't too familiar with
>> that term.
> 
> 
> Excellent idea, thanks.

Terminology point: I know that LiskovViolation is technically correct, 
but I'd really prefer it if exception names (which are sometimes all 
users get to see) were more informative for people w/o deep technical 
background.  Would that be possible?

--david
From mcherm at mcherm.com  Tue Jan 11 20:26:26 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Jan 11 20:26:26 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <1105471586.41e42862b9a39@mcherm.com>

David Ascher writes:
> Terminology point: I know that LiskovViolation is technically correct,
> but I'd really prefer it if exception names (which are sometimes all
> users get to see) were more informative for people w/o deep technical
> background.  Would that be possible?

I don't see how. Googling on Liskov immediately brings up clear
and understandable descriptions of the principle that's being violated.
I can't imagine summarizing the issue more concisely than that! What
would you suggest? Including better explanations in the documentation
is a must, but "LiskovViolation" in the exception name seems unbeatably
clear and concise.

-- Michael Chermside

From pje at telecommunity.com  Tue Jan 11 20:44:29 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 20:43:28 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <1105469233.41e41f315092a@mcherm.com>
Message-ID: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>

At 10:47 AM 1/11/05 -0800, Michael Chermside wrote:
>I'd agree except for the case where I am trying to pass an object
>into code which is misbehaving. If we do add type declarations that
>trigger an adapt() call, then people WILL write poor code which
>declares concrete types, and I will find myself writing __conform__
>methods to work around it. In this case, I'm the one making use of
>adaption (the original author was just expecting a TypeError), but
>what I'm doing isn't (IMO) bad style.

Agreed.  However, assuming that you're declaring a "clean" adaptation, 
perhaps it should be registered with the global registry rather than 
implemented in __conform__, which would be less work for you.


>If we're just recomending that people design for transitivity, then I
>don't have a problem (although see Alex's fairly good point illustrated
>with LotsOfInfo, PersonName, and FullName -- I found it convincing).

It's a bit misleading, however; if the target protocol allows for "nulls", 
then it's allowed to have nulls.  If it doesn't allow nulls, then the 
adaptation is broken.  Either way, it seems to me to work out, you just 
have to decide which way you want it.


>But I was under the impression that the point of transitivity was to
>make it "required", then automatically walk chains of adaptions.

I don't have a problem with making some part of the adaptation process 
avoid transitivity, such as hand-implemented __conform__ methods.


>  Then
>I fear one case of mis-used adaption could "poison" my entire adaption
>mechanism. The N^2 explosion of pairwise-only adapters scares me less,
>because I think in most real situations N will be small.

Well, Eclipse is a pretty good example of a large N, and I know that both 
Twisted and Zope developers have occasionally felt the need to do 
"double-dip" adaptation in order to work around the absence of transitive 
adapter composition in their adaptation systems.


> > If you allow interface inheritance, you're just as susceptible to an
> > invalid adaptation path, and in my experience this is more likely to
> > bite you unintentionally, mainly because interface inheritance works
> > differently than class inheritance (which of course is used more
> > often).  Do you want to prohibit interface inheritance, too?
>
>Hmm. Sounds like you're making a point here that's important, but which
>I don't quite get. Can you elaborate? I certainly hadn't intended to
>prohibit interface inheritance... how exactly does it "bite" one?

If you derive an interface from another interface, this is supposed to mean 
that your derived interface promises to uphold all the promises of the base 
interface.  That is, your derived interface is always usable where the base 
interface is required.

However, oftentimes one mistakenly derives an interface from another while 
meaning that the base interface is *required* by the derived interface, 
which is similar in meaning but subtly different.  Here, you mean to say, 
"IDerived has all of the requirements of IBase", but you have instead said, 
"You can use IDerived wherever IBase is desired".

But now, suppose that you have class Foo, which has an adapter defined to 
IDerived, and which is looked up for you by IDerived.__adapt__ and 
IBase.__adapt__.  Then, if you pass a Foo instance to a function that 
expects an *IBase*, then the function will end up with an IDerived.

Sometimes this is not at all what you want, at which point I normally go 
back and copy the relevant methods from IBase to IDerived and remove the 
inheritance relationship.

This problem exists in Zope's adaptation system as well as in 
PyProtocols.  I have found that I am far less likely to have an adaptation 
problem from defining a questionable adapter, than I am to have one from 
wrongly-used inheritance.  I am now more careful about the inheritance, but 
it's difficult because intuitively an interface defines a *requirement*, so 
it seems logical to inherit from an interface in order to add requirements!

Now, in the case where both an IBase and an IDerived adapter exist, Zope 
and PyProtocols prefer to use the IBase adapter when an IBase is 
requested.  But this doesn't address the problem case, which is where there 
is no IBase-only adaptation.

From pje at telecommunity.com  Tue Jan 11 20:48:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 20:47:37 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050111185020.GA28966@prometheusresearch.com>
References: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>

At 01:50 PM 1/11/05 -0500, Clark C. Evans wrote:
>On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote:
>| * Replacing LiskovViolation is possible by dropping type/isinstance
>| checks from adapt(), and adding an isinstance check to
>| object.__conform__; Liskov violators then override __conform__ in their
>| class to return None when asked to conform to a protocol they wish to
>| reject, and return super().__conform__ for all other cases.  This
>| achieves your use case while simplifying both the implementation and the
>| usage.
>
>I'd rather not assume that class inheritance implies substitutability,

Hm, you should take that up with Alex then, since that is what his current 
PEP 246 draft does.  :)  Actually, the earlier drafts did that too, so I'm 
not sure why you want to change this now.

What I've actually suggested here actually allows for 
inheritance=substitutability as the default, but also makes it trivially 
changeable for any given inheritance hierarchy by overriding __conform__ at 
the base of that hierarchy, and without introducing a special exception 
class to do it.

From aleax at aleax.it  Tue Jan 11 21:10:00 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 21:10:05 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
References: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
Message-ID: <C5EFE1FA-640C-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 20:48, Phillip J. Eby wrote:
    ...
>> I'd rather not assume that class inheritance implies substitutability,
>
> Hm, you should take that up with Alex then, since that is what his 
> current PEP 246 draft does.  :)  Actually, the earlier drafts did that 
> too, so I'm not sure why you want to change this now.
>
> What I've actually suggested here actually allows for 
> inheritance=substitutability as the default, but also makes it 
> trivially changeable for any given inheritance hierarchy by overriding 
> __conform__ at the base of that hierarchy, and without introducing a 
> special exception class to do it.

The base of the hierarchy has no idea of which subclasses follow or 
break Liskov subtitutability.  It's just silly to site the check there. 
  Moreover, having to change the base class is more invasive than being 
able to do it in the derived class: typically the author of the derived 
class is taking the base class from some library and does not want to 
change that library -- changing the derived class is not ideal, but 
still way better.


Alex

From aleax at aleax.it  Tue Jan 11 21:23:02 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 21:23:07 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
Message-ID: <9832132B-640E-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 20:44, Phillip J. Eby wrote:
    ...
>> If we're just recomending that people design for transitivity, then I
>> don't have a problem (although see Alex's fairly good point 
>> illustrated
>> with LotsOfInfo, PersonName, and FullName -- I found it convincing).
>
> It's a bit misleading, however; if the target protocol allows for 
> "nulls", then it's allowed to have nulls.  If it doesn't allow nulls, 
> then the adaptation is broken.  Either way, it seems to me to work 
> out, you just have to decide which way you want it.

NULLs are allowed, but *PRAGMATICALLY* they shouldn't be used except 
where there's no alternative.

Is the concept of *PRAGMATICS* so deucedly HARD for all of your 
eggheads?!  Maybe a full-credits course in linguistics should be 
mandatory for CS major or wherever you got your sheepskin[s].

In terms of syntax and semantics, a TCP/IP stack which just dropped all 
packet instantly would be compliant with the standards.  No GUARANTEE 
that any given packet will be delivered is ever written down anywhere, 
after all.

The reason such a TCP/IP stack would NOT be a SENSIBLE implementation 
of the standards is PRAGMATICS.  The stack is supposed to do a 
best-effort ATTEMPT to deliver packets, dammit!  That may be hard to 
formalize mathematically, but it makes all the difference in the world 
between a silly joke and a real-world tool.


My best example of pragmatics in linguistics: if I state...:

""" I never strangle python-dev posters with the initials PJE in months 
with an "R" in them """

I am saying nothing that is false or incorrect or misleading, in terms 
of syntax and semantics.  This assertion is grammatically correct and 
semantically true.

Does this mean you should worry come May...?  Not necessarily, because 
the assertion is _pragmatically_ dubious.  *PRAGMATICALLY*, in all 
natural languages, when I state "I never do X under condition Y" 
there's an implication that "condition Y" DOES have something to do 
with the case -- that if condition Y DOESN'T hold then my assurance 
about not doing X weakens.  If condition Y has nothing to do with my 
doing or not doing X, then by the PRAGMATICS of natural language I'm 
NOT supposed to juxtapose the two things -- even though both 
syntactically and semantically it's perfectly correct to do so.

Network protocol specs, programming language, libraries, etc, have 
pragmatics, too.  They're way harder to formalize, but that doesn't 
mean they can be blithely ignored in the real world.


Yes, you're ALLOWED to stuff with NULL any field that isn't explicitly 
specified as NOT NULL.

But you should ONLY do so when the information is REALLY missing, NOT 
when you've lost it along the way because you've implemented 
adapter-chain transitivity: dropping information which you COULD have 
preserved with a bit more care (==without transitivity) is a violation 
of PRAGMATICS, of the BEST-EFFORT implication, just as it would be to 
drop packets once in a while in a TCP/IP stack due to some silly 
programming bug which was passed silently.


Alex

From pje at telecommunity.com  Tue Jan 11 21:30:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 21:29:18 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <C5EFE1FA-640C-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com>

At 09:10 PM 1/11/05 +0100, Alex Martelli wrote:

>On 2005 Jan 11, at 20:48, Phillip J. Eby wrote:
>    ...
>>>I'd rather not assume that class inheritance implies substitutability,
>>
>>Hm, you should take that up with Alex then, since that is what his 
>>current PEP 246 draft does.  :)  Actually, the earlier drafts did that 
>>too, so I'm not sure why you want to change this now.
>>
>>What I've actually suggested here actually allows for 
>>inheritance=substitutability as the default, but also makes it trivially 
>>changeable for any given inheritance hierarchy by overriding __conform__ 
>>at the base of that hierarchy, and without introducing a special 
>>exception class to do it.
>
>The base of the hierarchy has no idea of which subclasses follow or break 
>Liskov subtitutability.  It's just silly to site the check 
>there.  Moreover, having to change the base class is more invasive than 
>being able to do it in the derived class: typically the author of the 
>derived class is taking the base class from some library and does not want 
>to change that library -- changing the derived class is not ideal, but 
>still way better.

Stop; you're responding to something I didn't propose!  (Maybe you're 
reading these posts in reverse order, and haven't seen the actual proposal 
yet?)

Clark said he didn't want to assume substitutability; I was pointing out 
that he could choose to not assume that, if he wished, by implementing an 
appropriate __conform__ at the base of his hierarchy.  This is entirely 
unrelated to deliberate Liskov violation, and is in any case not possible 
with your original proposal.  I don't agree with Clark's use case, but my 
proposal supports it as a possibility, and yours does not.

To implement a Liskov violation with my proposal, you do exactly the same 
as with your proposal, *except* that you can simply return None instead of 
raising an exception, and the logic for adapt() is more straightforward.

From pje at telecommunity.com  Tue Jan 11 21:53:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 21:52:16 2005
Subject: [Python-Dev] Concrete proposals for PEP 246
Message-ID: <5.1.1.6.0.20050111144851.009d2ec0@mail.telecommunity.com>

To save everyone from having to wade through the lengthy discussions 
between Alex and I, and to avoid putting all the summarization burden on 
Alex, I thought I would start a new thread listing my concrete proposals 
for PEP 246 changes, and summarizing my understanding of the current 
agreements and disagreements we have.  (Alex, please correct me if I have 
misrepresented your position in any respects.)


First, I propose to allow explicitly-declared Liskov violations, but using 
a different mechanism than the one Alex has proposed.  Specifically, I wish 
to remove all type or isinstance checking from the 'adapt()' 
function.  But, the 'object' type should have a default __conform__ that is 
equivalent to:

     class object:
         def __conform__(self,protocol):
             if isinstance(protocol,ClassTypes) and isinstance(self,protocol):
                 return self
             return None

and is inherited by all object types, including types defined by extension 
modules, unless overridden.

This approach provides solid backward compatibility with the previous 
version of PEP 246, as long as any user-implemented __conform__ methods 
call their superclass __conform__.  But, it also allows Liskov-violating 
classes to simply return 'None' to indicate their refusal to conform to a 
base class interface, instead of having to raise a special exception.  I 
think that, all else being equal, this is a simpler implementation approach 
to providing the feature, which Alex and others have convinced me is a 
valid (if personally somewhat distasteful) use case.


Second: Alex has agreed to drop the "cast" terminology, since its meanings 
in various languages are too diverse to be illuminating.  We have instead 
begun creeping towards agreement on some concepts such as "lossy" or 
"noisy" conversions.  I think we can also drop my original "lossy" term, 
because "noisy" can also imply information loss and better explains the 
real issue anyway.

A "noisy" conversion, then, is one where the conversion "makes up" 
information that was not implicitly present in the original, or drops 
information that *alters the semantics of the information that is 
retained*.  (This phrasing gets around Alex's LotsOfInfo example/objection, 
while still covering loss of numeric precision; mere narrowing and renaming 
of attributes/methods does not alter the semantics of the retained 
information.)

Adaptation is not recommended as a mechanism for noisy conversions, because 
implicit changes to semantics are a bad idea.  Note that this is actually 
independent of any transitivity issues -- implicit noisy conversion is just 
a bad idea to start with, which is why 'someList[1.2]' raises a TypeError 
rather than implicitly converting 1.2 to an integer!  If you *mean* to drop 
the .2, you should say so, by explicitly converting to an integer.

(However, it *might* be acceptable to implicitly convert 1.0 to an integer; 
I don't currently have a strong opinion either way on that issue, other 
than to note that the conversion is not "noisy" in that case.)

Anyway, I think that the current level of consensus between Alex and myself 
on the above is now such that his comparison to casting could now be 
replaced by some discussion of noisy vs. non-noisy (faithful? 
high-fidelity?) conversion, and the fact that adaptation is suitable only 
for the latter, supplemented by some examples of noisy conversion use cases 
and how to transform them into non-noisy constructs.  The string->file vs. 
string->file_factory example is a particularly good one, I think, because 
it shows how to address a common, practical issue.


Third: (Proposed) The PEP should explicitly support classic classes, or 
else there is no way to adapt exception instances.  (Yes, I have actually 
done this; peak.web adapts exception instances to obtain appropriate 
handlers, for example.)


Fourth: The principal issue from the original discussion that remains open 
at this time is determining specific policies or recommendations for 
addressing various forms of transitivity, which we have delved into a 
little bit.  (By  "open" I jut mean that there is no proposal for this 
issue currently on the table, not to imply that my proposals are not also 
"open" in the sense of awaiting consensus.)

Anyway, the kinds of transitivity we're discussing are:

   1. interface inheritance transitivity (i.e. adapt(x,IBase) if 
adapt(x,IDerived) and IDerived inherits from IBase)

   2. adapter composition transitivity (i.e. adapt(x,ISome) if 
adapt(x,IOther) and there is a general-purpose ISome->IOther adapter available.

These are mostly issues for the design of an interface system (implementing 
__adapt__) and for the design of a global adapter registry.  I don't think 
it's practical to implement either kind of transitivity on the __conform__ 
side, at least not for hand-written __conform__ methods.

To summarize current PEP 246 implementations' choices on this issue, Zope 
implements type 1 transitivity, but not type 2; PyProtocols implements 
both.  Both Zope and PyProtocols allow for individual objects to assert 
compliance with an interface that their class does not claim compliance 
with, and to use this assertion as a basis for adaptation.

In the case of PyProtocols, this is handled by adding a per-instance 
__conform__, but Zope has a separate concept of declaring what interfaces 
an instance "provides", distinct from what it is "adaptable 
to".  PyProtocols in contrast considers "provides" to be the same as 
"adaptable to with no adapter", i.e. a trivial special case of adaptation 
rather than a distinct concept.

I have also asserted that in practice I have encountered more problems with 
type 1 transitivity than with type 2, because of the strong temptation to 
derive an interface to avoid duplicating methods.  In other words, 
inappropriate use of interface inheritance produces roughly the same effect 
as introducing a noisy adapter into a type 2 adapter mesh, but IMO it's 
much easier to do innocently and accidentally, as it doesn't even require 
that you write an adapter!

From pje at telecommunity.com  Tue Jan 11 22:08:13 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 22:07:13 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <9832132B-640E-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>

At 09:23 PM 1/11/05 +0100, Alex Martelli wrote:
>Is the concept of *PRAGMATICS* so deucedly HARD for all of your eggheads?!

Hmm.  Pot, meet kettle.  :)


>Yes, you're ALLOWED to stuff with NULL any field that isn't explicitly 
>specified as NOT NULL.
>
>But you should ONLY do so when the information is REALLY missing, NOT when 
>you've lost it along the way because you've implemented adapter-chain 
>transitivity: dropping information which you COULD have preserved with a 
>bit more care (==without transitivity) is a violation of PRAGMATICS, of 
>the BEST-EFFORT implication, just as it would be to drop packets once in a 
>while in a TCP/IP stack due to some silly programming bug which was passed 
>silently.

This is again a misleading analogy.  You are comparing end-to-end with 
point-to-point.  I am saying that if you have a point-to-point connection 
that drops all packets of a particular kind, you should not put it into 
your network, unless you know that an alternate route exists that can 
ensure those packets get through.  Otherwise, you are breaking the network.

Thus, I am saying that PRAGMATICALLY, it is silly to create a cable that 
drops all ACK packets, for example, and then plug it into your 
network.  And especially, it's silly to turn around that as a reason that 
one should only use end-to-end leased lines, because that packet forwarding 
business is dangerously unreliable!

As far as I can tell, you are arguing that you should never use packet 
forwarding for communication, because somebody might have a router 
somewhere that drops packets.  While I am arguing that if a router is known 
to drop packets incorrectly, the router is broken and should be removed 
from the network, or else bypassed via another route.  And, in the cases 
where you have a "leased line" direct from point A to point B, your routers 
should be smart enough to use that route in place of forwarding from A to C 
to D to B, or whatever.

From fredrik at pythonware.com  Tue Jan 11 23:20:59 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Jan 11 23:20:58 2005
Subject: [Python-Dev] copy confusion
Message-ID: <cs1jfs$p3d$1@sea.gmane.org>

back in Python 2.1 (and before), an object could define how copy.copy should
work simply by definining a __copy__ method.  here's the relevant portion:

    ...
    try:
        copierfunction = _copy_dispatch[type(x)]
    except KeyError:
        try:
            copier = x.__copy__
        except AttributeError:
            raise error, \
                  "un(shallow)copyable object of type %s" % type(x)
        y = copier()
    ...

I recently discovered that this feature has disappeared in 2.3 and 2.4.  in-
stead of looking for an instance method, the code now looks at the object's
type:

    ...

    cls = type(x)

    copier = _copy_dispatch.get(cls)
    if copier:
        return copier(x)

    copier = getattr(cls, "__copy__", None)
    if copier:
        return copier(x)

    ...

(copy.deepcopy still seems to be able to use __deepcopy__ hooks, though)

is this a bug, or a feature of the revised copy/pickle design?   (the code in
copy_reg/copy/pickle might be among the more convoluted pieces of python
coding that I ever seen...  and what's that smiley doing in copy.py?)

and if it's a bug, does the fact that nobody reported this for 2.3 indicate that
I'm the only one using this feature?  is there a better way to control copying
that I should use instead?

</F> 



From pje at telecommunity.com  Tue Jan 11 23:39:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 11 23:38:36 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <cs1jfs$p3d$1@sea.gmane.org>
Message-ID: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>

At 11:20 PM 1/11/05 +0100, Fredrik Lundh wrote:
>I recently discovered that this feature has disappeared in 2.3 and 2.4.  in-
>stead of looking for an instance method, the code now looks at the object's
>type:
>
>     ...
>
>     cls = type(x)
>
>     copier = _copy_dispatch.get(cls)
>     if copier:
>         return copier(x)
>
>     copier = getattr(cls, "__copy__", None)
>     if copier:
>         return copier(x)
>
>     ...
>
>(copy.deepcopy still seems to be able to use __deepcopy__ hooks, though)
>
>is this a bug, or a feature of the revised copy/pickle design?

Looks like a bug to me; it breaks the behavior of classic classes, since 
type(classicInstance) returns InstanceType.

However, it also looks like it might have been introduced to fix the 
possibility that calling '__copy__' on a new-style class with a custom 
metaclass would result in ending up with an unbound method.  (Similar to 
the "metaconfusion" issue being recently discussed for PEP 246.)

ISTM the way to fix both issues is to switch to using x.__class__ in 
preference to type(x) to retrieve the __copy__ method from, although this 
still allows for metaconfusion at higher metaclass levels.

Maybe we need a getclassattr to deal with this issue, since I gather from 
Armin's post that this problem has popped up other places besides here and 
PEP 246.

(Follow-up: Guido's checkin comment for the change suggests it was actually 
done as a performance enhancement while adding a related feature (copy_reg 
integration), rather than as a fix for possible metaconfusion, even though 
it also has that effect.)

From aleax at aleax.it  Tue Jan 11 23:50:55 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 23:51:01 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <cs1jfs$p3d$1@sea.gmane.org>
References: <cs1jfs$p3d$1@sea.gmane.org>
Message-ID: <40628469-6423-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 23:20, Fredrik Lundh wrote:

> back in Python 2.1 (and before), an object could define how copy.copy 
> should
> work simply by definining a __copy__ method.  here's the relevant 
> portion:
>
>     ...
>     try:
>         copierfunction = _copy_dispatch[type(x)]
>     except KeyError:
>         try:
>             copier = x.__copy__
>         except AttributeError:
>             raise error, \
>                   "un(shallow)copyable object of type %s" % type(x)
>         y = copier()
>     ...
>
> I recently discovered that this feature has disappeared in 2.3 and 
> 2.4.  in-
> stead of looking for an instance method, the code now looks at the 
> object's
> type:

Hmmm, yes, we were discussing this general issue as part of the huge 
recent thread about pep 246.  In the new-style object model, special 
methods are supposed to be looked up on the type, not on the object; 
otherwise, having a class with special methods would be a problem -- 
are the methods meant to apply to the class object itself, or to its 
instances?

However, apparently, the code you quote is doing it wrong:

>     cls = type(x)
>
>     copier = _copy_dispatch.get(cls)
>     if copier:
>         return copier(x)
>
>     copier = getattr(cls, "__copy__", None)
>     if copier:
>         return copier(x)

...because getattr is apparently the wrong way to go about it (e.g., it 
could get the '__copy__' from type(cls), which would be mistaken).  
Please see Armin Rigo's only recent post to Python-Dev for the way it 
should apparently be done instead -- assuming Armin is right (he 
generally is), there should be plenty of bugs in copy.py (ones that 
emerge when you're using custom metaclasses &c -- are you doing that?).

Still, if you're using an instance of an old-style class, the lookup in 
_copy_dispatch should be on types.InstanceType -- is that what you're 
trying to copy, an instance of an old-style class?


> (copy.deepcopy still seems to be able to use __deepcopy__ hooks, 
> though)

It starts with a peek into a dispatch dictionary for the type of the 
object, too, just like shallow copy does.  What's the type of what 
you're trying to copy?


> is this a bug, or a feature of the revised copy/pickle design?   (the 
> code in
> copy_reg/copy/pickle might be among the more convoluted pieces of 
> python
> coding that I ever seen...  and what's that smiley doing in copy.py?)
>
> and if it's a bug, does the fact that nobody reported this for 2.3 
> indicate that
> I'm the only one using this feature?  is there a better way to control 
> copying
> that I should use instead?

When I can, I use __getstate__ and __setstate__, simply because they 
seem clear and flexible to be (usable for copying, deep copying, 
pickling).  But that doesn't mean __copy__ or __deepcopy__ should be 
left broken, of course.

Although there are features of design intent here, it does appear to me 
there may be bugs too (if the getattr on the type is wrong); in this 
case it's worrisome, not just that nobody else reported problems, but 
also that the unit tests didn't catch them...:-(


Alex

From aleax at aleax.it  Tue Jan 11 23:56:26 2005
From: aleax at aleax.it (Alex Martelli)
Date: Tue Jan 11 23:56:31 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
Message-ID: <05E1B32A-6424-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 23:39, Phillip J. Eby wrote:
    ...
>>     cls = type(x)
>>
>>     copier = _copy_dispatch.get(cls)
>>     if copier:
>>         return copier(x)
    ...
>> this a bug, or a feature of the revised copy/pickle design?
>
> Looks like a bug to me; it breaks the behavior of classic classes, 
> since type(classicInstance) returns InstanceType.

It doesn't, because types.InstanceType is a key in _copy_dispatch and 
gets a function that implements old-style classe behavior.

> However, it also looks like it might have been introduced to fix the 
> possibility that calling '__copy__' on a new-style class with a custom 
> metaclass would result in ending up with an unbound method.  (Similar 
> to the "metaconfusion" issue being recently discussed for PEP 246.)
>
> ISTM the way to fix both issues is to switch to using x.__class__ in 
> preference to type(x) to retrieve the __copy__ method from, although 
> this still allows for metaconfusion at higher metaclass levels.

What "both issues"?  There's only one issue, it seems to me -- one of 
metaconfusion.

Besides, getattr(x.__class__, '__copy__') would not give backwards 
compatibility if x is an old-style instance -- it would miss the 
per-instance x.__copy__ if any.  Fortunately, _copy_dispatch deals with 
that.  So changing from type(x) to x.__class__ seems useless.

> Maybe we need a getclassattr to deal with this issue, since I gather 
> from Armin's post that this problem has popped up other places besides 
> here and PEP 246.

Apparently we do: a bug in a reference implementation in a draft PEP is 
one thing, one that lives so long in a key module of the standard 
library is quite another.

> (Follow-up: Guido's checkin comment for the change suggests it was 
> actually done as a performance enhancement while adding a related 
> feature (copy_reg integration), rather than as a fix for possible 
> metaconfusion, even though it also has that effect.)

OK, but if Armin is correct about the code in the reference 
implementation of pep 246, and I think he is, this is still a bug in 
copy.py (though probably not the specific one that bit /f).


Alex

From gvanrossum at gmail.com  Tue Jan 11 23:58:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 11 23:58:12 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
Message-ID: <ca471dc205011114583df08bbf@mail.gmail.com>

[Fredrik]
> >I recently discovered that this feature has disappeared in 2.3 and 2.4.  in-
> >stead of looking for an instance method, the code now looks at the object's
> >type:
> >
> >     ...
> >
> >     cls = type(x)
> >
> >     copier = _copy_dispatch.get(cls)
> >     if copier:
> >         return copier(x)
> >
> >     copier = getattr(cls, "__copy__", None)
> >     if copier:
> >         return copier(x)
> >
> >     ...
> >
> >(copy.deepcopy still seems to be able to use __deepcopy__ hooks, though)
> >
> >is this a bug, or a feature of the revised copy/pickle design?

[Phillip]
> Looks like a bug to me; it breaks the behavior of classic classes, since
> type(classicInstance) returns InstanceType.

I'm not so sure. I can't seem to break this for classic classes.

The only thing this intends to break, and then only for new-style
classes, is the ability to have __copy__ be an instance variable
(whose value should be a callable without arguments) -- it must be a
method on the class. This is the same thing that I've done for all
built-in operations (__add__, __getitem__ etc.).

> However, it also looks like it might have been introduced to fix the
> possibility that calling '__copy__' on a new-style class with a custom
> metaclass would result in ending up with an unbound method.  (Similar to
> the "metaconfusion" issue being recently discussed for PEP 246.)

Sorry, my head just exploded. :-(

I think I did this change (for all slots) to make the operations more
efficient by avoiding dict lookups. It does have the desirable
property of not confusing a class's attributes with its metaclass's
attributes, but only as long as you use the operation's native syntax
(e.g. x[y]) rather than the nominally "equivalent" method call (e.g.
x.__getitem__(y)).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Wed Jan 12 00:09:17 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 00:09:21 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <ca471dc205011114583df08bbf@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
Message-ID: <D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 23:58, Guido van Rossum wrote:
    ...
>>>     cls = type(x)
>>>     copier = _copy_dispatch.get(cls)
>>>     if copier:
>>>         return copier(x)
    ...
>>> is this a bug, or a feature of the revised copy/pickle design?
>
> [Phillip]
>> Looks like a bug to me; it breaks the behavior of classic classes, 
>> since
>> type(classicInstance) returns InstanceType.
>
> I'm not so sure. I can't seem to break this for classic classes.

You can't, _copy_dispatch deals with those.

> The only thing this intends to break, and then only for new-style
> classes, is the ability to have __copy__ be an instance variable
> (whose value should be a callable without arguments) -- it must be a
> method on the class. This is the same thing that I've done for all
> built-in operations (__add__, __getitem__ etc.).

And a wonderful idea it is.

>> However, it also looks like it might have been introduced to fix the
>> possibility that calling '__copy__' on a new-style class with a custom
>> metaclass would result in ending up with an unbound method.  (Similar 
>> to
>> the "metaconfusion" issue being recently discussed for PEP 246.)
>
> Sorry, my head just exploded. :-(
>
> I think I did this change (for all slots) to make the operations more
> efficient by avoiding dict lookups. It does have the desirable
> property of not confusing a class's attributes with its metaclass's
> attributes, but only as long as you use the operation's native syntax
> (e.g. x[y]) rather than the nominally "equivalent" method call (e.g.
> x.__getitem__(y)).

Unfortunately, we do have a problem with the code in copy.py:

class MetaCopyableClass(type):
     def __copy__(cls):
         """ code to copy CLASSES of this metaclass """
     # etc, etc, snipped

class CopyableClass:
     __metaclass__ = MetaCopyableClass
     # rest of class snipped

x = CopyableClass()

import copy
y = copy.copy(x)


kallisti:/tmp alex$ python x.py
Traceback (most recent call last):
   File "x.py", line 14, in ?
     y = copy.copy(x)
   File "/usr/local/lib/python2.4/copy.py", line 79, in copy
     return copier(x)
TypeError: __copy__() takes exactly 1 argument (2 given)
kallisti:/tmp alex$

See?  copy.copy(x) ends up using MetaCopyableClass.__copy__ -- because 
of a getattr on CopyableClass for '__copy__', which gets the 
BOUND-METHOD defined in the metaclass, with im_self being 
CopyableClass.

I had exactly the same metabug in the pep 246 reference implementation, 
Armin Rigo showed how to fix it in his only recent post.


Alex


From DavidA at ActiveState.com  Wed Jan 12 00:13:43 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Wed Jan 12 00:15:33 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <1105471586.41e42862b9a39@mcherm.com>
References: <1105471586.41e42862b9a39@mcherm.com>
Message-ID: <41E45DA7.1030302@ActiveState.com>

Michael Chermside wrote:
> David Ascher writes:
> 
>>Terminology point: I know that LiskovViolation is technically correct,
>>but I'd really prefer it if exception names (which are sometimes all
>>users get to see) were more informative for people w/o deep technical
>>background.  Would that be possible?
> 
> 
> I don't see how. Googling on Liskov immediately brings up clear
> and understandable descriptions of the principle that's being violated.
> I can't imagine summarizing the issue more concisely than that! What
> would you suggest? Including better explanations in the documentation
> is a must, but "LiskovViolation" in the exception name seems unbeatably
> clear and concise.

Clearly, I disagree.

My point is that it'd be nice if we could come up with an exception name 
which could be grokkable without requiring 1) Google, 2) relatively 
high-level understanding of type theory.

Googling on Liskov brings up things like:

http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple

"""What is wanted here is something like the following substitution 
property: If for each object o1 of type S there is an object o2 of type 
T such that for all programs P defined in terms of T, the behavior of P 
is unchanged when o1 is substituted for o2 then S is a subtype of T." - 
BarbaraLiskov, Data Abstraction and Hierarchy, SIGPLAN Notices, 23,5 
(May, 1988)."""

If you think that that is clear and understandable to the majority of 
the Python community, you clearly have a different perspective on that 
community.  I have (almost) no doubt that all Python-dev'ers understand 
it, but maybe we should ask someone like Anna Ravenscroft or Mark Lutz 
if they thinks it'd be appropriate from a novice user's POV.  I'm quite 
sure that the experts could understand a more pedestrian name, and quite 
sure that the reverse isn't true.

I also think that the term "violation" isn't necessarily the best word 
to add to the Python namespace, when error or exception would do just fine.

In addition, to say that it's unbeatably clear without a discussion of 
alternatives (or if I've missed it, please let me know) seems weird.

The point is broader, though -- when I get my turn in the time machine, 
I'll lobby for replacing NameError with UndefinedVariable or something 
similar (or more useful still).  The former is confusing to novices, and 
while it can be learned, that's yet another bit of learning which is, 
IMO, unnecessary, even though it may be technically more correct.

--david ascher
From gvanrossum at gmail.com  Wed Jan 12 00:20:12 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 00:20:33 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <41E45DA7.1030302@ActiveState.com>
References: <1105471586.41e42862b9a39@mcherm.com>
	<41E45DA7.1030302@ActiveState.com>
Message-ID: <ca471dc2050111152034af1be7@mail.gmail.com>

> My point is that it'd be nice if we could come up with an exception name
> which could be grokkable without requiring 1) Google, 2) relatively
> high-level understanding of type theory.

How about SubstitutabilityError?

> The point is broader, though -- when I get my turn in the time machine,
> I'll lobby for replacing NameError with UndefinedVariable or something
> similar (or more useful still).  The former is confusing to novices, and
> while it can be learned, that's yet another bit of learning which is,
> IMO, unnecessary, even though it may be technically more correct.

We did that for UnboundLocalError, which subclasses NameError. Perhaps
we can rename NameError to UnboundVariableError (and add NameError as
an alias for b/w compat).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From DavidA at ActiveState.com  Wed Jan 12 00:21:29 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Wed Jan 12 00:23:12 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <ca471dc2050111152034af1be7@mail.gmail.com>
References: <1105471586.41e42862b9a39@mcherm.com>	
	<41E45DA7.1030302@ActiveState.com>
	<ca471dc2050111152034af1be7@mail.gmail.com>
Message-ID: <41E45F79.609@ActiveState.com>

Guido van Rossum wrote:
>>My point is that it'd be nice if we could come up with an exception name
>>which could be grokkable without requiring 1) Google, 2) relatively
>>high-level understanding of type theory.
> 
> 
> How about SubstitutabilityError?

That would be far, far better, yes.

> We did that for UnboundLocalError, which subclasses NameError. Perhaps
> we can rename NameError to UnboundVariableError (and add NameError as
> an alias for b/w compat).

Sure, although (and here I'm pushing it, I know, and I should have 
argued it way back then), the notion of 'unbound' is possibly too 
low-level still.  'Unknown' would probably carry much more meaning to 
those people who most need it.

But yes, you're catching my drift.

--david
From pje at telecommunity.com  Wed Jan 12 00:26:59 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 00:26:02 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <05E1B32A-6424-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111182510.032a14b0@mail.telecommunity.com>

At 11:56 PM 1/11/05 +0100, Alex Martelli wrote:
>What "both issues"?  There's only one issue, it seems to me -- one of 
>metaconfusion.

I was relying on Fredrik's report of a problem with the code; that is the 
other "issue" I referred to.

From fredrik at pythonware.com  Wed Jan 12 00:30:20 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Jan 12 00:30:22 2005
Subject: [Python-Dev] Re: copy confusion
References: <cs1jfs$p3d$1@sea.gmane.org><5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
Message-ID: <cs1nhu$1jc$1@sea.gmane.org>

Guido van Rossum wrote:

> The only thing this intends to break /.../

it breaks classic C types:

>>> import cElementTree
>>> x = cElementTree.Element("tag")
>>> x
<Element 'tag' at 00B4BA30>
>>> x.__copy__
<built-in method __copy__ of Element object at 0x00B4BA30>
>>> x.__copy__()
<Element 'tag' at 00B4BA68>
>>> import copy
>>> y = copy.copy(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\python24\lib\copy.py", line 93, in copy
    raise Error("un(shallow)copyable object of type %s" % cls)
copy.Error: un(shallow)copyable object of type <type 'Element'>
>>> dir(x)
['__copy__', '__deepcopy__', 'append', 'clear', 'find', 'findall', 'findtext',
'get', 'getchildren', 'getiterator', 'insert', 'items', 'keys', 'makeelement', 'set']
>>> dir(type(x))
['__class__', '__delattr__', '__delitem__', '__delslice__', '__doc__',
'__getattribute__', '__getitem__', '__getslice__', '__hash__', '__init__',
'__len__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__setitem__', '__setslice__', '__str__']

(and of course,  custom C types is the only case where I've ever used
__copy__; the default behavior has worked just fine for all other cases)

for cElementTree, I've worked around this with an ugly __reduce__ hack,
but that doesn't feel right...

</F> 



From aleax at aleax.it  Wed Jan 12 00:33:22 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 00:33:27 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
Message-ID: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 11, at 22:08, Phillip J. Eby wrote:
    ...
>> Yes, you're ALLOWED to stuff with NULL any field that isn't 
>> explicitly specified as NOT NULL.
>>
>> But you should ONLY do so when the information is REALLY missing, NOT 
>> when you've lost it along the way because you've implemented 
>> adapter-chain transitivity: dropping information which you COULD have 
>> preserved with a bit more care (==without transitivity) is a 
>> violation of PRAGMATICS, of the BEST-EFFORT implication, just as it 
>> would be to drop packets once in a while in a TCP/IP stack due to 
>> some silly programming bug which was passed silently.
>
> This is again a misleading analogy.  You are comparing end-to-end with 
> point-to-point.  I am saying that if you have a point-to-point 
> connection that drops all packets of a particular kind, you should not 
> put it into your network, unless you know that an alternate route 
> exists that can ensure those packets get through.  Otherwise, you are 
> breaking the network.

But adaptation is not transmission!  It's PERFECTLY acceptable for an 
adapter to facade: to show LESS information in the adapted object than 
was in the original.  It's PERFECTLY acceptable for an adapter to say 
"this piece information is not known" when it's adapting an object for 
which that information, indeed, is not known.  It's only CONJOINING the 
two perfectly acceptable adapters, as transitivity by adapter chain 
would do automatically, that you end up with a situation that is 
pragmatically undesirable: asserting that some piece of information is 
not known, when the information IS indeed available -- just not by the 
route automatically taken by the transitivity-system.

What happened here is not that either of the adapters registered is 
wrong: each does its job in the best way it can.  The programming 
error, which transitivity hides (degrading the quality of information 
resulting from the system -- a subtle kind of degradation that will be 
VERY hard to unearth), is simply that the programmer forgot to register 
the direct adapter.  Without transitivity, the programmer's mistake 
emerges easily and immediately; transitivity hides the mistake.

By imposing transitivity, you're essentially asserting that, if a 
programmer forgets to code and register an A -> C direct adapter, this 
is never a problem, as long as A -> B and B -> C adapters are 
registered, because A -> B -> C will give results just as good as the 
direct A -> C would have, so there's absolutely no reason to trouble 
the programmer about the trivial detail that transitivity is being 
used.

At the same time, if I understand correctly, you're ALSO saying that if 
two other adapters exist, A -> Z and Z -> C, *THEN* it's an error, 
because you don't know when adapting A -> C whether to go via B or via 
Z.  Well, if you consistently believe what I state in the previous 
paragraph, then this is just weird: since you're implicitly asserting 
that any old A->?->C transitive adaptation is just as good as a direct 
A->C, why should you worry about there being more than one such 2-step 
adaptation available?  Roll the dice to pick one and just proceed.

Please note that in the last paragraph I'm mostly trying to "reason by 
absurd": I do NOT believe one can sensibly assert in the general case 
that A->?->C is just as good as A->C, without imposing FAR stronger 
constraints on adaptation that we possibly can (QI gets away with it 
because, designed from scratch, it can and does impose such 
constraints, essentially that all interfaces "belong" to ONE single 
object -- no independent 3rd party adaptation, which may be a bigger 
loss than the constraints gain, actually).

I'm willing to compromise to the extent of letting any given adaptation 
somehow STATE, EXPLICITLY, "this adaptation is lossless and perfect, 
and can be used as a part of transitive chains of adaptation without 
any cost whatsoever".  If we do that, though, the adaptation system 
should trust this assertion, so if there are two possibilities of equal 
minimal length, such as A->B->C or A->Z->C, with all the steps being 
declared lossless and perfect, then it SHOULD just pick one by whatever 
criterion, since both will be equally perfect anyway -- so maybe my 
reasoning by absurd wasn't totally absurd after all;-).

Would this compromise be acceptable to you?


Alex

From cce at clarkevans.com  Wed Jan 12 00:38:58 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Jan 12 00:39:00 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com>
References: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
	<5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com>
Message-ID: <20050111233858.GB88115@prometheusresearch.com>

On Tue, Jan 11, 2005 at 03:30:19PM -0500, Phillip J. Eby wrote:
| Clark said he didn't want to assume substitutability; I was pointing out 
| that he could choose to not assume that, if he wished, by implementing an 
| appropriate __conform__ at the base of his hierarchy. 

Oh, that's sufficient.  If someone making a base class wants to assert
that derived classes should check compliance (rather than having it
automagic), then they can do this.  Good enough!

| I don't agree with Clark's use case, but my 
| proposal supports it as a possibility, and yours does not.

It was a straw-man; and I admit, not a particularly compelling one.

| To implement a Liskov violation with my proposal, you do exactly the same 
| as with your proposal, *except* that you can simply return None instead 
| of raising an exception, and the logic for adapt() is more 
| straightforward.

I think I prefer just returning None rather than raising a
specific exception.  The semantics are different: None implies that
other adaptation mechanisms (like a registry) could be tried, while
LiskovException implies that processing halts and no further 
adaptation techniques are to be used.  In this case, None is 
the better choice for this particular case since it would enable
third-parties to register a wrapper.

Overall, I think both you and Alex are now proposing essentially
the same thing... no?

Best,

Clark

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From pje at telecommunity.com  Wed Jan 12 00:40:18 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 00:39:22 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <ca471dc205011114583df08bbf@mail.gmail.com>
References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111182710.032a10c0@mail.telecommunity.com>

At 02:58 PM 1/11/05 -0800, Guido van Rossum wrote:
>[Phillip]
> > Looks like a bug to me; it breaks the behavior of classic classes, since
> > type(classicInstance) returns InstanceType.
>
>I'm not so sure. I can't seem to break this for classic classes.

Sorry; I was extrapolating from what I thought was Fredrik's description of 
this behavior as a bug, and examining of the history of the code that he 
referenced.  I saw that the current version of that code had evolved 
directly from a version that was retrieving instance.__copy__; I therefore 
assumed that the loss-of-feature Fredrik was reporting was that.

That is, I thought that the problem he was experiencing was that classic 
classes no longer supported __copy__ because this code had changed.  I 
guess I should have looked at other lines of code besides the ones he 
pointed out; sorry about that.  :(


>The only thing this intends to break, and then only for new-style
>classes, is the ability to have __copy__ be an instance variable
>(whose value should be a callable without arguments) -- it must be a
>method on the class. This is the same thing that I've done for all
>built-in operations (__add__, __getitem__ etc.).

Presumably, this is the actual feature loss that Fredrik's describing; i.e. 
lack of per-instance __copy__ on new-style classes.  That would make more 
sense.



> > However, it also looks like it might have been introduced to fix the
> > possibility that calling '__copy__' on a new-style class with a custom
> > metaclass would result in ending up with an unbound method.  (Similar to
> > the "metaconfusion" issue being recently discussed for PEP 246.)
>
>Sorry, my head just exploded. :-(

The issue is that for special attributes (like __copy__, __conform__, etc.) 
that do not have a corresponding type slot, using getattr() is not 
sufficient to obtain slot-like behavior.  This is because 
'aType.__special__' may refer to a __special__ intended for *instances* of 
'aType', instead of the __special__ for aType.

As Armin points out, the only way to fully emulate type slot behavior for 
unslotted special attributes is to perform a search of the __dict__ of each 
type in the MRO of the type of the object for which you wish to obtain the 
special attribute.

So, in this specific case, __copy__ does not have a type slot, so it is 
impossible using getattr (or simple attribute access) to guarantee that you 
are retrieving the correct version of __copy__ in the presence of metaclasses.

This is what Alex and I dubbed "metaconfusion" in discussion of the same 
issue for PEP 246's __adapt__ and __conform__ methods; until they have 
tp_adapt and tp_conform slots, they can have this same problem.

Alex and I also just speculated that perhaps the stdlib should include a 
function that can do this, so that stdlib modules that define unslotted 
special attributes (such as __copy__) can ensure they work correctly in the 
presence of metaclasses.

From DavidA at ActiveState.com  Wed Jan 12 00:51:33 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Wed Jan 12 00:53:17 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <ca471dc2050111152034af1be7@mail.gmail.com>
References: <1105471586.41e42862b9a39@mcherm.com>	
	<41E45DA7.1030302@ActiveState.com>
	<ca471dc2050111152034af1be7@mail.gmail.com>
Message-ID: <41E46685.2040603@ActiveState.com>

Guido van Rossum wrote:

>>The point is broader, though -- when I get my turn in the time machine,
>>I'll lobby for replacing NameError with UndefinedVariable or something

Strange, my blog reading just hit upon

http://blogs.zdnet.com/open-source/index.php?p=93

...
"Perhaps as open source developers are making their resolutions for 
2005, they could add human-readable error codes to their list?
"

:-)

--david
From pje at telecommunity.com  Wed Jan 12 01:17:23 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 01:16:29 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050111233858.GB88115@prometheusresearch.com>
References: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com>
	<5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com>
	<5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111191422.03b16de0@mail.telecommunity.com>

At 06:38 PM 1/11/05 -0500, Clark C. Evans wrote:
>| To implement a Liskov violation with my proposal, you do exactly the same
>| as with your proposal, *except* that you can simply return None instead
>| of raising an exception, and the logic for adapt() is more
>| straightforward.
>
>I think I prefer just returning None rather than raising a
>specific exception.  The semantics are different: None implies that
>other adaptation mechanisms (like a registry) could be tried, while
>LiskovException implies that processing halts and no further
>adaptation techniques are to be used.  In this case, None is
>the better choice for this particular case since it would enable
>third-parties to register a wrapper.
>
>Overall, I think both you and Alex are now proposing essentially
>the same thing... no?

Yes; I'm just proposing shuffling the invocation of things around a bit in 
order to avoid the need for an exception, and in the process increasing the 
number of possible customizations a bit.

Not that I care about those customizations as such; I just would like to 
simplify the protocol.  I suppose there's some educational benefit in 
making somebody explicitly declare that they're a Liskov violator, but it 
seems that if we're going to support it, it should be simple.

From pje at telecommunity.com  Wed Jan 12 01:29:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 01:28:19 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com>

At 12:33 AM 1/12/05 +0100, Alex Martelli wrote:
>But adaptation is not transmission!  It's PERFECTLY acceptable for an 
>adapter to facade: to show LESS information in the adapted object than was 
>in the original.

It's also true that it's acceptable for a router to choose not to forward 
packets, e.g. for security reasons, QoS, etc.  My point was that you seem 
to be using this to conclude that multihop packet forwarding is a bad idea 
in the general case, and that's what doesn't make any sense to me.

More to the point, the error in your example isn't the filtering-out of 
information, it's the adding of a NULL back in.  If NULLs are questionable 
for the target interface, this is not in general a candidate for implicit 
adaptation IMO -- *whether or not transitivity is involved*.

Let's look at the reverse of the float-to-int case for a better 
example.  Should I be able to implicitly adapt a float to a Decimal?  No, 
because I might be making up precision that isn't there.  May I explicitly 
convert a float to a decimal, if I know what I'm doing?  Yes, of 
course.  Just don't expect Python to guess for you.

This is very much like your example; adding a NULL middle name seems to me 
almost exactly like going from float to Decimal with spurious 
precision.  If you know what you're doing, it's certainly allowable to do 
it explicitly, but Python should not do it implicitly.

Thus, my argument is that an adapter like this should never be made part of 
the adapter system, even if there's no transitivity.  However, if you agree 
that such an adapter shouldn't be implicit, then it logically follows that 
there is no problem with allowing transitivity, except of course that 
people may sometimes break the rule.

However, I think we actually have an opportunity now to actually codify 
sensible adaptation practices, such that the violations will be 
infrequent.  It's also possible that we may be able to define some sort of 
restricted implicitness so that somebody making a noisy adapter can 
implicitly adapt in a limited context.


>What happened here is not that either of the adapters registered is wrong: 
>each does its job in the best way it can.  The programming error, which 
>transitivity hides (degrading the quality of information resulting from 
>the system -- a subtle kind of degradation that will be VERY hard to 
>unearth), is simply that the programmer forgot to register the direct 
>adapter.  Without transitivity, the programmer's mistake emerges easily 
>and immediately; transitivity hides the mistake.

Where we differ is that I believe that if the signal degradation over a 
path isn't acceptable, it shouldn't be made an implicit part of the 
network; it should be an explicitly forced route instead.

Note by the way, that the signal degradation in your example comes from a 
broken adapter: it isn't valid to make up data if you want real data.  It's 
PRAGMATIC, as you say, to make up the data when you don't have a choice, 
but this does not mean it should be AUTOMATIC.

So, we both believe in restricting the automatic and implicit nature of 
adaptation.  The difference is that I'm saying you should be explicit when 
you do something questionable, and you are saying you should be explicit 
when you're doing something that is *not*.  Perhaps as you previously 
suggested, this really is the aspect where we need the BDFL to be the 
tiebreaker.


>At the same time, if I understand correctly, you're ALSO saying that if 
>two other adapters exist, A -> Z and Z -> C, *THEN* it's an error, because 
>you don't know when adapting A -> C whether to go via B or via Z.  Well, 
>if you consistently believe what I state in the previous paragraph, then 
>this is just weird: since you're implicitly asserting that any old A->?->C 
>transitive adaptation is just as good as a direct A->C, why should you 
>worry about there being more than one such 2-step adaptation available?

Because such ambiguities are usually an indication of some *other* error, 
often in the area of interface inheritance transitivity; they rarely occur 
as a direct result of implementing two versions of the same adapter (at 
least in my experience).


>I'm willing to compromise to the extent of letting any given adaptation 
>somehow STATE, EXPLICITLY, "this adaptation is lossless and perfect, and 
>can be used as a part of transitive chains of adaptation without any cost 
>whatsoever".  If we do that, though, the adaptation system should trust 
>this assertion, so if there are two possibilities of equal minimal length, 
>such as A->B->C or A->Z->C, with all the steps being declared lossless and 
>perfect, then it SHOULD just pick one by whatever criterion, since both 
>will be equally perfect anyway -- so maybe my reasoning by absurd wasn't 
>totally absurd after all;-).

Here's the part I don't think you're seeing: interface inheritance 
transitivity has this *exact* same problem, and it's *far* easier to 
stumble into it, assuming you don't start out by declaring adapters that we 
both agree are insane (like filename-to-file).  If your argument is valid 
for adapters, however, then the only logical conclusion is that we cannot 
permit an adapter for a derived interface to be returned when a base 
interface is requested.  Is this your position as well?

From python at rcn.com  Wed Jan 12 01:27:09 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Jan 12 01:30:30 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
In-Reply-To: <20050110233126.GA14363@janus.swcomplete.com>
Message-ID: <000401c4f83d$7432db40$e841fea9@oemcomputer>

Would the csv module be a good place to add a DBF reader and writer?  

Dbase's dbf file format is one of the oldest, simplest and more common
database interchange formats.   It can be a good alternative to CSV as a
means of sharing data with pre-existing, non-python apps.

On the plus side, it has a precise spec, can preserve numeric and date
types, has guaranteed round-trip equivalence, and does not have weird
escape rules.  On the minus side, strings are limited to ASCII without
NULs and the fields are fixed length.

I've posted a draft on ASPN.  It interoperates well with the rest of the
CSV module because it also accepts/returns a list of fieldnames and a
sequence of records.

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715



Raymond Hettinger

From python at rcn.com  Wed Jan 12 01:52:53 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Jan 12 01:56:13 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com>
Message-ID: <000501c4f841$0c7874c0$e841fea9@oemcomputer>

> Thus, my argument is that an adapter like this should never be made
part
> of
> the adapter system, even if there's no transitivity.  However, if you
> agree
> that such an adapter shouldn't be implicit, then it logically follows
that
> there is no problem with allowing transitivity, except of course that
> people may sometimes break the rule.

At some point, the PEP should be extended to include a list of best
practices and anti-patterns for using adapters.  I don't find issues of
transitivity and implicit conversion to be immediately obvious.

Also, it is not clear to me how or if existing manual adaption practices
should change.  For example, if I need a file-like interface to a
string, I currently wrap it with StringIO.  How will that change it the
future?  By an explicit adapt/conform pair?  Or by strings knowing how
to conform to file-like requests?


Raymond Hettinger

From andrewm at object-craft.com.au  Wed Jan 12 01:57:16 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan 12 01:57:21 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list 
In-Reply-To: <000401c4f83d$7432db40$e841fea9@oemcomputer> 
References: <000401c4f83d$7432db40$e841fea9@oemcomputer>
Message-ID: <20050112005716.848393C889@coffee.object-craft.com.au>

>Would the csv module be a good place to add a DBF reader and writer?  

I would have thought it would make sense as it's own module (in the same
way that we have separate modules that present a common interface for
the different databases), or am I missing something?

I'd certainly like to see a DBF parser in python - reading and writing odd
file formats is bread-and-butter for us contractors... 8-)

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From roeland.rengelink at chello.nl  Wed Jan 12 02:48:43 2005
From: roeland.rengelink at chello.nl (Roeland Rengelink)
Date: Wed Jan 12 03:00:24 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <41E481FB.5020003@chello.nl>

I'm trying to understand the relation between Guido's posts on optional 
  static typing and PEP 245 (interfaces) and 246 (adaptation). I have a 
  couple of questions

PEP 245 proposes to introduce a fundamental distinction between type and 
  interface. However, 245 only introduces a syntax for interfaces, and 
says very little about the semantics of interfaces. (Basically only that 
  if X implements Y then implements(X, Y) will return True). The 
semantics of interfaces are currently only implied by PEP 246, and by 
Guido's  posts referring to 246.

Unfortunately PEP 246 explicitly refuses to decide that protocols are 
245-style interfaces. Therefore, it is not clear to me how acceptance of 
  245 would impact on 246?  Specifically, what would be the difference 
between:

x = adapt(obj, a_245_style_interface)

x = adapt(obj, a_protocol_type)

and, if there is no difference, what would the use-case of interfaces be?

Put another way: explicit interfaces and adaptation based typing seem to 
  be about introducing rigor (dynamic, not static) to Python. Yet, PEP 
245 and 246 seems to go out of their way to give interfaces and 
adaptation as little baggage as possible. So, where is the rigor going 
to come from?

On the one hand this seems very Pythonic - introduce a new feature with 
as little baggage as possible, and see where it evolves from there. Let 
the rigor flow, not from the restrictions of the language, but from the 
  expressive power of the language.

On the other hand: why not, at least:

- explore in 245 how the semantics of interfaces might introduce rigor 
into the language. It would be particularly illuminating to find out in 
what way implementing an interface differs from deriving from an ABC 
and  in what way an interface hierarchy differs semantically from a 
hierarchy  of ABCs

- rewrite 246 under the assumption that 245 (including semantics) has 
been accepted

I would volunteer, but, for those of you who hadn't noticed yet, I don't 
know what I'm talking about.

Cheers,

Roeland Rengelink



From pje at telecommunity.com  Wed Jan 12 04:06:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 04:05:14 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <000501c4f841$0c7874c0$e841fea9@oemcomputer>
References: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050111215448.039197b0@mail.telecommunity.com>

At 07:52 PM 1/11/05 -0500, Raymond Hettinger wrote:
>Also, it is not clear to me how or if existing manual adaption practices
>should change.  For example, if I need a file-like interface to a
>string, I currently wrap it with StringIO.  How will that change it the
>future?  By an explicit adapt/conform pair?  Or by strings knowing how
>to conform to file-like requests?

The goal here is to be able to specify that a function parameter takes, 
e.g. a "readable stream", and that you should be able to either explicitly 
wrap in a StringIO to satisfy this, or *possibly* that you be able to just 
pass a string and have it work automatically.

If the latter is the case, there are a variety of possible ways it might 
work.  str.__conform__ might recognize the "readable stream" interface, or 
the __adapt__ method of the "readable stream" interface could recognize 
'str'.  Or, Alex's new proposed global type registry might contain an entry 
for 'str,readableStream'.  Which of these is the preferred scenario very 
much depends on a lot of things, like who defined the "readable stream" 
interface, and whether anybody has registered an adapter for it!

PyProtocols tries to answer this question by allowing you to register 
adapters with interfaces, and then the interface's __adapt__ method will do 
the actual adaptation.  Zope does something similar, at least in that it 
uses the interface's __adapt__ method, but that method actually uses a 
global registry.

Neither PyProtocols nor Zope make much use of actually implementing 
hand-coded __conform__ or __adapt__ methods, as it's too much trouble for 
something that's so inherently declarative anyway, and only the creator of 
the object class or the interface's type have any ability to define 
adapters that way.  Given that built-in types are often handy sources of 
adaptation (e.g. str-to-StringIO in your example), it isn't practical in 
present-day Python to add a __conform__ method to the str type!

Thus, in the general case it just seems easier to use a per-interface or 
global registry for most normal adaptation, rather than using 
__conform__.  However, having __conform__ exist is a nice "out" for 
implementing unusual custom requirements (like easy dynamic conformance), 
so I don't think it should be removed.

From gvanrossum at gmail.com  Wed Jan 12 05:11:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 05:11:19 2005
Subject: [Python-Dev] copy confusion
In-Reply-To: <D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc2050111201174b86218@mail.gmail.com>

> Unfortunately, we do have a problem with the code in copy.py:
> 
> class MetaCopyableClass(type):
>      def __copy__(cls):
>          """ code to copy CLASSES of this metaclass """
>      # etc, etc, snipped
> 
> class CopyableClass:
>      __metaclass__ = MetaCopyableClass
>      # rest of class snipped
> 
> x = CopyableClass()
> 
> import copy
> y = copy.copy(x)
> 
> kallisti:/tmp alex$ python x.py
> Traceback (most recent call last):
>    File "x.py", line 14, in ?
>      y = copy.copy(x)
>    File "/usr/local/lib/python2.4/copy.py", line 79, in copy
>      return copier(x)
> TypeError: __copy__() takes exactly 1 argument (2 given)
> kallisti:/tmp alex$
> 
> See?  copy.copy(x) ends up using MetaCopyableClass.__copy__ -- because
> of a getattr on CopyableClass for '__copy__', which gets the
> BOUND-METHOD defined in the metaclass, with im_self being
> CopyableClass.
> 
> I had exactly the same metabug in the pep 246 reference implementation,
> Armin Rigo showed how to fix it in his only recent post.

Don't recall seeing that, but if you or he can fix this without
breaking other stuff, it's clear you should go ahead. (This worked in
2.2, FWIW; it broke in 2.3.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From mdehoon at ims.u-tokyo.ac.jp  Wed Jan 12 08:38:30 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Wed Jan 12 08:37:12 2005
Subject: [Python-Dev] PyOS_InputHook and threads
Message-ID: <41E4D3F6.4070807@ims.u-tokyo.ac.jp>

I have started writing a patch that replaces PyOS_InputHook with 
PyOS_AddInputHook and PyOS_RemoveInputHook. I am a bit confused though on how 
hook functions are supposed to work with threads.

PyOS_InputHook is a pointer to a hook function, which can be defined for example 
in a C extension module.

When Python is running in a single thread, PyOS_InputHook is called ten times 
per second while Python is waiting for the user to type something. This is 
achieved by setting readline's rl_event_hook function to PyOS_InputHook.

When Python uses multiple threads, each thread has its own PyOS_InputHook (I am 
not sure if this was intended). However, with IDLE I noticed that the subprocess 
thread doesn't call its PyOS_InputHook. In IDLE (if I understand correctly how 
it works), one thread takes care of the GUI and the interaction with the user, 
while another thread executes the user's commands. If an extension module sets 
PyOS_InputHook, the PyOS_InputHook belonging to this second thread is set. 
However, this PyOS_InputHook does not get called. Is this simply an oversight? 
What would be a suitable place to add the call to PyOS_InputHook? In other 
words, where does the second thread go idle?

--Michiel.



> On Thu, Dec 09, 2004, Michiel Jan Laurens de Hoon wrote:
> 
>>>
>>> My suggestion is therefore to replace PyOS_InputHook by two functions
>>> PyOS_AddInputHook and PyOS_RemoveInputHook, and let Python keep track of
>>> which hooks are installed. This way, an extension module can add a hook
>>> function without having to worry about other extension modules trying
>>> to use the same hook.
>>> 
>>> Any comments? Would I need to submit a PEP for this proposal?
> 
> 
> Because this is only for the C API, your best bet is to write a patch
> and submit it to SF.  If people whine or it gets rejected, then write a
> PEP.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon

From aleax at aleax.it  Wed Jan 12 09:03:59 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 09:04:07 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc2050111201174b86218@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
Message-ID: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>

Since this bug isn't the cause of Fredrik's problem I'm changing the 
subject (and keep discussing the specific problem that Fredrik 
uncovered under the original subject).

On 2005 Jan 12, at 05:11, Guido van Rossum wrote:
    ...
>> I had exactly the same metabug in the pep 246 reference 
>> implementation,
>> Armin Rigo showed how to fix it in his only recent post.
>
> Don't recall seeing that, but if you or he can fix this without
> breaking other stuff, it's clear you should go ahead. (This worked in
> 2.2, FWIW; it broke in 2.3.)

Armin's fix was to change:

    conform = getattr(type(obj), '__conform__', None)

into:

    for basecls in type(obj).__mro__:
        if '__conform__' in basecls.__dict__:
            conform = basecls.__dict__['__conform__']
            break
    else:
        # not found

I have only cursorily examined the rest of the standard library, but it 
seems to me there may be a few other places where getattr is being used 
on a type for this purpose, such as pprint.py which has a couple of 
occurrences of
     r = getattr(typ, "__repr__", None)

Since this very same replacement is needed in more than one place for 
"get the following special attribute from the type of the object', it 
seems that a function to do it should be introduced in one place and 
used from where it's needed:

def get_from_first_dict(dicts, name, default=None):
    for adict in dicts:
        try:
            return adict[name]
        except KeyError:
            pass
    return default

to be called, e.g. in the above example with '__conform__', as:

conform = get_from_first_dict(
               (basecls.__dict__ for basecls in type(obj).__mro__),
               '__conform__'
           )

The needed function could of course be made less general, by giving 
more work to the function and less work to the caller, all the way down 
to:

def getspecial(obj, name, default=None):
    for basecls in type(obj).__mro__:
        try:
            return basecls.__dict__[name]
        except KeyError:
            pass
    return default

to be called, e.g. in the above example with '__conform__', as:

conform = getspecial(obj, '__conform__')

This has the advantage of not needing the genexp, so it's usable to 
implement the fix in 2.3.5 as well as in 2.4.1.  Moreover, it can 
specialcase old-style class instances to provide the backwards 
compatible behavior, if desired -- that doesn't matter (but doesn't 
hurt) to fix the bug in copy.py, because in that case old-style 
instances have been already specialcases previously, and might help to 
avoid breaking anything in other similar bugfixes, so that's what I 
would suggest:

def getspecial(obj, name, default=None):
    if isinstance(obj, types.InstanceType):
        return getattr(obj, name, default)
    for basecls in type(obj).__mro__:
        try:
            return basecls.__dict__[name]
        except KeyError:
            pass
    return default

The tradeoff between using type(obj) and obj.__class__ isn't 
crystal-clear to me, but since the latter might apparently be faked by 
some proxy to survive isinstance calls type(obj) appears to me to be 
right.


Where in the standard library to place this function is not clear to me 
either.  Since it's going into bugfix-only releases, I assume it 
shouldn't be "published".  Maybe having it as copy._getspecial (i.e. 
with a private name) is best, as long as it's OK to introduce some 
coupling by having (e.g.) pprint call copy._getspecial too.

Performance might be a problem, but the few bugfix locations where a 
getattr would be replaced by this getspecial don't seem to be hotspots, 
so maybe we don't need to worry about it for 2.3 and 2.4 (it might be 
nice to have this functionality "published" in 2.5, of course, and then 
it should probably be made fast).

Feedback welcome -- the actual patch will doubtless be tiny, but it 
would be nice to have it right the first time (as it needs to go into 
both the 2.3 and 2.4 bugfix branches and the 2.5 head).


Alex

From aleax at aleax.it  Wed Jan 12 10:52:10 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 10:52:20 2005
Subject: [Python-Dev] Re: copy confusion
In-Reply-To: <cs1nhu$1jc$1@sea.gmane.org>
References: <cs1jfs$p3d$1@sea.gmane.org><5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<cs1nhu$1jc$1@sea.gmane.org>
Message-ID: <A0A8A124-647F-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 00:30, Fredrik Lundh wrote:

> Guido van Rossum wrote:
>
>> The only thing this intends to break /.../
>
> it breaks classic C types:

True!!!  And it only breaks copy, NOT deepcopy, because of the 
following difference between the two functions in copy.py...:

def deepcopy(x, memo=None, _nil=[]):
    ...
     cls = type(x)
     copier = _deepcopy_dispatch.get(cls)
     if copier:
         y = copier(x, memo)
     else:
         try:
             issc = issubclass(cls, type)
         except TypeError: # cls is not a class (old Boost; see SF 
#502085)
             issc = 0
         if issc:
             y = _deepcopy_atomic(x, memo)
         else:
             copier = getattr(x, "__deepcopy__", None)

Now:

 >>> x = cElementTree.Element("tag")
 >>> cls = type(x)
 >>> issubclass(cls, type)
False

therefore, copy.deepcopy ends up doing the getattr of '__deepcopy__' on 
x and live happily ever after.  Function copy.copy does NOT do that 
issubclass check, therefore it breaks Fredrik's code.


> (and of course,  custom C types is the only case where I've ever used
> __copy__; the default behavior has worked just fine for all other 
> cases)
>
> for cElementTree, I've worked around this with an ugly __reduce__ hack,
> but that doesn't feel right...

I think you're entirely correct and that we SHOULD bugfix copy.py so 
that function copy, just like function deepcopy, does the getattr from 
the object when not issubclass(cls, type).

The comment suggests that check is there only for strange cases such as 
"old Boost" (presumably Boost Python in some previous incarnation) but 
it appears to me that it's working fine for your custom C type and that 
it would work just as well for __copy__ as is seems to do for 
__deepcopy__.

The fix, again, should be a tiny patch -- and it seems to me that we 
should have it for 2.3.5 as well as for 2.4.1 and the HEAD.


Alex

From jim at zope.com  Wed Jan 12 13:45:50 2005
From: jim at zope.com (Jim Fulton)
Date: Wed Jan 12 13:45:56 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050111185020.GA28966@prometheusresearch.com>
References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>	<5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com>
	<20050111185020.GA28966@prometheusresearch.com>
Message-ID: <41E51BFE.2090008@zope.com>

Clark C. Evans wrote:
> On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote:
...
> | * In my experience, incorrectly deriving an interface from another is the 
> | most common source of unintended adaptation side-effects, not adapter 
> | composition
> 
> It'd be nice if interfaces had a way to specify a test-suite that
> could be run against a component which claims to be compliant.   For
> example, it could provide invalid inputs and assert that the proper
> errors are returned, etc.

We've tried this in Zope 3 with very limited success.  In fact,
so far, our attempts have provided more pain than their worth.
The problem is that interfaces are usually abstract enough that
it's very difficult to write generic tests.  For example,
many objects implement mapping protocols, but place restrictions
on the values stored.  It's hard to provide generic tests that don't
require lots of inconvenient hooks.  There are exceptions of course.
Our ZODB storage tests use a generic storage-interface test, but this
is possible because the ZODB storage interfaces are extremely
concrete.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From p.f.moore at gmail.com  Wed Jan 12 14:44:50 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed Jan 12 14:44:53 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <79990c6b05011205445ea4af76@mail.gmail.com>

On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli <aleax@aleax.it> wrote:
> But adaptation is not transmission!  It's PERFECTLY acceptable for an
> adapter to facade: to show LESS information in the adapted object than
> was in the original.  It's PERFECTLY acceptable for an adapter to say
> "this piece information is not known" when it's adapting an object for
> which that information, indeed, is not known.  It's only CONJOINING the
> two perfectly acceptable adapters, as transitivity by adapter chain
> would do automatically, that you end up with a situation that is
> pragmatically undesirable: asserting that some piece of information is
> not known, when the information IS indeed available -- just not by the
> route automatically taken by the transitivity-system.

[Risking putting my head above the parapet here :-)]

If you have adaptations A->B, B->C, and A->C, I would assume that the
system would automatically use the direct A->C route rather than
A->B->C. I understand that this is what PyProtocols does.

Are you mistakenly thinking that shortest-possible-route semantics
aren't used? Maybe the PEP should explicitly require such semantics.

If I'm missing the point here, I apologise. But I get the feeling that
something's getting lost in the discussions.

Paul.
From p.f.moore at gmail.com  Wed Jan 12 15:00:20 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed Jan 12 15:00:23 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <79990c6b05011206001a5a3805@mail.gmail.com>

On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli <aleax@aleax.it> wrote:
> By imposing transitivity, you're essentially asserting that, if a
> programmer forgets to code and register an A -> C direct adapter, this
> is never a problem, as long as A -> B and B -> C adapters are
> registered, because A -> B -> C will give results just as good as the
> direct A -> C would have, so there's absolutely no reason to trouble
> the programmer about the trivial detail that transitivity is being
> used.
[...]
> paragraph, then this is just weird: since you're implicitly asserting
> that any old A->?->C transitive adaptation is just as good as a direct
> A->C, why should you worry about there being more than one such 2-step
> adaptation available?  Roll the dice to pick one and just proceed.

I know this is out-of-context picking, but I don't think I've ever
seen anyone state that A->?->C is "just as good as" a direct A->C. I
would have thought it self-evident that a shorter adaptation path is
always better. And specifically, I know that Philip has stated that
PyProtocols applies a shorter-is-better algorithm.

Having pointed this out, I'll go back to lurking. You two are doing a
great job of converging on something so far, so I'll let you get on
with it.

Paul.
From aleax at aleax.it  Wed Jan 12 15:06:49 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 15:06:53 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <79990c6b05011205445ea4af76@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
Message-ID: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 14:44, Paul Moore wrote:

> On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli <aleax@aleax.it> 
> wrote:
>> But adaptation is not transmission!  It's PERFECTLY acceptable for an
>> adapter to facade: to show LESS information in the adapted object than
>> was in the original.  It's PERFECTLY acceptable for an adapter to say
>> "this piece information is not known" when it's adapting an object for
>> which that information, indeed, is not known.  It's only CONJOINING 
>> the
>> two perfectly acceptable adapters, as transitivity by adapter chain
>> would do automatically, that you end up with a situation that is
>> pragmatically undesirable: asserting that some piece of information is
>> not known, when the information IS indeed available -- just not by the
>> route automatically taken by the transitivity-system.
>
> [Risking putting my head above the parapet here :-)]
>
> If you have adaptations A->B, B->C, and A->C, I would assume that the
> system would automatically use the direct A->C route rather than
> A->B->C. I understand that this is what PyProtocols does.

Yes, it is.

> Are you mistakenly thinking that shortest-possible-route semantics
> aren't used? Maybe the PEP should explicitly require such semantics.

No, I'm not.  I'm saying that if, by mistake, the programmer has NOT 
registered the A->C adapter (which would be easily coded and work 
perfectly), then thanks to transitivity, instead of a clear and simple 
error message leading to immediate diagnosis of the error, they'll get 
a subtle unnecessary degradation of information and resulting reduction 
in information quality.

PyProtocols' author claims this can't happen because if adapters A->B 
and B->C are registered then each adapter is always invariably claiming 
to be lossless and perfect.  However, inconsistently with that stance, 
I believe that PyProtocols does give an error message if it finds two 
adaptation paths of equal minimum length, A->B->C or A->Z->C -- if it 
is truly believed that each adaptation step is lossless and perfect, 
it's inconsistent to consider the existence of two equal-length paths 
an error... either path should be perfect, so just picking either one 
of them should be a perfectly correct strategy.

> If I'm missing the point here, I apologise. But I get the feeling that
> something's getting lost in the discussions.

The discussions on this subject always and invariably get extremely 
long (and often somewhat heated, too), so it's quite possible that a 
lot is getting lost along the way, particularly to any other reader 
besides the two duelists.  Thus, thanks for focusing on one point that 
might well be missed by other readers (though not by either PJE or 
me;-) and giving me a chance to clarify it!


Alex

From marktrussell at btopenworld.com  Wed Jan 12 15:27:08 2005
From: marktrussell at btopenworld.com (Mark Russell)
Date: Wed Jan 12 15:27:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <1105540027.5326.19.camel@localhost>

I strongly prefer *not* to have A->B and B->C automatically used to
construct A->C.  Explicit is better than implicit, if in doubt don't
guess, etc etc.

So I'd support:

    - As a first cut, no automatic transitive adaptation

    - Later, and only if experience shows there's a need for it,
      add a way to say  "this adaptor can be used as part of a
      transitive chain"

Mark Russell

From aleax at aleax.it  Wed Jan 12 15:48:41 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 15:48:45 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <79990c6b05011206001a5a3805@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
Message-ID: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 15:00, Paul Moore wrote:

> On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli <aleax@aleax.it> 
> wrote:
>> By imposing transitivity, you're essentially asserting that, if a
>> programmer forgets to code and register an A -> C direct adapter, this
>> is never a problem, as long as A -> B and B -> C adapters are
>> registered, because A -> B -> C will give results just as good as the
>> direct A -> C would have, so there's absolutely no reason to trouble
>> the programmer about the trivial detail that transitivity is being
>> used.
> [...]
>> paragraph, then this is just weird: since you're implicitly asserting
>> that any old A->?->C transitive adaptation is just as good as a direct
>> A->C, why should you worry about there being more than one such 2-step
>> adaptation available?  Roll the dice to pick one and just proceed.
>
> I know this is out-of-context picking, but I don't think I've ever
> seen anyone state that A->?->C is "just as good as" a direct A->C. I
> would have thought it self-evident that a shorter adaptation path is
> always better. And specifically, I know that Philip has stated that
> PyProtocols applies a shorter-is-better algorithm.

Yes, he has.  If A->C is registered as a direct adaptation, it gets 
used and everybody lives happily ever after.  The controversy comes 
when A->C is *NOT* registered as a direct adaptation.

If there is no degradation of information quality, etc, at any 
intermediate step, picking the shortest path is still sensible because 
of likely performance consideration.  Each adaptation step might put 
some kind of wrapper/proxy/adapter object in the mix, delegate calls, 
etc.  Of course, it's possible that some such wrappers are coded much 
more tighter &c, so that in fact some roundabout A -> X1 -> X2 -> C 
would actually be better performing than either A -> B -> C or A -> Z 
-> C, but using one of the shortest available paths appears to be a 
reasonable heuristic for what, if one "assumes away" any degradation, 
is after all a minor issue.

Demanding that the set of paths of minimal available length has exactly 
one element is strange, though, IF one is assuming that all adaptation 
paths are exactly equivalent except at most for secondary issues of 
performance (which are only adjudicated by the simplest heuristic: if 
those issues were NOT considered minor/secondary, then a more 
sophisticated scheme would be warranted, e.g. by letting the programmer 
associate a cost to each step, picking the lowest-cost path, AND 
letting the caller of adapt() also specify the maximal acceptable cost 
or at least obtain the cost associated with the chosen path).

Personally, I disagree with having transitivity at all, unless perhaps 
it be restricted to adaptations specifically and explicitly stated to 
be "perfect and lossless"; PJE claims that ALL adaptations MUST, 
ALWAYS, be "perfect and lossless" -- essentially, it seems to me, he 
_has_ to claim that, to defend transitivity being applied 
automatically, relentlessly, NON-optionally, NON-selectively (but then 
the idea of giving an error when two or more shortest-paths have the 
same length becomes dubious).

Even C++ at least lets a poor beleaguered programmer assert that a 
conversion (C++ does not have adaptation, but it does have conversion) 
is _EXPLICIT_, meaning that it only applies as a single isolated step 
and NOT as a part of one of those automatic transitive 
conversion-chains which so often produce amazing, hell-to-debug 
results.  That's the wrong default (explicit should be the default, 
"usable transitively" should need to be asserted outright), explainable 
by the usual historical and backwards compatibilty reason (just like 
having methods default to non-virtual, etc, etc), but at least it's 
THERE -- a way to stop transitivity and restore sanity.  I have not yet 
seen PJE willing to compromise on this point -- having two categories 
or grades of adaptations, one "perfect, lossless, noiseless, 
impeccable" usable transitively and one "sublunar" NOT usable 
transitively.  ((If he was, we could still argue on which one should be 
the default;-))


Much the same applies to inheritance, BTW, which as PJE has pointed out 
a few times also induces transitivity in adaptation, and, according to 
him, is a more likely cause of bugs than chains of adapters (I have no 
reason to doubt that any way to induce transitivity without very 
clearly stating that one WANTS that effect can be extremely bug-prone).

So, yes, I'd _also_ love to have two grades of inheritance, one of the 
"total commitment" kind (implying transitivity and whatever), and one 
more earthly a la ``I'm just doing some convenient reuse, leave me 
alone''.  Here, regretfully, I'll admit C++ has the advantage, since 
``private inheritance'' is exactly that inferior, implementation-only 
kind (I'm perfectly happy with Python NOT having private methods nor 
attributes, but private INHERITANCE sometimes I still miss;-).  Ah 
well, can't have everything.  While I hope we can offer some lifesaver 
to those poor practicing programmers whose inheritance-structures 
aren't always perfect and pristine, if the only way to treat 
interface-inheritance is the Hyperliskovian one, ah well, we'll 
survive.

BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of 
inheritance.  You can say that interface ISub inherits from IBas: this 
means that ISub has all the same methods as IBas with the same 
signatures, plus it may have other methods; it does *NOT* mean that 
anything implementing ISub must also implement IBas, nor that a 
QueryInterface on an ISub asking for an IBas must succeed, or anything 
of that kind.  In many years of COM practice I have NEVER found this 
issue to be a limitation -- it works just fine.  I do not know CORBA 
anywhere as well as I do COM, but, doesn't CORBA interface inheritance, 
per OMG's standards, also work that way?


Alex

From pje at telecommunity.com  Wed Jan 12 16:12:26 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 16:12:30 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <1105540027.5326.19.camel@localhost>
References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>

At 02:27 PM 1/12/05 +0000, Mark Russell wrote:
>I strongly prefer *not* to have A->B and B->C automatically used to
>construct A->C.  Explicit is better than implicit, if in doubt don't
>guess, etc etc.
>
>So I'd support:
>
>     - As a first cut, no automatic transitive adaptation
>
>     - Later, and only if experience shows there's a need for it,

Well, by the experience of the people who use it, there is a need, so it's 
already "later".  :)  And at least my experience *also* shows that 
transitive interface inheritance with adaptation is much easier to 
accidentally screw up than transitive adapter composition -- despite the 
fact that nobody objects to the former.

But if you'd like to compare the two approaches pragmatically, try using 
both zope.interface and PyProtocols, and see what sort of issues you run 
into.  They have pretty much identical interface syntax, and you can use 
the PyProtocols declaration API and 'adapt' function to do interface 
declarations for either Zope interfaces or PyProtocols interfaces -- and 
the adaptation semantics follow Zope if you're using Zope interfaces.  So, 
you can literally flip between the two by changing where you import the 
'Interface' class from.

Both Zope and PyProtocols support the previous draft of PEP 246; the new 
draft adds only two new features:

* Ability for a class to opt out of the 'isinstance()' check for a base 
class (i.e., for a class to say it's not substitutable for its base class, 
for Alex's "private inheritance" use case)

* Ability to have a global type->protocol adapter registry

Anyway, I'm honestly curious as to whether anybody can find a real 
situation where transitive adapter composition is an *actual* problem, as 
opposed to a theoretical one.  I've heard a lot of people talk about what a 
bad idea it is, but I haven't heard any of them say they actually tried 
it.  Conversely, I've also heard from people who *have* tried it, and liked 
it.  However, at this point I have no way to know if this dichotomy is just 
a reflection of the fact that people who don't like the idea don't try it, 
and the people who either like the idea or don't care are open to trying it.

The other thing that really blows my mind is that the people who object to 
the idea don't get that transitive interface inheritance can produce the 
exact same problem, and it's more likely to happen in actual *practice*, 
than it is in theory.

As for the issue of what should and shouldn't exist in Python, it doesn't 
really matter; PEP 246 doesn't (and can't!) *prohibit* transitive 
adaptation.  However, I do strongly object to the spreading of theoretical 
FUD about a practical, useful technique, much as I would object to people 
saying that "using significant whitespace is braindead" who had never tried 
actually using Python.  The theoretical problems with transitive adapter 
composition are in my experience just as rare as whitespace-eating 
nanoviruses from outer space.


From gvanrossum at gmail.com  Wed Jan 12 16:26:36 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 16:26:40 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc205011207261a8432c@mail.gmail.com>

[Alex]
> I'm saying that if, by mistake, the programmer has NOT
> registered the A->C adapter (which would be easily coded and work
> perfectly), then thanks to transitivity, instead of a clear and simple
> error message leading to immediate diagnosis of the error, they'll get
> a subtle unnecessary degradation of information and resulting reduction
> in information quality.

I understand, but I would think that there are just as many examples
of cases where having to register a trivial A->C adapter is much more
of a pain than it's worth; especially if there are a number of A->B
pairs and a number of B->C pairs, the number of additional A->C pairs
needed could be bewildering.

But I would like to see some input from people with C++ experience.
C++ goes to great lengths to pick automatic conversions (which perhaps
aren't quite the same as adaptations but close enough for this
comparison to work) and combine them. *In practice*, is this a benefit
or a liability?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Wed Jan 12 16:36:35 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 16:36:41 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <BE11893E-64AF-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 16:12, Phillip J. Eby wrote:

> At 02:27 PM 1/12/05 +0000, Mark Russell wrote:
>> I strongly prefer *not* to have A->B and B->C automatically used to
>> construct A->C.  Explicit is better than implicit, if in doubt don't
>> guess, etc etc.
>>
>> So I'd support:
>>
>>     - As a first cut, no automatic transitive adaptation
>>
>>     - Later, and only if experience shows there's a need for it,
>
> Well, by the experience of the people who use it, there is a need, so 
> it's already "later".  :)  And at least my experience *also* shows 
> that transitive interface inheritance with adaptation is much easier 
> to accidentally screw up than transitive adapter composition -- 
> despite the fact that nobody objects to the former.

A-hem -- I *grumble* about the former (and more generally the fact that 
inheritance is taken as so deucedly *committal*:-).  If it doesn't 
really count as a "complaint" it's only because I doubt I can do 
anything about it and I don't like tilting at windmills.  But, I _DO_ 
remember Microsoft's COM, with inheritance of interface *NOT* implying 
anything whatsoever (except the fact that the inheriting one has all 
the methods of the inherited one with the same signature, w/o having to 
copy and paste, plus of course you can add more) -- I remember that 
idea with fondness, as I do many other features of a components-system 
that, while definitely not without defects, was in many respects a 
definite improvement over the same respects in its successors.

> The other thing that really blows my mind is that the people who 
> object to the idea don't get that transitive interface inheritance can 
> produce the exact same problem, and it's more likely to happen in 
> actual *practice*, than it is in theory.

Believe me, I'm perfectly glad to believe that [a] implied transitivity 
in any form, and [b] hypercommittal inheritance, cause HUGE lots of 
problems; and to take your word that the combination is PARTICULARLY 
bug-prone in practice.  It's just that I doubt I can do anything much 
to help the world avoid that particular blight.

> As for the issue of what should and shouldn't exist in Python, it 
> doesn't really matter; PEP 246 doesn't (and can't!) *prohibit* 
> transitive adaptation.  However, I do strongly object to the spreading 
> of theoretical FUD about a practical, useful technique, much as I 
> would object to people saying that "using significant whitespace is 
> braindead" who had never tried actually using Python.  The theoretical 
> problems with transitive adapter composition are in my experience just 
> as rare as whitespace-eating nanoviruses from outer space.

Well, I'm not going to start real-life work on a big and complicated 
system (the kind where such problems would emerge) relying on a 
technique I'm already dubious about, if I have any say in the matter, 
so of course I'm unlikely to gain much real-life experience -- I'm 
quite content, unless somebody should be willing to pay me adequately 
for my work yet choose to ignore my advice in the matter;-), to rely on 
imperfect analogies with other experiences based on other kinds of 
unwanted and unwarranted but uncontrollable and unstoppable 
applications of transitivity by underlying systems and frameworks.

I already know -- you told us so -- that if I had transitivity as you 
wish it (uncontrollable, unstoppable, always-on) I could not any more 
write and register a perfectly reasonable adapter which fills in with a 
NULL an optional field in the adapted-to interface, without facing 
undetected degradation of information quality by that adapter being 
invisibly, uncontrollably chained up with another -- no error message, 
no nothing, no way to stop this -- just because a direct adapter wasn't 
correctly written and registered.  Just this "detail", for me, is 
reason enough to avoid using any framework that imposes such 
noncontrollable transitivity, if I possibly can.


Alex

From theller at python.net  Wed Jan 12 16:44:57 2005
From: theller at python.net (Thomas Heller)
Date: Wed Jan 12 16:43:35 2005
Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux)
In-Reply-To: <20050111124157.GA16642@vicky.ecs.soton.ac.uk> (Armin Rigo's
	message of "Tue, 11 Jan 2005 12:41:57 +0000")
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<20050111124157.GA16642@vicky.ecs.soton.ac.uk>
Message-ID: <fz16hcsm.fsf_-_@python.net>

Armin Rigo <arigo@tunes.org> writes:

> ... is that the __adapt__() and __conform__() methods should work just
> like all other special methods for new-style classes.  The confusion
> comes from the fact that the reference implementation doesn't do that.
> It should be fixed by replacing:
>
>    conform = getattr(type(obj), '__conform__', None)
>
> with:
>
>    for basecls in type(obj).__mro__:
>        if '__conform__' in basecls.__dict__:
>            conform = basecls.__dict__['__conform__']
>            break
>    else:
>        # not found
>
> and the same for '__adapt__'.
>
> The point about tp_xxx slots is that when implemented in C with slots, you get
> the latter (correct) effect for free.  This is how metaconfusion is avoided in
> post-2.2 Python.  Using getattr() for that is essentially broken.  Trying to
> call the method and catching TypeErrors seems pretty fragile -- e.g. if you
> are calling a __conform__() which is implemented in C you won't get a Python
> frame in the traceback either.

I'm confused.  Do you mean that

   getattr(obj, "somemethod")(...)

does something different than

   obj.somemethod(...)

with new style class instances?  Doesn't getattr search the __dict__'s
along the __mro__ list?

Thomas

From gvanrossum at gmail.com  Wed Jan 12 16:45:55 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 16:46:02 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc2050112074562485f50@mail.gmail.com>

[Alex]
> Of course, it's possible that some such wrappers are coded much
> more tighter &c, so that in fact some roundabout A -> X1 -> X2 -> C
> would actually be better performing than either A -> B -> C or A -> Z
> -> C, but using one of the shortest available paths appears to be a
> reasonable heuristic for what, if one "assumes away" any degradation,
> is after all a minor issue.

I would think that the main reason for preferring the shortest path is
the two degenerate cases, A->A (no adaptation necessary) and A->C (a
direct adapter is available). These are always preferable over longer
possibilities.

> Demanding that the set of paths of minimal available length has exactly
> one element is strange, though,

I think you're over-emphasizing this point (in several messages);
somehow you sound a bit like you're triumphant over having found a bug
in your opponent's reasoning.

[...]
> So, yes, I'd _also_ love to have two grades of inheritance, one of the
> "total commitment" kind (implying transitivity and whatever), and one
> more earthly a la ``I'm just doing some convenient reuse, leave me
> alone''.

I'll bet that the list of situations where occasionally you wish you
had more control over Python's behavior is a lot longer than that, and
I think that if we started implementing that wish list (or anybody's
wish list), we would soon find that we had destroyed Python's charming
simplicity.

My personal POV here: even when you break Liskov in subtle ways, there
are lots of situations where assuming substitutability has no ill
effects, so I'm happy to pretend that a subclass is always a subtype
of all of its base classes, (and their base classes, etc.). If it
isn't, you can always provide an explicit adapter to rectify things.

As an example where a subclass that isn't a subtype can be used
successfully, consider a base class that defines addition to instances
of the same class. Now consider a subclass that overrides addition to
only handle addition to instances of that same subclass; this is a
Liskov violation. Now suppose the base class also has a factory
function that produces new instances, and the subclass overrides this
to produce new instances of the subclass. Then a function designed to
take an instance of the base class and return the sum of the instances
produced by calling the factory method a few times will work perfectly
with a subclass instance as argument. Concrete:

class B:
    def add(self, other: B) -> B: ...
    def factory(self) -> B: ...

class C(B):
    def add(self, other: C) -> C: ... # "other: C" violates Liskov
    def factory(self) -> C: ...

def foo(x: B) -> B:
    x1 = x.factory()
    x2 = x.factory()
    return x1.add(x2)

This code works fine in today's python if one leaves the type
declarations out. I don't think anybody is served by forbidding it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Jan 12 16:49:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 16:49:25 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1105540027.5326.19.camel@localhost>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <ca471dc205011207491db5457@mail.gmail.com>

[Phillip]
> As for the issue of what should and shouldn't exist in Python, it doesn't
> really matter; PEP 246 doesn't (and can't!) *prohibit* transitive
> adaptation.

Really? Then isn't it underspecified? I'd think that by the time we
actually implement PEP 246 in the Python core, this part of the
semantics should be specified (at least the default behavior, even if
there are hooks to change this).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Wed Jan 12 16:50:55 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 16:51:00 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011207261a8432c@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
Message-ID: <BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 16:26, Guido van Rossum wrote:
    ...
> [Alex]
>> I'm saying that if, by mistake, the programmer has NOT
>> registered the A->C adapter (which would be easily coded and work
>> perfectly), then thanks to transitivity, instead of a clear and simple
>> error message leading to immediate diagnosis of the error, they'll get
>> a subtle unnecessary degradation of information and resulting 
>> reduction
>> in information quality.
>
> I understand, but I would think that there are just as many examples
> of cases where having to register a trivial A->C adapter is much more
> of a pain than it's worth; especially if there are a number of A->B
> pairs and a number of B->C pairs, the number of additional A->C pairs
> needed could be bewildering.

Hm?  For any A and B there can be only one A->B adapter registered.  Do 
you mean a number of A->B1, B1->C1 ; A->B2, B2->C2; etc?  Because if it 
was B1->C and B2->C, as I understand the transitivity of PyProtocols, 
it would be considered an error.

> But I would like to see some input from people with C++ experience.

Here I am, at your service.  I've done, taught, mentored, etc, much 
more C++ than Python in my life.  I was technical leader for the whole 
C -> C++ migration of a SW house which at that time had more than 100 
programmers (just as I had earlier been for the Fortran -> C migration 
back a few years previously, with around 30 programmers): I taught 
internal courses, seminars and workshops on C++, its differences from 
C, OO programming and design, Design Patterns, and later generic 
programming, the STL, and so on, and so forth.  I mentored a lot of 
people (particularly small groups of people that would later go and 
teach/mentor the others), pair-programmed in the most critical 
migrations across the breadth of that SW house's software base, etc, 
etc.  FWIW, having aced Brainbench's C++ tests (I was evaluating them 
to see if it would help us select among candidates claiming C++ 
skills), I was invited to serve for a while as one of their "Most 
Valued Professionals" (MVPs) for C++, and although I had concluded that 
for that SW house's purposes the tests weren't all that useful, I did, 
trying to see if I could help make them better (more suitable to test 
_real-world_ skills and less biased in favour of people with that 
"language-lawyer" or "library-packrat" kind of mentality I have, which 
is more useful in tests than out in the real world).

I hope I can qualify as a C++ expert by any definition.

> C++ goes to great lengths to pick automatic conversions (which perhaps
> aren't quite the same as adaptations but close enough for this
> comparison to work)

I agree with you, though I believe PJE doesn't (he doesn't accept my 
experience with such conversions as a valid reason for me to be afraid 
of "close enough for this comparison" adaptations).

>  and combine them. *In practice*, is this a benefit
> or a liability?

It's in the running for the coveted "Alex's worst nightmare" prize, 
with a few other features of C++ - alternatively put, the prize for 
"reason making Alex happiest to have switched to Python and _almost_ 
managed to forget C++ save when he wakes up screaming in the middle of 
the night";-).


Alex

From gvanrossum at gmail.com  Wed Jan 12 16:52:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 16:52:18 2005
Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux)
In-Reply-To: <fz16hcsm.fsf_-_@python.net>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<20050111124157.GA16642@vicky.ecs.soton.ac.uk>
	<fz16hcsm.fsf_-_@python.net>
Message-ID: <ca471dc2050112075221c5ee09@mail.gmail.com>

[Armin]
> > ... is that the __adapt__() and __conform__() methods should work just
> > like all other special methods for new-style classes.  The confusion
> > comes from the fact that the reference implementation doesn't do that.
> > It should be fixed by replacing:
> >
> >    conform = getattr(type(obj), '__conform__', None)
> >
> > with:
> >
> >    for basecls in type(obj).__mro__:
> >        if '__conform__' in basecls.__dict__:
> >            conform = basecls.__dict__['__conform__']
> >            break
> >    else:
> >        # not found
> >
> > and the same for '__adapt__'.
> >
> > The point about tp_xxx slots is that when implemented in C with slots, you get
> > the latter (correct) effect for free.  This is how metaconfusion is avoided in
> > post-2.2 Python.  Using getattr() for that is essentially broken.  Trying to
> > call the method and catching TypeErrors seems pretty fragile -- e.g. if you
> > are calling a __conform__() which is implemented in C you won't get a Python
> > frame in the traceback either.

[Thomas]
> I'm confused.  Do you mean that
> 
>    getattr(obj, "somemethod")(...)
> 
> does something different than
> 
>    obj.somemethod(...)
> 
> with new style class instances?  Doesn't getattr search the __dict__'s
> along the __mro__ list?

No, he's referring to the (perhaps not widely advertised) fact that

    obj[X]

is not quite the same as

    obj.__getitem__(X)

since the explicit method invocation will find
obj.__dict__["__getitem__"] if it exists but the operator syntax will
start the search with obj.__class__.__dict__.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Wed Jan 12 16:52:53 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 16:52:59 2005
Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux)
In-Reply-To: <fz16hcsm.fsf_-_@python.net>
References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com>
	<5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com>
	<20050111124157.GA16642@vicky.ecs.soton.ac.uk>
	<fz16hcsm.fsf_-_@python.net>
Message-ID: <052FE6C7-64B2-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 16:44, Thomas Heller wrote:
    ...
>>    conform = getattr(type(obj), '__conform__', None)
    ...
> I'm confused.  Do you mean that
>
>    getattr(obj, "somemethod")(...)
>
> does something different than
>
>    obj.somemethod(...)
>
> with new style class instances?  Doesn't getattr search the __dict__'s
> along the __mro__ list?

Yes, but getattr(obj, ... ALSO searches obj itself, which is what we're 
trying to avoid here.

getattr(type(obj), ... OTOH has a DIFFERENT problem -- it ALSO searches 
type(type(obj)), the metaclass, which we DON'T want.


Alex

From aleax at aleax.it  Wed Jan 12 17:02:11 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 17:02:16 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc2050112074562485f50@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050112074562485f50@mail.gmail.com>
Message-ID: <51767A6A-64B3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 16:45, Guido van Rossum wrote:

> My personal POV here: even when you break Liskov in subtle ways, there
> are lots of situations where assuming substitutability has no ill
> effects, so I'm happy to pretend that a subclass is always a subtype
> of all of its base classes, (and their base classes, etc.). If it
> isn't, you can always provide an explicit adapter to rectify things.

Ah, this is the crucial point: an explicit adapter must take precedence 
over substitutability that is assumed by subclassing.  From my POV this 
does just as well as any other kind of explicit control about whether 
subclassing implies substitutability.

In retrospect, that's the same strategy as in copy.py: *FIRST*, check 
the registry -- if something is found in the registry, THAT takes 
precedence.  *THEN*, only for cases where the registry doesn't give an 
answer, proceed with other steps and checks and sub-strategies.

So, I think PEP 246 should specify that the step now called (e) 
[checking the registry] comes FIRST; then, an isinstance step 
[currently split between (a) and (d)], then __conform__ and __adapt__ 
steps [currently called (b) and (c)].  Checking the registry is after 
all very fast: make the 2-tuple (type(obj), protocol), use it to index 
into the registry -- period.  So, it's probably not worth complicating 
the semantics at all just to "fast path" the common case.

I intend to restructure pep246 at next rewrite to reflect this "obvious 
once thought of" idea, and thanks, Guido, for providing it.


Alex

From aleax at aleax.it  Wed Jan 12 17:04:42 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 17:04:48 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011207491db5457@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1105540027.5326.19.camel@localhost>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<ca471dc205011207491db5457@mail.gmail.com>
Message-ID: <ABABF62A-64B3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 16:49, Guido van Rossum wrote:

> [Phillip]
>> As for the issue of what should and shouldn't exist in Python, it 
>> doesn't
>> really matter; PEP 246 doesn't (and can't!) *prohibit* transitive
>> adaptation.
>
> Really? Then isn't it underspecified? I'd think that by the time we
> actually implement PEP 246 in the Python core, this part of the
> semantics should be specified (at least the default behavior, even if
> there are hooks to change this).

Very good point -- thanks Phillip and Guido jointly for pointing this 
out.


Alex

From dw at botanicus.net  Wed Jan 12 17:26:06 2005
From: dw at botanicus.net (David Wilson)
Date: Wed Jan 12 17:26:10 2005
Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support
In-Reply-To: <20050111141523.32125.1902928401.divmod.quotient.6074@ohm>
References: <20050111013252.GA216@thailand.botanicus.net>
	<20050111141523.32125.1902928401.divmod.quotient.6074@ohm>
Message-ID: <20050112162606.GA48911@thailand.botanicus.net>

On Tue, Jan 11, 2005 at 02:15:23PM +0000, Jp Calderone wrote:

> > I would like to see (optional?) support for this before your patch is
> > merged. I have a long-term interest in a Python-based service control /
> > init replacement / system management application, for use in specialised
> > environments. I could definately use this. :)
> 
>   Useful indeed, but I'm not sure why basic NETLINK support should be 
> held up for it?

Point taken. I don't recall why I thought special code would be required
for this.

I was thinking a little more about how support might be added for older
kernels. No harm can be done by compiling in the constant, and it
doesn't cost much. How about:

    #include <linux/netlink.h>
    ...

    #ifndef NETLINK_KOBJECT_UEVENT
    #define NETLINK_KOBJECT_UEVENT  15
    #endif

    /* Code assuming build host supports KOBJECT_UEVENT. */


Type thing.

Cheers,


David.

-- 
... do you think I'm going to waste my time trying to pin physical
interpretations upon every optical illusion of our instruments? Since when
is the evidence of our senses any match for the clear light of reason?
    -- Cutie, in Asimov's Reason
From pje at telecommunity.com  Wed Jan 12 17:40:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 17:39:55 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <BE11893E-64AF-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>

At 04:36 PM 1/12/05 +0100, Alex Martelli wrote:
>I already know -- you told us so -- that if I had transitivity as you wish 
>it (uncontrollable, unstoppable, always-on) I could not any more write and 
>register a perfectly reasonable adapter which fills in with a NULL an 
>optional field in the adapted-to interface, without facing undetected 
>degradation of information quality by that adapter being invisibly, 
>uncontrollably chained up with another -- no error message, no nothing, no 
>way to stop this -- just because a direct adapter wasn't correctly written 
>and registered.

But why would you *want* to do this, instead of just explicitly 
converting?  That's what I don't understand.  If I were writing such a 
converter, I wouldn't want to register it for ANY implicit conversion, even 
if it was non-transitive!

From aleax at aleax.it  Wed Jan 12 18:18:47 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 18:18:54 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
Message-ID: <04EFD92E-64BE-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 17:40, Phillip J. Eby wrote:

> At 04:36 PM 1/12/05 +0100, Alex Martelli wrote:
>> I already know -- you told us so -- that if I had transitivity as you 
>> wish it (uncontrollable, unstoppable, always-on) I could not any more 
>> write and register a perfectly reasonable adapter which fills in with 
>> a NULL an optional field in the adapted-to interface, without facing 
>> undetected degradation of information quality by that adapter being 
>> invisibly, uncontrollably chained up with another -- no error 
>> message, no nothing, no way to stop this -- just because a direct 
>> adapter wasn't correctly written and registered.
>
> But why would you *want* to do this, instead of just explicitly 
> converting?  That's what I don't understand.  If I were writing such a 
> converter, I wouldn't want to register it for ANY implicit conversion, 
> even if it was non-transitive!

Say I have an SQL DB with a table such as:

CREATE TABLE fullname (
     first VARCHAR(50) NOT NULL,
     middle VARCHAR(50),
     last VARCHAR(50) NOT NULL,
     -- snipped other information fields
)

Now, I need to record a lot of names of people, which I get from a vast 
variety of sources, so they come in as different types.  No problem: 
I'll just adapt each person-holding type to an interface which offers 
first, middle and last names (as well as other information fields, here 
snipped), using None to mean I don't know the middle name for a given 
person (that's what NULL means, after all: "information unknown" or the 
like; the fact that fullname.middle is allowed to be NULL indicates 
that, while it's of course BETTER to have that information, it's not a 
semantic violation if that information just can't be obtained nohow).

All of my types which hold info on people can at least supply first and 
last names; some but not all can supply middle names.  Fine, no 
problem: I can adapt them all with suitable adapters anyway, 
noninvasively, without having to typecheck, typeswitch, or any other 
horror.  Ah, the magic of adaptation!  So, I define an interface -- say 
with arbitrary syntax:

interface IFullname:
     first: str
     middle: str or None
     last: str
     # snipped other information fields

and my function to write a data record is just:

def writeFullname(person: IFullname):
     # do the writing


So, I have another interface in a similar vein, perhaps to map to/from 
some LDAP and similar servers which provide a slightly different set of 
info fields:

interface IPerson:
     firstName: str
     lastName: str
     userid: str
     # snipped other stuff

I have some data about people coming in from LDAP and the like, which I 
want to record in that SQL DB -- the incoming data is held in types 
that implement IPerson, so I write an adapter IPerson -> IFullname for 
the purpose.


If the datatypes are immutable, conversion is as good as adaptation 
here, as I mentioned ever since the first mail in which I sketched this 
case, many megabytes back.  But adaptation I can get automatically 
WITHOUT typechecking on what exactly is the concrete type I'm having to 
write (send to LDAP, whatver) this time -- a crucial advantage of 
adaptation, as you mention in the PyProtocols docs.  Besides, maybe in 
some cases some of those attributes are in fact properties that get 
computed at runtime, fetched from a slow link if and only if they're 
first required, whatever, or even, very simply, some datatype is 
mutable and I need to ensure I'm dealing with the current state of the 
object/record.  So, I'm not sure why you appear to argue for conversion 
against adaptation, or explicit typechecking against the avoidance 
thereof which is such a big part of adapt's role in life.


Alex

From mcherm at mcherm.com  Wed Jan 12 18:46:46 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Jan 12 18:47:07 2005
Subject: [Python-Dev] PEP 246, redux
Message-ID: <1105552006.41e562862bf84@mcherm.com>

This is a collection of responses to various things that don't appear
to have been resolved yet:

Phillip writes:
> if a target protocol has optional aspects, then lossy adaptation to it is
> okay by definition.  Conversely, if the aspect is *not* optional, then
> lossy adaptation to it is not acceptable.  I don't think there can really
> be a middle ground; you have to decide whether the information is required
> or not.

I disagree. To belabor Alex's example, suppose LotsOfInfo has first, middle,
and last names; PersonName has first and last, and FullName has first,
middle initial and last. FullName's __doc__ specifically states that if
the middle name is not available or the individual does not have a middle
name, then "None" is acceptable.

Converting LotsOfInfo to FullName via PersonName results in None for the
middle name. But that's just not right... it _should_ be filling in the
middle initial because that information IS available. It's technically
correct in a theoretical sort of a way (FullName never PROMISES that
middle_initial will be available), but it's wrong in a my-program-doesn't-
work-right sort of way, because it HAS the information available yet
doesn't use it.

You're probably going to say "okay, then register a LotsOfInfo->FullName
converter", and I agree. But if no such converter is registered, I
would rather have a TypeError then an automatic conversion which produces
incorrect results. I can explicitly silence it by registering a trivial
converter:

    def adapt_LotsOfInfo_to_FullName(lots_of_info):
        person_name = adapt(lots_of_info, PersonName)
        return adapt(person_name, FullName)

but if it just worked, I could only discover that by being clever enough
to think of writing unit test for middle name.

------------------

Elsewhere, Phillip writes:
> If you derive an interface from another interface, this is supposed to mean
> that your derived interface promises to uphold all the promises of the base
> interface.  That is, your derived interface is always usable where the base
> interface is required.
>
> However, oftentimes one mistakenly derives an interface from another while
> meaning that the base interface is *required* by the derived interface,
> which is similar in meaning but subtly different.  Here, you mean to say,
> "IDerived has all of the requirements of IBase", but you have instead said,
> "You can use IDerived wherever IBase is desired".

Okay, that's beginning to make sense to me.

> it's difficult because intuitively an interface defines a *requirement*, so
> it seems logical to inherit from an interface in order to add requirements!

Yes... I would fall into this trap as well until I'd been burned a few times.

------------------

Alex summarizes nicely:
> Personally, I disagree with having transitivity at all, unless perhaps
> it be restricted to adaptations specifically and explicitly stated to
> be "perfect and lossless"; PJE claims that ALL adaptations MUST,
> ALWAYS, be "perfect and lossless" -- essentially, it seems to me, he
> _has_ to claim that, to defend transitivity being applied
> automatically, relentlessly, NON-optionally, NON-selectively
      [...]
> Much the same applies to inheritance, BTW, which as PJE has pointed out
> a few times also induces transitivity in adaptation, and, according to
> him, is a more likely cause of bugs than chains of adapters

But Alex goes on to say that perhaps we need two grades of adaptations
(perfect and real-world) and two grades of interface inheritance (perfect
and otherwise) so that the transitivity can be (automatically) invoked
only for the perfect ones. That feels to me like excessive complexity:
why not just prohibit transitivity? What, after all, is the COST of
prohibiting transitivity?

For the first case (adapter chains) the cost is a N^2 explosion in the
number of adapters needed. I said I thought that N would be small, but
Phillip (who knows what he's talking about, don't get me wrong here)
says that it's big enough to be mildly annoying at times to Twisted
and Eclipse developers.

For the second case (interface inheritance), I haven't yet thought
through clearly how at affects things... in fact, it sort of seems like
there's no good way to _prevent_ "transitivity" in this case short
of prohibiting interface inheritance entirely. And, as Phillip points
out to me (see above) this is a more common type of error.

Gee... I'm understanding the problem a little better, but elegant
solutions are still escaping me.

-- Michael Chermside

From Scott.Daniels at Acm.Org  Wed Jan 12 18:52:54 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed Jan 12 18:51:49 2005
Subject: [Python-Dev] Recent IBM Patent releases
Message-ID: <cs3o2p$4t1$1@sea.gmane.org>

IBM has recently released 500 patents for use in opensource code.

     http://www.ibm.com/ibm/licensing/patents/pledgedpatents.pdf

     "...In order to foster innovation and avoid the possibility that a
     party will take advantage of this pledge and then assert patents or
     other intellectual property rights of its own against Open Source
     Software, thereby limiting the freedom of IBM or any other Open
     Source developer to create innovative software programs, the
     commitment not to assert any of these 500 U.S. patents and all
     counterparts of these patents issued in other countries is
     irrevocable except that IBM reserves the right to terminate this
     patent pledge and commitment only with regard to any party who files
     a lawsuit asserting patents or other intellectual property rights
     against Open Source Software."

Since this includes patents on compression and encryption stuff, we
will definitely be faced with deciding on whether to allow use of these
patents in the main Python library.

Somebody was worried about BSD-style licenses on Groklaw, and said,

     "Yes, you can use this patent in the free version... but if you
     close the code, you're violating IBM's Patents, and they WILL come
     after you. Think of what would have happened if IBM had held a
     patent that was used in the FreeBSD TCP/IP stack?  Microsoft used
     it as the base of the Windows NT TCP/IP stack. IBM could then sue
     Microsoft for patent violations."

To which he got the following reply:

     "Sorry, but that's not correct. That specific question was asked
     on the IBM con-call about this announcement. i.e. if there were a
     commercial product that was a derived work of an open source
     project that used these royalty-free patents, what would happen?

     "IBM answered that, so long as the commercial derived work followed
     the terms of the open source license agreement, there was no
     problem. (So IBM is fine with a commercial product based on an
     open source BSD project making use of these patents.)"

This means to me we can put these in Python's library, but it is
definitely something to start deciding now.

-- Scott David Daniels
Scott.Daniels@Acm.Org

From vinay_sajip at yahoo.co.uk  Wed Jan 12 18:21:34 2005
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Wed Jan 12 18:53:19 2005
Subject: [Python-Dev] Re: logging class submission
In-Reply-To: <1490506001@web.de>
References: <1490506001@web.de>
Message-ID: <41E55C9E.1050409@yahoo.co.uk>

There is already a TimedRotatingFileHandler which will do backups on a
schedule, including daily.

The correct way of doing what you want is to submit a patch via
SourceForge. If the patch is accepted, then your code will end up in Python.

Thanks,


Vinay Sajip

From pje at telecommunity.com  Wed Jan 12 18:58:38 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 18:57:51 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <04EFD92E-64BE-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com>

At 06:18 PM 1/12/05 +0100, Alex Martelli wrote:

>On 2005 Jan 12, at 17:40, Phillip J. Eby wrote:
>
>>At 04:36 PM 1/12/05 +0100, Alex Martelli wrote:
>>>I already know -- you told us so -- that if I had transitivity as you 
>>>wish it (uncontrollable, unstoppable, always-on) I could not any more 
>>>write and register a perfectly reasonable adapter which fills in with a 
>>>NULL an optional field in the adapted-to interface, without facing 
>>>undetected degradation of information quality by that adapter being 
>>>invisibly, uncontrollably chained up with another -- no error message, 
>>>no nothing, no way to stop this -- just because a direct adapter wasn't 
>>>correctly written and registered.
>>
>>But why would you *want* to do this, instead of just explicitly 
>>converting?  That's what I don't understand.  If I were writing such a 
>>converter, I wouldn't want to register it for ANY implicit conversion, 
>>even if it was non-transitive!
>
>[snip lots of stuff]
>I have some data about people coming in from LDAP and the like, which I 
>want to record in that SQL DB -- the incoming data is held in types that 
>implement IPerson, so I write an adapter IPerson -> IFullname for the purpose.

This doesn't answer my question.  Obviously it makes sense to adapt in this 
fashion, but not IMPLICITLY and AUTOMATICALLY.  That's the distinction I'm 
trying to make.  I have no issue with writing an adapter like 
'PersonAsFullName' for this use case; I just don't think you should 
*register* it for automatic use any time you pass a Person to something 
that takes a FullName.


>   So, I'm not sure why you appear to argue for conversion against 
> adaptation, or explicit typechecking against the avoidance thereof which 
> is such a big part of adapt's role in life.

Okay, I see where we are not communicating; where I've been saying 
"conversion", you are taking this to mean, "don't write an adapter", but 
what I mean is "don't *register* the adapter for implicit adaptation; 
explicitly use it in the place where you need it.



From skip at pobox.com  Wed Jan 12 02:48:50 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 12 18:59:04 2005
Subject: [Python-Dev] Re: [Csv] csv module TODO list
In-Reply-To: <000401c4f83d$7432db40$e841fea9@oemcomputer>
References: <20050110233126.GA14363@janus.swcomplete.com>
	<000401c4f83d$7432db40$e841fea9@oemcomputer>
Message-ID: <16868.33282.507253.969557@montanaro.dyndns.org>


    Raymond> Would the csv module be a good place to add a DBF reader and
    Raymond> writer?

Not really.

    Raymond> I've posted a draft on ASPN.  It interoperates well with the
    Raymond> rest of the CSV module because it also accepts/returns a list
    Raymond> of fieldnames and a sequence of records.

    Raymond> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715

Just clean it up (do the doco thing) and check it in as dbf.py with reader
and writer functions.

I see your modus operandi at work: code something in Python then switch to C
when nobody's looking. ;-)

Skip

From gvanrossum at gmail.com  Wed Jan 12 18:59:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 18:59:16 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc20501120959737d1935@mail.gmail.com>

[Alex]
> Armin's fix was to change:
> 
>     conform = getattr(type(obj), '__conform__', None)
> 
> into:
> 
>     for basecls in type(obj).__mro__:
>         if '__conform__' in basecls.__dict__:
>             conform = basecls.__dict__['__conform__']
>             break
>     else:
>         # not found
> 
> I have only cursorily examined the rest of the standard library, but it
> seems to me there may be a few other places where getattr is being used
> on a type for this purpose, such as pprint.py which has a couple of
> occurrences of
>      r = getattr(typ, "__repr__", None)
[And then proceeds to propose a new API to improve the situation]

I wonder if the following solution wouldn't be more useful (since less
code will have to be changed).

The descriptor for __getattr__ and other special attributes could
claim to be a "data descriptor" which means that it gets first pick
*even if there's also a matching entry in the instance __dict__*.
Quick illustrative example:

>>> class C(object):
     foo = property(lambda self: 42)   # a property is always a "data
descriptor"
    
>>> a = C()
>>> a.foo
42
>>> a.__dict__["foo"] = "hello"
>>> a.foo
42
>>> 

Normal methods are not data descriptors, so they can be overridden by
something in __dict__; but it makes some sense that for methods
implementing special operations like __getitem__ or __copy__, where
the instance __dict__ is already skipped when the operation is invoked
using its special syntax, it should also be skipped by explicit
attribute access (whether getattr(x, "__getitem__") or x.__getitem__
-- these are entirely equivalent).

We would need to introduce a new decorator so that classes overriding
these methods can also make those methods "data descriptors", and so
that users can define their own methods with this special behavior
(this would be needed for __copy__, probably).

I don't think this will cause any backwards compatibility problems --
since putting a __getitem__ in an instance __dict__ doesn't override
the x[y] syntax, it's unlikely that anybody would be using this.
"Ordinary" methods will still be overridable.

PS. The term "data descriptor" now feels odd, perhaps we can say "hard
descriptors" instead. Hard descriptors have a __set__ method in
addition to a __get__ method (though the __set__ method may always
raise an exception, to implement a read-only attribute).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From skip at pobox.com  Wed Jan 12 02:59:22 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 12 18:59:40 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines
In-Reply-To: <20050110044441.250103C889@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
Message-ID: <16868.33914.837771.954739@montanaro.dyndns.org>


    Andrew> The csv parser consumes lines from an iterator, but it also has
    Andrew> it's own idea of end-of-line conventions, which are currently
    Andrew> only used by the writer, not the reader, which is a source of
    Andrew> much confusion. The writer, by default, also attempts to emit a
    Andrew> \r\n sequence, which results in more confusion unless the file
    Andrew> is opened in binary mode.

    Andrew> I'm looking for suggestions for how we can mitigate these
    Andrew> problems (without breaking things for existing users).

You can argue that reading csv data from/writing csv data to a file on
Windows if the file isn't opened in binary mode is an error.  Perhaps we
should enforce that in situations where it matters.  Would this be a start?

    terminators = {"darwin": "\r",
                   "win32": "\r\n"}

    if (dialect.lineterminator != terminators.get(sys.platform, "\n") and
       "b" not in getattr(f, "mode", "b")):
       raise IOError, ("%s not opened in binary mode" %
                       getattr(f, "name", "???"))

The elements of the postulated terminators dictionary may already exist
somewhere within the sys or os modules (if not, perhaps they should be
added).  The idea of the check is to enforce binary mode on those objects
that support a mode if the desired line terminator doesn't match the
platform's line terminator.

Skip
From skip at pobox.com  Wed Jan 12 02:39:00 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 12 18:59:44 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <41E45DA7.1030302@ActiveState.com>
References: <1105471586.41e42862b9a39@mcherm.com>
	<41E45DA7.1030302@ActiveState.com>
Message-ID: <16868.32692.589702.66263@montanaro.dyndns.org>


    >>> Terminology point: I know that LiskovViolation is technically
    >>> correct, but I'd really prefer it if exception names (which are
    >>> sometimes all users get to see) were more informative for people w/o
    >>> deep technical background.  Would that be possible?
    >> 
    >> I don't see how. Googling on Liskov immediately brings up clear and
    >> understandable descriptions of the principle that's being violated.
    >> I can't imagine summarizing the issue more concisely than that! What
    >> would you suggest? Including better explanations in the documentation
    >> is a must, but "LiskovViolation" in the exception name seems
    >> unbeatably clear and concise.

    David> Clearly, I disagree.

I had never heard the term before and consulted the Google oracle as well.
I found this more readable definition:

    Functions that use pointers or references to base classes must be able to
    use objects of derived classes without knowing it.

here:

    http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm

Of course, the situations in which a Liskov violation can occur can be a bit
subtle.

    David> My point is that it'd be nice if we could come up with an
    David> exception name which could be grokkable without requiring 1)
    David> Google, 2) relatively high-level understanding of type theory.

I suspect if there was a succinct way to convey the concept in two or three
words it would already be in common use.  The alternative seems to be to
make sure it's properly docstringed and added to the tutorial's glossary:

    >>> help(lv.LiskovViolation)
    Help on class LiskovViolation in module lv:

    class LiskovViolation(exceptions.Exception)
     |  Functions that use pointers or references to base classes must be
     |  able to use objects of derived classes without knowing it.
     |  
     ...

I suspect there's something to be said for exposing the user base to a
little bit of software engineering terminology every now and then.  A couple
years ago I suspect most of us had never heard of list comprehensions, and
we all bat the term about without a second thought now.

Skip
From skip at pobox.com  Wed Jan 12 19:02:36 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 12 19:02:57 2005
Subject: [Python-Dev] Recent IBM Patent releases
In-Reply-To: <cs3o2p$4t1$1@sea.gmane.org>
References: <cs3o2p$4t1$1@sea.gmane.org>
Message-ID: <16869.26172.317064.537812@montanaro.dyndns.org>


    Scott> Since this includes patents on compression and encryption stuff,
    Scott> we will definitely be faced with deciding on whether to allow use
    Scott> of these patents in the main Python library.

Who is going to decide if a particular library would be affected by one or
more of the 500 patents IBM released?

Skip
From aleax at aleax.it  Wed Jan 12 19:04:04 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 19:04:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com>
References: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
	<5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com>
Message-ID: <5841288E-64C4-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 18:58, Phillip J. Eby wrote:
    ...
>> I have some data about people coming in from LDAP and the like, which 
>> I want to record in that SQL DB -- the incoming data is held in types 
>> that implement IPerson, so I write an adapter IPerson -> IFullname 
>> for the purpose.
>
> This doesn't answer my question.  Obviously it makes sense to adapt in 
> this fashion, but not IMPLICITLY and AUTOMATICALLY.  That's the 
> distinction I'm trying to make.  I have no issue with writing an 
> adapter like 'PersonAsFullName' for this use case; I just don't think 
> you should *register* it for automatic use any time you pass a Person 
> to something that takes a FullName.

I'm adapting incoming data that can be of any of a huge variety of 
concrete types with different interfaces.  *** I DO NOT WANT TO 
TYPECHECK THE INCOMING DATA *** to know what adapter or converter to 
apply -- *** THAT'S THE WHOLE POINT *** of PEP 246.  I can't believe 
we're misunderstanding each other about this -- there MUST be 
miscommunication going on!


>>   So, I'm not sure why you appear to argue for conversion against 
>> adaptation, or explicit typechecking against the avoidance thereof 
>> which is such a big part of adapt's role in life.
>
> Okay, I see where we are not communicating; where I've been saying 
> "conversion", you are taking this to mean, "don't write an adapter", 
> but what I mean is "don't *register* the adapter for implicit 
> adaptation; explicitly use it in the place where you need it.

"Adaptation is not conversion" is how I THOUGHT we had agreed to 
rephrase my unfortunate "adaptation is not casting" -- so if you're 
using conversion to mean adaptation, I'm nonplussed.

Needing to be explicit and therefore to typechecking/typeswitching to 
pick which adapter to apply is just what I don't *WANT* to do, what I 
don't want *ANYBODY* to have to do EVER, and the very reason I'm 
spending time and energy on PEP 246.  So, how would you propose I know 
which adapter I need, without spreading typechecks all over my 
bedraggled *CODE*?!?!


Alex

From mcherm at mcherm.com  Wed Jan 12 19:08:20 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Jan 12 19:08:24 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
Message-ID: <1105553300.41e56794d1fc5@mcherm.com>

I wrote:
> I don't see how [LiskovViolation could have a more descriptive name].
> Googling on Liskov immediately brings up clear and understandable
> descriptions of the principle


David writes:
> Clearly, I disagree. [...]

Skip writes:
> I had never heard the term before and consulted the Google oracle as well.

This must be one of those cases where I am mislead by my background...
I thought of Liskov substitution principle as a piece of basic CS
background that everyone learned in school (or from the net, or wherever
they learned programming). Clearly, that's not true.

Guido writes:
> How about SubstitutabilityError?

It would be less precise and informative to ME but apparently more so
to a beginner. Obviously, we should support the beginner!

-- Michael Chermside

From Scott.Daniels at Acm.Org  Wed Jan 12 19:13:12 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed Jan 12 19:11:55 2005
Subject: [Python-Dev] Re: Recent IBM Patent releases
In-Reply-To: <16869.26172.317064.537812@montanaro.dyndns.org>
References: <cs3o2p$4t1$1@sea.gmane.org>
	<16869.26172.317064.537812@montanaro.dyndns.org>
Message-ID: <cs3p8r$8pc$1@sea.gmane.org>

Skip Montanaro wrote:
> Who is going to decide if a particular library would be affected by one or
> more of the 500 patents IBM released?
> 
> Skip

I am thinking more along the lines of, "our policy on accepting new code
[will/will not] be to allow new submissions which use some of that
patented code."  I believe our current policy is that the author
warrants that the code is his/her own work and not encumbered by
any patent.


-- Scott David Daniels
Scott.Daniels@Acm.Org

From gvanrossum at gmail.com  Wed Jan 12 19:16:14 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 19:16:17 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc205011210166c14e3f4@mail.gmail.com>

> > [Alex]
> >> I'm saying that if, by mistake, the programmer has NOT
> >> registered the A->C adapter (which would be easily coded and work
> >> perfectly), then thanks to transitivity, instead of a clear and simple
> >> error message leading to immediate diagnosis of the error, they'll get
> >> a subtle unnecessary degradation of information and resulting
> >> reduction
> >> in information quality.

[Guido]
> > I understand, but I would think that there are just as many examples
> > of cases where having to register a trivial A->C adapter is much more
> > of a pain than it's worth; especially if there are a number of A->B
> > pairs and a number of B->C pairs, the number of additional A->C pairs
> > needed could be bewildering.

[Alex]
> Hm?

I meant if there were multiple A's. For every Ai that has an Ai->B you
would also have to register a trivial Ai->C. And if there were
multiple C's (B->C1, B->C2, ...) then the number of extra adaptors to
register would be the number of A's *times* the number of C's, in
addition to the sum of those numbers for the "atomic" adaptors (Ai->B,
B->Cj).

> > But I would like to see some input from people with C++ experience.
> 
> Here I am, at your service.
[...]
> It's in the running for the coveted "Alex's worst nightmare" prize,

Aha. This explains why you feel so strongly about it.

But now, since I am still in favor of automatic "combined" adaptation
*as a last resort*, I ask you to consider that Python is not C++, and
that perhaps we can make the experience in Python better than it was
in C++. Perhaps allowing more control over when automatic adaptation
is acceptable?

For example, inteface B (or perhaps this should be a property of the
adapter for B->C?) might be marked so as to allow or disallow its
consideration when looking for multi-step adaptations. We could even
make the default "don't consider", so only people who have to deal
with the multiple A's and/or multiple C's all adaptable via the same B
could save themselves some typing by turning it on.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From walter at livinglogic.de  Wed Jan 12 19:18:04 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Jan 12 19:18:07 2005
Subject: [Python-Dev] Feed style codec API
Message-ID: <41E569DC.2000407@livinglogic.de>

Now that Python 2.4 is out the door (and the problems with
StreamReader.readline() are hopefully fixed), I'd like bring
up the topic of a feed style codec API again. A feed style API
would make it possible to use stateful encoding/decoding where
the data is not available as a stream.

Two examples:

- xml.sax.xmlreader.IncrementalParser: Here the client passes raw
   XML data to the parser in multiple calls to the feed() method.
   If the parser wants to use Python codecs machinery, it has to
   wrap a stream interface around the data passed to the feed()
   method.
- WSGI (PEP 333) specifies that the web application returns the
   fragments of the resulting webpage as an iterator. If this result
   is encoded unicode we have the same problem: This must be wrapped
   in a stream interface.

The simplest solution is to add a feed() method both to StreamReader
and StreamWriter, that takes the state of the codec into account,
but doesn't use the stream. This can be done by simply moving a
few lines of code into separate methods. I've uploaded a patch to
Sourceforge: #1101097.

There are other open issues with the codec changes: unicode-escape,
UTF-7, the CJK codecs and probably a few others don't support
decoding imcomplete input yet (although AFAICR the functionality
is mostly there in the CJK codecs).

Bye,
    Walter D?rwald
From pje at telecommunity.com  Wed Jan 12 19:30:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 19:29:15 2005
Subject: getting special from type, not instance (was Re:
	[Python-Dev] copy confusion)
In-Reply-To: <ca471dc20501120959737d1935@mail.gmail.com>
References: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050112132041.02f92e60@mail.telecommunity.com>

At 09:59 AM 1/12/05 -0800, Guido van Rossum wrote:
>We would need to introduce a new decorator so that classes overriding
>these methods can also make those methods "data descriptors", and so
>that users can define their own methods with this special behavior
>(this would be needed for __copy__, probably).

I have used this technique as a workaround in PyProtocols for __conform__ 
and __adapt__, so I know it works.  In fact, here's the implementation:

def metamethod(func):
     """Wrapper for metaclass method that might be confused w/instance 
method"""
     return property(lambda ob: func.__get__(ob,ob.__class__))

Basically, if I implement a non-slotted special method (like __copy__ or 
__getstate__) on a metaclass, I usually wrap it with this decorator in 
order to avoid the "metaconfusion" issue.

I didn't mention it because the technique had gotten somehow tagged in my 
brain as a temporary workaround until __conform__ and __adapt__ get slots 
of their own, rather than a general fix for metaconfusion.  Also, I wrote 
the above when Python 2.2 was still pretty new, so its applicability was 
somewhat limited by the fact that 2.2 didn't allow super() to work with 
data descriptors.  So, if you used it, you couldn't use super.  This 
probably isn't a problem any more, although I'm still using the super-alike 
that I wrote as a workaround, so I don't know for sure.  :)


>PS. The term "data descriptor" now feels odd, perhaps we can say "hard
>descriptors" instead. Hard descriptors have a __set__ method in
>addition to a __get__ method (though the __set__ method may always
>raise an exception, to implement a read-only attribute).

I think I'd prefer "Override descriptor" to "Hard descriptor", but either 
is better than data descriptor.  Presumably there will need to be backward 
compatibility macros in the C API, though for e.g. PyDescriptor_IsData or 
whatever it's currently called.

From pje at telecommunity.com  Wed Jan 12 19:33:20 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 19:32:35 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011207261a8432c@mail.gmail.com>
References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050112105656.03a7e180@mail.telecommunity.com>

At 07:26 AM 1/12/05 -0800, Guido van Rossum wrote:
>[Alex]
> > I'm saying that if, by mistake, the programmer has NOT
> > registered the A->C adapter (which would be easily coded and work
> > perfectly), then thanks to transitivity, instead of a clear and simple
> > error message leading to immediate diagnosis of the error, they'll get
> > a subtle unnecessary degradation of information and resulting reduction
> > in information quality.
>
>I understand, but I would think that there are just as many examples
>of cases where having to register a trivial A->C adapter is much more
>of a pain than it's worth; especially if there are a number of A->B
>pairs and a number of B->C pairs, the number of additional A->C pairs
>needed could be bewildering.

Alex has suggested that there be a way to indicate that an adapter is 
noiseless and therefore suitable for transitivity.

I, on the other hand, would prefer having a way to declare that an adapter 
is *unsuitable* for transitivity, in the event that you feel the need to 
introduce an "imperfect" implicit adapter (versus using an explicit 
conversion).

So, in principle we agree, but we differ regarding what's a better 
*default*, and this is probably where you will be asked to Pronounce, 
because as Alex says, this *is* controversial.

However, I think there is another way to look at it, in which *both* can be 
the default...

Look at it this way.  If you create an adapter class or function to do some 
kind of adaptation, it is not inherently transitive, and you have to 
explicitly invoke it.  (This is certainly the case today!)

Now, let us say that you then register that adapter to perform implicit 
adaptation -- and by default, such a registration says that you are happy 
with it being used implicitly, so it will be used whenever you ask for its 
target interface or some further adaptation thereof.

So, here we have a situation where in some sense, BOTH approaches are the 
"default", so in theory, both sides should be happy.

However, as I understand it, Alex is *not* happy with this, because he 
wants to be able to register a noisy adapter for *implicit* use, but only 
in the case where it is used directly.

The real question, then, is "What is that use case good for?"  And I don't 
have an answer to that question, because it's Alex's use case.

I'm proposing an approach that has two useful extremes: be noisy and 
explicit, or clean and implicit.  Alex seems to want to also add the middle 
ground of "noisy but implicit", and I think this is a bad idea because it 
will lead to precisely to the same problems as it does in C++!

Python as it exists today tends to support the proposition that noisy 
adaptation or conversion should not be implicit, since trying to do 
'someList[1.2]' raises a TypeError, rather than silently truncating.  The 
number of steps between 'float' and 'int' in some adaptation graph has 
absolutely nothing to do with it; even if there is only one step, doing 
this kind of conversion or adaptation implicitly is just a bad idea.

From pje at telecommunity.com  Wed Jan 12 19:46:22 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 19:45:37 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5841288E-64C4-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com>
	<5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com>
	<5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112133516.02f961c0@mail.telecommunity.com>

At 07:04 PM 1/12/05 +0100, Alex Martelli wrote:

>On 2005 Jan 12, at 18:58, Phillip J. Eby wrote:
>    ...
>>>I have some data about people coming in from LDAP and the like, which I 
>>>want to record in that SQL DB -- the incoming data is held in types that 
>>>implement IPerson, so I write an adapter IPerson -> IFullname for the purpose.
>>
>>This doesn't answer my question.  Obviously it makes sense to adapt in 
>>this fashion, but not IMPLICITLY and AUTOMATICALLY.  That's the 
>>distinction I'm trying to make.  I have no issue with writing an adapter 
>>like 'PersonAsFullName' for this use case; I just don't think you should 
>>*register* it for automatic use any time you pass a Person to something 
>>that takes a FullName.
>
>I'm adapting incoming data that can be of any of a huge variety of 
>concrete types with different interfaces.  *** I DO NOT WANT TO TYPECHECK 
>THE INCOMING DATA *** to know what adapter or converter to apply -- *** 
>THAT'S THE WHOLE POINT *** of PEP 246.  I can't believe we're 
>misunderstanding each other about this -- there MUST be miscommunication 
>going on!

Indeed!

Let me try and be more specific about my assumptions.  What *I* would do in 
your described scenario is *not* register a general-purpose IPerson -> 
IFullName adapter, because in the general case this is a lossy adaptation.

However, for some *concrete* types an adaptation to IFullName must 
necessarily have a NULL middle name; therefore, I would define a *concrete 
adaptation* from those types to IFullName, *not* an adaptation from IPerson 
-> IFullName.  This still allows for transitive adapter composition from 
IFullName on to other interfaces, if need be.

IOW, the standard for "purity" in adapting from a concrete type to an 
interface can be much lower than for adapting from interface to 
interface.  An interface-to-interface adapter is promising that it can 
adapt any possible implementation of the first interface to the second 
interface, and that it's always suitable for doing so.  IMO, if you're not 
willing to make that commitment, you shouldn't define an 
interface-to-interface adapter.

Hm.  Maybe we actually *agree*, and are effectively only arguing 
terminology?  That would be funny, yet sad.  :)


>>>   So, I'm not sure why you appear to argue for conversion against 
>>> adaptation, or explicit typechecking against the avoidance thereof 
>>> which is such a big part of adapt's role in life.
>>
>>Okay, I see where we are not communicating; where I've been saying 
>>"conversion", you are taking this to mean, "don't write an adapter", but 
>>what I mean is "don't *register* the adapter for implicit adaptation; 
>>explicitly use it in the place where you need it.
>
>"Adaptation is not conversion" is how I THOUGHT we had agreed to rephrase 
>my unfortunate "adaptation is not casting" -- so if you're using 
>conversion to mean adaptation, I'm nonplussed.

Sorry; I think of "noisy adaptation" as being "conversion" -- i.e. I have 
one mental bucket for all conversion/adaptation scenarios that aren't "pure 
as-a" relationships.


>Needing to be explicit and therefore to typechecking/typeswitching to pick 
>which adapter to apply is just what I don't *WANT* to do, what I don't 
>want *ANYBODY* to have to do EVER, and the very reason I'm spending time 
>and energy on PEP 246.  So, how would you propose I know which adapter I 
>need, without spreading typechecks all over my bedraggled *CODE*?!?!

See above.  I thought this was obvious; sorry for the confusion.

I think we may be getting close to a breakthrough, though; so let's hang in 
there a bit longer.

From foom at fuhm.net  Wed Jan 12 19:47:30 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed Jan 12 19:47:34 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <69F7EF0F-64CA-11D9-86E3-000A95A50FB2@fuhm.net>

I'd just like to share a use case for transitive adaption that I've 
just run into (I'm using Zope.Interface which does not support it). 
Make of this what you will, I just thought an actual example where 
transitive adaption is actually necessary might be useful to the 
discussion.

I have a framework with an interface 'IResource'. A number of classes 
implement this interface. I introduce a new version of the API, with an 
'INewResource', so I register a wrapping adapter from 
IResource->INewResource in the global adapter registry.

This all works fine as long as all the old code returns objects that 
directly provide IResource. However, the issue is that code working 
with the old API *doesn't* just do that, it may also have returned 
something that is adaptable to IResource. Therefore, calling 
INewResource(obj) will fail, because there is no direct adapter from 
obj to INewResource, only obj->IResource and IResource->INewResource.

Now, I can't just add the extra adapters from obj->INewResource myself, 
because the adapter from obj->IResource is in client code, 
compatibility with which is needed. So, as far as I can tell, I have 
two options:
1) everywhere I want to adapt to INewResource, do a dance:

resource = INewResource(result, None)
if resource is not None:
   return resource

resource = IResource(result, None)
if resource is not None:
   return INewResource(resource)
else:
   raise TypeError("blah")

2) Make a custom __adapt__ on INewResource to do similar thing. This 
seems somewhat difficult with zope.interface (need two classes) but 
does work.

class INewResource(zope.interface.Interface):
     pass

class SpecialAdaptInterfaceClass(zope.interface.InterfaceClass):
     def __adapt__(self, result):
         resource = zope.interface.InterfaceClass.__adapt__(self, other)
         if resource is not None:
             return resource

         resource = IResource(result, None)
         if resource is not None:
             return INewResource(result)

INewResource.__class__ = SpecialAdaptInterfaceClass



I chose #2. In any case, it certainly looks doable, even with a 
non-transitive adaptation system, but it's somewhat irritating. 
Especially if you end up needing to do that kind of thing often.

James

From aahz at pythoncraft.com  Wed Jan 12 19:47:38 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed Jan 12 19:47:40 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc20501120959737d1935@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
Message-ID: <20050112184738.GB26104@panix.com>

On Wed, Jan 12, 2005, Guido van Rossum wrote:
>
> PS. The term "data descriptor" now feels odd, perhaps we can say "hard
> descriptors" instead. Hard descriptors have a __set__ method in
> addition to a __get__ method (though the __set__ method may always
> raise an exception, to implement a read-only attribute).

I'd prefer "property descriptor" since that's the primary use case.
"Getset descriptor" also works for me.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From pje at telecommunity.com  Wed Jan 12 20:06:51 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 20:06:06 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011210166c14e3f4@mail.gmail.com>
References: <BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>

At 10:16 AM 1/12/05 -0800, Guido van Rossum wrote:
>For example, inteface B (or perhaps this should be a property of the
>adapter for B->C?) might be marked so as to allow or disallow its
>consideration when looking for multi-step adaptations. We could even
>make the default "don't consider", so only people who have to deal
>with the multiple A's and/or multiple C's all adaptable via the same B
>could save themselves some typing by turning it on.

Another possibility; I've realized from Alex's last mail that there's a 
piece of my reasoning that I haven't been including, and now I can actually 
explain it clearly (I hope).  In my view, there are at least two kinds of 
adapters, with different fidelity requirements/difficulty:

    class     -> interface  ("lo-fi" is okay)
    interface -> interface  (It better be perfect!)

If you cannot guarantee that your interface-to-interface adapter is the 
absolute best way to adapt *any* implementation of the source interface, 
you should *not* treat it as an interface-to-interface adapter, but rather 
as a class-to-interface adapter for the specific classes that need 
it.  And, if transitivity exists, it is now restricted to a sensible subset 
of the possible paths.

I believe that this difference is why I don't run into Alex's problems in 
practice; when I encounter a use case like his, I may write the same 
adapter, but I'll usually register it as an adapter from 
class-to-interface, if I need to register it for implicit adaptation at all.

Also note that the fact that it's harder to write a solid 
interface-to-interface adapter naturally leads to my experience that 
transitivity problems occur more often via interface inheritance.  This is 
because interface inheritance as implemented by both Zope and PyProtocols 
is equivalent to defining an interface-to-interface adapter, but with no 
implementation!  Obviously, if it is already hard to write a good 
interface-to-interface adapter, then it must be even harder when you have 
to do it with no actual code!  :)

I think maybe this gets us a little bit closer to having a unified (or at 
least unifiable) view on the problem area.  If Alex agrees that 
class-to-interface adaptation is an acceptable solution for limiting the 
transitivity of noisy adaptation while still allowing some degree of 
implicitness, then maybe we have a winner.

(Btw, the fact that Zope and Twisted's interface systems initially 
implemented *only* interface-to-interface adaptation may have also led to 
their conceptualizing transitivity as unsafe, since they didn't have the 
option of using class-to-interface adapters as a way to deal with more 
narrowly-applicable adaptations.)

From aleax at aleax.it  Wed Jan 12 20:31:25 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 20:31:32 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011210166c14e3f4@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
Message-ID: <8C3AB4AC-64D0-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 19:16, Guido van Rossum wrote:
    ...
> [Alex]
>> Hm?
>
> I meant if there were multiple A's. For every Ai that has an Ai->B you
> would also have to register a trivial Ai->C. And if there were
> multiple C's (B->C1, B->C2, ...) then the number of extra adaptors to
> register would be the number of A's *times* the number of C's, in
> addition to the sum of those numbers for the "atomic" adaptors (Ai->B,
> B->Cj).

Ah, OK, I get it now, thanks.


> But now, since I am still in favor of automatic "combined" adaptation
> *as a last resort*, I ask you to consider that Python is not C++, and
> that perhaps we can make the experience in Python better than it was
> in C++. Perhaps allowing more control over when automatic adaptation
> is acceptable?

Yes, that would be necessary to achieve parity with C++, which does now 
have the 'explicit' keyword (to state that a conversion must not be 
used as a step in a chain automatically constructed) -- defaults to 
"acceptable in automatic chains" for historical and backwards 
compatibility reasons.

> For example, inteface B (or perhaps this should be a property of the
> adapter for B->C?) might be marked so as to allow or disallow its
> consideration when looking for multi-step adaptations. We could even
> make the default "don't consider", so only people who have to deal
> with the multiple A's and/or multiple C's all adaptable via the same B
> could save themselves some typing by turning it on.

Yes, this idea you propose seems to me to be a very reasonable 
compromise: one can get the convenience of automatic chains of 
adaptations but only when the adaptations involved are explicitly 
asserted to be OK for that.  I think that the property (of being OK for 
automatic/implicit/chained/transitive use) should definitely be one of 
the adaptation rather than of an interface, btw.


Alex

From pje at telecommunity.com  Wed Jan 12 20:35:02 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 20:34:17 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011207491db5457@mail.gmail.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<1105540027.5326.19.camel@localhost>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112142027.03d54910@mail.telecommunity.com>

At 07:49 AM 1/12/05 -0800, Guido van Rossum wrote:
>[Phillip]
> > As for the issue of what should and shouldn't exist in Python, it doesn't
> > really matter; PEP 246 doesn't (and can't!) *prohibit* transitive
> > adaptation.
>
>Really? Then isn't it underspecified?

No; I just meant that:

1. With its current hooks, implementing transitivity is trivial; 
PyProtocols' interface objects have an __adapt__ that does the transitive 
lookup.  So, as currently written, this is perfectly acceptable in PEP 246.

2. Given Python's overall flexibility, there's really no way to *stop* 
anybody from implementing it short of burying the whole thing in C and 
providing no way to access it from Python.  And then somebody can still 
implement an extension module.  ;)


>  I'd think that by the time we
>actually implement PEP 246 in the Python core, this part of the
>semantics should be specified (at least the default behavior, even if
>there are hooks to change this).

The default behavior *is* specified: it's just specified as "whatever you 
want".  :)  What Alex and I are really arguing about is what should be the 
"one obvious way to do it", and implicitly, what Python interfaces should do.

Really, the whole transitivity argument is moot for PEP 246 itself; PEP 246 
doesn't really care, because anybody can do whatever they want with 
it.  It's Python's "standard" interface implementation that cares; should 
its __adapt__ be transitive, and if so, how transitive?  (PEP 246's global 
registry could be transitive, I suppose, but it's only needed for 
adaptation to a concrete type, and I only ever adapt to interfaces, so I 
don't have any experience with what somebody might or might not want for 
that case.)

Really, the only open proposals remaining (i.e. not yet accepted/rejected 
by Alex) for actually *changing* PEP 246 that I know of at this point are:

1. my suggestion for how to handle the LiskovViolation use case by 
returning None instead of raising a special exception

2. that classic classes be supported, since the old version of PEP 246 
supported them and because it would make exceptions unadaptable otherwise.

The rest of our discussion at this point is just pre-arguing a 
not-yet-written PEP for how Python interfaces should handle adaptation.  :)

From pje at telecommunity.com  Wed Jan 12 20:39:09 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 20:38:23 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <1105552006.41e562862bf84@mcherm.com>
Message-ID: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>

At 09:46 AM 1/12/05 -0800, Michael Chermside wrote:
>This is a collection of responses to various things that don't appear
>to have been resolved yet:
>
>Phillip writes:
> > if a target protocol has optional aspects, then lossy adaptation to it is
> > okay by definition.  Conversely, if the aspect is *not* optional, then
> > lossy adaptation to it is not acceptable.  I don't think there can really
> > be a middle ground; you have to decide whether the information is required
> > or not.
>
>I disagree. To belabor Alex's example, suppose LotsOfInfo has first, middle,
>and last names; PersonName has first and last, and FullName has first,
>middle initial and last. FullName's __doc__ specifically states that if
>the middle name is not available or the individual does not have a middle
>name, then "None" is acceptable.

The error, IMO, is in registering an interface-to-interface adapter from 
PersonName to FullName; at best, it should be explicitly registered only 
for concrete classes that lack some way to provide a middle name.

If you don't want to lose data implicitly, don't register an implicit 
adaptation that loses data.


>You're probably going to say "okay, then register a LotsOfInfo->FullName
>converter", and I agree. But if no such converter is registered, I
>would rather have a TypeError then an automatic conversion which produces
>incorrect results.

Then don't register a data-losing adapter for implicit adaptation for any 
possible input source; only the specific input sources that you need it for.



> > it's difficult because intuitively an interface defines a *requirement*, so
> > it seems logical to inherit from an interface in order to add requirements!
>
>Yes... I would fall into this trap as well until I'd been burned a few times.

It's burned me more than just a few times, and I *still* sometimes make it 
if I'm not paying attention.  It's just too easy to make the mistake.  So, 
I'm actually open to considering dropping interface inheritance.

For adapters, I think it's much harder to make this mistake because you 
have more time to think about whether your adapter is universal or not, and 
you can always err on the safe side.  In truth, I believe I much more 
frequently implement class-to-interface adapters than 
interface-to-interface ones.  I can always go back later and declare the 
adapter as interface-to-interface if I want, so there's no harm in starting 
them out as class-to-interface adapters.


>Gee... I'm understanding the problem a little better, but elegant
>solutions are still escaping me.

My solution is to use class-to-interface adaptation for most adaptation, 
and interface-to-interface adaptation only when the adaptation can be 
considered "correct by definition".  It seems to work for me.

From pje at telecommunity.com  Wed Jan 12 20:51:16 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 20:50:38 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <79990c6b05011206001a5a3805@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com>

At 03:48 PM 1/12/05 +0100, Alex Martelli wrote:
>Demanding that the set of paths of minimal available length has exactly 
>one element is strange, though, IF one is assuming that all adaptation 
>paths are exactly equivalent except at most for secondary issues of 
>performance (which are only adjudicated by the simplest heuristic: if 
>those issues were NOT considered minor/secondary, then a more 
>sophisticated scheme would be warranted, e.g. by letting the programmer 
>associate a cost to each step, picking the lowest-cost path, AND letting 
>the caller of adapt() also specify the maximal acceptable cost or at least 
>obtain the cost associated with the chosen path).

There's a very simple reason.  If one is using only non-noisy adapters, 
there is absolutely no reason to ever define more than one adapter between 
the *same* two points.  If you do, then somebody is doing something 
redundant, and there is a possibility for error.  In practice, a package or 
library that declares two interfaces should provide the adapter between 
them, if a sensible one can exist.  For two separate packages, ordinarily 
one is the client and needs to adapt from one of its own implementations or 
interfaces to a foreign interface, or vice versa, and in either case the 
client should be the registrant for the adapters.

Bridging between two foreign packages is the only case in which there is an 
actual possibility of having two packages attempt to bridge the exact same 
interfaces or implementations, and this case is very rare, at least at 
present.  Given the extreme rarity of this legitimate situation where two 
parties have independently created adapter paths of the same length and 
number of adapters between two points, I considered it better to consider 
the situation an error, because in the majority of these bridging cases, 
the current author is the one who created at least one of the bridges, in 
which case he now knows that he is creating a redundant adapter that he 
need no longer maintain.

The very rarest of all scenarios would be that the developer is using two 
different packages that both bridge the same items between two *other* 
packages.  This is the only scenario I can think of where would be a 
duplication that the current developer could not easily control, and the 
only one where PyProtocols' current policy would create a problem for the 
developer, requiring them to explicitly work around the issue by declaring 
an artificially "better" adapter path to resolve the ambiguity.

As far as I can tell, this scenario will remain entirely theoretical until 
there are at least two packages out there with interfaces that need 
bridging, and two more packages exist that do the bridging, that someone 
might want to use at the same time.  I think that this will take a while.  :)

In the meantime, all other adapter ambiguities are suggestive of a possible 
programming or design error, such as using interface inheritance to denote 
what an interface requires instead of what it provides, incorrectly 
claiming that something is a universal (interface-to-interface) adapter 
when it is only suitable for certain concrete classes, etc.



>Personally, I disagree with having transitivity at all, unless perhaps it 
>be restricted to adaptations specifically and explicitly stated to be 
>"perfect and lossless"; PJE claims that ALL adaptations MUST, ALWAYS, be 
>"perfect and lossless" -- essentially, it seems to me, he _has_ to claim 
>that, to defend transitivity being applied automatically, relentlessly, 
>NON-optionally, NON-selectively (but then the idea of giving an error when 
>two or more shortest-paths have the same length becomes dubious).

No, it follows directly from the premise.  If adapters are non-noisy, why 
do you need more than one adapter chain of equal length between two 
points?  If you have such a condition, you have a redundancy at the least, 
and more likely a programming error -- surely BOTH of those adapters are 
not correct, unless you have that excruciatingly-rare case I mentioned above.



>BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of 
>inheritance.  You can say that interface ISub inherits from IBas: this 
>means that ISub has all the same methods as IBas with the same signatures, 
>plus it may have other methods; it does *NOT* mean that anything 
>implementing ISub must also implement IBas, nor that a QueryInterface on 
>an ISub asking for an IBas must succeed, or anything of that kind.  In 
>many years of COM practice I have NEVER found this issue to be a 
>limitation -- it works just fine.

I'm actually open to at least considering dropping interface inheritance 
transitivity, due to its actual problems in practice.  Fewer than half of 
the interfaces in PEAK do any inheritance, so having to explicitly declare 
that one interface implies another isn't a big deal.

Such a practice might seem very strange to Java programers, however, since 
it means that if you declare (in Python) a method to take IBas, it will not 
accept an ISub, unless the object has explicitly declared that it supports 
both.  (Whereas in Java it suffices for the class to declare that it 
supports ISub.)


From cce at clarkevans.com  Wed Jan 12 20:57:11 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Jan 12 20:57:15 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011210166c14e3f4@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
Message-ID: <20050112195711.GA1813@prometheusresearch.com>

On Wed, Jan 12, 2005 at 10:16:14AM -0800, Guido van Rossum wrote:
| But now, since I am still in favor of automatic "combined" adaptation
| *as a last resort*, I ask you to consider that Python is not C++, and
| that perhaps we can make the experience in Python better than it was
| in C++. Perhaps allowing more control over when automatic adaptation
| is acceptable?
| 
| For example, inteface B (or perhaps this should be a property of the
| adapter for B->C?) might be marked so as to allow or disallow its
| consideration when looking for multi-step adaptations. We could even
| make the default "don't consider", so only people who have to deal
| with the multiple A's and/or multiple C's all adaptable via the same B
| could save themselves some typing by turning it on.

How about not allowing transitive adaptation, by default, and
then providing two techniques to help the user cope:

  - raise a AdaptIsTransitive(AdaptationError) exception when
    an adaptation has failed, but there exists a A->C pathway
    using an intermediate B 

  - add a flag to adapt, allowTransitive, which defaults to False

This way new developers don't accidently shoot their foot off, as
Alex warns; however, the price for doing this sort of thing is cheap.
The AdaptIsTransitive error could even explain the problem with a
dynamic error message like:

  "You've tried to adapt a LDAPName to a FirstName, but no
   direct translation exists.  There is an indirect translation
   using FullName:  LDAPName -> FullName -> FirstName.  If you'd
   like to use this intermediate object, simply call adapt()
   with allowTransitive = True"

Clark
From aleax at aleax.it  Wed Jan 12 20:59:28 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 20:59:34 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
References: <BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
Message-ID: <777FF1CC-64D4-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 20:06, Phillip J. Eby wrote:

> At 10:16 AM 1/12/05 -0800, Guido van Rossum wrote:
>> For example, inteface B (or perhaps this should be a property of the
>> adapter for B->C?) might be marked so as to allow or disallow its
>> consideration when looking for multi-step adaptations. We could even
>> make the default "don't consider", so only people who have to deal
>> with the multiple A's and/or multiple C's all adaptable via the same B
>> could save themselves some typing by turning it on.
>
> Another possibility; I've realized from Alex's last mail that there's 
> a piece of my reasoning that I haven't been including, and now I can 
> actually explain it clearly (I hope).  In my view, there are at least 
> two kinds of adapters, with different fidelity 
> requirements/difficulty:
>
>    class     -> interface  ("lo-fi" is okay)
>    interface -> interface  (It better be perfect!)
>
> If you cannot guarantee that your interface-to-interface adapter is 
> the absolute best way to adapt *any* implementation of the source 
> interface, you should *not* treat it as an interface-to-interface 
> adapter, but rather as a class-to-interface adapter for the specific 
> classes that need it.  And, if transitivity exists, it is now 
> restricted to a sensible subset of the possible paths.

Even though Guido claimed I have been belaboring the following point, I 
do think it's crucial and I still haven't seen you answer it.  If *any* 
I1->I2 adapter, by the very fact of its existence, asserts it's the 
*absolute best way* to adapt ANY implementation of I1 into I2; then why 
should the existence of two equal-length shortest paths A->I1->I2 and 
A->I3->I2 be considered a problem in any sense?  Pick either, at random 
or by whatever rule: you trust that they're BOTH the absolute best, so 
they must be absolutely identical anyway.

If you agree that this is the only sensible behavior, and PyProtocols' 
current behavior (TypeError for two paths of equal length save in a few 
special cases), then I guess can accept your stance that providing 
adaptation between interfaces implies the strongest possible degree of 
commitment to perfection, and that this new conception of *absolute 
best way* entirely and totally replaces previous weaker and more 
sensible descriptions, such as for example in 
<http://peak.telecommunity.com/protocol_ref/proto-implication.html> 
that shorter chains "are less likely to be a ``lossy'' conversion".  
``less likely'' and ``absolute best way'' just can't coexist.  Two 
"absolute best ways" to do the same thing are exactly equally likely to 
be ``lossy'': that likelihood is ZERO, if "absolute" means anything.
((Preferring shorter chains as a heuristic for faster ones may be very 
reasonable approach if performance is a secondary consideration, as 
I've already mentioned; if performance were more important than that, 
then other ``costs'' besides the extreme NO_ADAPTER_NEEDED [[0 cost]] 
and DOES_NOT_SUPPORT [[infinite cost]] should be accepted, and the 
minimal-cost path ensured -- I do not think any such complication is 
warranted)).

> I think maybe this gets us a little bit closer to having a unified (or 
> at least unifiable) view on the problem area.  If Alex agrees that 
> class-to-interface adaptation is an acceptable solution for limiting 
> the transitivity of noisy adaptation while still allowing some degree 
> of implicitness, then maybe we have a winner.

If you agree that it cannot be an error to have two separate paths of 
"absolute best ways" (thus equally perfect) of equal length, then I can 
accept your stance that one must ensure the "absolute best way" each 
time one codes and registers an I -> I adapter (and each time one 
interface inherits another interface, apparently); I can then do half 
the rewrite of the PEP 246 draft (the changes already mentioned and 
roughly agreed) and turn it over to you as new first author to complete 
with the transitivity details &c.

If there is any doubt whatsoever marring that perfection, that 
"absolute best way", then I fear we're back at square one.


Alex

From pje at telecommunity.com  Wed Jan 12 21:03:28 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 21:02:44 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <51767A6A-64B3-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <ca471dc2050112074562485f50@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050112074562485f50@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050112145313.02eb26b0@mail.telecommunity.com>

At 05:02 PM 1/12/05 +0100, Alex Martelli wrote:
>So, I think PEP 246 should specify that the step now called (e) [checking 
>the registry] comes FIRST; then, an isinstance step [currently split 
>between (a) and (d)], then __conform__ and __adapt__ steps [currently 
>called (b) and (c)].

One question, and one suggestion.

The question: should the registry support explicitly declaring that a 
particular adaptation should *not* be used, thus pre-empting later phases 
entirely?  This would allow for the possibility of speeding lookups by 
caching, as well as the option to "opt out" of specific adaptations, which 
some folks seem to want.  ;)

The suggestion: rather than checking isinstance() in adapt(), define 
object.__conform__ such that it does the isinstance() check.  Then, Liskov 
violation is simply a matter of returning 'None' from __conform__ instead 
of raising a special error.


>   Checking the registry is after all very fast: make the 2-tuple 
> (type(obj), protocol), use it to index into the registry -- period.  So, 
> it's probably not worth complicating the semantics at all just to "fast 
> path" the common case.

Okay, one more suggestion/idea:

$ timeit -s "d={}; d[1,2]=None" "d[1,2]"
1000000 loops, best of 3: 1.65 usec per loop

$ timeit -s "d={}; d[1]={2:None}" "d[1][2]"
1000000 loops, best of 3: 0.798 usec per loop

This seems to suggest that using nested dictionaries could be faster under 
some circumstances than creating the two-tuple to do the lookup.  Of 
course, these are trivially-sized dictionaries and this is also measuring 
Python bytecode speed, not what would happen in C.  But it suggests that 
more investigation might be in order.

From skip at pobox.com  Wed Jan 12 21:03:30 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 12 21:03:59 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <1105553300.41e56794d1fc5@mcherm.com>
References: <1105553300.41e56794d1fc5@mcherm.com>
Message-ID: <16869.33426.883395.345417@montanaro.dyndns.org>


    Michael> This must be one of those cases where I am mislead by my
    Michael> background...  I thought of Liskov substitution principle as a
    Michael> piece of basic CS background that everyone learned in school
    Michael> (or from the net, or wherever they learned
    Michael> programming). Clearly, that's not true.

Note that some us were long out of school by the time Barbara Liskov first
published the idea (in 1988 according to
http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple).  Also, since it
pertains to OO programming it was probably not taught widely until the
mid-90s.  That means a fair number of people will have never heard about it.

    Michael> Guido writes:
    >> How about SubstitutabilityError?

I don't think that's any better.  At the very least, people can Google for
"Liskov violation" to educate themselves.  I'm not sure that the results of
a Google search for "Subtitutability Error" will be any clearer.

    Michael> It would be less precise and informative to ME but apparently
    Michael> more so to a beginner. Obviously, we should support the
    Michael> beginner!

I don't think that's appropriate in this case.  Liskov violation is
something precise.  I don't think that changing what you call it will help
beginners understand it any better in this case.  I say leave it as it and
make sure it's properly documented.

Skip

From aleax at aleax.it  Wed Jan 12 21:05:54 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 21:05:59 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
Message-ID: <5DAAD430-64D5-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 20:39, Phillip J. Eby wrote:
    ...
>> > it's difficult because intuitively an interface defines a 
>> *requirement*, so
>> > it seems logical to inherit from an interface in order to add 
>> requirements!
>>
>> Yes... I would fall into this trap as well until I'd been burned a 
>> few times.
>
> It's burned me more than just a few times, and I *still* sometimes 
> make it if I'm not paying attention.  It's just too easy to make the 
> mistake.  So, I'm actually open to considering dropping interface 
> inheritance.

What about accepting Microsoft's QueryInterface precedent for this?  I 
know that "MS" is a dirty word to many, but I did like much of what 
they did in COM, personally.  The QI precedent would be: you can 
inherit interface from interface, but that does NOT intrinsically imply 
substitutability -- it just means the inheriting interface has all the 
methods of the one being subclassed, with the same signatures, without 
having to do a nasty copy-and-paste.  Of course, one presumably could 
use NO_ADAPTER_NEEDED to easily (but explicitly: that makes a 
difference!) implement the common case in which the inheriting 
interface DOES want to assert that it's perfectly / losslessly / etc 
substitutable for the one being inherited.


Alex

From pje at telecommunity.com  Wed Jan 12 21:14:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 21:13:25 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5DAAD430-64D5-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
	<5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com>

At 09:05 PM 1/12/05 +0100, Alex Martelli wrote:

>On 2005 Jan 12, at 20:39, Phillip J. Eby wrote:
>>It's burned me more than just a few times, and I *still* sometimes make 
>>it if I'm not paying attention.  It's just too easy to make the 
>>mistake.  So, I'm actually open to considering dropping interface inheritance.
>
>What about accepting Microsoft's QueryInterface precedent for this?  I 
>know that "MS" is a dirty word to many, but I did like much of what they 
>did in COM, personally.  The QI precedent would be: you can inherit 
>interface from interface, but that does NOT intrinsically imply 
>substitutability -- it just means the inheriting interface has all the 
>methods of the one being subclassed, with the same signatures, without 
>having to do a nasty copy-and-paste.  Of course, one presumably could use 
>NO_ADAPTER_NEEDED to easily (but explicitly: that makes a difference!) 
>implement the common case in which the inheriting interface DOES want to 
>assert that it's perfectly / losslessly / etc substitutable for the one 
>being inherited.

Well, you and I may agree to this, but we can't agree on behalf of 
everybody else who hasn't been bitten by this problem, I'm afraid.

I checked PEAK and about 62 out of 150 interfaces inherited from anything 
else; it would not be a big deal to explicitly do the NO_ADAPTER_NEEDED 
thing, especially since PyProtocols has an 'advise' keyword that does the 
declaration, anyway; inheritance is just a shortcut for that declaration 
when you are using only one kind of interface, so the "explicit" way of 
defining NO_ADAPTER_NEEDED between two interfaces has only gotten used when 
mixing Zope or Twisted interfaces w/PyProtocols.

Anyway, I'm at least +0 on dropping this; the reservation is just because I 
don't think everybody else will agree with this, and don't want to be 
appearing to imply that consensus between you and me implies any sort of 
community consensus on this point.  That is, the adaptation from "Alex and 
Phillip agree" to "community agrees" is noisy at best!  ;)

From carribeiro at gmail.com  Wed Jan 12 21:21:38 2005
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Wed Jan 12 21:21:41 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc205011210166c14e3f4@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
Message-ID: <864d370905011212213cf3aa16@mail.gmail.com>

On Wed, 12 Jan 2005 10:16:14 -0800, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> But now, since I am still in favor of automatic "combined" adaptation
> *as a last resort*, I ask you to consider that Python is not C++, and
> that perhaps we can make the experience in Python better than it was
> in C++. Perhaps allowing more control over when automatic adaptation
> is acceptable?
> 
> For example, inteface B (or perhaps this should be a property of the
> adapter for B->C?) might be marked so as to allow or disallow its
> consideration when looking for multi-step adaptations. We could even
> make the default "don't consider", so only people who have to deal
> with the multiple A's and/or multiple C's all adaptable via the same B
> could save themselves some typing by turning it on.

+1. BTW, I _do_ use adaptation, including the 'lossy' one described in
this scenario (where the mapping is imperfect, or incomplete). So
having some way to tell the adaptation framework that a particular
adapter is not suited to use in a transitive chain is a good thing
IMHO. Generically speaking, anything that puts some control on the
hands of the programmer - as long it does not stand in the way between
him and the problem - is good.

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From kbk at shore.net  Wed Jan 12 21:26:33 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan 12 21:27:05 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org> (Skip
	Montanaro's message of "Wed, 12 Jan 2005 14:03:30 -0600")
References: <1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
Message-ID: <87oefu1jie.fsf@hydra.bayview.thirdcreek.com>

Skip Montanaro <skip@pobox.com> writes:

> I don't think that's appropriate in this case.  Liskov violation is
> something precise.  I don't think that changing what you call it will help
> beginners understand it any better in this case.  I say leave it as it and
> make sure it's properly documented.

+1
-- 
KBK
From just at letterror.com  Wed Jan 12 21:27:25 2005
From: just at letterror.com (Just van Rossum)
Date: Wed Jan 12 21:27:43 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org>
Message-ID: <r01050400-1037-61B80D9A64D811D9BA7D003065D5E7E4@[10.0.0.23]>

Skip Montanaro wrote:

> 
>     Michael> This must be one of those cases where I am mislead by my
>     Michael> background...  I thought of Liskov substitution principle 
>     Michael> as a piece of basic CS background that everyone learned 
>     Michael> in school (or from the net, or wherever they learned
>     Michael> programming). Clearly, that's not true.
> 
> Note that some us were long out of school by the time Barbara Liskov
> first published the idea (in 1988 according to
> http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple).  Also, since it
> pertains to OO programming it was probably not taught widely until
> the mid-90s.  That means a fair number of people will have never
> heard about it.

...and then there are those Python users who have no formal CS
background at all. Python is used quite a bit by people who's main job
is not programming.

<sidebar>I'm one of those, and whatever I know about CS, I owe it mostly
to the Python community. I learned an awful lot just by hanging out on
various Python mailing lists.</sidebar>

>     Michael> Guido writes:
>     >> How about SubstitutabilityError?
> 
> I don't think that's any better.  At the very least, people can
> Google for "Liskov violation" to educate themselves.  I'm not sure
> that the results of a Google search for "Subtitutability Error" will
> be any clearer.

Well, with a bit of luck Google will point to the Python documentation
then...

Just
From aleax at aleax.it  Wed Jan 12 21:30:58 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 12 21:31:08 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com>
References: <79990c6b05011206001a5a3805@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com>
Message-ID: <DE3F5BF3-64D8-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 20:51, Phillip J. Eby wrote:
    ...
> There's a very simple reason.  If one is using only non-noisy 
> adapters, there is absolutely no reason to ever define more than one 
> adapter between the *same* two points.  If you do,

...but there's no harm whatsoever done, either.  If I have four 
interfaces I use regularly, A, B, C, D, and I have the need to adapt 
A->B, A->C, B->D, C->D, with every one of these four adaptations being 
"the absolute best way" (as you stated all interface adaptations must 
be), then why should that be at all a problem?  Maybe sometimes someone 
will need to adapt A->D, fine -- again, no harm whatsoever, IF 
everything is as perfect as it MUST be for transitivity to apply 
unconditionally.

Put it another way: say I have the first three of these adaptations, 
only, so everything is hunky-dory.  Now I come upon a situation where I 
need C->D, fine, I add it: where's the error, if every one of the four 
adaptations is just perfect?

I admit I can't sharply follow your gyrations about what's in what 
package, who wrote what, and why the fact that interfaces and 
adaptation (particularly transitive adaptation) are NOT widespread at 
all so far (are only used by early adopters, on average heads and 
shoulders above the average Python coder) makes it MORE important to 
provide an error in a case that, by the premises, cannot be an error 
(would it be LESS important to provide the error if everybody and their 
cousin were interfacing and adapting with exhuberance...?).  All I can 
see is:

1. if an interface adapter must ABSOLUTELY be perfect, transitivity is 
fine, but the error makes no sense
2. if the error makes sense (or the assertion about "less likely to be 
lossy" makes any sense, etc etc), then transitivity is NOT fine -- 
adapters can be imperfect, and there is NO way to state that they are, 
one just gets an error message if one is SO DEUCEDLY LUCKY as to have 
created in the course of one's bumbling two shortest-paths of the same 
length

I suspect [2] holds.  But you're the one with experience, so if you 
stake that on [1], and the "absolute best way" unconditional assertion, 
then, fine, I guess, as per my previous message.  But the combination 
of "absolute best way" _AND_ an error when somebody adds C->D is, in my 
opinion, self-contradictory: experience or not, I can't support 
asserting something and its contrary at the same time.

> then somebody is doing something redundant, and there is a possibility 
> for error.  In

Not at all: each of the four above-listed adaptations may be needed to 
perform an unrelated adapt(...) operation.  How can you claim that set 
of four adaptations is REDUNDANT, when adding a FIFTH one (a direct 
A->D) would make it fine again per your rules?  This is the first time 
I've heard an implied claim that redundancy is something that can be 
eliminated by ADDING something, without taking anything away.



>> Personally, I disagree with having transitivity at all, unless 
>> perhaps it be restricted to adaptations specifically and explicitly 
>> stated to be "perfect and lossless"; PJE claims that ALL adaptations 
>> MUST, ALWAYS, be "perfect and lossless" -- essentially, it seems to 
>> me, he _has_ to claim that, to defend transitivity being applied 
>> automatically, relentlessly, NON-optionally, NON-selectively (but 
>> then the idea of giving an error when two or more shortest-paths have 
>> the same length becomes dubious).
>
> No, it follows directly from the premise.  If adapters are non-noisy, 
> why do you need more than one adapter chain of equal length between 
> two points?  If you have such a condition, you

I don't NEED the chain, but I may well need each step; and by the 
premise of "absolute best way" which you maintain, it must be innocuous 
if the separate steps I need end up producing more than one chain -- 
what difference can it make?!

> have a redundancy at the least, and more likely a programming error -- 
> surely BOTH of those adapters are not correct, unless you have that 
> excruciatingly-rare case I mentioned above.

Each of the FOUR adapters coded can be absolutely perfect.  Thus, the 
composite adapters which your beloved transitivity builds will also be 
perfect, and it will be absolutely harmless to pick one of them at 
random.


>> BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of 
>> inheritance.  You can say that interface ISub inherits from IBas: 
>> this means that ISub has all the same methods as IBas with the same 
>> signatures, plus it may have other methods; it does *NOT* mean that 
>> anything implementing ISub must also implement IBas, nor that a 
>> QueryInterface on an ISub asking for an IBas must succeed, or 
>> anything of that kind.  In many years of COM practice I have NEVER 
>> found this issue to be a limitation -- it works just fine.
>
> I'm actually open to at least considering dropping interface 
> inheritance transitivity, due to its actual problems in practice.  
> Fewer than half of the interfaces in PEAK do any inheritance, so 
> having to explicitly declare that one interface implies another isn't 
> a big deal.

Now that is something I'd really love, as per my previous msg.

> Such a practice might seem very strange to Java programers, however, 
> since it means that if you declare (in Python) a method to take IBas, 
> it will not accept an ISub, unless the object has explicitly declared 
> that it supports both.  (Whereas in Java it suffices for the class to 
> declare that it supports ISub.)

Often the author of ISub will be able to declare support for IBas as 
well as inheriting (widening) of it; when that is not possible, the 
Java programmer, although surprised, will most likely be better off for 
having to be a tad more explicit.


Alex

From pje at telecommunity.com  Wed Jan 12 21:42:32 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 21:41:51 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <777FF1CC-64D4-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>

At 08:59 PM 1/12/05 +0100, Alex Martelli wrote:
>Even though Guido claimed I have been belaboring the following point, I do 
>think it's crucial and I still haven't seen you answer it.

My post on that probably crossed with this post of yours; it contains an 
excruciating analysis of why I chose to consider such paths 
dubious.  However, I'll briefly answer your specific questions 
here.  (Well, briefly for ME! ;) )


>   If *any* I1->I2 adapter, by the very fact of its existence, asserts 
> it's the *absolute best way* to adapt ANY implementation of I1 into I2; 
> then why should the existence of two equal-length shortest paths 
> A->I1->I2 and A->I3->I2 be considered a problem in any sense?  Pick 
> either, at random or by whatever rule: you trust that they're BOTH the 
> absolute best, so they must be absolutely identical anyway.

Because if you have asserted that it is the absolute best, why did you 
write *another* one that's equally good?  This suggests that at least one 
of the paths you ended up with was unintentional: created, for example, via 
inappropriate use of interface inheritance.

Anyway, the other post has a detailed analysis for all the circumstances I 
can think of where you *might* have such a set of ambiguous adapter paths, 
and why it's excruciatingly rare that you would not in fact care when such 
a situation existed, and why the error is therefore valuable in pointing 
out the (almost certainly unintended) duplication.


>If you agree that this is the only sensible behavior, and PyProtocols' 
>current behavior (TypeError for two paths of equal length save in a few 
>special cases), then I guess can accept your stance that providing 
>adaptation between interfaces implies the strongest possible degree of 
>commitment to perfection, and that this new conception of *absolute best 
>way* entirely and totally replaces previous weaker and more sensible 
>descriptions, such as for example in 
><http://peak.telecommunity.com/protocol_ref/proto-implication.html> that 
>shorter chains "are less likely to be a ``lossy'' conversion".
>``less likely'' and ``absolute best way'' just can't coexist.  Two 
>"absolute best ways" to do the same thing are exactly equally likely to be 
>``lossy'': that likelihood is ZERO, if "absolute" means anything.

First off, as a result of our conversations here, I'm aware that the 
PyProtocols documentation needs updating; it was based on some of my 
*earliest* thinking about adaptation, before I realized how critical the 
distinction between class-to-interface and interface-to-interface 
adaptation really was.  And, my later thinking has only really been 
properly explained (even to my satisfaction!) during this 
discussion.  Indeed, there are lots of things I know now about when to 
adapt and when not to, that I had only the faintest idea of when I 
originally wrote the documentation.

Second, if the error PyProtocols produces became a problem in practice, it 
could potentially be downgraded to a warning, or even disabled 
entirely.  However, my experience with it has been that the *real* reason 
to flag adapter ambiguities is that they usually reveal some *other* 
problem, that would be much harder to find otherwise.


>((Preferring shorter chains as a heuristic for faster ones may be very 
>reasonable approach if performance is a secondary consideration, as I've 
>already mentioned; if performance were more important than that, then 
>other ``costs'' besides the extreme NO_ADAPTER_NEEDED [[0 cost]] and 
>DOES_NOT_SUPPORT [[infinite cost]] should be accepted, and the 
>minimal-cost path ensured -- I do not think any such complication is 
>warranted)).

Actually, the nature of the transitive algorithm PyProtocols uses is that 
it must track these running costs and pass them around anyway, so it is 
always possible to call one of its primitive APIs to force a certain cost 
consideration.  However, I have never actually had to use it, and I 
discourage others from playing with it, because I think the need to use it 
would be highly indicative of some other problem, like inappropriate use of 
adaptation or at least of I-to-I relationships.


>If you agree that it cannot be an error to have two separate paths of 
>"absolute best ways" (thus equally perfect) of equal length, then I can 
>accept your stance that one must ensure the "absolute best way" each time 
>one codes and registers an I -> I adapter (and each time one interface 
>inherits another interface, apparently); I can then do half the rewrite of 
>the PEP 246 draft (the changes already mentioned and roughly agreed) and 
>turn it over to you as new first author to complete with the transitivity 
>details &c.
>
>If there is any doubt whatsoever marring that perfection, that "absolute 
>best way", then I fear we're back at square one.

The only doubt is that somebody may have *erroneously* created a duplicate 
adapter, or an unintended duplicate path via a NO_ADAPTER_NEEDED link (e.g. 
by declaring that a class implements an interface directly, or interface 
inheritance).  Thus, even though such an adapter is technically correct and 
acceptable, it is only so if that's what you really *meant* to do.

But, if the adapters are *really* perfect, then by definition you are 
wasting your time defining "more than one way to do it", so it probably 
means you are making *some* kind of mistake, even if it's only the mistake 
of duplicating effort needlessly!  More likely, however, it means you have 
made some other mistake, like inappropriate interface inheritance.

At least 9 times out of 10, when I receive an ambiguous adapter path error, 
it's because I just added some kind of NO_ADAPTER_NEEDED link: either 
class-implements-interface, or interface-subclasses-interface, and I did so 
without having thought about the consequences of having that path.  The 
error tells me, "hey, you need to think about what you're doing here and be 
more explicit about what is going on, because there are some broader 
implications to what you just did."

It's not always immediately obvious how to fix it, but it's almost always 
obvious that I actually *have* done something wrong, as soon as I think 
about it; it's not just a spurious error.

Anyway, hopefully this post and the other one will be convincing that 
considering ambiguity to be an error *reinforces* the idea of I-to-I 
perfection, rather than undermining it.  (After all, if you've written a 
perfect one, and there's already one there, then either one of you is 
mistaken, or you are wasting your time writing one!)

From gvanrossum at gmail.com  Wed Jan 12 22:15:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Jan 12 22:15:26 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050112195711.GA1813@prometheusresearch.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
	<20050112195711.GA1813@prometheusresearch.com>
Message-ID: <ca471dc20501121315227e3a89@mail.gmail.com>

[Clark]
>   - add a flag to adapt, allowTransitive, which defaults to False

That wouldn't work very well when most adapt() calls are invoked
implicitly through signature declarations (per my blog's proposal).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Wed Jan 12 22:50:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Jan 12 22:50:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <DE3F5BF3-64D8-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011206001a5a3805@mail.gmail.com>
	<5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112154647.0327ca40@mail.telecommunity.com>

At 09:30 PM 1/12/05 +0100, Alex Martelli wrote:

>On 2005 Jan 12, at 20:51, Phillip J. Eby wrote:
>    ...
>>There's a very simple reason.  If one is using only non-noisy adapters, 
>>there is absolutely no reason to ever define more than one adapter 
>>between the *same* two points.  If you do,
>
>...but there's no harm whatsoever done, either.  If I have four interfaces 
>I use regularly, A, B, C, D, and I have the need to adapt A->B, A->C, 
>B->D, C->D, with every one of these four adaptations being "the absolute 
>best way" (as you stated all interface adaptations must be), then why 
>should that be at all a problem?

It isn't a problem, but *only* if A is an interface.  If it's a concrete 
class, then A->B and A->C are not "perfect" adapters, so it *can* make a 
difference which one you pick, and you should be explicit.

However, implementing an algorithm to ignore only interface-to-interface 
ambiguity is more complex than just hollering whenever *any* ambiguity is 
found.  Also, people make mistakes and may have declared something they 
didn't mean to.  The cost to occasionally be a bit more explicit is IMO 
outweighed by the benefit of catching bugs that might otherwise go 
unnoticed, but produce an ambiguity as a side-effect of the buggy part.

It's *possible* that you'd still catch almost as many bugs if you ignored 
pure I-to-I diamonds, but I don't feel entirely comfortable about giving up 
that extra bit of protection, especially since it would make the checker 
*more* complex to try to *not* warn about that situation.

Also, in general I'm wary of introducing non-determinism into a system's 
behavior.  I consider keeping e.g. the first path declared or the last path 
declared to be a form of non-determinism because it makes the system 
sensitive to trivial things like the order of import statements.  The 
current algorithm alerts you to this non-determinism.

Perhaps it would be simplest for Python's interface system to issue a 
warning about ambiguities, but allow execution to proceed?


>(would it be LESS important to provide the error if everybody and their 
>cousin were interfacing and adapting with exhuberance...?)

Only in the use case where two people might legitimately create the same 
adapter, but neither of them can stop using their adapter in favor of the 
other person's, thus forcing them to work around the error.

Or, in the case where lots of people try to define adapter diamonds and 
don't want to go to the trouble of having their program behave 
deterministically.  :)


>1. if an interface adapter must ABSOLUTELY be perfect, transitivity is 
>fine, but the error makes no sense

The error only makes no sense if we assume that the human(s) really *mean* 
to be ambiguous.  Ambiguity suggests, however, that something *else* may be 
wrong.


>I suspect [2] holds.  But you're the one with experience, so if you stake 
>that on [1], and the "absolute best way" unconditional assertion, then, 
>fine, I guess, as per my previous message.  But the combination of 
>"absolute best way" _AND_ an error when somebody adds C->D is, in my 
>opinion, self-contradictory: experience or not, I can't support asserting 
>something and its contrary at the same time.

It's not contrary; it's a warning that "Are you sure you want to waste time 
writing another way to do the same thing when there's already a perfectly 
valid way to do it with a comparable number of adaptation steps 
involved?  Maybe your adapter is better-performing or less buggy in some 
way, but I'm just a machine so how would I know?  Please tell me which of 
these adapters is the *really* right one to use, thanks."  (Assuming that 
the machine is tactful enough to leave out mentioning that maybe you just 
made a mistake and declared the adapter between the wrong two points, you 
silly human you.)



>How can you claim that set of four adaptations is REDUNDANT, when adding a 
>FIFTH one (a direct A->D) would make it fine again per your rules?  This 
>is the first time I've heard an implied claim that redundancy is something 
>that can be eliminated by ADDING something, without taking anything away.

PyProtocols doesn't say the situation is redundant, it says it's 
*ambiguous*, which implies a *possible* redundancy.  I'm also saying that 
the ambiguity is nearly always (for me) an indicator of a *real* problem, 
not merely a not-so-explicit diamond.


>I don't NEED the chain, but I may well need each step; and by the premise 
>of "absolute best way" which you maintain, it must be innocuous if the 
>separate steps I need end up producing more than one chain -- what 
>difference can it make?!

Fair enough; however I think that in the event that the system must make 
such a choice, it must at *least* warn about non-deterministic 
behavior.  Even if you are claiming perfect adaptation, that doesn't 
necessarily mean you are correct in your claim!


>Each of the FOUR adapters coded can be absolutely perfect.  Thus, the 
>composite adapters which your beloved transitivity builds will also be 
>perfect, and it will be absolutely harmless to pick one of them at random.

Right, but the point of my examples was that in all but one extremely rare 
scenario, a *real*  ambiguity of this type is trivial to fix by being 
explicit.  *But*, it's more often the case that this ambiguity reflects an 
actual problem or error of some kind (at least IME to date), than that it 
indicates a harmless adapter diamond.  And when you attempt to make the 
path more explicit, you then discover what that other mistake was.  So, 
sometimes you are "wasting time" declaring that extra explicitness, and 
sometimes you save time because of it.  Whether this tradeoff is right for 
everybody, I can't say; it's a little bit like static typing, but OTOH ISTM 
that it comes up much less often than static typing errors do, and it only 
has to be fixed once for each diamond.  (i.e., it doesn't propagate into 
every possible aspect of your program.)

From ianb at colorstudy.com  Wed Jan 12 23:07:37 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Jan 12 23:07:59 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <41E59FA9.4050605@colorstudy.com>

Phillip J. Eby wrote:
> Anyway, I'm honestly curious as to whether anybody can find a real 
> situation where transitive adapter composition is an *actual* problem, 
> as opposed to a theoretical one.  I've heard a lot of people talk about 
> what a bad idea it is, but I haven't heard any of them say they actually 
> tried it.  Conversely, I've also heard from people who *have* tried it, 
> and liked it.  However, at this point I have no way to know if this 
> dichotomy is just a reflection of the fact that people who don't like 
> the idea don't try it, and the people who either like the idea or don't 
> care are open to trying it.

I haven't read through the entire thread yet, so forgive me if I'm 
redundant.

One case occurred to me with the discussion of strings and files, i.e., 
adapting from a string to a file.  Let's say an IReadableFile, since 
files are too ambiguous.

Consider the case where we are using a path object, like Jason 
Orendorff's or py.path.  It seems quite reasonable and unambiguous that 
a string could be adapted to such a path object.  It also seems quite 
reasonable and unambiguous that a path object could be adapted to a 
IReadableFile by opening the file at the given path.  It's also quite 
unambiguous that a string could be adapted to a StringIO object, though 
I'm not sure it's reasonable.  In fact, it seems like an annoying but 
entirely possible case that some library would register such an adapter, 
and mess things up globally for everyone who didn't want such an 
adaptation to occur!  But that's an aside.  The problem is with the 
first example, where two seemingly innocuous adapters (string->path, 
path->IReadableFile) allow a new adaptation that could cause all sorts 
of problems (string->IReadableFile).

Ideally, if I had code that was looking for a file object and I wanted 
to accept filenames, I'd want to try to adapt to file, and if that 
failed I'd try to adapt to the path object and then from there to the 
file object.  Or if I wanted it to take strings (that represented 
content) or file-like objects, I'd adapt to a file object and if that 
failed I'd adapt to a string, then convert to a StringIO object.  A 
two-step adaptation encodes specific intention that it seems transitive 
adaption would be blind to.

As I think these things through, I'm realizing that registered 
adaptators really should be 100% accurate (i.e., no information loss, 
complete substitutability), because a registered adapter that seems 
pragmatically useful in one place could mess up unrelated code, since 
registered adapters have global effects.  Perhaps transitivity seems 
dangerous because that has the potential to dramatically increase the 
global effects of those registered adapters.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From steven.bethard at gmail.com  Wed Jan 12 23:19:09 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed Jan 12 23:19:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E59FA9.4050605@colorstudy.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <d11dcfba05011214194adabad1@mail.gmail.com>

On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking <ianb@colorstudy.com> wrote:
> One case occurred to me with the discussion of strings and files, i.e.,
> adapting from a string to a file.  Let's say an IReadableFile, since
> files are too ambiguous.
>
> Consider the case where we are using a path object, like Jason
> Orendorff's or py.path.  It seems quite reasonable and unambiguous that
> a string could be adapted to such a path object.  It also seems quite
> reasonable and unambiguous that a path object could be adapted to a
> IReadableFile by opening the file at the given path.

This strikes me as a strange use of adaptation -- I don't see how a
string can act-as-a path object, or how a path object can act-as-a
file.  I see that you might be able to *create* a path object from-a
string, or a file from-a path object, but IMHO this falls more into
the category of object construction than object adaptation...

Are these the sorts of things we can expect people to be doing with
adaptation?  Or is in really intended mainly for the act-as-a behavior
that I had assumed...?

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From carribeiro at gmail.com  Wed Jan 12 23:26:09 2005
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Wed Jan 12 23:26:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E59FA9.4050605@colorstudy.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <864d370905011214267edda37e@mail.gmail.com>

On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking <ianb@colorstudy.com> wrote:
> As I think these things through, I'm realizing that registered
> adaptators really should be 100% accurate (i.e., no information loss,
> complete substitutability), because a registered adapter that seems
> pragmatically useful in one place could mess up unrelated code, since
> registered adapters have global effects.  Perhaps transitivity seems
> dangerous because that has the potential to dramatically increase the
> global effects of those registered adapters.

To put it quite bluntly: many people never bother to implement the
_full_ interface of something if all they need is a half baked
implementation. For example, I may get away with a sequence-like
object in many situations without slice suport in getitem, or a dict
with some of the iteration methods. Call it lazyness, but this is
known to happen quite often, and Python is quite forgiving in this
respect. Add the global scope of the adapter registry & transitivity
to this and things may become much harder to debug...

...but on the other hand, transitivity is a powerful tool in the hands
of an expert programmer, and allows to write much shorter & cleaner
code. Some balance is needed.

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From cce at clarkevans.com  Wed Jan 12 23:54:46 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Jan 12 23:54:49 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E59FA9.4050605@colorstudy.com>
References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <20050112225446.GA43203@prometheusresearch.com>

On Wed, Jan 12, 2005 at 04:07:37PM -0600, Ian Bicking wrote:
| A two-step adaptation encodes specific intention that it seems transitive 
| adaption would be blind to.

Exactly.  Nice example Ian. To parrot your example a bit more
concretely, the problem happens when you get two different 
adaptation paths. 

   String -> PathName -> File
   String -> StringIO -> File

Originally, Python may ship with the String->StringIO and
StringIO->File adapters pre-loaded, and if my code was reliant upon
this transitive chain, the following will work just wonderfully,

    def parse(file: File):
        ...

    parse("helloworld")

by parsing "helloworld" content via a StringIO intermediate object.  But
then, let's say a new component "pathutils" registers another adapter pair:

   String->PathName and PathName->File

This ambiguity causes a few problems:

  - How does one determine which adapter path to use?
  - If a different path is picked, what sort of subtle bugs occur?
  - If the default path isn't what you want, how do you specify 
    the other path?
 
I think Phillip's suggestion is the only resonable one here, ambiguous
cases are an error; ask the user to register the adapter they need, or
do a specific cast when calling parse().

| As I think these things through, I'm realizing that registered 
| adaptators really should be 100% accurate (i.e., no information loss, 
| complete substitutability), because a registered adapter that seems 
| pragmatically useful in one place could mess up unrelated code, since 
| registered adapters have global effects.

I think this isn't all that useful; it's unrealistic to assume that
adapters are always perfect.   If transitive adaptation is even
permitted, it should be unambiguous.  Demanding that adaption is
100% perfect is a matter of perspective.  I think String->StringIO
and StringIO->File are perfectly pure.

| Perhaps transitivity seems dangerous because that has the potential to 
| dramatically increase the global effects of those registered adapters.

I'd prefer,
  
    1. adaptation to _not_ be transitive (be explicit)

    2. a simple mechanism for a user to register an explicit
       adaptation path from a source to a destination:

           adapt.path(String,PathName,File)

       to go from String->File, using PathName as an intermediate.

    3. an error message, AdaptationError, to list all possible
       adaptation paths:

          Could not convert 'String' object to 'File' beacuse
          there is not a suitable adapter.  Please consider an
          explicit conversion, or register a composite adapter
          with one of the following paths:

             adapt.path(String,PathName,File)
             adapt.path(String,StringIO,File)

    3. raise an exception when _registering_ a 'path' which would
       conflict with any existing adapter:
       
          "Could not complete adapt.path(String,PathName,File) 
           since an existing direct adapter from String to Path
           already exists."
           
          "Could not complete adapt.path(String,PathName,File)
           since an existing path String->StringIO->File is
           already registered".

I'd rather have the latter error occur when "importing" modules
rather than at run-time.  This way, the exception is pinned on
the correct library developer.

Best,

Clark
From andrewm at object-craft.com.au  Wed Jan 12 23:55:25 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed Jan 12 23:55:32 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines 
In-Reply-To: <16868.33914.837771.954739@montanaro.dyndns.org> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
Message-ID: <20050112225525.236BE3C889@coffee.object-craft.com.au>

>You can argue that reading csv data from/writing csv data to a file on
>Windows if the file isn't opened in binary mode is an error.  Perhaps we
>should enforce that in situations where it matters.  Would this be a start?
>
>    terminators = {"darwin": "\r",
>                   "win32": "\r\n"}
>
>    if (dialect.lineterminator != terminators.get(sys.platform, "\n") and
>       "b" not in getattr(f, "mode", "b")):
>       raise IOError, ("%s not opened in binary mode" %
>                       getattr(f, "name", "???"))
>
>The elements of the postulated terminators dictionary may already exist
>somewhere within the sys or os modules (if not, perhaps they should be
>added).  The idea of the check is to enforce binary mode on those objects
>that support a mode if the desired line terminator doesn't match the
>platform's line terminator.

Where that falls down, I think, is where you want to read an alien
file - in fact, under unix, most of the CSV files I read use \r\n for
end-of-line.

Also, I *really* don't like the idea of looking for a mode attribute
on the supplied iterator - it feels like a layering violation. We've
advertised the fact that it's an iterator, so we shouldn't be using
anything but the iterator protocol.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From Jack.Jansen at cwi.nl  Thu Jan 13 00:02:39 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Jan 13 00:02:23 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines
In-Reply-To: <16868.33914.837771.954739@montanaro.dyndns.org>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
Message-ID: <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl>


On 12-jan-05, at 2:59, Skip Montanaro wrote:
>     terminators = {"darwin": "\r",
>                    "win32": "\r\n"}
>
>     if (dialect.lineterminator != terminators.get(sys.platform, "\n") 
> and
>        "b" not in getattr(f, "mode", "b")):
>        raise IOError, ("%s not opened in binary mode" %
>                        getattr(f, "name", "???"))

On MacOSX you really want universal newlines. CSV files produced by 
older software (such as AppleWorks) will have \r line terminators, but 
lots of other programs will have files with normal \n terminators.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From pje at telecommunity.com  Thu Jan 13 00:06:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 00:05:53 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <d11dcfba05011214194adabad1@mail.gmail.com>
References: <41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <5.1.1.6.0.20050112180414.02f886a0@mail.telecommunity.com>

At 03:19 PM 1/12/05 -0700, Steven Bethard wrote:
>On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking <ianb@colorstudy.com> wrote:
> > One case occurred to me with the discussion of strings and files, i.e.,
> > adapting from a string to a file.  Let's say an IReadableFile, since
> > files are too ambiguous.
> >
> > Consider the case where we are using a path object, like Jason
> > Orendorff's or py.path.  It seems quite reasonable and unambiguous that
> > a string could be adapted to such a path object.  It also seems quite
> > reasonable and unambiguous that a path object could be adapted to a
> > IReadableFile by opening the file at the given path.
>
>This strikes me as a strange use of adaptation -- I don't see how a
>string can act-as-a path object, or how a path object can act-as-a
>file.

I see the former, but not the latter.  A string certainly can act-as-a path 
object; there are numerous stdlib functions that take a string and then use 
it "as a" path object.  In principle, a future version of Python might take 
path objects for these operations, and automatically adapt strings to them.

But a path can't act as a file; that indeed makes no sense.

From pje at telecommunity.com  Thu Jan 13 00:09:31 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 00:08:52 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E59FA9.4050605@colorstudy.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>

At 04:07 PM 1/12/05 -0600, Ian Bicking wrote:
>It also seems quite reasonable and unambiguous that a path object could be 
>adapted to a IReadableFile by opening the file at the given path.

Not if you think of adaptation as an "as-a" relationship, like using a 
screwdriver "as a" hammer (really an IPounderOfNails, or some such).  It 
makes no sense to use a path "as a" readable file, so this particular 
adaptation is bogus.


>The problem is with the first example, where two seemingly innocuous 
>adapters (string->path, path->IReadableFile) allow a new adaptation that 
>could cause all sorts of problems (string->IReadableFile).

Two problems with this thought:

1) path->IReadableFile is bogus

2) even if you had path->IReadableFile, you're not broken unless you extend 
transitivity to pass through concrete target types (which I really don't 
recommend)


>Ideally, if I had code that was looking for a file object and I wanted to 
>accept filenames, I'd want to try to adapt to file, and if that failed I'd 
>try to adapt to the path object and then from there to the file object.

There are two reasonable ways to accomplish this.  You can have code that 
expects an open stream -- in which case what's the harm in wrapping 
"open()" arount the value you pass if you want it to be opened?  OR, you 
can have code that expects an "openable stream", in which case you can pass 
it any of these:

1. an already-open stream (that then adapts to an object with a trivial 
'open()' method),
2. a path object that implements "openable stream"
3. a string that adapts to "openable stream" by conversion to a path object

The only thing you can't implicitly pass in that case is a 
string-to-be-a-StringIO; you have to explicitly make it a StringIO.

In *either* case, you can have a string adapt to either a path object or to 
a StringIO; you just can't have both then come back to a common interface.



>As I think these things through, I'm realizing that registered adaptators 
>really should be 100% accurate (i.e., no information loss, complete 
>substitutability), because a registered adapter that seems pragmatically 
>useful in one place could mess up unrelated code, since registered 
>adapters have global effects.  Perhaps transitivity seems dangerous 
>because that has the potential to dramatically increase the global effects 
>of those registered adapters.

However, if you:

1) have transitivity only for interface-to-interface relationships 
(allowing only one class-to-interface link at the start of the path), and

2) use adaptation only for "as a" relationships, not to represent 
operations on objects

you avoid these problems.  For example, avoiding the one adapter you 
presented that's not "as a", the adapter diamond becomes a triangle.

The longer the discussion goes on, however, the more I realize that like 
the internet, transitivity depends on the continued goodwill of your 
neighbors, and it only takes one fool to ruin things for a lot of 
people.  On the other hand, I also hate the idea of having to kludge 
workarounds like the one James Knight was doing, in order to get a simple 
adaptation to work.

From sxanth at cs.teiath.gr  Thu Jan 13 10:26:51 2005
From: sxanth at cs.teiath.gr (stelios xanthakis)
Date: Thu Jan 13 00:19:13 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org>
References: <1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
Message-ID: <41E63EDB.40008@cs.teiath.gr>

Skip Montanaro wrote:

>    Michael> Guido writes:
>    >> How about SubstitutabilityError?
>
>I don't think that's any better.  At the very least, people can Google for
>"Liskov violation" to educate themselves.  I'm not sure that the results of
>a Google search for "Subtitutability Error" will be any clearer
>
...

>
>I don't think that's appropriate in this case.  Liskov violation is
>something precise.  I don't think that changing what you call it will help
>beginners understand it any better in this case.  I say leave it as it and
>make sure it's properly documented.
>
>  
>

Yes but in order to fall into a Liskov Violation, one will have to use 
extreme OOP features (as I understand from the ongoing
discussion for which, honestly, I understand nothing:). So it's not like 
it will happen often and when it happens
it will make sense to the architects who made such complex things.

+1 on SubstitutabilityError or something easier and moreover because of 
the fact that some people
really don't care who Liskov is and what he/she discovered, and whether 
that same thing would had
been discovered anyway 2 mothns later by somebody else if the Liskov 
person wasn't there.


St.

From tjreedy at udel.edu  Thu Jan 13 00:22:47 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Jan 13 00:23:01 2005
Subject: [Python-Dev] Re: Recent IBM Patent releases
References: <cs3o2p$4t1$1@sea.gmane.org>
Message-ID: <cs4bg7$jd5$1@sea.gmane.org>


"Scott David Daniels" <Scott.Daniels@Acm.Org>
> IBM has recently released 500 patents for use in opensource code.
>
>     http://www.ibm.com/ibm/licensing/patents/pledgedpatents.pdf
>
>     "...In order to foster innovation and avoid the possibility that a
>     party will take advantage of this pledge and then assert patents or
>     other intellectual property rights of its own against Open Source
>     Software, thereby limiting the freedom of IBM or any other Open
>     Source developer to create innovative software programs, the
>     commitment not to assert any of these 500 U.S. patents and all
>     counterparts of these patents issued in other countries is
>     irrevocable except that IBM reserves the right to terminate this
>     patent pledge and commitment only with regard to any party who files
>     a lawsuit asserting patents or other intellectual property rights
>     against Open Source Software."

The exception is, of course, aimed for now at SCO and their ridiculous 
lawsuit against Linux and IBM with respect to Linux.

from another post
> I believe our current policy is that the author warrants that the code
> is his/her own work and not encumbered by any patent.

Without a qualifier such as 'To the best of my knowledge', the latter is an 
impossible warrant both practically, for an individual author without 
$1000s to spend on a patent search, and legally.  Legally, there is no 
answer until the statute of limitations runs out or until there is an 
after-the-fact final answer provided by the court system.

Terry J. Reedy



From fredrik at pythonware.com  Thu Jan 13 00:32:14 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Jan 13 00:32:12 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
References: <16869.33426.883395.345417@montanaro.dyndns.org>
	<r01050400-1037-61B80D9A64D811D9BA7D003065D5E7E4@[10.0.0.23]>
Message-ID: <cs4c1f$qhk$1@sea.gmane.org>

Just van Rossum wrote:

> ...and then there are those Python users who have no formal CS
> background at all. Python is used quite a bit by people who's main job
> is not programming.

...and among us who do programming as a main job, I can assure that I'm
not the only one who, if told by a computer that something I did was a LSP
violation, would take that computer out in the backyard and shoot it.  or at
least hit it with a shovel, or something.

</F> 



From pje at telecommunity.com  Thu Jan 13 01:49:06 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 01:48:27 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org>
References: <1105553300.41e56794d1fc5@mcherm.com>
	<1105553300.41e56794d1fc5@mcherm.com>
Message-ID: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>

At 02:03 PM 1/12/05 -0600, Skip Montanaro wrote:

>I don't think that's appropriate in this case.  Liskov violation is
>something precise.  I don't think that changing what you call it will help
>beginners understand it any better in this case.  I say leave it as it and
>make sure it's properly documented.

Actually, the whole discussion is kind of backwards; you should never *get* 
a Liskov violation error, because it's raised strictly for control flow 
inside of __conform__ and caught by adapt().  So the *only* way you can see 
this error is if you call __conform__ directly, and somebody added code 
like this:

     raise LiskovViolation

So, it's not something you need to worry about a newbie seeing.  The *real* 
problem with the name is knowing that you need to use it in the first place!

IMO, it's simpler to handle this use case by letting __conform__ return 
None, since this allows people to follow the One Obvious Way to not conform 
to a particular protocol.

Then, there isn't a need to even worry about the exception name in the 
first place, either...


From steven.bethard at gmail.com  Thu Jan 13 01:54:41 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Jan 13 01:54:44 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
References: <1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
	<5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
Message-ID: <d11dcfba0501121654659390a7@mail.gmail.com>

On Wed, 12 Jan 2005 19:49:06 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> So the *only* way you can see
> this error is if you call __conform__ directly, and somebody added code
> like this:
> 
>      raise LiskovViolation
> 
> So, it's not something you need to worry about a newbie seeing.  The *real*
> problem with the name is knowing that you need to use it in the first place!
> 
> IMO, it's simpler to handle this use case by letting __conform__ return
> None, since this allows people to follow the One Obvious Way to not conform
> to a particular protocol.

Not that my opinion counts for much =), but returning None does seem
much simpler to me.  I also haven't seen any arguments against this
route of handling protocol nonconformance...  Is there a particular
advantage to the exception-raising scheme?

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From pje at telecommunity.com  Thu Jan 13 02:18:48 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 02:18:09 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <d11dcfba0501121654659390a7@mail.gmail.com>
References: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
	<1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
	<5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com>

At 05:54 PM 1/12/05 -0700, Steven Bethard wrote:
>Not that my opinion counts for much =), but returning None does seem
>much simpler to me.  I also haven't seen any arguments against this
>route of handling protocol nonconformance...  Is there a particular
>advantage to the exception-raising scheme?

Only if there's any objection to giving the 'object' type a default 
__conform__ method that returns 'self' if 'isinstance(protocol,ClassTypes) 
and isinstance(self,protocol)'.

From skip at pobox.com  Thu Jan 13 03:36:54 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 13 03:46:27 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines 
In-Reply-To: <20050112225525.236BE3C889@coffee.object-craft.com.au>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
	<20050112225525.236BE3C889@coffee.object-craft.com.au>
Message-ID: <16869.57030.306263.612202@montanaro.dyndns.org>


    >> The idea of the check is to enforce binary mode on those objects that
    >> support a mode if the desired line terminator doesn't match the
    >> platform's line terminator.

    Andrew> Where that falls down, I think, is where you want to read an
    Andrew> alien file - in fact, under unix, most of the CSV files I read
    Andrew> use \r\n for end-of-line.

Well, you can either require 'b' in that situation or "know" that 'b' isn't
needed on Unix systems.

    Andrew> Also, I *really* don't like the idea of looking for a mode
    Andrew> attribute on the supplied iterator - it feels like a layering
    Andrew> violation. We've advertised the fact that it's an iterator, so
    Andrew> we shouldn't be using anything but the iterator protocol.

The fundamental problem is that the iterator protocol on files is designed
for use only with text mode (or universal newline mode, but that's just as
much of a problem in this context).  I think you either have to abandon the
iterator protocol or peek under the iterator's covers to make sure it reads
and writes in binary mode.  Right now, people on windows create writers like
this

    writer = csv.writer(open("somefile", "w"))

and are confused when their csv files contain blank lines.  I think the
reader and writer objects have to at least emit a warning when they discover
a source or destination that violates the requirements.

Skip
From skip at pobox.com  Thu Jan 13 03:39:41 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 13 03:46:35 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines
In-Reply-To: <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
	<0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl>
Message-ID: <16869.57197.95323.656027@montanaro.dyndns.org>

    Jack> On MacOSX you really want universal newlines. CSV files produced
    Jack> by older software (such as AppleWorks) will have \r line
    Jack> terminators, but lots of other programs will have files with
    Jack> normal \n terminators.

Won't work.  You have to be able to write a Windows csv file on any
platform.  Binary mode is the only way to get that.

Skip

From bob at redivi.com  Thu Jan 13 03:56:05 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Jan 13 03:56:11 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines
In-Reply-To: <16869.57197.95323.656027@montanaro.dyndns.org>
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
	<0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl>
	<16869.57197.95323.656027@montanaro.dyndns.org>
Message-ID: <AB07A6FA-650E-11D9-B569-000A95BA5446@redivi.com>


On Jan 12, 2005, at 21:39, Skip Montanaro wrote:

>     Jack> On MacOSX you really want universal newlines. CSV files 
> produced
>     Jack> by older software (such as AppleWorks) will have \r line
>     Jack> terminators, but lots of other programs will have files with
>     Jack> normal \n terminators.
>
> Won't work.  You have to be able to write a Windows csv file on any
> platform.  Binary mode is the only way to get that.

Isn't universal newlines only used for reading?

I have had no problems using the csv module for reading files with 
universal newlines by opening the file myself or providing an iterator.

Unicode, on the other hand, I have had problems with.

-bob

From pje at telecommunity.com  Thu Jan 13 03:57:07 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 03:56:33 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <20050112225446.GA43203@prometheusresearch.com>
References: <41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>

This is a pretty long post; it starts out as discussion of educational 
issues highlighted by Clark and Ian, but then it takes the motivation for 
PEP 246 in an entirely new direction -- possibly one that could be more 
intuitive than interfaces and adapters as they are currently viewed in 
Zope/Twisted/PEAK etc., and maybe one that could be a much better fit with 
Guido's type declaration ideas.  OTOH, everybody may hate the idea and 
think it's stupid, or if they like it, then Alex may want to strangle me 
for allowing doubt about PEP 246 to re-enter Guido's head.  Either way, 
somebody's going to be unhappy. <wink>


At 05:54 PM 1/12/05 -0500, Clark C. Evans wrote:

>    String -> PathName -> File
>    String -> StringIO -> File

Okay, after reading yours and Ian's posts and thinking about them some 
more, I've learned some really interesting things.

First, adapter abuse is *extremely* attractive to someone new to the 
concept -- so from here on out I'm going to forget about the idea that we 
can teach people to avoid this solely by telling them "the right way to do 
it" up front.

The second, much subtler point I noticed from your posts, was that *adapter 
abuse tends to sooner or later result in adapter diamonds*.

And that is particularly interesting because the way that I learned how NOT 
to abuse adapters, was by getting slapped upside the head by PyProtocols 
pointing out when adapter diamonds had resulted!

Now, that's not because I'm a genius who put the error in because I 
realized that adapter abuse causes diamonds.  I didn't really understand 
adapter abuse until *after* I got enough errors to be able to have a good 
intuition about what "as a" really means.

Now, I'm not claiming that adapter abuse inevitably results in a detectable 
ambiguity, and certainly not that it does so instantaneously.  I'm also not 
claiming that some ambiguities reported by PyProtocols might not be 
perfectly harmless.  So, adaptation ambiguity is a lot like a PyChecker 
warning: it might be a horrible problem, or it might be that you are just 
doing something a little unusual.

But the thing I find interesting is that, even with just the diamonds I 
ended up creating on my own, I was able to infer an intuitive concept of 
"as a", even though I hadn't fully verbalized the concepts prior to this 
lengthy debate with Alex forcing me to single-step through my thought 
processes.

What that suggests to me is that it might well be safe enough in practice 
to let new users of adaptation whack their hand with the mallet now and 
then, given that *now* it's possible to give a much better explanation of 
"as a" than it was before.

Also, consider this...  The larger an adapter network there is, the 
*greater* the probability that adapter abuse will create an ambiguity -- 
which could mean faster learning.

If the ambiguity error is easily looked up in documentation that explains 
the as-a concept and the intended working of adaptation, so much the 
better.  But in the worst case of a false alarm (the ambiguity was 
harmless), you just resolve the ambiguity and move on.


>Originally, Python may ship with the String->StringIO and
>StringIO->File adapters pre-loaded, and if my code was reliant upon
>this transitive chain, the following will work just wonderfully,
>
>     def parse(file: File):
>         ...
>
>     parse("helloworld")
>
>by parsing "helloworld" content via a StringIO intermediate object.  But
>then, let's say a new component "pathutils" registers another adapter pair:
>
>    String->PathName and PathName->File
>
>This ambiguity causes a few problems:
>
>   - How does one determine which adapter path to use?
>   - If a different path is picked, what sort of subtle bugs occur?
>   - If the default path isn't what you want, how do you specify
>     the other path?

The *real* problem here isn't the ambiguity, it's that Pathname->File is 
"adapter abuse".  However, the fact that it results in an ambiguity is a 
useful clue to fixing the problem.  Each time I sat down with one of these 
detected ambiguities, I learned better how to define sensible interfaces 
and meaningful adaptation.  I would not have learned these things by simply 
not having transitive adaptation.


>| As I think these things through, I'm realizing that registered
>| adaptators really should be 100% accurate (i.e., no information loss,
>| complete substitutability), because a registered adapter that seems
>| pragmatically useful in one place could mess up unrelated code, since
>| registered adapters have global effects.
>
>I think this isn't all that useful; it's unrealistic to assume that
>adapters are always perfect.   If transitive adaptation is even
>permitted, it should be unambiguous.  Demanding that adaption is
>100% perfect is a matter of perspective.  I think String->StringIO
>and StringIO->File are perfectly pure.

The next thing that I realized from your posts is that there's another 
education issue for people who haven't used adaptation, and that's just how 
precisely interfaces need to be specified.

For example, we've all been talking about StringIO like it means something, 
but really we need to talk about whether it's being used to read or write 
or both.  There's a reason why PEAK and Zope tend to have interface names 
like 'IComponentFactory' and 'IStreamSource' and other oddball names you'd 
normally not give to a concrete class.  An interface has to be really 
specific -- in the degenerate case an interface can end up being just one 
method.  In fact, I think that something like 10-15% of interfaces in PEAK 
have only one method; I don't know if it's that high for Zope and Twisted, 
although I do know that small interfaces (5 or fewer methods) are pretty 
normal.

What this also suggests to me is that maybe adaptation and interfaces are 
the wrong solution to the problems we've been trying to solve with them -- 
adding more objects to solve the problems created by having lots of 
objects.  :)

As a contrasting example, consider the Dylan language.  The Dylan concept 
of a "protocol" is a set of generic functions that can be called on any 
number of object types.  This is just like an interface, but 
inside-out...  maybe you could call it an "outerface".  :)

The basic idea is that a file protocol would consist of functions like 
'read(stream,byteCount)'.  If you implement a new file-like type, you "add 
a method" to the 'read' generic function that implements 'read' for your 
type.  If a type already exists that you'd like to use 'read' with, you can 
implement the new method yourself.

There are some important ramifications there.  First, there's no 
requirement to implement a complete interface; the system is already 
reduced to *operations* rather than interfaces.  Second, a different choice 
of method names isn't a reason to need more interfaces and adapters.  As 
more implementations of some basic idea (like stream-ness) exist, it 
becomes more and more natural to *share* common generic functions and put 
them in the stdlib, even without any concrete implementation for them, 
because they now form a standard "meeting point" for other libraries.

Third, Ka-Ping Yee has been arguing that Python should be able to define 
interfaces that contain abstract implementation.  Well, generic functions 
can actually *do* this in a straightforward fashion; just define the 
default implementation of that operation as delegating to other 
operations.  There still needs to be some way to "bottom out" so you don't 
end up with endless recursive delegation -- although you could perhaps just 
catch the recursion error and inspect the traceback to tell the user, "must 
implement one of these operations for type X".  (And this could perhaps be 
done automatically if you can declare that this delegating implementation 
is an "abstract method".)

Fourth, and this is *really* interesting (but also rather lengthy to 
explain)...  if all functions are generic (just using a fast-path for the 
nominal case of only one implementation), then you can actually construct 
adapters automatically, knowing precisely when an operation is "safe".

Let me explain.  Suppose that we have a type, SomeType.  It doesn't matter 
if this type is concrete or an interface, we really don't care.  The point 
is that this type defines some operations, and there is an outside 
operation 'foo' that relies on some set of those operations.

We then have OtherType, a concrete type we want to pass to 'foo'.  All we 
need in order to make it work, is *extend the generic functions in SomeType 
with methods that take a different 'self' type*!  Then, the operation 
'adapt(instOfOtherType,SomeType)' can assemble a simple proxy containing 
methods for just the generic functions that have an implementation 
available for OtherType.

The result of this is that now any type can be the basis for an interface, 
which is very intuitive.  That is, I can say, "implement file.read()" for 
my object, and somebody who has an argument declared as "file" will be able 
to use my object as long as they only need the operations I've 
implemented.  However, unlike using method names alone, we have unambiguous 
semantics, because all operations are grounded in some fixed type or 
location of definition that specifies the *meaning* of that operation.

Another benefit of this approach is that it lessens the need for transitive 
adaptation, because over time people converge towards using common 
operations, rather than continually reinventing new ones.  In this 
approach, all "adaptation" is endpoint to endpoint, but there are rarely 
any actual adapters involved, unless a set of related operations actually 
requires keeping some state.  Instead, you simply define an implementation 
of an operation for some concrete type.

I'm running out of time to explore this idea further, alas.  Up to this 
point, what I'm proposing would work *beautifully* for adaptations that 
don't require the adapter to add state to the underlying object, and ought 
to be intuitively obvious, given an appropriate syntax.  E.g.:

     class StringIO:

         def read(self, bytes) implements file.read:
             # etc...

could be used to indicate the simple case where you are conforming to an 
existing operation definition.  A third-party definition, of the same thing 
might look like this:

     def file.read(self: StringIO, bytes):
         return self.read(bytes)

Assuming, of course, that that's the syntax for adding an implementation to 
an existing operation.

Hm.  You know, I think the stateful adapter problem could be solved too, if 
*properties* were also operations.  For example, if 'file.fileno' was 
implemented as a set of three generic functions (get/set/delete), then you 
could maybe do something like:


    class socket:

        # internally declare that our fileno has the semantics
        # of file.fileno:

        fileno: int implements file.fileno

or maybe just:

     class socket implements file:
         ...

could be shorthand for saying that anything with the same name as what's in 
'file' has the same semantics.  OTOH, that could break between Python 
versions if a new operation were added to 'file', so maybe as verbose as 
the blow-by-blow declarations are, they'd be safer semantically.

Anyway, if we were a third party externally declaring the correspondence 
between socket.fileno and file.fileno, we could say:

    # declare how to get a file.fileno for a socket instance

    def file.fileno.__get__(self: socket):
        return self.fileno

Now, there isn't any need to have a separate "adapter" to store additional 
state; with appropriate name mangling it can be stored in the unadapted 
object, if you like.

This isn't a fully thought-out proposal; it's all a fairly 
spur-of-the-moment idea.  I've been playing with generic functions for a 
while now, but only recently started doing any "heavy lifting" with 
them.  However, in one instance, I refactored a PEAK module from being 400+ 
lines of implementation (plus 8 interfaces and lots of adaptation) down to 
just 140 lines implementation and one interface -- with the interface being 
pure documentation.  And the end result was more flexible than the original 
code.  So since then I've been considering whether adaptation is really the 
be-all end-all for this sort of thing, and Clark and Ian's posts made me 
start thinking about it even more seriously.

(One interesting data point: the number of languages with some kind of 
pattern matching, "guards" or other generic function variants seems to be 
growing, while Java (via Eclipse) is the only other system I know of that 
has anything remotely like PEP 246.)

So maybe the *real* answer here is that we should be looking at solutions 
that might prevent the problems that adapters are meant to solve, from 
arising in the first place!  Generic functions might be a good place to 
look for one, although the downside is that they might make Python look 
like a whole new language.  OTOH, type declarations might do that anyway.

A big plus, by the way, of the generic function approach is that it does 
away with the requirement for interfaces altogether, except as a semantic 
grouping of operations.  Lots of people dislike interfaces, and after all 
this discussion about how perfect interface-to-interface adaptation has to 
be, I'm personally becoming a lot less enamored with interfaces too!

In general, Python seems to like to let "natural instinct" prevail.  What 
could be more natural than saying "this is how to implement a such-and-such 
method like what that other guy's got"?  It ain't transitive, but if 
everybody tends to converge on a common "other guy" to define stuff in 
terms of (like 'file' in the stdlib), then you don't *need* transitivity in 
the long run, except for fairly specialized situations like pluggable IDE's 
(e.g. Eclipse) that need to dynamically connect chains between different 
plugins.  Even there, the need could be minimized by most operations 
grounding in "official" abstract types.  And abstract methods -- like a 
'file.readline()' implementation for any object that supports 'file.read()' 
-- could possibly take care of most of the rest.

Generic functions are undoubtedly more complex to implement than PEP 246 
adaptation.  My generic function implementation comprises 3323 lines of 
Python, and it actually *uses* PEP 246 adaptation internally for many 
things, although with more work it could probably do without it.

However, almost half of those lines of code are consumed by a mini-compiler 
and mini-interpreter for Python expressions; a built-in implementation of 
generic functions might be able to get away without having those parts, or 
at least not so many of them.  Also, my implementation supports full 
predicate dispatch, not just multimethod dispatch, so there's probably even 
more code that could be eliminated if it was decided not to do the whole 
nine yards.

Back on the downside, this looks like an invitation to another "language 
vs. stdlib" debates, since PEP 246 in and of itself is pure library.  OTOH, 
Guido's changing the language to add type declarations anyway, and generic 
functions are an excellent use case for them.  Since he's going to be 
flamed for changing the language anyway, he might as well be hanged for a 
sheep as for a goat.  :)

Oh, and back on the upside again, it *might* be easier to implement actual 
type checking with this technique than with PEP 246, because if I write a 
method expecting a 'file' and somebody calls it with a 'Foo' instance, I 
can maybe now look at the file operations actually used by the method, and 
then see if there's an implementation for e.g. 'file.read' defined anywhere 
for 'Foo'.  And, comparable type checking algorithms are more likely to 
already exist for other languages that include generic functions, than to 
exist for PEP 246-style adaptation.

Okay, I'm really out of time now.  Hate to dump this in as a possible 
spoiler on PEP 246, because I was just as excited as Alex about the 
possibility of it going in.  But this whole debate has made me even less 
enamored of adaptation, and more interested in finding a cleaner, more 
intuitive way to do it.

From andrewm at object-craft.com.au  Thu Jan 13 04:21:41 2005
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Thu Jan 13 04:21:44 2005
Subject: [Python-Dev] Re: [Csv] csv module and universal newlines 
In-Reply-To: <AB07A6FA-650E-11D9-B569-000A95BA5446@redivi.com> 
References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au>
	<20050110044441.250103C889@coffee.object-craft.com.au>
	<16868.33914.837771.954739@montanaro.dyndns.org>
	<0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl>
	<16869.57197.95323.656027@montanaro.dyndns.org>
	<AB07A6FA-650E-11D9-B569-000A95BA5446@redivi.com>
Message-ID: <20050113032141.78EB13C889@coffee.object-craft.com.au>

>Isn't universal newlines only used for reading?

That right. And the CSV reader has it's own version of univeral newlines
anyway (from the py1.5 days).

>I have had no problems using the csv module for reading files with 
>universal newlines by opening the file myself or providing an iterator.

Neither have I, funnily enough.

>Unicode, on the other hand, I have had problems with.

Ah, so somebody does want it then? Good to hear. Hard to get motivated
to make radical changes without feedback.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
From ianb at colorstudy.com  Thu Jan 13 04:50:14 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu Jan 13 04:50:15 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
Message-ID: <41E5EFF6.9090408@colorstudy.com>

Phillip J. Eby wrote:
> At 04:07 PM 1/12/05 -0600, Ian Bicking wrote:
> 
>> It also seems quite reasonable and unambiguous that a path object 
>> could be adapted to a IReadableFile by opening the file at the given 
>> path.
> 
> 
> Not if you think of adaptation as an "as-a" relationship, like using a 
> screwdriver "as a" hammer (really an IPounderOfNails, or some such).  It 
> makes no sense to use a path "as a" readable file, so this particular 
> adaptation is bogus.

I started to realize that in a now-aborted reply to Steven, when my 
defense of the path->IReadableFile adaptation started making less sense. 
  It's *still* not intuitively incorrect to me, but there's a couple 
things I can think of...

(a) After you adapted the path to the file, and have a side-effect of 
opening a file, it's unclear who is responsible for closing it.
(b) The file object clearly has state the path object doesn't have, like 
a file position.
(c) You can't  go adapting the path object to a file whenever you 
wanted, because of those side effects.

So those are some more practical reasons that it *now* seems bad to me, 
but that wasn't my immediate intuition, and I could have happily written 
out all the necessary code without countering that intuition.  In fact, 
I've misused adaptation before (I think) though in different ways, and 
it those mistakes haven't particularly improved my intuition on the 
matter.  If you can't learn from mistakes, how can you learn?

One way is with principles and rules, even if they are flawed or 
incomplete.  Perhaps avoiding adaptation diamonds is one such rule; it 
may not be necessarily and absolutely a bad thing that there is a 
diamond, but it is often enough a sign of problems elsewhere that it may 
be best to internalize that belief anyway.  Avoiding diamonds alone 
isn't enough of a rule, but maybe it's a start.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From pje at telecommunity.com  Thu Jan 13 05:19:32 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 05:17:58 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
References: <20050112225446.GA43203@prometheusresearch.com>
	<41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
Message-ID: <5.1.1.6.0.20050112225358.0307b160@mail.telecommunity.com>

At 09:57 PM 1/12/05 -0500, Phillip J. Eby wrote:
>     class StringIO:
>
>         def read(self, bytes) implements file.read:
>             # etc...
>
>could be used to indicate the simple case where you are conforming to an 
>existing operation definition.  A third-party definition, of the same 
>thing might look like this:
>
>     def file.read(self: StringIO, bytes):
>         return self.read(bytes)
>
>Assuming, of course, that that's the syntax for adding an implementation 
>to an existing operation.

After some more thought, I think this approach:

1. Might not actually need generic functions to be implemented.  I need to 
think some more about properties and Ka-Ping Yee's abstract method idea, to 
make sure they can be made to work without "real" generic functions, but a 
basic version of this approach should be implementable with just a handful 
of dictionaries and decorators.

2. Can be prototyped in today's Python, whether generic functions are used 
or not (but the decorator syntax might be ugly, and the decorator 
implementations might be hacky)

3. May still have some rough bits with respect to subclassing & Liskov; I 
need to work through that part some more.  My preliminary impression is 
that it might be safe to consider inherited (but not overridden) methods as 
being the same logical operation.  That imposes some burden on subclassers 
to redeclare compatibility on overridden methods, but OTOH would be 
typesafe by default.

4. Might be somewhat more tedious to declare adaptations with, than it 
currently is with tools like PyProtocols.

Anyway, the non-generic-function implementation would be to have 'adapt()' 
generate (and cache!) an adapter class by going through all the methods of 
the target class and then looking them up in its 'implements' registry, 
while walking up the source class' __mro__ to find the most-specific 
implementation for that type (while checking for 
overridden-but-not-declared methods along the way).  There would be no 
__conform__ or __adapt__ hooks needed.

Interestingly, C# requires you to declare when you are intentionally 
overriding a base class method, in order to avoid accidentally overriding a 
new method added to a base class later.  This concept actually contains a 
germ of the same idea, requiring overrides to specify that they still 
conform to the base class' operations.

Maybe this weekend I'll be able to spend some time on whipping up some sort 
of prototype, and hopefully that will answer some of my open 
questions.  It'll also be interesting to see if I can actually use the 
technique directly on existing interfaces and adaptation, i.e. get some 
degree of PyProtocols backward-compatibility.  It might also be possible to 
get backward-compatibility for Zope too.  In each case, the backward 
compatibility mechanism would be to change the adapter/interface 
declaration APIs to be equivalent to assertions about all the operations 
defined in a particular interface, against the concrete class you're 
claiming implements the interface.

However, for both PEAK and Zope, it would likely be desirable to migrate 
any interfaces like "mapping object" to be based off of operations in e.g. 
the 'dict' type rather than rolling their own IReadMapping and such.

From cce at clarkevans.com  Thu Jan 13 05:26:06 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 05:26:08 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
References: <41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
Message-ID: <20050113042605.GA58003@prometheusresearch.com>

Phillip,

In my mind, the driving use-case for PEP 246 was to allow causual
programmers to plug components together and have it 'just work'; it
does this by enabling the component vendors to carry on a discussion
via __adapt__ and __conform__ to work together.  I was not picturing
that your average developer would be using this sort of thing.

On Wed, Jan 12, 2005 at 09:57:07PM -0500, Phillip J. Eby wrote:
| First, adapter abuse is *extremely* attractive to someone new to the 
| concept -- so from here on out I'm going to forget about the idea that we 
| can teach people to avoid this solely by telling them "the right way to 
| do it" up front.
| 
| The second, much subtler point I noticed from your posts, was that 
| *adapter abuse tends to sooner or later result in adapter diamonds*.

However, I'd like to assert that these cases emerge when you have a
registry /w automatic transitive adaptation.  These problems can be
avoided quite easily by:

   - not doing transitive adaptation automatically

   - making it an error to register more than one adapter from 
     A to Z at any given time; in effect, ban diamonds from ever
     being created

   - make it easy for a user to construct and register an adapter
     from A to Z, via an intermediate X,

        adapt.registerTransitive(A,X,Z)
     
   - if an adaptation from A to Z isn't possible, give a very
     meaningful error listing the possible pathways that one
     could build a 'transitive adaption', adaptation path', 
     perhaps even showing the command that will do it:

         adapt.registerTransitive(A,B,C,Z)
         adapt.registerTranstive(A,Q,Z)
         adapt.registerTranstive(A,X,Z)

The results of this operation:

   - most component vendors will use __adapt__ and __conform__
     rather than use the 'higher-precedent' registry; therefore,
     transitive adaption isn't that common to start with

   - if two libraries register incompatible adpater chains
     during the 'import' of the module, then it will be an
     error that the casual developer will associate with 
     the module, and not with their code
     
   - casual users are given a nice message, like

      "Cannot automatically convert a String to a File. 
     
       Perhaps you should do a manual conversion of your String
       to a File.  Alternatively, there happen to be two adaptation
       paths which could do this for you, but you have to explicitly
       enable the pathway which matches your intent:
       
       To convert a String to a File via StringIO, call:
       adapt.registerTranstive(String,StringIO,File)

       To convert a String to a File via FileName, call:
       adapt.registerTranstive(String,FileName,File)"

| What that suggests to me is that it might well be safe enough in practice 
| to let new users of adaptation whack their hand with the mallet now and 
| then, given that *now* it's possible to give a much better explanation of 
| "as a" than it was before.

By disabling (the quite dangerous?) transitive adaptation, one could
guide the user along to the result they require without having them
shoot themselves in the foot first.

| What this also suggests to me is that maybe adaptation and interfaces are 
| the wrong solution to the problems we've been trying to solve with them 
| -- adding more objects to solve the problems created by having lots of 
| objects.  :)

I didn't see how your remaining post, in particular, Dylan's protocol was 
much different from an mixin/abstract-base-class. Regardless,
getting back to the main goal I had when writing PEP 246 -- your
alternative proposal still doesn't seem to provide a mechanism for
component developers to have a dialogue with one another to connect
components without involving the application programmer. 

Cheers!

Clark
From pje at telecommunity.com  Thu Jan 13 05:48:47 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 05:47:13 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <20050113042605.GA58003@prometheusresearch.com>
References: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>

At 11:26 PM 1/12/05 -0500, Clark C. Evans wrote:
>Regardless,
>getting back to the main goal I had when writing PEP 246 -- your
>alternative proposal still doesn't seem to provide a mechanism for
>component developers to have a dialogue with one another to connect
>components without involving the application programmer.

Eh?  You still have adapt(); you still have adapters.  The only difference 
is that I've specified a way to not need "interfaces" - instead interfaces 
can be defined in terms of individual operations, and those operations can 
be initially defined by an abstract base, concrete class, or an "interface" 
object.  Oh, and you don't have to write adapter *classes* - you write 
adapting *methods* for individual operations.  This can be done by the 
original author of a class or by a third party -- just like with PEP 246.

From michael.walter at gmail.com  Thu Jan 13 06:01:14 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Jan 13 06:01:17 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<20050113042605.GA58003@prometheusresearch.com>
	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
Message-ID: <877e9a1705011221013c9de8f7@mail.gmail.com>

> instead interfaces can be defined in terms of individual operations, and 
> those operations can be initially defined by an abstract base, concrete 
> class, or an "interface" object.
I think this is quite problematic in the sense that it will force many
dummy interfaces to be created. At least without type inference, this
is a no-no.

Consider: In order to type a function like:

def f(x):
  # ...
  x.foo()
  # ...

...so that type violations can be detected before the real action
takes place, you would need to create a dummy interface as in:

interface XAsFUsesIt:
  def foo():
    pass

def f(x : XAsFUsesIt):
  # ...

...or you would want type inference (which at compile time types x as
"a thing which has a 'nullary' foo() function) and a type system like
System CT.

Former appears cumbersome (as it should really be done for every
function), latter too NIMPY-ish <wink>. What am I missing?

Sleepingly yours,
Michael


On Wed, 12 Jan 2005 23:48:47 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 11:26 PM 1/12/05 -0500, Clark C. Evans wrote:
> >Regardless,
> >getting back to the main goal I had when writing PEP 246 -- your
> >alternative proposal still doesn't seem to provide a mechanism for
> >component developers to have a dialogue with one another to connect
> >components without involving the application programmer.
> 
> Eh?  You still have adapt(); you still have adapters.  The only difference
> is that I've specified a way to not need "interfaces" - instead interfaces
> can be defined in terms of individual operations, and those operations can
> be initially defined by an abstract base, concrete class, or an "interface"
> object.  Oh, and you don't have to write adapter *classes* - you write
> adapting *methods* for individual operations.  This can be done by the
> original author of a class or by a third party -- just like with PEP 246.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From pje at telecommunity.com  Thu Jan 13 07:04:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 07:02:28 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <877e9a1705011221013c9de8f7@mail.gmail.com>
References: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<20050113042605.GA58003@prometheusresearch.com>
	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>

At 12:01 AM 1/13/05 -0500, Michael Walter wrote:
>What am I missing?

The fact that this is a type-declaration issue, and has nothing to do with 
*how* types are checked.

Note that I'm only proposing:

1) a possible replacement for PEP 246 that leaves 'adapt()' as a function, 
but uses a different internal implementation,

2) a very specific notion of what an operation is, that doesn't require an 
interface to exist if there is already some concrete type that the 
interface would be an abstraction of,

3) a strawman syntax for declaring the relationship between operations

In other words, compared to the previous state of things, this should 
actually require *fewer* interfaces to accomplish the same use cases, and 
it doesn't require Python to have a built-in notion of "interface", because 
the primitive notion is an operation, not an interface.

Oh, and I think I've now figured out how to define a type-safe version of 
Ping's "abstract operations" concept that can play in the 
non-generic-function implementation, but I really need some sleep, so I 
might be hallucinating the solution.  :)

Anyway, so far it seems like it can all be done with a handful of decorators:

@implements(base_operation, for_type=None)
   (for_type is the "adapt *from*" type, defaulting to the enclosing class 
if used inside a class body)

@override
   (means the method is overriding the one in a base class, keeping the 
same operation correspondence(s) defined for the method in the base class)

@abstract(base_operation, *required_operations)
   (indicates that this implementation of base_operation requires the 
ability to use the specified required_operations on a target instance.  The 
adapter machinery can then "safely fail" if the operations aren't 
available, or if it detects a cycle between mutually-recursive abstract 
operations that don't have a non-abstract implementation.  An abstract 
method can be used to perform the operation on any object that provides the 
required operations, however.)

Anyway, from the information provided by these decorators, you can generate 
adapter classes for any operation-based interfaces.  I don't have a planned 
syntax or API for defining attribute correspondences as yet, but it should 
be possible to treat them internally as a get/set/del operation triplet, 
and then just wrap them in a descriptor on the adapter class.

By the way, saying "generate" makes it sound more complex than it is: just 
a subclass of 'object' with a single slot that points to the wrapped source 
object, and contains simple descriptors for each available operation of the 
"protocol" type that call the method implementations, passing in the 
wrapped object.  So really "generate" means, "populate a dictionary with 
descriptors and then call 'type(name,(object,),theDict)'".

A side effect of this approach, by the way, is that since adapters are 
*never* composed (transitively or otherwise), we can *always* get back to 
the "original" object.  So, in theory we could actually have 
'adapt(x,object)' always convert back to the original unwrapped object, if 
we needed it.  Likewise, adapting an already-adapted object can be safe 
because the adapter machinery knows when it's dealing with one of its own 
adapters, and unwrap it before rewrapping it with a new adapter.

Oh, btw, it should at least produce a warning to declare multiple 
implementations for the same operation and source type, if not an outright 
error.  Since there's no implicit transitivity in this system (either 
there's a registered implementation for something or there isn't), there's 
no other form of ambiguity besides dual declarations of a point-to-point 
adaptation.

Hm.  You know, this also solves the interface inheritance problem; under 
this scheme, if you inherit an operation from a base interface, it doesn't 
mean that you provide the base interface.

Oh, actually, you can still also do interface adaptation in a somewhat more 
restrictive form; you can declare abstract operations for the target 
interface in terms of operations in the base interface.  But it's much more 
controlled because you never stack adapters on adapters, and the system can 
tell at adaptation time what operations are and aren't actually available.

Even more interesting: Alex's "loss of middle name" example can't be 
recreated in this system as a problem, at least if I'm still thinking 
clearly.  But I'm probably not, so I'm going to bed now.  :)

From michael.walter at gmail.com  Thu Jan 13 07:23:38 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Jan 13 07:23:41 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<20050113042605.GA58003@prometheusresearch.com>
	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
	<877e9a1705011221013c9de8f7@mail.gmail.com>
	<5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>
Message-ID: <877e9a17050112222376178511@mail.gmail.com>

On Thu, 13 Jan 2005 01:04:01 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 12:01 AM 1/13/05 -0500, Michael Walter wrote:
> >What am I missing?
> 
> The fact that this is a type-declaration issue, and has nothing to do with
> *how* types are checked.
I was talking about how you declare such types, sir :] (see the
interface pseudo code sample -- maybe my reference to type inference
lead you to think the opposite.)

> In other words, compared to the previous state of things, this should
> actually require *fewer* interfaces to accomplish the same use cases, and
> it doesn't require Python to have a built-in notion of "interface", because
> the primitive notion is an operation, not an interface.
Yepyep, but *how* you declare types now? Can you quickly type the function
def f(x): x.read()? without needing an interface interface x_of_f: def
read(): pass or a decorator like @foo(x.read)? I've no idea what you
mean, really :o)

Michael
From aleax at aleax.it  Thu Jan 13 08:50:54 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 08:50:59 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
References: <41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
Message-ID: <DA4603AC-6537-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 03:57, Phillip J. Eby wrote:

> Okay, I'm really out of time now.  Hate to dump this in as a possible 
> spoiler on PEP 246, because I was just as excited as Alex about the 
> possibility of it going in.  But this whole debate has made me even 
> less enamored of adaptation, and more interested in finding a cleaner, 
> more intuitive way to do it.

Perfectly reasonable, of course.  Doubts about the class / inheritance 
/ interface / instance / method / ... "canon" as the OTW to do OOP are 
almost as old as that canon itself, and have evolved along the years, 
producing many interesting counterexamples and variations, and I fully 
share your interest in them.  Adaptation is rather ``ensconced'' in 
that canon, and the conceptual and practical issues of IS-A which 
pervade the canon are all reflected in the new ``(can be automatically 
adapted to be used) AS-A'' which adaptation introduces.  If adaptation 
cannot survive some vigorous critical appraisal, it's much better to 
air the issues now than later.

Your proposals are novel and interesting.  They also go WAY deeper into 
a critical reappraisal of the whole object model of Python, which has 
always been quite reasonably close to the above-mentioned "canon" and 
indeed has been getting _more_ so, rather than less, since 2.2 (albeit 
in a uniquely Pythonical way, as is Python's wont -- but not 
conceptually, nor, mostly, practically, all that VERY far from canonic 
OOP).  Moreover, your proposals are at a very early stage and no doubt 
need a lot more experience, discussion, maturation, and give-and-take.

Further, you have indicated that, far from _conflicting_ with PEP 246, 
your new ideas can grow alongside and on top of it -- if I read you 
correctly, you have prototyped some variations of them using PEP 246 
for implementation, you have some ideas of how 'adapt' could in turn be 
recast by using your new ideas as conceptual and practical foundations, 
etc, etc.

So, I think the best course of action at this time might be for me to 
edit PEP 246 to reflect some of this enormously voluminous discussion, 
including points of contention (it's part of a PEP's job to also 
indicate points of dissent, after all); and I think you should get a 
new PEP number to use for your new ideas, and develop them on that 
separate PEP, say PEP XYZ.  Knowing that a rethink of the whole 
object-model and related canon is going on at the same time should help 
me keep PEP 246 reasonably minimal and spare, very much in the spirit 
of YAGNI -- as few features as possible, for now.

If Guido, in consequence, decides to completely block 246's progress 
while waiting for the Copernican Revolution of your new PEP XYZ to 
mature, so be it -- his ``nose'' will no doubt be the best guide to him 
on the matter.  But I hope that, in the same pragmatic and minimalist 
spirit as his "stop the flames" Artima post -- proposing minimalistic 
interfaces and adaptation syntax as a starting point, while yet keeping 
as a background reflection the rich and complicated possibilities of 
parameterized types &c as discussed in his previous Artima entries -- 
he'll still give a minimalistic PEP 246 the go-ahead so that 
widespread, real-world experimentation with adaptation and his other 
proposals can proceed, and give many Pythonistas some practical 
experience which will make future discussions and developments much 
sounder-based and productive.


So, what do you think -- does this new plan of action sound reasonable 
to you?


Alex

From aleax at aleax.it  Thu Jan 13 09:00:19 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 09:00:25 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com>
References: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
	<1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
	<5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
	<5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com>
Message-ID: <2AFC1C53-6539-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 02:18, Phillip J. Eby wrote:

> At 05:54 PM 1/12/05 -0700, Steven Bethard wrote:
>> Not that my opinion counts for much =), but returning None does seem
>> much simpler to me.  I also haven't seen any arguments against this
>> route of handling protocol nonconformance...  Is there a particular
>> advantage to the exception-raising scheme?
>
> Only if there's any objection to giving the 'object' type a default 
> __conform__ method that returns 'self' if 
> 'isinstance(protocol,ClassTypes) and isinstance(self,protocol)'.

In the spirit of minimalism in which I propose to rewrite PEP 246 (as 
per my latest post: make a simple, noninvasive, unassuming PEP 246 
while new ``copernical revolution'' ideas which you proposed mature in 
another PEP), I'd rather not make a change to built-in ``object''  a 
prereq for PEP 246; so, I think the reference implementation should 
avoid assuming such changes, if it's at all possible to avoid them 
(while, no doubt, indicating the desirability of such changes for 
simplification and acceleration).

Incidentally, "get this specialmethod from the type (with specialcasing 
for classic classes &c)" is a primitive that PEP 246 needs as much as, 
say, copy.py needs it.  In the light of the recent discussions of how 
to fix copy.py etc, I'm unsure about what to assume there, in a rewrite 
of PEP 246: that getattr(obj, '__aspecial__', None) always does the 
right thing via special descriptors, that I must spell everything out, 
or, what else...?

If anybody has advice or feedback on these points, it will be welcome!


Alex

From arigo at tunes.org  Thu Jan 13 11:16:33 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu Jan 13 11:28:04 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc20501120959737d1935@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
Message-ID: <20050113101633.GA5193@vicky.ecs.soton.ac.uk>

Hi Guido,

On Wed, Jan 12, 2005 at 09:59:13AM -0800, Guido van Rossum wrote:
> The descriptor for __getattr__ and other special attributes could
> claim to be a "data descriptor"

This has the nice effect that x[y] and x.__getitem__(y) would again be
equivalent, which looks good.

On the other hand, I fear that if there is a standard "metamethod" decorator
(named after Phillip's one), it will be misused.  Reading the documentation
will probably leave most programmers with the feeling "it's something magical
to put on methods with __ in their names", and it won't be long before someone
notices that you can put this decorator everywhere in your classes (because it
won't break most programs) and gain a tiny performance improvement.

I guess that a name-based hack in type_new() to turn all __*__() methods into
data descriptors would be even more obscure?

Finally, I wonder if turning all methods whatsoever into data descriptors
(ouch! don't hit!) would be justifiable by the feeling that it's often bad
style and confusing to override a method in an instance (as opposed to
defining a method in an instance when there is none on the class).  
(Supporting this claim: Psyco does this simplifying hypothesis for performance
reasons and I didn't see yet a bug report for this.)

In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type)  
become data descriptors ("classy descriptors?" :-).


Armin
From aleax at aleax.it  Thu Jan 13 11:31:30 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 11:31:39 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
Message-ID: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 21:42, Phillip J. Eby wrote:
    ...
> Anyway, hopefully this post and the other one will be convincing that 
> considering ambiguity to be an error *reinforces* the idea of I-to-I 
> perfection, rather than undermining it.  (After all, if you've written 
> a perfect one, and there's already one there, then either one of you 
> is mistaken, or you are wasting your time writing one!)

I'd just like to point out, as apparently conceded in your "fair 
enough" sentence in another mail, that all of this talk of "wasting 
your time writing" is completely unfounded.  Since that "fair enough" 
of yours was deeply buried somewhere inside this huge conversation, 
some readers might miss the fact that your numerous repetitions of the 
similar concept in different words are just invalid, because, to recap:

Given four interfaces A, B, C, D, there may be need of each of the 
single steps A->B, A->C, B->D, C->D.  Writing each of these four 
adapters can IN NO WAY be considered "wasting your time writing one", 
because there is no way a set of just three out of the four can be used 
to produce the fourth one.

The only "redundancy" comes strictly because of transitivity being 
imposed automatically: at the moment the fourth one of these four 
needed adapters gets registered, there appear to be two same-length 
minimal paths A->x->D (x in {B, C}).  But inferring _from this 
consequence of transitivity_ that there's ANYTHING wrong with any of 
the four needed adapters is a big unwarranted logical jump -- IF one 
really trusted all interface->interface adapters to be perfect, as is 
needed to justify transitivity and as you here claims gets "reinforced" 
(?!).

Thinking of it as "redundancy" is further shown to be fallacious 
because the only solution, if each of those 4 adapters is necessary, is 
to write and register a FIFTH one, A->D directly, even if one has no 
interest whatsoever in A->D adaptation, just to shut up the error or 
warning (as you say, there may be some vague analogy to static typing 
here, albeit in a marginal corner of the stage rather than smack in the 
spotlight;-).

Yes, there is (lato sensu) "non-determinism" involved, just like in, 
say:
     for k in d:
         print k
for a Python dictionary d - depending on how d was constructed and 
modified during its lifetime (which may in turn depend on what order 
modules were imported, etc), this produces different outputs.  Such 
non-determinism may occasionally give some problems to unwary 
programmers (who could e.g. expect d1==d2 <--> repr(d1)==repr(d2) when 
keys and values have unique repr's: the right-pointing half of this 
implication doesn't hold, so using repr(d) to stand in for d when you 
need, e.g., a set of dictionaries, is not quite sufficient); such 
problems at the margin appear to be generally considered acceptable, 
though.

I seems to me that you do at least feel some unease at the whole 
arrangement, given that you say "this whole debate has made me even 
less enamored of adaptation", as it's not clear to me that any _other_ 
aspect of "this whole debate" was quite as problematic (e.g. issues 
such as "how to best get a special method from class rather than 
instance" -- while needing to be resolved for adaptation just as much 
as for copy.py etc -- hardly seem likely to have been the ones 
prompting you to go looking for "a cleaner, more intuitive way to do 
it" outside of the canonical, widespread approach to OOP).


Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 as 
a result of all this is anything but obvious at this point, at least to 
me.


Alex

From p.f.moore at gmail.com  Thu Jan 13 11:35:39 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Jan 13 11:35:42 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E5EFF6.9090408@colorstudy.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
Message-ID: <79990c6b05011302352cbd41de@mail.gmail.com>

On Wed, 12 Jan 2005 21:50:14 -0600, Ian Bicking <ianb@colorstudy.com> wrote:
> Phillip J. Eby wrote:
> > At 04:07 PM 1/12/05 -0600, Ian Bicking wrote:
> >
> >> It also seems quite reasonable and unambiguous that a path object
> >> could be adapted to a IReadableFile by opening the file at the given
> >> path.
> >
> >
> > Not if you think of adaptation as an "as-a" relationship, like using a
> > screwdriver "as a" hammer (really an IPounderOfNails, or some such).  It
> > makes no sense to use a path "as a" readable file, so this particular
> > adaptation is bogus.
> 
> I started to realize that in a now-aborted reply to Steven, when my
> defense of the path->IReadableFile adaptation started making less sense.

I think I'm getting a clearer picture here (at last!)

One thing I feel is key is the fact that adaptation is a *tool*, and
as such will be used in different ways by different people. That is
not a bad thing, even if it does mean that some people will abuse the
tool.

Now, a lot of the talk has referred to "implicit" adaptation. I'm
still struggling to understand how that concept applies in practice,
beyond the case of adaptation chains - at some level, all adaptation
is "explicit", insofar as it is triggered by an adapt() call.

James Knight's example (which seemed to get lost in the discussion, or
at least no-one commented on it) brought up a new point for me, namely
the fact that it's the library writer who creates interfaces, and
calls adapt(), but it's the library *user* who says what classes
support (can be adapted to) what interface. I hadn't focused on the
different people involved before this point.

Now, if we have a transitive case A->B->C, where A is written by "the
user", and C is part of "the library" and library code calls
adapt(x,C) where x is a variable which the user supplies as an object
of type A, then WHO IS RESPONSIBLE FOR B???? And does it matter, and
if it does, then what are the differences?

As I write this, being careful *not* to talk interms of "interfaces"
and "classes", I start to see Philip's point - in my mind, A (written
by the user) is a class, and C (part of the library) is an
"interface". So the answer to the question above about B is that it
depends on whether B is an interface or a class - and the sensible
transitivity rules could easily (I don't have the experience to
decide) depend on whether B is a class or an interface.

BUT, and again, Philip has made this point, I can't reason about
interfaces in the context of PEP 246, because interfaces aren't
defined there. So PEP 246 can't make a clear statement about
transitivity, precisely because it doesn't define interfaces. But does
this harm PEP 246? I'm not sure.

>   It's *still* not intuitively incorrect to me, but there's a couple
> things I can think of...
> 
> (a) After you adapted the path to the file, and have a side-effect of
> opening a file, it's unclear who is responsible for closing it.
> (b) The file object clearly has state the path object doesn't have, like
> a file position.
> (c) You can't  go adapting the path object to a file whenever you
> wanted, because of those side effects.

In the context of my example above, I was assuming that C was an
"interface" (whatever that might be). Here, you're talking about
adapting to a file (a concrete class), which I find to be a much
muddier concept.

This is very much a "best practices" type of issue, though. I don't
see PEP 246 mandating that you *cannot* adapt to concrete classes, but
I can see that it's a dangerous thing to do.

Even the string->path adaptation could be considered suspect. Rather,
you "should" be defining an IPath *interface*, with operations such as
join, basename, and maybe open. Then, the path class would have a
trivial adaptation to IPath, and adapting a string to an IPath would
likely do so by constructing a path object from the string. From a
practical point of view, the IPath interface adds nothing over
adapting direct to the path class, but for the purposes of clarity,
documentation, separation of concepts, etc, I can see the value.

> So those are some more practical reasons that it *now* seems bad to me,
> but that wasn't my immediate intuition, and I could have happily written
> out all the necessary code without countering that intuition.  In fact,
> I've misused adaptation before (I think) though in different ways, and
> it those mistakes haven't particularly improved my intuition on the
> matter.  If you can't learn from mistakes, how can you learn?
> 
> One way is with principles and rules, even if they are flawed or
> incomplete.  Perhaps avoiding adaptation diamonds is one such rule; it
> may not be necessarily and absolutely a bad thing that there is a
> diamond, but it is often enough a sign of problems elsewhere that it may
> be best to internalize that belief anyway.  Avoiding diamonds alone
> isn't enough of a rule, but maybe it's a start.

Some mistakes are easier to avoid if you have the correct conceptual
framework. I suspect that interfaces are the conceptual framework
which make adaptation fall into place. If so, then PEP 246, and
adaptation per se, is always going to be hard to reason about for
people without a background in interfaces.

Hmm. I think I just disqualified myself from making any meaningful comments :-)

Paul.
From aleax at aleax.it  Thu Jan 13 11:47:59 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 11:48:03 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc20501120959737d1935@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
Message-ID: <97350DCE-6550-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 12, at 18:59, Guido van Rossum wrote:
    ...
> [Alex]
>> Armin's fix was to change:
    ...
> [And then proceeds to propose a new API to improve the situation]
>
> I wonder if the following solution wouldn't be more useful (since less
> code will have to be changed).
>
> The descriptor for __getattr__ and other special attributes could
> claim to be a "data descriptor" which means that it gets first pick
> *even if there's also a matching entry in the instance __dict__*.
    ...
> Normal methods are not data descriptors, so they can be overridden by
> something in __dict__; but it makes some sense that for methods
> implementing special operations like __getitem__ or __copy__, where
> the instance __dict__ is already skipped when the operation is invoked
> using its special syntax, it should also be skipped by explicit
> attribute access (whether getattr(x, "__getitem__") or x.__getitem__
> -- these are entirely equivalent).

A very nice idea for how to proceed in the future, and I think 
definitely the right solution for Python 2.5.  But maybe we need to 
think about a bugfix for 2.3/2.4, too.

> We would need to introduce a new decorator so that classes overriding
> these methods can also make those methods "data descriptors", and so
> that users can define their own methods with this special behavior
> (this would be needed for __copy__, probably).
>
> I don't think this will cause any backwards compatibility problems --
> since putting a __getitem__ in an instance __dict__ doesn't override
> the x[y] syntax, it's unlikely that anybody would be using this.

...in new-style classes, yes.  And classic types and old-style classes 
would keep behaving the old-way (with per-instance override) so the bug 
that bit the effbot would disappear... in Python 2.5.  But the bug is 
there in 2.3 and 2.4, and it seems to me we should still find a fix 
that is applicable there, even though the fix won't need to get into 
the 2.5 head, just the 2.3 and 2.4 bugfix branches.

> "Ordinary" methods will still be overridable.
>
> PS. The term "data descriptor" now feels odd, perhaps we can say "hard
> descriptors" instead. Hard descriptors have a __set__ method in
> addition to a __get__ method (though the __set__ method may always
> raise an exception, to implement a read-only attribute).

Good terminology point, and indeed explaining the ``data'' in "data 
descriptor" has always been a problem.  "Hard" or "Get-Set" descriptors 
or other terminology yet will make explanation easier; to pick the best 
terminology we should also think of the antonym, since ``non-data'' 
won't apply any more ("soft descriptors", "get-only descriptors", ...). 
  ``strong'' descriptors having a __set__, and ``weak'' ones not having 
it, is another possibility.


But back to the bugfix for copy.py (and I believe at least pprint.py 
too, though of course that's more marginal than copy.py!) in 2.3 and 
2.4: am I correct that this new descriptor idea is too big/invasive for 
ths bugfix, and thus we should still be considering localized changes 
(to copy.py and pprint.py) via a function copy._get_special (or 
whatever) in 2.3.5 and 2.4.1?

This small, local, minimally invasive change to copy.py would go well 
with the other one we need (as per my latest post with
Subject: 	Re: [Python-Dev] Re: copy confusion
	Date: 	2005 January 12 10:52:10 CET
) -- having the check for issubclass(cls, type) in copy.copy() just as 
we have it in copy.deepcopy() and for the same reason (which is a bit 
wider than the comment in copy.deepcopy about old versions of Boost 
might suggest).


Alex

From ncoghlan at iinet.net.au  Thu Jan 13 13:18:27 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu Jan 13 13:18:32 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <877e9a17050112222376178511@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>	<41E59FA9.4050605@colorstudy.com>	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>	<20050113042605.GA58003@prometheusresearch.com>	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>	<877e9a1705011221013c9de8f7@mail.gmail.com>	<5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>
	<877e9a17050112222376178511@mail.gmail.com>
Message-ID: <41E66713.2060305@iinet.net.au>

Michael Walter wrote:
> Yepyep, but *how* you declare types now? Can you quickly type the function
> def f(x): x.read()? without needing an interface interface x_of_f: def
> read(): pass or a decorator like @foo(x.read)? I've no idea what you
> mean, really :o)

Why would something like

   def f(x):
     x.read()

do any type checking at all?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From skip at pobox.com  Thu Jan 13 03:45:41 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 13 13:59:57 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <41E63EDB.40008@cs.teiath.gr>
References: <1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
	<41E63EDB.40008@cs.teiath.gr>
Message-ID: <16869.57557.795447.53311@montanaro.dyndns.org>


    stelios> Yes but in order to fall into a Liskov Violation, one will have
    stelios> to use extreme OOP features (as I understand from the ongoing
    stelios> discussion for which, honestly, I understand nothing:). 

The first example here:

    http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm

Looks pretty un-extreme to me.  It may not be detectable without the pep 246
stuff, but I suspect it's pretty common.

Skip
From pje at telecommunity.com  Thu Jan 13 14:52:21 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 14:50:47 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <41E66713.2060305@iinet.net.au>
References: <877e9a17050112222376178511@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<20050113042605.GA58003@prometheusresearch.com>
	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
	<877e9a1705011221013c9de8f7@mail.gmail.com>
	<5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>
	<877e9a17050112222376178511@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com>

At 10:18 PM 1/13/05 +1000, Nick Coghlan wrote:
>Michael Walter wrote:
>>Yepyep, but *how* you declare types now? Can you quickly type the function
>>def f(x): x.read()? without needing an interface interface x_of_f: def
>>read(): pass or a decorator like @foo(x.read)? I've no idea what you
>>mean, really :o)
>
>Why would something like
>
>   def f(x):
>     x.read()
>
>do any type checking at all?

It wouldn't.  The idea is to make this:

    def f(x:file):
        x.read()

automatically find a method declared '@implements(file.read,X)' where X is 
in x.__class__.__mro__ (or the equivalent of MRO if x.__class__ is classic).

From ncoghlan at iinet.net.au  Thu Jan 13 15:30:17 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu Jan 13 15:30:22 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com>
References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>	<5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
	<5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com>
Message-ID: <41E685F9.2010606@iinet.net.au>

Phillip J. Eby wrote:
> Anyway, I'm at least +0 on dropping this; the reservation is just 
> because I don't think everybody else will agree with this, and don't 
> want to be appearing to imply that consensus between you and me implies 
> any sort of community consensus on this point.  That is, the adaptation 
> from "Alex and Phillip agree" to "community agrees" is noisy at best!  ;)

You seem to be doing a pretty good job of covering the bases, though. . .

Anyway, I'd like to know if the consensus I think you've reached is the one the 
pair of you think you've reached :)

That is, with A being our starting class, C being a target class, and F being a 
target interface, the legal adaptation chains are:
   # Class to class
   A->C
   # Class to interface, possibly via other interfaces
   A(->F)*->F

With a lookup sequence of:
   1. Check the global registry for direct adaptations
   2. Ask the object via __conform__
   3a. Check using isinstance() unless 2 raised LiskovViolation
   3b. Nothing, since object.__conform__ does an isinstance() check
   4. Ask the interface via __adapt__
   5. Look for transitive chains of interfaces in the global registry.

3a & 3b are the current differing answers to the question of who should be 
checking for inheritance - the adaptation machinery or the __conform__ method.

If classes wish to adapt to things which their parents adapt to, they must 
delegate to their parent's __conform__ method as needed (or simply not override 
__conform__). The ONLY automatic adaptation links are those that allow a subtype 
to be used in place of its parent type, and this can be overriden using 
__conform__. (FWIW, this point about 'adapting to things my parent can adapt to' 
by delegating in __conform__ inclines me in favour of option 3b for handling 
subtyping. However, I can appreciate wanting to keep the PEP free of proposing 
any changes to the core - perhaps mention both, and leave the decision to the BDFL?)

One question - is the presence of __adapt__ enough to mark something as an 
interface in the opinion of the adaptation machinery (for purposes of step 5)?

Second question - will there be something like __conformant__ and __conforming__ 
to allow classes and interfaces to provide additional information in the 
transitive search in step 5?

Or are both of these questions more in PEP 245 territory?

Cheers,
Nick.
Almost sent this to c.l.p by mistake . . .

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From pje at telecommunity.com  Thu Jan 13 15:32:49 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 15:31:15 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <DA4603AC-6537-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050113085248.020ffec0@mail.telecommunity.com>

At 08:50 AM 1/13/05 +0100, Alex Martelli wrote:
>Your proposals are novel and interesting.  They also go WAY deeper into a 
>critical reappraisal of the whole object model of Python, which has always 
>been quite reasonably close to the above-mentioned "canon" and indeed has 
>been getting _more_ so, rather than less, since 2.2 (albeit in a uniquely 
>Pythonical way, as is Python's wont -- but not conceptually, nor, mostly, 
>practically, all that VERY far from canonic OOP).

Actually, the whole generic function thing was just a way to break out of 
the typing problems the Python community has struggled with for 
years.  Every attempt to bring typing or interfaces to Python has run 
aground on simple, practical concepts like, "what is the abstract type of 
dict?  file?"

In essence, the answer is that Python's object model *already* has a 
concept of type...  duck typing!  Basically, I'm proposing a way to 
"formalize" duck typing...  when you say you want a 'file', then when you 
call the object's read method, it "reads like a file".

This isn't a critical reappraisal of Python's current object model at 
all!  It's a reappraisal of the ways we've been trying (and largely 
failing) to make it fit into the mold of other languages' object models.


>  Moreover, your proposals are at a very early stage and no doubt need a 
> lot more experience, discussion, maturation, and give-and-take.

Agreed.  In particular, operations on numeric types are the main area that 
needs conceptual work; most of the rest is implementation details.  I hope 
to spend some time prototyping an implementation this weekend (I start a 
new contract today, so I'll be busy with "real work" till then).

Nonetheless, the idea is so exciting I could barely sleep, and even as I 
woke this morning my head was spinning with comparisons of adaptation 
networks and operation networks, finding them isomorphic no matter what I 
tried.

One really exciting part is that this concept basically allows you to write 
"good" adapters, while making it very difficult or impossible to write 
"bad" ones!  Since it forces adapters to have no per-adapter state, it 
almost literally forces you to only create "as-a" adapters.  For example, 
if you try to define file operations on a string, you're dead in the water 
before you even start: strings have no place to *store* any state.  So, you 
can't adapt an immutable into a mutable.  You *can*, however, add extra 
state to a mutable, but it has to be per-object state, not per-adapter 
state.  (Coincidentally, this eliminates the need for PyProtocols' somewhat 
kludgy concept of "sticky" adapters.)

As a result of this, this adapter model is shaped like a superset of COM - 
you can adapt an object as many times as you like, and the adapters you get 
are basically "pointers to interfaces" on the same object, with no adapter 
composition.  And adapt(x,object) can always give you back the "original" 
object.

This model also has some potential to improve performance: adapter classes 
have all their methods in one dictionary, so there's no __mro__ scan, and 
they have no instance dictionary, so there's no __dict__ lookup.  This 
should mean faster lookups of methods that would otherwise be inherited, 
even without any special interpreter support.  And it also allows a 
possible fast-path opcode to be used for calling methods on a type-declared 
parameter or variable, possibly eventually streamlined to a vtable-like 
structure eliminating any dictionary lookups at all (except at function 
definition time, to bind names in the code object to vtable offsets 
obtained from the types being bound to the function signature).  But of 
course all that is much further down the road.

Anyway, I suppose the *really* exciting thing about all this is how *many* 
different problems the approach seems to address.  :)

(Like automatically detecting certain classes of Liskov violations, for 
example.  And not having to create adapter classes by hand.  And being able 
to support Ping's abstract operations.  Etc., etc., etc.)


>So, I think the best course of action at this time might be for me to edit 
>PEP 246 to reflect some of this enormously voluminous discussion, 
>including points of contention (it's part of a PEP's job to also indicate 
>points of dissent, after all); and I think you should get a new PEP number 
>to use for your new ideas, and develop them on that separate PEP, say PEP 
>XYZ.  Knowing that a rethink of the whole object-model and related canon 
>is going on at the same time should help me keep PEP 246 reasonably 
>minimal and spare, very much in the spirit of YAGNI -- as few features as 
>possible, for now.

Sounds good to me.


>If Guido, in consequence, decides to completely block 246's progress while 
>waiting for the Copernican Revolution of your new PEP XYZ to mature, so be 
>it -- his ``nose'' will no doubt be the best guide to him on the matter.

It's not that big of a revolution, really.  This should make Python *more* 
like Python, not less.


>   But I hope that, in the same pragmatic and minimalist spirit as his 
> "stop the flames" Artima post -- proposing minimalistic interfaces and 
> adaptation syntax as a starting point, while yet keeping as a background 
> reflection the rich and complicated possibilities of parameterized types 
> &c as discussed in his previous Artima entries -- he'll still give a 
> minimalistic PEP 246 the go-ahead so that widespread, real-world 
> experimentation with adaptation and his other proposals can proceed, and 
> give many Pythonistas some practical experience which will make future 
> discussions and developments much sounder-based and productive.

Well, as a practical matter, every current interface package for Python 
already implements the "old" PEP 246, so nothing stops people from 
experimenting with it now, any more than in the past.

But, since I think that my approach can simply be implemented as a more 
sophisticated version of PEP 246's new global adapter registry, there is no 
real reason for the PEPs to actually *conflict*.  The only area of 
potential conflict is that I think __conform__ and __adapt__ might be able 
to wither away completely in the presence of a suitably powerful registry.


>So, what do you think -- does this new plan of action sound reasonable to you?

Yes.  I'll prototype and PEP, unless somebody else gets as excited as I am 
and does it first.  ;)  I'll probably call it "Duck Typing and Adaptation" 
or some such.

From michael.walter at gmail.com  Thu Jan 13 15:33:17 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Jan 13 15:33:20 2005
Subject: [Python-Dev] Son of PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<41E59FA9.4050605@colorstudy.com>
	<5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com>
	<20050113042605.GA58003@prometheusresearch.com>
	<5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com>
	<877e9a1705011221013c9de8f7@mail.gmail.com>
	<5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com>
	<877e9a17050112222376178511@mail.gmail.com>
	<41E66713.2060305@iinet.net.au>
	<5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com>
Message-ID: <877e9a17050113063374b133ca@mail.gmail.com>

Ahhh, there we go, so "file" is type you declare. All I was asking
for, I thought you were thinking in a different/"more sophisticated"
direction (because what "f" actually wants is not a file, but a "thing
which has a read() like file" -- I thought one would like to manifest
that in the type instead of implicitely by the code). Your concept is
cool, tho :-)

Michael


On Thu, 13 Jan 2005 08:52:21 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 10:18 PM 1/13/05 +1000, Nick Coghlan wrote:
> >Michael Walter wrote:
> >>Yepyep, but *how* you declare types now? Can you quickly type the function
> >>def f(x): x.read()? without needing an interface interface x_of_f: def
> >>read(): pass or a decorator like @foo(x.read)? I've no idea what you
> >>mean, really :o)
> >
> >Why would something like
> >
> >   def f(x):
> >     x.read()
> >
> >do any type checking at all?
> 
> It wouldn't.  The idea is to make this:
> 
>     def f(x:file):
>         x.read()
> 
> automatically find a method declared '@implements(file.read,X)' where X is
> in x.__class__.__mro__ (or the equivalent of MRO if x.__class__ is classic).
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From cce at clarkevans.com  Thu Jan 13 15:34:21 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 15:34:28 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <79990c6b05011302352cbd41de@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
Message-ID: <20050113143421.GA39649@prometheusresearch.com>

On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote:
| One thing I feel is key is the fact that adaptation is a *tool*, and
| as such will be used in different ways by different people. That is
| not a bad thing, even if it does mean that some people will abuse the tool.
| 
| Now, a lot of the talk has referred to "implicit" adaptation. I'm
| still struggling to understand how that concept applies in practice,
| beyond the case of adaptation chains - at some level, all adaptation
| is "explicit", insofar as it is triggered by an adapt() call.

The 'implicit' adaptation refers to the automagical construction of
composite adapters assuming that a 'transitive' property holds. I've
seen nothing in this thread to explain why this is so valueable, why
it shouldn't be explicit, and on the contrary, most of the "problems
with adapt()" seem to stem from this aggressive extension of what
was proposed: Automatic construction of adapter chains is _not_ part
of the original PEP 246 and I hope it remains that way.   I've
outlined in several posts how this case could be made easy for a
application developer to do:

  - transitive adapters should always be explicit
  - it should be an error to have more than one adapter 
    from A to Z in the registry
  - when adaptation fails, an informative error message can
    tell the application developer of possible "chains" 
    which could work
  - registration of transitive adapters can be simple command
    application developers use:  adapt.transitive(from=A,to=Z,via=M)
    error message can tell an application developer 

| James Knight's example (which seemed to get lost in the discussion, or
| at least no-one commented on it) brought up a new point for me, namely
| the fact that it's the library writer who creates interfaces, and
| calls adapt(), but it's the library *user* who says what classes
| support (can be adapted to) what interface. I hadn't focused on the
| different people involved before this point.

I'd say the more common pattern is three players.  The framework
builder, the component budiler, and the application designer.  Adapt
provides a mechansim for the framework builder (via __adapt__) and
the component builder (via __conform__) to work together without
involving the application designer.

The 'registry' idea (which was not explored in the PEP) emerges from
the need, albeit limited, for the application developer who is
plugging a component into a framework, to have some say in the
process.  I think that any actions taken by the user, by registering
an adapter, should be explicit.  

The 'diamond' problem discussed by Phillip has only confirmed this
belief.  You don't want the adapt() system going around assuming
transitivity.  However, if the application developer is certain that
a conversion path from A to Z going through B, and/or Y will work,
then it should be easy for them to specify this adaptation path.

| Now, if we have a transitive case A->B->C, where A is written by "the
| user", and C is part of "the library" and library code calls
| adapt(x,C) where x is a variable which the user supplies as an object
| of type A, then WHO IS RESPONSIBLE FOR B???? And does it matter, and
| if it does, then what are the differences?

Great question.  But I'd like to rephrase that C is probably a framework,
A and B are probably components; and we assume that either the framework
or component developers have enabled A->B and B->C.   If the user wishes
to make an adapter from A->C assuming no (or acceptable for his purposes)
information loss from A->C through B, then this is his/her choice.  However,
it shouldn't be done by the framework or component developers unless it is
a perfect adaptation, and it certainly shouldn't be automagic.

I don't think who owns B is particularly more important than A or C.

| As I write this, being careful *not* to talk interms of "interfaces"
| and "classes", I start to see Philip's point - in my mind, A (written
| by the user) is a class, and C (part of the library) is an
| "interface". So the answer to the question above about B is that it
| depends on whether B is an interface or a class - and the sensible
| transitivity rules could easily (I don't have the experience to
| decide) depend on whether B is a class or an interface.

I'd like to say that _any_ transitivity rule should be explicit; there
is a point where you make it easy for the programmer, but for heavens
sake, let's not try to do their job.

| BUT, and again, Philip has made this point, I can't reason about
| interfaces in the context of PEP 246, because interfaces aren't
| defined there. So PEP 246 can't make a clear statement about
| transitivity, precisely because it doesn't define interfaces. But does
| this harm PEP 246? I'm not sure.

Well, PEP 246 should be edited, IMHO, to assert that all 'implicit'
adaptions are out-of-scope, and if they are supported should be done
so under the direct control of the application developer.

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From pje at telecommunity.com  Thu Jan 13 15:36:38 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 15:35:03 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
In-Reply-To: <2AFC1C53-6539-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com>
	<5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
	<1105553300.41e56794d1fc5@mcherm.com>
	<16869.33426.883395.345417@montanaro.dyndns.org>
	<5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com>
	<5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050113093338.03e58c20@mail.telecommunity.com>

At 09:00 AM 1/13/05 +0100, Alex Martelli wrote:
>Incidentally, "get this specialmethod from the type (with specialcasing 
>for classic classes &c)" is a primitive that PEP 246 needs as much as, 
>say, copy.py needs it.  In the light of the recent discussions of how to 
>fix copy.py etc, I'm unsure about what to assume there, in a rewrite of 
>PEP 246: that getattr(obj, '__aspecial__', None) always does the right 
>thing via special descriptors, that I must spell everything out, or, what 
>else...?

I think you can make it a condition that metaclasses with __conform__ or 
__adapt__ must use a data descriptor like my "metamethod" decorator.  Then, 
there is no metaconfusion since metaconfusion requires a metaclass to 
exist, and you're requiring that in that case, they must use a descriptor 
to avoid the problem.

From pje at telecommunity.com  Thu Jan 13 15:41:31 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 15:39:56 2005
Subject: getting special from type, not instance (was Re:
	[Python-Dev] copy confusion)
In-Reply-To: <20050113101633.GA5193@vicky.ecs.soton.ac.uk>
References: <ca471dc20501120959737d1935@mail.gmail.com>
	<cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050113093750.03e5a2d0@mail.telecommunity.com>

At 10:16 AM 1/13/05 +0000, Armin Rigo wrote:
>On the other hand, I fear that if there is a standard "metamethod" decorator
>(named after Phillip's one), it will be misused.  Reading the documentation
>will probably leave most programmers with the feeling "it's something magical
>to put on methods with __ in their names",

Possible solution: have it break when it's used in a non-subtype of 
'type'.  That is to say, when it's not used in a metaclass.


>Finally, I wonder if turning all methods whatsoever into data descriptors
>(ouch! don't hit!) would be justifiable by the feeling that it's often bad
>style and confusing to override a method in an instance (as opposed to
>defining a method in an instance when there is none on the class).

Hm.  I look at this the opposite way: sometimes it's nice to provide a 
default version of a callable that's supposed to be stuck on the object 
later, just like it's nice to have a default initial value for a variable 
supplied by the type.  I don't think that doing away with this feature for 
non-special methods is a step forwards.


>In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type)
>become data descriptors ("classy descriptors?" :-).

Heh.  :)

From pje at telecommunity.com  Thu Jan 13 16:08:10 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 16:06:36 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>

At 11:31 AM 1/13/05 +0100, Alex Martelli wrote:

>On 2005 Jan 12, at 21:42, Phillip J. Eby wrote:
>    ...
>>Anyway, hopefully this post and the other one will be convincing that 
>>considering ambiguity to be an error *reinforces* the idea of I-to-I 
>>perfection, rather than undermining it.  (After all, if you've written a 
>>perfect one, and there's already one there, then either one of you is 
>>mistaken, or you are wasting your time writing one!)
>
>I'd just like to point out, as apparently conceded in your "fair enough" 
>sentence in another mail, that all of this talk of "wasting your time 
>writing" is completely unfounded.

The above refers to two I-to-I adapters between the same two points, not 
the addition of an adapter that creates an adaptation diamond.  I have 
indeed agreed that an "innocent" adapter diamond can trigger a "false 
alarm" from PyProtocols with respect to duplication.


>  Since that "fair enough" of yours was deeply buried somewhere inside 
> this huge conversation, some readers might miss the fact that your 
> numerous repetitions of the similar concept in different words are just 
> invalid, because, to recap:
>
>Given four interfaces A, B, C, D, there may be need of each of the single 
>steps A->B, A->C, B->D, C->D.  Writing each of these four adapters can IN 
>NO WAY be considered "wasting your time writing one", because there is no 
>way a set of just three out of the four can be used to produce the fourth one.

Right, I agreed to this.  However, the fact that a test can produce false 
positives does not in and of itself mean that it's "just invalid" -- the 
question is how often the result is *useful*.


>I seems to me that you do at least feel some unease at the whole 
>arrangement, given that you say "this whole debate has made me even less 
>enamored of adaptation", as

Not exactly, because my experience to date has been that false alarms are 
exceedingly rare and I have yet to experience a *useful* adapter diamond of 
the type you've described in an actual real-life interface, as opposed to a 
made up group of A,B,C,D.    So, my unease in relation to the adapter 
diamond issue is only that I can't say for absolutely certain that if PEP 
246 use became widespread, the problem of accidental I-to-I adapter 
diamonds might not become much more common than it is now.


>it's not clear to me that any _other_ aspect of "this whole debate" was 
>quite as problematic (e.g. issues such as "how to best get a special 
>method from class rather than instance" -- while needing to be resolved 
>for adaptation just as much as for copy.py etc -- hardly seem likely to 
>have been the ones prompting you to go looking for "a cleaner, more 
>intuitive way to do it" outside of the canonical, widespread approach to OOP).

No, the part that made me seek another solution is the reactions of people 
who were relatively fresh to the debate and the concepts of adaptation, 
interfaces, etc. in Python.  The fact that virtually every single one of 
them immediately reached for what developers who were more "seasoned" in 
this concept thought of as "adapter abuse", meant to me that:

1) Any solution that relied on people doing the right thing right out of 
the gate wasn't going to work

2) The true "Pythonic" solution would be one that requires the least 
learning of interfaces, covariance, contravariance, Liskov, all that other 
stuff

3) And to be Pythonic, it would have to provide only one "obvious way to do 
it", and that way should be in some sense the "right" way, making it at 
least a little harder to do something silly.

In other words, I concluded that we "seasoned" developers might be right 
about what adaptation is supposed to be, but that our mere presentation of 
the ideas wasn't going to sway real users.

So no, my goal wasn't to fix the adapter diamond problem per se, although I 
believe that the equivalent concept in duck adaptation (overlapping 
abstract methods) will *also* be able to trivially ignore adapter diamonds 
and still warn about meaningful ambiguities.

Instead, the goal was to make it so that people who try to abuse adapters 
will quickly discover that they *can't*, and second, as soon as they ask 
how, the obvious answer will be, "well, that's because you're doing a type 
conversion.  You need to create an instance of the thing you want, because 
you can't use that thing "as a" such-and-such."  And then they will go, 
"Ah, yes, I see...  that makes sense," and then go and sin no more.

With the previous PEP, people could create all sorts of subtle problems in 
their code (with or without transitivity!) and have no direct indicator of 
a problem.  Clark and Ian made me realize this with their string/file/path 
discussions -- *nobody* is safe from implicit adaptation if adaptation 
actually creates new objects with independent state!  An adapter's state 
needs to be kept with the original object, or not at all, and most of the 
time "not at all" is the correct answer.


>Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 as a 
>result of all this is anything but obvious at this point, at least to me.

LOL.  Me either!

From carribeiro at gmail.com  Thu Jan 13 16:13:51 2005
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Thu Jan 13 16:13:55 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
Message-ID: <864d3709050113071350454789@mail.gmail.com>

On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> With the previous PEP, people could create all sorts of subtle problems in
> their code (with or without transitivity!) and have no direct indicator of
> a problem.  Clark and Ian made me realize this with their string/file/path
> discussions -- *nobody* is safe from implicit adaptation if adaptation
> actually creates new objects with independent state!  An adapter's state
> needs to be kept with the original object, or not at all, and most of the
> time "not at all" is the correct answer.

+1, specially for the last sentence. An adapter with local state is
not an adapter anymore! It's funny how difficult it's to get this...
but it's obvious once stated.

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From pje at telecommunity.com  Thu Jan 13 16:17:49 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 16:16:15 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <79990c6b05011302352cbd41de@mail.gmail.com>
References: <41E5EFF6.9090408@colorstudy.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
Message-ID: <5.1.1.6.0.20050113100908.020fc920@mail.telecommunity.com>

At 10:35 AM 1/13/05 +0000, Paul Moore wrote:
>Now, a lot of the talk has referred to "implicit" adaptation. I'm
>still struggling to understand how that concept applies in practice,
>beyond the case of adaptation chains - at some level, all adaptation
>is "explicit", insofar as it is triggered by an adapt() call.

It's "implicit" in that the caller of the code that contains the adapt() 
call carries no visible indication that adaptation will take place.


> >   It's *still* not intuitively incorrect to me, but there's a couple
> > things I can think of...
> >
> > (a) After you adapted the path to the file, and have a side-effect of
> > opening a file, it's unclear who is responsible for closing it.
> > (b) The file object clearly has state the path object doesn't have, like
> > a file position.
> > (c) You can't  go adapting the path object to a file whenever you
> > wanted, because of those side effects.
>
>In the context of my example above, I was assuming that C was an
>"interface" (whatever that might be). Here, you're talking about
>adapting to a file (a concrete class), which I find to be a much
>muddier concept.
>
>This is very much a "best practices" type of issue, though. I don't
>see PEP 246 mandating that you *cannot* adapt to concrete classes, but
>I can see that it's a dangerous thing to do.
>
>Even the string->path adaptation could be considered suspect. Rather,
>you "should" be defining an IPath *interface*, with operations such as
>join, basename, and maybe open. Then, the path class would have a
>trivial adaptation to IPath, and adapting a string to an IPath would
>likely do so by constructing a path object from the string. From a
>practical point of view, the IPath interface adds nothing over
>adapting direct to the path class, but for the purposes of clarity,
>documentation, separation of concepts, etc, I can see the value.

This confusion was another reason for the "Duck-Typing Adaptation" 
proposal; it's perfectly fine to take a 'path' class and "duck-type" an 
interface from it: i.e when you adapt to 'path', then if you call 
'basename' on the object, you will either:

1. Invoke a method that someone has claimed is semantically equivalent to 
path.basename, OR

2. Get a TypeError indicating that the object you're using doesn't have 
such an operation available.

In effect, this is the duck-typing version of a Java cast: it's more 
dynamic because it doesn't require you to implement all operations "up 
front", and also because third parties can implement the operations and add 
them, and because you can define abstract operations that can implement 
operations "in terms of" other operations.


>Some mistakes are easier to avoid if you have the correct conceptual
>framework. I suspect that interfaces are the conceptual framework
>which make adaptation fall into place. If so, then PEP 246, and
>adaptation per se, is always going to be hard to reason about for
>people without a background in interfaces.

Exactly, and that's a problem -- so, I think I've invented (or reinvented, 
one never knows) the concept of a "duck interface", that requires no 
special background to understand or use, because (for example) it has no 
inheritance except normal inheritance, and involves no "adapter classes" 
anywhere.  Therefore, the reasoning you already apply to ordinary Python 
classes "just works".  (Versus e.g. the interface-logic of Zope and 
PyProtocols, which is *not* ordinary Python inheritance.)


>Hmm. I think I just disqualified myself from making any meaningful 
>comments :-)

And I just requalified you.  Feel free to continue commenting.  :)

From pje at telecommunity.com  Thu Jan 13 16:26:54 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 16:25:20 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050113143421.GA39649@prometheusresearch.com>
References: <79990c6b05011302352cbd41de@mail.gmail.com>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com>

At 09:34 AM 1/13/05 -0500, Clark C. Evans wrote:
>On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote:
>| One thing I feel is key is the fact that adaptation is a *tool*, and
>| as such will be used in different ways by different people. That is
>| not a bad thing, even if it does mean that some people will abuse the tool.
>|
>| Now, a lot of the talk has referred to "implicit" adaptation. I'm
>| still struggling to understand how that concept applies in practice,
>| beyond the case of adaptation chains - at some level, all adaptation
>| is "explicit", insofar as it is triggered by an adapt() call.
>
>The 'implicit' adaptation refers to the automagical construction of
>composite adapters assuming that a 'transitive' property holds.

Maybe some folks are using the term that way; I use it to mean that in this 
code:

    someOb.something(aFoo)

'aFoo' may be "implicitly adapted" because the 'something' method has a 
type declaration on the parameter.

Further, 'something' might call another method with another type 
declaration, passing the adapted version of 'foo', which results in you 
possibly getting implicit transitive adaptation *anyway*, without having 
intended it.

Also, if adapters have per-adapter state, and 'someOb.something()' is 
expecting 'aFoo' to keep some state it puts there across calls to methods 
of 'someOb', then this code won't work correctly.

All of these things are "implicit adaptation" issues, IMO, and exist even 
withoutPyProtocols-style transitivity.  "Duck adaptation" solves these 
issues by prohibiting per-adapter state and by making adaptation 
order-insensitive.  (I.e. adapt(adapt(a,B),C) should always produce the 
same result as adapt(a,C).)

From mchermside at ingdirect.com  Thu Jan 13 16:42:44 2005
From: mchermside at ingdirect.com (Chermside, Michael)
Date: Thu Jan 13 16:42:50 2005
Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name
Message-ID: <0CFFADBB825C6249A26FDF11C1772AE101F6861E@ingdexj1.ingdirect.com>

Phillip writes:
> IMO, it's simpler to handle this use case by letting __conform__
return 
> None, since this allows people to follow the One Obvious Way to not
conform 
> to a particular protocol.
> 
> Then, there isn't a need to even worry about the exception name in the

> first place, either...

+1. Writing a default __conform__ for object is reasonable.

Alex writes:
> I'd rather not make a change to built-in ``object''  a prereq for PEP
246

Why not? Seems quite reasonable. Before __conform__ existed, there
wasn't
one for object; now that it exists, object needs one.

-- Michael Chermside



This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it.

From ncoghlan at iinet.net.au  Thu Jan 13 17:03:57 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu Jan 13 17:04:02 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <864d3709050113071350454789@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<ca471dc205011207261a8432c@mail.gmail.com>	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
	<864d3709050113071350454789@mail.gmail.com>
Message-ID: <41E69BED.9050508@iinet.net.au>

Carlos Ribeiro wrote:
> On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby
> <pje@telecommunity.com> wrote:
> 
>>With the previous PEP, people could create all sorts of subtle problems in
>>their code (with or without transitivity!) and have no direct indicator of
>>a problem.  Clark and Ian made me realize this with their string/file/path
>>discussions -- *nobody* is safe from implicit adaptation if adaptation
>>actually creates new objects with independent state!  An adapter's state
>>needs to be kept with the original object, or not at all, and most of the
>>time "not at all" is the correct answer.
> 
> 
> +1, specially for the last sentence. An adapter with local state is
> not an adapter anymore! It's funny how difficult it's to get this...
> but it's obvious once stated.

+lots

Now that it's been stated, I think this is similar to where implicit type 
conversions in C++ go wrong, and to the extent that PEP 246 aligns with those. . 
. *shudder*.

I've also learned from this discussion just how wrong my own ideas about how to 
safely use adaptation were. Most Python programmers aren't going to have the 
benefit of listening to some smart people work through the various issues in public.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From p.f.moore at gmail.com  Thu Jan 13 17:19:20 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Jan 13 17:19:24 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com>
Message-ID: <79990c6b050113081924bcf274@mail.gmail.com>

On Thu, 13 Jan 2005 10:26:54 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 09:34 AM 1/13/05 -0500, Clark C. Evans wrote:
> >On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote:
> >| One thing I feel is key is the fact that adaptation is a *tool*, and
> >| as such will be used in different ways by different people. That is
> >| not a bad thing, even if it does mean that some people will abuse the tool.
> >|
> >| Now, a lot of the talk has referred to "implicit" adaptation. I'm
> >| still struggling to understand how that concept applies in practice,
> >| beyond the case of adaptation chains - at some level, all adaptation
> >| is "explicit", insofar as it is triggered by an adapt() call.
> >
> >The 'implicit' adaptation refers to the automagical construction of
> >composite adapters assuming that a 'transitive' property holds.
> 
> Maybe some folks are using the term that way; I use it to mean that in this
> code:
> 
>     someOb.something(aFoo)
> 
> 'aFoo' may be "implicitly adapted" because the 'something' method has a
> type declaration on the parameter.

Whoa! At this point in time, parameters do not have type declarations,
and PEP 246 does NOTHING to change that.

In terms of Python *now* you are saying that if someOb.something is
defined like so:

    def someOb.something(f):
        adapted_f = adapt(f, ISomethingOrOther)

then aFoo is being "implicitly adapted". I'm sorry, but this seems to
me to be a completely bogus argument. The caller of someOb.something
has no right to know what goes on internal to the method. I would
assume that the documented interface of someOb.something would state
that its parameter "must be adaptable to ISomethingOrOther" - but
there's nothing implicit going on here, beyond the entirely sensible
presumption that if a method requires an argument to be adaptable,
it's because it plans on adapting it!

And as a user of someOb.something, I would be *entirely* comfortable
with being given the responsibility of ensuring that relevant
adaptations exist.

> Further, 'something' might call another method with another type
> declaration, passing the adapted version of 'foo', which results in you
> possibly getting implicit transitive adaptation *anyway*, without having
> intended it.

So you think it's reasonable for someOb.something to pass adapted_f on
to another function? I don't - it should pass f on. OK, that's a major
disadvantage of Guido's type declaration proposal - the loss of the
original object - but raise that with Guido, not with PEP 246. I'd
suspect that one answer from the POV of Guido's proposal would be a
way of introspecting the original object - but even that would be
horribly dangerous because of the significant change in semantics of
an argument when a type declaration is added. (This may kill
type-declaration-as-adaptation, so it's certainly serious, but *not*
in terms of PEP 246).

> Also, if adapters have per-adapter state, and 'someOb.something()' is
> expecting 'aFoo' to keep some state it puts there across calls to methods
> of 'someOb', then this code won't work correctly.

That one makes my brain hurt. But you've already said that per-adapter
state is bad, so maybe the pain is good for me :-)

> All of these things are "implicit adaptation" issues, IMO, and exist even
> withoutPyProtocols-style transitivity.

I'd certainly not characterise them as "implicit adaptation" issues
(except for the type declaration one, which doesn't apply to Python as
it is now), and personally (as I hope I've explained above) I don't
see them as PEP 246 issues, either.

> ... "Duck adaptation" solves these issues ...

... and may indeed be a better way of using Guido's type declarations
than making them define implicit adaptation. So I do support a
separate PEP for this. But I suspect its implementation timescale will
be longer than that of PEP 246...

Paul.
From cce at clarkevans.com  Thu Jan 13 17:57:02 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 17:57:04 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E69BED.9050508@iinet.net.au>
References: <79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
	<864d3709050113071350454789@mail.gmail.com>
	<41E69BED.9050508@iinet.net.au>
Message-ID: <20050113165701.GC14084@prometheusresearch.com>

On Fri, Jan 14, 2005 at 02:03:57AM +1000, Nick Coghlan wrote:
| Carlos Ribeiro wrote:
| > On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby wrote:
| > > With the previous PEP, people could create all sorts of subtle problems 
| > > in their code (with or without transitivity!) and have no direct 
| > > indicator of a problem.  Clark and Ian made me realize this with their 
| > > string/file/path discussions -- *nobody* is safe from implicit 
| > > adaptation if adaptation actually creates new objects with independent
| > > state!  An adapter's state needs to be kept with the original object,
| > > or not at all, and most of the time "not at all" is the correct answer.
| >
| >+1, specially for the last sentence. An adapter with local state is
| >not an adapter anymore! It's funny how difficult it's to get this...
| >but it's obvious once stated.
| 
| +lots

-1

There is nothing wrong with an adapter from String to File, one
which adds the current read position in its state.  No adapter is a
perfect translation -- or you woudn't need them in the first place.
An adapter, by default, does just that, it wraps the object to make
it compliant with another interface.  To disallow it from having
local state is, like taking the wheels off a car and expecting it
to be useful (in somecases it is, for ice fishing, but that's 
another and rather oblique story).

Ian stated the issue properly: adapters bring with them an intent,
which cannot, in general way, be expressed in code.  Therefore,
combining adapters haphazardly will, of course, get you into
trouble.  The solution is simple -- don't do that.  PEP 246 should
at the very least remain silent on this issue, it should not
encourage or specify automagic transitive adaptation.  If a user
blows their foot off, by their own actions, they will be able to
track it down and learn; if the system shots their foot off, by
some automatic transitive adaption; well, that's another issue.

| Now that it's been stated, I think this is similar to where implicit type 
| conversions in C++ go wrong, and to the extent that PEP 246 aligns with 
| those. . . *shudder*.

PEP 246 doesn't align at all with this problem. 

Best,

Clark

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From gvanrossum at gmail.com  Thu Jan 13 18:02:10 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Jan 13 18:02:12 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <20050113101633.GA5193@vicky.ecs.soton.ac.uk>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
	<20050113101633.GA5193@vicky.ecs.soton.ac.uk>
Message-ID: <ca471dc205011309023203847f@mail.gmail.com>

> > The descriptor for __getattr__ and other special attributes could
> > claim to be a "data descriptor"
> 
> This has the nice effect that x[y] and x.__getitem__(y) would again be
> equivalent, which looks good.
> 
> On the other hand, I fear that if there is a standard "metamethod" decorator
> (named after Phillip's one), it will be misused.  Reading the documentation
> will probably leave most programmers with the feeling "it's something magical
> to put on methods with __ in their names", and it won't be long before someone
> notices that you can put this decorator everywhere in your classes (because it
> won't break most programs) and gain a tiny performance improvement.
> 
> I guess that a name-based hack in type_new() to turn all __*__() methods into
> data descriptors would be even more obscure?

To the contary, I just realized in this would in fact be the right
approach. In particular, any *descriptor* named __*__ would be
considered a "data descriptor". Non-descriptors with such names can
still be overridden in the instance __dict__ (I believe this is used
by Zope).

> Finally, I wonder if turning all methods whatsoever into data descriptors
> (ouch! don't hit!) would be justifiable by the feeling that it's often bad
> style and confusing to override a method in an instance (as opposed to
> defining a method in an instance when there is none on the class).
> (Supporting this claim: Psyco does this simplifying hypothesis for performance
> reasons and I didn't see yet a bug report for this.)

Alas, it's a documented feature that you can override a (regular)
method by placing an appropriate callable in the instance __dict__.

> In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type)
> become data descriptors ("classy descriptors?" :-).

Let's do override descriptors.

And please, someone fix copy.py in 2.3 and 2.4.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From cce at clarkevans.com  Thu Jan 13 18:09:15 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 18:09:18 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E685F9.2010606@iinet.net.au>
References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
	<5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com>
	<5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com>
	<41E685F9.2010606@iinet.net.au>
Message-ID: <20050113170915.GD14084@prometheusresearch.com>

On Fri, Jan 14, 2005 at 12:30:17AM +1000, Nick Coghlan wrote:
| Anyway, I'd like to know if the consensus I think you've reached is the 
| one the pair of you think you've reached :)

This stated position is not the current PEP status, and it's 
not a concensus position I share.

| That is, with A being our starting class, C being a target class, and F 
| being a target interface, the legal adaptation chains are:
|   # Class to class
|   A->C
|   # Class to interface, possibly via other interfaces
|   A(->F)*->F

PEP246 should not talk about legal or illegal adaption chains, 
all adaption chains should be explicit, and if they are explicit,
the programmer who specified them has made them legal.

| With a lookup sequence of:
|   1. Check the global registry for direct adaptations
|   2. Ask the object via __conform__
|   3a. Check using isinstance() unless 2 raised LiskovViolation
|   3b. Nothing, since object.__conform__ does an isinstance() check
|   4. Ask the interface via __adapt__

These are OK up to 4.

|   5. Look for transitive chains of interfaces in the global registry.

No! No! No!  Perhaps...

    5. Raise a AdaptionFailed error, which includes the protocol
       which is being asked for.  This error message _could_ also
       include a list of possible adaptation chains from the global
       registry, but this is just a suggestion.
       
| 3a & 3b are the current differing answers to the question of who should 
| be checking for inheritance - the adaptation machinery or the __conform__ 
| method.

Correct.  I think either method is OK, and perfer Phillip's approach.

Best,

Clark


-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From aleax at aleax.it  Thu Jan 13 18:27:08 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 18:27:14 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050113143421.GA39649@prometheusresearch.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
Message-ID: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 15:34, Clark C. Evans wrote:
    ...
> The 'implicit' adaptation refers to the automagical construction of
> composite adapters assuming that a 'transitive' property holds. I've
> seen nothing in this thread to explain why this is so valueable, why

Let me play devil's advocate: I _have_ seen explanations of why 
transitive adaptation can be convenient -- the most direct one being an 
example by Guido which came in two parts, the second one a 
clarification which came in response to my request about the first one. 
  To summarize it: say we have N concrete classes A1, A2, ... AN which 
all implement interface I.
Now we want to integrate into the system function f1, which requires an 
argument with interface J1, i.e.
     def f1(x):
         x = adapt(x, J1)
         ...
or in Guido's new notation equivalently
     def f1(x: J1):
         ...
and also f2, ..., fM, requiring an argument with interface J2, ..., JM 
respectively.

Without transitivity, we need to code and register M*N adapters.  WITH 
transitivity, we only need M: I->J1, I->J2, ..., I->JM.

The convenience of this is undeniable; and (all other things being 
equal) convenience raises productivity and thus is valuable.

James Knight gave a real-life example, although, since no combinatorial 
explosion was involved, the extra convenience that he missed in 
transitivity was minor compared to the potential for it when the N*M 
issue should arise.

> it shouldn't be explicit,

On this point I'm partly with you: I do not see any real loss of 
convenience in requiring that an adapter which is so perfect and 
lossless as to be usable in transitivity chains be explicitly so 
registered/defined/marker.  E.g., provide a
     registerAdapter_TRANSITIVITY_SUITABLE(X, Y)
entry in addition to the standard registerAdapter which does not supply 
transitivity (or equivalently an optional suitable_for_transitivity 
argument to registerAdapter defaulting to False, etc, etc).
In terms of "should" as opposed to convenience, though, the argument is 
that interface to interface adapters SHOULD always, inherently be 
suitable for transitive chains because there is NO reason, EVER, under 
ANY circumstances, to have such adapters be less than perfect, 
lossless, noiseless, etc, etc.  I am not entirely convinced of this but 
"devil's advocacy wise" I could probably argue for it: for the hapless 
programmers' own good, they should be forced to think very VERY 
carefully of what they're doing, etc, etc.  Yeah, I know, I don't sound 
convincing because I'm not all that convinced myself;-).

>  and on the contrary, most of the "problems
> with adapt()" seem to stem from this aggressive extension of what
> was proposed: Automatic construction of adapter chains is _not_ part

Fair enough, except that it's not just chains of explicitly registered 
adapters: interface inheritance has just the same issues, indeed, in 
PJE's experience, MORE so, because no code is interposed -- if by 
inheriting an interface you're asserting 100% no-problem 
substitutability, the resulting "transitivity" may thus well give 
problems (PJE and I even came close to agreeing that MS COM's 
QueryInterface idea that interface inheritance does NOT implicitly and 
undeniably assert substitutability is very practical, nice, and 
usable...).

> of the original PEP 246 and I hope it remains that way.   I've
> outlined in several posts how this case could be made easy for a
> application developer to do:
>
>   - transitive adapters should always be explicit

What about "registered explicitly as being suitable for transitivity", 
would that suffice?

>   - it should be an error to have more than one adapter
>     from A to Z in the registry

OK, I think.  There has to be a way to replace an adapter with another, 
I think, but it might be fair to require that this be done in two 
steps:
     unregister the old adapter, THEN immediately
     register the new one
so that trying to register an adapter for some A->Z pair which already 
has one is an error.  Replacing adapters feels like a rare enough 
operation that the tiny inconvenience should not be a problem, it 
appears to me.

>   - when adaptation fails, an informative error message can
>     tell the application developer of possible "chains"
>     which could work

Nice, if not too much work.

>   - registration of transitive adapters can be simple command
>     application developers use:  adapt.transitive(from=A,to=Z,via=M)
>     error message can tell an application developer

OK, convenient if feasible.


Considering all of your proposals, I'm wondering: would you be willing 
to author the next needed draft of the PEP in the minimal spirit I was 
proposing?  Since I wrote the last round, you might do a better job of 
editing the next, and if more rounds are needed we could keep 
alternating (perhaps using private mail to exchange partially edited 
drafts, too)...

> The 'registry' idea (which was not explored in the PEP) emerges from
> the need, albeit limited, for the application developer who is
> plugging a component into a framework, to have some say in the
> process.  I think that any actions taken by the user, by registering
> an adapter, should be explicit.

Jim Fulton is VERY keen to have registration of adapters happen "behind 
the scenes" at startup, starting from some kind of declarative form (a 
configuration textfile or the like), given the deployment needs of 
Zope3 -- that shouldn't be a problem, it seems to me (if we have a way 
to to explicit registrations, zope3 can have a startup component that 
finds configuration files and does the registration calls based on that 
'declarative' form).

This potentially opens the door to N-players scenarios for N>3, but, 
like going from 3-tier to N-tier applications, that's not as huge a 
jump as that from N==2 to N==3;-).


> | BUT, and again, Philip has made this point, I can't reason about
> | interfaces in the context of PEP 246, because interfaces aren't
> | defined there. So PEP 246 can't make a clear statement about
> | transitivity, precisely because it doesn't define interfaces. But 
> does
> | this harm PEP 246? I'm not sure.
>
> Well, PEP 246 should be edited, IMHO, to assert that all 'implicit'
> adaptions are out-of-scope, and if they are supported should be done
> so under the direct control of the application developer.

So, are you willing to do that round of editing to PEP 246...?  I'll 
then to the NEXT one which will still undoubtedly be needed...


Alex

From aleax at aleax.it  Thu Jan 13 18:35:52 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 18:35:57 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
References: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
Message-ID: <928A1C83-6589-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 16:08, Phillip J. Eby wrote:

> this with their string/file/path discussions -- *nobody* is safe from 
> implicit adaptation if adaptation actually creates new objects with 
> independent state!  An adapter's state needs to be kept with the 
> original object, or not at all, and most of the time "not at all" is 
> the correct answer.

So, no way to wrap a str with a StringIO to adapt to 
"IReadableFile"...?  Ouch:-(  That was one of my favourite trivial use 
cases...

>> Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 
>> as a result of all this is anything but obvious at this point, at 
>> least to me.
>
> LOL.  Me either!

...so let's hope Clark has clearer ideas, as it appears he does: as per 
a previous msg, I've asked him if he could be the one doing the next 
round of edits, instead of me...


Alex

From pje at telecommunity.com  Thu Jan 13 18:38:46 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 13 18:37:13 2005
Subject: getting special from type, not instance (was Re:
	[Python-Dev] copy confusion)
In-Reply-To: <ca471dc205011309023203847f@mail.gmail.com>
References: <20050113101633.GA5193@vicky.ecs.soton.ac.uk>
	<cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
	<20050113101633.GA5193@vicky.ecs.soton.ac.uk>
Message-ID: <5.1.1.6.0.20050113123715.030839b0@mail.telecommunity.com>

At 09:02 AM 1/13/05 -0800, Guido van Rossum wrote:
>[Armin]
> > I guess that a name-based hack in type_new() to turn all __*__() 
> methods into
> > data descriptors would be even more obscure?
>
>To the contary, I just realized in this would in fact be the right
>approach. In particular, any *descriptor* named __*__ would be
>considered a "data descriptor". Non-descriptors with such names can
>still be overridden in the instance __dict__ (I believe this is used
>by Zope).

It should check that the __*__-named thing isn't *already* an override 
descriptor, though.

From aleax at aleax.it  Thu Jan 13 18:43:16 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 18:43:24 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <864d3709050113071350454789@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com>
	<864d3709050113071350454789@mail.gmail.com>
Message-ID: <9AD5E532-658A-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 16:13, Carlos Ribeiro wrote:
    ...
> +1, specially for the last sentence. An adapter with local state is
> not an adapter anymore! It's funny how difficult it's to get this...
> but it's obvious once stated.

...?  A StringIO instance adapting a string to be used as a 
readablefile is not an adapter?!  It's definitely a pristine example of 
the Adapter Design Pattern (per Gof4), anyway... and partly because of 
that I think it SHOULD be just fine as an ``adapter''... honestly I 
fail to see what's wrong with the poor StringIO instance keeping the 
"we have read as far as HERE" index as its "local state" (imagine a 
readonlyStringIO if you want, just to make for simpler concerns).

Or, consider a View in a Model-View-Controller arrangement; can't we 
get THAT as an adapter either, because (while getting most data from 
the Model) it must still record some few presentation-only details 
locally, such as, say, the font to use?

I'm not sure I'm all that enthusiastic about this crucial aspect of 
PJE's new "more pythonic than Python" [r]evolution, if it's being 
presented correctly here.


Alex

From cce at clarkevans.com  Thu Jan 13 18:46:43 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 18:46:45 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <ca471dc20501121315227e3a89@mail.gmail.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
	<20050112195711.GA1813@prometheusresearch.com>
	<ca471dc20501121315227e3a89@mail.gmail.com>
Message-ID: <20050113174643.GB35655@prometheusresearch.com>

On Wed, Jan 12, 2005 at 01:15:20PM -0800, Guido van Rossum wrote:
| [Clark]
| >   - add a flag to adapt, allowTransitive, which defaults to False
| 
| That wouldn't work very well when most adapt() calls are invoked
| implicitly through signature declarations (per my blog's proposal).

Understood.  This was a side-suggestion -- not the main thrust of my
response.  I'm writing to convince you that automatic "combined"
adaptation, even as a last resort, is a bad idea.  It should be
manual, but we can provide easy mechanisms for application developers 
to specify combined adapters easily.

On Wed, Jan 12, 2005 at 02:57:11PM -0500, Clark C. Evans wrote:
| On Wed, Jan 12, 2005 at 10:16:14AM -0800, Guido van Rossum wrote:
| | But now, since I am still in favor of automatic "combined" adaptation
| | *as a last resort*

A few problems with automatic "combined" adaptation:

  1. Handling the case of multiple adaptation pathways is one issue;
     how do you choose? There isn't a good cost algorithem since the
     goodness of an adapter depends largely on the programmer's need.

  2. Importing or commenting out the import of a module that may seem
     to have little bearing on a given chunk of code could cause
     subtle changes in behavior or adaptation errors, as a new path
     becomes available, or a previously working path is disabled.
    
  3. The technique causes people to want to say what is and isn't
     an adapter -- when this choice should be soly up to the
     appropriate developers.  I'd rather not have to standardize
     that FileName -> File is a _bad_ adaption, but File -> String
     is a good adaption.  Or whatever is in vogue that year.

  4. It's overly complicated for what it does.  I assert that this is
     a very minor use case. When transitive adaptation is needed,
     an explicit registration of an adapter can be made simple.

My current suggestion to make 'transitive adaption' easy for a 
application builder (one putting togeher components) has a few
small parts:

  - If an adaptation is not found, raise an error, but list in
    the error message two additional things: (a) what possible
    adaptation paths exist, and (b) how to register one of
    these paths in their module.
   
  - A simple method to register an adaption path, the error message
    above can even give the exact line needed,
  
       adapt.registerPath(from=A,to=C,via=B)
   
  - Make it an error to register more than one adapter from A
    to C, so that conflicts can be detected.  Also, registrations
    could be 'module specific', or local, so that adapters used 
    by a library need necessarly not be global.
    
In general, I think registries suffer all sorts of namespace and
scoping issues, which is why I had proposed __conform__ and __adapt__
Extending registry mechanism with automatic 'transitive' adapters 
makes things even worse.

Cheers,

Clark
-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From aleax at aleax.it  Thu Jan 13 18:58:36 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 18:58:43 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc205011309023203847f@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
	<20050113101633.GA5193@vicky.ecs.soton.ac.uk>
	<ca471dc205011309023203847f@mail.gmail.com>
Message-ID: <BF4DF506-658C-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 18:02, Guido van Rossum wrote:
    ...
>> In all cases, I'm +1 on seeing built-in method objects 
>> (PyMethodDescr_Type)
>> become data descriptors ("classy descriptors?" :-).
>
> Let's do override descriptors.

A Pronouncement!!!

> And please, someone fix copy.py in 2.3 and 2.4.

Sure -- what way, though?  The way I proposed in my last post about it?


Alex

From cce at clarkevans.com  Thu Jan 13 19:21:42 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 19:21:46 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050113182142.GC35655@prometheusresearch.com>


On Thu, Jan 13, 2005 at 06:27:08PM +0100, Alex Martelli wrote:
| >The 'implicit' adaptation refers to the automagical construction of
| >composite adapters assuming that a 'transitive' property holds. I've
| >seen nothing in this thread to explain why this is so valueable, why
| 
| Let me play devil's advocate: I _have_ seen explanations of why 
| transitive adaptation can be convenient -- the most direct one being an 
| example by Guido which came in two parts, the second one a 
| clarification which came in response to my request about the first one. 

hypothetical pseudocode ;)

| To summarize it: say we have N concrete classes A1, A2, ... AN which 
| all implement interface I.
| Now we want to integrate into the system function f1, which requires an 
| argument with interface J1, i.e.
|     def f1(x):
|         x = adapt(x, J1)
|         ...
| or in Guido's new notation equivalently
|     def f1(x: J1):
|         ...
| and also f2, ..., fM, requiring an argument with interface J2, ..., JM 
| respectively.
| 
| Without transitivity, we need to code and register M*N adapters. 

Are you _sure_ you have M*N adapters here?  But even so,

  for j in (J1,J2,J3,J4,...,JM)
    for i in (I1,I2,...,IN):
      register(j,i)

| WITH transitivity, we only need M: I->J1, I->J2, ..., I->JM.

Without transitivity, a given programmer, in a given module will
probably only use a few of these permutations; and in each case,
you can argue that the developer should be aware of the 'automatic'
conversions that are going on.  Imagine an application developer
plugging a component into a framework and getting this error:

    """Adaption Error
   
    Could not convert A1 to a J1.   There are two adaption pathways
    which you could register to do this conversion for you:
   
    # A1->I1 followed by I1->J1
    adapt.registerPath((A1,I1),(I1,J1))
    
    # A1->X3 followed by X3 -> PQ follwed by PQ -> J1
    adapt.registerPath((A1,X3),(X3,PQ),(PQ,J1))
    """

The other issue with registries (and why I avoided them in the origional
PEP) is that they often require a scoping; in this case, the path taken
by one module might be different from the one needed by another.

| The convenience of this is undeniable; and (all other things being 
| equal) convenience raises productivity and thus is valuable.

It also hides assumptions.  If you are doing adaptation paths

| James Knight gave a real-life example, although, since no combinatorial 
| explosion was involved, the extra convenience that he missed in 
| transitivity was minor compared to the potential for it when the N*M 
| issue should arise.

Right.  And that's more like it.

| >it shouldn't be explicit,
| 
| On this point I'm partly with you: I do not see any real loss of 
| convenience in requiring that an adapter which is so perfect and 
| lossless as to be usable in transitivity chains be explicitly so 
| registered/defined/marker.  E.g., provide a
|     registerAdapter_TRANSITIVITY_SUITABLE(X, Y)
| entry in addition to the standard registerAdapter which does not supply 
| transitivity (or equivalently an optional suitable_for_transitivity 
| argument to registerAdapter defaulting to False, etc, etc).

Ok.  I just think you all are solving a problem that doesn't exist,
and in the process hurting a the more common use case:

   A component developer X and a framework developer Y both 
   have stuff that an application developer A is putting
   together.  The goal is for A to not worry about _how_ the
   components and the framework fit; to automatically "find"
   the glue code.

The assertion that you can layer glue... is well, tenuous at best.

| In terms of "should" as opposed to convenience, though, the argument is 
| that interface to interface adapters SHOULD always, inherently be 
| suitable for transitive chains because there is NO reason, EVER, under 
| ANY circumstances, to have such adapters be less than perfect, 
| lossless, noiseless, etc, etc. 

I strongly disagree; the most useful adapters are the ones that
discard unneeded information.  The big picture above, where you're
plugging components into the framework will in most cases be lossy
-- or the frameworks / components would be identical and you woudn't
want to hook them up. Frankly, I think the whole idea of "perfect
adapters" is just, well, arrogant.  

| > and on the contrary, most of the "problems
| >with adapt()" seem to stem from this aggressive extension of what
| >was proposed: Automatic construction of adapter chains is _not_ part
| 
| Fair enough, except that it's not just chains of explicitly registered 
| adapters: interface inheritance has just the same issues, indeed, in 
| PJE's experience, MORE so, because no code is interposed -- if by 
| inheriting an interface you're asserting 100% no-problem 
| substitutability, the resulting "transitivity" may thus well give 
| problems (PJE and I even came close to agreeing that MS COM's 
| QueryInterface idea that interface inheritance does NOT implicitly and 
| undeniably assert substitutability is very practical, nice, and 
| usable...).
| 
| >of the original PEP 246 and I hope it remains that way.   I've
| >outlined in several posts how this case could be made easy for a
| >application developer to do:
| >
| >  - transitive adapters should always be explicit
| 
| What about "registered explicitly as being suitable for transitivity", 
| would that suffice?

I suppose so.  But I think it is a bad idea for a few reasons:

  1. it seems to add complexity without a real-world justifcation,
     let's go without it; and add it in a later version if it turns
     out to be as valueable as people think
     
  2. different adapters have different intents, and I think a given
     adapter may be perfect in one situation, it may royally 
     screw up in another; users of systems often break interfaces
     to meet immediate needs.  In your strawman I can think of
     several such twists-and-turns that an "obviously perfect"
     adapter would fail to handle:
     
       - In the 'structure' variety (where the middle name is
         not necessarly placed in the middle), someone decides
         to store one's title... beacuse, well, the slot is
         there and they need to store this information
         
       - In the 'ordered' variety, "John P. Smith", you might
         have "Murata Makoto".  If you thought Makoto was the
         last name... you'd be wrong.
        
In short, unless a human is giving the 'ok' to an adapter's
use, be it the application, framework, or component developer,
then I'd expect wacko bugs.

| >The 'registry' idea (which was not explored in the PEP) emerges from
| >the need, albeit limited, for the application developer who is
| >plugging a component into a framework, to have some say in the
| >process.  I think that any actions taken by the user, by registering
| >an adapter, should be explicit.
| 
| Jim Fulton is VERY keen to have registration of adapters happen "behind 
| the scenes" at startup, starting from some kind of declarative form (a 
| configuration textfile or the like), given the deployment needs of 
| Zope3 -- that shouldn't be a problem, it seems to me (if we have a way 
| to to explicit registrations, zope3 can have a startup component that 
| finds configuration files and does the registration calls based on that 
| 'declarative' form).

That's fine.  

| This potentially opens the door to N-players scenarios for N>3, but, 
| like going from 3-tier to N-tier applications, that's not as huge a 
| jump as that from N==2 to N==3;-).

The problem with registries is that often times scope is needed;
just beacuse my module wants to use this adaption path, doesn't 
mean your module will make the same choice.  I avoided registries
in the first pass of the draft to avoid this issue.  So, if we
are going to add registries, then namespaces for the registries
need to also be discussed.

| So, are you willing to do that round of editing to PEP 246...?  I'll 
| then to the NEXT one which will still undoubtedly be needed...

I could make a wack at it this weekend.

Best,

clark

From Scott.Daniels at Acm.Org  Thu Jan 13 19:51:17 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Thu Jan 13 19:49:56 2005
Subject: [Python-Dev] Re: Recent IBM Patent releases
In-Reply-To: <cs4bg7$jd5$1@sea.gmane.org>
References: <cs3o2p$4t1$1@sea.gmane.org> <cs4bg7$jd5$1@sea.gmane.org>
Message-ID: <cs6fs9$l8j$1@sea.gmane.org>

Terry Reedy wrote:
> "Scott David Daniels" <Scott.Daniels@Acm.Org>
>>I believe our current policy is that the author warrants that the code
>>is his/her own work and not encumbered by any patent.
> 
> Without a qualifier such as 'To the best of my knowledge', the latter is an 
> impossible warrant both practically, for an individual author without 
> $1000s to spend on a patent search, and legally.  Legally, there is no 
> answer until the statute of limitations runs out or until there is an 
> after-the-fact final answer provided by the court system.

Absolutely.  I should have written that in the first place.  I was
trying to generate a little discussion about a particular case (the
released IBM patents) where we might want to say, "for these patents,
feel free to include code based on them."  My understanding is that
we will remove patented code if we get notice that it _is_ patented,
and that we strive to not put any patented code into the source.

--Scott David Daniels
Scott.Daniels@Acm.Org

From aleax at aleax.it  Thu Jan 13 20:40:56 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 20:41:01 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050113182142.GC35655@prometheusresearch.com>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
Message-ID: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 19:21, Clark C. Evans wrote:
    ...
> Are you _sure_ you have M*N adapters here?  But even so,

Yep.

>   for j in (J1,J2,J3,J4,...,JM)
>     for i in (I1,I2,...,IN):
>       register(j,i)

Uh?  WHAT are you registering for each j->i...?

> The other issue with registries (and why I avoided them in the 
> origional
> PEP) is that they often require a scoping; in this case, the path taken
> by one module might be different from the one needed by another.

I think that should not be supported, just like different modules 
cannot register different ways to copy.copy(X) for the same X.  One 
size had better fit all, be it a single specific adapter or potentially 
a path thereof.

> | The convenience of this is undeniable; and (all other things being
> | equal) convenience raises productivity and thus is valuable.
>
> It also hides assumptions.  If you are doing adaptation paths


Not sure if it hides them very deeply, but yes, there may be some 
aspects of "information hiding" -- which is not necessarily a bad 
thing.

> Ok.  I just think you all are solving a problem that doesn't exist,

Apparently, the existence of the problem is testified by the experience 
of the Eclipse developers (who are, specifically, adapting plugins: 
Eclipse being among the chief examples f plugin-based architecture... 
definitely an N-players scenario).


> and in the process hurting a the more common use case:
>
>    A component developer X and a framework developer Y both
>    have stuff that an application developer A is putting
>    together.  The goal is for A to not worry about _how_ the
>    components and the framework fit; to automatically "find"
>    the glue code.
>
> The assertion that you can layer glue... is well, tenuous at best.

If you ever did any glueing (of the traditional kind, e.g. in 
historical-furniture restoration, as opposed to relatively new miracle 
glues) you know you typically DO layer glue -- one layer upon one of 
the pieces of wood you're glueing; one layer on the other; let those 
two dry a bit; then, you glue the two together with a third layer in 
the middle.  Of course it takes skill (which is why although I know the 
theory when I have any old valuable piece of furniture needing such 
restoration I have recourse to professionals;-) to avoid the 
easy-to-make mistake of getting the glue too thick (or uneven, etc).

I'm quite ready to consider the risk of having too-thick combined 
layers of glue resulting from adaptation (particularly but not 
exclusively with transitivity): indeed PJE's new ideas may be seen as a 
novel way to restart-from-scratch and minimize glue thickness in the 
overall resulting concoction.  But the optional ability for 
particularly skilled glue-layers to have that extra layer which makes 
everything better should perhaps not be discounted.  Although, 
considering PJE's new just-started effort, it may well be wisest for 
PEP 246 to stick to a minimalist attitude -- leave open the possibility 
of future additions or alterations but only specify that minimal core 
of functionality which we all _know_ is needed.


> | In terms of "should" as opposed to convenience, though, the argument 
> is
> | that interface to interface adapters SHOULD always, inherently be
> | suitable for transitive chains because there is NO reason, EVER, 
> under
> | ANY circumstances, to have such adapters be less than perfect,
> | lossless, noiseless, etc, etc.
>
> I strongly disagree; the most useful adapters are the ones that
> discard unneeded information.

The Facade design pattern?  It's useful, but I disagree that it's "most 
useful" when compared to general Adapter.  My favourite example is 
wrapping a str into a StringIO to make a filelike readable object -- 
that doesn't discard anything, it *adds* a "current reading point" 
state variable (which makes me dubious of the new "no per-state 
adapter" craze, which WOULD be just fine if it was true that discarding 
unneeded info -- facading -- is really the main use case).

>   The big picture above, where you're
> plugging components into the framework will in most cases be lossy
> -- or the frameworks / components would be identical and you woudn't
> want to hook them up. Frankly, I think the whole idea of "perfect
> adapters" is just, well, arrogant.

So please explain what's imperfect in wrapping a str into a StringIO?


> | What about "registered explicitly as being suitable for 
> transitivity",
> | would that suffice?
>
> I suppose so.  But I think it is a bad idea for a few reasons:
>
>   1. it seems to add complexity without a real-world justifcation,
>      let's go without it; and add it in a later version if it turns
>      out to be as valueable as people think

Particularly in the light of PJE's newest ideas, being spare and 
minimal in PEP 246 does sound good, as long as we're not shutting and 
bolting doors against future improvements.


>   2. different adapters have different intents, and I think a given
>      adapter may be perfect in one situation, it may royally
>      screw up in another; users of systems often break interfaces
>      to meet immediate needs.  In your strawman I can think of
>      several such twists-and-turns that an "obviously perfect"
>      adapter would fail to handle:
>
>        - In the 'structure' variety (where the middle name is
>          not necessarly placed in the middle), someone decides
>          to store one's title... beacuse, well, the slot is
>          there and they need to store this information
>
>        - In the 'ordered' variety, "John P. Smith", you might
>          have "Murata Makoto".  If you thought Makoto was the
>          last name... you'd be wrong.

If you've ever looked into "quality of data" issues in huge databases, 
you know that these are two (out of thousands) typical problems -- but 
not problems in _adaptation_, in fact.

> In short, unless a human is giving the 'ok' to an adapter's
> use, be it the application, framework, or component developer,
> then I'd expect wacko bugs.

A lot of the data quality problems in huge databases come exactly from 
humans -- data entry issues, form design issues, ... all the way to 
schema-design issues.  I don't see why, discussing a data-quality 
problem, you'd think that having a human OK whatsoever would help wrt 
having a formalized rule (e.g. a database constraint) do it.


> | This potentially opens the door to N-players scenarios for N>3, but,
> | like going from 3-tier to N-tier applications, that's not as huge a
> | jump as that from N==2 to N==3;-).
>
> The problem with registries is that often times scope is needed;
> just beacuse my module wants to use this adaption path, doesn't
> mean your module will make the same choice.  I avoided registries
> in the first pass of the draft to avoid this issue.  So, if we
> are going to add registries, then namespaces for the registries
> need to also be discussed.

If _one_ registry of how to copy/serialize things is good enough for 
copy.py / copy_reg.py / pickle.py / ..., in the light of minimalism we 
should specify only one for PEP 246 too.


> | So, are you willing to do that round of editing to PEP 246...?  I'll
> | then to the NEXT one which will still undoubtedly be needed...
>
> I could make a wack at it this weekend.

Great!  I assume you have copies of all relevant mails since they all 
went around this mailing list, but if you need anything just holler, 
including asking me privately about anything that might be unclear or 
ambiguous or whatever -- I'll be around all weekend except Sunday night 
(italian time -- afternoon US time;-).


Thanks,

Alex

From shane.holloway at ieee.org  Thu Jan 13 21:42:58 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Thu Jan 13 21:43:27 2005
Subject: [Python-Dev] frame.f_locals is writable
Message-ID: <41E6DD52.2080109@ieee.org>

For a little background, I'm working on making an edit and continue 
support in python a little more robust.  So, in replacing references to 
unmodifiable types like tuples and bound-methods (instance or class), I 
iterate over gc.get_referrers.

So, I'm working on frame types, and wrote this code::

     def replaceFrame(self, ref, oldValue, newValue):
         for name, value in ref.f_locals.items():
             if value is oldValue:
                 ref.f_locals[name] = newValue
                 assert ref.f_locals[name] is newValue


But unfortunately, the assert fires.  f_locals is writable, but not 
modifiable.  I did a bit of searching on Google Groups, and found 
references to a desire for smalltalk like "swap" functionality using a 
similar approach, but no further ideas or solutions.

While I am full well expecting the smack of "don't do that", this 
functionality would be very useful for debugging long-running 
applications.  Is this possible to implement in CPython and ports?  Is 
there an optimization reason to not do this?

At worst, if this isn't possible, direct references in the stack will be 
wrong above the reload call, and corrected on the invocation of the 
function.  This is a subtle issue with reloading code, and can be 
documented.  And at best there is an effective way to replace it, the 
system can be changed to a consistent state even in the stack, and I can 
rejoice.  Even if I have to wait until 2.5.  ;)

Thanks for your time!
-Shane Holloway
From cce at clarkevans.com  Thu Jan 13 22:08:01 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Jan 13 22:08:05 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050113210801.GA49652@prometheusresearch.com>

On Thu, Jan 13, 2005 at 08:40:56PM +0100, Alex Martelli wrote:
| >The other issue with registries (and why I avoided them in the 
| >origional) is that they often require a scoping; in this case, 
| >the path taken by one module might be different from the one 
| >needed by another.
| 
| I think that should not be supported, just like different modules 
| cannot register different ways to copy.copy(X) for the same X.  One 
| size had better fit all, be it a single specific adapter or potentially 
| a path thereof.

Sounds good.

| >Ok.  I just think you all are solving a problem that doesn't exist,
| 
| Apparently, the existence of the problem is testified by the experience 
| of the Eclipse developers (who are, specifically, adapting plugins: 
| Eclipse being among the chief examples f plugin-based architecture... 
| definitely an N-players scenario).

Some specific examples from Eclipse developers would help then,
especially ones that argue strongly for automagical transitive
adaptation.  That is, ones where an alternative approach that
is not automatic is clearly inferior.

| >   A component developer X and a framework developer Y both
| >   have stuff that an application developer A is putting
| >   together.  The goal is for A to not worry about _how_ the
| >   components and the framework fit; to automatically "find"
| >   the glue code.
... 
| I'm quite ready to consider the risk of having too-thick combined 
| layers of glue resulting from adaptation (particularly but not 
| exclusively with transitivity): indeed PJE's new ideas may be seen as a 
| novel way to restart-from-scratch and minimize glue thickness in the 
| overall resulting concoction.

I do like PJE's idea, since it seems to focus on declaring individual 
functions rather than on sets of functions; but I'm still unclear
what problem it is trying to solve.

| But the optional ability for 
| particularly skilled glue-layers to have that extra layer which makes 
| everything better should perhaps not be discounted.  Although, 
| considering PJE's new just-started effort, it may well be wisest for 
| PEP 246 to stick to a minimalist attitude -- leave open the possibility 
| of future additions or alterations but only specify that minimal core 
| of functionality which we all _know_ is needed.

I'd rather not be pushing for a powerful regsistry mechansim unless
we have solid evidence that the value it provides outweighs the costs
that it incurres.

| >I strongly disagree; the most useful adapters are the ones that
| >discard unneeded information.
| 
| The Facade design pattern?  It's useful, but I disagree that it's "most 
| useful" when compared to general Adapter

My qualification was not very well placed.  That said, I don't see any 
reason why a facade can't also be asked for via the adapt() mechanism.

| So please explain what's imperfect in wrapping a str into a StringIO?

It adds information, and it implies mutability which the underlying
object is not.  In short, it's quite a different animal from a String,
which is why String->StringIO is a great example for an adapter.

| >| What about "registered explicitly as being suitable for transitivity",
| >| would that suffice?
| >
| >I suppose so.  But I think it is a bad idea for a few reasons:
| >
| >  1. it seems to add complexity without a real-world justifcation,
| >     let's go without it; and add it in a later version if it turns
| >     out to be as valueable as people think
| 
| Particularly in the light of PJE's newest ideas, being spare and 
| minimal in PEP 246 does sound good, as long as we're not shutting and 
| bolting doors against future improvements.

Agreed!

| >  2. different adapters have different intents...
|
| If you've ever looked into "quality of data" issues in huge databases, 
| you know that these are two (out of thousands) typical problems -- but 
| not problems in _adaptation_, in fact.

I deal with these issues all of the time; but what I'm trying to
express with the example is that someone may _think_ that they are
writing a perfect adapter; but they may be wrong, on a number of
levels.  It's not so much to say what is good, but rather to 
challenge the notion of a 'perfect adapter'. 

| >In short, unless a human is giving the 'ok' to an adapter's
| >use, be it the application, framework, or component developer,
| >then I'd expect wacko bugs.
| 
| A lot of the data quality problems in huge databases come exactly from 
| humans -- data entry issues, form design issues, ... all the way to 
| schema-design issues.  I don't see why, discussing a data-quality 
| problem, you'd think that having a human OK whatsoever would help wrt 
| having a formalized rule (e.g. a database constraint) do it.

The point I was trying to make is that automatically constructing
adapters isn't a great idea unless you have someone who can vouche
for the usefulness.  In other words, I picture this as a physics
story problem, where a bunch of numbers are given with units. While
the units may keep you "in check", just randomly combining figures
with the right units can give you the wrong answer.

| >| So, are you willing to do that round of editing to PEP 246...?  I'll
| >| then to the NEXT one which will still undoubtedly be needed...
| >
| >I could make a wack at it this weekend.
| 
| Great!  I assume you have copies of all relevant mails since they all 
| went around this mailing list, but if you need anything just holler, 
| including asking me privately about anything that might be unclear or 
| ambiguous or whatever -- I'll be around all weekend except Sunday night 
| (italian time -- afternoon US time;-).

Ok.

Best,

Clark
-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From psoberoi at gmail.com  Thu Jan 13 22:43:53 2005
From: psoberoi at gmail.com (Paramjit Oberoi)
Date: Thu Jan 13 22:43:56 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <e443ad0e0501131343347d5cf5@mail.gmail.com>

On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli <aleax@aleax.it> wrote:
>
> So please explain what's imperfect in wrapping a str into a StringIO?

If I understand Philip's argument correctly, the problem is this:

def print_next_line(f: file):
    print f.readline()

s = "line 1\n" "line 2"

print_next_line(s)
print_next_line(s)

This will print "line 1" twice.

-param
From bac at OCF.Berkeley.EDU  Thu Jan 13 22:50:24 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Jan 13 22:50:31 2005
Subject: [Python-Dev] frame.f_locals is writable
In-Reply-To: <41E6DD52.2080109@ieee.org>
References: <41E6DD52.2080109@ieee.org>
Message-ID: <41E6ED20.50103@ocf.berkeley.edu>

Shane Holloway (IEEE) wrote:
> For a little background, I'm working on making an edit and continue 
> support in python a little more robust.  So, in replacing references to 
> unmodifiable types like tuples and bound-methods (instance or class), I 
> iterate over gc.get_referrers.
> 
> So, I'm working on frame types, and wrote this code::
> 
>     def replaceFrame(self, ref, oldValue, newValue):
>         for name, value in ref.f_locals.items():
>             if value is oldValue:
>                 ref.f_locals[name] = newValue
>                 assert ref.f_locals[name] is newValue
> 
> 
> But unfortunately, the assert fires.  f_locals is writable, but not 
> modifiable.  I did a bit of searching on Google Groups, and found 
> references to a desire for smalltalk like "swap" functionality using a 
> similar approach, but no further ideas or solutions.
> 
> While I am full well expecting the smack of "don't do that", this 
> functionality would be very useful for debugging long-running 
> applications.  Is this possible to implement in CPython and ports?  Is 
> there an optimization reason to not do this?
> 

So it would be doable, but it is not brain-dead simple if you want to keep the 
interface of a dict.  Locals, in the frame, are an array of PyObjects (see 
PyFrameObject->f_localsplus).  When you request f_locals that returns a dict 
that was created by a function that takes the array, traverses it, and creates 
a dict with the proper names (using PyFrameObject->f_code->co_varnames for the 
array offset -> name mapping).  The resulting dict gets stored in 
PyFrameObject->f_locals.  So it is writable as you discovered since it is just 
a dict, but it is not used in Python/ceval.c except for IMPORT_STAR; changes 
are just never even considered.  The details for all of this can be found in 
Objects/frameobject.c:PyFrame_FastToLocals() .

The interesting thing is that there is a corresponding PyFrame_LocalsToFast() 
function that seems to do what you want; it takes the dict in 
PyFrameObject->f_locals and propogates the changes into 
PyFrameObject->f_localsplus (or at least seems to; don't have time to stare at 
the code long enough to make sure it does that exactly).  So the functionality 
is there (and is in the API even).  It just isn't called explicitly except in 
two points in Python/ceval.c where you can't get at it.  =)

As to making changes to f_locals actually matter would require either coming up 
with a proxy object that is stored in f_locals instead of a dict and 
dynamically grab everything from f_localsplus as needed.  That would suck for 
performance and be a pain to keep the dict API.  So you can count that out.

Other option would be to add a function that either directly modified single 
values in f_localsplus, a function that takes a dict and propogates the values, 
or a function that just calls PyFrame_LocalsToFast() .

Personally I am against this, but that is because you would single-handedly 
ruin my master's thesis and invalidate any possible type inferencing one can do 
in Python without some semantic change.  But then again my thesis shows that 
amount of type inferencing is not worth the code complexity so it isn't totally 
devastating.  =)

And you are right, "don't do that".  =)

Back to the putrid, boggy marsh of JavaLand for me...

-Brett
From aleax at aleax.it  Thu Jan 13 22:59:53 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 13 22:59:59 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <e443ad0e0501131343347d5cf5@mail.gmail.com>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
Message-ID: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 13, at 22:43, Paramjit Oberoi wrote:

> On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli <aleax@aleax.it> 
> wrote:
>>
>> So please explain what's imperfect in wrapping a str into a StringIO?
>
> If I understand Philip's argument correctly, the problem is this:
>
> def print_next_line(f: file):
>     print f.readline()
>
> s = "line 1\n" "line 2"
>
> print_next_line(s)
> print_next_line(s)
>
> This will print "line 1" twice.

Ah!  A very clear example, thanks.  Essentially equivalent to saying 
that adapting a list to an iterator ``rewinds'' each time the 
``adaptation'' is performed, if one mistakenly thinks of iter(L) as 
providing an _adapter_:

def print_next_item(it: iterator):
     print it.next()

L = ['item 1', 'item 2']

print_next_item(L)
print_next_item(L)


Funny that the problem was obvious to me for the list->iterator issue 
and yet I was so oblivious to it for the str->readablefile one.  OK, 
this does show that (at least some) classical cases of Adapter Design 
Pattern are unsuitable for implicit adaptation (in a language with 
mutation -- much like, say, a square IS-A rectangle if a language does 
not allow mutation, but isn't if the language DOES allow it).


Thanks!

Alex

From p.f.moore at gmail.com  Thu Jan 13 23:10:49 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Jan 13 23:10:52 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <e443ad0e0501131343347d5cf5@mail.gmail.com>
References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
Message-ID: <79990c6b05011314102399f2a3@mail.gmail.com>

On Thu, 13 Jan 2005 13:43:53 -0800, Paramjit Oberoi <psoberoi@gmail.com> wrote:
> On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli <aleax@aleax.it> wrote:
> >
> > So please explain what's imperfect in wrapping a str into a StringIO?
> 
> If I understand Philip's argument correctly, the problem is this:
> 
> def print_next_line(f: file):
>     print f.readline()
> 
> s = "line 1\n" "line 2"
> 
> print_next_line(s)
> print_next_line(s)
> 
> This will print "line 1" twice.

Nice example!

The real subtlety here is that

f = adapt(s, StringIO)
print_next_line(f)
print_next_line(f)

*does* work - the implication is that for the original example to
work, adapt(s, StringIO) needs to not only return *a* wrapper, but to
return *the same wrapper* every time. Which may well break *other*
uses, which expect a "new" wrapper each time.

But the other thing that this tends to make me believe even more
strongly is that using Guido's type notation for adaptation is a bad
thing.

def print_next_line(f):
    ff = adapt(f, file)
    print ff.readline()

Here, the explicit adaptation step in the definition of the function
feels to me a little more obviously a "wrapping" operation which may
reinitialise the adapter - and would raise warning bells in my mind if
I thought of it in terms of a string->StringIO adapter.

Add this to the inability to recover the original object (for
readaptation, or passing on as an argument to another function), and
I'm very concerned about Guido's type notation being used as an
abbreviation for adaptation...

Paul.
From gvanrossum at gmail.com  Fri Jan 14 00:11:43 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 14 00:11:46 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <BF4DF506-658C-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
	<20050113101633.GA5193@vicky.ecs.soton.ac.uk>
	<ca471dc205011309023203847f@mail.gmail.com>
	<BF4DF506-658C-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc205011315114553e369@mail.gmail.com>

> > Let's do override descriptors.
> 
> A Pronouncement!!!
> 
> > And please, someone fix copy.py in 2.3 and 2.4.
> 
> Sure -- what way, though?  The way I proposed in my last post about it?

This would do it, right? (From your first post in this conversation
according to gmail:)

> Armin's fix was to change:
>
>     conform = getattr(type(obj), '__conform__', None)
>
> into:
>
>     for basecls in type(obj).__mro__:
>         if '__conform__' in basecls.__dict__:
>             conform = basecls.__dict__['__conform__']
>             break
>     else:
>         # not found

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From aleax at aleax.it  Fri Jan 14 00:26:02 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 14 00:26:07 2005
Subject: getting special from type,
	not instance (was Re: [Python-Dev] copy confusion)
In-Reply-To: <ca471dc205011315114553e369@mail.gmail.com>
References: <cs1jfs$p3d$1@sea.gmane.org>
	<5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com>
	<ca471dc205011114583df08bbf@mail.gmail.com>
	<D13DEDB2-6425-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc2050111201174b86218@mail.gmail.com>
	<83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501120959737d1935@mail.gmail.com>
	<20050113101633.GA5193@vicky.ecs.soton.ac.uk>
	<ca471dc205011309023203847f@mail.gmail.com>
	<BF4DF506-658C-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011315114553e369@mail.gmail.com>
Message-ID: <7D1C69B7-65BA-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 14, at 00:11, Guido van Rossum wrote:

>>> Let's do override descriptors.
>>
>> A Pronouncement!!!
>>
>>> And please, someone fix copy.py in 2.3 and 2.4.
>>
>> Sure -- what way, though?  The way I proposed in my last post about 
>> it?
>
> This would do it, right? (From your first post in this conversation
> according to gmail:)
>
>> Armin's fix was to change:
>>
>>     conform = getattr(type(obj), '__conform__', None)
>>
>> into:
>>
>>     for basecls in type(obj).__mro__:
>>         if '__conform__' in basecls.__dict__:
>>             conform = basecls.__dict__['__conform__']
>>             break
>>     else:
>>         # not found

Yes, the code could be expanded inline each time it's needed (for 
__copy__, __getstate__, and all other special methods copy.py needs to 
get-from-the-type).  It does seem better to write it once as a private 
function of copy.py, though.

Plus, to fix the effbot's bug, we need to have in function copy() a 
test about object type that currently is in deepcopy() [[for the 
commented purpose of fixing a problem with Boost's old version -- but 
it works to make deepcopy work in the effbot's case too]] but not in 
copy().  Lastly, the tests should also be enriched to make sure they 
catch the bug (no doc change needed, it seems to me).

I can do it this weekend if the general approach is OK, since Clark has 
kindly agreed to do the next rewrite of PEP 246;-).


Alex

From cce at clarkevans.com  Fri Jan 14 02:03:07 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 02:03:10 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050114010307.GA51446@prometheusresearch.com>

Ok.  I think we have identified two sorts of restrictions on the
sorts of adaptations one may want to have:

  `stateless'  the adaptation may only provide a result which
               does not maintain its own state
               
  `lossless'   the adaptation preserves all information available
               in the original object, it may not discard state

If we determined that these were the 'big-ones', we could possibly
allow for the signature of the adapt request to be parameterized with 
these two designations, with the default to accept any sort of adapter:

   adapt(object, protocol, alternative = None, 
         stateless = False, lossless = False)

   __conform__(self, protocol, stateless, lossless)

   __adapt__(self, object, stateless, lossless)

Then, Guido's 'Optional Static Typing',

     def f(X: Y):
         pass

   would be equivalent to

      def f(X):
          X = adapt(Y, True, True)

In other words, while calling adapt directly would allow for any adapter; 
using the 'Static Typing' short-cut one would be asking for adapters
which are both stateless and lossless.  Since __conform__ and __adapt__
would sprout two new arguments, it would make those writing adapters 
think a bit more about the kind of adapter that they are providing.

Furthermore, perhaps composite adapters can be automatically generated
from 'transitive' adapters (that is, those which are both stateless
and lossless).  But adaptations which were not stateless and lossless
would not be used (by default) in an automatic adapter construction.

Your thoughts?

Clark

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From bob at redivi.com  Fri Jan 14 02:08:43 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Jan 14 02:08:47 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114010307.GA51446@prometheusresearch.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
Message-ID: <D581AC5C-65C8-11D9-8F98-000A95BA5446@redivi.com>


On Jan 13, 2005, at 20:03, Clark C. Evans wrote:

> Ok.  I think we have identified two sorts of restrictions on the
> sorts of adaptations one may want to have:
>
>   `stateless'  the adaptation may only provide a result which
>                does not maintain its own state
>
>   `lossless'   the adaptation preserves all information available
>                in the original object, it may not discard state
>
> If we determined that these were the 'big-ones', we could possibly
> allow for the signature of the adapt request to be parameterized with
> these two designations, with the default to accept any sort of adapter:
>
>    adapt(object, protocol, alternative = None,
>          stateless = False, lossless = False)
>
>    __conform__(self, protocol, stateless, lossless)
>
>    __adapt__(self, object, stateless, lossless)
>
> Then, Guido's 'Optional Static Typing',
>
>      def f(X: Y):
>          pass
>
>    would be equivalent to
>
>       def f(X):
>           X = adapt(Y, True, True)
>
> In other words, while calling adapt directly would allow for any 
> adapter;
> using the 'Static Typing' short-cut one would be asking for adapters
> which are both stateless and lossless.  Since __conform__ and __adapt__
> would sprout two new arguments, it would make those writing adapters
> think a bit more about the kind of adapter that they are providing.
>
> Furthermore, perhaps composite adapters can be automatically generated
> from 'transitive' adapters (that is, those which are both stateless
> and lossless).  But adaptations which were not stateless and lossless
> would not be used (by default) in an automatic adapter construction.
>
> Your thoughts?

In some cases, such as when you plan to consume the whole thing in one 
function call, you wouldn't care so much if it's stateless.

-bob

From gvanrossum at gmail.com  Fri Jan 14 02:52:10 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 14 02:52:14 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114010307.GA51446@prometheusresearch.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
Message-ID: <ca471dc2050113175217585406@mail.gmail.com>

> Then, Guido's 'Optional Static Typing',
> 
>      def f(X: Y):
>          pass
> 
>    would be equivalent to
> 
>       def f(X):
>           X = adapt(Y, True, True)
> 
> In other words, while calling adapt directly would allow for any adapter;
> using the 'Static Typing' short-cut one would be asking for adapters
> which are both stateless and lossless.  Since __conform__ and __adapt__
> would sprout two new arguments, it would make those writing adapters
> think a bit more about the kind of adapter that they are providing.

This may solve the curernt raging argument, but IMO it would make the
optional signature declaration less useful, because there's no way to
accept other kind of adapters. I'd be happier if def f(X: Y) implied X
= adapt(X, Y).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Fri Jan 14 02:54:33 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Jan 14 02:58:29 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114010307.GA51446@prometheusresearch.com>
Message-ID: <000d01c4f9dc$138633a0$e841fea9@oemcomputer>

> Ok.  I think we have identified two sorts of restrictions on the
> sorts of adaptations one may want to have:
> 
>   `stateless'  the adaptation may only provide a result which
>                does not maintain its own state
> 
>   `lossless'   the adaptation preserves all information available
>                in the original object, it may not discard state

+1 on having a provision for adapters to provide some meta-information
about themselves.  With these two key properties identified at the
outset, adapt calls can be made a bit more intelligent (or at least less
prone to weirdness).

There is some merit to establishing these properties right away rather
than trying to retrofit adapters after they've been in the wild for a
while.



 
> Since __conform__ and __adapt__
> would sprout two new arguments, it would make those writing adapters
> think a bit more about the kind of adapter that they are providing.

Using optional arguments may not be the most elegant or extensible
approach.  Perhaps a registry table or adapter attributes would fare
better.



Raymond Hettinger

From cce at clarkevans.com  Fri Jan 14 03:02:54 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 03:02:56 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <D581AC5C-65C8-11D9-8F98-000A95BA5446@redivi.com>
References: <41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<D581AC5C-65C8-11D9-8F98-000A95BA5446@redivi.com>
Message-ID: <20050114020254.GA87169@prometheusresearch.com>

On Thu, Jan 13, 2005 at 08:08:43PM -0500, Bob Ippolito wrote:
| >Ok.  I think we have identified two sorts of restrictions on the
| >sorts of adaptations one may want to have:
| >
| >  `stateless'  the adaptation may only provide a result which
| >               does not maintain its own state
| >
| >  `lossless'   the adaptation preserves all information available
| >               in the original object, it may not discard state
| >
| >If we determined that these were the 'big-ones', we could possibly
| >allow for the signature of the adapt request to be parameterized with
| >these two designations, with the default to accept any sort of adapter:
| >
| >   adapt(object, protocol, alternative = None,
| >         stateless = False, lossless = False)
| >
| >Then, Guido's 'Optional Static Typing',
| >
| >     def f(X: Y):
| >         pass
| >
| >   would be equivalent to
| >
| >      def f(X):
| >          X = adapt(X,Y, stateless = True, lossless = True)
..
| 
| In some cases, such as when you plan to consume the whole thing in one 
| function call, you wouldn't care so much if it's stateless.

etrepum,
               True                             False

  stateless    adapter may not add              adapter may have its
               state beyond that already        own state, if it wishes
               provided by the object           but additional state is
                                                not required


  lossless     adapter must preserve and        adapter may discard 
               give all information which       information if it wishes
               the underlying object has


So, in this case, if your consumer doesn't care if the adapter is
stateless or not, just call adapt(), which defaults to the case
that you wish.

Is this a better explanation?  Or is this whole idea too convoluted?

Best,

Clark
From foom at fuhm.net  Fri Jan 14 03:41:13 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri Jan 14 03:41:13 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <20050113174643.GB35655@prometheusresearch.com>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>
	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011210166c14e3f4@mail.gmail.com>
	<20050112195711.GA1813@prometheusresearch.com>
	<ca471dc20501121315227e3a89@mail.gmail.com>
	<20050113174643.GB35655@prometheusresearch.com>
Message-ID: <C15E6034-65D5-11D9-B092-000A95A50FB2@fuhm.net>

On Jan 13, 2005, at 12:46 PM, Clark C. Evans wrote:
> My current suggestion to make 'transitive adaption' easy for a
> application builder (one putting togeher components) has a few
> small parts:
>
>   - If an adaptation is not found, raise an error, but list in
>     the error message two additional things: (a) what possible
>     adaptation paths exist, and (b) how to register one of
>     these paths in their module.
>
>   - A simple method to register an adaption path, the error message
>     above can even give the exact line needed,
>
>        adapt.registerPath(from=A,to=C,via=B)

I'd just like to note that this won't solve my use case for transitive 
adaptation. To keep backwards compatibility, I can't depend on the 
application developer to register an adapter path from A through 
IResource to INewResource. Trying to adapt A to INewResource needs to 
just work. I can't register the path either, because I (the framework 
author) don't know anything about A.

A solution that would work is if I have to explicitly declare the 
adapter from IResource to INewResource as 'safe', as long as I don't 
also have to declare the adapter from A to IResource as 'safe'. (That 
is, I suppose -- in a transitive adapter chain, all except one adapter 
in the chain would have to be declared 'safe').

I don't know whether or not it's worthwhile to have this encoded in the 
framework, as it is clearly possible to do it on my own in any case. 
I'll leave that for others to debate. :)

James

From DavidA at ActiveState.com  Fri Jan 14 04:08:05 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Fri Jan 14 04:08:11 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<ca471dc205011207261a8432c@mail.gmail.com>	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <41E73795.2070505@ActiveState.com>

Alex Martelli wrote:

> Yes, there is (lato sensu) "non-determinism" involved, just like in, say:
>     for k in d:
>         print k

Wow, it took more than the average amount of googling to figure out that 
lato sensu means "broadly speaking", and occurs as "sensu lato" with a 
1:2 ratio.

I learned something today! ;-)

--david
From skip at pobox.com  Thu Jan 13 22:56:19 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan 14 04:52:31 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
Message-ID: <16870.61059.451494.303971@montanaro.dyndns.org>

A couple months ago I proposed (maybe in a SF bug report) that
time.strptime() grow some way to parse time strings containing fractional
seconds based on my experience with the logging module.  I've hit that
stumbling block again, this time in parsing files with timestamps that were
generated using datetime.time objects.  I hacked around it again (in
miserable fashion), but I really think this shortcoming should be addressed.

A couple possibilities come to mind:

    1. Extend the %S format token to accept simple decimals that match
       the re pattern "[0-9]+(?:\.[0-9]+)".

    2. Add a new token that accepts decimals as above to avoid overloading
       the meaning of %S.

    3. Add a token that matches integers corresponding to fractional parts.
       The Perl DateTime module uses %N to match nanoseconds (wanna bet that
       was added by a physicist?).  Arbitrary other units can be specified
       by sticking a number between the "%" and the "N".  I didn't see an
       example, but I presume "%6N" would match integers that are
       interpreted as microseconds.

The advantage of the third choice is that you can use anything as the
"decimal" point.  The logging module separates seconds from their fractional
part with a comma for some reason.  (I live in the USofA where decimal
points are usually represented by a period.  I would be in favor of
replacing the comma with a locale-specific decimal point in a future version
of the logging module.)  I'm not sure I like the optional exponent thing in
Perl's DateTime module but it does make it easy to interpret integers
representing fractions of a second when they occur without a decimal point
to tell you where it is.

I'm open to suggestions and will be happy to implement whatever is agreed
to.


Skip
From bac at OCF.Berkeley.EDU  Fri Jan 14 05:16:16 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Jan 14 05:16:26 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <16870.61059.451494.303971@montanaro.dyndns.org>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
Message-ID: <41E74790.60108@ocf.berkeley.edu>

Skip Montanaro wrote:
> A couple months ago I proposed (maybe in a SF bug report)

http://www.python.org/sf/1006786

  that
> time.strptime() grow some way to parse time strings containing fractional
> seconds based on my experience with the logging module.  I've hit that
> stumbling block again, this time in parsing files with timestamps that were
> generated using datetime.time objects.  I hacked around it again (in
> miserable fashion), but I really think this shortcoming should be addressed.
> 
> A couple possibilities come to mind:
> 
>     1. Extend the %S format token to accept simple decimals that match
>        the re pattern "[0-9]+(?:\.[0-9]+)".
> 
>     2. Add a new token that accepts decimals as above to avoid overloading
>        the meaning of %S.
> 
>     3. Add a token that matches integers corresponding to fractional parts.
>        The Perl DateTime module uses %N to match nanoseconds (wanna bet that
>        was added by a physicist?).  Arbitrary other units can be specified
>        by sticking a number between the "%" and the "N".  I didn't see an
>        example, but I presume "%6N" would match integers that are
>        interpreted as microseconds.
> 

The problem I have always had with this proposal is that the value is 
worthless, time tuples do not have a slot for fractional seconds.  Yes, it 
could possibly be changed to return a float for seconds, but that could 
possibly break things.

My vote is that if something is added it be like %N but without the optional 
optional digit count.  This allows any separator to be used while still 
consuming the digits.  It also doesn't suddenly add optional args which are not 
supported for any other directive.

-Brett
From pje at telecommunity.com  Fri Jan 14 05:50:37 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 05:49:01 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114010307.GA51446@prometheusresearch.com>
References: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com>

At 08:03 PM 1/13/05 -0500, Clark C. Evans wrote:
>Ok.  I think we have identified two sorts of restrictions on the
>sorts of adaptations one may want to have:
>
>   `stateless'  the adaptation may only provide a result which
>                does not maintain its own state
>
>   `lossless'   the adaptation preserves all information available
>                in the original object, it may not discard state

'lossless' isn't really a good term for non-noisy.  The key is that a 
"noisy" adapter is one that alters the precision of the information it 
provides, by either claiming greater precision than is actually present, or 
by losing precision that was present in the meaning of the data.  (I.e., 
truncating 12.3 to 12 loses precision, but dropping the middle name field 
of a name doesn't because the first and last name are independent from the 
middle name).

Anyway, being non-noisy is only a prerequisite for interface-to-interface 
adapters, because they're claiming to be suitable for all possible 
implementations of the source interface.

'statelessness', on the other hand, is primarily useful as a guide to 
whether what you're building is really an "as-a" adapter.  If an adapter 
has per-adapter state, it's an extremely good indication that it's actually 
a *decorator* (in GoF pattern terminology).

In GoF, an Adapter simply converts one interface to another, it doesn't 
implement new functionality.  A decorator, on the other hand, is used to 
"add responsibilities to individual objects dynamically and transparently, 
that is, without affecting other objects."

In fact, as far as I can tell from the GoF book, you can't *have* multiple 
adapter instances for a given object in their definition of the "adapter 
pattern".  IOW, there's no per-adapter state, and their examples never 
suggest the idea that the adapter pattern is intended to add any 
per-adaptee state, either.

So, by their terminology, PEP 246 is a mechanism for dynamically selecting 
and obtaining *decorators*, not adapters.  As if people weren't already 
confused enough about decorators.  :)

Anyway, for type declaration, IMO statelessness is the key criterion.  Type 
declaration "wants" to have true adapters (which can maintain object 
identity), not decorators (which are distinct objects from the things they 
add functionality to).


>In other words, while calling adapt directly would allow for any adapter;
>using the 'Static Typing' short-cut one would be asking for adapters
>which are both stateless and lossless.  Since __conform__ and __adapt__
>would sprout two new arguments, it would make those writing adapters
>think a bit more about the kind of adapter that they are providing.

Unfortunately, in practice this will just lead to people ignoring the 
arguments, because 1) it's easier and 2) it will make their code work with 
type declarations!  So, it won't actually produce any useful effect.

From cce at clarkevans.com  Fri Jan 14 06:00:52 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 06:00:55 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com>
References: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com>
Message-ID: <20050114050052.GB93742@prometheusresearch.com>

On Thu, Jan 13, 2005 at 11:50:37PM -0500, Phillip J. Eby wrote:
| 'lossless' isn't really a good term for non-noisy.  The key is that a 
| "noisy" adapter is one that alters the precision of the information it 
| provides, by either claiming greater precision than is actually present, 
| or by losing precision that was present in the meaning of the data.  

Noisy doesn't cut it -- my PC fan is noisy.  In computer science,
noisy usually refers to a flag on an object that tells it to spew
debug output...


| 'statelessness', on the other hand, is primarily useful as a guide to 
| whether what you're building is really an "as-a" adapter.  If an adapter 
| has per-adapter state, it's an extremely good indication that it's 
| actually a *decorator* (in GoF pattern terminology).

GoF is very nice, but I'm using a much broader definition of 'adapt':

   To make suitable to or fit for a specific use or situation
   
By this definition, decorators, facade are both kinds of adapters.

| Anyway, for type declaration, IMO statelessness is the key criterion.  
| Type declaration "wants" to have true adapters (which can maintain object 
| identity), not decorators (which are distinct objects from the things 
| they add functionality to).

Stateful adapters are very useful, and the value of PEP 246 is 
significantly reduced without alowing them.

| Unfortunately, in practice this will just lead to people ignoring the 
| arguments, because 1) it's easier and 2) it will make their code work 
| with type declarations!  So, it won't actually produce any useful effect.

Hmm.

Best,

Clark
From pje at telecommunity.com  Fri Jan 14 06:11:10 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 06:09:35 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <ca471dc2050113175217585406@mail.gmail.com>
References: <20050114010307.GA51446@prometheusresearch.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
Message-ID: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>

At 05:52 PM 1/13/05 -0800, Guido van Rossum wrote:
>This may solve the curernt raging argument, but IMO it would make the
>optional signature declaration less useful, because there's no way to
>accept other kind of adapters. I'd be happier if def f(X: Y) implied X
>= adapt(X, Y).

The problem is that type declarations really want more guarantees about 
object identity and state than an unrestricted adapt() can provide, 
including sane behavior when objects are passed into the same or different 
functions repeatedly.  See this short post by Paul Moore:

http://mail.python.org/pipermail/python-dev/2005-January/051020.html

It presents some simple examples that show how non-deterministic adaptation 
can be in the presence of stateful adapters created "implicitly" by type 
declaration.  It suggests that just avoiding transitive interface adapters 
may not be sufficient to escape C++ish pitfalls.

Even if you're *very* careful, your seemingly safe setup can be blown just 
by one routine passing its argument to another routine, possibly causing an 
adapter to be adapted.  This is a serious pitfall because today when you 
'adapt' you can also access the "original" object -- you have to first 
*have* it, in order to *adapt* it.  But type declarations using adapt() 
prevents you from ever *seeing* the original object within a function.  So, 
it's *really* unsafe in a way that explicitly calling 'adapt()' is 
not.  You might be passing an adapter to another function, and then that 
function's signature might adapt it again, or perhaps just fail because you 
have to adapt from the original object.

Clark's proposal isn't going to solve this issue for PEP 246, alas.  In 
order to guarantee safety of adaptive type declarations, the implementation 
strategy *must* be able to guarantee that 1) adapters do not have state of 
their own, and 2) adapting an already-adapted object re-adapts the original 
rather than creating a new adapter.  This is what the monkey-typing PEP and 
prototype implementation are intended to address.

(This doesn't mean that explicit adapt() still isn't a useful thing, it 
just means that using it for type declarations is a bad idea in ways that 
we didn't realize until after the "great debate".)

From pje at telecommunity.com  Fri Jan 14 06:28:07 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 06:26:33 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114050052.GB93742@prometheusresearch.com>
References: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com>
	<5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com>
	<41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114001345.02b251e0@mail.telecommunity.com>

At 12:00 AM 1/14/05 -0500, Clark C. Evans wrote:
>On Thu, Jan 13, 2005 at 11:50:37PM -0500, Phillip J. Eby wrote:
>| 'lossless' isn't really a good term for non-noisy.  The key is that a
>| "noisy" adapter is one that alters the precision of the information it
>| provides, by either claiming greater precision than is actually present,
>| or by losing precision that was present in the meaning of the data.
>
>Noisy doesn't cut it -- my PC fan is noisy.  In computer science,
>noisy usually refers to a flag on an object that tells it to spew
>debug output...

Come up with a better name, then.  Precision-munging?  :)


>| Anyway, for type declaration, IMO statelessness is the key criterion.
>| Type declaration "wants" to have true adapters (which can maintain object
>| identity), not decorators (which are distinct objects from the things
>| they add functionality to).
>
>Stateful adapters are very useful, and the value of PEP 246 is
>significantly reduced without alowing them.

Absolutely.  But that doesn't mean type declarations are the right choice 
for PEP 246.  Look at this code:

     def foo(self, bar:Baz):
         bar.whack(self)
         self.spam.fling(bar)

Does this code produce transitive adaptation (i.e. adapt an 
adapter)?  Can't tell?  Me neither.  :)  It depends on what type 
spam.fling() declares its parameter to be, and whether the caller of foo() 
passed in an object that needed an adapter to Baz.

The problem here is that *all* of the arguments you and Alex and others 
raised in the last few days against unconstrained transitive adaptation 
apply in spades to type declarations.  My argument was that if 
well-designed and properly used, transitivity could be quite safe, but even 
I agreed that uncontrolled semi-random adapter composition was madness.

Unfortunately, having type declarations do adapt() introduces the potential 
for precisely this sort of uncontrolled semi-random adapter madness in 
seemingly harmless code.

Now compare to *this* code:

     def foo(self, bar):
         adapt(bar,Baz).whack(self)
         self.spam.fling(bar)

It's obvious that the above does not introduce a transitive adaptation; at 
least if it was passed an "original" object, then it will pass on that 
original object to spam.fling().

So, explicit use of PEP 246 doesn't introduce this problem, but type 
declarations do.  With type declarations you can never even *know* if you 
have the "original object" or not, let alone get it if you don't have it.

From gvanrossum at gmail.com  Fri Jan 14 07:20:40 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 14 07:20:43 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
Message-ID: <ca471dc205011322205f4d28ec@mail.gmail.com>

[Guido]
> >This may solve the curernt raging argument, but IMO it would make the
> >optional signature declaration less useful, because there's no way to
> >accept other kind of adapters. I'd be happier if def f(X: Y) implied X
> >= adapt(X, Y).

[Phillip]
> The problem is that type declarations really want more guarantees about
> object identity and state than an unrestricted adapt() can provide,

I'm not so sure. When I hear "guarantee" I think of compile-time
checking, and I though that was a no-no.

> including sane behavior when objects are passed into the same or different
> functions repeatedly.  See this short post by Paul Moore:
> 
> http://mail.python.org/pipermail/python-dev/2005-January/051020.html

Hm. Maybe that post points out that adapters that add state are bad,
period. I have to say that the example of adapting a string to a file
using StringIO() is questionable. Another possible adaptation from a
string to a file would be open(), and in fact I know a couple of
existing APIs in the Python core (and elsewhere) that take either a
string or a file, and interpret the string as a filename. Operations
that are customarily done with string data or a file typically use two
different different function/method names, for example pickle.load and
pickle.loads.

But I'd be just as happy if an API taking either a string or a file
(stream) should be declared as taking the union of IString and
IStream; adapting to a union isn't that hard to define (I think
someone gave an example somewhere already).

OK, so what am I saying here (rambling really): my gut tells me that I
still like argument declarations to imply adapt(), but that adapters
should be written to be stateless. (I'm not so sure I care about
lossless.)

Are there real-life uses of stateful adapters that would be thrown out
by this requirement?

> Even if you're *very* careful, your seemingly safe setup can be blown just
> by one routine passing its argument to another routine, possibly causing an
> adapter to be adapted.  This is a serious pitfall because today when you
> 'adapt' you can also access the "original" object -- you have to first
> *have* it, in order to *adapt* it.

How often is this used, though? I can imagine all sorts of problems if
you mix access to the original object and to the adapter.

> But type declarations using adapt()
> prevents you from ever *seeing* the original object within a function.  So,
> it's *really* unsafe in a way that explicitly calling 'adapt()' is
> not.  You might be passing an adapter to another function, and then that
> function's signature might adapt it again, or perhaps just fail because you
> have to adapt from the original object.

Real-life example, please?

I can see plenty of cases where this could happen with explicit
adaptation too, for example f1 takes an argument and adapts it, then
calls f2 with the adapted value, which calls f3, which adapts it to
something else. Where is f3 going to get the original object?

I wonder if a number of these cases are isomorphic to the hypothetical
adaptation from a float to an int using the int() constructor -- no
matter how we end up defining adaptation, that should *not* happen,
and neither should adaptation from numbers to strings using str(), or
from strings to numbers using int() or float().

But the solution IMO is not to weigh down adapt(), but to agree, as a
user community, not to create such "bad" adapters, period. OTOH there
may be specific cases where the conventions of a particular
application or domain make stateful or otherwise naughty adapters
useful, and everybody understands the consequences and limitations.
Sort of the way that NumPy defines slices as views on the original
data, even though lists define slices as copies of the original data;
you have to know what you are doing with the NumPy slices but the
NumPy users don't seem to have a problem with that. (I think.)

> Clark's proposal isn't going to solve this issue for PEP 246, alas.  In
> order to guarantee safety of adaptive type declarations, the implementation
> strategy *must* be able to guarantee that 1) adapters do not have state of
> their own, and 2) adapting an already-adapted object re-adapts the original
> rather than creating a new adapter.  This is what the monkey-typing PEP and
> prototype implementation are intended to address.

Guarantees again. I think it's hard to provide these, and it feels
unpythonic. (2) feels weird too -- almost as if it were to require
that float(int(3.14)) should return 3.14. That ain't gonna happen.

> (This doesn't mean that explicit adapt() still isn't a useful thing, it
> just means that using it for type declarations is a bad idea in ways that
> we didn't realize until after the "great debate".)

Or maybe we shouldn't try to guarantee so much and instead define
simple, "Pythonic" semantics and live with the warts, just as we do
with mutable defaults and a whole slew of other cases where Python
makes a choice rooted in what is easy to explain and implement (for
example allowing non-Liskovian subclasses). Adherence to a particular
theory about programming is not very Pythonic; doing something that
superficially resembles what other languages are doing but actually
uses a much more dynamic mechanism is (for example storing instance
variables in a dict, or defining assignment as name binding rather
than value copying).

My 0.02 MB,

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Fri Jan 14 08:38:05 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 08:36:33 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <ca471dc205011322205f4d28ec@mail.gmail.com>
References: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>

At 10:20 PM 1/13/05 -0800, Guido van Rossum wrote:
>[Guido]
> > >This may solve the curernt raging argument, but IMO it would make the
> > >optional signature declaration less useful, because there's no way to
> > >accept other kind of adapters. I'd be happier if def f(X: Y) implied X
> > >= adapt(X, Y).
>
>[Phillip]
> > The problem is that type declarations really want more guarantees about
> > object identity and state than an unrestricted adapt() can provide,
>
>I'm not so sure. When I hear "guarantee" I think of compile-time
>checking, and I though that was a no-no.

No, it's not compile-time based, it's totally at runtime.  I mean that if 
the implementation of 'adapt()' *generates* the adapter (cached of course 
for source/target type pairs), it can trivially guarantee that adapter's 
stateless.  Quick demo (strawman syntax) of declaring adapters...

First, a type declaring that its 'read' method has the semantics of 
'file.read':

     class SomeKindOfStream:
         def read(self, byteCount) like file.read:
             ...

Second, third-party code adapting a string iterator to a readable file:

     def read(self, byteCount) like file.read for type(iter("")):
         # self is a string iterator here, implement read()
         # in terms of its .next()

And third, some standalone code implementing an "abstract" dict.update 
method for any source object that supports a method that's "like" 
dict.__setitem__:

     def update_anything(self:dict, other:dict) like dict.update for object:
         for k,v in other.items(): self[k] = v

Each of these examples registers the function as an implementation of the 
"file.read" operation for the appropriate type.  When you want to build an 
adapter from SomeKindOfStream or from a string iterator to the "file" type, 
you just access the 'file' type's descriptors, and look up the 
implementation registered for that descriptor for the source type 
(SomeKindOfStream or string-iter).  If there is no implementation 
registered for a particular descriptor of 'file', you leave the 
corresponding attribute off of the adapter class, resulting in a class 
representing the subset of 'file' that can be obtained for the source class.

The result is that you generate a simple adapter class whose only state is 
a read-only slot pointing to the adapted object, and descriptors that bind 
the registered implementations to that object.  That is, the descriptor 
returns a bound instancemethod with an im_self of the original object, not 
the adapter.  (Thus the implementation never even gets a reference to the 
adapter, unless 'self' in the method is declared of the same type as the 
adapter, which would be the case for an abstract method like 'readline()' 
being implemented in terms of 'read'.)

Anyway, it's therefore trivially "guaranteed" to be stateless (in the same 
way that an 'int' is "guaranteed" to be immutable), and the implementation 
is also "guaranteed" to be able to always get back the "original" object.

Defining adaptation in terms of adapting operations also solves another 
common problem with interface mechanisms for Python: the dreaded "mapping 
interface" and "file-like object" problem.  Really, being able to 
*incompletely* implement an interface is often quite useful in practice, so 
this "monkey see, monkey do" typing ditches the whole concept of a complete 
interface in favor of "explicit duck typing".  You're just declaring "how 
can X act 'like' a duck" -- emulating behaviors of another type rather than 
converting structure.


>Are there real-life uses of stateful adapters that would be thrown out
>by this requirement?

Think about this: if an adapter has independent state, that means it has a 
particular scope of applicability.  You're going to keep the adapter and 
then throw it away at some point, like you do with an iterator.  If it has 
no state, or only state that lives in the original object (by tacking 
annotations onto it), then it has a common lifetime with the original object.

If it has state, then, you have to explicitly manage that state; you can't 
do that if the only way to create an adapter is to pass it into some other 
function that does the adapting, unless all it's going to do is return the 
adapter back to you!

Thus, stateful adapters *must* be explicitly adapted by the code that needs 
to manage the state.

This is why I say that PEP 246 is fine, but type declarations need a more 
restrictive version.  PEP 246 provides a nice way to *find* stateful 
adapters, it just shouldn't do it for function arguments.


> > Even if you're *very* careful, your seemingly safe setup can be blown just
> > by one routine passing its argument to another routine, possibly causing an
> > adapter to be adapted.  This is a serious pitfall because today when you
> > 'adapt' you can also access the "original" object -- you have to first
> > *have* it, in order to *adapt* it.
>
>How often is this used, though? I can imagine all sorts of problems if
>you mix access to the original object and to the adapter.

Right - and early adopters of PEP 246 are warned about this, either from 
the PEP or PyProtocols docs.  The PyProtocols docs early on have dire 
warnings about not forwarding adapted objects to other functions unless you 
already know the other method needs only the interface you adapted to 
already.  However, with type declarations, you may never receive the 
original object.



> > But type declarations using adapt()
> > prevents you from ever *seeing* the original object within a function.  So,
> > it's *really* unsafe in a way that explicitly calling 'adapt()' is
> > not.  You might be passing an adapter to another function, and then that
> > function's signature might adapt it again, or perhaps just fail because you
> > have to adapt from the original object.
>
>Real-life example, please?

If you mean, an example of code that's currently using adapt() that I'd 
have changed to use type declaration instead and then broken something, 
I'll have to look for one and get back to you.  I have a gut feel/vague 
recollection that there are some, but I don't know how many.

The problem is that the effect is inherently non-local; you can't look at a 
piece of code using type declarations and have a clue as to whether there's 
even *potentially* a problem there.


>I can see plenty of cases where this could happen with explicit
>adaptation too, for example f1 takes an argument and adapts it, then
>calls f2 with the adapted value, which calls f3, which adapts it to
>something else. Where is f3 going to get the original object?

PyProtocols warns people not to do this in the docs, but it can't do 
anything about enforcing it.



>But the solution IMO is not to weigh down adapt(), but to agree, as a
>user community, not to create such "bad" adapters, period.

Maybe.  The thing that inspired me to come up with a new approach is that 
"bad" adapters are just *sooo* tempting; many of the adapters that we're 
just beginning to realize are "bad", were ones that Alex and I both 
initially thought were okay.  Making the system such that you get "safe" 
adapters by default removes the temptation, and provides a learning 
opportunity to explain why the caller needs to manage the state when 
creating a stateful adapter.  PEP 246 still allows you to leave it implicit 
how you get the adapter, but it still should be created explicitly by the 
code that needs to manage its lifetime.


>  OTOH there
>may be specific cases where the conventions of a particular
>application or domain make stateful or otherwise naughty adapters
>useful, and everybody understands the consequences and limitations.

Right; and I think that in those cases, it's the *caller* that needs to 
(explicitly) adapt, not the callee, because it's the caller that knows the 
lifetime for which the adapter needs to exist.


> > Clark's proposal isn't going to solve this issue for PEP 246, alas.  In
> > order to guarantee safety of adaptive type declarations, the implementation
> > strategy *must* be able to guarantee that 1) adapters do not have state of
> > their own, and 2) adapting an already-adapted object re-adapts the original
> > rather than creating a new adapter.  This is what the monkey-typing PEP and
> > prototype implementation are intended to address.
>
>Guarantees again. I think it's hard to provide these, and it feels
>unpythonic.

Well, right now Python provides lots of guarantees, like that numbers are 
immutable.  It would be no big deal to guarantee immutable adapters, if 
Python supplies the adapter type for you.


>(2) feels weird too -- almost as if it were to require
>that float(int(3.14)) should return 3.14. That ain't gonna happen.

No, but 'int_wrapper(3.14).original_object' is trivial.

The point is that adaptation should just always return a wrapper of a type 
that's immutable and has a pointer to the original object.

If you prefer, call these characteristics "implementation requirements" 
rather than guarantees.  :)


>Or maybe we shouldn't try to guarantee so much and instead define
>simple, "Pythonic" semantics and live with the warts, just as we do
>with mutable defaults and a whole slew of other cases where Python
>makes a choice rooted in what is easy to explain and implement (for
>example allowing non-Liskovian subclasses). Adherence to a particular
>theory about programming is not very Pythonic; doing something that
>superficially resembles what other languages are doing but actually
>uses a much more dynamic mechanism is (for example storing instance
>variables in a dict, or defining assignment as name binding rather
>than value copying).

Obviously the word "guarantee" hit a hot button; please don't let it 
obscure the actual merit of the approach, which does not involve any sort 
of compile-time checking.  Heck, it doesn't even have interfaces!

From aleax at aleax.it  Fri Jan 14 08:50:27 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 14 08:50:34 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <41E73795.2070505@ActiveState.com>
References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>	<5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com>	<2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it>	<79990c6b05011205445ea4af76@mail.gmail.com>	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>	<ca471dc205011207261a8432c@mail.gmail.com>	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
	<41E73795.2070505@ActiveState.com>
Message-ID: <F49A587A-6600-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 14, at 04:08, David Ascher wrote:

> Alex Martelli wrote:
>
>> Yes, there is (lato sensu) "non-determinism" involved, just like in, 
>> say:
>>     for k in d:
>>         print k
>
> Wow, it took more than the average amount of googling to figure out 
> that lato sensu means "broadly speaking",

Ooops -- sorry; I wouldn't have imagined Brazilian hits would swamp the 
google hits to that extent, mostly qualifying post-grad courses and the 
like... seems to be an idiom there for that.

>  and occurs as "sensu lato" with a 1:2 ratio.

In Latin as she was spoken word order is very free, but the issue here 
is that _in taxonomy specifically_ (which was the way I intended the 
form!) the "sensu lato" order vastly predominates.  Very exhaustive 
discussion of this word order choice in taxonomy at 
<http://www.forum-one.org/new-1967018-4338.html>, btw (mostly about 
"sensu scricto", the antonym).

> I learned something today! ;-)

Me too: about Brazilian idiom, and about preferred word-order use in 
Aquinas and Bonaventura.

Also, a reflection: taxonomy, the classification of things (living 
beings, rocks, legal precedents, ...) into categories, is a discipline 
with many, many centuries of experience behind it.  I think it is 
telling that taxonomists found out they require _two_ kinds of 
``inheritance'' to do their job (no doubt there are all kind of 
_nuances_, but specialized technical wording exists for two kinds: 
"strict-sense" and "broad-sense")... they need to be able to assert 
that "A is a B _broadly speaking_" (or specifically "_strictly 
speaking_") so often that they evolved specific terminology.  Let's 
hope it doesn't take OOP many centuries to accept that both "stricto 
sensu inheritance" (Liskovianly-correct) AND "lato sensu inheritance" 
are needed to do _our_ jobs!-)


Alex

From just at letterror.com  Fri Jan 14 10:09:26 2005
From: just at letterror.com (Just van Rossum)
Date: Fri Jan 14 10:09:33 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <ca471dc205011322205f4d28ec@mail.gmail.com>
Message-ID: <r01050400-1037-FEE97ADA660B11D9AEA9003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> Are there real-life uses of stateful adapters that would be thrown out
> by this requirement?

Here are two interfaces we're using in a project:

  http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen")
  http://just.letterror.com/ltrwiki/PointPen

They're both abstractions for drawing glyphs (characters from a font).
Sometimes the former is more practical and sometimes the latter. We
really need both interfaces. Yet they can't be adapted without keeping
some state in the adapter.

Implicit adaptations may be dangerous here, but I'm not so sure I care.
In my particular use case, it will be very rare that people want to do

    funcTakingPointPen(segmentPen)
    otherFuncTakingPointPen(segmentPen)

I don't it will be a problem in general that my adapter carries a bit of
state, and that if it _does_ become a problem, it's easy to work around.
It's not dissimilar to file.readline() vs. file.next(): sure, it's not
pretty that file iteration doesn't work nice with readline(), but all
bug reports about that get closed as "won't fix" ;-). It's something you
can easily learn to live with.

That said, I don't think implicit adaptation from str to file is a good
idea... Python (the std lib, really) shouldn't use "dangerous" adapters
for implicit adaptation, but that doesn't mean it should be impossible
to do so anyway.

[ ... ]
> But the solution IMO is not to weigh down adapt(), but to agree, as a
> user community, not to create such "bad" adapters, period. OTOH there
> may be specific cases where the conventions of a particular
> application or domain make stateful or otherwise naughty adapters
> useful, and everybody understands the consequences and limitations.
> Sort of the way that NumPy defines slices as views on the original
> data, even though lists define slices as copies of the original data;
> you have to know what you are doing with the NumPy slices but the
> NumPy users don't seem to have a problem with that. (I think.)
[ ... ]
> Guarantees again. I think it's hard to provide these, and it feels
> unpythonic.
[ ... ]
> Or maybe we shouldn't try to guarantee so much and instead define
> simple, "Pythonic" semantics and live with the warts, just as we do
> with mutable defaults and a whole slew of other cases where Python
> makes a choice rooted in what is easy to explain and implement (for
> example allowing non-Liskovian subclasses). Adherence to a particular
> theory about programming is not very Pythonic; doing something that
> superficially resembles what other languages are doing but actually
> uses a much more dynamic mechanism is (for example storing instance
> variables in a dict, or defining assignment as name binding rather
> than value copying).

Yes, yes and yes!

Just
From arigo at tunes.org  Fri Jan 14 10:47:15 2005
From: arigo at tunes.org (Armin Rigo)
Date: Fri Jan 14 10:58:46 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <ca471dc205011322205f4d28ec@mail.gmail.com>
References: <20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
Message-ID: <20050114094715.GA21852@vicky.ecs.soton.ac.uk>

Hi Guido,

On Thu, Jan 13, 2005 at 10:20:40PM -0800, Guido van Rossum wrote:
> Hm. Maybe that post points out that adapters that add state are bad,
> period. I have to say that the example of adapting a string to a file
> using StringIO() is questionable. Another possible adaptation from a
> string to a file would be open()

I have some theory about why adapting a string to a file in any way is
questionable, but why adapting some more specific class to some other class
usually "feels right", and about what "lossy" means.  In my opinion a
user-defined class or interface mixes two notions: a "concept" meaningful for
the programmer that the instances represent, and the "interface" provided to
manipulate it.  Adaptation works well at the "concept" level without all the
hassles of information loss and surprizes of transitive adaptation.  The
problems show up in the cases where a single concrete interface doesn't
obviously match to a single "concept".  For example, strings "mean" very
different concepts in various contexts, e.g. a file name, an url, the byte
content a document, or the pickled representation of something.  Containers
have the similar problem.  This suggests that only concrete objects which are
expected to encode a *single* concept should be used for adaptation.

Note that the theory -- for which I have an old draft at
http://arigo.tunes.org/semantic_models.html -- suggests that it is possible to
be more precise about various levels of concepts encoding each others, like a
string standing for the name of a file itself encoding an image; but I'm not
proposing anything similar here, just suggesting a way to realize what kind of
adaptation is problematic.

> may be specific cases where the conventions of a particular
> application or domain make stateful or otherwise naughty adapters
> useful, and everybody understands the consequences and limitations.

Note that it may be useful to be able to register some adapaters in "local"  
registeries instead of the single global one, to avoid all kinds of unexpected
global effects.  For example something along the lines of (but nicer than) :

  my_registry = AdapterRegister()
  my_registry.register(...)
  my_registry.adapt(x, y)   # direct use

  __adaptregistry__ = my_registry
  def f(x as y):   # implicit use of the module-local registry
      stuff

This would allow a module to provide the str->StringIO or str->file conversion
locally.


A bientot,

Armin.
From carribeiro at gmail.com  Fri Jan 14 11:40:52 2005
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Fri Jan 14 11:40:54 2005
Subject: [Python-Dev] PEP 246, redux
In-Reply-To: <F49A587A-6600-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com>
	<79990c6b05011205445ea4af76@mail.gmail.com>
	<338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc205011207261a8432c@mail.gmail.com>
	<BE7288E8-64B1-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com>
	<5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com>
	<49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it>
	<41E73795.2070505@ActiveState.com>
	<F49A587A-6600-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <864d37090501140240628ec1f2@mail.gmail.com>

On Fri, 14 Jan 2005 08:50:27 +0100, Alex Martelli <aleax@aleax.it> wrote:
> Ooops -- sorry; I wouldn't have imagined Brazilian hits would swamp the
> google hits to that extent, mostly qualifying post-grad courses and the
> like... seems to be an idiom there for that.

'Lato sensu' is used to indicate short post-graduate level courses
that don't give one any recognized degree such as 'MSc', or 'master'.
It's pretty much like a specialization course on some specific area,
usually offered by small private universities. It's like a fever
around here - everyone does just to add something to the resume - and
has spawned a entire branch in the educational industry (and yeah,
'industry' is the best word for it).

Some schools refer to traditional post graduate courses as 'stricto
sensu'. I don't have the slightest idea about where they did get this
naming from. It's also amazing how many hits you'll get for the wrong
spelling: 'latu sensu' & 'strictu sensu', mostly from Brazil, and also
from some spanish-speaking countries.

> Also, a reflection: taxonomy, the classification of things (living
> beings, rocks, legal precedents, ...) into categories, is a discipline
> with many, many centuries of experience behind it.  I think it is
> telling that taxonomists found out they require _two_ kinds of
> ``inheritance'' to do their job (no doubt there are all kind of
> _nuances_, but specialized technical wording exists for two kinds:
> "strict-sense" and "broad-sense")... they need to be able to assert
> that "A is a B _broadly speaking_" (or specifically "_strictly
> speaking_") so often that they evolved specific terminology.  Let's
> hope it doesn't take OOP many centuries to accept that both "stricto
> sensu inheritance" (Liskovianly-correct) AND "lato sensu inheritance"
> are needed to do _our_ jobs!-)

Good point!

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From skip at pobox.com  Fri Jan 14 10:36:21 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan 14 12:01:38 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <41E74790.60108@ocf.berkeley.edu>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
Message-ID: <16871.37525.981821.580939@montanaro.dyndns.org>


    Brett> The problem I have always had with this proposal is that the
    Brett> value is worthless, time tuples do not have a slot for fractional
    Brett> seconds.  Yes, it could possibly be changed to return a float for
    Brett> seconds, but that could possibly break things.

Actually, time.strptime() returns a struct_time object.  Would it be
possible to extend %S to parse floats then add a microseconds (or whatever)
field to struct_time objects that is available by attribute only?  In Py3k
it could worm its way into the tuple representation somehow (either as a new
field or by returning seconds as a float).

    Brett> My vote is that if something is added it be like %N but without
    Brett> the optional optional digit count.  This allows any separator to
    Brett> be used while still consuming the digits.  It also doesn't
    Brett> suddenly add optional args which are not supported for any other
    Brett> directive.

I realize the %4N notation is distasteful, but without it I think you will
have trouble parsing something like

    13:02:00.704

What would be the format string?  %H:%M:%S.%N would be incorrect.  It
works if you allow the digit notation: %H:%M:%S.%3N

I think that except for the logging module presentation of fractions of a
second would almost always use the locale-specific decimal point, so if
that problem is fixed, extending %S to understand floating point seconds
would be reasonable.

Skip

From aleax at aleax.it  Fri Jan 14 12:40:41 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 14 12:40:46 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
Message-ID: <1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 14, at 10:36, Skip Montanaro wrote:

>
>     Brett> The problem I have always had with this proposal is that the
>     Brett> value is worthless, time tuples do not have a slot for 
> fractional
>     Brett> seconds.  Yes, it could possibly be changed to return a 
> float for
>     Brett> seconds, but that could possibly break things.
>
> Actually, time.strptime() returns a struct_time object.  Would it be
> possible to extend %S to parse floats then add a microseconds (or 
> whatever)
> field to struct_time objects that is available by attribute only?  In 
> Py3k
> it could worm its way into the tuple representation somehow (either as 
> a new
> field or by returning seconds as a float).

+1 -- I never liked the idea that 'time tuples' lost fractions of a 
second.  On platforms where that's sensible and not too hard, 
time.time() could also -- unobtrusively and backwards compatibly -- set 
that same attribute.  I wonder if, where the attribute's real value is 
unknown, it should be None (a correct indication of "I dunno") or 0.0 
(maybe handier); instinctively, I would prefer None.

"Available by attribute only" is probably sensible, overall, but maybe 
strftime should make available whatever formatting item[s] strptime may 
grow to support fractions of a second; and one such item (distinct from 
%S for guaranteed backwards compatibility) should be "seconds and 
fraction, with [[presumably, locale-specific]] decimal point inside".


Alex

From barry at python.org  Fri Jan 14 12:46:09 2005
From: barry at python.org (Barry Warsaw)
Date: Fri Jan 14 12:46:12 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <1105703168.12628.21.camel@presto.wooz.org>

On Fri, 2005-01-14 at 06:40, Alex Martelli wrote:

> +1 -- I never liked the idea that 'time tuples' lost fractions of a 
> second.  On platforms where that's sensible and not too hard, 
> time.time() could also -- unobtrusively and backwards compatibly -- set 
> that same attribute.  I wonder if, where the attribute's real value is 
> unknown, it should be None (a correct indication of "I dunno") or 0.0 
> (maybe handier); instinctively, I would prefer None.

None feels better.  I've always thought it was kind of icky for
datetimes to use microseconds=0 to decide whether to print the
fractional second part or not for isoformat(), e.g.:

>>> import datetime
>>> now = datetime.datetime.now()
>>> now.isoformat()
'2005-01-14T06:44:18.013832'
>>> now.replace(microsecond=0).isoformat()
'2005-01-14T06:44:18'

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050114/b1f9bc15/attachment.pgp
From marktrussell at btopenworld.com  Fri Jan 14 13:24:24 2005
From: marktrussell at btopenworld.com (Mark Russell)
Date: Fri Jan 14 13:24:27 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
Message-ID: <1105705464.5494.9.camel@localhost>

On Fri, 2005-01-14 at 09:36, Skip Montanaro wrote:
> Actually, time.strptime() returns a struct_time object.  Would it be
> possible to extend %S to parse floats then add a microseconds (or whatever)
> field to struct_time objects that is available by attribute only?

+1 for adding a microseconds field to struct_time, but I'd also like to
see an integer-only way of parsing fractional seconds in time.strptime. 
Using floating point makes it harder to support exact comparison of
timestamps (an issue I recently ran into when writing unit tests for
code storing timestamps in a database).

My vote is for %<digit>N producing a microseconds field.

Mark Russell

From cce at clarkevans.com  Fri Jan 14 14:30:44 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 14:30:46 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <000d01c4f9dc$138633a0$e841fea9@oemcomputer>
References: <20050114010307.GA51446@prometheusresearch.com>
	<000d01c4f9dc$138633a0$e841fea9@oemcomputer>
Message-ID: <20050114133044.GA37099@prometheusresearch.com>

On Thu, Jan 13, 2005 at 08:54:33PM -0500, Raymond Hettinger wrote:
| > Since __conform__ and __adapt__
| > would sprout two new arguments, it would make those writing adapters
| > think a bit more about the kind of adapter that they are providing.
| 
| Using optional arguments may not be the most elegant or extensible
| approach.  Perhaps a registry table or adapter attributes would fare
| better.

I'm not sure how either of these would work since the adapt()
function could return `self`.  Adapter attributes wouldn't work in
that case (or would they?), and since adapters could be given
dynamically by __adapt__ or __conform__ a registry isn't all that
appropriate.   Perhaps we could just pass around a single **kwargs?

Best,

Clark
From cce at clarkevans.com  Fri Jan 14 15:19:39 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 15:19:41 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
References: <41E5EFF6.9090408@colorstudy.com>
	<79990c6b05011302352cbd41de@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
Message-ID: <20050114141939.GB37099@prometheusresearch.com>

On Fri, Jan 14, 2005 at 12:11:10AM -0500, Phillip J. Eby wrote:
| Clark's proposal isn't going to solve this issue for PEP 246, alas.  In 
| order to guarantee safety of adaptive type declarations, the 
| implementation strategy *must* be able to guarantee that 1) adapters do 
| not have state of their own, and 2) adapting an already-adapted object 
| re-adapts the original rather than creating a new adapter.

1. Following Raymond's idea for allowing adaption to reflect more
   arbitrary properties (which can be used to provide restrictions
   on the kinds of adapters expected):
   
       adapt(object, protocol, default = False, **properties)

          Request adaptation, where the result matches a set of
          properties, such as 'lossless', 'stateless'.

       __conform__(self, protocol, **properties)
       __adapt__(self, object, **properties)
      
          Conform/adapt but optionally parameterized by a set of
          restrictions.  The **properties can be used to inform
          the adaptation.
     
       register(from, to, adapter = None, predicate = None)

          Register an adaptation path from one protocol to another,
          optionally providing an adapter.  If no adapter is provided, 
          then adapt(from,to,**properties) is used when adapting. If
          a predicate is provided, then the adaptation path is
          available only if predicate(**properties) returns True.

2. Perhaps if we just provide a mechanism for an adapter to specify
   that it's OK to be used "implicitly" via the declaration syntax?

       def fun(x: Y):
           ...

   is equivalent to,

       def fun(x):
           x = adapt(x, Y, declaration = True)

On Thu, Jan 13, 2005 at 05:52:10PM -0800, Guido van Rossum wrote:
| This may solve the current raging argument, but IMO it would 
| make the optional signature declaration less useful, because 
| there's no way to accept other kind of adapters. I'd be happier
| if def f(X: Y) implied X = adapt(X, Y).

Ideally, yes.  However, some adapters may want to explicitly disable
their usage in this context -- so some differentiation is warranted.
This 'revised' proposal puts the burden on the adapter (or its
registration) to specify that it shouldn't be used in this context.
I'm carefully using 'declaration' as the restriction, not 'stateless'.  
One may have a stateful adapter which is most appropriate to be used
in declarations (see Armin's insightful post).

Furthermore, the 'full' version of adapt() where argument 'restrictions'
can be specified could be done via a decorator syntax:

  @adapt(x, Y, **properties)

I hope this helps.

P.S. Clearly there is much of information to be captured in this
thread and put into the PEP (mostly as appendix material); keep
posting good ideas, problems, opinions, whatever -- I will summarize
over this weekend.  

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From ncoghlan at iinet.net.au  Fri Jan 14 15:24:15 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Jan 14 15:24:19 2005
Subject: [Python-Dev] frame.f_locals is writable
In-Reply-To: <41E6DD52.2080109@ieee.org>
References: <41E6DD52.2080109@ieee.org>
Message-ID: <41E7D60F.9000208@iinet.net.au>

Shane Holloway (IEEE) wrote:
> For a little background, I'm working on making an edit and continue 
> support in python a little more robust.  So, in replacing references to 
> unmodifiable types like tuples and bound-methods (instance or class), I 
> iterate over gc.get_referrers.
> 
> So, I'm working on frame types, and wrote this code::
> 
>     def replaceFrame(self, ref, oldValue, newValue):
>         for name, value in ref.f_locals.items():
>             if value is oldValue:
>                 ref.f_locals[name] = newValue
>                 assert ref.f_locals[name] is newValue
> 

FWIW, this should work:

     def replaceFrame(self, ref, oldValue, newValue):
         for name, value in ref.f_locals.items():
             if value is oldValue:
                 exec "ref.f_locals[name] = newValue"
                 assert ref.f_locals[name] is newValue

And, no, you don't have to tell me that this is an evil hack. I already know 
that, since I discovered it earlier this evening by poking around in the C 
source code for PyFrame_LocalsToFast and then looking to see what code calls 
that function :)

Cheers,
Nick

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From cce at clarkevans.com  Fri Jan 14 15:39:33 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 15:39:36 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114094715.GA21852@vicky.ecs.soton.ac.uk>
References: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<20050114094715.GA21852@vicky.ecs.soton.ac.uk>
Message-ID: <20050114143933.GC37099@prometheusresearch.com>

On Fri, Jan 14, 2005 at 09:47:15AM +0000, Armin Rigo wrote:
| In my opinion a user-defined class or interface mixes two notions: a
| "concept" meaningful for the programmer that the instances
| represent, and the "interface" provided to manipulate it.
...
| This suggests that only concrete objects which are expected to
| encode a *single* concept should be used for adaptation.

So, in this view of the world, the adapter from FileName to File _is_ 
appropriate, but the adapter from String to FileName isn't?

  def checkSecurity(filename: FileName):
      ...
 
Hmm.  I'd like to be able to pass in a String here, and use that
String->FileName adapter.  So, there isn't a problem yet; although
String is vague in a sense, it doesn't hurt to specialize it 
in the context that I have in mind.

   def checkContent(file: File):
       ... look for well known viruses ...

   def checkSecurity(filename: FileName):
       ... look for nasty path information ...
       return checkContent(filename)
       
Even this is _ok_ since the conceptual jump is specified by the
programmer between the two stages.  The problem happens when
one does...

    checkContent("is-this-a-filename-or-is-this-content")
    
This is where we run into issues.  When an adapter which 'specializes'
the content is used implicitly in a trasitive adaption chain.

| Note that it may be useful to be able to register some adapaters 
| in "local"  registeries instead of the single global one, to avoid 
| all kinds of unexpected global effects.

Nice...

Best,

Clark

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From pje at telecommunity.com  Fri Jan 14 16:07:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 16:05:25 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-FEE97ADA660B11D9AEA9003065D5E7E4@[10.0.0.23
 ]>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>

At 10:09 AM 1/14/05 +0100, Just van Rossum wrote:
>Guido van Rossum wrote:
>
> > Are there real-life uses of stateful adapters that would be thrown out
> > by this requirement?
>
>Here are two interfaces we're using in a project:
>
>   http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen")
>   http://just.letterror.com/ltrwiki/PointPen
>
>They're both abstractions for drawing glyphs (characters from a font).
>Sometimes the former is more practical and sometimes the latter. We
>really need both interfaces. Yet they can't be adapted without keeping
>some state in the adapter.

Maybe I'm missing something, but for those interfaces, isn't it okay to 
keep the state in the *adapted* object here?  In other words, if PointPen 
just added some private attributes to store the extra data?


>Implicit adaptations may be dangerous here, but I'm not so sure I care.
>In my particular use case, it will be very rare that people want to do
>
>     funcTakingPointPen(segmentPen)
>     otherFuncTakingPointPen(segmentPen)

But if the extra state were stored on the segmentPen rather than the 
adapter, this would work correctly, wouldn't it?  Whereas with it stored in 
an adapter, it wouldn't.

From pje at telecommunity.com  Fri Jan 14 16:22:36 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 16:21:02 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114094715.GA21852@vicky.ecs.soton.ac.uk>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<20050113143421.GA39649@prometheusresearch.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>

At 09:47 AM 1/14/05 +0000, Armin Rigo wrote:
>For example, strings "mean" very
>different concepts in various contexts, e.g. a file name, an url, the byte
>content a document, or the pickled representation of something.

Note that this is solvable in practice by the author of a method or 
framework choosing to define an interface that they accept, and then 
pre-defining the adaptation from string to that interface.  So, what a 
string "means" in that context is pre-defined.

The interpretation problem for strings comes only when a third party 
attempts to define adaptation from a string to a context that takes some 
more generic interface.


>This would allow a module to provide the str->StringIO or str->file conversion
>locally.

It also works for the module to define a target interface and register an 
adapter to that, and introduces less complexity into the adaptation system.

From just at letterror.com  Fri Jan 14 17:27:53 2005
From: just at letterror.com (Just van Rossum)
Date: Fri Jan 14 17:27:59 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
Message-ID: <r01050400-1037-3F051851664911D9AEA9003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> At 10:09 AM 1/14/05 +0100, Just van Rossum wrote:
> >Guido van Rossum wrote:
> >
> > > Are there real-life uses of stateful adapters that would be
> > > thrown out by this requirement?
> >
> >Here are two interfaces we're using in a project:
> >
> >   http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen")
> >   http://just.letterror.com/ltrwiki/PointPen
> >
> >They're both abstractions for drawing glyphs (characters from a
> >font). Sometimes the former is more practical and sometimes the
> >latter. We really need both interfaces. Yet they can't be adapted
> >without keeping some state in the adapter.
> 
> Maybe I'm missing something, but for those interfaces, isn't it okay
> to keep the state in the *adapted* object here?  In other words, if
> PointPen just added some private attributes to store the extra data?
> 
> 
> >Implicit adaptations may be dangerous here, but I'm not so sure I
> >care. In my particular use case, it will be very rare that people
> >want to do
> >
> >     funcTakingPointPen(segmentPen)
> >     otherFuncTakingPointPen(segmentPen)
> 
> But if the extra state were stored on the segmentPen rather than the
> adapter, this would work correctly, wouldn't it?  Whereas with it
> stored in an adapter, it wouldn't.

Are you saying the adapter could just hijack some attrs on the adapted
object? Or ore you saying the adapted object should be aware of the
adapter? Both don't sound right, so I hope I'm misunderstanding...

Just
From gvanrossum at gmail.com  Fri Jan 14 17:32:42 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 14 17:32:45 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
Message-ID: <ca471dc2050114083242b9fb48@mail.gmail.com>

[Phillip]
> Quick demo (strawman syntax) of declaring adapters...
> 
> First, a type declaring that its 'read' method has the semantics of
> 'file.read':
> 
>      class SomeKindOfStream:
>          def read(self, byteCount) like file.read:
>              ...
[and more like this]

Sorry, this is dead in the water. I have no desire to add syntax
complexities like this to satisfy some kind of theoretically nice
property.

> Second, third-party code adapting a string iterator to a readable file:

We need to pick a better example; I like Armin's hypothesis that
adapting strings to more specific things is abuse of the adapter
concept (paraphrased).

> >Are there real-life uses of stateful adapters that would be thrown out
> >by this requirement?
> 
> Think about this:
[...]

No, I asked for a real-life example. Just provided one, and I'm
satisfied that stateful adapters can be useful.

["proof" omitted] 
> Thus, stateful adapters *must* be explicitly adapted by the code that needs
> to manage the state.

This doesn't prove it at all to me.

> This is why I say that PEP 246 is fine, but type declarations need a more
> restrictive version.  PEP 246 provides a nice way to *find* stateful
> adapters, it just shouldn't do it for function arguments.

You haven't proven that for me. The example quoted earlier involving
print_next_line() does nothing to prove it, since it's a bad use of
adaptation for a different reason: string -> file adaptation is abuse.

> >But the solution IMO is not to weigh down adapt(), but to agree, as a
> >user community, not to create such "bad" adapters, period.
> 
> Maybe.  The thing that inspired me to come up with a new approach is that
> "bad" adapters are just *sooo* tempting; many of the adapters that we're
> just beginning to realize are "bad", were ones that Alex and I both
> initially thought were okay.

One of my hesitations about adding adapt() and interfaces to the core
language has always been that it would change the "flavor" of much of
the Python programming we do and that we'd have to relearn how to
write good code. There are other places in Python where it can be
tempting to use its features in a way that can easily cause trouble
(the extremely dynamic nature of the language is always tempting); we
tend not to invent new syntax to fix this but instead develop idioms
that avoid the issues.

IOW, I don't want to make it syntactically impossible to write bad
adapters, but we'll have to develop a set of guidelines for writing
good adapters. I don't believe for a second that all stateful adapters
are bad, even though I expect that stateless lossless adapters are
always good.

I like Armin's hypothesis better.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From arigo at tunes.org  Fri Jan 14 17:39:00 2005
From: arigo at tunes.org (Armin Rigo)
Date: Fri Jan 14 17:50:29 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
References: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
Message-ID: <20050114163900.GA21005@vicky.ecs.soton.ac.uk>

Hi Phillip,

On Fri, Jan 14, 2005 at 10:22:36AM -0500, Phillip J. Eby wrote:
> Note that this is solvable in practice by the author of a method or 
> framework choosing to define an interface that they accept, and then 
> pre-defining the adaptation from string to that interface.  So, what a 
> string "means" in that context is pre-defined.

I'm trying to reserve the usage of "interface" to something more concrete: the
concrete ways we have to manipulate a given object (typically a set of methods
including some unwritten expectations).  We might be talking about the same
thing, then, but just to check: I'm making sense of the above paragraph in two
steps.  First, I read it with "interface" replaced by "concept": the author of
the method chooses what concepts the input arguments carry: a file or a file
name, for example.  Then he chooses which particular interface he'd like to
access the input arguments through: if it's a file, then it's probably via the
standard file-like methods; if it's a file name, then it's probably as a
string.  It's important to do it in two steps, even if in practice a lot of
concepts typically comes with a single associated interface (both together,
they are a "duck type").

The programmer using existing methods also goes through two steps: first, he
considers the "intuitive" signature of the method, which includes a reasonable
name and conceptual arguments.  Then he traditionally has to care about the
precise interface that the callee expects.  For example, he knows that in some
specific situation he wants to use marshal.load[s](something), but he has to
check precisely which interface the function expects for 'something': a file
name string, a file-like object, a real file, a content string?

Adaptation should make the latter part more automatic, and nothing more.  
Ideally, both the caller and the callee know (and write down) that the
function's argument is a "reference to some kind of file stuff", a very
general concept; then they can independently specify which concrete object
they expect and provide, e.g. "a string naming a file", "a file-like object",
"a string containing the data".

What I see in most arguments about adaptation/conversion/cast is some kind of
confusion that would make us believe that the concrete interface (or even
worse the formal one) fully defines what underlying concepts they represent.  
It is true only for end-user application-specific classes.

> The interpretation problem for strings comes only when a third party 
> attempts to define adaptation from a string to a context that takes some 
> more generic interface.

In the above example, there is nothing in the general concept that helps the
caller to guess how a plain string will be interpreted, or symmetrically that
helps the callee to guess what an incoming plain string means.  In my opinion
this should fail, in favor of something more explicit.  It's already a problem
without any third party.

> >(...) conversion locally.
> 
> It also works for the module to define a target interface and register an 
> adapter to that, and introduces less complexity into the adaptation system.

Makes sense, but my fear is that people will soon register generic adapters
all around...  debugging nightmares!


Armin
From shane.holloway at ieee.org  Fri Jan 14 18:17:23 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Fri Jan 14 18:17:54 2005
Subject: [Python-Dev] frame.f_locals is writable
In-Reply-To: <41E6ED20.50103@ocf.berkeley.edu>
References: <41E6DD52.2080109@ieee.org> <41E6ED20.50103@ocf.berkeley.edu>
Message-ID: <41E7FEA3.10101@ieee.org>

Brett C. wrote:
> Other option would be to add a function that either directly modified
>  single values in f_localsplus, a function that takes a dict and 
> propogates the values, or a function that just calls 
> PyFrame_LocalsToFast() .
Brett!!  Thanks for looking this up!  With a little help from ctypes, I 
was able to call PyFrame_LocalsToFast, and it works wonderfully!  Maybe 
this method could be added to the frame type itself?


> Personally I am against this, but that is because you would 
> single-handedly ruin my master's thesis and invalidate any possible
> type inferencing one can do in Python without some semantic change.
> But then again my thesis shows that amount of type inferencing is not
> worth the code complexity so it isn't totally devastating. =)
Well, at least in theory this only allows the developer to replace a 
variable with a better (hopefully) version of a class that is very 
similar...  <wink>


 > And you are right, "don't do that".  =)

I'm going to only remember this trick in the light of development tools. 
  Really!  This magic is WAY too deep for a library.  The only use for 
it that I could really see is a smalltalk-like swap method.


Thanks again for your help!
-Shane

From shane.holloway at ieee.org  Fri Jan 14 18:19:46 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Fri Jan 14 18:20:20 2005
Subject: [Python-Dev] frame.f_locals is writable
In-Reply-To: <41E7D60F.9000208@iinet.net.au>
References: <41E6DD52.2080109@ieee.org> <41E7D60F.9000208@iinet.net.au>
Message-ID: <41E7FF32.2030502@ieee.org>

> FWIW, this should work:
> 
>     def replaceFrame(self, ref, oldValue, newValue):
>         for name, value in ref.f_locals.items():
>             if value is oldValue:
>                 exec "ref.f_locals[name] = newValue"
>                 assert ref.f_locals[name] is newValue
> 
> And, no, you don't have to tell me that this is an evil hack. I already 
> know that, since I discovered it earlier this evening by poking around 
> in the C source code for PyFrame_LocalsToFast and then looking to see 
> what code calls that function :)

Yes.  After poking around in Google with PyFrame_LocalsToFast, I found 
some other links to people doing that.  I implemented a direct call 
using ctypes to make the code explicit about what's happening.  I'm just 
glad it is possible now.  Works fine in both 2.3 and 2.4.

Thanks,
-Shane
From pje at telecommunity.com  Fri Jan 14 18:28:16 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 18:26:41 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <ca471dc2050114083242b9fb48@mail.gmail.com>
References: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
	<5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>

At 08:32 AM 1/14/05 -0800, Guido van Rossum wrote:
>I have no desire to add syntax
>complexities like this to satisfy some kind of theoretically nice
>property.

Whether it's syntax or a decorator, it allows you to create stateless 
adapters without needing to write individual adapter *classes*, or even 
having an explicit notion of an "interface" to adapt to.  That is, it makes 
it very easy to write a "good" adapter; you can do it without even 
trying.  The point isn't to make it impossible to write a "bad" adapter, 
it's to make it more attractive to write a good one.

Also, btw, it's not a "theoretically nice" property.  I just tried making 
PyProtocols 'Adapter' class immutable, and reran PEAK's unit tests, 
exercising over 100 adapter classes.  *Three* had state.  All were trivial 
caching, not per-adapter state.

However, even had they *not* been trivial caching, this suggests that there 
are a *lot* of use cases for stateless adapters, and that means that a 
trivial 'like' decorator can make it very easy to write stateless adapters.

What I'm suggesting is effectively replacing PEP 246's global registry with 
one that can generate a stateless adapter from individual operation 
declarations.  But it can still fall back on __conform__ and __adapt__ if 
there aren't any declarations, and we could also require adapt() to return 
the same adapter instance if an adapter is stateful.


>No, I asked for a real-life example. Just provided one, and I'm
>satisfied that stateful adapters can be useful.

But his example doesn't require *per-adapter* state, just 
per-original-object state.  As long as there's a clean way to support that, 
his example still works -- and in fact it works *better*, because then that 
"rare" case he spoke of will work just fine without even thinking about it.

Therefore, I think we should make it easy for stateful adapters to link 
their state to the adapted object, not the adapter instance.  This better 
matches most people's intuitive mental model of adaptation, as judged by 
the comments of people in this discussion who were new to adaptation.  If 
adapt() promised to return the same (stateful) adapter instance each time, 
then Just's "rare" example would work nicely, without a bug.


>One of my hesitations about adding adapt() and interfaces to the core
>language has always been that it would change the "flavor" of much of
>the Python programming we do and that we'd have to relearn how to
>write good code.

Exactly!  I came up with the monkey typing idea specifically to address 
this very issue, because the PEP discussion has shown that it is hard to 
learn to write good adapters, and very easy to be tempted to write bad 
ones.  If there is a very easy way to write good adapters, then it will be 
more attractive to learn about it.  If you have to do a little bit more to 
get per-object state, and then it's hardest of all to get per-adapter 
state, the model is a good match to the frequency of those use cases.

Even better, it avoids creating the concept of an interface, except that 
you want something "like" a file or a dictionary.  It's the first Python 
"interface" proposal I know of that can actually spell the loose notion of 
"file-like" in a concretely useful way!

I think the concept can be extended slightly to work with stateful 
(per-object) adapters, though I'll have to give it some thought and 
prototyping.


>I don't believe for a second that all stateful adapters
>are bad,

Neither do I.  It's *per-adapter-instance* state that's bad, or at least 
that no good use cases have yet been shown for.  If we can make it easy to 
have *per-adapted-object* state, or guarantee "same-adapter return", then 
that's even better.

For example, if there were a weak reference dictionary mapping objects to 
their (stateful) adapters, then adapt() could always return the same 
adapter instance for a given source object, thus guaranteeing a single state.

Of course, this would also imply that adapt() needs to know that an adapter 
is stateful, so that it doesn't keep around lots of trivial stateless 
adapters.  Thus, there should be a little more effort required to create 
this kind of adapter (i.e., you need to say that it's stateful).

By the way, I've encountered the need for *this* kind of stateful adapter 
more than once;
PyProtocols has a notion of a StickyAdapter, that keeps per-adapted-object 
state, which is sometimes needed because you can't hold on to the "same 
adapter" for some reason.  The StickyAdapter attaches itself to the 
original object, such that when you adapt that object again, you always get 
the same StickyAdapter instance.  In basically all the use cases I found 
where there's a *really* stateful adapter, I'm using a StickyAdapter, not 
trying to have per-adapter-instance state.

So, what I'm suggesting is that we make it ridiculously easy for somebody 
to create adapters that either have no state, or that have "sticky" state, 
and make it obscure at best to create one that has per-adapter-instance 
state, because nobody has yet presented an example of per-adapter-instance 
state that wasn't either 1) clearly abuse or 2) would be problematic if 
adapt() always returned the same adapter instance.

From pje at telecommunity.com  Fri Jan 14 18:36:12 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 18:34:40 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114163900.GA21005@vicky.ecs.soton.ac.uk>
References: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
	<5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114122923.03b55cd0@mail.telecommunity.com>

At 04:39 PM 1/14/05 +0000, Armin Rigo wrote:
>Ideally, both the caller and the callee know (and write down) that the
>function's argument is a "reference to some kind of file stuff", a very
>general concept; then they can independently specify which concrete object
>they expect and provide, e.g. "a string naming a file", "a file-like object",
>"a string containing the data".

Yes, exactly!  That's what I mean by "one use case, one interface".  But as 
you say, that's because we don't currently have a way to separate these ideas.

So, in developing with PyProtocols, I create a new interface for each 
concept, possibly allowing adapters for some other interface to supply 
default implementations for that concept.  But, for things like strings and 
such, I define direct adapters to the new concept, so that they override 
any "generic" adapters as you call them.

So, I have a path that looks like:

concreteType -> functionalInterface -> conceptInterface

Except that there's also a shorter concreteType -> conceptInterface path 
for various types like string, thus providing 
context-sensitivity.  (Interestingly, strings are the *most* common 
instance of this situation, as they're one of the most "open to 
interpretation" objects you can have!)


> > It also works for the module to define a target interface and register an
> > adapter to that, and introduces less complexity into the adaptation system.
>
>Makes sense, but my fear is that people will soon register generic adapters
>all around...  debugging nightmares!

Well, if you have "interface per concept", you *have* a context; the 
context is the concept itself.

From cce at clarkevans.com  Fri Jan 14 18:41:32 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 18:41:39 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114163900.GA21005@vicky.ecs.soton.ac.uk>
References: <20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
	<20050114163900.GA21005@vicky.ecs.soton.ac.uk>
Message-ID: <20050114174132.GA46344@prometheusresearch.com>

On Fri, Jan 14, 2005 at 04:39:00PM +0000, Armin Rigo wrote:
| I'm trying to reserve the usage of "interface" to something more 
| concrete: the concrete ways we have to manipulate a given object 
| (typically a set of methods including some unwritten expectations).  

I'd say that a programmer interface intends to encapsulates both the
'concept' and the 'signature'.  The concept is indicated by the
names of the function delcarations and fields, the signature by the
position and type of arguments.

| Adaptation should make [passing data between conceptually equivalent 
| interfaces?] more automatic, and nothing more.  Ideally, both the caller 
| and the callee know (and write down) that the function's argument is a 
| "reference to some kind of file stuff", a very general concept; then they 
| can independently specify which concrete object they expect and provide,
| e.g. "a string naming a file", "a file-like object", "a string containing 
| the data".

But it is quite difficult to know when two interfaces are conceptually
equivalent...

| What I see in most arguments about adaptation/conversion/cast is some kind
| of confusion that would make us believe that the concrete interface (or 
| even worse the formal one) fully defines what underlying concepts they
| represent.  It is true only for end-user application-specific classes.

It seems your distinction comes down to defining 'best pratice' for
when you define an adapter... and when you don't.   Perhaps we don't
need to qualify the adapters that exist, as much as make them
transparent to the programmer.  A bad adapter will most likely be
detected _after_ a weird bug has happened.  Perhaps the adapt()
framework can provide meaningful information in these cases.

Imagine enhancing the stack-trace with additional information about
what adaptations were made; 

    Traceback (most recent call last):
       File "xxx", line 1, in foo
         Adapting x to File
       File "yyy", line 384, in bar
         Adapting x to FileName
       etc.

| In the above example, there is nothing in the general concept that helps 
| the caller to guess how a plain string will be interpreted, or 
| symmetrically that helps the callee to guess what an incoming plain
| string means.  In my opinion this should fail, in favor of something 
| more explicit.  It's already a problem without any third party.

How can we express your thoughts so that they fit into a narrative
describing how adapt() should and should not be used?  If you could
respond by re-posting your idea with the 'average python programmer'
as your audience it would help me quite a bit when summarizing your
contribution to the thread.

Best,

Clark
From cce at clarkevans.com  Fri Jan 14 18:41:50 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 18:41:53 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
References: <20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
	<5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
Message-ID: <20050114174149.GB21254@prometheusresearch.com>

On Fri, Jan 14, 2005 at 12:28:16PM -0500, Phillip J. Eby wrote:
| At 08:32 AM 1/14/05 -0800, Guido van Rossum wrote:
| >I have no desire to add syntax
| >complexities like this to satisfy some kind of theoretically nice
| >property.
| 
| Whether it's syntax or a decorator, it allows you to create stateless 
| adapters without needing to write individual adapter *classes*, or even 
| having an explicit notion of an "interface" to adapt to.  That is, it 
| makes it very easy to write a "good" adapter; you can do it without even 
| trying.  The point isn't to make it impossible to write a "bad" adapter, 
| it's to make it more attractive to write a good one.

Phillip, 

May I suggest that you write this up as a PEP?  Being dead in the
water isn't always fatal.  Right now you're ideas are still very
fuzzy and by forcing yourself to come up with a narrative, semantics
section, minimal implementation, and examples, you will go along way
to both refining your idea and also allowing others to better
understand what you're proposing.

Cheers,

Clark
From just at letterror.com  Fri Jan 14 18:56:52 2005
From: just at letterror.com (Just van Rossum)
Date: Fri Jan 14 18:56:57 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
Message-ID: <r01050400-1037-ACCCDAA0665511D9AEA9003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> For example, if there were a weak reference dictionary mapping
> objects to their (stateful) adapters, then adapt() could always
> return the same adapter instance for a given source object, thus
> guaranteeing a single state.

Wouldn't that tie the lifetime of the adapter object to that of the
source object?

Possibly naive question: is using adaptation to go from iterable to
iterator abuse? That would be a clear example of per-adapter state.

Just
From bac at OCF.Berkeley.EDU  Fri Jan 14 19:04:05 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Jan 14 19:04:18 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org>
References: <16870.61059.451494.303971@montanaro.dyndns.org>	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
Message-ID: <41E80995.5030901@ocf.berkeley.edu>

Skip Montanaro wrote:
>     Brett> The problem I have always had with this proposal is that the
>     Brett> value is worthless, time tuples do not have a slot for fractional
>     Brett> seconds.  Yes, it could possibly be changed to return a float for
>     Brett> seconds, but that could possibly break things.
> 
> Actually, time.strptime() returns a struct_time object.  Would it be
> possible to extend %S to parse floats then add a microseconds (or whatever)
> field to struct_time objects that is available by attribute only?  In Py3k
> it could worm its way into the tuple representation somehow (either as a new
> field or by returning seconds as a float).
> 

Right, it's a struct_time object; just force of habit to call it a time tuple.

And I technically don't see why a fractional second attribute could not be 
added that is not represented in the tuple.  But I personally would like to see 
struct_tm eliminated in Py3k and replaced with datetime usage.  My wish is to 
have the 'time' module stripped down to only the bare essentials that just 
don't fit in datetime and push everyone to use datetime for most things.

>     Brett> My vote is that if something is added it be like %N but without
>     Brett> the optional optional digit count.  This allows any separator to
>     Brett> be used while still consuming the digits.  It also doesn't
>     Brett> suddenly add optional args which are not supported for any other
>     Brett> directive.
> 
> I realize the %4N notation is distasteful, but without it I think you will
> have trouble parsing something like
> 
>     13:02:00.704
> 
> What would be the format string?  %H:%M:%S.%N would be incorrect.

Why is that incorrect?

-Brett
From aahz at pythoncraft.com  Fri Jan 14 19:11:01 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Jan 14 19:11:03 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <41E80995.5030901@ocf.berkeley.edu>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<41E80995.5030901@ocf.berkeley.edu>
Message-ID: <20050114181101.GB21486@panix.com>

On Fri, Jan 14, 2005, Brett C. wrote:
>
> Right, it's a struct_time object; just force of habit to call it a time 
> tuple.
> 
> And I technically don't see why a fractional second attribute could not be 
> added that is not represented in the tuple.  But I personally would like to 
> see struct_tm eliminated in Py3k and replaced with datetime usage.  My wish 
> is to have the 'time' module stripped down to only the bare essentials that 
> just don't fit in datetime and push everyone to use datetime for most 
> things.

Because of people doing things like

year, month, day, hour, min, sec, junk, junk, junk = time.localtime()
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From michel at dialnetwork.com  Fri Jan 14 19:02:39 2005
From: michel at dialnetwork.com (Michel Pelletier)
Date: Fri Jan 14 19:18:50 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114090955.0BCD41E400D@bag.python.org>
References: <20050114090955.0BCD41E400D@bag.python.org>
Message-ID: <200501141002.39525.michel@dialnetwork.com>

> Date: Fri, 14 Jan 2005 02:38:05 -0500
> From: "Phillip J. Eby" <pje@telecommunity.com>
> Subject: Re: [Python-Dev] PEP 246: lossless and stateless
> To: guido@python.org
> Cc: "Clark C. Evans" <cce@clarkevans.com>, python-dev@python.org
> Message-ID: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
> Content-Type: text/plain; charset="us-ascii"; format=flowed
> 
> Each of these examples registers the function as an implementation of the 
> "file.read" operation for the appropriate type. ?When you want to build an 
> adapter from SomeKindOfStream or from a string iterator to the "file" type, 
> you just access the 'file' type's descriptors, and look up the 
> implementation registered for that descriptor for the source type 
> (SomeKindOfStream or string-iter). ?If there is no implementation 
> registered for a particular descriptor of 'file', you leave the 
> corresponding attribute off of the adapter class, resulting in a class 
> representing the subset of 'file' that can be obtained for the source class.
> 
> The result is that you generate a simple adapter class whose only state is 
> a read-only slot pointing to the adapted object, and descriptors that bind 
> the registered implementations to that object. ?That is, the descriptor 
> returns a bound instancemethod with an im_self of the original object, not 
> the adapter. ?(Thus the implementation never even gets a reference to the 
> adapter, unless 'self' in the method is declared of the same type as the 
> adapter, which would be the case for an abstract method like 'readline()' 
> being implemented in terms of 'read'.)
> 
> Anyway, it's therefore trivially "guaranteed" to be stateless (in the same 
> way that an 'int' is "guaranteed" to be immutable), and the implementation 
> is also "guaranteed" to be able to always get back the "original" object.
> 
> Defining adaptation in terms of adapting operations also solves another 
> common problem with interface mechanisms for Python: the dreaded "mapping 
> interface" and "file-like object" problem. ?Really, being able to 
> *incompletely* implement an interface is often quite useful in practice, so 
> this "monkey see, monkey do" typing ditches the whole concept of a complete 
> interface in favor of "explicit duck typing". ?You're just declaring "how 
> can X act 'like' a duck" -- emulating behaviors of another type rather than 
> converting structure.

I get it!  Your last description didn't quite sink in but this one does and 
I've been thinking about this quite a bit, and I like it.  I'm starting to 
see how it nicely sidesteps the problems discussed in the thread so far. 

Partial implementation of interfaces (read, implementing only the operations 
you care about on a method by method basis instead of an entire interface) 
really is very useful and feels quite pythonic to me. After all, in most 
cases of substitutability in Pyhton (in my experience), it's not the *type* 
you do anything with, but that type's operations.  Does anyone know of any 
other languages that take this "operational" aproach to solving the 
substitutability problem?

There seem to be some downsides vs. interfaces (I think) the lack of "it's 
documentation too" aspect, I find zope 3 interfaces.py modules the best way 
to learn about it, but again the upside is, no complex interface 
relationships just to define the subtle variations of "mapping" and users can 
always just say help(file.read). 

Another thing I see used fairly commonly are marker interfaces.  While I'm not 
sure of their overall usefullness I don't see how they can be done using your 
operational scheme.  Maybe that means they were a bad idea in the first 
place.

I also think this is easier for beginners to understand, instead of "you have 
to implement this interface, look at it over here, that's the "file" 
interface, now you implement that in your object and you better do it all 
right" you just tell them "call your method 'read' and say its 'like 
file.read' and your thing will work where any file can be read.

-Michel
From skip at pobox.com  Fri Jan 14 19:26:02 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Jan 14 19:26:15 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <41E80995.5030901@ocf.berkeley.edu>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<41E80995.5030901@ocf.berkeley.edu>
Message-ID: <16872.3770.25143.582154@montanaro.dyndns.org>


    >> I realize the %4N notation is distasteful, but without it I think you
    >> will have trouble parsing something like
    >> 
    >> 13:02:00.704
    >> 
    >> What would be the format string?  %H:%M:%S.%N would be incorrect.

    Brett> Why is that incorrect?

Because "704" represents the number of milliseconds, not the number of
nanoseconds.

I'm sure that in some applications people are interested in extremely short
time scales.  Writing out hours, minutes and seconds when all you are
concerned with are small fractions of seconds (think high energy physics)
would be a waste.  In those situations log entries like

    704 saw proton
    705 proton hit neutron
    706 saw electron headed toward Saturn

might make perfect sense.  Parsing the time field entirely within
time.strptime would be at least clumsy if you couldn't tell it the scale of
the numbers you're dealing with.  Parsing with %N, %3N or %6N would give
different values (nanoseconds, milliseconds or microseconds).

Skip

From aleax at aleax.it  Fri Jan 14 19:29:53 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 14 19:29:58 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <20050114181101.GB21486@panix.com>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<41E80995.5030901@ocf.berkeley.edu>
	<20050114181101.GB21486@panix.com>
Message-ID: <48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 14, at 19:11, Aahz wrote:

> On Fri, Jan 14, 2005, Brett C. wrote:
>>
>> Right, it's a struct_time object; just force of habit to call it a 
>> time
>> tuple.
>>
>> And I technically don't see why a fractional second attribute could 
>> not be
>> added that is not represented in the tuple.  But I personally would 
>> like to
>> see struct_tm eliminated in Py3k and replaced with datetime usage.  
>> My wish
>> is to have the 'time' module stripped down to only the bare 
>> essentials that
>> just don't fit in datetime and push everyone to use datetime for most
>> things.
>
> Because of people doing things like
>
> year, month, day, hour, min, sec, junk, junk, junk = time.localtime()

And why would that be a problem?  It would keep working just like 
today, assuming you're answering the "don't see why" part.  From the 
start, we discussed fractional seconds being available only as an 
ATTRIBUTE of a struct_time, not an ITEM (==iteration on a struct_time 
will keep working just line now).


Alex

From aahz at pythoncraft.com  Fri Jan 14 19:39:43 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Jan 14 19:39:46 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <16870.61059.451494.303971@montanaro.dyndns.org>
	<41E74790.60108@ocf.berkeley.edu>
	<16871.37525.981821.580939@montanaro.dyndns.org>
	<41E80995.5030901@ocf.berkeley.edu>
	<20050114181101.GB21486@panix.com>
	<48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050114183943.GA10564@panix.com>

On Fri, Jan 14, 2005, Alex Martelli wrote:
> On 2005 Jan 14, at 19:11, Aahz wrote:
>>On Fri, Jan 14, 2005, Brett C. wrote:
>>>
>>>Right, it's a struct_time object; just force of habit to call it a
>>>time tuple.
>>>
>>>And I technically don't see why a fractional second attribute could
>>>not be added that is not represented in the tuple.  But I personally
>>>would like to see struct_tm eliminated in Py3k and replaced with
>>>datetime usage.  My wish is to have the 'time' module stripped down
>>>to only the bare essentials that just don't fit in datetime and push
>>>everyone to use datetime for most things.
>>
>>Because of people doing things like
>>
>>year, month, day, hour, min, sec, junk, junk, junk = time.localtime()
>
> And why would that be a problem?  It would keep working just like
> today, assuming you're answering the "don't see why" part.  From the
> start, we discussed fractional seconds being available only as an
> ATTRIBUTE of a struct_time, not an ITEM (==iteration on a struct_time
> will keep working just line now).

Uh, I missed the second "not" in Brett's first sentence of second
paragraph.  Never mind! </litella>
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From cce at clarkevans.com  Fri Jan 14 20:25:11 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Fri Jan 14 20:25:14 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <200501141002.39525.michel@dialnetwork.com>
References: <20050114090955.0BCD41E400D@bag.python.org>
	<200501141002.39525.michel@dialnetwork.com>
Message-ID: <20050114192511.GD21254@prometheusresearch.com>

On Fri, Jan 14, 2005 at 10:02:39AM -0800, Michel Pelletier wrote:
| Phillip J. Eby wrote:
| > The result is that you generate a simple adapter class whose 
| > only state is a read-only slot pointing to the adapted object,
| > and descriptors that bind the registered implementations to that object.

it has only the functions in the interface, plus the adaptee; all
requests through the functions are forwarded on to their equivalent
in the adaptee; sounds alot like the adapter pattern ;)

| I get it!  Your last description didn't quite sink in but this one does 
| and I've been thinking about this quite a bit, and I like it.  I'm 
| starting to see how it nicely sidesteps the problems discussed in 
| the thread so far. 

I'm not sure what else this mechanism provides; besides limiting
adapters so that they cannot maintain their own state.

| Does anyone know of any other languages that take this "operational" 
| aproach to solving the substitutability problem?

Microsoft's COM?

| I also think this is easier for beginners to understand, instead of 
| "you have to implement this interface, look at it over here, 
| that's the "file" interface, now you implement that in your object
| and you better do it all right" you just tell them "call your 
| method 'read' and say its 'like file.read' and your thing will work 
| where any file can be read.

A tangable example would perhaps better explain... 

Looking forward to the PEP,

Clark
From aleax at aleax.it  Fri Jan 14 21:36:36 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 14 21:36:39 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114192511.GD21254@prometheusresearch.com>
References: <20050114090955.0BCD41E400D@bag.python.org>
	<200501141002.39525.michel@dialnetwork.com>
	<20050114192511.GD21254@prometheusresearch.com>
Message-ID: <FC4E60B2-666B-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 14, at 20:25, Clark C. Evans wrote:

> | Does anyone know of any other languages that take this "operational"
> | aproach to solving the substitutability problem?
>
> Microsoft's COM?

I don't see the parallel: COM (QueryInterface) is strictly 
by-interface, not by-method, and has many other differences.


Alex

From pje at telecommunity.com  Fri Jan 14 21:48:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 21:46:28 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-ACCCDAA0665511D9AEA9003065D5E7E4@[10.0.0.23
 ]>
References: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114153236.0335fc50@mail.telecommunity.com>

At 06:56 PM 1/14/05 +0100, Just van Rossum wrote:
>Phillip J. Eby wrote:
>
> > For example, if there were a weak reference dictionary mapping
> > objects to their (stateful) adapters, then adapt() could always
> > return the same adapter instance for a given source object, thus
> > guaranteeing a single state.
>
>Wouldn't that tie the lifetime of the adapter object to that of the
>source object?

Well, you also need to keep the object alive if the adapter is still 
hanging around.  I'll get to implementation details and alternatives in the 
PEP.


>Possibly naive question: is using adaptation to go from iterable to
>iterator abuse? That would be a clear example of per-adapter state.

I don't know if it's abuse per se, but I do know that speciifying whether a 
routine takes an iterable or can accept an iterator is often something 
important to point out, and it's a requirement that back-propagates through 
code, forcing explicit management of the iterator's state.

So, if you were going to do some kind of adaptation with iterators, it 
would be much more useful IMO to adapt the *other* way, to turn an iterator 
into a reiterable.  Coincidentally, a reiterable would create per-object 
state.  :)

In other words, if you *did* consider iterators to be adaptation, it seems 
to me an example of wanting to be explicit about when the adapter gets 
created, if its state is per-adapter.  And the reverse scenario 
(iterator->reiterable) is an example of adaptation where shared state could 
solve a problem for you if it's done implicitly.  (E.g. by declaring that 
you take a reiterable, but allowing people to pass in iterators.)


From pje at telecommunity.com  Fri Jan 14 21:50:03 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 21:48:29 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114174149.GB21254@prometheusresearch.com>
References: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
	<20050113182142.GC35655@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com>
	<5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114154836.03479c90@mail.telecommunity.com>

At 12:41 PM 1/14/05 -0500, Clark C. Evans wrote:
>May I suggest that you write this up as a PEP?

Already committed to it for this weekend, but my statement was buried in a 
deep thread between Alex and I, so you might've missed it.

From pje at telecommunity.com  Fri Jan 14 22:12:48 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 22:11:31 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <200501141002.39525.michel@dialnetwork.com>
References: <20050114090955.0BCD41E400D@bag.python.org>
	<20050114090955.0BCD41E400D@bag.python.org>
Message-ID: <5.1.1.6.0.20050114155150.0347e140@mail.telecommunity.com>

At 10:02 AM 1/14/05 -0800, Michel Pelletier wrote:
>I get it!

Thanks for the positive feedback, I was getting worried that I had perhaps 
gone quite insane during the great debate.  :)


>   Your last description didn't quite sink in but this one does and
>I've been thinking about this quite a bit, and I like it.  I'm starting to
>see how it nicely sidesteps the problems discussed in the thread so far.

Good, I'll try to cherry-pick from that post when writing the PEP.


>Does anyone know of any
>other languages that take this "operational" aproach to solving the
>substitutability problem?

This can be viewed as a straightforward extension of the COM or Java type 
models in two specific ways:

1) You can implement an interface incompletely, but still receive a partial 
adapter
2) Third parties can supply implementations of individual operations

Everything else is pretty much like COM pointers to interfaces, or Java 
casting.

Of course, as a consequence of #1, you also have to declare conformance 
per-operation rather than per-interface, but some syntactic sugar for 
declaring a block of methods would be helpful.  But that's a detail; 
declaring support for an interface in COM or Java is just "like" 
automatically adding all the individual "like" declarations.

Alternatively, you can look at this as a dumbed-down version of protocols 
or typeclasses in functional languages that use generic or polymorphic 
operations as the basis of their type system.  E.g. in Haskell a 
"typeclass" categorizes types by common operations that are available to 
them.  For example the 'Ord' typeclass represents types that have ordering 
via operations like <, >, and so forth.  However, you don't go and declare 
that a type is in the 'Ord' typeclass, what you do is *implement* those 
operations (which may be by defining how to call some other operation the 
type has) and the type is automatically then considered to be in the typeclass.

(At least, that's my understanding as a non-Haskell developer who's skimmed 
exactly one tutorial on the language!  I could be totally misinterpreting 
what I read.)

Anyway, all of these systems were inspirations, if that's what you're asking.


>There seem to be some downsides vs. interfaces (I think) the lack of "it's
>documentation too" aspect, I find zope 3 interfaces.py modules the best way
>to learn about it, but again the upside is, no complex interface
>relationships just to define the subtle variations of "mapping" and users can
>always just say help(file.read).

It doesn't *stop* you from using interfaces of whatever stripe for 
documentation, though.  The target type can be abstract.  All that's 
required is that it *be* a type (and that restriction might be loosen-able 
via an adapter!) and that it have descriptors that will indicate the 
callable operations.  So Zope interfaces still work; there's no requirement 
that the descriptor something is "like" can't be an empty function with a 
docstring, like it is in a Zope or PyProtocols interface.


>Another thing I see used fairly commonly are marker interfaces.  While I'm 
>not
>sure of their overall usefullness I don't see how they can be done using your
>operational scheme.

Add an operation to them, or an attribute like 'isFoo'.  Then declare an 
implementation that returns true, if the appropriate object state 
matches.  (I presume you're talking about Zope's per-instance marker 
interfaces that come and go based on object state.)


>   Maybe that means they were a bad idea in the first
>place.

Probably so! But they can still be done, if you really need one.  You just 
have to recast it in terms of some kind of operation or attribute.


>I also think this is easier for beginners to understand, instead of "you have
>to implement this interface, look at it over here, that's the "file"
>interface, now you implement that in your object and you better do it all
>right" you just tell them "call your method 'read' and say its 'like
>file.read' and your thing will work where any file can be read.

You don't even need to call it read; you could use the word "read" in the 
non-English language of your choice; any code that wants a "file" will 
still be able to invoke it using "read".  (And English speakers will at 
least know they're looking at code that's "like" file.read.)

From pje at telecommunity.com  Fri Jan 14 22:29:28 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Jan 14 22:27:54 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114192511.GD21254@prometheusresearch.com>
References: <200501141002.39525.michel@dialnetwork.com>
	<20050114090955.0BCD41E400D@bag.python.org>
	<200501141002.39525.michel@dialnetwork.com>
Message-ID: <5.1.1.6.0.20050114161325.0347ac20@mail.telecommunity.com>

At 02:25 PM 1/14/05 -0500, Clark C. Evans wrote:
>I'm not sure what else this mechanism provides; besides limiting
>adapters so that they cannot maintain their own state.

* No need to write adapter classes for stateless adapters; just declare methods

* Allows partial adapters to be written for e.g. "file-like" objects 
without creating lots of mini-interfaces and somehow relating them all

* No need to explain the concept of "interface" to somebody who just knows 
that the routine they're calling needs a "file" and they need to make their 
object "work like" a file in some way.  (That is, more supportive of 
"programming for everybody")

* Supports using either concrete or abstract types as effective interfaces

* Doesn't require us to create explicit interfaces for the entire stdlib, 
if saying something's "like" an existing abstract or concrete type suffices!

* Supports abstract operations like "dict.update" that can automatically 
flesh out partial adapters (i.e, if you have an object with an operation 
"like" dict.__setitem__, then a generic dict.update can be used to complete 
your adaptation)

* Doesn't require anybody to write __conform__ or __adapt__ methods in 
order to get started with adaptation.

This is really more of a replacement for PEP 245 than 246 in some ways, but 
of course it relates to 246 also, since the idea would basically be to 
integrate it with the "global registry" described in 246.  In other words, 
"like" declarations should populate the global registry, and in such a way 
that state is unified for (per-object) stateful adapters.

From bac at OCF.Berkeley.EDU  Fri Jan 14 22:50:48 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Jan 14 22:51:02 2005
Subject: [Python-Dev] redux: fractional seconds in strptime
In-Reply-To: <16872.3770.25143.582154@montanaro.dyndns.org>
References: <16870.61059.451494.303971@montanaro.dyndns.org>	<41E74790.60108@ocf.berkeley.edu>	<16871.37525.981821.580939@montanaro.dyndns.org>	<41E80995.5030901@ocf.berkeley.edu>
	<16872.3770.25143.582154@montanaro.dyndns.org>
Message-ID: <41E83EB8.8060405@ocf.berkeley.edu>

Skip Montanaro wrote:
>     >> I realize the %4N notation is distasteful, but without it I think you
>     >> will have trouble parsing something like
>     >> 
>     >> 13:02:00.704
>     >> 
>     >> What would be the format string?  %H:%M:%S.%N would be incorrect.
> 
>     Brett> Why is that incorrect?
> 
> Because "704" represents the number of milliseconds, not the number of
> nanoseconds.
> 
> I'm sure that in some applications people are interested in extremely short
> time scales.  Writing out hours, minutes and seconds when all you are
> concerned with are small fractions of seconds (think high energy physics)
> would be a waste.  In those situations log entries like
> 
>     704 saw proton
>     705 proton hit neutron
>     706 saw electron headed toward Saturn
> 
> might make perfect sense.  Parsing the time field entirely within
> time.strptime would be at least clumsy if you couldn't tell it the scale of
> the numbers you're dealing with.  Parsing with %N, %3N or %6N would give
> different values (nanoseconds, milliseconds or microseconds).
> 

Fine, but couldn't you also do a pass over the data after extraction to get to 
the actual result you want (so parse, and take the millisecond value and 
multiply by the proper scale)?  This feels like it is YAGNI, or at least KISS. 
  If you want to handle milliseconds because of the logging module, fine.  But 
trying to deal with all possible time parsing possibilities is painful and 
usually not needed.

Personally I am more inclined to add a new directive that acts as %S but allows 
for an optional decimal point, comma or the current locale's separator if it 
isn't one of those two which will handle the logging package's optional decimal 
output ('\d+([,.%s]\d+)?" % locale.localeconv()['decimal_point']).  Also 
doesn't break any existing code.

And an issue I forgot to mention for all of this is it will break symmetry with 
time.strftime().  If symmetry is kept then an extra step in strftime will need 
to be handled since whatever solution we do will not match the C spec anymore.

-Brett
From glyph at divmod.com  Sat Jan 15 01:02:52 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Sat Jan 15 00:58:47 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
Message-ID: <1105747372.13655.93.camel@localhost>

On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote: 

> Maybe I'm missing something, but for those interfaces, isn't it okay to 
> keep the state in the *adapted* object here?  In other words, if PointPen 
> just added some private attributes to store the extra data?

I have been following this discussion with quite a lot of interest, and
I have to confess that a lot of what's being discussed is confusing me.
I use stateful adapters quite a bit - Twisted has long had a concept of
"sticky" adapters (they are called "persistent" in Twisted, but I think
I prefer "sticky").  Sometimes my persistent adapters are sticky,
sometimes not.  Just's example of iter() as an adaptation is a good
example of a non-sticky stateful adaptation, but this example I found
interesting, because it seems that the value-judgement of stateless
adapters as "good" is distorting design practices to make other
mistakes, just to remove state from adapters.  I can't understand why
PJE thinks - and why there seems to be a general consensus emerging -
that stateless adapters are intrinsically better.

For the sake of argument, let's say that SegmentPen is a C type, which
does not have a __dict__, and that PointPen is a Python adapter for it,
in a different project.

Now, we have nowhere to hide PointPen's state on SegmentPen - and why
were we trying to in the first place?  It's a horrible breach of
encapsulation.  The whole *point* of adapters is to convert between
*different* interfaces, not merely to rename methods on the same
interface, or to add extra methods that work on the same data.  To me,
"different interfaces" means that the actual meaning of the operations
is different - sometimes subtly, sometimes dramatically.  There has to
be enough information in one interface to get started on the
implementation of another, but the fact that such information is
necessary doesn't mean it is sufficient.  It doesn't mean that there is
enough information in the original object to provide a complete
implementation of a different interface.

If there were enough information, why not just implement all of your
interfaces on the original class?  In the case of our hypothetical
cSegmentPen, we *already* have to modify the implementation of the
original class to satisfy the needs of a "stateless" adapter.  When
you're modifying cSegmentPen, why not just add the methods that you
wanted in the first place?

Here's another example: I have a business logic class which lives in an
object database, typically used for a web application.  I convert this
into a desktop application.  Now, I want to adapt IBusinessThunk to
IGtkUIPlug.  In the process of doing so, I have to create a GTK widget,
loaded out of some sort of resource file, and put it on the screen.  I
have to register event handlers which are associated with that adapter.

The IBusinessThunk interface doesn't specify a __dict__ attribute as
part of the interface, or the ability to set arbitrary attributes.  And
nor should it!  It is stored in an indexed database where every
attribute has to be declared, maybe, or perhaps it uses Pickle and
sticking a GTK widget into its representation would make it
un-pickleable.  Maybe it's using an O/R mapper which loses state that is
not explicitly declared or explicitly touch()ed.  There are a variety of
problems which using it in this unsupported way might create, but as the
implementor of a IGtkUIPlug, I should be concerned *only* with what
IBusinessThunk provides, which is .embezzle()
and .checkFundsAvailable().  I am not writing an adapter from
DBBusinessThunkImpl, after all, and perhaps I am receiving a test
implementation that works entirely differently.

This example gets to the heart of what makes interfaces useful to me -
model/view separation.  Although one might be hard pressed to call some
of the things I use adaptation for "views", the idea of mediated access
from a user, or from network protocol, or from some internal code acting
on behalf of a user is the overwhelming majority of my use-cases.

Most of the other use-cases I can think of are like the one James
mentions, where we really are using adaptation to shuffle around some
method names and provide simple glossing over totally isomorphic
functionality to provide backwards (or sideways, in the case of
almost-identical libraries provided on different platforms or
environments) compatibility.

For these reasons I would vastly prefer it if transitivity were declared
as a property of the *adaptation*, not of the adapter or the registry or
to be inferred from various vaguely-defined properties like
"losslessness" or "statelessness".  I am also concerned about any
proposal which introduces transitivity-based errors at adaptation time
rather than at registration time, because by then it is definitely too
late to do anything about it.

I wish I had a better suggestion, but I'm still struggling through the
rest of the thread :).

From bob at redivi.com  Sat Jan 15 01:14:37 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sat Jan 15 01:14:44 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <1105747372.13655.93.camel@localhost>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
Message-ID: <7163A634-668A-11D9-A02F-000A95BA5446@redivi.com>


On Jan 14, 2005, at 19:02, Glyph Lefkowitz wrote:

> On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote:
>
>> Maybe I'm missing something, but for those interfaces, isn't it okay 
>> to
>> keep the state in the *adapted* object here?  In other words, if 
>> PointPen
>> just added some private attributes to store the extra data?
>
> Here's another example: I have a business logic class which lives in an
> object database, typically used for a web application.  I convert this
> into a desktop application.  Now, I want to adapt IBusinessThunk to
> IGtkUIPlug.  In the process of doing so, I have to create a GTK widget,
> loaded out of some sort of resource file, and put it on the screen.  I
> have to register event handlers which are associated with that adapter.
>
> The IBusinessThunk interface doesn't specify a __dict__ attribute as
> part of the interface, or the ability to set arbitrary attributes.  And
> nor should it!  It is stored in an indexed database where every
> attribute has to be declared, maybe, or perhaps it uses Pickle and
> sticking a GTK widget into its representation would make it
> un-pickleable.  Maybe it's using an O/R mapper which loses state that 
> is
> not explicitly declared or explicitly touch()ed.  There are a variety 
> of
> problems which using it in this unsupported way might create, but as 
> the
> implementor of a IGtkUIPlug, I should be concerned *only* with what
> IBusinessThunk provides, which is .embezzle()
> and .checkFundsAvailable().  I am not writing an adapter from
> DBBusinessThunkImpl, after all, and perhaps I am receiving a test
> implementation that works entirely differently.
>
> This example gets to the heart of what makes interfaces useful to me -
> model/view separation.  Although one might be hard pressed to call some
> of the things I use adaptation for "views", the idea of mediated access
> from a user, or from network protocol, or from some internal code 
> acting
> on behalf of a user is the overwhelming majority of my use-cases.

I think the idea is that it's "better" to have an adapter from 
IBusinessThunk -> IGtkUIPlugFactory, which you can use to *create* a 
stateful object that complies with the IGtkUIPlug interface.

This way, you are explicitly creating something entirely new (derived 
from something else) with its own lifecycle and state and it should be 
managed accordingly.  This is clearly not simply putting a shell around 
an IBusinessThunk that says "act like this right now".

-bob

From steven.bethard at gmail.com  Sat Jan 15 01:37:13 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat Jan 15 01:37:16 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <1105747372.13655.93.camel@localhost>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
Message-ID: <d11dcfba05011416373057fae1@mail.gmail.com>

On Fri, 14 Jan 2005 19:02:52 -0500, Glyph Lefkowitz <glyph@divmod.com> wrote:
> On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote:
>
> > Maybe I'm missing something, but for those interfaces, isn't it okay to
> > keep the state in the *adapted* object here?  In other words, if PointPen
> > just added some private attributes to store the extra data?
>
> I have been following this discussion with quite a lot of interest, and
> I have to confess that a lot of what's being discussed is confusing me.
> I use stateful adapters quite a bit - Twisted has long had a concept of
> "sticky" adapters (they are called "persistent" in Twisted, but I think
> I prefer "sticky").  Sometimes my persistent adapters are sticky,
> sometimes not.  Just's example of iter() as an adaptation is a good
> example of a non-sticky stateful adaptation, but this example I found
> interesting, because it seems that the value-judgement of stateless
> adapters as "good" is distorting design practices to make other
> mistakes, just to remove state from adapters.  I can't understand why
> PJE thinks - and why there seems to be a general consensus emerging -
> that stateless adapters are intrinsically better.

My feeling here was not that people thought that stateless adapters
were in general intrinsically better -- just when the adaptation was
going to be done implicitly (e.g. by type declarations).

When no state is involved, adapting an object multiple times can be
guaranteed to produce the same adapted object, so if this happens
implicitly, it's not a big deal.  When state is involved, _some_
decisions have to be made, and it seems like those decisions should be
made explicitly...

Steve
--
You can wordify anything if you just verb it.
       --- Bucky Katt, Get Fuzzy
From pje at telecommunity.com  Sat Jan 15 02:06:22 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 02:04:48 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <d11dcfba05011416373057fae1@mail.gmail.com>
References: <1105747372.13655.93.camel@localhost>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
Message-ID: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com>

At 05:37 PM 1/14/05 -0700, Steven Bethard wrote:
>On Fri, 14 Jan 2005 19:02:52 -0500, Glyph Lefkowitz <glyph@divmod.com> wrote:
> > On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote:
> >
> > > Maybe I'm missing something, but for those interfaces, isn't it okay to
> > > keep the state in the *adapted* object here?  In other words, if PointPen
> > > just added some private attributes to store the extra data?
> >
> > I have been following this discussion with quite a lot of interest, and
> > I have to confess that a lot of what's being discussed is confusing me.
> > I use stateful adapters quite a bit - Twisted has long had a concept of
> > "sticky" adapters (they are called "persistent" in Twisted, but I think
> > I prefer "sticky").  Sometimes my persistent adapters are sticky,
> > sometimes not.  Just's example of iter() as an adaptation is a good
> > example of a non-sticky stateful adaptation, but this example I found
> > interesting, because it seems that the value-judgement of stateless
> > adapters as "good" is distorting design practices to make other
> > mistakes, just to remove state from adapters.  I can't understand why
> > PJE thinks - and why there seems to be a general consensus emerging -
> > that stateless adapters are intrinsically better.
>
>My feeling here was not that people thought that stateless adapters
>were in general intrinsically better -- just when the adaptation was
>going to be done implicitly (e.g. by type declarations).

Yes, exactly. :)


>When no state is involved, adapting an object multiple times can be
>guaranteed to produce the same adapted object, so if this happens
>implicitly, it's not a big deal.  When state is involved, _some_
>decisions have to be made, and it seems like those decisions should be
>made explicitly...

At last someone has been able to produce a concise summary of my insane 
ramblings.  :)

Yes, this is precisely the key: implicit adaptation should always return an 
adapter with the "same" state (for some sensible meaning of "same"), 
because otherwise control of an important aspect of the system's behavior 
is too widely distributed to be able to easily tell for sure what's going 
on.  It also produces the side-effect issue of possibly introducing 
transitive adaptation, and again, that property is widely distributed and 
hard to "see".

Explicit adaptation to add per-adapter state is just fine; it's only 
*implicit* "non-sticky stateful" adaptation that creates issues.  Thus, the 
PEP I'm working on focuses on making it super-easy to make stateless and 
sticky stateful adapters with a bare minimum of declarations and interfaces 
and such.


From pje at telecommunity.com  Sat Jan 15 02:30:04 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 02:28:32 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <1105747372.13655.93.camel@localhost>
References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>

At 07:02 PM 1/14/05 -0500, Glyph Lefkowitz wrote:
>For the sake of argument, let's say that SegmentPen is a C type, which
>does not have a __dict__, and that PointPen is a Python adapter for it,
>in a different project.

There are multiple implementation alternatives possible here; it isn't 
necessary that the state be hidden there.  The point is that, given the 
same SegmentPen, we want to get the same PointPen each time we *implicitly* 
adapt, in order to avoid violating the "naive" developer's mental model of 
what adaptation is -- i.e. an extension of the object's state, not a new 
object with independent state.

One possible alternative implementation is to use a dictionary from object 
id to a 'weakref(ob),state' tuple, with the weakref set up to remove the 
entry when 'ob' goes away.  Adapters would then have a pointer to their 
state object and a pointer to the adaptee.  As long as an adapter lives, 
the adaptee lives, so the state remains valid.  Or, if no adapters remain, 
but the adaptee still lives, then so does the state which can be 
resurrected when a new adapter is requested.  It's too bad Python doesn't 
have some sort of deallocation hook you could use to get notified when an 
object goes away.  Oh well.

Anyway, as you and I have both pointed out, sticky adaptation is an 
important use case; when you need it, you really need it.


>This example gets to the heart of what makes interfaces useful to me -
>model/view separation.  Although one might be hard pressed to call some
>of the things I use adaptation for "views", the idea of mediated access
>from a user, or from network protocol, or from some internal code acting
>on behalf of a user is the overwhelming majority of my use-cases.

If every time you pass a "model" to something that expects a "view", you 
get a new "view" instance being created, things are going to get mighty 
confusing, mighty fast.  In contrast, explicit adaptation with 
'adapt(model,IView)' or 'IView(model)' allows you to explicitly control the 
lifecycle of the view (or views!) you want to create.

Guido currently thinks that type declaration should be implemented as 
'adapt(model,IView)'; I think that maybe it should be restricted (if only 
by considerations of "good style") to adapters that are sticky or 
stateless, reserving per-state adaptation for explicit creation via today's 
'adapt()' or 'IFoo(ob)' APIs.


>I wish I had a better suggestion, but I'm still struggling through the
>rest of the thread :).

I'll be starting work on the PEP soon, maybe I'll have a rough draft of at 
least the first few sections ready to post tonight so everybody can get 
started on ripping them to pieces.  The sooner I know about the holes, the 
sooner I can fix 'em.  Or alternatively, the sooner Guido shoots it down, 
the less work I have to do on the PEP.  :)

From ncoghlan at iinet.net.au  Sat Jan 15 04:18:25 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Jan 15 04:18:30 2005
Subject: [Python-Dev] frame.f_locals is writable
In-Reply-To: <41E7FF32.2030502@ieee.org>
References: <41E6DD52.2080109@ieee.org> <41E7D60F.9000208@iinet.net.au>
	<41E7FF32.2030502@ieee.org>
Message-ID: <41E88B81.3040303@iinet.net.au>

Shane Holloway (IEEE) wrote:
> Yes.  After poking around in Google with PyFrame_LocalsToFast, I found 
> some other links to people doing that.  I implemented a direct call 
> using ctypes to make the code explicit about what's happening.  I'm just 
> glad it is possible now.  Works fine in both 2.3 and 2.4.

I realised after posting that the exec-based hack only works for poking values 
into the _current_ frame's locals, so my trick wouldn't have done what you 
needed, anyway.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From glyph at divmod.com  Sat Jan 15 07:45:05 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Sat Jan 15 07:40:59 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <7163A634-668A-11D9-A02F-000A95BA5446@redivi.com>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
	<7163A634-668A-11D9-A02F-000A95BA5446@redivi.com>
Message-ID: <1105771505.13655.96.camel@localhost>

On Fri, 2005-01-14 at 19:14 -0500, Bob Ippolito wrote:

> I think the idea is that it's "better" to have an adapter from 
> IBusinessThunk -> IGtkUIPlugFactory, which you can use to *create* a 
> stateful object that complies with the IGtkUIPlug interface.
> 
> This way, you are explicitly creating something entirely new (derived 
> from something else) with its own lifecycle and state and it should be 
> managed accordingly.  This is clearly not simply putting a shell around 
> an IBusinessThunk that says "act like this right now".

Yes.  This is exactly what I meant to say.

Maybe there are 2 entirely different use-cases for adaptation, and we
shouldn't be trying to confuse the two, or conflate them into one
system?  I am going to go have a look at PEAK next, to see why there are
so many stateless adapters there.

From aleax at aleax.it  Sat Jan 15 10:30:25 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sat Jan 15 10:30:32 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <1105747372.13655.93.camel@localhost>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
Message-ID: <16679EEB-66D8-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 15, at 01:02, Glyph Lefkowitz wrote:
     ...
> Now, we have nowhere to hide PointPen's state on SegmentPen - and why
> were we trying to in the first place?  It's a horrible breach of
> encapsulation.  The whole *point* of adapters is to convert between
> *different* interfaces, not merely to rename methods on the same
> interface, or to add extra methods that work on the same data.  To me,

A common implementation technique, when you'd love to associate some 
extra data to an object, but can't rely on the object having a __dict__ 
to let you do that conveniently, is to have an auxiliary dict of 
bunches of extra data, keyed by object's id().  It's a bit messier, in 
that you have to deal with cleanup issues when the object goes away, as 
well as suffer an extra indirectness; but in many use cases it's quite 
workable.  I don't see doing something like
     myauxdict[id(obj)] = {'foo': 'bar'}
as "terribly invasive", and therefore neither do I see
     obj.myauxfoo = 'bar'
as any more invasive -- just two implementation techniques for the same 
task with somewhat different tradeoffs.  The task, associating extra 
data with obj without changing obj type's source, won't just go away.

Incidentally, the realization of this equivalence was a key step in my 
very early acceptance of Python.  In the first few days, the concept 
"some external code might add an attribute to obj -- encapsulation 
breach!" made me wary; then CLICK, the first time I had to associate 
extra data to an object and realized the alleged ``breach'' was just a 
handy implementation help for the task I needed anyway, I started 
feeling much better about it.

Adapter use cases exist for all three structures:

1. the adapter just needs to change method names and signatures or 
combine existing methods of the object, no state additions;
2. the adapter needs to add some per-object state, which must be shared 
among different adapters which may simultaneously exist on the same 
object;
3. the adapter needs to add some per-adapter state, which must be 
distinct among different adapters which may simultaneously exist on the 
same object.

Case [1] is simplest because you don't have to wonder whether [2] or 
[3] are better, which may be why it's being thought of as "best".  Case 
[3] may be dubious when we talk about AUTOMATIC adaptation, because in 
[3] making and using two separate adapters has very different semantics 
from making just one adapter and using it twice.  When you build the 
adapter explicitly of course you have full control and hopefully 
awareness of that.  For example, in Model/View, clearly you want 
multiple views on the same model and each view may well need a few 
presentation data of its own; if you think of it as adaptation, it's 
definitely a [3].  But do we really want _automatic_ adaptation -- 
passing a Model to a function which expects a View, and having some 
kind of default presentation data be used to make a default view on it? 
  That, I guess, is the dubious part.


> "different interfaces" means that the actual meaning of the operations
> is different - sometimes subtly, sometimes dramatically.  There has to
> be enough information in one interface to get started on the
> implementation of another, but the fact that such information is
> necessary doesn't mean it is sufficient.  It doesn't mean that there is
> enough information in the original object to provide a complete
> implementation of a different interface.
>
> If there were enough information, why not just implement all of your
> interfaces on the original class?  In the case of our hypothetical
> cSegmentPen, we *already* have to modify the implementation of the
> original class to satisfy the needs of a "stateless" adapter.  When
> you're modifying cSegmentPen, why not just add the methods that you
> wanted in the first place?

Reason #1: because the author of the cSegmentPen code cannot assume he 
or she knows about all the interfaces to which a cSegmentPen might be 
adapted a la [3].  If he or she provides a __dict__, or makes 
cSegmentPen weakly referenceable, all [3]-like adaptation needs are 
covered at one (heh heh) stroke.


> Here's another example: I have a business logic class which lives in an
> object database, typically used for a web application.  I convert this
> into a desktop application.  Now, I want to adapt IBusinessThunk to
> IGtkUIPlug.  In the process of doing so, I have to create a GTK widget,
> loaded out of some sort of resource file, and put it on the screen.  I
> have to register event handlers which are associated with that adapter.

OK, a typical case of model/view and thus a [3].  The issue is whether 
you want adaptation to be automatic or explicit, in such cases.

> Most of the other use-cases I can think of are like the one James
> mentions, where we really are using adaptation to shuffle around some
> method names and provide simple glossing over totally isomorphic
> functionality to provide backwards (or sideways, in the case of
> almost-identical libraries provided on different platforms or
> environments) compatibility.

And what's wrong with that?  Those are the "case [1]" adapters, and 
they're very useful.

I guess this boils down to the issue that you don't think there are use 
cases for [2], where the extra state is needed but it had better be 
per-object, shared among adapters, and not per-adapter, distinct for 
each adapter.

Well, one example in the model/view area comes from 3d modeling for 
mechanical engineering: the model is a complex collection of solids 
which only deal with geometrical properties, the views are "cameras" 
rendering scenes onto windows on the screen.  Each view has some modest 
state of its own (camera distance, angles, screen coordinates), but 
also there are some presentation data -- alien to the model itself, 
which only has geometry -- which are required to be shared among views, 
such as lighting information and surface texturing.  One approach would 
be to first wrap the bare-model into an enriched-model, once only; and 
adapt only the enriched model to the views.  If it's important to have 
different sets of views of the same geometry with different lighting 
&c, it's the only way to go; but sometimes the functional requirement 
is exactly the reverse -- ensure there is never any discrepancy among 
the lighting, texturing etc of the views over the same (geometrical) 
model.  Nothing particularly wrong, then, in having the bunch of 
information that is the "enriched model" (lighting &c) be known only to 
the views but directly associated with the geometry-model.

> For these reasons I would vastly prefer it if transitivity were 
> declared
> as a property of the *adaptation*, not of the adapter or the registry 
> or
> to be inferred from various vaguely-defined properties like
> "losslessness" or "statelessness".  I am also concerned about any
> proposal which introduces transitivity-based errors at adaptation time
> rather than at registration time, because by then it is definitely too
> late to do anything about it.

Fair enough, but for Guido's suggested syntax of "def f(X:Y):..." 
meaning X=adapt(X,Y) at function entry, the issue is how that 
particular "default/implicit" adaptation should behave -- is it always 
allowed to be transitive, never, only when Y is an interface and not a 
class, or under what specific set of constraints?


Alex

From aleax at aleax.it  Sat Jan 15 10:35:30 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sat Jan 15 10:35:33 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
Message-ID: <CBDEAA0A-66D8-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 15, at 02:30, Phillip J. Eby wrote:

> is requested.  It's too bad Python doesn't have some sort of 
> deallocation hook you could use to get notified when an object goes 
> away.  Oh well.

For weakly referenceable objects, it does.  Giving one to other objects 
would be almost isomorphic to making every object weakly referenceable, 
wouldn't it?  Or am I missing something...?


Alex

From just at letterror.com  Sat Jan 15 10:39:03 2005
From: just at letterror.com (Just van Rossum)
Date: Sat Jan 15 10:39:11 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
Message-ID: <r01050400-1037-4C90012C66D911D9AEA9003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> At 07:02 PM 1/14/05 -0500, Glyph Lefkowitz wrote:
> >For the sake of argument, let's say that SegmentPen is a C type,
> >which does not have a __dict__, and that PointPen is a Python
> >adapter for it, in a different project.
> 
> There are multiple implementation alternatives possible here; it
> isn't necessary that the state be hidden there.  The point is that,
> given the same SegmentPen, we want to get the same PointPen each time
> we *implicitly* adapt, in order to avoid violating the "naive"
> developer's mental model of what adaptation is -- i.e. an extension
> of the object's state, not a new object with independent state.
> 
> One possible alternative implementation is to use a dictionary from
> object id to a 'weakref(ob),state' tuple, with the weakref set up to
> remove the entry when 'ob' goes away.  Adapters would then have a
> pointer to their state object and a pointer to the adaptee.  As long
> as an adapter lives, the adaptee lives, so the state remains valid. 
> Or, if no adapters remain, but the adaptee still lives, then so does
> the state which can be resurrected when a new adapter is requested. 
> It's too bad Python doesn't have some sort of deallocation hook you
> could use to get notified when an object goes away.  Oh well.

That sounds extremely complicated as apposed to just storing the sate
where it most logically belongs: on the adapter. And all that to work
around a problem that I'm not convinced needs solving or even exists. At
the very least *I* don't care about it in my use case.

> Anyway, as you and I have both pointed out, sticky adaptation is an 
> important use case; when you need it, you really need it.

Maybe I missed it, but was there an example posted of when "sticky
adaptation" is needed?

It's not at all clear to me that "sticky" behavior is the best default
behavior, even with implicit adoptation. Would anyone in their right
mind expect the following to return [0, 1, 2, 3, 4, 5] instead of [0, 1,
2, 0, 1, 2]?

  >>> from itertools import *
  >>> seq = range(10)
  >>> list(chain(islice(seq, 3), islice(seq, 3)))
  [0, 1, 2, 0, 1, 2]
  >>> 

Just
From p.f.moore at gmail.com  Sat Jan 15 14:20:37 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat Jan 15 14:20:41 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com>
References: <ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
	<d11dcfba05011416373057fae1@mail.gmail.com>
	<5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com>
Message-ID: <79990c6b050115052024b2208a@mail.gmail.com>

On Fri, 14 Jan 2005 20:06:22 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> >My feeling here was not that people thought that stateless adapters
> >were in general intrinsically better -- just when the adaptation was
> >going to be done implicitly (e.g. by type declarations).
> 
> Yes, exactly. :)

In which case, given that there is no concept in PEP 246 of implicit
adaptation, can we please make a clear separation of this discussion
from PEP 246? (The current version of the PEP makes no mention of
transitive adaptation, as optional or required behaviour, which is the
only other example of implicit adaptation I can think of).

I think there are the following distinct threads of discussion going
on at the moment:

* Details of what should be in PEP 246
* Discussions spinning off from Guido's type-declaration-as-adaptation proposal
* Discussion of what counts as a "good" adapter
* Philip's new generic function / ducy typing proposals

Is that even close to others' understanding?

Just trying to keep my brain from exploding :-)

Paul.
From pje at telecommunity.com  Sat Jan 15 16:49:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 16:48:20 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-4C90012C66D911D9AEA9003065D5E7E4@[10.0.0.23
 ]>
References: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com>

At 10:39 AM 1/15/05 +0100, Just van Rossum wrote:
>That sounds extremely complicated as apposed to just storing the sate
>where it most logically belongs: on the adapter.

Oh, the state will be on the adapter all right.  It's just that for type 
declarations, I'm saying the system should return the *same* adapter each time.

>  And all that to work
>around a problem that I'm not convinced needs solving or even exists. At
>the very least *I* don't care about it in my use case.
>
> > Anyway, as you and I have both pointed out, sticky adaptation is an
> > important use case; when you need it, you really need it.
>
>Maybe I missed it, but was there an example posted of when "sticky
>adaptation" is needed?

No; but Glyph and I have independent use cases for them.  Here's one of 
mine: code generation from a UML or MOF model.  The model classes can't 
contain methods or data for doing code generation, unless you want to cram 
every possible kind of code generation into them.  The simple thing to do 
is to adapt them to a PythonCodeGenerator or an SQLCodeGenerator or 
what-have-you, and to do so stickily.  (Because a code generator may need 
to walk over quite a bit of the structure while keeping state for different 
things being generated.)

You *could* keep state in an external dictionary, of course, but it's much 
easier to use sticky adapters.


>It's not at all clear to me that "sticky" behavior is the best default
>behavior, even with implicit adoptation. Would anyone in their right
>mind expect the following to return [0, 1, 2, 3, 4, 5] instead of [0, 1,
>2, 0, 1, 2]?
>
>   >>> from itertools import *
>   >>> seq = range(10)
>   >>> list(chain(islice(seq, 3), islice(seq, 3)))
>   [0, 1, 2, 0, 1, 2]
>   >>>

I don't understand why you think it would.  What does islice have to do 
with adaptation?

From pje at telecommunity.com  Sat Jan 15 16:57:29 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 16:55:55 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <79990c6b050115052024b2208a@mail.gmail.com>
References: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<1105747372.13655.93.camel@localhost>
	<d11dcfba05011416373057fae1@mail.gmail.com>
	<5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115105020.04055d90@mail.telecommunity.com>

At 01:20 PM 1/15/05 +0000, Paul Moore wrote:
>I think there are the following distinct threads of discussion going
>on at the moment:
>
>* Details of what should be in PEP 246
>* Discussions spinning off from Guido's type-declaration-as-adaptation 
>proposal

My understanding was that the first needed to be considered in context of 
the second, since it was the second which gave an implicit blessing to the 
first.  PEP 246 had languished in relative obscurity for a long time until 
Guido's blessing it for type declarations brought it back into the 
spotlight.  So, I thought it important to frame its discussion in terms of 
its use for type declaration.


>* Discussion of what counts as a "good" adapter

Alex was originally trying to add to PEP 246 some recommendations regarding 
"good" vs. "bad" adaptation, so this is actually part of "what should be in 
PEP 246"


>* Philip's new generic function / ducy typing proposals

And of course this one is an attempt to unify everything and replace PEP 
245 (not 246) with a hopefully more pythonic way of defining interfaces and 
adapters.  I hope to define a "relatively safe" subset of PEP 246 for type 
declarations that can be done automatically by Python, in a way that's also 
conceptually compatible with COM and Java casting (possibly making Jython 
and IronPython's lives a little easier re: type declarations).

From just at letterror.com  Sat Jan 15 17:32:42 2005
From: just at letterror.com (Just van Rossum)
Date: Sat Jan 15 17:32:47 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com>
Message-ID: <r01050400-1037-156C4A67671311D9AEA9003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> >It's not at all clear to me that "sticky" behavior is the best
> >default behavior, even with implicit adoptation. Would anyone in
> >their right mind expect the following to return [0, 1, 2, 3, 4, 5]
> >instead of [0, 1, 2, 0, 1, 2]?
> >
> >   >>> from itertools import *
> >   >>> seq = range(10)
> >   >>> list(chain(islice(seq, 3), islice(seq, 3)))
> >   [0, 1, 2, 0, 1, 2]
> >   >>>
> 
> I don't understand why you think it would.  What does islice have to
> do with adaptation?

islice() takes an iterator, yet I give it a sequence. It calls
iter(seq), which I see as a form of adaptation (maybe you don't). Sticky
adaptation would not be appropriate here, even though the adaptation is
implicit.

Just
From tjreedy at udel.edu  Sat Jan 15 17:59:54 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat Jan 15 18:00:02 2005
Subject: [Python-Dev] Re: Re: PEP 246: LiskovViolation as a name
References: <1105553300.41e56794d1fc5@mcherm.com><16869.33426.883395.345417@montanaro.dyndns.org><41E63EDB.40008@cs.teiath.gr>
	<16869.57557.795447.53311@montanaro.dyndns.org>
Message-ID: <csbi69$3a4$1@sea.gmane.org>


"Skip Montanaro" <skip@pobox.com> wrote in message 
news:16869.57557.795447.53311@montanaro.dyndns.org...
> The first example here:
>    http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm
> Looks pretty un-extreme to me.

To both summarize and flesh out the square-rectangle example:
Q. Is a square 'properly' a rectangle? A. Depends on 'square' and 
'rectangle'.
* A static, mathematical square is a static, mathematical rectangle just 
fine,
once width and height are aliased (adapted?) to edge.  The only 'behaviors'
are to report size and possibly derived quantities like diagonal and area.
* Similarly, a dynamic, zoomable square is a zoomable rectangle.
* But a square cannot 'properly' be a fully dynamic rectangle that can 
mutate to a dissimilar shape, and must when just one dimension is changed
-- unless shape mutation is allowed to fail or unless the square is allowed 
to mutate itself into a rectangle.

So it seems easily possible to introduce Liskov violations when adding 
behavior to a general superclass.

Terry J. Reedy



From pje at telecommunity.com  Sat Jan 15 18:06:36 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 18:05:03 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-156C4A67671311D9AEA9003065D5E7E4@[10.0.0.23
 ]>
References: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com>

At 05:32 PM 1/15/05 +0100, Just van Rossum wrote:
>Phillip J. Eby wrote:
>
> > >It's not at all clear to me that "sticky" behavior is the best
> > >default behavior, even with implicit adoptation. Would anyone in
> > >their right mind expect the following to return [0, 1, 2, 3, 4, 5]
> > >instead of [0, 1, 2, 0, 1, 2]?
> > >
> > >   >>> from itertools import *
> > >   >>> seq = range(10)
> > >   >>> list(chain(islice(seq, 3), islice(seq, 3)))
> > >   [0, 1, 2, 0, 1, 2]
> > >   >>>
> >
> > I don't understand why you think it would.  What does islice have to
> > do with adaptation?
>
>islice() takes an iterator, yet I give it a sequence.

No, it takes an *iterable*, both practically and according to its 
documentation:

 >>> help(itertools.islice)
Help on class islice in module itertools:

class islice(__builtin__.object)
  |  islice(iterable, [start,] stop [, step]) --> islice object
  |
  | ... [snip rest]

If you think about the iterator and iterable protocols a bit, you'll see 
that normally the adaptation goes the *other* way: you can pass an iterator 
to something that expects an iterable, as long as it doesn't need 
reiterability.

From pje at telecommunity.com  Sat Jan 15 18:25:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 18:23:42 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <CBDEAA0A-66D8-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com>
	<5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115105740.040511e0@mail.telecommunity.com>

At 10:35 AM 1/15/05 +0100, Alex Martelli wrote:

>On 2005 Jan 15, at 02:30, Phillip J. Eby wrote:
>
>>is requested.  It's too bad Python doesn't have some sort of deallocation 
>>hook you could use to get notified when an object goes away.  Oh well.
>
>For weakly referenceable objects, it does.  Giving one to other objects 
>would be almost isomorphic to making every object weakly referenceable, 
>wouldn't it?  Or am I missing something...?

I meant if there was some way to listen for a particular object's 
allocation, like sticking all the pointers you were interested in into a 
big dictionary with callbacks and having a callback run whenever an 
object's refcount reaches zero.  It's doubtless completely impractical, 
however.  I think we can probably live with only weak-referenceable objects 
being seamlessly sticky, if that's a word.  :)

Actually, I've just gotten to the part of the PEP where I have to deal with 
stateful adapters and state retention, and I think I'm going to use this 
terminology for the three kinds of adapters:

* operations (no adapter class needed)

* extenders (operations + a consistent state that conceptually adds state 
to the base object rather than creating an object w/separate lifetime)

* "volatile", "inconsistent", or "disposable" adapters (state may be lost 
or multiplied if passed to different routines)

The idea is to make it really easy to make any of these, but for the last 
category you should have to explicitly declare that you *want* volatility 
(or at least that you are willing to accept it, if the target type is not 
weak-referenceable).

In this way, all three kinds of adaptation may be allowed, but it takes one 
extra step to create a potentially "bad" adapter.  Right now, people often 
create volatile adapters even if what they want is an extender ("sticky 
adapter"), because it's more work to make a functioning extender, not 
because they actually want volatility.

So, let's reverse that and make it easier to create extenders than it is to 
create volatile adapters.  And, since in some cases an extender won't be 
possible even when it's what you want, we could go ahead and allow type 
declarations to make them, as long as the creator has specified that 
they're volatile.

Meanwhile, all three kinds of adapters should avoid accidental implicit 
transitivity by only adapting the "original object".  (Unless, again, there 
is some explicit choice to do otherwise.)  This makes the type declaration 
system a straightforward extension of the COM QueryInterface and Java 
casting models, where an object's "true identity" is always preserved 
regardless of which interface you access its operations through.

From s.percivall at chello.se  Sat Jan 15 22:48:17 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sat Jan 15 22:48:21 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com>
References: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com>
	<5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com>
Message-ID: <2A7F1150-673F-11D9-B46A-0003934AD54A@chello.se>

On 2005-01-15, at 18.06, Phillip J. Eby wrote:
> At 05:32 PM 1/15/05 +0100, Just van Rossum wrote:
>> Phillip J. Eby wrote:
>>
>> > >It's not at all clear to me that "sticky" behavior is the best
>> > >default behavior, even with implicit adoptation. Would anyone in
>> > >their right mind expect the following to return [0, 1, 2, 3, 4, 5]
>> > >instead of [0, 1, 2, 0, 1, 2]?
>> > >
>> > >   >>> from itertools import *
>> > >   >>> seq = range(10)
>> > >   >>> list(chain(islice(seq, 3), islice(seq, 3)))
>> > >   [0, 1, 2, 0, 1, 2]
>> > >   >>>
>> >
>> > I don't understand why you think it would.  What does islice have to
>> > do with adaptation?
>>
>> islice() takes an iterator, yet I give it a sequence.
>
> No, it takes an *iterable*, both practically and according to its 
> documentation:

But it _does_ perform an implicit adaptation, via PyObject_GetIter. A 
list has no next()-method, but iter(list()) does.

//Simon

From pje at telecommunity.com  Sat Jan 15 23:23:56 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Jan 15 23:22:24 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <2A7F1150-673F-11D9-B46A-0003934AD54A@chello.se>
References: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com>
	<5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com>
	<5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com>

At 10:48 PM 1/15/05 +0100, Simon Percivall wrote:
>On 2005-01-15, at 18.06, Phillip J. Eby wrote:
>>At 05:32 PM 1/15/05 +0100, Just van Rossum wrote:
>>>Phillip J. Eby wrote:
>>>
>>> > >It's not at all clear to me that "sticky" behavior is the best
>>> > >default behavior, even with implicit adoptation. Would anyone in
>>> > >their right mind expect the following to return [0, 1, 2, 3, 4, 5]
>>> > >instead of [0, 1, 2, 0, 1, 2]?
>>> > >
>>> > >   >>> from itertools import *
>>> > >   >>> seq = range(10)
>>> > >   >>> list(chain(islice(seq, 3), islice(seq, 3)))
>>> > >   [0, 1, 2, 0, 1, 2]
>>> > >   >>>
>>> >
>>> > I don't understand why you think it would.  What does islice have to
>>> > do with adaptation?
>>>
>>>islice() takes an iterator, yet I give it a sequence.
>>
>>No, it takes an *iterable*, both practically and according to its 
>>documentation:
>
>But it _does_ perform an implicit adaptation, via PyObject_GetIter.

First, that's not implicit.  Second, it's not adaptation, 
either.  PyObject_GetIter invokes the '__iter__' method of its target -- a 
method that is part of the *iterable* interface.  It has to have something 
that's *already* iterable; it can't "adapt" a non-iterable into an iterable.

Further, if calling a method of an interface that you already have in order 
to get another object that you don't is adaptation, then what *isn't* 
adaptation?  Is it adaptation when you call 'next()' on an iterator? Are 
you then "adapting" the iterator to its next yielded value?

No?  Why not?  It's a special method of the "iterator" interface, just like 
__iter__ is a special method of the "iterable" interface.

So, I can't see how you can call one adaptation, but not the other.  My 
conclusion: neither one is adaptation.


>  A list has no next()-method, but iter(list()) does.

But a list has an __iter__ method, so therefore it's an iterable.  That's 
what defines an iterable: it has an __iter__ method.  It would only be 
adaptation if lists *didn't* have an __iter__ method.

From just at letterror.com  Sat Jan 15 23:50:39 2005
From: just at letterror.com (Just van Rossum)
Date: Sat Jan 15 23:50:43 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com>
Message-ID: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> >But it _does_ perform an implicit adaptation, via PyObject_GetIter.
> 
> First, that's not implicit.  Second, it's not adaptation, either. 
> PyObject_GetIter invokes the '__iter__' method of its target -- a
> method that is part of the *iterable* interface.  It has to have
> something that's *already* iterable; it can't "adapt" a non-iterable
> into an iterable.
> 
> Further, if calling a method of an interface that you already have in
> order to get another object that you don't is adaptation, then what
> *isn't* adaptation?  Is it adaptation when you call 'next()' on an
> iterator? Are you then "adapting" the iterator to its next yielded
> value?

That's one (contrived) way of looking at it. Another is that

  y = iter(x)

adapts the iterable protocol to the iterator protocol. I don't (yet) see
why a bit of state disqualifies this from being called adaptation.

> No?  Why not?  It's a special method of the "iterator" interface,
> just like __iter__ is a special method of the "iterable" interface.

The difference it that the result of .next() doesn't have a specified
interface.

> So, I can't see how you can call one adaptation, but not the other. 
> My conclusion: neither one is adaptation.

Maybe...

Just
From s.percivall at chello.se  Sun Jan 16 00:02:20 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sun Jan 16 00:02:23 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>
References: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>
Message-ID: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>

On 2005-01-15, at 23.50, Just van Rossum wrote:
> Phillip J. Eby wrote:
>
>>> But it _does_ perform an implicit adaptation, via PyObject_GetIter.
>>
>> First, that's not implicit.  Second, it's not adaptation, either.
>> PyObject_GetIter invokes the '__iter__' method of its target -- a
>> method that is part of the *iterable* interface.  It has to have
>> something that's *already* iterable; it can't "adapt" a non-iterable
>> into an iterable.
>>
>> Further, if calling a method of an interface that you already have in
>> order to get another object that you don't is adaptation, then what
>> *isn't* adaptation?  Is it adaptation when you call 'next()' on an
>> iterator? Are you then "adapting" the iterator to its next yielded
>> value?
>
> That's one (contrived) way of looking at it. Another is that
>
>   y = iter(x)
>
> adapts the iterable protocol to the iterator protocol.

Especially since an iterable can also be an object without an __iter__
method but with a __getitem__ method. Calling __iter__ might get an
iterator, but calling __getitem__ does not. That seems like adaptation.
No? It's still not clear to me, as this shows, exactly what counts as
what in this game.

//Simon

From foom at fuhm.net  Sun Jan 16 02:13:26 2005
From: foom at fuhm.net (James Y Knight)
Date: Sun Jan 16 02:14:47 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
References: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
Message-ID: <D2D9DA90-675B-11D9-B76D-000A95A50FB2@fuhm.net>


On Jan 15, 2005, at 6:02 PM, Simon Percivall wrote:

> On 2005-01-15, at 23.50, Just van Rossum wrote:
>> Phillip J. Eby wrote:
>>
>>>> But it _does_ perform an implicit adaptation, via PyObject_GetIter.
>>>
>>> First, that's not implicit.  Second, it's not adaptation, either.
>>> PyObject_GetIter invokes the '__iter__' method of its target -- a
>>> method that is part of the *iterable* interface.  It has to have
>>> something that's *already* iterable; it can't "adapt" a non-iterable
>>> into an iterable.
>>>
>>> Further, if calling a method of an interface that you already have in
>>> order to get another object that you don't is adaptation, then what
>>> *isn't* adaptation?  Is it adaptation when you call 'next()' on an
>>> iterator? Are you then "adapting" the iterator to its next yielded
>>> value?
>>
>> That's one (contrived) way of looking at it. Another is that
>>
>>   y = iter(x)
>>
>> adapts the iterable protocol to the iterator protocol.
>
> Especially since an iterable can also be an object without an __iter__
> method but with a __getitem__ method. Calling __iter__ might get an
> iterator, but calling __getitem__ does not. That seems like adaptation.
> No? It's still not clear to me, as this shows, exactly what counts as
> what in this game.

I think that's wrong. To spell iter() in an adapter/interface world, 
I'd spell iter(obj) as:
   adapt(obj, IIterable).iterator()

Then, list, tuple, dict objects would specify that they implement 
IIterable. There is a default adapter from object->IIterable which 
provides a .iterate() method which creates an iterator that uses 
__getitem__ on the adaptee.

In my opinion, adapters provide a different view of an object. I can 
see treating list "as a" iterable, but not "as a" iterator.

James

From jimjjewett at gmail.com  Sat Jan 15 01:20:31 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun Jan 16 02:32:32 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
Message-ID: <fb6fbf560501141620eff6d85@mail.gmail.com>

Phillip J. Eby wrote (in
http://mail.python.org/pipermail/python-dev/2005-January/050854.html)

> * Classic class support is a must; exceptions are still required to be 
> classic, and even if they weren't in 2.5, backward compatibility should be 
> provided for at least one release.

The base of the Exception hierarchy happens to be a classic class.
But why are they "required" to be classic?

More to the point, is this a bug, a missing feature, or just a bug in 
the documentation for not mentioning the restriction?

You can inherit from both Exception and object.  (Though it turns out
you can't raise the result.)  My first try with google failed to produce an
explanation -- and I'm still not sure I understand, beyond "it doesn't
happen to work at the moment."  Neither the documentation nor the 
tutorial mention this restriction.

http://docs.python.org/lib/module-exceptions.html
http://docs.python.org/tut/node10.html#SECTION0010500000000000000000

I didn't find any references to this restriction in exception.c.  I did find
some code implying this in errors.c and ceval.c, but that wouldn't have 
caught my eye if I weren't specifically looking for it *after* having just
read the discussion about (rejected) PEP 317.  

-jJ
From gvanrossum at gmail.com  Sun Jan 16 02:57:53 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Jan 16 02:57:56 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <fb6fbf560501141620eff6d85@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
Message-ID: <ca471dc205011517576655bc17@mail.gmail.com>

> The base of the Exception hierarchy happens to be a classic class.
> But why are they "required" to be classic?
> 
> More to the point, is this a bug, a missing feature, or just a bug in
> the documentation for not mentioning the restriction?

It's an unfortunate feature; it should be mentioned in the docs; it
should also be fixed, but fixing it isn't easy (believe me, or it
would have been fixed in Python 2.2).

To be honest, I don't recall the exact reasons why this wasn't fixed
in 2.2; I believe it has something to do with the problem of
distinguishing between string and class exception, and between the
various forms of raise statements.

I think the main ambiguity is raise "abc", which could be considered
short for raise str, "abc", but that would be incompatible with except
"abc". I also think that the right way out of there is to simply
hardcode a check that says that raise "abc" raises a string exception
and raising any other instance raises a class exception. But there's a
lot of code that has to be changed.

It's been suggested that all exceptions should inherit from Exception,
but this would break tons of existing code, so we shouldn't enforce
that until 3.0. (Is there a PEP for this? I think there should be.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Sun Jan 16 03:06:51 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 03:05:19 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23
 ]>
References: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050115202610.034c3c10@mail.telecommunity.com>

At 11:50 PM 1/15/05 +0100, Just van Rossum wrote:
>Phillip J. Eby wrote:
>
> > >But it _does_ perform an implicit adaptation, via PyObject_GetIter.
> >
> > First, that's not implicit.  Second, it's not adaptation, either.
> > PyObject_GetIter invokes the '__iter__' method of its target -- a
> > method that is part of the *iterable* interface.  It has to have
> > something that's *already* iterable; it can't "adapt" a non-iterable
> > into an iterable.
> >
> > Further, if calling a method of an interface that you already have in
> > order to get another object that you don't is adaptation, then what
> > *isn't* adaptation?  Is it adaptation when you call 'next()' on an
> > iterator? Are you then "adapting" the iterator to its next yielded
> > value?
>
>That's one (contrived) way of looking at it. Another is that
>
>   y = iter(x)
>
>adapts the iterable protocol to the iterator protocol. I don't (yet) see
>why a bit of state disqualifies this from being called adaptation.

Well, if you go by the GoF "Design Patterns" book, this is actually what's 
called an "Abstract Factory":

    "Abstract Factory: Provide an interface for creating ... related or 
dependent objects without specifying their concrete classes."

So, 'iter()' is an abstract factory that creates an iterator without 
needing to specify the concrete class of iterator you want.  This is a much 
closer fit for what's happening than the GoF description of "Adapter":

    "Adapter: Convert the interface of a class into another interface 
clients expect.  Adapter lets classes work together that couldn't otherwise 
because of incompatible interfaces."

IMO, it's quite "contrived" to try and squeeze iteration into this concept, 
compared to simply saying that 'iter()' is an abstract factory that creates 
"related or dependent objects".

While it has been pointed out that the GoF book is not handed down from 
heaven or anything, its terminology is certainly widely used to describe 
certain patterns of programming.  If you read their full description of the 
adapter pattern, nothing in it is about automatically getting an adapter 
based on an interface.  It's just about the idea of *using* an adapter that 
you already have, and it's strongly implied that you only use one adapter 
for a given source and destination that need adapting, not create lots of 
instances all over the place.

So really, PEP 246 'adapt()' (like 'iter()') is more about the Abstract 
Factory pattern.  It just happens in the case of PEP 246 that it's an 
Abstract Factory that *can* create adapters, but it's not restricted to 
handing out *just* adapters.  It can also be used to create views, 
iterators, and whatever else you like.  But that's precisely what makes it 
problematic for use as a type declaration mechanism, because you run the 
risk of it serving up entirely new objects that aren't just interface 
transformers.  And of course, that's why I think that you should have to 
declare that you really want to use it for type declarations, if in fact 
it's allowed at all.  Explicit use of 'adapt()', on the other hand, can 
safely create whatever objects you want.

Oh, one other thing -- distinguishing between "adapters" and merely 
"related" objects allows you to distinguish whether you should adapt the 
object or what it wraps.  A "related" object (like an iterator) is a 
separate object, so it's safe to adapt it to other things.  An actual 
*adapter* is not a separate object, it's an extension of the object it 
wraps.  So, it should not be re-adapted when adapting again; instead the 
underlying object should be adapted.

So, while I support in principle all the use cases for "adaptation" 
(so-called) that have been discussed here, I think it's important to refine 
our terminology to distinguish between GoF "adapters" and "things you might 
want to create with an abstract factory", because they have different 
requirements and support different use cases.

We have gotten a little bogged down by our comparisons of "good" and "bad" 
adapters; perhaps to move forward we should distinguish between "adapters" 
and "views", and say that an iterator is an example of a view: you may have 
more than one view on the same thing, and although a view depends on the 
thing it "views", it doesn't really "convert an interface"; it provides 
distinct functionality on a per-view basis.

Currently, PEP 246 'adapt()' is used "in the field" to create both adapters 
and views, because 1) it's convenient, and 2) it can.  :)  However, for 
type declarations, I think it's important to distinguish between the two, 
to avoid implicit creation of additional views.

A view needs to be managed within the scope that it applies to.  By that, I 
mean for example that a 'for' loop creates an iterator view and then 
manages it within the scope of the loop.  However, if you need the iterator 
to remain valid outside the 'for' loop, you may need to first call 'iter()' 
to get an explicit iterator you can hold on to.

Similarly, if you have a file that you are reading things from by calling 
routines and passing in the file, you don't want to pass each of those 
routines a filename and have them implicitly open the file; they won't be 
reading from it sequentially then.  So, again, you have to manage the view 
by opening a file or creating a StringIO or whatever.

Granted that there are some scenarios where implicit view creation will do 
exactly the right thing, introducing it also opens the opportunity for it 
to go very badly.  Today's PEP 246 implementations are as easy to use as 
'iter()', so why not use them explicitly when you need a view?

From s.percivall at chello.se  Sun Jan 16 03:07:23 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sun Jan 16 03:07:26 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011517576655bc17@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<ca471dc205011517576655bc17@mail.gmail.com>
Message-ID: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se>

On 2005-01-16, at 02.57, Guido van Rossum wrote:
> It's been suggested that all exceptions should inherit from Exception,
> but this would break tons of existing code, so we shouldn't enforce
> that until 3.0. (Is there a PEP for this? I think there should be.)

What would happen if Exception were made a new-style class, enforce
inheritance from Exception for all new-style exceptions, and allow all
old-style exceptions as before. Am I wrong in assuming that only the
most esoteric exceptions inheriting from Exception would break by
Exception becoming new-style?

//Simon

From pje at telecommunity.com  Sun Jan 16 03:17:45 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 03:16:15 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <D2D9DA90-675B-11D9-B76D-000A95A50FB2@fuhm.net>
References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
Message-ID: <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>

At 08:13 PM 1/15/05 -0500, James Y Knight wrote:

>On Jan 15, 2005, at 6:02 PM, Simon Percivall wrote:
>
>>On 2005-01-15, at 23.50, Just van Rossum wrote:
>>>Phillip J. Eby wrote:
>>>
>>>>>But it _does_ perform an implicit adaptation, via PyObject_GetIter.
>>>>
>>>>First, that's not implicit.  Second, it's not adaptation, either.
>>>>PyObject_GetIter invokes the '__iter__' method of its target -- a
>>>>method that is part of the *iterable* interface.  It has to have
>>>>something that's *already* iterable; it can't "adapt" a non-iterable
>>>>into an iterable.
>>>>
>>>>Further, if calling a method of an interface that you already have in
>>>>order to get another object that you don't is adaptation, then what
>>>>*isn't* adaptation?  Is it adaptation when you call 'next()' on an
>>>>iterator? Are you then "adapting" the iterator to its next yielded
>>>>value?
>>>
>>>That's one (contrived) way of looking at it. Another is that
>>>
>>>   y = iter(x)
>>>
>>>adapts the iterable protocol to the iterator protocol.
>>
>>Especially since an iterable can also be an object without an __iter__
>>method but with a __getitem__ method. Calling __iter__ might get an
>>iterator, but calling __getitem__ does not. That seems like adaptation.
>>No? It's still not clear to me, as this shows, exactly what counts as
>>what in this game.
>
>I think that's wrong. To spell iter() in an adapter/interface world, I'd 
>spell iter(obj) as:
>   adapt(obj, IIterable).iterator()
>
>Then, list, tuple, dict objects would specify that they implement 
>IIterable. There is a default adapter from object->IIterable which 
>provides a .iterate() method which creates an iterator that uses 
>__getitem__ on the adaptee.
>
>In my opinion, adapters provide a different view of an object. I can see 
>treating list "as a" iterable, but not "as a" iterator.

Uh oh.  I just used "view" to describe an iterator as a view on an 
iterable, as distinct from an adapter that adapts a sequence so that it's 
iterable.  :)

I.e., using "view" in the MVC sense where a given Model might have multiple 
independent Views.

We really need to clean up our terminology somehow, and I may need to 
rewrite some parts of my PEP-in-progress.  I had been using the term 
"volatile adapter" for what I'd written so far, but by the time I got to 
the part where I had to explain how to actually *make* volatile adapters, I 
realized that I was right before: they aren't adapters just because PEP 246 
'adapt()' can be used to create them.  They're just something *else* that's 
convenient to create with 'adapt()' besides adapters.  Calling them even 
"volatile adapters" just confuses them with "real" adapters.

On the *other* hand, maybe we should just call GoF adapters "extenders" 
(since they extend the base object with a new interface or extended 
functionality, but aren't really separate objects) and these other things 
like iterators and views should be called "accessories", which implies you 
have lots of them and although they "accessorize" an object, they are 
themselves individual objects.  (Whereas an extender becomes conceptually 
"part of" the thing it extends.)

It's then also clearer that it makes no sense to have a type declaration 
ever cause you to end up with a new accessory, as opposed to an extender 
that's at least figuratively always there.

What do y'all think?  Is that a better way to distinguish kinds of 
"adapters"?  (I.e. extenders versus accessories)  Or does somebody have 
better words we can use?

From pje at telecommunity.com  Sun Jan 16 03:22:28 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 03:20:56 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011517576655bc17@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<fb6fbf560501141620eff6d85@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>

At 05:57 PM 1/15/05 -0800, Guido van Rossum wrote:
>It's been suggested that all exceptions should inherit from Exception,
>but this would break tons of existing code, so we shouldn't enforce
>that until 3.0. (Is there a PEP for this? I think there should be.)

Couldn't we require new-style exceptions to inherit from Exception?  Since 
there are no new-style exceptions that work now, this can't break existing 
code.  Then, the code path is just something like:

     if isinstance(ob,Exception):
         # it's an exception, use its type

     else:
         # all the other tests done now

This way, the other tests that would be ambiguous wrt new-style classes can 
be skipped, but non-Exception classic classes would still be handled by the 
existing checks.

Or am I missing something?

From cce at clarkevans.com  Sun Jan 16 05:04:24 2005
From: cce at clarkevans.com (Clark C. Evans)
Date: Sun Jan 16 05:04:27 2005
Subject: [Python-Dev] PEP 246, Feedback Request
In-Reply-To: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <20050116040424.GA76191@prometheusresearch.com>

I started to edit the PEP, but found that we really don't have any
consensus on a great many items. The following is a bunch of topics,
and a proposed handling of those topics.  A bulk of this comes from
a phone chat I had with Alex this past afternoon, and other items
come from my understanding of the mailing list, or prior conversations
with Phillip, among others.  It's a strawman.

I'd really very much like feedback on each topic, preferably only
one post per person summarizing your position/suggestions.  I'd
rather not have a run-away discussion on this post.

---
-
  topic: a glossary
  overview:
    It seems that we are having difficulty with words that have shifting
    definitions.  The next PEP edit will need to add a glossary that 
    nails down some meanings of these words.  Following are a few 
    proposed terms/meanings.
  proposal:
  - protocol means any object, usually a type or class or interface,
    which guides the construction of an adapter
  - adaptee is the object which is to be adapted, the original object
  - adaptee-class refers to the adaptee's class
  - adapter refers to the result of adapting an adaptee to a protocol
  - factory refers to a function, f(adaptee) -> adapter, where 
    the resulting adapter complies with a given protocol
  feedback:
    Much help is needed here; either respond to this thread with
    your words and definitions, or email them directly to Clark 
    and he will use your feedback when creating the PEP's glossary.
-
  topic: a registry mechanism
  overview:
    It has become very clear from the conversations the past few
    days that a registry is absolutely needed for the kind of adapt()
    behavior people are currently using in Zope, Twisted, and Peak.
  proposal:
  - The PEP will define a simple and flexible registry mechanism.
  - The registry will be a mapping from a (adaptee-class, protocol) 
    pair to a corresponding factory.
  - Only one active registration per pair (see below)
  feedback:
    We welcome/encourage experiences and concreate suggestions from
    existing registries.  Our goal is to be minimal, extensible,
    and sufficient.  See other topics for more specific concerns
    before you comment on this more general topic.
-
  topic: should 'object' be impacted by PEP 246
  overview:
    The semantics of exceptions depend if 'object' is given a
    default __conform__ method (which would do isinstance), in which
    case, returning None in a subclass could be used to prevent
    Liskov violations. However, by requiring a change to 'object',
    it may hinder adoption or slow-down testing.
  proposal:
  - We will not ask/require changes to `object'.
  - Liskov violations will be managed via the registry, see below.
  - This is probably faster for isinstance cases?
  feedback:
    If you really think we should move isinstance() into
    object.__conform__, then here is your chance to make a final
    stand. ;)
-
  topic: adaption stages
  overview:
    There are several stages for adaptation.  It was recommended
    that the 'registry' be the first stop in the chain. 
  proposal:
  - First, the registry is checked for a suitable adapter
  - Second, isinstance() is checked, the adaptee is an instance
    of the protocol, adaptation ends and adaptee is returned.
  - Third, __conform__ on the adaptee is called with the given
    protocol being requested.
  - Fourth, __adapt__ on the protocol is called, with the given
    adaptee.
  feedback:
    This largely dependent upon the previous topic, but if something
    isn't obvious (mod exceptions below), please say something.
-
  topic: module vs built-in
  overview:
    Since we will be adding a registry, exceptions, and other items,
    it probably makes sense to use a module for 'adapt'.
  proposal:
  - PEP 246 will ask for a `adapt' module, with an `adapt' function.
  - The registry will be contained in this module, 'adapt.register'
  - The `adapt' module can provide commonly-used adapter factories,
    such as adapt.Identity.
  - With a standardized signature, frameworks can provide their own
    'local' registry/adapt overrides.
  feedback:
    Please discuss the merits of a module approach, and if having
    local registries is important (or worth the added complexity).
    Additional suggestions on how the module should be structured
    are welcome. 
-
  topic: exception handling
  overview:
    How should adaption stages progress and exceptions be handled.
    There were problems with swallowed TypeError exceptions in 
    the 2001 version of the PEP, type errors are not swallowed.
  proposal:
  - The 'adapt' module will define an adapt.AdaptError(TypeError).
  - At any stage of adaptation, if None is returned, the adaptation
    continues to the next stage.
  - Any exception other than adapt.AdaptException(TypeError)
    causes the adapt() call to fail, and the exception to be 
    raised to the caller of adapt(); the 'default' value is not
    returned in this case.
  - At any stage of adaption, if adapt.AdaptException(TypeError) is
    raised, then the adaptation process stops, as if None had been
    returned from each stage.
  - If all adaption stages return None, there are two cases.  If the
    call to adapt() had a 'default' value, then this is returned;
    otherwise, an adapt.AdaptException is raised.
  feedback:
    I think this is the same as the current PEP, and different
    from the first PEP.  Comments?  Anything that was missed?
-
  topic: transitivity
  overview:
    A case for allowing A->C to work when A->B and B->C is
    available; an equally compelling case to forbid this was also
    given.  There are numerous reasons for not allowing transitive
    adapters, mostly that 'lossy' adapters or 'stateful' adapters
    are usually the problem cases.  However, a hard-and-fast rule
    for knowing when transitivity exists wasn't found.
  proposal:
  - When registering an adapter factory, from A->B, an additional
    flag 'transitive' will be available.  
  - This flag defaults to False, so specific care is needed when
    registering adapters which one considers to be transitive.
  - If there exist two adapter factories, X: A->B, and Y: B->C,
    the path factory Z: A->C will be considered registered
    if and only if both X and Y were registered 'Transitive'.
  - It is an error for a registration to cause two path factories
    from A to C to be constructed; thus the registry will never have a
    case where two transitive adaptation paths exist at a single time.
  - An explicit registration always has precedent over an
    a transitive path.
  - One can also register a None factory from A->B for the
    purpose of marking it transitive.  In this circumstance,
    the composite adapter is built through __conform__ and
    __adapt__.  The None registration is just a place holder
    to signal that a given path exists.
  feedback:
    I'm looking for warts in this plan, and verification if
    something like this has been done -- comments how well
    it works.  Alternative approaches?
-
  topic: substitutability
  overview:
    There is a problem with the default isinstance() behavior when
    someone derives a class from another to re-use implementation,
    but with a different 'concept'.  A mechanism to disable
    isinstance() is needed for this particular case.
  proposal:
  - The 'adapt' module will define a 'LiskovAdaptionError', which
    has as a text description something like:
     "Although the given class '%s' derives from '%s', it has been
     marked as not being substitutable; although it is a subclass,
     the intent has changed so that one should not assume an 'is-a'
     relationship." % (adaptee.__class__, protocol)
  - The 'adapt' module will provide an 'NotSubstitutable' adaption
    factory, which, by default, raises LiskovAdaptionError.
  - If someone is concerned that their subclass should not be 
    adapted to the superclass automatically, they should register
    the NotSubstitutable adapter to the superclass, recursively.
  feedback:
    I'm not sure how this would work for the adaptee-class's
    grandparent; perhaps a helper function that recursively
    marks super classes is needed?  Other comments?
-
  topic: declaration (aka Guido's syntax) and intrinsic adaption
  overview:
    Guido would like his type declaration syntax (see blog entry) to
    be equivalent to a call to adapt() without any additional
    arguments.  However, not all adapters should be created in the
    context of a declaration -- some should be created more
    explicitly.  We propose a mechanism where an adapter factory can
    register itself as not suitable for the declaration syntax.
  proposal:
  - The adapt.register method has an optional argument, 'intrinsic',
    that defaults to True.
  - The adapt() function has an optional argument, 'intrinsic_only' which
    defaults to True; and thus is the default for the declaration syntax.
  - If an adapter factory is registered with intrinsic = False, then
    it is _not_ used by default calls to adapt().
  - adapt( , intrinsic_only = False) will enable both sorts of adapters,
    intrinsic or not; enabling the use of adapters which should not
    be used by default in a declaration syntax.
  - all adapters created through __conform__ and __adapt__ are
    by default intrinsic since this parameter is not part of the
    function signature
  feedback:
    This is the simplest solution I heard on the list; the word
    'intrinsic' was given by Alex.  Is there a better word?  Should
    we even worry about this case?  Any other ways to view this issue?
-
  topic: adaptee (aka origin)
  overview:
    There was discussion as to how to get back to the original
    object from an adapter.  Is this in scope of PEP 246?
  proposal:
  - we specify an __adaptee__ property, to be optionally implemented 
    by an adapter that provides a reference adaptee
  - the adapt.register method has an optional argument, 'adaptee',
    that defaults to False; if it is True, adapt() calls will stuff
    away into a weak-reference mapping from adapter to adaptee.
  - an adapt.adaptee(adaptor) function which returns the given
    adaptee for the adaptor; this first checks the weak-reference
    table, and then checks for an __adaptee_ 
  feedback:
    Is this useful, worth the complexity?
-
  topic: sticky
  overview:
    Sticky adapters, that is, ones where there is only one instance
    per adaptee is a common use case.  Should the registry of PEP 246
    provide this feature?
  proposal:
  - the adapt.register method has an optional argument, 'sticky',
    that defaults to False
  - if the given adapter factory is marked sticky, then a call
    to adapt() will first check to see if a given adapter (keyed
    by protocol) has been created for the adaptee; if so, then
    that adapter is returned, otherwise the factory is asked to
    produce an adapter and that adapter is cashed.
  feedback:
    Is this useful, worth the complexity?  It seems like an easy
    operation.  The advantage to this approach (over each factory
    inheriting from a StickyFactory) is that registry queries can be
    done, to list sticky adapters and other bookkeeping chores.

Ok.  That's it.

Cheers,

Clark
--
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From pje at telecommunity.com  Sun Jan 16 05:36:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 05:34:28 2005
Subject: [Python-Dev] PEP 246, Feedback Request
In-Reply-To: <20050116040424.GA76191@prometheusresearch.com>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050115231609.02fa4230@mail.telecommunity.com>

At 11:04 PM 1/15/05 -0500, Clark C. Evans wrote:
>   topic: a glossary
>   overview:
>     It seems that we are having difficulty with words that have shifting
>     definitions.  The next PEP edit will need to add a glossary that
>     nails down some meanings of these words.  Following are a few
>     proposed terms/meanings.

It would also be helpful to distinguish between 1-to-1 "as a" adapters, and 
1-to-many "view" adapters.  There isn't a really good terminology for this, 
but it's important at least as it relates to type declarations.


>   - Any exception other than adapt.AdaptException(TypeError)
>     causes the adapt() call to fail, and the exception to be
>     raised to the caller of adapt(); the 'default' value is not
>     returned in this case.
>   - At any stage of adaption, if adapt.AdaptException(TypeError) is
>     raised, then the adaptation process stops, as if None had been
>     returned from each stage.
>   - If all adaption stages return None, there are two cases.  If the
>     call to adapt() had a 'default' value, then this is returned;
>     otherwise, an adapt.AdaptException is raised.

-1; This allows unrelated AdaptExceptions to end up being silently 
caught.  These need to be two different exceptions if you want to support 
stages being able to "veto" adaptation.  Perhaps you should have a distinct 
VetoAdaptation error to support that use case.


>   topic: transitivity
>   ...
>   proposal:
>   ...
>   feedback:
>     I'm looking for warts in this plan, and verification if
>     something like this has been done -- comments how well
>     it works.  Alternative approaches?

I'll try to think some more about this one later, but I didn't see any 
obvious problems at first glance.


>   topic: declaration (aka Guido's syntax) and intrinsic adaption
>   overview:
>     Guido would like his type declaration syntax (see blog entry) to
>     be equivalent to a call to adapt() without any additional
>     arguments.  However, not all adapters should be created in the
>     context of a declaration -- some should be created more
>     explicitly.  We propose a mechanism where an adapter factory can
>     register itself as not suitable for the declaration syntax.

It would be much safer to have the reverse be the default; i.e., it should 
take special action to declare an adapter as being *suitable* for use with 
type declarations.

IOW, sticky intrinsic adapters should be the default, and volatile 
accessories should take an extra action to make them usable with type 
declarations.


>   feedback:
>     This is the simplest solution I heard on the list; the word
>     'intrinsic' was given by Alex.  Is there a better word?

Sadly, no.  I've been playing with words like "extender", "mask", 
"personality" etc. to try and find a name for a thing you only reasonably 
have one of, versus things you can have many of like "accessory", "add-on", 
etc.


>   topic: adaptee (aka origin)
>   overview:
>     There was discussion as to how to get back to the original
>     object from an adapter.  Is this in scope of PEP 246?
>   proposal:
>   - we specify an __adaptee__ property, to be optionally implemented
>     by an adapter that provides a reference adaptee
>   - the adapt.register method has an optional argument, 'adaptee',
>     that defaults to False; if it is True, adapt() calls will stuff
>     away into a weak-reference mapping from adapter to adaptee.
>   - an adapt.adaptee(adaptor) function which returns the given
>     adaptee for the adaptor; this first checks the weak-reference
>     table, and then checks for an __adaptee_
>   feedback:
>     Is this useful, worth the complexity?

This is tied directly to intrinsicness and stickyness.  If you are 
intrinsic, you *must* have __adaptee__, so that adapt can re-adapt you 
safely.  If you are intrinsic, you *must* be stateless or 
sticky.  (Stateless can be considered an empty special case of 
"sticky")  So, you might be able to combine a lot of these options to make 
the interface cleaner.

Think of it this way: if the adapter is intrinsic, it's just a 
"personality" of the underlying object.  So you don't want to re-adapt a 
personality, instead you re-adapt the "original object".

But for a non-intrinsic adapter, the adapter is an independent object only 
incidentally related to the original adaptee, so it is now an "original 
object" of its own.



>   topic: sticky
>   overview:
>     Sticky adapters, that is, ones where there is only one instance
>     per adaptee is a common use case.  Should the registry of PEP 246
>     provide this feature?

Ideally, yes.


>   proposal:
>   - the adapt.register method has an optional argument, 'sticky',
>     that defaults to False

Make it default to whatever the 'intrinsic' setting is, because the only 
time you don't care for an intrinsic adapter is if the adapter is 
completely stateless.  Or, better yet, call it 'volatile' or something and 
default to False.  (I.e, you have to be say you're willing to have it 
volatile.)

If you get all of these features, it's going to come mighty close to the 
functionality I've written up in my PEP; the primary difference is that 
mine also includes a more concrete notion of "interface" and defines a way 
to create intrinsic adapter factories automatically, without having to 
write adapter classes.  For volatile/accessory adapters, you still have to 
write the classes, but that's sort of the point of such adapters.

From pje at telecommunity.com  Sun Jan 16 05:57:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 05:56:08 2005
Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft
Message-ID: <5.1.1.6.0.20050115235444.02f246f0@mail.telecommunity.com>

I just attempted to post the Monkey Typing draft pre-PEP, but it bounced 
due to being just barely over the size limit for the list.  :)

So, I'm just posting the preamble and abstract here for now, and a link to 
a Wiki page with the full text.  I hope the moderator will approve the 
actual posting soon so that replies can quote from the text.

=== original message ===

This is only a partial first draft, but the Motivation section nonetheless 
attempts to briefly summarize huge portions of the various discussions 
regarding adaptation, and to coin a hopefully more useful terminology than 
some of our older working adjectives like "sticky" and "stateless" and 
such.  And the specification gets as far as defining a simple 
decorator-based syntax for creating operational (prev. "stateless") and 
extension (prev. "per-object stateful") adapters.

I stopped when I got to the API for declaring volatile (prev. per-adapter 
stateful) adapters, and for enabling them to be used with type 
declarations, because Clark's post on his revisions-in-progress seem to 
indicate that this can probably be handled within the scope of PEP 246 
itself.  As such, this PEP should then be viewed more as an attempt to 
formulate how "intrinsic" adapters can be defined in Python code, without 
the need to manually create adapter classes for the majority of 
type-compatibility and "extension" use cases.  In other words, the 
implementation described herein could probably become part of the front-end 
for the PEP 246 adapter registry.

Feedback and corrections (e.g. if I've repeated myself somewhere, spelling, 
etc.) would be greatly appreciated.  This uses ReST markup heavily, so if 
you'd prefer to read an HTML version, please see:

http://peak.telecommunity.com/DevCenter/MonkeyTyping

But I'd prefer that corrections/discussion quote the relevant section so I 
know what parts you're talking about.  Also, if you find a place where a 
more concrete example would be helpful, please consider submitting one that 
I can add.  Thanks!


PEP: XXX
Title: "Monkey Typing" for Agile Type Declarations
Version: $Revision: X.XX $
Last-Modified: $Date: 2003/09/22 04:51:50 $
Author: Phillip J. Eby <pje@telecommunity.com>
Status: Draft
Type: Standards Track
Python-Version: 2.5
Content-Type: text/x-rst
Created: 15-Jan-2005
Post-History: 15-Jan-2005

Abstract
========

Python has always had "duck typing": a way of implicitly defining types by
the methods an object provides.  The name comes from the saying, "if it walks
like a duck and quacks like a duck, it must *be* a duck".  Duck typing has
enormous practical benefits for small and prototype systems.  For very large
frameworks, however, or applications that comprise multiple frameworks, some
limitations of duck typing can begin to show.

This PEP proposes an extension to "duck typing" called "monkey typing", that
preserves most of the benefits of duck typing, while adding new features to
enhance inter-library and inter-framework compatibility.  The name comes from
the saying, "Monkey see, monkey do", because monkey typing works by stating
how one object type may *mimic* specific behaviors of another object type.

Monkey typing can also potentially form the basis for more sophisticated type
analysis and improved program performance, as it is essentially a simplified
form of concepts that are also found in languages like Dylan and Haskell.  It
is also a straightforward extension of Java casting and COM's QueryInterface,
which should make it easier to represent those type systems' behaviors within
Python as well.

[see the web page above for the remaining text]

From aleax at aleax.it  Sun Jan 16 09:23:26 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 09:23:33 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@[10.0.0.23]>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
Message-ID: <E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 16, at 03:17, Phillip J. Eby wrote:
    ...
> Uh oh.  I just used "view" to describe an iterator as a view on an 
> iterable, as distinct from an adapter that adapts a sequence so that 
> it's iterable.  :)
>
> I.e., using "view" in the MVC sense where a given Model might have 
> multiple independent Views.

I think that in order to do that you need to draw a distinction between 
two categories of iterables: so, again, a problem of terminology, but 
one connected to a conceptual difference.

An iterator IS-AN iterable: it has __iter__.  However, it can't have 
"multiple independent views"... except maybe if you use itertools.tee 
for that purpose.

Other iterables are, well, ``re-iterables'': each call to their 
__iter__ makes a new fresh iterator, and using that iterator won't 
alter the iterable's state.  In this case, viewing multiple iterators 
on the same re-iterables as akin to views on a model seems quite OK.

I can't think of any 3rd case -- an iterable that's not an iterator 
(__iter__ does not return self) but neither is it seamlessly 
re-iterable.  Perhaps the ``file'' built-in type as it was in 2.2 
suffered that problem, but it was a design problem, and is now fixed.


Alex

From robey at lag.net  Sun Jan 16 10:21:51 2005
From: robey at lag.net (Robey Pointer)
Date: Sun Jan 16 10:22:26 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011517576655bc17@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<ca471dc205011517576655bc17@mail.gmail.com>
Message-ID: <41EA322F.1080304@lag.net>

Guido van Rossum wrote:

>>The base of the Exception hierarchy happens to be a classic class.
>>But why are they "required" to be classic?
>>
>>More to the point, is this a bug, a missing feature, or just a bug in
>>the documentation for not mentioning the restriction?
>>    
>>
>
>It's an unfortunate feature; it should be mentioned in the docs; it
>should also be fixed, but fixing it isn't easy (believe me, or it
>would have been fixed in Python 2.2).
>
>To be honest, I don't recall the exact reasons why this wasn't fixed
>in 2.2; I believe it has something to do with the problem of
>distinguishing between string and class exception, and between the
>various forms of raise statements.
>
>I think the main ambiguity is raise "abc", which could be considered
>short for raise str, "abc", but that would be incompatible with except
>"abc". I also think that the right way out of there is to simply
>hardcode a check that says that raise "abc" raises a string exception
>and raising any other instance raises a class exception. But there's a
>lot of code that has to be changed.
>
>It's been suggested that all exceptions should inherit from Exception,
>but this would break tons of existing code, so we shouldn't enforce
>that until 3.0. (Is there a PEP for this? I think there should be.)
>  
>

There's actually a bug open on the fact that exceptions can't be 
new-style classes:

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=518846&group_id=5470

I added some comments to try to stir it up but there ended up being a 
lot of confusion and I don't think I helped much.  The problem is that 
people want to solve the larger issues (raising strings, wanting to 
force all exceptions to be new-style, etc) but those all have long-term 
solutions, while the current bug just languishes.

robey

From martin at v.loewis.de  Sun Jan 16 10:27:12 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 16 10:27:16 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<ca471dc205011517576655bc17@mail.gmail.com>
	<5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se>
Message-ID: <41EA3370.7000204@v.loewis.de>

Simon Percivall wrote:
> What would happen if Exception were made a new-style class, enforce
> inheritance from Exception for all new-style exceptions, and allow all
> old-style exceptions as before.

string exceptions would break.

In addition, code may break which assumes that exceptions are classic
instances, e.g. that they are picklable, have an __dict__, and so on.

 > Am I wrong in assuming that only the
> most esoteric exceptions inheriting from Exception would break by
> Exception becoming new-style?

Yes, I think so.

Regards,
Martin
From martin at v.loewis.de  Sun Jan 16 10:28:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 16 10:28:54 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>
Message-ID: <41EA33D3.8080102@v.loewis.de>

Phillip J. Eby wrote:
> Couldn't we require new-style exceptions to inherit from Exception?  
> Since there are no new-style exceptions that work now, this can't break 
> existing code.

This would require to make Exception a new-style class, right?
This, in itself, could break existing code.

Regards,
Martin
From aleax at aleax.it  Sun Jan 16 10:38:16 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 10:38:22 2005
Subject: [Python-Dev] how to test behavior wrt an extension type?
Message-ID: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>

copy.py, as recently discussed starting from a post by /F, has two 
kinds of misbehaviors since 2.3 (possibly 2.2, haven't checked), both 
connected to instance/type/metatype confusion (where do special methods 
come from? in classic classes and types, from the instance, which may 
delegate to the type/class; in newstype one, from the class/type which 
_must not_ delegate to the metaclass): type/metatype confusion, and 
misbehavior with instances of extension types.

So, as per discussion here, I have prepared a patch (to the maintenance 
branch of 2.3, to start with) which adds unit tests to highlight these 
issues, and fixes them in copy.py.  This patch should go in the 
maintenance of 2.3 and 2.4, but in 2.5 a different approach based on 
new special descriptors for special methods is envisaged (though 
keeping compatibility with classic extension types may also require 
some patching to copy.py along the lines of my patch).

Problem: to write unit tests showing that the current copy.py 
misbehaves with a classic extension type, I need a classic extension 
type which defines __copy__ and __deepcopy__ just like /F's 
cElementTree does.  So, I made one: a small trycopy.c and accompanying 
setup.py whose only purpose in life is checking that instances of a 
classic type get copied correctly, both shallowly and deeply.  But now 
-- where do I commit this extension type, so that the unit tests in 
test_copy.py can do their job...?

Right now I've finessed the issue by having a try/except ImportError in 
the two relevant unit tests (for copy and deepcopy) -- if the trycopy 
module is not available I just don't test how its instances behave 
under deep or shallow copying.  However, if I just commit or send the 
patch as-is, without putting trycopy.c and its setup.py somewhere, then 
I'm basically doing a fix without unit tests to back it up, from the 
point of view of anybody but myself.

I do not know what the recommended practice is for this kind of issues, 
so, I'm asking for guidance (and specifically asking Anthony since my 
case deals with 2.3 and 2.4 maintenance and he's release manager for 
both, but, of course, everybody's welcome to help!).  Surely this can't 
be the first case in which a bug got triggered only by a certain 
behavior in an extension type, but I couldn't find precedents.  Ideas, 
suggestions, ...?


Alex

From aleax at aleax.it  Sun Jan 16 10:44:14 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 10:44:20 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <41EA3370.7000204@v.loewis.de>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<ca471dc205011517576655bc17@mail.gmail.com>
	<5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se>
	<41EA3370.7000204@v.loewis.de>
Message-ID: <2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 16, at 10:27, Martin v. L?wis wrote:

> Simon Percivall wrote:
>> What would happen if Exception were made a new-style class, enforce
>> inheritance from Exception for all new-style exceptions, and allow all
>> old-style exceptions as before.
>
> string exceptions would break.

Couldn't we just specialcase strings specifically, to keep 
grandfathering them in?

> In addition, code may break which assumes that exceptions are classic
> instances, e.g. that they are picklable, have an __dict__, and so on.

There would be no problem giving the new
class Extension(object): ...
a __dict__ and the ability to get pickled, particularly since both come 
by default.

The "and so on" would presumably refer to whether special methods 
should be looked up on the instance or the type.  But as I understand 
the question (raised in the threads about copy.py) the planned solution 
is to make special methods their own kind of descriptors, so even that 
exoteric issue could well be finessed.

> > Am I wrong in assuming that only the
>> most esoteric exceptions inheriting from Exception would break by
>> Exception becoming new-style?
>
> Yes, I think so.

It seems to me that if the new-style Exception is made very normally 
and strings are grandfathered in, we ARE down to exoteric breakage 
cases (potentially fixable by those new magic descriptors as above for 
specialmethods).


Alex

From aleax at aleax.it  Sun Jan 16 10:47:34 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 10:47:38 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <41EA33D3.8080102@v.loewis.de>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>
	<41EA33D3.8080102@v.loewis.de>
Message-ID: <A5D2B02C-67A3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 16, at 10:28, Martin v. L?wis wrote:

> Phillip J. Eby wrote:
>> Couldn't we require new-style exceptions to inherit from Exception?  
>> Since there are no new-style exceptions that work now, this can't 
>> break existing code.
>
> This would require to make Exception a new-style class, right?

Not necessarily, since Python supports multiple inheritance:

class MyException(Exception, object): .....

there -- a newstyle exception class inheriting from oldstyle Exception. 
  (ClassType goes to quite some trouble to allow this, getting the 
metaclass from _following_ bases if any).

Without inheritance you might similarly say:

class AnotherOne(Exception):
     __metaclass__ = type
     ...

> This, in itself, could break existing code.

Not necessarily, see my previous post.  But anyway, PJE's proposal is 
less invasive than making Exception itself newstyle.


Alex

From martin at v.loewis.de  Sun Jan 16 11:05:20 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 16 11:05:23 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<ca471dc205011517576655bc17@mail.gmail.com>
	<5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se>
	<41EA3370.7000204@v.loewis.de>
	<2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <41EA3C60.1060801@v.loewis.de>

Alex Martelli wrote:
>>> What would happen if Exception were made a new-style class, enforce
>>> inheritance from Exception for all new-style exceptions, and allow all
>>> old-style exceptions as before.
>>
>>
>> string exceptions would break.
> 
> 
> Couldn't we just specialcase strings specifically, to keep 
> grandfathering them in?

Sure. That just wouldn't be the change that Simon described, anymore.

You don't specify in which way you would like to specialcase strings.
Two alternatives are possible:
1. Throwing strings is still allowed, and to catch them, you need
    the identical string (i.e. the current behaviour)
2. Throwing strings is allowed, and they can be caught by either
    the identical string, or by catching str
In the context of Simon's proposal, the first alternative would
be more meaningful, I guess.

> The "and so on" would presumably refer to whether special methods should 
> be looked up on the instance or the type.  

Perhaps. That type(exc) changes might also cause problems.

> It seems to me that if the new-style Exception is made very normally and 
> strings are grandfathered in, we ARE down to exoteric breakage cases 
> (potentially fixable by those new magic descriptors as above for 
> specialmethods).

This would be worth a try. Does anybody have a patch to implement it?

Regards,
Martin
From python at rcn.com  Sun Jan 16 11:17:52 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sun Jan 16 11:21:45 2005
Subject: [Python-Dev] how to test behavior wrt an extension type?
In-Reply-To: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer>

[Alex]
> So, as per discussion here, I have prepared a patch (to the
maintenance
> branch of 2.3, to start with) which adds unit tests to highlight these
> issues, and fixes them in copy.py.  This patch should go in the
> maintenance of 2.3 and 2.4, but in 2.5 a different approach based on
> new special descriptors for special methods is envisaged (though
> keeping compatibility with classic extension types may also require
> some patching to copy.py along the lines of my patch).

For Py2.5, do you have in mind changing something other than copy.py?
If so, please outline your plan.  I hope your not planning on wrapping
all special method access as descriptor look-ups -- that would be a
somewhat radical change.


Raymond

From fredrik at pythonware.com  Sun Jan 16 12:03:42 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Jan 16 12:03:34 2005
Subject: [Python-Dev] Re: how to test behavior wrt an extension type?
References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <csdhm0$ude$1@sea.gmane.org>

Alex Martelli wrote:

> Problem: to write unit tests showing that the current copy.py misbehaves with a classic extension 
> type, I need a classic extension type which defines __copy__ and __deepcopy__ just like /F's 
> cElementTree does.  So, I made one: a small trycopy.c and accompanying setup.py whose only purpose 
> in life is checking that instances of a classic type get copied correctly, both shallowly and 
> deeply.  But now -- where do I commit this extension type, so that the unit tests in test_copy.py 
> can do their job...?

Modules/_testcapimodule.c ?

(I'm using the C api to define an extension type, after all...)

</F> 



From aleax at aleax.it  Sun Jan 16 12:37:33 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 12:37:39 2005
Subject: [Python-Dev] how to test behavior wrt an extension type?
In-Reply-To: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer>
References: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer>
Message-ID: <0365E8C6-67B3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 16, at 11:17, Raymond Hettinger wrote:

> [Alex]
>> So, as per discussion here, I have prepared a patch (to the
> maintenance
>> branch of 2.3, to start with) which adds unit tests to highlight these
>> issues, and fixes them in copy.py.  This patch should go in the
>> maintenance of 2.3 and 2.4, but in 2.5 a different approach based on
>> new special descriptors for special methods is envisaged (though
>> keeping compatibility with classic extension types may also require
>> some patching to copy.py along the lines of my patch).
>
> For Py2.5, do you have in mind changing something other than copy.py?
> If so, please outline your plan.  I hope your not planning on wrapping
> all special method access as descriptor look-ups -- that would be a
> somewhat radical change.

The overall plan does appear to be exactly the "somewhat radical 
change" which you hope is not being proposed, except it's not my plan 
-- it's Guido's.  Quoting his first relevant post on the subject:
'''
	From: 	  gvanrossum@gmail.com
	Subject: 	Re: getting special from type, not instance (was Re: 
[Python-Dev] copy confusion)
	Date: 	2005 January 12 18:59:13 CET
       ...
I wonder if the following solution wouldn't be more useful (since less
code will have to be changed).

The descriptor for __getattr__ and other special attributes could
claim to be a "data descriptor" which means that it gets first pick
*even if there's also a matching entry in the instance __dict__*.
Quick illustrative example:

>>> class C(object):
      foo = property(lambda self: 42)   # a property is always a "data
descriptor"

>>> a = C()
>>> a.foo
42
>>> a.__dict__["foo"] = "hello"
>>> a.foo
42
>>>

Normal methods are not data descriptors, so they can be overridden by
something in __dict__; but it makes some sense that for methods
implementing special operations like __getitem__ or __copy__, where
the instance __dict__ is already skipped when the operation is invoked
using its special syntax, it should also be skipped by explicit
attribute access (whether getattr(x, "__getitem__") or x.__getitem__
-- these are entirely equivalent).

We would need to introduce a new decorator so that classes overriding
these methods can also make those methods "data descriptors", and so
that users can define their own methods with this special behavior
(this would be needed for __copy__, probably).

I don't think this will cause any backwards compatibility problems --
since putting a __getitem__ in an instance __dict__ doesn't override
the x[y] syntax, it's unlikely that anybody would be using this.
"Ordinary" methods will still be overridable.

PS. The term "data descriptor" now feels odd, perhaps we can say "hard
descriptors" instead. Hard descriptors have a __set__ method in
addition to a __get__ method (though the __set__ method may always
raise an exception, to implement a read-only attribute).
'''

All following discussion was, I believe, in the same thread, mostly 
among Guido, Phillip and Armin.  I'm focusing on getting copy.py fixed 
in 2.3 and 2.4, w/o any plan yet to implement Guido's idea.  If you 
dislike Guido's idea (which Phillip, Armin and I all liked, in 
different degrees), it might be best for you to read that other thread 
and explain the issues there, I think.


Alex

From aleax at aleax.it  Sun Jan 16 12:39:49 2005
From: aleax at aleax.it (Alex Martelli)
Date: Sun Jan 16 12:39:53 2005
Subject: [Python-Dev] Re: how to test behavior wrt an extension type?
In-Reply-To: <csdhm0$ude$1@sea.gmane.org>
References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
	<csdhm0$ude$1@sea.gmane.org>
Message-ID: <540CDCFE-67B3-11D9-ADA4-000A95EFAE9E@aleax.it>


On 2005 Jan 16, at 12:03, Fredrik Lundh wrote:

> Alex Martelli wrote:
>
>> Problem: to write unit tests showing that the current copy.py 
>> misbehaves with a classic extension
>> type, I need a classic extension type which defines __copy__ and 
>> __deepcopy__ just like /F's
>> cElementTree does.  So, I made one: a small trycopy.c and 
>> accompanying setup.py whose only purpose
>> in life is checking that instances of a classic type get copied 
>> correctly, both shallowly and
>> deeply.  But now -- where do I commit this extension type, so that 
>> the unit tests in test_copy.py
>> can do their job...?
>
> Modules/_testcapimodule.c ?
>
> (I'm using the C api to define an extension type, after all...)

Fine with me, if there are no objections I'll alter the patch 
accordingly and submit it that way.


Thanks,

Alex

From pje at telecommunity.com  Sun Jan 16 16:18:21 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 16:16:50 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <41EA33D3.8080102@v.loewis.de>
References: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>
	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050116101727.030e9a30@mail.telecommunity.com>

At 10:28 AM 1/16/05 +0100, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
>>Couldn't we require new-style exceptions to inherit from Exception?
>>Since there are no new-style exceptions that work now, this can't break 
>>existing code.
>
>This would require to make Exception a new-style class, right?

 >>> class MyException(Exception,object): pass

 >>>

Not as far as I can see, no.

From irmen at xs4all.nl  Sun Jan 16 17:08:54 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sun Jan 16 17:08:52 2005
Subject: [Python-Dev] a bunch of Patch reviews
Message-ID: <41EA9196.1020709@xs4all.nl>

Hello
I've looked at one bug and a bunch of patches and
added a comment to them:

(bug) [ 1102649 ] pickle files should be opened in binary mode
Added a comment about a possible different solution

[ 946207 ] Non-blocking Socket Server
Useless, what are the mixins for? Recommend close

[ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows
Looks good but added suggestion about when to test for special case

[ 740827 ] add urldecode() method to urllib
I think it's better to group these things into urlparse

[ 579435 ] Shadow Password Support Module
Would be nice to have, I recently just couldn't do the user
authentication that I wanted: based on the users' unix passwords

[ 1093468 ] socket leak in SocketServer
Trivial and looks harmless, but don't the sockets
get garbage collected once the request is done?

[ 1049151 ] adding bool support to xdrlib.py
Simple patch and 2.4 is out now, so...



It would be nice if somebody could have a look at my
own patches or help me a bit with them:

[ 1102879 ] Fix for 926423: socket timeouts + Ctrl-C don't play nice
[ 1103213 ] Adding the missing socket.recvall() method
[ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule
[ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names
[ 1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld

Some of them come from the last Python Bug Day, see
http://www.python.org/moin/PythonBugDayStatus


Thank you !

Regards,

--Irmen de Jong
From gvanrossum at gmail.com  Sun Jan 16 18:15:35 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Jan 16 18:15:38 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <ca471dc20501160915778b4eca@mail.gmail.com>

The various PEP 246 threads are dead AFAIC -- I won't ever have the
time to read them in full length, and because I haven't followed them
I don't get much of the discussion that's still going on.

I hear that Clark and Alex are going to do a revision of the PEP; I'm
looking forward to the results.

In the mean time, here's a proposal to reduce the worries about
implicit adaptation: let's not do it!

Someone posted a new suggestion to my blog: it would be good if an
optimizing compiler (or a lazy one) would be allowed to ignore all
type declarations, and the program should behave the same
(except for things like catching TypeError). This rules out using
adapt() for type declarations, and we're back to pure typechecking.
Given the many and various issues with automamtic adaptation
(transitivity, lossiness, statelessness, and apparently more still)
that might be a better approach.

Typechecking can be trivially defined in terms of adaptation:

def typecheck(x, T):
   y = adapt(x, T)
   if y is x:
       return y
   raise TypeError("...")

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Sun Jan 16 19:00:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 18:58:58 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <ca471dc20501160915778b4eca@mail.gmail.com>
References: <E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>

At 09:15 AM 1/16/05 -0800, Guido van Rossum wrote:
>Given the many and various issues with automamtic adaptation
>(transitivity, lossiness, statelessness, and apparently more still)
>that might be a better approach.

Actually, I think Clark, Alex, and I are rapidly converging on a relatively 
simple common model to explain all this stuff, with only two kinds of 
adaptation covering everything we've discussed to date in a reasonable 
way.  My most recent version of my pre-PEP (not yet posted) explains the 
two kinds of adaptation in this way:

"""One type is the "extender", whose purpose is to extend the capability of 
an object or allow it to masquerade as another type of object.  An 
"extender" is not truly an object unto itself, merely a kind of "alternate 
personality" for the object it adapts.  For example, a power transformer 
might be considered an "extender" for a power outlet, because it allows the 
power to be used with different devices than it would otherwise be usable for.

By contrast, an "independent adapter" is an object that provides entirely 
different capabilities from the object it adapts, and therefore is truly an 
object in its own right.  While it only makes sense to have one extender of 
a given type for a given base object, you may have as many instances of an 
independent adapter as you like for the same base object.

For example, Python iterators are independent adapters, as are views in a 
model-view-controller framework, since each iterable may have many 
iterators in existence, each with its own independent state.  Resuming the 
previous analogy of a power outlet, you may consider independent adapters 
to be like appliances: you can plug more than one lamp into the same 
outlet, and different lamps may be on or off at a given point in 
time.  Many appliances may come and go over the lifetime of the power 
outlet -- there is no inherent connection between them because the 
appliances are independent objects rather than mere extensions of the power 
outlet."""

I then go on to propose that extenders be automatically allowed for use 
with type declaration, but that independent adapters should require 
additional red tape (e.g. an option when registering) to do so.  (An 
explicit 'adapt()' call should allow either kind of adapter, 
however.)  Meanwhile, no adapt() call should adapt an extender; it should 
instead adapt the extended object.  Clark and Alex have proposed changes to 
PEP 246 that would support these proposals within the scope of the 
'adapt()' system, and I have pre-PEPped an add-on to their system that 
allows extenders to be automatically assembled from "@like" decorators 
sprinkled over methods or extension routines.  My proposal also does away 
with the need to have a special interface type to support extender-style 
adaptation.  (I.e., it could supercede PEP 245, because interfaces can then 
simply be abstract classes or just "like" concrete classes.)

From pje at telecommunity.com  Sun Jan 16 19:11:14 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Jan 16 19:09:45 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
Message-ID: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>

I've revised the draft today to simplify the terminology, discussing only 
two broad classes of adapters.  Since Clark's pending proposals for PEP 246 
align well with the concept of "extenders" vs. "independent adapters", I've 
refocused my PEP to focus exclusively on adding support for "extenders", 
since PEP 246 already provides everything needed for independent adapters.

The new draft is here:
http://peak.telecommunity.com/DevCenter/MonkeyTyping

And you can view diffs from the previous version(s) here:
http://peak.telecommunity.com/DevCenter/MonkeyTyping?action=info

From kbk at shore.net  Sun Jan 16 20:24:37 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sun Jan 16 20:25:07 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200501161924.j0GJObqo029011@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  272 open ( +5) /  2737 closed (+10) /  3009 total (+15)
Bugs    :  793 open ( -5) /  4777 closed (+29) /  5570 total (+24)
RFE     :  165 open ( +0) /   141 closed ( +1) /   306 total ( +1)

New / Reopened Patches
______________________

Enhance tracebacks and stack traces with vars  (2005-01-08)
       http://python.org/sf/1098732  opened by  Skip Montanaro

Single-line option to pygettext.py  (2005-01-09)
       http://python.org/sf/1098749  opened by  Martin Blais

improved smtp connect debugging  (2005-01-11)
CLOSED http://python.org/sf/1100140  opened by  Wummel

Log gc times when DEBUG_STATS set  (2005-01-11)
       http://python.org/sf/1100294  opened by  Skip Montanaro

deepcopying listlike and dictlike objects  (2005-01-12)
       http://python.org/sf/1100562  opened by  Bj?rn Lindqvist

ast-branch: fix for coredump from new import grammar  (2005-01-11)
       http://python.org/sf/1100563  opened by  logistix

datetime.strptime constructor added  (2005-01-12)
       http://python.org/sf/1100942  opened by  Josh

Feed style codec API  (2005-01-12)
       http://python.org/sf/1101097  opened by  Walter D?rwald

Patch for potential buffer overrun in tokenizer.c  (2005-01-13)
       http://python.org/sf/1101726  opened by  Greg Chapman

ast-branch: hacks so asdl_c.py generates compilable code  (2005-01-14)
       http://python.org/sf/1102710  opened by  logistix

Fix for 926423: socket timeouts + Ctrl-C don't play nice  (2005-01-15)
       http://python.org/sf/1102879  opened by  Irmen de Jong

Boxing up PyDECREF correctly  (2005-01-15)
CLOSED http://python.org/sf/1103046  opened by  Norbert Nemec

AF_NETLINK sockets basic support  (2005-01-15)
       http://python.org/sf/1103116  opened by  Philippe Biondi

Adding the missing socket.recvall() method  (2005-01-16)
       http://python.org/sf/1103213  opened by  Irmen de Jong

tarfile.py: fix for bug #1100429  (2005-01-16)
       http://python.org/sf/1103407  opened by  Lars Gust?bel

Patches Closed
______________

pydoc data descriptor unification  (2004-04-17)
       http://python.org/sf/936774  closed by  jlgijsbers

xml.dom missing API docs (bugs 1010196, 1013525)  (2004-10-21)
       http://python.org/sf/1051321  closed by  jlgijsbers

Fix for bug 1017546  (2004-08-27)
       http://python.org/sf/1017550  closed by  jlgijsbers

fixes urllib2 digest to allow arbitrary methods  (2005-01-04)
       http://python.org/sf/1095362  closed by  jlgijsbers

Bug fix 548176: urlparse('http://foo?blah') errs  (2003-03-30)
       http://python.org/sf/712317  closed by  jlgijsbers

bug fix 702858: deepcopying reflexive objects  (2003-03-22)
       http://python.org/sf/707900  closed by  jlgijsbers

minor codeop fixes  (2003-05-15)
       http://python.org/sf/737999  closed by  jlgijsbers

SimpleHTTPServer reports wrong content-length for text files  (2003-11-10)
       http://python.org/sf/839496  closed by  jlgijsbers

improved smtp connect debugging  (2005-01-11)
       http://python.org/sf/1100140  closed by  jlgijsbers

Boxing up PyDECREF correctly  (2005-01-15)
       http://python.org/sf/1103046  closed by  rhettinger

New / Reopened Bugs
___________________

socket.setdefaulttimeout() breaks smtplib.starttls()  (2005-01-08)
       http://python.org/sf/1098618  opened by  Matthew Cowles

set objects cannot be marshalled  (2005-01-09)
CLOSED http://python.org/sf/1098985  opened by  Gregory H. Ball

codec readline() splits lines apart  (2005-01-09)
CLOSED http://python.org/sf/1098990  opened by  Irmen de Jong

Optik OptionParse important undocumented option  (2005-01-10)
       http://python.org/sf/1099324  opened by  ncouture

refman doesn't know about universal newlines  (2005-01-10)
       http://python.org/sf/1099363  opened by  Jack Jansen

raw_input() displays wrong unicode prompt  (2005-01-10)
       http://python.org/sf/1099364  opened by  Petr Prikryl

tempfile files not types.FileType  (2005-01-10)
CLOSED http://python.org/sf/1099516  opened by  Frans van Nieuwenhoven

copy.deepcopy barfs when copying a class derived from dict  (2005-01-10)
       http://python.org/sf/1099746  opened by  Doug Winter

Cross-site scripting on BaseHTTPServer  (2005-01-11)
       http://python.org/sf/1100201  opened by  Paul Johnston

Scripts started with CGIHTTPServer: missing cgi environment  (2005-01-11)
       http://python.org/sf/1100235  opened by  pacote

Frame does not receive configure event on move  (2005-01-11)
       http://python.org/sf/1100366  opened by  Anand Kameswaran

Wrong "type()" syntax in docs  (2005-01-11)
       http://python.org/sf/1100368  opened by  Facundo Batista

TarFile iteration can break (on Windows) if file has links  (2005-01-11)
       http://python.org/sf/1100429  opened by  Greg Chapman

Python Interpreter shell is crashed   (2005-01-12)
       http://python.org/sf/1100673  opened by  abhishek

test_fcntl fails on netbsd2  (2005-01-12)
       http://python.org/sf/1101233  opened by  Mike Howard

test_shutil fails on NetBSD 2.0  (2005-01-12)
CLOSED http://python.org/sf/1101236  opened by  Mike Howard

dict subclass breaks cPickle noload()  (2005-01-13)
       http://python.org/sf/1101399  opened by  Neil Schemenauer

popen3 on windows loses environment variables  (2005-01-13)
       http://python.org/sf/1101667  opened by  June Kim

popen4/cygwin ssh hangs  (2005-01-13)
       http://python.org/sf/1101756  opened by  Ph.E

% operator bug  (2005-01-14)
CLOSED http://python.org/sf/1102141  opened by  ChrisF

rfc822 Deprecated since release 2.3?  (2005-01-14)
       http://python.org/sf/1102469  opened by  Wai Yip Tung

pickle files should be opened in binary mode  (2005-01-15)
       http://python.org/sf/1102649  opened by  John Machin

Incorrect RFC 2231 decoding  (2005-01-15)
       http://python.org/sf/1102973  opened by  Barry A. Warsaw

raw_input problem with readline and UTF8  (2005-01-15)
       http://python.org/sf/1103023  opened by  Casey Crabb

send/recv SEGMENT_SIZE should be used more in socketmodule  (2005-01-16)
       http://python.org/sf/1103350  opened by  Irmen de Jong

Bugs Closed
___________

typo in "Python Tutorial": 1. Whetting your appetite  (2005-01-08)
       http://python.org/sf/1098497  closed by  jlgijsbers

xml.dom documentation omits hasAttribute, hasAttributeNS  (2004-08-16)
       http://python.org/sf/1010196  closed by  jlgijsbers

xml.dom documentation omits createDocument, ...DocumentType  (2004-08-21)
       http://python.org/sf/1013525  closed by  jlgijsbers

Documentation of DOMImplmentation lacking  (2004-07-15)
       http://python.org/sf/991805  closed by  jlgijsbers

wrong documentation for popen2  (2004-01-29)
       http://python.org/sf/886619  closed by  jlgijsbers

test_inspect.py fails to clean up upon failure  (2004-08-27)
       http://python.org/sf/1017546  closed by  jlgijsbers

weird/buggy inspect.getsource behavious  (2003-07-11)
       http://python.org/sf/769569  closed by  jlgijsbers

SimpleHTTPServer sends wrong Content-Length header  (2005-01-07)
       http://python.org/sf/1097597  closed by  jlgijsbers

urllib2: improper capitalization of headers  (2004-07-19)
       http://python.org/sf/994101  closed by  jlgijsbers

urlparse doesn't handle host?bla  (2002-04-24)
       http://python.org/sf/548176  closed by  jlgijsbers

set objects cannot be marshalled  (2005-01-09)
       http://python.org/sf/1098985  closed by  rhettinger

codec readline() splits lines apart  (2005-01-09)
       http://python.org/sf/1098990  closed by  doerwalter

tempfile files not types.FileType  (2005-01-10)
       http://python.org/sf/1099516  closed by  rhettinger

for lin in file: file.tell() tells wrong  (2002-11-29)
       http://python.org/sf/645594  closed by  facundobatista

Py_Main() does not perform to spec  (2003-01-21)
       http://python.org/sf/672035  closed by  facundobatista

Incorrect permissions set in lib-dynload.  (2003-02-04)
       http://python.org/sf/680379  closed by  facundobatista

Apple-installed Python fails to build extensions  (2005-01-04)
       http://python.org/sf/1095822  closed by  jackjansen

test_shutil fails on NetBSD 2.0  (2005-01-12)
       http://python.org/sf/1101236  closed by  jlgijsbers

CSV reader does not parse Mac line endings  (2003-08-16)
       http://python.org/sf/789519  closed by  andrewmcnamara

Bugs in _csv module - lineterminator  (2004-11-24)
       http://python.org/sf/1072404  closed by  andrewmcnamara

% operator bug  (2005-01-14)
       http://python.org/sf/1102141  closed by  rhettinger

test_atexit fails in directories with spaces  (2003-03-18)
       http://python.org/sf/705792  closed by  facundobatista

SEEK_{SET,CUR,END} missing in 2.2.2  (2003-03-29)
       http://python.org/sf/711830  closed by  loewis

CGIHTTPServer cannot manage cgi in sub directories  (2003-07-28)
       http://python.org/sf/778804  closed by  facundobatista

double symlinking corrupts sys.path[0]  (2003-08-24)
       http://python.org/sf/794291  closed by  facundobatista

popen3 under threads reports different stderr results  (2003-12-09)
       http://python.org/sf/856706  closed by  facundobatista

Signals discard one level of exception handling  (2003-07-15)
       http://python.org/sf/771429  closed by  facundobatista

build does not respect --prefix  (2002-10-27)
       http://python.org/sf/629345  closed by  facundobatista

urllib2 proxyhandle won't work.  (2001-11-30)
       http://python.org/sf/487471  closed by  facundobatista

RFE Closed
__________

popen does not like filenames with spaces  (2003-07-20)
       http://python.org/sf/774546  closed by  rhettinger

From pje at telecommunity.com  Sun Jan 16 05:51:59 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 00:14:57 2005
Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft
Message-ID: <5.1.1.6.0.20050115170350.020fe6a0@mail.telecommunity.com>

This is only a partial first draft, but the Motivation section nonetheless 
attempts to briefly summarize huge portions of the various discussions 
regarding adaptation, and to coin a hopefully more useful terminology than 
some of our older working adjectives like "sticky" and "stateless" and 
such.  And the specification gets as far as defining a simple 
decorator-based syntax for creating operational (prev. "stateless") and 
extension (prev. "per-object stateful") adapters.

I stopped when I got to the API for declaring volatile (prev. per-adapter 
stateful) adapters, and for enabling them to be used with type 
declarations, because Clark's post on his revisions-in-progress seem to 
indicate that this can probably be handled within the scope of PEP 246 
itself.  As such, this PEP should then be viewed more as an attempt to 
formulate how "intrinsic" adapters can be defined in Python code, without 
the need to manually create adapter classes for the majority of 
type-compatibility and "extension" use cases.  In other words, the 
implementation described herein could probably become part of the front-end 
for the PEP 246 adapter registry.

Feedback and corrections (e.g. if I've repeated myself somewhere, spelling, 
etc.) would be greatly appreciated.  This uses ReST markup heavily, so if 
you'd prefer to read an HTML version, please see:

http://peak.telecommunity.com/DevCenter/MonkeyTyping

But I'd prefer that corrections/discussion quote the relevant section so I 
know what parts you're talking about.  Also, if you find a place where a 
more concrete example would be helpful, please consider submitting one that 
I can add.  Thanks!


PEP: XXX
Title: "Monkey Typing" for Agile Type Declarations
Version: $Revision: X.XX $
Last-Modified: $Date: 2003/09/22 04:51:50 $
Author: Phillip J. Eby <pje@telecommunity.com>
Status: Draft
Type: Standards Track
Python-Version: 2.5
Content-Type: text/x-rst
Created: 15-Jan-2005
Post-History: 15-Jan-2005


Abstract
========

Python has always had "duck typing": a way of implicitly defining types by
the methods an object provides.  The name comes from the saying, "if it walks
like a duck and quacks like a duck, it must *be* a duck".  Duck typing has
enormous practical benefits for small and prototype systems.  For very large
frameworks, however, or applications that comprise multiple frameworks, some
limitations of duck typing can begin to show.

This PEP proposes an extension to "duck typing" called "monkey typing", that
preserves most of the benefits of duck typing, while adding new features to
enhance inter-library and inter-framework compatibility.  The name comes from
the saying, "Monkey see, monkey do", because monkey typing works by stating
how one object type may *mimic* specific behaviors of another object type.

Monkey typing can also potentially form the basis for more sophisticated type
analysis and improved program performance, as it is essentially a simplified
form of concepts that are also found in languages like Dylan and Haskell.  It
is also a straightforward extension of Java casting and COM's QueryInterface,
which should make it easier to represent those type systems' behaviors within
Python as well.


Motivation
==========

Many interface and static type declaration mechanisms have been proposed for
Python over the years, but few have met with great success.  As Guido has
said recently [1]_:

     One of my hesitations about adding adapt() and interfaces to the core
     language has always been that it would change the "flavor" of much of
     the Python programming we do and that we'd have to relearn how to
     write good code.

Even for widely-used Python interface systems (such as the one provided by
Zope), interfaces and adapters seem to require this change in "flavor", and
can require a fair amount of learning in order to use them well and avoid
various potential pitfalls inherent in their use.

Thus, spurred by a discussion on PEP 246 and its possible use for optional
type declarations in Python [2]_, this PEP is an attempt to propose a semantic
basis for optional type declarations that retains the "flavor" of Python, and
prevents users from having to "relearn how to write good code" in order to use
the new features successfully.

Of course, given the number of previous failed attempts to create a type
declaration system for Python, this PEP is an act of extreme optimism, and it
will not be altogether surprising if it, too, ultimately fails.  However, if
only because the record of its failure will be useful to the community, it is
worth at least making an attempt.  (It would also not be altogether surprising
if this PEP results in the ironic twist of convincing Guido not to include
type declarations in Python at all!)

Although this PEP will attempt to make adaptation easy, safe, and flexible,
the discussion of *how* it will do that must necessarily delve into many
detailed aspects of different use cases for adaptation, and the possible
pitfalls thereof.

It's important to understand, however, that developers do *not* need to
understand more than a tiny fraction of what is in this PEP, in order to
effectively use the features it proposes.  Otherwise, you may gain the
impression that this proposal is overly complex for the benefits it
provides, even though virtually none of that complexity is visible to the
developer making use of the proposed facilities.  That is, the value of this
PEP's implementation lies in how much of this PEP will *not* need to be thought
about by a developer using it!

Therefore, if you would prefer an uncorrupted "developer first impression"
of the proposal, please skip the remainder of this Motivation and proceed
directly to the `Specification`_ section, which presents the usage and
implementation.  However, if you've been involved in the Python-Dev discussion
regarding PEP 246, you probably already know too much about the subject to have
an uncorrupted first impression, so you should instead read the rest of this
Motivation and check that I have not misrepresented your point of view before
proceeding to the Specification.  :)


Why Adaptation for Type Declarations?
-------------------------------------

As Guido acknowledged in his optional static typing proposals, having type
declarations check argument types based purely on concrete type or conformance
to interfaces would stifle much of Python's agility and flexibility.  However,
if type declarations are used instead to *adapt* objects to an interface
expected by the receiver, Python's flexibility could in fact be *improved* by
type declarations.

PEP 246 presents a basic implementation model for automatically finding an
appropriate adapter so that one type can conform to the interface of another.
However, in recent discussions on the Python developers' mailing list, it
came out that there were many open issues about what sort of adapters would
be useful (or dangerous) in the context of type declarations.

PEP 246 was originally proposed for an explicit adaptation model where an
``adapt()`` function is called to retrieve an "adapter".  So, in this model the
adapting code potentially has access to both the "original" object and the
adapted version of the object.  Also, PEP 246 permitted either the caller of a
function or the called function to perform the adaptation, meaning that the
scope and lifetime of the resulting adapter could be explicitly controlled in a
straightforward way.

By contrast, type declarations would perform adaptation at the boundary between
caller and callee, making it impossible for the caller to control the adapter's
lifetime, or for the callee to obtain the "original" object.

Many options for reducing or controlling these effects were discussed.  By and
large, it is possible for an adapter author to address these issues with due
care and attention.  However, it also became clear from the discussion that
most persons new to the use of adaptation are often eager to use it for things
that lead rather directly to potentially problematic adapter behaviors.

Also, by the very nature of ubiquitous adaptation via type declarations, these
potentially problematic behaviors can spread throughout a program, and just
because one developer did not create a problematic adaptation, it does not mean
he or she will be immune to the effects of those created by others.

So, rather than attempt to make all possible Python developers "relearn how to
write good code", this PEP seeks to make the safer forms of adaptation easier
to learn and use than the less-safe forms.  (Which is the reverse of the
current situation, where less-safe adapters are often easier to write
than some safer ones!)



Kinds of Adaptation
-------------------

Specifically, the three forms of type adaptation we will discuss here are:

Operational Conformance
     Providing operations required by a target interface, using the
     operations and state available of the adapted type.  This is the
     simplest category of adaptation, because it introduces no new
     state information.  It is simply a specification of how an instance
     of one type can be adapted to act as if it were an instance of
     another type.

Extension/Extender
     The same as operational conformance, but with additional required state.
     This extra state, however, "belongs" to the original object, in the sense
     that it should exist as long as the original object exists.  An extension,
     in other words, is intended to extend the capabilities of the original
     object when needed, not to be an independently created object with its
     own lifetime.  Each time an adapter is requested for the target interface,
     an extension instance with the "same" state should be returned.

Volatile/View/Accessory
     Volatile adapters are used to provide functionality that may require
     multiple independent adapters for the same adapted object.  For example,
     a "view" in a model-view-controller (MVC) framework can be seen as a
     volatile adapter on a model, because more than one view may exist for the
     same model, with each view having its own independent state (such as 
window
     position, etc.).

Volatile adaptation is not an ideal match for type declaration, because it
is often important to explicitly control when each new volatile adapter is
created, and to whom it is being passed.  For example, in an MVC framework
one would not normally wish to pass a model to methods expecting views,
and wind up having new views created (e.g. windows opened) automatically!

Naturally, there *are* cases where opening a new window for some model object
*is* what you want.  However, using an implicit adaptation (via type
declaration) also means that passing a model to *any* method expecting a view
would result in this happening.  So, it is generally better to have the methods
that desire this behavior explicitly request it, e.g. by calling the PEP 246
``adapt()`` function, rather than having it happen implicitly by way of a type
declaration.

So, this PEP seeks to:

1. Make it easy to define operational and extension adapters

2. Make it possible to define volatile adapters, but only by explicitly
    declaring them as such in the adapter's definition.

3. Make it possible to have a type declaration result in creation of a
    volatile adapter, but only by explicitly declaring in the adapter's
    definition that type declarations are allowed to implicitly create
    instances.

By doing this, the language can gently steer developers away from
unintentionally creating adapters whose implicit behavior is difficult to
understand, or is not as they intended, by making it easier to do safer
forms of adaptation, and suggesting (via declaration requirements) that other
forms may need a bit more thought to use correctly.


Adapter Composition
-------------------

One other issue that was discussed heavily on Python-Dev regarding PEP 246
was adapter composition.  That is, adapting an already-adapted object.
Many people spoke out against implicit adapter composition (which was referred
to as transitive adaptation), because it introduces potentially unpredictable
emergent behavior.  That is, a local change to a program could have unintended
effects at a more global scale.

Using adaptation for type declarations can produce unintended adapter
composition.  Take this code, for example::

     def foo(bar: Baz):
         whack(bar)

     def whack(ping: Whee):
         ping.pong()

If a ``Baz`` instance is passed to ``foo()``, it is not wrapped in an adapter,
but is then passed to ``whack()``, which must then adapt it to the ``Whee``
type.  However, if an instance of a different type is passed to ``foo()``, then
``foo()`` will receive an adapter to make that object act like a ``Baz``
instance.  This adapter is then passed to ``whack()``, which further adapts
it to a ``Whee`` instance, thereby composing a second adapter onto the first,
or perhaps failing with a type error because there is no adapter available
to adapt the already-adapted object.  (There can be other side effects as well,
such as when attempting to compare implicitly adapted objects or use them as
dictionary keys.)

Therefore, this proposal seeks to have adaptation performed via type
declarations avoid implicit adapter composition, by never adapting an
operational or extension adapter.  Instead, the original object will be
retrieved from the adapter, and then adapted to the new target interface.

Volatile adapters, however, are independent objects from the object they
adapt, so they must always be considered an "original object" in their own
right.  (So, volatile adapters are also more volatile than other adapters with
respect to transitive adaptation.)  However, since volatile adapters
must be declared as such, and require an additional declaration to allow them
to be implicitly created, the developer at least has some warning that their
behavior will be more difficult to predict in the presence of type
declarations.


Interfaces vs. Duck Typing
--------------------------

An "interface" is generally recognized as a collection of operations that an
object may perform, or that may be performed on it.  Type declarations are then
used in many languages to indicate what interface is required of an object
that is supplied to a routine, or what interface is provided by the routine's
return value(s).

The problem with this concept is that interface implementations are typically
expected to be complete.  In Java, for example, you say that your class
implements an interface unless you actually add all of the required methods,
even if some of them aren't needed in your program yet.

A second problem with this is that incompatible interfaces tend to proliferate
among libraries and frameworks, even when they deal with the same basic
concepts and operations.  Just the fact that people might choose different
names for otherwise-identical operations makes it considerably less likely that
two interfaces will be compatible with each other!

There are two missing things here:

1. Just because you want to have an object of a given type (interface) doesn't
    mean you will use all possible operations on it.

2. It'd be really nice to be able to map operations from one interface onto
    another, without having to write wrapper classes and possibly having to
    write dummy implementations for operations you don't need, and perhaps 
can't
    even implement at all!

On the other hand, the *idea* of an interface as a collection of operations
isn't a bad idea.  And if you're the one *using* the interface's operations,
it's a convenient way to do it.  This proposal seeks to retain this useful
property, while ditching much of the "baggage" that otherwise comes with it.

What we would like to do, then, is allow any object that can perform operations
"like" those of a target interface, to be used as if it were an object of the
type that the interface suggests.

As an example, consider the notion of a "file-like" object, which is often
referred to in the discussion of Python programs.  It basically means, "an
object that has methods whose semantics roughly correspond to the same-named
methods of the built-in ``file`` type."

It does *not* mean that the object must be an instance of a subclass of
``file``, or that it must be of a class that declares it "implements the
``file`` interface".  It simply means that the object's *namespace* mirrors the
*meaning* of a ``file`` instance's namespace.  In a phrase, it is
"duck typing": if it walks like a duck and quacks like a duck, it must *be*
a duck.

Traditional interface systems, however, rapidly break down when you attempt
to apply them to this concept.  One repeatedly used measuring stick for
proposed Python interface systems has been, "How do I say I want a file-like
object?"  To date, no proposed interface system for Python (that this author
knows about, anyway) has had a good answer for this question, because they
have all been based on completely implementing the operations defined by an
interface object, distinct from the concrete ``file`` type.

Note, however, that this alienation between "file-like" interfaces and the
``file`` type, leads to a proliferation of incompatible interfaces being
created by different packages, each declaring a different subset of the total
operations provided by the ``file`` type.  This then leads further to the need
to somehow reconcile the incompatibilities between these diverse interfaces.

Therefore, in this proposal we will turn both of those assumptions upside down,
by proposing to declare conformance to *individual operations* of a target
type, whether the type is concrete or abstract.  That is, one may define the
notion of "file-like" without reference to any interface at all, by simply
declaring that certain operations on an object are "like" the operations
provided by the ``file`` type.

This idea will (hopefully) better match the uncorrupted intuition of a Python
programmer who has not yet adopted traditional static interface concepts, or
of a Python programmer who rebels against the limitations of those concepts (as
many Python developers do).  And, the approach corresponds fairly closely to
concepts in other languages with more sophisticated type systems (like Haskell
typeclasses or Dylan protocols), while still being a straightforward extension
of more rigid type systems like those of Java or Microsoft's COM (Component
Object Model).

This PEP directly competes with PEP 245, which proposes a syntax for
Python interfaces.  If some form of this proposal is accepted, it
would be unnecessary for a special interface type or syntax to be added to
Python, since normal classes and partially or completely abstract classes will
be routinely usable as interfaces.  Some packages or frameworks, of course, may
have additional requirements for interface features, but they can use
metaclasses to implement such enhanced interfaces without impeding their
ability to be used as interfaces by this PEP's adaptation system.


Specification
=============

For "file-like" objects, the standard library already has a type which may
form the basis for compatible interfacing between packages; if each package
denotes the relationship between its types' operations and the operations
of the ``file`` type, then those packages can accept other packages' objects
as parameters declared as requiring a ``file`` instance.

However, the standard library cannot contain base versions of all possible
operations for which multiple implementations might exist, so different
packages are bound to create different renderings of the same basic operations.
For example, one package's ``Duck`` class might have ``walk()`` and ``quack()``
methods, where another package might have a ``Mallard`` class (a kind of duck)
with ``waddle()`` and ``honk()`` methods.  And perhaps another package might
have a class with ``moveLeftLeg()`` and ``moveRightLeg()`` methods that must
be combined in order to offer an operation equivalent to ``Duck.walk()``.

Assuming that the package containing ``Duck`` has a function like this (using
Guido's proposed optional typing syntax [2]_)::

     def walkTheDuck(duck: Duck):
         duck.walk()

This function expects a ``Duck`` instance, but what if we wish to use a
``Mallard`` from the other package?

The simple answer is to allow Python programs to explicitly state that an
operation (i.e. function or method) of one type has semantics that roughly
correspond to those of an operation possessed by a different type.  That is,
we want to be able to say that ``Mallard.waddle()`` is "like" the method
``Duck.walk()``.  (For our examples, we'll use decorators to declare this
"like"-ness, but of course Python's syntax could also be extended if desired.)

If we are the author of the ``Mallard`` class, we can declare our compatibility
like this::

     class Mallard(Waterfowl):

         @like(Duck.walk)
         def waddle(self):
             # walk like a duck!

This is an example of declaring the similarity *inside* the class to be
adapted.  In many cases, however, you can't do this because you don't control
the implementation of the class you want to use, or even if you do, you don't
wish to introduce a dependency on the foreign package.

In that case, you can create what we'll call an "external operation", which
is just a function that's declared outside the class it applies to.  It's
almost identical to the "internal operation" we declared inside the ``Mallard``
class, but it has to call the ``waddle()`` method, since it doesn't also
implement waddling::

     @like(Duck.walk, for_type=Mallard)
     def duckwalk_by_waddling(self):
         self.waddle()

Whichever way the operation correspondence is registered, we should now be
able to successfully call ``walkTheDuck(Mallard())``.  Python will then
automatically create a "proxy" or "adapter" object that wraps the ``Mallard``
instance with a ``Duck``-like interface.  That adapter will have a ``walk()``
method that is just a renamed version of the ``Mallard`` instance's
``waddle()`` method (or of the ``duckwalk_by_waddling`` external operation).

For any methods of ``Duck`` that have no corresponding ``Mallard`` operation,
the adapter will omit that attribute, thereby maintaining backward
compatibility with code that uses attribute introspection or traps
``AttributeError`` to control optional behaviors.  In other words, if we have
a ``MuteMallard`` class that has no ability to ``quack()``, but has an
operation corresponding to ``walk()``, we can still safely pass its instances
to ``walkTheDuck()``, but if we pass a ``MuteMallard`` to a routine that
tries to make it ``quack``, that routine will get an ``AttributeError``.


Adapter Creation
----------------

Note, however, that even though a different adapter class is needed for
different adapted types, it is not necessary to create an adapter class "from
scratch" every time a ``Mallard`` is used as a ``Duck``.  Instead, the
implementation can need only create a ``MallardAsDuck`` adapter class once, and
then cache it for repeated uses.  Adapter instances can also be quite small in
size, because in the general case they only need to contain a reference to the
object instance that they are adapting.  (Except for "extension" adapters,
which need storage for their added "state" attributes.  More on this later,
in the section on `Adapters That Extend`_, below.)

In order to be able to create these adapter classes, we need to be able to
determine the correspondence between the target ``Duck`` operations, and
operations for a ``Mallard``.  This is done by traversing the ``Duck``
operation namespace, and retrieving methods and attribute descriptors.  These
descriptors are then looked up in a registry keyed by descriptor (method or
property) and source type (``Mallard``).  The found operation is then placed
in the adapter class' namespace under the name given to it by the ``Duck``
type.

So, as we go through the ``Duck`` methods, we find a ``walk()`` method
descriptor, and we look into a registry for the key ``(Duck.walk,Mallard)``.
(Note that this is keyed by the actual ``Duck.walk`` method, not by the *name*
``"Duck.walk"``.  This means that an operation inherited unchanged by a
subclass of ``Duck`` can reuse operations declared "like" that operation.)

If we find the entry, ``duckwalk_by_waddling`` (the function object, not its
name), then we simply place that object in the adapter class' dictionary under
the name ``"walk"``, wrapped in a descriptor that substitutes the original
object as the method's ``self`` parameter.  Thus, when the function is invoked
via an adapter instance's ``walk()`` method, it will receive the adapted
``Mallard`` as its ``self``, and thus be able to call the ``waddle()``
operation.

However, operations declared in a class work somewhat differently.  If
we directly declared that ``waddle()`` is "like" ``Duck.walk`` in the body
of the ``Mallard`` class, then the ``@like`` decorator will register the method
name ``"waddle"`` as the operation in the registry.  So, we would then look up
that name on the source type in order to implement the operation on the
adapter.  For the ``Mallard`` class, this doesn't make any difference, but if
we were adapting a subclass of ``Mallard`` this would allow us to pick up the
subclass' implementation of ``waddle()`` instead.

So, we have our ``walk()`` method, so now let's add a ``quack()`` method.
But wait, we haven't declared one for ``Mallard``, so there's no entry for
``(Duck.quack,Mallard)`` in our registry.  So, we proceed through
the ``__mro__`` (method resolution order) of ``Mallard`` in order to see if
there is an operation corresponding to ``quack`` that ``Mallard`` inherited
from one of its base classes.  If no method is found, we simply do not put
anything in the adapter class for a ``"quack"`` method, which will cause
an ``AttributeError`` if somebody tries to call it.

Finally, if our attempt at creating an adapter winds up having *no* operations
specific to the ``Duck`` type, then a ``TypeError`` is raised.  Thus if we had
passed an instance of ``Pig`` to the ``walkTheDuck`` function, and ``Pig``
had no methods corresponding to any ``Duck`` methods, this would result in
a ``TypeError`` -- even if the ``Pig`` type has a method named ``walk()``! --
because we haven't said anywhere that a pig walks like a duck.

Of course, if all we wanted was for ``walkTheDuck`` to accept any object
with a method *named* ``walk()``, we could've left off the type declaration
in the first place!  The purpose of the type declaration is to say that we
*only* want objects that claim to walk like ducks, assuming that they walk
at all.

This approach is not perfect, of course.  If we passed in a ``LeglessDuck``
to ``walkTheDuck()``, it is not going to work, even though it will pass the
``Duck`` type check (because it can still ``quack()`` like a ``Duck``).
However, as with normal Python "duck typing", it suffices to run the program
to find that error.  The key here is that type declarations should facilitate
using *different* objects, perhaps provided by other authors following
different naming conventions or using different operation granularities.


Inheritance
-----------

By default, this system assumes that subclasses are "substitutable" for their
base classes.  That is, we assume that a method of a given name in a subclass
is "like" (i.e. is substitutable for) the correspondingly-named method in a
base class.  However, sometimes this is *not* the case; a subclass may have
stricter requirements on routine parameters.  For example, suppose we have
a ``Mallard`` subclass like this one::

     class SpeedyMallard(Mallard):
         def waddle(self, speed):
             # waddle at given speed

This class is *not* substitutable for Mallard, because it requires an extra
parameter for the ``waddle()`` method.  In this case, the system should *not*
consider ``SpeedyMallard.waddle`` to be "like" ``Mallard.waddle``, and it
therefore should not be usable as a ``Duck.walk`` operation.  In other words,
when inheriting an operation definition from a base class, the subclass'
operation signature must be checked against that of the base class, and
rejected if it is not compatible.  (Where "compatible" means that the subclass
method will accept as many arguments as the base class method will, and that
any extra arguments taken by the subclass method are optional ones.)

Note that Python cannot tell, however, if a subclass changes the *meaning*
of an operation, without changing its name or signature.  Doing so is arguably
bad style, of course, but it could easily be supported anyway by using an
additional decorator, perhaps something like ``@unlike(Mallard.waddle)`` to
claim that no operation correspondences should remain, or perhaps
``@unlike(Duck.walk)`` to indicate that only that operation no longer applies.

In any case, when a substitutability error like this occurs, it should ideally
give the developer an error message that explains what is happening, perhaps
something like "waddle() signature changed in class Mallard, but replacement
operation for Duck.walk has not been defined."  This error can then be
silenced with an explicit ``@unlike`` decorator (or by a standalone ``unlike``
call if the class cannot be changed).


External Operations and Method Dependencies
-------------------------------------------

So far, we've been dealing only with simple examples of method renaming, so
let's now look at more complex integration needs.  For example, the Python
``dict`` type allows you to set one item at a time (using ``__setitem__``) or
to set multiple items using ``update()``.  If you have an object that you'd
like to pass to a routine accepting "dictionary-like" objects, what if your
object only has a ``__setitem__`` operation but the routine wants to use
``update()``?

As you may recall, we follow the source type's ``__mro__`` to look for an
operation inherited possibly "inherited" from a base class.  This means that
it's possible to register an "external operation" under
``(dict.update,object)`` that implements a dictionary-like ``update()`` method
by repeatedly calling ``__setitem__``.  We can do so like this::

     @like(dict.update, for_type=object, needs=[dict.__setitem__])
     def do_update(self:dict, other:dict):
         for key,value in other.items():
             self[key] = value

Thus, if a given type doesn't have a more specific implementation of
``dict.update``, then types that implement a ``dict.__setitem__`` method can
automatically have this ``update()`` method added to their ``dict`` adapter
class.  While building the adapter class, we simply keep track of the needed
operations, and remove any operations with unmet or circular dependencies.

By the way, even though technically the ``needs`` argument to ``@like`` could
be omitted since the information is present in the method body, it's actually
helpful for documentation purposes to present the external operation's
requirements up-front.

However, if the programmer fails to accurately state the method's needs, the
result will either be an ``AttributeError`` at a deeper point in the code, or
a stack overflow exception caused by looping between mutually recursive
operations.  (E.g. if an external ``dict.__setitem__`` is defined in terms of
``dict.update``, and a particular adapted type supports neither operation
directly.)  Neither of these ways of revealing the error is particularly
problematic, and is easily fixed when discovered, so ``needs`` is still
intended more for the reader of the code than for the adaptation system.

By the way, if we look again at one of our earliest examples, where we
externally declared a method correspondence from ``Mallard.waddle`` to
``Duck.walk``::

     @like(Duck.walk, for_type=Mallard)
     def walk_like_a_duck(self):
         self.waddle()

we can see that this is actually an external operation being declared; it's
just that we didn't give the (optional) full declarations::

     @like(Duck.walk, for_type=Mallard, needs=[Mallard.waddle])
     def walk_like_a_duck(self:Mallard):
         self.waddle()

When you register an external operation, the actual function object given is
registered, because the operation doesn't correspond to a method on the
adapted type.  In contrast, "internal operations" declared within the adapted
type cause the method *name* to be registered, so that subclasses can inherit
the "likeness" of the base class' methods.


Adapters That Extend
--------------------

One big difference between external operations and ones created within a
class, is that a class' internal operations can easily add extra attributes
if needed.  An external operation, however, is not in a good position to do
that.  It *could* just stick additional attributes onto the original
object, but this would be considered bad style at best, even if it used
mangled attribute names to avoid collisions with other external
operations' attributes.

So let's look at an example of how to handle adaptation that needs more
state information than is available in the adapted object.  Suppose, for
example, we have a new ``DuckDodgers`` class, representing a duck who is
also a test pilot.  He can therefore be used as a rocket-powered vehicle by
strapping on a ``JetPack``, which we can have happen automatically::

     @like(Rocket.launch, for_type=DuckDodgers, using=JetPack)
     def launch(jetpack, self):
         jetpack.activate()
         print "Up, up, and away!"

The type given as the ``using`` parameter must be instantiable without
arguments.  That is, ``JetPack()`` must create a valid instance.  When
a ``DuckDodgers`` instance is being used as a ``Rocket`` instance, and this
``launch`` method is invoked, it will attempt to create a ``JetPack``
instance for the ``DuckDodgers`` instance (if one has not already been
created and cached).

The same ``JetPack`` will be used for all external operations that request to
use a ``JetPack`` for that specific ``DuckDodgers`` instance.  (Which only
makes sense, because Dodgers can wear only one jet pack at a time, and adding
more jet packs will not allow him to fly to several places at once!)

It's also necessary to keep reusing the *same* ``JetPack`` instance for a
given ``DuckDodgers`` instance, even if it is adapted many times to different
rocketry-related interfaces.  Otherwise, we might create a new ``JetPack``
during flight, which would then be confused about how much fuel it had or
whether it was currently in flight!

This pattern of adaptation is referred to in the `Motivation`_ section as
"extension" or "extender" adaptation, because it allows you to dynamically
extend the capabilities of an existing class at runtime, as opposed to just
recasting its existing operations in a form that's compatible with another
type.  In this case, the ``JetPack`` is the extension, and our ``launch``
method defines part of the adapter.

Note, by the way that ``JetPack`` is a completely independent class here.  It
does not have to know anything about ``DuckDodgers`` or its use as an adapter,
nor does ``DuckDodgers`` need to know about ``JetPack``.  In fact, neither
object should be given a reference to the other, or this will create a
circularity that may be difficult to garbage collect.  Python's adaptation
machinery will use a weak-key dictionary mapping from adapted objects to their
"extensions", so that our ``JetPack`` instance will hang around until the
associated ``DuckDodgers`` instance goes away.

Then, when external operations using ``JetPack`` are invoked, they simply
request a ``JetPack`` instance from this dictionary, for the given
``DuckDodgers`` instance, and then the operation is invoked with references
to both objects.

Of course, this mechanism is not available for adapting types whose instances
cannot be weak-referenced, such as strings and integers.  If you need to extend
such a type, you must fall back to using a volatile adapter, even if you would
prefer to have a state that remains consistent across adaptations.  (See the
`Volatile Adaptation`_ section below.)


Using Multiple Extenders
------------------------

Each external operation can use a different ``using`` type to store its state.
For example, a ``DuckDodgers`` instance might be able to be used as a
``Soldier``, provided that he has a ``RayGun``::

     @like(Soldier.fight, for_type=DuckDodgers, using=RayGun)
     def fight(raygun, self, enemy:Martian):
         while enemy.isAlive():
             raygun.fireAt(enemy)

In the event that two operations covering a given ``for_type`` type have
``using`` types with a common base class (other than ``object``), the
most-derived type is used for both operations.  This rule ensures that
extenders do not end up with more than one copy of the same state, divided
between a base type and a derived type.

Notice that our examples of ``using=JetPack`` and ``using=RayGun`` do not
interact, as long as ``RayGun`` and ``JetPack`` do not share a common base
class other than ``object``.  However, if we had defined one operation
``using=JetPack`` and another as ``using=HypersonicJetPack``, then both
operations would receive a ``HypersonicJetPack`` if ``HypersonicJetPack`` is
a subclass of ``JetPack``.  This ensures that we don't end up with two jet
packs, but instead use the best jetpack possible for the operations we're
going to perform.

However, if we *also* have an operation using a ``BrokenJetPack``, and that's
also a subclass of ``JetPack``, then we have a conflict, because there's no
way to reconcile a ``HypersonicJetPack`` with a ``BrokenJetPack``, without
first creating a ``BrokenHypersonicJetPack`` that derives from both, and using
it in at least one of the operations.

If it is not possible to determine a single "most-derived" type among a set of
operations for a given adapted type, then an error is raised, similar to that
raised by when deriving a class from classes with incompatible metaclasses.
As with that kind of error, this error can be resolved just by adding another
``using`` type that inherits from the conflicting types.


Volatile Adaptation
-------------------

Volatile adapters are not the same thing as operational adapters or extenders.
Indeed, some strongly question whether they should be called "adapters" at all,
because to do so weakens the term.  For example, in the model-view-controller
pattern, does it make sense to call a view an "adapter"?  What about iterators?
Are they "adapters", too?  At some point, one is reduced to calling any object
an adapter, as long as it mainly performs operations on one other object.  This
seems like a questionable practice, and it's a much broader term than is used
in the context of the GoF "Adapter Pattern" [3]_.

Indeed, it could be argued that these other "adapters" are actually extensions
of the GoF "Abstract Factory" pattern [4]_.  An Abstract Factory is a way
of creating an object whose interface is known, but whose concrete type is not.
PEP 246 adaptation can basically be viewed as an all-purpose Abstract Factory
that takes a source object and a destination interface.  This is a valuable
tool for many purposes, but it is not really the same thing as adaptation.

Shortly after I began writing this section, Clark Evans posted a request for
feedback on changes to PEP 246, that suggests PEP 246 will provide adequate
solutions of its own for defining volatile adapters, including options for
declaring an adapter volatile, and whether it is safe for use with type
declarations.  So, for now, this PEP will assume that volatile adapters will
fall strictly under the jurisdiction of PEP 246, leaving this PEP to deal
only with the previously-covered styles of adaptation that are by definition
safe for use with type declarations.  (Because they only cast an object in
a different role, rather than creating an independent object.)


Miscellaneous
-------------

XXX property get/set/del as three "operations"

XXX binary operators

XXX level-confusing operators: comparison, repr/str, equality/hashing

XXX other special methods


Backward Compatibility
======================

XXX explain Java cast and COM QueryInterface as proper subsets of adaptation


Reference Implementation
========================

TODO


Acknowledgments
===============

Many thanks to Alex Martelli, Clark Evans, and the many others who participated
in the Great Adaptation Debate of 2005.  Special thanks also go to folks like
Ian Bicking, Paramjit Oberoi, Steven Bethard, Carlos Ribeiro, Glyph Lefkowitz
and others whose brief comments in a single message sometimes provided more
insight than could be found in a megabyte or two of debate between myself
and Alex; this PEP would not have been possible without all of your input.
Last, but not least, Ka-Ping Yee is to be thanked for pushing the idea of
"partially abstract" interfaces, for which idea I have here attempted to
specify a practical implementation.

Oh, and finally, an extra special thanks to Guido for not banning me from
the Python-Dev list when Alex and I were posting megabytes of
adapter-related discussion each day.  ;)


References
==========

.. [1] Guido's Python-Dev posting on "PEP 246: lossless and stateless"
    (http://mail.python.org/pipermail/python-dev/2005-January/051053.html)

.. [2] Optional Static Typing -- Stop the Flames!
    (http://www.artima.com/weblogs/viewpost.jsp?thread=87182)

.. [3] XXX Adapter Pattern

.. [4] XXX Abstract Factory Pattern


Copyright
=========

This document has been placed in the public domain.



..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    End:

From pje at telecommunity.com  Mon Jan 17 00:46:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 00:44:39 2005
Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft
In-Reply-To: <5.1.1.6.0.20050115170350.020fe6a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050116184546.030afc00@mail.telecommunity.com>

Oops.  I forgot to cancel this posting; it's an older version.

At 11:51 PM 1/15/05 -0500, Phillip J. Eby wrote:
>This is only a partial first draft, but the Motivation section nonetheless 
>attempts to briefly summarize huge portions of the various discussions 
>regarding adaptation, and to coin a hopefully more useful terminology than 
>some of our older working adjectives like "sticky" and "stateless" and 
>such.  And the specification gets as far as defining a simple 
>decorator-based syntax for creating operational (prev. "stateless") and 
>extension (prev. "per-object stateful") adapters.

From kbk at shore.net  Mon Jan 17 05:01:19 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Mon Jan 17 05:02:00 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	pythonrun.c, 2.161.2.15, 2.161.2.16
References: <E1CmxrV-0006V1-N8@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5305010712375ca9a373@mail.gmail.com>
	<87brc17xbg.fsf@hydra.bayview.thirdcreek.com>
	<e8bf7a5305010713434d2b9323@mail.gmail.com>
	<877jmo93qv.fsf@hydra.bayview.thirdcreek.com>
	<873bxc8m6k.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <87llasvh4g.fsf@hydra.bayview.thirdcreek.com>

kbk@shore.net (Kurt B. Kaiser) writes:

> kbk@shore.net (Kurt B. Kaiser) writes:
>>  [JH]
>>> ../Python/symtable.c:193: structure has no member named `st_tmpname'
>>>
>>> Do you see that?
>>
>> Yeah, the merge eliminated it from the symtable struct in symtable.h.
>> You moved it to symtable_entry at rev 2.12 in MAIN :-)

[...]

I checked in a change which adds the st_tmpname element back to
symtable.  Temporary until someone gets time to evaluate the situation.

[...]

> Apparently the $(AST_H) $(AST_C): target ran and Python-ast.c was
> recreated (without the changes).  It's not clear to me how/why that
> happened.  I did start with a clean checkout, but it seems that the
> target only runs if Python-ast.c and/or its .h are missing (they
> should have been in the checkout), or older than Python.asdl, which
> they are not.  I don't see them in the .cvsignore.

I believe the problem was caused by the fact that the dates in the
local tree aren't the repository dates, so it happened that
Parser/Python.adsl had a newer date than Python-ast.[ch].  I did a
clean install on my Debian system and got around the issue by touching
Python-ast.[c,h] before the build.  IMO ASDLGEN s/b a .phony target,
run manually as needed by the AST developer.  Otherwise there will be
no end of trouble when people try to build from CVS after the merge.

Absent objection, I'll check in such a change.

== 
The tree compiles, but there is a segfault when make tries to run
Python on setup.py.  Failure occurs when trying to import site.py

==
Neal has fixed the import issue and several others!!  I was bitten by the
Python.asdl / Python-ast.[ch] timing again when updating to his changes....

Branch now builds and python can be started.  There are a number of
test failures remaining.

-- 
KBK
From gvanrossum at gmail.com  Mon Jan 17 06:42:33 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 06:42:36 2005
Subject: [Python-Dev] PEP 246, Feedback Request
In-Reply-To: <20050116040424.GA76191@prometheusresearch.com>
References: <CFD6B9BC-6315-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050116040424.GA76191@prometheusresearch.com>
Message-ID: <ca471dc205011621421fce464c@mail.gmail.com>

>   - protocol means any object, usually a type or class or interface,
>     which guides the construction of an adapter

Then what do we call the abstract *concept* of a protocol?

> - adaptee-class refers to the adaptee's class

Please make it explicit that this is a.__class__, not type(a).

>   - factory refers to a function, f(adaptee) -> adapter, where
>     the resulting adapter complies with a given protocol

Make this adapter factory -- factory by itself is too commonly used.

>   - First, the registry is checked for a suitable adapter

How about checking whether adaptee.__class__ is equal to the protocol
even before this?  It would be perverse to declare an adapter from a
protocol to itself that wasn't the identity adapter.

>   - PEP 246 will ask for a `adapt' module, with an `adapt' function.

Please don't give both the same name.  This practice has caused enough
problems in the past.  The module can be called adaptation, or
adapting (cf. threading; but it doesn't feel right so I guess
adaptation is better).

>   - At any stage of adaptation, if None is returned, the adaptation
>     continues to the next stage.

Maybe use NotImplemented instead?  I could imagine that occasionally
None would be a valid adapter.  (And what do we do when asked adapt
None to a protocol?  I think it should be left alone but not
considered an error.)

>   - At any stage of adaption, if adapt.AdaptException(TypeError) is
>     raised, then the adaptation process stops, as if None had been
>     returned from each stage.

Why are there two ways to signal an error?  TOOWTDI!

>   - One can also register a None factory from A->B for the
>     purpose of marking it transitive.  In this circumstance,
>     the composite adapter is built through __conform__ and
>     __adapt__.  The None registration is just a place holder
>     to signal that a given path exists.

Sounds overkill; the None feels too magical.  An explicit adapter
can't be too difficult to come up with?

>     There is a problem with the default isinstance() behavior when
>     someone derives a class from another to re-use implementation,
>     but with a different 'concept'.  A mechanism to disable
>     isinstance() is needed for this particular case.

Do we really have to care about this case?  Has someone found that
things go harebrained when this case is not handled?  It sounds very
much like a theoretical problem only.  I don't mean that subclasses
that reuse implementation without being substitutable are rare (I've
seen and written plenty); I mean that I don't expect this to cause
additional problems due to incorrect adaptation.

>     Guido would like his type declaration syntax (see blog entry) to
>     be equivalent to a call to adapt() without any additional
>     arguments.  However, not all adapters should be created in the
>     context of a declaration -- some should be created more
>     explicitly.  We propose a mechanism where an adapter factory can
>     register itself as not suitable for the declaration syntax.

I'm considering a retraction of this proposal, given that adaptation
appears to be so subtle and fraught with controversies and pitfalls;
but more particularly given the possible requirement (which someone
added in a response to a blog) that it should be possible to remove or
ignore type declarations without changing the meaning of programs that
run correctly (except for code that catches TypeError).

>   - adapt( , intrinsic_only = False) will enable both sorts of adapters,

That's one ugly keyword parameter.  If we really need this, I'd like
to see something that defaults to False and can be switched on by
passing mumble=True on the call.

But if we could have only one kind that would be much more attractive.
Adaptation looks like it is going to fail the KISS test.

>     There was discussion as to how to get back to the original
>     object from an adapter.  Is this in scope of PEP 246?

Seems too complex.  Again, KISS.

>     Sticky adapters, that is, ones where there is only one instance
>     per adaptee is a common use case.  Should the registry of PEP 246
>     provide this feature?

Ditto.  If you really need this, __adapt__ and __conform__ could use a
cache.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Mon Jan 17 07:12:37 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 07:12:40 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
Message-ID: <ca471dc20501162212446e63b5@mail.gmail.com>

https://sourceforge.net/tracker/index.php?func=detail&aid=1103689&group_id=5470&atid=305470

Here's a patch that gets rid of unbound methods, as
discussed here before. A function's __get__ method
now returns the function unchanged when called without
an instance, instead of returning an unbound method object.

I couldn't remove support for unbound methods
completely, since they were used by the built-in
exceptions. (We can get rid of that use once we convert
to new-style exceptions.)

For backward compatibility, functions now have
read-only im_self and im_func attributes; im_self is
always None, im_func is always the function itself.
(These should issue warnings, but I haven't added that
yet.)

The test suite passes. (I have only tried "make test"
on a Linux box.)

What do people think? (My main motivation for this, as stated before,
is that it adds complexity without much benefit.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From glyph at divmod.com  Mon Jan 17 07:49:07 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Mon Jan 17 07:44:31 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
References: <E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
Message-ID: <1105944547.30052.21.camel@localhost>

On Sun, 2005-01-16 at 13:00 -0500, Phillip J. Eby wrote:

> """One type is the "extender", ...

> By contrast, an "independent adapter" ...

I really like the way this part of the PEP is sounding, since it really
captures two almost, but not quite, completely different use-cases, the
confusion between which generated all the discussion here in the first
place.  The terminology seems a bit cumbersome though.

I'd like to propose that an "extender" be called a "transformer", since
it provides a transformation for an underlying object - it changes the
shape of the underlying object so it will fit somewhere else, without
creating a new object.  Similarly, the cumbersome "independent adapter"
might be called a "converter", since it converts A into B, where B is
some new kind of thing.

Although the words are almost synonyms, their implications seem to match
up with what's trying to be communicated here.  A "transformer" is
generally used in the electrical sense - it is a device which changes
voltage, and only voltage.  It takes in one flavor of current and
produces one, and exactly one other.  Used in the electrical sense, a
"converter" is far more general, since it has no technical meaning that
I'm aware of - it might change anything about the current.  However,
other things are also called converters, such as currency converters,
which take one kind of currency and produce another, separate currency.
Similar to "independent adapters", this conversion is dependent on a
moment in time for the conversion - after the conversion, each currency
may gain or lose value relative to the other.

If nobody likes this idea, it would seem a bit more symmetric to have
"dependent" and "independent" adapters, rather than "extenders" and
"independent adapters".  As it is I'm left wondering what the concept of
dependency in an adapter is.

From glyph at divmod.com  Mon Jan 17 07:56:59 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Mon Jan 17 07:52:23 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc20501162212446e63b5@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <1105945019.30052.26.camel@localhost>

On Sun, 2005-01-16 at 22:12 -0800, Guido van Rossum wrote:

> What do people think? (My main motivation for this, as stated before,
> is that it adds complexity without much benefit.)

> ***************
> *** 331,339 ****
>   def test_im_class():
>       class C:
>           def foo(self): pass
> -     verify(C.foo.im_class is C)

^ Without this, as JP Calderone pointed out earlier, you can't serialize
unbound methods.  I wouldn't mind that so much, but you can't tell that
they're any different from regular functions until you're
*de*-serializing them.

In general I like the patch, but what is the rationale for removing
im_class from functions defined within classes?




From tjreedy at udel.edu  Mon Jan 17 08:39:46 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Jan 17 08:39:55 2005
Subject: [Python-Dev] Re: Getting rid of unbound methods: patch available
References: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <csfq42$mue$1@sea.gmane.org>


"Guido van Rossum" <gvanrossum@gmail.com> wrote in message 
news:ca471dc20501162212446e63b5@mail.gmail.com...
> What do people think? (My main motivation for this, as stated before,
> is that it adds complexity without much benefit.)

>From the viewpoint of learning and explaining Python, this is a plus.
I never understood why functions were wrapped as unbounds until this 
proposal was put forth and discussed.

Terry J. Reedy



From ncoghlan at iinet.net.au  Mon Jan 17 11:01:27 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Jan 17 11:01:31 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc20501162212446e63b5@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <41EB8CF7.9010002@iinet.net.au>

Guido van Rossum wrote:
> What do people think? (My main motivation for this, as stated before,
> is that it adds complexity without much benefit.)

I'm in favour, since it removes the "an unbound method is almost like a bare 
function, only not quite as useful" distinction. It would allow things like 
str.join(sep, seq) to work correctly for a Unicode separator. It also allows 
'borrowing' of method implementations without inheritance.

I'm a little concerned about the modification to pyclbr_input.py, though (since 
it presumably worked before the patch). Was the input file tweaked before or 
after the test itself was fixed? (I'll probably get around to trying out the 
patch myself, but that will be on Linux as well, so I doubt my results will 
differ from yours).

The other question is the pickling example - an unbound method currently stores 
meaningful data in im_class, whereas a standard function doesn't have that 
association. Any code which makes use of im_class on unbound methods (even 
without involving pickling)is going to have trouble with the change. (Someone 
else will need to provide a real-life use case though, since I certainly don't 
have one).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From arigo at tunes.org  Mon Jan 17 11:52:19 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon Jan 17 12:04:03 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <fb6fbf560501141620eff6d85@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
Message-ID: <20050117105219.GA12763@vicky.ecs.soton.ac.uk>

Hi,

On Fri, Jan 14, 2005 at 07:20:31PM -0500, Jim Jewett wrote:
> The base of the Exception hierarchy happens to be a classic class.
> But why are they "required" to be classic?

For reference, PyPy doesn't have old-style classes at all so far, so we had to
come up with something about exceptions.  After some feedback from python-dev
it appears that the following scheme works reasonably well.  Actually it's
surprizing how little problems we actually encountered by removing the
old-/new-style distinction (particularly when compared with the extremely
obscure workarounds we had to go through in PyPy itself, e.g. precisely
because we wanted exceptions that are member of some (new-style) class
hierarchy).

Because a bit of Python code tells more than long and verbose explanations,
here it is:

def app_normalize_exception(etype, value, tb):
    """Normalize an (exc_type, exc_value) pair:
    exc_value will be an exception instance and exc_type its class.
    """
    # mistakes here usually show up as infinite recursion, which is fun.
    while isinstance(etype, tuple):
        etype = etype[0]
    if isinstance(etype, type):
        if not isinstance(value, etype):
            if value is None:
                # raise Type: we assume we have to instantiate Type
                value = etype()
            elif isinstance(value, tuple):
                # raise Type, Tuple: assume Tuple contains the constructor
                #                    args
                value = etype(*value)
            else:
                # raise Type, X: assume X is the constructor argument
                value = etype(value)
        # raise Type, Instance: let etype be the exact type of value
        etype = value.__class__
    elif type(etype) is str:
        # XXX warn -- deprecated
        if value is not None and type(value) is not str:
            raise TypeError("string exceptions can only have a string value")
    else:
        # raise X: we assume that X is an already-built instance
        if value is not None:
            raise TypeError("instance exception may not have a separate"
                            " value")
        value = etype
        etype = value.__class__
        # for the sake of language consistency we should not allow
        # things like 'raise 1', but it's probably fine (i.e.
        # not ambiguous) to allow them in the explicit form 'raise int, 1'
        if not hasattr(value, '__dict__') and not hasattr(value, '__slots__'):
            raise TypeError("raising built-in objects can be ambiguous, "
                            "use 'raise type, value' instead")
    return etype, value, tb


Armin
From ncoghlan at iinet.net.au  Mon Jan 17 12:49:42 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Jan 17 12:49:46 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <ca471dc20501160915778b4eca@mail.gmail.com>
References: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<ca471dc20501160915778b4eca@mail.gmail.com>
Message-ID: <41EBA656.8070409@iinet.net.au>

Guido van Rossum wrote:
> Typechecking can be trivially defined in terms of adaptation:
> 
> def typecheck(x, T):
>    y = adapt(x, T)
>    if y is x:
>        return y
>    raise TypeError("...")

Assuming the type error displayed contains information on T, the caller can then 
trivially correct the type error by invoking adapt(arg, T) at the call point 
(assuming the argument actually *is* adaptable to the desired protocol). The 
code inside the function still gets to assume the supplied object has the 
correct type - the only difference is that if adaptation is actually needed, the 
onus is on the caller to provide it explicitly (and they will get a specific 
error telling them so).

This strikes me as quite an elegant solution.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mal at egenix.com  Mon Jan 17 13:11:19 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon Jan 17 13:11:24 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41EB8CF7.9010002@iinet.net.au>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<41EB8CF7.9010002@iinet.net.au>
Message-ID: <41EBAB67.2080907@egenix.com>

Nick Coghlan wrote:
> Guido van Rossum wrote:
> 
>> What do people think? (My main motivation for this, as stated before,
>> is that it adds complexity without much benefit.)
> 
> 
> I'm in favour, since it removes the "an unbound method is almost like a 
> bare function, only not quite as useful" distinction. It would allow 
> things like str.join(sep, seq) to work correctly for a Unicode 
> separator. 

This won't work. Strings and Unicode are two different types,
not subclasses of one another.

> It also allows 'borrowing' of method implementations without 
> inheritance.
 >
> I'm a little concerned about the modification to pyclbr_input.py, though 
> (since it presumably worked before the patch). Was the input file 
> tweaked before or after the test itself was fixed? (I'll probably get 
> around to trying out the patch myself, but that will be on Linux as 
> well, so I doubt my results will differ from yours).
> 
> The other question is the pickling example - an unbound method currently 
> stores meaningful data in im_class, whereas a standard function doesn't 
> have that association. Any code which makes use of im_class on unbound 
> methods (even without involving pickling)is going to have trouble with 
> the change. (Someone else will need to provide a real-life use case 
> though, since I certainly don't have one).

I don't think there's much to worry about. At the C level,
bound and unbound methods are the same type. The only
difference is that bound methods have the object
attribute im_self set to an instance object, while
unbound methods have it set NULL.

Given that the two are already the same type, I don't
really see much benefit from dropping the printing of
"unbound" in case im_self is NULL... perhaps I'm missing
something.

As for real life examples: basemethod() in mxTools uses
.im_class to figure the right base method to use (contrary
to super(), basemethod() also works for old-style classes).
basemethod() in return if used in quite a few applications
to deal with overriding methods in mixin classes.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From anthony at interlink.com.au  Mon Jan 17 14:45:09 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Jan 17 14:44:57 2005
Subject: [Python-Dev] Re: how to test behavior wrt an extension type?
In-Reply-To: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
Message-ID: <200501180045.10439.anthony@interlink.com.au>

On Sunday 16 January 2005 20:38, Alex Martelli wrote:
> Problem: to write unit tests showing that the current copy.py
> misbehaves with a classic extension type, I need a classic extension
> type which defines __copy__ and __deepcopy__ just like /F's
> cElementTree does.  So, I made one: a small trycopy.c and accompanying
> setup.py whose only purpose in life is checking that instances of a
> classic type get copied correctly, both shallowly and deeply.  But now
> -- where do I commit this extension type, so that the unit tests in
> test_copy.py can do their job...?

> I do not know what the recommended practice is for this kind of issues,
> so, I'm asking for guidance (and specifically asking Anthony since my
> case deals with 2.3 and 2.4 maintenance and he's release manager for
> both, but, of course, everybody's welcome to help!).  Surely this can't
> be the first case in which a bug got triggered only by a certain
> behavior in an extension type, but I couldn't find precedents.  Ideas,
> suggestions, ...?

Beats me - worst comes to worst, I guess we ship the unittest code 
there with a try/except around the ImportError on the new 'copytest'
module, and the test skips if it's not built. Then we don't build it by
default, but if someone wants to build it and check it, they can. I don't
like this much, but I can't think of a better alternative. Shipping a new
extension module just for this unittest seems like a bad idea.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From FBatista at uniFON.com.ar  Mon Jan 17 14:48:10 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Jan 17 14:51:46 2005
Subject: [Python-Dev] Deprecating old bugs
Message-ID: <A128D751272CD411BC9200508BC2194D053C7EB0@escpl.tcp.com.ar>

As I discussed in this list, in the "Policy about old Python versions"
thread at 8-Nov-2004, I started verifying the old bugs.

Here are the results for 2.1.*. This maybe should be put in an informational
PEP.

When I verified the bug, I filled two fields:

- Group: the bug's group at verifying time.
- Bug #: the bug number
- Verified: is the date when I checked the bug.
- Action: is what I did then.

If the bug survived the verification, the next two fields are applicable (if
not, I put a dash, the idea is to keep this info easily parseable):

- Final: is the action took by someone who eliminated the bug from that
category (closed, moved to Py2.4, etc).
- By: is the someone who did the final action.


Group:    2.1.1
Bug #:    1020605
Verified: 08-Nov-2004
Action:   Closed: Invalid. Was a Mailman issue, not a Python one.
Final:    -
By:       -

Group:    2.1.2
Bug #:    771429
Verified: 08-Nov-2004
Action:   Deprecation alerted. I can not try it, don't have that context.
Final:    Closed: Won't fix.
By:       facundobatista

Group:    2.1.2
Bug #:    629345
Verified: 08-Nov-2004
Action:   Deprecation alerted. Can't discern if it's really a bug or not.
Final:    Closed: Won't fix.
By:       facundobatista

Group:    2.1.2
Bug #:    589149
Verified: 08-Nov-2004
Action:   Closed: Fixed. The problem is solved from Python 2.3a1, as the
submitter posted.
Final:    -
By:       -


I included here only 2.1.* because there were only four, so it's a good
trial. If you think I should change the format or add more information,
please let me know ASAP.

The next chapter of this story will cover 2.2 bugs.

Regards,

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/

  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050117/ba12c730/attachment.htm
From aleax at aleax.it  Mon Jan 17 15:03:46 2005
From: aleax at aleax.it (Alex Martelli)
Date: Mon Jan 17 15:03:54 2005
Subject: [Python-Dev] Re: how to test behavior wrt an extension type?
In-Reply-To: <200501180045.10439.anthony@interlink.com.au>
References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it>
	<200501180045.10439.anthony@interlink.com.au>
Message-ID: <9AC0ACCE-6890-11D9-9DED-000A95EFAE9E@aleax.it>


On 2005 Jan 17, at 14:45, Anthony Baxter wrote:
    ...
>> both, but, of course, everybody's welcome to help!).  Surely this 
>> can't
>> be the first case in which a bug got triggered only by a certain
>> behavior in an extension type, but I couldn't find precedents.  Ideas,
>> suggestions, ...?
>
> Beats me - worst comes to worst, I guess we ship the unittest code
> there with a try/except around the ImportError on the new 'copytest'
> module, and the test skips if it's not built. Then we don't build it by
> default, but if someone wants to build it and check it, they can. I 
> don't
> like this much, but I can't think of a better alternative. Shipping a 
> new
> extension module just for this unittest seems like a bad idea.

Agreed about this issue not warranting the shipping of a new extension 
module -- however, in the patch (to the 2.3 maintenance branch) which I 
uploaded (and assigned to you), I followed the effbot's suggestion, and 
added the type needed for testing to the already existing "extension 
module for testing purposes", namely Modules/_testcapi.c -- I don't 
think it can do any harm there, and lets test/test_copy.py do all of 
its testing blissfully well.  I haven't even made the compilation of 
the part of Modules/_testcapi.c which hold the new type conditional 
upon anything, because I don't think that having it there 
unconditionally can possibly break anything anyway... _testcapi IS only 
used for testing, after all...!


Alex

From mwh at python.net  Mon Jan 17 15:06:39 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Jan 17 15:06:41 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011517576655bc17@mail.gmail.com> (Guido van Rossum's
	message of "Sat, 15 Jan 2005 17:57:53 -0800")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<ca471dc205011517576655bc17@mail.gmail.com>
Message-ID: <2mmzv8cfps.fsf@starship.python.net>

Guido van Rossum <gvanrossum@gmail.com> writes:

> To be honest, I don't recall the exact reasons why this wasn't fixed
> in 2.2; I believe it has something to do with the problem of
> distinguishing between string and class exception, and between the
> various forms of raise statements.

A few months back I hacked an attempt to make all exceptions
new-style.  It's not especially hard, but it's tedious.  There's lots
of code (more than I expected, anyway) to change and my attempt ended
up being pretty messy.  I suspect allowing both old- and new-style
classes would be no harder, but even more tedious and messy.

It would still be worth doing, IMHO.

Cheers,
mwh

-- 
 <cube> If you are anal, and you love to be right all the time, C++
   gives you a multitude of mostly untimportant details to fret about
   so you can feel good about yourself for getting them "right", 
   while missing the big picture entirely       -- from Twisted.Quotes
From anthony at interlink.com.au  Mon Jan 17 14:41:01 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Jan 17 15:30:26 2005
Subject: [Python-Dev] 2.3.5 delayed til next week
Message-ID: <200501180041.03195.anthony@interlink.com.au>

As I'd kinda feared, my return to work has left me completely
buried this week, and so I'm going to have to push 2.3.5 until
next week. Thomas and Fred: does one of the days in the
range 25-27 January suit you? The 26th is a public holiday here,
and so that's the day that's most likely for me...

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From gvanrossum at gmail.com  Mon Jan 17 16:27:33 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 16:27:36 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <20050117105219.GA12763@vicky.ecs.soton.ac.uk>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
Message-ID: <ca471dc205011707277b386ec8@mail.gmail.com>

[Armin]
> For reference, PyPy doesn't have old-style classes at all so far, so we had to
> come up with something about exceptions.  After some feedback from python-dev
> it appears that the following scheme works reasonably well.  Actually it's
> surprizing how little problems we actually encountered by removing the
> old-/new-style distinction (particularly when compared with the extremely
> obscure workarounds we had to go through in PyPy itself, e.g. precisely
> because we wanted exceptions that are member of some (new-style) class
> hierarchy).
> 
> Because a bit of Python code tells more than long and verbose explanations,
> here it is:
> 
> def app_normalize_exception(etype, value, tb):
[...]
>     elif type(etype) is str:
>         # XXX warn -- deprecated
>         if value is not None and type(value) is not str:
>             raise TypeError("string exceptions can only have a string value")

That is stricter than classic Python though -- it allows the value to
be anything (and you get the value back unadorned in the except 's',
x: clause).

[Michael]
> It would still be worth doing, IMHO.

Then let's do it. Care to resurrect your patch? (And yes, classic
classes should also be allowed for b/w compatibility.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Mon Jan 17 16:39:30 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 16:38:10 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch
  available
In-Reply-To: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com>

At 10:12 PM 1/16/05 -0800, Guido van Rossum wrote:
>I couldn't remove support for unbound methods
>completely, since they were used by the built-in
>exceptions. (We can get rid of that use once we convert
>to new-style exceptions.)

Will it still be possible to create an unbound method with 
new.instancemethod?  (I know the patch doesn't change this, I mean, is it 
planned to remove the facility from the instancemethod type?)

From gvanrossum at gmail.com  Mon Jan 17 16:43:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 16:43:21 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <1105945019.30052.26.camel@localhost>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
Message-ID: <ca471dc2050117074347baa60d@mail.gmail.com>

[Guido]
> >   def test_im_class():
> >       class C:
> >           def foo(self): pass
> > -     verify(C.foo.im_class is C)

[Glyph]
> ^ Without this, as JP Calderone pointed out earlier, you can't serialize
> unbound methods.  I wouldn't mind that so much, but you can't tell that
> they're any different from regular functions until you're
> *de*-serializing them.

Note that you can't pickle unbound methods anyway unless you write
specific suppport code to do that; it's not supported by pickle
itself.

I think that use case is weak. If you really have the need to pickle
an individual unbound method, it's less work to create a global helper
function and pickle that, than to write the additional pickling
support for picking unbound methods.

> In general I like the patch, but what is the rationale for removing
> im_class from functions defined within classes?

The information isn't easily available to the function. I could go
around and change the parser to make this info available, but that
would require changes in many places currently untouched by the patch.

[Nick]
> I'm a little concerned about the modification to pyclbr_input.py, though (since
> it presumably worked before the patch). Was the input file tweaked before or
> after the test itself was fixed? (I'll probably get around to trying out the
> patch myself, but that will be on Linux as well, so I doubt my results will
> differ from yours).

It is just a work-around for stupidity in the test code, which tries
to filter out cases like "om = Other.om" because the pyclbr code
doesn't consider these. pyclbr.py hasn't changed, and still doesn't
consider these (since it parses the source code); but the clever test
in the test code no longer works.

> The other question is the pickling example - an unbound method currently stores
> meaningful data in im_class, whereas a standard function doesn't have that
> association. Any code which makes use of im_class on unbound methods (even
> without involving pickling)is going to have trouble with the change. (Someone
> else will need to provide a real-life use case though, since I certainly don't
> have one).

Apart from the tests that were testing the behavior of im_class, I
found only a single piece of code in the standard library that used
im_class of an unbound method object (the clever test in the pyclbr
test). Uses of im_self and im_func were more widespread. Given the
level of cleverness in the pyclbr test (and the fact that I wrote it
myself) I'm not worried about widespread use of im_class on unbound
methods.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Mon Jan 17 16:45:26 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 16:44:02 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <1105944547.30052.21.camel@localhost>
References: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>

At 01:49 AM 1/17/05 -0500, Glyph Lefkowitz wrote:
>On Sun, 2005-01-16 at 13:00 -0500, Phillip J. Eby wrote:
>
> > """One type is the "extender", ...
>
> > By contrast, an "independent adapter" ...
>
>I really like the way this part of the PEP is sounding, since it really
>captures two almost, but not quite, completely different use-cases, the
>confusion between which generated all the discussion here in the first
>place.  The terminology seems a bit cumbersome though.
>
>I'd like to propose that an "extender" be called a "transformer", since
>it provides a transformation for an underlying object - it changes the
>shape of the underlying object so it will fit somewhere else, without
>creating a new object.  Similarly, the cumbersome "independent adapter"
>might be called a "converter", since it converts A into B, where B is
>some new kind of thing.

Heh.  As long as you're going to continue the electrical metaphor, why not 
just call them transformers and appliances?  Appliances "convert" 
electricity into useful non-electricity things, and it's obvious that you 
can have more than one, they're independent objects, etc.  Whereas a 
transformer or converter would be something you use in order to be able to 
change the electricity itself.

Calling views and iterators "appliances" might be a little weird at first, 
but it fits.  (At one point, I thought about calling them "accessories".)


>If nobody likes this idea, it would seem a bit more symmetric to have
>"dependent" and "independent" adapters, rather than "extenders" and
>"independent adapters".  As it is I'm left wondering what the concept of
>dependency in an adapter is.

It's that independent adapters each have state independent from other 
independent adapters of the same type for the same object.  (vs. extenders 
having shared state amongst themselves, even if you have more than one)

From gvanrossum at gmail.com  Mon Jan 17 16:45:52 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 16:45:55 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com>
Message-ID: <ca471dc205011707456158486@mail.gmail.com>

> Will it still be possible to create an unbound method with
> new.instancemethod?  (I know the patch doesn't change this, I mean, is it
> planned to remove the facility from the instancemethod type?)

I was hoping to be able to get rid of this as soon as the built-in
exceptions code no longer depends on it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From theller at python.net  Mon Jan 17 16:50:03 2005
From: theller at python.net (Thomas Heller)
Date: Mon Jan 17 16:48:34 2005
Subject: [Python-Dev] Re: 2.3.5 delayed til next week
In-Reply-To: <200501180041.03195.anthony@interlink.com.au> (Anthony Baxter's
	message of "Tue, 18 Jan 2005 00:41:01 +1100")
References: <200501180041.03195.anthony@interlink.com.au>
Message-ID: <y8es6ono.fsf@python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

> As I'd kinda feared, my return to work has left me completely
> buried this week, and so I'm going to have to push 2.3.5 until
> next week. Thomas and Fred: does one of the days in the
> range 25-27 January suit you? The 26th is a public holiday here,
> and so that's the day that's most likely for me...
>
25-27 January are all ok for me.  Will there be a lot of backports, or
are they already in place?  If they are already there, I can build the
installer as soon as Fred has built the html docs.

Thomas

From arigo at tunes.org  Mon Jan 17 16:49:07 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon Jan 17 17:00:48 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011707277b386ec8@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
Message-ID: <20050117154907.GA7853@vicky.ecs.soton.ac.uk>

Hi Guido,

On Mon, Jan 17, 2005 at 07:27:33AM -0800, Guido van Rossum wrote:
> That is stricter than classic Python though -- it allows the value to
> be anything (and you get the value back unadorned in the except 's',
> x: clause).

Thanks for the note !


Armin
From gjc at inescporto.pt  Mon Jan 17 16:05:08 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Mon Jan 17 17:02:04 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EA9196.1020709@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl>
Message-ID: <1105974308.17513.1.camel@localhost>

  If someone could take a look at:

[ 1069624 ] incomplete support for AF_PACKET in socketmodule.c

  I have to ship my own patched copy of the socket module because of
this... :|

On Sun, 2005-01-16 at 17:08 +0100, Irmen de Jong wrote:
> Hello
> I've looked at one bug and a bunch of patches and
> added a comment to them:
> 
> (bug) [ 1102649 ] pickle files should be opened in binary mode
> Added a comment about a possible different solution
> 
> [ 946207 ] Non-blocking Socket Server
> Useless, what are the mixins for? Recommend close
> 
> [ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows
> Looks good but added suggestion about when to test for special case
> 
> [ 740827 ] add urldecode() method to urllib
> I think it's better to group these things into urlparse
> 
> [ 579435 ] Shadow Password Support Module
> Would be nice to have, I recently just couldn't do the user
> authentication that I wanted: based on the users' unix passwords
> 
> [ 1093468 ] socket leak in SocketServer
> Trivial and looks harmless, but don't the sockets
> get garbage collected once the request is done?
> 
> [ 1049151 ] adding bool support to xdrlib.py
> Simple patch and 2.4 is out now, so...
> 
> 
> 
> It would be nice if somebody could have a look at my
> own patches or help me a bit with them:
> 
> [ 1102879 ] Fix for 926423: socket timeouts + Ctrl-C don't play nice
> [ 1103213 ] Adding the missing socket.recvall() method
> [ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule
> [ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names
> [ 1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld
> 
> Some of them come from the last Python Bug Day, see
> http://www.python.org/moin/PythonBugDayStatus
> 
> 
> Thank you !
> 
> Regards,
> 
> --Irmen de Jong
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/gjc%40inescporto.pt
-- 
Gustavo J. A. M. Carneiro
<gjc@inescporto.pt> <gustavo@users.sourceforge.net>
The universe is always one step beyond logic.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3086 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050117/8fb01c9b/smime-0001.bin
From just at letterror.com  Mon Jan 17 17:06:04 2005
From: just at letterror.com (Just van Rossum)
Date: Mon Jan 17 17:06:12 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>
Message-ID: <r01050400-1037-B1CCB42268A111D98C3B003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> Heh.  As long as you're going to continue the electrical metaphor,
> why not just call them transformers and appliances? [ ... ]

Next we'll see Appliance-Oriented Programming ;-)

Just
From mwh at python.net  Mon Jan 17 17:06:40 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Jan 17 17:06:44 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc205011707277b386ec8@mail.gmail.com> (Guido van Rossum's
	message of "Mon, 17 Jan 2005 07:27:33 -0800")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
Message-ID: <2mbrboca5r.fsf@starship.python.net>

Guido van Rossum <gvanrossum@gmail.com> writes:

> [Michael]
>> It would still be worth doing, IMHO.
>
> Then let's do it. Care to resurrect your patch? (And yes, classic
> classes should also be allowed for b/w compatibility.)

I found it and uploaded it here:

    http://starship.python.net/crew/mwh/new-style-exception-hacking.diff

The change to type_str was the sort of unexpected change I was talking
about.

TBH, I'm not sure it's really worth working from my patch, a more
sensible course would be to just do the work again, but paying a bit
more attention to getting a maintainable result.

Questions:

a) Is Exception to be new-style?

b) Somewhat but not entirely independently, would demanding that all
   new-style exceptions inherit from Exception be reasonable?

Cheers,
mwh


-- 
  ZAPHOD:  You know what I'm thinking?
    FORD:  No.
  ZAPHOD:  Neither do I.  Frightening isn't it?
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11
From fdrake at acm.org  Mon Jan 17 17:09:57 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon Jan 17 17:10:04 2005
Subject: [Python-Dev] Re: 2.3.5 delayed til next week
In-Reply-To: <200501180041.03195.anthony@interlink.com.au>
References: <200501180041.03195.anthony@interlink.com.au>
Message-ID: <200501171109.04797.fdrake@acm.org>

On Monday 17 January 2005 08:41, Anthony Baxter wrote:
 > As I'd kinda feared, my return to work has left me completely
 > buried this week, and so I'm going to have to push 2.3.5 until
 > next week. Thomas and Fred: does one of the days in the
 > range 25-27 January suit you? The 26th is a public holiday here,
 > and so that's the day that's most likely for me...

Sounds good to me.  Anything in that range is equally doable.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From pje at telecommunity.com  Mon Jan 17 17:35:53 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 17:34:31 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <2mbrboca5r.fsf@starship.python.net>
References: <ca471dc205011707277b386ec8@mail.gmail.com>
	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>

At 04:06 PM 1/17/05 +0000, Michael Hudson wrote:
>a) Is Exception to be new-style?

Probably not in 2.5; Martin and others have suggested that this could 
introduce instability for users' existing exception classes.


>b) Somewhat but not entirely independently, would demanding that all
>    new-style exceptions inherit from Exception be reasonable?

Yes.  Right now you can't have a new-style exception at all, so it would be 
quite reasonable to require new ones to inherit from Exception.

From gvanrossum at gmail.com  Mon Jan 17 19:16:45 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 19:16:49 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
Message-ID: <ca471dc2050117101654e0116f@mail.gmail.com>

On Mon, 17 Jan 2005 11:35:53 -0500, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 04:06 PM 1/17/05 +0000, Michael Hudson wrote:
> >a) Is Exception to be new-style?
> 
> Probably not in 2.5; Martin and others have suggested that this could
> introduce instability for users' existing exception classes.

Really? I thought that was eventually decided to be a very small amount of code.

> >b) Somewhat but not entirely independently, would demanding that all
> >    new-style exceptions inherit from Exception be reasonable?
> 
> Yes.  Right now you can't have a new-style exception at all, so it would be
> quite reasonable to require new ones to inherit from Exception.

That would be much more reasonable if Exception itself was a new-style
class. As long as it isn't, you'd have to declare new-style classes
like this:

class MyError(Exception, object):
    ...

which is ugly.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Mon Jan 17 19:21:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 19:21:17 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>
References: <r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
	<1105944547.30052.21.camel@localhost>
	<5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>
Message-ID: <ca471dc2050117102164aaecdf@mail.gmail.com>

> Heh.  As long as you're going to continue the electrical metaphor, why not
> just call them transformers and appliances?

Please don't. Transformer is commonly used in all sorts of contexts.
But appliances applies mostly to kitchenware and the occasional
marketing term for cheap computers.

The electrical metaphor is cute, but doesn't cut it IMO. Adapter,
converter and transformer all sound to me like they imply an "as a"
relationship rather than "has a". The "has a" kind feels more like a
power tool to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From kbk at shore.net  Mon Jan 17 19:32:23 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Mon Jan 17 19:32:44 2005
Subject: [Python-Dev] Re: 2.3.5 delayed til next week
In-Reply-To: <y8es6ono.fsf@python.net> (Thomas Heller's message of "Mon, 17
	Jan 2005 16:50:03 +0100")
References: <200501180041.03195.anthony@interlink.com.au>
	<y8es6ono.fsf@python.net>
Message-ID: <87d5w3vrd4.fsf@hydra.bayview.thirdcreek.com>

Thomas Heller <theller@python.net> writes:

> 25-27 January are all ok for me.  Will there be a lot of backports, or
> are they already in place?  If they are already there, I can build the
> installer as soon as Fred has built the html docs.

I've got a couple, I'll get them in by tomorrow.
-- 
KBK
From pje at telecommunity.com  Mon Jan 17 19:34:30 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 19:33:08 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc2050117101654e0116f@mail.gmail.com>
References: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com>

At 10:16 AM 1/17/05 -0800, Guido van Rossum wrote:
>On Mon, 17 Jan 2005 11:35:53 -0500, Phillip J. Eby
><pje@telecommunity.com> wrote:
> > At 04:06 PM 1/17/05 +0000, Michael Hudson wrote:
> > >a) Is Exception to be new-style?
> >
> > Probably not in 2.5; Martin and others have suggested that this could
> > introduce instability for users' existing exception classes.
>
>Really? I thought that was eventually decided to be a very small amount of 
>code.

Guess I missed that part of the thread in the ongoing flood of PEP 246 
stuff.  :)


>That would be much more reasonable if Exception itself was a new-style
>class. As long as it isn't, you'd have to declare new-style classes
>like this:
>
>class MyError(Exception, object):
>     ...
>
>which is ugly.

I was thinking the use case was that you were having to add 'Exception', 
not that you were adding 'object'.  The two times in the past that I wanted 
to make a new-style class an exception, I *first* made it a new-style 
class, and *then* tried to make it an exception.  I believe the OP on this 
thread described the same thing.

But whatever; as long as it's *possible*, I don't care much how it's done, 
and I can't think of anything in my code that would break from making 
Exception new-style.

From pje at telecommunity.com  Mon Jan 17 19:42:54 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Jan 17 19:41:31 2005
Subject: [Python-Dev] PEP 246: let's reset
In-Reply-To: <ca471dc2050117102164aaecdf@mail.gmail.com>
References: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>
	<r01050400-1037-E1D04844674711D98C3B003065D5E7E4@10.0.0.23>
	<82543A8B-6749-11D9-B46A-0003934AD54A@chello.se>
	<5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com>
	<E530176C-6797-11D9-ADA4-000A95EFAE9E@aleax.it>
	<5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com>
	<1105944547.30052.21.camel@localhost>
	<5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050117133747.02c66ec0@mail.telecommunity.com>

At 10:21 AM 1/17/05 -0800, Guido van Rossum wrote:
> > Heh.  As long as you're going to continue the electrical metaphor, why not
> > just call them transformers and appliances?
>
>Please don't. Transformer is commonly used in all sorts of contexts.
>But appliances applies mostly to kitchenware and the occasional
>marketing term for cheap computers.
>
>The electrical metaphor is cute, but doesn't cut it IMO. Adapter,
>converter and transformer all sound to me like they imply an "as a"
>relationship rather than "has a". The "has a" kind feels more like a
>power tool to me.

By the way, another use case for type declarations supporting dynamic 
"as-a" adapters...

Chandler's data model has a notion of "kinds" that a single object can be, 
like Email, Appointment, etc.  A single object can be of multiple kinds, 
sort of like per-instance multiple-inheritance.  Which means that passing 
the same object to routines taking different types would "do the right 
thing" with such an object if they adapted to the desired kind, and if such 
adaptation removed the existing kind-adapter and replaced it with the 
destination kind-adapter.  So, there's an underlying object that just 
represents the identity, and then everything else is "as-a" adaptation.

From gvanrossum at gmail.com  Mon Jan 17 19:44:54 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 17 19:44:57 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com>
Message-ID: <ca471dc20501171044478db623@mail.gmail.com>

> >That would be much more reasonable if Exception itself was a new-style
> >class. As long as it isn't, you'd have to declare new-style classes
> >like this:
> >
> >class MyError(Exception, object):
> >     ...
> >
> >which is ugly.
> 
> I was thinking the use case was that you were having to add 'Exception',
> not that you were adding 'object'.  The two times in the past that I wanted
> to make a new-style class an exception, I *first* made it a new-style
> class, and *then* tried to make it an exception.  I believe the OP on this
> thread described the same thing.
> 
> But whatever; as long as it's *possible*, I don't care much how it's done,
> and I can't think of anything in my code that would break from making
> Exception new-style.

Well, right now you would only want to make an exception a new style
class if you had a very specific use case for wanting the new style
class. But once we allow new-style exceptions *and* require them to
inherit from Exception, we pretty much send the message "if you're not
using new-style exceptions derived from Exception your code is out of
date" and that means it should be as simple as possible to make code
conform. And that means IMO making Exception a new style class.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From martin at v.loewis.de  Mon Jan 17 23:12:25 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 17 23:12:26 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <1105974308.17513.1.camel@localhost>
References: <41EA9196.1020709@xs4all.nl> <1105974308.17513.1.camel@localhost>
Message-ID: <41EC3849.8040503@v.loewis.de>

Gustavo J. A. M. Carneiro wrote:
>   If someone could take a look at:
> 
> [ 1069624 ] incomplete support for AF_PACKET in socketmodule.c


The rule applies: five reviews, with results posted to python-dev,
and I will review your patch.

Regards,
Martin
From martin at v.loewis.de  Mon Jan 17 23:14:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 17 23:14:55 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <ca471dc2050117101654e0116f@mail.gmail.com>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>	<ca471dc205011707277b386ec8@mail.gmail.com>	<2mbrboca5r.fsf@starship.python.net>	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
Message-ID: <41EC38DE.8080603@v.loewis.de>

Guido van Rossum wrote:
>>>a) Is Exception to be new-style?
>>
>>Probably not in 2.5; Martin and others have suggested that this could
>>introduce instability for users' existing exception classes.
> 
> 
> Really? I thought that was eventually decided to be a very small amount of code.

I still think that only an experiment could decide: somebody should
come up with a patch that does that, and we will see what breaks.

I still have the *feeling* that this has significant impact, but
I could not pin-point this to any specific problem I anticipate.

Regards,
Martin
From gjc at inescporto.pt  Mon Jan 17 23:27:52 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Mon Jan 17 23:28:02 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EC3849.8040503@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl>
	<1105974308.17513.1.camel@localhost>  <41EC3849.8040503@v.loewis.de>
Message-ID: <1106000872.5931.6.camel@emperor>

On Mon, 2005-01-17 at 23:12 +0100, "Martin v. L?wis" wrote:
> Gustavo J. A. M. Carneiro wrote:
> >   If someone could take a look at:
> > 
> > [ 1069624 ] incomplete support for AF_PACKET in socketmodule.c
> 
> 
> The rule applies: five reviews, with results posted to python-dev,
> and I will review your patch.

  Oh... sorry, I didn't know about any rules.

/me hides in shame.

-- 
Gustavo J. A. M. Carneiro
<gjc@inescporto.pt> <gustavo@users.sourceforge.net>
The universe is always one step beyond logic

From martin at v.loewis.de  Mon Jan 17 23:46:44 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 17 23:46:47 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <1106000872.5931.6.camel@emperor>
References: <41EA9196.1020709@xs4all.nl>	 <1105974308.17513.1.camel@localhost>
	<41EC3849.8040503@v.loewis.de> <1106000872.5931.6.camel@emperor>
Message-ID: <41EC4054.6000908@v.loewis.de>

Gustavo J. A. M. Carneiro wrote:
>   Oh... sorry, I didn't know about any rules.

My apologies - I had announced this (personal) rule
a few times, so I thought everybody on python-dev knew.
If you really want to push a patch, you
can do so by doing your own share of work, namely by
reviewing other's patches. If you don't, someone
will apply your patch when he finds the time to do so.
So if you can wait, it might be best to wait a few
months (this won't go into 2.4 patch releases, anyway).

I think Brett Cannon now also follows this rule; it
really falls short enough in practice because (almost)
nobody really wants to push his patch bad enough to
put some work into it to review other patches.

Regards,
Martin
From mal at egenix.com  Mon Jan 17 23:58:34 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon Jan 17 23:58:37 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc2050117074347baa60d@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
Message-ID: <41EC431A.90204@egenix.com>

Guido van Rossum wrote:
> Apart from the tests that were testing the behavior of im_class, I
> found only a single piece of code in the standard library that used
> im_class of an unbound method object (the clever test in the pyclbr
> test). Uses of im_self and im_func were more widespread. Given the
> level of cleverness in the pyclbr test (and the fact that I wrote it
> myself) I'm not worried about widespread use of im_class on unbound
> methods.

I guess this depends on how you define widespread use. I'm using
this feature a lot via the basemethod() function in mxTools for
calling the base method of an overridden method in mixin classes
(basemethod() predates super() and unlike the latter works for
old-style classes).

What I don't understand in your reasoning is that you are talking
about making an unbound method look more like a function. Unbound
methods and bound methods are objects of the same type -
the method object. By turning an unbound method into a function
type, you break code that tests for MethodType in Python
or does a PyMethod_Check() at C level.

If you want to make methods look more like functions,
the method object should become a subclass of the function
object (function + added im_* attributes).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From glyph at divmod.com  Tue Jan 18 00:33:34 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Tue Jan 18 00:28:49 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc2050117074347baa60d@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
Message-ID: <1106004814.30052.80.camel@localhost>

On Mon, 2005-01-17 at 07:43 -0800, Guido van Rossum wrote:

> Note that you can't pickle unbound methods anyway unless you write
> specific suppport code to do that; it's not supported by pickle
> itself.

It's supported by Twisted.  Alternatively, replace "pickle" with "python
object serializer of my design" - I am concerned both about useful
information being removed, and about specific features of Pickle.

Twisted's .tap files have always pushed the limits of pickle.  I don't
remember why my users wanted this specific feature - the code in
question is almost 3 years old - but when you think of a pickle as a
self-contained universe of running Python objects, any plausible reason
why one might want a reference to an unbound method in code becomes a
reason to want to serialize one.

The only time I've used it myself was to pickle attributes of
interfaces, which I no longer need to do since zope.interface has its
own object types for that, so it's not really _that_ important to me.
On the other hand, if PJE's "monkey typing" PEP is accepted, there will
probably be lots more reasons to serialize unbound methods, for
descriptive purposes.

> I think that use case is weak.

It's not the strongest use-case in the world, but is the impetus to
remove unbound method objects from Python that much stronger?  I like
the fact that it's simpler, but it's a small amount of extra simplicity,
it doesn't seem to enable any new use-cases, and it breaks the potential
for serialization.

In general, Pickle  handles other esoteric, uncommon use-cases pretty
well:

        >>> x = []
        >>> y = (x,)
        >>> x.append(y)
        >>> import cPickle
        >>> cPickle.dumps(x)
        '(lp1\n(g1\ntp2\na.'
        >>> x
        [([...],)]
        
since when you need 'em, you really need 'em.

Method objects were previously unsupported, which is fine because
they're pretty uncommon.  Not only would this patch continue to not
support them, though, it makes the problem impossible to fix in
3rd-party code.  By removing the unbound method type, it becomes an
issue that has to be fixed in the standard library.  Assuming that 3rd
party code will not be able to change the way that functions are pickled
and unpickled in cPickle, in python2.5.

Ironically, I think that this use case is also going to become more
common if the patch goes in, because then it is going to be possible to
"borrow" functionality without going around a method's back to grab its
im_func.

> If you really have the need to pickle an individual unbound method,
> it's less work to create a global helper function and pickle that,
> than to write the additional pickling support for picking unbound
> methods.

This isn't true if you've already got the code written, which I do ;-).


From glyph at divmod.com  Tue Jan 18 00:35:23 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Tue Jan 18 00:30:36 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41EC431A.90204@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com> <41EC431A.90204@egenix.com>
Message-ID: <1106004923.30052.82.camel@localhost>

On Mon, 2005-01-17 at 23:58 +0100, M.-A. Lemburg wrote:

> If you want to make methods look more like functions,
> the method object should become a subclass of the function
> object (function + added im_* attributes).

I think this suggestion would fix my serialization problem as well...
but does it actually buy enough extra simplicity to make it worthwhile?

From barry at python.org  Tue Jan 18 01:29:35 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Jan 18 01:29:42 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41EC431A.90204@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com> <41EC431A.90204@egenix.com>
Message-ID: <1106008175.20172.115.camel@geddy.wooz.org>

On Mon, 2005-01-17 at 17:58, M.-A. Lemburg wrote:

> If you want to make methods look more like functions,
> the method object should become a subclass of the function
> object (function + added im_* attributes).

I have no personal use cases, but it does make me vaguely uncomfortable
to lose im_class.  Isn't it possible to preserve this attribute?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050117/ac818a02/attachment.pgp
From bob at redivi.com  Tue Jan 18 02:07:13 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan 18 02:07:18 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <1106004814.30052.80.camel@localhost>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
	<1106004814.30052.80.camel@localhost>
Message-ID: <49A0D307-68ED-11D9-A13E-000A95BA5446@redivi.com>


On Jan 17, 2005, at 18:33, Glyph Lefkowitz wrote:

> It's not the strongest use-case in the world, but is the impetus to
> remove unbound method objects from Python that much stronger?  I like
> the fact that it's simpler, but it's a small amount of extra 
> simplicity,
> it doesn't seem to enable any new use-cases, and it breaks the 
> potential
> for serialization.

Well, it lets you meaningfully do:

class Foo:
     def someMethod(self):
         pass

class Bar:
     someMethod = Foo.someMethod

Where now you have to do:

class Bar:
     someMethod = Foo.someMethod.im_func

I'm not sure how useful this actually is, though.

-bob

From gvanrossum at gmail.com  Tue Jan 18 06:15:42 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 18 06:15:48 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41EC431A.90204@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
	<41EC431A.90204@egenix.com>
Message-ID: <ca471dc20501172115275a99c9@mail.gmail.com>

[Guido]
> > Apart from the tests that were testing the behavior of im_class, I
> > found only a single piece of code in the standard library that used
> > im_class of an unbound method object (the clever test in the pyclbr
> > test). Uses of im_self and im_func were more widespread. Given the
> > level of cleverness in the pyclbr test (and the fact that I wrote it
> > myself) I'm not worried about widespread use of im_class on unbound
> > methods.

[Marc-Andre]
> I guess this depends on how you define widespread use. I'm using
> this feature a lot via the basemethod() function in mxTools for
> calling the base method of an overridden method in mixin classes
> (basemethod() predates super() and unlike the latter works for
> old-style classes).

I'm not sure I understand how basemethod is supposed to work; I can't
find docs for it using Google (only three hits for the query mxTools
basemethod). How does it depend on im_class?

> What I don't understand in your reasoning is that you are talking
> about making an unbound method look more like a function.

That's a strange interpretation. I'm getting rid of the unbound method
object altogether.

> Unbound
> methods and bound methods are objects of the same type -
> the method object.

Yeah I know that. :-)

And it is one of the problems -- the two uses are quite distinct and
yet it's the same object, which is confusing.

> By turning an unbound method into a function
> type, you break code that tests for MethodType in Python
> or does a PyMethod_Check() at C level.

My expectation  is that there is very little code like that. Almost
all the code that I found doing that in the core Python code (none in
C BTW) was in the test suite.

> If you want to make methods look more like functions,
> the method object should become a subclass of the function
> object (function + added im_* attributes).

Can't do that, since the (un)bound method object supports binding
other callables besides functions.

[Glyph]
> On the other hand, if PJE's "monkey typing" PEP is accepted, there will
> probably be lots more reasons to serialize unbound methods, for
> descriptive purposes.

Let's cross that bridge when we get to it.

> It's not the strongest use-case in the world, but is the impetus to
> remove unbound method objects from Python that much stronger?

Perhaps not, but we have to strive for simplicity whenever we can, to
counteract the inevitable growth in complexity of the language
elsewhere.

> I like
> the fact that it's simpler, but it's a small amount of extra simplicity,
> it doesn't seem to enable any new use-cases,

I think it does. You will be able to get a method out of a class and
put it into another unrelated class. Previously, you would have to use
__dict__ or im_func to do that.

Also, I've always liked the explanation of method calls that

    C().f()

is the same as

    C.f(C())

and to illustrate this it would be nice to say "look, C.f is just a function".

> and it breaks the potential for serialization.

For which you seem to have no use yourself. The fact that you support
it doesn't prove that it's used -- large software packages tend to
accrete lots of unused features over time, because it's safer to keep
it in than to remove it.

This is a trend I'd like to buck with Python. There's lots of dead
code in Python's own standard library, and one day it will bite the
dust.

[Barry]
> I have no personal use cases, but it does make me vaguely uncomfortable
> to lose im_class.  Isn't it possible to preserve this attribute?

That vague uncomfort is called FUD until proven otherwise. :-)

Keeping im_class would be tricky -- the information isn't easily
available when the function is defined, and adding it would require
changing unrelated code that the patch so far didn't have to get near.
Also, it would not be compatible -- the unbound method sets im_class
to whichever class was used to retrieve the attribute, not the class
in which the function was defined.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)>
From gvanrossum at gmail.com  Tue Jan 18 06:17:44 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 18 06:17:48 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <41EC38DE.8080603@v.loewis.de>
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de>
Message-ID: <ca471dc2050117211778acdd6a@mail.gmail.com>

> I still think that only an experiment could decide: somebody should
> come up with a patch that does that, and we will see what breaks.
> 
> I still have the *feeling* that this has significant impact, but
> I could not pin-point this to any specific problem I anticipate.

This sounds like a good approach. We should do this now in 2.5, and as
alpha and beta testing progresses we can decide whether to roll it
back of what kind of backwards compatibility to provide.

(Most exceptions are very short classes with very limited behavior, so
I expect that in the large majority of cases it won't matter. The
question is of course how small the remaining minority is.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tdelaney at avaya.com  Tue Jan 18 06:56:11 2005
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Jan 18 06:56:17 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com>

Guido van Rossum wrote:

> Keeping im_class would be tricky -- the information isn't easily
> available when the function is defined, and adding it would require
> changing unrelated code that the patch so far didn't have to get near.
> Also, it would not be compatible -- the unbound method sets im_class
> to whichever class was used to retrieve the attribute, not the class
> in which the function was defined.

I actually do have a use case for im_class, but not in its current
incarnation. It would be useful if im_class was set (permanently) to the
class in which the function was defined.

My use case is my autosuper recipe. Currently I have to trawl through
the MRO, comparing code objects to find out which class I'm currently
in. Most annoyingly, I have to trawl *beyond* where I first find the
function, in case it's actually come from a base class (otherwise
infinite recursion can result ;)

If im_func were set to the class where the function was defined, I could
definitely avoid the second part of the trawling (not sure about the
first yet, since I need to get at the function object).

Cheers.

Tim Delaney
From ncoghlan at iinet.net.au  Tue Jan 18 09:37:05 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Jan 18 09:37:11 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41EBAB67.2080907@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<41EB8CF7.9010002@iinet.net.au> <41EBAB67.2080907@egenix.com>
Message-ID: <41ECCAB1.5080303@iinet.net.au>

M.-A. Lemburg wrote:
> Nick Coghlan wrote:
> 
>> Guido van Rossum wrote:
>>
>>> What do people think? (My main motivation for this, as stated before,
>>> is that it adds complexity without much benefit.)
>>
>>
>>
>> I'm in favour, since it removes the "an unbound method is almost like 
>> a bare function, only not quite as useful" distinction. It would allow 
>> things like str.join(sep, seq) to work correctly for a Unicode separator. 
> 
> 
> This won't work. Strings and Unicode are two different types,
> not subclasses of one another.

My comment was based on misremembering how str.join actually works. It 
automatically flips to Unicode when it finds a Unicode string in the sequence - 
however, it doesn't do that for the separator, since that should already have 
been determined to be a string by the method lookup machinery.

However, looking at the code for string_join suggests another possible issue 
with removing unbound methods. The function doesn't check the type of the first 
argument - it just assumes it is a PyStringObject.

PyString_Join adds the typecheck that is normally performed by the method 
wrapper when str.join is invoked from Python.

The issue is that, if the unbound method wrapper is removed, 
str.join(some_unicode_str, seq) will lead to PyString_AS_STRING being invoked on 
a PyUnicode object. Ditto for getting the arguments out of order, as in 
str.join(seq, separator). At the moment, the unbound method wrapper takes care 
of raising a TypeError in both of these cases. Without it, we get an unsafe 
PyString macro being applied to an arbitrary type.

I wonder how many other C methods make the same assumption, and skip type 
checking on the 'self' argument? It certainly seems to be the standard approach 
in stringobject.c.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mal at egenix.com  Tue Jan 18 10:28:57 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Jan 18 10:29:00 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc20501172115275a99c9@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>	
	<1105945019.30052.26.camel@localhost>	
	<ca471dc2050117074347baa60d@mail.gmail.com>	
	<41EC431A.90204@egenix.com>
	<ca471dc20501172115275a99c9@mail.gmail.com>
Message-ID: <41ECD6D9.9000001@egenix.com>

Guido van Rossum wrote:
> [Guido]
> 
>>>Apart from the tests that were testing the behavior of im_class, I
>>>found only a single piece of code in the standard library that used
>>>im_class of an unbound method object (the clever test in the pyclbr
>>>test). Uses of im_self and im_func were more widespread. Given the
>>>level of cleverness in the pyclbr test (and the fact that I wrote it
>>>myself) I'm not worried about widespread use of im_class on unbound
>>>methods.
> 
> 
> [Marc-Andre]
> 
>>I guess this depends on how you define widespread use. I'm using
>>this feature a lot via the basemethod() function in mxTools for
>>calling the base method of an overridden method in mixin classes
>>(basemethod() predates super() and unlike the latter works for
>>old-style classes).
> 
> 
> I'm not sure I understand how basemethod is supposed to work; I can't
> find docs for it using Google (only three hits for the query mxTools
> basemethod). How does it depend on im_class?

It uses im_class to find the class defining the (unbound) method:

def basemethod(object,method=None):

     """ Return the unbound method that is defined *after* method in the
         inheritance order of object with the same name as method
         (usually called base method or overridden method).

         object can be an instance, class or bound method. method, if
         given, may be a bound or unbound method. If it is not given,
         object must be bound method.

         Note: Unbound methods must be called with an instance as first
         argument.

         The function uses a cache to speed up processing. Changes done
         to the class structure after the first hit will not be noticed
         by the function.

     """
     ...

This is how it is used in mixin classes to call the base
method of the overridden method in the inheritance tree (of
old-style classes):

class RequestListboxMixin:

     def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0,
                  size=None,width=None,monospaced=1,events=None):

         # Call base method
         mx.Tools.basemethod(self, RequestListboxMixin.__init__)\
                            (self,name,size,width,monospaced,None,events)

         ...

Without .im_class for the unbound method, basemethod would
seize to work since it uses this attribute to figure out
the class object defining the overriding method.

I can send you the code if you don't have egenix-mx-base
installed somewhere (its in mx/Tools/Tools.py).

>>What I don't understand in your reasoning is that you are talking
>>about making an unbound method look more like a function.
> 
> That's a strange interpretation. I'm getting rid of the unbound method
> object altogether.

Well, you do have to assign some other type to the object
that is returned by "myClass.myMethod" and as I understood
your proposal, the returned object should be of the
FunctionType. So from an application point of view, you
are changing the type of an object.

>>Unbound
>>methods and bound methods are objects of the same type -
>>the method object.
> 
> Yeah I know that. :-)
> 
> And it is one of the problems -- the two uses are quite distinct and
> yet it's the same object, which is confusing.

Hmm, I have a hard time seeing how you can get rid
off unbound methods while keeping bound methods - since
both are the same type :-)

>>By turning an unbound method into a function
>>type, you break code that tests for MethodType in Python
>>or does a PyMethod_Check() at C level.
> 
> 
> My expectation  is that there is very little code like that. Almost
> all the code that I found doing that in the core Python code (none in
> C BTW) was in the test suite.

I'm using PyMethod_Check() in mxProxy to automatically
wrap methods of proxied object in order to prevent references
to the object class or the object itself to slip by the
proxy. Changing the type to function object and placing
the class information into a function attribute would break
this approach. Apart from that the type change (by itself)
would not affect the eGenix code base.

I would expect code in the following areas to make use
of the type check:
* language interface code (e.g. Java, .NET bridges)
* security code that tries to implement object access control
* RPC applications that use introspection to generate
   interface definitions (e.g. WSDL service definitions)
* debugging tools (e.g. IDEs)

Perhaps a few others could scan their code base as well ?!

>>If you want to make methods look more like functions,
>>the method object should become a subclass of the function
>>object (function + added im_* attributes).
> 
> Can't do that, since the (un)bound method object supports binding
> other callables besides functions.

Is this feature used anywhere ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From arigo at tunes.org  Tue Jan 18 13:59:14 2005
From: arigo at tunes.org (Armin Rigo)
Date: Tue Jan 18 14:11:03 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050114174132.GA46344@prometheusresearch.com>
References: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
	<20050114163900.GA21005@vicky.ecs.soton.ac.uk>
	<20050114174132.GA46344@prometheusresearch.com>
Message-ID: <20050118125914.GA28380@vicky.ecs.soton.ac.uk>

Hi Clark,

On Fri, Jan 14, 2005 at 12:41:32PM -0500, Clark C. Evans wrote:
> Imagine enhancing the stack-trace with additional information about
> what adaptations were made; 
> 
>     Traceback (most recent call last):
>        File "xxx", line 1, in foo
>          Adapting x to File
>        File "yyy", line 384, in bar
>          Adapting x to FileName
>        etc.

More thoughts should be devoted to this, because it would be very precious.  
There should also be a way to know why a given call to adapt() returned an
unexpected object even if it didn't crash.  Given the nature of the problem,
it seems not only "nice" but essential to have a good way to debug it.

> How can we express your thoughts so that they fit into a narrative
> describing how adapt() should and should not be used?

I'm attaching a longer, hopefully easier reformulation...


Armin
-------------- next part --------------

A view on adaptation
====================

Adaptation is a tool to help exchange data between two pieces of code; a very powerful tool, even.  But it is easy to misunderstand its aim, and unlike other features of a programming language, misusing adaptation will quickly lead into intricate debugging nightmares.  Here is the point of view on adaptation which I defend, and which I believe should be kept in mind.


Let's take an example.  You want to call a function in the Python standard library to do something interesting, like pickling (saving) a number of instances to a file with the ``pickle`` module.  You might remember that there is a function ``pickle.dump(obj, file)``, which saves the object ``obj`` to the file ``file``, and another function ``pickle.load(file)`` which reads back the object from ``file``.  (Adaptation doesn't help you to figure this out; you have to be at least a bit familiar with the standard library to know that this feature exists.)

Let's take the example of ``pickle.load(file)``.  Even if you remember about it, you might still have to look up the documentation if you don't remember exactly what kind of object ``file`` is supposed to be.  Is it an open file object, or a file name?  All you know is that ``file`` is meant to somehow "be", or "stand for", the file.  Now there are at least two commonly used ways to "stand for" a file: the file path as a string, or the file object directly.  Actually, it might even not be a file at all, but just a string containing the already-loaded binary data.  This gives a third alternative.

The point here is that the person who wrote the ``pickle.load(x)`` function also knew that the argument was supposed to "stand for" a source of binary data to read from, and he had to make a choice for one of the three common representations: file path, file object, or raw data in a string.  The "source of binary data" is what both the author of the function and you would easily agree on; the formal choice of representation is more arbitrary.  This is where adaptation is supposed to help.  With properly setup adaptation, you can pass to ``pickle.load()`` either a file name or a file object, or possibly anything else that "reasonably stands for" an input file, and it will just work.


But to understand it more fully, we need to look a bit closer.  Imagine yourself as the author of functions like ``pickle.load()`` and ``pickle.dump()``.  You decide if you want to use adaptation or not.  Adaptation should be used in this case, and ONLY in this kind of case: there is some generally agreed concept on what a particular object -- typically an argument of function -- should represent, but not on precisely HOW it should represent it.  If your function expects a "place to write the data to", it can typically be an open file or just a file name; in this case, the function would be defined like this::

    def dump_data_into(target):
        file = adapt(target, TargetAsFile)
        file.write('hello')

with ``TargetAsFile`` being suitably defined -- i.e. having a correct ``__adapt__()`` special method -- so that the adaptation will accept either a file or a string, and in the latter case open the named file for writing.

Surely, you think that ``TargetAsFile`` is a strange name for an interface if you think about adaptation in term of interfaces.  Well, for the purpose of this argument, don't.  Forget about interfaces.  This special object ``TargetAsFile`` means not one but two things at once: that the input argument ``target`` represents the place into which data should be written; and that the result ``file`` of the adaptation, as used within function itself, must be more precisely a file object.

This two-level distinction is important to keep in mind, specially when adapting built-in objects like strings and files.  For example, the adaptation that would be used in ``pickle.load(source)`` is more difficult to get right, because there are two common ways that a string object can stand for a source of data: either as the name of a file, or as raw binary data.  It is not possible to distinguish between these two differents uses of ``str`` automatically.  In other words, strings are very versatile and low-level objects which can have various meanings in various contexts, and sometimes these meanings even conflict in the same context!  More concretely, it is not possible to use adaptation to write a function ``pickle.load(source)`` which accepts either a file, a file name, or a raw binary string.  You have to make a choice.  For symmetry with the case of ``TargetAsFile``, a ``SourceAsFile`` would probably interpret a string as a file name, and the caller still has to explicitely turn a raw string into a file-like object -- by wrapping it in a ``StringIO()``.

However, it would be possible to extend our adapters to accept URLs, say, because it's possible to distinguish between a local file name and an URL.  Similarily, various other object types could unambiguously refer to, respectively, a "source" or "target" of data.


The essential point is: the criterion to keep in mind for knowing when it is reasonable or not to add new adaptation paths is whether the object you are adapting "clearly stands" for the **high-level concept** that you are adapting to, and **not** for whatever resulting type or interface the adapted object should have.  It **makes no sense** to adapt a string to a file or a file-like object.  *Never define an adapter from the string type to the file type!!*  A string and a file are two low-level concepts that mean different things.  It only makes sense to adapt a string to a "source of data" which is then represented as a file.


This subtle distinction is essential when adapting built-in types.  In large frameworks, it is perhaps more common to adapt to interfaces or between classes specific to your framework.  These interfaces and classes merge both roles: one class is a concrete objects in the Python sense -- a type -- and a single embodied concept.  In this case, the difference between a concrete instance and the concept it stands for is not so important.  This is why we can often think about adaptation as creating an adapter object on top of an instance, to provide a different interface for the object.  If you adapt an instance to an interface ``I`` you really mean that there is a common concept behind the instance and ``I``, and you want to change from the representation given by the instance to the one given by ``I``.

I believe it is useful to keep in mind that adaptation is really about converting between different concrete representations ("str", "file") of a common abstract concept ("source of data").  You have at least to realize which abstract concept you want to adapt representations of, before you define your own adapters.  If you do, then properties like the transitivity of adaptation (i.e. automatically finding longer adaptation paths A -> B -> C when asked to adapt from A to C) become desirable, because the intermediate steps are merely changes in representation for the same abstract concept ("it's the same source of data all along").  If you don't, then transitivity becomes the Source Of All Nightmares :-)
From pje at telecommunity.com  Tue Jan 18 15:38:51 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Jan 18 15:37:44 2005
Subject: [Python-Dev] PEP 246: lossless and stateless
In-Reply-To: <20050118125914.GA28380@vicky.ecs.soton.ac.uk>
References: <20050114174132.GA46344@prometheusresearch.com>
	<0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it>
	<e443ad0e0501131343347d5cf5@mail.gmail.com>
	<7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it>
	<20050114010307.GA51446@prometheusresearch.com>
	<ca471dc2050113175217585406@mail.gmail.com>
	<5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com>
	<ca471dc205011322205f4d28ec@mail.gmail.com>
	<5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com>
	<20050114163900.GA21005@vicky.ecs.soton.ac.uk>
	<20050114174132.GA46344@prometheusresearch.com>
Message-ID: <5.1.1.6.0.20050118093756.042e6020@mail.telecommunity.com>

At 12:59 PM 1/18/05 +0000, Armin Rigo wrote:
> > How can we express your thoughts so that they fit into a narrative
> > describing how adapt() should and should not be used?
>
>I'm attaching a longer, hopefully easier reformulation...

Well said!  You've explained my "interface per use case" theory much better 
than I ever have.

From walter at livinglogic.de  Tue Jan 18 16:05:42 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Jan 18 16:05:50 2005
Subject: [Python-Dev] __str__ vs. __unicode__
Message-ID: <41ED25C6.80603@livinglogic.de>

__str__ and __unicode__ seem to behave differently. A __str__
overwrite in a str subclass is used when calling str(), a __unicode__
overwrite in a unicode subclass is *not* used when calling unicode():

-------------------------------
class str2(str):
     def __str__(self):
         return "foo"

x = str2("bar")
print str(x)

class unicode2(unicode):
     def __unicode__(self):
         return u"foo"

x = unicode2(u"bar")
print unicode(x)
-------------------------------

This outputs:
foo
bar

IMHO this should be fixed so that __unicode__() is used in the
second case too.

Bye,
    Walter D?rwald

From jhylton at gmail.com  Tue Jan 18 17:06:58 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Jan 18 17:07:02 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python ceval.c,
	2.420, 2.421
In-Reply-To: <E1CqviN-00019N-Tw@sc8-pr-cvs1.sourceforge.net>
References: <E1CqviN-00019N-Tw@sc8-pr-cvs1.sourceforge.net>
Message-ID: <e8bf7a53050118080662621485@mail.gmail.com>

On Tue, 18 Jan 2005 07:56:19 -0800, mwh@users.sourceforge.net
<mwh@users.sourceforge.net> wrote:
> Update of /cvsroot/python/python/dist/src/Python
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4034/Python
> 
> Modified Files:
>         ceval.c
> Log Message:
> Change the name of the macro used by --with-tsc builds to the less
> inscrutable READ_TIMESTAMP.

An obvious improvement.  Thanks!

Jeremy
From mwh at python.net  Tue Jan 18 18:00:45 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Jan 18 18:00:46 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <41EC38DE.8080603@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon,
	17 Jan 2005 23:14:54 +0100")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de>
Message-ID: <2my8eqbrk2.fsf@starship.python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Guido van Rossum wrote:
>>>>a) Is Exception to be new-style?
>>>
>>>Probably not in 2.5; Martin and others have suggested that this could
>>>introduce instability for users' existing exception classes.
>> Really? I thought that was eventually decided to be a very small
>> amount of code.
>
> I still think that only an experiment could decide: somebody should
> come up with a patch that does that, and we will see what breaks.
>
> I still have the *feeling* that this has significant impact, but
> I could not pin-point this to any specific problem I anticipate.

Well, some code is certainly going to break such as this from
warnings.py:

    assert isinstance(category, types.ClassType), "category must be a class"

or this from traceback.py:

    if type(etype) == types.ClassType:
        stype = etype.__name__
    else:
        stype = etype

I hope to have a new patch (which makes PyExc_Exception new-style, but
allows arbitrary old-style classes as exceptions) "soon".  It may even
pass bits of "make test" :)

Cheers,
mwh

-- 
  SPIDER:  'Scuse me. [scuttles off]
  ZAPHOD:  One huge spider.
    FORD:  Polite though.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11
From gvanrossum at gmail.com  Tue Jan 18 18:17:48 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 18 18:17:51 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com>
Message-ID: <ca471dc20501180917674964d0@mail.gmail.com>

[Timothy Delaney]
> If im_func were set to the class where the function was defined, I could
> definitely avoid the second part of the trawling (not sure about the
> first yet, since I need to get at the function object).

Instead of waiting for unbound methods to change their functionality,
just create a metaclass that sticks the attribute you want on the
function objects.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Tue Jan 18 18:32:49 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 18 18:32:55 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41ECD6D9.9000001@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
	<41EC431A.90204@egenix.com>
	<ca471dc20501172115275a99c9@mail.gmail.com>
	<41ECD6D9.9000001@egenix.com>
Message-ID: <ca471dc2050118093239bdad02@mail.gmail.com>

[me]
> > I'm not sure I understand how basemethod is supposed to work; I can't
> > find docs for it using Google (only three hits for the query mxTools
> > basemethod). How does it depend on im_class?

[Marc-Andre]
> It uses im_class to find the class defining the (unbound) method:
> 
> def basemethod(object,method=None):
> 
>      """ Return the unbound method that is defined *after* method in the
>          inheritance order of object with the same name as method
>          (usually called base method or overridden method).
> 
>          object can be an instance, class or bound method. method, if
>          given, may be a bound or unbound method. If it is not given,
>          object must be bound method.
> 
>          Note: Unbound methods must be called with an instance as first
>          argument.
> 
>          The function uses a cache to speed up processing. Changes done
>          to the class structure after the first hit will not be noticed
>          by the function.
> 
>      """
>      ...
> 
> This is how it is used in mixin classes to call the base
> method of the overridden method in the inheritance tree (of
> old-style classes):
> 
> class RequestListboxMixin:
> 
>      def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0,
>                   size=None,width=None,monospaced=1,events=None):
> 
>          # Call base method
>          mx.Tools.basemethod(self, RequestListboxMixin.__init__)\
>                             (self,name,size,width,monospaced,None,events)
> 
>          ...
> 
> Without .im_class for the unbound method, basemethod would
> cease to work since it uses this attribute to figure out
> the class object defining the overriding method.

Well, you could always do what Timothy Delaney's autosuper recipe
does: crawl the class structure starting from object.__class__ until
you find the requested method. Since you're using a cache the extra
cost should be minimal.

I realize that this requires you to issue a new release of mxTools to
support this, but you probably want to do one anyway to support other
2.5 features.

> Hmm, I have a hard time seeing how you can get rid
> off unbound methods while keeping bound methods - since
> both are the same type :-)

Easy. There is a lot of code in the instance method type specifically
to support the case where im_self is NULL. All that code can be
deleted (once built-in exceptions stop using it).

> I'm using PyMethod_Check() in mxProxy to automatically
> wrap methods of proxied object in order to prevent references
> to the object class or the object itself to slip by the
> proxy. Changing the type to function object and placing
> the class information into a function attribute would break
> this approach. Apart from that the type change (by itself)
> would not affect the eGenix code base.

Isn't mxProxy a weak referencing scheme? Is it still useful given
Python's own support for weak references?

> I would expect code in the following areas to make use
> of the type check:
> * language interface code (e.g. Java, .NET bridges)

Java doesn't have the concept of unbound methods, so I doubt it's
useful there. Remember that as far as how you call it, the unbound
method has no advantages over the function!

> * security code that tries to implement object access control

Security code should handle plain functions just as well as (un)bound
methods anyway.

> * RPC applications that use introspection to generate
>    interface definitions (e.g. WSDL service definitions)

Why would those care about unbound methods?

> * debugging tools (e.g. IDEs)

Hopefuly those will use the filename + line number information in the
function object. Remember, by the time the function is called, the
(un)bound method object is unavailable.

> >>If you want to make methods look more like functions,
> >>the method object should become a subclass of the function
> >>object (function + added im_* attributes).
> >
> > Can't do that, since the (un)bound method object supports binding
> > other callables besides functions.
> 
> Is this feature used anywhere ?

Yes, by the built-in exception code. (It surprised me too; I think in
modern days it would have been done using a custom descriptor.)

BTW, decorators and other descriptors are one reason why approaches
that insist on im_class being there will have a diminishing value in
the future.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From mal at egenix.com  Tue Jan 18 18:38:34 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Jan 18 18:38:36 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41ED25C6.80603@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de>
Message-ID: <41ED499A.1050206@egenix.com>

Walter D?rwald wrote:
> __str__ and __unicode__ seem to behave differently. A __str__
> overwrite in a str subclass is used when calling str(), a __unicode__
> overwrite in a unicode subclass is *not* used when calling unicode():
> 
> -------------------------------
> class str2(str):
>     def __str__(self):
>         return "foo"
> 
> x = str2("bar")
> print str(x)
> 
> class unicode2(unicode):
>     def __unicode__(self):
>         return u"foo"
> 
> x = unicode2(u"bar")
> print unicode(x)
> -------------------------------
> 
> This outputs:
> foo
> bar
> 
> IMHO this should be fixed so that __unicode__() is used in the
> second case too.

If you drop the base class for unicode, this already works.

This code in object.c:PyObject_Unicode() is responsible for
the sub-class version not doing what you'd expect:

	if (PyUnicode_Check(v)) {
		/* For a Unicode subtype that's not a Unicode object,
		   return a true Unicode object with the same data. */
		return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(v),
					     PyUnicode_GET_SIZE(v));
	}

So the question is whether conversion of a Unicode sub-type
to a true Unicode object should honor __unicode__ or not.

The same question can be asked for many other types, e.g.
floats (and __float__), integers (and __int__), etc.

 >>> class float2(float):
...     def __float__(self):
...             return 3.141
...
 >>> float(float2(1.23))
1.23
 >>> class int2(int):
...     def __int__(self):
...             return 42
...
 >>> int(int2(123))
123

I think we need general consensus on what the strategy
should be: honor these special hooks in conversions
to base types or not ?

Maybe the string case is the real problem ... :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mwh at python.net  Tue Jan 18 19:13:29 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Jan 18 19:13:32 2005
Subject: [Python-Dev] Exceptions *must*? be old-style classes?
In-Reply-To: <2my8eqbrk2.fsf@starship.python.net> (Michael Hudson's message
	of "Tue, 18 Jan 2005 17:00:45 +0000")
References: <fb6fbf560501141620eff6d85@mail.gmail.com>
	<20050117105219.GA12763@vicky.ecs.soton.ac.uk>
	<ca471dc205011707277b386ec8@mail.gmail.com>
	<2mbrboca5r.fsf@starship.python.net>
	<5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com>
	<ca471dc2050117101654e0116f@mail.gmail.com>
	<41EC38DE.8080603@v.loewis.de> <2my8eqbrk2.fsf@starship.python.net>
Message-ID: <2mu0pebo6u.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> I hope to have a new patch (which makes PyExc_Exception new-style, but
> allows arbitrary old-style classes as exceptions) "soon".  It may even
> pass bits of "make test" :)

Done: http://www.python.org/sf/1104669

It passed 'make test' apart from failures I really don't think are my
fault.  I'll run "regrtest -uall" overnight...

Cheers,
mwh

-- 
[1] If you're lost in the woods, just bury some fibre in the ground
    carrying data. Fairly soon a JCB will be along to cut it for you
    - follow the JCB back to civilsation/hitch a lift.
                                               -- Simon Burr, cam.misc
From irmen at xs4all.nl  Tue Jan 18 20:32:43 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Tue Jan 18 20:32:45 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EA9196.1020709@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl>
Message-ID: <41ED645B.40709@xs4all.nl>

Irmen de Jong wrote:
> Hello
> I've looked at one bug and a bunch of patches and
> added a comment to them:
[...]

> [ 579435 ] Shadow Password Support Module
> Would be nice to have, I recently just couldn't do the user
> authentication that I wanted: based on the users' unix passwords

I'm almost done with completing this thing.
(including doc and unittest).
However:
1- I can't add new files to this tracker item.
    Should I open a new patch and refer to it?
2- As shadow passwords can only be retrieved when
    you are root, is a unit test module even useful?
3- Should the order of the chapters in the documentation
    be preserved? I'd rather add spwd below pwd, but
    this pushes the other unix modules "1 down"...

--Irmen

From martin at v.loewis.de  Tue Jan 18 23:17:46 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan 18 23:17:45 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41ED645B.40709@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
Message-ID: <41ED8B0A.7050201@v.loewis.de>

Irmen de Jong wrote:
> 1- I can't add new files to this tracker item.
>    Should I open a new patch and refer to it?

Depends on whether you want tracker admin access (i.e.
become a SF python project member). If you do,
you could attach patches to bug reports not
written by you.

> 2- As shadow passwords can only be retrieved when
>    you are root, is a unit test module even useful?

Probably not. Alternatively, introduce a "root" resource,
and make that test depend on the presence of the root resource.

> 3- Should the order of the chapters in the documentation
>    be preserved? I'd rather add spwd below pwd, but
>    this pushes the other unix modules "1 down"...

You could make it a subsection (e.g. "spwd -- shadow passwords")
Not sure whether this would be supported by the processing
tools; if not, inserting the module in the middle might be
acceptable.

In any case, what is important is that the documentation is
added - it can always be rearranged later.

Regards,
Martin
From fdrake at acm.org  Tue Jan 18 23:23:34 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Jan 18 23:23:47 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41ED8B0A.7050201@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
	<41ED8B0A.7050201@v.loewis.de>
Message-ID: <200501181723.35070.fdrake@acm.org>

Irmen de Jong wrote:
 > 3- Should the order of the chapters in the documentation
 >    be preserved? I'd rather add spwd below pwd, but
 >    this pushes the other unix modules "1 down"...

On Tuesday 18 January 2005 17:17, Martin v. L?wis wrote:
 > You could make it a subsection (e.g. "spwd -- shadow passwords")
 > Not sure whether this would be supported by the processing
 > tools; if not, inserting the module in the middle might be
 > acceptable.

I see no reason not to insert it right after pwd module docs.  The order of 
the sections is not a backward compatibility concern.  :-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From tdelaney at avaya.com  Tue Jan 18 23:54:59 2005
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Jan 18 23:55:25 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE025202BC@au3010avexu1.global.avaya.com>

Guido van Rossum wrote:

> [Timothy Delaney]
>> If im_func were set to the class where the function was defined, I
>> could definitely avoid the second part of the trawling (not sure
>> about the first yet, since I need to get at the function object).
> 
> Instead of waiting for unbound methods to change their functionality,
> just create a metaclass that sticks the attribute you want on the
> function objects.

Yep - that's one approach I've considered. I've also thought about
modifying the code objects, which would mean I could grab the base class
directly.

It's definitely not the most compelling use case in the world ;)

Tim Delaney
From irmen at xs4all.nl  Wed Jan 19 00:25:54 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Wed Jan 19 00:25:56 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41ED8B0A.7050201@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
	<41ED8B0A.7050201@v.loewis.de>
Message-ID: <41ED9B02.2040908@xs4all.nl>

Martin,

> Irmen de Jong wrote:
> 
>> 1- I can't add new files to this tracker item.
>>    Should I open a new patch and refer to it?
> 
> 
> Depends on whether you want tracker admin access (i.e.
> become a SF python project member). If you do,
> you could attach patches to bug reports not
> written by you.

That sounds very convenient, thanks.
Does the status of 'python project member' come with
certain expectations that must be complied with ? ;-)

>> 2- As shadow passwords can only be retrieved when
>>    you are root, is a unit test module even useful?
> 
> 
> Probably not. Alternatively, introduce a "root" resource,
> and make that test depend on the presence of the root resource.

I'm not sure what this "resource" is actually.
I have seen them pass on my screen when executing the
regression tests (resource "network" is not enabled, etc)
but never paid much attention to them.
Are they used to select optional parts of the test suite
that can only be run in certain conditions?


> In any case, what is important is that the documentation is
> added - it can always be rearranged later.

I've copied and adapted the "pwd" module chapter.

I'll try to have a complete patch ready tomorrow night.


Bye,
-Irmen.
From tim.peters at gmail.com  Wed Jan 19 01:03:17 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Jan 19 01:03:24 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41ED9B02.2040908@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
	<41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl>
Message-ID: <1f7befae0501181603732388f0@mail.gmail.com>

[Martin asks whether Irmen wants to be a tracker admin on SF]

[Irmen de Jong]
> That sounds very convenient, thanks.
> Does the status of 'python project member' come with
> certain expectations that must be complied with ? ;-)

If you're using Python, you're already required to comply with all of
Guido's demands, this would just make it more official.  Kinda like
the difference in sanctifying cohabitation with a marriage ceremony
<wink>.

OK, really, the minimum required of Python project members is that
they pay some attention to Python-Dev.

>>> 2- As shadow passwords can only be retrieved when
>>>    you are root, is a unit test module even useful?

>> Probably not. Alternatively, introduce a "root" resource,
>> and make that test depend on the presence of the root resource.
 
> I'm not sure what this "resource" is actually.
> I have seen them pass on my screen when executing the
> regression tests (resource "network" is not enabled, etc)
> but never paid much attention to them.
> Are they used to select optional parts of the test suite
> that can only be run in certain conditions?

That's right, where "the condition" is precisely that you tell
regrtest.py to enable a (one or more) named resource.  There's no
intelligence involved.  "Resource names" are arbitrary, and can be
passed to regrtest.py's -u argument.  See regrtest's docstring for
details.  For example, to run the tests that require the network
resource, pass "-u network".  Then it will run network tests, and
regardless of whether a network is actually available.  Passing "-u
all" makes it try to run all tests.
From mdehoon at ims.u-tokyo.ac.jp  Wed Jan 19 04:03:30 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Wed Jan 19 04:00:08 2005
Subject: [Python-Dev] Patch review [ 684500 ] extending readline
	functionality
Message-ID: <41EDCE02.3020505@ims.u-tokyo.ac.jp>

Patch review [ 684500 ] (extending readline functionality)

This patch is a duplicate of patch [ 675551 ] (extending readline 
functionality), which was first submitted against stable python version 2.2.2. 
After the resubmitted patch [ 684500 ] against Python 2.3a1 was accepted 
(Modules/readline.c revision 2.73 and Doc/lib/libreadline.tex revision 1.16), 
the original patch [ 675551 ] was closed but patch [ 684500 ] was not. I have 
added a comment to patch [ 684500 ] that it can be closed.

--Michiel.

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
From abo at minkirri.apana.org.au  Wed Jan 19 06:16:09 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Wed Jan 19 06:16:47 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6
Message-ID: <1106111769.3822.52.camel@schizo>

G'day,

I've Cc'ed this to zope-coders as it might affect other Zope developers
and it had me stumped for ages. I couldn't find anything on it anywhere,
so I figured it would be good to get something into google :-).

We are developing a Zope2.7 application on Debian GNU/Linux that is
using fop to generate pdf's from xml-fo data. fop is a java thing, and
we are using popen2.Popen3(), non-blocking mode, and select loop to
write/read stdin/stdout/stderr. This was all working fine.

Then over the Christmas chaos, various things on my development system
were apt-get updated, and I noticed that java/fop had started
segfaulting. I tried running fop with the exact same input data from the
command line; it worked. I wrote a python script that invoked fop in
exactly the same way as we were invoking it inside zope; it worked. It
only segfaulted when invoked inside Zope.

I googled and tried everything... switched from j2re1.4 to kaffe, rolled
back to a previous version of python, re-built Zope, upgraded Zope from
2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8
kernel to a 2.4.27 kernel; it worked!

After googling around, I found references to recent attempts to resolve
some signal handling problems in Python threads. There was one post that
mentioned subtle differences between how Linux 2.4 and Linux 2.6 did
signals to threads.

So it seems this is a problem with Python threads and Linux kernel 2.6.
The attached program demonstrates that it has nothing to do with Zope.
Using it to run "fop-test /usr/bin/fop </dev/null" on a Debian box with
fop installed will show the segfault. Running the same thing on a
machine with 2.4 kernel will instead get the fop "usage" message. It is
not a generic fop/java problem with 2.6 because the commented
un-threaded line works fine. It doesn't seem to segfault for any
command... "cat -" works OK, so it must be something about java
contributing.

After searching the Python bugs, the closest I could find was #971213
<http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=971213>. Is this the same bug? Should I submit a new bug report? Is there any other way I can help resolve this?

BTW, built in file objects really could use better non-blocking
support... I've got a half-drafted PEP for it... anyone interested in
it?

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-fop.py
Type: application/x-python
Size: 1685 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050119/1f7dd103/test-fop-0001.bin
From walter at livinglogic.de  Wed Jan 19 10:40:46 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Jan 19 10:40:49 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41ED499A.1050206@egenix.com>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
Message-ID: <41EE2B1E.8030209@livinglogic.de>

M.-A. Lemburg wrote:

> Walter D?rwald wrote:
> 
>> __str__ and __unicode__ seem to behave differently. A __str__
>> overwrite in a str subclass is used when calling str(), a __unicode__
>> overwrite in a unicode subclass is *not* used when calling unicode():
>>
>> [...]
> 
> If you drop the base class for unicode, this already works.

That's cheating! ;)

My use case is an XML DOM API: __unicode__() should extract the
character data from the DOM. For Text nodes this is the text,
for comments and processing instructions this is u"" etc. To
reduce memory footprint and to inherit all the unicode methods,
it would be good if Text, Comment and ProcessingInstruction could
be subclasses of unicode.

> This code in object.c:PyObject_Unicode() is responsible for
> the sub-class version not doing what you'd expect:
> 
>     if (PyUnicode_Check(v)) {
>         /* For a Unicode subtype that's not a Unicode object,
>            return a true Unicode object with the same data. */
>         return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(v),
>                          PyUnicode_GET_SIZE(v));
>     }
> 
> So the question is whether conversion of a Unicode sub-type
> to a true Unicode object should honor __unicode__ or not.
> 
> The same question can be asked for many other types, e.g.
> floats (and __float__), integers (and __int__), etc.
> 
>  >>> class float2(float):
> ...     def __float__(self):
> ...             return 3.141
> ...
>  >>> float(float2(1.23))
> 1.23
>  >>> class int2(int):
> ...     def __int__(self):
> ...             return 42
> ...
>  >>> int(int2(123))
> 123
> 
> I think we need general consensus on what the strategy
> should be: honor these special hooks in conversions
> to base types or not ?

I'd say, these hooks should be honored, because it gives
us more possibilities: If you want the original value,
simply don't implement the hook.

> Maybe the string case is the real problem ... :-)

At least it seems that the string case is the exception.

So if we fix __str__ this would be a bugfix for 2.4.1.
If we fix the rest, this would be a new feature for 2.5.

Bye,
    Walter D?rwald
From bob at redivi.com  Wed Jan 19 11:10:36 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Jan 19 11:10:42 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE2B1E.8030209@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de>
Message-ID: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com>

On Jan 19, 2005, at 4:40, Walter D?rwald wrote:

> M.-A. Lemburg wrote:
>
>> Walter D?rwald wrote:
>>> __str__ and __unicode__ seem to behave differently. A __str__
>>> overwrite in a str subclass is used when calling str(), a __unicode__
>>> overwrite in a unicode subclass is *not* used when calling unicode():
>>>
>>> [...]
>> If you drop the base class for unicode, this already works.
>
> That's cheating! ;)
>
> My use case is an XML DOM API: __unicode__() should extract the
> character data from the DOM. For Text nodes this is the text,
> for comments and processing instructions this is u"" etc. To
> reduce memory footprint and to inherit all the unicode methods,
> it would be good if Text, Comment and ProcessingInstruction could
> be subclasses of unicode.

It sounds like a really bad idea to have a class that supports both of 
these properties:
- unicode as a base class
- non-trivial result from unicode(foo)

Do you REALLY think this should be True?!
     isinstance(foo, unicode) and foo != unicode(foo)

Why don't you just call this "extract character data" method something 
other than __unicode__?  That way, you get the reduced memory footprint 
and convenience methods of unicode, with none of the craziness.

-bob

From walter at livinglogic.de  Wed Jan 19 12:19:14 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Jan 19 12:19:17 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de>
	<5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com>
Message-ID: <41EE4232.9070409@livinglogic.de>

Bob Ippolito wrote:
> On Jan 19, 2005, at 4:40, Walter D?rwald wrote:
> 
>> [...]
>> That's cheating! ;)
>>
>> My use case is an XML DOM API: __unicode__() should extract the
>> character data from the DOM. For Text nodes this is the text,
>> for comments and processing instructions this is u"" etc. To
>> reduce memory footprint and to inherit all the unicode methods,
>> it would be good if Text, Comment and ProcessingInstruction could
>> be subclasses of unicode.
> 
> It sounds like a really bad idea to have a class that supports both of 
> these properties:
> - unicode as a base class
> - non-trivial result from unicode(foo)
> 
> Do you REALLY think this should be True?!
>     isinstance(foo, unicode) and foo != unicode(foo)
> 
> Why don't you just call this "extract character data" method something 
> other than __unicode__?

IMHO __unicode__ is the most natural and logical choice.
isinstance(foo, unicode) is just an implementation detail.

But you're right: the consequences of this can be a bit scary.

> That way, you get the reduced memory footprint 
> and convenience methods of unicode, with none of the craziness.

Without this craziness we wouldn't have discovered the problem. ;)
Whether this craziness gets implemented, depends on the solution
to this problem.

Bye,
    Walter D?rwald
From aleax at aleax.it  Wed Jan 19 12:22:44 2005
From: aleax at aleax.it (Alex Martelli)
Date: Wed Jan 19 12:23:03 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de>
	<5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com>
Message-ID: <708E5DA6-6A0C-11D9-9DED-000A95EFAE9E@aleax.it>


On 2005 Jan 19, at 11:10, Bob Ippolito wrote:

> Do you REALLY think this should be True?!
>     isinstance(foo, unicode) and foo != unicode(foo)

Hmmmm -- why not?  In the generic case, talking about some class B, it 
certainly violates no programming principle known to me that 
"isinstance(foo, B) and foo != B(foo)"; it seems a rather common case 
-- ``casting to the base class'' (in C++ terminology, I guess) ``slices 
off'' some parts of foo, and thus equality does not hold.  If this is 
specifically a bad idea for the specific case where B is unicode, OK, 
that's surely possible, but if so it seems it should be possible to 
explain this in terms of particular properties of type unicode.


Alex

From mal at egenix.com  Wed Jan 19 12:27:29 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan 19 12:27:32 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc2050118093239bdad02@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>	
	<1105945019.30052.26.camel@localhost>	
	<ca471dc2050117074347baa60d@mail.gmail.com>	
	<41EC431A.90204@egenix.com>	
	<ca471dc20501172115275a99c9@mail.gmail.com>	
	<41ECD6D9.9000001@egenix.com>
	<ca471dc2050118093239bdad02@mail.gmail.com>
Message-ID: <41EE4421.80303@egenix.com>

Guido van Rossum wrote:
> [me]
> 
>>>I'm not sure I understand how basemethod is supposed to work; I can't
>>>find docs for it using Google (only three hits for the query mxTools
>>>basemethod). How does it depend on im_class?
> 
> 
> [Marc-Andre]
> 
>>It uses im_class to find the class defining the (unbound) method:
>>
>>def basemethod(object,method=None):
>>
>>     """ Return the unbound method that is defined *after* method in the
>>         inheritance order of object with the same name as method
>>         (usually called base method or overridden method).
>>
>>         object can be an instance, class or bound method. method, if
>>         given, may be a bound or unbound method. If it is not given,
>>         object must be bound method.
>>
>>         Note: Unbound methods must be called with an instance as first
>>         argument.
>>
>>         The function uses a cache to speed up processing. Changes done
>>         to the class structure after the first hit will not be noticed
>>         by the function.
>>
>>     """
>>     ...
>>
>>This is how it is used in mixin classes to call the base
>>method of the overridden method in the inheritance tree (of
>>old-style classes):
>>
>>class RequestListboxMixin:
>>
>>     def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0,
>>                  size=None,width=None,monospaced=1,events=None):
>>
>>         # Call base method
>>         mx.Tools.basemethod(self, RequestListboxMixin.__init__)\
>>                            (self,name,size,width,monospaced,None,events)
>>
>>         ...
>>
>>Without .im_class for the unbound method, basemethod would
>>seize to work since it uses this attribute to figure out
>>the class object defining the overriding method.
> 
> 
> Well, you could always do what Timothy Delaney's autosuper recipe
> does: crawl the class structure starting from object.__class__ until
> you find the requested method. Since you're using a cache the extra
> cost should be minimal.

That won't work, since basemethod() is intended for standard
classes (not new-style ones).

> I realize that this requires you to issue a new release of mxTools to
> support this, but you probably want to do one anyway to support other
> 2.5 features.

A new release wouldn't be much trouble, but I don't see any
way to fix the basemethod() implementation without also requiring
a change to the function arguments.

Current usage is basemethod(instance, unbound_method). In order
for basemethod to still be able to find the right class and
method name, I'd have to change that to basemethod(instance,
unbound_method_or_function, class_defining_method).

This would require all users of mx.Tools.basemethod() to
update their code base. Users will probably not understand
why this change is necessary, since they'd have to write
the class twice:

mx.Tools.basemethod(self,
                     RequestListboxMixin.__init__,
                     RequestListboxMixin)\
                     (self,name,size,width,monospaced,None,events)

Dropping the unbound method basically loses expressiveness:
without extra help from function attributes or other descriptors,
it would no longer be possible to whether a function is to be
used as method or as function. The defining namespace of the method
would also not be available anymore.

>>Hmm, I have a hard time seeing how you can get rid
>>off unbound methods while keeping bound methods - since
>>both are the same type :-)
> 
> Easy. There is a lot of code in the instance method type specifically
> to support the case where im_self is NULL. All that code can be
> deleted (once built-in exceptions stop using it).

So this is not about removing a type, but about removing
extra code. You'd still keep bound methods as separate
type.

>>I'm using PyMethod_Check() in mxProxy to automatically
>>wrap methods of proxied object in order to prevent references
>>to the object class or the object itself to slip by the
>>proxy. Changing the type to function object and placing
>>the class information into a function attribute would break
>>this approach. Apart from that the type change (by itself)
>>would not affect the eGenix code base.
> 
> 
> Isn't mxProxy a weak referencing scheme? Is it still useful given
> Python's own support for weak references?

Sure. First of all, mxProxy is more than just a weak referencing
scheme (in fact, that was only an add-on feature).

mxProxy allows you to wrap any Python object in way that hides
the object from the rest of the Python interpreter, putting
access to the object under fine-grained and strict control.
This is the main application space for mxProxy.

The weak reference feature was added later on, to work around
problems with circular references. Unlike the Python weak
referencing scheme, mxProxy allows creating weak references
to all Python objects (not just the ones that support the
Python weak reference protocol).

>>I would expect code in the following areas to make use
>>of the type check:
>>* language interface code (e.g. Java, .NET bridges)
> 
> Java doesn't have the concept of unbound methods, so I doubt it's
> useful there. Remember that as far as how you call it, the unbound
> method has no advantages over the function!

True, but you know that its a method and not just a function.
That can make a difference in how you implement the call.

>>* security code that tries to implement object access control
> 
> 
> Security code should handle plain functions just as well as (un)bound
> methods anyway.

It is very important for security code to know which attributes
are available on an object, e.g. an unbound method includes
the class object, which again has a reference to the module,
the builtins, etc.

Functions currently don't have this problem (but will once you add the
im_class attribute ;-).

>>* RPC applications that use introspection to generate
>>   interface definitions (e.g. WSDL service definitions)
> 
> 
> Why would those care about unbound methods?

I was thinking of iterating over the list of methods in
a class:

To find out which of the entries in the class
dict are methods, a tool would have to check whether
myClass.myMethod maps to an unbound method (e.g.
non-function callables are currently not wrapped as
unbound methods).

Example:

 >>> class C:
...     def test(self):
...             print 'hello'
...     test1 = dict
...
 >>> C.test1
<type 'dict'>
 >>> C.test
<unbound method C.test>

>>* debugging tools (e.g. IDEs)
> 
> Hopefuly those will use the filename + line number information in the
> function object. Remember, by the time the function is called, the
> (un)bound method object is unavailable.

I was thinking more in terms of being able to tell whether
a function is a method or not. IDEs might want to help
the user by placing a "(self," right after she types the
name of an unbound method.

>>>>If you want to make methods look more like functions,
>>>>the method object should become a subclass of the function
>>>>object (function + added im_* attributes).
>>>
>>>Can't do that, since the (un)bound method object supports binding
>>>other callables besides functions.
>>
>>Is this feature used anywhere ?
> 
> Yes, by the built-in exception code. (It surprised me too; I think in
> modern days it would have been done using a custom descriptor.)
> 
> BTW, decorators and other descriptors are one reason why approaches
> that insist on im_class being there will have a diminishing value in
> the future.

True, as long as you put the information from im_class somewhere
else (where it's easily accessible). However, I wouldn't want
to start writing

@method
def funcname(self, arg0, arg1):
     return 42

just to tell Python that this particular function will only
be used as method ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mal at egenix.com  Wed Jan 19 12:42:15 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan 19 12:42:17 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE2B1E.8030209@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de>
Message-ID: <41EE4797.6030105@egenix.com>

Walter D?rwald wrote:
> M.-A. Lemburg wrote:
>> So the question is whether conversion of a Unicode sub-type
>> to a true Unicode object should honor __unicode__ or not.
>>
>> The same question can be asked for many other types, e.g.
>> floats (and __float__), integers (and __int__), etc.
>>
>>  >>> class float2(float):
>> ...     def __float__(self):
>> ...             return 3.141
>> ...
>>  >>> float(float2(1.23))
>> 1.23
>>  >>> class int2(int):
>> ...     def __int__(self):
>> ...             return 42
>> ...
>>  >>> int(int2(123))
>> 123
>>
>> I think we need general consensus on what the strategy
>> should be: honor these special hooks in conversions
>> to base types or not ?
> 
> 
> I'd say, these hooks should be honored, because it gives
> us more possibilities: If you want the original value,
> simply don't implement the hook.
> 
>> Maybe the string case is the real problem ... :-)
> 
> 
> At least it seems that the string case is the exception.

Indeed.

> So if we fix __str__ this would be a bugfix for 2.4.1.
> If we fix the rest, this would be a new feature for 2.5.

I have a feeling that we're better off with the bug fix than
the new feature.

__str__ and __unicode__ as well as the other hooks were
specifically added for the type constructors to use.
However, these were added at a time where sub-classing
of types was not possible, so it's time now to reconsider
whether this functionality should be extended to sub-classes
as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From ncoghlan at iinet.net.au  Wed Jan 19 13:26:14 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Jan 19 13:26:23 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE4797.6030105@egenix.com>
References: <41ED25C6.80603@livinglogic.de>
	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>
	<41EE4797.6030105@egenix.com>
Message-ID: <41EE51E6.3090708@iinet.net.au>

M.-A. Lemburg wrote:
>> So if we fix __str__ this would be a bugfix for 2.4.1.
>> If we fix the rest, this would be a new feature for 2.5.
> 
> 
> I have a feeling that we're better off with the bug fix than
> the new feature.
> 
> __str__ and __unicode__ as well as the other hooks were
> specifically added for the type constructors to use.
> However, these were added at a time where sub-classing
> of types was not possible, so it's time now to reconsider
> whether this functionality should be extended to sub-classes
> as well.

It seems oddly inconsistent though:

"""Define __str__ to determine what your class returns for str(x).

NOTE: This won't work if your class directly or indirectly inherits from str. If 
that is the case, you cannot alter the results of str(x)."""

At present, most of the type constructors need the caveat, whereas __str__ 
actually agrees with the simple explanation in the first line.

Going back to PyUnicode, PyObject_Unicode's handling of subclasses of builtins 
is decidedly odd:

Py> class C(str):
...   def __str__(self): return "I am a string!"
...   def __unicode__(self): return "I am not unicode!"
...
Py> c = C()
Py> str(c)
'I am a string!'
Py> unicode(c)
u''

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mal at egenix.com  Wed Jan 19 13:50:04 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Jan 19 13:50:07 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE51E6.3090708@iinet.net.au>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>
	<41EE51E6.3090708@iinet.net.au>
Message-ID: <41EE577C.3010405@egenix.com>

Nick Coghlan wrote:
> M.-A. Lemburg wrote:
> 
>>> So if we fix __str__ this would be a bugfix for 2.4.1.
>>> If we fix the rest, this would be a new feature for 2.5.
>>
>>
>>
>> I have a feeling that we're better off with the bug fix than
>> the new feature.
>>
>> __str__ and __unicode__ as well as the other hooks were
>> specifically added for the type constructors to use.
>> However, these were added at a time where sub-classing
>> of types was not possible, so it's time now to reconsider
>> whether this functionality should be extended to sub-classes
>> as well.
> 
> 
> It seems oddly inconsistent though:
> 
> """Define __str__ to determine what your class returns for str(x).
> 
> NOTE: This won't work if your class directly or indirectly inherits from 
> str. If that is the case, you cannot alter the results of str(x)."""
> 
> At present, most of the type constructors need the caveat, whereas 
> __str__ actually agrees with the simple explanation in the first line.
> 
> Going back to PyUnicode, PyObject_Unicode's handling of subclasses of 
> builtins is decidedly odd:

Those APIs were all written long before there were sub-classes
of types.

> Py> class C(str):
> ...   def __str__(self): return "I am a string!"
> ...   def __unicode__(self): return "I am not unicode!"
> ...
> Py> c = C()
> Py> str(c)
> 'I am a string!'
> Py> unicode(c)
> u''

Ah, looks as if the function needs a general overhaul :-)

This section should be do a PyString_CheckExact():

	if (PyString_Check(v)) {
		Py_INCREF(v);
	    	res = v;
     	}

But before we start hacking the function, we need a general
picture of what we think is right.

Note, BTW, that there is also a tp_str slot that serves
as hook. The overall solution to this apparent mess should
be consistent for all hooks (__str__, tp_str, __unicode__
and a future tp_unicode).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From ncoghlan at iinet.net.au  Wed Jan 19 14:27:43 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Jan 19 14:27:49 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE577C.3010405@egenix.com>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>
	<41EE51E6.3090708@iinet.net.au> <41EE577C.3010405@egenix.com>
Message-ID: <41EE604F.3000006@iinet.net.au>

M.-A. Lemburg wrote:
> Those APIs were all written long before there were sub-classes
> of types.

Understood. PyObject_Unicode certainly looked like an 'evolved' piece of code :)

> But before we start hacking the function, we need a general
> picture of what we think is right.

Aye.

> Note, BTW, that there is also a tp_str slot that serves
> as hook. The overall solution to this apparent mess should
> be consistent for all hooks (__str__, tp_str, __unicode__
> and a future tp_unicode).

I imagine many people are like me, with __str__ being the only one of these 
hooks they use frequently (Helping out with the Decimal implementation is the 
only time I can recall using the slots for the numeric types, and I rarely need 
to deal with Unicode).

Anyway, they're heavy use suggests to me that __str__ and str() are likely to 
provide a good model for the desired behaviour - they're the ones that are 
likely to have been nudged in the most useful direction by bug reports and the like.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mwh at python.net  Wed Jan 19 14:37:11 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Jan 19 14:37:14 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <1106111769.3822.52.camel@schizo> (Donovan Baarda's message of
	"Wed, 19 Jan 2005 16:16:09 +1100")
References: <1106111769.3822.52.camel@schizo>
Message-ID: <2mpt01bkvs.fsf@starship.python.net>

Donovan Baarda <abo@minkirri.apana.org.au> writes:

> G'day,
>
> I've Cc'ed this to zope-coders as it might affect other Zope developers
> and it had me stumped for ages. I couldn't find anything on it anywhere,
> so I figured it would be good to get something into google :-).
>
> We are developing a Zope2.7 application on Debian GNU/Linux that is
> using fop to generate pdf's from xml-fo data. fop is a java thing, and
> we are using popen2.Popen3(), non-blocking mode, and select loop to
> write/read stdin/stdout/stderr. This was all working fine.
>
> Then over the Christmas chaos, various things on my development system
> were apt-get updated, and I noticed that java/fop had started
> segfaulting. I tried running fop with the exact same input data from the
> command line; it worked. I wrote a python script that invoked fop in
> exactly the same way as we were invoking it inside zope; it worked. It
> only segfaulted when invoked inside Zope.
>
> I googled and tried everything... switched from j2re1.4 to kaffe, rolled
> back to a previous version of python, re-built Zope, upgraded Zope from
> 2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8
> kernel to a 2.4.27 kernel; it worked!
>
> After googling around, I found references to recent attempts to resolve
> some signal handling problems in Python threads. There was one post that
> mentioned subtle differences between how Linux 2.4 and Linux 2.6 did
> signals to threads.

You've left out a very important piece of information: which version
of Python you are using.  I'm guessing 2.3.4.  Can you try 2.4?

> So it seems this is a problem with Python threads and Linux kernel 2.6.
> The attached program demonstrates that it has nothing to do with Zope.
> Using it to run "fop-test /usr/bin/fop </dev/null" on a Debian box with
> fop installed will show the segfault. Running the same thing on a
> machine with 2.4 kernel will instead get the fop "usage" message. It is
> not a generic fop/java problem with 2.6 because the commented
> un-threaded line works fine. It doesn't seem to segfault for any
> command... "cat -" works OK, so it must be something about java
> contributing.
>
> After searching the Python bugs, the closest I could find was
> #971213
> <http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=971213>. Is
> this the same bug? Should I submit a new bug report? Is there any
> other way I can help resolve this?

I'd be astonished if this is the same bug.

The main oddness about python threads (before 2.3) is that they run
with all signals masked.  You could play with a C wrapper (call
setprocmask, then exec fop) to see if this is what is causing the
problem.  But please try 2.4.

> BTW, built in file objects really could use better non-blocking
> support... I've got a half-drafted PEP for it... anyone interested in
> it?

Err, this probably should be in a different mail :)

Cheers,
mwh

-- 
  If trees could scream, would we be so cavalier about cutting them
  down? We might, if they screamed all the time, for no good reason.
                                                        -- Jack Handey
From aahz at pythoncraft.com  Wed Jan 19 16:04:29 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed Jan 19 16:04:31 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE2B1E.8030209@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de>
Message-ID: <20050119150428.GB25472@panix.com>

On Wed, Jan 19, 2005, Walter D?rwald wrote:
> M.-A. Lemburg wrote:
>> 
>>Maybe the string case is the real problem ... :-)
> 
> At least it seems that the string case is the exception.
> So if we fix __str__ this would be a bugfix for 2.4.1.

Nope.  Unless you're claiming the __str__ behavior is new in 2.4?
(Haven't been following the thread closely.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
From walter at livinglogic.de  Wed Jan 19 18:31:25 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Jan 19 18:31:28 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE604F.3000006@iinet.net.au>
References: <41ED25C6.80603@livinglogic.de>	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>	<41EE4797.6030105@egenix.com>	<41EE51E6.3090708@iinet.net.au>
	<41EE577C.3010405@egenix.com> <41EE604F.3000006@iinet.net.au>
Message-ID: <41EE996D.6040601@livinglogic.de>

Nick Coghlan wrote:

> [...]
> I imagine many people are like me, with __str__ being the only one of 
> these hooks they use frequently (Helping out with the Decimal 
> implementation is the only time I can recall using the slots for the 
> numeric types, and I rarely need to deal with Unicode).
> 
> Anyway, they're heavy use suggests to me that __str__ and str() are 
> likely to provide a good model for the desired behaviour - they're the 
> ones that are likely to have been nudged in the most useful direction by 
> bug reports and the like.

+1

__foo__ provides conversion to foo, no matter whether foo is among the
direct or indirect base classes.

Simply moving the PyUnicode_Check() call in PyObject_Unicode() after the
__unicode__ call (after the PyErr_Clear() call) will implement this (but 
does not fix Nick's bug). Running the test suite with this change
reveals no other problems.

Bye,
    Walter D?rwald
From bac at OCF.Berkeley.EDU  Wed Jan 19 22:43:01 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Jan 19 22:43:31 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EC4054.6000908@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl>	
	<1105974308.17513.1.camel@localhost>	<41EC3849.8040503@v.loewis.de>
	<1106000872.5931.6.camel@emperor> <41EC4054.6000908@v.loewis.de>
Message-ID: <41EED465.1020305@ocf.berkeley.edu>

Martin v. L?wis wrote:
> I think Brett Cannon now also follows this rule; it
> really falls short enough in practice because (almost)
> nobody really wants to push his patch bad enough to
> put some work into it to review other patches.
> 

Yes, I am trying to support the rule, but my schedule is nutty right now so my 
turn-around time is rather long at the moment.

-Brett
From stuart at stuartbishop.net  Thu Jan 20 00:06:33 2005
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Thu Jan 20 00:06:41 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
Message-ID: <41EEE7F9.7000902@stuartbishop.net>

There is a discussion going on at the moment in postgresql-general about 
plpythonu (which allows you write stored procedures in Python) and line 
endings. The discussion starts here:

   http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php

The problem appears to be that things are working as documented in PEP-278:

   There is no support for universal newlines in strings passed to
   eval() or exec.  It is envisioned that such strings always have the
   standard \n line feed, if the strings come from a file that file can
   be read with universal newlines.

So what happens is that if a Windows or Mac user tries to create a 
Python stored procedure, it will go through to the server with Windows 
line endings and the embedded Python interpreter will raise a syntax 
error for everything except single line functions.

I don't think it is possible for plpythonu to fix this by simply 
translating the line endings, as this would require significant 
knowledge of Python syntax to do correctly (triple quoted strings and 
character escaping I think).

The timing of this thread is very unfortunate, as PostgreSQL 8.0 is 
being released this weekend and the (hopefully) last release of the 2.3 
series next week :-(


-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/
From fredrik at pythonware.com  Thu Jan 20 00:14:55 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Jan 20 00:14:50 2005
Subject: [Python-Dev] Re: Unix line endings required for PyRun*
	breakingembedded Python
References: <41EEE7F9.7000902@stuartbishop.net>
Message-ID: <csmpl1$v7f$1@sea.gmane.org>

Stuart Bishop wrote:

> I don't think it is possible for plpythonu to fix this by simply translating the line endings, as 
> this would require significant knowledge of Python syntax to do correctly (triple quoted strings 
> and character escaping I think).

of course it's possible: that's what the interpreter does when it loads
a script or module, after all...  or in other words,

print repr("""
""")

always prints "\n" (at least on Unix (\n) and Windows (\r\n)).

</F> 



From aleax at aleax.it  Thu Jan 20 00:32:19 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 20 00:32:25 2005
Subject: [Python-Dev] Re: Unix line endings required for PyRun*
	breakingembedded Python
In-Reply-To: <csmpl1$v7f$1@sea.gmane.org>
References: <41EEE7F9.7000902@stuartbishop.net> <csmpl1$v7f$1@sea.gmane.org>
Message-ID: <5CC7C030-6A72-11D9-9DED-000A95EFAE9E@aleax.it>


On 2005 Jan 20, at 00:14, Fredrik Lundh wrote:

> Stuart Bishop wrote:
>
>> I don't think it is possible for plpythonu to fix this by simply 
>> translating the line endings, as
>> this would require significant knowledge of Python syntax to do 
>> correctly (triple quoted strings
>> and character escaping I think).
>
> of course it's possible: that's what the interpreter does when it loads
> a script or module, after all...  or in other words,
>
> print repr("""
> """)
>
> always prints "\n" (at least on Unix (\n) and Windows (\r\n)).

Mac, too (but then, that IS Unix to all intents and purposes, nowadays).


Alex

From firemoth at gmail.com  Thu Jan 20 01:03:25 2005
From: firemoth at gmail.com (Timothy Fitz)
Date: Thu Jan 20 01:03:28 2005
Subject: [Python-Dev] Re: Zen of Python
In-Reply-To: <3e8ca5c8050119150358c71728@mail.gmail.com>
References: <972ec5bd050119111359e358f5@mail.gmail.com>
	<3e8ca5c8050119150358c71728@mail.gmail.com>
Message-ID: <972ec5bd05011916033242179@mail.gmail.com>

On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne
<stephen.thorne@gmail.com> wrote:
> "Flat is better than nested" has one foot in concise powerful
> programming, the other foot in optimisation.
> 
> foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable lookup.

I find it amazingly hard to believe that this is implying optimization
over functionality or clarity. There has to be another reason, yet I
can't think of any.
From pje at telecommunity.com  Thu Jan 20 01:14:47 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 20 01:14:26 2005
Subject: [Python-Dev] Re: Zen of Python
In-Reply-To: <972ec5bd05011916033242179@mail.gmail.com>
References: <3e8ca5c8050119150358c71728@mail.gmail.com>
	<972ec5bd050119111359e358f5@mail.gmail.com>
	<3e8ca5c8050119150358c71728@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com>

At 07:03 PM 1/19/05 -0500, Timothy Fitz wrote:
>On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne
><stephen.thorne@gmail.com> wrote:
> > "Flat is better than nested" has one foot in concise powerful
> > programming, the other foot in optimisation.
> >
> > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable 
> lookup.
>
>I find it amazingly hard to believe that this is implying optimization
>over functionality or clarity. There has to be another reason, yet I
>can't think of any.

Actually, this is one of those rare cases where optimization and clarity go 
hand in hand.  Human brains just don't handle nesting that well.  It's easy 
to visualize two levels of nested structure, but three is a stretch unless 
you can abstract at least one of the layers.

For example, I can remember 'peak.binding.attributes' because the 'peak' is 
the same for all the packages in PEAK.  I can also handle 
'peak.binding.tests.test_foo' because 'tests' is also always the same.  But 
that's pretty much the limit of my mental stack, which is why PEAK's 
namespaces are organized so that APIs are normally accessed as 
'binding.doSomething' or 'naming.fooBar', instead of requiring people to 
type 'peak.binding.attributes.doSomething'.

Clearly Java developers have this brain-stack issue as well, in that you 
usually see Java imports set up to have a flat namespace within the given 
module... er, class.  You don't often see people creating 
org.apache.jakarta.foo.bar.Baz instances in their method bodies.

From bac at OCF.Berkeley.EDU  Thu Jan 20 02:12:59 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Jan 20 02:13:17 2005
Subject: [Python-Dev] python-dev Summary for 2004-12-01 through 2004-12-15
	[draft]
Message-ID: <41EF059B.5090908@ocf.berkeley.edu>

Uh, life has been busy.

Will probably send this one out this weekend some time so please get 
corrections in before then.

------------------------------------

=====================
Summary Announcements
=====================
PyCon_ 2005 is well underway.  The schedule is in the process of being 
finalized (just figuring out the order of the talks).  And there is still time 
for the early-bird registration price of $175 ($125 students) before it expires 
on January 28th.

Some day I will be all caught up with the Summaries...

.. _PyCon: http://www.pycon.org

=========
Summaries
=========
----------------------------------
PEPS: those existing and gestating
----------------------------------
[for emails on PEP updates, subscribe to python-checkins_ and choose the 'PEP' 
topic]
A proto-PEP covering the __source__ proposal from the `last summary`_ has been 
posted to python-dev.

`PEP 338`_ proposes how to modify the '-m' modifier so as to be able to execute 
modules contained within packages.

.. _python-checkins: http://mail.python.org/mailman/listinfo/python-checkins
.. _PEP 338: http://www.python.org/peps/pep-0338.html

Contributing threads:
   - `PEP: __source__ proposal <>`__
   - `PEP 338: Executing modules inside packages with '-m' <>`__


-------------------
Deprecating modules
-------------------
The xmllib module was deprecated but not listed in `PEP 4`_.  What does one do? 
  Well, this led to a long discussion on how to handle module deprecation.

With the 'warning' module now in existence, PEP 4 seemed to be less important. 
  It was generally agreed that listing modules in PEP 4 was no longer needed. 
It was also agreed that deleting deprecated modules was not needed; it breaks 
code and disk space is cheap.

It seems that no longer listing documentation and adding a deprecation warning 
is what is needed to properly deprecate a module.  By no longer listing 
documentation new programmers will not use the code since they won't know about 
it.  And adding the warning will let old users know that they should be using 
something else.

.. _PEP 4: http://www.python.org/peps/pep-0004.html

Contributing threads:
   - `Deprecated xmllib module <>`__
   - `Rewriting PEP4 <>`__

------------------------------------------
PR to fight the idea that Python is "slow"
------------------------------------------
An article_ in ACM TechNews that covered 2.4 had several mentions that Python 
was "slow" while justifying the slowness (whether it be flexibility or being 
fast enough).  Guido (rightfully) didn't love all of the "slow" mentions which 
I am sure we have all heard at some point or another.

The suggestions started to pour in on how to combat this.  The initial one was 
to have a native compiler.  The thinking was that if we compiled to a native 
executable that people psychologically would stop the association of Python 
being interpreted which is supposed to be slow.  Some people didn't love this 
idea since a native compiler is not an easy thing.  Others suggested including 
Pyrex with CPython, but didn't catch on (maintenance issue plus one might say 
Pyrex is not the most Pythonic solution).  This didn't get anywhere in the end 
beyond the idea of a SIG about the various bundling tools (py2app, py2exe, etc.).

The other idea was to just stop worrying about speed and move on stomping out 
bugs and making Python functionally more useful.  With modules in the stdlib 
being rewritten in C for performance reasons it was suggested we are putting 
out the perception that performance is important to us.  Several other people 
also suggested that we just not mention speed as a big deal in release notes 
and such.

This also tied into the idea that managers don't worry too much about speed as 
much as being able to hire a bunch of Python programmers.  This led to the 
suggestion of also emphasizing that Python is very easy to learn and thus is a 
moot point.  There are a good number of Python programmers, though; Stephan 
Deibel had some rough calculations that put the number at about 750K Python 
developers worldwide (give or take; rough middle point of two different 
calculations).

.. _article: http://gcn.com/vol1_no1/daily-updates/28026-1.html

Contributing threads:
   - `2.4 news reaches interesting places <>`__


===============
Skipped Threads
===============
- MS VC compiler versions
- Any reason why CPPFLAGS not used in compiling?
       Extension modules now compile with directories specified in the LDFLAGS 
and CPPFLAGS env vars
- adding key argument to min and max
       min and max now have a 'key' argument like list.sort
- Unicode in doctests
- SRE bug and notifications
- PyInt_FromLong returning NULL
- PyOS_InputHook enhancement proposal
- The other Py2.4 issue
- MinGW And The other Py2.4 issue
- Supporting Third Party Modules
- Python in education
From abo at minkirri.apana.org.au  Thu Jan 20 02:43:43 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Thu Jan 20 02:44:21 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <2mpt01bkvs.fsf@starship.python.net>
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net>
Message-ID: <1106185423.3784.26.camel@schizo>

On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
> Donovan Baarda <abo@minkirri.apana.org.au> writes:
[...]
> You've left out a very important piece of information: which version
> of Python you are using.  I'm guessing 2.3.4.  Can you try 2.4?

Debian Python2.3 (2.3.4-18), Debian kernel-image-2.6.8-1-686 (2.6.8-10),
and Debian kernel-image-2.4.27-1-686 (2.4.27-6)

> I'd be astonished if this is the same bug.
> 
> The main oddness about python threads (before 2.3) is that they run
> with all signals masked.  You could play with a C wrapper (call
> setprocmask, then exec fop) to see if this is what is causing the
> problem.  But please try 2.4.

Python 2.4 does indeed fix the problem. Unfortunately we are using Zope
2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to
2.4. Is there any way this "Fix" can be back-ported to 2.3?

Note that this problem is being triggered when using 
Popen3() in a thread. Popen3() simply uses os.fork() and os.execvp().
The segfault is occurring in the excecvp'ed process. I'm sure there must
be plenty of cases where this could happen. I think most people manage
to avoid it because the processes they are popen'ing or exec'ing happen
to not use signals.

After testing a bit, it seems the fork() in Popen3 is not a contributing
factor. The problem occurs whenever os.execvp() is executed in a thread.
It looks like the exec'ed command inherits the masked signals from the
thread.

I'm not sure what the correct behaviour should be. The fact that it
works in python2.4 feels more like a byproduct of the thread mask change
than correct behaviour. To me it seems like execvp() should be setting
the signal mask back to defaults or at least the mask of the main
process before doing the exec.

> > BTW, built in file objects really could use better non-blocking
> > support... I've got a half-drafted PEP for it... anyone interested in
> > it?
> 
> Err, this probably should be in a different mail :)

The verboseness of the attached test code because of this issue prompted
that comment... so vaguely related :-)

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From skip at pobox.com  Thu Jan 20 02:42:03 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 20 03:00:09 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <41EEE7F9.7000902@stuartbishop.net>
References: <41EEE7F9.7000902@stuartbishop.net>
Message-ID: <16879.3179.793126.174549@montanaro.dyndns.org>


    Stuart> I don't think it is possible for plpythonu to fix this by simply
    Stuart> translating the line endings, as this would require significant
    Stuart> knowledge of Python syntax to do correctly (triple quoted
    Stuart> strings and character escaping I think).

I don't see why not.  If you treat the string as a file in text mode, I
think you'd replace all [\r\n]+ with \n, even if it was embedded in a
string:

    >>> s
    'from math import pi\r\n"""triple-quoted string embedding CR:\rrest of string"""\r\nprint 2*pi*7\r'
    >>> open("foo", "w").write(s)       
    >>> open("foo", "rU").read()
    'from math import pi\n"""triple-quoted string embedding CR:\nrest of string"""\nprint 2*pi*7\n'

Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.

Skip
From skip at pobox.com  Thu Jan 20 02:47:02 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 20 03:00:15 2005
Subject: [Python-Dev] Re: Zen of Python
In-Reply-To: <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com>
References: <3e8ca5c8050119150358c71728@mail.gmail.com>
	<972ec5bd050119111359e358f5@mail.gmail.com>
	<5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com>
Message-ID: <16879.3478.511912.271438@montanaro.dyndns.org>


    Phillip> Actually, this is one of those rare cases where optimization
    Phillip> and clarity go hand in hand.  Human brains just don't handle
    Phillip> nesting that well.  It's easy to visualize two levels of nested
    Phillip> structure, but three is a stretch unless you can abstract at
    Phillip> least one of the layers.

Also, if you think about nesting in a class/instance context, something like

    self.attr.foo.xyz()

says you are noodling around in the implementation details of self.attr (you
know it has a data attribute called "foo").  This provides for some very
tight coupling between the implementation of whatever self.attr is and your
code.  If there is a reason for you to get at whatever xyz() returns, it's
probably best to publish a method as part of the api for self.attr.

Skip
From stephen.thorne at gmail.com  Thu Jan 20 01:14:55 2005
From: stephen.thorne at gmail.com (Stephen Thorne)
Date: Thu Jan 20 07:47:19 2005
Subject: [Python-Dev] Re: Zen of Python
In-Reply-To: <972ec5bd05011916033242179@mail.gmail.com>
References: <972ec5bd050119111359e358f5@mail.gmail.com>
	<3e8ca5c8050119150358c71728@mail.gmail.com>
	<972ec5bd05011916033242179@mail.gmail.com>
Message-ID: <3e8ca5c805011916144ce90c27@mail.gmail.com>

On Wed, 19 Jan 2005 19:03:25 -0500, Timothy Fitz <firemoth@gmail.com> wrote:
> On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne
> <stephen.thorne@gmail.com> wrote:
> > "Flat is better than nested" has one foot in concise powerful
> > programming, the other foot in optimisation.
> >
> > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable lookup.
> 
> I find it amazingly hard to believe that this is implying optimization
> over functionality or clarity. There has to be another reason, yet I
> can't think of any.

What I meant to say was, 'flat is better than nested' allows you to
write more concise code, while also writing faster code.

Stephen.
From aleax at aleax.it  Thu Jan 20 09:09:36 2005
From: aleax at aleax.it (Alex Martelli)
Date: Thu Jan 20 09:09:42 2005
Subject: [Python-Dev] Re: Zen of Python
In-Reply-To: <16879.3478.511912.271438@montanaro.dyndns.org>
References: <3e8ca5c8050119150358c71728@mail.gmail.com>
	<972ec5bd050119111359e358f5@mail.gmail.com>
	<5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com>
	<16879.3478.511912.271438@montanaro.dyndns.org>
Message-ID: <9FDAA05E-6ABA-11D9-9DED-000A95EFAE9E@aleax.it>


On 2005 Jan 20, at 02:47, Skip Montanaro wrote:

>     Phillip> Actually, this is one of those rare cases where 
> optimization
>     Phillip> and clarity go hand in hand.  Human brains just don't 
> handle
>     Phillip> nesting that well.  It's easy to visualize two levels of 
> nested
>     Phillip> structure, but three is a stretch unless you can abstract 
> at
>     Phillip> least one of the layers.
>
> Also, if you think about nesting in a class/instance context, 
> something like
>
>     self.attr.foo.xyz()
>
> says you are noodling around in the implementation details of 
> self.attr (you
> know it has a data attribute called "foo").  This provides for some 
> very
> tight coupling between the implementation of whatever self.attr is and 
> your
> code.  If there is a reason for you to get at whatever xyz() returns, 
> it's
> probably best to publish a method as part of the api for self.attr.

Good point: this is also known as "Law of Demeter" and relevant 
summaries and links are for example at 
http://www.ccs.neu.edu/home/lieber/LoD.html .


Alex

From just at letterror.com  Thu Jan 20 09:48:41 2005
From: just at letterror.com (Just van Rossum)
Date: Thu Jan 20 09:48:48 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <16879.3179.793126.174549@montanaro.dyndns.org>
Message-ID: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>

Skip Montanaro wrote:

> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.

I don't think that in general you want to fold multiple empty lines into
one. This would be my prefered regex:

    s = re.sub(r"\r\n?", "\n", s)

Catches both DOS and old-style Mac line endings. Alternatively, you can
use s.splitlines():

    s = "\n".join(s.splitlines()) + "\n"

This also makes sure the string ends with a \n, which may or may not be
a good thing, depending on your application.

Just
From fredrik at pythonware.com  Thu Jan 20 09:57:50 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Jan 20 09:57:43 2005
Subject: [Python-Dev] Re: Unix line endings required for PyRun* breaking
	embedded Python
References: <16879.3179.793126.174549@montanaro.dyndns.org>
	<r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
Message-ID: <csnrq0$7gu$1@sea.gmane.org>

Just van Rossum wrote:

> I don't think that in general you want to fold multiple empty lines into
> one. This would be my prefered regex:
>
>    s = re.sub(r"\r\n?", "\n", s)
>
> Catches both DOS and old-style Mac line endings. Alternatively, you can
> use s.splitlines():
>
>    s = "\n".join(s.splitlines()) + "\n"
>
> This also makes sure the string ends with a \n, which may or may not be
> a good thing, depending on your application.

    s = s.replace("\r", "\n"["\n" in s:])

</F> 



From gvanrossum at gmail.com  Thu Jan 20 12:07:35 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Jan 20 12:07:41 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
Message-ID: <ca471dc2050120030716adfb0@mail.gmail.com>

[Phillip J. Eby]
> I've revised the draft today to simplify the terminology, discussing only
> two broad classes of adapters.  Since Clark's pending proposals for PEP 246
> align well with the concept of "extenders" vs. "independent adapters", I've
> refocused my PEP to focus exclusively on adding support for "extenders",
> since PEP 246 already provides everything needed for independent adapters.
> 
> The new draft is here:
> http://peak.telecommunity.com/DevCenter/MonkeyTyping

On the plane to the Amazon.com internal developers conference in
Seattle (a cool crowd BTW) I finally got to read this. I didn't see a
way to attach comments to Phillip's draft, so here's my response. (And
no, it hasn't affected my ideas about optional typing. :)

The Monkey Typing proposal is trying to do too much, I believe. There
are two or three separate problem, and I think it would be better to
deal with each separately.

The first problem is what I'd call incomplete duck typing. There is a
function that takes a sequence argument, and you have an object that
partially implements the sequence protocol. What do you do? In current
Python, you just pass the object and pray -- if the function only uses
the methods that your object implements, it works, otherwise you'll
get a relatively clean AttributeError (e.g. "Foo instance has no
attribute '__setitem__'").

Phillip worries that solving this with interfaces would cause a
proliferation of "partial sequence" interfaces representing the needs
of various libraries. Part of his proposal comes down to having a way
to declare that some class C implements some interface I, even if C
doesn't implement all of I's methods (as long as implements at least
one). I like having this ability, but I think this fits in the
existing proposals for declaring interface conformance: there's no
reason why C couldn't have a __conform__ method that claims it
conforms to I even if it doesn't implement all methods. Or if you
don't want to modify C, you can do the same thing using the external
adapter registry.

I'd also like to explore ways of creating partial interfaces on the
fly. For example, if we need only the read() and readlines() methods
of the file protocol, maybe we could declare that as follows::

  def foo(f: file['read', 'readlines']): ...

I find the quoting inelegant, so maybe this would be better::

  file[file.read, file.readlines]

Yet another  idea (which places a bigger burden on the typecheck()
function presumed by the type declaration notation, see my blog on
Artima.com) would be to just use a list of the needed methods::

  [file.read, file.readlines]

All this would work better if file weren't a concrete type but an interface.

Now on to the other problems Phillip is trying to solve with his
proposal. He says, sometimes there's a class that has the
functionality that you need, but it's packaged differently. I'm not
happy with his proposal for solving this by declaring various adapting
functions one at a time, and I'd much rather see this done without
adding new machinery or declarations: when you're using adaptation,
just write an adapter class and register it; without adaptation, you
can still write the adapter class and explicitly instantiate it.

I have to admit that I totally lost track of the proposal when it
started to talk about JetPacks. I believe that this is trying to deal
with stateful adapters. I hope that Phillip can write something up
about these separately from all the other issues, maybe then it's
clearer.

There's one other problem that Phillip tries to tackle in his
proposal: how to implement the "rich" version of an interface if all
you've got is a partial implementation (e.g. you might have readline()
but you need readlines()). I think this problem is worthy of a
solution, but I think the solution could be found, again, in a
traditional adapter class. Here's a sketch::

class RichFile:
    def __init__(self, ref):
        self.__ref = ref
        if not hasattr(ref, 'readlines'):
            self.readlines = self.__readlines # Other forms of this
magic are conceivably
    def __readlines(self): # Ignoring the rarely used optional argument
        # It's tempting to use [line for line in self.__ref] here but
that doesn't use readline()
        lines = []
        while True:
            line = self.__ref.readline()
            if not line:
                break
            lines.append(line)
        return lines
    def __getattr__(self, name): # Delegate all other attributes to
the underlying object
        return getattr(self.__ref, name)

Phillip's proposal reduces the amount of boilerplate in this class
somewhat (mostly the constructor and the __getattr__() method), but
apart from that it doesn't really seem to do a lot except let you put
pieces of the adapter in different places, which doesn't strike me as
such a great idea.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From marktrussell at btopenworld.com  Thu Jan 20 13:33:05 2005
From: marktrussell at btopenworld.com (Mark Russell)
Date: Thu Jan 20 13:33:08 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <ca471dc2050120030716adfb0@mail.gmail.com>
References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
	<ca471dc2050120030716adfb0@mail.gmail.com>
Message-ID: <1106224384.5347.6.camel@localhost>

On Thu, 2005-01-20 at 11:07, Guido van Rossum wrote:
> I'd also like to explore ways of creating partial interfaces on the
> fly. For example, if we need only the read() and readlines() methods
> of the file protocol, maybe we could declare that as follows::
> 
>   def foo(f: file['read', 'readlines']): ...
> 
> I find the quoting inelegant, so maybe this would be better::
> 
>   file[file.read, file.readlines]

Could you not just have a builtin which constructs an interface on the
fly, so you could write:

    def foo(f: interface(file.read, file.readlines)): ...

For commonly used subsets of course you'd do something like:

    IInputStream = interface(file.read, file.readlines)

    def foo(f: IInputStream): ...

I can't see that interface() would need much magic - I would guess you
could implement it in python with ordinary introspection.

Mark Russell

From arigo at tunes.org  Thu Jan 20 13:38:26 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu Jan 20 13:50:19 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41ECD6D9.9000001@egenix.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
	<41EC431A.90204@egenix.com>
	<ca471dc20501172115275a99c9@mail.gmail.com>
	<41ECD6D9.9000001@egenix.com>
Message-ID: <20050120123826.GA30873@vicky.ecs.soton.ac.uk>

Hi,

Removing unbound methods also breaks the 'py' lib quite a bit.  The 'py.test'
framework handles function and bound/unbound method objects all over the
place, and uses introspection on them, as they are the objects defining the
tests to run.
  
It's nothing that can't be repaired, and at places the fix even looks nicer
than the original code, but I thought that it points to large-scale breakage.  
I'm expecting any code that relies on introspection to break at least here or
there.  My bet is that even if it's just for fixes a couple of lines long
everyone will have to upgrade a number of their packages when switching to
Python 2.5 -- unheard of !

For reference, the issues I got with the py lib are described at
 http://codespeak.net/pipermail/py-dev/2005-January/000159.html


Armin
From skip at pobox.com  Thu Jan 20 13:51:41 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 20 14:00:16 2005
Subject: [Python-Dev] Re: Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <csnrq0$7gu$1@sea.gmane.org>
References: <16879.3179.793126.174549@montanaro.dyndns.org>
	<r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
	<csnrq0$7gu$1@sea.gmane.org>
Message-ID: <16879.43357.357745.467891@montanaro.dyndns.org>


    Fredrik> s = s.replace("\r", "\n"["\n" in s:])

This fails on admittedly weird strings that mix line endings:

    >>> s = "abc\rdef\r\n"
    >>> s = s.replace("\r", "\n"["\n" in s:])
    >>> s
    'abcdef\n'

where universal newline mode or Just's re.sub() gadget would work.

Skip
From skip at pobox.com  Thu Jan 20 13:44:00 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 20 14:00:18 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
References: <16879.3179.793126.174549@montanaro.dyndns.org>
	<r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
Message-ID: <16879.42896.49304.693682@montanaro.dyndns.org>


    Just> Skip Montanaro wrote:
    >> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.

    Just> I don't think that in general you want to fold multiple empty
    Just> lines into one.

Whoops.  Yes.

Skip
From mdehoon at ims.u-tokyo.ac.jp  Thu Jan 20 14:18:52 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Thu Jan 20 14:14:54 2005
Subject: [Python-Dev] Patch review [ 723201 ] PyArg_ParseTuple problem with
	'L' format
Message-ID: <41EFAFBC.8000203@ims.u-tokyo.ac.jp>

Patch review [ 723201 ] PyArg_ParseTuple problem with 'L' format

The PyArg_ParseTuple function (PyObject *args, char *format, ...) parses the
arguments args and stores them in the variables specified following the format
argument. If format=="i", indicating an integer, but the corresponding Python
object in args is not a Python int or long, a TypeError is thrown:

TypeError: an integer is required

For the "L" format, indicating a long long, instead a SystemError is thrown:

SystemError: Objects/longobject.c:788: bad argument to internal function

The submitted patch fixes this, however I think it is not the best way to do it.
The original code (part of the convertsimple function in Python/getargs.c) is

	case 'L': {/* PY_LONG_LONG */
		PY_LONG_LONG *p = va_arg( *p_va, PY_LONG_LONG * );
		PY_LONG_LONG ival = PyLong_AsLongLong( arg );
		if( ival == (PY_LONG_LONG)-1 && PyErr_Occurred() ) {
			return converterr("long<L>", arg, msgbuf, bufsize);
		} else {
			*p = ival;
		}
		break;
	}

In the patch, a PyLong_Check and a PyInt_Check are added:

         case 'L': {/* PY_LONG_LONG */
                 PY_LONG_LONG *p = va_arg(*p_va, PY_LONG_LONG *);
                 PY_LONG_LONG ival;
		/* ********** patch starts here ********** */
                 if (!PyLong_Check(arg) && !PyInt_Check(arg))
                         return converterr("long<L>", arg, msgbuf, bufsize);
		/* ********** patch ends here ********** */
                 ival = PyLong_AsLongLong(arg);
                 if (ival == (PY_LONG_LONG)-1 && PyErr_Occurred()) {
                         return converterr("long<L>", arg, msgbuf, bufsize);
                 } else {
                         *p = ival;
                 }
                 break;
         }

However, the PyLong_AsLongLong function (in Objects/longobject.c) also contains
a call to PyLong_Check and PyInt_Check, so there should be no need for another
such check here:

PY_LONG_LONG
PyLong_AsLongLong(PyObject *vv)
{
         PY_LONG_LONG bytes;
         int one = 1;
         int res;

         if (vv == NULL) {
                 PyErr_BadInternalCall();
                 return -1;
         }
         if (!PyLong_Check(vv)) {
                 if (PyInt_Check(vv))
                         return (PY_LONG_LONG)PyInt_AsLong(vv);
                 PyErr_BadInternalCall();
                 return -1;
         }

A better solution would be to replace the PyErr_BadInternalCall() in the 
PyLong_AsLongLong function by
		PyErr_SetString(PyExc_TypeError, "an integer is required");
This would make it consistent with PyInt_AsLong in Objects/intobject.c:

long
PyInt_AsLong(register PyObject *op)
{
         PyNumberMethods *nb;
         PyIntObject *io;
         long val;

         if (op && PyInt_Check(op))
                 return PyInt_AS_LONG((PyIntObject*) op);

         if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL ||
             nb->nb_int == NULL) {
                 PyErr_SetString(PyExc_TypeError, "an integer is required");
                 return -1;
         }

By the way, I noticed that a Python float is converted to an int (with a
deprecation warning), while trying to convert a Python float into a long long
int results in a TypeError. Also, I'm not sure about the function of the calls 
to converterr (in various places in the convertsimple function); none of the 
argument type errors seem to lead to the warning messages created by converterr.

--Michiel.

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon



From mwh at python.net  Thu Jan 20 15:12:37 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Jan 20 15:12:40 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <1106185423.3784.26.camel@schizo> (Donovan Baarda's message of
	"Thu, 20 Jan 2005 12:43:43 +1100")
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo>
Message-ID: <2mllaob356.fsf@starship.python.net>

Donovan Baarda <abo@minkirri.apana.org.au> writes:

> On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
>> Donovan Baarda <abo@minkirri.apana.org.au> writes:
> [...]
>> You've left out a very important piece of information: which version
>> of Python you are using.  I'm guessing 2.3.4.  Can you try 2.4?
>
> Debian Python2.3 (2.3.4-18), Debian kernel-image-2.6.8-1-686 (2.6.8-10),
> and Debian kernel-image-2.4.27-1-686 (2.4.27-6)
>
>> I'd be astonished if this is the same bug.
>> 
>> The main oddness about python threads (before 2.3) is that they run
>> with all signals masked.  You could play with a C wrapper (call
>> setprocmask, then exec fop) to see if this is what is causing the
>> problem.  But please try 2.4.
>
> Python 2.4 does indeed fix the problem. 

That's good to hear.

> Unfortunately we are using Zope 2.7.4, and I'm a bit wary of
> attempting to migrate it all from 2.3 to 2.4. 

That's not so good to hear, albeit unsurprising.

> Is there any way this "Fix" can be back-ported to 2.3?

Probably not.  It was quite invasive and a bit scary.  OTOH, it hasn't
been the cause of any bug reports yet, so it can't be all bad.

> Note that this problem is being triggered when using 
> Popen3() in a thread. Popen3() simply uses os.fork() and os.execvp().
> The segfault is occurring in the excecvp'ed process. I'm sure there must
> be plenty of cases where this could happen. I think most people manage
> to avoid it because the processes they are popen'ing or exec'ing happen
> to not use signals.

Indeed.

> After testing a bit, it seems the fork() in Popen3 is not a contributing
> factor. The problem occurs whenever os.execvp() is executed in a thread.
> It looks like the exec'ed command inherits the masked signals from the
> thread.

Yeah.  I could have told you that, sorry :)

> I'm not sure what the correct behaviour should be. The fact that it
> works in python2.4 feels more like a byproduct of the thread mask change
> than correct behaviour. 

Well, getting rid of the thread mask changes was one of the goals of
the change.

> To me it seems like execvp() should be setting the signal mask back
> to defaults or at least the mask of the main process before doing
> the exec.

Possibly.  I think the 2.4 change -- not fiddling the process mask at
all -- is the Right Thing, but that doesn't help 2.3 users.  This has
all been discussed before at some length, on python-dev and in various
bug reports on SF.

In your situation, I think the simplest thing you can do is dig out an
old patch of mine that exposes sigprocmask + co to Python and either
make a custom Python incorporating the patch and use that, or put the
code from the patch into an extension module.  Then before execing
fop, use the new code to set the signal mask to something sane.  Not
pretty, particularly, but it should work.

>> > BTW, built in file objects really could use better non-blocking
>> > support... I've got a half-drafted PEP for it... anyone interested in
>> > it?
>> 
>> Err, this probably should be in a different mail :)
>
> The verboseness of the attached test code because of this issue prompted
> that comment... so vaguely related :-)

Oh right :) Didn't actually read the test code, not having fop to
hand...

Cheers,
mwh

-- 
  The ability to quote is a serviceable substitute for wit.
                                                -- W. Somerset Maugham
From skip at pobox.com  Thu Jan 20 15:22:09 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 20 16:00:01 2005
Subject: [Python-Dev] ANN: Free Trac/Subversion hosting at
	Python-Hosting.com (fwd)
Message-ID: <16879.48785.395695.742124@montanaro.dyndns.org>


Thought I'd pass this along for people who don't read comp.lang.python.

Skip

-------------- next part --------------
An embedded message was scrubbed...
From: remi@cherrypy.org (Remi Delon)
Subject: ANN: Free Trac/Subversion hosting at Python-Hosting.com
Date: 19 Jan 2005 10:15:02 -0800
Size: 5405
Url: http://mail.python.org/pipermail/python-dev/attachments/20050120/b0ad5d8f/attachment-0001.mht
From pje at telecommunity.com  Thu Jan 20 17:02:38 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Jan 20 17:02:34 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <ca471dc2050120030716adfb0@mail.gmail.com>
References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
	<5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com>

At 03:07 AM 1/20/05 -0800, Guido van Rossum wrote:
>Phillip worries that solving this with interfaces would cause a
>proliferation of "partial sequence" interfaces representing the needs
>of various libraries. Part of his proposal comes down to having a way
>to declare that some class C implements some interface I, even if C
>doesn't implement all of I's methods (as long as implements at least
>one). I like having this ability, but I think this fits in the
>existing proposals for declaring interface conformance: there's no
>reason why C couldn't have a __conform__ method that claims it
>conforms to I even if it doesn't implement all methods. Or if you
>don't want to modify C, you can do the same thing using the external
>adapter registry.

There are some additional things that it does in this area:

1. Avoids namespace collisions when an object has a method with the same 
name as one in an interface, but which doesn't do the same thing.  (A 
common criticism of duck typing by static typing advocates; i.e. how do you 
know that 'read()' has the same semantics as what this routine expects?)

2. Provides a way to say that you conform, without writing a custom 
__conform__ method

3. Syntax for declaring conformance is the same as for adaptation

4. Allows *external* (third-party) code to declare a type's conformance, 
which is important for integrating existing code with code with type 
declarations


>I'd also like to explore ways of creating partial interfaces on the
>fly. For example, if we need only the read() and readlines() methods
>of the file protocol, maybe we could declare that as follows::
>
>   def foo(f: file['read', 'readlines']): ...

FYI, this is similar to the suggestion from Samuele Pedroni that lead to 
PyProtocols having a:

     protocols.protocolForType(file, ['read','readlines'])

capability, that implements this idea.  However, the problem with 
implementing it by actually having distinct protocols is that declaring as 
few as seven methods results in 127 different protocol objects with 
conformance relationships to manage.

In practice, I've also personally never used this feature, and probably 
never would unless it had meaning for type declarations.  Also, your 
proposal as shown would be tedious for the declarer compared to just saying 
'file' and letting the chips fall where they may.


>Now on to the other problems Phillip is trying to solve with his
>proposal. He says, sometimes there's a class that has the
>functionality that you need, but it's packaged differently. I'm not
>happy with his proposal for solving this by declaring various adapting
>functions one at a time, and I'd much rather see this done without
>adding new machinery or declarations: when you're using adaptation,
>just write an adapter class and register it; without adaptation, you
>can still write the adapter class and explicitly instantiate it.

In the common case (at least for my code) an adapter class has only one or 
two methods, but the additional code and declarations needed to make it an 
adapter can increase the code size by 20-50%.  Using @like directly on an 
adapting method would result in a more compact expression in the common case.


>I have to admit that I totally lost track of the proposal when it
>started to talk about JetPacks. I believe that this is trying to deal
>with stateful adapters. I hope that Phillip can write something up
>about these separately from all the other issues, maybe then it's
>clearer.

Yes, it was for per-object ("as a") adapter state, rather than per-adapter 
("has a") state, however.  The PEP didn't try to tackle "has a" adapters at 
all.


>Phillip's proposal reduces the amount of boilerplate in this class
>somewhat (mostly the constructor and the __getattr__() method),

Actually, it wouldn't implement the __getattr__; a major point of the 
proposal is that when adapting to an interface, you get *only* the 
attributes from the interface, and of those only the ones that the adaptee 
has implementations for.  So, arbitrary __getattr__ doesn't pass down to 
the adapted item.


>  but
>apart from that it doesn't really seem to do a lot except let you put
>pieces of the adapter in different places, which doesn't strike me as
>such a great idea.

The use case for that is that you are writing a package which extends an 
interface IA to create interface IB, and there already exist numerous 
adapters to IA.  As long as IB's additional methods can be defined in terms 
of IA, then you can extend all of those adapters at one stroke.

In other words, external abstract operations are exactly equivalent to 
stateless, lossless, interface-to-interface adapters applied 
transitively.  But the point of the proposal was to avoid having to explain 
to somebody what all those terms mean, while making it easier to do such an 
adaptation correctly and succinctly.

One problem with using concrete adapter classes to full interfaces rather 
than partial interfaces is that it leads to situations like Alex's adapter 
diamond examples, because you end up with not-so-good adapters and few 
common adaptation targets.  The idea of operation conformance and 
interface-as-namespace is to make it easier to have fewer interfaces and 
therefore fewer adapter diamonds.  And, equally important, if you have only 
partial conformance you don't have to worry about claiming to have more 
information/ability than you actually have, which was the source of the 
problem in one class of Alex's examples.  If you substitute per-operation 
adapters in Alex's PersonName example, the issue disappears because there 
isn't an adapter claiming to supply a middle name that it doesn't have; 
that operation or attribute simply doesn't appear on the dynamic adapter 
class in that case.

By the way, this concept is also exactly equivalent to single-dispatched 
generic functions in a language like Dylan.  In Dylan, a protocol consists 
of a set of abstract generic functions, not unlike the no-op methods in a 
Python interface.  However, instead of adapting objects or declaring their 
conformance, you declare how those methods are implemented for a particular 
subject type, and that does not have to be in the class for the subject 
type, or in the class where the method is.  And when you invoke the 
operation, you do the moral equivalent of 'file.read(file_like_object, 
bytes)', rather than 'file_like_object.read(bytes)', and the right 
implementation is looked up by the concrete type of 'file_like_object'.

Of course, that's not a very Pythonic style, so the idea of this PEP was to 
swap it around so the type declaration of 'file' is automatically turning 
'filelike.read(bytes)' into 'file_interface.read(filelike,bytes)' internally.

Pickling and copying and such in the stdlib are already generic functions 
of this kind.  You have a dictionary of type->implementation for each of 
these operations.  The table is explicit and the lookup is explicit, and 
adaptation doesn't come into it, but this is basically the same as what 
you'd do in Dylan by having the moral equivalent of a 'picklable' protocol 
with a 'pickle(ob,stream)' generic function, and implementations declared 
elsewhere.  So, the concept of registering implementations of an operation 
in an interface for a given concrete type (that can happen from third-party 
code!) certainly isn't without precedent in Python.

Once you look at it through that lens, then you will see that everything in 
the proposal that doesn't deal with stateful adaptation is just a 
straightforward way to flip from 'operation(ob,...)' to 
'ob.operation(...)', where the original 'operation()' is a type-registered 
operation like 'pickle', but created automatically for existing operations 
like file.read.

So if it "does too much", it's only because that one concept of a 
type-dispatched function in Python provides for many possibilities.  :)

From martin at v.loewis.de  Thu Jan 20 17:22:53 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan 20 17:22:49 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41ED9B02.2040908@xs4all.nl>
References: <41EA9196.1020709@xs4all.nl>
	<41ED645B.40709@xs4all.nl>	<41ED8B0A.7050201@v.loewis.de>
	<41ED9B02.2040908@xs4all.nl>
Message-ID: <41EFDADD.10701@v.loewis.de>

Irmen de Jong wrote:
> That sounds very convenient, thanks.

Ok, welcome to the project! Please let me know whether
it "works".

> Does the status of 'python project member' come with
> certain expectations that must be complied with ? ;-)

There are a few conventions that are followed more
or less stringently. You should be aware of the
things in the developer FAQ,

http://www.python.org/dev/devfaq.html

Initially, "new" developers should follow a
"write-after-approval" procedure, i.e. they should not
commit anything until they got somebody's approval.
Later, we commit things which we feel confident about,
and post other things to SF.

For CVS, I'm following a few more conventions which
I think are not documented anywhere.
- Always add a CVS commit message
- Add an entry to Misc/NEWS, if there is a new feature,
   or if it is a bug fix for a maintenance branch
   (I personally don't list bug fixed in the HEAD revision,
   but others apparently do)
- When committing configure.in, always remember to commit
   configure also (and pyconfig.h.in if it changed; remember
   to run autoheader)
- Always run the test suite before committing
- If you are committing a bug fix, consider to backport
   it to maintenance branches right away. If you don't
   backport it immediately, it likely won't appear in the
   next release. At the moment, backports to 2.4 are
   encouraged; backports to 2.3 are still possible for
   a few more days.
   If you chose not to backport for some reason, document
   that reason in the commit message. If you plan to
   backport, document that intention in the commit message
   (I usually say "Will backport to 2.x")
- In the commit message, always refer to the SF tracker
   id. In the tracker item, always refer to CVS version
   numbers. I use the script attached to extract those
   numbers from the CVS commit message, to paste them
   into the SF tracker.

I probably forgot to mention a few things; you'll notice
few enough :-)

HTH,
Martin
From tim.peters at gmail.com  Thu Jan 20 17:59:31 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Jan 20 17:59:34 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EFDADD.10701@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
	<41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl>
	<41EFDADD.10701@v.loewis.de>
Message-ID: <1f7befae050120085919bdfd2f@mail.gmail.com>

[Martin v. L?wis]
...
> - Add an entry to Misc/NEWS, if there is a new feature,
>   or if it is a bug fix for a maintenance branch
>   (I personally don't list bug fixed in the HEAD revision,
>   but others apparently do)

You should.  In part this is to comply with license requirements: 
we're a derivative work from CNRI and BeOpen's Python releases, and
their licenses require that we include "a brief summary of the changes
made to Python".  That certainly includes changes made to repair bugs.

It's also extremely useful in practice to have a list of repaired bugs
in NEWS!  That saved me hours just yesterday, when trying to account
for a Zope3 test that fails under Python 2.4 but works under 2.3.4. 
2.4 NEWS pointed out that tuple hashing changed to close bug 942952,
which I can't imagine how I would have remembered otherwise.
From martin at v.loewis.de  Thu Jan 20 18:29:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan 20 18:29:15 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <1f7befae050120085919bdfd2f@mail.gmail.com>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>	
	<41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl>	
	<41EFDADD.10701@v.loewis.de>
	<1f7befae050120085919bdfd2f@mail.gmail.com>
Message-ID: <41EFEA6F.7010600@v.loewis.de>

Tim Peters wrote:
> It's also extremely useful in practice to have a list of repaired bugs
> in NEWS!

I'm not convinced about that - it makes the NEWS file almost unreadable,
as the noise is now so high if every tiny change is listed; it is very
hard to see what the important changes are.

Regards,
Martin
From tim.peters at gmail.com  Thu Jan 20 18:44:56 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Jan 20 18:44:59 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <41EFEA6F.7010600@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>
	<41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl>
	<41EFDADD.10701@v.loewis.de>
	<1f7befae050120085919bdfd2f@mail.gmail.com>
	<41EFEA6F.7010600@v.loewis.de>
Message-ID: <1f7befae05012009444ad4c822@mail.gmail.com>

[Tim Peters]
>> It's also extremely useful in practice to have a list of repaired
>> bugs in NEWS!

[Martin v. L?wis]
> I'm not convinced about that - it makes the NEWS file almost
> unreadable, as the noise is now so high if every tiny change is
> listed; it is very hard to see what the important changes are.

My experience disagrees, and I gave a specific example from just the
last day.  High-level coverage of the important bits is served (and
served well) by Andrew's "What's New in Python" doc.  (Although I'll
note that when I did releases, I tried to sort section contents in
NEWS, to put the more important items at the top.)

In any case, you snipped the other part here:  a brief summary of
changes is required by the licenses, and they don't distinguish
between changes due to features or bugs.  If, for example, we didn't
note that tuple hashing changed in NEWS, we would be required to note
that in some other file.  NEWS is the historical place for it, and
works fine for this purpose (according to me <wink>).
From irmen at xs4all.nl  Thu Jan 20 19:05:55 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Thu Jan 20 19:05:56 2005
Subject: [Python-Dev] A short introduction
In-Reply-To: <41EFDADD.10701@v.loewis.de>
References: <41EA9196.1020709@xs4all.nl>
	<41ED645B.40709@xs4all.nl>	<41ED8B0A.7050201@v.loewis.de>
	<41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de>
Message-ID: <41EFF303.1060004@xs4all.nl>

Martin v. L?wis wrote:
> Irmen de Jong wrote:
> 
>> That sounds very convenient, thanks.
> 
> 
> Ok, welcome to the project! Please let me know whether
> it "works".

It looks that it works, I seem to be able to add a new
attachment to the spwd patch- which I will do shortly.

*

Now that I'm part of the developers group I feel obliged
to tell a little bit about myself.

I'm a guy from the Netherlands, 30 years old, currently
employed at a company designing and developing front-
and mid-office web based applications (mostly). We do
this in Java (j2ee).
I'm using Python where the job allows it, which is much
too little IMO :) and for my private stuff, or hobby if
you wish.

I've been introduced to Python in 1995-6, I think, wanted
it at home too, ported it to AmigaDOS (voila AmigaPython),
and got more and more involved ever since (starting
mostly as a lurker in comp.lang.python).
My interests are broad but there's two areas that I
particularly like: internet/networking and web/browsers.

Over the course of the past few years I developed Pyro,
which many of you will probably know (still doing small
improvements on that) and more recently, Snakelets and
Frog (my own web server and blog app).

My C/C++ skills are getting a bit rusty now because I do
almost all of my programming in Java or Python, but it's still
good enough to be able to contribute to (C)Python, I think.
My interest in contributing to Python itself was sparked by
the last bug day organized by Johannes Gijsbers. I hope to
be able to find time to contribute more often.

Well, that about sums it up I think !


>> Does the status of 'python project member' come with
>> certain expectations that must be complied with ? ;-)
> 
> 
> There are a few conventions that are followed more
> or less stringently. You should be aware of the
> things in the developer FAQ,
> 
> http://www.python.org/dev/devfaq.html
> 
> Initially, "new" developers should follow a
> "write-after-approval" procedure, i.e. they should not
> commit anything until they got somebody's approval.
> Later, we commit things which we feel confident about,
> and post other things to SF.

I don't think I will be committing stuff any time soon.
But thanks for mentioning.


Bye

-Irmen de Jong
From martin at v.loewis.de  Thu Jan 20 19:07:23 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan 20 19:07:19 2005
Subject: [Python-Dev] a bunch of Patch reviews
In-Reply-To: <1f7befae05012009444ad4c822@mail.gmail.com>
References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl>	
	<41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl>	
	<41EFDADD.10701@v.loewis.de>	
	<1f7befae050120085919bdfd2f@mail.gmail.com>	
	<41EFEA6F.7010600@v.loewis.de>
	<1f7befae05012009444ad4c822@mail.gmail.com>
Message-ID: <41EFF35B.3020706@v.loewis.de>

Tim Peters wrote:
> My experience disagrees, and I gave a specific example from just the
> last day.  High-level coverage of the important bits is served (and
> served well) by Andrew's "What's New in Python" doc.  (Although I'll
> note that when I did releases, I tried to sort section contents in
> NEWS, to put the more important items at the top.)

I long ago stopped reading the NEWS file, because it is just too
much text. However, if it is desirable to list any change in the
NEWS file, I'm willing to comply.

Regards,
Martin
From python at rcn.com  Fri Jan 21 00:21:17 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Jan 21 00:24:48 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <ca471dc2050120030716adfb0@mail.gmail.com>
Message-ID: <001d01c4ff46$bec52040$793ec797@oemcomputer>

[Guido van Rossum]
> There's one other problem that Phillip tries to tackle in his
> proposal: how to implement the "rich" version of an interface if all
> you've got is a partial implementation (e.g. you might have readline()
> but you need readlines()). I think this problem is worthy of a
> solution, but I think the solution could be found, again, in a
> traditional adapter class. Here's a sketch::
> 
> class RichFile:
>     def __init__(self, ref):
>         self.__ref = ref
>         if not hasattr(ref, 'readlines'):
>             self.readlines = self.__readlines # Other forms of this
> magic are conceivably
>     def __readlines(self): # Ignoring the rarely used optional
argument
>         # It's tempting to use [line for line in self.__ref] here but
> that doesn't use readline()
>         lines = []
>         while True:
>             line = self.__ref.readline()
>             if not line:
>                 break
>             lines.append(line)
>         return lines
>     def __getattr__(self, name): # Delegate all other attributes to
> the underlying object
>         return getattr(self.__ref, name)

Instead of a __getattr__ solution, I recommend subclassing from a mixin:

    class RichMap(SomePartialMapping, UserDict.DictMixin): pass

    class RichFile(SomePartialFileClass, Mixins.FileMixin): pass



Raymond

From abo at minkirri.apana.org.au  Fri Jan 21 00:56:00 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Fri Jan 21 00:56:39 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <2mllaob356.fsf@starship.python.net>
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo>
	<2mllaob356.fsf@starship.python.net>
Message-ID: <1106265360.1537.24.camel@schizo>

On Thu, 2005-01-20 at 14:12 +0000, Michael Hudson wrote:
> Donovan Baarda <abo@minkirri.apana.org.au> writes:
> 
> > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
> >> Donovan Baarda <abo@minkirri.apana.org.au> writes:
[...]
> >> The main oddness about python threads (before 2.3) is that they run
> >> with all signals masked.  You could play with a C wrapper (call
> >> setprocmask, then exec fop) to see if this is what is causing the
> >> problem.  But please try 2.4.
> >
> > Python 2.4 does indeed fix the problem. 
> 
> That's good to hear.
[...]

I still don't understand what Linux 2.4 vs Linux 2.6 had to do with it.
Reading the man pages for execve(), pthread_sigmask() and sigprocmask(),
I can see some ambiguities, but mostly only if you do things they warn
against (ie, use sigprocmask() instead of pthread_sigmask() in a
multi-threaded app).

The man page for execve() says that the new process will inherit the
"Process signal mask (see sigprocmask() )". This implies to me it will
inherit the mask from the main process, not the thread's signal mask.

It looks like Linux 2.4 uses the signal mask of the main thread or
process for the execve(), whereas Linux 2.6 uses the thread's signal
mask. Given that execve() replaces the whole process, including all
threads, I dunno if using the thread's mask is right. Could this be a
Linux 2.6 kernel bug?

> > I'm not sure what the correct behaviour should be. The fact that it
> > works in python2.4 feels more like a byproduct of the thread mask change
> > than correct behaviour. 
> 
> Well, getting rid of the thread mask changes was one of the goals of
> the change.

I gathered that... which kinda means the fact that it fixed execvp in
threads is a side effect...(though I also guess it fixed a lot of other
things like this too).

> > To me it seems like execvp() should be setting the signal mask back
> > to defaults or at least the mask of the main process before doing
> > the exec.
> 
> Possibly.  I think the 2.4 change -- not fiddling the process mask at
> all -- is the Right Thing, but that doesn't help 2.3 users.  This has
> all been discussed before at some length, on python-dev and in various
> bug reports on SF.

Would a simple bug-fix for 2.3 be to have os.execvp() set the mask to
something sane before executing C execvp()? Given that Python does not
have any visibility of the procmask...

This might be a good idea regardless as it will protect against this bug
resurfacing in the future if someone decides fiddling with the mask for
threads is a good idea again.

> In your situation, I think the simplest thing you can do is dig out an
> old patch of mine that exposes sigprocmask + co to Python and either
> make a custom Python incorporating the patch and use that, or put the
> code from the patch into an extension module.  Then before execing
> fop, use the new code to set the signal mask to something sane.  Not
> pretty, particularly, but it should work.

The extension module that exposes sigprocmask() is probably best for
now...

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From pycon at python.org  Fri Jan 21 01:06:05 2005
From: pycon at python.org (Steve Holden)
Date: Fri Jan 21 01:06:06 2005
Subject: [Python-Dev] PyCon Preliminary Program Announced!
Message-ID: <20050121000605.AAC3C1E400F@bag.python.org>


Dear Python Colleague:

You will be happy to know that the PyCon Program
Committee, after lengthy deliberations, has now
finalized the program for PyCon DC 2005. I can tell
you that the decision-making was very difficult, as
the standard of submissions was even higher than
last year.

You can see the preliminary program at

  http://www.python.org/pycon/2005/schedule.html

and it's obvious that this year's PyCon is going
to be even fuller than the last.

On innovation is that there will be activities of
a more social nature on the Wednesday (and perhaps
the Thursday) evening, as well as keynote speeches
from Guido and two other luminaries.

Remember that the early bird registration rates end
in just over a week, so hurry on down to

  http://www.python.org/pycon/2005/register.html

to be sure of your place in what will surely be the
premier Python event of the year.

As always, I would appreciate your help in getting
the word out. Please forward this message to your
favorite mailing lists and newsgroups to make sure
that everyone has a chance to join in the fun!


regards
Steve Holden
Chairman, PyCON DC 2005
-- 
PyCon DC 2005: The third Python Community Conference
http://www.pycon.org/   http://www.python.org/pycon/
The scoop on Python implementations and applications
From noamraph at gmail.com  Fri Jan 21 01:50:01 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Fri Jan 21 01:50:04 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc20501162212446e63b5@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <b348a08505012016505c5224f6@mail.gmail.com>

Hello,

I would like to add here another small thing which I encountered this
week, and seems to follow the same logic as does Guido's proposal.

It's about staticmethods. I was writing a class, and its
pretty-printing method got a function for converting a value to a
string as an argument. I wanted to supply a default function. I
thought that it should be in the namespace of the class, since its
main use lies there. So I made it a staticmethod.

But - alas! After I declared the function a staticmethod, I couldn't
make it a default argument for the method, since there's nothing to do
with staticmethod instances.

The minor solution for this is to make staticmethod objects callable.
This would solve my problem. But I suggest a further step: I suggest
that if this is done, it would be nice if classname.staticmethodname
would return the classmethod instance, instead of the function itself.
I know that this things seems to contradict Guido's proposal, since he
suggests to return the function instead of a strange object, and I
suggest to return a strange object instead of a function. But this is
not true; Both are according to the idea that class attributes should
be, when possible, the same objects that were created when defining
the class. This is more consistent with the behaviour of modules
(module attributes are the objects that were created when the code was
run), and is more consistent with the general convention, that running
A = B
causes
A == B
to be true. Currently, Class.func = staticmethod(func), and Class.func
= func, don't behave by this rule. If the suggestions are accepted,
both will.

I just think it's simpler and cleaner that way. Just making
staticmethods callable would solve my practical problem too.

Noam Raphael
From gvanrossum at gmail.com  Fri Jan 21 05:20:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 21 05:20:26 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <b348a08505012016505c5224f6@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<b348a08505012016505c5224f6@mail.gmail.com>
Message-ID: <ca471dc20501202020215863da@mail.gmail.com>

> It's about staticmethods. I was writing a class, and its
> pretty-printing method got a function for converting a value to a
> string as an argument. I wanted to supply a default function. I
> thought that it should be in the namespace of the class, since its
> main use lies there. So I made it a staticmethod.
> 
> But - alas! After I declared the function a staticmethod, I couldn't
> make it a default argument for the method, since there's nothing to do
> with staticmethod instances.
> 
> The minor solution for this is to make staticmethod objects callable.
> This would solve my problem. But I suggest a further step: I suggest
> that if this is done, it would be nice if classname.staticmethodname
> would return the classmethod instance, instead of the function itself.
> I know that this things seems to contradict Guido's proposal, since he
> suggests to return the function instead of a strange object, and I
> suggest to return a strange object instead of a function. But this is
> not true; Both are according to the idea that class attributes should
> be, when possible, the same objects that were created when defining
> the class. This is more consistent with the behaviour of modules
> (module attributes are the objects that were created when the code was
> run), and is more consistent with the general convention, that running
> A = B
> causes
> A == B
> to be true. Currently, Class.func = staticmethod(func), and Class.func
> = func, don't behave by this rule. If the suggestions are accepted,
> both will.

Well, given that attribute assignment can be overloaded, you can't
depend on that requirement all the time.

> I just think it's simpler and cleaner that way. Just making
> staticmethods callable would solve my practical problem too.

The use case is fairly uncommon (though not invalid!), and making
staticmethod callable would add more code without much benefits. I
recommend that you work around it by setting the default to None and
substituting the real default in the function.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Jan 21 05:21:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 21 05:21:22 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <20050120123826.GA30873@vicky.ecs.soton.ac.uk>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<1105945019.30052.26.camel@localhost>
	<ca471dc2050117074347baa60d@mail.gmail.com>
	<41EC431A.90204@egenix.com>
	<ca471dc20501172115275a99c9@mail.gmail.com>
	<41ECD6D9.9000001@egenix.com>
	<20050120123826.GA30873@vicky.ecs.soton.ac.uk>
Message-ID: <ca471dc205012020211e89fd76@mail.gmail.com>

> Removing unbound methods also breaks the 'py' lib quite a bit.  The 'py.test'
> framework handles function and bound/unbound method objects all over the
> place, and uses introspection on them, as they are the objects defining the
> tests to run.

OK, I'm convinced. Taking away im_class is going to break too much
code. I hereby retract the patch.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Jan 21 05:27:57 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 21 05:28:03 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com>
References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
	<ca471dc2050120030716adfb0@mail.gmail.com>
	<5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com>
Message-ID: <ca471dc2050120202762d3689b@mail.gmail.com>

Phillip, it looks like you're not going to give up. :) I really don't
want to accept your proposal into core Python, but I think you ought
to be able to implement everything you propose as part of PEAK (or
whatever other framework).

Therefore, rather than continuing to argue over the merits of your
proposal, I'd like to focus on what needs to be done so you can
implement it. The basic environment you can assume: an adaptation
module according to PEP 246, type declarations according to my latest
blog (configurable per module or per class by defining __typecheck__,
but defaulting to something conservative that either returns the
original object or raises an exception).

What do you need then?

[My plane is about to leave, gotta run!]

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From mdehoon at ims.u-tokyo.ac.jp  Fri Jan 21 06:38:50 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Fri Jan 21 06:34:58 2005
Subject: [Python-Dev] Patch review [ 1093585 ] sanity check for readline
	remove/replace
Message-ID: <41F0956A.4010606@ims.u-tokyo.ac.jp>

Patch review [ 1093585 ] sanity check for readline remove/replace

The functions remove_history_item and replace_history_item in the readline
module respectively remove and replace an item in the history of commands. As
outlined in bug [ 1086603 ], both functions cause a segmentation fault if the
item index is negative. This is actually a bug in the corresponding functions in
readline, which return a NULL pointer if the item index is larger than the size
of the history, but does not check for the item index being negative. I sent a
patch to bug-readline@gnu.org, so this will probably be fixed in future versions
of readline. But for now, we need a workaround in Python.

The patched code checks if the item index is negative, and issues an error
message if so. I have run the test suite after applying this patch, and I
found no problems with it.

Note that there is one more way to fix this bug, which is to interpret negative
indeces as counting from the end (same as lists and strings for exampe). So 
remove_history_item(-1) removes the last item added to the history etc. In that 
case, get_history_item should change as well. Right now get_history_item(-1) 
returns None, so the patch introduces a small (and probably insignificant) 
inconsistency: get_history_item(-1) returns None but remove_history_item(-1) 
raises an error.

--Michiel.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon



From stuart at stuartbishop.net  Fri Jan 21 08:18:59 2005
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Fri Jan 21 08:19:12 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
References: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
Message-ID: <41F0ACE3.8030002@stuartbishop.net>

Just van Rossum wrote:
> Skip Montanaro wrote:
> 
> 
>>Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.
> 
> 
> I don't think that in general you want to fold multiple empty lines into
> one. This would be my prefered regex:
> 
>     s = re.sub(r"\r\n?", "\n", s)
> 
> Catches both DOS and old-style Mac line endings. Alternatively, you can
> use s.splitlines():
> 
>     s = "\n".join(s.splitlines()) + "\n"
> 
> This also makes sure the string ends with a \n, which may or may not be
> a good thing, depending on your application.

Do people consider this a bug that should be fixed in Python 2.4.1 and 
Python 2.3.6 (if it ever exists), or is the resposibility for doing this 
transformation on the application that embeds Python?

-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/
From fredrik at pythonware.com  Fri Jan 21 08:27:39 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Jan 21 08:27:31 2005
Subject: [Python-Dev] Re: Unix line endings required for PyRun*
	breakingembedded Python
References: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
	<41F0ACE3.8030002@stuartbishop.net>
Message-ID: <csqast$6kj$1@sea.gmane.org>

Stuart Bishop wrote:

> Do people consider this a bug that should be fixed in Python 2.4.1 and Python 2.3.6 (if it ever 
> exists), or is the resposibility for doing this transformation on the application that embeds 
> Python?

the text you quoted is pretty clear on this:

    It is envisioned that such strings always have the
   standard \n line feed, if the strings come from a file that file can
   be read with universal newlines.

just add the fix, already  (you don't want plpythonu to depend on a future
release anyway)

</F>



From noamraph at gmail.com  Fri Jan 21 08:59:47 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Fri Jan 21 08:59:50 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <ca471dc20501202020215863da@mail.gmail.com>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<b348a08505012016505c5224f6@mail.gmail.com>
	<ca471dc20501202020215863da@mail.gmail.com>
Message-ID: <b348a085050120235937e868bf@mail.gmail.com>

> > and is more consistent with the general convention, that running
> > A = B
> > causes
> > A == B
> > to be true. Currently, Class.func = staticmethod(func), and Class.func
> > = func, don't behave by this rule. If the suggestions are accepted,
> > both will.
> 
> Well, given that attribute assignment can be overloaded, you can't
> depend on that requirement all the time.
> 
Yes, I know. For example, I don't know how you can make this work for
classmethods. (although I have the idea that if nested scopes were
including classes, and there was a way to assign names to a different
scope, then there would be no need for them. But I have no idea how
this can be done, so never mind.)

I just think of it as a very common convention, and I don't find the
exceptions "aesthetically pleasing". But of course, I accept practical
reasons for not making it that way.

> I recommend that you work around it by setting the default to None and
> substituting the real default in the function.

That's a good idea, I will probably use it. (I thought of a different
way: don't use decorators, and wrap the function in a staticmethod
after defining the function that uses it. But this is really ugly.)

Thanks for your reply,
Noam
From walter at livinglogic.de  Fri Jan 21 13:10:03 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri Jan 21 13:10:06 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41EE4797.6030105@egenix.com>
References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com>
	<41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com>
Message-ID: <41F0F11B.8000600@livinglogic.de>

M.-A. Lemburg wrote:

 > [...]
> __str__ and __unicode__ as well as the other hooks were
> specifically added for the type constructors to use.
> However, these were added at a time where sub-classing
> of types was not possible, so it's time now to reconsider
> whether this functionality should be extended to sub-classes
> as well.

So can we reach consensus on this, or do we need a
BDFL pronouncement?

Bye,
    Walter D?rwald
From Jack.Jansen at cwi.nl  Fri Jan 21 13:36:55 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Fri Jan 21 13:36:04 2005
Subject: [Python-Dev] Updated Monkey Typing pre-PEP
In-Reply-To: <ca471dc2050120030716adfb0@mail.gmail.com>
References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com>
	<ca471dc2050120030716adfb0@mail.gmail.com>
Message-ID: <2268316A-6BA9-11D9-88A6-000A958D1666@cwi.nl>


On 20 Jan 2005, at 12:07, Guido van Rossum wrote:
> The first problem is what I'd call incomplete duck typing.

Confit de canard-typing?
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From mwh at python.net  Fri Jan 21 13:46:41 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Jan 21 13:46:43 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <1106265360.1537.24.camel@schizo> (Donovan Baarda's message of
	"Fri, 21 Jan 2005 10:56:00 +1100")
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo>
	<2mllaob356.fsf@starship.python.net> <1106265360.1537.24.camel@schizo>
Message-ID: <2m651rar0u.fsf@starship.python.net>

Donovan Baarda <abo@minkirri.apana.org.au> writes:

> On Thu, 2005-01-20 at 14:12 +0000, Michael Hudson wrote:
>> Donovan Baarda <abo@minkirri.apana.org.au> writes:
>> 
>> > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
>> >> Donovan Baarda <abo@minkirri.apana.org.au> writes:
> [...]
>> >> The main oddness about python threads (before 2.3) is that they run
>> >> with all signals masked.  You could play with a C wrapper (call
>> >> setprocmask, then exec fop) to see if this is what is causing the
>> >> problem.  But please try 2.4.
>> >
>> > Python 2.4 does indeed fix the problem. 
>> 
>> That's good to hear.
> [...]
>
> I still don't understand what Linux 2.4 vs Linux 2.6 had to do with
> it.

I have to admit to not being that surprised that behaviour appears
somewhat inexplicable.

As you probably know, linux 2.6 has a more-or-less entirely different
threads implementation (NPTL) than 2.4 (LinuxThreads) -- so changes in
behaviour aren't exactly surprising.  Whether they were intentional, a
good thing, etc, I have a careful lack of opinion :)

> Reading the man pages for execve(), pthread_sigmask() and sigprocmask(),
> I can see some ambiguities, but mostly only if you do things they warn
> against (ie, use sigprocmask() instead of pthread_sigmask() in a
> multi-threaded app).

Uh, I don't know how much I'd trust documentation in this situation.
Really.

Threads and signals are almost inherently incompatible, unfortunately.

> The man page for execve() says that the new process will inherit the
> "Process signal mask (see sigprocmask() )". This implies to me it will
> inherit the mask from the main process, not the thread's signal mask.

Um.  Maybe.  But this is the sort of thing I meant above -- if signals
are delivered to threads, not processes, what does the "Process signal
mask" mean?  The signal mask of the thread that executed main()?  I
guess you could argue that, but I don't know how much I'd bet on it.

> It looks like Linux 2.4 uses the signal mask of the main thread or
> process for the execve(), whereas Linux 2.6 uses the thread's signal
> mask.

I'm not sure that this is the case -- I'm reasonably sure I saw
problems caused by the signal masks before 2.6 was ever released.  But
I could be wrong.

> Given that execve() replaces the whole process, including all
> threads, I dunno if using the thread's mask is right. Could this be
> a Linux 2.6 kernel bug?

You could ask, certainly...

Although I've done a certain amount of battle with these problems, I
don't know what any published standards have to say about these things
which is the only real criteria by which it could be called "a bug".

>> > I'm not sure what the correct behaviour should be. The fact that it
>> > works in python2.4 feels more like a byproduct of the thread mask change
>> > than correct behaviour. 
>> 
>> Well, getting rid of the thread mask changes was one of the goals of
>> the change.
>
> I gathered that... which kinda means the fact that it fixed execvp in
> threads is a side effect...(though I also guess it fixed a lot of other
> things like this too).

Um.  I meant "getting rid of the thread mask" was one of the goals
*because* it would fix the problems with execve and system() and
friends.

>> > To me it seems like execvp() should be setting the signal mask back
>> > to defaults or at least the mask of the main process before doing
>> > the exec.
>> 
>> Possibly.  I think the 2.4 change -- not fiddling the process mask at
>> all -- is the Right Thing, but that doesn't help 2.3 users.  This has
>> all been discussed before at some length, on python-dev and in various
>> bug reports on SF.
>
> Would a simple bug-fix for 2.3 be to have os.execvp() set the mask to
> something sane before executing C execvp()?

Perhaps.  I'm not sure I want to go fiddling there.  Maybe someone
else does.  system(1) presents a problem too, though, which is harder
to worm around unless we want to implement it ourselves, in practice.

> Given that Python does not have any visibility of the procmask...
>
> This might be a good idea regardless as it will protect against this bug
> resurfacing in the future if someone decides fiddling with the mask for
> threads is a good idea again.

In the long run, everyone will use 2.4.  There are some other details
to the changes in 2.4 that have a slight chance of breaking programs
which is why I'm uneasy about putting them in 2.3.5 -- for a bug fix
release it's much much worse to break a program that was working than
to fail to fix one that wasn't.

>> In your situation, I think the simplest thing you can do is dig out an
>> old patch of mine that exposes sigprocmask + co to Python and either
>> make a custom Python incorporating the patch and use that, or put the
>> code from the patch into an extension module.  Then before execing
>> fop, use the new code to set the signal mask to something sane.  Not
>> pretty, particularly, but it should work.
>
> The extension module that exposes sigprocmask() is probably best for
> now...

I hope it helps!

Cheers,
mwh

-- 
  <etrepum> Jokes around here tend to get followed by implementations.
                                                -- from Twisted.Quotes
From Jack.Jansen at cwi.nl  Fri Jan 21 13:44:22 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Fri Jan 21 13:48:32 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <41F0ACE3.8030002@stuartbishop.net>
References: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
	<41F0ACE3.8030002@stuartbishop.net>
Message-ID: <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl>


On 21 Jan 2005, at 08:18, Stuart Bishop wrote:

> Just van Rossum wrote:
>> Skip Montanaro wrote:
>>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.
>> I don't think that in general you want to fold multiple empty lines 
>> into
>> one. This would be my prefered regex:
>>     s = re.sub(r"\r\n?", "\n", s)
>> Catches both DOS and old-style Mac line endings. Alternatively, you 
>> can
>> use s.splitlines():
>>     s = "\n".join(s.splitlines()) + "\n"
>> This also makes sure the string ends with a \n, which may or may not 
>> be
>> a good thing, depending on your application.
>
> Do people consider this a bug that should be fixed in Python 2.4.1 and 
> Python 2.3.6 (if it ever exists), or is the resposibility for doing 
> this transformation on the application that embeds Python?

It could theoretically break something: a program that uses unix 
line-endings but embeds \r or \r\n in string data.

But this is rather theoretical, I don't think I'd have a problem with 
fixing this. The real problem is: who will fix it, because the fix 
isn't going to be as trivial as the Python code posted here, I'm 
afraid...
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From ncoghlan at iinet.net.au  Fri Jan 21 14:02:18 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Jan 21 14:02:21 2005
Subject: [Python-Dev] PEP 246 - concrete assistance to developers of new
	adapter classes
Message-ID: <41F0FD5A.1000206@iinet.net.au>

Phillip's monkey-typing PEP (and his goal of making it easy to write well 
behaved adapters) got me wondering about the benefits of providing an 
adaptation. Adapter class that could be used to reduce the boiler plate required 
when developing new adapters. Inheriting from it wouldn't be *required* in any 
way - doing so would simply make it easier to write a good adapter by 
eliminating or simplifying some of the required code. Being written in Python, 
it could also serve as good documentation of recommended adapter behaviour.

For instance, it could by default preserve a reference to the original object 
and use that for any further adaptation requests:

class Adapter(object):
   def __init__(self, original):
     self.original = original

   def __conform__(self, protocol):
     return adapt(self.original, protocol)

Does anyone else (particularly those with PEAK and Zope interface experience) 
think such a class would be beneficial in encouraging good practices?

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From bob at redivi.com  Fri Jan 21 14:07:40 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Jan 21 14:07:51 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl>
References: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
	<41F0ACE3.8030002@stuartbishop.net>
	<2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl>
Message-ID: <6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com>


On Jan 21, 2005, at 7:44, Jack Jansen wrote:

>
> On 21 Jan 2005, at 08:18, Stuart Bishop wrote:
>
>> Just van Rossum wrote:
>>> Skip Montanaro wrote:
>>>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.
>>> I don't think that in general you want to fold multiple empty lines 
>>> into
>>> one. This would be my prefered regex:
>>>     s = re.sub(r"\r\n?", "\n", s)
>>> Catches both DOS and old-style Mac line endings. Alternatively, you 
>>> can
>>> use s.splitlines():
>>>     s = "\n".join(s.splitlines()) + "\n"
>>> This also makes sure the string ends with a \n, which may or may not 
>>> be
>>> a good thing, depending on your application.
>>
>> Do people consider this a bug that should be fixed in Python 2.4.1 
>> and Python 2.3.6 (if it ever exists), or is the resposibility for 
>> doing this transformation on the application that embeds Python?
>
> It could theoretically break something: a program that uses unix 
> line-endings but embeds \r or \r\n in string data.
>
> But this is rather theoretical, I don't think I'd have a problem with 
> fixing this. The real problem is: who will fix it, because the fix 
> isn't going to be as trivial as the Python code posted here, I'm 
> afraid...

Well, Python already does the right thing in Py_Main, but it does not 
do the right thing from the other places you can use to run code, 
surely it can't be that hard to fix if the code is already there?

-bob



From aleax at aleax.it  Fri Jan 21 14:29:04 2005
From: aleax at aleax.it (Alex Martelli)
Date: Fri Jan 21 14:29:12 2005
Subject: [Python-Dev] PEP 246 - concrete assistance to developers of new
	adapter classes
In-Reply-To: <41F0FD5A.1000206@iinet.net.au>
References: <41F0FD5A.1000206@iinet.net.au>
Message-ID: <6B31ACEE-6BB0-11D9-9DED-000A95EFAE9E@aleax.it>


On 2005 Jan 21, at 14:02, Nick Coghlan wrote:

> Phillip's monkey-typing PEP (and his goal of making it easy to write 
> well behaved adapters) got me wondering about the benefits of 
> providing an adaptation. Adapter class that could be used to reduce 
> the boiler plate required when developing new adapters. Inheriting 
> from it wouldn't be *required* in any way - doing so would simply make 
> it easier to write a good adapter by eliminating or simplifying some 
> of the required code. Being written in Python, it could also serve as 
> good documentation of recommended adapter behaviour.
>
> For instance, it could by default preserve a reference to the original 
> object and use that for any further adaptation requests:
>
> class Adapter(object):
>   def __init__(self, original):
>     self.original = original
>
>   def __conform__(self, protocol):
>     return adapt(self.original, protocol)
>
> Does anyone else (particularly those with PEAK and Zope interface 
> experience) think such a class would be beneficial in encouraging good 
> practices?

Yes, there was something just like that in Nevow (pre-move to the zope 
interfaces) and it sure didn't hurt.


Alex

From Jack.Jansen at cwi.nl  Sat Jan 22 21:50:09 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sat Jan 22 21:49:24 2005
Subject: [Python-Dev] Unix line endings required for PyRun* breaking
	embedded Python
In-Reply-To: <6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com>
References: <r01050400-1037-17198DD56AC011D98C3B003065D5E7E4@[10.0.0.23]>
	<41F0ACE3.8030002@stuartbishop.net>
	<2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl>
	<6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com>
Message-ID: <341299DE-6CB7-11D9-82B4-000D934FF6B4@cwi.nl>


On 21-jan-05, at 14:07, Bob Ippolito wrote:

>
> On Jan 21, 2005, at 7:44, Jack Jansen wrote:
>
>>
>> On 21 Jan 2005, at 08:18, Stuart Bishop wrote:
>>
>>> Just van Rossum wrote:
>>>> Skip Montanaro wrote:
>>>>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go.
>>>> I don't think that in general you want to fold multiple empty lines 
>>>> into
>>>> one. This would be my prefered regex:
>>>>     s = re.sub(r"\r\n?", "\n", s)
>>>> Catches both DOS and old-style Mac line endings. Alternatively, you 
>>>> can
>>>> use s.splitlines():
>>>>     s = "\n".join(s.splitlines()) + "\n"
>>>> This also makes sure the string ends with a \n, which may or may 
>>>> not be
>>>> a good thing, depending on your application.
>>>
>>> Do people consider this a bug that should be fixed in Python 2.4.1 
>>> and Python 2.3.6 (if it ever exists), or is the resposibility for 
>>> doing this transformation on the application that embeds Python?
>>
>> It could theoretically break something: a program that uses unix 
>> line-endings but embeds \r or \r\n in string data.
>>
>> But this is rather theoretical, I don't think I'd have a problem with 
>> fixing this. The real problem is: who will fix it, because the fix 
>> isn't going to be as trivial as the Python code posted here, I'm 
>> afraid...
>
> Well, Python already does the right thing in Py_Main, but it does not 
> do the right thing from the other places you can use to run code, 
> surely it can't be that hard to fix if the code is already there?

IIRC the universal newline support is in the file I/O routines, which I 
assume aren't used when you execute Python code from a string.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From mal at egenix.com  Sun Jan 23 15:27:59 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun Jan 23 15:28:11 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41F0F11B.8000600@livinglogic.de>
References: <41ED25C6.80603@livinglogic.de>
	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>
	<41EE4797.6030105@egenix.com> <41F0F11B.8000600@livinglogic.de>
Message-ID: <41F3B46F.5040205@egenix.com>

Walter D?rwald wrote:
> M.-A. Lemburg wrote:
> 
>  > [...]
> 
>> __str__ and __unicode__ as well as the other hooks were
>> specifically added for the type constructors to use.
>> However, these were added at a time where sub-classing
>> of types was not possible, so it's time now to reconsider
>> whether this functionality should be extended to sub-classes
>> as well.
> 
> 
> So can we reach consensus on this, or do we need a
> BDFL pronouncement?

I don't have a clear picture of what the consensus currently
looks like :-)

If we're going for for a solution that implements the hook
awareness for all __<typename>__ hooks, I'd be +1 on that.
If we only touch the __unicode__ case, we'd only be created
yet another special case. I'd vote -0 on that.

Another solution would be to have all type constructors
ignore the __<typename>__ hooks (which were originally
added to provide classes with a way to mimic type behavior).

In general, I think we should try to get rid off special
cases and go for a clean solution (either way).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 23 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From nnorwitz at gmail.com  Sun Jan 23 19:39:42 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun Jan 23 19:39:45 2005
Subject: [Python-Dev] Speed up function calls
Message-ID: <ee2a432c050123103956a8114f@mail.gmail.com>

I added a patch to SF:  http://python.org/sf/1107887

I would like feedback on whether the approach is desirable.

The patch adds a new method type (flags) METH_ARGS that is used in
PyMethodDef. METH_ARGS means the min and max # of arguments are
specified in the PyMethodDef by adding 2 new fields. This information
can be used in ceval to
call the method. No tuple packing/unpacking is required since the C
stack is used.

The benefits are:
 * faster function calls
 * simplify function call machinery by removing METH_NOARGS, METH_O,
and possibly METH_VARARGS
 * more introspection info for C functions (ie, min/max arg count)
(not implemented)

The drawbacks are:
 * the defn of the MethodDef (# args) is separate from the function defn
 * potentially more error prone to write C methods???

I've measured between 13-22% speed improvement (debug build on
Operton) when doing simple tests like:

     ./python ./Lib/timeit.py -v 'pow(3, 5)'

I think the difference tends to be fairly constant at about .3 usec per loop.

Here's a portion of the patch to show the difference between conventions:

-builtin_filter(PyObject *self, PyObject *args)
+builtin_filter(PyObject *self, PyObject *func, PyObject *seq)
 {
-       PyObject *func, *seq, *result, *it, *arg;
+       PyObject *result, *it, *arg;
        int len;   /* guess for result list size */
        register int j;

-       if (!PyArg_UnpackTuple(args, "filter", 2, 2, &func, &seq))
-               return NULL;
-

# the are no other changes between METH_O and METH_ARGS
-       {"abs",         builtin_abs,        METH_O, abs_doc},
+       {"abs",         builtin_abs,        METH_ARGS, abs_doc, 1, 1},

-       {"filter",      builtin_filter,     METH_VARARGS, filter_doc},
+       {"filter",      builtin_filter,     METH_ARGS, filter_doc, 2, 2},

Neal
From fredrik at pythonware.com  Sun Jan 23 20:23:09 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Jan 23 20:23:12 2005
Subject: [Python-Dev] Re: Speed up function calls
References: <ee2a432c050123103956a8114f@mail.gmail.com>
Message-ID: <ct0tif$su6$1@sea.gmane.org>

Neal Norwitz wrote:

> The patch adds a new method type (flags) METH_ARGS that is used in
> PyMethodDef. METH_ARGS means the min and max # of arguments are
> specified in the PyMethodDef by adding 2 new fields.

> * the defn of the MethodDef (# args) is separate from the function defn
> * potentially more error prone to write C methods???

"potentially"?  sounds like a recipe for disaster.  but the patch is nice, and more
speed never hurts.  maybe it's time to write that module fixup preprocessor thing
that Guido should have written some 15 years ago... ;-)

</F> 



From kbk at shore.net  Sun Jan 23 21:15:45 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sun Jan 23 21:16:01 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200501232015.j0NKFjhi001559@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  273 open ( +1) /  2746 closed ( +9) /  3019 total (+10)
Bugs    :  797 open ( +4) /  4789 closed (+12) /  5586 total (+16)
RFE     :  166 open ( +1) /   141 closed ( +0) /   307 total ( +1)

New / Reopened Patches
______________________

fix distutils.install.dump_dirs() with negated options  (2005-01-17)
CLOSED http://python.org/sf/1103844  opened by  Wummel

Add O_SHLOCK/O_EXLOCK to posix  (2005-01-17)
       http://python.org/sf/1103951  opened by  Skip Montanaro

setup.py --help and --help-commands altered.  (2005-01-17)
       http://python.org/sf/1104111  opened by  Titus Brown

new-style exceptions  (2005-01-18)
       http://python.org/sf/1104669  opened by  Michael Hudson

misc doc typos  (2005-01-18)
CLOSED http://python.org/sf/1104868  opened by  DSM

chr, ord, unichr documentation updates  (2004-10-31)
       http://python.org/sf/1057588  reopened by  mike_j_brown

Faster commonprefix in macpath, ntpath, etc.  (2005-01-19)
       http://python.org/sf/1105730  opened by  Jimmy Retzlaff

get rid of unbound methods (mostly)  (2005-01-17)
CLOSED http://python.org/sf/1103689  opened by  Guido van Rossum

Updated "Working on Cygwin" section  (2005-01-22)
       http://python.org/sf/1107221  opened by  Alan Green

Add Thread.isActive()  (2005-01-23)
       http://python.org/sf/1107656  opened by  Alan Green

Speed up function calls/can add more introspection info  (2005-01-23)
       http://python.org/sf/1107887  opened by  Neal Norwitz

Patches Closed
______________

fix distutils.install.dump_dirs() with negated options  (2005-01-17)
       http://python.org/sf/1103844  closed by  theller

ast-branch: fix for coredump from new import grammar  (2005-01-11)
       http://python.org/sf/1100563  closed by  kbk

Shadow Password Support Module  (2002-07-10)
       http://python.org/sf/579435  closed by  loewis

misc doc typos  (2005-01-18)
       http://python.org/sf/1104868  closed by  fdrake

extending readline functionality  (2003-02-11)
       http://python.org/sf/684500  closed by  fdrake

self.button.pack() in tkinter.tex example  (2005-01-03)
       http://python.org/sf/1094815  closed by  fdrake

Clean up discussion of new C thread idiom  (2004-09-20)
       http://python.org/sf/1031233  closed by  fdrake

Description of args to IMAP4.store() in imaplib  (2004-12-12)
       http://python.org/sf/1084092  closed by  fdrake

get rid of unbound methods (mostly)  (2005-01-17)
       http://python.org/sf/1103689  closed by  gvanrossum

New / Reopened Bugs
___________________

email.base64MIME.header_encode vs RFC 1522  (2005-01-17)
       http://python.org/sf/1103926  opened by  Ucho

wishlist: os.feed_urandom(input)  (2005-01-17)
       http://python.org/sf/1104021  opened by  Zooko O'Whielacronx

configure doesn't set up CFLAGS properly  (2005-01-17)
       http://python.org/sf/1104249  opened by  Bryan O'Sullivan

Bugs in _csv module - lineterminator  (2004-11-24)
       http://python.org/sf/1072404  reopened by  fresh

Wrong expression with \w+?  (2005-01-18)
CLOSED http://python.org/sf/1104608  opened by  rengel

Bug in String rstrip method  (2005-01-18)
CLOSED http://python.org/sf/1104923  opened by  Rick Coupland

Undocumented implicit strip() in split(None) string method  (2005-01-19)
       http://python.org/sf/1105286  opened by  YoHell

Warnings in Python.h with gcc 4.0.0  (2005-01-19)
       http://python.org/sf/1105699  opened by  Bob Ippolito

incorrect constant names in curses window objects page  (2005-01-19)
       http://python.org/sf/1105706  opened by  dcrosta

null source chars handled oddly  (2005-01-19)
       http://python.org/sf/1105770  opened by  Reginald B. Charney

bug with idle's stdout when executing load_source  (2005-01-20)
       http://python.org/sf/1105950  opened by  imperialfists

os.stat int/float oddity  (2005-01-20)
CLOSED http://python.org/sf/1105998  opened by  George Yoshida

README of 2.4 source download says 2.4a3  (2005-01-20)
       http://python.org/sf/1106057  opened by  Roger Erens

semaphore errors from Python 2.3.x on AIX 5.2  (2005-01-20)
       http://python.org/sf/1106262  opened by  The Written Word

slightly easier way to debug from the exception handler  (2005-01-20)
       http://python.org/sf/1106316  opened by  Leonardo Rochael Almeida

os.makedirs() ignores mode parameter  (2005-01-21)
       http://python.org/sf/1106572  opened by  Andreas Jung

split() takes no keyword arguments  (2005-01-21)
       http://python.org/sf/1106694  opened by  Vinz

os.pathsep is wrong on Mac OS X  (2005-01-22)
CLOSED http://python.org/sf/1107258  opened by  Mac-arena the Bored Zo

Bugs Closed
___________

--without-cxx flag of configure isn't documented.  (2003-03-12)
       http://python.org/sf/702147  closed by  bcannon

presentation typo in lib: 6.21.4.2 How callbacks are called  (2004-12-22)
       http://python.org/sf/1090139  closed by  gward

rfc822 Deprecated since release 2.3?  (2005-01-15)
       http://python.org/sf/1102469  closed by  anthonybaxter

codecs.open and iterators  (2003-03-20)
       http://python.org/sf/706595  closed by  doerwalter

Wrong expression with \w+?  (2005-01-18)
       http://python.org/sf/1104608  closed by  niemeyer

Wrong expression with \w+?  (2005-01-18)
       http://python.org/sf/1104608  closed by  effbot

Bug in String rstrip method  (2005-01-18)
       http://python.org/sf/1104923  closed by  tim_one

No documentation for zipimport module  (2003-12-03)
       http://python.org/sf/853800  closed by  fdrake

distutils/tests not installed  (2004-12-30)
       http://python.org/sf/1093173  closed by  fdrake

urllib2 doesn't handle urls without a scheme  (2005-01-07)
       http://python.org/sf/1097834  closed by  fdrake

vertical bar typeset horizontal in docs  (2004-08-13)
       http://python.org/sf/1008998  closed by  fdrake

write failure ignored in Py_Finalize()  (2004-11-27)
       http://python.org/sf/1074011  closed by  loewis

os.stat int/float oddity  (2005-01-20)
       http://python.org/sf/1105998  closed by  loewis

os.pathsep is wrong on Mac OS X  (2005-01-22)
       http://python.org/sf/1107258  closed by  bcannon

From pycon at python.org  Sun Jan 23 22:06:58 2005
From: pycon at python.org (Steve Holden)
Date: Sun Jan 23 22:06:59 2005
Subject: [Python-Dev] Microsoft to Provide PyCon Opening Keynote
Message-ID: <20050123210658.BFB1E1E400B@bag.python.org>


Dear Python Colleague:

The PyCon Program Committee is happy to announce that
the opening keynote speech, at 9:30 am on Wednesday
March 23 will be:

    Python on the .NET Platform, by
    Jim Hugunin, Microsoft Corporation

Jim Hugunin is well-known in the Python world for his
pioneering work on JPython (now Jython), and more
recently for the IronPython .NET implementation of
Python.

Jim joined Microsoft's Common Language Runtime team
in August last year to continue his work on Iron Python
and further improve the CLR's support for dynamic
languages like Python.

I look forward to hearing what Jim has to say, and
hope that you will join me and the rest of the Python
community at PyCon DC 2005, at George Washington
University from March 23-25, with a four-day sprint
starting on Saturday March 19.

Early bird registration rates are still available for
a few more days. Go to

    http://www.python.org/moin/PyConDC2005/Schedule

for the current schedule, and register at

    http://www.python.org/pycon/2005/


regards
Steve Holden
Chairman, PyCON DC 2005
-- 
PyCon DC 2005: The third Python Community Conference
http://www.pycon.org/   http://www.python.org/pycon/
The scoop on Python implementations and applications
From ejones at uwaterloo.ca  Sun Jan 23 23:19:22 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sun Jan 23 23:19:01 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
Message-ID: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>

This message is a follow up to a thread I started on python-dev back in 
October, archived here:

http://mail.python.org/pipermail/python-dev/2004-October/049480.html

Basically, the problem I am trying to solve is that the Python memory 
allocator never frees memory back to the operating system. I have 
attached a patch against obmalloc.c for discussion. The patch still has 
some rough edges and possibly some bugs, so I don't think it should be 
merged as is. However, I would appreciate any feedback on the chances 
for getting this implementation into the core. The rest of this message 
lists some disadvantages to this implementation, a description of the 
important changes, a benchmark, and my future plans if this change gets 
accepted.

The patch works for any version of Python that uses obmalloc.c (which 
includes Python 2.3 and 2.4), but I did my testing with Python 2.5 from 
CVS under Linux and Mac OS X. This version of the allocator will 
actually free memory. It has two disadvantages:

First, there is slightly more overhead with programs that allocate a 
lot of memory, release it, then reallocate it. The original allocator 
simply holds on to all the memory, allowing it to be efficiently 
reused. This allocator will call free(), so it also must call malloc() 
again when the memory is needed. I have a "worst case" benchmark which 
shows that this cost isn't too significant, but it could be a problem 
for some workloads. If it is, I have an idea for how to work around it.

Second, the previous allocator went out of its way to permit a module 
to call PyObject_Free while another thread is executing 
PyObject_Malloc. Apparently, this was a backwards compatibility hack 
for old Python modules which erroneously call these functions without 
holding the GIL. These modules will have to be fixed if this 
implementation is accepted into the core.


Summary of the changes:

- Add an "arena_object" structure for tracking pages that belong to 
each 256kB arena.
- Change the "arenas" array from an array of pointers to an array of 
arena_object structures.
- When freeing a page (a pool), it is placed on a free pool list for 
the arena it belongs to, instead of a global free pool list.
- When freeing a page, if the arena is completely unused, the arena is 
deallocated.
- When allocating a page, it is taken from the arena that is the most 
full. This gives arenas that are almost completely unused a chance to 
be freed.


Benchmark:

The only benchmark I have performed at the moment is the worst case for 
this allocator: A program that allocates 1 000 000 Python objects which 
occupy nearly 200MB, frees them, reallocates them, then quits. I ran 
the program four times, and discarded the initial time. Here is the 
object:

class Obj:
	def __init__( self ):
		self.dumb = "hello"

And here are the average execution times for this program:

Python 2.5:
real time: 16.304
user time: 16.016
system: 0.257

Python 2.5 + patch:
real time: 16.062
user time: 15.593
system: 0.450

As expected, the patched version spends nearly twice as much system 
time than the original version. This is because it calls free() and 
malloc() twice as many times. However, this difference is offset by the 
fact that the user space execution time is actually *less* than the 
original version. How is this possible? The likely cause is because the 
original version defined the arenas pointer to be "volatile" in order 
to work when Free and Malloc were called simultaneously. Since this 
version breaks that, the pointer no longer needs to be volatile, which 
allows the value to be stored in a register instead of being read from 
memory on each operation.

Here are some graphs of the memory allocator behaviour running this 
benchmark.

Original: http://www.eng.uwaterloo.ca/~ejones/original.png
New: http://www.eng.uwaterloo.ca/~ejones/new.png


Future Plans:

- More detailed benchmarking.
- The "specialized" allocators for the basic types, such as ints, also 
need to free memory back to the system.
- Potentially the allocator should keep some amount of free memory 
around to improve the performance of programs that cyclically allocate 
and free large amounts of memory. This amount should be "self-tuned" to 
the application.

Thank you for your feedback,

Evan Jones

-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-allocator.diff
Type: application/octet-stream
Size: 19080 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050123/33ee017b/python-allocator-0001.obj
From python at rcn.com  Mon Jan 24 09:11:05 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Jan 24 09:14:37 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050123103956a8114f@mail.gmail.com>
Message-ID: <001d01c501ec$40da86e0$5822a044@oemcomputer>

[Neal Norwitz]
> I would like feedback on whether the approach is desirable.
> 
> The patch adds a new method type (flags) METH_ARGS that is used in
> PyMethodDef. METH_ARGS means the min and max # of arguments are
> specified in the PyMethodDef by adding 2 new fields. This information
> can be used in ceval to
> call the method. No tuple packing/unpacking is required since the C
> stack is used.
> 
> The benefits are:
>  * faster function calls
>  * simplify function call machinery by removing METH_NOARGS, METH_O,
> and possibly METH_VARARGS
>  * more introspection info for C functions (ie, min/max arg count)
> (not implemented)

An additional benefit would be improving the C-API by allowing C calls
without creating temporary argument tuples.  Also, some small degree of
introspection becomes possible when a method knows its own arity.

Replacing METH_O and METH_NOARGS seems straight-forward, but
METH_VARARGS has much broader capabilities.  How would you handle the
simple case of "O|OO"?  How could you determine useful default values
(NULL, 0, -1, -909, etc.)?

If you solve the default value problem, then please also try to come up
with a better flag name than METH_ARGS which I find to be indistinct
from METH_VARARGS and also not very descriptive of its functionality.
Perhaps something like METH_UNPACKED would be an improvement.



> The drawbacks are:
>  * the defn of the MethodDef (# args) is separate from the function
defn
>  * potentially more error prone to write C methods???

No worse than with METH_O or METH_NOARGS.



> I've measured between 13-22% speed improvement (debug build on
> Operton) when doing simple tests like:
> 
>      ./python ./Lib/timeit.py -v 'pow(3, 5)'
> 
> I think the difference tends to be fairly constant at about .3 usec
per
> loop.

If speed is the main advantage being sought, it would be worthwhile to
conduct more extensive timing tests with a variety of code and not using
a debug build.  Running test.test_decimal would be a useful overall
benchmark.

In theory, I don't see how you could improve on METH_O and METH_NOARGS.
The only saving is the time for the flag test (a predictable branch).
Offsetting that savings is the additional time for checking min/max args
and for constructing a C call with the appropriate number of args.  I
suspect there is no savings here and that the timings will get worse.

In all likelihood, the only real opportunity for savings is replacing
METH_VARARGS in cases that have already been sped-up using
PyTuple_Unpack().  Those can be further improved by eliminating the time
to build and unpack the temporary argument tuple.

Even then, I don't see how to overcome the need to set useful default
values for optional object arguments.



Raymond Hettinger

From ncoghlan at iinet.net.au  Mon Jan 24 12:30:01 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Jan 24 12:30:08 2005
Subject: [Python-Dev] Allowing slicing of iterators
Message-ID: <41F4DC39.9020603@iinet.net.au>

I just wrote a new C API function (PyItem_GetItem) that supports slicing for 
arbitrary iterators. A patch for current CVS is at http://www.python.org/sf/1108272

For simple indices it does the iteration manually, and for extended slices it 
returns an itertools.islice object.

As a trivial example, here's how to skip the head of a zero-numbered list:

   for i, item in enumerate("ABCDEF")[1:]:
     print i, item


Is this idea a non-starter, or should I spend my holiday on Wednesday finishing 
it off and writing the documentation and tests for it?

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From rodsenra at gpr.com.br  Mon Jan 24 13:00:10 2005
From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra)
Date: Mon Jan 24 12:58:35 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
Message-ID: <41F4E34A.1050101@gpr.com.br>

Evan Jones wrote:
> This message is a follow up to a thread I started on python-dev back in 
> October, archived here:

> First, there is slightly more overhead with programs that allocate a lot 
> of memory, release it, then reallocate it.

> Summary of the changes:
> 
> - When freeing a page, if the arena is completely unused, the arena is 
> deallocated.

Depending on the cost of arena allocation, it might help to define a 
lower threshold keeping a minimum of empty arena_objects permanently
available. Do you think this can bring any speedup ?

cheers,
Senra

-- 
Rodrigo Senra
MSc Computer Engineer     rodsenra@gpr.com.br
GPr Sistemas Ltda       http://www.gpr.com.br

From ejones at uwaterloo.ca  Mon Jan 24 14:50:19 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Mon Jan 24 14:50:20 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <41F4E34A.1050101@gpr.com.br>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
	<41F4E34A.1050101@gpr.com.br>
Message-ID: <E2728F8E-6E0E-11D9-9C81-0003938016AE@uwaterloo.ca>

On Jan 24, 2005, at 7:00, Rodrigo Dias Arruda Senra wrote:
> Depending on the cost of arena allocation, it might help to define a 
> lower threshold keeping a minimum of empty arena_objects permanently 
> available. Do you think this can bring any speedup ?

Yes, I think it might. I have to do some more benchmarking first, to 
try and figure out how expensive the allocations are. This is one of my 
"future work" items to work on if this change gets accepted. I have not 
implemented it yet, because I don't want to have to merge one *massive* 
patch. My rough idea is to do something like this:

1. Keep track of the largest number of pages in use at one time.

2. Every N memory operations (or some other measurement of "time"), 
reset this value and calculate a moving average of the number of pages. 
This estimates the current memory requirements of the application.

3. If (used + free) > average, free arenas until freeing one more arena 
would make (used + free) < average.

This is better than a static scheme which says "keep X MB of free 
memory around" because it will self-tune to the application's 
requirements. If you have an applications that needs lots of RAM, it 
will keep lots of RAM. If it has very low RAM usage, it will be more 
aggressive in reclaiming free space. The challenge is how to determine 
a good measurement of "time." Ideally, if the application was idle for 
a while, you would perform some housekeeping like this. Does Python's 
cyclic garbage collector currently do this? If so, I could hook this 
"management" stuff on to its calls to gc.collect()

Evan Jones

From rodsenra at gpr.com.br  Mon Jan 24 15:21:52 2005
From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra)
Date: Mon Jan 24 15:20:16 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <E2728F8E-6E0E-11D9-9C81-0003938016AE@uwaterloo.ca>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
	<41F4E34A.1050101@gpr.com.br>
	<E2728F8E-6E0E-11D9-9C81-0003938016AE@uwaterloo.ca>
Message-ID: <41F50480.1070003@gpr.com.br>

[Evan Jones] :
--------------
> 2. Every N memory operations (or some other measurement of "time"), 
> reset this value and calculate a moving average of the number of pages. 
> This estimates the current memory requirements of the application.

 > The challenge is how to determine a good measurement of "time."
 > Ideally, if the application was idle for a while,
> you would perform some housekeeping like this. Does Python's cyclic 
> garbage collector currently do this? If so, I could hook this 
> "management" stuff on to its calls to gc.collect()

IMVHO, any measurement of "time" chosen would hurt performance of
non-memory greedy applications. OTOH, makes sense for the developers
of memory greedy applications (they should be aware of it <wink>)
to call gc.collect() periodically. Therefore, *hooking* gc.collect()
sounds about right to me, let the janitoring pace be defined by those
who really care about it.

Looking forward to see this evolve,
Senra

-- 
Rodrigo Senra
MSc Computer Engineer     rodsenra@gpr.com.br
GPr Sistemas Ltda       http://www.gpr.com.br

From gvanrossum at gmail.com  Mon Jan 24 16:25:27 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 24 16:25:54 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <41F4DC39.9020603@iinet.net.au>
References: <41F4DC39.9020603@iinet.net.au>
Message-ID: <ca471dc205012407251dac200d@mail.gmail.com>

> I just wrote a new C API function (PyItem_GetItem) that supports slicing for
> arbitrary iterators. A patch for current CVS is at http://www.python.org/sf/1108272
> 
> For simple indices it does the iteration manually, and for extended slices it
> returns an itertools.islice object.
> 
> As a trivial example, here's how to skip the head of a zero-numbered list:
> 
>    for i, item in enumerate("ABCDEF")[1:]:
>      print i, item
> 
> Is this idea a non-starter, or should I spend my holiday on Wednesday finishing
> it off and writing the documentation and tests for it?

Since we already have the islice iterator, what's the point? It seems
to me that introducing this notation would mostly lead to confuse
users, since in most other places a slice produces an independent
*copy* of the data.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From FBatista at uniFON.com.ar  Mon Jan 24 16:34:25 2005
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Jan 24 16:38:04 2005
Subject: [Python-Dev] Allowing slicing of iterators
Message-ID: <A128D751272CD411BC9200508BC2194D053C7F24@escpl.tcp.com.ar>

[Guido van Rossum]

#- > As a trivial example, here's how to skip the head of a 
#- zero-numbered list:
#- > 
#- >    for i, item in enumerate("ABCDEF")[1:]:
#- >      print i, item
#- > 
#- > Is this idea a non-starter, or should I spend my holiday 
#- on Wednesday finishing
#- > it off and writing the documentation and tests for it?
#- 
#- Since we already have the islice iterator, what's the point? It seems
#- to me that introducing this notation would mostly lead to confuse
#- users, since in most other places a slice produces an independent
#- *copy* of the data.

I think that breaking the common idiom...

  for e in something[:]:
      something.remove(e)

is a no-no...

.    Facundo

Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog
PyAr - Python Argentina: http://pyar.decode.com.ar/


  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
ADVERTENCIA.

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley.
Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo.
Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada.
Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje.
Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050124/67a73544/attachment.html
From martin at v.loewis.de  Mon Jan 24 23:36:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 24 23:36:23 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050123103956a8114f@mail.gmail.com>
References: <ee2a432c050123103956a8114f@mail.gmail.com>
Message-ID: <41F57868.1010404@v.loewis.de>

Neal Norwitz wrote:
> I would like feedback on whether the approach is desirable.

I'm probably missing something really essential, but...

Where are the Py_DECREFs done for the function arguments?

Also, changing PyArg_ParseTuple is likely incorrect.
Currently, chr/unichr expects float values; with your
change, I believe it won't anymore.

Apart from that, the change looks fine to me.

Regards,
Martin
From nnorwitz at gmail.com  Tue Jan 25 00:08:29 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue Jan 25 00:08:32 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <41F57868.1010404@v.loewis.de>
References: <ee2a432c050123103956a8114f@mail.gmail.com>
	<41F57868.1010404@v.loewis.de>
Message-ID: <ee2a432c0501241508620d618f@mail.gmail.com>

On Mon, 24 Jan 2005 23:36:24 +0100, "Martin v. L?wis"
<martin@v.loewis.de> wrote:
> Neal Norwitz wrote:
> > I would like feedback on whether the approach is desirable.
> 
> I'm probably missing something really essential, but...
> 
> Where are the Py_DECREFs done for the function arguments?

The original code path still handles the Py_DECREFs.
This is the while loop at the end of call_function().

I hope to refine the patch further in this area.

> Also, changing PyArg_ParseTuple is likely incorrect.
> Currently, chr/unichr expects float values; with your
> change, I believe it won't anymore.

You are correct there is an unintended change in behaviour:

    Python 2.5a0 (#51, Jan 23 2005, 18:54:53)
    >>> chr(5.3)
    '\x05'

    Python 2.3.4 (#1, Dec  7 2004, 12:24:19)
    >>> chr(5.3)
    __main__:1: DeprecationWarning: integer argument expected, got float
    '\x05'

This needs to be fixed.

Neal
From martin at v.loewis.de  Tue Jan 25 00:16:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan 25 00:16:24 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
Message-ID: <41F581C8.6070109@v.loewis.de>

Here my comments, from more general to more subtle:

- please don't post patches here; post them to SF
   You may ask for comments here after you posted them to SF.
- please follow Python coding style. In particular, don't write
     if ( available_arenas == NULL ) {
   but write
     if (available_arenas == NULL) {

> Second, the previous allocator went out of its way to permit a module to 
> call PyObject_Free while another thread is executing PyObject_Malloc. 
> Apparently, this was a backwards compatibility hack for old Python 
> modules which erroneously call these functions without holding the GIL. 
> These modules will have to be fixed if this implementation is accepted 
> into the core.

I'm not certain it is acceptable to make this assumption. Why is it
not possible to use the same approach that was previously used (i.e.
leak the arenas array)?

> - When allocating a page, it is taken from the arena that is the most 
> full. This gives arenas that are almost completely unused a chance to be 
> freed.

It would be helpful if that was documented in the data structures
somewhere. The fact that the nextarena list is sorted by nfreepools
is only mentioned in the place where this property is preserved;
it should be mentioned in the introductory comments as well.

Regards,
Martin
From martin at v.loewis.de  Tue Jan 25 00:30:44 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan 25 00:30:44 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c0501241508620d618f@mail.gmail.com>
References: <ee2a432c050123103956a8114f@mail.gmail.com>	<41F57868.1010404@v.loewis.de>
	<ee2a432c0501241508620d618f@mail.gmail.com>
Message-ID: <41F58524.1020600@v.loewis.de>

Neal Norwitz wrote:
>>Where are the Py_DECREFs done for the function arguments?
> 
> 
> The original code path still handles the Py_DECREFs.
> This is the while loop at the end of call_function().

Can you please elaborate? For METH_O and METH_ARGS,
the arguments have already been popped off the stack,
and the "What does this do" loop only pops off the
function itself. So (without testing) methinks your
code currently leaks references.

Regards,
Martin
From nnorwitz at gmail.com  Tue Jan 25 00:37:04 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue Jan 25 00:37:07 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <001d01c501ec$40da86e0$5822a044@oemcomputer>
References: <ee2a432c050123103956a8114f@mail.gmail.com>
	<001d01c501ec$40da86e0$5822a044@oemcomputer>
Message-ID: <ee2a432c0501241537b3f1845@mail.gmail.com>

On Mon, 24 Jan 2005 03:11:05 -0500, Raymond Hettinger <python@rcn.com> wrote:
>
> Replacing METH_O and METH_NOARGS seems straight-forward, but
> METH_VARARGS has much broader capabilities.  How would you handle the
> simple case of "O|OO"?  How could you determine useful default values
> (NULL, 0, -1, -909, etc.)?

I have a new version of the patch that handles this condition.
I pass NULLs for non-existant optional parameters.  In your
case above, the arguments passed would be:

    (obj, NULL, NULL)

This is handled pretty cleanly in the callees, since it is
pretty common to initialize optional params to NULL.

> If you solve the default value problem, then please also try to come up
> with a better flag name than METH_ARGS which I find to be indistinct
> from METH_VARARGS and also not very descriptive of its functionality.
> Perhaps something like METH_UNPACKED would be an improvement.

I agree METH_ARGS is a poor name.  UNPACKED is fine with me.
If I don't hear a better suggestion, I'll go with that.

> > The drawbacks are:
> >  * the defn of the MethodDef (# args) is separate from the function
> defn
> >  * potentially more error prone to write C methods???
> 
> No worse than with METH_O or METH_NOARGS.

I agree, plus the signature changes if METH_KEYWORDS is used.
I was interested if others viewed the change as better, worse,
or about the same.  I agree with /F that it could be a disaster
if it really is more error prone.  I don't view the change as
much different.  Do others view this as a real problem? 

> If speed is the main advantage being sought, it would be worthwhile to
> conduct more extensive timing tests with a variety of code and not using
> a debug build.  Running test.test_decimal would be a useful overall
> benchmark.

I was hoping others might try it out and see.  I don't have access
to Windows, Mac, or other arches.  I only have x86 and amd64.
It would also be interesting to test this on some real world code.

I have tried various builtin functions and methods and the gain
seems to be consistent across all of them.  I tried things like
dict.get, pow, isinstance.  Since the overhead is fairly constant,
I would expect functions with more arguments to have an even
better improvement.

> In theory, I don't see how you could improve on METH_O and METH_NOARGS.
> The only saving is the time for the flag test (a predictable branch).
> Offsetting that savings is the additional time for checking min/max args
> and for constructing a C call with the appropriate number of args.  I
> suspect there is no savings here and that the timings will get worse.

I think tested a method I changed from METH_O to METH_ARGS and could
not measure a difference.  A beneift would be to consolidate METH_O,
METH_NOARGS, and METH_VARARGS into a single case.  This should
make code simpler all around (IMO).

> In all likelihood, the only real opportunity for savings is replacing
> METH_VARARGS in cases that have already been sped-up using
> PyTuple_Unpack().  Those can be further improved by eliminating the time
> to build and unpack the temporary argument tuple.

Which this patch accomplishes.

> Even then, I don't see how to overcome the need to set useful default
> values for optional object arguments.

Take a look at the updated patch (#2).  I still think it's pretty clean and an
overall win.  But I'd really like to know what others think.  I also implemented
most (all?) of METH_O and METH_NOARGS plus many METH_VARARGS, so
benchmarkers can compare a difference with and without the patch.

Neal
From nnorwitz at gmail.com  Tue Jan 25 00:48:51 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue Jan 25 00:48:53 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <41F58524.1020600@v.loewis.de>
References: <ee2a432c050123103956a8114f@mail.gmail.com>
	<41F57868.1010404@v.loewis.de>
	<ee2a432c0501241508620d618f@mail.gmail.com>
	<41F58524.1020600@v.loewis.de>
Message-ID: <ee2a432c05012415485079b281@mail.gmail.com>

On Tue, 25 Jan 2005 00:30:44 +0100, "Martin v. L?wis"
<martin@v.loewis.de> wrote:
> Neal Norwitz wrote:
> >>Where are the Py_DECREFs done for the function arguments?
> >
> > The original code path still handles the Py_DECREFs.
> > This is the while loop at the end of call_function().
> 
> Can you please elaborate? 

I'll try.  Do you really trust me, given my first explanation was so poor? :-)

EXT_POP() modifies stack_pointer on the stack.  In call_function(), 
stack_pointer is PyObject ***.  But in new_fast_function(), stack_pointer
is only PyObject **.  So the modifications by EXT_POP to stack_pointer
(moving it down) are lost in new_fast_function().  So when it returns
to call_function(), the stack_pointer is still at the top of the stack.
The while loop pops off the arguments.

If there was a ref leak, this scenario should demonstrate the refs increasing:

>>> isinstance(5, int)
True
[25363 refs]
>>> isinstance(5, int)
True
[25363 refs]
>>> isinstance(5, int)
True
[25363 refs]

The current code is not optimal.  new_fast_function() should take PyObject***
and it should also do the DECREF, but I had some bugs when I tried to get
that working, so I've deferred fixing that.  It ought to be fixed though.

HTH,
Neal
From martin at v.loewis.de  Tue Jan 25 01:00:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Jan 25 01:00:34 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c05012415485079b281@mail.gmail.com>
References: <ee2a432c050123103956a8114f@mail.gmail.com>	
	<41F57868.1010404@v.loewis.de>	
	<ee2a432c0501241508620d618f@mail.gmail.com>	
	<41F58524.1020600@v.loewis.de>
	<ee2a432c05012415485079b281@mail.gmail.com>
Message-ID: <41F58C22.9080903@v.loewis.de>

Neal Norwitz wrote:
> EXT_POP() modifies stack_pointer on the stack.  In call_function(), 
> stack_pointer is PyObject ***.  But in new_fast_function(), stack_pointer
> is only PyObject **.  So the modifications by EXT_POP to stack_pointer
> (moving it down) are lost in new_fast_function().

Thanks - that is the detail I was missing.

Regards,
Martin
From ejones at uwaterloo.ca  Tue Jan 25 01:33:22 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Tue Jan 25 01:33:12 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <41F581C8.6070109@v.loewis.de>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
	<41F581C8.6070109@v.loewis.de>
Message-ID: <B7E9F05C-6E68-11D9-9C81-0003938016AE@uwaterloo.ca>

On Jan 24, 2005, at 18:16, Martin v. L?wis wrote:
> - please don't post patches here; post them to SF
>   You may ask for comments here after you posted them to SF.

Sure. This should be done even for patches which should absolutely not 
be committed?

> - please follow Python coding style. In particular, don't write
>     if ( available_arenas == NULL ) {
>   but write
>     if (available_arenas == NULL) {

Yikes! This is a "bad" habit of mine that is in the minority of coding 
style . Thank you for catching it.

>> Second, the previous allocator went out of its way to permit a module 
>> to call PyObject_Free while another thread is executing 
>> PyObject_Malloc. Apparently, this was a backwards compatibility hack 
>> for old Python modules which erroneously call these functions without 
>> holding the GIL. These modules will have to be fixed if this 
>> implementation is accepted into the core.
> I'm not certain it is acceptable to make this assumption. Why is it
> not possible to use the same approach that was previously used (i.e.
> leak the arenas array)?

This is definitely a very important point of discussion. The main 
problem is that leaking the "arenas" arena is not sufficient to make 
the memory allocator thread safe. Back in October, Tim Peters suggested 
that it might be possible to make the breakage easily detectable:

http://mail.python.org/pipermail/python-dev/2004-October/049502.html

> If we changed PyMem_{Free, FREE, Del, DEL} to map to the system
> free(), all would be golden (except for broken old code mixing
> PyObject_ with PyMem_ calls).  If any such broken code still exists,
> that remapping would lead to dramatic failures, easy to reproduce; and
> old code broken in the other, infinitely more subtle way (calling
> PyMem_{Free, FREE, Del, DEL} when not holding the GIL) would continue
> to work fine.

I'll be honest, I only have a theoretical understanding of why this 
support is necessary, or why it is currently correct. For example, is 
it possible to call PyMem_Free from two threads simultaneously? Since 
the problem is that threads could call PyMem_Free without holding the 
GIL, it seems to be that it is possible. Shouldn't it also be 
supported? In the current memory allocator, I believe that situation 
can lead to inconsistent state. For example, see obmalloc.c:746, where 
it has been determined that a block needs to be put on the list of free 
blocks:

		*(block **)p = lastfree = pool->freeblock;
		pool->freeblock = (block *)p;

Imagine two threads are simultaneously freeing blocks that belong to 
the same pool. They both read the same value for pool->freeblock, and 
assign that same value to p. The changes to pool->freeblock will have 
some arbitrary ordering. The result? You have just leaked a block of 
memory.

Basically, if a concurrent memory allocator is the requirement, then I 
think some other approach is necessary.

>> - When allocating a page, it is taken from the arena that is the most 
>> full. This gives arenas that are almost completely unused a chance to 
>> be freed.
> It would be helpful if that was documented in the data structures
> somewhere. The fact that the nextarena list is sorted by nfreepools
> is only mentioned in the place where this property is preserved;
> it should be mentioned in the introductory comments as well.

This is one of those rough edges I mentioned before. If there is some 
concensus that these changes should be accepted, then I will need to 
severely edit the comments at the beginning of obmalloc.c.

Thanks for your feedback,

Evan Jones

From steve at holdenweb.com  Tue Jan 25 00:24:21 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue Jan 25 02:34:54 2005
Subject: [Python-Dev] PyCon: The Spam Continues ;-)
Message-ID: <41F583A5.8020206@holdenweb.com>

Dear python-dev:

The current (as of even date) summary of my recent contributions to 
Python -dev appears to be spam about PyCon.

Not being one to break habits, even not those of a lifetime sometimes, I 
spam you yet again to show you what a beautiful summary ActiveState have 
provided (I don't know whether this URL is cacheable or not):

<http://aspn.activestate.com/ASPN/Mail/Browse/ByAuthor/python-dev?author=cHljb25AcHl0aG9uLm9yZw-->

If I remember Trent Lott (?) described at an IPC the SQL Server database 
that drives this system, and it was a great example of open source 
technology driving a proprietary (but I expect (?) relatively portable) 
repository.

Since I have your attention (and if I haven't then it really doesn't 
matter what I write hereafter, goodbye ...) I will also point out that 
the current top hit on Google for

	"Microsoft to Provide PyCon Opening Keynote"

is

	[Python-Dev] Microsoft to Provide PyCon Opening Keynote
by Steve Holden (you can repeat the search to see whether this assertion 
is true as you read this mail, and read the opening keynote announcement 
[I hope...]).

Space at PyCon is again enlarged, but it certainly isn't infinite. I'd 
love to see it filled in my third and last year as chair. The program 
committee have worked incredibly hard to make sure we all have to choose 
between far more technical content than a single individual can possibly 
take in on their own. They [disclaimer: I was program chair, but this 
should be kudos for the committee membership - without whom this 
conference would have failed in many dimensions] they have succeeded so 
well we all, I hope, have to agonize between two sumptuous but 
equidistant technical bales of hay.

Only by providing such rich choice can we ensure that an even broader 
community forms around Python, with free interchange between the 
technical communities of the proprietary and open source worlds, and 
equitable participation in the benefit.

Sorry I haven't made many CVS contributions lately. We really should be 
showcasing more Python technologies via www.python.org.

targeted-marketing-to-talented-professionals-ly y'rs  - steve
-- 
Steve Holden               http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC      +1 703 861 4237  +1 800 494 3119

From steve at holdenweb.com  Tue Jan 25 00:40:04 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue Jan 25 02:34:57 2005
Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-)
In-Reply-To: <41F583A5.8020206@holdenweb.com>
References: <41F583A5.8020206@holdenweb.com>
Message-ID: <41F58754.1020607@holdenweb.com>

Steve Holden wrote:

[some things followed by]

> 
> If I remember Trent Lott (?) described at an IPC the SQL Server database 
> that drives this system, and it was a great example of open source 
> technology driving a proprietary (but I expect (?) relatively portable) 
> repository.
> 
Please forgive me for this not-so-talented Transatlantic confusion, 
since I mistook one famous name for another. I did of course mean
Trent Mick at ActiveState.  Apologies for the confusion.

regards
Steve
-- 
Steve Holden               http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC      +1 703 861 4237  +1 800 494 3119

From DavidA at ActiveState.com  Tue Jan 25 02:43:16 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Tue Jan 25 02:45:14 2005
Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-)
In-Reply-To: <41F583A5.8020206@holdenweb.com>
References: <41F583A5.8020206@holdenweb.com>
Message-ID: <41F5A434.1060905@ActiveState.com>

Steve Holden wrote:
> Dear python-dev:
> 
> The current (as of even date) summary of my recent contributions to 
> Python -dev appears to be spam about PyCon.
> 
> Not being one to break habits, even not those of a lifetime sometimes, I 
> spam you yet again to show you what a beautiful summary ActiveState have 
> provided (I don't know whether this URL is cacheable or not):
> 
> <http://aspn.activestate.com/ASPN/Mail/Browse/ByAuthor/python-dev?author=cHljb25AcHl0aG9uLm9yZw--> 

Yup, we try to make all our URLs portable and persistent.

> If I remember Trent Lott (?) 

Nah, that's a US politician.  T'was Trent Mick.

> described at an IPC the SQL Server database 
> that drives this system, and it was a great example of open source 
> technology driving a proprietary (but I expect (?) relatively portable) 
> repository.

Modulo some SQLServer features we're using.

> Since I have your attention (and if I haven't then it really doesn't 
> matter what I write hereafter, goodbye ...) I will also point out that 
> the current top hit on Google for
> 
>     "Microsoft to Provide PyCon Opening Keynote"

What a bizarre search.

(note that some of your To's and Cc's were pretty strange...

--david

	
From tjreedy at udel.edu  Tue Jan 25 03:04:44 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Jan 25 03:04:50 2005
Subject: [Python-Dev] Re: PyCon: The Spam Continues ;-)
References: <41F583A5.8020206@holdenweb.com>
Message-ID: <ct49fr$pp$1@sea.gmane.org>


 <http://aspn.activestate.com/ASPN/Mail/Browse/ByAuthor/python-dev?author=cHljb25AcHl0aG9uLm9yZw-->Huh?  I get a mostly blank page.  Perhaps there are no authors by thatname.tjr


From anthony at interlink.com.au  Tue Jan 25 06:57:55 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Jan 25 06:59:14 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux kernel
	2.6
In-Reply-To: <1106185423.3784.26.camel@schizo>
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net>
	<1106185423.3784.26.camel@schizo>
Message-ID: <200501251657.57682.anthony@interlink.com.au>

On Thursday 20 January 2005 12:43, Donovan Baarda wrote:
> On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
> > The main oddness about python threads (before 2.3) is that they run
> > with all signals masked.  You could play with a C wrapper (call
> > setprocmask, then exec fop) to see if this is what is causing the
> > problem.  But please try 2.4.
>
> Python 2.4 does indeed fix the problem. Unfortunately we are using Zope
> 2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to
> 2.4. Is there any wa  this "Fix" can be back-ported to 2.3?

It's extremely unlikely - I couldn't make myself comfortable with it
when attempting to figure out it's backportedness. While the current
behaviour on 2.3.4 is broken in some cases, I fear very much that 
the new behaviour will break other (working) code - and this is 
something I try very hard to avoid in a bugfix release, particularly
in one that's probably the final one of a series.

Fundamentally, the answer is "don't do signals+threads, you will
get burned". For your application, you might want to instead try 
something where you write requests to a file in a spool directory, 
and have a python script that loops looking for requests, and 
generates responses. This is likely to be much simpler to debug 
and work with. 



Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From raymond.hettinger at verizon.net  Tue Jan 25 12:42:57 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue Jan 25 12:47:03 2005
Subject: [Python-Dev] Speed up function calls
Message-ID: <000e01c502d3$0458a340$18fccc97@oemcomputer>

> > In theory, I don't see how you could improve on METH_O and METH_NOARGS.
> > The only saving is the time for the flag test (a predictable branch).
> > Offsetting that savings is the additional time for checking min/max args
> > and for constructing a C call with the appropriate number of args.  I
> > suspect there is no savings here and that the timings will get worse.
> 
> I think tested a method I changed from METH_O to METH_ARGS and could
> not measure a difference.  

Something is probably wrong with the measurements.  The new call does much more work than METH_O or METH_NOARGS.  Those two common and essential cases cannot be faster and are likely slower on at least some compilers and some machines.  If some timing shows differently, then it is likely a mirage (falling into an unsustainable local minimum).

The patch introduces range checks, an extra C function call, nine variable initializations, and two additional unpredictable branches (the case statements).  The only benefit (in terms of timing) is possibly saving a tuple allocation/deallocation.  That benefit only kicks in for METH_VARARGS and even then only when the tuple free list is empty.

I recommend not changing ANY of the METH_O and METH_NOARGS calls.  These are already close to optimal.



> A beneift would be to consolidate METH_O,
> METH_NOARGS, and METH_VARARGS into a single case.  This should
> make code simpler all around (IMO).

Will backwards compatibility allow those cases to be eliminated?  It would be a bummer if most existing extensions could not compile with Py2.5.  Also, METH_VARARGS will likely have to hang around unless a way can be found to handle more than nine arguments.

This patch appears to be taking on a life of its own and is being applied more broadly than is necessary or wise.  The patch is extensive and introduces a new C API that cannot be taken back later, so we ought to be careful with it.  

For the time being, try not to touch the existing METH_O and METH_NOARGS methods.  Focus on situations that do stand a chance of being improved (such as methods with a signature like "O|O").

That being said, I really like the concept.  I just worry that many of the stated benefits won't materialize:
* having to keep the old versions for backwards compatibility,
* being slower than METH_O and METH_NOARGS,
* not handling more than nine arguments,
* separating function signature info from the function itself,
* the time to initialize all the argument variables to NULL,
* somewhat unattractive case stmt code for building the c function call.



Raymond

From anthony at interlink.com.au  Tue Jan 25 14:05:02 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Jan 25 14:05:14 2005
Subject: [Python-Dev] 2.3 BRANCH FREEZE imminent!
Message-ID: <200501260005.02705.anthony@interlink.com.au>

As those of you playing along at home with python-checkins would know, we're 
going to be cutting a 2.3.5c1 shortly (in about 12 hours time). 

Can people not in the set of the normal release team (you know the drill) 
please hold off on checkins to the branch from about 0000 UTC, 26th January 
(in about 12 hours time). After than, we'll have a one-week delay from release 
candidate until the final 2.3.5 - until then, please be ultra-conservative 
with checkins to the 2.3 branch (unless you're also volunteering to cut an 
emergency 2.3.6 <wink>).

Assuming nothing horrible goes wrong, this will be the final release of Python 
2.3.

The next bugfix release will be 2.4.1, in a couple of months. 

(As usual - any questions, comments or whatever, let me know via email,
or #python-dev on irc.freenode.net)

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From ncoghlan at iinet.net.au  Tue Jan 25 14:36:23 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Jan 25 14:36:30 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <ca471dc205012407251dac200d@mail.gmail.com>
References: <41F4DC39.9020603@iinet.net.au>
	<ca471dc205012407251dac200d@mail.gmail.com>
Message-ID: <41F64B57.3040807@iinet.net.au>

Guido van Rossum wrote:
> Since we already have the islice iterator, what's the point?

I'd like to see iterators become as easy to work with as lists are. At the 
moment, anything that returns an iterator forces you to use the relatively 
cumbersome itertools.islice mechanism, rather than Python's native slice syntax.

In the example below (printing the first 3 items of a sequence), the fact that 
sorted() produces a new iterable list, while reversed() produces an iterator 
over the original list *should* be an irrelevant implementation detail from the 
programmer's point of view.

However, the fact that iterators aren't natively sliceable throws this detail in 
the programmer's face, and forces them to alter their code to deal with it. The 
conversion from list comprehensions to generator expressions results in similar 
irritation - the things just aren't as convenient, because the syntactic support 
isn't there.

Py> lst = "1 5 23 1234  57 89 2 1 54 7".split()
Py> lst
['1', '5', '23', '1234', '57', '89', '2', '1', '54', '7']

Py> for i in sorted(lst)[:3]:
...   print i
...
1
1
1234

Py> for i in reversed(lst)[:3]:
...   print i
...
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: unsubscriptable object

Py> from itertools import islice
Py> for i in islice(reversed(lst), 3):
...   print i
...
7
54
1

>It seems
> to me that introducing this notation would mostly lead to confuse
> users, since in most other places a slice produces an independent
> *copy* of the data.

Well, certainly everything I can think of in the core that currently supports 
slicing produces a copy. Slicing on a numarray, however, gives you a view. The 
exact behaviour (view or copy) really depends on what is being sliced.

For iterators, I think Raymond's islice exemplifies the most natural slicing 
behaviour. Invoking itertools.tee() behind the scenes (to get copying semantics) 
would eliminate the iterator nature of the approach in many cases.

I don't think native slicing support will introduce any worse problems with the 
iterator/iterable distinction than already exist (e.g. a for loop consumes an 
iterator, but leaves an iterable unchanged. Similarly, slicing does not alter an 
iterable, but consumes an iterator).

Continuing the sorted() vs reversed() example from above:

Py> sortlst = sorted(lst)
Py> max(sortlst)
'89'
Py> len(sortlst)
10

Py> revlst = reversed(lst)
Py> max(revlst)
'89'
Py> len(revlst)
0

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at iinet.net.au  Tue Jan 25 14:41:32 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Jan 25 14:41:40 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7F24@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7F24@escpl.tcp.com.ar>
Message-ID: <41F64C8C.6090902@iinet.net.au>

Batista, Facundo wrote:
> I think that breaking the common idiom...
> 
>   for e in something[:]:
>       something.remove(e)
> 
> is a no-no...

The patch doesn't change existing behaviour - anything which is already 
sliceable (e.g. lists) goes through the existing __getitem__ or __getslice__ 
code paths.

All the patch adds is two additional checks (the first for an iterator, the 
second for an iterable) before PyObject_GetItem fails with the traditional 
"TypeError: unsubscriptable object".

Defining __getitem__ also allows any given iterator or iterable type to override 
the default slicing behaviour if they so choose.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From abo at minkirri.apana.org.au  Tue Jan 25 15:01:55 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue Jan 25 15:02:19 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux kernel
	2.6
References: <1106111769.3822.52.camel@schizo>
	<2mpt01bkvs.fsf@starship.python.net>
	<1106185423.3784.26.camel@schizo>
	<200501251657.57682.anthony@interlink.com.au>
Message-ID: <004b01c502e6$6db77380$24ed0ccb@apana.org.au>

G'day,

From: "Anthony Baxter" <anthony@interlink.com.au>
> On Thursday 20 January 2005 12:43, Donovan Baarda wrote:
> > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
> > > The main oddness about python threads (before 2.3) is that they run
> > > with all signals masked.  You could play with a C wrapper (call
> > > setprocmask, then exec fop) to see if this is what is causing the
> > > problem.  But please try 2.4.
> >
> > Python 2.4 does indeed fix the problem. Unfortunately we are using Zope
> > 2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to
> > 2.4. Is there any wa  this "Fix" can be back-ported to 2.3?
>
> It's extremely unlikely - I couldn't make myself comfortable with it
> when attempting to figure out it's backportedness. While the current
> behaviour on 2.3.4 is broken in some cases, I fear very much that
> the new behaviour will break other (working) code - and this is
> something I try very hard to avoid in a bugfix release, particularly
> in one that's probably the final one of a series.
>
> Fundamentally, the answer is "don't do signals+threads, you will
> get burned". For your application, you might want to instead try

In this case it turns out to be "don't do exec() in a thread, because what
you exec can have all it's signals masked". That turns out to be a hell of a
lot of things; popen, os.command, etc. They all only work OK in a threaded
application if what you are exec'ing doesn't use any signals.

> something where you write requests to a file in a spool directory,
> and have a python script that loops looking for requests, and
> generates responses. This is likely to be much simpler to debug
> and work with.

Hmm, interprocess communications; great fun :-) And no spawning the process
from within the zope application; it's gotta be a separate daemon.

Actually, I've noticed that zope often has a sorta zombie "which" process
which it spawns. I wonder it this is a stuck thread waiting for some
signal...

----------------------------------------------------------------
Donovan Baarda                http://minkirri.apana.org.au/~abo/
----------------------------------------------------------------

From anthony at interlink.com.au  Tue Jan 25 15:53:20 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Jan 25 15:53:34 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux kernel
	2.6
In-Reply-To: <004b01c502e6$6db77380$24ed0ccb@apana.org.au>
References: <1106111769.3822.52.camel@schizo>
	<200501251657.57682.anthony@interlink.com.au>
	<004b01c502e6$6db77380$24ed0ccb@apana.org.au>
Message-ID: <200501260153.21672.anthony@interlink.com.au>

On Wednesday 26 January 2005 01:01, Donovan Baarda wrote:
> In this case it turns out to be "don't do exec() in a thread, because what
> you exec can have all it's signals masked". That turns out to be a hell of
> a lot of things; popen, os.command, etc. They all only work OK in a
> threaded application if what you are exec'ing doesn't use any signals.

Yep. You just have to be aware of it. We do a bit of this at work, and we
either spool via a database table, or a directory full of spool files. 

> Actually, I've noticed that zope often has a sorta zombie "which" process
> which it spawns. I wonder it this is a stuck thread waiting for some
> signal...

Quite likely.

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From python at rcn.com  Tue Jan 25 18:06:31 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Jan 25 18:11:03 2005
Subject: [Python-Dev] state of 2.4 final release
In-Reply-To: <1f7befae04112919183144973b@mail.gmail.com>
Message-ID: <000001c50300$37ad1320$18fccc97@oemcomputer>

> [Anthony Baxter]
> > I didn't see any replies to the last post, so I'll ask again with a
> > better subject line - as I said last time, as far as I'm aware, I'm
> > not aware of anyone having done a fix for the issue Tim identified
> > ( http://www.python.org/sf/1069160 )
> >
> > So, my question is: Is this important enough to delay a 2.4 final
> > for?

[Tim]
> Not according to me; said before I'd be happy if everyone pretended I
> hadn't filed that report until a month after 2.4 final was released.

Any chance of this getting fixed before 2.4.1 goes out in February?


Raymond

From tim.peters at gmail.com  Tue Jan 25 18:24:58 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Jan 25 18:25:01 2005
Subject: [Python-Dev] state of 2.4 final release
In-Reply-To: <000001c50300$37ad1320$18fccc97@oemcomputer>
References: <1f7befae04112919183144973b@mail.gmail.com>
	<000001c50300$37ad1320$18fccc97@oemcomputer>
Message-ID: <1f7befae05012509241b3164f3@mail.gmail.com>

[Anthony Baxter]
>>> I didn't see any replies to the last post, so I'll ask again with a
>>> better subject line - as I said last time, as far as I'm aware, I'm
>>> not aware of anyone having done a fix for the issue Tim identified
>>> ( http://www.python.org/sf/1069160 )
>>>
>>> So, my question is: Is this important enough to delay a 2.4 final
>>> for?

[Tim]
>> Not according to me; said before I'd be happy if everyone pretended I
>> hadn't filed that report until a month after 2.4 final was released.
 
[Raymond Hettinger]
> Any chance of this getting fixed before 2.4.1 goes out in February?

It probably won't be fixed by me.  It would be better if a Unix-head
volunteered to repair it, because the most likely kind of thread race
(explained in the bug report) has proven impossible to provoke on
Windows (short of carefully inserting sleeps into Python's C code) any
of the times this bug has been reported in the past (the same kind of
bug has appeared several times in different parts of Python's
threading code -- holding the GIL is not sufficient protection against
concurrent mutation of the tstate chain, for reasons explained in the
bug report).

A fix is very simple (also explained in the bug report) -- acquire the
damn mutex, don't trust to luck.
From DavidA at ActiveState.com  Tue Jan 25 18:29:59 2005
From: DavidA at ActiveState.com (David Ascher)
Date: Tue Jan 25 18:31:59 2005
Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-)
In-Reply-To: <41F67F43.4010403@holdenweb.com>
References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com>
	<41F67F43.4010403@holdenweb.com>
Message-ID: <41F68217.8000408@ActiveState.com>

Steve Holden wrote:

>> Modulo some SQLServer features we're using.
>>
> Well free-text indexing would be my first guess. Anything else of 
> interest? MySQL's free text indexing really sucks compared with SQL 
> Server's, which to my mind is a good justification for the Microsoft 
> product.

Freetext search is one of them, but there may be others (I think there 
are some stored procedure in some MS language).  I'm hardly a SQL 
expert, or an expert on our ASPN infrastructure.

--david
From fredrik at pythonware.com  Tue Jan 25 18:39:54 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Jan 25 18:39:48 2005
Subject: [Python-Dev] Re: Allowing slicing of iterators
References: <41F4DC39.9020603@iinet.net.au>
	<ca471dc205012407251dac200d@mail.gmail.com>
Message-ID: <ct608t$jqf$1@sea.gmane.org>

Guido van Rossum wrote:

>> As a trivial example, here's how to skip the head of a zero-numbered list:
>>
>>    for i, item in enumerate("ABCDEF")[1:]:
>>      print i, item
>>
>> Is this idea a non-starter, or should I spend my holiday on Wednesday finishing
>> it off and writing the documentation and tests for it?
>
> Since we already have the islice iterator, what's the point?

readability?  I don't have to import seqtools to work with traditional
sequences, so why should I have to import itertools to be able to use
the goodies in there?  better leave that to the compiler.

</F> 



From bob at redivi.com  Tue Jan 25 19:13:24 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Jan 25 19:13:30 2005
Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ; -)
In-Reply-To: <41F68217.8000408@ActiveState.com>
References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com>
	<41F67F43.4010403@holdenweb.com> <41F68217.8000408@ActiveState.com>
Message-ID: <CD667FA3-6EFC-11D9-A261-000A95BA5446@redivi.com>


On Jan 25, 2005, at 12:29, David Ascher wrote:

> Steve Holden wrote:
>
>>> Modulo some SQLServer features we're using.
>>>
>> Well free-text indexing would be my first guess. Anything else of 
>> interest? MySQL's free text indexing really sucks compared with SQL 
>> Server's, which to my mind is a good justification for the Microsoft 
>> product.
>
> Freetext search is one of them, but there may be others (I think there 
> are some stored procedure in some MS language).  I'm hardly a SQL 
> expert, or an expert on our ASPN infrastructure.

There is OpenFTS <http://openfts.sourceforge.net/> for PostgreSQL 
<http://postgresql.org>.  I'm not sure how it compares to SQL Server's 
or MySQL's, but it's been around a while so I expect it's pretty 
decent.

-bob

From gvanrossum at gmail.com  Tue Jan 25 19:30:39 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Jan 25 19:30:42 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <41F64B57.3040807@iinet.net.au>
References: <41F4DC39.9020603@iinet.net.au>
	<ca471dc205012407251dac200d@mail.gmail.com>
	<41F64B57.3040807@iinet.net.au>
Message-ID: <ca471dc20501251030384c40ee@mail.gmail.com>

[me]
> > Since we already have the islice iterator, what's the point?

[Nick]
> I'd like to see iterators become as easy to work with as lists are. At the
> moment, anything that returns an iterator forces you to use the relatively
> cumbersome itertools.islice mechanism, rather than Python's native slice syntax.

Sorry. Still -1.

I read your defense, and I'm not convinced. Even Fredrik's support
didn't convince me.

Iterators are for single sequential access. It's a feature that you
have to import itertools (or at least that you have to invoke its
special operations) -- iterators are not sequences and shouldn't be
confused with such.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From steven.bethard at gmail.com  Tue Jan 25 20:41:56 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue Jan 25 20:42:01 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <41F64B57.3040807@iinet.net.au>
References: <41F4DC39.9020603@iinet.net.au>
	<ca471dc205012407251dac200d@mail.gmail.com>
	<41F64B57.3040807@iinet.net.au>
Message-ID: <d11dcfba05012511413f700810@mail.gmail.com>

Nick Coghlan wrote:
> In the example below (printing the first 3 items of a sequence), the fact that
> sorted() produces a new iterable list, while reversed() produces an iterator
> over the original list *should* be an irrelevant implementation detail from the
> programmer's point of view.

You have to be aware on some level of whether or not you're using a
list when you use slice notation -- what would you do for iterators
when given a negative step index?  Presumably it would have to raise
an exception, where doing so with lists would not...

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From jcarlson at uci.edu  Tue Jan 25 21:57:39 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue Jan 25 22:00:24 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <41F64B57.3040807@iinet.net.au>
References: <ca471dc205012407251dac200d@mail.gmail.com>
	<41F64B57.3040807@iinet.net.au>
Message-ID: <20050125095456.992C.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan@iinet.net.au> wrote:
> Guido van Rossum wrote:
> > Since we already have the islice iterator, what's the point?
> 
> I'd like to see iterators become as easy to work with as lists are. At the 
> moment, anything that returns an iterator forces you to use the relatively 
> cumbersome itertools.islice mechanism, rather than Python's native slice syntax.


If you want to use full sequence slicing semantics, then make yourself a
list or tuple. I promise it will take less typing than itertools.islice()
(at least in the trivial case of list(iterable)). Using language syntax
to pretend that an arbitrary iterable is a list or tuple may well lead
to unexpected behavior, whether that behavior is data loss or a caching
of results.  Which behavior is desireable is generally application
specific, and I don't believe that Python should make that assumption
for the user or developer.

 - Josiah

From python at rcn.com  Tue Jan 25 22:05:37 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Jan 25 22:10:32 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <ca471dc20501251030384c40ee@mail.gmail.com>
Message-ID: <000e01c50321$a20d1e60$18fccc97@oemcomputer>

> Iterators are for single sequential access. It's a feature that you
> have to import itertools (or at least that you have to invoke its
> special operations) -- iterators are not sequences and shouldn't be
> confused with such.

FWIW, someone (Bengt Richter perhaps) once suggested syntactic support
differentiated from sequences but less awkward than a call to
itertools.islice().

itertools.islice(someseq, lo, hi) would be rendered as someseq'[lo:hi].



Raymond

From python at rcn.com  Tue Jan 25 22:09:32 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Jan 25 22:14:20 2005
Subject: [Python-Dev] state of 2.4 final release
In-Reply-To: <1f7befae05012509241b3164f3@mail.gmail.com>
Message-ID: <000f01c50322$2a91d960$18fccc97@oemcomputer>

[Anthony Baxter]
> >>> I'm
> >>> not aware of anyone having done a fix for the issue Tim identified
> >>> ( http://www.python.org/sf/1069160 )

[Raymond Hettinger]
> > Any chance of this getting fixed before 2.4.1 goes out in February?

[Timbot]
> It probably won't be fixed by me.  It would be better if a Unix-head
> volunteered to repair it, because the most likely kind of thread race
> (explained in the bug report) has proven impossible to provoke on
> Windows (short of carefully inserting sleeps into Python's C code) any
> of the times this bug has been reported in the past (the same kind of
> bug has appeared several times in different parts of Python's
> threading code -- holding the GIL is not sufficient protection against
> concurrent mutation of the tstate chain, for reasons explained in the
> bug report).
> 
> A fix is very simple (also explained in the bug report) -- acquire the
> damn mutex, don't trust to luck.

Hey Unix-heads.
Any takers?



Raymond

From steven.bethard at gmail.com  Tue Jan 25 22:20:24 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue Jan 25 22:20:32 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <000e01c50321$a20d1e60$18fccc97@oemcomputer>
References: <ca471dc20501251030384c40ee@mail.gmail.com>
	<000e01c50321$a20d1e60$18fccc97@oemcomputer>
Message-ID: <d11dcfba05012513202defd297@mail.gmail.com>

Raymond Hettinger <python@rcn.com> wrote:
> 
> FWIW, someone (Bengt Richter perhaps) once suggested syntactic support
> differentiated from sequences but less awkward than a call to
> itertools.islice().
> 
> itertools.islice(someseq, lo, hi) would be rendered as someseq'[lo:hi].

Just to make sure I'm reading this right, the difference between
sequence slicing and iterator slicing is a single-quote?  IMVHO,
that's pretty hard to read...

If we're really looking for a builtin, wouldn't it be better to go the
route of getattr/setattr and have something like getslice that could
operate on both lists and iterators?  Then
    getslice(lst, lo, hi)
would just be an alias for
    lst[lo:hi]
and
    getslice(itr, lo, hi)
would just be an alias for
    itertools.islice(itr, lo, hi)

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From jimjjewett at gmail.com  Tue Jan 25 22:50:27 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue Jan 25 22:50:30 2005
Subject: [Python-Dev] Deprecating modules (python-dev summary for early Dec,
	2004)
Message-ID: <fb6fbf560501251350610c0c5c@mail.gmail.com>

> It was also agreed that deleting deprecated modules was not needed; it breaks 
> code and disk space is cheap.

> It seems that no longer listing documentation and adding a deprecation warning 
> is what is needed to properly deprecate a module.  By no longer listing 
> documentation new programmers will not use the code since they won't know 
> about it.[*]    And adding the warning will let old users know that they should be 
> using  something else.

[* Unless they try to maintain old code.  Hopefully, they know to find the 
documentation at python.org.]

Would it make sense to add an attic (or even "deprecated") directory to the end 
of sys.path, and move old modules there?  This would make the search for 
non-deprecated modules a bit faster, and would make it easier to
verify that new
code isn't depending (perhaps indirectly) on any deprecated features.

New programmers may just browse the list of files for names that look right.  
They're more likely to take the first (possibly false) hit if the list
is long.
I'm not the only one who ended up using markupbase for that reason.

Also note that some shouldn't-be-used modules don't (yet?) raise a deprecation 
warning.  For instance, I'm pretty sure regex_syntax and and reconvert are both 
fairly useless without deprecated regex, but they aren't deprecated on
their own --
so they show up as tempting choices in a list of library files. 
(Though reconvert
does something other than I expected, based on the name.)  I understand not 
bothering to repeat the deprecation for someone who is using them correctly,
but it would be nice to move them to an attic.

Bastion and rexec should probably also raise Deprecation errors, if that becomes
the right way to mark them deprecated.  (They import fine; they just
don't work --
which could be interpreted as merely an "XXX not done yet" comment.)

-jJ
From walter at livinglogic.de  Tue Jan 25 23:13:08 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Jan 25 23:13:11 2005
Subject: [Python-Dev] __str__ vs. __unicode__
In-Reply-To: <41F3B46F.5040205@egenix.com>
References: <41ED25C6.80603@livinglogic.de>
	<41ED499A.1050206@egenix.com>	<41EE2B1E.8030209@livinglogic.de>
	<41EE4797.6030105@egenix.com> <41F0F11B.8000600@livinglogic.de>
	<41F3B46F.5040205@egenix.com>
Message-ID: <41F6C474.8030700@livinglogic.de>

M.-A. Lemburg wrote:
> Walter D?rwald wrote:
> 
>> M.-A. Lemburg wrote:
>>
>>  > [...]
>>
>>> __str__ and __unicode__ as well as the other hooks were
>>> specifically added for the type constructors to use.
>>> However, these were added at a time where sub-classing
>>> of types was not possible, so it's time now to reconsider
>>> whether this functionality should be extended to sub-classes
>>> as well.
>>
>> So can we reach consensus on this, or do we need a
>> BDFL pronouncement?
> 
> I don't have a clear picture of what the consensus currently
> looks like :-)
> 
> If we're going for for a solution that implements the hook
> awareness for all __<typename>__ hooks, I'd be +1 on that.
> If we only touch the __unicode__ case, we'd only be created
> yet another special case. I'd vote -0 on that.
 > [...]

Here's the patch that implements this for int/long/float/unicode:
http://www.python.org/sf/1109424

Note that complex already did the right thing.

For int/long/float this is implemented in the following way:
Converting an instance of a subclass to the base class is done
in the appropriate slot of the type (i.e. intobject.c::int_int()
etc.) instead of in PyNumber_Int()/PyNumber_Long()/PyNumber_Float().
It's still possible for a conversion method to return an instance
of a subclass of int/long/float.

Bye,
    Walter D?rwald
From skip at pobox.com  Tue Jan 25 23:21:34 2005
From: skip at pobox.com (Skip Montanaro)
Date: Tue Jan 25 23:58:23 2005
Subject: [Python-Dev] Deprecating modules (python-dev summary for early
	Dec, 2004)
In-Reply-To: <fb6fbf560501251350610c0c5c@mail.gmail.com>
References: <fb6fbf560501251350610c0c5c@mail.gmail.com>
Message-ID: <16886.50798.56069.314227@montanaro.dyndns.org>


    Jim> Would it make sense to add an attic (or even "deprecated")
    Jim> directory to the end of sys.path, and move old modules there?  This
    Jim> would make the search for non-deprecated modules a bit faster, and
    Jim> would make it easier to verify that new code isn't depending
    Jim> (perhaps indirectly) on any deprecated features.

That's what lib-old is for.  All people have to do is append it to sys.path
to get access to its contents:

    % python
    Python 2.5a0 (#72, Jan 20 2005, 20:14:27) 
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1493)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import glob
    >>> for f in glob.glob("/Users/skip/local/lib/python2.5/lib-old/*.py"):
    ...   print f
    ... 
    /Users/skip/local/lib/python2.5/lib-old/addpack.py
    /Users/skip/local/lib/python2.5/lib-old/cmp.py
    /Users/skip/local/lib/python2.5/lib-old/cmpcache.py
    /Users/skip/local/lib/python2.5/lib-old/codehack.py
    /Users/skip/local/lib/python2.5/lib-old/dircmp.py
    /Users/skip/local/lib/python2.5/lib-old/dump.py
    /Users/skip/local/lib/python2.5/lib-old/find.py
    /Users/skip/local/lib/python2.5/lib-old/fmt.py
    /Users/skip/local/lib/python2.5/lib-old/grep.py
    /Users/skip/local/lib/python2.5/lib-old/lockfile.py
    /Users/skip/local/lib/python2.5/lib-old/newdir.py
    /Users/skip/local/lib/python2.5/lib-old/ni.py
    /Users/skip/local/lib/python2.5/lib-old/packmail.py
    /Users/skip/local/lib/python2.5/lib-old/Para.py
    /Users/skip/local/lib/python2.5/lib-old/poly.py
    /Users/skip/local/lib/python2.5/lib-old/rand.py
    /Users/skip/local/lib/python2.5/lib-old/statcache.py
    /Users/skip/local/lib/python2.5/lib-old/tb.py
    /Users/skip/local/lib/python2.5/lib-old/tzparse.py
    /Users/skip/local/lib/python2.5/lib-old/util.py
    /Users/skip/local/lib/python2.5/lib-old/whatsound.py
    /Users/skip/local/lib/python2.5/lib-old/whrandom.py
    /Users/skip/local/lib/python2.5/lib-old/zmod.py

That doesn't help for deprecated extension modules, but I think they are
much less frequently candidates for deprecation.

Skip
From jimjjewett at gmail.com  Wed Jan 26 00:26:58 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed Jan 26 00:27:01 2005
Subject: [Python-Dev] Deprecating modules (python-dev summary for early
	Dec, 2004)
In-Reply-To: <16886.50798.56069.314227@montanaro.dyndns.org>
References: <fb6fbf560501251350610c0c5c@mail.gmail.com>
	<16886.50798.56069.314227@montanaro.dyndns.org>
Message-ID: <fb6fbf5605012515262c07d395@mail.gmail.com>

On Tue, 25 Jan 2005 16:21:34 -0600, Skip Montanaro <skip@pobox.com> wrote:
> 
>     Jim> Would it make sense to add an attic (or even "deprecated")
>     Jim> directory to the end of sys.path, and move old modules there?  This
>     Jim> would make the search for non-deprecated modules a bit faster, and
>     Jim> would make it easier to verify that new code isn't depending
>     Jim> (perhaps indirectly) on any deprecated features.
 
> That's what lib-old is for.  All people have to do is append it to sys.path
> to get access to its contents:

That seems to be for "obsolete" modules.

Should deprecated modules be moved there as well?

I had proposed a middle ground, where they were moved to a separate directory,
but that directory was (by default) included on the search path.

Moving deprecated modules to lib-old (not on the search path at all) seems to 
risk breaking code.

-jJ
From nnorwitz at gmail.com  Wed Jan 26 04:35:42 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed Jan 26 04:35:45 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <000e01c502d3$0458a340$18fccc97@oemcomputer>
References: <000e01c502d3$0458a340$18fccc97@oemcomputer>
Message-ID: <ee2a432c050125193511085d8@mail.gmail.com>

On Tue, 25 Jan 2005 06:42:57 -0500, Raymond Hettinger
<raymond.hettinger@verizon.net> wrote:
> >
> > I think tested a method I changed from METH_O to METH_ARGS and could
> > not measure a difference.
> 
> Something is probably wrong with the measurements.  The new call does much more work than METH_O or METH_NOARGS.  Those two common and essential cases cannot be faster and are likely slower on at least some compilers and some machines.  If some timing shows differently, then it is likely a mirage (falling into an unsustainable local minimum).

I tested w/chr() which Martin pointed out is broken in my patch.  I
just tested with len('') and got these results (again on opteron):

# without patch
neal@janus clean $ ./python ./Lib/timeit.py -v "len('')"
10 loops -> 8.11e-06 secs
100 loops -> 6.7e-05 secs
1000 loops -> 0.000635 secs
10000 loops -> 0.00733 secs
100000 loops -> 0.0634 secs
1000000 loops -> 0.652 secs
raw times: 0.654 0.652 0.654
1000000 loops, best of 3: 0.652 usec per loop
# with patch
neal@janus src $ ./python ./Lib/timeit.py -v "len('')"
10 loops -> 9.06e-06 secs
100 loops -> 7.01e-05 secs
1000 loops -> 0.000692 secs
10000 loops -> 0.00693 secs
100000 loops -> 0.0708 secs
1000000 loops -> 0.703 secs
raw times: 0.712 0.714 0.713
1000000 loops, best of 3: 0.712 usec per loop

So with the patch METH_O is .06 usec slower.

I'd like to discuss this later after I explain a bit more about the
direction I'm headed.  I agree that METH_O and METH_NOARGS are near
optimal wrt to performance.  But if we could have one METH_UNPACKED
instead of 3 METH_*, I think that would be a win.

> > A beneift would be to consolidate METH_O,
> > METH_NOARGS, and METH_VARARGS into a single case.  This should
> > make code simpler all around (IMO).
> 
> Will backwards compatibility allow those cases to be eliminated?  It would be a bummer if most existing extensions could not compile with Py2.5.  Also, METH_VARARGS will likely have to hang around unless a way can be found to handle more than nine arguments.

Sorry, I meant eliminated w/3.0.  METH_O couldn't be eliminated, but
METH_NOARGS actually could since min/max args would be initialized
to 0.  so #define METH_NOARGS METH_UNPACKED would work.  
But I'm not proposing that, unless there is consensus that it's ok.

> This patch appears to be taking on a life of its own and is being applied more broadly than is necessary or wise.  The patch is extensive and introduces a new C API that cannot be taken back later, so we ought to be careful with it.

I agree we should be careful.  But it's all experimentation right now.
The reason to modify METH_O and METH_NOARGS is verify direction
and various effects.  It's not necessarily meant to be integrated.

> That being said, I really like the concept.  I just worry that many of the stated benefits won't materialize:
> * having to keep the old versions for backwards compatibility,
> * being slower than METH_O and METH_NOARGS,
> * not handling more than nine arguments,

There are very few functions I've found that take more than 2 arguments.
Should 9 be lower, higher?  I don't have a good feel.  From what I've
seen, 5 may be more reasonable as far as catching 90% of the cases.

> * separating function signature info from the function itself,

I haven't really seen any discussion on this point.  I think
Raymond pointed out this isn't really much different today
with METH_NOARGS and METH_KEYWORD.  METH_O too
if you consider how the arg is used even though the signature
is still the same.

> * the time to initialize all the argument variables to NULL,

See below how this could be fixed.

> * somewhat unattractive case stmt code for building the c function call.

This is the python test coverage:
    http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=182530

Note that VARARGS is over 3 times as likely as METH_O or METH_NOARGS. 
Plus we could get rid of a couple of if statements.

So far it seems there isn't any specific problems with the approach. 
There are simply concerns.  I not sure it would be best to modify this
patch over many iterations and then make one huge checkin.  I also
don't want to lose the changes or the results.  Perhaps I should make
a branch for this work?  It's easy to abondon it or take only the
pieces we want if it should ever see the light of day.

----

Here's some thinking out loud.  Raymond mentioned about some of the
warts of the current patch.  In particular, all nine argument
variables are initialized each time and there's a switch on the number
of arguments.

Ultimately, I think we can speed things up more by having 9 different
op codes, ie, one for each # of arguments.  CALL_FUNCTION_0,
CALL_FUNCTION_1, ...
(9 is still arbitrary and subject to change)

Then we would have N little functions, each with the exact # of
parameters.  Each would still need a switch to call the C function
because there may be optional parameters.  Ultimately, it's possible
the code would be small enough to stick it into the eval_frame loop. 
Each of these steps would need to be tested, but that's a possible
longer term direction.

There would only be an if to check if it was a C function or not. 
Maybe we could even get rid of this by more fixup at import time.

Neal
From ncoghlan at iinet.net.au  Wed Jan 26 04:59:12 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Jan 26 04:59:19 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050125193511085d8@mail.gmail.com>
References: <000e01c502d3$0458a340$18fccc97@oemcomputer>
	<ee2a432c050125193511085d8@mail.gmail.com>
Message-ID: <41F71590.9090501@iinet.net.au>

Neal Norwitz wrote:
> So far it seems there isn't any specific problems with the approach. 
> There are simply concerns.  I not sure it would be best to modify this
> patch over many iterations and then make one huge checkin.  I also
> don't want to lose the changes or the results.  Perhaps I should make
> a branch for this work?  It's easy to abondon it or take only the
> pieces we want if it should ever see the light of day.

A branch would seem the best way to allow other people to contribute to the 
experiment.

I'll also note that this mechanism should make it easier to write C functions 
which are easily used both from Python and as direct entries in a C API.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at iinet.net.au  Wed Jan 26 05:40:53 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Jan 26 05:41:00 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <d11dcfba05012513202defd297@mail.gmail.com>
References: <ca471dc20501251030384c40ee@mail.gmail.com>	<000e01c50321$a20d1e60$18fccc97@oemcomputer>
	<d11dcfba05012513202defd297@mail.gmail.com>
Message-ID: <41F71F55.8010104@iinet.net.au>

Steven Bethard wrote:
> If we're really looking for a builtin, wouldn't it be better to go the
> route of getattr/setattr and have something like getslice that could
> operate on both lists and iterators?

Such a builtin should probably be getitem() rather than getslice() (since 
getitem(iterable, slice(start, stop, step)) covers the getslice() case). 
However, I don't really see the point of this, since "from itertools import 
islice" is nearly as good as such a builtin.

More importantly, I don't see how this alters Guido's basic criticism that 
slicing a list and slicing an iterator represent fundamentally different 
concepts. (ie. if "itr[x]" is unacceptable, I don't see how changing the 
spelling to "getitem(itr, x)" could make it any more acceptable).

If slicing is taken as representing random access to a data structure (which 
seems to be Guido's view), then using it to represent sequential access to an 
item in or region of an iterator is not appropriate.

I'm not sure how compatible that viewpoint is with wanting Python 3k to be as 
heavily iterator based as 2.x is list based, but that's an issue for the future.

For myself, I don't attach such specific semantics to slicing (I see it as 
highly dependent on the type of object being sliced), and consider it obvious 
syntactic sugar for the itertools islice operation. As mentioned in my previous 
message, I also think the iterator/iterable distinction should be able to be 
ignored as much as possible, and the lack of syntactic support for working with 
iterators is the major factor that throws the distinction into a programmer's 
face. It currently makes the fact that some builtins return lists and others 
iterators somewhat inconvenient.

Those arguments have already failed to persuade Guido though, so I guess the 
idea is dead for the moment (unless/until someone comes up with a convincing 
argument that I haven't thought of).

Given Guido's lack of enthusiasm for *this* idea though, I'm not even going to 
venture into the realms of "+" on iterators defaulting to itertools.chain or "*" 
to itertools.repeat.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From skip at pobox.com  Wed Jan 26 05:30:53 2005
From: skip at pobox.com (Skip Montanaro)
Date: Wed Jan 26 05:59:59 2005
Subject: [Python-Dev] I think setup.py needs major rework
Message-ID: <16887.7421.746339.594221@montanaro.dyndns.org>

I just submitted a bug report for setup.py:

    http://python.org/sf/1109602

It begins:

    Python's setup.py has grown way out of control.  I'm trying to build and
    install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a
    non-standard place and I can't figure out the incantation to tell
    setup.py to look where they are installed.

    ...

and ends:

    This might be an excellent sprint topic for PyCon.

Skip
From kbk at shore.net  Wed Jan 26 06:29:26 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan 26 06:29:45 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050125193511085d8@mail.gmail.com> (Neal Norwitz's
	message of "Tue, 25 Jan 2005 22:35:42 -0500")
References: <000e01c502d3$0458a340$18fccc97@oemcomputer>
	<ee2a432c050125193511085d8@mail.gmail.com>
Message-ID: <87is5kzrk9.fsf@hydra.bayview.thirdcreek.com>

Neal Norwitz <nnorwitz@gmail.com> writes:

>> * not handling more than nine arguments,
>
> There are very few functions I've found that take more than 2
> arguments.  Should 9 be lower, higher?  I don't have a good feel.
> From what I've seen, 5 may be more reasonable as far as catching 90%
> of the cases.

Five is probably conservative.

http://mail.python.org/pipermail/python-dev/2004-February/042847.html

-- 
KBK
From fdrake at acm.org  Wed Jan 26 07:18:46 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Jan 26 07:18:55 2005
Subject: [Python-Dev] I think setup.py needs major rework
In-Reply-To: <16887.7421.746339.594221@montanaro.dyndns.org>
References: <16887.7421.746339.594221@montanaro.dyndns.org>
Message-ID: <200501260118.47192.fdrake@acm.org>

On Tuesday 25 January 2005 23:30, Skip Montanaro wrote:
 >     Python's setup.py has grown way out of control.  I'm trying to build
 > and install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a
 > non-standard place and I can't figure out the incantation to tell setup.py
 > to look where they are installed.
...
 >     This might be an excellent sprint topic for PyCon.

Indeed it would be!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From ncoghlan at iinet.net.au  Wed Jan 26 09:06:36 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Jan 26 09:06:43 2005
Subject: [Python-Dev] Allowing slicing of iterators
In-Reply-To: <ca471dc20501251030384c40ee@mail.gmail.com>
References: <41F4DC39.9020603@iinet.net.au>	
	<ca471dc205012407251dac200d@mail.gmail.com>	
	<41F64B57.3040807@iinet.net.au>
	<ca471dc20501251030384c40ee@mail.gmail.com>
Message-ID: <41F74F8C.5080100@iinet.net.au>

Guido van Rossum wrote:
> Iterators are for single sequential access. It's a feature that you
> have to import itertools (or at least that you have to invoke its
> special operations) -- iterators are not sequences and shouldn't be
> confused with such.
> 

I agree the semantic difference between an iterable and an iterator is 
important, but I am unclear on why that needs to translate to a syntactic 
difference for slicing, when it doesn't translate to such a difference for 
iteration (despite the *major* difference in the effect upon the object that is 
iterated over). Are the semantics of slicing really that much more exact than 
those for iteration?

Also, would it make a difference if the ability to extract an individual item 
from an iterator through subscripting was disallowed? (i.e. getting the second 
item of an iterator being spelt "itr[2:3].next()" instead of "itr[2]")

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From anthony at python.org  Wed Jan 26 09:51:55 2005
From: anthony at python.org (Anthony Baxter)
Date: Wed Jan 26 09:52:18 2005
Subject: [Python-Dev] RELEASED Python 2.3.5, release candidate 1
Message-ID: <200501261952.06150.anthony@python.org>


On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.3.5 (release candidate 1).

Python 2.3.5 is a bug-fix release. See the release notes at the website
(also available as Misc/NEWS in the source distribution) for details of
the bugs squished in this release.

Assuming no major problems crop up, a final release of Python 2.3.5 will
follow in about a week's time. 

Python 2.3.5 is the last release in the Python 2.3 series, and is being
released for those people who still need to use Python 2.3. Python 2.4
is a newer release, and should be preferred if possible. From here,
bugfix releases are switching to the Python 2.4 branch - a 2.4.1 will
follow 2.3.5 final.

For more information on Python 2.3.5, including download links for
various platforms, release notes, and known issues, please see:

    http://www.python.org/2.3.5

Highlights of this new release include:

  - Bug fixes. According to the release notes, more than 50 bugs 
    have been fixed, including a couple of bugs that could cause 
    Python to crash. 

Highlights of the previous major Python release (2.3) are available     
from the Python 2.3 page, at                                            

    http://www.python.org/2.3/highlights.html

Enjoy the new release,
Anthony

Anthony Baxter
anthony@python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050126/769bc0ed/attachment.pgp
From walter at livinglogic.de  Wed Jan 26 12:47:11 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Jan 26 12:47:17 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050125193511085d8@mail.gmail.com>
References: <000e01c502d3$0458a340$18fccc97@oemcomputer>
	<ee2a432c050125193511085d8@mail.gmail.com>
Message-ID: <41F7833F.90905@livinglogic.de>

Neal Norwitz wrote:

 > [...]
> This is the python test coverage:
>     http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=182530

This link won't work because of session management. To get the
coverage info of ceval.c go to http://coverage.livinglogic.de,
click on the latest run, enter "ceval" in the "Filename" field,
click "Search" and click on the one line in the search result.

Bye,
    Walter D?rwald
From python at rcn.com  Wed Jan 26 15:47:41 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Jan 26 15:53:31 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <ee2a432c050125193511085d8@mail.gmail.com>
Message-ID: <004a01c503b5$fd167b00$18fccc97@oemcomputer>

> I agree that METH_O and METH_NOARGS are near
> optimal wrt to performance.  But if we could have one METH_UNPACKED
> instead of 3 METH_*, I think that would be a win.
 . . .
> Sorry, I meant eliminated w/3.0. 

So, leave METH_O and METH_NOARGS alone.  They can't be dropped until 3.0
and they can't be improved speedwise.



> > * not handling more than nine arguments,
> 
> There are very few functions I've found that take more than 2
arguments.

It's not a matter of how few; it's a matter of imposing a new, arbitrary
limit where none previously existed.  This is not a positive point for
the patch.



> Ultimately, I think we can speed things up more by having 9 different
> op codes, ie, one for each # of arguments.  CALL_FUNCTION_0,
> CALL_FUNCTION_1, ...
> (9 is still arbitrary and subject to change)

How is the compiler to know the arity of the target function?  If I call
pow(3,5), how would the compiler know that pow() can take an optional
third argument which would be need to be initialized to NULL?



> Then we would have N little functions, each with the exact # of
> parameters.  Each would still need a switch to call the C function
> because there may be optional parameters.  Ultimately, it's possible
> the code would be small enough to stick it into the eval_frame loop.
> Each of these steps would need to be tested, but that's a possible
> longer term direction.
 . . .
> There would only be an if to check if it was a C function or not.
> Maybe we could even get rid of this by more fixup at import time.

This is what I mean about the patch taking on a life of its own.  It's
an optimization patch that slows down METH_O and METH_NOARGS.  It's a
incremental change that throws away backwards compatibility.  It's a
simplification that introduces a bazillion new code paths.  It's a
simplification that can't be realized until 3.0.  It's a minor change
that entails new opcodes, compiler changes, and changes in all
extensions that have ever been written.

IOW, this patch has lost its focus (or innocence).  That can be
recovered by limiting the scope to improving the call time for methods
with signatures like "O}O".  That is an achievable goal that doesn't
impact backwards compatibility, doesn't negatively impact existing
near-optimal METH_O and METH_NOARGS code, doesn't mess with the
compiler, doesn't introduce new opcodes, doesn't alter import logic, and
doesn't muck-up existing extensions.



Raymond


"Until next week, keep your feet on the ground and keep reaching for the
stars." -- Casey Kasem

From fredrik at pythonware.com  Wed Jan 26 18:44:32 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Jan 26 18:44:24 2005
Subject: [Python-Dev] Re: Allowing slicing of iterators
References: <41F4DC39.9020603@iinet.net.au><ca471dc205012407251dac200d@mail.gmail.com><41F64B57.3040807@iinet.net.au>
	<ca471dc20501251030384c40ee@mail.gmail.com>
Message-ID: <ct8kti$1sr$1@sea.gmane.org>

Guido van Rossum wrote:
>> I'd like to see iterators become as easy to work with as lists are. At the
>> moment, anything that returns an iterator forces you to use the relatively
>> cumbersome itertools.islice mechanism, rather than Python's native slice syntax.
>
> Sorry. Still -1.

can we perhaps persuade you into changing that to a -0.1, so we can
continue playing with the idea?

> iterators are not sequences and shouldn't be confused with such.

the for-in statement disagrees with you, I think.  on the other hand, I'm not
sure I trust that statement any more; I'm really disappointed that it won't let
me write my loops as:

    for item in item for item in item for item in item for item in seq:
        ...

</F> 



From steve at holdenweb.com  Tue Jan 25 18:17:55 2005
From: steve at holdenweb.com (Steve Holden)
Date: Wed Jan 26 20:07:14 2005
Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-)
In-Reply-To: <41F5A434.1060905@ActiveState.com>
References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com>
Message-ID: <41F67F43.4010403@holdenweb.com>

David Ascher wrote:

> Steve Holden wrote:
> 
>> Dear python-dev:
>>
>> The current (as of even date) summary of my recent contributions to 
>> Python -dev appears to be spam about PyCon.
>>
>> Not being one to break habits, even not those of a lifetime sometimes, 
>> I spam you yet again to show you what a beautiful summary ActiveState 
>> have provided (I don't know whether this URL is cacheable or not):
>>
>> <http://aspn.activestate.com/ASPN/Mail/Browse/ByAuthor/python-dev?author=cHljb25AcHl0aG9uLm9yZw--> 
> 
> 
> 
> Yup, we try to make all our URLs portable and persistent.
> 
Good for you.

>> If I remember Trent Lott (?) 
> 
> 
> Nah, that's a US politician.  T'was Trent Mick.
> 
Indeed, and already corrected.

>> described at an IPC the SQL Server database that drives this system, 
>> and it was a great example of open source technology driving a 
>> proprietary (but I expect (?) relatively portable) repository.
> 
> 
> Modulo some SQLServer features we're using.
> 
Well free-text indexing would be my first guess. Anything else of 
interest? MySQL's free text indexing really sucks compared with SQL 
Server's, which to my mind is a good justification for the Microsoft 
product.

>> Since I have your attention (and if I haven't then it really doesn't 
>> matter what I write hereafter, goodbye ...) I will also point out that 
>> the current top hit on Google for
>>
>>     "Microsoft to Provide PyCon Opening Keynote"
> 
> 
> What a bizarre search.
> 
Oh, no, people run that search all the time :-)

It's actually hit #3 for "Microsoft PyCon" as well, so I guess that's 
not too bad.

> (note that some of your To's and Cc's were pretty strange...
> 
Hmm, yes, I cringed when I got the bounces. That information didn't 
belong there. If only there were a way to take emails back ...

regards
  Steve
-- 
Steve Holden               http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC      +1 703 861 4237  +1 800 494 3119

From kbk at shore.net  Wed Jan 26 21:50:54 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Jan 26 21:51:20 2005
Subject: [Python-Dev] I think setup.py needs major rework
In-Reply-To: <16887.7421.746339.594221@montanaro.dyndns.org> (Skip
	Montanaro's message of "Tue, 25 Jan 2005 22:30:53 -0600")
References: <16887.7421.746339.594221@montanaro.dyndns.org>
Message-ID: <87acqvzzgx.fsf@hydra.bayview.thirdcreek.com>

Skip Montanaro <skip@pobox.com> writes:

>     Python's setup.py has grown way out of control.  I'm trying to
>     build and install Python 2.4.0 on a Solaris system with Tcl/Tk
>     installed in a non-standard place and I can't figure out the
>     incantation to tell setup.py to look where they are installed.

This may be more due to the complexity of distutils than to setup.py
itself.  Special cases are special cases, after all, e.g. look at
Autotools.

setup.py is billed as "Autodetecting setup.py script for building
the Python extensions" but exactly how to override it without hacking
it isn't very well documented, when possible at all.

"Distributing Python Modules" helped me, but the reference section
is missing, so it's utsl from there.

So one improvement would be to better document overriding setup.py in
README.

Your solution may be as simple as adding to Makefile:342 (approx)

--include-dirs=xxxx --library-dirs=yyyy

where setup.py is called.  (distutils/command/build_ext.py)

*But* I suspect build() may not pass the options through to build_ext()!

So, a config file approach:

.../src/setup.cfg:
[build_ext]
include-dirs=xxxx
library-dirs=yyyy


In setup.py, PyBuildExt.build_extension() does most of the special
casing.  The last thing done is to call PyBuildExt.detect_tkinter()
which handles a bunch of platform incompatibilities. e.g.

 # OpenBSD and FreeBSD use Tcl/Tk library names like libtcl83.a, but
 # the include subdirs are named like .../include/tcl8.3.

If the previous ideas flub, you could hack your detect_tkinter() and
append your include and lib dirs to inc_dirs and lib_dirs at the
beginning of the method.

All else fails, use Modules/Setup.dist to install Tcl/Tk?

Or maybe symlink your non-standard location?

-- 
KBK
From bac at OCF.Berkeley.EDU  Wed Jan 26 22:35:35 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Jan 26 22:35:52 2005
Subject: [Python-Dev] I think setup.py needs major rework
In-Reply-To: <200501260118.47192.fdrake@acm.org>
References: <16887.7421.746339.594221@montanaro.dyndns.org>
	<200501260118.47192.fdrake@acm.org>
Message-ID: <41F80D27.1090100@ocf.berkeley.edu>

Fred L. Drake, Jr. wrote:
> On Tuesday 25 January 2005 23:30, Skip Montanaro wrote:
>  >     Python's setup.py has grown way out of control.  I'm trying to build
>  > and install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a
>  > non-standard place and I can't figure out the incantation to tell setup.py
>  > to look where they are installed.
> ...
>  >     This might be an excellent sprint topic for PyCon.
> 
> Indeed it would be!
> 

... and now it is listed as one.  Started a section on the sprint wiki page for 
orphaned sprint topics with a subsection on core stuff.  Listed a rewrite of 
setup.py along with a rewrite of site.py and the usual bug/patch squashing.

URL is http://www.python.org/moin/PyConDC2005/Sprints for those not wanting to 
go hunting for it.

-Brett
From abo at minkirri.apana.org.au  Thu Jan 27 00:51:43 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Thu Jan 27 00:52:28 2005
Subject: [Python-Dev] Strange segfault in Python threads and linux
	kernel 2.6
In-Reply-To: <200501260153.21672.anthony@interlink.com.au>
References: <1106111769.3822.52.camel@schizo>
	<200501251657.57682.anthony@interlink.com.au>
	<004b01c502e6$6db77380$24ed0ccb@apana.org.au>
	<200501260153.21672.anthony@interlink.com.au>
Message-ID: <1106783503.3889.12.camel@schizo>

On Wed, 2005-01-26 at 01:53 +1100, Anthony Baxter wrote:
> On Wednesday 26 January 2005 01:01, Donovan Baarda wrote:
> > In this case it turns out to be "don't do exec() in a thread, because what
> > you exec can have all it's signals masked". That turns out to be a hell of
> > a lot of things; popen, os.command, etc. They all only work OK in a
> > threaded application if what you are exec'ing doesn't use any signals.
> 
> Yep. You just have to be aware of it. We do a bit of this at work, and we
> either spool via a database table, or a directory full of spool files. 
> 
> > Actually, I've noticed that zope often has a sorta zombie "which" process
> > which it spawns. I wonder it this is a stuck thread waiting for some
> > signal...
> 
> Quite likely.

For the record, it seems that the java version also contributes. This
problem only occurs when you have the following combination;

Linux >=2.6
Python <=2.3
j2re1.4 =1.4.2.01-1 | kaffe 2:1.1.4xxx

If you use Linux 2.4, it goes away. If you use Python 2.4 it goes away.
If you use j2re1.4.1.01-1 it goes away.

For the problem to occur the following combination needs to occur;

1) Linux uses the thread's sigmask instead of the main thread/process
sigmask for the exc'ed process (ie, 2.6 does this, 2.4 doesn't).

2) Python needs to screw with the sigmask in threads (python 2.3 does,
python 2.4 doesn't).

3) The exec'ed process needs to rely on threads (j2re1.4 1.4.2.01-1
does, j2re1.4 1.4.1.01-1 doesn't).

It is hard to find old Debian deb's of j2re1.4 (1.4.1.01-1), and when
you do, you will also need the now non-existent j2se-common 1.1 package.
I don't know if this qualifies as a potential bug against j2re1.4
1.4.2.01-1.

For now my solution is to roll back to the older j2re1.4.


-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From alan.green at gmail.com  Thu Jan 27 01:27:03 2005
From: alan.green at gmail.com (Alan Green)
Date: Thu Jan 27 01:27:06 2005
Subject: [Python-Dev] Patch review: [ 1100942 ] datetime.strptime
	constructor added
Message-ID: <c6fc3cde050126162732430a5e@mail.gmail.com>

I see a need for this patch - I've had to write
"datetime(*(time.strptime(date_string, format)[0:6]))" far too many
times.

I don't understand the C API well enough to check if
reference counts are handled properly, but otherwise the
implementation looks straight forward.

Documentation looks good and the test passes on my machine.

Two suggestions:

1. In the time module, the strptime() function's format
parameter is  optional. For consistency's sake, I'd expect
datetime.strptime()'s format parameter also to be optional.
(On the other hand, the default value for the format is not
very useful.)

2. Since strftime is supported by datetime.time,
datetime.date and datetime.datetime, I'd also expect
strptime to be supported by all three classes. Could you add
that now, or would it be better to do it as a separate patch?

Alan. 
-- 
Alan Green 
alan.green@cardboard.nu - http://cardboard.nu
From alan.green at gmail.com  Thu Jan 27 01:32:58 2005
From: alan.green at gmail.com (Alan Green)
Date: Thu Jan 27 01:33:03 2005
Subject: [Python-Dev] Patch review: [ 1094542 ] add Bunch type to
	collections module
Message-ID: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>

Steven Bethard is proposing a new collection class named Bunch. I had
a few suggestions which I attached as comments to the patch - but what
is really required is a bit more work on the draft PEP, and then
discussion on the python-dev mailing list.

http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470

Alan.
-- 
Alan Green 
alan.green@cardboard.nu - http://cardboard.nu
From steven.bethard at gmail.com  Thu Jan 27 01:40:06 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Jan 27 01:40:08 2005
Subject: [Python-Dev] Patch review: [ 1094542 ] add Bunch type to
	collections module
In-Reply-To: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
Message-ID: <d11dcfba050126164067bef3a1@mail.gmail.com>

Alan Green <alan.green@gmail.com> wrote:
> Steven Bethard is proposing a new collection class named Bunch. I had
> a few suggestions which I attached as comments to the patch - but what
> is really required is a bit more work on the draft PEP, and then
> discussion on the python-dev mailing list.
> 
> http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470

I believe the correct tracker is:

http://sourceforge.net/tracker/index.php?func=detail&aid=1094542&group_id=5470&atid=305470

There was a substantial discussion about this on the python-list
before I put the PEP together:

http://mail.python.org/pipermail/python-list/2004-November/252621.html

I applied for a PEP number on 2 Jan 2005 and haven't heard back yet,
but the patch was posted so people could easily play around with it if
they liked.  My intentions were to post the PEP to python-dev as soon
as I got a PEP number for it.

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From alan.green at gmail.com  Thu Jan 27 02:44:08 2005
From: alan.green at gmail.com (Alan Green)
Date: Thu Jan 27 02:44:10 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to
	__builtin__
Message-ID: <c6fc3cde050126174466c45c09@mail.gmail.com>

Last August, James Knight posted to python-dev, "There's a fair number
of classes that claim they are defined in __builtin__, but do not
actually appear there". There was a discussion and James submitted
this patch:

http://sourceforge.net/tracker/index.php?func=detail&aid=1009811&group_id=5470&atid=305470

The final result of the discussion is unclear. Guido declared himself
+0.5 on the concept, but nobody has reviewed the patch in detail yet.

The original email thread starts here: 

http://mail.python.org/pipermail/python-dev/2004-August/047477.html

The patch still applies, and test cases still run OK afterwards.

Now that 2.4 has been released it is perhaps a good time to discuss in
on python-dev again. If it isn't discussed, then the patch should be
closed due to lack of interest.

Alan.
-- 
Alan Green 
alan.green@cardboard.nu - http://cardboard.nu
From skip at pobox.com  Thu Jan 27 04:17:50 2005
From: skip at pobox.com (Skip Montanaro)
Date: Thu Jan 27 04:18:00 2005
Subject: [Python-Dev] Patch review: [ 1100942 ] datetime.strptime
	constructor added
In-Reply-To: <c6fc3cde050126162732430a5e@mail.gmail.com>
References: <c6fc3cde050126162732430a5e@mail.gmail.com>
Message-ID: <16888.23902.44511.8382@montanaro.dyndns.org>


    Alan> 1. In the time module, the strptime() function's format
    Alan> parameter is optional.  For consistency's sake, I'd expect
    Alan> datetime.strptime()'s format parameter also to be optional.  (On
    Alan> the other hand, the default value for the format is not very
    Alan> useful.)

Correct.  No need to propagate a mistake.

    Alan> 2. Since strftime is supported by datetime.time,
    Alan> datetime.date and datetime.datetime, I'd also expect strptime to
    Alan> be supported by all three classes.  Could you add that now, or
    Alan> would it be better to do it as a separate patch?

That can probably be done, but I'm not convinced strftime really belongs on
either date or time objects given the information those objects are missing:

    >>> t = datetime.datetime.now().time()
    >>> t.strftime("%Y-%m-%d")
    '1900-01-01'
    >>> d = datetime.datetime.now().date()
    >>> d.strftime("%H:%M:%S")
    '00:00:00'

I would be happy for strftime to only be available for datetime objects
(assuming there was a good way to get from time or date objects to datetime
objects short of extracting their individual attributes).  In any case,
going from datetime to date or time objects is trivial:

    >>> dt = datetime.datetime.now()
    >>> dt.time()
    datetime.time(21, 12, 18, 705365)

so parsing a string into a datetime object then splitting out date and time
objects seems reasonable.

Skip
From python at rcn.com  Thu Jan 27 05:55:41 2005
From: python at rcn.com (Raymond Hettinger)
Date: Thu Jan 27 05:59:18 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types
	to__builtin__
In-Reply-To: <c6fc3cde050126174466c45c09@mail.gmail.com>
Message-ID: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>

> Last August, James Knight posted to python-dev, "There's a fair number
> of classes that claim they are defined in __builtin__, but do not
> actually appear there". There was a discussion and James submitted
> this patch:
> 
>
http://sourceforge.net/tracker/index.php?func=detail&aid=1009811&group_i
d=
> 5470&atid=305470

I'm -1 on adding these to __builtin__.  They are just distractors and
have almost no use in real Python programs.  Worse, if you do use them,
then you are likely to be programming badly -- we don't want to
encourage that.

Also, I take some of these, such as dictproxy and cell, to be
implementation details that are subject to change.  Adding them to
__builtin__ would unnecessarily immortalize them.



> The final result of the discussion is unclear. Guido declared himself
> +0.5 on the concept, but nobody has reviewed the patch in detail yet.

Even if Guido were suffering from time machine induced hallucinations
that day, he still knew better than to go a full +1.


Raymond

From martin at v.loewis.de  Thu Jan 27 07:20:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan 27 07:20:47 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing
	types	to__builtin__
In-Reply-To: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>
References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>
Message-ID: <41F88840.7070105@v.loewis.de>

Raymond Hettinger wrote:
> I'm -1 on adding these to __builtin__.  They are just distractors and
> have almost no use in real Python programs.  Worse, if you do use them,
> then you are likely to be programming badly -- we don't want to
> encourage that.

I agree. Because of the BDFL pronouncement, I cannot reject the patch,
but I won't accept it, either. So it seems that this patch will have
to sit in the SF tracker until either Guido processes it, or it is
withdrawn.

Regards,
Martin
From foom at fuhm.net  Thu Jan 27 08:01:20 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu Jan 27 08:01:43 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing
	types	to__builtin__
In-Reply-To: <41F88840.7070105@v.loewis.de>
References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>
	<41F88840.7070105@v.loewis.de>
Message-ID: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net>

On Jan 27, 2005, at 1:20 AM, Martin v. L?wis wrote:
> I agree. Because of the BDFL pronouncement, I cannot reject the patch,
> but I won't accept it, either. So it seems that this patch will have
> to sit in the SF tracker until either Guido processes it, or it is
> withdrawn.

If people want to restart this discussion, I'd like to start back with 
the following message, rather than simply accepting/rejecting the 
patch. From the two comments so far, it seems like it's not the patch 
that needs reviewing, but still the concept.

On August 10, 2004 12:17:14 PM EDT, I wrote:
> Sooo should (for 'generator' in objects that claim to be in
> __builtins__ but aren't),
> 1) 'generator' be added to __builtins__
> 2) 'generator' be added to types.py and its __module__ be set to 
> 'types'
> 3) 'generator' be added to <newmodule>.py and its __module__ be set to
> '<newmodule>' (and a name for the module chosen)

Basically, I'd like to see them be given a binding somewhere, and have 
their claimed module agree with that, but am not particular as to 
where. Option #2 seemed to be rejected last time, and option #1 was 
given approval, so that's what I wrote a patch for. It sounds like it's 
getting pretty strong "no" votes this time around, however. Therefore, 
I would like to suggest option #3, with <newmodule> being, say, 
'internals'.

James
From fperez.net at gmail.com  Thu Jan 27 09:07:06 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu Jan 27 09:20:55 2005
Subject: [Python-Dev] Re: Patch review: [ 1094542 ] add Bunch type to
	collections module
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
Message-ID: <cta7fq$144$1@sea.gmane.org>

Hi all,

Steven Bethard wrote:

> Alan Green <alan.green@gmail.com> wrote:
>> Steven Bethard is proposing a new collection class named Bunch. I had
>> a few suggestions which I attached as comments to the patch - but what
>> is really required is a bit more work on the draft PEP, and then
>> discussion on the python-dev mailing list.
>> 
>>
http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470
> 
> I believe the correct tracker is:
> 
>
http://sourceforge.net/tracker/index.php?func=detail&aid=1094542&group_id=5470&atid=305470

A while back, when I started writing ipython, I had to write this same class (I
called it Struct), and I ended up building a fairly powerful one for handling
ipython's reucursive configuration system robustly.  

The design has some nasty problems which I'd change if I were doing this today
(I was just learning the language at the time).  But it also taught me a few
things about what one ends up needing from such a beast in complex situations.

I've posted the code here as plain text and syntax-highlighted html, in case
anyone is interested:

http://amath.colorado.edu/faculty/fperez/python/Struct.py
http://amath.colorado.edu/faculty/fperez/python/Struct.py.html

One glaring problem of my class is the blocking of dictionary method names as
attributes, this would have to be addressed differently.

But one thing which I really find necessary from a useful 'Bunch' class, is
the ability to access attributes via foo[name] (which requires implementing
__getitem__).  Named access is convenient when you _know_ the name you need
(foo.attr).  However, if the name of the attribute is held in a variable, IMHO 
foo[name] beats getattr(foo,name) in clarity and feels much more 'pythonic'.

Another useful feature of this Struct class is the 'merge' method.  While mine
is probably overly flexible and complex for the stdlib (though it is
incredibly useful in many situations), I'd really like dicts/Structs to have
another way of updating with a single method, which was non-destructive
(update automatically overwrites with the new data).  Currently this is best
done with a loop, but a 'merge' method which would work like 'update', but
without overwriting would be a great improvement, I think.

Finally, my values() method allows an optional keys argument, which I also
find very useful.  If this keys sequence is given, values are returned only
for those keys.  I don't know if anyone else would find such a feature useful,
but I do :).  It allows a kind of 'slicing' of dicts which can be really
convenient.

I understand that my Struct is much more of a dict/Bunch hybrid than what you
have in mind.  But in heavy usage, I quickly realized that at least having
__getitem__ implemented was an immediate need in many cases.

Finally, the Bunch class should have a way of returning its values easily as a
plain dictionary, for cases when you want to pass this data into a function
which expects a true dict.  Otherwise, it will 'lock' your information in.

I really would like to see such a class in the stdlib, as it's something that
pretty much everyone ends up rewriting.  I certainly don't claim my
implementation to be a good reference (it isn't).  But perhaps it can be
useful to the discussion as an example of a 'battle-tested' such class, flaws
and all.

I think the current pre-PEP version is a bit too limited to be generally
useful in complex, real-world situtations.  It would be a good starting point
to subclass for more demanding situations, but IMHO it would be worth
considering a more powerful default class.

Regards,



From p.f.moore at gmail.com  Thu Jan 27 10:49:48 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Jan 27 10:49:51 2005
Subject: [Python-Dev] PEP 309 (Was: Patch review: [ 1094542 ] add Bunch type
	to collections module)
In-Reply-To: <cta7fq$144$1@sea.gmane.org>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
Message-ID: <79990c6b05012701492440d0c0@mail.gmail.com>

On Thu, 27 Jan 2005 01:07:06 -0700, Fernando Perez <fperez.net@gmail.com> wrote:
> I really would like to see such a class in the stdlib, as it's something that
> pretty much everyone ends up rewriting.  I certainly don't claim my
> implementation to be a good reference (it isn't).  But perhaps it can be
> useful to the discussion as an example of a 'battle-tested' such class, flaws
> and all.

On the subject of  "things everyone ends up rewriting", what needs to
be done to restart discussion on PEP 309 (Partial Function
Application)? The PEP is marked "Accepted" and various patches exist:

941881 - C implementation
1006948 - Windows build changes
931010 - Unit Tests
931007 - Documentation
931005 - Reference implementation (in Python)

I get the impression that there are some outstanding tweaks required
to the Windows build, but I don't have VS.NET to check and/or update
the patch.

Does this just need a core developer to pick it up? I guess I'll go
off and do some patch/bug reviews...

Paul.
From gvanrossum at gmail.com  Thu Jan 27 16:48:07 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Jan 27 16:48:14 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types
	to__builtin__
In-Reply-To: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net>
References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>
	<41F88840.7070105@v.loewis.de>
	<3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net>
Message-ID: <ca471dc205012707481d0786eb@mail.gmail.com>

On Thu, 27 Jan 2005 02:01:20 -0500, James Y Knight <foom@fuhm.net> wrote:

> Basically, I'd like to see them be given a binding somewhere, and have
> their claimed module agree with that, but am not particular as to
> where. Option #2 seemed to be rejected last time, and option #1 was
> given approval, so that's what I wrote a patch for. It sounds like it's
> getting pretty strong "no" votes this time around, however. Therefore,
> I would like to suggest option #3, with <newmodule> being, say,
> 'internals'.

+1

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Thu Jan 27 18:11:06 2005
From: python at rcn.com (Raymond Hettinger)
Date: Thu Jan 27 18:14:50 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing
	typesto__builtin__
In-Reply-To: <ca471dc205012707481d0786eb@mail.gmail.com>
Message-ID: <004301c50493$32e80fe0$433fc797@oemcomputer>

[James Y Knight]
> > Basically, I'd like to see them be given a binding somewhere, and
have
> > their claimed module agree with that, but am not particular as to
> > where. Option #2 seemed to be rejected last time, and option #1 was
> > given approval, so that's what I wrote a patch for. It sounds like
it's
> > getting pretty strong "no" votes this time around, however.
Therefore,
> > I would like to suggest option #3, with <newmodule> being, say,
> > 'internals'.

[GvR]
> +1

That gives them a place to live and doesn't clutter __builtin__.
However, it should be named __internals__.

The next question is how to document it.  My preference is to be clear
that it is implementation specific (Jython won't have cell, PyCFunction,
and dictproxy types); that it is subject to change between versions (so
as not to prematurely immortalize design/implementation accidents); and
that they have only esoteric application (99.9% of programs won't need
them and should avoid them like the plague).

Calling it __internals__ will help emphasize that we are exposing parts
of the implementation that were consciously left semi-private or
undocumented.


Raymond  

From steven.bethard at gmail.com  Thu Jan 27 18:52:33 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Jan 27 18:52:37 2005
Subject: [Python-Dev] Re: Patch review: [ 1094542 ] add Bunch type to
	collections module
In-Reply-To: <cta7fq$144$1@sea.gmane.org>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
Message-ID: <d11dcfba05012709524d5af4c5@mail.gmail.com>

Fernando Perez <fperez.net@gmail.com> wrote:
> > Alan Green <alan.green@gmail.com> wrote:
> >> Steven Bethard is proposing a new collection class named Bunch. I had
> >> a few suggestions which I attached as comments to the patch - but what
> >> is really required is a bit more work on the draft PEP, and then
> >> discussion on the python-dev mailing list.
>
> But one thing which I really find necessary from a useful 'Bunch' class, is
> the ability to access attributes via foo[name] (which requires implementing
> __getitem__).  Named access is convenient when you _know_ the name you need
> (foo.attr).  However, if the name of the attribute is held in a variable, IMHO
> foo[name] beats getattr(foo,name) in clarity and feels much more 'pythonic'.

My feeling about this is that if the name of the attribute is held in
a variable, you should be using a dict, not a Bunch/Struct.  If you
have a Bunch/Struct and decide you want a dict instead, you can just
use vars:

py> b = Bunch(a=1, b=2, c=3)
py> vars(b)
{'a': 1, 'c': 3, 'b': 2}

> Another useful feature of this Struct class is the 'merge' method.
[snip]
> my values() method allows an optional keys argument, which I also
> find very useful.

Both of these features sound useful, but I don't see that they're
particularly more useful in the case of a Bunch/Struct than they are
for dict.  If dict gets such methods, then I'd be happy to add them to
Bunch/Struct, but for consistency's sake, I think at the moment I'd
prefer that people who want this functionality subclass Bunch/Struct
and add the methods themselves.

> I think the current pre-PEP version is a bit too limited to be generally
> useful in complex, real-world situtations.  It would be a good starting point
> to subclass for more demanding situations, but IMHO it would be worth
> considering a more powerful default class.

I'm probably not willing to budge much on adding dict-style methods --
if you want a dict, use a dict.  But if people think they're
necessary, there are a few methods from Struct that I wouldn't be too
upset if I had to add, e.g. clear, copy, etc.  But I'm going to need
more feedback before I make any changes like this.

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From martin at v.loewis.de  Thu Jan 27 23:41:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Jan 27 23:41:24 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing
	types	to__builtin__
In-Reply-To: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net>
References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer>
	<41F88840.7070105@v.loewis.de>
	<3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net>
Message-ID: <41F96E1A.3020701@v.loewis.de>

James Y Knight wrote:
>> Sooo should (for 'generator' in objects that claim to be in
>> __builtins__ but aren't),
>> 1) 'generator' be added to __builtins__
>> 2) 'generator' be added to types.py and its __module__ be set to 'types'
>> 3) 'generator' be added to <newmodule>.py and its __module__ be set to
>> '<newmodule>' (and a name for the module chosen)

There are more alternatives:
4) the __module__ of these types could be absent
    (i.e. accessing __module__ could give an AttributeError)
5) the __module__ could be present and have a value of None
6) anything could be left as is. The __module__ value of these
    types might be somewhat confusing, but not enough so to
    justify changing it to any of the alternatives, which might
    also be confusing (each in their own way).

> Basically, I'd like to see them be given a binding somewhere, and have 
> their claimed module agree with that, but am not particular as to where. 

I think I cannot agree with this as a goal regardless of the consequences.

> Option #2 seemed to be rejected last time, and option #1 was given 
> approval, so that's what I wrote a patch for. It sounds like it's 
> getting pretty strong "no" votes this time around, however. Therefore, I 
> would like to suggest option #3, with <newmodule> being, say, 'internals'.

-1. 'internals' is not any better than 'sys', 'new', or 'types'. It
is worse, as new modules are confusing to users - one more thing they
have to learn.

Regards,
Martin
From python at rcn.com  Thu Jan 27 23:52:17 2005
From: python at rcn.com (Raymond Hettinger)
Date: Thu Jan 27 23:56:00 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing
	types	to__builtin__
In-Reply-To: <41F96E1A.3020701@v.loewis.de>
Message-ID: <006001c504c2$da9cc1c0$433fc797@oemcomputer>

> > Basically, I'd like to see them be given a binding somewhere, and
have
> > their claimed module agree with that, but am not particular as to
where.
> 
> I think I cannot agree with this as a goal regardless of the
consequences.

Other than a vague feeling of completeness is there any reason this
needs to be done?  Is there anything useful that currently cannot be
expressed without this new module?  



Raymond

From martin at v.loewis.de  Fri Jan 28 00:24:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Jan 28 00:24:46 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add
	missing	types	to__builtin__
In-Reply-To: <006001c504c2$da9cc1c0$433fc797@oemcomputer>
References: <006001c504c2$da9cc1c0$433fc797@oemcomputer>
Message-ID: <41F97843.8070902@v.loewis.de>

Raymond Hettinger wrote:
> Other than a vague feeling of completeness is there any reason this
> needs to be done?  Is there anything useful that currently cannot be
> expressed without this new module?  

That I wonder myself, too.

Regards,
Martin
From fperez.net at gmail.com  Fri Jan 28 02:16:24 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri Jan 28 02:16:32 2005
Subject: [Python-Dev] Re: Re: Patch review: [ 1094542 ] add Bunch type to
	collections module
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<d11dcfba05012709524d5af4c5@mail.gmail.com>
Message-ID: <ctc3p8$ovu$1@sea.gmane.org>

Steven Bethard wrote:

> Fernando Perez <fperez.net@gmail.com> wrote:

> My feeling about this is that if the name of the attribute is held in
> a variable, you should be using a dict, not a Bunch/Struct.  If you
> have a Bunch/Struct and decide you want a dict instead, you can just
> use vars:
> 
> py> b = Bunch(a=1, b=2, c=3)
> py> vars(b)
> {'a': 1, 'c': 3, 'b': 2}

Well, the problem I see here is that often, you need to mix both kinds of
usage.  It's reasonable to have code for which Bunch is exactly what you need
in most places, but where you have a number of accesses via variables whose
value is resolved at runtime.  Granted, you can use getattr(bunch,varname), or
make an on-the-fly dict as you indicated above.  But since Bunch is above all
a convenience class for common idioms, I think supporting a common need is a
reasonable idea.  Again, just my opinion.

>> Another useful feature of this Struct class is the 'merge' method.
> [snip]
>> my values() method allows an optional keys argument, which I also
>> find very useful.
> 
> Both of these features sound useful, but I don't see that they're
> particularly more useful in the case of a Bunch/Struct than they are
> for dict.  If dict gets such methods, then I'd be happy to add them to
> Bunch/Struct, but for consistency's sake, I think at the moment I'd
> prefer that people who want this functionality subclass Bunch/Struct
> and add the methods themselves.

It's very true that these are almost a request for a dict extension.  Frankly,
I'm too swamped to follow up with a pep/patch for it, though.  Pity, because
they can be really useful... Takers?

> I'm probably not willing to budge much on adding dict-style methods --
> if you want a dict, use a dict.  But if people think they're
> necessary, there are a few methods from Struct that I wouldn't be too
> upset if I had to add, e.g. clear, copy, etc.  But I'm going to need
> more feedback before I make any changes like this.

You already have update(), which by the way precludes a bunch storing an
'update' attribute.  My class suffers from the same problem, just with many
more names.  I've thought about this, and my favorite solution so far would be
to provide whichever dict-like methods end up implemented (update, merge (?),
etc) with a leading single underscore.  I simply don't see any other way to
cleanly distinguish between a bunch which holds an 'update' attribute and the
update method.  

I guess making them classmethods (or is it staticmethods? I don't use those so
I may be confusing terminology) might be a clean way out:

Bunch.update(mybunch, othermapping) -> modifies mybunch.

Less traditional OO syntax for bunches, but this would sidestep the potential
name conflicts.

Anyway, these are just some thoughts.  Feel free to take what you like.

Regards,

f

From steven.bethard at gmail.com  Fri Jan 28 02:25:57 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Jan 28 02:26:01 2005
Subject: [Python-Dev] Re: Re: Patch review: [ 1094542 ] add Bunch type to
	collections module
In-Reply-To: <ctc3p8$ovu$1@sea.gmane.org>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<d11dcfba05012709524d5af4c5@mail.gmail.com>
	<ctc3p8$ovu$1@sea.gmane.org>
Message-ID: <d11dcfba05012717252035cea1@mail.gmail.com>

Fernando Perez wrote:
> Steven Bethard wrote:
> > I'm probably not willing to budge much on adding dict-style methods --
> > if you want a dict, use a dict.  But if people think they're
> > necessary, there are a few methods from Struct that I wouldn't be too
> > upset if I had to add, e.g. clear, copy, etc.  But I'm going to need
> > more feedback before I make any changes like this.
> 
> You already have update(), which by the way precludes a bunch storing an
> 'update' attribute.

Well, actually, you can have an update attribute, but then you have to
call update from the class instead of the instance:

py> from bunch import Bunch
py> b = Bunch(update=3)
py> b.update
3
py> b.update(Bunch(hi=4))
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
TypeError: 'int' object is not callable
py> Bunch.update(b, Bunch(hi=4))
py> b.hi
4

> Bunch.update(mybunch, othermapping) -> modifies mybunch.

Yup, that works currently.  As is normal for new-style classes
(AFAIK), the methods are stored in the class object, so assigning an
'update' attribute to an instance just hides the method in the class. 
You can still reach the method by invoking it from the class and
passing it an instance as the first argument.

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From fperez.net at gmail.com  Fri Jan 28 02:31:55 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri Jan 28 02:32:10 2005
Subject: [Python-Dev] Re: Re: Re: Patch review: [ 1094542 ] add Bunch type
	to collections module
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<d11dcfba05012709524d5af4c5@mail.gmail.com>
	<ctc3p8$ovu$1@sea.gmane.org>
	<d11dcfba05012717252035cea1@mail.gmail.com>
Message-ID: <ctc4mb$qo7$1@sea.gmane.org>

Steven Bethard wrote:

> Fernando Perez wrote:
>> Steven Bethard wrote:
>> > I'm probably not willing to budge much on adding dict-style methods --
>> > if you want a dict, use a dict.  But if people think they're
>> > necessary, there are a few methods from Struct that I wouldn't be too
>> > upset if I had to add, e.g. clear, copy, etc.  But I'm going to need
>> > more feedback before I make any changes like this.
>> 
>> You already have update(), which by the way precludes a bunch storing an
>> 'update' attribute.
> 
> Well, actually, you can have an update attribute, but then you have to
> call update from the class instead of the instance:

[...]

Of course, you are right.

However, I think it would perhaps be best to advertise any methods of Bunch as
strictly classmethods from day 1.  Otherwise, you can have:

b = Bunch()
b.update(otherdict) -> otherdict happens to have an 'update' key

... more code

b.update(someotherdict) -> boom! update is not callable

If all Bunch methods are officially presented always as classmethods, users can
simply expect that all attributes of a bunch are meant to store data, without
any instance methods at all.

Regards,

f

From steven.bethard at gmail.com  Fri Jan 28 02:54:04 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Jan 28 02:54:07 2005
Subject: [Python-Dev] Re: Re: Re: Patch review: [ 1094542 ] add Bunch type
	to collections module
In-Reply-To: <ctc4mb$qo7$1@sea.gmane.org>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<d11dcfba05012709524d5af4c5@mail.gmail.com>
	<ctc3p8$ovu$1@sea.gmane.org>
	<d11dcfba05012717252035cea1@mail.gmail.com>
	<ctc4mb$qo7$1@sea.gmane.org>
Message-ID: <d11dcfba050127175427033d5c@mail.gmail.com>

On Thu, 27 Jan 2005 18:31:55 -0700, Fernando Perez <fperez.net@gmail.com> wrote:
> However, I think it would perhaps be best to advertise any methods of Bunch as
> strictly classmethods from day 1.  Otherwise, you can have:
> 
> b = Bunch()
> b.update(otherdict) -> otherdict happens to have an 'update' key
> 
> ... more code
> 
> b.update(someotherdict) -> boom! update is not callable
> 
> If all Bunch methods are officially presented always as classmethods, users can
> simply expect that all attributes of a bunch are meant to store data, without
> any instance methods at all.

That sounds reasonable to me.  I'll fix update to be a staticmethod. 
If people want other methods, I'll make sure they're staticmethods
too.[1]

Steve

[1] In all the cases I can think of, staticmethod is sufficient -- the
methods don't need to access any attributes of the Bunch class.  If
anyone has a good reason to make them classmethods instead, let me
know...

-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From gvanrossum at gmail.com  Fri Jan 28 03:48:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 28 03:48:04 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/idlelib
	EditorWindow.py, 1.65, 1.66 NEWS.txt, 1.53,
	1.54 config-keys.def, 1.21, 1.22 configHandler.py, 1.37, 1.38
In-Reply-To: <E1CuJoB-0001PA-7E@sc8-pr-cvs1.sourceforge.net>
References: <E1CuJoB-0001PA-7E@sc8-pr-cvs1.sourceforge.net>
Message-ID: <ca471dc2050127184856bb0b58@mail.gmail.com>

Thanks!!!


On Thu, 27 Jan 2005 16:16:19 -0800, kbk@users.sourceforge.net
<kbk@users.sourceforge.net> wrote:
> Update of /cvsroot/python/python/dist/src/Lib/idlelib
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5316
> 
> Modified Files:
>         EditorWindow.py NEWS.txt config-keys.def configHandler.py
> Log Message:
> Add keybindings for del-word-left and del-word-right.
> 
> M EditorWindow.py
> M NEWS.txt
> M config-keys.def
> M configHandler.py
> 
> Index: EditorWindow.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/EditorWindow.py,v
> retrieving revision 1.65
> retrieving revision 1.66
> diff -u -d -r1.65 -r1.66
> --- EditorWindow.py     19 Jan 2005 00:22:54 -0000      1.65
> +++ EditorWindow.py     28 Jan 2005 00:16:15 -0000      1.66
> @@ -141,6 +141,8 @@
>          text.bind("<<change-indentwidth>>",self.change_indentwidth_event)
>          text.bind("<Left>", self.move_at_edge_if_selection(0))
>          text.bind("<Right>", self.move_at_edge_if_selection(1))
> +        text.bind("<<del-word-left>>", self.del_word_left)
> +        text.bind("<<del-word-right>>", self.del_word_right)
> 
>          if flist:
>              flist.inversedict[self] = key
> @@ -386,6 +388,14 @@
>                      pass
>          return move_at_edge
> 
> +    def del_word_left(self, event):
> +        self.text.event_generate('<Meta-Delete>')
> +        return "break"
> +
> +    def del_word_right(self, event):
> +        self.text.event_generate('<Meta-d>')
> +        return "break"
> +
>      def find_event(self, event):
>          SearchDialog.find(self.text)
>          return "break"
> 
> Index: NEWS.txt
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/NEWS.txt,v
> retrieving revision 1.53
> retrieving revision 1.54
> diff -u -d -r1.53 -r1.54
> --- NEWS.txt    19 Jan 2005 00:22:57 -0000      1.53
> +++ NEWS.txt    28 Jan 2005 00:16:16 -0000      1.54
> @@ -3,17 +3,24 @@
> 
>  *Release date: XX-XXX-2005*
> 
> +- Add keybindings for del-word-left and del-word-right.
> +
>  - Discourage using an indent width other than 8 when using tabs to indent
>    Python code.
> 
>  - Restore use of EditorWindow.set_indentation_params(), was dead code since
> -  Autoindent was merged into EditorWindow.
> +  Autoindent was merged into EditorWindow.  This allows IDLE to conform to the
> +  indentation width of a loaded file.  (But it still will not switch to tabs
> +  even if the file uses tabs.)  Any change in indent width is local to that
> +  window.
> 
>  - Add Tabnanny check before Run/F5, not just when Checking module.
> 
>  - If an extension can't be loaded, print warning and skip it instead of
>    erroring out.
> 
> +- Improve error handling when .idlerc can't be created (warn and exit).
> +
>  - The GUI was hanging if the shell window was closed while a raw_input()
>    was pending.  Restored the quit() of the readline() mainloop().
>    http://mail.python.org/pipermail/idle-dev/2004-December/002307.html
> 
> Index: config-keys.def
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/config-keys.def,v
> retrieving revision 1.21
> retrieving revision 1.22
> diff -u -d -r1.21 -r1.22
> --- config-keys.def     17 Aug 2004 08:01:19 -0000      1.21
> +++ config-keys.def     28 Jan 2005 00:16:16 -0000      1.22
> @@ -55,6 +55,8 @@
>  untabify-region=<Alt-Key-6> <Meta-Key-6>
>  toggle-tabs=<Alt-Key-t> <Meta-Key-t> <Alt-Key-T>
>  change-indentwidth=<Alt-Key-u> <Meta-Key-u> <Alt-Key-U>
> +del-word-left=<Control-Key-BackSpace>
> +del-word-right=<Control-Key-Delete>
> 
>  [IDLE Classic Unix]
>  copy=<Alt-Key-w> <Meta-Key-w>
> @@ -104,6 +106,8 @@
>  untabify-region=<Alt-Key-6>
>  toggle-tabs=<Alt-Key-t>
>  change-indentwidth=<Alt-Key-u>
> +del-word-left=<Alt-Key-BackSpace>
> +del-word-right=<Alt-Key-d>
> 
>  [IDLE Classic Mac]
>  copy=<Command-Key-c>
> @@ -153,3 +157,5 @@
>  untabify-region=<Control-Key-6>
>  toggle-tabs=<Control-Key-t>
>  change-indentwidth=<Control-Key-u>
> +del-word-left=<Control-Key-BackSpace>
> +del-word-right=<Control-Key-Delete>
> 
> Index: configHandler.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/configHandler.py,v
> retrieving revision 1.37
> retrieving revision 1.38
> diff -u -d -r1.37 -r1.38
> --- configHandler.py    13 Jan 2005 17:37:38 -0000      1.37
> +++ configHandler.py    28 Jan 2005 00:16:16 -0000      1.38
> @@ -579,7 +579,9 @@
>              '<<tabify-region>>': ['<Alt-Key-5>'],
>              '<<untabify-region>>': ['<Alt-Key-6>'],
>              '<<toggle-tabs>>': ['<Alt-Key-t>'],
> -            '<<change-indentwidth>>': ['<Alt-Key-u>']
> +            '<<change-indentwidth>>': ['<Alt-Key-u>'],
> +            '<<del-word-left>>': ['<Control-Key-BackSpace>'],
> +            '<<del-word-right>>': ['<Control-Key-Delete>']
>              }
>          if keySetName:
>              for event in keyBindings.keys():
> 
> _______________________________________________
> Python-checkins mailing list
> Python-checkins@python.org
> http://mail.python.org/mailman/listinfo/python-checkins
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jeff at taupro.com  Fri Jan 28 06:50:53 2005
From: jeff at taupro.com (Jeff Rush)
Date: Fri Jan 28 06:51:02 2005
Subject: [Python-Dev] Patch review: [ 1009811 ] Add
	missing	types	to__builtin__
In-Reply-To: <41F97843.8070902@v.loewis.de>
References: <006001c504c2$da9cc1c0$433fc797@oemcomputer>
	<41F97843.8070902@v.loewis.de>
Message-ID: <1106891453.13830.67.camel@vault.timecastle.net>

On Thu, 2005-01-27 at 17:24, "Martin v. L?wis" wrote:
> Raymond Hettinger wrote:
> > Other than a vague feeling of completeness is there any reason this
> > needs to be done?  Is there anything useful that currently cannot be
> > expressed without this new module?  
> 
> That I wonder myself, too.

One reason is correct documentation.  If the code is rejected, there
should be a patch proposed to remove the erroneous documentation
references that indicates things are in __builtins_ when they are in
fact not.

If they are put into __builtins__, the documentation won't need
updating. ;-)

-Jeff Rush


From fperez.net at gmail.com  Fri Jan 28 10:06:17 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri Jan 28 10:06:56 2005
Subject: [Python-Dev] Re: Re: Re: Re: Patch review: [ 1094542 ] add Bunch
	type to collections module
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<d11dcfba05012709524d5af4c5@mail.gmail.com>
	<ctc3p8$ovu$1@sea.gmane.org>
	<d11dcfba05012717252035cea1@mail.gmail.com>
	<ctc4mb$qo7$1@sea.gmane.org>
	<d11dcfba050127175427033d5c@mail.gmail.com>
Message-ID: <ctcvb5$iu0$1@sea.gmane.org>

Steven Bethard wrote:

> That sounds reasonable to me.  I'll fix update to be a staticmethod.
> If people want other methods, I'll make sure they're staticmethods
> too.[1]
> 
> Steve
> 
> [1] In all the cases I can think of, staticmethod is sufficient -- the
> methods don't need to access any attributes of the Bunch class.  If
> anyone has a good reason to make them classmethods instead, let me
> know...

Great.  I probably meant staticmethod.  I don't use either much, so I don't
really know the difference in the terminology.  For a long time I stuck to 2.1
features for ipython and my other codes, and I seem to recall those appeared in
2.2.  But you got what I meant :)

Cheers,

f

From ejones at uwaterloo.ca  Fri Jan 28 14:17:07 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Fri Jan 28 14:16:49 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
Message-ID: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>

Due to the issue of thread safety in the Python memory allocator, I 
have been wondering about thread safety in the rest of the Python 
interpreter. I understand that the interpreter is not thread safe, but 
I'm not sure that I have seen a discussion of the all areas where this 
is an issue. Here are the areas I know of:

1. The memory allocator.
2. Reference counts.
3. The cyclic garbage collector.
4. Current interpreter state is pointed to by a single shared pointer.
5. Many modules may not be thread safe (?).

Ignoring the issue of #5 for the moment, are there any other areas 
where this is a problem? I'm curious about how much work it would be to 
allow concurrent execution of Python code.

Evan Jones

Note: One of the reasons I am asking is that my memory allocator patch 
is that it changes the current allocator from "sort of" thread safe to 
obviously unsafe. One way to eliminate this issue is to make the 
allocator completely thread safe, but that would require some fairly 
significant changes to avoid a major performance penalty. However, if 
it was one of the components that permitted the interpreter to go 
multi-threaded, then it would be worth it.

From steve at holdenweb.com  Fri Jan 28 19:24:03 2005
From: steve at holdenweb.com (Steve Holden)
Date: Fri Jan 28 19:29:27 2005
Subject: [Python-Dev] Re: [PyCon] Reg: Registration
In-Reply-To: <20050128182019.GB19054@panix.com>
References: <3e475f5b0501280929717ce3b3@mail.gmail.com>
	<20050128182019.GB19054@panix.com>
Message-ID: <41FA8343.8020007@holdenweb.com>

Aahz, writing as pycon@python.org, wrote:

> It's still January 28 here -- register now!  I don't know if we'll be
> able to extend the registration price beyond that.

Just in case anybody else might be wondering when the early bird 
registration deadline is, I've asked the registration team to allow the 
early bird price as long as it's January 28th somewhere in the world.

There have been rumors that Guido will not be attending PyCon this year. 
I am happy to scotch them by pointing out that Guido van Rossum's 
keynote address will be on its traditional Thursday morning. I look 
forward to joining you all to hear Guido speak on "The State of Python".

regards
  Steve
-- 
Steve Holden               http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC      +1 703 861 4237  +1 800 494 3119

From greg.ewing at canterbury.ac.nz  Fri Jan 28 23:27:13 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Jan 28 23:35:27 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
References: <ca471dc20501162212446e63b5@mail.gmail.com>
Message-ID: <41FABC41.3060406@canterbury.ac.nz>

Guido van Rossum wrote:
> Here's a patch that gets rid of unbound methods, as
> discussed here before. A function's __get__ method
> now returns the function unchanged when called without
> an instance, instead of returning an unbound method object.

I thought the main reason for existence of unbound
methods (for user-defined classes, at least) was so that
if you screw up a super call by forgetting to pass self,
or passing the wrong type of object, you get a more
helpful error message.

I remember a discussion about this some years ago, in
which you seemed to think the ability to produce this
message was important enough to justify the existence
of unbound methods, even though it meant you couldn't
easily have static methods (this was long before
staticmethod() was created).

Have you changed your mind about that?

Also, surely unbound methods will still have to exist
for C methods? Otherwise there will be nothing to ensure
that C code is getting the object type it expects for
self.

--
Greg



From gvanrossum at gmail.com  Fri Jan 28 23:45:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Jan 28 23:45:19 2005
Subject: [Python-Dev] Getting rid of unbound methods: patch available
In-Reply-To: <41FABC41.3060406@canterbury.ac.nz>
References: <ca471dc20501162212446e63b5@mail.gmail.com>
	<41FABC41.3060406@canterbury.ac.nz>
Message-ID: <ca471dc2050128144531276db7@mail.gmail.com>

[Guido]
> > Here's a patch that gets rid of unbound methods, as
> > discussed here before. A function's __get__ method
> > now returns the function unchanged when called without
> > an instance, instead of returning an unbound method object.

[Greg]
> I thought the main reason for existence of unbound
> methods (for user-defined classes, at least) was so that
> if you screw up a super call by forgetting to pass self,
> or passing the wrong type of object, you get a more
> helpful error message.

Yes, Tim reminded me of this too. But he said he could live without it. :-)

> I remember a discussion about this some years ago, in
> which you seemed to think the ability to produce this
> message was important enough to justify the existence
> of unbound methods, even though it meant you couldn't
> easily have static methods (this was long before
> staticmethod() was created).
> 
> Have you changed your mind about that?

After all those years, I think the added complexity of unbound methods
doesn't warrant having the convenience of the error message.

> Also, surely unbound methods will still have to exist
> for C methods? Otherwise there will be nothing to ensure
> that C code is getting the object type it expects for
> self.

No, C methods have their own object type for that (which is logically
equivalent to an unbound method).

But there was a use case for unbound methods having to do with C
methods of classic classes, in the implementation of built-in
exceptions.

Anyway, it's all moot because I withdrew the patch, due to the large
amount of code that would break due to the missing im_class attribute
-- all fixable, but enough not to want to break it all when 2.5 comes
out. So I'm salting the idea up for 3.0.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From greg.ewing at canterbury.ac.nz  Fri Jan 28 23:46:12 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Jan 28 23:54:25 2005
Subject: [Python-Dev] Let's get rid of unbound methods
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
Message-ID: <41FAC0B4.8080301@canterbury.ac.nz>

Josiah Carlson wrote:
> While it seems that super() is the 'modern paradigm' for this,
> I have been using base.method(self, ...) for years now, and have been
> quite happy with it.

I too would be very disappointed if base.method(self, ...)
became somehow deprecated. Cooperative super calls are a
different beast altogether and have different use cases.

In fact I'm having difficulty finding *any* use cases at
all for super() in my code. I thought I had found one
once, but on further reflection I changed my mind.

And I have found that the type checking of self provided
by unbound methods has caught a few bugs that would
probably have produced more mysterious symptoms otherwise.
But I can't say for sure whether they would have been
greatly more mysterious -- perhaps not.

--
Greg


From greg.ewing at canterbury.ac.nz  Fri Jan 28 23:51:00 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Jan 28 23:59:12 2005
Subject: [Python-Dev] Let's get rid of unbound methods
References: <ca471dc2050104102814be915b@mail.gmail.com>
	<1f7befae05010410576effd024@mail.gmail.com>
	<20050104154707.927B.JCARLSON@uci.edu>
	<1f7befae050104180711743ebd@mail.gmail.com>
Message-ID: <41FAC1D4.6070106@canterbury.ac.nz>

Tim Peters wrote:
> I expect that's because he stopped working on Zope code, so actually
> thinks it's odd again to see a gazillion methods like:
> 
> class Registerer(my_base):
>     def register(*args, **kws):
>         my_base.register(*args, **kws)

I second that! My PyGUI code is *full* of __init__
methods like that, because of my convention for supplying
initial values of properties as keyword arguments.

--
Greg

From martin at v.loewis.de  Sat Jan 29 00:12:23 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Jan 29 00:12:16 2005
Subject: [Python-Dev] Patch review: [ 1009811 ]
	Add	missing	types	to__builtin__
In-Reply-To: <1106891453.13830.67.camel@vault.timecastle.net>
References: <006001c504c2$da9cc1c0$433fc797@oemcomputer>	<41F97843.8070902@v.loewis.de>
	<1106891453.13830.67.camel@vault.timecastle.net>
Message-ID: <41FAC6D7.6070704@v.loewis.de>

Jeff Rush wrote:
> If they are put into __builtins__, the documentation won't need
> updating. ;-)

In that case, I'd rather prefer to correct the documentation.

Regards,
Martin
From martin at v.loewis.de  Sat Jan 29 00:24:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Jan 29 00:24:15 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
Message-ID: <41FAC99D.3070502@v.loewis.de>

Evan Jones wrote:
> Due to the issue of thread safety in the Python memory allocator, I have 
> been wondering about thread safety in the rest of the Python 
> interpreter. I understand that the interpreter is not thread safe, but 
> I'm not sure that I have seen a discussion of the all areas where this 
> is an issue. Here are the areas I know of:

This very much depends on the definition of "thread safe". If this mean
"working correctly in the presence of threads", then your statement
is wrong - the interpreter *is* thread-safe. The global interpreter lock
(GIL) guarantees that only one thread at any time can execute critical
operations. Since all threads acquire the GIL before performing such
operations, the interpreter is thread-safe.

> 1. The memory allocator.
> 2. Reference counts.
> 3. The cyclic garbage collector.
> 4. Current interpreter state is pointed to by a single shared pointer.

This is all protected by the GIL.

> 5. Many modules may not be thread safe (?).

Modules often release the GIL through BEGIN_ALLOW_THREADS, if they know
that would be safe if another thread would enter the Python interpreter.

> Ignoring the issue of #5 for the moment, are there any other areas where 
> this is a problem? I'm curious about how much work it would be to allow 
> concurrent execution of Python code.

Define "concurrent". Webster's offers

1. operating or occurring at the same time

Clearly, on a single-processor system, no two activities can execute
concurrently - the processor can do at most one activity at any point
in time.

Perhaps you are asking whether it would be possible to change the
current coarse-grained lock into a more finer-grained lock (as working
without locks is not implementable). This is also known as "free
threading". There have been attempts to implement free threading, and
they have failed.

> Note: One of the reasons I am asking is that my memory allocator patch 
> is that it changes the current allocator from "sort of" thread safe to 
> obviously unsafe.

The allocator is thread-safe in the presence of the GIL - you are
supposed to hold the GIL before entering the allocator. Due to some
unfortunate historical reasons, there is code which enters free()
without holding the GIL - and that is what the allocator specifically
deals with. Except for this single case, all callers of the allocator
are required to hold the GIL.

> However, if it 
> was one of the components that permitted the interpreter to go 
> multi-threaded, then it would be worth it.

Again, the interpreter supports multi-threading today. Removing
the GIL is more difficult, though - nearly any container object
(list, dictionary, etc) would have to change, plus the reference
counting (which would have to grow atomic increment/decrement).

Regards,
Martin
From ejones at uwaterloo.ca  Sat Jan 29 01:22:52 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sat Jan 29 01:22:41 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <41FAC99D.3070502@v.loewis.de>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
	<41FAC99D.3070502@v.loewis.de>
Message-ID: <EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>

On Jan 28, 2005, at 18:24, Martin v. L?wis wrote:
>> 5. Many modules may not be thread safe (?).
> Modules often release the GIL through BEGIN_ALLOW_THREADS, if they know
> that would be safe if another thread would enter the Python 
> interpreter.

Right, I guess that the modules already have to deal with being 
reentrant and thread-safe, since Python threads could already cause 
issues.

>> Ignoring the issue of #5 for the moment, are there any other areas 
>> where this is a problem? I'm curious about how much work it would be 
>> to allow concurrent execution of Python code.
> Define "concurrent". Webster's offers

Sorry, I really meant *parallel* execution of Python code: Multiple 
threads simultaneously executing a Python program, potentially on 
different CPUs.

> There have been attempts to implement free threading, and
> they have failed.

What I was trying to ask with my last email was what are the trouble 
areas? There are probably many that I am unaware of, due to my 
unfamiliarity the Python internals.

> Due to some
> unfortunate historical reasons, there is code which enters free()
> without holding the GIL - and that is what the allocator specifically
> deals with.

Right, but as said in a previous post, I'm not convinced that the 
current implementation is completely correct anyway.

> Again, the interpreter supports multi-threading today. Removing
> the GIL is more difficult, though - nearly any container object
> (list, dictionary, etc) would have to change, plus the reference
> counting (which would have to grow atomic increment/decrement).

Wouldn't it be up to the programmer to ensure that accesses to shared 
objects, like containers, are serialized? For example, with Java's 
collections, there are both synchronized and unsynchronized versions.

Evan Jones

From martin at v.loewis.de  Sat Jan 29 01:44:09 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Jan 29 01:44:01 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>	<41FAC99D.3070502@v.loewis.de>
	<EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
Message-ID: <41FADC59.3050609@v.loewis.de>

Evan Jones wrote:
> Sorry, I really meant *parallel* execution of Python code: Multiple 
> threads simultaneously executing a Python program, potentially on 
> different CPUs.

There cannot be parallel threads on a single CPU - for threads to
be truly parallel, you *must* have two CPUs, at a minimum.

Python threads can run truly parallel, as long as one of them
invoke BEGIN_ALLOW_THREADS.

> What I was trying to ask with my last email was what are the trouble 
> areas? There are probably many that I am unaware of, due to my 
> unfamiliarity the Python internals.

I think nobody really remembers - ask Google for "Python free 
threading". Greg Stein did the patch, and the main problem apparently
was that the performance became unacceptable - apparently primarily
because of dictionary locking.

> Right, but as said in a previous post, I'm not convinced that the 
> current implementation is completely correct anyway.

Why do you think so? (I see in your previous post that you claim
it is not completely correct, but I don't see any proof).

> Wouldn't it be up to the programmer to ensure that accesses to shared 
> objects, like containers, are serialized? 

In a truly parallel Python, two arbitrary threads could access the
same container, and it would still work. If some containers cannot
be used simultaneously in multiple threads, this might ask for a
desaster.

> For example, with Java's 
> collections, there are both synchronized and unsynchronized versions.

I don't think this approach can apply to Python. Python users are
used to completely thread-safe containers, and lots of programs
would break if the container would suddenly throw exceptions.

Furthermore, the question is what kind of failure you'ld expect
if an unsynchronized dictionary is used from multiple threads.
Apparently, Java guarantees something (e.g. that the interpreter
won't crash) but even this guarantee would be difficult to
make.

For example, for lists, the C API allows direct access to the pointers
in the list. If the elements of the list could change in-between, an
object in the list might go away after you got the pointer, but before
you had a chance to INCREF it. This would cause a crash shortly
afterwards. Even if that was changed to always return a new refence,
lots of code would break, as it would create large memory leaks
(code would have needed to decref the list items, but currently
doesn't - nor is it currently necessary).

Regards,
Martin
From abo at minkirri.apana.org.au  Sat Jan 29 01:44:21 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sat Jan 29 01:44:34 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <41FAC99D.3070502@v.loewis.de>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
	<41FAC99D.3070502@v.loewis.de>
Message-ID: <1106959461.4557.21.camel@localhost>

On Sat, 2005-01-29 at 00:24 +0100, "Martin v. L?wis" wrote:
> Evan Jones wrote:
[...]
> The allocator is thread-safe in the presence of the GIL - you are
> supposed to hold the GIL before entering the allocator. Due to some
> unfortunate historical reasons, there is code which enters free()
> without holding the GIL - and that is what the allocator specifically
> deals with. Except for this single case, all callers of the allocator
> are required to hold the GIL.

Just curious; is that "one case" a bug that needs fixing, or is the some
reason this case can't be changed to use the GIL? Surely making it
mandatory for all free() calls to hold the GIL is easier than making the
allocator deal with the one case where this isn't done.

I like the GIL :-) so much so I'd like to see it visible at the Python
level. Then you could write your own atomic methods in Python.

BTW, if what Evan is hoping for concurrent threads running on different
processors in a multiprocessor system, then don't :-)

It's been a while since I looked at multiprocessor architectures, but I
believe threading's shared memory paradigm will always be hard to
distribute efficiently over multiple CPU's. If you want to run on
multiple processors, use processes, not threads.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From mcherm at mcherm.com  Sat Jan 29 02:14:59 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Sat Jan 29 02:15:15 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
Message-ID: <1106961299.41fae39341ec9@mcherm.com>

Martin v. L?wis writes:
> Due to some
> unfortunate historical reasons, there is code which enters free()
> without holding the GIL - and that is what the allocator specifically
> deals with. Except for this single case, all callers of the allocator
> are required to hold the GIL.

Donovan Baarda writes:
> Just curious; is that "one case" a bug that needs fixing, or is the some
> reason this case can't be changed to use the GIL? Surely making it
> mandatory for all free() calls to hold the GIL is easier than making the
> allocator deal with the one case where this isn't done.

What Martin is trying to say here is that it _IS_ mandatory to hold
the GIL when calling free(). However, there is some very old code in
existance (written by other people) which calls free() without holding
the GIL. We work very hard to provide backward compatibility, so we
are jumping through hoops to ensure that even this old code which is
violating the rules doesn't get broken.

-- Michael Chermside

From tim.peters at gmail.com  Sat Jan 29 02:27:25 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Jan 29 02:27:28 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
	<41FAC99D.3070502@v.loewis.de>
	<EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
Message-ID: <1f7befae05012817277bbc292b@mail.gmail.com>

...

[Evan Jones]
> What I was trying to ask with my last email was what are the trouble
> areas? There are probably many that I am unaware of, due to my
> unfamiliarity the Python internals.

Google on "Python free threading".  That's not meant to be curt, it's
just meant to recognize that the task is daunting and has been
discussed often before.

[Martin v. L?wis]
>> Due to some unfortunate historical reasons, there is code which enters
>> free() without holding the GIL - and that is what the allocator specifically
>> deals with.

> Right, but as said in a previous post, I'm not convinced that the
> current implementation is completely correct anyway.

Sorry, I haven't had time for this.  From your earlier post:

> For example, is it possible to call PyMem_Free from two threads
> simultaneously?

Possible but not legal; undefined behavior if you try.  See the
"Thread State and the Global Interpreter Lock" section of the Python C
API manual.

    ... only the thread that has acquired the global interpreter lock may
        operate on Python objects or call Python/C API functions

There are only a handful of exceptions to the last part of that rule,
concerned with interpreter and thread startup and shutdown, and
they're explicitly listed in that section.  The memory-management
functions aren't among them.

In addition, it's not legal to call PyMem_Free regardless unless the
pointer passed to it was originally obtained from another function in
the PyMem_* family (that specifically excludes memory obtained from a
PyObject_* function).  In a release build, all of the PyMem_*
allocators resolve directly to the platform malloc or realloc, and all
PyMem_Free has to determine is that they *were* so allocated and thus
call the platform free() directly (which is presumably safe to call
without holding the GIL).  The hacks in PyObject_Free (== PyMem_Free)
are there solely so that question can be answered correctly in the
absence of holding the GIL.   "That question" == "does pymalloc
control the pointer passed to me, or does the system malloc?".

In return, that hack is there solely because in much earlier versions
of Python extension writers got into the horrible habit of allocating
object memory with PyObject_New but releasing it with PyMem_Free, and
because indeed Python didn't *have* a PyObject_Free function then. 
Other extension writers were just nuts, mixing PyMem_* calls with
direct calls to system free/malloc/realloc, and ignoring GIL issues
for all of those.  When pymalloc was new, we went to insane lengths to
avoid breaking that stuff, but enough is enough.

> Since the problem is that threads could call PyMem_Free without
> holding the GIL, it seems to be that it is possible.

Yes, but not specific to PyMem_Free.  It's clearly _possible_ to call
_any_ function from multiple threads without holding the GIL.

> Shouldn't it also be supported?

No. If what they want is the system malloc/realloc/free, that's what
they should call.

> In the current memory allocator, I believe that situation can lead to
> inconsistent state.

Certainly, but only if pymalloc controls the memory blocks.  If they
were actually obtained from the system malloc, the only part of
pymalloc that has to work correctly is the Py_ADDRESS_IN_RANGE()
macro.  When that returns false, the only other thing PyObject_Free()
does is call the system free() immediately, then return.  None of
pymalloc's data structures are involved, apart from the hacks ensuring
that the arena of base addresses is safe to access despite potentlly
current mutation-by-appending.

> ...
> Basically, if a concurrent memory allocator is the requirement,

It isn't.  The attempt to _exploit_ the GIL by doing no internal
locking of its own is 100% deliberate in pymalloc -- it's a
significant speed win (albeit on some platforms more than others).

> then I think some other approach is necessary.

If it became necessary, that's what this section of obmalloc is for:

SIMPLELOCK_DECL(_malloc_lock)
#define LOCK()		SIMPLELOCK_LOCK(_malloc_lock)
#define UNLOCK()	SIMPLELOCK_UNLOCK(_malloc_lock)
#define LOCK_INIT()	SIMPLELOCK_INIT(_malloc_lock)
#define LOCK_FINI()	SIMPLELOCK_FINI(_malloc_lock)

You'll see that PyObject_Free() calls LOCK() and UNLOCK() at
appropriate places already, but they have empty expansions now.

Back to the present:

[Martin]
>> Again, the interpreter supports multi-threading today. Removing
>> the GIL is more difficult, though - nearly any container object
>> (list, dictionary, etc) would have to change, plus the reference
>> counting (which would have to grow atomic increment/decrement).

[Evan]
> Wouldn't it be up to the programmer to ensure that accesses to shared
> objects, like containers, are serialized? For example, with Java's
> collections, there are both synchronized and unsynchronized versions.

Enormous mounds of existing threaded Python code freely manipulates
lists and dicts without explicit locking now.  We can't break that --
and wouldn't want to.  Writing threaded code is especially easy (a
relative stmt, not absolute) in Python because of it.
From ejones at uwaterloo.ca  Sat Jan 29 03:15:33 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sat Jan 29 03:15:45 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <1f7befae05012817277bbc292b@mail.gmail.com>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>
	<41FAC99D.3070502@v.loewis.de>
	<EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
	<1f7befae05012817277bbc292b@mail.gmail.com>
Message-ID: <A7A0136C-719B-11D9-8570-0003938016AE@uwaterloo.ca>

On Jan 28, 2005, at 20:27, Tim Peters wrote:
> The hacks in PyObject_Free (== PyMem_Free)
> are there solely so that question can be answered correctly in the
> absence of holding the GIL.   "That question" == "does pymalloc
> control the pointer passed to me, or does the system malloc?".

<Really loud sound of me smacking my hand on my forehead>

Ah! *Now* I get it. And yes, it will be possible to still support this 
in my patched version of the allocator. It just means that I have to 
leak the "arenas" array just like it did before, and then do some hard 
thinking about memory models and consistency to decide if the "arenas" 
pointer needs to be volatile.

> When pymalloc was new, we went to insane lengths to
> avoid breaking that stuff, but enough is enough.

So you don't think we need to bother supporting that any more?

> Back to the present:
>> Wouldn't it be up to the programmer to ensure that accesses to shared
>> objects, like containers, are serialized? For example, with Java's
>> collections, there are both synchronized and unsynchronized versions.
> Enormous mounds of existing threaded Python code freely manipulates
> lists and dicts without explicit locking now.  We can't break that --
> and wouldn't want to.  Writing threaded code is especially easy (a
> relative stmt, not absolute) in Python because of it.

Right, because currently Python switches threads on a granularity of 
opcodes, which gives you this serialization with the cost of never 
having parallel execution.

Evan Jones

From ejones at uwaterloo.ca  Sat Jan 29 03:17:54 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sat Jan 29 03:17:50 2005
Subject: [Python-Dev] Python Interpreter Thread Safety?
In-Reply-To: <41FADC59.3050609@v.loewis.de>
References: <E925038C-712E-11D9-9C81-0003938016AE@uwaterloo.ca>	<41FAC99D.3070502@v.loewis.de>
	<EA040C9E-718B-11D9-8570-0003938016AE@uwaterloo.ca>
	<41FADC59.3050609@v.loewis.de>
Message-ID: <FC051810-719B-11D9-8570-0003938016AE@uwaterloo.ca>

On Jan 28, 2005, at 19:44, Martin v. L?wis wrote:
> Python threads can run truly parallel, as long as one of them
> invoke BEGIN_ALLOW_THREADS.

Except that they are really executing C code, not Python code.

> I think nobody really remembers - ask Google for "Python free 
> threading". Greg Stein did the patch, and the main problem apparently
> was that the performance became unacceptable - apparently primarily
> because of dictionary locking.

Thanks, I found the threads discussing it.

>> Right, but as said in a previous post, I'm not convinced that the 
>> current implementation is completely correct anyway.
> Why do you think so? (I see in your previous post that you claim
> it is not completely correct, but I don't see any proof).

There are a number of issues actually, but as Tim points, only if the 
blocks are managed by PyMalloc. I had written a description of three of 
them here, but they are not relevant. If the issue is calling 
PyMem_Free with a pointer that was allocated with malloc() while 
PyMalloc is doing other stuff, then no problem: That is possible to 
support, but I'll have to think rather hard about some of the issues.

> For example, for lists, the C API allows direct access to the pointers
> in the list. If the elements of the list could change in-between, an
> object in the list might go away after you got the pointer, but before
> you had a chance to INCREF it. This would cause a crash shortly
> afterwards. Even if that was changed to always return a new refence,
> lots of code would break, as it would create large memory leaks
> (code would have needed to decref the list items, but currently
> doesn't - nor is it currently necessary).

Ah! Right. In Java, the collections are all actually written in Java, 
and run on the VM. Thus, when some concurrent weirdness happens, it 
just corrupts the application, not the VM. However, in Python, this 
could actually corrupt the interpreter itself, crashing the entire 
thing with a very ungraceful Segmentation Fault or something similar.

Evan Jones

From kbk at shore.net  Sat Jan 29 20:10:30 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat Jan 29 20:10:47 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200501291910.j0TJAUhi007769@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  280 open ( +7) /  2747 closed ( +1) /  3027 total ( +8)
Bugs    :  803 open ( +6) /  4799 closed (+10) /  5602 total (+16)
RFE     :  167 open ( +1) /   141 closed ( +0) /   308 total ( +1)

New / Reopened Patches
______________________

tarfile.ExFileObject iterators  (2005-01-23)
       http://python.org/sf/1107973  opened by  Mitch Chapman

Allow slicing of any iterator by default  (2005-01-24)
       http://python.org/sf/1108272  opened by  Nick Coghlan

fix .split() separator doc, update .rsplit() docs  (2005-01-24)
CLOSED http://python.org/sf/1108303  opened by  Wummel

type conversion methods and subclasses  (2005-01-25)
       http://python.org/sf/1109424  opened by  Walter D?rwald

distutils dry-run breaks when attempting to bytecompile  (2005-01-26)
       http://python.org/sf/1109658  opened by  Anthony Baxter

patch for idlelib  (2005-01-26)
       http://python.org/sf/1110205  opened by  sowjanya

patch for gzip.GzipFile.flush()  (2005-01-26)
       http://python.org/sf/1110248  opened by  David Schnepper

HEAD/PUT/DELETE support for urllib2.py  (2005-01-28)
       http://python.org/sf/1111653  opened by  Terrel Shumway

Patches Closed
______________

fix .split() maxsplit doc, update .rsplit() docs  (2005-01-24)
       http://python.org/sf/1108303  closed by  rhettinger

New / Reopened Bugs
___________________

"\0" not listed as a valid escape in the lang reference  (2005-01-24)
CLOSED http://python.org/sf/1108060  opened by  Andrew Bennetts

broken link in tkinter docs  (2005-01-24)
       http://python.org/sf/1108490  opened by  Ilya Sandler

Cookie.py produces invalid code  (2005-01-25)
       http://python.org/sf/1108948  opened by  Simon Dahlbacka

idle freezes when run over ssh  (2005-01-25)
       http://python.org/sf/1108992  opened by  Mark Poolman

Time module missing from latest module index  (2005-01-25)
       http://python.org/sf/1109523  opened by  Skip Montanaro

Need some setup.py sanity  (2005-01-25)
       http://python.org/sf/1109602  opened by  Skip Montanaro

distutils argument parsing is bogus  (2005-01-26)
       http://python.org/sf/1109659  opened by  Anthony Baxter

bdist_wininst ignores build_lib from build command  (2005-01-26)
       http://python.org/sf/1109963  opened by  Anthony Tuininga

Cannot ./configure on FC3 with gcc 3.4.2  (2005-01-26)
CLOSED http://python.org/sf/1110007  opened by  Paul Watson

recursion core dumps  (2005-01-26)
       http://python.org/sf/1110055  opened by  Jacob Engelbrecht

gzip.GzipFile.flush() does not flush all internal buffers  (2005-01-26)
       http://python.org/sf/1110242  opened by  David Schnepper

os.environ.update doesn't work  (2005-01-27)
CLOSED http://python.org/sf/1110478  opened by  June Kim

list comprehension scope  (2005-01-27)
CLOSED http://python.org/sf/1110705  opened by  Simon Dahlbacka

RLock logging mispells "success"  (2005-01-27)
CLOSED http://python.org/sf/1110998  opened by  Matthew Bogosian

csv reader barfs encountering quote when quote_none is set  (2005-01-27)
       http://python.org/sf/1111100  opened by  washington irving

tkSimpleDialog broken on MacOS X (Aqua Tk)  (2005-01-27)
       http://python.org/sf/1111130  opened by  Russell Owen

Bugs Closed
___________

bug with idle's stdout when executing load_source  (2005-01-20)
       http://python.org/sf/1105950  closed by  kbk

"\0" not listed as a valid escape in the lang reference  (2005-01-23)
       http://python.org/sf/1108060  closed by  tim_one

Undocumented implicit strip() in split(None) string method  (2005-01-19)
       http://python.org/sf/1105286  closed by  rhettinger

split() takes no keyword arguments  (2005-01-21)
       http://python.org/sf/1106694  closed by  rhettinger

Cannot ./configure on FC3 with gcc 3.4.2  (2005-01-26)
       http://python.org/sf/1110007  closed by  loewis

os.environ.update doesn't work  (2005-01-27)
       http://python.org/sf/1110478  closed by  loewis

Scripts started with CGIHTTPServer: missing cgi environment  (2005-01-11)
       http://python.org/sf/1100235  closed by  loewis

list comprehension scope  (2005-01-27)
       http://python.org/sf/1110705  closed by  rhettinger

RLock logging mispells "success"  (2005-01-27)
       http://python.org/sf/1110998  closed by  bcannon

README of 2.4 source download says 2.4a3  (2005-01-20)
       http://python.org/sf/1106057  closed by  loewis

New / Reopened RFE
__________________

'attrmap' function, attrmap(x)['attname'] = x.attname  (2005-01-26)
       http://python.org/sf/1110010  opened by  Gregory Smith

From p.f.moore at gmail.com  Sat Jan 29 23:15:40 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat Jan 29 23:15:44 2005
Subject: [Python-Dev] Re: PEP 309 (Was: Patch review: [ 1094542 ] add Bunch
	type to collections module)
In-Reply-To: <79990c6b05012701492440d0c0@mail.gmail.com>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<79990c6b05012701492440d0c0@mail.gmail.com>
Message-ID: <79990c6b05012914156800e5bc@mail.gmail.com>

On Thu, 27 Jan 2005 09:49:48 +0000, Paul Moore <p.f.moore@gmail.com> wrote:
> On the subject of  "things everyone ends up rewriting", what needs to
> be done to restart discussion on PEP 309 (Partial Function
> Application)? The PEP is marked "Accepted" and various patches exist:
> 
> 941881 - C implementation
> 1006948 - Windows build changes
> 931010 - Unit Tests
> 931007 - Documentation
> 931005 - Reference implementation (in Python)
> 
> I get the impression that there are some outstanding tweaks required
> to the Windows build, but I don't have VS.NET to check and/or update
> the patch.
> 
> Does this just need a core developer to pick it up? I guess I'll go
> off and do some patch/bug reviews...

OK, I reviewed some bugs. Could I ask that someone review 941881
(Martin would be ideal, as he knows the Windows build process -
1006948 should probably be included as well).

If I'm being cheeky by effectively asking that a suite of patches be
reviewed in exchange for 5 bugs, then I'll review some more - I don't
have time now, unfortunately. I justify myself by claiming that the
suite of patches is in effect one big patch split into multiple
tracker items... :-)

Bugs reviewed:

1083306 - looks fine to me, I recommend applying. I've added a patch
for CVS HEAD.

1058960 - already fixed in CVS HEAD (rev 1.45) - can be closed.
Backport candidate?

1033422 - This is standard Windows behaviour, and should be retained.
I recommend closing "Won't Fix".

1016563 - The patch looks fine, I've added a patch against CVS HEAD.
The change was introduced in revision 1.32 from patch 527518. It looks
accidental. I can't reproduce the problem, but I can see that it could
be an issue. I recommend applying the patch.

977250 - Not a bug. I've given an explanation in the tracker item, and
would recommend closing "Won't Fix".

Also, while looking at patches I noticed 1077106. It doesn't apply to
me - I don't use Linux - but it looks like this may have simply been
forgotten. The last comment is in December from from Michael Hudson,
saying in effect "I'll commit this tomorrow". Michael?

Paul.
From martin at v.loewis.de  Sun Jan 30 00:54:12 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 30 00:54:01 2005
Subject: [Python-Dev] Re: PEP 309 (Was: Patch review: [ 1094542 ] add
	Bunch	type to collections module)
In-Reply-To: <79990c6b05012914156800e5bc@mail.gmail.com>
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>	<d11dcfba050126164067bef3a1@mail.gmail.com>	<cta7fq$144$1@sea.gmane.org>	<79990c6b05012701492440d0c0@mail.gmail.com>
	<79990c6b05012914156800e5bc@mail.gmail.com>
Message-ID: <41FC2224.4000107@v.loewis.de>

Paul Moore wrote:
> OK, I reviewed some bugs. Could I ask that someone review 941881
> (Martin would be ideal, as he knows the Windows build process -
> 1006948 should probably be included as well).

Thanks for the reviews. I won't be available next week to look into
the PEP, but I promise to do so some time in February.

I've dealt with the easy reviews already:

> 1058960 - already fixed in CVS HEAD (rev 1.45) - can be closed.
> 1033422 - This is standard Windows behaviour, and should be retained.
> I recommend closing "Won't Fix".
> 977250 - Not a bug. I've given an explanation in the tracker item, and
> would recommend closing "Won't Fix".

I've closed all of them.

Regards,
Martin
From martin at v.loewis.de  Sun Jan 30 11:31:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 30 11:31:46 2005
Subject: [Python-Dev] Improving the Python Memory Allocator
In-Reply-To: <B7E9F05C-6E68-11D9-9C81-0003938016AE@uwaterloo.ca>
References: <D53FC231-6D8C-11D9-919A-0003938016AE@uwaterloo.ca>
	<41F581C8.6070109@v.loewis.de>
	<B7E9F05C-6E68-11D9-9C81-0003938016AE@uwaterloo.ca>
Message-ID: <41FCB79E.4050605@v.loewis.de>

Evan Jones wrote:
> Sure. This should be done even for patches which should absolutely not 
> be committed?

Difficult question. I think the answer goes like this: "Patches that
should absolutely not be committed should not be published at all".
There are different shades of gray, of course - but people typically
dislike receiving patches through a mailing list.

OTOH, I'm guilty of committing a patch myself which was explicitly
marked as not-to-be-committed on SF, so I cannot really advise to
use SF in this case. Putting it on your own web server would be
best.

Regards,
Martin
From nnorwitz at gmail.com  Sun Jan 30 21:40:35 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun Jan 30 21:40:38 2005
Subject: [Python-Dev] Speed up function calls
In-Reply-To: <004a01c503b5$fd167b00$18fccc97@oemcomputer>
References: <ee2a432c050125193511085d8@mail.gmail.com>
	<004a01c503b5$fd167b00$18fccc97@oemcomputer>
Message-ID: <ee2a432c05013012404c265a99@mail.gmail.com>

On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger <python@rcn.com> wrote:
> > I agree that METH_O and METH_NOARGS are near
> > optimal wrt to performance.  But if we could have one METH_UNPACKED
> > instead of 3 METH_*, I think that would be a win.
>  . . .
> > Sorry, I meant eliminated w/3.0.
> 
> So, leave METH_O and METH_NOARGS alone.  They can't be dropped until 3.0
> and they can't be improved speedwise.

I was just trying to point out possible directions.  I wasn't trying
to suggest that the patch as a whole should be integrated now.

> > Ultimately, I think we can speed things up more by having 9 different
> > op codes, ie, one for each # of arguments.  CALL_FUNCTION_0,
> > CALL_FUNCTION_1, ...
> > (9 is still arbitrary and subject to change)
> 
> How is the compiler to know the arity of the target function?  If I call
> pow(3,5), how would the compiler know that pow() can take an optional
> third argument which would be need to be initialized to NULL?

The compiler wouldn't know anything about pow().  It would only know
that 2 arguments are passed.  That would help get rid of the first
switch statement.
I need to think more about the NULL initialization.  I may have mixed
2 separate issues.

> > Then we would have N little functions, each with the exact # of
> > parameters.  Each would still need a switch to call the C function
> > because there may be optional parameters.  Ultimately, it's possible
> > the code would be small enough to stick it into the eval_frame loop.
> > Each of these steps would need to be tested, but that's a possible
> > longer term direction.
>  . . .
> > There would only be an if to check if it was a C function or not.
> > Maybe we could even get rid of this by more fixup at import time.
> 
> This is what I mean about the patch taking on a life of its own.  It's
> an optimization patch that slows down METH_O and METH_NOARGS.  It's a
> incremental change that throws away backwards compatibility.  It's a
> simplification that introduces a bazillion new code paths.  It's a
> simplification that can't be realized until 3.0.  It's a minor change
> that entails new opcodes, compiler changes, and changes in all
> extensions that have ever been written.

I really didn't want to do this now (or necessarily in 2.5).  I was
just trying to provide insight into future direction.  This brings up
another discussion about working towards 3.0.  But I'll make a new
thread for that.

At this point, it seems there aren't many disagreements about the
general idea.  There is primarily a question about what is acceptable
now.  I will rework the patch based on Raymond's feedback and continue
update the tracker.  Unless if anyone disagrees, I don't see a reason
to continue the remainder of this discussion on py-dev.

Neal
From nnorwitz at gmail.com  Sun Jan 30 21:55:08 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun Jan 30 21:55:10 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <004a01c503b5$fd167b00$18fccc97@oemcomputer>
References: <ee2a432c050125193511085d8@mail.gmail.com>
	<004a01c503b5$fd167b00$18fccc97@oemcomputer>
Message-ID: <ee2a432c050130125510c7938d@mail.gmail.com>

On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger <python@rcn.com> wrote:
>
> This is what I mean about the patch taking on a life of its own.  It's
> an optimization patch that slows down METH_O and METH_NOARGS.  It's a
> incremental change that throws away backwards compatibility.  It's a
> simplification that introduces a bazillion new code paths.  It's a
> simplification that can't be realized until 3.0.

I've been thinking about how to move towards 3.0.  There are many
changes that are desirable and unlikely to occur prior to 3.0.  But if
we defer so many enhancments, the changes will be voluminous,
potentially difficult to manage, and possibly error prone.  There is a
risk that many small warts will not be fixed, only because they fell
through the cracks.

I thought about making a p3k branch in CVS.  It could be worked on
slowly and would be the implementation of PEP 3000.  However, if a
branch was created all changes would need to be forward ported to it
and it would need to be kept up to date.  I know I wouldn't have
enough time to maintain this.

The benefit is that people could test the portability of their
applications with 3.0 sooner rather than later.  They could see if the
switch to iterators created problems, or integer division, or
new-style exceptions, etc.  We could try to improve performance by
simplifying architecture.  We could see how much a problem it would be
to (re)move some builtins.

Any ideas how we could start to realize some benefits of Py3.0 before
it arrives?  I'm not sure if this is worth it, if it's premature, or
if there are other ways to acheive the goal of easing transition for
users and simplifying developers tasks (by spreading over a longer
period of time) and reducing the possibility of not fixing warts.

Neal
From t-meyer at ihug.co.nz  Sun Jan 30 23:05:57 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Sun Jan 30 23:07:08 2005
Subject: [Python-Dev] Should Python's library modules be written to help the
	freeze tools?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801EC7549@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDC1@its-xchg4.massey.ac.nz>

The Python 2.4 Lib/bsddb/__init__.py contains this:

"""
# for backwards compatibility with python versions older than 2.3, the
# iterator interface is dynamically defined and added using a mixin
# class.  old python can't tokenize it due to the yield keyword.
if sys.version >= '2.3':
    exec """
import UserDict
from weakref import ref
class _iter_mixin(UserDict.DictMixin):
...
"""

Because the imports are inside an exec, modulefinder (e.g. when using bsddb
with a py2exe built application) does not realise that the imports are
required.  (The requirement can be manually specified, of course, if you
know that you need to do so).

I believe that changing the above code to:

"""
if sys.version >= '2.3':
    import UserDict
    from weakref import ref
    exec """
class _iter_mixin(UserDict.DictMixin):
"""

Would still have the intended effect and would let modulefinder do its work.

The main question (to steal Thomas's words) is whether the library modules
should be written to help the freeze tools - if the answer is 'yes', then
I'll submit the above as a patch for 2.5.

Thanks!

=Tony.Meyer

From martin at v.loewis.de  Sun Jan 30 23:50:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Jan 30 23:50:39 2005
Subject: [Python-Dev] Should Python's library modules be written to help
	the	freeze tools?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDC1@its-xchg4.massey.ac.nz>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDC1@its-xchg4.massey.ac.nz>
Message-ID: <41FD64CD.5050307@v.loewis.de>

Tony Meyer wrote:
> The main question (to steal Thomas's words) is whether the library modules
> should be written to help the freeze tools - if the answer is 'yes', then
> I'll submit the above as a patch for 2.5.

The answer to this question certainly is "yes, if possible". In this
specific case, I wonder whether the backwards compatibility is still
required in the first place. According to PEP 291, Greg Smith and
Barry Warsaw decide on this, so I think they would need to comment
first because any patch can be integrated. If they comment that 2.1
compatibility is still desirable, your patch would be fine (I guess);
if they say that the compatibility requirement can be dropped for 2.5,
I suggest that the entire exec statement is removed, along with the
conditional clause.

Regards,
Martin
From t-meyer at ihug.co.nz  Mon Jan 31 00:28:42 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Mon Jan 31 00:28:46 2005
Subject: [Python-Dev] Should Python's library modules be written to help
	the freeze tools?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801EC7A6C@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDCB@its-xchg4.massey.ac.nz>

[Tony Meyer]
> The main question (to steal Thomas's words) is whether the 
> library modules should be written to help the freeze tools
> - if the answer is 'yes', then I'll submit the above as a
> patch for 2.5.

[Martin v. L?wis]
> The answer to this question certainly is "yes, if possible". In this
> specific case, I wonder whether the backwards compatibility is still
> required in the first place. According to PEP 291, Greg Smith and
> Barry Warsaw decide on this, so I think they would need to comment
> first because any patch can be integrated.
[...]

Thanks!  I've gone ahead and submitted a patch, in that case:

[ 1112812 ] Patch for Lib/bsddb/__init__.py to work with modulefinder
<http://sourceforge.net/tracker/index.php?func=detail&aid=1112812&group_id=5
470&atid=305470>

I realise that neither of the people that need to look at this are part of
the '5 for 1' deal, so I need to wait for one of them to have time to look
at it (plenty of time left before 2.5 anyway) but I'll do 5 reviews for the
karma anyway, today or tomorrow.

=Tony.Meyer

From bob at redivi.com  Mon Jan 31 00:56:17 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan 31 00:56:32 2005
Subject: [Python-Dev] Should Python's library modules be written to help
	the	freeze tools?
In-Reply-To: <41FD64CD.5050307@v.loewis.de>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDC1@its-xchg4.massey.ac.nz>
	<41FD64CD.5050307@v.loewis.de>
Message-ID: <97B1F1B1-9004-4002-A2A2-11259D4EE007@redivi.com>

On Jan 30, 2005, at 5:50 PM, Martin v. L?wis wrote:


> Tony Meyer wrote:
>> The main question (to steal Thomas's words) is whether the library  
>> modules
>> should be written to help the freeze tools - if the answer is 'yes',  
>> then
>> I'll submit the above as a patch for 2.5.
>
> The answer to this question certainly is "yes, if possible". In this
> specific case, I wonder whether the backwards compatibility is still
> required in the first place. According to PEP 291, Greg Smith and
> Barry Warsaw decide on this, so I think they would need to comment
> first because any patch can be integrated. If they comment that 2.1
> compatibility is still desirable, your patch would be fine (I guess);
> if they say that the compatibility requirement can be dropped for 2.5,
> I suggest that the entire exec statement is removed, along with the
> conditional clause.

py2app <http://undefined.org/python/#py2app> handles this situation by  
using a much richer way to analyze module dependencies, in that it can  
use hooks (called "recipes") to trigger arbitrary behavior when the  
responsible recipe sees that a certain module is in the dependency  
graph.  This is actually necessary to do the right thing in the case of  
extensions and modules that are not friendly with bytecode analysis.   
Though there are not many of these in the standard library, a few  
common packages such as PIL have a real need for this.  Also, since  
modulegraph uses a graph data structure it is much better suited to  
pruning the dependency graph.  For example, pydoc imports Tkinter to  
support an obscure feature, but this is almost never desired in the  
context of an application freeze tool.  py2app ships with a recipe that  
automatically breaks the edge between pydoc and Tkinter  
<http://svn.red-bean.com/bob/py2app/trunk/src/py2app/recipes/ 
docutils.py>, so if Tkinter is not explicitly included or used by  
anything else in the dependency graph, it is correctly excluded from  
the resultant application bundle.

In order to correctly cover the Python API, I needed to ALWAYS include:  
unicodedata, warnings, encodings, and weakref because they can be used  
by the implementation of Python itself without any "import" hints  
(which, if py2exe also did this, would've probably solved Tony's issue  
with bsddb).  Also, I did an analysis of the Python standard library  
and I discovered that the following (hopefully rather complete) list of  
implicit dependencies (from  
<http://svn.red-bean.com/bob/py2app/trunk/src/modulegraph/ 
find_modules.py>):

     {
         # imports done from builtin modules in C code (untrackable by  
modulegraph)
         "time":         ["_strptime"],
         "datetime":     ["time"],
         "MacOS":        ["macresource"],
         "cPickle":      ["copy_reg", "cStringIO"],
         "parser":       ["copy_reg"],
         "codecs":       ["encodings"],
         "cStringIO":    ["copy_reg"],
         "_sre":         ["copy", "string", "sre"],
         "zipimport":    ["zlib"],
         # mactoolboxglue can do a bunch more of these
         # that are far harder to predict, these should be tracked
         # manually for now.

         # this isn't C, but it uses __import__
         "anydbm":       ["dbhash", "gdbm", "dbm", "dumbdbm", "whichdb"],
     }

I would like to write a PEP for modulegraph as a replacement for  
modulefinder at some point, but I can't budget the time for it right  
now.  The current implementation is also largely untested on other  
platforms.  I hear it has been used by a Twisted developer to create  
Windows NT services using py2exe (augmenting py2exe's rather simple  
dependency resolution mechanism), however I'm not sure if that is in  
public svn or not.  If the authors of the other freeze tools are  
interested, they can feel free to use modulegraph from py2app -- it is  
cross-platform code under MIT license, but I can dual-license if  
necessary (however I think it should be compatible with cx_freeze,  
py2exe, and Python itself).  The API is purposefully different than  
modulefinder, but it is close enough such that most of the work  
involved is just removing unnecessary kludges.

-bob

From bac at OCF.Berkeley.EDU  Mon Jan 31 03:29:25 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Jan 31 03:29:33 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <ee2a432c050130125510c7938d@mail.gmail.com>
References: <ee2a432c050125193511085d8@mail.gmail.com>	<004a01c503b5$fd167b00$18fccc97@oemcomputer>
	<ee2a432c050130125510c7938d@mail.gmail.com>
Message-ID: <41FD9805.1090309@ocf.berkeley.edu>

Neal Norwitz wrote:
> On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger <python@rcn.com> wrote:
> 
[SNIP]
> Any ideas how we could start to realize some benefits of Py3.0 before
> it arrives?  I'm not sure if this is worth it, if it's premature, or
> if there are other ways to acheive the goal of easing transition for
> users and simplifying developers tasks (by spreading over a longer
> period of time) and reducing the possibility of not fixing warts.
> 

The way I always imagined Python 3.0 would come about would be through preview 
releases.  Once the final 2.x version was released and went into maintennance 
we would start developing Python 3.0 .  During that development, when a major 
semantic change was checked in and seemed to work we could do a quick preview 
release for people to use to see if the new features up to that preview release 
would break their code.

Any other way, though, through concurrent development, seems painful.  As you 
mentioned, Neal, branches require merges eventually and that can be painful.  I 
suspect people will just have to put up with a longer dev time for Python 3.0 . 
  That longer dev time might actually be a good thing in the end.  It would 
enable us to really develop a very stable 2.x version of Python that we all 
know will be in use for quite some time by old code.

-Brett
From python at rcn.com  Mon Jan 31 04:26:47 2005
From: python at rcn.com (Raymond Hettinger)
Date: Mon Jan 31 04:30:27 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <ee2a432c050130125510c7938d@mail.gmail.com>
Message-ID: <000d01c50744$b2395700$fe26a044@oemcomputer>

Neal Norwitz 
> I thought about making a p3k branch in CVS

I had hoped for the core of p3k to be built for scratch so that even the
most pervasive and fundamental implementation choices would be open for
discussion:

* Possibly write in C++.
* Possibly replace bytecode with Forth style threaded code.
* Possibly toss ref counting in favor of some kind of GC.
* Consider ways to leverage multiple processor environments.
* Consider alternative ways to implement exception handling (long jumps,
etc, signals, etc.)
* Look at alternate ways of building, passing, and parsing function
arguments.
* Use b-trees instead of dictionaries (just kidding).


Raymond


From skip at pobox.com  Mon Jan 31 06:00:21 2005
From: skip at pobox.com (Skip Montanaro)
Date: Mon Jan 31 06:00:27 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <000d01c50744$b2395700$fe26a044@oemcomputer>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
Message-ID: <16893.47973.850283.413462@montanaro.dyndns.org>


    Raymond> I had hoped for the core of p3k to be built for scratch ...

Then we should just create a new CVS module for it (or go whole hog and try
a new revision control system altogether - svn, darcs, arch, whatever).

Skip
From gvanrossum at gmail.com  Mon Jan 31 06:17:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Jan 31 06:17:19 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <000d01c50744$b2395700$fe26a044@oemcomputer>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
Message-ID: <ca471dc2050130211744a1b76f@mail.gmail.com>

> I had hoped for the core of p3k to be built for scratch [...]

Stop right there.

I used to think that was a good idea too, and was hoping to do exactly
that (after retirement :). However, the more I think about it, the
more I believe it would be throwing away too much valuable work.

Please read this article by Joel Spolsky (if you're not yet in the
habit of reading "Joel on Software", you're missing something):
http://joelonsoftware.com/articles/fog0000000069.html

Then tell me if you still want to start over. I expect that if we do
piecemeal replacement of modules rather than starting from scratch
we'll be more productive sooner with less effort. After all, the
Python 3000 effort shouldn't be as pervasive as the Perl 6 design --
we're not redesigning the language from scratch, we're just tweaking
(albeit allowing backwards incompatibilities).

> * Possibly write in C++.
> * Possibly replace bytecode with Forth style threaded code.
> * Possibly toss ref counting in favor of some kind of GC.
> * Consider ways to leverage multiple processor environments.
> * Consider alternative ways to implement exception handling (long jumps,
> etc, signals, etc.)
> * Look at alternate ways of building, passing, and parsing function
> arguments.
> * Use b-trees instead of dictionaries (just kidding).

The "just kidding" applies to the whole list, right? None of these
strike me as good ideas, except for improvements to function argument
passing.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tameyer at ihug.co.nz  Mon Jan 31 01:48:35 2005
From: tameyer at ihug.co.nz (Tony Meyer)
Date: Mon Jan 31 07:55:58 2005
Subject: [Python-Dev] Bug tracker reviews
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DAFDCE@its-xchg4.massey.ac.nz>

As promised, here are five bug reviews with recommendations.  If they help 

[ 1112812 ] Patch for Lib/bsddb/__init__.py to work with modulefinder
<http://sourceforge.net/tracker/index.php?func=detail&aid=1112812&group_id=5
470&atid=305470>

get reviewed, then that'd be great.  Otherwise I'll just take the good karma
and run :)

-----

[ 531205 ] Bugs in rfc822.parseaddr()
<http://sourceforge.net/tracker/index.php?func=detail&aid=531205&group_id=54
70&atid=105470>

What to do when an email address contains spaces, when RFC2822 says it
can't.  At the moment the spaces are stripped.  Recommend closing "Won't
Fix", for reasons outlined in the tracker by Tim Roberts.

[ 768419 ] Subtle bug in os.path.realpath on Cygwin
<http://sourceforge.net/tracker/index.php?func=detail&aid=768419&group_id=54
70&atid=105470>

Agree with Sjoerd that this is a Cygwin bug rather than a Python one (and no
response from OP for a very long time).  Recommend closing "Won't Fix".

[ 803413 ] uu.decode prints to stderr
<http://sourceforge.net/tracker/index.php?func=detail&aid=803413&group_id=54
70&atid=105470>

The question is whether it is ok for library modules to print to stderr if a
recoverable error occurs.  Looking at other modules, it seems uncommon, but
ok, so recommend closing "Won't fix", but making the suggested documentation
change.
(Alternatively, change from printing to stderr to using warnings.warn, which
would be a simple change and possibly more correct, although giving the same
result).

[ 989333 ] Empty curses module is loaded in win32
<http://sourceforge.net/tracker/index.php?func=detail&aid=989333&group_id=54
70&atid=105470>

Importing curses loads an empty module instead of raising ImportError on
win32.  I cannot duplicate this: recommend closing as "Invalid".

[ 1090076 ] Defaults in ConfigParser.get overrides section values
<http://sourceforge.net/tracker/index.php?func=detail&aid=1090076&group_id=5
470&atid=105470>

Behaviour of ConfigParser doesn't match the documentation.  The included
patch for ConfigParser does fix the problem, but might break existing code.
A decision needs to be made which is the desired behaviour, and the tracker
can be closed either "Won't Fix" or "Fixed" (and the fix applied for 2.5 and
2.4.1).

=Tony.Meyer

From barry at python.org  Mon Jan 31 14:12:38 2005
From: barry at python.org (Barry Warsaw)
Date: Mon Jan 31 14:12:47 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	function calls)
In-Reply-To: <ca471dc2050130211744a1b76f@mail.gmail.com>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
Message-ID: <1107177157.14649.125.camel@presto.wooz.org>

On Mon, 2005-01-31 at 00:17, Guido van Rossum wrote:
> > I had hoped for the core of p3k to be built for scratch [...]
> 
> Stop right there.

Phew!
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050131/5f6e756b/attachment.pgp
From barry at python.org  Mon Jan 31 14:13:32 2005
From: barry at python.org (Barry Warsaw)
Date: Mon Jan 31 14:13:34 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	function calls)
In-Reply-To: <16893.47973.850283.413462@montanaro.dyndns.org>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<16893.47973.850283.413462@montanaro.dyndns.org>
Message-ID: <1107177212.14651.127.camel@presto.wooz.org>

On Mon, 2005-01-31 at 00:00, Skip Montanaro wrote:
>     Raymond> I had hoped for the core of p3k to be built for scratch ...
> 
> Then we should just create a new CVS module for it (or go whole hog and try
> a new revision control system altogether - svn, darcs, arch, whatever).

I've heard rumors that SF was going to be making svn available.  Anybody
know more about that?  I'd be +1 on moving from cvs to svn.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050131/cb8fb3f5/attachment.pgp
From ejones at uwaterloo.ca  Mon Jan 31 16:43:53 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Mon Jan 31 16:43:41 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <ca471dc2050130211744a1b76f@mail.gmail.com>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
Message-ID: <E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>

On Jan 31, 2005, at 0:17, Guido van Rossum wrote:
> The "just kidding" applies to the whole list, right? None of these
> strike me as good ideas, except for improvements to function argument
> passing.

Really? You see no advantage to moving to garbage collection, nor 
allowing Python to leverage multiple processor environments? I'd be 
curious to hear your reasons why not.

My knowledge about garbage collection is weak, but I have read a little 
bit of Hans Boehm's work on garbage collection. For example, his 
"Memory Allocation Myths and Half Truths" presentation 
(http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite 
interesting. On page 25 he examines reference counting. The biggest 
disadvantage mentioned is that simple pointer assignments end up 
becoming "increment ref count" operations as well, which can "involve 
at least 4 potential memory references." The next page has a 
micro-benchmark that shows reference counting performing very poorly. 
Not to mention that Python has a garbage collector *anyway,* so 
wouldn't it make sense to get rid of the reference counting?

My only argument for making Python capable of leveraging multiple 
processor environments is that multithreading seems to be where the big 
performance increases will be in the next few years. I am currently 
using Python for some relatively large simulations, so performance is 
important to me.

Evan Jones

From bob at redivi.com  Mon Jan 31 17:04:51 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Jan 31 17:04:57 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
Message-ID: <f0480f69f325f0dd1af7695c993d73ca@redivi.com>


On Jan 31, 2005, at 10:43, Evan Jones wrote:

> On Jan 31, 2005, at 0:17, Guido van Rossum wrote:
>> The "just kidding" applies to the whole list, right? None of these
>> strike me as good ideas, except for improvements to function argument
>> passing.
>
> Really? You see no advantage to moving to garbage collection, nor 
> allowing Python to leverage multiple processor environments? I'd be 
> curious to hear your reasons why not.
>
> My knowledge about garbage collection is weak, but I have read a 
> little bit of Hans Boehm's work on garbage collection. For example, 
> his "Memory Allocation Myths and Half Truths" presentation 
> (http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite 
> interesting. On page 25 he examines reference counting. The biggest 
> disadvantage mentioned is that simple pointer assignments end up 
> becoming "increment ref count" operations as well, which can "involve 
> at least 4 potential memory references." The next page has a 
> micro-benchmark that shows reference counting performing very poorly. 
> Not to mention that Python has a garbage collector *anyway,* so 
> wouldn't it make sense to get rid of the reference counting?
>
> My only argument for making Python capable of leveraging multiple 
> processor environments is that multithreading seems to be where the 
> big performance increases will be in the next few years. I am 
> currently using Python for some relatively large simulations, so 
> performance is important to me.

Wouldn't it be nicer to have a facility that let you send messages 
between processes and manage concurrency properly instead?  You'll need 
most of this anyway to do multithreading sanely, and the benefit to the 
multiple process model is that you can scale to multiple machines, not 
just processors.  For brokering data between processes on the same 
machine, you can use mapped memory if you can't afford to copy it 
around, which gives you basically all the benefits of threads with 
fewer pitfalls.

-bob

From fredrik at pythonware.com  Mon Jan 31 17:09:02 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Jan 31 17:10:22 2005
Subject: [Python-Dev] Re: Moving towards Python 3.0 (was Re: Speed up
	functioncalls)
References: <ee2a432c050130125510c7938d@mail.gmail.com><000d01c50744$b2395700$fe26a044@oemcomputer><ca471dc2050130211744a1b76f@mail.gmail.com><E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
	<f0480f69f325f0dd1af7695c993d73ca@redivi.com>
Message-ID: <ctll64$q1e$1@sea.gmane.org>

Bob Ippolito wrote:

> Wouldn't it be nicer to have a facility that let you send messages between processes and manage 
> concurrency properly instead?  You'll need most of this anyway to do multithreading sanely, and 
> the benefit to the multiple process model is that you can scale to multiple machines, not just 
> processors.

yes, please!

> For brokering data between processes on the same machine, you can use
> mapped memory if you can't afford to copy it around

this mechanism should be reasonably hidden, of course, at least for "normal
use".

</F> 



From mcherm at mcherm.com  Mon Jan 31 17:51:05 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon Jan 31 17:51:09 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	functioncalls)
Message-ID: <1107190265.41fe61f9caab8@mcherm.com>

Evan Jones writes:
> My knowledge about garbage collection is weak, but I have read a little
> bit of Hans Boehm's work on garbage collection. [...] The biggest
> disadvantage mentioned is that simple pointer assignments end up
> becoming "increment ref count" operations as well...

Hans Boehm certainly has some excellent points. I believe a little
searching through the Python dev archives will reveal that attempts
have been made in the past to use his GC tools with CPython, and that
the results have been disapointing. That may be because other parts
of CPython are optimized for reference counting, or it may be just
because this stuff is so bloody difficult!

However, remember that changing away from reference counting is a change
to the semantics of CPython. Right now, people can (and often do) assume
that objects which don't participate in a reference loop are collected
as soon as they go out of scope. They write code that depends on
this... idioms like:

    >>> text_of_file = open(file_name, 'r').read()

Perhaps such idioms aren't a good practice (they'd fail in Jython or
in IronPython), but they ARE common. So we shouldn't stop using
reference counting unless we can demonstrate that the alternative is
clearly better. Of course, we'd also need to devise a way for extensions
to cooperate (which is a problem Jython, at least, doesn't face).

So it's NOT an obvious call, and so far numerous attempts to review
other GC strategies have failed. I wouldn't be so quick to dismiss
reference counting.

> My only argument for making Python capable of leveraging multiple
> processor environments is that multithreading seems to be where the big
> performance increases will be in the next few years. I am currently
> using Python for some relatively large simulations, so performance is
> important to me.

CPython CAN leverage such environments, and it IS used that way.
However, this requires using multiple Python processes and inter-process
communication of some sort (there are lots of choices, take your pick).
It's a technique which is more trouble for the programmer, but in my
experience usually has less likelihood of containing subtle parallel
processing bugs. Sure, it'd be great if Python threads could make use
of separate CPUs, but if the cost of that were that Python dictionaries
performed as poorly as a Java HashTable or synchronized HashMap, then it
wouldn't be worth the cost. There's a reason why Java moved away from
HashTable (the threadsafe data structure) to HashMap (not threadsafe).

Perhaps the REAL solution is just a really good IPC library that makes
it easier to write programs that launch "threads" as separate processes
and communicate with them. No change to the internals, just a new
library to encourage people to use the technique that already works.

-- Michael Chermside

From skip at pobox.com  Mon Jan 31 18:02:27 2005
From: skip at pobox.com (Skip Montanaro)
Date: Mon Jan 31 18:02:59 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	functioncalls)
In-Reply-To: <1107190265.41fe61f9caab8@mcherm.com>
References: <1107190265.41fe61f9caab8@mcherm.com>
Message-ID: <16894.25763.467209.961676@montanaro.dyndns.org>


    Michael> CPython CAN leverage such environments, and it IS used that
    Michael> way.  However, this requires using multiple Python processes
    Michael> and inter-process communication of some sort (there are lots of
    Michael> choices, take your pick).  It's a technique which is more
    Michael> trouble for the programmer, but in my experience usually has
    Michael> less likelihood of containing subtle parallel processing
    Michael> bugs.

In my experience, when people suggest that "threads are easier than ipc", it
means that their code is sprinkled with "subtle parallel processing bugs".

    Michael> Perhaps the REAL solution is just a really good IPC library
    Michael> that makes it easier to write programs that launch "threads" as
    Michael> separate processes and communicate with them. 

Tuple space, anyone?

Skip
From mwh at python.net  Mon Jan 31 18:20:46 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Jan 31 18:20:49 2005
Subject: [Python-Dev] Re: PEP 309
In-Reply-To: <79990c6b05012914156800e5bc@mail.gmail.com> (Paul Moore's
	message of "Sat, 29 Jan 2005 22:15:40 +0000")
References: <c6fc3cde05012616322cb5f3e4@mail.gmail.com>
	<d11dcfba050126164067bef3a1@mail.gmail.com>
	<cta7fq$144$1@sea.gmane.org>
	<79990c6b05012701492440d0c0@mail.gmail.com>
	<79990c6b05012914156800e5bc@mail.gmail.com>
Message-ID: <2mr7k18qhd.fsf@starship.python.net>

Paul Moore <p.f.moore@gmail.com> writes:

> Also, while looking at patches I noticed 1077106. It doesn't apply to
> me - I don't use Linux - but it looks like this may have simply been
> forgotten. The last comment is in December from from Michael Hudson,
> saying in effect "I'll commit this tomorrow". Michael?

Argh.  Committed.

Cheers,
mwh

-- 
  LINTILLA:  You could take some evening classes.
    ARTHUR:  What, here?
  LINTILLA:  Yes, I've got a bottle of them.  Little pink ones.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12
From glyph at divmod.com  Mon Jan 31 20:08:24 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Mon Jan 31 20:07:50 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up
	functioncalls)
In-Reply-To: <1107190265.41fe61f9caab8@mcherm.com>
References: <1107190265.41fe61f9caab8@mcherm.com>
Message-ID: <1107198504.4185.5.camel@localhost>

On Mon, 2005-01-31 at 08:51 -0800, Michael Chermside wrote:

> However, remember that changing away from reference counting is a change
> to the semantics of CPython. Right now, people can (and often do) assume
> that objects which don't participate in a reference loop are collected
> as soon as they go out of scope. They write code that depends on
> this... idioms like:
> 
>     >>> text_of_file = open(file_name, 'r').read()
> 
> Perhaps such idioms aren't a good practice (they'd fail in Jython or
> in IronPython), but they ARE common. So we shouldn't stop using
> reference counting unless we can demonstrate that the alternative is
> clearly better. Of course, we'd also need to devise a way for extensions
> to cooperate (which is a problem Jython, at least, doesn't face).

I agree that the issue is highly subtle, but this reason strikes me as
kind of bogus.  The problem here is not that the semantics are really
different, but that Python doesn't treat file descriptors as an
allocatable resource, and therefore doesn't trigger the GC when they are
exhausted.

As it stands, this idiom works most of the time, and if an EMFILE errno
triggered the GC, it would always work.

Obviously this would be difficult to implement pervasively, but maybe it
should be a guideline for alternative implementations to follow so as
not to fall into situations where tricks like this one, which are
perfectly valid both semantically and in regular python, would fail due
to an interaction with the OS...?


From martin at v.loewis.de  Mon Jan 31 20:21:10 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 31 20:20:53 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
References: <ee2a432c050130125510c7938d@mail.gmail.com>	<000d01c50744$b2395700$fe26a044@oemcomputer>	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
Message-ID: <41FE8526.4000907@v.loewis.de>

Evan Jones wrote:
> The next page has a 
> micro-benchmark that shows reference counting performing very poorly. 
> Not to mention that Python has a garbage collector *anyway,* so wouldn't 
> it make sense to get rid of the reference counting?

It's not clear what these numbers exactly mean, but I don't believe
them. With the Python GIL, the increments/decrements don't have to
be atomic, which already helps in a multiprocessor system (as you
don't need a buslock). The actual costs of GC occur when a
collection happens - and it should always be possible to construct
cases where the collection needs longer, because it has to look
at so much memory.

I like reference counting because of its predictability. I
deliberately do

data = open(filename).read()

without having to worry about closing the file - just because
reference counting does it for me. I guess a lot of code will
break when you drop refcounting - perhaps unless an fopen
failure will trigger a GC.

Regards,
Martin
From martin at v.loewis.de  Mon Jan 31 20:23:18 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Jan 31 20:23:02 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up	function
	calls)
In-Reply-To: <1107177212.14651.127.camel@presto.wooz.org>
References: <ee2a432c050130125510c7938d@mail.gmail.com>	<000d01c50744$b2395700$fe26a044@oemcomputer>	<16893.47973.850283.413462@montanaro.dyndns.org>
	<1107177212.14651.127.camel@presto.wooz.org>
Message-ID: <41FE85A6.10903@v.loewis.de>

Barry Warsaw wrote:
> I've heard rumors that SF was going to be making svn available.  Anybody
> know more about that?  I'd be +1 on moving from cvs to svn.

It was on their "things we do in 2005" list. 2005 isn't over yet...
I wouldn't be surprised if it gets moved to their "things we do in 2006"
list in November (just predicting from past history, without any
insight).

Regards,
Martin
From mwh at python.net  Mon Jan 31 20:40:48 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Jan 31 20:40:49 2005
Subject: [Python-Dev] Re: Moving towards Python 3.0
In-Reply-To: <E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca> (Evan
	Jones's message of "Mon, 31 Jan 2005 10:43:53 -0500")
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
Message-ID: <2mmzup8jzz.fsf@starship.python.net>

Evan Jones <ejones@uwaterloo.ca> writes:

> On Jan 31, 2005, at 0:17, Guido van Rossum wrote:
>> The "just kidding" applies to the whole list, right? None of these
>> strike me as good ideas, except for improvements to function argument
>> passing.
>
> Really? You see no advantage to moving to garbage collection, nor
> allowing Python to leverage multiple processor environments? I'd be
> curious to hear your reasons why not.

Obviously, if one could wave a wand and make it so, we would.  The
argument about whether the cost (in backwards compatibility,
portability, uniprocessor performace, developer time, etc) outweighs
the benefit.

> My knowledge about garbage collection is weak, but I have read a
> little bit of Hans Boehm's work on garbage collection. For example,
> his "Memory Allocation Myths and Half Truths" presentation
> (http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite
> interesting. On page 25 he examines reference counting. The biggest
> disadvantage mentioned is that simple pointer assignments end up
> becoming "increment ref count" operations as well, which can "involve
> at least 4 potential memory references." The next page has a
> micro-benchmark that shows reference counting performing very
> poorly.

Given the current implementations *extreme* malloc-happyness I posit
that it would be more-or-less impossible to make any form of
non-copying garabage collector go faster for Python that refcounting.
I may be wrong, but I don't think so and I have actually thought about
this a little bit :)

The "non-copying" bit is important for backwards compatibility of C
extensions (unless there's something I don't know).

> Not to mention that Python has a garbage collector *anyway,* so
> wouldn't it make sense to get rid of the reference counting?

Here you're confused.  Python's cycle collector depends utterly on
reference counting.

(And what is it with this "let's ditch refcounting and use a garbage
collector" thing that people always wheel out?  Refcounting *is* a
form of garbage collection by most reasonable definitions, esp. when
you add Python's cycle collector).

> My only argument for making Python capable of leveraging multiple
> processor environments is that multithreading seems to be where the
> big performance increases will be in the next few years. I am
> currently using Python for some relatively large simulations, so
> performance is important to me.

I'm sure you're tired of hearing it, but I think processes are your
friend...

Cheers,
mwh

-- 
  It is time-consuming to produce high-quality software. However,
  that should not alone be a reason to give up the high standards
  of Python development.              -- Martin von Loewis, python-dev
From apocalypznow at gmail.com  Mon Jan 31 09:15:58 2005
From: apocalypznow at gmail.com (apocalypznow)
Date: Mon Jan 31 21:10:20 2005
Subject: [Python-Dev] linux executable - how?
Message-ID: <ctkpbu$70c$1@sea.gmane.org>

How can I take my python scripts and create a linux executable out of it 
(to be distributed without having to also distribute python) ?

From aahz at pythoncraft.com  Mon Jan 31 22:11:16 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon Jan 31 22:11:20 2005
Subject: [Python-Dev] linux executable - how?
In-Reply-To: <ctkpbu$70c$1@sea.gmane.org>
References: <ctkpbu$70c$1@sea.gmane.org>
Message-ID: <20050131211116.GA7518@panix.com>

On Mon, Jan 31, 2005, apocalypznow wrote:
>
> How can I take my python scripts and create a linux executable out of it 
> (to be distributed without having to also distribute python) ?

python-dev is for discussion of patches and bugs to Python itself.
Please post your question on comp.lang.python.  Thanks!
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Given that C++ has pointers and typecasts, it's really hard to have a serious 
conversation about type safety with a C++ programmer and keep a straight face.
It's kind of like having a guy who juggles chainsaws wearing body armor 
arguing with a guy who juggles rubber chickens wearing a T-shirt about who's 
in more danger."  --Roy Smith, c.l.py, 2004.05.23
From bac at OCF.Berkeley.EDU  Mon Jan 31 23:02:20 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Jan 31 23:02:37 2005
Subject: [Python-Dev] python-dev Summary for 2004-12-16 through 2004-12-31
	[draft]
Message-ID: <41FEAAEC.5080805@ocf.berkeley.edu>

Nice and short summary this time.  Plan to send this off Wednesday or Thursday 
so get corrections in before then.

------------------------------

=====================
Summary Announcements
=====================
You can still `register <http://www.python.org/pycon/2005/register.html>`__ for 
`PyCon`_.  The `schedule of talks`_ is now online.  Jim Hugunin is lined up to 
be the keynote speaker on the first day with Guido being the keynote on 
Thursday.  Once again PyCon looks like it is going to be great.

On a different note, as I am sure you are all aware I am still about a month 
behind in summaries.  School this quarter for me has just turned out hectic.  I 
think it is lack of motivation thanks to having finished my 14 doctoral 
applications just a little over a week ago (and no, that number is not a typo). 
  I am going to for the first time in my life come up with a very regimented 
study schedule that will hopefully allow me to fit in weekly Python time so as 
to allow me to catch up on summaries.

And this summary is not short because I wanted to finish it.  2.5 was released 
just before the time this summary covers so most stuff was on bug fixes 
discovered after the release.

.. _PyCon: http://www.pycon.org/
.. _schedule of talks: http://www.python.org/pycon/2005/schedule.html


=======
Summary
=======
-------------
PEP movements
-------------
I introduced a `proto-PEP 
<http://mail.python.org/pipermail/python-dev/2005-January/050753.html>`__ to 
the list on how one can go about changing CPython's bytecode.  It will need 
rewriting once the AST branch is merged into HEAD on CVS.  Plus I need to get a 
PEP number assigned to me.  =)

Contributing threads:
   - ` proto-pep: How to change Python's bytecode <>`__

------------------------------------
Handling versioning within a package
------------------------------------
The suggestion of extending import syntax to support explicit version 
importation came up.  The idea was to have something along the lines of 
``import foo version 2, 4`` so that one can have packages that contain 
different versions of itself and to provide an easy way to specify which 
version was desired.

The idea didn't fly, though.  The main objection was that import-as support was 
all you really needed; ``import foo_2_4 as foo``.  And if you had a ton of 
references to a specific package and didn't want to burden yourself with 
explicit imports, one can always have a single place before codes starts 
executing doing ``import foo_2_4; sys.modules["foo"] = foo_2_4``.  And that 
itself can even be lower by creating a foo.py file that does the above for you.

You can also look at how wxPython handles it at 
http://wiki.wxpython.org/index.cgi/MultiVersionInstalls .

Contributing threads:
   - `Re: [Pythonmac-SIG] The versioning question... <>`__


===============
Skipped Threads
===============
- Problems compiling Python 2.3.3 on Solaris 10 with gcc 3.4.1
- 2.4 news reaches interesting places
      see `last summary`_ for coverage of this thread
- RE: [Python-checkins] python/dist/src/Modules posixmodule.c, 2.300.8.10, 
2.300.8.11
- mmap feature or bug?
- Re: [Python-checkins]	python/dist/src/Pythonmarshal.c, 1.79, 1.80
- Latex problem when trying to build documentation
- Patches: 1 for the price of 10.
- Python for Series 60 released
- Website documentation - link to descriptor information
- Build extensions for windows python 2.4 what are the compiler rules?
- Re: [Python-checkins] python/dist/src setup.py, 1.208, 1.209
- Zipfile needs?
     fake 32-bit unsigned int overflow with ``x = x & 0xFFFFFFFFL`` and signed 
ints with the additional ``if x & 0x80000000L: x -= 0x100000000L`` .
- Re: [Python-checkins] python/dist/src/Mac/OSX	fixapplepython23.py, 1.1, 1.2
From binkertn at umich.edu  Mon Jan 31 21:16:47 2005
From: binkertn at umich.edu (Nathan Binkert)
Date: Mon Jan 31 23:05:39 2005
Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function
	calls)
In-Reply-To: <f0480f69f325f0dd1af7695c993d73ca@redivi.com>
References: <ee2a432c050130125510c7938d@mail.gmail.com>
	<000d01c50744$b2395700$fe26a044@oemcomputer>
	<ca471dc2050130211744a1b76f@mail.gmail.com>
	<E8D78F45-739E-11D9-8570-0003938016AE@uwaterloo.ca>
	<f0480f69f325f0dd1af7695c993d73ca@redivi.com>
Message-ID: <Pine.LNX.4.58.0501311513040.10672@smtp.eecs.umich.edu>

> Wouldn't it be nicer to have a facility that let you send messages
> between processes and manage concurrency properly instead?  You'll need
> most of this anyway to do multithreading sanely, and the benefit to the
> multiple process model is that you can scale to multiple machines, not
> just processors.  For brokering data between processes on the same
> machine, you can use mapped memory if you can't afford to copy it
> around, which gives you basically all the benefits of threads with
> fewer pitfalls.

I don't think this is an answered problem.  There are plenty of
researchers on both sides of this fence.  It is not been proven at all
that threads are a bad model.

http://capriccio.cs.berkeley.edu/pubs/threads-hotos-2003.pdf or even
http://www.python.org/~jeremy/weblog/030912.html