From rrr at ronadam.com  Thu Nov  1 16:58:32 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 01 Nov 2007 10:58:32 -0500
Subject: [Python-ideas] str(<int>, base=<int>) as complement to int(<str>,
 base=<int>)
In-Reply-To: <47289EA3.7090005@cheimes.de>
References: <fg9jrd$nua$1@ger.gmane.org>
	<d11dcfba0710310738q5401b280i18e6e37977baf4ff@mail.gmail.com>
	<47289996.9000304@ronadam.com> <47289EA3.7090005@cheimes.de>
Message-ID: <4729F7A8.4090803@ronadam.com>



Christian Heimes wrote:
>> Or should it be a function in the math or string module?
> 
> Why do you want to hide the function somewhere instead of putting the
> functionality in an obvious place. In Python 3000 the str() builtin has
> two optional arguments:
> 
>    str(s, [encoding, [errors]])
> 
> Isn't base 2 or base 16 just another kind of encoding? IMHO the
> intergers 2, 8 or 16 can be treated as a form of encoding just as
> "ascii" or "latin-1".
> 
> Christian


See Guido's reply about it not being a str() constructor.

Sense int types don't have non-special methods it can't be an int method.

I don't think it's needed often enough to justify making it a global 
builtin function.

That leaves putting it in either the string or math module.


I don't think of it as hiding.  I think of it a grouping which makes it 
easier to find rather than harder to find.

Cheers,
    Ron


From adam at atlas.st  Thu Nov  1 18:39:58 2007
From: adam at atlas.st (Adam Atlas)
Date: Thu, 1 Nov 2007 13:39:58 -0400
Subject: [Python-ideas] str(<int>, base=<int>) as complement to int(<str>,
	base=<int>)
In-Reply-To: <fg9jrd$nua$1@ger.gmane.org>
References: <fg9jrd$nua$1@ger.gmane.org>
Message-ID: <203E0116-42D5-4059-9659-D5A6527F4E3C@atlas.st>


On 31 Oct 2007, at 06:02, Christian Heimes wrote:
> I know it's not a killer feature but it feels right to have a
> complement. How do you like the idea?
>
> Christian

How about extending the int type's (and other numeric types', perhaps)  
implementation of __format__ (for py3k -- PEP 3101) so that it can  
take an optional format specifier component indicating the base?





From guido at python.org  Thu Nov  1 19:52:25 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 1 Nov 2007 11:52:25 -0700
Subject: [Python-ideas] str(<int>, base=<int>) as complement to int(<str>,
	base=<int>)
In-Reply-To: <203E0116-42D5-4059-9659-D5A6527F4E3C@atlas.st>
References: <fg9jrd$nua$1@ger.gmane.org>
	<203E0116-42D5-4059-9659-D5A6527F4E3C@atlas.st>
Message-ID: <ca471dc20711011152m492b3187hc6213c174e168e89@mail.gmail.com>

We go over this about once a year. The conclusion is always the same:
there isn't enough use for bases other than 2, 8, 10, 16 to bother
including anything, and these are already covered by bin(), oct(),
str() and hex(). (bin() is in 3.0 and to be backported to 2.6.)

On 11/1/07, Adam Atlas <adam at atlas.st> wrote:
>
> On 31 Oct 2007, at 06:02, Christian Heimes wrote:
> > I know it's not a killer feature but it feels right to have a
> > complement. How do you like the idea?
> >
> > Christian
>
> How about extending the int type's (and other numeric types', perhaps)
> implementation of __format__ (for py3k -- PEP 3101) so that it can
> take an optional format specifier component indicating the base?
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jimjjewett at gmail.com  Thu Nov  1 20:13:02 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 1 Nov 2007 15:13:02 -0400
Subject: [Python-ideas] str(<int>, base=<int>) as complement to int(<str>,
	base=<int>)
In-Reply-To: <ca471dc20711011152m492b3187hc6213c174e168e89@mail.gmail.com>
References: <fg9jrd$nua$1@ger.gmane.org>
	<203E0116-42D5-4059-9659-D5A6527F4E3C@atlas.st>
	<ca471dc20711011152m492b3187hc6213c174e168e89@mail.gmail.com>
Message-ID: <fb6fbf560711011213h37121b65pd3c586ad49bfcf8a@mail.gmail.com>

On 11/1/07, Guido van Rossum <guido at python.org> wrote:
> We go over this about once a year. The conclusion is always the same:
> there isn't enough use for bases other than 2, 8, 10, 16 to bother
> including anything, and these are already covered by bin(), oct(),
> str() and hex(). (bin() is in 3.0 and to be backported to 2.6.)

Of course, if part of the deal were dropping bin, oct, and hex, that
might be a good trade.  But it may already be too late even for Py3.

-jJ


From greg.ewing at canterbury.ac.nz  Thu Nov  1 23:48:21 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 02 Nov 2007 11:48:21 +1300
Subject: [Python-ideas] str(<int>, base=<int>) as complement to int(<str>,
 base=<int>)
In-Reply-To: <4729F7A8.4090803@ronadam.com>
References: <fg9jrd$nua$1@ger.gmane.org>
	<d11dcfba0710310738q5401b280i18e6e37977baf4ff@mail.gmail.com>
	<47289996.9000304@ronadam.com> <47289EA3.7090005@cheimes.de>
	<4729F7A8.4090803@ronadam.com>
Message-ID: <472A57B5.8010102@canterbury.ac.nz>

Ron Adam wrote:
> 
> That leaves putting it in either the string or math module.

I don't think it belongs in the math module, because
that's supposed to correspond 1-1 with what's in the
C math library.

--
Greg


From bborcic at gmail.com  Fri Nov  9 15:39:59 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 15:39:59 +0100
Subject: [Python-ideas] x )= f   as shorthand for   x=f(x)
Message-ID: <fh1rhm$ui$1@ger.gmane.org>


Title says it all. Got used to += et al. My mind often expects augmented 
assignment syntax to exist uniformly for whatever transform.

If I am not mistaken, python syntax doesn't permit augmented assignment 
operators to sit between parens so that )= wouldn't risk confusing quick 
machine- or eye-scans to match parens.

Cheers, BB



From jimjjewett at gmail.com  Fri Nov  9 16:20:11 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 9 Nov 2007 10:20:11 -0500
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh1rhm$ui$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
Message-ID: <fb6fbf560711090720m293abcb4l30654967365bcef8@mail.gmail.com>

On 11/9/07, Boris Borcic <bborcic at gmail.com> wrote:

> Title says it all. Got used to += et al. My mind often expects
> augmented assignment syntax to exist uniformly for whatever
> transform.

Agreed.

Whether it is worth the costs is a different question.  I'm not sure
it is, and I'm sure it isn't with this particular syntax.

> If I am not mistaken, python syntax doesn't permit augmented
> assignment operators to sit between parens so that )= wouldn't
> risk confusing quick machine- or eye-scans to match parens.

There are plenty of tools (and plenty of eyes, including mine) that
don't use the full ruleset.

A parenthesis inside a string has no syntactic meaning.  In practice,
it still messes up some syntax colorings.

    (1, 2, """3, 4)

""", 5)

I don't think there is any reason to encourage the use of unmatched
parentheses for any purpose.

-jJ


From fredrik.johansson at gmail.com  Fri Nov  9 16:24:22 2007
From: fredrik.johansson at gmail.com (Fredrik Johansson)
Date: Fri, 9 Nov 2007 16:24:22 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh1rhm$ui$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
Message-ID: <3d0cebfb0711090724p5fecb5c5pc23d44db8a4f0c84@mail.gmail.com>

On Nov 9, 2007 3:39 PM, Boris Borcic <bborcic at gmail.com> wrote:
>
> Title says it all. Got used to += et al. My mind often expects augmented
> assignment syntax to exist uniformly for whatever transform.
>
> If I am not mistaken, python syntax doesn't permit augmented assignment
> operators to sit between parens so that )= wouldn't risk confusing quick
> machine- or eye-scans to match parens.

Would the statement

( x )= f

represent the ordinary assignment x=f or would it become a syntax error?

Fredrik


From eduardo.padoan at gmail.com  Fri Nov  9 16:22:30 2007
From: eduardo.padoan at gmail.com (Eduardo O. Padoan)
Date: Fri, 9 Nov 2007 13:22:30 -0200
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh1rhm$ui$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
Message-ID: <dea92f560711090722w4684133en649681906fec274d@mail.gmail.com>

On Nov 9, 2007 12:39 PM, Boris Borcic <bborcic at gmail.com> wrote:
>
> Title says it all. Got used to += et al. My mind often expects augmented
> assignment syntax to exist uniformly for whatever transform.
>
> If I am not mistaken, python syntax doesn't permit augmented assignment
> operators to sit between parens so that )= wouldn't risk confusing quick
> machine- or eye-scans to match parens.
>

Bizarre syntax. Close-parens should close something. Also, al it saves
is 1 char.



-- 
http://www.advogato.org/person/eopadoan/
Bookmarks: http://del.icio.us/edcrypt


From bborcic at gmail.com  Fri Nov  9 18:24:49 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 18:24:49 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <3d0cebfb0711090724p5fecb5c5pc23d44db8a4f0c84@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<3d0cebfb0711090724p5fecb5c5pc23d44db8a4f0c84@mail.gmail.com>
Message-ID: <fh256p$45r$1@ger.gmane.org>

Fredrik Johansson wrote:
> On Nov 9, 2007 3:39 PM, Boris Borcic <bborcic at gmail.com> wrote:
>> Title says it all. Got used to += et al. My mind often expects augmented
>> assignment syntax to exist uniformly for whatever transform.
>>
>> If I am not mistaken, python syntax doesn't permit augmented assignment
>> operators to sit between parens so that )= wouldn't risk confusing quick
>> machine- or eye-scans to match parens.
> 
> Would the statement
> 
> ( x )= f
> 
> represent the ordinary assignment x=f or would it become a syntax error?

Ah, indeed I almost itemized my remark about current python syntax with a (1), 
to add a "(2) makes closing parens before an augmented assignment (part of) a 
superfluous construct".

I'd make it a syntax error, to answer your question. I'd be interested in 
examples out of the "wild".

Cheers, BB

> 
> Fredrik



From bborcic at gmail.com  Fri Nov  9 18:29:43 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 18:29:43 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fb6fbf560711090720m293abcb4l30654967365bcef8@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<fb6fbf560711090720m293abcb4l30654967365bcef8@mail.gmail.com>
Message-ID: <fh25fv$53p$1@ger.gmane.org>

Jim Jewett wrote:
> On 11/9/07, Boris Borcic <bborcic at gmail.com> wrote:
> 
>> Title says it all. Got used to += et al. My mind often expects
>> augmented assignment syntax to exist uniformly for whatever
>> transform.
> 
> Agreed.
> 
> Whether it is worth the costs is a different question.  I'm not sure
> it is, and I'm sure it isn't with this particular syntax.
> 
>> If I am not mistaken, python syntax doesn't permit augmented
>> assignment operators to sit between parens so that )= wouldn't
>> risk confusing quick machine- or eye-scans to match parens.
> 
> There are plenty of tools (and plenty of eyes, including mine) that
> don't use the full ruleset.
> 
> A parenthesis inside a string has no syntactic meaning.  In practice,
> it still messes up some syntax colorings.
> 
>     (1, 2, """3, 4)
> 
> """, 5)

Point was, in a syntactically correct program, the proposed operator can not 
occur /at all/ inside the span of an opened parenthesis, so this type of 
confusion isn't possible.

BB
> 
> I don't think there is any reason to encourage the use of unmatched
> parentheses for any purpose.
> 
> -jJ



From bborcic at gmail.com  Fri Nov  9 18:51:35 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 18:51:35 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <3d0cebfb0711090724p5fecb5c5pc23d44db8a4f0c84@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<3d0cebfb0711090724p5fecb5c5pc23d44db8a4f0c84@mail.gmail.com>
Message-ID: <fh26p0$9f7$1@ger.gmane.org>

Fredrik Johansson wrote:
> On Nov 9, 2007 3:39 PM, Boris Borcic <bborcic at gmail.com> wrote:
>> Title says it all. Got used to += et al. My mind often expects augmented
>> assignment syntax to exist uniformly for whatever transform.
>>
>> If I am not mistaken, python syntax doesn't permit augmented assignment
>> operators to sit between parens so that )= wouldn't risk confusing quick
>> machine- or eye-scans to match parens.
> 
> Would the statement
> 
> ( x )= f
> 
> represent the ordinary assignment x=f or would it become a syntax error?

Ah, and what about

(x,y)=f

- more likely to already exist in the wild, isn't it ?

Well, if ')=' was an augmented assignment operator,
I'd say

(x,y)=f

should parse as a destructuring assignment as it already does
while

(x)=f

should become a syntax error. I admit it's debatable, of course. I think a case 
could be made in terms of lookahead tokens in favor of that solution (all other 
things equal).

Cheers, BB



From bborcic at gmail.com  Fri Nov  9 19:00:18 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 19:00:18 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <dea92f560711090722w4684133en649681906fec274d@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<dea92f560711090722w4684133en649681906fec274d@mail.gmail.com>
Message-ID: <fh279b$baa$1@ger.gmane.org>

Eduardo O. Padoan wrote:
> On Nov 9, 2007 12:39 PM, Boris Borcic <bborcic at gmail.com> wrote:
>> Title says it all. Got used to += et al. My mind often expects augmented
>> assignment syntax to exist uniformly for whatever transform.
>>
>> If I am not mistaken, python syntax doesn't permit augmented assignment
>> operators to sit between parens so that )= wouldn't risk confusing quick
>> machine- or eye-scans to match parens.
>>
> 
> Bizarre syntax. Close-parens should close something. Also, al it saves
> is 1 char.

Typical motivating usecase is like for other augmented assignment

Just as

a[<complex_expression>] += n

saves both the typing and the computation of an <complex_expression>
over

a[<complex_expression>] += a[<complex_expression>] + n

and an temporary variable assignment over

temp = <complex_expression>
a[temp]=a[temp]+n

so would

a[<complex_expression>] )= f

save over

a[<complex_expression>] = f(a[<complex_expression>])

etc. More than "just 1 char", anyway.

BB





From steven.bethard at gmail.com  Fri Nov  9 19:07:42 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 9 Nov 2007 11:07:42 -0700
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh1rhm$ui$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
Message-ID: <d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>

On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
> Title says it all. Got used to += et al. My mind often expects augmented
> assignment syntax to exist uniformly for whatever transform.

I'm not really a Guido channeler, but I'd guess this has about a 0%
chance of ever making it into Python.

Function calls in Python are indicated by () following the function
name.  Your proposal puts the parentheses (or one of them) *before*
the function name. Breaking the consistency here seems like an
*extremely* bad idea.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From george.sakkis at gmail.com  Fri Nov  9 19:16:26 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Fri, 9 Nov 2007 13:16:26 -0500
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
Message-ID: <91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>

On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:

> Title says it all. Got used to += et al. My mind often expects augmented
> assignment syntax to exist uniformly for whatever transform.

And the "most inane proposal in python-ideas" award goes to... ;-)


From bborcic at gmail.com  Fri Nov  9 19:33:00 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 19:33:00 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
Message-ID: <fh296l$hqi$1@ger.gmane.org>

Steven Bethard wrote:
> On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
>> Title says it all. Got used to += et al. My mind often expects augmented
>> assignment syntax to exist uniformly for whatever transform.
> 
> I'm not really a Guido channeler, but I'd guess this has about a 0%
> chance of ever making it into Python.
> 
> Function calls in Python are indicated by () following the function
> name.  Your proposal puts the parentheses (or one of them) *before*
> the function name. Breaking the consistency here seems like an
> *extremely* bad idea.


I contend that   x )= f   captures some perfume of the invariant you mention, 
although I admit there is no comparably simple formula for the relaxed invariant 
(if indeed it exists).

Note that current python syntax requires any ) to follow a ( that it balances,
so that's not one but two rules broken in coordination.

(-1)*(-1)==(+1)-ly yours,

Boris Borcic
--
What happened to our chief humorist and python zen master, BTW ?



From guido at python.org  Fri Nov  9 19:40:28 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 9 Nov 2007 10:40:28 -0800
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh296l$hqi$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
	<fh296l$hqi$1@ger.gmane.org>
Message-ID: <ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>

Boris, give it up. That syntax is never going to fly. If you have to
ask why, you're just not cut out to be a language designer.

On Nov 9, 2007 10:33 AM, Boris Borcic <bborcic at gmail.com> wrote:
> Steven Bethard wrote:
> > On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
> >> Title says it all. Got used to += et al. My mind often expects augmented
> >> assignment syntax to exist uniformly for whatever transform.
> >
> > I'm not really a Guido channeler, but I'd guess this has about a 0%
> > chance of ever making it into Python.
> >
> > Function calls in Python are indicated by () following the function
> > name.  Your proposal puts the parentheses (or one of them) *before*
> > the function name. Breaking the consistency here seems like an
> > *extremely* bad idea.
>
>
> I contend that   x )= f   captures some perfume of the invariant you mention,
> although I admit there is no comparably simple formula for the relaxed invariant
> (if indeed it exists).
>
> Note that current python syntax requires any ) to follow a ( that it balances,
> so that's not one but two rules broken in coordination.
>
> (-1)*(-1)==(+1)-ly yours,
>
> Boris Borcic
> --
> What happened to our chief humorist and python zen master, BTW ?
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From bborcic at gmail.com  Fri Nov  9 19:54:03 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 19:54:03 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
	<91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>
Message-ID: <fh2ae4$ltl$1@ger.gmane.org>

George Sakkis wrote:
> On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
> 
>> Title says it all. Got used to += et al. My mind often expects augmented
>> assignment syntax to exist uniformly for whatever transform.
> 
> And the "most inane proposal in python-ideas" award goes to... ;-)

[Is the name of "sarkkasm" (sarkkism ?) the one obvious way to make your remark 
escape appropriate qualification as...]

But please be precise, are you saying

(1) that it is inane to suggest that <code> x=f(x) </code> has enough in common 
with say <code> x=x%n </code> that special syntax paralleling the latter's 
shorthand <code> x%=n </code> could or would make sense for the former ?

(2) that the proposed choice of special syntax is "most inane".

In case you mean only (2), please back your claim with some facts, by proposing 
"less inane" special syntax.

Cheers, BB



From adam at atlas.st  Fri Nov  9 20:24:46 2007
From: adam at atlas.st (Adam Atlas)
Date: Fri, 9 Nov 2007 14:24:46 -0500
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
	<91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>
Message-ID: <38D4AA4E-2708-4E4F-BBCC-381B62F2961B@atlas.st>


On 9 Nov 2007, at 13:16, George Sakkis wrote:

> On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
>
>> Title says it all. Got used to += et al. My mind often expects  
>> augmented
>> assignment syntax to exist uniformly for whatever transform.
>
> And the "most inane proposal in python-ideas" award goes to... ;-)

I can top that. Instead of "x )= f", I propose one of the following:

-   x $?$%$?666= f
-   x =^_^= f
-   x ?= f
-   x 8======D f

From bborcic at gmail.com  Fri Nov  9 20:37:29 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 20:37:29 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>	<fh296l$hqi$1@ger.gmane.org>
	<ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>
Message-ID: <fh2cvi$uo8$1@ger.gmane.org>

Guido van Rossum wrote:
> Boris, give it up. That syntax is never going to fly. If you have to
> ask why, you're just not cut out to be a language designer.

Guido,

I did not intend to pose as a language designer. I just bumped for the nth time 
on a corner of the language and came up with the closest approximation to a 
solution I could invent, expecting the (actual and potential) language designers 
of the forum to find a better solution if any can be dreamed up. Maybe I was 
mistaken about this newsgroup's purpose, but imho playing the devil's advocate 
is a perfectly honorable manner to push ideas (as opposed to designs).

I must admit I wasn't expecting the discussion to rely so quickly on involving 
my character. In conclusion, I guess I'm warranted to take this to mean "we can 
dream up no appropriate syntax".

Regards,

Boris
---
PS,FYI : a notation borne from letting parens live independent lives,
and indeed could fly http://en.wikipedia.org/wiki/Bra-ket_notation


> 
> On Nov 9, 2007 10:33 AM, Boris Borcic <bborcic at gmail.com> wrote:
>> Steven Bethard wrote:
>>> On Nov 9, 2007 7:39 AM, Boris Borcic <bborcic at gmail.com> wrote:
>>>> Title says it all. Got used to += et al. My mind often expects augmented
>>>> assignment syntax to exist uniformly for whatever transform.
>>> I'm not really a Guido channeler, but I'd guess this has about a 0%
>>> chance of ever making it into Python.
>>>
>>> Function calls in Python are indicated by () following the function
>>> name.  Your proposal puts the parentheses (or one of them) *before*
>>> the function name. Breaking the consistency here seems like an
>>> *extremely* bad idea.
>>
>> I contend that   x )= f   captures some perfume of the invariant you mention,
>> although I admit there is no comparably simple formula for the relaxed invariant
>> (if indeed it exists).
>>
>> Note that current python syntax requires any ) to follow a ( that it balances,
>> so that's not one but two rules broken in coordination.
>>
>> (-1)*(-1)==(+1)-ly yours,
>>
>> Boris Borcic
>> --
>> What happened to our chief humorist and python zen master, BTW ?
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
> 
> 
> 



From bborcic at gmail.com  Fri Nov  9 20:42:35 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Fri, 09 Nov 2007 20:42:35 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <38D4AA4E-2708-4E4F-BBCC-381B62F2961B@atlas.st>
References: <fh1rhm$ui$1@ger.gmane.org>	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>	<91ad5bf80711091016s732f163asf0ac533e7e7841ac@mail.gmail.com>
	<38D4AA4E-2708-4E4F-BBCC-381B62F2961B@atlas.st>
Message-ID: <fh2d94$uo8$2@ger.gmane.org>

Adam Atlas wrote:

>> And the "most inane proposal in python-ideas" award goes to... ;-)
> 
> I can top that. Instead of "x )= f", I propose one of the following:
> 
> -   x $?$%$?666= f
> -   x =^_^= f
> -   x ?= f
> -   x 8======D f

That's self-contradictory, or "most" doesn't denote a superlative.




From bwinton at latte.ca  Fri Nov  9 21:05:32 2007
From: bwinton at latte.ca (Blake Winton)
Date: Fri, 09 Nov 2007 15:05:32 -0500
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh2cvi$uo8$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>	<fh296l$hqi$1@ger.gmane.org>	<ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>
	<fh2cvi$uo8$1@ger.gmane.org>
Message-ID: <4734BD8C.6050600@latte.ca>

Some people wrote:
 >>>> Function calls in Python are indicated by () following the function name.
 >>> I contend that "x )= f" captures some perfume of the invariant you mention,

But not enough of it.  A syntax of "x ()= f" would seem to have more chance of 
being accepted.  But I would still give it no more than 0.1% chance, based on 
the potential confusion between it and "x() = f"...

> In conclusion, I guess I'm warranted to take this to mean "we can 
> dream up no appropriate syntax".

If I were you, I would take it more as "that suggestion is too Functional (or 
perhaps just too confusing) for Python."  (If you're looking for a language that 
has filed all the corners off, might I suggest Scheme.  No, seriously, I'm not 
making a parenthesis joke here.)

Later,
Blake.



From tjreedy at udel.edu  Fri Nov  9 21:11:16 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 9 Nov 2007 15:11:16 -0500
Subject: [Python-ideas] x )= f   as shorthand for   x=f(x)
References: <fh1rhm$ui$1@ger.gmane.org>
Message-ID: <fh2et1$phn$1@ger.gmane.org>


"Boris Borcic" <bborcic at gmail.com> wrote in 
message news:fh1rhm$ui$1 at ger.gmane.org...
|
| Title says it all. Got used to += et al. My mind often expects augmented
| assignment syntax to exist uniformly for whatever transform.

I the analogy can be improved.

x += y # abbreviates
x = x + y # which could have been defined to have been written
x = +(x,y) # and which usually *is* equivalent to x = type(x).__add__(x,y)

Hence by analogy, I would rewrite
x = f(x,y) # as
x f= y # ;-)

Making the obvious generalization to n params, and specializing to one, 
gives

x f=

tjr





From jimjjewett at gmail.com  Fri Nov  9 22:12:55 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 9 Nov 2007 16:12:55 -0500
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh2cvi$uo8$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
	<fh296l$hqi$1@ger.gmane.org>
	<ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>
	<fh2cvi$uo8$1@ger.gmane.org>
Message-ID: <fb6fbf560711091312qae43ac1w5ec231cb1c02c145@mail.gmail.com>

Boris,

I'm posting this publicly because you aren't the first to feel this
way, so I think an answer should be archived.

On 11/9/07, Boris Borcic <bborcic at gmail.com> wrote:
> Guido van Rossum wrote:
> > Boris, give it up. That syntax is never going to fly. If you have to
> > ask why, you're just not cut out to be a language designer.

> I did not intend to pose as a language designer.

Suggesting a change in python is acting (in a small way) as a language designer.

> came up with the closest approximation to a solution I could invent,

Which is fine.

The catch is that no one -- not even Guido -- gets everything right
the first time.

There is a natural desire to just tweak the proposal to work, or even
to explain why things are already OK.  For a good proposal, you need
to do this to make it great.  Unfortunately, that turns out to be
running in circles for the proposals that -- like most proposals --
turn out to be dead ends.

So you need to be willing to step back and figure out

    (1)  How important the problem really is.
    (2)  How expensive the proposed solutions really are.

> I must admit I wasn't expecting the discussion to rely so quickly on
> involving my character.

I don't think that was anyone's intent.

I suspect you were thinking of lines like:

> > That syntax is never going to fly. If you have to
> > ask why, you're just not cut out to be a language designer.

These don't mean you're bad person; they just mean that you don't yet
know how to answer those two questions the same way Guido (for
example) would.

> In conclusion, I guess I'm warranted to take this to mean "we can
> dream up no appropriate syntax".

Yes, but there is also a question about whether to do it at all.  Remember that

    x = f(x)

is one step of reduce -- and reduce is something Guido wants to take
back out of the language because, in practice, it is too confusing.

(a)  Is this operation frequent enough to be worth a syntactic
shortcut?  Would it actually make the code easier to read?
(b)  Is the sort of code that uses this operation something that
should be encouraged?  Or is making it hard a *good* thing that steers
people towards other idioms?

> PS,FYI : a notation borne from letting parens live independent lives,
> and indeed could fly http://en.wikipedia.org/wiki/Bra-ket_notation

The question isn't whether it is possible, but whether it is worth the
cost.  The costs are different for physics and for a generic
programming language -- and different still for Python in particular.

-jJ


From g.brandl at gmx.net  Fri Nov  9 23:40:49 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 09 Nov 2007 23:40:49 +0100
Subject: [Python-ideas] x )= f   as shorthand for   x=f(x)
In-Reply-To: <fh2et1$phn$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org> <fh2et1$phn$1@ger.gmane.org>
Message-ID: <fh2nlh$jjf$1@ger.gmane.org>

Terry Reedy schrieb:
> "Boris Borcic" <bborcic at gmail.com> wrote in 
> message news:fh1rhm$ui$1 at ger.gmane.org...
> |
> | Title says it all. Got used to += et al. My mind often expects augmented
> | assignment syntax to exist uniformly for whatever transform.
> 
> I the analogy can be improved.
> 
> x += y # abbreviates
> x = x + y # which could have been defined to have been written
> x = +(x,y) # and which usually *is* equivalent to x = type(x).__add__(x,y)

Hah, I have the solution!

x ?= f

unicode-ly yours,
Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From ntoronto at cs.byu.edu  Fri Nov  9 23:51:26 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 09 Nov 2007 15:51:26 -0700
Subject: [Python-ideas] Raw strings return compiled regexps
Message-ID: <4734E46E.2050709@cs.byu.edu>

It seems like every time somebody has issues with raw strings, the 
canonical answer is "don't use them for that, use the for regular 
expressions".

What if they just returned regular expression objects? As in

   r'<some long exp>'.match('<my string>')

That would guarantee they didn't get abused for anything else. It would 
break a lot of code, too. :)

Quick question, if someone has the time: is there any way to test 
equivalence of regular expressions? If we had intersection and an 
emptiness test (both of which are easy in the theoretical construct, but 
harder to do in practice), it'd be easy. I may be able to fake 
intersection using lookahead and such, but there's no emptiness test 
that I know of.

Thanks in advance,
Neil


From stephen at xemacs.org  Sat Nov 10 00:26:21 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 10 Nov 2007 08:26:21 +0900
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fh2cvi$uo8$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org>
	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>
	<fh296l$hqi$1@ger.gmane.org>
	<ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>
	<fh2cvi$uo8$1@ger.gmane.org>
Message-ID: <87d4ujm25e.fsf@uwakimon.sk.tsukuba.ac.jp>

Boris Borcic writes:

 > I must admit I wasn't expecting the discussion to rely so quickly
 > on involving my character.

Some people have natural talent for a particular kind of design, some
don't.  If one doesn't, it's no big deal, s/he still can contribute,
even to design---but coming up with original ideas is likely to waste
her/his time and that of others.  (I don't say it's impossible to
develop it as a skill, but it would take real work.)

Why not take Guido's comment literally, "*if* you don't have it," and
think about the "litmus test" he described?  (Ie, think about why this
proposal is unattractive.)  Of course, there is an implication that
you *don't* have it, but it will be better all around if you ignore
that implication, and leave it an open question as long as you want to
contribute in this way.

 > In conclusion, I guess I'm warranted to take this to mean "we can
 > dream up no appropriate syntax".

I wouldn't say "impossible".  However, the senior developers who have
spoken up clearly think that your proposal (a) is not an improvement
over x = f(x) in most use cases (and IMO often would be worse, because
x += y expresses accumulation of y, while x = y expresses replacement)
and (b) seems to have very few, if any, appropriate use cases.  So
"why bother?" is the message.

 > PS,FYI : a notation borne from letting parens live independent lives,
 > and indeed could fly http://en.wikipedia.org/wiki/Bra-ket_notation

As I understand it, the bra-ket notation arose in physics because both
the bra part and the ket part make sense as operators, but only in the
lefthand role for the bra, and righthand role for the ket.  So they
don't really live independent lives, any more than the dx and the dy
do in conventional calculus.

However, in your syntax you do (c) lose the kind of implied symmetry
that the bra-ket and infinitesimal notations have.  You could "fix"
that by using the notation "apply-and-assign" x ()= f, but that
syntax already has a meaning in python, and runs even more forcefully
into STeVe's criticism that parens are a postfix operator, not infix.

Note that I myself can come up with criticisms like (a), (b), and (c)
but to the best of my knowledge I've never invented any useful
syntax.<wink>

I-always-wanted-to-be-a-language-designer-too-ly y'rs,



From lists at cheimes.de  Sat Nov 10 02:37:53 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sat, 10 Nov 2007 02:37:53 +0100
Subject: [Python-ideas] x )= f   as shorthand for   x=f(x)
In-Reply-To: <fh2nlh$jjf$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org> <fh2et1$phn$1@ger.gmane.org>
	<fh2nlh$jjf$1@ger.gmane.org>
Message-ID: <fh321g$erp$1@ger.gmane.org>

Georg Brandl wrote:
 > Hah, I have the solution!
> 
> x ?= f
> 
> unicode-ly yours,

Georg has even written a Python enhancement proposals about the topic:
http://www.python.org/dev/peps/pep-3117/ It should be hard to get the
idea into it ...

*just kidding*

Christian



From greg at krypto.org  Sat Nov 10 08:04:40 2007
From: greg at krypto.org (Gregory P. Smith)
Date: Fri, 9 Nov 2007 23:04:40 -0800
Subject: [Python-ideas] Raw strings return compiled regexps
In-Reply-To: <4734E46E.2050709@cs.byu.edu>
References: <4734E46E.2050709@cs.byu.edu>
Message-ID: <52dc1c820711092304y2d88c403q26e1d4e8a1cbf4fd@mail.gmail.com>

Interesting idea.  Rather than breaking a lot of code you could have it be a
subclass of string that also adds the regular expression object methods.
Trivial to prototype such a type:

import re
class rstr(str):
  def __init__(self, x):
    str.__init__(self, x)
    self.__re = None
  def match(self, *args, **kwargs):
    if not self.__re:
      self.__re = re.compile(self)
    return self.__re.match(*args, **kwargs)
  def search(self, *args, **kwargs):
    if not self.__re:
      self.__re = re.compile(self)
    return self.__re.search(*args, **kwargs)
  def set_re_flags(self, flags):
    if self.__re:
      raise RuntimeError('flags may only be set once before the first use as
a regular expression.')
    self.__re = re.compile(self, flags)


Regardless, count me as +0 on the concept.  It seems neat but also smells
fishy.

-gps

On 11/9/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
>
> It seems like every time somebody has issues with raw strings, the
> canonical answer is "don't use them for that, use the for regular
> expressions".
>
> What if they just returned regular expression objects? As in
>
>    r'<some long exp>'.match('<my string>')
>
> That would guarantee they didn't get abused for anything else. It would
> break a lot of code, too. :)
>
> Quick question, if someone has the time: is there any way to test
> equivalence of regular expressions? If we had intersection and an
> emptiness test (both of which are easy in the theoretical construct, but
> harder to do in practice), it'd be easy. I may be able to fake
> intersection using lookahead and such, but there's no emptiness test
> that I know of.
>
> Thanks in advance,
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071109/19429631/attachment.html>

From lists at cheimes.de  Sun Nov 11 21:43:27 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 11 Nov 2007 21:43:27 +0100
Subject: [Python-ideas] Enable tab completion for interactive sessions
Message-ID: <fh7phf$tbh$1@ger.gmane.org>

Hello fellow Pythonistas!

Python has a very useful feature a lot of people don't know about. It's
tab completion for the interactive shell.
http://docs.python.org/lib/module-rlcompleter.html

Tab completion is very useful for introspection and quick tests in an
interactive shell. rlcompleter isn't enable by default - and it
shouldn't. But I like to add a cmd line option and an env var to load
and enable the rlcompleter in an interactive session.

The right place to enable the feature is in

Modules/main.c:460
if ((Py_InspectFlag || (command == NULL && filename == NULL && module ==
NULL)) && isatty(fileno(stdin))) {

    code run when -i is given or neither command nor filename nor module
is set and stdin is an interactive terminal.

}

At the moment the code block loads just the readline module. I also like
to load the rlcompleter module and invoke readline.parse_and_bind("tab:
complete") there.

Options:
(1) always enable tab completion for interactive shells w/o a command,
module and filename.
(2) only enable rlcompleter when the -i flag or PYTHONINTERACTIVE env
var is set.
(3) add a new command flag and env var to enable the completer when (1)
or (2) is true

Christian



From adam at atlas.st  Mon Nov 12 05:52:43 2007
From: adam at atlas.st (Adam Atlas)
Date: Sun, 11 Nov 2007 23:52:43 -0500
Subject: [Python-ideas] Pause (sort of a 'deep yield'?)
Message-ID: <ED43715F-2994-4BBD-A5BD-2E8ECE7AA384@atlas.st>

Generator-based coroutines are great, but I've thought of some  
interesting cases where it would help to be able to sort of yield to  
an outer scope (beyond the parent scope) while being able to resume.  
I'm thinking this would make the most sense as a kind of exception,  
with an added "resume" method which would resume execution at the  
point at which the exception was raised. (They'd also have a throw()  
method for continuing execution but raising an exception, and a  
close() method, as with generators in Python >= 2.5.)

Here's an example to demonstrate what I'm talking about:

def a():
     print 'blah'
     p = pause 7 # like using `yield` as an expression
               # but it raises "PauseException" (or whatever)
     print p
     return (p, 123)

def b():
     return a()

try:
     print b()
except PauseException, e:
     print e.value
     e.reusme(3)

#prints:
#  blah
#  7
#  3
#  (3, 123)

Normally you'd subclass PauseException so you can catch specific known  
instances of pausing in your application. If no outer scope can handle  
a pause, then the program should exit as with any other exception.

For more practical use cases, I'm mainly thinking about asynchronous  
programming, things like Twisted; I see a lot of interesting  
possibilities there. But here's a simpler example... Suppose we have  
WSGI 2.0, and, as expected, it is rid of start_response() and the  
resulting write() callable. And suppose we want to write an adaptor to  
allow WSGI 1.0 applications to be used as WSGI 2.0 applications. We  
want to do this by creating a write() which pauses and sends the value  
to an outer wrapper which interleaves any write()en output with the  
WSGI 1.0 app's returned app_iter into a single generator. It would go  
something like this:

class StartRespPause (PauseException): pass
class WritePause (PauseException): pass
class wsgi_adaptor (object):
     def __init__(self, app):
         self.app = app

     def _write(self, data):
         pause WritePause(data)
         # Interrupts this frame and returns control to the first  
outer frame
         # that catches WritePause.

         # If the `pause` statement/expression is given a PauseException
         # instance, it raises that; if it is given a PauseException  
subclass,
         # it raises that with None; if it gets another value `v`, it  
raises
         # PauseException(v).

     def _start_response(self, status, response_headers, exc_info=None):
         # [...irrelevant exc_info handling stuff here...]
         pause (status, response_headers)
         return self._write

     def _app_iter(self, environ):
         try:
             for v in self.app(environ, self._start_response):
                 yield v
         except WritePause, e:
             yield e.value
             e.resume()
             # This part of the syntax is perhaps a little troublesome  
-- the
             # body of a `try` block might cause multiple pauses, so  
an `except`
             # block catching a PauseException subclass has the  
possibility of
             # running multiple times. This is the correct behaviour,  
but it is
             # somewhat counterintuitive given the huge precedent for  
at most
             # one `except` block to execute, once, for a given `try`  
block.
             # Perhaps there could be some syntax other than `except`,  
but of
             # course we'd rather keep the number of reserved words  
down.

     def __call__(self, environ):
         # [...whatever other bridging is needed...]
         try:
             app_iter = self.app_iter(environ)
         except StartRespPause, e:
             status, response_headers = e.value
             e.resume()
         return (status, response_headers, app_iter)

Thinking about environments like Twisted, it seems to me that this  
could make Deferreds/callbacks [almost?] entirely unnecessary. PEP 342  
(Coroutines via Enhanced Generators) speaks of using "a simple co- 
routine scheduler or 'trampoline function' [which] would let  
coroutines 'call' each other without blocking -- a tremendous boon for  
asynchronous applications", but I think pauses would simplify this  
even further; it would allow these matters to be mostly invisible  
outside the innermost potentially blocking functions. Basically, it  
"would let coroutines 'call' each other without blocking", but now  
without the quotes around the word 'call'. :)

The PEP gives the simple example of "data = (yield  
nonblocking_read(my_socket, nbytes))", but with pauses, we could  
forget about yields -- we'd be able to program almost exactly as with  
traditional blocking operations. "data = read(my_socket, nbytes)".  
Only potentially blocking functions would have to be concerned with  
pausing; read() would pause to an outer scheduler/trampoline/Twisted- 
type reactor, which, when data was available, would resume the paused  
read() function (giving it the data similarly to generator.send()),  
which would then return the value to the calling function exactly as a  
synchronous function would.


From guido at python.org  Mon Nov 12 17:51:40 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Nov 2007 08:51:40 -0800
Subject: [Python-ideas] Enable tab completion for interactive sessions
In-Reply-To: <fh7phf$tbh$1@ger.gmane.org>
References: <fh7phf$tbh$1@ger.gmane.org>
Message-ID: <ca471dc20711120851q36c86bdfg154b2f9d5bf94dba@mail.gmail.com>

You can already enable this by copying those few lines into your
$PYTHONSTARTUP file. People who are truly into completion should be
using iPython anyway. :-)

On Nov 11, 2007 12:43 PM, Christian Heimes <lists at cheimes.de> wrote:
> Hello fellow Pythonistas!
>
> Python has a very useful feature a lot of people don't know about. It's
> tab completion for the interactive shell.
> http://docs.python.org/lib/module-rlcompleter.html
>
> Tab completion is very useful for introspection and quick tests in an
> interactive shell. rlcompleter isn't enable by default - and it
> shouldn't. But I like to add a cmd line option and an env var to load
> and enable the rlcompleter in an interactive session.
>
> The right place to enable the feature is in
>
> Modules/main.c:460
> if ((Py_InspectFlag || (command == NULL && filename == NULL && module ==
> NULL)) && isatty(fileno(stdin))) {
>
>     code run when -i is given or neither command nor filename nor module
> is set and stdin is an interactive terminal.
>
> }
>
> At the moment the code block loads just the readline module. I also like
> to load the rlcompleter module and invoke readline.parse_and_bind("tab:
> complete") there.
>
> Options:
> (1) always enable tab completion for interactive shells w/o a command,
> module and filename.
> (2) only enable rlcompleter when the -i flag or PYTHONINTERACTIVE env
> var is set.
> (3) add a new command flag and env var to enable the completer when (1)
> or (2) is true
>
> Christian
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From arno at marooned.org.uk  Mon Nov 12 20:03:16 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Mon, 12 Nov 2007 19:03:16 +0000
Subject: [Python-ideas] Pause (sort of a 'deep yield'?)
In-Reply-To: <ED43715F-2994-4BBD-A5BD-2E8ECE7AA384@atlas.st>
References: <ED43715F-2994-4BBD-A5BD-2E8ECE7AA384@atlas.st>
Message-ID: <B3DF9A63-A6D8-4AF5-8D5F-969548AD92D3@marooned.org.uk>


On 12 Nov 2007, at 04:52, Adam Atlas wrote:

> Generator-based coroutines are great, but I've thought of some
> interesting cases where it would help to be able to sort of yield to
> an outer scope (beyond the parent scope) while being able to resume.
> I'm thinking this would make the most sense as a kind of exception,
> with an added "resume" method which would resume execution at the
> point at which the exception was raised. (They'd also have a throw()
> method for continuing execution but raising an exception, and a
> close() method, as with generators in Python >= 2.5.)
>
> Here's an example to demonstrate what I'm talking about:
>
> def a():
>    print 'blah'
>    p = pause 7 # like using `yield` as an expression
>              # but it raises "PauseException" (or whatever)
>    print p
>    return (p, 123)
>
> def b():
>    return a()
>
> try:
>    print b()
> except PauseException, e:
>    print e.value
>    e.reusme(3)
>
> #prints:
> #  blah
> #  7
> #  3
> #  (3, 123)


It seems to me it has the full power of call/cc & co.
It would allow to turn the clock back to any previous state of an  
execution stack (unless I misunderstand what you mean by 'pause').   
Here is a simple example:

def getstate():
    pause
    return

try:
    getstate()
except PauseException, here:
    pass
# code line 1
# code line 2
...
here.resume() # This line takes us back to code line 1

So the whole stack should be saved each time a pause happens (unless a  
stackless approach is adopted).

-- 
Arnaud




From ntoronto at cs.byu.edu  Tue Nov 13 08:23:40 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 13 Nov 2007 00:23:40 -0700
Subject: [Python-ideas] Required to call superclass __init__
Message-ID: <473950FC.10202@cs.byu.edu>

I'm not talking about having the runtime call the superclass __init__ 
for you, as I am aware of the arguments over it and I am against it 
myself. I'm talking about checking whether it's been called within a 
subclass's own __init__.

There are many kinds of objects with such complex underpinnings or 
initialization that leaving out the call to superclass __init__ would be 
disastrous. There are two situations I can think of where enforcing its 
invocation could be useful: a corporate environment and a teaching 
environment. (I've done the former and I'm working in the latter.)

If someone forgets to call a superclass __init__, problems may not show 
up until much later. Even if they do show up immediately, it's almost 
never obvious what the real problem is, especially to someone who is new 
to programming or is working on someone else's code.

I've got a working prototype metaclass and class instance 
(require_super) and decorator (super_required). Decorating a 
require_super method with @super_required will require any subclass 
override to call its superclass method, or it throws a TypeError upon 
exiting the subclass method. Here's how it works on the __init__ problem:

class A(require_super):
     @super_required
     def __init__(self):
         pass

a = A()  # No problem


class B(A):
     def __init__(self):
         super(B, self).__init__()

b = B()  # No problem


class C(B):
     def __init__(self):
         pass  # this could be a problem

c = C()  # TypeError: C.__init__: no super call


class D(C):
     def __init__(self):
         super(D, self).__init__()

d = D()  # TypeError: C.__init__: no super call


As long as A.__init__ is eventually called, it doesn't raise a TypeError.

There's not much magic involved (as metaclasses go), just explicit and 
implicit method wrappers, and no crufty-looking magic words in the 
subclasses. Not calling the superclass method results in immediate 
runtime feedback. I've tested this on a medium-small, real-life 
single-inheritance hierarchy and it seems to work just fine. (I *think* 
it should work with multiple inheritance.)

Two questions:

1. Is the original problem (missed superclass method calls) big enough 
to warrant language, runtime, or library support for a similar solution?

2. Does anybody but me think this is a great idea?

Neil



From phd at phd.pp.ru  Tue Nov 13 10:12:46 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Tue, 13 Nov 2007 12:12:46 +0300
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <473950FC.10202@cs.byu.edu>
References: <473950FC.10202@cs.byu.edu>
Message-ID: <20071113091246.GC15166@phd.pp.ru>

On Tue, Nov 13, 2007 at 12:23:40AM -0700, Neil Toronto wrote:
> I've got a working prototype metaclass and class instance 
> (require_super) and decorator (super_required).

   Chicken and egg problem, in my eyes. If the user is clever enough to use
the class and the decorator isn't she clever enough to call inherited
__init__?

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From jimjjewett at gmail.com  Tue Nov 13 15:36:41 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 13 Nov 2007 09:36:41 -0500
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <20071113091246.GC15166@phd.pp.ru>
References: <473950FC.10202@cs.byu.edu> <20071113091246.GC15166@phd.pp.ru>
Message-ID: <fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>

On 11/13/07, Oleg Broytmann <phd at phd.pp.ru> wrote:
> On Tue, Nov 13, 2007 at 12:23:40AM -0700, Neil Toronto wrote:
> > I've got a working prototype metaclass and class instance
> > (require_super) and decorator (super_required).

Is this restricted to __init__ (and __new__?) or could it be used on any method?

Is there (and should there be?) a way around it, by catching the
TypeError?  By creating a decoy object to call super on?

>    Chicken and egg problem, in my eyes. If the user is clever enough to use
> the class and the decorator isn't she clever enough to call inherited
> __init__?

It may not be the same user.

A library or framework writer would create the base class and use the
decorator to (somewhat) ensure that subclasses meet the full interface
requirements.

A subclass writer should call the super.__init__ because it is in the
API, but Neil's metaclass makes it easier to debug if they forget.

-jJ


From ntoronto at cs.byu.edu  Tue Nov 13 17:09:31 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 13 Nov 2007 09:09:31 -0700
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>
References: <473950FC.10202@cs.byu.edu> <20071113091246.GC15166@phd.pp.ru>
	<fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>
Message-ID: <4739CC3B.1090205@cs.byu.edu>

Jim Jewett wrote:
> On 11/13/07, Oleg Broytmann <phd at phd.pp.ru> wrote:
>> On Tue, Nov 13, 2007 at 12:23:40AM -0700, Neil Toronto wrote:
>>> I've got a working prototype metaclass and class instance
>>> (require_super) and decorator (super_required).
> 
> Is this restricted to __init__ (and __new__?) or could it be used on any method?

It can be used on any method.

> Is there (and should there be?) a way around it, by catching the
> TypeError?  By creating a decoy object to call super on?

Definitely should be, and I made one because I plan on using this 
myself. :) Currently, you can set self.<method>_super = True (or 
self.__<method>__super = True) instead of doing the superclass method 
call. (Yes, it currently litters the class instance with flags, but 
that's an implementation detail.) If you're not going to call the 
superclass method, you need to state that explicitly.

class C(B):
      def __init__(self):
          self.__init__super = True

c = C()  # No problem


I've fiddled with the idea of having a redecoration with @super_required 
remove the requirement from the current method but place it back on 
future overrides. Maybe a @super_not_required could remove it completely.

>>    Chicken and egg problem, in my eyes. If the user is clever enough to use
>> the class and the decorator isn't she clever enough to call inherited
>> __init__?
> 
> It may not be the same user.
> 
> A library or framework writer would create the base class and use the
> decorator to (somewhat) ensure that subclasses meet the full interface
> requirements.
> 
> A subclass writer should call the super.__init__ because it is in the
> API, but Neil's metaclass makes it easier to debug if they forget.

Exactly so.

Neil


From mark at qtrac.eu  Wed Nov 14 09:07:16 2007
From: mark at qtrac.eu (Mark Summerfield)
Date: Wed, 14 Nov 2007 08:07:16 +0000
Subject: [Python-ideas] python3: subtle change to new input()
Message-ID: <200711140807.16677.mark@qtrac.eu>

Hi,

In Python 3, input() returns an empty string in two situations: blank
lines and EOF. Here's a little program that uses it:

    print("enter numbers one per line; blank line to quit")
    count = 0
    total = 0
    while True:
	line = input()
	if not line: # EOF or blank line
	    break
	n = int(line)
	total += n
	count += 1
    print("count =", count, "total =", total)

If input() returned None on EOF you could write this:

    print("enter numbers one per line; EOF (^D or ^Z) to quit")
    count = 0
    total = 0
    while True:
	line = input()
	if line is None: # EOF
	    break
	elif not line: # Blank line
	    continue
	n = int(line)
	total += n
	count += 1
    print("count =", count, "total =", total)

The advantage of this second approach is that you can accept blank
lines, which is often more convenient if using < on the command line to
read stdin. Furthermore, if you replaced input() with the None returning
one in the first example, it will work just the same as before. So I
think that returning None on EOF gives a subtle improvement without
breaking much.

-- 
Mark Summerfield, Qtrac Ltd., www.qtrac.eu



From ntoronto at cs.byu.edu  Wed Nov 14 11:38:05 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Wed, 14 Nov 2007 03:38:05 -0700
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <4739CC3B.1090205@cs.byu.edu>
References: <473950FC.10202@cs.byu.edu>
	<20071113091246.GC15166@phd.pp.ru>	<fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>
	<4739CC3B.1090205@cs.byu.edu>
Message-ID: <473AD00D.2070502@cs.byu.edu>

Since I know you're all dying to see the code... ;)

This works for instance methods, classmethods, staticmethods (if cls or 
self is the first parameter, as in __new__), and probably most decorated 
methods.

Current Issues I Can Think Of:
  - @classmethod overrides that don't redecorate with @classmethod always
    raise TypeError (maybe not such a bad thing)
  - exceptions can cause entries in the __supercalls__ set to accumulate
    unboundedly (can be fixed)

Anyway, here's a concrete implementation. Is the "missed super call" 
problem big or annoying enough to warrant having language, runtime, or 
library support for a similar solution?



import threading
import types


class _supercall_set(object):
     def __init__(self, *args, **kwargs):
         # Removing threading.local() should make this faster
         # but not thread-safe
         self.loc = threading.local()
         self.loc.s = set(*args, **kwargs)

     def add(self, key): self.loc.s.add(key)
     def discard(self, key): self.loc.s.discard(key)
     def __contains__(self, key): return self.loc.s.__contains__(key)
     def __repr__(self): return self.loc.s.__repr__()


def _unwrap_rewrap(func):
     '''For supported types (classmethod, staticmethod, function),
     returns the actual function and a function to re-wrap it, if
     necessary. Raises TypeError if func's type isn't supported.'''

     if isinstance(func, classmethod):
         return func.__get__(func).im_func, type(func)
     elif isinstance(func, staticmethod):
         return func.__get__(func), type(func)
     elif isinstance(func, types.FunctionType):
         return func, lambda func: func

     raise TypeError('unsupported type %s' % type(func))


def super_required(func):
     '''
     Marks a method as requiring subclass overrides to call it, either
     directly or via a super() call. Works with all undecorated
     methods, classmethods, staticmethods (fragile: only if 'cls' or
     'self' is the first parameter) including __new__, and probably
     most other decorated methods. Correct operation is guaranteed only
     when the method is in a subclass of require_super.

     If a super_required override has a superclass method that is also
     super_required, the override will not be required to call the
     superclass method, either directly or via a super() call.

     The superclass call requirement can be cancelled for a method and
     methods of the same name in all future subclasses using the
     super_not_required decorator.

     The implementation should be as thread-safe as the classes it's
     used in. Recursion should work as long as the last, innermost
     call calls the superclass method. (It's usually best to avoid it.)
     This is not robust to method injection, but then again, what is?

     Examples:

         class A(require_super):
             @super_required
             def __init__(self): pass

         class B(A):
             def __init__(self): pass

         b = B()  # TypeError: B.__init__: no super call
                  # B.__init__ needs a super(B, self).__init__()

         class C(require_super):
             @super_required
             @classmethod      # order of decorators doesn't matter
             def clsmeth(cls): pass

         class D(C):
             @classmethod
             def clsmeth(cls): pass

         d = D()
         d.clsmeth()  # TypeError: D.clsmeth: no super call
                      # C.clsmeth needs a super(C, cls).clsmeth()
     '''

     func, rewrap = _unwrap_rewrap(func)
     name = func.func_name

     def super_wrapper(self_or_cls, *args, **kwargs):
         retval = func(self_or_cls, *args, **kwargs)
         # Flag that the super call happened
         self_or_cls.__supercalls__.discard((id(self_or_cls), name))
         return retval

     super_wrapper.func_name = func.func_name
     super_wrapper.func_doc = func.func_doc
     super_wrapper.__super_required__ = True  # Pass it down

     return rewrap(super_wrapper)


def super_not_required(func):
     '''Marks a method as no longer requiring subclass overrides to
     call it. This is only meaningful for methods in subclasses of
     require_super.'''

     func.__super_required__ = False
     return func


def _get_sub_wrapper(func, class_name, method_name):
     '''Returns a wrapper function that:
     1. Adds key to __supercalls__
     2. Calls the wrapped function
     3. Checks for key in __supercalls__ - if there, raises TypeError'''

     def sub_wrapper(self_or_cls, *args, **kwargs):
         key = (id(self_or_cls), method_name)
         self_or_cls.__supercalls__.add(key)

         retval = func(self_or_cls, *args, **kwargs)

         if key not in self_or_cls.__supercalls__:
             return retval

         self_or_cls.__supercalls__.discard(key)

         raise TypeError("%s.%s: no super call" %
                 (class_name, method_name))

     sub_wrapper.func_name = func.func_name
     sub_wrapper.func_doc = func.func_doc
     sub_wrapper.__super_required__ = True  # Pass it down

     return sub_wrapper


class _require_super_meta(type):
     def __new__(typ, cls_name, bases, dct):
         # Search through all attributes
         for method_name, func in dct.items():
             try:
                 func, rewrap = _unwrap_rewrap(func)
             except TypeError:
                 continue  # unsupported type

             if hasattr(func, '__super_required__'):
                 continue  # decorated - don't wrap it again

             # See if a base class's method is __super_required__
             for base in bases:
                 try:
                     if getattr(base, method_name).__super_required__:
                         break
                 except AttributeError:
                     pass  # not there or no __super_required__
             else:
                 continue  # outer loop

             # Wrap up the function
             newfunc = _get_sub_wrapper(func, cls_name, method_name)
             dct[method_name] = rewrap(newfunc)

         return type.__new__(typ, cls_name, bases, dct)


class require_super(object):
     '''Inheriting from require_super makes super_required and
     super_not_required decorators work.'''

     __metaclass__ = _require_super_meta

     # This will be visible to classes and instances
     __supercalls__ = _supercall_set()




From g.brandl at gmx.net  Wed Nov 14 14:43:03 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 14 Nov 2007 14:43:03 +0100
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <200711140807.16677.mark@qtrac.eu>
References: <200711140807.16677.mark@qtrac.eu>
Message-ID: <fhettl$sc5$1@ger.gmane.org>

Mark Summerfield schrieb:
> Hi,
> 
> In Python 3, input() returns an empty string in two situations: blank
> lines and EOF.

Could this be a platform issue? Here, on Linux, input() raises EOFError
on EOF.

Georg



From mark at qtrac.eu  Wed Nov 14 15:00:39 2007
From: mark at qtrac.eu (Mark Summerfield)
Date: Wed, 14 Nov 2007 14:00:39 +0000
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <fhettl$sc5$1@ger.gmane.org>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
Message-ID: <200711141400.39876.mark@qtrac.eu>

On 2007-11-14, Georg Brandl wrote:
> Mark Summerfield schrieb:
> > Hi,
> >
> > In Python 3, input() returns an empty string in two situations: blank
> > lines and EOF.
>
> Could this be a platform issue? Here, on Linux, input() raises EOFError
> on EOF.

Sorry, you're quite right...

-- 
Mark Summerfield, Qtrac Ltd., www.qtrac.eu



From guido at python.org  Wed Nov 14 15:28:34 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 06:28:34 -0800
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <200711141400.39876.mark@qtrac.eu>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
	<200711141400.39876.mark@qtrac.eu>
Message-ID: <ca471dc20711140628n2a73739di1e4382caa6f28691@mail.gmail.com>

On Nov 14, 2007 6:00 AM, Mark Summerfield <mark at qtrac.eu> wrote:
> On 2007-11-14, Georg Brandl wrote:
> > Mark Summerfield schrieb:
> > > Hi,
> > >
> > > In Python 3, input() returns an empty string in two situations: blank
> > > lines and EOF.
> >
> > Could this be a platform issue? Here, on Linux, input() raises EOFError
> > on EOF.
>
> Sorry, you're quite right...

Mark, did it return "" on your platform? Then please file a bug. I
can't quite tell if that's the case or if you simply misread the docs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jimjjewett at gmail.com  Wed Nov 14 15:35:09 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 14 Nov 2007 09:35:09 -0500
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <473AD00D.2070502@cs.byu.edu>
References: <473950FC.10202@cs.byu.edu> <20071113091246.GC15166@phd.pp.ru>
	<fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>
	<4739CC3B.1090205@cs.byu.edu> <473AD00D.2070502@cs.byu.edu>
Message-ID: <fb6fbf560711140635u46b0f016g260f3185e89d4107@mail.gmail.com>

On 11/14/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:

> Current Issues I Can Think Of:

> ... Is the "missed super call"
> problem big or annoying enough to warrant having language,
> runtime, or library support for a similar solution?

recipe, yes.

Library or more?  I'm not sure -- and I don't think this is ready yet.
 It feels too complicated, as though there may still be plenty of
simplifications that should happen before it gets frozen.  I don't yet
see what those simplifications should actually be, but maybe someone
else will if you publish and wait long enough.

-jJ


From lists at cheimes.de  Wed Nov 14 16:57:46 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 14 Nov 2007 16:57:46 +0100
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <fhettl$sc5$1@ger.gmane.org>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
Message-ID: <473B1AFA.7080900@cheimes.de>

Georg Brandl wrote:
> Mark Summerfield schrieb:
>> Hi,
>>
>> In Python 3, input() returns an empty string in two situations: blank
>> lines and EOF.
> 
> Could this be a platform issue? Here, on Linux, input() raises EOFError
> on EOF.

I think it's more likely a subtle difference between platforms:

On Linux
>>> r = input() <CTRL+D>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
EOFError
>>> r = input() <CTRL+Z>
[1]+  Stopped                 ./python
$ fg 1
./python
>>>

On Windows
>>> r = input() <CTRL+D>
^D <ENTER>
>>> r
'\x04'
>>> r = input() <CTRL+Z>
^Z <ENTER>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
EOFError

Christian



From mark at qtrac.eu  Wed Nov 14 17:37:56 2007
From: mark at qtrac.eu (Mark Summerfield)
Date: Wed, 14 Nov 2007 16:37:56 +0000
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <ca471dc20711140628n2a73739di1e4382caa6f28691@mail.gmail.com>
References: <200711140807.16677.mark@qtrac.eu>
	<200711141400.39876.mark@qtrac.eu>
	<ca471dc20711140628n2a73739di1e4382caa6f28691@mail.gmail.com>
Message-ID: <200711141637.56229.mark@qtrac.eu>

On 2007-11-14, Guido van Rossum wrote:
> On Nov 14, 2007 6:00 AM, Mark Summerfield <mark at qtrac.eu> wrote:
> > On 2007-11-14, Georg Brandl wrote:
> > > Mark Summerfield schrieb:
> > > > Hi,
> > > >
> > > > In Python 3, input() returns an empty string in two situations: blank
> > > > lines and EOF.
> > >
> > > Could this be a platform issue? Here, on Linux, input() raises EOFError
> > > on EOF.
> >
> > Sorry, you're quite right...
>
> Mark, did it return "" on your platform? Then please file a bug. I
> can't quite tell if that's the case or if you simply misread the docs.

It isn't a Python 3 bug. I confused myself with my tests. Sorry!

And the docs are perfectly okay... well, apart from "stripping a
trailing newline". On Unices that's fine but I don't know if Windows
consoles actually send \r\n or whatever, in which case, assuming input()
does the right cross-platform thing, maybe "stripping the trailing line
termination character(s)" would be more accurate.

(What went wrong: My little program worked fine when I used it
interactively. But then I ran it using a file of data redirected from
stdin, that didn't produce an EOFError. But the reason was that my test
file had a blank line in it, so the program correctly broke out of the
while loop at that point and stopped reading, so never reached EOF. Once
I removed the blank line the program correctly terminated with an
unhandled EOFError.)

-- 
Mark Summerfield, Qtrac Ltd., www.qtrac.eu



From bborcic at gmail.com  Wed Nov 14 17:55:23 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Wed, 14 Nov 2007 17:55:23 +0100
Subject: [Python-ideas] x )= f as shorthand for x=f(x)
In-Reply-To: <fb6fbf560711091312qae43ac1w5ec231cb1c02c145@mail.gmail.com>
References: <fh1rhm$ui$1@ger.gmane.org>	<d11dcfba0711091007ydd51549vac327f4c3e07dd8@mail.gmail.com>	<fh296l$hqi$1@ger.gmane.org>	<ca471dc20711091040id31bed9o1731b8a787f4b582@mail.gmail.com>	<fh2cvi$uo8$1@ger.gmane.org>
	<fb6fbf560711091312qae43ac1w5ec231cb1c02c145@mail.gmail.com>
Message-ID: <fhf9bt$bq6$1@ger.gmane.org>

Jim Jewett wrote:
> Boris,
> 
> I'm posting this publicly because you aren't the first to feel this
> way, so I think an answer should be archived. [...]

Ah, thanks for caring, Jim. And for your nice explanations.

Stephen J. Turnbull wrote:
[...]
 >
 > Why not take Guido's comment literally, "*if* you don't have it," and
 > think about the "litmus test" he described?  (Ie, think about why this
 > proposal is unattractive.)

It's like a judge silencing a advocate by saying "It's no, and if you can't 
plead the other side's view now that it's over, this means you don't have what 
it takes to be judge". Now the competent advocate is deferential to the judge 
and in general won't dream he could replace the judge any more than he would 
ignore any judge's simple demand for silence. But he will nevertheless recognize 
that the test the judge proposes is one by which to recognize a competent 
advocate foremost, and a competent judge only subsidiarily if ever.

IOW, deciding given pros and cons isn't the same as listing them. And if courts 
tend to distribute the role of listing the pros, that of listing the cons, and 
that of deciding, to three distinct persons or parties, it's not without good 
reasons, imo. And...  I've a "good" enough personal history of driving myself 
into undecidable dilemmas, thanks.

The above is what I was first tempted to reply in short to Guido, but felt it 
was rather OT, so I settled on a shortcut. 'nough said.

[...]
 > I-always-wanted-to-be-a-language-designer-too-ly y'rs,

But-I-never-really-did-ly y'rs,

Boris





From lists at cheimes.de  Wed Nov 14 18:29:52 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 14 Nov 2007 18:29:52 +0100
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <200711141637.56229.mark@qtrac.eu>
References: <200711140807.16677.mark@qtrac.eu>	<200711141400.39876.mark@qtrac.eu>	<ca471dc20711140628n2a73739di1e4382caa6f28691@mail.gmail.com>
	<200711141637.56229.mark@qtrac.eu>
Message-ID: <fhfbag$i6t$1@ger.gmane.org>

Mark Summerfield wrote:
> And the docs are perfectly okay... well, apart from "stripping a
> trailing newline". On Unices that's fine but I don't know if Windows
> consoles actually send \r\n or whatever, in which case, assuming input()
> does the right cross-platform thing, maybe "stripping the trailing line
> termination character(s)" would be more accurate.

Microsoft's stdio lib is using \n as newline for stdin, stdout and
stderr. Does it answer your question?

Christian



From rhamph at gmail.com  Wed Nov 14 18:49:20 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 14 Nov 2007 10:49:20 -0700
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
Message-ID: <aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>

(ugh, this was supposed to go to python-ideas, not python-list.  No
wonder I got no responses to this email!)

(I've had trouble getting response for collaboration on a PEP.
Perhaps I'm the only interested party?)

Although py3k raises an exception for completely unsortable types, it
continues to silently do the wrong thing for non-symmetric types that
overload comparison operator with special meanings.

>>> a = set([1])
>>> b = set([2, 5])
>>> c = set([1, 2])
>>> sorted([a, c, b])
[{1}, {1, 2}, {2, 5}]
>>> sorted([a, b, c])
[{1}, {2, 5}, {1, 2}]

To solve this I propose a revived cmp (as per the previous thread[1]),
which is the preferred path for orderings.  The rich comparison
operators will be simple wrappers for cmp() (ensuring an exception is
raised if they're not merely comparing for equality.)

Thus, set would need 7 methods defined (6 rich comparisons plus
__cmp__, although it could skip __eq__ and __ne__), whereas nearly all
other types (int, list, etc) need only __cmp__.

Code which uses <= to compare sets would be assumed to want subset
operations.  Generic containers should use cmp() exclusively.


[1] http://mail.python.org/pipermail/python-3000/2007-October/011072.html

--
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Nov 14 18:54:50 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 09:54:50 -0800
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
Message-ID: <ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>

Are you sure you're solving a real problem?

On Nov 14, 2007 9:49 AM, Adam Olsen <rhamph at gmail.com> wrote:
> (ugh, this was supposed to go to python-ideas, not python-list.  No
> wonder I got no responses to this email!)
>
> (I've had trouble getting response for collaboration on a PEP.
> Perhaps I'm the only interested party?)
>
> Although py3k raises an exception for completely unsortable types, it
> continues to silently do the wrong thing for non-symmetric types that
> overload comparison operator with special meanings.
>
> >>> a = set([1])
> >>> b = set([2, 5])
> >>> c = set([1, 2])
> >>> sorted([a, c, b])
> [{1}, {1, 2}, {2, 5}]
> >>> sorted([a, b, c])
> [{1}, {2, 5}, {1, 2}]
>
> To solve this I propose a revived cmp (as per the previous thread[1]),
> which is the preferred path for orderings.  The rich comparison
> operators will be simple wrappers for cmp() (ensuring an exception is
> raised if they're not merely comparing for equality.)
>
> Thus, set would need 7 methods defined (6 rich comparisons plus
> __cmp__, although it could skip __eq__ and __ne__), whereas nearly all
> other types (int, list, etc) need only __cmp__.
>
> Code which uses <= to compare sets would be assumed to want subset
> operations.  Generic containers should use cmp() exclusively.
>
>
> [1] http://mail.python.org/pipermail/python-3000/2007-October/011072.html
>
> --
> Adam Olsen, aka Rhamphoryncus
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From theller at ctypes.org  Wed Nov 14 19:29:52 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 14 Nov 2007 19:29:52 +0100
Subject: [Python-ideas] Make obj[] valid syntax?
Message-ID: <fhfer0$vml$1@ger.gmane.org>

I'm not sure if this is a good idea or not, but - hey - this
is python.ideas ;-)

The following statements currently raise a SyntaxError:

  obj[] = something
  x = obj[]

I propose to make these statements valid syntax.
'obj[]' should behave like 'obj[()]' does:
Call __getitem__ or __setitem__ with an empty tuple.

My use case is in a COM library (comtypes).

Some COM properties require one or more arguments; this is
not a problem since one could write
  obj.prop[1, 2, 3]

Sometimes, however, arguments are optional.  Unfortunately
one has to write
  obj.prop[()]
to pass an empty tuple to __getitem__ or __setitem__,
which looks strange imo.

Comments?

Thomas



From phd at phd.pp.ru  Wed Nov 14 19:34:40 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 14 Nov 2007 21:34:40 +0300
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <fhfer0$vml$1@ger.gmane.org>
References: <fhfer0$vml$1@ger.gmane.org>
Message-ID: <20071114183440.GC30836@phd.pp.ru>

On Wed, Nov 14, 2007 at 07:29:52PM +0100, Thomas Heller wrote:
> 'obj[]' should behave like 'obj[()]' does:

   I remember it was discussed and rejected a year or two ago. Still -1
from me. Explicit [()] is better than implicit [].

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From guido at python.org  Wed Nov 14 19:35:16 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 10:35:16 -0800
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <fhfer0$vml$1@ger.gmane.org>
References: <fhfer0$vml$1@ger.gmane.org>
Message-ID: <ca471dc20711141035n14d6720ey538a412f5c2dd4fe@mail.gmail.com>

Why can't you use call syntax, i.e. obj.prop(1, 2, 3)?

On Nov 14, 2007 10:29 AM, Thomas Heller <theller at ctypes.org> wrote:
> I'm not sure if this is a good idea or not, but - hey - this
> is python.ideas ;-)
>
> The following statements currently raise a SyntaxError:
>
>   obj[] = something
>   x = obj[]
>
> I propose to make these statements valid syntax.
> 'obj[]' should behave like 'obj[()]' does:
> Call __getitem__ or __setitem__ with an empty tuple.
>
> My use case is in a COM library (comtypes).
>
> Some COM properties require one or more arguments; this is
> not a problem since one could write
>   obj.prop[1, 2, 3]
>
> Sometimes, however, arguments are optional.  Unfortunately
> one has to write
>   obj.prop[()]
> to pass an empty tuple to __getitem__ or __setitem__,
> which looks strange imo.
>
> Comments?
>
> Thomas
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From theller at ctypes.org  Wed Nov 14 19:37:48 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 14 Nov 2007 19:37:48 +0100
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <ca471dc20711141035n14d6720ey538a412f5c2dd4fe@mail.gmail.com>
References: <fhfer0$vml$1@ger.gmane.org>
	<ca471dc20711141035n14d6720ey538a412f5c2dd4fe@mail.gmail.com>
Message-ID: <fhff9s$tu$1@ger.gmane.org>

Guido van Rossum schrieb:
> Why can't you use call syntax, i.e. obj.prop(1, 2, 3)?

Because I cannot set the property in this way:

obj.prop(1, 2, 3) = "foo"

Of course I know that obj.set_prop(1, 2, 3, "foo") would work.



From theller at ctypes.org  Wed Nov 14 19:38:34 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 14 Nov 2007 19:38:34 +0100
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <20071114183440.GC30836@phd.pp.ru>
References: <fhfer0$vml$1@ger.gmane.org> <20071114183440.GC30836@phd.pp.ru>
Message-ID: <fhffba$tu$2@ger.gmane.org>

Oleg Broytmann schrieb:
> On Wed, Nov 14, 2007 at 07:29:52PM +0100, Thomas Heller wrote:
>> 'obj[]' should behave like 'obj[()]' does:
> 
>    I remember it was discussed and rejected a year or two ago. Still -1
> from me. Explicit [()] is better than implicit [].

However:   obj[(1, 2, 3)] is the same as obj[1, 2, 3]



From guido at python.org  Wed Nov 14 19:45:06 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 10:45:06 -0800
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <fhff9s$tu$1@ger.gmane.org>
References: <fhfer0$vml$1@ger.gmane.org>
	<ca471dc20711141035n14d6720ey538a412f5c2dd4fe@mail.gmail.com>
	<fhff9s$tu$1@ger.gmane.org>
Message-ID: <ca471dc20711141045x16204286ma31b7a747a5b403e@mail.gmail.com>

On Nov 14, 2007 10:37 AM, Thomas Heller <theller at ctypes.org> wrote:
> Guido van Rossum schrieb:
> > Why can't you use call syntax, i.e. obj.prop(1, 2, 3)?
>
> Because I cannot set the property in this way:
>
> obj.prop(1, 2, 3) = "foo"
>
> Of course I know that obj.set_prop(1, 2, 3, "foo") would work.

And can't you arrange for obj.prop = "foo" to work as well as
obj.prop[1,2,3] = "foo"?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Wed Nov 14 19:47:56 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 14 Nov 2007 11:47:56 -0700
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>
Message-ID: <aac2c7cb0711141047g2bde3b24xfd07e34cec6c9e80@mail.gmail.com>

On Nov 14, 2007 10:54 AM, Guido van Rossum <guido at python.org> wrote:
> Are you sure you're solving a real problem?

I see it as part of a problem we've already decided to solve, by
making types with no reasonable ordering raise TypeError.


> On Nov 14, 2007 9:49 AM, Adam Olsen <rhamph at gmail.com> wrote:
> > (ugh, this was supposed to go to python-ideas, not python-list.  No
> > wonder I got no responses to this email!)
> >
> > (I've had trouble getting response for collaboration on a PEP.
> > Perhaps I'm the only interested party?)
> >
> > Although py3k raises an exception for completely unsortable types, it
> > continues to silently do the wrong thing for non-symmetric types that
> > overload comparison operator with special meanings.
> >
> > >>> a = set([1])
> > >>> b = set([2, 5])
> > >>> c = set([1, 2])
> > >>> sorted([a, c, b])
> > [{1}, {1, 2}, {2, 5}]
> > >>> sorted([a, b, c])
> > [{1}, {2, 5}, {1, 2}]
> >
> > To solve this I propose a revived cmp (as per the previous thread[1]),
> > which is the preferred path for orderings.  The rich comparison
> > operators will be simple wrappers for cmp() (ensuring an exception is
> > raised if they're not merely comparing for equality.)
> >
> > Thus, set would need 7 methods defined (6 rich comparisons plus
> > __cmp__, although it could skip __eq__ and __ne__), whereas nearly all
> > other types (int, list, etc) need only __cmp__.
> >
> > Code which uses <= to compare sets would be assumed to want subset
> > operations.  Generic containers should use cmp() exclusively.
> >
> >
> > [1] http://mail.python.org/pipermail/python-3000/2007-October/011072.html
> >
> > --
> > Adam Olsen, aka Rhamphoryncus
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>



-- 
Adam Olsen, aka Rhamphoryncus


From theller at ctypes.org  Wed Nov 14 19:59:15 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 14 Nov 2007 19:59:15 +0100
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <ca471dc20711141045x16204286ma31b7a747a5b403e@mail.gmail.com>
References: <fhfer0$vml$1@ger.gmane.org>	<ca471dc20711141035n14d6720ey538a412f5c2dd4fe@mail.gmail.com>	<fhff9s$tu$1@ger.gmane.org>
	<ca471dc20711141045x16204286ma31b7a747a5b403e@mail.gmail.com>
Message-ID: <fhfgi3$61c$1@ger.gmane.org>

Guido van Rossum schrieb:
> On Nov 14, 2007 10:37 AM, Thomas Heller <theller at ctypes.org> wrote:
>> Guido van Rossum schrieb:
>> > Why can't you use call syntax, i.e. obj.prop(1, 2, 3)?
>>
>> Because I cannot set the property in this way:
>>
>> obj.prop(1, 2, 3) = "foo"
>>
>> Of course I know that obj.set_prop(1, 2, 3, "foo") would work.
> 
> And can't you arrange for obj.prop = "foo" to work as well as
> obj.prop[1,2,3] = "foo"?
> 

Sure, but this requires to use [] for setting and () for getting the property.



From phd at phd.pp.ru  Wed Nov 14 20:00:57 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 14 Nov 2007 22:00:57 +0300
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <fhffba$tu$2@ger.gmane.org>
References: <fhfer0$vml$1@ger.gmane.org> <20071114183440.GC30836@phd.pp.ru>
	<fhffba$tu$2@ger.gmane.org>
Message-ID: <20071114190057.GA32728@phd.pp.ru>

On Wed, Nov 14, 2007 at 07:38:34PM +0100, Thomas Heller wrote:
> Oleg Broytmann schrieb:
> > On Wed, Nov 14, 2007 at 07:29:52PM +0100, Thomas Heller wrote:
> >> 'obj[]' should behave like 'obj[()]' does:
> > 
> >    I remember it was discussed and rejected a year or two ago. Still -1
> > from me. Explicit [()] is better than implicit [].
> 
> However:   obj[(1, 2, 3)] is the same as obj[1, 2, 3]

   1, 2, 3 is a tuple, and () is a tuple, should there be a syntax for an
empty tuple without parenthesis?

   Thomas, there were many arguments in the previous discussion. This one
was there, too. But finally the proposal was rejected.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From guido at python.org  Wed Nov 14 19:54:31 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 10:54:31 -0800
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <aac2c7cb0711141047g2bde3b24xfd07e34cec6c9e80@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>
	<aac2c7cb0711141047g2bde3b24xfd07e34cec6c9e80@mail.gmail.com>
Message-ID: <ca471dc20711141054h6ba49d29ra73a31fb2771f775@mail.gmail.com>

On Nov 14, 2007 10:47 AM, Adam Olsen <rhamph at gmail.com> wrote:
> On Nov 14, 2007 10:54 AM, Guido van Rossum <guido at python.org> wrote:
> > Are you sure you're solving a real problem?
>
> I see it as part of a problem we've already decided to solve, by
> making types with no reasonable ordering raise TypeError.

I think we're reaching the land of diminishing returns though.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Wed Nov 14 19:55:39 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 10:55:39 -0800
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <fhffba$tu$2@ger.gmane.org>
References: <fhfer0$vml$1@ger.gmane.org> <20071114183440.GC30836@phd.pp.ru>
	<fhffba$tu$2@ger.gmane.org>
Message-ID: <ca471dc20711141055x129291e3uf34c975303816eb2@mail.gmail.com>

On Nov 14, 2007 10:38 AM, Thomas Heller <theller at ctypes.org> wrote:
> Oleg Broytmann schrieb:
> > On Wed, Nov 14, 2007 at 07:29:52PM +0100, Thomas Heller wrote:
> >> 'obj[]' should behave like 'obj[()]' does:
> >
> >    I remember it was discussed and rejected a year or two ago. Still -1
> > from me. Explicit [()] is better than implicit [].
>
> However:   obj[(1, 2, 3)] is the same as obj[1, 2, 3]

So what?

x =

is not equivalent to

x = ()

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Wed Nov 14 20:03:19 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 14 Nov 2007 12:03:19 -0700
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <ca471dc20711141054h6ba49d29ra73a31fb2771f775@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>
	<aac2c7cb0711141047g2bde3b24xfd07e34cec6c9e80@mail.gmail.com>
	<ca471dc20711141054h6ba49d29ra73a31fb2771f775@mail.gmail.com>
Message-ID: <aac2c7cb0711141103r2943106dwa0f7715cafaa8d1c@mail.gmail.com>

On Nov 14, 2007 11:54 AM, Guido van Rossum <guido at python.org> wrote:
> On Nov 14, 2007 10:47 AM, Adam Olsen <rhamph at gmail.com> wrote:
> > On Nov 14, 2007 10:54 AM, Guido van Rossum <guido at python.org> wrote:
> > > Are you sure you're solving a real problem?
> >
> > I see it as part of a problem we've already decided to solve, by
> > making types with no reasonable ordering raise TypeError.
>
> I think we're reaching the land of diminishing returns though.

Aye.  If we don't want to readd __cmp__ for other reasons then it's
not worthwhile.  If we do readd __cmp__ then it's basically free.

So the real question is if there's enough support behind __cmp__..
which I kind of doubt at this point.

-- 
Adam Olsen, aka Rhamphoryncus


From guido at python.org  Wed Nov 14 20:18:14 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 11:18:14 -0800
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <aac2c7cb0711141103r2943106dwa0f7715cafaa8d1c@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<ca471dc20711140954k6f60ba92w2686fe4c3f467d4f@mail.gmail.com>
	<aac2c7cb0711141047g2bde3b24xfd07e34cec6c9e80@mail.gmail.com>
	<ca471dc20711141054h6ba49d29ra73a31fb2771f775@mail.gmail.com>
	<aac2c7cb0711141103r2943106dwa0f7715cafaa8d1c@mail.gmail.com>
Message-ID: <ca471dc20711141118k4db7298ewa52fca0b9b5b63f7@mail.gmail.com>

On Nov 14, 2007 11:03 AM, Adam Olsen <rhamph at gmail.com> wrote:
> On Nov 14, 2007 11:54 AM, Guido van Rossum <guido at python.org> wrote:
> > On Nov 14, 2007 10:47 AM, Adam Olsen <rhamph at gmail.com> wrote:
> > > On Nov 14, 2007 10:54 AM, Guido van Rossum <guido at python.org> wrote:
> > > > Are you sure you're solving a real problem?
> > >
> > > I see it as part of a problem we've already decided to solve, by
> > > making types with no reasonable ordering raise TypeError.
> >
> > I think we're reaching the land of diminishing returns though.
>
> Aye.  If we don't want to readd __cmp__ for other reasons then it's
> not worthwhile.  If we do readd __cmp__ then it's basically free.

That depends -- while __cmp__ may be faster to compare lists or
tuples, __lt__ is faster when comparing ints or strings.

> So the real question is if there's enough support behind __cmp__..
> which I kind of doubt at this point.

If nobody volunteers to help write a PEP at this point, I will have to
agree with that conclusion.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From theller at ctypes.org  Wed Nov 14 20:22:48 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 14 Nov 2007 20:22:48 +0100
Subject: [Python-ideas] Make obj[] valid syntax?
In-Reply-To: <ca471dc20711141055x129291e3uf34c975303816eb2@mail.gmail.com>
References: <fhfer0$vml$1@ger.gmane.org>
	<20071114183440.GC30836@phd.pp.ru>	<fhffba$tu$2@ger.gmane.org>
	<ca471dc20711141055x129291e3uf34c975303816eb2@mail.gmail.com>
Message-ID: <fhfhu8$bc6$1@ger.gmane.org>

Guido van Rossum schrieb:
> So what?
> 
> x =
> 
> is not equivalent to
> 
> x = ()

I won't argue this with you ;-)

>> > Oleg Broytmann schrieb:
>>> > > On Wed, Nov 14, 2007 at 07:29:52PM +0100, Thomas Heller wrote:
>>>> > >> 'obj[]' should behave like 'obj[()]' does:
>>> > > 
>>> > >    I remember it was discussed and rejected a year or two ago. Still -1
>>> > > from me. Explicit [()] is better than implicit [].
>> > 
>> > However:   obj[(1, 2, 3)] is the same as obj[1, 2, 3]
> 
>    1, 2, 3 is a tuple, and () is a tuple, should there be a syntax for an
> empty tuple without parenthesis?
> 
>    Thomas, there were many arguments in the previous discussion. This one
> was there, too. But finally the proposal was rejected.

I see that my proposal probably won't fly.  This encourages me to describe
my full wish, just for fun:

It would be nice if I could have positional AND keyword arguments for __getitem__
and __setitem__, so that I could write code like this (COM has named parameters also):

  x = obj.prop[1, 2, lcid=0]
  x = obj.prop[]

  obj.prop[1, 2, lcid=0] = "foo"
  obj.prop[] = "foo"

or even

  x = obj.prop[1, 2, lcid=0]
  x = obj.prop[]
  x = obj.prop # same as previous line (now how would THAT work?)

  obj.prop[1, 2, lcid=0] = "foo"
  obj.prop[] = "foo"
  obj.prop = "foo" # same as previous line

I retract my proposal.

VB-ly, yours

Thomas



From tjreedy at udel.edu  Wed Nov 14 22:43:44 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 14 Nov 2007 16:43:44 -0500
Subject: [Python-ideas] python3: subtle change to new input()
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
	<473B1AFA.7080900@cheimes.de>
Message-ID: <fhfq6f$8pb$1@ger.gmane.org>


"Christian Heimes" <lists at cheimes.de> wrote in 
message news:473B1AFA.7080900 at cheimes.de...
| I think it's more likely a subtle difference between platforms:
|
| On Linux
| >>> r = input() <CTRL+D>
| Traceback (most recent call last):
|  File "<stdin>", line 1, in <module>
| EOFError
| >>> r = input() <CTRL+Z>
| [1]+  Stopped                 ./python
| $ fg 1
| ./python
| >>>
|
| On Windows
| >>> r = input() <CTRL+D>
| ^D <ENTER>
| >>> r
| '\x04'
| >>> r = input() <CTRL+Z>
| ^Z <ENTER>
| Traceback (most recent call last):
|  File "<stdin>", line 1, in <module>
| EOFError

1. Would it be sensibly possible to equalize the behavior?  (Your def of 
'sensibly'.)
a. ^D and ^Z both raise EOF on all systems.
b. Only ^D on all systems
c. ^D on all systems and ^Z also on Windows.

Would it be a good idea?

For many current Windows users, Python will be the only contact with an 
imitation-DOS console window and the need for EOF input, so strict 
imitation of old, semi-obsolete DOS mode behavior seems not necesarry.

tjr





From tjreedy at udel.edu  Wed Nov 14 23:04:41 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 14 Nov 2007 17:04:41 -0500
Subject: [Python-ideas] Make obj[] valid syntax?
References: <fhfer0$vml$1@ger.gmane.org>
	<20071114183440.GC30836@phd.pp.ru><fhffba$tu$2@ger.gmane.org>
	<20071114190057.GA32728@phd.pp.ru>
Message-ID: <fhfrdp$d1d$1@ger.gmane.org>


"Oleg Broytmann" <phd at phd.pp.ru> wrote in 
message news:20071114190057.GA32728 at phd.pp.ru...
|   1, 2, 3 is a tuple, and () is a tuple, should there be a syntax for an
| empty tuple without parenthesis?

The only thing I have thought of is a bare comma, but I like that even less 
than the () exception and expect most would agree ;-)






From steven.bethard at gmail.com  Wed Nov 14 23:05:44 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 14 Nov 2007 15:05:44 -0700
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <fhfq6f$8pb$1@ger.gmane.org>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
	<473B1AFA.7080900@cheimes.de> <fhfq6f$8pb$1@ger.gmane.org>
Message-ID: <d11dcfba0711141405j58abdeey9ecdf746e280ef7b@mail.gmail.com>

On Nov 14, 2007 2:43 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> 1. Would it be sensibly possible to equalize the behavior?  (Your def of
> 'sensibly'.)
> a. ^D and ^Z both raise EOF on all systems.
> b. Only ^D on all systems
> c. ^D on all systems and ^Z also on Windows.
>
> Would it be a good idea?
>
> For many current Windows users, Python will be the only contact with an
> imitation-DOS console window and the need for EOF input, so strict
> imitation of old, semi-obsolete DOS mode behavior seems not necesarry.

There's already a single way of spelling this on both systems: quit()

$python
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()
$

$python
Python 2.5.1 (r251:54863, Nov 12 2007, 09:59:19)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()
$

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From guido at python.org  Wed Nov 14 23:15:34 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 14 Nov 2007 14:15:34 -0800
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <fhfq6f$8pb$1@ger.gmane.org>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
	<473B1AFA.7080900@cheimes.de> <fhfq6f$8pb$1@ger.gmane.org>
Message-ID: <ca471dc20711141415u47a9f637yb5abe4cd34591835@mail.gmail.com>

On Nov 14, 2007 1:43 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> For many current Windows users, Python will be the only contact with an
> imitation-DOS console window and the need for EOF input, so strict
> imitation of old, semi-obsolete DOS mode behavior seems not necesarry.

We're not doing any of the imitation. On both Linux and Windows we're
getting whatever the OS provides. Note that on Linux you can change
the EOF character using the stty command. I wouldn't be surprised if
there was a way to change this setting in Windows too. But I'd be
opposed to Python messing with it -- while some users may never have
seen a DOS box before, others use them all the time, and Python should
work out of the box for the latter too. Those afraid of DOS boxes
should probably use IDLE anyway.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Thu Nov 15 02:41:33 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 15 Nov 2007 14:41:33 +1300
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
Message-ID: <473BA3CD.2050305@canterbury.ac.nz>

Adam Olsen wrote:
> Thus, set would need 7 methods defined (6 rich comparisons plus
> __cmp__, although it could skip __eq__ and __ne__)

With the 4-valued __cmp__ that I proposed, it would
only need __cmp__, I think.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From luke.stebbing at gmail.com  Thu Nov 15 03:01:50 2007
From: luke.stebbing at gmail.com (Luke Stebbing)
Date: Wed, 14 Nov 2007 18:01:50 -0800
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <473BA3CD.2050305@canterbury.ac.nz>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<473BA3CD.2050305@canterbury.ac.nz>
Message-ID: <dcb1979a0711141801wfc9ead2j59faabb3c06f15bf@mail.gmail.com>

On 11/14/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Adam Olsen wrote:
> > Thus, set would need 7 methods defined (6 rich comparisons plus
> > __cmp__, although it could skip __eq__ and __ne__)
>
> With the 4-valued __cmp__ that I proposed, it would
> only need __cmp__, I think.

set only needs 4 values, but other types need more. See PEP 207,
Proposed Resolutions, #3:
http://www.python.org/dev/peps/pep-0207/

IMO, such things should not use comparison operators, but I think I'm
in the minority.

Luke


From greg.ewing at canterbury.ac.nz  Thu Nov 15 03:01:52 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 15 Nov 2007 15:01:52 +1300
Subject: [Python-ideas] python3: subtle change to new input()
In-Reply-To: <fhfq6f$8pb$1@ger.gmane.org>
References: <200711140807.16677.mark@qtrac.eu> <fhettl$sc5$1@ger.gmane.org>
	<473B1AFA.7080900@cheimes.de> <fhfq6f$8pb$1@ger.gmane.org>
Message-ID: <473BA890.4080904@canterbury.ac.nz>

Terry Reedy wrote:
> strict 
> imitation of old, semi-obsolete DOS mode behavior

...which I think was already obsolete when it was inherited
from CP/M...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From greg.ewing at canterbury.ac.nz  Thu Nov 15 03:21:37 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 15 Nov 2007 15:21:37 +1300
Subject: [Python-ideas] cmp and sorting non-symmetric types
In-Reply-To: <dcb1979a0711141801wfc9ead2j59faabb3c06f15bf@mail.gmail.com>
References: <aac2c7cb0711131051k30972562i75aff0c77e5f0fad@mail.gmail.com>
	<aac2c7cb0711140949g77266723sfdee210e53aafbc7@mail.gmail.com>
	<473BA3CD.2050305@canterbury.ac.nz>
	<dcb1979a0711141801wfc9ead2j59faabb3c06f15bf@mail.gmail.com>
Message-ID: <473BAD31.4080205@canterbury.ac.nz>

Luke Stebbing wrote:
> set only needs 4 values, but other types need more.

A type can always override the 6 separate methods if
it needs to. I'm not proposing to replace these, only
to provide a simpler alternative that covers most use
cases.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From greg.ewing at canterbury.ac.nz  Thu Nov 15 00:55:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 15 Nov 2007 12:55:23 +1300
Subject: [Python-ideas] Raw strings return compiled regexps
In-Reply-To: <4734E46E.2050709@cs.byu.edu>
References: <4734E46E.2050709@cs.byu.edu>
Message-ID: <473B8AEB.8050000@canterbury.ac.nz>

Neil Toronto wrote:

> What if they just returned regular expression objects?

That would force the re module to be part of the core,
which would not be a good thing.

Also, raw strings are good for more than just regexps.
The fact that there are a few things that they're *not*
good for doesn't mean they should be restricted to
regexps.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+


From scott+python-ideas at scottdial.com  Thu Nov 15 09:14:25 2007
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Thu, 15 Nov 2007 03:14:25 -0500
Subject: [Python-ideas] Required to call superclass __init__
In-Reply-To: <fb6fbf560711140635u46b0f016g260f3185e89d4107@mail.gmail.com>
References: <473950FC.10202@cs.byu.edu>
	<20071113091246.GC15166@phd.pp.ru>	<fb6fbf560711130636i2d4abb80gd038c20127bbc222@mail.gmail.com>	<4739CC3B.1090205@cs.byu.edu>
	<473AD00D.2070502@cs.byu.edu>
	<fb6fbf560711140635u46b0f016g260f3185e89d4107@mail.gmail.com>
Message-ID: <473BFFE1.2040804@scottdial.com>

Jim Jewett wrote:
> I don't yet
> see what those simplifications should actually be, but maybe someone
> else will if you publish and wait long enough.
> 

The first thing I noticed was that the naming scheme is confusing. 
Between required_super and super_required, neither of them indicate to 
me which is the function decorator and which is the base class. 
Furthermore, I don't see why required_super (the base class) needs a 
distinct name. Perhaps I am being a bit to clever, but couldn't we just 
overload the __new__ method of the base class.

def _super_required(func):
     ...

class super_required(object):
     ...
     def __new__(cls, *func):
         if len(func) > 0:
             return _super_required(*func)
         return object.__new__(cls)

Leaving your example now being spelled as:

class A(super_required):
     @super_required
     def __init__(self):
         pass

I can't think of a case that the the base class would ever be passed 
arguments, so this seems ok and rids us of the naming oddities.

-Scott

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu


From bborcic at gmail.com  Thu Nov 15 11:48:13 2007
From: bborcic at gmail.com (Boris Borcic)
Date: Thu, 15 Nov 2007 11:48:13 +0100
Subject: [Python-ideas] x @f   as shorthand for   x=f(x)
In-Reply-To: <fh2et1$phn$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org> <fh2et1$phn$1@ger.gmane.org>
Message-ID: <fhh874$3is$1@ger.gmane.org>

Terry Reedy wrote:
> Making the obvious generalization to n params, and specializing to one, 
> gives
> 
> x f=

Been there, saw that :) But hadn't seen 'target decorators', eg

@f
x

Possibly shorthandable as

x @f

Cheers, BB



From luke.stebbing at gmail.com  Thu Nov 15 12:18:20 2007
From: luke.stebbing at gmail.com (Luke Stebbing)
Date: Thu, 15 Nov 2007 03:18:20 -0800
Subject: [Python-ideas] x @f as shorthand for x=f(x)
In-Reply-To: <fhh874$3is$1@ger.gmane.org>
References: <fh1rhm$ui$1@ger.gmane.org> <fh2et1$phn$1@ger.gmane.org>
	<fhh874$3is$1@ger.gmane.org>
Message-ID: <dcb1979a0711150318w27d862faj85f714b263f2ce89@mail.gmail.com>

On 11/15/07, Boris Borcic <bborcic at gmail.com> wrote:
> Possibly shorthandable as
>
> x @f

Hey, I can actually read that one. It sounds like "x, apply f".

Luke


From ntoronto at cs.byu.edu  Sat Nov 17 20:27:47 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sat, 17 Nov 2007 12:27:47 -0700
Subject: [Python-ideas] Optional extra globals dict for function objects
Message-ID: <473F40B3.9030804@cs.byu.edu>

I set out trying to redo the 3.0 autosuper metaclass in 2.5 without 
bytecode hacking and ran into a problem: a function's func_globals isn't 
polymorphic. That is, the interpreter uses PyDict_* calls to access it, 
and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually. 
If it weren't for this, I could have easily done 3.0 super without 
bytecode hacking, by making a custom dict that allows another dict to 
shadow it, and putting the new super object in the shadowing dict.

I know it's for performance, and that if func_globals were made 
polymorphic, it'd bring the pystone benchmark to its knees, begging for 
a quick and merciful death. That's not what I'm proposing.

I propose adding a read-only attribute func_extra_globals to the 
function object, default NULL. In the interpreter loop, global lookups 
try func_extra_globals first if it's not NULL. It's accessed using 
PyObject_* functions.

Here are the reasons I think this is a good idea:

- It should have near zero impact on performance in the general case 
because NULL checks are quick. There would be another attribute in the 
frame object (f_extra_globals), almost always NULL.

- Language enhancement prototypes that currently use bytecode hacking 
could be accomplished with a method wrapper and a func_extra_globals 
dict. The prototypes could be pure Python, and thus more general, less 
brittle, and easier to get right. Hacking closures is nasty business.

- I'm sure lots of other stuff that I can't think of, where it'd be nice 
to dynamically add information to a method or function that can be 
accessed as a variable. Pure-Python function preambles whose results can 
be seen by the original function would be pretty sweet.

- Because func_extra_globals would be read-only and default NULL, it'd 
almost always be obvious when it's getting messed with. A 
wrapper/decorator or a metaclass, and a call to types.FunctionType() 
would signal that.

- func_globals would almost never have to be overridden: for most 
purposes (besides security), shadowing it is actually better, as it 
leaves the function's module fully accessible.

Anybody else think it's awesome? :) How about opinions of major suckage?

If it helps acceptance, I'd be willing to make a patch for this. It 
looks pretty straightforward.

Neil


From brett at python.org  Sat Nov 17 21:46:39 2007
From: brett at python.org (Brett Cannon)
Date: Sat, 17 Nov 2007 12:46:39 -0800
Subject: [Python-ideas] Optional extra globals dict for function objects
In-Reply-To: <473F40B3.9030804@cs.byu.edu>
References: <473F40B3.9030804@cs.byu.edu>
Message-ID: <bbaeab100711171246o617bde3cvd7eebc6611eaec12@mail.gmail.com>

On Nov 17, 2007 11:27 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I set out trying to redo the 3.0 autosuper metaclass in 2.5 without
> bytecode hacking and ran into a problem: a function's func_globals isn't
> polymorphic. That is, the interpreter uses PyDict_* calls to access it,
> and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually.
> If it weren't for this, I could have easily done 3.0 super without
> bytecode hacking, by making a custom dict that allows another dict to
> shadow it, and putting the new super object in the shadowing dict.
>
> I know it's for performance, and that if func_globals were made
> polymorphic, it'd bring the pystone benchmark to its knees, begging for
> a quick and merciful death. That's not what I'm proposing.
>
> I propose adding a read-only attribute func_extra_globals to the
> function object, default NULL. In the interpreter loop, global lookups
> try func_extra_globals first if it's not NULL. It's accessed using
> PyObject_* functions.
>

My initial response is "eww".  I say this as I don't want to
complicate the scoping rules anymore than they are.  This adds yet
another place to check for things.  While it might not be a nasty
performance hit (although you neglect to say what happens if something
is not found in func_extra_globals; do you check func_globals as well?
 That will be a penalty hit), it does complicate semantics slightly.

> Here are the reasons I think this is a good idea:
>
> - It should have near zero impact on performance in the general case
> because NULL checks are quick. There would be another attribute in the
> frame object (f_extra_globals), almost always NULL.
>

That is only true if you skip a func_globals check if the
func_extra_globals check doesn't happen.

> - Language enhancement prototypes that currently use bytecode hacking
> could be accomplished with a method wrapper and a func_extra_globals
> dict. The prototypes could be pure Python, and thus more general, less
> brittle, and easier to get right. Hacking closures is nasty business.

Which are what?  the auto-super example is not exactly common.

>
> - I'm sure lots of other stuff that I can't think of, where it'd be nice
> to dynamically add information to a method or function that can be
> accessed as a variable. Pure-Python function preambles whose results can
> be seen by the original function would be pretty sweet.

Basing an idea on unknown potential is not a good reason to add
something to the language.  I don't think the Air Force needs to
protect against flying pigs just because there is the possibility
someone might genetically engineer some to carry nuclear bombs.  =)

>
> - Because func_extra_globals would be read-only and default NULL, it'd
> almost always be obvious when it's getting messed with. A
> wrapper/decorator or a metaclass, and a call to types.FunctionType()
> would signal that.

Read-only?  Then how are you supposed to set this?  Do you want to
introduce something like __build_class__ for functions and methods?
Requiring the use of Types.FunctionType() will be a pain and dilute
the usefulness.

>
> - func_globals would almost never have to be overridden: for most
> purposes (besides security), shadowing it is actually better, as it
> leaves the function's module fully accessible.
>

If that's the case why worry about func_extra_globals?  =)  It solves
%95 of the uses you might have (and I suspect 94% of the uses are "I
don't need to muck with func_globals").

> Anybody else think it's awesome? :) How about opinions of major suckage?
>

I'm -1 on the idea personally.

> If it helps acceptance, I'd be willing to make a patch for this. It
> looks pretty straightforward.

It always helps acceptance, it's just a question of whether it will
push it over the edge into actually being accepted.

-Brett


From ntoronto at cs.byu.edu  Mon Nov 19 20:00:04 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 19 Nov 2007 12:00:04 -0700
Subject: [Python-ideas] Explicit self argument, implicit super argument
Message-ID: <4741DD34.1070106@cs.byu.edu>

(Disclaimer: I have no issue with "self." and "super." attribute access, 
which is what most people think of when they think "implicit self".)

While showing a coworker a bytecode hack I made this weekend - it allows 
insertion of arbitrary function parameters into an already-existing 
function - he asked for a use case. I gave him this:

class A(object):
     # ...
     def method(x, y):
         self.x = x
         super.method(y)


where 'method' is replaced by this method wrapper via metaclass or 
decorator:

def method_wrapper(self, *args, **kwargs):
     return hacked_method(self, super(cls, self), *args, **kwargs)


These hackish details aren't important, the resulting "A.method" is.

It occurred to me that explicit self and implicit super is semantically 
inconsistent. Here's Python 3000's version of the above (please compare):

class A(object):
     def method(self, x, y):
         self.x = x
         super.method(y)


Why have a magic "super" local but not a magic "self" local? From a 
*general usage* standpoint, the only reason I can think of (which is not 
necessarily the only one, which is why I'm asking) is that a person 
might want to change the name of "self", like so:

class AddLike(object):
     # ...
     def __add__(a, b):
         # return something
     def __radd__(b, a):
         # return something


But reverse binary special methods are the only case where it's not 
extremely bad form. Okay, two reasons for explicit self: backward 
compatibility, but 2to3 would make it a non-issue.

 From an *implementation standpoint*, making self implicit - a cell 
variable like super, for example - would wreak havoc with the current 
bound/unbound method distinction, but I'm not so sure that's a bad thing.

Neil


From guido at python.org  Mon Nov 19 20:11:21 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 19 Nov 2007 11:11:21 -0800
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4741DD34.1070106@cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
Message-ID: <ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>

The reason for explicit self in method definition signatures is
semantic consistency. If you write

class C:
  def foo(self, x, y): ...

This really *is* the same as writing

class C:
  pass

def foo(self, x, y): ...
C.foo = foo

And of course it works the other way as well: you really *can* invoke
foo with an explicit argument for self as follows:

class D(C):
  ...

C.foo(D(), 1, 2)

IOW it's not an implementation hack -- it is a semantic device.

--Guido

On Nov 19, 2007 11:00 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> (Disclaimer: I have no issue with "self." and "super." attribute access,
> which is what most people think of when they think "implicit self".)
>
> While showing a coworker a bytecode hack I made this weekend - it allows
> insertion of arbitrary function parameters into an already-existing
> function - he asked for a use case. I gave him this:
>
> class A(object):
>      # ...
>      def method(x, y):
>          self.x = x
>          super.method(y)
>
>
> where 'method' is replaced by this method wrapper via metaclass or
> decorator:
>
> def method_wrapper(self, *args, **kwargs):
>      return hacked_method(self, super(cls, self), *args, **kwargs)
>
>
> These hackish details aren't important, the resulting "A.method" is.
>
> It occurred to me that explicit self and implicit super is semantically
> inconsistent. Here's Python 3000's version of the above (please compare):
>
> class A(object):
>      def method(self, x, y):
>          self.x = x
>          super.method(y)
>
>
> Why have a magic "super" local but not a magic "self" local? From a
> *general usage* standpoint, the only reason I can think of (which is not
> necessarily the only one, which is why I'm asking) is that a person
> might want to change the name of "self", like so:
>
> class AddLike(object):
>      # ...
>      def __add__(a, b):
>          # return something
>      def __radd__(b, a):
>          # return something
>
>
> But reverse binary special methods are the only case where it's not
> extremely bad form. Okay, two reasons for explicit self: backward
> compatibility, but 2to3 would make it a non-issue.
>
>  From an *implementation standpoint*, making self implicit - a cell
> variable like super, for example - would wreak havoc with the current
> bound/unbound method distinction, but I'm not so sure that's a bad thing.
>
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From luke.stebbing at gmail.com  Mon Nov 19 21:20:28 2007
From: luke.stebbing at gmail.com (Luke Stebbing)
Date: Mon, 19 Nov 2007 12:20:28 -0800
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
Message-ID: <dcb1979a0711191220h5918542ch9d73cece1b08c04c@mail.gmail.com>

On 11/19/07, Guido van Rossum <guido at python.org> wrote:
> The reason for explicit self in method definition signatures is
> semantic consistency. If you write
>
> class C:
>   def foo(self, x, y): ...
>
> This really *is* the same as writing
>
> class C:
>   pass
>
> def foo(self, x, y): ...
> C.foo = foo

What about an instancemethod decorator?

@instancemethod(C)
def foo(x, y): ...

> And of course it works the other way as well: you really *can* invoke
> foo with an explicit argument for self as follows:
>
> class D(C):
>   ...
>
> C.foo(D(), 1, 2)

Couldn't __builtin__.__super__ be used? It would look pretty weird if
you invoked a method higher up the MRO, though.


I find that these cases come up rarely in my code, while I forget the
'self' argument much more frequently, but YMMV.

Luke


From ntoronto at cs.byu.edu  Mon Nov 19 21:42:16 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 19 Nov 2007 13:42:16 -0700
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
Message-ID: <4741F528.6000400@cs.byu.edu>

Guido van Rossum wrote:
> The reason for explicit self in method definition signatures is
> semantic consistency. If you write
> 
> class C:
>   def foo(self, x, y): ...
> 
> This really *is* the same as writing
> 
> class C:
>   pass
> 
> def foo(self, x, y): ...
> C.foo = foo
> 
> And of course it works the other way as well: you really *can* invoke
> foo with an explicit argument for self as follows:
> 
> class D(C):
>   ...
> 
> C.foo(D(), 1, 2)
> 
> IOW it's not an implementation hack -- it is a semantic device.

Ah, thanks, that helps. (I'll be able to  sleep tonight. :D) This 
semantic device, of course, would really suck if applied to "super":

d = D()
C.foo(d, super(C, d), 1, 2)  # strange and hideous


which is a great reason that the new "super" is implicit.

(Before I continue, please understand that I'm not arguing for a 
language change. Responses to my last two ideas have shown me that I 
need to thoroughly understand why things are as they are right now while 
considering a change, and long before advocating one. It also goes over 
better with the language designers.)

Now, correct me if I'm wrong, but it seems there are only two use cases 
for DistantParentOfD.method(D_instance, ...):

1. The Good Case: you know the "next-method" as determined by the MRO 
isn't the right one to call. Multiple inheritance can twist you into 
this sort of behavior, though if it does, your design likely needs 
reconsideration.

2. The Evil Case: you know the override method as defined by D isn't the 
one you want for your extra-special D instance. This should be possible 
but never encouraged.

Because the runtime enforces isinstance(D_instance, D), everything else 
can be handled with D_instance.method(...) or self.method() or 
super.method(). We know that #1 and #2 above are the uncommon cases, 
which is why the new "super", which covers the common ones, doesn't 
cover those.

Is it right to say that the explicit "self" parameter only exists to 
enable those two uncommon cases?

Of course, if self were implicit, there would still need to be a way to 
spell DistantParentOfD.method(D_instance, ...). Being the uncommon case, 
maybe it shouldn't have a nice spelling:

as_parent(C, D_instance).method(...)


Trying-to-understandingly-yours,

Neil



From arno at marooned.org.uk  Mon Nov 19 22:54:50 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Mon, 19 Nov 2007 21:54:50 +0000
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4741F528.6000400@cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu>
Message-ID: <FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>


On 19 Nov 2007, at 20:42, Neil Toronto wrote:
>
[...]
> Now, correct me if I'm wrong, but it seems there are only two use  
> cases
> for DistantParentOfD.method(D_instance, ...):
>
> 1. The Good Case: you know the "next-method" as determined by the MRO
> isn't the right one to call. Multiple inheritance can twist you into
> this sort of behavior, though if it does, your design likely needs
> reconsideration.
>
> 2. The Evil Case: you know the override method as defined by D isn't  
> the
> one you want for your extra-special D instance. This should be  
> possible
> but never encouraged.
>
> Because the runtime enforces isinstance(D_instance, D), everything  
> else
> can be handled with D_instance.method(...) or self.method() or
> super.method(). We know that #1 and #2 above are the uncommon cases,
> which is why the new "super", which covers the common ones, doesn't
> cover those.
>
> Is it right to say that the explicit "self" parameter only exists to
> enable those two uncommon cases?


Self being explicit makes it less selfish :)
To illustrate, I like that you can do:

class Foo(str):
    def mybar(self):
        class Bar(str):
            def madeby(me):
                return "I am %s and I was made by %s" % (me, self)
        return Bar

 >>> foo=Foo("foo")
 >>> bar=foo.mybar()
 >>> Bar=foo.mybar()
 >>> bar=Bar("bar")
 >>> print bar.madeby()
I am bar and I was made by foo


This depends on 'self' being explicit and is not related to super.
I didn't know about implicit super, it's probably great but my initial
reaction is that I don't like it :(

Why not:

class Foo:
    @with_super
    def bar(super, self, x, y):
        super.bar(x, y)
        ...

-- 
Arnaud




From ntoronto at cs.byu.edu  Mon Nov 19 23:03:56 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 19 Nov 2007 15:03:56 -0700
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
References: <4741DD34.1070106@cs.byu.edu>	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>	<4741F528.6000400@cs.byu.edu>
	<FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
Message-ID: <4742084C.6070008@cs.byu.edu>

Arnaud Delobelle wrote:
> Self being explicit makes it less selfish :)
> To illustrate, I like that you can do:
> 
> class Foo(str):
>     def mybar(self):
>         class Bar(str):
>             def madeby(me):
>                 return "I am %s and I was made by %s" % (me, self)
>         return Bar
> 
>  >>> foo=Foo("foo")
>  >>> #bar=foo.mybar()  # typo
>  >>> Bar=foo.mybar()
>  >>> bar=Bar("bar")
>  >>> print bar.madeby()
> I am bar and I was made by foo

Ah, I see. If self were passed implicitly, you would need to make a 
Bar.__init__ that received and stored the outer self. I think I'd call 
this a third uncommon case. Outside functional idioms, common is usually 
flat.

> This depends on 'self' being explicit and is not related to super.
> I didn't know about implicit super, it's probably great but my initial
> reaction is that I don't like it :(
> 
> Why not:
> 
> class Foo:
>     @with_super
>     def bar(super, self, x, y):
>         super.bar(x, y)
>         ...

Probably because it's way too common to require a decorator for it. 
Users would have to make "always use @with_super" into a coding habit. 
(Sort of like "self" actually.) It'd also be yet another thing to keep 
in mind while reading code: did this method use @with_super or not?

Neil


From greg.ewing at canterbury.ac.nz  Tue Nov 20 01:33:37 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 20 Nov 2007 13:33:37 +1300
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4741F528.6000400@cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu>
Message-ID: <47422B61.50703@canterbury.ac.nz>

Neil Toronto wrote:
> Because the runtime enforces isinstance(D_instance, D), everything else 
> can be handled with D_instance.method(...) or self.method() or 
> super.method().

But super() is not a general replacement for explicit inherited
method calls. It's only appropriate in special, quite restricted
circumstances.

--
Greg


From luke.stebbing at gmail.com  Tue Nov 20 02:27:30 2007
From: luke.stebbing at gmail.com (Luke Stebbing)
Date: Mon, 19 Nov 2007 17:27:30 -0800
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu>
	<FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
Message-ID: <dcb1979a0711191727j7dac67b8j40a6ccc64961d4eb@mail.gmail.com>

On 11/19/07, Arnaud Delobelle <arno at marooned.org.uk> wrote:
> Self being explicit makes it less selfish :)
> To illustrate, I like that you can do:
>
> class Foo(str):
>     def mybar(self):
>         class Bar(str):
>             def madeby(me):
>                 return "I am %s and I was made by %s" % (me, self)
>         return Bar
>

How about:

class Foo(str):
    def mybar():
        outer = self
        class Bar(str):
            def madeby():
                return "I am %s and I was made by %s" % (self, outer)
        return Bar


Luke


From greg.ewing at canterbury.ac.nz  Tue Nov 20 02:23:40 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 20 Nov 2007 14:23:40 +1300
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4741DD34.1070106@cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
Message-ID: <4742371C.1070706@canterbury.ac.nz>

Neil Toronto wrote:

> class A(object):
>      def method(self, x, y):
>          self.x = x
>          super.method(y)

Is that really how it's going to be? What if self isn't
called 'self'?

I would rather see

           super.method(self, y)

>  From an *implementation standpoint*, making self implicit - a cell 
> variable like super, for example - would wreak havoc with the current 
> bound/unbound method distinction, but I'm not so sure that's a bad thing.

What happens to explicit inherited method calls? If they
become impossible or awkward, it's very definitely a bad
thing.

--
Greg


From luke.stebbing at gmail.com  Tue Nov 20 02:50:08 2007
From: luke.stebbing at gmail.com (Luke Stebbing)
Date: Mon, 19 Nov 2007 17:50:08 -0800
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4742371C.1070706@canterbury.ac.nz>
References: <4741DD34.1070106@cs.byu.edu> <4742371C.1070706@canterbury.ac.nz>
Message-ID: <dcb1979a0711191750qe356698h9a285bb147f7b8de@mail.gmail.com>

On 11/19/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Neil Toronto wrote:
>
> > class A(object):
> >      def method(self, x, y):
> >          self.x = x
> >          super.method(y)
>
> Is that really how it's going to be? What if self isn't
> called 'self'?
>
> I would rather see
>
>            super.method(self, y)

PEP 3135 specifies that the first argument of the method is used,
regardless of name:
http://www.python.org/dev/peps/pep-3135/#specification

Luke


From ntoronto at cs.byu.edu  Tue Nov 20 03:37:09 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 19 Nov 2007 19:37:09 -0700
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <dcb1979a0711191727j7dac67b8j40a6ccc64961d4eb@mail.gmail.com>
References: <4741DD34.1070106@cs.byu.edu>	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>	<4741F528.6000400@cs.byu.edu>	<FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
	<dcb1979a0711191727j7dac67b8j40a6ccc64961d4eb@mail.gmail.com>
Message-ID: <47424855.7020904@cs.byu.edu>

Luke Stebbing wrote:
> On 11/19/07, Arnaud Delobelle <arno at marooned.org.uk> wrote:
>> Self being explicit makes it less selfish :)
>> To illustrate, I like that you can do:
>>
>> class Foo(str):
>>     def mybar(self):
>>         class Bar(str):
>>             def madeby(me):
>>                 return "I am %s and I was made by %s" % (me, self)
>>         return Bar
>>
> 
> How about:
> 
> class Foo(str):
>     def mybar():
>         outer = self
>         class Bar(str):
>             def madeby():
>                 return "I am %s and I was made by %s" % (self, outer)
>         return Bar

Good point. I actually like this better, since it forces the outer scope 
self to have a different name, removing a source of confusion. Back down 
to two uncommon use cases so far, then.

Neil


From arno at marooned.org.uk  Tue Nov 20 08:05:51 2007
From: arno at marooned.org.uk (Arnaud Delobelle)
Date: Tue, 20 Nov 2007 07:05:51 -0000 (GMT)
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <dcb1979a0711191727j7dac67b8j40a6ccc64961d4eb@mail.gmail.com>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu>
	<FDFCAF5A-FEF5-4859-ACDA-94692AB91972@marooned.org.uk>
	<dcb1979a0711191727j7dac67b8j40a6ccc64961d4eb@mail.gmail.com>
Message-ID: <51974.82.46.172.40.1195542351.squirrel@www.marooned.org.uk>


On Tue, November 20, 2007 1:27 am, Luke Stebbing wrote:
> On 11/19/07, Arnaud Delobelle <arno at marooned.org.uk> wrote:
>> Self being explicit makes it less selfish :)
>> To illustrate, I like that you can do:
>>
>> class Foo(str):
>>     def mybar(self):
>>         class Bar(str):
>>             def madeby(me):
>>                 return "I am %s and I was made by %s" % (me, self)
>>         return Bar
>>
>
> How about:
>
> class Foo(str):
>     def mybar():
>         outer = self
>         class Bar(str):
>             def madeby():
>                 return "I am %s and I was made by %s" % (self, outer)
>         return Bar
>
>
> Luke
>

I suppose, though it's a waste of a cell IMHO ;)

-- 
Arnaud




From ntoronto at cs.byu.edu  Tue Nov 20 09:50:47 2007
From: ntoronto at cs.byu.edu (ntoronto at cs.byu.edu)
Date: Tue, 20 Nov 2007 01:50:47 -0700 (MST)
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <47422B61.50703@canterbury.ac.nz>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu> <47422B61.50703@canterbury.ac.nz>
Message-ID: <33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>

> Neil Toronto wrote:
>> Because the runtime enforces isinstance(D_instance, D), everything else
>> can be handled with D_instance.method(...) or self.method() or
>> super.method().
>
> But super() is not a general replacement for explicit inherited
> method calls. It's only appropriate in special, quite restricted
> circumstances.

Exactly. There are two common method-calling cases, and an uncommon one.
In order of expected number of occurrences, with #3 being quite low:

1. self.method(...)

2. super.method(...)

3. DistantParent.method(self, ...) (either to get out of the MRO or
because you're feeling evil - two use cases for it)

If self were only implicitly available, #3 would need a new spelling, as
you say. That's not hard to do, and I've already suggested
as_parent(DistantParent, self).method(...) as an alternate spelling for
the uncommon cases.

That's not to say I'm advocating such a thing for Python 3.0 - just
showing that it's possible to cover the current known use cases. Actually,
I suspect there aren't any more use cases, as all correct ways of calling
the method (those that don't raise an exception) are covered, and implicit
self would still be as accessible from anywhere as explicit self is.

Would saving six keystrokes per method, reducing noise in every method
header, and removing the need for a habit (always including self in the
parameter list) be enough to justify a change? I'm going to guess either
"no" or "not right now". If I were doing it from scratch, I'd make self
and super into keywords, and change method binding to return a function
with them included in the locals somehow.

Neil



From jimjjewett at gmail.com  Tue Nov 20 16:15:56 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 20 Nov 2007 10:15:56 -0500
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu> <47422B61.50703@canterbury.ac.nz>
	<33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>
Message-ID: <fb6fbf560711200715w251f6332se406b64e7c6e7756@mail.gmail.com>

On 11/20/07, ntoronto at cs.byu.edu <ntoronto at cs.byu.edu> wrote:
> > Neil Toronto wrote:
> >> Because the runtime enforces isinstance(D_instance, D), everything else
> >> can be handled with D_instance.method(...) or self.method() or
> >> super.method().

> > But super() is not a general replacement for explicit inherited
> > method calls. It's only appropriate in special, quite restricted
> > circumstances.

I would say it it almost always appropriate.  The times it fails are when

(1)  You want to change the name of the method.  Fair enough -- but
you can usually forward to self.othername

(2)  You want to change the arguments of the method.  Changing the
signature is generally a bad idea, though it is tolerable for
constructors.

(3)  You're explicitly managing the order of super-calls (==> fragile,
and the inheritance is already a problem)

(4)  Backwards compatibility with some other class that uses explicit
class names instead of super.

Number 4 is pretty common still, but it is just a backwards
compatibility hack that makes code more fragile.

> Would saving six keystrokes per method, reducing noise in every method
> header, and removing the need for a habit (always including self in the
> parameter list) be enough to justify a change? I'm going to guess either
> "no" or "not right now". If I were doing it from scratch, I'd make self
> and super into keywords, and change method binding to return a function
> with them included in the locals somehow.

Agreed.  The fact that method parameter lists look different at the
definition and call sites is an annoying wart, but ... too late to
change.

-jJ


From jimjjewett at gmail.com  Tue Nov 20 20:30:06 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 20 Nov 2007 14:30:06 -0500
Subject: [Python-ideas] Optional extra globals dict for function objects
In-Reply-To: <473F40B3.9030804@cs.byu.edu>
References: <473F40B3.9030804@cs.byu.edu>
Message-ID: <fb6fbf560711201130v25e0c375s3475deb428798c8f@mail.gmail.com>

On 11/17/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I set out trying to redo the 3.0 autosuper metaclass
> in 2.5 without bytecode hacking and ran into a problem:
>      a function's func_globals isn't polymorphic.
> That is, the interpreter uses PyDict_* calls to access it,
> and in one case (LOAD_GLOBAL), actually inlines
> PyDict_GetItem manually.

(1)  Is this just one of the "this must be a real dict, not just any
mapping" limits, or is there something else I'm missing?

(2)  Isn't the func_globals already (a read-only reference to) the
module's __dict__?  So is this really about changing the promise of
the module type, instead of just about func_globals?

Note that weakening the module.__dict__ promise to only meeting the
dict API would make it easier to implement the various
speed-up-globals suggestions.  And to be honest, I think that assuming
a UserDict.DictMixin wouldn't be that bad.  How often is a module's
dict used for anything time-critical except get (and maybe set,
delete, iterate)?

> If it weren't for this, I could have easily done 3.0 super
> without bytecode hacking, by making a custom dict that
> allows another dict to shadow it, and putting the new
> super object in the shadowing dict.

...

> I propose adding a read-only attribute func_extra_globals
> to the function object, default NULL. In the interpreter loop,
> global lookups try func_extra_globals first if it's not NULL.

Would this really be a global dict though, or just a closure inserted
between the func and the normal globals?

Is the real problem that you can't change which variables are in a
closure (rather than fully global) after the function is compiled?

-jJ


From ntoronto at cs.byu.edu  Tue Nov 20 21:44:51 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 20 Nov 2007 13:44:51 -0700
Subject: [Python-ideas] Optional extra globals dict for function objects
In-Reply-To: <fb6fbf560711201130v25e0c375s3475deb428798c8f@mail.gmail.com>
References: <473F40B3.9030804@cs.byu.edu>
	<fb6fbf560711201130v25e0c375s3475deb428798c8f@mail.gmail.com>
Message-ID: <47434743.4010305@cs.byu.edu>

Jim Jewett wrote:
> On 11/17/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
>> I set out trying to redo the 3.0 autosuper metaclass
>> in 2.5 without bytecode hacking and ran into a problem:
>>      a function's func_globals isn't polymorphic.
>> That is, the interpreter uses PyDict_* calls to access it,
>> and in one case (LOAD_GLOBAL), actually inlines
>> PyDict_GetItem manually.
> 
> (1)  Is this just one of the "this must be a real dict, not just any
> mapping" limits, or is there something else I'm missing?

That's all it is, yes.

> (2)  Isn't the func_globals already (a read-only reference to) the
> module's __dict__?  So is this really about changing the promise of
> the module type, instead of just about func_globals?

My original question was about extending (with an optional dictionary) 
the behavior of a function with regard to its func_globals. Because of 
speed concerns, I didn't suggest weakening the type constraint to allow 
just anything that meets the dict API.

> Note that weakening the module.__dict__ promise to only meeting the
> dict API would make it easier to implement the various
> speed-up-globals suggestions.

By "implement" do you mean proof-of-concept, final, or both? At least 
for proof-of-concept, I totally agree. And thanks for the use case 
(which sort of applies to my original flawed idea), my lack of which 
Brett has raked me over the coals for. :) (But it didn't hurt much!)

> And to be honest, I think that assuming
> a UserDict.DictMixin wouldn't be that bad.  How often is a module's
> dict used for anything time-critical except get (and maybe set,
> delete, iterate)?

I doubt that delete and iterate are common enough that they'd have to be 
regarded as time-critical. Maybe set - maybe. It hardly happens 
(especially compared to get), and when it does, it's almost never in a 
time-critical inner loop.

DictMixin is currently pure Python. That's a speed concern that wouldn't 
be *too* hard to address, I suppose.

>> I propose adding a read-only attribute func_extra_globals
>> to the function object, default NULL. In the interpreter loop,
>> global lookups try func_extra_globals first if it's not NULL.
> 
> Would this really be a global dict though, or just a closure inserted
> between the func and the normal globals?

Basically a customizable closure, yeah.

> Is the real problem that you can't change which variables are in a
> closure (rather than fully global) after the function is compiled?

Really, that's it. That's why I made the silly bytecode hack to insert 
function parameters, which actually works better than augmenting a 
function's globals with a polymorphic dict.

Assuming func_globals is a DictMixin is intriguing, though.

Neil


From greg.ewing at canterbury.ac.nz  Wed Nov 21 00:48:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 21 Nov 2007 12:48:36 +1300
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu> <47422B61.50703@canterbury.ac.nz>
	<33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>
Message-ID: <47437254.3070505@canterbury.ac.nz>

ntoronto at cs.byu.edu wrote:
> There are two common method-calling cases, and an uncommon one.
> In order of expected number of occurrences, with #3 being quite low:
> 
> 1. self.method(...)
> 
> 2. super.method(...)
> 
> 3. DistantParent.method(self, ...)

You're still missing an important case. I would rank
them as

1. self.method(...)

2. DirectParent.method(self, ...)

3. super.method(...)

4. DistantParent.method(self, ...)

Anything that made number 2 difficult would be unacceptable.

--
Greg


From greg.ewing at canterbury.ac.nz  Wed Nov 21 01:43:20 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 21 Nov 2007 13:43:20 +1300
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <fb6fbf560711200715w251f6332se406b64e7c6e7756@mail.gmail.com>
References: <4741DD34.1070106@cs.byu.edu>
	<ca471dc20711191111j13802ff1w640dce23636d095a@mail.gmail.com>
	<4741F528.6000400@cs.byu.edu> <47422B61.50703@canterbury.ac.nz>
	<33030.10.7.75.26.1195548647.squirrel@mail.cs.byu.edu>
	<fb6fbf560711200715w251f6332se406b64e7c6e7756@mail.gmail.com>
Message-ID: <47437F28.2040405@canterbury.ac.nz>

Jim Jewett wrote:

> I would say it it almost always appropriate.  The times it fails are when

(5) Someone multiply-inherits from your class, and you end
up calling one of their methods instead of yours, when neither
your method or their method is expecting this to happen.

Plus various other problems. There's a good discussion of
the issues here:

http://fuhm.net/super-harmful/

--
Greg


From aligrudi at gmail.com  Wed Nov 21 05:54:46 2007
From: aligrudi at gmail.com (Ali Gholami Rudi)
Date: Wed, 21 Nov 2007 08:24:46 +0330
Subject: [Python-ideas] Explicit self argument, implicit super argument
In-Reply-To: <4741DD34.1070106@cs.byu.edu>
References: <4741DD34.1070106@cs.byu.edu>
Message-ID: <20071121045446.GA2695@oojibishe>

On Mon, Nov 19, 2007 at 12:00:04PM -0700, Neil Toronto wrote:
> (Disclaimer: I have no issue with "self." and "super." attribute access, 
> which is what most people think of when they think "implicit self".)

I don't feel easy about the new super either (maybe from a different
perspective than Neil's).  Why should self be passed to methods using
a parameter but super should use magic (something like a global name
that holds different objects in different places).

Instead of making self implicit, I'd like super to use less magic.  I
much preferred super(self).foo(*args).  Some magic for finding the
surrounding class might be needed but at least we don't use the first
parameter of a method implicitly.  (I don't see this in the
alternative proposals sections of :PEP:`3135`).  It can be made
backward compatible, too.  I have not read new super discussions; So
sorry if it has been already discussed.

-- Ali
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071121/b9dcca5a/attachment.pgp>

From ntoronto at cs.byu.edu  Thu Nov 22 16:40:49 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Thu, 22 Nov 2007 08:40:49 -0700
Subject: [Python-ideas] Fast global cacheless lookup
Message-ID: <4745A301.5090201@cs.byu.edu>

I have a hack coded up against r59068 in which LOAD_GLOBAL is even 
faster than LOAD_FAST. It'll be the same with STORE_GLOBAL and the 
*_NAME opcodes after I'm done with it, and it should be fully 
transparent to Python code. (That is, you can go ahead and swap out 
__builtins__ and crazy junk like that and everything should work as it 
did before.) Regression tests all pass, except test_gc on functions - 
I've got a refcount bug somewhere.

Here's the microbenchmark I've been using to test LOAD_GLOBAL and LOAD_FAST:

import timeit
import dis

def test_local_get():
     x = 0
     x; x; x; #... and 397 more of them

if __name__ == '__main__':
     print dis.dis(test_local_get.func_code)
     print timeit.Timer('test_local_get()',
             'from locals_test import test_local_get').timeit()


The globals test puts 'x' in module scope, and the builtins test changes 
'x' to 'len' and doesn't assign it to 0.

Output right now:

r59068 locals: 15.57 sec
myhack locals: 15.61 sec (increase is probably insignificant or random)

r59068 globals: 23.61 sec
myhack globals: 15.14 sec (!)

r59068 builtins: 28.08 sec
myhack builtins: 15.26 sec (!!)

Of course, it's no good if it slows everything else way the heck down. 
So 10 rounds of pybench says:

r59068: mean 8.92, std 0.05
myhack: mean 8.99, std 0.04

 From what I see in pybench, globals access is severely underrepresented 
compared to real programs, so those numbers aren't representative of the 
possible difference in real-life performance.

Jim Jewett gave me the idea here:

http://mail.python.org/pipermail/python-ideas/2007-November/001207.html

"Note that weakening the module.__dict__ promise to only meeting the 
dict API would make it easier to implement the various speed-up-globals 
suggestions."

I didn't exactly do that, but it did get me thinking. The other 
proposals for speeding up globals access seemed to do their darndest to 
leave PyDictObject alone and ended up hideously complicated because of 
it. Here's the main idea for this one: What if a frame could maintain an 
array of pointers right into a dictionary's entry table? A global lookup 
would then consist of a couple of pointer dereferences, and any value 
change would show up immediately to the frame.

There was a dangerous dangling pointer problem inherent in that, so I 
formalized an update mechanism using an observer pattern.

Here's how it works. Arbitrary objects can register themselves with a 
dictionary as "entry observers". The dictionary keeps track of all the 
registered observers, and for certain events, makes a call to each one 
to tell them that something has changed. The entry observers get 
pointers to entries via PyDict_GetEntry, which is just like 
PyDict_GetItem, except it returns a PyDictEntry * right from the 
dictionary's entry table.

The dict notifies its observers on delitem, pop, popitem, resize and 
clear. Nothing else is necessary - nothing else will change the address 
of or invalidate an entry. There are very, very few changes in 
PyDictObject. In the general case, the pointer to the list of observers 
is NULL, and the only additional slowdown is when delitem, pop, popitem, 
resize and clear check that and move on - but those aren't called often.

So get, set, iter, contains, etc., are all exactly as fast as they were 
before. The biggest performance hit is when a highly-observed dict like 
__builtin__.__dict__ resizes, but that's rare enough to not worry about.

To speed up globals access, an auxiliary object to functions and frames 
registers itself as an observer to func_globals and __builtins__. It 
makes an array of PyDictEntry pointers corresponding to 
func_code.co_names. PyEval_EvalFrameEx indexes that array first for 
global values, and updates it if there's one it couldn't find when the 
function was created.

That's pretty much it. There are corner cases I still have to address, 
like what happens if someone replaces or deletes __builtins__, but it 
should be fairly easy to monitor that.

I'd love to hear your comments, everyone. I've glossed over a lot of 
implementation details, but I've tried to make the main ideas clear.

Neil


From guido at python.org  Thu Nov 22 16:46:21 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 22 Nov 2007 07:46:21 -0800
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4745A301.5090201@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
Message-ID: <ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>

Cool! Are you willing to show the code yet (bugs and all)?

Personally, I'm not sure that it's worth doing this for STORE_GLOBAL
(which should be rarely used in properly written code).

Some questions:

- what's the space & time impact for a dict with no watchers?

- does this do anything for builtins?

- could this be made to work for instance variables?

- what about exec(src, ns) where ns is a mapping but not a dict?

--Guido

On Nov 22, 2007 7:40 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I have a hack coded up against r59068 in which LOAD_GLOBAL is even
> faster than LOAD_FAST. It'll be the same with STORE_GLOBAL and the
> *_NAME opcodes after I'm done with it, and it should be fully
> transparent to Python code. (That is, you can go ahead and swap out
> __builtins__ and crazy junk like that and everything should work as it
> did before.) Regression tests all pass, except test_gc on functions -
> I've got a refcount bug somewhere.
>
> Here's the microbenchmark I've been using to test LOAD_GLOBAL and LOAD_FAST:
>
> import timeit
> import dis
>
> def test_local_get():
>      x = 0
>      x; x; x; #... and 397 more of them
>
> if __name__ == '__main__':
>      print dis.dis(test_local_get.func_code)
>      print timeit.Timer('test_local_get()',
>              'from locals_test import test_local_get').timeit()
>
>
> The globals test puts 'x' in module scope, and the builtins test changes
> 'x' to 'len' and doesn't assign it to 0.
>
> Output right now:
>
> r59068 locals: 15.57 sec
> myhack locals: 15.61 sec (increase is probably insignificant or random)
>
> r59068 globals: 23.61 sec
> myhack globals: 15.14 sec (!)
>
> r59068 builtins: 28.08 sec
> myhack builtins: 15.26 sec (!!)
>
> Of course, it's no good if it slows everything else way the heck down.
> So 10 rounds of pybench says:
>
> r59068: mean 8.92, std 0.05
> myhack: mean 8.99, std 0.04
>
>  From what I see in pybench, globals access is severely underrepresented
> compared to real programs, so those numbers aren't representative of the
> possible difference in real-life performance.
>
> Jim Jewett gave me the idea here:
>
> http://mail.python.org/pipermail/python-ideas/2007-November/001207.html
>
> "Note that weakening the module.__dict__ promise to only meeting the
> dict API would make it easier to implement the various speed-up-globals
> suggestions."
>
> I didn't exactly do that, but it did get me thinking. The other
> proposals for speeding up globals access seemed to do their darndest to
> leave PyDictObject alone and ended up hideously complicated because of
> it. Here's the main idea for this one: What if a frame could maintain an
> array of pointers right into a dictionary's entry table? A global lookup
> would then consist of a couple of pointer dereferences, and any value
> change would show up immediately to the frame.
>
> There was a dangerous dangling pointer problem inherent in that, so I
> formalized an update mechanism using an observer pattern.
>
> Here's how it works. Arbitrary objects can register themselves with a
> dictionary as "entry observers". The dictionary keeps track of all the
> registered observers, and for certain events, makes a call to each one
> to tell them that something has changed. The entry observers get
> pointers to entries via PyDict_GetEntry, which is just like
> PyDict_GetItem, except it returns a PyDictEntry * right from the
> dictionary's entry table.
>
> The dict notifies its observers on delitem, pop, popitem, resize and
> clear. Nothing else is necessary - nothing else will change the address
> of or invalidate an entry. There are very, very few changes in
> PyDictObject. In the general case, the pointer to the list of observers
> is NULL, and the only additional slowdown is when delitem, pop, popitem,
> resize and clear check that and move on - but those aren't called often.
>
> So get, set, iter, contains, etc., are all exactly as fast as they were
> before. The biggest performance hit is when a highly-observed dict like
> __builtin__.__dict__ resizes, but that's rare enough to not worry about.
>
> To speed up globals access, an auxiliary object to functions and frames
> registers itself as an observer to func_globals and __builtins__. It
> makes an array of PyDictEntry pointers corresponding to
> func_code.co_names. PyEval_EvalFrameEx indexes that array first for
> global values, and updates it if there's one it couldn't find when the
> function was created.
>
> That's pretty much it. There are corner cases I still have to address,
> like what happens if someone replaces or deletes __builtins__, but it
> should be fairly easy to monitor that.
>
> I'd love to hear your comments, everyone. I've glossed over a lot of
> implementation details, but I've tried to make the main ideas clear.
>
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ntoronto at cs.byu.edu  Thu Nov 22 17:42:14 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Thu, 22 Nov 2007 09:42:14 -0700
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
Message-ID: <4745B166.3040904@cs.byu.edu>

Guido van Rossum wrote:
> Cool! Are you willing to show the code yet (bugs and all)?

Sure! I stayed up all night doing it and today is Thanksgiving, so I'll 
probably not get to it for a little while. (I know making a patch 
shouldn't take long, but I've never done it before.) Should I post the 
patch here or somewhere else?

> Some questions:
> 
> - what's the space & time impact for a dict with no watchers?

I think it's almost negligible.

Space: There are four bytes extra on every dict for a pointer to the 
observer list. It may actually be zero or eight or more depending on 
alignment and malloc block size - I haven't looked.

Time: On dicts with no observers, dealloc, delitem, pop, popitem, clear, 
and resize pass through an "if (mp->ma_entryobs_list != NULL)". 
PyDict_New sets mp->ma_entryobs_list to NULL. Nothing else is affected.

> - does this do anything for builtins?

It does right now well enough to get them quickly, but setting or 
deleting them elsewhere won't show up yet in the frame. And it doesn't 
handle the case where __builtins__ is replaced. That'll take a little 
doing, but just mentally - it shouldn't affect performance much.

Anyway, that part will work properly when I'm done.

> - could this be made to work for instance variables?

If my brain were thinking in straight lines, maybe I'd come up with 
something. :) I've got this fuzzy idea that it just might work. The hard 
part may be distinguishing LOAD_ATTR applied to self from LOAD_ATTR 
applied to something else. Hmm...

Something to digest while I'm digesting the Real Other White Meat. :)

> - what about exec(src, ns) where ns is a mapping but not a dict?

Good question - I don't know, but I think it should work, at least as 
well as it did before. If there's no observer attached to a frame, it'll 
default to its previous behavior. Another hmm...

Thanks for the prompt reply.

Neil

P.S. By the way, I'm very pleased with how clean and workable the 
codebase is. I actually cheered at the lack of warnings. My wife 
probably thinks I'm nuts. :D



From guido at python.org  Thu Nov 22 18:07:58 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 22 Nov 2007 09:07:58 -0800
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4745B166.3040904@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
	<4745B166.3040904@cs.byu.edu>
Message-ID: <ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>

[Quick] The best way to post a patch is to put it in the bug tracker
at bugs.python.org, and post a link to the issue here. The best way to
create a patch is svn diff, assuming you started out with an anonymous
svn checkout (see python.org/dev/) and not just with a distro
download.

Looking forward to it!

--Guido

On Nov 22, 2007 8:42 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> Guido van Rossum wrote:
> > Cool! Are you willing to show the code yet (bugs and all)?
>
> Sure! I stayed up all night doing it and today is Thanksgiving, so I'll
> probably not get to it for a little while. (I know making a patch
> shouldn't take long, but I've never done it before.) Should I post the
> patch here or somewhere else?
>
> > Some questions:
> >
> > - what's the space & time impact for a dict with no watchers?
>
> I think it's almost negligible.
>
> Space: There are four bytes extra on every dict for a pointer to the
> observer list. It may actually be zero or eight or more depending on
> alignment and malloc block size - I haven't looked.
>
> Time: On dicts with no observers, dealloc, delitem, pop, popitem, clear,
> and resize pass through an "if (mp->ma_entryobs_list != NULL)".
> PyDict_New sets mp->ma_entryobs_list to NULL. Nothing else is affected.
>
> > - does this do anything for builtins?
>
> It does right now well enough to get them quickly, but setting or
> deleting them elsewhere won't show up yet in the frame. And it doesn't
> handle the case where __builtins__ is replaced. That'll take a little
> doing, but just mentally - it shouldn't affect performance much.
>
> Anyway, that part will work properly when I'm done.
>
> > - could this be made to work for instance variables?
>
> If my brain were thinking in straight lines, maybe I'd come up with
> something. :) I've got this fuzzy idea that it just might work. The hard
> part may be distinguishing LOAD_ATTR applied to self from LOAD_ATTR
> applied to something else. Hmm...
>
> Something to digest while I'm digesting the Real Other White Meat. :)
>
> > - what about exec(src, ns) where ns is a mapping but not a dict?
>
> Good question - I don't know, but I think it should work, at least as
> well as it did before. If there's no observer attached to a frame, it'll
> default to its previous behavior. Another hmm...
>
> Thanks for the prompt reply.
>
> Neil
>
> P.S. By the way, I'm very pleased with how clean and workable the
> codebase is. I actually cheered at the lack of warnings. My wife
> probably thinks I'm nuts. :D
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ntoronto at cs.byu.edu  Thu Nov 22 18:16:17 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Thu, 22 Nov 2007 10:16:17 -0700
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>	
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>	
	<4745B166.3040904@cs.byu.edu>
	<ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
Message-ID: <4745B961.20408@cs.byu.edu>

Guido van Rossum wrote:
> [Quick] The best way to post a patch is to put it in the bug tracker
> at bugs.python.org, and post a link to the issue here. The best way to
> create a patch is svn diff, assuming you started out with an anonymous
> svn checkout (see python.org/dev/) and not just with a distro
> download.

Figures. I couldn't get it through svn because my university has a 
transparent proxy that doesn't like REPORT requests. Is there any chance 
of getting https enabled at svn.python.org sometime so I don't have to 
stick to snapshots?

Anyway, I'll get to this sometime after I get to bed.

> Looking forward to it!

Yes sir! :)

Neil



From facundobatista at gmail.com  Thu Nov 22 18:26:05 2007
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 22 Nov 2007 14:26:05 -0300
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4745B961.20408@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
	<4745B166.3040904@cs.byu.edu>
	<ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
	<4745B961.20408@cs.byu.edu>
Message-ID: <e04bdf310711220926jfdea37fv6acaf544eb4bfdd6@mail.gmail.com>

2007/11/22, Neil Toronto <ntoronto at cs.byu.edu>:

> Figures. I couldn't get it through svn because my university has a
> transparent proxy that doesn't like REPORT requests. Is there any chance
> of getting https enabled at svn.python.org sometime so I don't have to
> stick to snapshots?

I have the same problem in the virtualized Ubuntu at work, but...

1. I create a dynamic tunnel with SSH to the machine at home.

2. Execute svn with tsocks, actually sending the SVN traffic through
the tunnel (that acts like a SOCKS proxy).

Here is a detailed how to, but in Spanish:

    http://www.taniquetil.com.ar/plog/post/1/303

If you can access other machine with SSH, feel free to contact me
directly if need help to set up something like this.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/


From gnewsg at gmail.com  Thu Nov 22 22:04:05 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Thu, 22 Nov 2007 13:04:05 -0800 (PST)
Subject: [Python-ideas] os.listdir iteration support
Message-ID: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>

Hi to all,
I would find very useful having a version of os.listdir returning a
generator.
If a directory has many files, say 20,000, it could take a long time
getting all of them with os.listdir and this could be a problem in
asynchronous environments (e.g. asynchronous servers).

The only solution which comes to my mind in such case is using a
thread/fork or having a non-blocking version of listdir() returning an
iterator.

What do you think about that?


From eyal.lotem at gmail.com  Thu Nov 22 23:07:10 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Fri, 23 Nov 2007 00:07:10 +0200
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
	<4745B166.3040904@cs.byu.edu>
	<ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
Message-ID: <b64f365b0711221407l7564b507p7359c227866fd230@mail.gmail.com>

Hey, I had a very similar idea and implementation back in June (that
also passed all regression tests):
http://mail.python.org/pipermail/python-ideas/2007-June/000902.html

When I read Neil's mail I almost thought it was my old mail :-)

Unfortunately, when I posted my optimization, it pretty much got
ignored. Maybe I have not worded it properly.

The main difference between our implementations, if I understand
Neil's explanation correctly, is that you use direct ptrs into the
dict and notify the ptr holders of relocations.

I used a different method, where you call a new PyDict_ExportKey
method and it creates a mediating element.  The mediating element has
a fixed position so it can be dereferenced directly.  Direct access to
it is just as fast, but it may be slightly affecting dict performance.

I think a hybrid approach similar to Neil's, but with a mediating
object to represent the access to the dict and do the observing for
its user could be nicer (hell, Neil might already be doing this).

P.S: I also had a more ambitious plan, after eliminating
globals/builtins dict lookups, to use mro caches more aggressively
with this optimization and type-specialization on code objects, to
also eliminate class-side dict lookups. The user can also eliminate
instance-side dict lookupts via __slots__ - effectively allowing the
conversion of virtually _all_ namespace dict lookups in pure Python
code to be direct memory dereferences, isn't that exciting? :-)

Eyal

On Nov 22, 2007 7:07 PM, Guido van Rossum <guido at python.org> wrote:
> [Quick] The best way to post a patch is to put it in the bug tracker
> at bugs.python.org, and post a link to the issue here. The best way to
> create a patch is svn diff, assuming you started out with an anonymous
> svn checkout (see python.org/dev/) and not just with a distro
> download.
>
> Looking forward to it!
>
> --Guido
>
>
> On Nov 22, 2007 8:42 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> > Guido van Rossum wrote:
> > > Cool! Are you willing to show the code yet (bugs and all)?
> >
> > Sure! I stayed up all night doing it and today is Thanksgiving, so I'll
> > probably not get to it for a little while. (I know making a patch
> > shouldn't take long, but I've never done it before.) Should I post the
> > patch here or somewhere else?
> >
> > > Some questions:
> > >
> > > - what's the space & time impact for a dict with no watchers?
> >
> > I think it's almost negligible.
> >
> > Space: There are four bytes extra on every dict for a pointer to the
> > observer list. It may actually be zero or eight or more depending on
> > alignment and malloc block size - I haven't looked.
> >
> > Time: On dicts with no observers, dealloc, delitem, pop, popitem, clear,
> > and resize pass through an "if (mp->ma_entryobs_list != NULL)".
> > PyDict_New sets mp->ma_entryobs_list to NULL. Nothing else is affected.
> >
> > > - does this do anything for builtins?
> >
> > It does right now well enough to get them quickly, but setting or
> > deleting them elsewhere won't show up yet in the frame. And it doesn't
> > handle the case where __builtins__ is replaced. That'll take a little
> > doing, but just mentally - it shouldn't affect performance much.
> >
> > Anyway, that part will work properly when I'm done.
> >
> > > - could this be made to work for instance variables?
> >
> > If my brain were thinking in straight lines, maybe I'd come up with
> > something. :) I've got this fuzzy idea that it just might work. The hard
> > part may be distinguishing LOAD_ATTR applied to self from LOAD_ATTR
> > applied to something else. Hmm...
> >
> > Something to digest while I'm digesting the Real Other White Meat. :)
> >
> > > - what about exec(src, ns) where ns is a mapping but not a dict?
> >
> > Good question - I don't know, but I think it should work, at least as
> > well as it did before. If there's no observer attached to a frame, it'll
> > default to its previous behavior. Another hmm...
> >
> > Thanks for the prompt reply.
> >
> > Neil
> >
> > P.S. By the way, I'm very pleased with how clean and workable the
> > codebase is. I actually cheered at the lack of warnings. My wife
> > probably thinks I'm nuts. :D
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
>
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From tjreedy at udel.edu  Fri Nov 23 00:11:46 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 Nov 2007 18:11:46 -0500
Subject: [Python-ideas] Fast global cacheless lookup
References: <4745A301.5090201@cs.byu.edu><ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com><4745B166.3040904@cs.byu.edu><ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
	<b64f365b0711221407l7564b507p7359c227866fd230@mail.gmail.com>
Message-ID: <fi52bh$nho$1@ger.gmane.org>


"Eyal Lotem" <eyal.lotem at gmail.com> wrote in 
message news:b64f365b0711221407l7564b507p7359c227866fd230 at mail.gmail.com...
| Hey, I had a very similar idea and implementation back in June (that
| also passed all regression tests):
| http://mail.python.org/pipermail/python-ideas/2007-June/000902.html
|
| When I read Neil's mail I almost thought it was my old mail :-)
|
| Unfortunately, when I posted my optimization, it pretty much got
| ignored. Maybe I have not worded it properly.

Rereading your post, I think Neil's was a bit clearer, partly because it 
had more details.  In particular, I see no mention of ...

| I used a different method, where you call a new PyDict_ExportKey
| method and it creates a mediating element.  The mediating element has
| a fixed position so it can be dereferenced directly.

(which I do not quite get, actually, but that is probably just me.)

More important, I think, is timing.  Last June, the focus was on defining 
what 3.0 would consist of.  Now that that is mostly done, and the result 
found to be slower than 2.5, I think more attention is available for speed 
issues.

It will be great if the two of you can come up with a clean lookup speedup 
that avoids any showstopper issues.  This issue has been rumbling around 
'in the basement' for several years.

Terry Jan Reedy





From tjreedy at udel.edu  Fri Nov 23 00:25:06 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 Nov 2007 18:25:06 -0500
Subject: [Python-ideas] os.listdir iteration support
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
Message-ID: <fi534h$pii$1@ger.gmane.org>


"Giampaolo Rodola'" <gnewsg at gmail.com> wrote 
in message 
news:d827975f-7c1e-471e-bac1-8d55262ab122 at d27g2000prf.googlegroups.com...

| I would find very useful having a version of os.listdir returning a 
generator.

If there are no technical issues in the way, such a replacement (rather 
than addition) would be in line with other list -> iterator replacements in 
3.0 (range, dict,items, etc).  A list could then be obtained with 
list(os.listdir).

tjr






From greg.ewing at canterbury.ac.nz  Fri Nov 23 00:33:15 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 23 Nov 2007 12:33:15 +1300
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4745B166.3040904@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
	<4745B166.3040904@cs.byu.edu>
Message-ID: <474611BB.70608@canterbury.ac.nz>

Neil Toronto wrote:
> The hard 
> part may be distinguishing LOAD_ATTR applied to self from LOAD_ATTR 
> applied to something else.

Why would you *want* to distinguish that? A decent attribute
lookup acceleration mechanism should work for attributes of
any object, not just self. Think method calls, which are
probably even more common than accesses to globals.

--
Greg


From guido at python.org  Fri Nov 23 02:40:45 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 22 Nov 2007 17:40:45 -0800
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <fi534h$pii$1@ger.gmane.org>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<fi534h$pii$1@ger.gmane.org>
Message-ID: <ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com>

On Nov 22, 2007 3:25 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> "Giampaolo Rodola'" <gnewsg at gmail.com> wrote
> > I would find very useful having a version of os.listdir returning a
> > generator.
>
> If there are no technical issues in the way, such a replacement (rather
> than addition) would be in line with other list -> iterator replacements in
> 3.0 (range, dict,items, etc).  A list could then be obtained with
> list(os.listdir).

But how common is this use case really?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz at pythoncraft.com  Fri Nov 23 05:59:02 2007
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 22 Nov 2007 20:59:02 -0800
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
Message-ID: <20071123045902.GA4136@panix.com>

On Thu, Nov 22, 2007, Giampaolo Rodola' wrote:
>
> I would find very useful having a version of os.listdir returning a
> generator.  If a directory has many files, say 20,000, it could take
> a long time getting all of them with os.listdir and this could be a
> problem in asynchronous environments (e.g. asynchronous servers).
>
> The only solution which comes to my mind in such case is using a
> thread/fork or having a non-blocking version of listdir() returning an
> iterator.
>
> What do you think about that?

-1

The problem is that reading a directory requires an open file handle;
given a generator context, there's no clear mechanism for determining
when to close the handle.  Because the list needs to be created in the
first place, why bother with a generator?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith


From adam at atlas.st  Fri Nov 23 06:54:48 2007
From: adam at atlas.st (Adam Atlas)
Date: Fri, 23 Nov 2007 00:54:48 -0500
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <20071123045902.GA4136@panix.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<20071123045902.GA4136@panix.com>
Message-ID: <DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>


On 22 Nov 2007, at 23:59, Aahz wrote:
> The problem is that reading a directory requires an open file handle;
> given a generator context, there's no clear mechanism for determining
> when to close the handle.

Whenever the generator is __del__ed, or whenever the iteration  
completes, whichever comes first?

> Because the list needs to be created in the first place

How so?



From greg.ewing at canterbury.ac.nz  Fri Nov 23 08:01:43 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 23 Nov 2007 20:01:43 +1300
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<20071123045902.GA4136@panix.com>
	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
Message-ID: <47467AD7.8070702@canterbury.ac.nz>

Adam Atlas wrote:
> On 22 Nov 2007, at 23:59, Aahz wrote:
> 
>>The problem is that reading a directory requires an open file handle;
>>given a generator context, there's no clear mechanism for determining
>>when to close the handle.
> 
> Whenever the generator is __del__ed, or whenever the iteration  
> completes, whichever comes first?

Maybe what we really want is the functionality of
the C opendir and readdir functions exposed in the os
module. Then we could have an explicit method for
closing the file handle.

--
Greg



From ntoronto at cs.byu.edu  Fri Nov 23 08:18:37 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 23 Nov 2007 00:18:37 -0700
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>	<20071123045902.GA4136@panix.com>
	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
Message-ID: <47467ECD.1090406@cs.byu.edu>

Adam Atlas wrote:
> On 22 Nov 2007, at 23:59, Aahz wrote:
>> Because the list needs to be created in the first place
> 
> How so?

It doesn't, actually. On Windows, os.listdir uses FindFirstFile and 
FindNextFile, on OS2 it's DosFindFirst and DosFindNext, and on 
everything else it's Posix opendir and readdir. All of these are 
incremental, so a generator is the most natural way to expose the 
underlying API.

That's just a set of facts and a single opinion. Past that I personally 
have no preference.

Neil


From ntoronto at cs.byu.edu  Fri Nov 23 09:26:19 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 23 Nov 2007 01:26:19 -0700
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <b64f365b0711221407l7564b507p7359c227866fd230@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>	
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>	
	<4745B166.3040904@cs.byu.edu>	
	<ca471dc20711220907h1a19be67o49f9944290b91f1d@mail.gmail.com>
	<b64f365b0711221407l7564b507p7359c227866fd230@mail.gmail.com>
Message-ID: <47468EAB.905@cs.byu.edu>

Eyal Lotem wrote:
> Hey, I had a very similar idea and implementation back in June (that
> also passed all regression tests):
> http://mail.python.org/pipermail/python-ideas/2007-June/000902.html
> 
> When I read Neil's mail I almost thought it was my old mail :-)
> 
> Unfortunately, when I posted my optimization, it pretty much got
> ignored. Maybe I have not worded it properly.
> 
> The main difference between our implementations, if I understand
> Neil's explanation correctly, is that you use direct ptrs into the
> dict and notify the ptr holders of relocations.
> 
> I used a different method, where you call a new PyDict_ExportKey
> method and it creates a mediating element.  The mediating element has
> a fixed position so it can be dereferenced directly.  Direct access to
> it is just as fast, but it may be slightly affecting dict performance.

Nicely done. :)

> I think a hybrid approach similar to Neil's, but with a mediating
> object to represent the access to the dict and do the observing for
> its user could be nicer (hell, Neil might already be doing this).

I am, actually. I originally had the observer be the function object 
itself, but that presented problems with generators, which create a 
frame object from a function and then dump the function. I had assumed 
that a frame would never outlast the function object it was created from 
and ended up with dangling pointers. D'oh!

Anyway, it's correct now and the details are well-abstracted. The 
mediating object is called PyFastGlobalsAdapter. ("Adapter" because it 
allows you to getitem/setitem a dict like you do a list - using an 
index.) It gets bunted about among functions, frames, and eval code. It 
basically has the following members:

     PyObject *globals;      /* PyDictObject only */
     PyObject *names;        /* From func_code->co_names */
     PyDictEntry **entries;  /* Struct pointers into globals entries */

(I've omitted the details for builtins because I haven't got them 
totally worked out yet. I'll probably have a PyObject *builtins and a 
PyDictEntry *builtins_entry pointing at the globals dict so I can detect 
when __builtins__ is replaced within globals.)

On init, it registers itself with the globals dict and starts keeping 
track of pointers to dict entries in "entries". "entries" is the same 
length as "names". Getting the value globals[names[i]] is done by just 
referencing entries[i]->me_value. It's very quick. :)

There's a PyFastGlobals_GetItem(PyObject *fg, int index) that does it 
for you and also does the necessary bookkeeping to update the 
PyDictEntry pointers when there's a miss. (entries[i] == NULL; happens 
when a key is anticipated but not in the dict at first.)

I agree that the dict observer interface + an adapter is a good way to 
go. The dict part should be flexible, lean and fast (like dicts 
themselves), and the simple observer interface does just that. The 
adapter keeps it all correct and refcount-y, and provides a convenient 
way to get values by index.

Would it be worth it to expose dict adapters as a Python object? Then 
Python code could do this kind of crazy stuff:

     d = {<some dict>}
     a = dictadapter(d, ('keys', 'i', 'want', 'fast', 'access', 'to'))
     a[0] == d['keys'] and a[1] == d['i']  #..., etc. => True

That could make it a lot easier to experiment with fast cacheless dict 
lookups in other contexts. The problem is, I have no idea what those 
contexts might be, at least within Python code. :)

> P.S: I also had a more ambitious plan, after eliminating
> globals/builtins dict lookups, to use mro caches more aggressively
> with this optimization and type-specialization on code objects, to
> also eliminate class-side dict lookups. The user can also eliminate
> instance-side dict lookupts via __slots__ - effectively allowing the
> conversion of virtually _all_ namespace dict lookups in pure Python
> code to be direct memory dereferences, isn't that exciting? :-)

Extremely! There's no reason it couldn't be done. But I'm not exactly 
sure what you mean (seeing the idea only from the context of my own 
hacks), so could you elaborate a bit? :D  When you said "mro" I 
immediately thought of this adapter layout:

     PyList of MRO class dicts
     PyTuple of common attribute names (or list of tuples)
     PyDictEntry * array per MRO class

But where would the name index come from? The co_names tuples are 
different for each method.

Neil


From ntoronto at cs.byu.edu  Fri Nov 23 09:42:51 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 23 Nov 2007 01:42:51 -0700
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <474611BB.70608@canterbury.ac.nz>
References: <4745A301.5090201@cs.byu.edu>	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>	<4745B166.3040904@cs.byu.edu>
	<474611BB.70608@canterbury.ac.nz>
Message-ID: <4746928B.60101@cs.byu.edu>

Greg Ewing wrote:
> Neil Toronto wrote:
>> The hard 
>> part may be distinguishing LOAD_ATTR applied to self from LOAD_ATTR 
>> applied to something else.
> 
> Why would you *want* to distinguish that? A decent attribute
> lookup acceleration mechanism should work for attributes of
> any object, not just self. Think method calls, which are
> probably even more common than accesses to globals.

Now that's a durned good point. My cute little hack can be used anywhere 
you have a mostly-static dict (or at least one that grows infrequently) 
and a tuple of keys for which you want to repeatedly get or set values. 
As long as lookups start as tuple indexes (like indexes into co_names 
and such), things go fast.

I'm still a bit fuzzy about how it would be used with LOAD_ATTR. Let's 
restrict it to just accelerating self.<attr> lookups for now. The oparg 
to LOAD_ATTR and STORE_ATTR is the co_names index, so co_names is again 
the tuple of keys. But it seems like you'd need an adapter (see previous 
reply to Eyal for terminology) for each pair of (self, method). Is there 
a better way?

Neil


From g.brandl at gmx.net  Fri Nov 23 09:06:43 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 23 Nov 2007 09:06:43 +0100
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <47467AD7.8070702@canterbury.ac.nz>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>	<20071123045902.GA4136@panix.com>	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
	<47467AD7.8070702@canterbury.ac.nz>
Message-ID: <fi657n$2k4$1@ger.gmane.org>

Greg Ewing schrieb:
> Adam Atlas wrote:
>> On 22 Nov 2007, at 23:59, Aahz wrote:
>> 
>>>The problem is that reading a directory requires an open file handle;
>>>given a generator context, there's no clear mechanism for determining
>>>when to close the handle.
>> 
>> Whenever the generator is __del__ed, or whenever the iteration  
>> completes, whichever comes first?
> 
> Maybe what we really want is the functionality of
> the C opendir and readdir functions exposed in the os
> module. Then we could have an explicit method for
> closing the file handle.

What about an os.iterdir() generator which uses opendir/readdir as proposed?
The generator's close() could also call closedir(), and you could have a
warning in the docs about making sure to have it closed at some point.
One could even use an enclosing with closing(os.iterdir()) as d: block.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From greg.ewing at canterbury.ac.nz  Fri Nov 23 10:30:57 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 23 Nov 2007 22:30:57 +1300
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4746928B.60101@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
	<ca471dc20711220746m65cc83cfm3e930236fc3e152c@mail.gmail.com>
	<4745B166.3040904@cs.byu.edu> <474611BB.70608@canterbury.ac.nz>
	<4746928B.60101@cs.byu.edu>
Message-ID: <47469DD1.3060004@canterbury.ac.nz>

Neil Toronto wrote:
> But it seems like you'd need an adapter (see previous 
> reply to Eyal for terminology) for each pair of (self, method). Is there 
> a better way?

I started writing down some ideas for this, but then I
realised that it doesn't really extend to attribute
lookup in general. The reason is that only some kinds of
attribute have their values stored in dict entries --
mainly just instance variables of user-defined class
instances. Bound methods, attributes of built-in objects,
etc., would be left out.

I think the way to approach this is to have a global
cache which is essentially a dictionary mapping (obj, name)
pairs to some object that knows how to set or get the
attribute value as directly as possible. While this
wouldn't eliminate dict lookups entirely, in the case
of a cache hit it would just be a single lookup instead
of potentially many.

Some of the ideas behind your adapter might be carried
over, such as the idea of callbacks triggered by changes to
the underlying objects to help keep the cache up to date.
But there would probably have to be a variety of such
callback mechanisms for use by different kinds of objects.

--
Greg



From greg.ewing at canterbury.ac.nz  Fri Nov 23 12:11:35 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 24 Nov 2007 00:11:35 +1300
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <fi657n$2k4$1@ger.gmane.org>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<20071123045902.GA4136@panix.com>
	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
	<47467AD7.8070702@canterbury.ac.nz> <fi657n$2k4$1@ger.gmane.org>
Message-ID: <4746B567.2020806@canterbury.ac.nz>

Georg Brandl wrote:
> What about an os.iterdir() generator which uses opendir/readdir as proposed?

I was feeling in the mood for a diversion, so I whipped up
a Pyrex prototype of an opendir() object that can be used
either as a file-like object or an iterator.

Here's the docstring:

   """opendir(pathname) --> an open directory object

   Opens a directory and provides incremental access to
   the filenames it contains. May be used as a file-like
   object or as an iterator.

   When used as a file-like object, each call to read()
   returns one filename, or an empty string when the end
   of the directory is reached. The close() method should
   be called when finished with the directory.

   The close() method should also be called when used as
   an iterator and iteration is stopped prematurely. If
   iteration proceeds to completion, the directory is
   closed automatically."""

Source, setup.py and a brief test attached.

--
Greg
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: opendir.pyx
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071124/8a5a0a90/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: setup.py
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071124/8a5a0a90/attachment-0001.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.py
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20071124/8a5a0a90/attachment-0002.ksh>

From gnewsg at gmail.com  Fri Nov 23 15:06:01 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Fri, 23 Nov 2007 06:06:01 -0800 (PST)
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<fi534h$pii$1@ger.gmane.org>
	<ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com>
Message-ID: <85d8d06e-6287-4dbf-9f2b-89bf4dfe662b@w28g2000hsf.googlegroups.com>

imho, not so unusual.
First examples which come to my mind are HTTP and FTP servers which
commonly have to list the content of local directories.
FTP servers, in particular, have to do that VERY often.

On 23 Nov, 02:40, "Guido van Rossum" <gu... at python.org> wrote:
> On Nov 22, 2007 3:25 PM, Terry Reedy <tjre... at udel.edu> wrote:
>
> > "Giampaolo Rodola'" <gne... at gmail.com> wrote
> > > I would find very useful having a version of os.listdir returning a
> > > generator.
>
> > If there are no technical issues in the way, such a replacement (rather
> > than addition) would be in line with other list -> iterator replacements in
> > 3.0 (range, dict,items, etc).  A list could then be obtained with
> > list(os.listdir).
>
> But how common is this use case really?
>
> --
> --Guido van Rossum (home page:http://www.python.org/~guido/)
> _______________________________________________
> Python-ideas mailing list
> Python-id... at python.orghttp://mail.python.org/mailman/listinfo/python-ideas


From gnewsg at gmail.com  Fri Nov 23 15:12:30 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Fri, 23 Nov 2007 06:12:30 -0800 (PST)
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <4746B567.2020806@canterbury.ac.nz>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<20071123045902.GA4136@panix.com>
	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st> 
	<47467AD7.8070702@canterbury.ac.nz> <fi657n$2k4$1@ger.gmane.org> 
	<4746B567.2020806@canterbury.ac.nz>
Message-ID: <e942890c-cebc-40a9-8bfd-13da5c4a2e35@w40g2000hsb.googlegroups.com>

On 23 Nov, 12:11, Greg Ewing <greg.ew... at canterbury.ac.nz> wrote:
> Georg Brandl wrote:

> from opendir import opendir
>
> print "READ"
> d = opendir(".")
> while 1:
>         name = d.read()
>         if not name:
>                 break
>         print "   ", name
> print "EOF"
>
> print "ITERATE"
> d = opendir(".")
> for name in d:
>         print "   ", name
> print "STOP"
>
> print "TELL/SEEK"
> d = opendir(".")
> for i in range(3):
>         name = d.read()
>         print "   ", name
> pos = d.tell()
> for i in range(3):
>         name = d.read()
>         print "   ", name
> d.seek(pos)
> while 1:
>         name = d.read()
>         if not name:
>                 break
>         print "   ", name
> print "EOF"

This is exactly the usage I was talking about.


From aahz at pythoncraft.com  Fri Nov 23 15:39:39 2007
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 23 Nov 2007 06:39:39 -0800
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<20071123045902.GA4136@panix.com>
	<DF8BD948-5E96-4303-A7D8-CDF696147A62@atlas.st>
Message-ID: <20071123143939.GA28219@panix.com>

On Fri, Nov 23, 2007, Adam Atlas wrote:
> On 22 Nov 2007, at 23:59, Aahz wrote:
>> 
>> The problem is that reading a directory requires an open file handle;
>> given a generator context, there's no clear mechanism for determining
>> when to close the handle.
> 
> Whenever the generator is __del__ed, or whenever the iteration  
> completes, whichever comes first?

Enh.  That is not reliable without work, and getting it reliable is a
waste of work.  The proposed idea for adding an opendir() function is
workable, but it still doesn't solve the need for closing the handle
within listdir().

No matter what, changes the semantics of listdir() to leave a handle
lying around is going to cause problems for some people.

>> Because the list needs to be created in the first place
> 
> How so?

If you're going to ask a question, it would be nice to leave the entire
original context in place, especially given that it's not a particularly
long chunk of text.

Anyway, the Windows case aside, if you don't have a reliable close()
mechanism, you need to slurp the whole thing into a list in one swell
foop so that you can just close the handle.  Even in the Windows case,
you need a handle, and I don't know what the consequences are of leaving
it lying around.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith


From guido at python.org  Fri Nov 23 21:23:37 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 23 Nov 2007 12:23:37 -0800
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <85d8d06e-6287-4dbf-9f2b-89bf4dfe662b@w28g2000hsf.googlegroups.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<fi534h$pii$1@ger.gmane.org>
	<ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com>
	<85d8d06e-6287-4dbf-9f2b-89bf4dfe662b@w28g2000hsf.googlegroups.com>
Message-ID: <ca471dc20711231223o74242cd7ybeba2ca7c6e02cc1@mail.gmail.com>

But how many FTP servers are written in Python *and* have directories
with 20,000 files in them?

--Guido

On Nov 23, 2007 6:06 AM, Giampaolo Rodola' <gnewsg at gmail.com> wrote:
> imho, not so unusual.
> First examples which come to my mind are HTTP and FTP servers which
> commonly have to list the content of local directories.
> FTP servers, in particular, have to do that VERY often.
>
> On 23 Nov, 02:40, "Guido van Rossum" <gu... at python.org> wrote:
> > On Nov 22, 2007 3:25 PM, Terry Reedy <tjre... at udel.edu> wrote:
> >
> > > "Giampaolo Rodola'" <gne... at gmail.com> wrote
> > > > I would find very useful having a version of os.listdir returning a
> > > > generator.
> >
> > > If there are no technical issues in the way, such a replacement (rather
> > > than addition) would be in line with other list -> iterator replacements in
> > > 3.0 (range, dict,items, etc).  A list could then be obtained with
> > > list(os.listdir).
> >
> > But how common is this use case really?
> >
> > --
> > --Guido van Rossum (home page:http://www.python.org/~guido/)
> > _______________________________________________
> > Python-ideas mailing list
> > Python-id... at python.orghttp://mail.python.org/mailman/listinfo/python-ideas
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From gnewsg at gmail.com  Fri Nov 23 22:26:40 2007
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Fri, 23 Nov 2007 13:26:40 -0800 (PST)
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <ca471dc20711231223o74242cd7ybeba2ca7c6e02cc1@mail.gmail.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<fi534h$pii$1@ger.gmane.org>
	<ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com> 
	<85d8d06e-6287-4dbf-9f2b-89bf4dfe662b@w28g2000hsf.googlegroups.com> 
	<ca471dc20711231223o74242cd7ybeba2ca7c6e02cc1@mail.gmail.com>
Message-ID: <e3f46730-005f-4aae-a7a0-2a0eb86ae7f3@o42g2000hsc.googlegroups.com>

On 23 Nov, 21:23, "Guido van Rossum" <gu... at python.org> wrote:
> But how many FTP servers are written in Python *and* have directories
> with 20,000 files in them?
>
> --Guido

I sincerely don't know.
Surely it's a rather specific use case, but it is one of the tasks
which takes the longest amount of time on an FTP server. 20,000 is
probably an exaggerated hypothetical situation, so I did a simple test
with a more realistic scenario.
On windows a very crowded directory is C:\windows\system32. Currently
the C:\windows\system32 of my Windows XP workstation contains 2201
files.
I tried to run the code below which is how an FTP server should
properly respond to a "LIST" command issued by client.
It took 1.70300006866 seconds to complete the first time and
0.266000032425 the second one.
I don't know if such specific use case could justify a listdir
generators support to have into the stdlib but having something like
Greg Ewing's opendirs module could have saved a lot of time in this
specific case.


-- Giampaolo


import os, stat, time
from tarfile import filemode
try:
    import pwd, grp
except ImportError:
    pwd = grp = None


def format_list(directory):
    """Return a directory listing emulating "/bin/ls -lA" UNIX
    command output.

    This is how output appears to client:
    -rw-rw-rw-   1 owner   group    7045120 Sep 02  3:47 music.mp3
    drwxrwxrwx   1 owner   group          0 Aug 31 18:50 e-books
    -rw-rw-rw-   1 owner   group        380 Sep 02  3:40 module.py
    """
    listing = os.listdir(directory)

    result = []
    for basename in listing:
        file = os.path.join(directory, basename)

        # if the file is a broken symlink, use lstat to get stat for
        # the link
        try:
            stat_result = os.stat(file)
        except (OSError,AttributeError):
            stat_result = os.lstat(file)

        perms = filemode(stat_result.st_mode)  # permissions

        nlinks = stat_result.st_nlink   # number of links to inode
        if not nlinks:  # non-posix system, let's use a bogus value
            nlinks = 1

        if pwd and grp:
            # get user and group name, else just use the raw uid/gid
            try:
                uname = pwd.getpwuid(stat_result.st_uid).pw_name
            except KeyError:
                uname = stat_result.st_uid
            try:
                gname = grp.getgrgid(stat_result.st_gid).gr_name
            except KeyError:
                gname = stat_result.st_gid
        else:
            # on non-posix systems the only chance we use default
            # bogus values for owner and group
            uname = "owner"
            gname = "group"

        size = stat_result.st_size  # file size

        # stat.st_mtime could fail (-1) if file's last modification
        # time is too old, in that case we return local time as last
        # modification time.
        try:
            mtime = time.strftime("%b %d %H:%M",
time.localtime(stat_result.st_mtime))
        except ValueError:
            mtime = time.strftime("%b %d %H:%M")

        # if the file is a symlink, resolve it, e.g. "symlink ->
real_file"
        if stat.S_ISLNK(stat_result.st_mode):
            basename = basename + " -> " + os.readlink(file)

        # formatting is matched with proftpd ls output
        result.append("%s %3s %-8s %-8s %8s %s %s\r\n" %(
            perms, nlinks, uname, gname, size, mtime, basename))

    return ''.join(result)

if __name__ == '__main__':
    before = time.time()
    format_list(r'C:\windows\system32')
    print time.time() - before




From ntoronto at cs.byu.edu  Sat Nov 24 13:41:56 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sat, 24 Nov 2007 05:41:56 -0700
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
Message-ID: <47481C14.90009@cs.byu.edu>

I'd post this on Python-dev, but it has more to do with the future of 
Python, and it directly impacts the fairly-well-received Python-idea I'm 
working on right now.

The current behavior has persisted since revision 9877, nine years ago:

http://svn.python.org/view?rev=9877&view=rev

"Vladimir Marangozov' performance hack: copy f_builtins from ancestor
if the globals are the same."

A variant of the behavior has persisted since the age of the dinosaurs, 
as far as I can tell - or at least ever since Python had stack frames.

Here's how the globals/builtins lookup is currently presented as working:

     1. If 'name' is in globals, return globals['name']
     2. Return globals['__builtins__']['name']

Glossing over a lot of details, here's how it *actually* worked before 
the performance hack:

     0. A code object gets executed, which creates a stack frame. It
        sets frame.builtins = globals['__builtins__'].
     While executing the code:
     1. If 'name' is in globals, return globals['name'].
     2. Otherwise return frame.builtins['name'].

A problem example, which is still a problem today:

     __builtins__ = {'len': lambda x: 1}
     print len([1, 2, 3])
     # prints:
     #   '3' when run as a script
     #   '1' in interactive mode

If running as a script or part of an import, the module's frame caches 
builtins, so it doesn't matter that it gets reassigned. When 'len' is 
looked up for the print statement, it's looked up in the cached version. 
But in interactive mode, each statement is executed in its own frame, so 
it doesn't have this problem.

Well, at least module *functions* will run in their own frames, so 
they'll see the new builtins, right? But here's how it works now, after 
the performance hack:

     0. A code object gets executed, which creates a stack frame.
        a. If the stack frame has a parent (think "call site") and
          the parent has the same globals, it sets
          frame.builtins = parent.builtins.
        b. Otherwise it sets frame.builtins = globals['__builtins__'].
     While executing the code:
     1. If 'name' is in globals, return globals['name'].
     2. Otherwise return frame.builtins['name'].

A problem example:

     __builtins__ = {'len': lambda x: 1}
     def f(): print len([1, 2, 3])
     f()
     # prints:
     #   '3' when run as a script
     #   '1' in interactive mode


At the call site "f()", frame.builtins is the original, cached builtins. 
Before the hack, f()'s frame would have recalculated and re-cached it. 
After the hack, f()'s frame inherits the cached version. But this only 
happens in a script, which runs its code in a single frame. If you try 
this in interactive mode, you'll get correct behavior.

If function calls stay within a module, builtins is effectively frozen 
at the value it had when the module started execution. But if outside 
modules call those same functions, builtins will have its new value! 
That could be bad:

     import my_extra_special_builtins as __builtins__

     <define extra-special library functions that use new builtins>

     def run_tests_on_extra_special_functions():
         <tests, etc.>

     if __name__ == '__main__':
         run_tests_on_extra_special_functions()

The special library functions work, but the tests don't. The special 
builtins module only shows up when functions are called from outside 
modules (where the call sites have different globals) and the functions' 
frames are forced to recalculate builtins rather than inheriting it. 
Here are some ways around the problem:

     1. Put all the tests in a different module.
     2. Use a unit testing framework, which will call the module
        functions from outside the module.
     3. Call functions using exec with custom globals.
     4. Replace functions using types.FunctionType with custom globals.

#3 and #4 are decidedly unlikely. :) #1 is generally discouraged (AFAIK) 
if not annoying, and #2 is encouraged.

In the last thread on __builtins__ vs. __builtin__, back in March, it 
seemed that Guido was open to new ideas for Python 3.0 on the subject. 
Well, keeping in mind this strange behavior and the length of time it's 
gone on, here's my recommendation:

     Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
     look in "builtins" (currently "__builtin__") for names after it
     checks globals. If modules want to hack at builtins, they can
     import it. But they hack it globally or not at all.

I honestly can't think of a use case you can handle by replacing a 
module's __builtins__ that can't be handled without. If there is one, 
nobody actually does it, because we would have heard them screaming in 
agony and banging their heads against the walls from thousands of miles 
away by now. You just can't do it reliably as of February 1998.

The regression test suite doesn't even touch things like this. It only 
goes as far as injecting stuff into __builtin__.

Finally, on to my practical problem.

I'm working on the fast globals stuff, which is how I got onto this 
subject in the first place. Here are a few of my options:

     1. I can make __builtins__ work like it was always supposed to, at
        the cost of decreased performance and extra complexity. It would
        still be much faster than it is now, though.
     2. Status quo: I can make __builtins__ work like it does now. I
        think I can do this, anyway. It's actually more complex than #1,
        and very likely slower. I would rather not take this route.
     3. For a given function, I can freeze __builtins__ at the value it
        was at when the function was defined.
     4. I can make it work like I suggested for Python 3.0, but make
        __builtin__ automatically available to modules as __builtins__.

With or without it, I should be posting my patch for fast globals soon. 
No, don't look at me like that. I'm serious!

Wondering-what-to-do-ly,
Neil


From greg.ewing at canterbury.ac.nz  Sun Nov 25 00:09:02 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 25 Nov 2007 12:09:02 +1300
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <47481C14.90009@cs.byu.edu>
References: <47481C14.90009@cs.byu.edu>
Message-ID: <4748AF0E.6090401@canterbury.ac.nz>

Neil Toronto wrote:
>      Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
>      look in "builtins" (currently "__builtin__") for names after it
>      checks globals.

What about things like running code sandboxed with a
restricted set of builtins?

--
Greg


From jimjjewett at gmail.com  Sun Nov 25 00:22:04 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 24 Nov 2007 18:22:04 -0500
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4745A301.5090201@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
Message-ID: <fb6fbf560711241522p13c5636ctaa73fd37e890b140@mail.gmail.com>

On 11/22/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:

> ... What if a frame could maintain an
> array of pointers right into a dictionary's entry table?

> The dict notifies its observers on delitem, pop, popitem, resize and
> clear. Nothing else is necessary - nothing else will change the address
> of or invalidate an entry.

I think this isn't quite true, because of DUMMY entries.

Insert key1.
Insert key2 that wants the same slot.
Register an observer that cares about key2 but not key1.

Delete key1.   The key1 entry is replaced with DUMMY, but the entry
for key2 is not affected.

Look up key2 (by some other code which hasn't already taken this
shortcut) and the lookdict function (as a side effect) moves key2 to
the better location that key1 no longer occupies.  As described, I
think this breaks your cache.

Of course you can get around this by just not moving things without a
resize, but that is likely to be horrible for the (non-namespace?)
dictionaries that do see frequent deletions.

Another way around it is to also notify the observers whenever
lookdict moves an entry; I'm not sure how that would affect normal
lookup performance.

A more radical change is to stop exposing the internal structure at
all.  For example, a typical namespace might instead be represented as
an array of values, plus a dict mapping names to indices.  The cost
would be an extra pointer for each key ever in the dictionary (since
you wouldn't reuse positional slots), and the savings would be that
most lookups could just grab namespace[i] without having to even check
that they got the right key, let alone following a trail of collision
resolutions.

> To speed up globals access, an auxiliary object to functions and frames
> registers itself as an observer to func_globals and __builtins__.

Note that func_globals probably *will* be updated again in the future,
if only to register this very function with its module.  You could
wait to "seal" a namespace until you think all its names are known, or
you could adapt the timestamp solution suggested in
http://bugs.python.org/issue1616125

-jJ


From aahz at pythoncraft.com  Sun Nov 25 01:29:17 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 24 Nov 2007 16:29:17 -0800
Subject: [Python-ideas] os.listdir iteration support
In-Reply-To: <e3f46730-005f-4aae-a7a0-2a0eb86ae7f3@o42g2000hsc.googlegroups.com>
References: <d827975f-7c1e-471e-bac1-8d55262ab122@d27g2000prf.googlegroups.com>
	<fi534h$pii$1@ger.gmane.org>
	<ca471dc20711221740n7bdc56eeoa64b083b50cd09dd@mail.gmail.com>
	<85d8d06e-6287-4dbf-9f2b-89bf4dfe662b@w28g2000hsf.googlegroups.com>
	<ca471dc20711231223o74242cd7ybeba2ca7c6e02cc1@mail.gmail.com>
	<e3f46730-005f-4aae-a7a0-2a0eb86ae7f3@o42g2000hsc.googlegroups.com>
Message-ID: <20071125002916.GA12966@panix.com>

On Fri, Nov 23, 2007, Giampaolo Rodola' wrote:
>
> Surely it's a rather specific use case, but it is one of the tasks
> which takes the longest amount of time on an FTP server. 20,000 is
> probably an exaggerated hypothetical situation, so I did a simple test
> with a more realistic scenario.
> On windows a very crowded directory is C:\windows\system32. Currently
> the C:\windows\system32 of my Windows XP workstation contains 2201
> files.
> I tried to run the code below which is how an FTP server should
> properly respond to a "LIST" command issued by client.
> It took 1.70300006866 seconds to complete the first time and
> 0.266000032425 the second one.

Your code calls os.stat() on each file.  I know from past experience
that os.stat() is *extremely* expensive.  Because os.listdir() runs at C
speed, it only gets slow when run against hundreds of thousands of
entries.

(One directory on a work server has over 200K entries, and it takes
os.listdir() about twenty seconds.  I believe that if we switched from
ext3 to something more appropriate that would get reduced.)

> I don't know if such specific use case could justify a listdir
> generators support to have into the stdlib but having something like
> Greg Ewing's opendirs module could have saved a lot of time in this
> specific case.

Doubtful.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Typing is cheap.  Thinking is expensive."  --Roy Smith


From jimjjewett at gmail.com  Sun Nov 25 02:21:37 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 24 Nov 2007 20:21:37 -0500
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <47481C14.90009@cs.byu.edu>
References: <47481C14.90009@cs.byu.edu>
Message-ID: <fb6fbf560711241721g368e8275xa9c07045512200be@mail.gmail.com>

On 11/24/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:

[I'm summarizing and paraphrasing]

 If a name isn't in globals, python looks in
     globals['__builtins__']['name']
 Unfortunately, it may use a stale cached value for
     globals['__builtins__']
...

> Well, keeping in mind this strange behavior and the length
> of time it's gone on, here's my recommendation:

>      Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
>      look in "builtins" (currently "__builtin__") for names after it
>      checks globals. If modules want to hack at builtins, they can
>      import it. But they hack it globally or not at all.

As Greg pointed out, this isn't so good for sandboxes.

But as long as you're changing dicts to be better namespaces, why not
go a step farther?  Instead of using a magic key name (some spelling
variant of builtin), make the fallback part of the dict itself.  For
example:

Use a defaultdict and set the __missing__ method to the builtin's __getitem__.

Then neither python nor the frame need to worry about tracking the
builtin namespace, but the fallback can be reset (even on a
per-function basis) by simply replacing the fallback method.

-jJ


From ntoronto at cs.byu.edu  Sun Nov 25 03:18:25 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sat, 24 Nov 2007 19:18:25 -0700
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <fb6fbf560711241522p13c5636ctaa73fd37e890b140@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>
	<fb6fbf560711241522p13c5636ctaa73fd37e890b140@mail.gmail.com>
Message-ID: <4748DB71.6000601@cs.byu.edu>

Jim Jewett wrote:
> On 11/22/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> 
>> ... What if a frame could maintain an
>> array of pointers right into a dictionary's entry table?
> 
>> The dict notifies its observers on delitem, pop, popitem, resize and
>> clear. Nothing else is necessary - nothing else will change the address
>> of or invalidate an entry.
> 
> I think this isn't quite true, because of DUMMY entries.
> 
> Insert key1.
> Insert key2 that wants the same slot.
> Register an observer that cares about key2 but not key1.
> 
> Delete key1.   The key1 entry is replaced with DUMMY, but the entry
> for key2 is not affected.
> 
> Look up key2 (by some other code which hasn't already taken this
> shortcut) and the lookdict function (as a side effect) moves key2 to
> the better location that key1 no longer occupies.  As described, I
> think this breaks your cache.

Good grief old chap, you freaked me out.

Turns out it all still works. Whether the lookdict functions used to 
move entries around I don't know, but now it doesn't. It's probably 
because deletions are so rare compared to other operations that it's not 
worth the extra logic in those tight little loops.

Mind if I keep rambling, just to make sure I've got it right? :)

It's the dummy entries that make lookup work at all. The lookdict 
functions use them as flags so that it knows to keep skipping around the 
table looking for an open entry or an entry with the right key. It's 
basically: "If ep->me_key != key or ep->me_key == dummy, I need to keep 
trying different ep's. If I reach an empty ep, return the first dummy I 
found or that ep if I didn't find one. If I reach an ep with the right 
key, return that."

I wasn't completely satisfied by static analysis, so I traced the case 
you brought up through both lookdict and lookdict_string. Here it is:

Assume hash(key1) == hash(key2).
Assume (without loss of generality) that for this hash, entries are 
traversed in order 0, 1, 2...

Insert key1:

0: key1
1: NULL
2: ...

Insert key2:

0: key1
1: key2
2: ...

Delete key1:

0: dummy
1: key2
2: ...

Look up key2 (trace):

("freeslot" keeps track of the first dummy found on the traversal; is 
NULL if none found)

start:
   ep = 0
   freeslot = 0  [ep->me_key == dummy]
loop:
   ep = 1
   return ep     [ep->me_key == key2]

In that last bit would have been the part that goes something like this:

if (freeslot != NULL) {
     /* refcount-neutral */
     *freeslot = *ep;
     ep->me_key = dummy;
     ep->me_value = NULL;
     return freeslot;
}
else
     return ep;

It might be a speed improvement if you assume that the key is very 
likely to be looked up again. But it's extra complexity in a 
speed-critical code path and you never know whether you lengthened the 
traversal for other lookups.

As long as it's a wash in the end, it might as well be left alone, at 
least for the fast globals. :D

Neil



From guido at python.org  Mon Nov 26 18:40:59 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 26 Nov 2007 09:40:59 -0800
Subject: [Python-ideas] Fast global cacheless lookup
In-Reply-To: <4748DB71.6000601@cs.byu.edu>
References: <4745A301.5090201@cs.byu.edu>
	<fb6fbf560711241522p13c5636ctaa73fd37e890b140@mail.gmail.com>
	<4748DB71.6000601@cs.byu.edu>
Message-ID: <ca471dc20711260940t222ccc23s2b00d8a8122493d@mail.gmail.com>

On Nov 24, 2007 6:18 PM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> Jim Jewett wrote:
> > I think this isn't quite true, because of DUMMY entries.
> >
> > Insert key1.
> > Insert key2 that wants the same slot.
> > Register an observer that cares about key2 but not key1.
> >
> > Delete key1.   The key1 entry is replaced with DUMMY, but the entry
> > for key2 is not affected.
> >
> > Look up key2 (by some other code which hasn't already taken this
> > shortcut) and the lookdict function (as a side effect) moves key2 to
> > the better location that key1 no longer occupies.  As described, I
> > think this breaks your cache.
>
> Good grief old chap, you freaked me out.
>
> Turns out it all still works. Whether the lookdict functions used to
> move entries around I don't know, but now it doesn't. It's probably
> because deletions are so rare compared to other operations that it's not
> worth the extra logic in those tight little loops.

I don't know where Jim gets his information, but I don't recall that
just looking up a key has ever moved entries around. You'd have to
delete and re-add it to get it moved. (Or you'd have to hit the
"rehash everything to a larger hash table" of course.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Mon Nov 26 18:46:44 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 26 Nov 2007 09:46:44 -0800
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <47481C14.90009@cs.byu.edu>
References: <47481C14.90009@cs.byu.edu>
Message-ID: <ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>

The semantics of __builtins__ are an implementation detail used for
sandboxing, and assignment to __builtins__ is not supported. Alas, I
can't quite figure out what you're after; your post doesn't start with
a clear problem statement, so I'm not even sure if this is helpful
information. I just hope to encourage you from trying to change the
semantics of __builtins__. In 3.0, __builtins__ may well be renamed.

--Guido

On Nov 24, 2007 4:41 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I'd post this on Python-dev, but it has more to do with the future of
> Python, and it directly impacts the fairly-well-received Python-idea I'm
> working on right now.
>
> The current behavior has persisted since revision 9877, nine years ago:
>
> http://svn.python.org/view?rev=9877&view=rev
>
> "Vladimir Marangozov' performance hack: copy f_builtins from ancestor
> if the globals are the same."
>
> A variant of the behavior has persisted since the age of the dinosaurs,
> as far as I can tell - or at least ever since Python had stack frames.
>
> Here's how the globals/builtins lookup is currently presented as working:
>
>      1. If 'name' is in globals, return globals['name']
>      2. Return globals['__builtins__']['name']
>
> Glossing over a lot of details, here's how it *actually* worked before
> the performance hack:
>
>      0. A code object gets executed, which creates a stack frame. It
>         sets frame.builtins = globals['__builtins__'].
>      While executing the code:
>      1. If 'name' is in globals, return globals['name'].
>      2. Otherwise return frame.builtins['name'].
>
> A problem example, which is still a problem today:
>
>      __builtins__ = {'len': lambda x: 1}
>      print len([1, 2, 3])
>      # prints:
>      #   '3' when run as a script
>      #   '1' in interactive mode
>
> If running as a script or part of an import, the module's frame caches
> builtins, so it doesn't matter that it gets reassigned. When 'len' is
> looked up for the print statement, it's looked up in the cached version.
> But in interactive mode, each statement is executed in its own frame, so
> it doesn't have this problem.
>
> Well, at least module *functions* will run in their own frames, so
> they'll see the new builtins, right? But here's how it works now, after
> the performance hack:
>
>      0. A code object gets executed, which creates a stack frame.
>         a. If the stack frame has a parent (think "call site") and
>           the parent has the same globals, it sets
>           frame.builtins = parent.builtins.
>         b. Otherwise it sets frame.builtins = globals['__builtins__'].
>      While executing the code:
>      1. If 'name' is in globals, return globals['name'].
>      2. Otherwise return frame.builtins['name'].
>
> A problem example:
>
>      __builtins__ = {'len': lambda x: 1}
>      def f(): print len([1, 2, 3])
>      f()
>      # prints:
>      #   '3' when run as a script
>      #   '1' in interactive mode
>
>
> At the call site "f()", frame.builtins is the original, cached builtins.
> Before the hack, f()'s frame would have recalculated and re-cached it.
> After the hack, f()'s frame inherits the cached version. But this only
> happens in a script, which runs its code in a single frame. If you try
> this in interactive mode, you'll get correct behavior.
>
> If function calls stay within a module, builtins is effectively frozen
> at the value it had when the module started execution. But if outside
> modules call those same functions, builtins will have its new value!
> That could be bad:
>
>      import my_extra_special_builtins as __builtins__
>
>      <define extra-special library functions that use new builtins>
>
>      def run_tests_on_extra_special_functions():
>          <tests, etc.>
>
>      if __name__ == '__main__':
>          run_tests_on_extra_special_functions()
>
> The special library functions work, but the tests don't. The special
> builtins module only shows up when functions are called from outside
> modules (where the call sites have different globals) and the functions'
> frames are forced to recalculate builtins rather than inheriting it.
> Here are some ways around the problem:
>
>      1. Put all the tests in a different module.
>      2. Use a unit testing framework, which will call the module
>         functions from outside the module.
>      3. Call functions using exec with custom globals.
>      4. Replace functions using types.FunctionType with custom globals.
>
> #3 and #4 are decidedly unlikely. :) #1 is generally discouraged (AFAIK)
> if not annoying, and #2 is encouraged.
>
> In the last thread on __builtins__ vs. __builtin__, back in March, it
> seemed that Guido was open to new ideas for Python 3.0 on the subject.
> Well, keeping in mind this strange behavior and the length of time it's
> gone on, here's my recommendation:
>
>      Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
>      look in "builtins" (currently "__builtin__") for names after it
>      checks globals. If modules want to hack at builtins, they can
>      import it. But they hack it globally or not at all.
>
> I honestly can't think of a use case you can handle by replacing a
> module's __builtins__ that can't be handled without. If there is one,
> nobody actually does it, because we would have heard them screaming in
> agony and banging their heads against the walls from thousands of miles
> away by now. You just can't do it reliably as of February 1998.
>
> The regression test suite doesn't even touch things like this. It only
> goes as far as injecting stuff into __builtin__.
>
> Finally, on to my practical problem.
>
> I'm working on the fast globals stuff, which is how I got onto this
> subject in the first place. Here are a few of my options:
>
>      1. I can make __builtins__ work like it was always supposed to, at
>         the cost of decreased performance and extra complexity. It would
>         still be much faster than it is now, though.
>      2. Status quo: I can make __builtins__ work like it does now. I
>         think I can do this, anyway. It's actually more complex than #1,
>         and very likely slower. I would rather not take this route.
>      3. For a given function, I can freeze __builtins__ at the value it
>         was at when the function was defined.
>      4. I can make it work like I suggested for Python 3.0, but make
>         __builtin__ automatically available to modules as __builtins__.
>
> With or without it, I should be posting my patch for fast globals soon.
> No, don't look at me like that. I'm serious!
>
> Wondering-what-to-do-ly,
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jimjjewett at gmail.com  Mon Nov 26 19:43:03 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 26 Nov 2007 13:43:03 -0500
Subject: [Python-ideas] Fwd:  Fast global cacheless lookup
In-Reply-To: <fb6fbf560711251434p59b42a02s79dcdbf942860ae1@mail.gmail.com>
References: <4745A301.5090201@cs.byu.edu>
	<fb6fbf560711241522p13c5636ctaa73fd37e890b140@mail.gmail.com>
	<4748DB71.6000601@cs.byu.edu>
	<fb6fbf560711251434p59b42a02s79dcdbf942860ae1@mail.gmail.com>
Message-ID: <fb6fbf560711261043nc8becd9nfd8a518ab1d09893@mail.gmail.com>

gaah ... this should have been sent to the list for archiving.

The summary is that my memory was wrong, and items are *not* jostled
back to "better" locations.

-jJ


From ntoronto at cs.byu.edu  Mon Nov 26 21:50:53 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 26 Nov 2007 13:50:53 -0700
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
References: <47481C14.90009@cs.byu.edu>
	<ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
Message-ID: <474B31AD.9060004@cs.byu.edu>

Guido van Rossum wrote:
> The semantics of __builtins__ are an implementation detail used for
> sandboxing, and assignment to __builtins__ is not supported. Alas, I
> can't quite figure out what you're after; your post doesn't start with
> a clear problem statement, so I'm not even sure if this is helpful
> information. I just hope to encourage you from trying to change the
> semantics of __builtins__. In 3.0, __builtins__ may well be renamed.

Sorry - it was very early in the morning when I did my analysis, so I 
wasn't as clear as I could have been. I had two points:

1. A suggestion for future builtins, which is probably the wrong thing 
to do. Please disregard this.

2. A question about which semantics fast globals should support, and how 
different they can be from the current semantics and still be acceptable.

I have two problems with the current semantics:

1. They seem very wrong to me, even for an implementation detail. Python 
developers rely on function behavior being invariant to the call site. 
(As much as Python developers could be said to rely on any invariance, 
anyway.)

2. Implementing the current semantics with fast globals seems 
unnecessary. It no longer helps performance (it hurts it a tiny bit), 
and the code that does it reads like a pasted-on hack.

I've since discovered that it wouldn't be much slower. Here are some 
times for one of my "builtins get" benchmarks:

Current builtins:                    3.11 sec
Fast builtins, immediate semantics:  1.81 sec
Fast builtins, current or pre-1998:  1.64 sec (+ epsilon for hack)

"Immediate" semantics (which I find most correct) are a little slower 
because it has to check whether __builtins__ has changed every time a 
globals lookup fails, before it does a builtins lookup. In "pre-1998" 
semantics, a change of __builtins__ is checked only with a new stack frame.

Besides those results, fast globals reduces function call overhead by 
10%. I haven't measured what effect the hack has on that.

Personally, I like fast globals with pre-1998 semantics best, though 
there's still a difference in meaning between script and interactive 
mode. I can do it that way, the current way, or the immediate way. Or I 
could make current vs. pre-1998 selectable by macro. Do you have a 
preference?

I swear, though, I'm nearly ready to post a patch. :)

Neil


From guido at python.org  Mon Nov 26 22:40:00 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 26 Nov 2007 13:40:00 -0800
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <474B31AD.9060004@cs.byu.edu>
References: <47481C14.90009@cs.byu.edu>
	<ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
	<474B31AD.9060004@cs.byu.edu>
Message-ID: <ca471dc20711261340ra3547asb0704c0e8f6c4e14@mail.gmail.com>

On Nov 26, 2007 12:50 PM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> [...] A question about which semantics fast globals should support, and how
> different they can be from the current semantics and still be acceptable.
>
> I have two problems with the current semantics:
>
> 1. They seem very wrong to me, even for an implementation detail. Python
> developers rely on function behavior being invariant to the call site.
> (As much as Python developers could be said to rely on any invariance,
> anyway.)

Please assume I didn't read your initial post. "Very wrong" is a
strong stance. Care to explain what's wrong and why? Without more info
I'm not sure I understand what you're saying about call site
invariance.

> 2. Implementing the current semantics with fast globals seems
> unnecessary. It no longer helps performance (it hurts it a tiny bit),
> and the code that does it reads like a pasted-on hack.

Please provide full context (I'm also behind on the fast globals
thread). What exactly do you mean by "the current semantics"? And
what's the problem with implementing it with fast globals?

> I've since discovered that it wouldn't be much slower. Here are some
> times for one of my "builtins get" benchmarks:
>
> Current builtins:                    3.11 sec
> Fast builtins, immediate semantics:  1.81 sec
> Fast builtins, current or pre-1998:  1.64 sec (+ epsilon for hack)

Where's the benchmark source code?

> "Immediate" semantics (which I find most correct)

Even though I already told you not to care?

> are a little slower
> because it has to check whether __builtins__ has changed every time a
> globals lookup fails, before it does a builtins lookup. In "pre-1998"
> semantics, a change of __builtins__ is checked only with a new stack frame.
>
> Besides those results, fast globals reduces function call overhead by
> 10%. I haven't measured what effect the hack has on that.
>
> Personally, I like fast globals with pre-1998 semantics best, though
> there's still a difference in meaning between script and interactive
> mode. I can do it that way, the current way, or the immediate way. Or I
> could make current vs. pre-1998 selectable by macro. Do you have a
> preference?

Given that *nobody* should assign to __builtins__ in their current
globals, *ever*, I'm fine with pre-1998 semantics if it's fastest.

> I swear, though, I'm nearly ready to post a patch. :)

Please consider posting it before replying to this post.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Tue Nov 27 00:21:15 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 27 Nov 2007 12:21:15 +1300
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
References: <47481C14.90009@cs.byu.edu>
	<ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
Message-ID: <474B54EB.9090700@canterbury.ac.nz>

Guido van Rossum wrote:
> The semantics of __builtins__ are an implementation detail used for
> sandboxing, and assignment to __builtins__ is not supported.

Perhaps in 3.0 there could be an additional argument to
eval and exec for supplying a builtin namespace? Then
sandboxing code wouldn't have to make assumptions about
the implementation, and the way would be open for
optimising it in any way we wanted.

--
Greg


From guido at python.org  Tue Nov 27 00:29:10 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 26 Nov 2007 15:29:10 -0800
Subject: [Python-ideas] __builtins__ behavior and... the FUTURE!
In-Reply-To: <474B54EB.9090700@canterbury.ac.nz>
References: <47481C14.90009@cs.byu.edu>
	<ca471dc20711260946k673bd106m82ca5935e9672ed4@mail.gmail.com>
	<474B54EB.9090700@canterbury.ac.nz>
Message-ID: <ca471dc20711261529w12491fecv30f755b056f6ae8c@mail.gmail.com>

On Nov 26, 2007 3:21 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > The semantics of __builtins__ are an implementation detail used for
> > sandboxing, and assignment to __builtins__ is not supported.
>
> Perhaps in 3.0 there could be an additional argument to
> eval and exec for supplying a builtin namespace? Then
> sandboxing code wouldn't have to make assumptions about
> the implementation, and the way would be open for
> optimising it in any way we wanted.

Good idea. If only I hadn't made a mistake in the signature design...
It's kind of awkward to have it be exec(code, globals, locals,
builtins), but I'm afraid that changing it to exec(code, locals,
globals, builtins) would break too much code in the transition (2to3
notwithstanding).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From mark at qtrac.eu  Tue Nov 27 09:31:04 2007
From: mark at qtrac.eu (Mark Summerfield)
Date: Tue, 27 Nov 2007 08:31:04 +0000
Subject: [Python-ideas] P3k __builtins__ identifiers -> warning
Message-ID: <200711270831.04321.mark@qtrac.eu>

Here is a nice little Python 3 program, test.py:

    import string
    buffer = string.ascii_letters
    bytes = []
    sum = 0
    for chr in buffer:
	int = ord(chr)
	if 32 <= int < 127:
	    bytes.append(chr)
	    sum += 1
    str = "".join(bytes)
    print(sum, str)

If run as:

    python30a -W all test.py

It produces the expected output:

    52 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

But unfortunately it uses as identifiers: buffer, bytes, chr, int, sum,
and str. None of these are keywords so none of them provokes a
SyntaxError. In fact there are over 130 such identifiers;
print(dir(__builtins__)) to see them.

I think many newcomers to Python will find it difficult to remember 160
identifiers (keywords + __builtins__) and since some of them have
appealing names (esp. buffer, bytes, min, max, and sum), they may make
use of them without realising that this could cause them problems later
on.

My python-idea is that if python is run with -W all then it should
report uses of __builtins__ as identifiers.

-- 
Mark Summerfield, Qtrac Ltd., www.qtrac.eu




From guido at python.org  Tue Nov 27 19:43:47 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 27 Nov 2007 10:43:47 -0800
Subject: [Python-ideas] P3k __builtins__ identifiers -> warning
In-Reply-To: <200711270831.04321.mark@qtrac.eu>
References: <200711270831.04321.mark@qtrac.eu>
Message-ID: <ca471dc20711271043t225c7efbndbe31c5974d4881f@mail.gmail.com>

IMO this is a task for tools llike pylint or pychecker (both of which
flag this).

Also, it's controversial -- especially since you're unlikely to want
to use a builtin whose name you can't remember. :-) The builtins were
not made keywords for a reason.

--Guido

On Nov 27, 2007 12:31 AM, Mark Summerfield <mark at qtrac.eu> wrote:
> Here is a nice little Python 3 program, test.py:
>
>     import string
>     buffer = string.ascii_letters
>     bytes = []
>     sum = 0
>     for chr in buffer:
>         int = ord(chr)
>         if 32 <= int < 127:
>             bytes.append(chr)
>             sum += 1
>     str = "".join(bytes)
>     print(sum, str)
>
> If run as:
>
>     python30a -W all test.py
>
> It produces the expected output:
>
>     52 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
>
> But unfortunately it uses as identifiers: buffer, bytes, chr, int, sum,
> and str. None of these are keywords so none of them provokes a
> SyntaxError. In fact there are over 130 such identifiers;
> print(dir(__builtins__)) to see them.
>
> I think many newcomers to Python will find it difficult to remember 160
> identifiers (keywords + __builtins__) and since some of them have
> appealing names (esp. buffer, bytes, min, max, and sum), they may make
> use of them without realising that this could cause them problems later
> on.
>
> My python-idea is that if python is run with -W all then it should
> report uses of __builtins__ as identifiers.
>
> --
> Mark Summerfield, Qtrac Ltd., www.qtrac.eu
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From scott+python-ideas at scottdial.com  Wed Nov 28 07:11:24 2007
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Wed, 28 Nov 2007 01:11:24 -0500
Subject: [Python-ideas] P3k __builtins__ identifiers -> warning
In-Reply-To: <200711270831.04321.mark@qtrac.eu>
References: <200711270831.04321.mark@qtrac.eu>
Message-ID: <474D068C.1000508@scottdial.com>

Mark Summerfield wrote:
> My python-idea is that if python is run with -W all then it should
> report uses of __builtins__ as identifiers.

This could never work as the stdlib violates this rule and would invoke 
a large number of these warnings. And given the controversy about it, I 
doubt anyone is that interested in patching the stdlib to avoid these names.

As far as I am concerned, I don't really see the point in avoiding these 
names. As you say, several of them are very attractive and just because 
there is a built-in with that name doesn't always deter me from using 
it. The scoping rules of python are fairly simple, so it is not 
difficult to keep track of the shadowing. And it's pretty easy to 
recover a built-in by retrieving the object from the __builtins__ 
module, though not very obvious to newcomers.

-Scott

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu