From rrr at ronadam.com  Sun Apr  1 08:14:10 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 01 Apr 2007 01:14:10 -0500
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
 and Concurrency)
In-Reply-To: <fb6fbf560703301249r7b3352f5j492803cf65c6d36@mail.gmail.com>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>	
	<460B23FE.2010309@canterbury.ac.nz>	
	<fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>	
	<460C0E69.9060007@ronadam.com>
	<fb6fbf560703301249r7b3352f5j492803cf65c6d36@mail.gmail.com>
Message-ID: <460F4DB2.7030600@ronadam.com>

Jim Jewett wrote:
 > On 3/29/07, Ron Adam <rrr at ronadam.com> wrote:
 >> Jim Jewett wrote:
 >> > What we really need is a Task object that treats shared memory
 >> > (perhaps with a small list of specified exceptions) as immutable.
 >
 >> * A 'better' task object for easily creating tasks.
 >>       + We have a threading object now. (Needs improving.)
 >
 > But the task isn't in any way restricted.  Brett's security sandbox
 > might be a useful starting point, if it is fast enough.  Otherwise,
 > we'll probably need to stick with microthreading to get things small
 > enough to contain.
 >
 >> * Shared memory -
 >>       + Prevent names from being rebound
 >>       + Prevent objects from being altered
 >
 > I had thought of the names as being part of a shared dictionary.  (Of
 > course, immutable dictionaries aren't really available out-of-the-box
 > now, and I'm not sure I would trust the supposed immutability of
 > anything that wasn't proxied.)

Not all that different. The immutable dictionary would be an implantation 
detail of a locked name space I think.

I'm wondering if there might be a way to have an inheritance by container 
relationship, where certain characteristics can be acquired from parent 
containers.  Not exactly the same as class inheritance.


 >>      frozen: object can't be altered while frozen
 >>      locked: name can't be rebound to another object
 >
 >
 >>      3. Pass mutable "deep" copies back and forth.
 >>            ? Works now. (but not for all objects?)
 >
 > Well, anything that can be deep-copied -- unless you also want the
 > mutations to be collected back into a single location.

It would need to be able to make a round trip.


 >>      4. Pass frozen mutable objects.
 >>            - Needs freezable/unfreezable mutable objects.
 >>              (Not the same as making an immutable copy.)
 >
 > And there is where it starts to fall apart.  Though if you look at the
 > pypy dict and interpreter optimizations, they have started to deal
 > with it through versioning types.

I didn't find anything about "versioning" at these links. Did I miss it?

 > 
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#multi-dicts 

 > http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id23

_Ron




From ntoronto at cs.byu.edu  Mon Apr  2 06:08:02 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sun, 01 Apr 2007 22:08:02 -0600
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <4602354F.1010003@acm.org>
References: <4602354F.1010003@acm.org>
Message-ID: <461081A2.50507@cs.byu.edu>

Talin wrote:
> One thing that is important to understand is that I'm not talking about 
> "automatic parallelization" where the compiler automatically figures out 
> what parts can be parallelized. That would require so much change to the 
> Python language as to make it no longer Python.
>   
...


> I am not even necessarily talking about changing the Python language 
> (although certainly the internals of the interpreter will need to be 
> changed.) The same language can be used to describe the same kinds of 
> problems and operations, but the implications of those language elements 
> will change. This is analogous to the fact that these massively 
> multicore CPUs 10 years from now will most likely be using the same 
> instruction sets as today - but that does not mean that programming as a 
> discipline will anything like what it is now.
>   

I'm not convinced you wouldn't have to change Python. After dorking 
around online for years, I've *finally* found someone who put into 
math-talk my biggest problem with current programming paradigms and how 
they relate to concurrency:

http://alarmingdevelopment.org/index.php?p=5

I don't agree with everything in the post, but this part I do:

"Most programming languages adopt a control flow model of execution, 
mirroring the hardware, in which there is an execution pointer flowing 
through the program. The primary reason for this is to permit 
side-effects to be ordered by the programmer. The problem is that 
interdependencies between side-effects are naturally a partial order, 
while control flow establishes a total (linear) order. This means that 
the actual design exists only in the programmer?s mind. It is up to the 
programmer to mentally compile (by a topological sort) these implicit 
dependencies into a total order expressed in a control flow. Whenever 
the program?s control flow is to be changed, the implicit 
interdependencies encoded into it must be remembered or guessed at in 
order to maintain them. Obviously the language should allow the partial 
order to be explicitly specified, and a compiler should take care of 
working out an execution schedule for it."

There's an elephant-in-the-living-room UI problem, here: how would one 
go about extracting a partial order from a programmer? A text editor is 
fine for a total order, but I can't think of how I'd use one non-messily 
to define a partial order. How about a Gantt chart for a partial order, 
or some other kind of dependency diagram? How would you make it as easy 
to use as a text editor? The funny thing is, once you solve this 
problem, it may even be *easier* to program this way, because rather 
than maintaining the partial order in your head (or inferring it from a 
total order in the code), it'd be right in front of you.

There's no reason a program with partial flow control couldn't have very 
Python-like syntax. After reading this, though, which formalized what 
I've long felt is the biggest problem with concurrent programming, I'd 
have to say it'd definitely not be Python itself.

For the record, I disagree strongly with the "let's keep concurrency in 
the libraries" idea. I want to program in *Python*, dangit. Or at least 
something that feels a lot like it.

Neil



From jcarlson at uci.edu  Mon Apr  2 08:27:34 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 01 Apr 2007 23:27:34 -0700
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <461081A2.50507@cs.byu.edu>
References: <4602354F.1010003@acm.org> <461081A2.50507@cs.byu.edu>
Message-ID: <20070401223816.049E.JCARLSON@uci.edu>


Neil Toronto <ntoronto at cs.byu.edu> wrote:

(I'm going to rearrange your post so that my reply flows a bit better)

> There's no reason a program with partial flow control couldn't have very 
> Python-like syntax. After reading this, though, which formalized what 
> I've long felt is the biggest problem with concurrent programming, I'd 
> have to say it'd definitely not be Python itself.

It depends on what operations one wants to support.  "Apply this
function to all of this data" is easy.

To say things like 'line x depends on line x-2, and line x+1 depends on
line x-1, and line x+2 depends on line x and x+1, certainly that is not
easy.  But I question the purpose of being able to offer up that kind of
information (in Python specifically). Presumably it is so that those
tasks that don't depend on each other could be executed in parallel; but
unless you have a method by which parallel execution is fast (or at
least faster than just doing it in series), it's not terribly useful
(especially if those operations are data structure manipulations that
need to be propagated back to the 'main process').


> There's an elephant-in-the-living-room UI problem, here: how would one 
> go about extracting a partial order from a programmer? A text editor is 
> fine for a total order, but I can't think of how I'd use one non-messily 
> to define a partial order. How about a Gantt chart for a partial order, 
> or some other kind of dependency diagram? How would you make it as easy 
> to use as a text editor? The funny thing is, once you solve this 
> problem, it may even be *easier* to program this way, because rather 
> than maintaining the partial order in your head (or inferring it from a 
> total order in the code), it'd be right in front of you.

Generally, the standard way of defining a partial order is via
dependency graph.  Unfortunately, breaking blocks of code into a
dependency graph (or partial-order control-flow) tends to make the code
hard to understand.  I know there are various tools that use this
particular kind of method, but those that I have seen leave much to be
desired. Alternatively, there is a huge amount of R&D that has gone into
C/C++ compilers to extract this information automatically from source
code, and even more on the hardware end of things to automatically
extract this information from machine code as it executes. Unfortunately,
due to Python's dynamic nature, even something as simple as 'i += 0' can
lead to all sorts of underlying system changes, and we may not be able
to reliably extract this information (though PyPy with the LLVM backend
may offer opportunities here).


> For the record, I disagree strongly with the "let's keep concurrency in 
> the libraries" idea. I want to program in *Python*, dangit. Or at least 
> something that feels a lot like it.

And leaving concurrency in a library allows Python to stay Python.  For
certain tasks, one merely needs parallel variants of currently existing
Python functions/operations.  Take Google's MapReduce [1], which applies
a function to a large number of data elements in parallel, then combines
the results of those computations.  While it is not universal, it can do
certain operations quickly.  Other tasks merely require the execution of
*some function* while *some other function* is executing.  Free
threading, and various ways of allowing concurrent thread execution has
been offered, but the more I read about the Processing package, the more
I like it.

These options don't offer a solution to what you seem to be wanting; an
easy definition of partial order on code to be executed in Python.
However, without language-level support for something like...

    exec lines in *block in *parallel:
        i += 1
        j += fcn(foo)
        bar = fcn2(bar)

...I don't see how it is possible.  Then again, I'm not sure I
completely agree with Mr. Edwards or yourself in that being able to
state partial ordering will offer improvements over the status quo. 
Then again, I tend to not worry about the blocks of 3-4 lines that
aren't dependent on one another, as much as the entire function suite
returning what I intended it to.

 - Josiah

[1] http://labs.google.com/papers/mapreduce.html



From rrr at ronadam.com  Mon Apr  2 11:01:59 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 02 Apr 2007 04:01:59 -0500
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <461081A2.50507@cs.byu.edu>
References: <4602354F.1010003@acm.org> <461081A2.50507@cs.byu.edu>
Message-ID: <4610C687.7090603@ronadam.com>

Neil Toronto wrote:

> There's an elephant-in-the-living-room UI problem, here: how would one 
> go about extracting a partial order from a programmer?

Beat him or her with a stick?  Just kidding of course. ;-)

I think what you mean is how can we make it easier for a programmer to 
express their intention.  One way is to provide a rich set of alternatives 
in extension modules and letting them choose.  The ones that work will 
bubble up to the top, and the hard to manage and maintain choices will 
either be improved or forgotten.


> A text editor is fine for a total order, but I can't think of how I'd 
> use one non-messily to define a partial order. How about a Gantt chart
> for a partial order, or some other kind of dependency diagram? How would
> you make it as easy to use as a text editor? The funny thing is, once
> you solve this problem, it may even be *easier* to program this way,
> because rather than maintaining the partial order in your head (or
> inferring it from a total order in the code), it'd be right in front of
> you.

Well, you wouldn't want to interleave several tasks instructions together 
in any fixed (or otherwise) way.  That definitely would not be anything I 
would want to maintain.

Being able to define serial blocks of code to execute in a parallel fashion 
  can make some things easier to express because it gives you another way 
you can group related code together instead of having to split op, or 
combine unrelated, code together because of ordering dependencies.

But addressing your partial order concerns, most likely you will have 
parallel structures communicating to one another with no apparent 
predefined order.  The ordering could be completely dependent on the data 
they get and send to each other, and completely dependent on outside events.

Think of tasks that can open multiple communication channels to other tasks 
as needed.  What order would these be executed in?  Who knows! <shrug>  And 
you may not need to know.

It may be that a partial-order execution order could be considered a subset 
of indeterminate execution order.


> There's no reason a program with partial flow control couldn't have very 
> Python-like syntax. After reading this, though, which formalized what 
> I've long felt is the biggest problem with concurrent programming, I'd 
> have to say it'd definitely not be Python itself.

I think it would still be python.  From what I see Python will continue to 
be improved on for quite some time.  But these ideas must prove themselves 
to be pythonic before they get put in python.

(My spell checker picked out polyphonic for pythonic... PolyPython?)

> For the record, I disagree strongly with the "let's keep concurrency in 
> the libraries" idea. I want to program in *Python*, dangit. Or at least 
> something that feels a lot like it.

My guess, (if/when this is ever added), it will most likely be a 
combination of some basic supporting enhancements to names and objects so 
that they can work with task libraries better, along with one or more 
tasks/multi-processing libraries.

If it turns out that the use of some of these ideas becomes both frequent 
and common.  Then syntax similar to the 'with' statement might find 
support.  But all of this is still quite a ways off unless some (yet to be 
identified) overwhelming need pushes it forward.

Just my two cents,
   _Ron



From jason.orendorff at gmail.com  Mon Apr  2 15:07:58 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 2 Apr 2007 09:07:58 -0400
Subject: [Python-ideas] Python and Concurrency
In-Reply-To: <461081A2.50507@cs.byu.edu>
References: <4602354F.1010003@acm.org> <461081A2.50507@cs.byu.edu>
Message-ID: <bb8868b90704020607i70731cabi7aba631b13def132@mail.gmail.com>

On 4/2/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I don't agree with everything in the post, but this part I do:
>
> [...] It is up to the
> programmer to mentally compile (by a topological sort) these implicit
> dependencies into a total order expressed in a control flow.

The fancy phrase "topological sort" here obscures that this
"compilation" is something humans are good at.  We do it all the
time. We make plans and carry them out.

-j


From jimjjewett at gmail.com  Mon Apr  2 18:07:30 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 2 Apr 2007 12:07:30 -0400
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
	and Concurrency)
In-Reply-To: <460F4DB2.7030600@ronadam.com>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>
	<460B23FE.2010309@canterbury.ac.nz>
	<fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>
	<460C0E69.9060007@ronadam.com>
	<fb6fbf560703301249r7b3352f5j492803cf65c6d36@mail.gmail.com>
	<460F4DB2.7030600@ronadam.com>
Message-ID: <fb6fbf560704020907k2ef0edcdre34b630d4a4a13c7@mail.gmail.com>

On 4/1/07, Ron Adam <rrr at ronadam.com> wrote:
> Jim Jewett wrote:

>  > And there is where it starts to fall apart.  Though if you look at the
>  > pypy dict and interpreter optimizations, they have started to deal
>  > with it through versioning types.
>
> I didn't find anything about "versioning" at these links. Did I miss it?

> http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#multi-dicts

>  > http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id23

Sorry; I wasn't pointing to enough of the document.

http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html

discussed versions under method caching, just above the Interpreter
Optimizations section.

-jJ


From eoghan at qatano.com  Wed Apr 11 11:38:43 2007
From: eoghan at qatano.com (Eoghan Murray)
Date: Wed, 11 Apr 2007 10:38:43 +0100
Subject: [Python-ideas] Implicit String Concatenation
Message-ID: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>

I heard the call for the P3K PEP April deadline, so I thought I better get
this sent off!

When I was first exposed to Python, I was delighted that I could do the
following;
>>> "Hello" ' world'
    'Hello world'
This turned to confusion when I tried;
>>> domain = " world"
>>> "hello" domain
Syntax Error ... Invalid Syntax

My proposal for Python3K is to allow string-concatenation via juxtaposition
between string-literals, string-variables and expressions that evaluate to
strings.
Juxtaposition has some precedence in Python (the example above) and also in
the awk programming language.

If anyone agrees that this is a good idea, then I'd be happy to write up a
PEP explaining why I think that implicit string concatenation is better than
overloading the plus operator (which this proposal wouldn't deprecate) and
more elegant than template strings or string interpolation.

Eoghan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070411/d65363db/attachment.html>

From thobes at gmail.com  Wed Apr 11 13:01:04 2007
From: thobes at gmail.com (Tobias Ivarsson)
Date: Wed, 11 Apr 2007 13:01:04 +0200
Subject: [Python-ideas] from __future__ import function_annotations
Message-ID: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>

I am just curiously wondering about the plans for introducing function
annotations (PEP 3107). I could not find any information about this in the
PEP, neither when I searched the mail archives.
The way I see it this feature could be quite interesting to introduce as
early as possible since I believe that there are quite a few tools that
could benefit from this today.
I could for example see Jython using function annotations for declaring
methods that are supposed to be accessible from java code. This is done via
annotations in the doc string today, and would be a lot clearer using
function annotations.
Jython could implement this use of function annotations without python
supporting it, but that would make the code incompatible between python and
Jython, which would be highly unfortunate.
Therefore i propose that python adds support for function annotations in
version 2.6 via
from __future__ import function_annotations
This would make the change as compatible as for example @decorators or the
with-statement.

/Tobias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070411/8c27a1b8/attachment.html>

From g.brandl at gmx.net  Wed Apr 11 16:21:47 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 11 Apr 2007 16:21:47 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
Message-ID: <eviqtr$fas$1@sea.gmane.org>

Eoghan Murray schrieb:
> I heard the call for the P3K PEP April deadline, so I thought I better 
> get this sent off!
> 
> When I was first exposed to Python, I was delighted that I could do the 
> following;
>> >> "Hello" ' world'
>     'Hello world'
> This turned to confusion when I tried;
>> >> domain = " world"
>> >> "hello" domain
> Syntax Error ... Invalid Syntax
> 
> My proposal for Python3K is to allow string-concatenation via 
> juxtaposition between string-literals, string-variables and expressions 
> that evaluate to strings.
> Juxtaposition has some precedence in Python (the example above) and also 
> in the awk programming language.

No, please! The concatenation of string literals is done in the parser.
Your proposal would move that to runtime and introduce a "whitespace operator".
How would you spell that? How would you overload it? etc.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From jcarlson at uci.edu  Wed Apr 11 17:00:20 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Apr 2007 08:00:20 -0700
Subject: [Python-ideas] from __future__ import function_annotations
In-Reply-To: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
References: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
Message-ID: <20070411075823.62AB.JCARLSON@uci.edu>


"Tobias Ivarsson" <thobes at gmail.com> wrote:
> I am just curiously wondering about the plans for introducing function
> annotations (PEP 3107). I could not find any information about this in the
> PEP, neither when I searched the mail archives.

Function annotations are a Python 3.0 feature.  From what I understand,
they have a potential implementation, tests, and documentation.  You
just have to wait until the Alpha comes out.

 - Josiah



From collinw at gmail.com  Wed Apr 11 17:01:54 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 11 Apr 2007 08:01:54 -0700
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <eviqtr$fas$1@sea.gmane.org>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
Message-ID: <43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>

On 4/11/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Eoghan Murray schrieb:
[snip]
> > My proposal for Python3K is to allow string-concatenation via
> > juxtaposition between string-literals, string-variables and expressions
> > that evaluate to strings.
> > Juxtaposition has some precedence in Python (the example above) and also
> > in the awk programming language.
>
> No, please! The concatenation of string literals is done in the parser.
> Your proposal would move that to runtime and introduce a "whitespace operator".
> How would you spell that? How would you overload it? etc.

A single-width whitespace operator would just be confusing since PEP
3117 will be using zero-width spaces for the None typedef : )

Collin


From collinw at gmail.com  Wed Apr 11 17:11:02 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 11 Apr 2007 08:11:02 -0700
Subject: [Python-ideas] from __future__ import function_annotations
In-Reply-To: <20070411075823.62AB.JCARLSON@uci.edu>
References: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
	<20070411075823.62AB.JCARLSON@uci.edu>
Message-ID: <43aa6ff70704110811s36e5d707l634f80783ba7fcf1@mail.gmail.com>

On 4/11/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> "Tobias Ivarsson" <thobes at gmail.com> wrote:
> > I am just curiously wondering about the plans for introducing function
> > annotations (PEP 3107). I could not find any information about this in the
> > PEP, neither when I searched the mail archives.
>
> Function annotations are a Python 3.0 feature.  From what I understand,
> they have a potential implementation, tests, and documentation.  You
> just have to wait until the Alpha comes out.

Backporting annotations to 2.x was discussed in March
(http://mail.python.org/pipermail/python-3000/2007-March/006107.html)
to generally positive reception. The only possible hiccup would be
that annotations wouldn't be supported for tuple parameters, since
tuple params won't survive in 3.0 anyway.

Collin Winter


From tony at PageDNA.com  Wed Apr 11 17:24:19 2007
From: tony at PageDNA.com (Tony Lownds)
Date: Wed, 11 Apr 2007 08:24:19 -0700
Subject: [Python-ideas] from __future__ import function_annotations
In-Reply-To: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
References: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
Message-ID: <133B3E94-4E72-4E59-8D21-2056316C210D@PageDNA.com>


On Apr 11, 2007, at 4:01 AM, Tobias Ivarsson wrote:

> I am just curiously wondering about the plans for introducing  
> function annotations (PEP 3107). I could not find any information  
> about this in the PEP, neither when I searched the mail archives.
> The way I see it this feature could be quite interesting to  
> introduce as early as possible since I believe that there are quite  
> a few tools that could benefit from this today.
> I could for example see Jython using function annotations for  
> declaring methods that are supposed to be accessible from java  
> code. This is done via annotations in the doc string today, and  
> would be a lot clearer using function annotations.
> Jython could implement this use of function annotations without  
> python supporting it, but that would make the code incompatible  
> between python and Jython, which would be highly unfortunate.
> Therefore i propose that python adds support for function  
> annotations in version 2.6 via
> from __future__ import function_annotations
> This would make the change as compatible as for example @decorators  
> or the with-statement.
>

Function annotations PEP is accepted and code has been checked in to  
p3yk. I don't think there would be much support for
the syntax in 2.6, but I could be wrong. A more palatable  
compatibility strategy may be to introduce a decorator that sets
function.__annotations__, so that these two function definitions  
would have equivalent annotations:

 >>> @annotate(int, int, returns=int)
... def gcd1(m, n):
...   etc

 >>> def gcd2(m: int, n: int) -> int:
...   etc

It's easier for a decorator to be compatible with run-time semantics,  
and more likely to avoid syntax questions, than embedding
the annotattions in docstrings. Source code conversion (2to3) could  
change the @annotate decorator form to the in-line function
form. Tools could be written to use either annotation set.

What do y'all think?

-Tony



From g.brandl at gmx.net  Wed Apr 11 18:10:23 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 11 Apr 2007 18:10:23 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
Message-ID: <evj19f$e41$2@sea.gmane.org>

Collin Winter schrieb:
> On 4/11/07, Georg Brandl <g.brandl at gmx.net> wrote:
>> Eoghan Murray schrieb:
> [snip]
>> > My proposal for Python3K is to allow string-concatenation via
>> > juxtaposition between string-literals, string-variables and expressions
>> > that evaluate to strings.
>> > Juxtaposition has some precedence in Python (the example above) and also
>> > in the awk programming language.
>>
>> No, please! The concatenation of string literals is done in the parser.
>> Your proposal would move that to runtime and introduce a "whitespace operator".
>> How would you spell that? How would you overload it? etc.
> 
> A single-width whitespace operator would just be confusing since PEP
> 3117 will be using zero-width spaces for the None typedef : )

Thinking in that directing, NO-BREAK SPACE would be a perfect choice for
an operator!

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From collinw at gmail.com  Wed Apr 11 18:16:02 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 11 Apr 2007 09:16:02 -0700
Subject: [Python-ideas] from __future__ import function_annotations
In-Reply-To: <133B3E94-4E72-4E59-8D21-2056316C210D@PageDNA.com>
References: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
	<133B3E94-4E72-4E59-8D21-2056316C210D@PageDNA.com>
Message-ID: <43aa6ff70704110916odd7df89waf95566f9ea9d19@mail.gmail.com>

On 4/11/07, Tony Lownds <tony at pagedna.com> wrote:
> Function annotations PEP is accepted and code has been checked in to
> p3yk. I don't think there would be much support for
> the syntax in 2.6, but I could be wrong. A more palatable
> compatibility strategy may be to introduce a decorator that sets
> function.__annotations__, so that these two function definitions
> would have equivalent annotations:
>
>  >>> @annotate(int, int, returns=int)
> ... def gcd1(m, n):
> ...   etc
>
>  >>> def gcd2(m: int, n: int) -> int:
> ...   etc
>
> It's easier for a decorator to be compatible with run-time semantics,
> and more likely to avoid syntax questions, than embedding
> the annotattions in docstrings. Source code conversion (2to3) could
> change the @annotate decorator form to the in-line function
> form. Tools could be written to use either annotation set.
>
> What do y'all think?

Speaking only to the part about 2to3, that sort of conversion would be
a pain in the ass to write. Even if the @annotate decorator were
keyword-args only (allowing positional args complicates the
implementation more than you would expect), it would still probably be
quicker/easier/more accurate just to port the 3.0 annotations
implementation to 2.6.

Collin Winter


From eoghan at qatano.com  Wed Apr 11 18:23:46 2007
From: eoghan at qatano.com (Eoghan Murray)
Date: Wed, 11 Apr 2007 17:23:46 +0100
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
Message-ID: <ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>

Hi guys,

Thanks for your replies:


On 4/11/07, Georg Brandl <g.brandl at gmx.net> wrote:
> [snip]
>
>  No, please! The concatenation of string literals is done in the parser.

 Your proposal would move that to runtime [snip...]


 An implementation detail?

 [...snip] and introduce a "whitespace operator".
>  How would you spell that? How would you overload it? etc.


This is exactly what I'm proposing.  You could spell it __juxta__  short for
juxtaposition or __concat__, and overload it as usual :-)

On 11/04/07, Collin Winter <collinw at gmail.com> wrote:
>
A single-width whitespace operator would just be confusing since PEP
> 3117 will be using zero-width spaces for the None typedef : )


3117 looks cool, but it is in draft stages so needn't factor.

Anyone with any positive reactions?

Eoghan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070411/050a59a8/attachment.html>

From jcarlson at uci.edu  Wed Apr 11 18:51:14 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Apr 2007 09:51:14 -0700
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
References: <43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
Message-ID: <20070411095003.62B1.JCARLSON@uci.edu>


"Eoghan Murray" <eoghan at qatano.com> wrote:
> Anyone with any positive reactions?

Sorry, only negative from me.

 - Josiah



From tony at pagedna.com  Wed Apr 11 19:32:24 2007
From: tony at pagedna.com (Tony Lownds)
Date: Wed, 11 Apr 2007 10:32:24 -0700
Subject: [Python-ideas] from __future__ import function_annotations
In-Reply-To: <43aa6ff70704110916odd7df89waf95566f9ea9d19@mail.gmail.com>
References: <9997d5e60704110401k7333a276j3104f69ed0c8ca5a@mail.gmail.com>
	<133B3E94-4E72-4E59-8D21-2056316C210D@PageDNA.com>
	<43aa6ff70704110916odd7df89waf95566f9ea9d19@mail.gmail.com>
Message-ID: <5745BD18-FFEB-49B2-8E4E-FAC11516F599@pagedna.com>


On Apr 11, 2007, at 9:16 AM, Collin Winter wrote:
> Speaking only to the part about 2to3, that sort of conversion would be
> a pain in the ass to write. Even if the @annotate decorator were
> keyword-args only (allowing positional args complicates the
> implementation more than you would expect), it would still probably be
> quicker/easier/more accurate just to port the 3.0 annotations
> implementation to 2.6.

Ok. +1 backporting the syntax from me, FWIW

-Tony


From jason.orendorff at gmail.com  Wed Apr 11 20:01:59 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Wed, 11 Apr 2007 14:01:59 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
Message-ID: <bb8868b90704111101n5351ebdamd588d5838f7763c6@mail.gmail.com>

On 4/11/07, Eoghan Murray <eoghan at qatano.com> wrote:
> This is exactly what I'm proposing.  You could spell it __juxta__  short for
> juxtaposition or __concat__, and overload it as usual :-)

And if __juxta__ is not defined, it should fall back first on
__call__, then __mul__, then __add__.  If it binds right-to-left, you
could write things like

  from math import *
  print (2 sin x + cos x)

We might as well make newlines an operator at the same time.  There's
precedent for this in Haskell, and good synergy--adding the STM monad
to Python would solve the GIL problem.  You could spell that operator
__bind__ or just __>>=__, take your pick.

And I think Guido already committed to ripping out the @decorator
syntax in Py3k in favor of comment overloading, via __rem__().

Just kidding, of course...

> Anyone with any positive reactions?

Eoghan, thanks for taking the time to write.  I don't think anyone
likes the idea, though.  It causes many grammatical problems:  should
a[0] parse as a.__getitem__(0) or a.__juxta__([0])?  What about
(foo)(bar)?  And while "sin x" would of course mean sin.__juxta__(x),
"sin -x" would parse as "sin - x", or sin.__sub__(x).

A few extra + signs are a small price to pay.

-j


From g.brandl at gmx.net  Wed Apr 11 20:11:51 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 11 Apr 2007 20:11:51 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>	<eviqtr$fas$1@sea.gmane.org>	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<ddb3f080704110923s4792fb4cn3dc7163cab604818@mail.gmail.com>
Message-ID: <evj8d7$c58$1@sea.gmane.org>

Eoghan Murray schrieb:
> Hi guys,
> 
> Thanks for your replies:
> 
> 
>     On 4/11/07, Georg Brandl < g.brandl at gmx.net
>     <mailto:g.brandl at gmx.net>> wrote:
>     [snip]
> 
>      No, please! The concatenation of string literals is done in the parser.
> 
>      Your proposal would move that to runtime [snip...]
> 
> 
>  An implementation detail?

A rather involved "detail".

>      [...snip] and introduce a "whitespace operator".
>      How would you spell that? How would you overload it? etc.
> 
> 
> This is exactly what I'm proposing.  You could spell it __juxta__  short 
> for juxtaposition or __concat__, and overload it as usual :-)
> 
>     A single-width whitespace operator would just be confusing since PEP
>     3117 will be using zero-width spaces for the None typedef : )
> 
> 
> 3117 looks cool, but it is in draft stages so needn't factor.

This is a joke, isn't it? You're a bit late...

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From adam at atlas.st  Wed Apr 11 20:39:00 2007
From: adam at atlas.st (Adam Atlas)
Date: Wed, 11 Apr 2007 14:39:00 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
Message-ID: <7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>


On 11 Apr 2007, at 11.01, Collin Winter wrote:

> On 4/11/07, Georg Brandl <g.brandl at gmx.net> wrote:
>> No, please! The concatenation of string literals is done in the  
>> parser.
>> Your proposal would move that to runtime and introduce a  
>> "whitespace operator".
>> How would you spell that? How would you overload it? etc.
>
> A single-width whitespace operator would just be confusing since PEP
> 3117 will be using zero-width spaces for the None typedef : )

I propose we use the ASCII character 0x07 (BEL) as the concatenation  
operator. It's invisible, so your code still looks nice and clean,  
but you know it's there because your text editor will beep at you  
every time you pass it. :)

(Speaking of PEP 3117, I will fight it to the death unless the  
typedef for Exception is changed to Unicode character 2620 (SKULL AND  
CROSSBONES) or 2623 (BIOHAZARD SIGN). Brilliant choice for frozenset,  
though. No longer need I wonder why the Unicode Consortium saw fit to  
include a snowman character!)


From jimjjewett at gmail.com  Wed Apr 11 22:15:54 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 11 Apr 2007 16:15:54 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
Message-ID: <fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>

On 4/11/07, Eoghan Murray <eoghan at qatano.com> wrote:
> When I was first exposed to Python, I was delighted that I could do the
> following;
> >>> "Hello" ' world'
>     'Hello world'
> This turned to confusion when I tried;
> >>> domain = " world"
> >>> "hello" domain
> Syntax Error ... Invalid Syntax

I would support a proposal to remove the implicit concatenation entirely.

I suspect it would be shot down for backwards compatibility (even in
Py3K), but from a readability standpoint ...

I have never seen a string concatentation that would look worse
because of a "+".

I *have* seen some bugs where a comma was forgotten, and two arguments
got invisibly jammed together.  That's a pain to debug in C; in python
with default values, the interpreter may not even gripe sensibly.

-jJ


From jason.orendorff at gmail.com  Wed Apr 11 23:03:10 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Wed, 11 Apr 2007 17:03:10 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
Message-ID: <bb8868b90704111403o5e1bd006r6ed417cda4268573@mail.gmail.com>

On 4/11/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> I have never seen a string concatentation that would look worse
> because of a "+".
>
> I *have* seen some bugs where a comma was forgotten, and two arguments
> got invisibly jammed together.  That's a pain to debug in C; in python
> with default values, the interpreter may not even gripe sensibly.

Oh.  I just realized this happens a lot out here.  Where I work, we
use scons, and each SConscript has a long list of filenames:

sourceFiles = [
    'foo.c',
    'bar.c',
    #...many lines omitted...
    'q1000x.c']

It's a common mistake to leave off a comma, and then scons complains
that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
even if you *are* a Python programmer, and not everyone here is.

-j


From g.brandl at gmx.net  Wed Apr 11 23:06:54 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 11 Apr 2007 23:06:54 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <bb8868b90704111403o5e1bd006r6ed417cda4268573@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<bb8868b90704111403o5e1bd006r6ed417cda4268573@mail.gmail.com>
Message-ID: <evjile$ahq$1@sea.gmane.org>

Jason Orendorff schrieb:
> On 4/11/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>> I have never seen a string concatentation that would look worse
>> because of a "+".
>>
>> I *have* seen some bugs where a comma was forgotten, and two arguments
>> got invisibly jammed together.  That's a pain to debug in C; in python
>> with default values, the interpreter may not even gripe sensibly.
> 
> Oh.  I just realized this happens a lot out here.  Where I work, we
> use scons, and each SConscript has a long list of filenames:
> 
> sourceFiles = [
>     'foo.c',
>     'bar.c',
>     #...many lines omitted...
>     'q1000x.c']
> 
> It's a common mistake to leave off a comma, and then scons complains
> that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
> even if you *are* a Python programmer, and not everyone here is.

I think that convinces me to support the removal.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From adam at atlas.st  Wed Apr 11 23:09:22 2007
From: adam at atlas.st (Adam Atlas)
Date: Wed, 11 Apr 2007 17:09:22 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
Message-ID: <BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>


On 11 Apr 2007, at 16.15, Jim Jewett wrote:
> I would support a proposal to remove the implicit concatenation  
> entirely.

I'd agree with that. The parser can probably just do the same  
optimization automatically if it gets [string literal] "+" [string  
literal]. (Or does it already?)

Meanwhile, on a similar subject, I have a... strange idea. I'm not  
sure how easy/hard it would be to parse or how necessary it is, but  
it's just a thought.

Currently, you can do multiline strings a couple of ways:
x = '''foo
bar
baz'''
x = 'foo' \
     'bar' \
     'baz'

Neither of these seem ideal. Triple-quoting is decent, but it can get  
ugly if you're using it in an indented block (as you most often will  
be), since the following lines are considered to start right after  
the newline, not after the containing block's indentation level. But  
changing it to the latter behaviour has been discussed before, if I  
remember correctly, and that didn't seem popular. That's  
understandable; the current triple-quote multiline behaviour makes  
sense from a logical point of view, it just doesn't look as nice to  
have text suddenly fall down to 0 indentation and then jump back to  
the original indentation level when the quote is over. So anyway,  
what I'm proposing is the following:

x = 'foo
     'bar
     'baz'

In other words, if you start a ' or "-quoted string, and don't close  
it at the end of the line, you can continue it on the next line. It  
would be generally equivalent to appending \n, closing the quote, and  
preceding the physical newline with a backslash. (And inserting a  
plus sign, if we take Jim's proposal into account.) Not closing a  
quote and doing something else on the next line (i.e. not starting it  
with a quote character after any whitespace) would still be a syntax  
error.

This style takes precedent from multi-paragraph quoting style in  
English: if you end a paragraph without closing a quote, then you  
continue it by starting the next one with a quote, and you can  
continue like that until you do have an end-quote.

I think it would improve readability/writability for when you need to  
include multiline text blocks or code blocks. Having to have that \n"+ 
\ at the end of each line really breaks up the flow, whether of a  
block of human or computer language text. And having subsequent lines  
fall to 0 indentation (if you choose to use triple-quotes) breaks up  
the flow of the surrounding Python code. This seems like a good  
solution, especially since it has precedent in written English. Any  
thoughts?


From jcarlson at uci.edu  Thu Apr 12 00:19:29 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Apr 2007 15:19:29 -0700
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
References: <fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
Message-ID: <20070411151507.62BA.JCARLSON@uci.edu>


Adam Atlas <adam at atlas.st> wrote:
[snip]
> Currently, you can do multiline strings a couple of ways:
> x = '''foo
> bar
> baz'''
> x = 'foo' \
>      'bar' \
>      'baz'
[snip]
> x = 'foo
>      'bar
>      'baz'

That's a horrible idea.  It's even worse than the 'space implies
concatenation' suggestion made earlier.

If you want to get rid of indentation in the case of...

    x = '''foo
           bar
           baz'''

use textwrap.dedent and friends.

 - Josiah



From greg.ewing at canterbury.ac.nz  Thu Apr 12 00:48:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Apr 2007 10:48:36 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <eviqtr$fas$1@sea.gmane.org>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
Message-ID: <461D65C4.4070108@canterbury.ac.nz>

Georg Brandl wrote:

> Your proposal would move that to runtime and introduce a "whitespace operator".
> How would you spell that? How would you overload it? etc.

Using the ____() method, obviously. :-)

But seriously, there is no way this is going to
fly. Python is not Perl or awk (or SNOBOL).

--
Greg


From tjreedy at udel.edu  Wed Apr 11 23:54:11 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 Apr 2007 17:54:11 -0400
Subject: [Python-ideas] Implicit String Concatenation
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com><fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
Message-ID: <evjldr$k7h$1@sea.gmane.org>


"Adam Atlas" <adam at atlas.st> wrote in message 
news:BCDF57D3-8555-4230-8ABD-8419A41A8E1C at atlas.st...
|
| On 11 Apr 2007, at 16.15, Jim Jewett wrote:
| > I would support a proposal to remove the implicit concatenation
| > entirely.

Raymond H. is proposing this for Py3.

| I'd agree with that. The parser can probably just do the same
| optimization automatically if it gets [string literal] "+" [string
| literal]. (Or does it already?)

He says it does (not sure which version he meant).


| what I'm proposing is the following:
|
| x = 'foo
|     'bar
|     'baz'

-1 Looks ugly to me ;-)

tjr





From jan.kanis at phil.uu.nl  Thu Apr 12 11:53:58 2007
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Thu, 12 Apr 2007 11:53:58 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <evjldr$k7h$1@sea.gmane.org>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
	<evjldr$k7h$1@sea.gmane.org>
Message-ID: <op.tqn0f8pad64u53@jan-lenovo>

On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy at udel.edu> wrote:
> | what I'm proposing is the following:
> |
> | x = 'foo
> |     'bar
> |     'baz'
>
> -1 Looks ugly to me ;-)

Indeed, I don't really like this syntax. I do like if there'd be a way to  
spell 'multiline string with indentation chopped off'. The easiest way  
(syntax-wise) would be to just have tripple quote do that, but that's  
gonna give backward compat problems.

Jan


From eoghan at qatano.com  Thu Apr 12 13:34:18 2007
From: eoghan at qatano.com (Eoghan Murray)
Date: Thu, 12 Apr 2007 12:34:18 +0100
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
Message-ID: <ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>

On 11/04/07, Adam Atlas <adam at atlas.st> wrote:
>
> On 11 Apr 2007, at 11.01, Collin Winter wrote:
>
> > On 4/11/07, Georg Brandl <g.brandl at gmx.net> wrote:
> >> No, please! The concatenation of string literals is done in the
> >> parser.
> >> Your proposal would move that to runtime and introduce a
> >> "whitespace operator".
> >> How would you spell that? How would you overload it? etc.
> >
> > A single-width whitespace operator would just be confusing since PEP
> > 3117 will be using zero-width spaces for the None typedef : )
>
> I propose we use the ASCII character 0x07 (BEL) as the concatenation
> operator. It's invisible, so your code still looks nice and clean,
> but you know it's there because your text editor will beep at you
> every time you pass it. :)
>

LOL, I'll reply to the funniest put down!

The rationale for this is that Python should have one definitive way
of concatenating strings.
I dislike '+' as a string concatenation operator as I think
overloading the meaning of '+' for non-numbers is ugly, and I dislike
'%s' string formatting as it perpetuates perhaps obscure C syntax, as
well as shunting the variables to the end of the line - hard for a
human to parse.

Given that __juxta__ isn't going to fly, +1 for complete removal of
implicit string concatenation in Py3k

Eoghan


From adam at atlas.st  Thu Apr 12 16:11:40 2007
From: adam at atlas.st (Adam Atlas)
Date: Thu, 12 Apr 2007 10:11:40 -0400
Subject: [Python-ideas] Regular expression algorithms
Message-ID: <C6567D95-7DA4-4BBF-A4CC-4959E2C2563F@atlas.st>

Has anyone seen this article?
http://swtch.com/~rsc/regexp/regexp1.html

Are its criticisms of Python's regex algorithm accurate? If so, might  
it be possible to revise Python's `re` module to use this sort of  
algorithm? I noticed it says that this approach doesn't work if the  
pattern contains backreferences, but maybe the module could at least  
sort of self-optimize by switching to this method when no backrefs  
are used.


From ntoronto at cs.byu.edu  Thu Apr 12 17:39:47 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Thu, 12 Apr 2007 09:39:47 -0600
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <op.tqn0f8pad64u53@jan-lenovo>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>	<evjldr$k7h$1@sea.gmane.org>
	<op.tqn0f8pad64u53@jan-lenovo>
Message-ID: <461E52C3.8030907@cs.byu.edu>

Jan Kanis wrote:
> On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy at udel.edu> wrote:
>   
>> | what I'm proposing is the following:
>> |
>> | x = 'foo
>> |     'bar
>> |     'baz'
>>
>> -1 Looks ugly to me ;-)
>>     
>
> Indeed, I don't really like this syntax. I do like if there'd be a way to  
> spell 'multiline string with indentation chopped off'. The easiest way  
> (syntax-wise) would be to just have tripple quote do that, but that's  
> gonna give backward compat problems.
>   

These cases would be fine:

        a = """Some text.
    Some more text."""

    def f(x):
        """"Translates x into Hungarian.
        Does it quite badly."""
        pass

This wouldn't:

    a = """Some text.
        Some intentionally indented text."""

How often do people rely on those tabs or spaces being preserved?

Neil



From jcarlson at uci.edu  Thu Apr 12 18:04:57 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 12 Apr 2007 09:04:57 -0700
Subject: [Python-ideas] Regular expression algorithms
In-Reply-To: <C6567D95-7DA4-4BBF-A4CC-4959E2C2563F@atlas.st>
References: <C6567D95-7DA4-4BBF-A4CC-4959E2C2563F@atlas.st>
Message-ID: <20070412084641.62C6.JCARLSON@uci.edu>


Adam Atlas <adam at atlas.st> wrote:
> 
> Has anyone seen this article?
> http://swtch.com/~rsc/regexp/regexp1.html

Yes, it has been posted in the sourceforge tracker as a feature request,
in python-dev, and now here.

> Are its criticisms of Python's regex algorithm accurate?

In the worst-case, yes, Python's regular expression runs in O(2^n) time
(where n is the length of the string you are searching).  However, as
stated in the sourceforge entry, and has been stated in other places,
one has to write a fairly useless regular expression to get it into the
O(2^n) running time.  For many cases, Python's regular expression engine
is quite competitive with the Thompson NFA.


> If so, might  
> it be possible to revise Python's `re` module to use this sort of  
> algorithm? I noticed it says that this approach doesn't work if the  
> pattern contains backreferences, but maybe the module could at least  
> sort of self-optimize by switching to this method when no backrefs  
> are used.

It is possible, but only if someone takes the time to offer a patch. 
One thing to remember is that as stated in the documentation for
Python's re module, certain operators are "greedy", that is, a* will
pick up as many a's as it possibly can.  Where as a base Thompson NFA
will move on to the next state as early as possible, making a* with
Thompson analagous to a*? in the Python (and others') regular expression
engine. Yes, the Thompson NFA can be modified to also be greedy, but
that is a particular characteristic of Python's engine that a Thompson
NFA based engine will have to emulate (along with groups, named
references, etc., which are a PITA for non-recursive engines).


 - Josiah



From jimjjewett at gmail.com  Thu Apr 12 18:11:01 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 12 Apr 2007 12:11:01 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <461E52C3.8030907@cs.byu.edu>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
	<evjldr$k7h$1@sea.gmane.org> <op.tqn0f8pad64u53@jan-lenovo>
	<461E52C3.8030907@cs.byu.edu>
Message-ID: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>

On 4/12/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> Jan Kanis wrote:
> > On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy at udel.edu> wrote:

> > Indeed, I don't really like this syntax. I do like if there'd be a way to
> > spell 'multiline string with indentation chopped off'.

Most of the time, the extra indents are OK.  And if they aren't, it is
usually OK to start the string with a blank line.  (So everything is
aligned to left, at least.)

Would textwrap.dedent do what you wanted (if it were added to __all__)?
Should it have a mode to skip the first line?
Should there be a TextWrapper expose it somehow?  (My thought would be
to optionally call it from within _munge_whitespace.)

>     a = """Some text.
>         Some intentionally indented text."""

> How often do people rely on those tabs or spaces being preserved?

For doctests, mainly, so a consistent change would be OK ...  but
triple quoted strings are supposed to be almost exactly WYSIWYG.

-jJ


From rrr at ronadam.com  Thu Apr 12 20:10:52 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 12 Apr 2007 13:10:52 -0500
Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python
 and Concurrency)
In-Reply-To: <fb6fbf560704020907k2ef0edcdre34b630d4a4a13c7@mail.gmail.com>
References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org>	
	<460B23FE.2010309@canterbury.ac.nz>	
	<fb6fbf560703290629q3f7b7b59wf168b78f8867473e@mail.gmail.com>	
	<460C0E69.9060007@ronadam.com>	
	<fb6fbf560703301249r7b3352f5j492803cf65c6d36@mail.gmail.com>	
	<460F4DB2.7030600@ronadam.com>
	<fb6fbf560704020907k2ef0edcdre34b630d4a4a13c7@mail.gmail.com>
Message-ID: <461E762C.6040100@ronadam.com>

Jim Jewett wrote:
> On 4/1/07, Ron Adam <rrr at ronadam.com> wrote:
>> Jim Jewett wrote:
> 
>>  > And there is where it starts to fall apart.  Though if you look at the
>>  > pypy dict and interpreter optimizations, they have started to deal
>>  > with it through versioning types.
>>
>> I didn't find anything about "versioning" at these links. Did I miss it?
> 
>> http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#multi-dicts 
>>
> 
>>  > 
>> http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id23 
>>
> 
> Sorry; I wasn't pointing to enough of the document.
> 
> http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html
> 
> discussed versions under method caching, just above the Interpreter
> Optimizations section.

Thanks, Found it.   (Been busy with other things.)

I may also depend on what abstraction level is desired.  A high level of 
abstraction would hide all of this under the covers and it would be done 
transparently.  A lower level would provide the tools needed to do it with, 
but also have the property of "if it hurts, don't do that."

Ron




From greg.ewing at canterbury.ac.nz  Fri Apr 13 01:27:44 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Apr 2007 11:27:44 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
Message-ID: <461EC070.9060802@canterbury.ac.nz>

For Py3k, how about changing the definition of triple
quoted strings so that indentation is stripped up
to the level of the line where the string began?

In other words, apply an implicit dedent() to it
in the parser.

--
Greg


From adam at atlas.st  Fri Apr 13 01:39:05 2007
From: adam at atlas.st (Adam Atlas)
Date: Thu, 12 Apr 2007 19:39:05 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <461EC070.9060802@canterbury.ac.nz>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
	<461EC070.9060802@canterbury.ac.nz>
Message-ID: <C0A793E6-16CD-4F68-A1F7-43D7C43E6DC6@atlas.st>


On 12 Apr 2007, at 19.27, Greg Ewing wrote:
> For Py3k, how about changing the definition of triple
> quoted strings so that indentation is stripped up
> to the level of the line where the string began?

I'm almost positive that's been discussed before. Can anyone find a  
link?


From greg.ewing at canterbury.ac.nz  Fri Apr 13 02:12:32 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Apr 2007 12:12:32 +1200
Subject: [Python-ideas] Regular expression algorithms
In-Reply-To: <20070412084641.62C6.JCARLSON@uci.edu>
References: <C6567D95-7DA4-4BBF-A4CC-4959E2C2563F@atlas.st>
	<20070412084641.62C6.JCARLSON@uci.edu>
Message-ID: <461ECAF0.1010408@canterbury.ac.nz>

Josiah Carlson wrote:
> a base Thompson NFA
> will move on to the next state as early as possible, making a* with
> Thompson analagous to a*? in the Python

Are you sure that's an inherent characteristic of
a Thompson NFA?

As I understood it, using a Thompson NFA is no
different from building an NFA and converting it
to a DFA, except it does the conversion lazily.

And when using a DFA, whether it matches greedily
or not depends on how you drive it. If you stop
as soon as you reach the first accepting state,
it's non-greedy; if you keep going until you
can't go any further, it's greedy.

--
Greg


From greg.ewing at canterbury.ac.nz  Fri Apr 13 02:16:42 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Apr 2007 12:16:42 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
	<evjldr$k7h$1@sea.gmane.org>
	<op.tqn0f8pad64u53@jan-lenovo> <461E52C3.8030907@cs.byu.edu>
	<fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
Message-ID: <461ECBEA.2050001@canterbury.ac.nz>

Jim Jewett wrote:

> For doctests, mainly, so a consistent change would be OK ...  but
> triple quoted strings are supposed to be almost exactly WYSIWYG.

But they're *not* WYSIWYG, according to what you
naturally "see" when looking at the code.

Not sure about anyone else, but what I see is
some lines of text that happen to be indented
because the're part of a code block. I don't
see the indentation as being an intended part
of the string.

Does anyone have a use case where they *need*
the indentation to be preserved? (As opposed
to just not caring whether it's there or not.)

--
Greg


From greg.ewing at canterbury.ac.nz  Fri Apr 13 02:46:49 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Apr 2007 12:46:49 +1200
Subject: [Python-ideas] Ideas towards GIL removal
Message-ID: <461ED2F9.9020407@canterbury.ac.nz>

I've been thinking about some ideas for reducing the
amount of refcount adjustment that needs to be done,
with a view to making GIL removal easier.

1) Permanent objects

In a typical Python program there are many objects
that are created at the beginning and exist for the
life of the program -- classes, functions, literals,
etc. Refcounting these is a waste of effort, since
they're never going to go away.

So perhaps there could be a way of marking such
objects as "permanent" or "immortal". Any refcount
operation on a permanent object would be a no-op,
so no locking would be needed. This would also have
the benefit of eliminating any need to write to the
object's memory at all when it's only being read.

2) Objects owned by a thread

Python code creates and destroys temporary objects
at a high rate -- stack frames, argument tuples,
intermediate results, etc. If the code is executed
by a thread, those objects are rarely if ever seen
outside of that thread. It would be beneficial if
refcount operations on such objects could be carried
out by the thread that created them without locking.

To achieve this, two extra fields could be added
to the object header: an "owning thread id" and a
"local reference count". (The existing refcount
field will be called the "global reference count"
in what follows.)

An object created by a thread has its owning thread
id set to that thread. When adjusting an object's
refcount, if the current thread is the object's owning
thread, the local refcount is updated without locking.
If the object has no owning thread, or belongs to
a different thread, the object is locked and the
global refcount is updated.

The object is considered garbage only when both
refcounts drop to zero. Thus, after a decref, both
refcounts would need to be checked to see if they
are zero. When decrementing the local refcount and
it reaches zero, the global refcount can be checked
without locking, since a zero will never be written
to it until it truly has zero non-local references
remaining.

I suspect that these two strategies together would
eliminate a very large proportion of refcount-related
activities requiring locking, perhaps to the point
where those remaining are infrequent enough to make
GIL removal practical.

--
Greg


From brett at python.org  Fri Apr 13 04:15:28 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 12 Apr 2007 19:15:28 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461ED2F9.9020407@canterbury.ac.nz>
References: <461ED2F9.9020407@canterbury.ac.nz>
Message-ID: <bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>

On 4/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> I've been thinking about some ideas for reducing the
> amount of refcount adjustment that needs to be done,
> with a view to making GIL removal easier.
>
> 1) Permanent objects
>
> In a typical Python program there are many objects
> that are created at the beginning and exist for the
> life of the program -- classes, functions, literals,
> etc. Refcounting these is a waste of effort, since
> they're never going to go away.



In reality this is true, but obviously not technically true.  You could
delete a class if you really wanted to.  But obviously it rarely happens.


So perhaps there could be a way of marking such
> objects as "permanent" or "immortal". Any refcount
> operation on a permanent object would be a no-op,
> so no locking would be needed. This would also have
> the benefit of eliminating any need to write to the
> object's memory at all when it's only being read.
>
> 2) Objects owned by a thread
>
> Python code creates and destroys temporary objects
> at a high rate -- stack frames, argument tuples,
> intermediate results, etc. If the code is executed
> by a thread, those objects are rarely if ever seen
> outside of that thread. It would be beneficial if
> refcount operations on such objects could be carried
> out by the thread that created them without locking.
>
> To achieve this, two extra fields could be added
> to the object header: an "owning thread id" and a
> "local reference count". (The existing refcount
> field will be called the "global reference count"
> in what follows.)
>
> An object created by a thread has its owning thread
> id set to that thread. When adjusting an object's
> refcount, if the current thread is the object's owning
> thread, the local refcount is updated without locking.
> If the object has no owning thread, or belongs to
> a different thread, the object is locked and the
> global refcount is updated.
>
> The object is considered garbage only when both
> refcounts drop to zero. Thus, after a decref, both
> refcounts would need to be checked to see if they
> are zero. When decrementing the local refcount and
> it reaches zero, the global refcount can be checked
> without locking, since a zero will never be written
> to it until it truly has zero non-local references
> remaining.
>
> I suspect that these two strategies together would
> eliminate a very large proportion of refcount-related
> activities requiring locking, perhaps to the point
> where those remaining are infrequent enough to make
> GIL removal practical.
>
>

I wonder what the overhead is going to be.  If for every INCREF or DECREF
you have to check that an object is immortal or whether it is a thread-owned
object is going to incur at least an 'if' check, if not more.  I wonder what
the performance hit is going to be.

And for the second idea, adding two more fields to every object might be
considered expensive by some in terms of memory.

Also, how would this scenario be handled: object foo is created in thread A
(does it have a local-thread refcount of 1, a global of 1, or are both 1?),
is passed to thread B, and then DECREF'ed in thread B as the object is no
longer needed by anyone.  If the local-thread refcount is 1 then this would
not work as it would fail with the global refcount already at 0.  But if
objects start with a global refcount of 1 but a local refcount of 0 and it
is DECREF'ed locally then wouldn't that fail for the same reason?  I guess I
am wondering how refcounts would be handled when objects cross between
threads.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070412/8504442d/attachment.html>

From george.sakkis at gmail.com  Fri Apr 13 06:39:41 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Fri, 13 Apr 2007 00:39:41 -0400
Subject: [Python-ideas] iter() on steroids
Message-ID: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>

I proposed an (admittedly more controversial) version of this a few
months back at the py3k list and the reaction was unexpectedly (IMO)
negative or indifferent, so I'm wondering if things have changed a bit
since.

The proposal is to make the the builtin iter() return an object with
an API that consists of (most) functions currently at itertools. In
addition to saving one "from itertools import chain,islice,..." line
in every other module I write these days, an extra bonus of the OO
interface is that islice can be replaced with slice syntax and chain
with '+' (and/or perhaps the "pipe" character '|'). As a (deliberately
involved) example, consider this:

# A composite iterator over two files specified as follows:
# - each yielded line is right stripped.
# - the first 3 lines of the first file are yielded.
# - the first line of the second file is skipped and its next 4 lines
are yielded
# - empty lines (after the right stripping) are filtered out.
# - the remaining lines are enumerated.

f1,f2 = [iter(open(f)).map(str.rstrip) for f in 'foo.txt','bar.txt']
for i,line in (f1[:3] + f2[1:5]).filter(None).enumerate():
	print i,line

The equivalent itertools version is left as an exercise to the reader.

This is actually backwards compatible and could even go in 2.x if
accepted, but I'm focusing on py3K here.

Comments ?

George


PS: FYI, a proof of concept implementation is posted as a recipe at:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498272


From jcarlson at uci.edu  Fri Apr 13 06:57:05 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 12 Apr 2007 21:57:05 -0700
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
Message-ID: <20070412215622.62E9.JCARLSON@uci.edu>


"George Sakkis" <george.sakkis at gmail.com> wrote:
> 
> I proposed an (admittedly more controversial) version of this a few
> months back at the py3k list and the reaction was unexpectedly (IMO)
> negative or indifferent, so I'm wondering if things have changed a bit
> since.
[snip]
> Comments ?

Still -1.

 - Josiah



From cvrebert at gmail.com  Fri Apr 13 07:16:15 2007
From: cvrebert at gmail.com (Chris Rebert)
Date: Thu, 12 Apr 2007 22:16:15 -0700
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
Message-ID: <461F121F.2020208@gmail.com>

George Sakkis wrote:
> I proposed an (admittedly more controversial) version of this a few
> months back at the py3k list and the reaction was unexpectedly (IMO)
> negative or indifferent, so I'm wondering if things have changed a bit
> since.
> 
> The proposal is to make the the builtin iter() return an object with
> an API that consists of (most) functions currently at itertools. In
> addition to saving one "from itertools import chain,islice,..." line
> in every other module I write these days, an extra bonus of the OO
> interface is that islice can be replaced with slice syntax and chain
> with '+' (and/or perhaps the "pipe" character '|'). As a (deliberately
> involved) example, consider this:
[snipped]
> Comments ?

+0 on your proposal
I just don't see itertools being used often enough to justify your 
change, but I can see the utility for those instances where it is used 
heavily.

+1 on adding your Iter class (or something similar) to itertools
Less controversial and just as succinct/convenient as your proposal 
(Iter() vs iter()), save another line for the requisite import.

- Chris Rebert


From talin at acm.org  Fri Apr 13 08:51:11 2007
From: talin at acm.org (Talin)
Date: Thu, 12 Apr 2007 23:51:11 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461ED2F9.9020407@canterbury.ac.nz>
References: <461ED2F9.9020407@canterbury.ac.nz>
Message-ID: <461F285F.4060003@acm.org>

Greg Ewing wrote:
> I've been thinking about some ideas for reducing the
> amount of refcount adjustment that needs to be done,
> with a view to making GIL removal easier.

(omitted)

I'm thinking along similar lines, but my approach is to eliminate 
refcounting entirely. (Note that the incref and decref macros could 
still exist for backwards compatibility, but would do nothing.)

Garbage collection for concurrent systems is an active area of research, 
however it appears that many of the research systems out there have 
settled on a few basic design parameters. Most of them use a copying 
collector for the "young" generation (with a separate heap per thread, 
for exactly the reasons you suggest), and a shared mark-and-sweep heap 
store for the older generation that uses a traditional free list.

For my own amusement and curiosity, I've been playing around with 
implementing such a collector, using a heap allocator design that's 
loosely based on the one from dlmalloc, which is an open source malloc 
implementation with very good overall performance. The idea is to build 
a stand-alone garbage collection library, similar to the popular Boehm 
collector, but parallel by design and intended for "cooperative" 
language interpreters rather than uncooperative languages such as C and C++.

Of course, this is purely a hobby-level effort, and I admit that I 
really have no clue as to what I am doing here - the real point of the 
exercise is to educate myself about the problem space, not to actually 
produce a working library, although that might be a possible side effect 
(equally likely is that I'll never finish it.) I doubt that an untutored 
amateur such as myself can actually create a robust, efficient parallel 
implementation, given how hard such programming actually is, and how 
inexperienced I am. But it's interesting and enjoyable to work on, and 
that's the only reason I need. (And also produces some interesting side 
effects - I wasn't happy with the various graphical front-ends to gdb, 
so I took a couple of days off on wrote one using wxPython :)

Now, all that being said, even if such a GC library were to exist, that 
is a long way from removal of the GIL, although it is a necessary step. 
For example, take the case of a dictionary in which more than one thread 
is inserting values. Clearly, that will require a lock or some other 
mechanism to prevent corruption of the hash table as it is updated. I 
think we want to avoid the Java situation where every object has its own 
lock. Instead, we'd have to require the user to provide a lock around 
that insertion operation. But what about dictionaries that the user 
isn't aware of, such as class methods and module contents? In a world 
without a GIL, what kind of steps need to be taken to insure that shared 
data structures can be updated without creating chaos?

-- Talin



From jcarlson at uci.edu  Fri Apr 13 09:34:33 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 13 Apr 2007 00:34:33 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>
References: <461ED2F9.9020407@canterbury.ac.nz>
	<bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>
Message-ID: <20070412224427.62F5.JCARLSON@uci.edu>


"Brett Cannon" <brett at python.org> wrote:
> On 4/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >
> > I've been thinking about some ideas for reducing the
> > amount of refcount adjustment that needs to be done,
> > with a view to making GIL removal easier.
> >
> > 1) Permanent objects
> >
> > In a typical Python program there are many objects
> > that are created at the beginning and exist for the
> > life of the program -- classes, functions, literals,
> > etc. Refcounting these is a waste of effort, since
> > they're never going to go away.
> 
> 
> 
> In reality this is true, but obviously not technically true.  You could
> delete a class if you really wanted to.  But obviously it rarely happens.
> 
> 
> So perhaps there could be a way of marking such
> > objects as "permanent" or "immortal". Any refcount
> > operation on a permanent object would be a no-op,
> > so no locking would be needed. This would also have
> > the benefit of eliminating any need to write to the
> > object's memory at all when it's only being read.
> >
> > 2) Objects owned by a thread
> >
> > Python code creates and destroys temporary objects
> > at a high rate -- stack frames, argument tuples,
> > intermediate results, etc. If the code is executed
> > by a thread, those objects are rarely if ever seen
> > outside of that thread. It would be beneficial if
> > refcount operations on such objects could be carried
> > out by the thread that created them without locking.
> >
> > To achieve this, two extra fields could be added
> > to the object header: an "owning thread id" and a
> > "local reference count". (The existing refcount
> > field will be called the "global reference count"
> > in what follows.)
> >
> > An object created by a thread has its owning thread
> > id set to that thread. When adjusting an object's
> > refcount, if the current thread is the object's owning
> > thread, the local refcount is updated without locking.
> > If the object has no owning thread, or belongs to
> > a different thread, the object is locked and the
> > global refcount is updated.
> >
> > The object is considered garbage only when both
> > refcounts drop to zero. Thus, after a decref, both
> > refcounts would need to be checked to see if they
> > are zero. When decrementing the local refcount and
> > it reaches zero, the global refcount can be checked
> > without locking, since a zero will never be written
> > to it until it truly has zero non-local references
> > remaining.
> >
> > I suspect that these two strategies together would
> > eliminate a very large proportion of refcount-related
> > activities requiring locking, perhaps to the point
> > where those remaining are infrequent enough to make
> > GIL removal practical.
> >
> >
> 
> I wonder what the overhead is going to be.  If for every INCREF or DECREF
> you have to check that an object is immortal or whether it is a thread-owned
> object is going to incur at least an 'if' check, if not more.  I wonder what
> the performance hit is going to be.

The real question is whether the wasteful parallel if branches will be
faster or slower than the locking non-parallel increments.


> And for the second idea, adding two more fields to every object might be
> considered expensive by some in terms of memory.

In the worst case, it would double the size of an object (technically, a
minimal Python instance can consist of a refcount and a type pointer). 
In the case of an integer, it would increase its space use by 2/3.


> Also, how would this scenario be handled: object foo is created in thread A
> (does it have a local-thread refcount of 1, a global of 1, or are both 1?),

I would say global 0, thread 1.


> is passed to thread B, and then DECREF'ed in thread B as the object is no
> longer needed by anyone.  If the local-thread refcount is 1 then this would
> not work as it would fail with the global refcount already at 0.  But if

If the object is still being used in thread A, its thread refcount
should be at least 1.  If thread B decrefs the global refcount, and it
becomes 0, then it can check the thread refcount and notice it is
nonzero and not deallocate, or if it notices that it *is* zero, then
since it already has the GIL (necessary to have decrefed the global
refcount), it can pass the object to the deallocator.

Now, if thread B is using the object, and the object's thread refcount
drops to zero, and thread B passes the object to thread C, then thread C
is free to use the thread refcount.


It takes a bit of work, but the system could be made to work. However, I
am fairly certain that though it would remove the need to have the GIL
during some object reference passing, specifically for objects whose
whole lifetime is within a single thread, the larger definitions
necessary for increfs and decrefs would put more pressure on processor
cache, and regardless of locking, cache coherency requirements could
ruin performance when two threads were running on different processors
(due to cache line alignment).

I ran a microbenchmark, but all it seemed to tell me was that dealing
with the GIL is slow in multiple threads, but I didn't get conclusive
results either way (either positive or negative).


 - Josiah



From jcarlson at uci.edu  Fri Apr 13 09:50:11 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 13 Apr 2007 00:50:11 -0700
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <461ECBEA.2050001@canterbury.ac.nz>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
Message-ID: <20070412193258.62E6.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Does anyone have a use case where they *need*
> the indentation to be preserved? (As opposed
> to just not caring whether it's there or not.)

Not personally.  I think that telling people to use textwrap.dedent() is
sufficient.  Generally I'm -.5 on the change.

 - Josiah



From ivan at selidor.net  Fri Apr 13 09:43:01 2007
From: ivan at selidor.net (Ivan Vilata i Balaguer)
Date: Fri, 13 Apr 2007 09:43:01 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <461EC070.9060802@canterbury.ac.nz>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
	<461EC070.9060802@canterbury.ac.nz>
Message-ID: <20070413074301.GJ16507@tardis.terramar.selidor.net>

Greg Ewing (el 2007-04-13 a les 11:27:44 +1200) va dir::

> For Py3k, how about changing the definition of triple
> quoted strings so that indentation is stripped up
> to the level of the line where the string began?
> 
> In other words, apply an implicit dedent() to it
> in the parser.

I'd rather make it explicit by using some string prefix a la 'r' or 'u',
'i', for instance:

>>> normal_string = """
...   foo
...     bar \
...   baz
... """
>>> print repr(normal_string)
'\n  foo\n    bar   baz\n'
>>> indented_string1 = i"""
...   foo
...     bar \
...   baz
... """
>>> print repr(indented_string1)
'foo\n  bar baz\n'
>>> indented_string2 = i"""foo
...     bar \
...   baz
... """
>>> print repr(indented_string2)
'foo\n    bar   baz\n'

As you see, strings marked with 'i' are dedented to the outer non-blank
character, and their first empty line is ignored.  I haven't meditated
this much, so some questions come to my mind:

* Is it really OK to remove the first empty line?
* How would this interact with an 'r' prefix?  Should initial space be
  kept then?  (This would effectively disable 'i'.)
* Should leading space in a line after a continuation backslash really
  be removed?

Of course the proposal can be made a lot better with some insight.  What
do you think of the basic idea?

::

  Ivan Vilata i Balaguer   @ Welcome to the European Banana Republic! @
  http://www.selidor.net/  @     http://www.nosoftwarepatents.com/    @
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 307 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070413/fa564080/attachment.pgp>

From g.brandl at gmx.net  Fri Apr 13 10:40:17 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 13 Apr 2007 10:40:17 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <20070412193258.62E6.JCARLSON@uci.edu>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
Message-ID: <evnflh$v5f$1@sea.gmane.org>

Josiah Carlson schrieb:
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Does anyone have a use case where they *need*
>> the indentation to be preserved? (As opposed
>> to just not caring whether it's there or not.)
> 
> Not personally.  I think that telling people to use textwrap.dedent() is
> sufficient.  Generally I'm -.5 on the change.

I've already suggested at one time that a dedent() method be added to strings,
which would make it more obvious, but what is one import...

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.



From taleinat at gmail.com  Fri Apr 13 13:18:32 2007
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 13 Apr 2007 14:18:32 +0300
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <evnflh$v5f$1@sea.gmane.org>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu> <evnflh$v5f$1@sea.gmane.org>
Message-ID: <7afdee2f0704130418q3628c57bp9d59677c22b7065b@mail.gmail.com>

Georg Brandl wrote:
>
>
> I've already suggested at one time that a dedent() method be added to
> strings,
> which would make it more obvious, but what is one import...


I'm not sure this is the way to go. IMO string methods should be generic
manipulations on strings, and personally I find indenting/dedenting
multi-line strings doesn't fit in. For me, a stdlib function is just fine.


Ivan Vilata i Balaguer wrote:

> I'd rather make it explicit by using some string prefix a la 'r' or 'u',
> 'i', for instance:


This could be a reasonable solution, but it has some downsides:
* It's less readable than a well named function
* It's harder to understand for a newbie - a function/method has a
docstring, this would have to be looked up in the docs
* It's easy to miss while reading code - one small letter making a big
difference
* It paves the road for making more such string prefixes, and then we'd have
to memorize all of them... or consult the docs often

-1 from me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070413/76acbaee/attachment.html>

From erez27 at gmail.com  Fri Apr 13 13:39:45 2007
From: erez27 at gmail.com (Erez Sh.)
Date: Fri, 13 Apr 2007 13:39:45 +0200
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
Message-ID: <3ee1180704130439r7c5657fdq6dd1689c202a01b5@mail.gmail.com>

I think it's a great idea. In a language where everything is meant to be
object-oriented and easy-to-use, it's funny that iterators (which are
becoming increasingly popular) have such a bumpy interface.
We wouldn't want to do "from numtools import nadd,nmul ;  nadd(1,2)". The
claim that iterators aren't "being used often enough to justify your change"
is very disturbing considering that in future pythons most default functions
will return iterators if possible (such as key()/items()/values() of dict).
The WILL be used more than files, and it would be foolish to force the user
to do "from filetools import seek, tell, readlines"
The specifics of your proposal may need a little improvement here and there,
but the idea itself, IMHO, is very good and pythonic.

+1


On 4/13/07, George Sakkis <george.sakkis at gmail.com> wrote:
>
> I proposed an (admittedly more controversial) version of this a few
> months back at the py3k list and the reaction was unexpectedly (IMO)
> negative or indifferent, so I'm wondering if things have changed a bit
> since.
>
> The proposal is to make the the builtin iter() return an object with
> an API that consists of (most) functions currently at itertools. In
> addition to saving one "from itertools import chain,islice,..." line
> in every other module I write these days, an extra bonus of the OO
> interface is that islice can be replaced with slice syntax and chain
> with '+' (and/or perhaps the "pipe" character '|'). As a (deliberately
> involved) example, consider this:
>
> # A composite iterator over two files specified as follows:
> # - each yielded line is right stripped.
> # - the first 3 lines of the first file are yielded.
> # - the first line of the second file is skipped and its next 4 lines
> are yielded
> # - empty lines (after the right stripping) are filtered out.
> # - the remaining lines are enumerated.
>
> f1,f2 = [iter(open(f)).map(str.rstrip) for f in 'foo.txt','bar.txt']
> for i,line in (f1[:3] + f2[1:5]).filter(None).enumerate():
>         print i,line
>
> The equivalent itertools version is left as an exercise to the reader.
>
> This is actually backwards compatible and could even go in 2.x if
> accepted, but I'm focusing on py3K here.
>
> Comments ?
>
> George
>
>
> PS: FYI, a proof of concept implementation is posted as a recipe at:
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498272
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070413/506ad1f0/attachment.html>

From jason.orendorff at gmail.com  Fri Apr 13 16:54:37 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Fri, 13 Apr 2007 10:54:37 -0400
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
Message-ID: <bb8868b90704130754q1ca36755pa81e78e0c6d2c757@mail.gmail.com>

On 4/13/07, George Sakkis <george.sakkis at gmail.com> wrote:
> f1,f2 = [iter(open(f)).map(str.rstrip) for f in 'foo.txt','bar.txt']
> for i,line in (f1[:3] + f2[1:5]).filter(None).enumerate():
>         print i,line

George, you've got to pick a better example next time.  This one
is terrifying.  :)

-j


From gsakkis at rutgers.edu  Fri Apr 13 17:32:41 2007
From: gsakkis at rutgers.edu (George Sakkis)
Date: Fri, 13 Apr 2007 11:32:41 -0400
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <bb8868b90704130754q1ca36755pa81e78e0c6d2c757@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<bb8868b90704130754q1ca36755pa81e78e0c6d2c757@mail.gmail.com>
Message-ID: <91ad5bf80704130832y2641a50aj8adb23915e6f4c05@mail.gmail.com>

On 4/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:

> On 4/13/07, George Sakkis <george.sakkis at gmail.com> wrote:
> > f1,f2 = [iter(open(f)).map(str.rstrip) for f in 'foo.txt','bar.txt']
> > for i,line in (f1[:3] + f2[1:5]).filter(None).enumerate():
> >         print i,line
>
> George, you've got to pick a better example next time.  This one
> is terrifying.  :)

I know, but the equivalent using itertools is at least as terrifying :-)

George


From george.sakkis at gmail.com  Fri Apr 13 17:49:22 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Fri, 13 Apr 2007 11:49:22 -0400
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <461F121F.2020208@gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<461F121F.2020208@gmail.com>
Message-ID: <91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>

On 4/13/07, Chris Rebert <cvrebert at gmail.com> wrote:

> +0 on your proposal
> I just don't see itertools being used often enough to justify your
> change, but I can see the utility for those instances where it is used
> heavily.

In some sense it's a chicken-and-egg problem. My guess is that one
reason itertools are not used as much as they could/should is that
they are "hidden away" in a module, which makes one think it twice
before importing it (let alone newbies that don't even know its
existence). As a single data point, I'm a big fan itertools and still
I'm often lazy to import it to use, say izip() only once; I just go
with zip() instead.

George


From jan.kanis at phil.uu.nl  Fri Apr 13 18:51:18 2007
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Fri, 13 Apr 2007 18:51:18 +0200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461F285F.4060003@acm.org>
References: <461ED2F9.9020407@canterbury.ac.nz> <461F285F.4060003@acm.org>
Message-ID: <op.tqqefshzd64u53@jan-lenovo>

On Fri, 13 Apr 2007 08:51:11 +0200, Talin <talin at acm.org> wrote:

> Now, all that being said, even if such a GC library were to exist, that
> is a long way from removal of the GIL, although it is a necessary step.
> For example, take the case of a dictionary in which more than one thread
> is inserting values. Clearly, that will require a lock or some other
> mechanism to prevent corruption of the hash table as it is updated. I
> think we want to avoid the Java situation where every object has its own
> lock. Instead, we'd have to require the user to provide a lock around
> that insertion operation. But what about dictionaries that the user
> isn't aware of, such as class methods and module contents? In a world
> without a GIL, what kind of steps need to be taken to insure that shared
> data structures can be updated without creating chaos?


In the case of hashtables, a nonblocking variant could perhaps be an  
option. There was a nice article on reddit some time ago:  
http://blogs.azulsystems.com/cliff/2007/03/a_nonblocking_h.html , the guy  
claims that it's competitive in speed to non-lock protected (so thread  
unsafe) implementations. Nonblocking algorithms don't exist for all data  
structures, but perhaps they exist for most ones that are used in the  
python interpreter?

- Jan


From steven.bethard at gmail.com  Fri Apr 13 19:07:22 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 13 Apr 2007 11:07:22 -0600
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <bb8868b90704130754q1ca36755pa81e78e0c6d2c757@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<bb8868b90704130754q1ca36755pa81e78e0c6d2c757@mail.gmail.com>
Message-ID: <d11dcfba0704131007i1854773dic23045d1795fb6dc@mail.gmail.com>

On 4/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 4/13/07, George Sakkis <george.sakkis at gmail.com> wrote:
> > f1,f2 = [iter(open(f)).map(str.rstrip) for f in 'foo.txt','bar.txt']
> > for i,line in (f1[:3] + f2[1:5]).filter(None).enumerate():
> >         print i,line
>
> George, you've got to pick a better example next time.  This one
> is terrifying.  :)

Yeah, that's pretty awful. ;-)  Maybe a more reasonable example::

    # skip the first line
    for line in iter(fileobj)[1:]:
        ....

where currently you'd write::

    # skip the first line
    fileobj.next()
    for line in fileobj:
        ...

I'm floating around -0.5 on this one. The itertools functions I use
most are chain(), izip(), and count(). None of these are particularly
natural as a method of a single iterator object.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From tjreedy at udel.edu  Fri Apr 13 20:33:13 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 13 Apr 2007 14:33:13 -0400
Subject: [Python-ideas] iter() on steroids
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com><461F121F.2020208@gmail.com>
	<91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>
Message-ID: <evoid0$kj3$1@sea.gmane.org>


"George Sakkis" <george.sakkis at gmail.com> 
wrote in message 
news:91ad5bf80704130849y5117321ev475acf240b4f086f at mail.gmail.com...
| In some sense it's a chicken-and-egg problem. My guess is that one
| reason itertools are not used as much as they could/should is that
| they are "hidden away" in a module, which makes one think it twice
| before importing it (let alone newbies that don't even know its
| existence). As a single data point, I'm a big fan itertools and still
| I'm often lazy to import it to use, say izip() only once; I just go
| with zip() instead.

I personally think there are too many builtins.  So I would like some 
pushed to modules, which means more import statements.  Oh, dear.

If you have trouble writing 'from itertools import izip' or 'import 
itertools as it', then I guess it is hard to promote a module. 
Nonetheless, I think perhaps you should write your own based on iter and 
itertools.  And put it up on PyPI if it works at least for you.

Terry Jan Reedy





From gsakkis at rutgers.edu  Fri Apr 13 22:00:39 2007
From: gsakkis at rutgers.edu (George Sakkis)
Date: Fri, 13 Apr 2007 16:00:39 -0400
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <evoid0$kj3$1@sea.gmane.org>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<461F121F.2020208@gmail.com>
	<91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>
	<evoid0$kj3$1@sea.gmane.org>
Message-ID: <91ad5bf80704131300h359faeeeqbf585e83fdea6b8a@mail.gmail.com>

On 4/13/07, Terry Reedy <tjreedy at udel.edu> wrote:
>
> "George Sakkis" <george.sakkis at gmail.com>
> wrote in message
> news:91ad5bf80704130849y5117321ev475acf240b4f086f at mail.gmail.com...
> | In some sense it's a chicken-and-egg problem. My guess is that one
> | reason itertools are not used as much as they could/should is that
> | they are "hidden away" in a module, which makes one think it twice
> | before importing it (let alone newbies that don't even know its
> | existence). As a single data point, I'm a big fan itertools and still
> | I'm often lazy to import it to use, say izip() only once; I just go
> | with zip() instead.
>
> I personally think there are too many builtins.

I agree, that's why I don't suggest a new builtin but adding features
to an existing one. In fact, this can even *reduce* the builtins since
map(), zip() and enumerate() could be removed from the builtin
namespace.

George


From jcarlson at uci.edu  Fri Apr 13 23:09:16 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 13 Apr 2007 14:09:16 -0700
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704131300h359faeeeqbf585e83fdea6b8a@mail.gmail.com>
References: <evoid0$kj3$1@sea.gmane.org>
	<91ad5bf80704131300h359faeeeqbf585e83fdea6b8a@mail.gmail.com>
Message-ID: <20070413140557.62FE.JCARLSON@uci.edu>


"George Sakkis" <gsakkis at rutgers.edu> wrote:
> On 4/13/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > I personally think there are too many builtins.
> 
> I agree, that's why I don't suggest a new builtin but adding features
> to an existing one. In fact, this can even *reduce* the builtins since
> map(), zip() and enumerate() could be removed from the builtin
> namespace.

Map was already going to be removed because it can be replaced by...
    [f(x) for x in y]

Filter was already going to be removed because it can be replaced by...
    [x for x in y if f(x)]

I can't remember if anything was going to happen to zip or any of the
other functional programming functions.

 - Josiah



From brett at python.org  Sat Apr 14 00:45:54 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 13 Apr 2007 15:45:54 -0700
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <20070413140557.62FE.JCARLSON@uci.edu>
References: <evoid0$kj3$1@sea.gmane.org>
	<91ad5bf80704131300h359faeeeqbf585e83fdea6b8a@mail.gmail.com>
	<20070413140557.62FE.JCARLSON@uci.edu>
Message-ID: <bbaeab100704131545m424170b9l42ef892e2a54ed88@mail.gmail.com>

On 4/13/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
>
> "George Sakkis" <gsakkis at rutgers.edu> wrote:
> > On 4/13/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > > I personally think there are too many builtins.
> >
> > I agree, that's why I don't suggest a new builtin but adding features
> > to an existing one. In fact, this can even *reduce* the builtins since
> > map(), zip() and enumerate() could be removed from the builtin
> > namespace.
>
> Map was already going to be removed because it can be replaced by...
>     [f(x) for x in y]
>
> Filter was already going to be removed because it can be replaced by...
>     [x for x in y if f(x)]
>
> I can't remember if anything was going to happen to zip or any of the
> other functional programming functions.



zip is staying but replaced underneath the covers with itertools.izip.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070413/54fec001/attachment.html>

From greg.ewing at canterbury.ac.nz  Sat Apr 14 02:51:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 12:51:04 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>
References: <461ED2F9.9020407@canterbury.ac.nz>
	<bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>
Message-ID: <46202578.9080507@canterbury.ac.nz>

Brett Cannon wrote:

> In reality this is true, but obviously not technically true.  You could 
> delete a class if you really wanted to.  But obviously it rarely happens.

And if it does, the worst that will happen is that
the original version will hang around, tying up a
small amount of memory.

> I wonder what the overhead is going to be.  If for every INCREF or 
> DECREF you have to check that an object is immortal or whether it is a 
> thread-owned object is going to incur at least an 'if' check, if not 
> more.

Clearly, there will be a small increase in overhead.
But it may be worth it if it avoids the need for
a rather expensive lock/unlock. It was pointed out
earlier that, even using the special instructions
provided by some processors, this can take a great
many times longer than a normal memory access or
two.

> And for the second idea, adding two more fields to every object might be 
> considered expensive by some in terms of memory.

Again, it's a tradeoff. If it enables removal of
the GIL and massive threading on upcoming multi-
core CPUs, it might be considered worth the cost.
> 
> Also, how would this scenario be handled: object foo is created in 
> thread A ...  is passed to thread B, and then DECREF'ed in thread B as
> the object is no longer needed by anyone.

I'll have to think about that. If a thread gives
away a reference to another thread, it really
needs to be a global reference rather than a
local one. The tricky part is telling when this
happens.

> But if objects start with a global refcount of 1 but a 
> local refcount of 0 and it is DECREF'ed locally then wouldn't that fail 
> for the same reason?

That one's easier -- if the local refcount is 0
on a decref, you need to lock the object and
decrement the global refcount.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 03:26:14 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 13:26:14 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461F285F.4060003@acm.org>
References: <461ED2F9.9020407@canterbury.ac.nz> <461F285F.4060003@acm.org>
Message-ID: <46202DB6.9000802@canterbury.ac.nz>

Talin wrote:

> I'm thinking along similar lines, but my approach is to eliminate 
> refcounting entirely.

That's a possibility, although refcounting does have
some nice properties -- it's cache-friendly, and it's
usually fairly easy to get it to work with other
libraries that have their own scheme for managing
memory and don't know about Python's one.

> For example, take the case of a dictionary in which more than one thread 
> is inserting values. .. I 
> think we want to avoid the Java situation where every object has its own 
> lock.

Having to lock dictionaries mightn't be so bad, as
long as it can be done using special instructions.
It's still a much larger-grained locking unit than
an incref or decref.

But I'm wondering whether the problem might get
solved for us from the hardware end if we wait
long enough. If we start seeing massively-multicore
CPUs, I expect there will be a lot of pressure to
come up with more efficient ways of doing fine-
grained locking in order to make effective use
of them. Maybe a special lump of high-speed
multi-port memory used just for locks, with
surrounding hardware designed for using it as
such.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 03:51:49 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 13:51:49 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <20070412224427.62F5.JCARLSON@uci.edu>
References: <461ED2F9.9020407@canterbury.ac.nz>
	<bbaeab100704121915m595af385jf27ec74194101b3e@mail.gmail.com>
	<20070412224427.62F5.JCARLSON@uci.edu>
Message-ID: <462033B5.7030706@canterbury.ac.nz>

Josiah Carlson wrote:
> If thread B decrefs the global refcount, and it
> becomes 0, then it can check the thread refcount and notice it is
> nonzero and not deallocate, or if it notices that it *is* zero, then
> since it already has the GIL (necessary to have decrefed the global
> refcount), it can pass the object to the deallocator.

The problem with that is the owning thread needs to be
able to manipulate the local refcount without holding
any kind of lock. That's the whole point of it.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 03:51:57 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 13:51:57 +1200
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
Message-ID: <462033BD.8040506@canterbury.ac.nz>

George Sakkis wrote:
> The proposal is to make the the builtin iter() return an object with
> an API that consists of (most) functions currently at itertools.

The problem with this kind of thing is that it
becomes an arbitrary choice what is included as
a method. Anything not included in that choice
is left out in the cold and has to be applied
as a function anyway.

If there were a certain set of iterator algebra
functions that were *very* frequently used,
there could be an argument for making methods
of them. But I think you're overestimating how
much the itertools functions are used. Some
people may make heavy use of them, but they're
not used much in general.

If you happen to be a heavy user, there's
nothing stopping you from creating your own
version of iter() that returns an object with
all the methods you want. Let's keep the
standard iterator objects clean and simple.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 03:55:51 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 13:55:51 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <20070412193258.62E6.JCARLSON@uci.edu>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
Message-ID: <462034A7.2070603@canterbury.ac.nz>

Josiah Carlson wrote:

>>Does anyone have a use case where they *need*
>>the indentation to be preserved?

> Not personally.  I think that telling people to
 > use textwrap.dedent() is sufficient.

But it seems crazy to make people do this all
the time, when there's no reason not to do
it automatically in the first place.

--
Greg



From greg.ewing at canterbury.ac.nz  Sat Apr 14 04:05:09 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 14:05:09 +1200
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <3ee1180704130439r7c5657fdq6dd1689c202a01b5@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<3ee1180704130439r7c5657fdq6dd1689c202a01b5@mail.gmail.com>
Message-ID: <462036D5.9080606@canterbury.ac.nz>

Erez Sh. wrote:
> The claim that iterators aren't "being used often enough to justify your 
> change" is very disturbing 

The claim wasn't that iterators are used infrequently,
but the functions in the itertools module.

The vast majority of the time, the only thing people
do with iterators is iterate over them.

> considering that in future pythons most 
> default functions will return iterators if possible (such as 
> key()/items()/values() of dict).

This is wrong -- they won't return iterators, they'll
return *views* that can be indexed and otherwise used
as sequences or mappings. So this has nothing to do
with the proposal at hand.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 04:21:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 14:21:30 +1200
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<461F121F.2020208@gmail.com>
	<91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>
Message-ID: <46203AAA.9000802@canterbury.ac.nz>

George Sakkis wrote:

> I'm often lazy to import it to use, say izip() only once; I just go
> with zip() instead.

I think that range() and zip() are going to return
views or iterators in Py3k, so you won't be needing
izip() any more.

--
Greg


From greg.ewing at canterbury.ac.nz  Sat Apr 14 04:24:26 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 14 Apr 2007 14:24:26 +1200
Subject: [Python-ideas] iter() on steroids
In-Reply-To: <evoid0$kj3$1@sea.gmane.org>
References: <91ad5bf80704122139h5bb5ec4bmfacd1dc7d685b4eb@mail.gmail.com>
	<461F121F.2020208@gmail.com>
	<91ad5bf80704130849y5117321ev475acf240b4f086f@mail.gmail.com>
	<evoid0$kj3$1@sea.gmane.org>
Message-ID: <46203B5A.9080307@canterbury.ac.nz>

Terry Reedy wrote:
> So I would like some 
> pushed to modules, which means more import statements.

A middle ground would be to move them into modules,
but have the modules pre-imported into the builtin
namespace.

--
Greg


From rrr at ronadam.com  Sat Apr 14 04:54:44 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 13 Apr 2007 21:54:44 -0500
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <462034A7.2070603@canterbury.ac.nz>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>	<461ECBEA.2050001@canterbury.ac.nz>	<20070412193258.62E6.JCARLSON@uci.edu>
	<462034A7.2070603@canterbury.ac.nz>
Message-ID: <46204274.5000907@ronadam.com>

Greg Ewing wrote:
> Josiah Carlson wrote:
> 
>>> Does anyone have a use case where they *need*
>>> the indentation to be preserved?
> 
>> Not personally.  I think that telling people to
>  > use textwrap.dedent() is sufficient.
> 
> But it seems crazy to make people do this all
> the time, when there's no reason not to do
> it automatically in the first place.

Reminds me of ...

     http://www.artima.com/weblogs/viewpost.jsp?thread=101968


Note that the optional implementation of this has already been put in 
Python 2.5 just as it said it would be.


How about using indenting along with implicit string endings?

    def foo(...):
        ```
            Just
            another
            foo.

        message = ```
            This is a multi-
            line string +
            implicit right stripping.

        print message


Just kidding of course.  The back-quotes will never be approved.  ;-)


I don't know what would be the best solution because just about anything I 
can think of has some sort of side effects in some situations.  Maybe if 
line based editors are ever completely replaced with folding graphic 
editors it will no longer be a problem because all our multi-line strings 
can have nice borders around them.

Cheers,
    Ron







From jcarlson at uci.edu  Sat Apr 14 09:21:54 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 14 Apr 2007 00:21:54 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <462033B5.7030706@canterbury.ac.nz>
References: <20070412224427.62F5.JCARLSON@uci.edu>
	<462033B5.7030706@canterbury.ac.nz>
Message-ID: <20070414001918.6301.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Josiah Carlson wrote:
> > If thread B decrefs the global refcount, and it
> > becomes 0, then it can check the thread refcount and notice it is
> > nonzero and not deallocate, or if it notices that it *is* zero, then
> > since it already has the GIL (necessary to have decrefed the global
> > refcount), it can pass the object to the deallocator.
> 
> The problem with that is the owning thread needs to be
> able to manipulate the local refcount without holding
> any kind of lock. That's the whole point of it.

Certainly, but thread B isn't the owning thread, thread A was the owning
thread, and by virtue of decrefing its thread count to zero, acquiring
the GIL, and checking the global refcount to make sure that either
someone else is responsible for its deallocation (global refcount > 0),
or that thread A is responsible for its deallocation (global refcount ==
0).

 - Josiah



From taleinat at gmail.com  Sat Apr 14 18:34:32 2007
From: taleinat at gmail.com (Tal Einat)
Date: Sat, 14 Apr 2007 19:34:32 +0300
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<fb6fbf560704111315u7b006d14t48d669dcbf21ee6b@mail.gmail.com>
	<BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st>
Message-ID: <7afdee2f0704140934v410a4f67hb2dec4f8cc682ba@mail.gmail.com>

On 4/12/07, Adam Atlas <adam at atlas.st> wrote:
>
>
> Meanwhile, on a similar subject, I have a... strange idea. I'm not
> sure how easy/hard it would be to parse or how necessary it is, but
> it's just a thought.


[snip]

So anyway,
> what I'm proposing is the following:
>
> x = 'foo
>      'bar
>      'baz'
>
> Any
> thoughts?


-1 on such new syntax.

What i usually do is:
message = ("yada yada\n"
           "more yada yada\n"
           "even more yada.")

This works a lot like what you suggest, but with Python's current syntax. If
implicit string concatenation were removed, I'd just add a plus sign at the
end of each line.

This is also a possibility:
message = "\n".join([
    "yada yada",
    "more yada yada",
    "even more yada."])

The latter would work even better with the removal of implicit string
concatenation, since forgetting a comma would cause a syntax error instead
of skipping a newline.

- Tal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070414/588d8896/attachment.html>

From lists at janc.be  Sun Apr 15 04:46:55 2007
From: lists at janc.be (Jan Claeys)
Date: Sun, 15 Apr 2007 04:46:55 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
	<ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>
Message-ID: <1176605216.28153.112.camel@localhost>

Op donderdag 12-04-2007 om 12:34 uur [tijdzone +0100], schreef Eoghan
Murray:
> I dislike '+' as a string concatenation operator as I think
> overloading the meaning of '+' for non-numbers is ugly, 

D uses '~' as a string concatenation operator...


-- 
Jan Claeys



From adam at atlas.st  Sun Apr 15 05:07:56 2007
From: adam at atlas.st (Adam Atlas)
Date: Sat, 14 Apr 2007 23:07:56 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <1176605216.28153.112.camel@localhost>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
	<ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>
	<1176605216.28153.112.camel@localhost>
Message-ID: <2F132947-A3F1-4622-91D0-F10BDFC229E2@atlas.st>


On 14 Apr 2007, at 22.46, Jan Claeys wrote:
> D uses '~' as a string concatenation operator...

Eh... I like D, but that would be confusing in Python, since it  
already uses ~ as a unary operator that means something totally  
different.


From greg.ewing at canterbury.ac.nz  Sun Apr 15 13:37:15 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 15 Apr 2007 23:37:15 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <20070414001918.6301.JCARLSON@uci.edu>
References: <20070412224427.62F5.JCARLSON@uci.edu>
	<462033B5.7030706@canterbury.ac.nz>
	<20070414001918.6301.JCARLSON@uci.edu>
Message-ID: <46220E6B.5060208@canterbury.ac.nz>

Josiah Carlson wrote:

> Certainly, but thread B isn't the owning thread, thread A was the owning
> thread, and by virtue of decrefing its thread count to zero, acquiring
> the GIL, and checking the global refcount to make sure that either
> someone else is responsible for its deallocation (global refcount > 0),
> or that thread A is responsible for its deallocation (global refcount ==
> 0).

Thread B holding the GIL doesn't help, because the
local refcount is not covered by the GIL. Thread A
must be able to assume it has total ownership of the
local refcount, otherwise there's no benefit in
the scheme.

--
Greg


From talin at acm.org  Sun Apr 15 19:12:04 2007
From: talin at acm.org (Talin)
Date: Sun, 15 Apr 2007 10:12:04 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461ED2F9.9020407@canterbury.ac.nz>
References: <461ED2F9.9020407@canterbury.ac.nz>
Message-ID: <46225CE4.4040207@acm.org>

Greg Ewing wrote:
> 2) Objects owned by a thread
> 
> Python code creates and destroys temporary objects
> at a high rate -- stack frames, argument tuples,
> intermediate results, etc. If the code is executed
> by a thread, those objects are rarely if ever seen
> outside of that thread. It would be beneficial if
> refcount operations on such objects could be carried
> out by the thread that created them without locking.

While we are on the topic of reference counting, I'd like to direct your 
attention to Recycler, an IBM research project:

"The Recycler is a concurrent multiprocessor garbage collector with 
extremely low pause times (maximum of 6 milliseconds over eight 
benchmarks) while remaining competitive with the best throughput- 
oriented collectors in end-to-end execution times. This paper describes 
the overall architecture of the Recycler, including its use of reference 
counting and concurrent cycle collection, and presents extensive 
measurements of the system comparing it to a parallel, stop-the-world 
mark-and-sweep collector."

There are a bunch of research papers describing Recycler which can be 
found at the following link:

http://www.research.ibm.com/people/d/dfb/publications.html

I'd start with the papers entitled "Java without the Coffee Breaks: A 
Non-intrusive Multiprocessor Garbage Collector" and "Concurrent Cycle 
Collection in Reference Counted Systems".

Let me describe a bit about how the Recycler works and how it relates to 
what you've proposed.

The basic idea is that for each thread, there is a set of thread local 
data (TLD) that contains a pair of "refcount buffers", one buffer for 
increfs and one buffer for decrefs. Each refcount buffer is a flat array 
of pointers which starts empty and gradually fills up.

The "incref" operation does not actually touch the reference count field 
of the object. Instead, an "incref" appends a pointer to the object to 
the end of the incref buffer for that thread. Similarly, a decref 
operation appends a pointer to the object to the decref buffer. Since 
the refcount buffers are thread-local, there is no need for locking or 
synchronization.

When one of the buffers gets full, both buffers are swapped out for new 
ones, and the old buffers are placed on a queue which is processed by 
the collector thread. The collector thread is the only thread which is 
allowed to actually touch the reference counts of the individual 
objects, and its the only thread which is allowed to delete objects.

Processing the buffers is relatively simple: First, the incref buffer is 
processed. The collector thread scans through each pointer in the 
buffer, and increments the refcount of each object. Then the decref 
buffer is processed in a similar way, decrementing the refcount.

However, it also needs to process the buffers for the other threads 
before the memory can be reclaimed. Recycler defines a series of 
"epochs" (i.e. intervals between collections). Within a refcount buffer, 
each epoch is represented as a contiguous range of values within the 
array -- all of the increfs and decrefs which occurred during that 
epoch. An auxiliary array records the high water mark for each epoch.

Using this information, the collector thread is able to process only 
those increfs and decrefs for all threads which occurred before the 
current epoch. This does mean that objects whose refcount falls to zero 
during the current epoch will not be deleted until the next collection 
cycle.

Recycler also handles cyclic garbage via cycle detection, which is 
described in the paper. It does not use a "mark and sweep" type of 
algorithm, but is instead able to detect cycles locally without scanning 
the entire heap.

Thus, the Recycler's use of refcount buffers achieves what you were 
trying to achieve, which is refcounting without locking. However, it 
does require access to thread-local data for each incref / release 
operation. The performance of this scheme will greatly depend on how 
quickly the code can get access to thread local data. The fastest 
possible method would be to dedicate a register, but that's infeasible 
on most systems. Another idea is for large functions to look up the TLD 
and stuff it in a local variable at the beginning of the function. For 
older source code the existing, backwards-compatible incref and decref 
macros could each individually get access to the TLD, but these would be 
slower than the more optimized methods in which the TLD was supplied as 
a parameter.

-- Talin


From jcarlson at uci.edu  Sun Apr 15 20:16:53 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 15 Apr 2007 11:16:53 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <46220E6B.5060208@canterbury.ac.nz>
References: <20070414001918.6301.JCARLSON@uci.edu>
	<46220E6B.5060208@canterbury.ac.nz>
Message-ID: <20070415111128.6307.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Josiah Carlson wrote:
> 
> > Certainly, but thread B isn't the owning thread, thread A was the owning
> > thread, and by virtue of decrefing its thread count to zero, acquiring
> > the GIL, and checking the global refcount to make sure that either
> > someone else is responsible for its deallocation (global refcount > 0),
> > or that thread A is responsible for its deallocation (global refcount ==
> > 0).
> 
> Thread B holding the GIL doesn't help, because the
> local refcount is not covered by the GIL. Thread A
> must be able to assume it has total ownership of the
> local refcount, otherwise there's no benefit in
> the scheme.

I seem to not be explaining myself well enough.  What you describe is
precisely what I described earlier.  I don't believe we have a
disagreement about the execution semantics of threads on an object with
a local thread count.

I was only mentioning A acquiring the GIL if/when it becomes finished
with the object, to determine if the object could be sent to the
standard Python deallocation rutines, and/or if thread A should send it
(as opposed to thread B in the case if thread B was passed the object
and was using it beyond the time that A did).


 - Josiah



From brett at python.org  Sun Apr 15 21:52:33 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 15 Apr 2007 12:52:33 -0700
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <46225CE4.4040207@acm.org>
References: <461ED2F9.9020407@canterbury.ac.nz> <46225CE4.4040207@acm.org>
Message-ID: <bbaeab100704151252u3ef30786qb45c2ea20f6f9269@mail.gmail.com>

On 4/15/07, Talin <talin at acm.org> wrote:
>
> Greg Ewing wrote:
> > 2) Objects owned by a thread
> >
> > Python code creates and destroys temporary objects
> > at a high rate -- stack frames, argument tuples,
> > intermediate results, etc. If the code is executed
> > by a thread, those objects are rarely if ever seen
> > outside of that thread. It would be beneficial if
> > refcount operations on such objects could be carried
> > out by the thread that created them without locking.
>
> While we are on the topic of reference counting, I'd like to direct your
> attention to Recycler, an IBM research project:
>
> "The Recycler is a concurrent multiprocessor garbage collector with
> extremely low pause times (maximum of 6 milliseconds over eight
> benchmarks) while remaining competitive with the best throughput-
> oriented collectors in end-to-end execution times. This paper describes
> the overall architecture of the Recycler, including its use of reference
> counting and concurrent cycle collection, and presents extensive
> measurements of the system comparing it to a parallel, stop-the-world
> mark-and-sweep collector."
>
> There are a bunch of research papers describing Recycler which can be
> found at the following link:
>
> http://www.research.ibm.com/people/d/dfb/publications.html
>
> I'd start with the papers entitled "Java without the Coffee Breaks: A
> Non-intrusive Multiprocessor Garbage Collector" and "Concurrent Cycle
> Collection in Reference Counted Systems".
>
> Let me describe a bit about how the Recycler works and how it relates to
> what you've proposed.
>
> The basic idea is that for each thread, there is a set of thread local
> data (TLD) that contains a pair of "refcount buffers", one buffer for
> increfs and one buffer for decrefs. Each refcount buffer is a flat array
> of pointers which starts empty and gradually fills up.
>
> The "incref" operation does not actually touch the reference count field
> of the object. Instead, an "incref" appends a pointer to the object to
> the end of the incref buffer for that thread. Similarly, a decref
> operation appends a pointer to the object to the decref buffer. Since
> the refcount buffers are thread-local, there is no need for locking or
> synchronization.
>
> When one of the buffers gets full, both buffers are swapped out for new
> ones, and the old buffers are placed on a queue which is processed by
> the collector thread. The collector thread is the only thread which is
> allowed to actually touch the reference counts of the individual
> objects, and its the only thread which is allowed to delete objects.
>
> Processing the buffers is relatively simple: First, the incref buffer is
> processed. The collector thread scans through each pointer in the
> buffer, and increments the refcount of each object. Then the decref
> buffer is processed in a similar way, decrementing the refcount.
>
> However, it also needs to process the buffers for the other threads
> before the memory can be reclaimed. Recycler defines a series of
> "epochs" (i.e. intervals between collections). Within a refcount buffer,
> each epoch is represented as a contiguous range of values within the
> array -- all of the increfs and decrefs which occurred during that
> epoch. An auxiliary array records the high water mark for each epoch.



Huh, interesting idea.  I downloaded the papers and plan to give them a
read.  The one isssue I can see with this is that because of these epochs
and using a buffer instead of actually manipulating the refcount means a
delay.  And I know some people love the (mostly) instantaneous garbage
collection when the refcount should be at 0.

Anyway, I will give the paper a read before I make any more ignorant
statements about the design.  =)

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070415/0899c563/attachment.html>

From greg.ewing at canterbury.ac.nz  Mon Apr 16 01:50:12 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Apr 2007 11:50:12 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <20070415111128.6307.JCARLSON@uci.edu>
References: <20070414001918.6301.JCARLSON@uci.edu>
	<46220E6B.5060208@canterbury.ac.nz>
	<20070415111128.6307.JCARLSON@uci.edu>
Message-ID: <4622BA34.7050103@canterbury.ac.nz>

Josiah Carlson wrote:

> I was only mentioning A acquiring the GIL if/when it becomes finished
> with the object, to determine if the object could be sent to the
> standard Python deallocation rutines

Oh, yes, that part is fine. The problem is what happens
if thread A stuffs a reference into another object that
lives beyond A's interest in matters. Then another
thread can see an object that still has local ref
counts, even though the owning thread no longer cares
about it and is never going to get rid of those local
refcounts itself. I haven't thought of a non-expensive
way of fixing that yet.

--
Greg


From greg.ewing at canterbury.ac.nz  Mon Apr 16 01:52:52 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Apr 2007 11:52:52 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <bbaeab100704151252u3ef30786qb45c2ea20f6f9269@mail.gmail.com>
References: <461ED2F9.9020407@canterbury.ac.nz> <46225CE4.4040207@acm.org>
	<bbaeab100704151252u3ef30786qb45c2ea20f6f9269@mail.gmail.com>
Message-ID: <4622BAD4.2020706@canterbury.ac.nz>

Brett Cannon wrote:
> And I know some people love the (mostly) instantaneous 
> garbage collection when the refcount should be at 0.

Yeah, and even a 6 millisecond pause could be too long
in some applications, such as high frame rate animation.
6 milliseconds is a *long* time for today's GHz processors.

--
Greg


From jimjjewett at gmail.com  Mon Apr 16 18:52:04 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 16 Apr 2007 12:52:04 -0400
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <461ED2F9.9020407@canterbury.ac.nz>
References: <461ED2F9.9020407@canterbury.ac.nz>
Message-ID: <fb6fbf560704160952r523cf34dlf6c75980e9c36ca5@mail.gmail.com>

On 4/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I've been thinking about some ideas for reducing the
> amount of refcount adjustment that needs to be done,
> with a view to making GIL removal easier.
>
> 1) Permanent objects

I have some vague memory (but couldn't find the references) that
someone tried and it was too expensive.  INCREF and DECREF on
something in the header of an object you need anyhow were just too
small to beat once you added any logic.  (That said, the the
experiment was pretty old, and the results may have changed.)


> 2) Objects owned by a thread

[Create a owner-refcount separate from the global count]

Some distributed systems already take advantage of the fact that the
actual count is irrelevant.  They use weights, so that other stores
don't need to be updated until the (local) weight hits zero.

While it would be reasonable for a thread to only INCREF once, and
then keep its internal refcount elsewhere ... it is really hard to
beat "(add1 to/subtract 1 from) an int already at a known location in
cache."

Also note that Mark Miller and Ping Yee
    http://www.eros-os.org/pipermail/e-lang/1999-May/002590.html
suggested a way to mark objects as "expensive" (==> release as soon as
possible).

Combining this, today's python looks only at an object's size when
determining which memory pool to use.  There might be some value in
also categorizing types based on their instances typical memory usage.
 Examples:

(1)  Default pool, like today.

(2)  Permanent Pool:  Expected to be small and permanent.  Maybe skip
the refcount entirely?  Or at least ignore it going to zero, so you
don't need to lock for updates?

(3)  Thread-local.  Has an "external refcount" field that would
normally be zero.

(4)  Expensive:  Going to a rooted GC won't release it fast enough.

-jJ


From jimjjewett at gmail.com  Mon Apr 16 19:01:50 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 16 Apr 2007 13:01:50 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <462034A7.2070603@canterbury.ac.nz>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
	<462034A7.2070603@canterbury.ac.nz>
Message-ID: <fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>

On 4/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Josiah Carlson wrote:

> >>Does anyone have a use case where they *need*
> >>the indentation to be preserved?

> > Not personally.  I think that telling people to
>  > use textwrap.dedent() is sufficient.

> But it seems crazy to make people do this all
> the time, when there's no reason not to do
> it automatically in the first place.

The textwrap methods (including a proposed dedent) might make useful
string methods.  Short of that


(1)  Where does this preservation actually hurt?

    def f(self, arg1):
        """My DocString ...

        And I continue here -- which really is what I want.
        """

I use docstrings online -- and I typically do want them indented like the code.

(2)  Should literals (or at least strings, or at least docstrings) be
decoratable?  Anywhere but a docstring, you could just call the
function, but ... I suppose it serves the same meta-value is the
proposed i(nternational) or t(emplate) strings.

    def f(...):
        ....
        @dedent
        """ ...
        ...
        """

-jJ


From greg.ewing at canterbury.ac.nz  Tue Apr 17 02:14:00 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Apr 2007 12:14:00 +1200
Subject: [Python-ideas] Ideas towards GIL removal
In-Reply-To: <fb6fbf560704160952r523cf34dlf6c75980e9c36ca5@mail.gmail.com>
References: <461ED2F9.9020407@canterbury.ac.nz>
	<fb6fbf560704160952r523cf34dlf6c75980e9c36ca5@mail.gmail.com>
Message-ID: <46241148.2040200@canterbury.ac.nz>

Jim Jewett wrote:

> I have some vague memory (but couldn't find the references) that
> someone tried and it was too expensive.

Too expensive compared to what? The question isn't
whether it's more expensive than the current scheme,
but whether it helps when there's no GIL and you
have to lock the object to update the refcount.

--
Greg


From greg.ewing at canterbury.ac.nz  Tue Apr 17 02:19:24 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Apr 2007 12:19:24 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
	<462034A7.2070603@canterbury.ac.nz>
	<fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>
Message-ID: <4624128C.7010204@canterbury.ac.nz>

Jim Jewett wrote:

> (1)  Where does this preservation actually hurt?

It hurts because it places a burden on everyone every
time they use a triple quoted string to do something
about the indentation which is unwanted 99.999% of
the time.

> I use docstrings online -- and I typically do want them indented like 
> the code.

I don't understand what you mean by that. Can you
give an example where an auto-dedented docstring
would give an undesirable result?

--
Greg


From rrr at ronadam.com  Tue Apr 17 09:50:49 2007
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 17 Apr 2007 02:50:49 -0500
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <4624128C.7010204@canterbury.ac.nz>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>	<461ECBEA.2050001@canterbury.ac.nz>	<20070412193258.62E6.JCARLSON@uci.edu>	<462034A7.2070603@canterbury.ac.nz>	<fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>
	<4624128C.7010204@canterbury.ac.nz>
Message-ID: <46247C59.9090305@ronadam.com>

Greg Ewing wrote:
> Jim Jewett wrote:
> 
>> (1)  Where does this preservation actually hurt?
> 
> It hurts because it places a burden on everyone every
> time they use a triple quoted string to do something
> about the indentation which is unwanted 99.999% of
> the time.
> 
>> I use docstrings online -- and I typically do want them indented like 
>> the code.
> 
> I don't understand what you mean by that. Can you
> give an example where an auto-dedented docstring
> would give an undesirable result?

You didn't specify doc strings earlier, Just triple quoted strings in general.

I don't think it would be problem for only doc strings.  It could probably 
be done at compile time too.  It's not really that different than the -OO 
option to remove them.

Dedenting triple quoted strings in general would cause some problems in 
(python 2.x) with existing gui interfaces that use triple quoted strings to 
define their text.

Cheers,
    Ron










From theller at ctypes.org  Wed Apr 18 21:02:24 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 18 Apr 2007 21:02:24 +0200
Subject: [Python-ideas] Command line options
Message-ID: <f05q01$t75$1@sea.gmane.org>

Sometimes I think it would be great if it were possible to have standard
Python command line options that would allow

- initialize and configure the logging module
- specify requirements for pkg_resources (for eggs installed with --multi-version

All this would avoid having to change logging options or requirements in the
script, or having to implement a command line parser for this stuff in every script.
The idea is to call python in this way:

  python --require foo==dev --logging level=DEBUG myscript.py



I have not been able to implement something like this in sitecustomize.py,
because this module is executed when sys.argv is not yet available.

Another possible way to implement this would probably be to set environment vars
and parse those in sitecustomize.py, you would have to call

  env option1=foo option2=bar python script.py

then; unfortuately windows does not have an 'env' utility.

Does this sound like a useful idea?

Thomas



From taleinat at gmail.com  Wed Apr 18 22:09:40 2007
From: taleinat at gmail.com (Tal Einat)
Date: Wed, 18 Apr 2007 23:09:40 +0300
Subject: [Python-ideas] Command line options
In-Reply-To: <f05q01$t75$1@sea.gmane.org>
References: <f05q01$t75$1@sea.gmane.org>
Message-ID: <7afdee2f0704181309lf9abb28we0cc8c01f861334d@mail.gmail.com>

On 4/18/07, Thomas Heller <theller at ctypes.org> wrote:
>
> Sometimes I think it would be great if it were possible to have standard
> Python command line options that would allow
>
> - initialize and configure the logging module
> - specify requirements for pkg_resources (for eggs installed with
> --multi-version
>
> All this would avoid having to change logging options or requirements in
> the
> script, or having to implement a command line parser for this stuff in
> every script.
> The idea is to call python in this way:
>
>   python --require foo==dev --logging level=DEBUG myscript.py
>
>
>
> I have not been able to implement something like this in sitecustomize.py,
> because this module is executed when sys.argv is not yet available.
>
> Another possible way to implement this would probably be to set
> environment vars
> and parse those in sitecustomize.py, you would have to call
>
>   env option1=foo option2=bar python script.py
>
> then; unfortuately windows does not have an 'env' utility.
>
> Does this sound like a useful idea?
>
> Thomas


-1 on this. IMHO these options are too specific to be part of Python's
standard command line options. And using environment variables as a
workaround... would cause all sorts of problems, like the one you mentioned.

If these are options you often use for different Python scripts, you could
create a generic Python script runner which would parse these options,
initialize whatever is required (logging, packages, etc.) and finally
execfile the given Python script. For example:

script_runner.py --require foo==dev --logging level=DEBUG myscript.py

- Tal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070418/f1804844/attachment.html>

From greg.ewing at canterbury.ac.nz  Thu Apr 19 00:02:50 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 Apr 2007 10:02:50 +1200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <46247C59.9090305@ronadam.com>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
	<462034A7.2070603@canterbury.ac.nz>
	<fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>
	<4624128C.7010204@canterbury.ac.nz> <46247C59.9090305@ronadam.com>
Message-ID: <4626958A.4000809@canterbury.ac.nz>

Ron Adam wrote:

>> Can you
>> give an example where an auto-dedented docstring
>> would give an undesirable result?
> 
> You didn't specify doc strings earlier, Just triple quoted strings in 
> general.

Triple quoted strings in general is what I had in mind.
I was replying to something that seemed to imply that
it would cause trouble with docstrings, without being
very clear about what the trouble was.

> Dedenting triple quoted strings in general would cause some problems in 
> (python 2.x) with existing gui interfaces that use triple quoted strings 
> to define their text.

I conjecture that in all such cases, the existing code
is already dedenting the string itself. I still haven't
seen a real case where a piece of code actually needs
the extra indentation.

--
Greg


From lists at janc.be  Thu Apr 19 00:43:28 2007
From: lists at janc.be (Jan Claeys)
Date: Thu, 19 Apr 2007 00:43:28 +0200
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <2F132947-A3F1-4622-91D0-F10BDFC229E2@atlas.st>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
	<ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>
	<1176605216.28153.112.camel@localhost>
	<2F132947-A3F1-4622-91D0-F10BDFC229E2@atlas.st>
Message-ID: <1176936208.28153.258.camel@localhost>

Op zaterdag 14-04-2007 om 23:07 uur [tijdzone -0400], schreef Adam
Atlas:
> On 14 Apr 2007, at 22.46, Jan Claeys wrote:
> > D uses '~' as a string concatenation operator...
> 
> Eh... I like D, but that would be confusing in Python, since it  
> already uses ~ as a unary operator that means something totally  
> different. 

Python uses '+', '*', ':', '.', etc. for multiple different purposes
already, and at least the '+' case is more confusing sometimes than '~'
would ever be...


-- 
Jan Claeys



From adam at atlas.st  Thu Apr 19 01:01:10 2007
From: adam at atlas.st (Adam Atlas)
Date: Wed, 18 Apr 2007 19:01:10 -0400
Subject: [Python-ideas] Implicit String Concatenation
In-Reply-To: <1176936208.28153.258.camel@localhost>
References: <ddb3f080704110238q2fde4a17j4635956ed369ad48@mail.gmail.com>
	<eviqtr$fas$1@sea.gmane.org>
	<43aa6ff70704110801w2dd29e96j35d087cbee7a384f@mail.gmail.com>
	<7B3BA968-4228-49E2-A590-D84E34550EF4@atlas.st>
	<ddb3f080704120434m21ae0195kf7269ef4f7e1e829@mail.gmail.com>
	<1176605216.28153.112.camel@localhost>
	<2F132947-A3F1-4622-91D0-F10BDFC229E2@atlas.st>
	<1176936208.28153.258.camel@localhost>
Message-ID: <6FB44457-52D8-4BF9-BDAE-45FE7FC64FA9@atlas.st>


On 18 Apr 2007, at 18.43, Jan Claeys wrote:
> Op zaterdag 14-04-2007 om 23:07 uur [tijdzone -0400], schreef Adam
> Atlas:
>> On 14 Apr 2007, at 22.46, Jan Claeys wrote:
>>> D uses '~' as a string concatenation operator...
>>
>> Eh... I like D, but that would be confusing in Python, since it
>> already uses ~ as a unary operator that means something totally
>> different.
>
> Python uses '+', '*', ':', '.', etc. for multiple different purposes
> already, and at least the '+' case is more confusing sometimes than  
> '~'
> would ever be...

Heh, yeah, I actually realized immediately after I sent that email  
that the exact same thing could be said about +. But I don't know...  
even if + might be confused with an arithmetic operator sometimes,  
it's what people are used to, and I think it makes sense intuitively.  
'Plus', in a very abstract sense, suggests 'put two things together',  
whether with numbers or strings or anything else for which we have a  
concept of 'putting together'. ~ doesn't have that advantage. If a  
programmer coming from pretty much any language sees "foo"+"bar",  
they're probably going to be able to guess that it's concatenations.  
If they see "foo"~"bar", it is really not immediately clear what it's  
doing.


From thobes at gmail.com  Thu Apr 19 19:02:28 2007
From: thobes at gmail.com (Tobias Ivarsson)
Date: Thu, 19 Apr 2007 19:02:28 +0200
Subject: [Python-ideas] Fwd:  Implicit String Concatenation
In-Reply-To: <9997d5e60704191001q3ac384f0lb82106556dfa5ff2@mail.gmail.com>
References: <fb6fbf560704120911t3086b8e1ue3edf84c5fbf2064@mail.gmail.com>
	<461ECBEA.2050001@canterbury.ac.nz>
	<20070412193258.62E6.JCARLSON@uci.edu>
	<462034A7.2070603@canterbury.ac.nz>
	<fb6fbf560704161001u1f5a64d6l73c1c60d4b68455f@mail.gmail.com>
	<9997d5e60704191001q3ac384f0lb82106556dfa5ff2@mail.gmail.com>
Message-ID: <9997d5e60704191002l40b056fcg428c6a8420f58340@mail.gmail.com>

On 4/16/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>
> On 4/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Josiah Carlson wrote:
>
> > >>Does anyone have a use case where they *need*
> > >>the indentation to be preserved?
>
> > > Not personally.  I think that telling people to
> >  > use textwrap.dedent() is sufficient.
>
> > But it seems crazy to make people do this all
> > the time, when there's no reason not to do
> > it automatically in the first place.
>
> The textwrap methods (including a proposed dedent) might make useful
> string methods.  Short of that
>
>
> (1)  Where does this preservation actually hurt?
>
>     def f(self, arg1):
>         """My DocString ...
>
>         And I continue here -- which really is what I want.
>         """
>
> I use docstrings online -- and I typically do want them indented like the
> code.
>
> (2)  Should literals (or at least strings, or at least docstrings) be
> decoratable?  Anywhere but a docstring, you could just call the
> function, but ... I suppose it serves the same meta-value is the
> proposed i(nternational) or t(emplate) strings.
>
>     def f(...):
>         ....
>         @dedent
>         """ ...
>         ...
>         """



If docstrings is the problem you can always use a function decorator for it:

def dedentdoc(func):
    func.__doc__ = dedent(func.__doc__)
    return func

@dedentdoc
def f(...):
    """
    Long and indented docstring.
        extra indented
    unindented, phew"""
    pass

/Tobias

-jJ
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070419/a0660af4/attachment.html>

From jcarlson at uci.edu  Thu Apr 19 19:47:32 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 19 Apr 2007 10:47:32 -0700
Subject: [Python-ideas] Fwd:  Implicit String Concatenation
In-Reply-To: <9997d5e60704191002l40b056fcg428c6a8420f58340@mail.gmail.com>
References: <9997d5e60704191001q3ac384f0lb82106556dfa5ff2@mail.gmail.com>
	<9997d5e60704191002l40b056fcg428c6a8420f58340@mail.gmail.com>
Message-ID: <20070419104540.6359.JCARLSON@uci.edu>


"Tobias Ivarsson" <thobes at gmail.com> wrote:
> If docstrings is the problem you can always use a function decorator for it:

That wasn't the question.  Greg was asking "when is dedenting a
docstring *not* the right solution?"  We all understand and know that
any string can be manually dedented, the question is whether automatic
dedenting of all triple-quoted strings should be done.

 - Josiah



From brett at python.org  Fri Apr 20 05:38:42 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 19 Apr 2007 20:38:42 -0700
Subject: [Python-ideas] PEP for executing a module in a package containing
	relative imports
Message-ID: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>

Some of you might remember a discussion that took place on this list
about not being able to execute a script contained in a package that
used relative imports (read the PEP if you don't quite get what I am
talking about).  The PEP below proposes a solution (along with a
counter-solution).

Let me know what you think.  I especially want to hear which proposal
people prefer; the one in the PEP or the one in the Open Issues
section.  Plus I wouldn't mind suggestions on a title for this PEP.
=)

-------------------------------------------
PEP: XXX
Title: XXX
Version: $Revision: 52916 $
Last-Modified: $Date: 2006-12-04 11:59:42 -0800 (Mon, 04 Dec 2006) $
Author: Brett Cannon
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: XXX-Apr-2007

Abstract
========

Because of how name resolution works for relative imports in a world
where PEP 328 is implemented, the ability to execute modules within a
package ceases being possible.  This failing stems from the fact that
the module being executed as the "main" module replaces its
``__name__`` attribute with ``"__main__"`` instead of leaving it as
the actual, absolute name of the module.  This breaks import's ability
to resolve relative imports from the main module into absolute names.

In order to resolve this issue, this PEP proposes to change how a
module is delineated as the module that is being executed as the main
module.  By leaving the ``__name__`` attribute in a module alone and
setting a module attribute named ``__main__`` to a true value for the
main module (and thus false in all others), proper relative name
resolution can occur while still having a clear way for a module to
know if it is being executed as the main module.


The Problem
===========

With the introduction of PEP 328, relative imports became dependent on
the ``__name__`` attribute of the module performing the import.  This
is because the use of dots in a relative import are used to strip away
parts of the calling module's name to calcuate where in the package
hierarchy a relative import should fall (prior to PEP 328 relative
imports could fail and would fall back on absolute imports which had a
chance of succeeding).

For instance, consider the import ``from .. import spam`` made from the
``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package
itself, i.e., does not define ``__path__``).  Name resolution of the
relative import takes the caller's name (``bacon.ham.beans``), splits
on dots, and then slices off the last n parts based on the level
(which is 2).  In this example both ``ham`` and ``beans`` are dropped
and ``spam`` is joined with what is left (``bacon``).  This leads to
the proper import of the module ``bacon.spam``.

This reliance on the ``__name__`` attribute of a module when handling
realtive imports becomes an issue with executing a script within a
package.  Because the executing script is set to ``'__main__'``,
import cannot resolve any relative imports.  This leads to an
``ImportError`` if you try to execute a script in a package that uses
any relative import.

For example, assume we have a package named ``bacon`` with an
``__init__.py`` file containing::

  from . import spam

Also create a module named ``spam`` within the ``bacon`` package (it
can be an empty file).  Now if you try to execute the ``bacon``
package (either through ``python bacon/__init__.py`` or
``python -m bacon``) you will get an ``ImportError`` about trying to
do a relative import from within a non-package.  Obviously the import
is valid, but because of the setting of ``__name__`` to ``'__main__'``
import thinks that ``bacon/__init__.py`` is not in a package since no
dots exist in ``__name__``.  To see how the algorithm works, see
``importlib.Import._resolve_name()`` in the sandbox [#importlib]_.

Currently a work-around is to remove all relative imports in the
module being executed and make them absolute.  This is unfortunate,
though, as one should not be required to use a specific type of
resource in order to make a module in a package be able to be
executed.


The Solution
============

The solution to the problem is to not change the value of ``__name__``
in modules.  But there still needs to be a way to let executing code
know it is being executed as a script.  This is handled with a new
module attribute named ``__main__``.

When a module is being executed as a script, ``__main__`` will be set
to a true value.  For all other modules, ``__main__`` will be set to a
false value.  This changes the current idiom of::

  if __name__ == '__main__':
      ...

to::

  if __main__:
      ...

The current idiom is not as obvious and could cause confusion for new
programmers.  The proposed idiom, though, does not require explaining
why ``__name__`` is set as it is.

With the proposed solution the convenience of finding out what module
is being executed by examining ``sys.modules['__main__']`` is lost.
To make up for this, the ``sys`` module will gain the ``main``
attribute.  It will contain a string of the name of the module that is
considered the executing module.

A competing solution is discussed in `Open Issues`_.


Transition Plan
===============

Using this solution will not work directly in Python 2.6.  Code is
dependent upon the semantics of having ``__name__`` set to
``'__main__'``.  There is also the issue of pre-existing global
variables in a module named ``__main__``.  To deal with these issues,
a two-step solution is needed.

First, a Py3K deprecation warning will be raised during AST generation
when a global variable named ``__main__`` is defined.  This will help
with the detection of code that would reset the value of ``__main__``
for a module.  Without adding a warning when a global variable is
injected into a module, though, it is not fool-proof.  But this
solution should cover the vast majority of variable rebinding
problems.

Second, 2to3 [#2to3]_ will gain a rule to transform the current ``if
__name__ == '__main__': ...`` idiom to the new one.  While it will not
help with code that checks ``__name__`` outside of the idiom, that
specific line of code makes up a large proporation of code that every
looks for ``__name__`` set to ``'__main__'``.


Open Issues
===========

A counter-proposal to introducing the ``__main__`` attribute on
modules was to introduce a built-in with the same name.  The value of
the built-in would be the name of the module being executed (just like
the proposed ``sys.main``).  This would lead to a new idiom of::

  if __name__ == __main__:
      ...

The perk of this idiom over the one proposed earlier is that the
general semantics does not differ greatly from the current idiom.

The drawback is that the syntactic difference is subtle; the dropping
of quotes around "__main__".  Some believe that for existing Python
programmers bugs will be introduced where the quotation marks will be
put on by accident.  But one could argue that the bug would be
discovered quickly through testing as it is a very shallow bug.

The other pro of this proposal over the earlier one is the alleviation
of requiring import code to have to set the value of ``__main__``.  By
making it a built-in variable import does not have to care about
``__main__`` as executing the code itself will pick up the built-in
``__main__`` itself.  This simplies the implementation of the proposal
as it only requires setting a built-in instead of changing import to
set an attribute on every module that has exactly one module have a
different value (much like the current implementation has to do to set
``__name__`` in one module to ``'__main__'``).


References
==========

.. [#2to3]  2to3 tool
    (http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC]

.. [#importlib] importlib
    (http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=markup)
    [ViewVC]



Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From selliott4 at austin.rr.com  Fri Apr 20 07:30:44 2007
From: selliott4 at austin.rr.com (Steven Elliott)
Date: Fri, 20 Apr 2007 00:30:44 -0500
Subject: [Python-ideas] [Python-Dev] Making builtins more efficient
In-Reply-To: <45DCF124.6040101@canterbury.ac.nz>
References: <1141879691.11091.78.camel@grey> <440FF9CB.5030407@gmail.com>
	<79990c6b0603090400h25dd2c7ev3d5c379f6529f3c2@mail.gmail.com>
	<1141915806.11091.127.camel@grey> <4410BE69.3080004@canterbury.ac.nz>
	<1171984065.22648.47.camel@grey>  <45DCF124.6040101@canterbury.ac.nz>
Message-ID: <1177047044.16345.79.camel@grey>

Thanks for forwarding this.  It took me a while to catch on to the
thread being moved here from python-dev.

On Thu, 2007-02-22 at 14:25 +1300, Greg Ewing wrote:
> Steven Elliott wrote:
> 
> > What I have in mind may be close to what you are suggesting above.
> 
> My idea is somewhat more uniform and general than that.
> 
> For the module dict, you use a special mapping type that
> allows selected items to be accessed by an index as well
> as a name. The set of such names is determined when the
> module's code is compiled -- it's simply the names used
> in that module to refer to globals or builtins.

That sounds like an interesting idea.  How does it differ from PEP 280?:
    http://www.python.org/dev/peps/pep-0280
(assuming PEP 280 isn't what you are describing).  If there is some
place I can read more about this idea, that would be great.

> The first time a given builtin is referenced in the module,
> it will be unbound in the module dict, so it is looked up
> in the usual way and then written into the module dict,
> so it can subsequently be retrieved by index.
> 
> The advantages of this scheme over yours are that it speeds
> access to module-level names as well as builtins, and it
> doesn't require the compiler to have knowledge of a
> predefined set of names.

How does your scheme speed access to module-level names?  Are they
referred to by index?  With your idea would module level names only be
referred to by index internally in the module as global variables
(LOAD_GLOBAL)?  It seems like referring to attributes inside the module
from outside the module (LOAD_ATTR) would require something like a
visioning scheme where the compiler knows in advance knows what version
the module is so that it can get the index right.

Again, if your idea is PEP 280 then my questions in the above paragraph
are answered.

> It does entail a slight semantic change, as changes made
> to a builtin won't be seen by a module that has already
> used that builtin for the first time. But from what Guido
> has said before, it seems he is willing to accept a change
> like that if it will help.

I think slight semantic changes like that are worth it if it buys
greater performance, but I understand the importance of reverse
compatibility as well.

After exploring some of the different ideas for making
globals/bulitins/attributes more efficient it seems to me that there is
an overall tradeoff - How much complexity is justified to avoid hash
table lookups or extra levels of indirection?

For example, PEP 280 avoids a hash table lookup but adds a level of
indirection (the cells point to the value), which is a net performance
gain.  My idea (my last big email) avoids a level of indirection, but
only for a predefined set of names.  And it requires new opcodes.  And
the compiler has to be aware of the predefined set of names.

I have a question about PEP 280, but maybe I'll ask it here since it
seems relevant.

One of the elegant things about the way Python compiles code is that,
for the most part, each function can be compiled independently without
concern for context.  For example, the co_names in the code object for a
function has only the names for that function.  So the co_names for a
given function does not depend on anything outside of that function.

What if PEP 280's proposed co_globals, which is currently only has
globals referenced by that function (similar to co_names), was instead a
pointer to a shared array of globals that was common for the entire
module (one to one with the module's dict)?

I think doing so could avoid a level of indirection, but at the cost of
forcing the compiler to keep track of the indexes of all globals so that
it could get the index right (the index being the "<i>" in the proposed
LOAD_GLOBAL_CELL).  The  celldict might then have indexes into
co_globals instead of cells.  So the cost would be making the compiling
of functions less independent.

-- 
-- 
-----------------------------------------------------------------------
|          Steven Elliott          |      selliott4 at austin.rr.com     |
-----------------------------------------------------------------------




From steven.bethard at gmail.com  Fri Apr 20 07:56:09 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 19 Apr 2007 23:56:09 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>

On 4/19/07, Brett Cannon <brett at python.org> wrote:
> Let me know what you think.  I especially want to hear which proposal
> people prefer; the one in the PEP or the one in the Open Issues
> section.  Plus I wouldn't mind suggestions on a title for this PEP.

As you've probably already guessed, I prefer the::

    if __main__:

version. I don't think I've ever used sys.modules['__main__'].

> Transition Plan
> ===============
>
> Using this solution will not work directly in Python 2.6.  Code is
> dependent upon the semantics of having ``__name__`` set to
> ``'__main__'``.  There is also the issue of pre-existing global
> variables in a module named ``__main__``.

Could you explain a bit why __main__ couldn't be inserted into modules
before the module is actually executed? E.g. something like::

    >>> module_text = '''\
    ... __main__ = 'foo'
    ... print __main__
    ... '''
    >>> import new
    >>> mod = new.module('mod')
    >>> mod.__main__ = True
    >>> exec module_text in mod.__dict__
    foo
    >>> mod.__main__
    'foo'

I would have thought that if Python inserted __main__ before any of
the module contents got exec'd, it would be backwards compatible
because any use of __main__ would just overwrite the default one.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From jcarlson at uci.edu  Fri Apr 20 09:16:49 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Apr 2007 00:16:49 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <20070419235504.6374.JCARLSON@uci.edu>


"Brett Cannon" <brett at python.org> wrote:
> 
> Some of you might remember a discussion that took place on this list
> about not being able to execute a script contained in a package that
> used relative imports (read the PEP if you don't quite get what I am
> talking about).  The PEP below proposes a solution (along with a
> counter-solution).
> 
> Let me know what you think.  I especially want to hear which proposal
> people prefer; the one in the PEP or the one in the Open Issues
> section.  Plus I wouldn't mind suggestions on a title for this PEP.
> =)

About all I can come up with is "Fixing relative imports".


>   if __name__ == '__main__':
>       ...
> 
> to::
> 
>   if __main__:
>       ...

According to your PEP, the point of the above is so that __name__ can
become something descriptive, so that relative imports can do their
thing as per PEP 328 semantics.  However, both of your proposals seek to
offer a value for __main__ (either as a builtin or module global).

While others will probably disagree with me, I'm going to go with your
'open issues' proposal of ...

>   if __name__ == __main__:
>       ...

As you say, errors arising from the 'subtle' removal of quotes will be
quickly discovered (without a 2to3 conversion), and with a 2to3
conversion can be automatically converted.  In 2.6, it could result in a
warning or exception, depending on how Python 2.6 is run and/or what
__future__ statements are used.  It also doesn't rely on sticking yet
another value in a module's globals (which makes it easier for 3rd
parties to handle module loading by hand), while still makeing __main__
accessable.

For people who had previously been using sys.modules['__main__'], they
can instead use sys.modules[__main__] to get the same effect, which your
initial proposal does not allow.

 - Josiah



From lists at cheimes.de  Fri Apr 20 15:43:21 2007
From: lists at cheimes.de (Christian Heimes)
Date: Fri, 20 Apr 2007 15:43:21 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <f0ag3v$egs$1@sea.gmane.org>

Brett Cannon schrieb:
> When a module is being executed as a script, ``__main__`` will be set
> to a true value.  For all other modules, ``__main__`` will be set to a
> false value.  This changes the current idiom of::
> 
>   if __name__ == '__main__':
>       ...
> 
> to::
> 
>   if __main__:
>       ...
> 
> The current idiom is not as obvious and could cause confusion for new
> programmers.  The proposed idiom, though, does not require explaining
> why ``__name__`` is set as it is.
> 
> With the proposed solution the convenience of finding out what module
> is being executed by examining ``sys.modules['__main__']`` is lost.
> To make up for this, the ``sys`` module will gain the ``main``
> attribute.  It will contain a string of the name of the module that is
> considered the executing module.

What about

    import sys
    if __name__ == sys.main:
        ...

You won't have to introduce a new global module var __name__ and it's
easy to understand for newbies and experienced developers. The code is
only executed when the name of the current module is equal to the
executed main module (sys.main).
IMO it's much less PIT...B then introducing __main__.

Christian



From jimjjewett at gmail.com  Fri Apr 20 17:18:51 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 20 Apr 2007 11:18:51 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <fb6fbf560704200818w73ef5c68sb7c71b1743a0e789@mail.gmail.com>

On 4/19/07, Brett Cannon <brett at python.org> wrote:

> ...  By leaving the ``__name__`` attribute in a module alone and
> setting a module attribute named ``__main__`` to a true value for the
> main module (and thus false in all others) ...

Part of me says that you are already proposing the right answer, as
these alternatives are just a little too hackish.  Still, they are
good enough that they should be listed in the PEP, even if only as
rejected alternatives.

(1)  You could add a builtin __main__ that is false.  The real main
module would mask it, but no other code would need to change.

Con:  Another builtin, and this one wouldn't even make sense as an
independent object.

(2)  You could special-case the import to use __file__ instead of
__name__ when __name__ == "__main__"

Con:  may be more fragile.

(3)  You could set __name__ to (an instance of) a funky string
subclass that overrides __eq__.

Con:  may be hard to find exactly the *right* behavior.  Examples:
What should str(name) do?  Maybe __main__ should be the primary value,
and split should be overridden?

-jJ


From grosser.meister.morti at gmx.net  Fri Apr 20 17:41:19 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-15?Q?Mathias_Panzenb=F6ck?=)
Date: Fri, 20 Apr 2007 17:41:19 +0200
Subject: [Python-ideas] ordered dict
Message-ID: <4628DF1F.3060803@gmx.net>

Some kind of ordered dictionary would be nice to have in the
standard library. e.g. a AVL tree or something like that.
It would be nice so we can do things like that:

for value in tree[:end_key]:
	do_something_with(value)

del tree[:end_key]


A alternative would be just to sort the keys of a dict but
that's O( n log n ) for each sort. Depending on what's the more
often occurring case (lookup, insert, get key-range, etc.) a
other kind of dict object would make sense.

What do you think?


	-panzi


From jcarlson at uci.edu  Fri Apr 20 18:23:54 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Apr 2007 09:23:54 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <4628DF1F.3060803@gmx.net>
References: <4628DF1F.3060803@gmx.net>
Message-ID: <20070420091127.637D.JCARLSON@uci.edu>


Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
> 
> Some kind of ordered dictionary would be nice to have in the
> standard library. e.g. a AVL tree or something like that.
> It would be nice so we can do things like that:
> 
> for value in tree[:end_key]:
> 	do_something_with(value)
> 
> del tree[:end_key]
> 
> 
> A alternative would be just to sort the keys of a dict but
> that's O( n log n ) for each sort. Depending on what's the more
> often occurring case (lookup, insert, get key-range, etc.) a
> other kind of dict object would make sense.
> 
> What do you think?

This has been brought up many times.  The general consensus has been
'you don't get what you think you get'.

    >>> u'a' < 'b' < () < u'a'
    True

That is to say, there isn't a total ordering on objects that would make
sense as a sorted key,value dictionary.  In Python 3.0, objects that
don't make sense to compare won't be comparable, so list.sort() and/or
an AVL tree may make sense again.

However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.)
tree is deciding semantics.  Do you allow duplicate keys?  Do you allow
insertion and removal by position?  Do you allow the fetching of the
key/value at position X?  Do you allow the fetching of the position for
key X?  Insertion before/after (bisect_left, bisect_right equivalents).
Etcetera.

In many cases, using a sorted list gets you what you want, is almost as
fast, and has the benefit of using less memory.


 - Josiah



From brett at python.org  Fri Apr 20 19:09:55 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:09:55 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
Message-ID: <bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>

On 4/19/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/19/07, Brett Cannon <brett at python.org> wrote:
> > Let me know what you think.  I especially want to hear which proposal
> > people prefer; the one in the PEP or the one in the Open Issues
> > section.  Plus I wouldn't mind suggestions on a title for this PEP.
>
> As you've probably already guessed, I prefer the::
>
>     if __main__:
>
> version. I don't think I've ever used sys.modules['__main__'].
>

Yeah, I figured you would.  =)

> > Transition Plan
> > ===============
> >
> > Using this solution will not work directly in Python 2.6.  Code is
> > dependent upon the semantics of having ``__name__`` set to
> > ``'__main__'``.  There is also the issue of pre-existing global
> > variables in a module named ``__main__``.
>
> Could you explain a bit why __main__ couldn't be inserted into modules
> before the module is actually executed? E.g. something like::
>
>     >>> module_text = '''\
>     ... __main__ = 'foo'
>     ... print __main__
>     ... '''
>     >>> import new
>     >>> mod = new.module('mod')
>     >>> mod.__main__ = True
>     >>> exec module_text in mod.__dict__
>     foo
>     >>> mod.__main__
>     'foo'
>
> I would have thought that if Python inserted __main__ before any of
> the module contents got exec'd, it would be backwards compatible
> because any use of __main__ would just overwrite the default one.

That's right, and that is the problem.  That would mean if __main__
was false but then overwritten by a function or something, it suddenly
became true.  It isn't a problem in terms of whether the code will
run, but whether the expected semantics will occur.

-Brett


From brett at python.org  Fri Apr 20 19:11:43 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:11:43 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <20070419235504.6374.JCARLSON@uci.edu>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<20070419235504.6374.JCARLSON@uci.edu>
Message-ID: <bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>

On 4/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Brett Cannon" <brett at python.org> wrote:
> >
> > Some of you might remember a discussion that took place on this list
> > about not being able to execute a script contained in a package that
> > used relative imports (read the PEP if you don't quite get what I am
> > talking about).  The PEP below proposes a solution (along with a
> > counter-solution).
> >
> > Let me know what you think.  I especially want to hear which proposal
> > people prefer; the one in the PEP or the one in the Open Issues
> > section.  Plus I wouldn't mind suggestions on a title for this PEP.
> > =)
>
> About all I can come up with is "Fixing relative imports".
>
>
> >   if __name__ == '__main__':
> >       ...
> >
> > to::
> >
> >   if __main__:
> >       ...
>
> According to your PEP, the point of the above is so that __name__ can
> become something descriptive, so that relative imports can do their
> thing as per PEP 328 semantics.  However, both of your proposals seek to
> offer a value for __main__ (either as a builtin or module global).
>
> While others will probably disagree with me, I'm going to go with your
> 'open issues' proposal of ...
>
> >   if __name__ == __main__:
> >       ...
>

Woohoo!  It's my preference, but that's just because I think it will
be easier to implement.


> As you say, errors arising from the 'subtle' removal of quotes will be
> quickly discovered (without a 2to3 conversion), and with a 2to3
> conversion can be automatically converted.  In 2.6, it could result in a
> warning or exception, depending on how Python 2.6 is run and/or what
> __future__ statements are used.  It also doesn't rely on sticking yet
> another value in a module's globals (which makes it easier for 3rd
> parties to handle module loading by hand), while still makeing __main__
> accessable.
>

That's a good point.

> For people who had previously been using sys.modules['__main__'], they
> can instead use sys.modules[__main__] to get the same effect, which your
> initial proposal does not allow.

Yep.

-Brett


From brett at python.org  Fri Apr 20 19:15:43 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:15:43 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0ag3v$egs$1@sea.gmane.org>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
Message-ID: <bbaeab100704201015k121e240fje731b75950377a41@mail.gmail.com>

On 4/20/07, Christian Heimes <lists at cheimes.de> wrote:
> Brett Cannon schrieb:
> > When a module is being executed as a script, ``__main__`` will be set
> > to a true value.  For all other modules, ``__main__`` will be set to a
> > false value.  This changes the current idiom of::
> >
> >   if __name__ == '__main__':
> >       ...
> >
> > to::
> >
> >   if __main__:
> >       ...
> >
> > The current idiom is not as obvious and could cause confusion for new
> > programmers.  The proposed idiom, though, does not require explaining
> > why ``__name__`` is set as it is.
> >
> > With the proposed solution the convenience of finding out what module
> > is being executed by examining ``sys.modules['__main__']`` is lost.
> > To make up for this, the ``sys`` module will gain the ``main``
> > attribute.  It will contain a string of the name of the module that is
> > considered the executing module.
>
> What about
>
>     import sys
>     if __name__ == sys.main:
>         ...
>
> You won't have to introduce a new global module var __name__ and it's
> easy to understand for newbies and experienced developers. The code is
> only executed when the name of the current module is equal to the
> executed main module (sys.main).
> IMO it's much less PIT...B then introducing __main__.
>

True, but it does introduce an import for a module that may never be
used if the module is not being executed.  That kind of sucks for
minor performance reasons.

But what do other people think?

-Brett


From brett at python.org  Fri Apr 20 19:16:48 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:16:48 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <fb6fbf560704200818w73ef5c68sb7c71b1743a0e789@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<fb6fbf560704200818w73ef5c68sb7c71b1743a0e789@mail.gmail.com>
Message-ID: <bbaeab100704201016kf695c70i48afec3c5c76ad02@mail.gmail.com>

On 4/20/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 4/19/07, Brett Cannon <brett at python.org> wrote:
>
> > ...  By leaving the ``__name__`` attribute in a module alone and
> > setting a module attribute named ``__main__`` to a true value for the
> > main module (and thus false in all others) ...
>
> Part of me says that you are already proposing the right answer, as
> these alternatives are just a little too hackish.  Still, they are
> good enough that they should be listed in the PEP, even if only as
> rejected alternatives.
>
> (1)  You could add a builtin __main__ that is false.  The real main
> module would mask it, but no other code would need to change.
>
> Con:  Another builtin, and this one wouldn't even make sense as an
> independent object.
>
> (2)  You could special-case the import to use __file__ instead of
> __name__ when __name__ == "__main__"
>
> Con:  may be more fragile.
>
> (3)  You could set __name__ to (an instance of) a funky string
> subclass that overrides __eq__.
>
> Con:  may be hard to find exactly the *right* behavior.  Examples:
> What should str(name) do?  Maybe __main__ should be the primary value,
> and split should be overridden?
>

Yeah, I don't like any of them.  =)  I will add them to the PEP in a
Rejected Ideas section.

-Brett


From brett at python.org  Fri Apr 20 19:22:28 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:22:28 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <bbaeab100704201022j2b8a2035k41424bd4b01e9065@mail.gmail.com>

I realized two things that I didn't mention in the PEP.

One is that Python will have to infer the proper package name for a
module being executed.  Currently Python only knows the name of a
module because you asked for something and it tries to find a module
that fits that request.  But what is being proposed here has to figure
out what you would have asked for in order for the import to happen.
So I need to spell out the algorithm that will need to be used to
figure out ``python bacon/__init__.py`` is the bacon package.  Using
the '-m' option solves this as the name is given as an argument.

Maybe this should only be expected to work with the -m option?  Would
simplify things, but it does restrict the usefulness overall (but not
entirely as you would still gain a new feature).

The other issue is what to do if the module being executed is above
the current directory where Python is executing from (e.g., ``python
../spam.py``).  You can't infer the name for that module if the parent
directory is not on sys.path.  Setting the name to "__main__" might
need to stay for instances where the module being executed cannot have
it's name inferred.  This is another argument to only support '-m'
with this.

-Brett




On 4/19/07, Brett Cannon <brett at python.org> wrote:
> Some of you might remember a discussion that took place on this list
> about not being able to execute a script contained in a package that
> used relative imports (read the PEP if you don't quite get what I am
> talking about).  The PEP below proposes a solution (along with a
> counter-solution).
>
> Let me know what you think.  I especially want to hear which proposal
> people prefer; the one in the PEP or the one in the Open Issues
> section.  Plus I wouldn't mind suggestions on a title for this PEP.
> =)
>
> -------------------------------------------
> PEP: XXX
> Title: XXX
> Version: $Revision: 52916 $
> Last-Modified: $Date: 2006-12-04 11:59:42 -0800 (Mon, 04 Dec 2006) $
> Author: Brett Cannon
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: XXX-Apr-2007
>
> Abstract
> ========
>
> Because of how name resolution works for relative imports in a world
> where PEP 328 is implemented, the ability to execute modules within a
> package ceases being possible.  This failing stems from the fact that
> the module being executed as the "main" module replaces its
> ``__name__`` attribute with ``"__main__"`` instead of leaving it as
> the actual, absolute name of the module.  This breaks import's ability
> to resolve relative imports from the main module into absolute names.
>
> In order to resolve this issue, this PEP proposes to change how a
> module is delineated as the module that is being executed as the main
> module.  By leaving the ``__name__`` attribute in a module alone and
> setting a module attribute named ``__main__`` to a true value for the
> main module (and thus false in all others), proper relative name
> resolution can occur while still having a clear way for a module to
> know if it is being executed as the main module.
>
>
> The Problem
> ===========
>
> With the introduction of PEP 328, relative imports became dependent on
> the ``__name__`` attribute of the module performing the import.  This
> is because the use of dots in a relative import are used to strip away
> parts of the calling module's name to calcuate where in the package
> hierarchy a relative import should fall (prior to PEP 328 relative
> imports could fail and would fall back on absolute imports which had a
> chance of succeeding).
>
> For instance, consider the import ``from .. import spam`` made from the
> ``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package
> itself, i.e., does not define ``__path__``).  Name resolution of the
> relative import takes the caller's name (``bacon.ham.beans``), splits
> on dots, and then slices off the last n parts based on the level
> (which is 2).  In this example both ``ham`` and ``beans`` are dropped
> and ``spam`` is joined with what is left (``bacon``).  This leads to
> the proper import of the module ``bacon.spam``.
>
> This reliance on the ``__name__`` attribute of a module when handling
> realtive imports becomes an issue with executing a script within a
> package.  Because the executing script is set to ``'__main__'``,
> import cannot resolve any relative imports.  This leads to an
> ``ImportError`` if you try to execute a script in a package that uses
> any relative import.
>
> For example, assume we have a package named ``bacon`` with an
> ``__init__.py`` file containing::
>
>   from . import spam
>
> Also create a module named ``spam`` within the ``bacon`` package (it
> can be an empty file).  Now if you try to execute the ``bacon``
> package (either through ``python bacon/__init__.py`` or
> ``python -m bacon``) you will get an ``ImportError`` about trying to
> do a relative import from within a non-package.  Obviously the import
> is valid, but because of the setting of ``__name__`` to ``'__main__'``
> import thinks that ``bacon/__init__.py`` is not in a package since no
> dots exist in ``__name__``.  To see how the algorithm works, see
> ``importlib.Import._resolve_name()`` in the sandbox [#importlib]_.
>
> Currently a work-around is to remove all relative imports in the
> module being executed and make them absolute.  This is unfortunate,
> though, as one should not be required to use a specific type of
> resource in order to make a module in a package be able to be
> executed.
>
>
> The Solution
> ============
>
> The solution to the problem is to not change the value of ``__name__``
> in modules.  But there still needs to be a way to let executing code
> know it is being executed as a script.  This is handled with a new
> module attribute named ``__main__``.
>
> When a module is being executed as a script, ``__main__`` will be set
> to a true value.  For all other modules, ``__main__`` will be set to a
> false value.  This changes the current idiom of::
>
>   if __name__ == '__main__':
>       ...
>
> to::
>
>   if __main__:
>       ...
>
> The current idiom is not as obvious and could cause confusion for new
> programmers.  The proposed idiom, though, does not require explaining
> why ``__name__`` is set as it is.
>
> With the proposed solution the convenience of finding out what module
> is being executed by examining ``sys.modules['__main__']`` is lost.
> To make up for this, the ``sys`` module will gain the ``main``
> attribute.  It will contain a string of the name of the module that is
> considered the executing module.
>
> A competing solution is discussed in `Open Issues`_.
>
>
> Transition Plan
> ===============
>
> Using this solution will not work directly in Python 2.6.  Code is
> dependent upon the semantics of having ``__name__`` set to
> ``'__main__'``.  There is also the issue of pre-existing global
> variables in a module named ``__main__``.  To deal with these issues,
> a two-step solution is needed.
>
> First, a Py3K deprecation warning will be raised during AST generation
> when a global variable named ``__main__`` is defined.  This will help
> with the detection of code that would reset the value of ``__main__``
> for a module.  Without adding a warning when a global variable is
> injected into a module, though, it is not fool-proof.  But this
> solution should cover the vast majority of variable rebinding
> problems.
>
> Second, 2to3 [#2to3]_ will gain a rule to transform the current ``if
> __name__ == '__main__': ...`` idiom to the new one.  While it will not
> help with code that checks ``__name__`` outside of the idiom, that
> specific line of code makes up a large proporation of code that every
> looks for ``__name__`` set to ``'__main__'``.
>
>
> Open Issues
> ===========
>
> A counter-proposal to introducing the ``__main__`` attribute on
> modules was to introduce a built-in with the same name.  The value of
> the built-in would be the name of the module being executed (just like
> the proposed ``sys.main``).  This would lead to a new idiom of::
>
>   if __name__ == __main__:
>       ...
>
> The perk of this idiom over the one proposed earlier is that the
> general semantics does not differ greatly from the current idiom.
>
> The drawback is that the syntactic difference is subtle; the dropping
> of quotes around "__main__".  Some believe that for existing Python
> programmers bugs will be introduced where the quotation marks will be
> put on by accident.  But one could argue that the bug would be
> discovered quickly through testing as it is a very shallow bug.
>
> The other pro of this proposal over the earlier one is the alleviation
> of requiring import code to have to set the value of ``__main__``.  By
> making it a built-in variable import does not have to care about
> ``__main__`` as executing the code itself will pick up the built-in
> ``__main__`` itself.  This simplies the implementation of the proposal
> as it only requires setting a built-in instead of changing import to
> set an attribute on every module that has exactly one module have a
> different value (much like the current implementation has to do to set
> ``__name__`` in one module to ``'__main__'``).
>
>
> References
> ==========
>
> .. [#2to3]  2to3 tool
>     (http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC]
>
> .. [#importlib] importlib
>     (http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=markup)
>     [ViewVC]
>
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
>
> ..
>    Local Variables:
>    mode: indented-text
>    indent-tabs-mode: nil
>    sentence-end-double-space: t
>    fill-column: 70
>    coding: utf-8
>    End:
>


From lists at cheimes.de  Fri Apr 20 19:32:45 2007
From: lists at cheimes.de (Christian Heimes)
Date: Fri, 20 Apr 2007 19:32:45 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <bbaeab100704201015k121e240fje731b75950377a41@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>
	<bbaeab100704201015k121e240fje731b75950377a41@mail.gmail.com>
Message-ID: <f0atfu$rsl$1@sea.gmane.org>

Brett Cannon schrieb:
> True, but it does introduce an import for a module that may never be
> used if the module is not being executed.  That kind of sucks for
> minor performance reasons.

Yeah but sys is used by a lot of modules. Probably 95%+ of executable
modules are either using sys directly to access sys.argv or os which
imports sys. Also sys is a builtin module which is imported ridiculously
fast. I assume that the speed penalty for scripts that don't use sys is
minor.

In my humble opinion it sucks less to force the import of a core module
that is already used by most modules than to bind valuable developer
time in the __main__ approach. I think it's a Pythonic solution as well. :)

Christian



From jcarlson at uci.edu  Fri Apr 20 19:38:37 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Apr 2007 10:38:37 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201022j2b8a2035k41424bd4b01e9065@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<bbaeab100704201022j2b8a2035k41424bd4b01e9065@mail.gmail.com>
Message-ID: <20070420103528.6380.JCARLSON@uci.edu>


"Brett Cannon" <brett at python.org> wrote:
> I realized two things that I didn't mention in the PEP.
> 
> One is that Python will have to infer the proper package name for a
> module being executed.  Currently Python only knows the name of a
> module because you asked for something and it tries to find a module
> that fits that request.  But what is being proposed here has to figure
> out what you would have asked for in order for the import to happen.
> So I need to spell out the algorithm that will need to be used to
> figure out ``python bacon/__init__.py`` is the bacon package.  Using
> the '-m' option solves this as the name is given as an argument.

There's also the rub that if you 'run' the module in /a/b/c/d/e/f.py,
but all a-e are packages, the "proper" semantics may state that you need
to import a/__init__.py, a/b/__init__.py, etc., prior to the execution
of f.py .

Of course the only way that you would know that is if you checked the
paths .../e/, .../d/, etc.

The PEP should probably be changed to state the order of imports in a
case similar to this, and whether or not it bothers to check ancestor
paths for package information.


 - Josiah



From brett at python.org  Fri Apr 20 19:46:46 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 10:46:46 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <20070420103528.6380.JCARLSON@uci.edu>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<bbaeab100704201022j2b8a2035k41424bd4b01e9065@mail.gmail.com>
	<20070420103528.6380.JCARLSON@uci.edu>
Message-ID: <bbaeab100704201046l5c75512bp3f7fcfe851c14efa@mail.gmail.com>

On 4/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Brett Cannon" <brett at python.org> wrote:
> > I realized two things that I didn't mention in the PEP.
> >
> > One is that Python will have to infer the proper package name for a
> > module being executed.  Currently Python only knows the name of a
> > module because you asked for something and it tries to find a module
> > that fits that request.  But what is being proposed here has to figure
> > out what you would have asked for in order for the import to happen.
> > So I need to spell out the algorithm that will need to be used to
> > figure out ``python bacon/__init__.py`` is the bacon package.  Using
> > the '-m' option solves this as the name is given as an argument.
>
> There's also the rub that if you 'run' the module in /a/b/c/d/e/f.py,
> but all a-e are packages, the "proper" semantics may state that you need
> to import a/__init__.py, a/b/__init__.py, etc., prior to the execution
> of f.py .
>
> Of course the only way that you would know that is if you checked the
> paths .../e/, .../d/, etc.
>
> The PEP should probably be changed to state the order of imports in a
> case similar to this, and whether or not it bothers to check ancestor
> paths for package information.
>

Good point.  It's one of the ways my import implementation differs
from the current one as I just import the parent up to the requested
module while the current implementation throws an exception.

-Brett


From adam at atlas.st  Fri Apr 20 19:48:36 2007
From: adam at atlas.st (Adam Atlas)
Date: Fri, 20 Apr 2007 13:48:36 -0400
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <727CDD37-8C12-490F-93ED-BDBFF5F0E3D3@atlas.st>


On 19 Apr 2007, at 23.38, Brett Cannon wrote:
> Open Issues
> ===========
>
> A counter-proposal to introducing the ``__main__`` attribute on
> modules was to introduce a built-in with the same name.  The value of
> the built-in would be the name of the module being executed (just like
> the proposed ``sys.main``).  This would lead to a new idiom of::
>
>   if __name__ == __main__:
>       ...

I like that one. But one thing I've always thought would be handy is  
a builtin (maybe __this__?) pointing to the current module object  
itself (instead of its name). Any chance of that happening? In that  
case, __main__ could globally point to the main module instead of its  
name. The idiom would then be "if __this__ is __main__:...'. I think  
that reads pretty well: "If this is [the] main [module, then ...]."


From tjreedy at udel.edu  Fri Apr 20 20:13:44 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Apr 2007 14:13:44 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containingrelative imports
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <f0avsc$4o5$1@sea.gmane.org>


"Brett Cannon" <brett at python.org> wrote in 
message news:bbaeab100704192038v110b053eqfdcf49f613302f8 at mail.gmail.com...
| Let me know what you think.  I especially want to hear which proposal
| people prefer; the one in the PEP or the one in the Open Issues section.

This PEP has two proposals, which I think should be better separated.

1. Leave __name__ alone (without the '__main__' hack) so that relative 
imports work when executing scripts within packages.  My comment here is 
that I am fuzzy on the difference between __name__ and __file__ and why we 
would then need both.

2. Fix the 'main' self-knowledge problem introduced by fix 1.  The 
'counter-proposal' is only an alternative to this second proposal, as it 
agree with the first.  I had the same idea as Christian as a third 
alternative, but as a user would prefer the simplest invocation possible. 
I agree with Jim that multiple alternatives should be listed.

I think the '__main__' hack was both elegant and a wart, and agree that we 
should seriously consider a pair of coupled fixes.

|  Plus I wouldn't mind suggestions on a title for this PEP.| =)

Package scripts, relative imports, and main identification.

Terry Jan Reedy





From grosser.meister.morti at gmx.net  Fri Apr 20 20:31:50 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Fri, 20 Apr 2007 20:31:50 +0200
Subject: [Python-ideas] ordered dict
In-Reply-To: <20070420091127.637D.JCARLSON@uci.edu>
References: <4628DF1F.3060803@gmx.net> <20070420091127.637D.JCARLSON@uci.edu>
Message-ID: <46290716.6080504@gmx.net>

Josiah Carlson schrieb:
> 
> However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.)
> tree is deciding semantics.  Do you allow duplicate keys? 

Does dict? no. so no.

> Do you allow
> insertion and removal by position?

Does dict? no. so no.

> Do you allow the fetching of the
> key/value at position X?

Does dict? no. so no.

> Do you allow the fetching of the position for
> key X?

Does dict? no. so no.

> Insertion before/after (bisect_left, bisect_right equivalents).
> Etcetera.
> 

Why should all this be relevant? It just has to be some kind of
relation between a key and a value, and the keys should be accessible
in a sorted way (and you should not to have to sort them every time).
So it would be possible to slice such a container.

> In many cases, using a sorted list gets you what you want, is almost as
> fast, and has the benefit of using less memory.
> 

AFAIK does a doubled link list use the same amount of memory as a
(very) simple AVL tree:

struct tree_node {
	struct tree_node left;
	struct tree_node right;
	void * data;
};

struct list_node {
	struct list_node prev;
	struct list_node next;
	void * data;
};

> 
>  - Josiah
> 




From tjreedy at udel.edu  Fri Apr 20 20:34:58 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Apr 2007 14:34:58 -0400
Subject: [Python-ideas] ordered dict
References: <4628DF1F.3060803@gmx.net>
Message-ID: <f0b146$94h$1@sea.gmane.org>


"Mathias Panzenb?ck" <grosser.meister.morti at gmx.net> 
wrote in message news:4628DF1F.3060803 at gmx.net...
| Some kind of ordered dictionary would be nice to have in the
| standard library.

This has come up frequently, with 'ordered' having two quite different 
meanings.

1. Order of entry into the dictionary (for use with class definitions, for 
instance(though don't ask me why!).  When a given key is entered just once, 
this is relatively easy: just append to a subsidiary list.  I believe this 
is being at least considered for 3.0.

2. Order in the sorting or collation sense, which I presume you mean.  To 
reduce confusion, call this a sorted dictionary, as others have done.

Regardless, this has the problem that potential keys are not always 
comparable.  This will become worse when most cross-type comparisons are 
disallowed in 3.0.  So pershaps the __init__ method should require a tuple 
of allowed key types.

| e.g. a AVL tree or something like that.
...
| A alternative would be just to sort the keys of a dict but
| that's O( n log n ) for each sort. Depending on what's the more
| often occurring case (lookup, insert, get key-range, etc.) a
| other kind of dict object would make sense.
|
| What do you think?

If not already present in PyPI, someone could code an implementation and 
add it there.  When such has be tested and achieved enough usage, then it 
might be proposed for addition to the collections module.

Terry Jan Reedy





From jcarlson at uci.edu  Fri Apr 20 21:38:20 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Apr 2007 12:38:20 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <46290716.6080504@gmx.net>
References: <20070420091127.637D.JCARLSON@uci.edu> <46290716.6080504@gmx.net>
Message-ID: <20070420121603.6387.JCARLSON@uci.edu>


Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
> 
> Josiah Carlson schrieb:
> > 
> > However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.)
> > tree is deciding semantics.  Do you allow duplicate keys? 
> 
> Does dict? no. so no.
> 
> > Do you allow
> > insertion and removal by position?
> 
> Does dict? no. so no.
> 
> > Do you allow the fetching of the
> > key/value at position X?
> 
> Does dict? no. so no.
> 
> > Do you allow the fetching of the position for
> > key X?
> 
> Does dict? no. so no.
> 
> > Insertion before/after (bisect_left, bisect_right equivalents).
> > Etcetera.
> > 
> 
> Why should all this be relevant?

Very few use-cases of trees involve an ordered key/value dictionary. In
90% of the cases where I have needed (and implemented) trees involved
one of the following use-cases; sorted keys (but no values), no but fast
insertion of value based on position, sorted keys indexed by position or
key with (and without) values, etc.

Please also understand that the semantics of Python's dictionary is a
function of its implementation as an open-addressed hash table.  It's
useful for 95% of use-cases, but among the remaining 5% (which includes
the use-case you have in mind for the structure), there is a huge
variety of just as significant uses that shouldn't be discounted.


> > In many cases, using a sorted list gets you what you want, is almost as
> > fast, and has the benefit of using less memory.
> > 
> 
> AFAIK does a doubled link list use the same amount of memory as a
> (very) simple AVL tree:

Python lists aren't linked lists.  If you didn't know this, then you
don't know enough about the underlying implementation to make comments
about what should or should not be available in the base language.

 - Josiah



From talin at acm.org  Fri Apr 20 21:50:39 2007
From: talin at acm.org (Talin)
Date: Fri, 20 Apr 2007 12:50:39 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <20070420091127.637D.JCARLSON@uci.edu>
References: <4628DF1F.3060803@gmx.net> <20070420091127.637D.JCARLSON@uci.edu>
Message-ID: <4629198F.70200@acm.org>

Josiah Carlson wrote:
> Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
>> Some kind of ordered dictionary would be nice to have in the
>> standard library. e.g. a AVL tree or something like that.
>> It would be nice so we can do things like that:
>>
>> for value in tree[:end_key]:
>> 	do_something_with(value)
>>
>> del tree[:end_key]
>>
>>
>> A alternative would be just to sort the keys of a dict but
>> that's O( n log n ) for each sort. Depending on what's the more
>> often occurring case (lookup, insert, get key-range, etc.) a
>> other kind of dict object would make sense.
>>
>> What do you think?
> 
> This has been brought up many times.  The general consensus has been
> 'you don't get what you think you get'.
> 
>     >>> u'a' < 'b' < () < u'a'
>     True
> 
> That is to say, there isn't a total ordering on objects that would make
> sense as a sorted key,value dictionary.  In Python 3.0, objects that
> don't make sense to compare won't be comparable, so list.sort() and/or
> an AVL tree may make sense again.
> 
> However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.)
> tree is deciding semantics.  Do you allow duplicate keys?  Do you allow
> insertion and removal by position?  Do you allow the fetching of the
> key/value at position X?  Do you allow the fetching of the position for
> key X?  Insertion before/after (bisect_left, bisect_right equivalents).
> Etcetera.

I generally agree. I also think that the term "ordered dictionary" ought 
to be avoided.

One the one hand, I have no particular objection to someone creating an 
implementation of RB trees, B+-trees, PATRICIA radix trees and so on - 
in fact, these might be very useful things to have as standard 
collection classes.

However, 'dict' has a whole set of semantic baggage that goes along with 
it that may or may not apply to these other container types; And 
similarly, these other container types may have operations and semantics 
that don't correspond to the standard Python dictionary. One expects to 
be able to do certain things with an RB-tree that are either disallowed 
or very inefficient with a regular dict, and the converse is true as 
well. You give a number of examples such as fetching the position for a 
given key.

So my feeling is - let trees be trees, and dicts be dicts, and don't 
attempt to conflate the two. Otherwise, you end up with what I like to 
call the "overfactoring" anti-pattern - that is, attempt to generalize 
and unify two disparate systems that have different purposes and design 
intents into a single uniform interface.

-- Talin


From steven.bethard at gmail.com  Sat Apr 21 00:08:16 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 20 Apr 2007 16:08:16 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
	<bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
Message-ID: <d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>

On 4/20/07, Brett Cannon <brett at python.org> wrote:
> On 4/19/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 4/19/07, Brett Cannon <brett at python.org> wrote:
> > > Transition Plan
> > > ===============
> > >
> > > Using this solution will not work directly in Python 2.6.  Code is
> > > dependent upon the semantics of having ``__name__`` set to
> > > ``'__main__'``.  There is also the issue of pre-existing global
> > > variables in a module named ``__main__``.
> >
> > Could you explain a bit why __main__ couldn't be inserted into modules
> > before the module is actually executed? E.g. something like::
> >
> >     >>> module_text = '''\
> >     ... __main__ = 'foo'
> >     ... print __main__
> >     ... '''
> >     >>> import new
> >     >>> mod = new.module('mod')
> >     >>> mod.__main__ = True
> >     >>> exec module_text in mod.__dict__
> >     foo
> >     >>> mod.__main__
> >     'foo'
> >
> > I would have thought that if Python inserted __main__ before any of
> > the module contents got exec'd, it would be backwards compatible
> > because any use of __main__ would just overwrite the default one.
>
> That's right, and that is the problem.  That would mean if __main__
> was false but then overwritten by a function or something, it suddenly
> became true.  It isn't a problem in terms of whether the code will
> run, but whether the expected semantics will occur.

Sure, but I don't see how it's much different from anyone who writes::

    list = [foo, bar, baz]

and then later wonders why::

    list(obj)

gives a ``TypeError: 'list' object is not callable``.

If someone doesn't understand that the __main__ they defined at the
beginning of a module is going to be the same __main__ they use at the
end of the module, they're going to need to go do some reading about
how name binding works in Python anyway.

Of course, I definitely think it would be valuable to have a Py3K
deprecation warning to help users identify when they've made a silly
mistake like this.

(Note that the counter-proposal has the same problem, so this needs to
be resolved regardless of which approach gets taken.)

I'd really like there to be a way to write Python 3.0 compatible code
in Python 2.6 without having to run through 2to3. I think it's clear
that __main__ can be defined (at module-level or in the builtins)
without introducing any backwards compatibility problems right? Anyone
that doesn't want to use the Python 3.0 idiom can still write ``if
__name__ == '__main__'`` and it will continue to work in Python 2.X.
And anyone who does want to use the Python 3.0 idiom is probably using
the Py3K flag anyway, so if they make a stupid mistake, it'll get
caught pretty quickly.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From steven.bethard at gmail.com  Sat Apr 21 00:14:44 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 20 Apr 2007 16:14:44 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0ag3v$egs$1@sea.gmane.org>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
Message-ID: <d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>

On 4/20/07, Christian Heimes <lists at cheimes.de> wrote:
> What about
>
>     import sys
>     if __name__ == sys.main:
>         ...
>
> You won't have to introduce a new global module var __name__ and it's
> easy to understand for newbies and experienced developers. The code is
> only executed when the name of the current module is equal to the
> executed main module (sys.main).

But you have to understand a few things to understand why this works.
You have to know that __name__ is the name of the module, and that if
you want to find out the name of the main module, you need to look at
sys.main.  With the idiom::

    if __main__:

all you need to know is that the main module has __main__ set to true.

> IMO it's much less PIT...B then introducing __main__.

Could you elaborate? Do you think it would be hard to introduce
another module-level attribute (like we already do for __name__)? Or
do you think that the code would be hard to maintain? Or something
else...?

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From aahz at pythoncraft.com  Sat Apr 21 00:30:19 2007
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 20 Apr 2007 15:30:19 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201015k121e240fje731b75950377a41@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<bbaeab100704201015k121e240fje731b75950377a41@mail.gmail.com>
Message-ID: <20070420223019.GA12929@panix.com>

On Fri, Apr 20, 2007, Brett Cannon wrote:
> On 4/20/07, Christian Heimes <lists at cheimes.de> wrote:
>> 
>> What about
>>
>>     import sys
>>     if __name__ == sys.main:
>>         ...
>>
>> You won't have to introduce a new global module var __name__ and it's
>> easy to understand for newbies and experienced developers. The code is
>> only executed when the name of the current module is equal to the
>> executed main module (sys.main).
>> IMO it's much less PIT...B then introducing __main__.
> 
> True, but it does introduce an import for a module that may never be
> used if the module is not being executed.  That kind of sucks for
> minor performance reasons.
> 
> But what do other people think?

Looks good to me!  sys is essentially guaranteed to be imported, so
you're only wasting a few cycles to bring it into the module namespace.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Help a hearing-impaired person: http://rule6.info/hearing.html


From brett at python.org  Sat Apr 21 03:35:42 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 18:35:42 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
	<bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
	<d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>
Message-ID: <bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>

On 4/20/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/20/07, Brett Cannon <brett at python.org> wrote:
> > On 4/19/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > > On 4/19/07, Brett Cannon <brett at python.org> wrote:
> > > > Transition Plan
> > > > ===============
> > > >
> > > > Using this solution will not work directly in Python 2.6.  Code is
> > > > dependent upon the semantics of having ``__name__`` set to
> > > > ``'__main__'``.  There is also the issue of pre-existing global
> > > > variables in a module named ``__main__``.
> > >
> > > Could you explain a bit why __main__ couldn't be inserted into modules
> > > before the module is actually executed? E.g. something like::
> > >
> > >     >>> module_text = '''\
> > >     ... __main__ = 'foo'
> > >     ... print __main__
> > >     ... '''
> > >     >>> import new
> > >     >>> mod = new.module('mod')
> > >     >>> mod.__main__ = True
> > >     >>> exec module_text in mod.__dict__
> > >     foo
> > >     >>> mod.__main__
> > >     'foo'
> > >
> > > I would have thought that if Python inserted __main__ before any of
> > > the module contents got exec'd, it would be backwards compatible
> > > because any use of __main__ would just overwrite the default one.
> >
> > That's right, and that is the problem.  That would mean if __main__
> > was false but then overwritten by a function or something, it suddenly
> > became true.  It isn't a problem in terms of whether the code will
> > run, but whether the expected semantics will occur.
>
> Sure, but I don't see how it's much different from anyone who writes::
>
>     list = [foo, bar, baz]
>
> and then later wonders why::
>
>     list(obj)
>
> gives a ``TypeError: 'list' object is not callable``.
>

Exactly.  It's just that 'list' was known about when the code was
written while __main__ was not.

> If someone doesn't understand that the __main__ they defined at the
> beginning of a module is going to be the same __main__ they use at the
> end of the module, they're going to need to go do some reading about
> how name binding works in Python anyway.
>
> Of course, I definitely think it would be valuable to have a Py3K
> deprecation warning to help users identify when they've made a silly
> mistake like this.
>
> (Note that the counter-proposal has the same problem, so this needs to
> be resolved regardless of which approach gets taken.)
>

Yep.

> I'd really like there to be a way to write Python 3.0 compatible code
> in Python 2.6 without having to run through 2to3. I think it's clear
> that __main__ can be defined (at module-level or in the builtins)
> without introducing any backwards compatibility problems right? Anyone
> that doesn't want to use the Python 3.0 idiom can still write ``if
> __name__ == '__main__'`` and it will continue to work in Python 2.X.
> And anyone who does want to use the Python 3.0 idiom is probably using
> the Py3K flag anyway, so if they make a stupid mistake, it'll get
> caught pretty quickly.

Exactly.  Python 2.6 will still have __name__ set to '__main__', but
also have __main__ set.  Python 3.0 will not change __name__ at all.
This is why the PEP is a Py3K PEP and not a 2.6 PEP.

-Brett


From grosser.meister.morti at gmx.net  Sat Apr 21 03:37:52 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 21 Apr 2007 03:37:52 +0200
Subject: [Python-ideas] ordered dict
In-Reply-To: <4628DF1F.3060803@gmx.net>
References: <4628DF1F.3060803@gmx.net>
Message-ID: <46296AF0.7050608@gmx.net>

Ok, now. Forget all I said. Just a short question:
When you have to store values accosiated with keys and the
keys have to be accessible in a sorted manner. What container
type would you use? What data structure would you implement?
(I just thought a AVL tree would have been a good choice.)

Thanks,
panzi


From jcarlson at uci.edu  Sat Apr 21 04:46:17 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Apr 2007 19:46:17 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <46296AF0.7050608@gmx.net>
References: <4628DF1F.3060803@gmx.net> <46296AF0.7050608@gmx.net>
Message-ID: <20070420193620.6399.JCARLSON@uci.edu>


Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
> Ok, now. Forget all I said. Just a short question:
> When you have to store values accosiated with keys and the
> keys have to be accessible in a sorted manner. What container
> type would you use? What data structure would you implement?
> (I just thought a AVL tree would have been a good choice.)

If you want to use only things that are available in base Python, use a
list and the bisect module.

If you need O(logn) insertion and removal, then an AVL/Red-Black/2-3
tree with the semantics you described would also work. (I think there is
both an AVL and Red-Black tree implementation in the Python package
index [1])

If you only need to concern yourself with ordering every once in a while,
then x = dct.items();x.sort() works reasonably well.

Sometimes a "pair heap" can get you what you are looking for [2].


Data structure choices are tricky.  It is usually better to describe the
problem and one's approach (why you choose to use a particular algorithm
and structure), rather than strictly asking "where can I find data
structure X".


 - Josiah

[1] http://www.python.org/pypi/
[2] http://mail.python.org/pipermail/python-dev/2006-November/069845.html



From bjourne at gmail.com  Sat Apr 21 06:28:18 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Sat, 21 Apr 2007 04:28:18 +0000
Subject: [Python-ideas] ordered dict
In-Reply-To: <f0b146$94h$1@sea.gmane.org>
References: <4628DF1F.3060803@gmx.net> <f0b146$94h$1@sea.gmane.org>
Message-ID: <740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com>

On 4/20/07, Terry Reedy <tjreedy at udel.edu> wrote:
> 2. Order in the sorting or collation sense, which I presume you mean.  To
> reduce confusion, call this a sorted dictionary, as others have done.
>
> Regardless, this has the problem that potential keys are not always
> comparable.  This will become worse when most cross-type comparisons are
> disallowed in 3.0.  So pershaps the __init__ method should require a tuple
> of allowed key types.

>>> l = [(), "moo", 123, []]
>>> l.sort()
>>> l
[123, [], 'moo', ()]

If it is not a problem for lists it is not a problem for ordered dictionaries.

> If not already present in PyPI, someone could code an implementation and
> add it there.  When such has be tested and achieved enough usage, then it
> might be proposed for addition to the collections module.

And that is how the currently considered for Python 3.0 ordered dict
implementation got into Python?

I find it amusing that over the years people have argued against
having an ordered dict in Python. But now, when one is considered,
only THAT version with THOSE semantics, is good. The rest should go to
PyPI.


-- 
mvh Bj?rn


From steven.bethard at gmail.com  Sat Apr 21 07:19:54 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 20 Apr 2007 23:19:54 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
	<bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
	<d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>
	<bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>
Message-ID: <d11dcfba0704202219h60d16fd9sc99bbaa031e43ad6@mail.gmail.com>

On 4/20/07, Brett Cannon <brett at python.org> wrote:
> Exactly.  Python 2.6 will still have __name__ set to '__main__', but
> also have __main__ set.  Python 3.0 will not change __name__ at all.

That should be Python 3.0 will not change __main__ at all, right?
Because __name__ is going to change from being "__main__" in the main
module to being the actual module name in Python 3.0, right?

Assuming that's right, I think it was unclear to me that you wanted to
add __main__ to Python 2.x. Probably chainging:
    First, a Py3K deprecation warning will be raised...
to:
    First, each module will gain a __main__ attribute and a Py3K
    deprecation warning will be raised...
would make the intent clearer.


Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From brett at python.org  Sat Apr 21 08:18:20 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 20 Apr 2007 23:18:20 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704202219h60d16fd9sc99bbaa031e43ad6@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
	<bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
	<d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>
	<bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>
	<d11dcfba0704202219h60d16fd9sc99bbaa031e43ad6@mail.gmail.com>
Message-ID: <bbaeab100704202318o6f03fta77b11b5522b4164@mail.gmail.com>

On 4/20/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/20/07, Brett Cannon <brett at python.org> wrote:
> > Exactly.  Python 2.6 will still have __name__ set to '__main__', but
> > also have __main__ set.  Python 3.0 will not change __name__ at all.
>
> That should be Python 3.0 will not change __main__ at all, right?
> Because __name__ is going to change from being "__main__" in the main
> module to being the actual module name in Python 3.0, right?

Yes.

>
> Assuming that's right, I think it was unclear to me that you wanted to
> add __main__ to Python 2.x. Probably chainging:
>     First, a Py3K deprecation warning will be raised...
> to:
>     First, each module will gain a __main__ attribute and a Py3K
>     deprecation warning will be raised...
> would make the intent clearer.
>

Yes, __main__ will be defined in 2.6 and a warning raised if someone
defines __main__ later on in the module.

-Brett


From jcarlson at uci.edu  Sat Apr 21 09:10:03 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 21 Apr 2007 00:10:03 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com>
References: <f0b146$94h$1@sea.gmane.org>
	<740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com>
Message-ID: <20070421000102.63A5.JCARLSON@uci.edu>


"BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> 
> On 4/20/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > 2. Order in the sorting or collation sense, which I presume you mean.  To
> > reduce confusion, call this a sorted dictionary, as others have done.
> >
> > Regardless, this has the problem that potential keys are not always
> > comparable.  This will become worse when most cross-type comparisons are
> > disallowed in 3.0.  So pershaps the __init__ method should require a tuple
> > of allowed key types.
> 
> >>> l = [(), "moo", 123, []]
> >>> l.sort()
> >>> l
> [123, [], 'moo', ()]
> 
> If it is not a problem for lists it is not a problem for ordered dictionaries.

It's about a total ordering.  Without a total ordering, you won't
necessarily be able to *find* an object even if it is in there.

>>> import random
>>> a = ['b', (), u'a']
>>> a.sort()
>>> a
['b', (), u'a']
>>> random.shuffle(a)
>>> a.sort()
>>> a
[u'a', 'b', ()]


Also, in 3.0, objects will only be orderable if they are of compatible
type.  str and tuple are not compatible, so will raise an exception when
something like "" < () is performed.


> > If not already present in PyPI, someone could code an implementation and
> > add it there.  When such has be tested and achieved enough usage, then it
> > might be proposed for addition to the collections module.
> 
> And that is how the currently considered for Python 3.0 ordered dict
> implementation got into Python?
> 
> I find it amusing that over the years people have argued against
> having an ordered dict in Python. But now, when one is considered,
> only THAT version with THOSE semantics, is good. The rest should go to
> PyPI.

No, the "ordered dict" that is making its way into Python 3.0 is
specifically ordered based on insertion order, and is to make more
reasonable database interfaces like...

class Person(db.table):
    firstname = str
    ...

Its implementation is also a very simple variant of a dictionary, which
isn't the case with any tree implementation.


Further, because there are *so many* possible behaviors for a dictionary
ordered by keys implemented as a tree, picking one (or even a small set
of them) is guaranteed to raise comments of "can't we have one that does
X too?"


 - Josiah



From lists at cheimes.de  Sat Apr 21 16:25:57 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sat, 21 Apr 2007 16:25:57 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
Message-ID: <f0d6tm$5n2$1@sea.gmane.org>

Steven Bethard wrote:
> But you have to understand a few things to understand why this works.
> You have to know that __name__ is the name of the module, and that if
> you want to find out the name of the main module, you need to look at
> sys.main.  With the idiom::
> 
>     if __main__:
> 
> all you need to know is that the main module has __main__ set to true.
> 
>> IMO it's much less PIT...B then introducing __main__.
> 
> Could you elaborate? Do you think it would be hard to introduce
> another module-level attribute (like we already do for __name__)? Or
> do you think that the code would be hard to maintain? Or something
> else...?

This is just my humble opinion. I'm new to Python core development.

Well, in my opinion a new module level var like __main__ isn't worth to 
add when it is just boolean flag. With the proposed addition of sys.main 
the same information is available with just few more characters to type. 
If I recall correctly Python is trying to get rid of global variables in 
Python 3000.

I don't think it's hard to add - even for me although I know less about 
the Python core. I'm more worried about the side effect when people have 
already used __main__ as a function. The problem is in 2to3.

If you like to introduce __main__ why not implement 
http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*) 
module level function that replaced the "if __name__ == '__main__'" 
idiom. The __main__ function follows the example of other programming 
languages like C, C# and Java.

I'm aware of the fact that the PEP was rejected but I think it's worth 
to discuss it again.

Christian



From tjreedy at udel.edu  Sat Apr 21 16:59:46 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 21 Apr 2007 10:59:46 -0400
Subject: [Python-ideas] ordered dict
References: <4628DF1F.3060803@gmx.net> <f0b146$94h$1@sea.gmane.org>
	<740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com>
Message-ID: <f0d8sl$c0h$1@sea.gmane.org>


"BJ?rn Lindqvist" <bjourne at gmail.com> wrote 
in message 
news:740c3aec0704202128g6537c5bfv94c0f60a5d883d76 at mail.gmail.com...
>On 4/20/07, Terry Reedy <tjreedy at udel.edu> wrote:
>> 2. Order in the sorting or collation sense, which I presume you mean. 
>> To
>> reduce confusion, call this a sorted dictionary, as others have done.

>> Regardless, this has the problem that potential keys are not always 
>> comparable.

Current example:
>>> [1, 1j].sort()

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in -toplevel-
    [1, 1j].sort()
TypeError: no ordering relation is defined for complex numbers

>>  This will become worse when most cross-type comparisons are
>> disallowed in 3.0.

> >>> l = [(), "moo", 123, []]
> >>> l.sort()
> >>> l
> [123, [], 'moo', ()]

Py 3.0 will raise an exception here as these will all be incomparable.

> If it is not a problem for lists it is not a problem for ordered 
> dictionaries.

But it *is* currently a problem for lists that will become much more 
extensive in the future, so it *is* currently a problem for sorted dicts 
that will be much more of a problem in the future.  Hence, sorted dicts 
will have to be restricted to one type or one group of truly comparable 
types.

Terry Jan Reedy





From bjourne at gmail.com  Sat Apr 21 17:17:07 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Sat, 21 Apr 2007 15:17:07 +0000
Subject: [Python-ideas] ordered dict
In-Reply-To: <f0d8sl$c0h$1@sea.gmane.org>
References: <4628DF1F.3060803@gmx.net> <f0b146$94h$1@sea.gmane.org>
	<740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com>
	<f0d8sl$c0h$1@sea.gmane.org>
Message-ID: <740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com>

On 4/21/07, Terry Reedy <tjreedy at udel.edu> wrote:
> But it *is* currently a problem for lists that will become much more
> extensive in the future, so it *is* currently a problem for sorted dicts
> that will be much more of a problem in the future.  Hence, sorted dicts
> will have to be restricted to one type or one group of truly comparable
> types.

Alternatively, you could require a comparator function to be specified
at creation time.

-- 
mvh Bj?rn


From brett at python.org  Sat Apr 21 19:54:07 2007
From: brett at python.org (Brett Cannon)
Date: Sat, 21 Apr 2007 10:54:07 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0d6tm$5n2$1@sea.gmane.org>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
	<f0d6tm$5n2$1@sea.gmane.org>
Message-ID: <bbaeab100704211054g4052e9cdob1bc85957a9c6f12@mail.gmail.com>

On 4/21/07, Christian Heimes <lists at cheimes.de> wrote:
[SNIP]

> If you like to introduce __main__ why not implement
> http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*)
> module level function that replaced the "if __name__ == '__main__'"
> idiom. The __main__ function follows the example of other programming
> languages like C, C# and Java.

Because I don't like the solution and thus didn't want to do the
footwork for it.  =)

-Brett


From steven.bethard at gmail.com  Sat Apr 21 20:12:57 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 21 Apr 2007 12:12:57 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0d6tm$5n2$1@sea.gmane.org>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
	<f0d6tm$5n2$1@sea.gmane.org>
Message-ID: <d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>

On 4/21/07, Christian Heimes <lists at cheimes.de> wrote:
> If you like to introduce __main__ why not implement
> http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*)
> module level function that replaced the "if __name__ == '__main__'"
> idiom. The __main__ function follows the example of other programming
> languages like C, C# and Java.

I don't like the __main__ function signature. There are lots of
options, like optparse and argparse_ that are much better than
manually parsing sys.argv as the PEP 299 signature would suggest. And
if there's nothing to be passed to the function, why make it a
function at all? Personally, I thought one of the pluses of the
current status quo (as well as what Brett is proposing here) is that
it *didn't* follow in the (misplaced IMHO) footsteps of languages like
C and Java. I think we're probably best letting dead PEPs lie.

.. _argparse: http://argparse.python-hosting.com/

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From jcarlson at uci.edu  Sat Apr 21 20:29:44 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 21 Apr 2007 11:29:44 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com>
References: <f0d8sl$c0h$1@sea.gmane.org>
	<740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com>
Message-ID: <20070421112051.63AB.JCARLSON@uci.edu>


"BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> On 4/21/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > But it *is* currently a problem for lists that will become much more
> > extensive in the future, so it *is* currently a problem for sorted dicts
> > that will be much more of a problem in the future.  Hence, sorted dicts
> > will have to be restricted to one type or one group of truly comparable
> > types.
> 
> Alternatively, you could require a comparator function to be specified
> at creation time.

You could, but that would imply a total ordering on elements that Python
itself is removing because it doesn't make any sense.  Including a list
of 'acceptable' classes as Terry has suggested would work, but would
generally be superfluous.  The moment a user first added an object to
the sorted dictionary is the moment the type of objects that can be
inserted is easily limited (hello Abstract Base Classes PEP!)


 - Josiah



From jh at improva.dk  Sat Apr 21 22:27:36 2007
From: jh at improva.dk (Jacob Holm)
Date: Sat, 21 Apr 2007 16:27:36 -0400
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>	<f0d6tm$5n2$1@sea.gmane.org>
	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
Message-ID: <462A73B8.4080406@improva.dk>

Steven Bethard wrote:
> On 4/21/07, Christian Heimes <lists at cheimes.de> wrote:
>   
>> If you like to introduce __main__ why not implement
>> http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*)
>> module level function that replaced the "if __name__ == '__main__'"
>> idiom. The __main__ function follows the example of other programming
>> languages like C, C# and Java.
>>     
>
> I don't like the __main__ function signature. There are lots of
> options, like optparse and argparse_ that are much better than
> manually parsing sys.argv as the PEP 299 signature would suggest.

I agree that optparse and argparse are better ways to parse a command 
line than using sys.argv directly, but nothing in PEP299 would prevent 
you from using them.
In fact, I am pretty sure that with a suitable decorator on __main__ you 
could make their use even simpler.

>  And
> if there's nothing to be passed to the function, why make it a
> function at all? 
Because you may want to call it from somewhere else, possibly with 
different arguments?

> Personally, I thought one of the pluses of the
> current status quo (as well as what Brett is proposing here) is that
> it *didn't* follow in the (misplaced IMHO) footsteps of languages like
> C and Java. I think we're probably best letting dead PEPs lie.
>   

I find it very sad that PEP299 did in fact die, because I think it is 
much cleaner solution than the proposal that started this thread. 

That said, I would like to se a way to remove the __name__=='__main__' 
weirdness.  I am +1 on resurrecting PEP299, but also +1 on adding a 
"sys.main" that could be used in a new "if __name__=sys.main".  I am -1 
on adding a builtin/global __main__ as proposed, because that would 
clash with my own PEP299-like use of that name.

Jacob

-- 
Jacob Holm
CTO
Improva ApS



From jcarlson at uci.edu  Sat Apr 21 23:09:23 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 21 Apr 2007 14:09:23 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
Message-ID: <20070421135948.63AF.JCARLSON@uci.edu>


After reading other posts in the thread, I'm going to put my support
into the sys.main variant.  It has all of the benefits of the builtin __name__
== __main__, with none of the drawbacks (no builtin!), and only a slight
annoyance of 'import sys', which is more or less free.


 - Josiah



From jimjjewett at gmail.com  Sun Apr 22 00:03:03 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 21 Apr 2007 18:03:03 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<d11dcfba0704192256p74018fcey4d2c7827f7696f88@mail.gmail.com>
	<bbaeab100704201009je396733n3261cd7683f2ed58@mail.gmail.com>
	<d11dcfba0704201508x7139a1a4ma932e2b363631222@mail.gmail.com>
	<bbaeab100704201835vab6e94sfb7e2fa0cbf03a5@mail.gmail.com>
Message-ID: <fb6fbf560704211503t3a23f131o23e8c4599e4a74cf@mail.gmail.com>

On 4/20/07, Brett Cannon <brett at python.org> wrote:
> On 4/20/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 4/20/07, Brett Cannon <brett at python.org> wrote:
> > > On 4/19/07, Steven Bethard <steven.bethard at gmail.com> wrote:

> > > > I would have thought that if Python inserted __main__ before any of
> > > > the module contents got exec'd, it would be backwards compatible
> > > > because any use of __main__ would just overwrite the default one.

> > > That's right, and that is the problem.  That would mean if __main__
> > > was false but then overwritten by a function or something, it suddenly
> > > became true.  It isn't a problem in terms of whether the code will
> > > run, but whether the expected semantics will occur.

If the code is still using a __main__ variable of its own, then
presumably it isn't using the new meaning of __main__, and isn't
affected by the unexpected semantics.

Or are you concerned that some code *outside* a module could check to
see whether that module is __main__?

> > Sure, but I don't see how it's much different from anyone who writes::

> >     list = [foo, bar, baz]

> > and then later wonders why::

> >     list(obj)

> > gives a ``TypeError: 'list' object is not callable``.


> Exactly.  It's just that 'list' was known about when the code was
> written while __main__ was not.

In that case, the module itself isn't using (and doesn't care) about
the new __main__ semantics.  Code external to the module can't rely on
either (list or __main__) being unchanged, even today.

> > I'd really like there to be a way to write Python 3.0 compatible code
> > in Python 2.6 without having to run through 2to3.

To me, this is a fairly important requirement that I fear is sometimes
being forgotten.

2to3 isn't really a one-time translation unless you stop supporting
2.x after running it.

-jJ


From rrr at ronadam.com  Sun Apr 22 00:50:29 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sat, 21 Apr 2007 17:50:29 -0500
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <462A73B8.4080406@improva.dk>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>	<f0d6tm$5n2$1@sea.gmane.org>	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
	<462A73B8.4080406@improva.dk>
Message-ID: <462A9535.3010600@ronadam.com>



Jacob Holm wrote:
> Steven Bethard wrote:
>> On 4/21/07, Christian Heimes <lists at cheimes.de> wrote:
>>   
>>> If you like to introduce __main__ why not implement
>>> http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*)
>>> module level function that replaced the "if __name__ == '__main__'"
>>> idiom. The __main__ function follows the example of other programming
>>> languages like C, C# and Java.
>>>     
>> I don't like the __main__ function signature. There are lots of
>> options, like optparse and argparse_ that are much better than
>> manually parsing sys.argv as the PEP 299 signature would suggest.
> 
> I agree that optparse and argparse are better ways to parse a command 
> line than using sys.argv directly, but nothing in PEP299 would prevent 
> you from using them.
> In fact, I am pretty sure that with a suitable decorator on __main__ you 
> could make their use even simpler.
> 
>>  And
>> if there's nothing to be passed to the function, why make it a
>> function at all? 
> Because you may want to call it from somewhere else, possibly with 
> different arguments?
> 
>> Personally, I thought one of the pluses of the
>> current status quo (as well as what Brett is proposing here) is that
>> it *didn't* follow in the (misplaced IMHO) footsteps of languages like
>> C and Java. I think we're probably best letting dead PEPs lie.
>>   
> 
> I find it very sad that PEP299 did in fact die, because I think it is 
> much cleaner solution than the proposal that started this thread. 
> 
> That said, I would like to se a way to remove the __name__=='__main__' 
> weirdness.  I am +1 on resurrecting PEP299, but also +1 on adding a 
> "sys.main" that could be used in a new "if __name__=sys.main".  I am -1 
> on adding a builtin/global __main__ as proposed, because that would 
> clash with my own PEP299-like use of that name.

I had at one time (about 4 years ago) thought it was a bit strange.  But 
that was only for a very short while.

Python differs from other languages in a very important way.  python 
*always* starts at the top of the file and works it way down until if falls 
off the bottom.  What it does in between the top and the bottom is entirely 
up to you.  It's very dynamic.

Other languages *compile* all the code first without executing any of it. 
Then you are required to tell the the compiler where the program will 
start, which is why you need to define a main() function.

In Python, letting control fall off the bottom in order to start again at 
some place in the middle doesn't make much sense. It's already started, so 
you don't need to do that.

Cheers,
    Ron




From brett at python.org  Sun Apr 22 01:49:47 2007
From: brett at python.org (Brett Cannon)
Date: Sat, 21 Apr 2007 16:49:47 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <20070421135948.63AF.JCARLSON@uci.edu>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
Message-ID: <bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>

On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> After reading other posts in the thread, I'm going to put my support
> into the sys.main variant.  It has all of the benefits of the builtin __name__
> == __main__, with none of the drawbacks (no builtin!), and only a slight
> annoyance of 'import sys', which is more or less free.
>

Yeah, I am starting to like it as well.  Steven and Jim, what do you think?

-Brett


From jh at improva.dk  Sun Apr 22 02:04:04 2007
From: jh at improva.dk (Jacob Holm)
Date: Sat, 21 Apr 2007 20:04:04 -0400
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <462A9535.3010600@ronadam.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>	<f0d6tm$5n2$1@sea.gmane.org>	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
	<462A73B8.4080406@improva.dk> <462A9535.3010600@ronadam.com>
Message-ID: <462AA674.1020803@improva.dk>

Ron Adam wrote:
>
> Jacob Holm wrote:
>> I find it very sad that PEP299 did in fact die, because I think it is 
>> much cleaner solution than the proposal that started this thread.
>> That said, I would like to se a way to remove the 
>> __name__=='__main__' weirdness. I am +1 on resurrecting PEP299, but 
>> also +1 on adding a "sys.main" that could be used in a new "if 
>> __name__=sys.main". I am -1 on adding a builtin/global __main__ as 
>> proposed, because that would clash with my own PEP299-like use of 
>> that name.
>
> I had at one time (about 4 years ago) thought it was a bit strange. 
> But that was only for a very short while.
>
To clarify: By "weirdness" here I meant the fact that the name of a 
module changes when it is used as the main module.

> Python differs from other languages in a very important way. python 
> *always* starts at the top of the file and works it way down until if 
> falls off the bottom. What it does in between the top and the bottom 
> is entirely up to you. It's very dynamic.
>
> Other languages *compile* all the code first without executing any of 
> it. Then you are required to tell the the compiler where the program 
> will start, which is why you need to define a main() function.
>
I know all that.

> In Python, letting control fall off the bottom in order to start again 
> at some place in the middle doesn't make much sense. It's already 
> started, so you don't need to do that.

There are a number of reasons to want to use a function for the main 
part of the code, instead of putting it in an "if" at the end of the 
module. Two simple ones are:

Keeping the module namespace clean.

The ability to call the function from other code, most likely with 
different args.

Since I am usually writing such a function anyway, I would prefer not to 
have to write the "if" boilerplate at the bottom in order to get it 
called. Oh, and automatically calling a __main__ function if it exists, 
does not prevent people who like the current "if" aproach from using 
that. It would just make *my* life that tiny bit easier.

Therefore I would like to keep that door open by *not* adding the 
proposed __main__ variable at this point. Fortunately, the people that 
matter here seem to think avoiding the extra variable is a good idea 
(although for different reasons).

Jacob

-- 
Jacob Holm
CTO
Improva ApS



From ironfroggy at gmail.com  Sun Apr 22 06:14:59 2007
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 22 Apr 2007 00:14:59 -0400
Subject: [Python-ideas] partial with skipped arguments
Message-ID: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>

I often wish you could bind to arguments in a partial out of order,
skipping some positionals. The solution I came up with is a singleton
object located as an attribute of the partial function itself and used
like this:

def foo(a, b):
    return a / b
pf = partial(foo, partial.skip, 2)
assert pf(1.0) == 0.5

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/


From tjreedy at udel.edu  Sun Apr 22 06:57:06 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 22 Apr 2007 00:57:06 -0400
Subject: [Python-ideas] ordered dict
References: <f0d8sl$c0h$1@sea.gmane.org><740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com>
	<20070421112051.63AB.JCARLSON@uci.edu>
Message-ID: <f0epul$479$1@sea.gmane.org>


"Josiah Carlson" <jcarlson at uci.edu> wrote in message 
news:20070421112051.63AB.JCARLSON at uci.edu...

>  Including a list of 'acceptable' classes as Terry has suggested would 
> work, but would
> generally be superfluous.

I realized that later.  The main use would be to improve the error message, 
or allow introspection ("Sdict, what can I put in you?").

tjr





From steven.bethard at gmail.com  Sun Apr 22 07:10:16 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 21 Apr 2007 23:10:16 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
Message-ID: <d11dcfba0704212210l44155aaek8fe05ad534d5f635@mail.gmail.com>

On 4/21/07, Brett Cannon <brett at python.org> wrote:
> On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> >
> > After reading other posts in the thread, I'm going to put my support
> > into the sys.main variant.  It has all of the benefits of the builtin __name__
> > == __main__, with none of the drawbacks (no builtin!), and only a slight
> > annoyance of 'import sys', which is more or less free.
>
> Yeah, I am starting to like it as well.  Steven and Jim, what do you think?

Note that the one benefit the sys.main-only variant doesn't have is
the lower cognitive load of just having to know about __main__,
instead of having to know about __name__, import and sys.main.

That said, since the PEP as it stands introduces a sys.main anyway, we
might as well start with that.  People can then play around with it
and see if we need to introduce a __main__ module attribute or builtin
as well.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From steven.bethard at gmail.com  Sun Apr 22 07:14:01 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 21 Apr 2007 23:14:01 -0600
Subject: [Python-ideas] partial with skipped arguments
In-Reply-To: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
References: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
Message-ID: <d11dcfba0704212214n691838e5v95edb6e758a5a1db@mail.gmail.com>

On 4/21/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> I often wish you could bind to arguments in a partial out of order,
> skipping some positionals. The solution I came up with is a singleton
> object located as an attribute of the partial function itself and used
> like this:
>
> def foo(a, b):
>     return a / b
> pf = partial(foo, partial.skip, 2)
> assert pf(1.0) == 0.5

The other way I've seen this proposed is as::

    rpartial(foo, 2)

In this particular situation, you could also just write::

    partial(foo, b=2)

I think the presence of keyword argument support is why rpartial
wasn't added originally.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From ironfroggy at gmail.com  Sun Apr 22 14:11:55 2007
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 22 Apr 2007 08:11:55 -0400
Subject: [Python-ideas] partial with skipped arguments
In-Reply-To: <d11dcfba0704212214n691838e5v95edb6e758a5a1db@mail.gmail.com>
References: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
	<d11dcfba0704212214n691838e5v95edb6e758a5a1db@mail.gmail.com>
Message-ID: <76fd5acf0704220511wb7e5026odd3daf21f68b396b@mail.gmail.com>

On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/21/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> > I often wish you could bind to arguments in a partial out of order,
> > skipping some positionals. The solution I came up with is a singleton
> > object located as an attribute of the partial function itself and used
> > like this:
> >
> > def foo(a, b):
> >     return a / b
> > pf = partial(foo, partial.skip, 2)
> > assert pf(1.0) == 0.5
>
> The other way I've seen this proposed is as::
>
>     rpartial(foo, 2)
>
> In this particular situation, you could also just write::
>
>     partial(foo, b=2)
>
> I think the presence of keyword argument support is why rpartial
> wasn't added originally.
>
> Steve
> --
> I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
> tiny blip on the distant coast of sanity.
>         --- Bucky Katt, Get Fuzzy

Relying on the names of position arguments is not always a good idea,
of course. Also, it doesn't work at all with builtin (and extension?)
functions. The design is a little different, but I like it. Also, the
rpartial idea just creates multiple names for essentially the same
thing and still doesn't allow for skipping middle arguments or specify
only middle arguments, etc. I'd like to write a patch, if it would be
considered.

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/


From steven.bethard at gmail.com  Sun Apr 22 17:15:58 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 09:15:58 -0600
Subject: [Python-ideas] partial with skipped arguments
In-Reply-To: <76fd5acf0704220511wb7e5026odd3daf21f68b396b@mail.gmail.com>
References: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
	<d11dcfba0704212214n691838e5v95edb6e758a5a1db@mail.gmail.com>
	<76fd5acf0704220511wb7e5026odd3daf21f68b396b@mail.gmail.com>
Message-ID: <d11dcfba0704220815i638f2580sf29464491e611c88@mail.gmail.com>

On 4/22/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 4/21/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> > > I often wish you could bind to arguments in a partial out of order,
> > > skipping some positionals. The solution I came up with is a singleton
> > > object located as an attribute of the partial function itself and used
> > > like this:
> > >
> > > def foo(a, b):
> > >     return a / b
> > > pf = partial(foo, partial.skip, 2)
> > > assert pf(1.0) == 0.5
> >
> > The other way I've seen this proposed is as::
> >
> >     rpartial(foo, 2)
> >
> > In this particular situation, you could also just write::
> >
> >     partial(foo, b=2)
>
> Relying on the names of position arguments is not always a good idea,
> of course. Also, it doesn't work at all with builtin (and extension?)
> functions. The design is a little different, but I like it. Also, the
> rpartial idea just creates multiple names for essentially the same
> thing and still doesn't allow for skipping middle arguments or specify
> only middle arguments, etc. I'd like to write a patch, if it would be
> considered.

Well, I can pretty much guarantee you'll get the two responses above,
so if you post a patch, make sure you let python-dev know that you've
already considered these options and don't see them as satisfactory.
Your best bet of convincing people is probably to find a few
real-world use cases and post the corresponding code.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From collinw at gmail.com  Sun Apr 22 18:14:01 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 22 Apr 2007 09:14:01 -0700
Subject: [Python-ideas] partial with skipped arguments
In-Reply-To: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
References: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
Message-ID: <43aa6ff70704220914l489de151ta648c485155446f8@mail.gmail.com>

On 4/21/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> I often wish you could bind to arguments in a partial out of order,
> skipping some positionals. The solution I came up with is a singleton
> object located as an attribute of the partial function itself and used
> like this:
>
> def foo(a, b):
>     return a / b
> pf = partial(foo, partial.skip, 2)
> assert pf(1.0) == 0.5

In Python 2.5.0:

>>> import functools
>>> def f(a, b):
...     return a + b
...
>>> p = functools.partial(f, b=9)
>>> p
<functools.partial object at 0xb7d66194>
>>> p(3)
12
>>>

Is this what you're looking for?

Collin Winter


From jimjjewett at gmail.com  Sun Apr 22 19:11:52 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 22 Apr 2007 13:11:52 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
Message-ID: <fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>

On 4/21/07, Brett Cannon <brett at python.org> wrote:
> On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:

> > After reading other posts in the thread, I'm going to put my support
> > into the sys.main variant.  It has all of the benefits of the builtin __name__
> > == __main__, with none of the drawbacks (no builtin!), and only a slight
> > annoyance of 'import sys', which is more or less free.

> Yeah, I am starting to like it as well.  Steven and Jim, what do you think?

Better than adding a builtin.

I'm not sure I like the idea of another semi-random object in sys
either, though.

(1)  One of the motivations was importing.  It looks like __file__
already has sufficient information.  I understand that relying on it
(or on __package__?) seems a bit hacky, but is it really worse than
adding something?

(2)  Is there a reason the main module can't appear in sys.modules
twice, once under the alias "__main__"?

# Equivalent to today
if __name__ == sys.modules["__main__"].__name__:

# Better than today
if __name__ is sys.modules["__main__"].__name__:

# What I would like (pending PEP I hope to write tonight)
if __this_module__ is sys.modules["__main__"]:

-jJ


From jcarlson at uci.edu  Sun Apr 22 19:39:57 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 22 Apr 2007 10:39:57 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
References: <bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
Message-ID: <20070422102818.63B5.JCARLSON@uci.edu>


"Jim Jewett" <jimjjewett at gmail.com> wrote:
> 
> On 4/21/07, Brett Cannon <brett at python.org> wrote:
> > On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> 
> > > After reading other posts in the thread, I'm going to put my support
> > > into the sys.main variant.  It has all of the benefits of the builtin __name__
> > > == __main__, with none of the drawbacks (no builtin!), and only a slight
> > > annoyance of 'import sys', which is more or less free.
> 
> > Yeah, I am starting to like it as well.  Steven and Jim, what do you think?
> 
> Better than adding a builtin.
> 
> I'm not sure I like the idea of another semi-random object in sys
> either, though.
> 
> (1)  One of the motivations was importing.  It looks like __file__
> already has sufficient information.  I understand that relying on it
> (or on __package__?) seems a bit hacky, but is it really worse than
> adding something?
> 
> (2)  Is there a reason the main module can't appear in sys.modules
> twice, once under the alias "__main__"?

While it is unlikely, there may be cleanup issues when the process is
ending.


> # Equivalent to today
> if __name__ == sys.modules["__main__"].__name__:
> 
> # Better than today
> if __name__ is sys.modules["__main__"].__name__:

The above two should be equivalent unless the importer has a bad habit.


> # What I would like (pending PEP I hope to write tonight)
> if __this_module__ is sys.modules["__main__"]:

While I would also very much like the ability to access *this module*,
I don't believe that this necessarily precludes the use of a proper
package.module naming scheme for all __name__ values.


 - Josiah



From steven.bethard at gmail.com  Sun Apr 22 19:42:38 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 11:42:38 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
Message-ID: <d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>

On 4/22/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> # Equivalent to today
> if __name__ == sys.modules["__main__"].__name__:
>
> # Better than today
> if __name__ is sys.modules["__main__"].__name__:
>
> # What I would like (pending PEP I hope to write tonight)
> if __this_module__ is sys.modules["__main__"]:

Is it just me, or are the proposals starting to look more and more like::

    public static void main(String args[])

I think this PEP now needs to explicitly state that keeping the "am I
the main module?" idiom as simple as possible is *not* a goal. Because
everything I've seen (except for the original proposals in the PEP)
are substantially more complicated than the current::

    if __name__ == '__main__':

I guess I don't understand why we wouldn't be willing to put up with a
new module attribute or builtin to minimize the boilerplate in pretty
much every Python application out there.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From lists at cheimes.de  Sun Apr 22 20:15:43 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 22 Apr 2007 20:15:43 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
Message-ID: <f0g8ok$5kp$1@sea.gmane.org>

Steven Bethard wrote:
> I think this PEP now needs to explicitly state that keeping the "am I
> the main module?" idiom as simple as possible is *not* a goal. Because
> everything I've seen (except for the original proposals in the PEP)
> are substantially more complicated than the current::
> 
>     if __name__ == '__main__':
> 

I'm proposing the following changes:

* sys.main is added which contains the dotted name of the main script.
   This allows code like:

     if __name__ == sys.main:
         ...

     main_module = sys.modules[sys.main]

* __name__ is never mangled and contains always the dotted name of
   the current module. It's not set to '__main__' any more. You can
   get the current module object with

     this_module = sys.modules[__name__]

* I'm against sys.modules['__main__] = main_module because it may
   cause ugly side effects with reload. The same functionality is
   available with sys.modules[sys.main]. The Zen Of Python says that
   there should be one and only one obvious way.

 > I guess I don't understand why we wouldn't be willing to put up with a
 > new module attribute or builtin to minimize the boilerplate in pretty
 > much every Python application out there.

Why bother with the second price when you can win the first prize? In my 
opinion a __main__() function makes live easier than a __main__ module 
level variable. It's also my opinion that the main code should be in a 
function and not in the body of the module. I consider it good style 
because the code is unit testable (is this a word? *g*) and callable 
from another module while code in the body is not accessable from unit 
tests and other scripts.

I know that some people are against __main__(argv) but I've good reasons 
to propose the argv syntax. Although argv is available via sys.argv I 
like the see it as an argument for __main__() for the same reasons I 
like to see __main__. It makes unit testing and calls from another 
module possible. W/o the argv argument is harder to change the argument 
in unit tests.

Now for some syntactic sugar and a dream of mine:

@argumentdecorator(MyOptionParserClass)
def __main__(egg, spam=5):
    pass

The argumentdecorator function takes some kind of option parser class 
that is used to parse argv. This would allow nice code like

__main__(('mainscript.py', '--eggs 5', '--no-spam'))

Christian



From steven.bethard at gmail.com  Sun Apr 22 20:32:03 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 12:32:03 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0g8ok$5kp$1@sea.gmane.org>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
Message-ID: <d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>

On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:
> Steven Bethard wrote:
> > I think this PEP now needs to explicitly state that keeping the "am I
> > the main module?" idiom as simple as possible is *not* a goal. Because
> > everything I've seen (except for the original proposals in the PEP)
> > are substantially more complicated than the current::
> >
> >     if __name__ == '__main__':
> >
>
> I'm proposing the following changes:
>
> * sys.main is added which contains the dotted name of the main script.
>    This allows code like:
>
>      if __name__ == sys.main:
>          ...

Note that this really requires the code::

    import sys
    if __name__ == sys.main:

The import statement matters to me because 77% of my modules that use
the __main__ idiom *don't* import sys. Hence, for those modules, this
new idiom introduces more boilerplate.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From brett at python.org  Sun Apr 22 20:39:01 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 22 Apr 2007 11:39:01 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
Message-ID: <bbaeab100704221139u7a6794e4n29fc7e6224eb3769@mail.gmail.com>

On 4/22/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 4/21/07, Brett Cannon <brett at python.org> wrote:
> > On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> > > After reading other posts in the thread, I'm going to put my support
> > > into the sys.main variant.  It has all of the benefits of the builtin __name__
> > > == __main__, with none of the drawbacks (no builtin!), and only a slight
> > > annoyance of 'import sys', which is more or less free.
>
> > Yeah, I am starting to like it as well.  Steven and Jim, what do you think?
>
> Better than adding a builtin.
>
> I'm not sure I like the idea of another semi-random object in sys
> either, though.
>
> (1)  One of the motivations was importing.  It looks like __file__
> already has sufficient information.  I understand that relying on it
> (or on __package__?) seems a bit hacky, but is it really worse than
> adding something?
>

Yes, because you have no guarantee __file__ will in any way be unique
or even defined (look at 'sys').  It's up to the loader to set
__file__ and it can do whatever it wants.  This doesn't happen with
__name__ since it is rather clear what that should be no matter where
the module was loaded from (unless it was a Python file specified at
the command line in some random directory).

-Brett


From brett at python.org  Sun Apr 22 20:44:56 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 22 Apr 2007 11:44:56 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <f0g8ok$5kp$1@sea.gmane.org>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
Message-ID: <bbaeab100704221144m1cfbc29ag86e8bdfa155187a4@mail.gmail.com>

On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:
> Steven Bethard wrote:
> > I think this PEP now needs to explicitly state that keeping the "am I
> > the main module?" idiom as simple as possible is *not* a goal. Because
> > everything I've seen (except for the original proposals in the PEP)
> > are substantially more complicated than the current::
> >
> >     if __name__ == '__main__':
> >
>
> I'm proposing the following changes:
>
> * sys.main is added which contains the dotted name of the main script.
>    This allows code like:
>
>      if __name__ == sys.main:
>          ...
>
>      main_module = sys.modules[sys.main]
>
> * __name__ is never mangled and contains always the dotted name of
>    the current module. It's not set to '__main__' any more.

That can't be true.  If I am in the directory /spam but I execute the
file /bacon/code.py, what is the name of /bacon/code.py supposed to
be?  It makes absolutely no sense unless sys.path happens to have
either / or /bacon.  This is why I wondered out loud if setting
whatever attribute that is chosen not to __main__ should only be done
with '-m' as that keeps it simple and clear instead of having to try
to reverse-engineer a file's __name__ attribute.

>You can
>    get the current module object with
>
>      this_module = sys.modules[__name__]
>
> * I'm against sys.modules['__main__] = main_module because it may
>    cause ugly side effects with reload.

I assume that key is a string?  There is a single quote that is not closed off.

> The same functionality is
>    available with sys.modules[sys.main]. The Zen Of Python says that
>    there should be one and only one obvious way.
>
>  > I guess I don't understand why we wouldn't be willing to put up with a
>  > new module attribute or builtin to minimize the boilerplate in pretty
>  > much every Python application out there.
>
> Why bother with the second price when you can win the first prize? In my
> opinion a __main__() function makes live easier than a __main__ module
> level variable. It's also my opinion that the main code should be in a
> function and not in the body of the module. I consider it good style
> because the code is unit testable (is this a word? *g*) and callable
> from another module while code in the body is not accessable from unit
> tests and other scripts.

People can stop wishing for this.  I am not going to be writing a PEP
supporting this.  I don't like it; never have.  I like how Python
handles things currently in terms of relying on how module are
executed linearly.

I am totally fine if people propose a competing PEP or try to
resurrect PEP 299, but I am not going to be the person who does that
leg work.

-Brett


From lists at cheimes.de  Sun Apr 22 20:54:57 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 22 Apr 2007 20:54:57 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <bbaeab100704221144m1cfbc29ag86e8bdfa155187a4@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>	<f0g8ok$5kp$1@sea.gmane.org>
	<bbaeab100704221144m1cfbc29ag86e8bdfa155187a4@mail.gmail.com>
Message-ID: <f0gb25$2n1$1@sea.gmane.org>

Brett Cannon wrote:
>> * __name__ is never mangled and contains always the dotted name of
>>    the current module. It's not set to '__main__' any more.
> 
> That can't be true.  If I am in the directory /spam but I execute the
> file /bacon/code.py, what is the name of /bacon/code.py supposed to
> be?  It makes absolutely no sense unless sys.path happens to have
> either / or /bacon.  This is why I wondered out loud if setting
> whatever attribute that is chosen not to __main__ should only be done
> with '-m' as that keeps it simple and clear instead of having to try
> to reverse-engineer a file's __name__ attribute.

I haven't thought of that issue. :(

>> * I'm against sys.modules['__main__] = main_module because it may
>>    cause ugly side effects with reload.
> 
> I assume that key is a string?  There is a single quote that is not closed off.

Yes, it's a typo. It should say sys.modules['__main__'].

> I am totally fine if people propose a competing PEP or try to
> resurrect PEP 299, but I am not going to be the person who does that
> leg work.

Understood! :)

Christian



From lists at cheimes.de  Sun Apr 22 21:00:28 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 22 Apr 2007 21:00:28 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <462AA674.1020803@improva.dk>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>	<f0ag3v$egs$1@sea.gmane.org>	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>	<f0d6tm$5n2$1@sea.gmane.org>	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>	<462A73B8.4080406@improva.dk>
	<462A9535.3010600@ronadam.com> <462AA674.1020803@improva.dk>
Message-ID: <f0gbch$4ep$1@sea.gmane.org>

Jacob Holm wrote:
> Therefore I would like to keep that door open by *not* adding the 
> proposed __main__ variable at this point. Fortunately, the people that 
> matter here seem to think avoiding the extra variable is a good idea 
> (although for different reasons).

+1 from me

Christian



From jimjjewett at gmail.com  Sun Apr 22 21:26:43 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 22 Apr 2007 15:26:43 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
Message-ID: <fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>

On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:

> > I'm proposing the following changes:

> > * sys.main is added which contains the dotted name of the main
> >    script.   This allows code like:

> >      if __name__ == sys.main:

> Note that this really requires the code::

>     import sys
>     if __name__ == sys.main:

As long as we're in python-ideas, I'll throw out the radical
suggestion of auto-importing sys into builtins, the way os autoimports
path.

-jJ


From steven.bethard at gmail.com  Sun Apr 22 22:56:09 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 14:56:09 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
	<fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
Message-ID: <d11dcfba0704221356j2f62ab63u69bf3a79cb27d3ad@mail.gmail.com>

On 4/22/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:
>
> > > I'm proposing the following changes:
>
> > > * sys.main is added which contains the dotted name of the main
> > >    script.   This allows code like:
>
> > >      if __name__ == sys.main:
>
> > Note that this really requires the code::
>
> >     import sys
> >     if __name__ == sys.main:
>
> As long as we're in python-ideas, I'll throw out the radical
> suggestion of auto-importing sys into builtins, the way os autoimports
> path.

While that would address my concern, I wonder if adding sys to the
builtins is really any better than adding __main__ to the builtins.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From lists at cheimes.de  Sun Apr 22 23:08:57 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 22 Apr 2007 23:08:57 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <d11dcfba0704221356j2f62ab63u69bf3a79cb27d3ad@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>	<f0g8ok$5kp$1@sea.gmane.org>	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>	<fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
	<d11dcfba0704221356j2f62ab63u69bf3a79cb27d3ad@mail.gmail.com>
Message-ID: <f0gita$r44$1@sea.gmane.org>

Steven Bethard wrote:
> While that would address my concern, I wonder if adding sys to the
> builtins is really any better than adding __main__ to the builtins.

If I understand the proposal right then __main__ won't be a builtin. 
Each module would get a new global variable __main__ which is set either 
to True or False.

Also I consider sys kinda reserved for the sys module while the __main__ 
global var approach would reserve a new name that I like to see used for 
something else.

+0.25 for sys in builtins

Christian



From greg.ewing at canterbury.ac.nz  Mon Apr 23 00:18:22 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 23 Apr 2007 10:18:22 +1200
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
	<f0d6tm$5n2$1@sea.gmane.org>
	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
Message-ID: <462BDF2E.1070004@canterbury.ac.nz>

Steven Bethard wrote:
> if there's nothing to be passed to the function, why make it a
> function at all?

I don't usually like to put big lumps of init code
at the module level, because it pollutes the module
namespace with local variables. So I typically end
up with

   def main():
     ...
     ...
     ...

   if __name__ == "__main__":
     main()

So I'd be quite happy if I could just define a
function called __main__() and be done with. I
don't understand why there's so much opposition
to that idea.

--
Greg


From george.sakkis at gmail.com  Mon Apr 23 00:49:49 2007
From: george.sakkis at gmail.com (George Sakkis)
Date: Sun, 22 Apr 2007 18:49:49 -0400
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <462BDF2E.1070004@canterbury.ac.nz>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
	<f0d6tm$5n2$1@sea.gmane.org>
	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
	<462BDF2E.1070004@canterbury.ac.nz>
Message-ID: <91ad5bf80704221549g105a61f8p2dca945e1895d2db@mail.gmail.com>

On 4/22/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Steven Bethard wrote:
> > if there's nothing to be passed to the function, why make it a
> > function at all?
>
> I don't usually like to put big lumps of init code
> at the module level, because it pollutes the module
> namespace with local variables. So I typically end
> up with
>
>    def main():
>      ...
>      ...
>      ...
>
>    if __name__ == "__main__":
>      main()
>
> So I'd be quite happy if I could just define a
> function called __main__() and be done with. I
> don't understand why there's so much opposition
> to that idea.

+1. Although I may start out at the module level, that's typically the
idiom I use eventually for any non-trivial (e.g. more than 1-2 lines)
main*.

George


* Only exception is if the module consists essentially of main(), i.e.
a small standalone script without classes, functions, etc.


From aahz at pythoncraft.com  Mon Apr 23 01:09:26 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 22 Apr 2007 16:09:26 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704212210l44155aaek8fe05ad534d5f635@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<d11dcfba0704212210l44155aaek8fe05ad534d5f635@mail.gmail.com>
Message-ID: <20070422230926.GB7208@panix.com>

On Sat, Apr 21, 2007, Steven Bethard wrote:
>
> Note that the one benefit the sys.main-only variant doesn't have is
> the lower cognitive load of just having to know about __main__,
> instead of having to know about __name__, import and sys.main.

>From my POV that is indeed a lower cognitive load because all I need to
remember is to look in the docs for the sys module -- everything else is
there.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons."  --Aahz


From aahz at pythoncraft.com  Mon Apr 23 01:11:38 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 22 Apr 2007 16:11:38 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
Message-ID: <20070422231138.GC7208@panix.com>

On Sun, Apr 22, 2007, Steven Bethard wrote:
> On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:
>>
>> I'm proposing the following changes:
>>
>> * sys.main is added which contains the dotted name of the main script.
>>    This allows code like:
>>
>>      if __name__ == sys.main:
>>          ...
> 
> Note that this really requires the code::
> 
>     import sys
>     if __name__ == sys.main:
> 
> The import statement matters to me because 77% of my modules that use
> the __main__ idiom *don't* import sys. Hence, for those modules, this
> new idiom introduces more boilerplate.

Does this follow the axiom that 83% of all statistics are made up on the
spot?  ;-)  Seriously, if I'm writing a script that requires __main__,
chances are excellent that it already includes sys (because it's
probably a command-line script that's graduating to module status).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons."  --Aahz


From rrr at ronadam.com  Mon Apr 23 01:50:17 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 22 Apr 2007 18:50:17 -0500
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>	<f0g8ok$5kp$1@sea.gmane.org>	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
	<fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
Message-ID: <462BF4B9.40403@ronadam.com>

Jim Jewett wrote:

> As long as we're in python-ideas, I'll throw out the radical
> suggestion of auto-importing sys into builtins, the way os autoimports
> path.

+1

I thought that this was discussed before and had gotten general approval.



Also it makes sense to me to have additional entries in sys to identify the 
starting main, and the root package modules.  I also like the idea of 
having a way to say this_module.

if __module__ is sys.__main__:
    ...

Notice names aren't used this way, which is generally how you would compare 
any object in python.  You wouldn't try to get it's name and then compare 
that to the name of another object.

Ron


From ntoronto at cs.byu.edu  Mon Apr 23 01:51:26 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sun, 22 Apr 2007 17:51:26 -0600
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
Message-ID: <462BF4FE.1050103@cs.byu.edu>

Steven Bethard wrote:
> On 4/22/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>   
>> # Equivalent to today
>> if __name__ == sys.modules["__main__"].__name__:
>>
>> # Better than today
>> if __name__ is sys.modules["__main__"].__name__:
>>
>> # What I would like (pending PEP I hope to write tonight)
>> if __this_module__ is sys.modules["__main__"]:
>>     
>
> Is it just me, or are the proposals starting to look more and more like::
>
>     public static void main(String args[])
>
> I think this PEP now needs to explicitly state that keeping the "am I
> the main module?" idiom as simple as possible is *not* a goal. Because
> everything I've seen (except for the original proposals in the PEP)
> are substantially more complicated than the current::
>
>     if __name__ == '__main__':
>
> I guess I don't understand why we wouldn't be willing to put up with a
> new module attribute or builtin to minimize the boilerplate in pretty
> much every Python application out there.
>   

Agreed - it's getting horrid. As Pythonic as they think this is, they're 
completely forgetting the newb.

So let's look at it from his point of view. Say I'm a Python newb. I've 
written some modules and some executable Python scripts and I'm somewhat 
comfy with the language. (Of course, it only took me about two hours to 
get comfy - this is Python, after all.) I now want to write either:

1) A module that runs unit tests when it is run as a script, but not 
when it's just imported; or
2) A script that can be imported as a module when I need a few of its 
functions. (I should really split them into another module, but this is 
a use case.)

Now I have to import sys? Never seen that one... okay. Imported. Now, 
what's this Greek I have to write to test whether the script is the main 
script? How am I supposed to remember this? This is worse than fork()!

On the other hand, IMNSHO, either of the following two are just about 
perfect in terms of understandability, and parsimony:

    def __main__():  # we really don't need args here
        # stuff

    if __main__:
        # stuff


Chances are, the first will be very familiar, but refreshing that it's 
just a plain old, gibberish-free function. Both are easier than what 
we've got currently. (IMO, the first is better, because 1) the code can 
be put anywhere in the module; 2) it automatically doesn't pollute the 
global namespace; and 3) it's less boilerplate for complex modules and 
no more boilerplate for simple ones.)

FWIW, I don't see a problem with a sys.modules['__main__'] - it would 
even occasionally be useful - but nobody should be *required* to use an 
abomination like that for what's clearly a newbie task: determining 
whether a module is run as a script.

Neil



From adam at atlas.st  Mon Apr 23 02:00:44 2007
From: adam at atlas.st (Adam Atlas)
Date: Sun, 22 Apr 2007 20:00:44 -0400
Subject: [Python-ideas] PEP for executing a module in a
 package	containing relative imports
In-Reply-To: <462BF4B9.40403@ronadam.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
	<fb6fbf560704221226m17e9288fs3157cbe2cb217b58@mail.gmail.com>
	<462BF4B9.40403@ronadam.com>
Message-ID: <EA902AA4-A363-4967-823F-357E41AC397E@atlas.st>


On 22 Apr 2007, at 19.50, Ron Adam wrote:
> I also like the idea of
> having a way to say this_module.
>
> if __module__ is sys.__main__:
>     ...

Agreed... I suggested something like that a couple of days ago  
(except assuming __main__ would be a builtin global instead of in  
sys). I proposed __this__ as the name for accessing the current  
module. Mainly because I like the Englishlike way it reads: "if  
__this__ is __main__". 'If this is main' -- couldn't be simpler.

Though I'd also be fine with sys.__main__ or sys.main (I'd prefer the  
latter). I would support having sys be an automatic global.


From lists at cheimes.de  Mon Apr 23 02:35:14 2007
From: lists at cheimes.de (Christian Heimes)
Date: Mon, 23 Apr 2007 02:35:14 +0200
Subject: [Python-ideas] PEP for executing a module in a package
 containing relative imports
In-Reply-To: <462BF4FE.1050103@cs.byu.edu>
References: <20070419235504.6374.JCARLSON@uci.edu>	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>	<20070421135948.63AF.JCARLSON@uci.edu>	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<462BF4FE.1050103@cs.byu.edu>
Message-ID: <f0gv06$nls$1@sea.gmane.org>

Neil Toronto wrote:
> On the other hand, IMNSHO, either of the following two are just about 
> perfect in terms of understandability, and parsimony:
> 
>     def __main__():  # we really don't need args here
>         # stuff

I think __main__(*argv) has some benefits over __main__(). It allows you 
to call the function with different arguments from another script or a 
unit test.

def __main__(argv=None):
     if argv is None:
         argv = sys.argv # has the same effect but it is ugly

> FWIW, I don't see a problem with a sys.modules['__main__'] - it would 
> even occasionally be useful - but nobody should be *required* to use an 
> abomination like that for what's clearly a newbie task: determining 
> whether a module is run as a script.

I see the problem in having the same module under two names in 
sys.modules. It may lead to issues (reload?). Also it is not necessary 
to get the main module if we store the dotted name in sys.main. So 
sys.modules[sys.main] would return the main module.

Christian



From steven.bethard at gmail.com  Mon Apr 23 03:18:51 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 19:18:51 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <20070422231138.GC7208@panix.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<fb6fbf560704221011i333cc553ge287421c2bf781f2@mail.gmail.com>
	<d11dcfba0704221042g588fab6sd3a0b831275695fe@mail.gmail.com>
	<f0g8ok$5kp$1@sea.gmane.org>
	<d11dcfba0704221132v184b66ddoe2bf6ca87c23c2a0@mail.gmail.com>
	<20070422231138.GC7208@panix.com>
Message-ID: <d11dcfba0704221818r1c6d501fyaf8d4e5366b84629@mail.gmail.com>

On 4/22/07, Aahz <aahz at pythoncraft.com> wrote:
> On Sun, Apr 22, 2007, Steven Bethard wrote:
> > On 4/22/07, Christian Heimes <lists at cheimes.de> wrote:
> >>
> >> I'm proposing the following changes:
> >>
> >> * sys.main is added which contains the dotted name of the main script.
> >>    This allows code like:
> >>
> >>      if __name__ == sys.main:
> >>          ...
> >
> > Note that this really requires the code::
> >
> >     import sys
> >     if __name__ == sys.main:
> >
> > The import statement matters to me because 77% of my modules that use
> > the __main__ idiom *don't* import sys. Hence, for those modules, this
> > new idiom introduces more boilerplate.
>
> Does this follow the axiom that 83% of all statistics are made up on the
> spot?  ;-)  Seriously, if I'm writing a script that requires __main__,
> chances are excellent that it already includes sys (because it's
> probably a command-line script that's graduating to module status).

No, I actually went and counted in my local repository. There are two
main reasons why that's true:
(1) Most unittest modules just run unittest.main(), so no import of sys.
(2) Most other modules use optparse or argparse, so no import of sys.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From steven.bethard at gmail.com  Mon Apr 23 03:24:43 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 19:24:43 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <462BDF2E.1070004@canterbury.ac.nz>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<f0ag3v$egs$1@sea.gmane.org>
	<d11dcfba0704201514u71a6d27dg331737dbb9024a76@mail.gmail.com>
	<f0d6tm$5n2$1@sea.gmane.org>
	<d11dcfba0704211112o64fa988cnd4b4243bce829152@mail.gmail.com>
	<462BDF2E.1070004@canterbury.ac.nz>
Message-ID: <d11dcfba0704221824o3e90f72dgfc6a0cca064a7f26@mail.gmail.com>

On 4/22/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Steven Bethard wrote:
> > if there's nothing to be passed to the function, why make it a
> > function at all?
>
> I don't usually like to put big lumps of init code
> at the module level, because it pollutes the module
> namespace with local variables. So I typically end
> up with
>
>    def main():
>      ...
>      ...
>      ...
>
>    if __name__ == "__main__":
>      main()
>
> So I'd be quite happy if I could just define a
> function called __main__() and be done with. I
> don't understand why there's so much opposition
> to that idea.

I guess I'm just the odd one out here in that I parse my arguments
before passing them to module-level functions. So my code normally
looks like::

    if __name__ == '__main__':
        ... a few lines of argument parsing code ...
        some_function_name(args.foo, args.bar, args.baz)

That is, I do the argument parsing at the module level, and then call
the module functions with more meaningful arguments than sys.argv.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From steven.bethard at gmail.com  Mon Apr 23 03:28:19 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 19:28:19 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <20070422230926.GB7208@panix.com>
References: <20070419235504.6374.JCARLSON@uci.edu>
	<bbaeab100704201011w5afc877q4ea34d2d5d92e180@mail.gmail.com>
	<20070421135948.63AF.JCARLSON@uci.edu>
	<bbaeab100704211649h1869707ei36ebc5d98780b3ba@mail.gmail.com>
	<d11dcfba0704212210l44155aaek8fe05ad534d5f635@mail.gmail.com>
	<20070422230926.GB7208@panix.com>
Message-ID: <d11dcfba0704221828y39f4938aob026873706f3b85b@mail.gmail.com>

On 4/22/07, Aahz <aahz at pythoncraft.com> wrote:
> On Sat, Apr 21, 2007, Steven Bethard wrote:
> >
> > Note that the one benefit the sys.main-only variant doesn't have is
> > the lower cognitive load of just having to know about __main__,
> > instead of having to know about __name__, import and sys.main.
>
> From my POV that is indeed a lower cognitive load because all I need to
> remember is to look in the docs for the sys module -- everything else is
> there.

As a newbie, you need to remember to lookup at least two things:
__name__ and sys.main.  As compared to having to lookup just __main__.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From brett at python.org  Mon Apr 23 03:46:39 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 22 Apr 2007 18:46:39 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
Message-ID: <bbaeab100704221846w41803110t5ced56029c9fe30b@mail.gmail.com>

I revised the PEP to use the sys.main idea and sent it off to
python-3000.  If you care to participate in the discussion please move
it over there.

Thanks to everyone who contributed to the discussion.  I really
appreciate the help!

-Brett


From steven.bethard at gmail.com  Mon Apr 23 04:30:18 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 22 Apr 2007 20:30:18 -0600
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <bbaeab100704221846w41803110t5ced56029c9fe30b@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<bbaeab100704221846w41803110t5ced56029c9fe30b@mail.gmail.com>
Message-ID: <d11dcfba0704221930v36781401o3108d9e821be635b@mail.gmail.com>

On 4/22/07, Brett Cannon <brett at python.org> wrote:
> I revised the PEP to use the sys.main idea and sent it off to
> python-3000.

Just wanted to say thanks Brett for putting the time into this!

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy


From jimjjewett at gmail.com  Mon Apr 23 05:05:14 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 22 Apr 2007 23:05:14 -0400
Subject: [Python-ideas] PEP 30xx: Access to Module/Class/Function Currently
	Being Defined (this)
Message-ID: <fb6fbf560704222005le4798a4j5daa5e71e644f069@mail.gmail.com>

(Please note that several groups were Cc'd.  For now, please limit
followups to python-3000.  This would *probably* be backported to 2.6,
but that wouldn't be decided until the implementation strategy was
settled.)

PEP: 30XX
Title: Access to Module/Class/Function Currently Being Defined (this)
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <jimjjewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 22-Apr-2007
Python-Version: 3.0
Post-History: 22-Apr-2007


Abstract

    It is common to need a reference to the current module, class,
    or function, but there is currently no entirely correct way to
    do this.  This PEP proposes adding the keywords __module__,
    __class__, and __function__.


Rationale

    Many modules export various functions, classes, and other objects,
    but will perform additional activities (such as running unit tests)
    when run as a script.  The current idiom is to test whether the
    module's name has been set to magic value.

        if __name__ == "__main__": ...

    More complicated introspection requires a module to (attempt to)
    import itself.  If importing the expected name actually produces
    a different module, there is no good workaround.

    Proposal:  Add a __module__ keyword which refers to the module
    currently being defined (executed).  (But see open issues.)

        if __module__ is sys.main: ...   # assuming PEP 3020, Cannon


    Class methods are passed the current instance; from this they can
    determine self.__class__ (or cls, for classmethods).  Unfortunately,
    this reference is to the object's actual class, which may be a
    subclass of the defining class.  The current workaround is to repeat
    the name of the class, and assume that the name will not be rebound.

        class C(B):
            def meth(self):
                super(C, self).meth() # Hope C is never rebound.

        class D(C):
            def meth(self):
                super(C, self).meth() # ?!? issubclass(D,C), so it "works"

    Proposal:  Add a __class__ keyword which refers to the class currently
    being defined (executed).  (But see open issues.)

        class C(B):
            def meth(self):
                super(__class__, self).meth()

    Note that super calls may be further simplified by PEP 30XX, Jewett.
    The __class__ (or __this_class__) attribute came up in attempts to
    simplify the explanation and/or implementation of that PEP, but was
    separated out as an independent decision.

    Note that __class__ (or __this_class__) is not quite the same as the
    __thisclass__ property on bound super objects.  The existing
    super.__thisclass__ property refers to the class from which the Method
    Resolution Order search begins.  In the above class D, it would refer to
    (the current reference of name) C.


    Functions (including methods) often want access to themselves,
    usually for a private storage location.  While there are several
    workarounds, all have their drawbacks.

        def counter(_total=[0]):   # _total shouldn't really appear in the
            _total[0] += 1         # signature at all; the list wrapping and
            return _total[0]       # [0] unwrapping obscure the code

        @annotate(total=0)
        def counter():
            counter.total += 1    # Assume name counter is never rebound
            return counter.total

        class _wrap(object):  # class exists only to provide storage
            __total=0
            def f(self):
                self.__total += 1
                return self.__total
        accum=_wrap().f       # set module attribute to a bound method

    Proposal:  Add a __function__ keyword which refers to the function
    (or method) currently being defined (executed).  (But see open issues.)

        @annotate(total=0)
        def counter():
            __function__.total += 1    # Always refers to this function obj
            return __function__.total


Backwards Compatibility

    While a user could be using these names already, __anything__ names
    are explicitly reserved to the interpreter.  It is therefore acceptable
    to introduce special meaning to these names within a single feature
    release.


Implementation

    Ideally, these names would be keywords treated specially by the bytecode
    compiler.

    Guido has suggested [1] using a cell variable filled in by the metaclass.

    Michele Simionato has provided a prototype using bytecode hacks [2].


Open Issues

    - Are __module__, __class__, and __function__ the right names?
      In particular, should the names include the word "this", either as
      __this_module__, __this_class__, and __this_function__, (format
      discussed on the python-3000 and python-ideas lists) or as
      __thismodule__, __thisclass__, and __thisfunction__ (inspired by,
      but conflicting with, current usage of super.__thisclass__).

    - Are all three keywords needed, or should this enhancement be limited
      to a subset of the objects?  Should methods be treated separately from
      other functions?


References

    [1] Fixing super anyone?  Guido van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/006671.html

    [2] Descriptor/Decorator challenge,  Michele Simionato
        http://groups.google.com/group/comp.lang.python/browse_frm/thread/a6010c7494871bb1/62a2da68961caeb6?lnk=gst&q=simionato+challenge&rnum=1&hl=en#62a2da68961caeb6


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:


From adam at atlas.st  Mon Apr 23 05:22:36 2007
From: adam at atlas.st (Adam Atlas)
Date: Sun, 22 Apr 2007 23:22:36 -0400
Subject: [Python-ideas] Object adaptation and interfaces and so forth
Message-ID: <0F15F5F9-5104-4D02-A708-EC3FFDE4047F@atlas.st>

(Not exactly an idea post, but I don't want to bother python-dev or  
python-3000 with this.)

PEP 246 was rejected a year or so ago, and Guido's rejection note  
stated "Something much better is about to happen; it's too early to  
say exactly what, but it's not going to resemble the proposal in this  
PEP." Does anyone know if anything has gone on with this concept  
since then? It seems like it has a lot of really interesting  
potential, although I do see why PEP 246's specific proposal was  
rejected. It's just the "Something much better is about to happen"  
that got me curious -- is it happening yet? :)


From jimjjewett at gmail.com  Mon Apr 23 05:33:57 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 22 Apr 2007 23:33:57 -0400
Subject: [Python-ideas] Object adaptation and interfaces and so forth
In-Reply-To: <0F15F5F9-5104-4D02-A708-EC3FFDE4047F@atlas.st>
References: <0F15F5F9-5104-4D02-A708-EC3FFDE4047F@atlas.st>
Message-ID: <fb6fbf560704222033q71a885ffuf17f922f0a6728ca@mail.gmail.com>

I think the rejection was refering to parameter annotations.

def f(arg1:int, arg2:"woot the bounding main"): ...

In combination with decorations, this can provide adaptation.

-jJ


On 4/22/07, Adam Atlas <adam at atlas.st> wrote:
> (Not exactly an idea post, but I don't want to bother python-dev or
> python-3000 with this.)
>
> PEP 246 was rejected a year or so ago, and Guido's rejection note
> stated "Something much better is about to happen; it's too early to
> say exactly what, but it's not going to resemble the proposal in this
> PEP." Does anyone know if anything has gone on with this concept
> since then? It seems like it has a lot of really interesting
> potential, although I do see why PEP 246's specific proposal was
> rejected. It's just the "Something much better is about to happen"
> that got me curious -- is it happening yet? :)


From talin at acm.org  Mon Apr 23 05:45:29 2007
From: talin at acm.org (Talin)
Date: Sun, 22 Apr 2007 20:45:29 -0700
Subject: [Python-ideas] Object adaptation and interfaces and so forth
In-Reply-To: <0F15F5F9-5104-4D02-A708-EC3FFDE4047F@atlas.st>
References: <0F15F5F9-5104-4D02-A708-EC3FFDE4047F@atlas.st>
Message-ID: <462C2BD9.5060004@acm.org>

Adam Atlas wrote:
> (Not exactly an idea post, but I don't want to bother python-dev or  
> python-3000 with this.)
> 
> PEP 246 was rejected a year or so ago, and Guido's rejection note  
> stated "Something much better is about to happen; it's too early to  
> say exactly what, but it's not going to resemble the proposal in this  
> PEP." Does anyone know if anything has gone on with this concept  
> since then? It seems like it has a lot of really interesting  
> potential, although I do see why PEP 246's specific proposal was  
> rejected. It's just the "Something much better is about to happen"  
> that got me curious -- is it happening yet? :)

Reminds me of that scene from 2010:

   Dave Bowman: You see, something's going to happen. You must leave.
   Heywood Floyd: What? What's going to happen?
   Dave Bowman: Something wonderful.
   Heywood Floyd: What?
   Dave Bowman: I understand how you feel. You see, it's all very clear
     to me now. The whole thing. It's wonderful.

The answer to your question is "yes", although it's happening in very 
small stages. Specifically, Python 3000'a argument decorators and 
abstract base classes are laying the groundwork for an adaption system 
via generic functions.

Argument decorators make declaring of generic functions much less 
cumbersome than was previously possible. And abstract base classes give 
the generic functions something to work on that is more general than 
merely working on concrete types - it provides a way to reason about 
types in a duck-typing world.

What happens next is that there will be various 3rd party 
implementations of generic function dispatch which will be based on 
those two things. Phillip J. Eby has already stated that he is 
interested in creating a kind of reference implementation that 
incorporates most of the interesting features, however his need not be 
the only one.

These generic function dispatchers, working off of both concrete and 
abstract types can be used to implement object adaptation in various 
ways (If anyone wants to supply some concrete examples here, please be 
my guest.)

-- Talin



From brett at python.org  Mon Apr 23 05:58:02 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 22 Apr 2007 20:58:02 -0700
Subject: [Python-ideas] PEP for executing a module in a package
	containing relative imports
In-Reply-To: <d11dcfba0704221930v36781401o3108d9e821be635b@mail.gmail.com>
References: <bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com>
	<bbaeab100704221846w41803110t5ced56029c9fe30b@mail.gmail.com>
	<d11dcfba0704221930v36781401o3108d9e821be635b@mail.gmail.com>
Message-ID: <bbaeab100704222058n129c423t21bf6748b5256592@mail.gmail.com>

On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/22/07, Brett Cannon <brett at python.org> wrote:
> > I revised the PEP to use the sys.main idea and sent it off to
> > python-3000.
>
> Just wanted to say thanks Brett for putting the time into this!
>

Welcome.  I am just glad I got the email off literally 15 minutes or
so before my laptop died.  So if the hard drive is gone at least I
have the latest version still.  =)


From ironfroggy at gmail.com  Mon Apr 23 17:44:48 2007
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Mon, 23 Apr 2007 11:44:48 -0400
Subject: [Python-ideas] partial with skipped arguments
In-Reply-To: <43aa6ff70704220914l489de151ta648c485155446f8@mail.gmail.com>
References: <76fd5acf0704212114y22db7080je41e378c6ceaa994@mail.gmail.com>
	<43aa6ff70704220914l489de151ta648c485155446f8@mail.gmail.com>
Message-ID: <76fd5acf0704230844t39ed2f47oe0df07d6e2915cf1@mail.gmail.com>

On 4/22/07, Collin Winter <collinw at gmail.com> wrote:
> On 4/21/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> > I often wish you could bind to arguments in a partial out of order,
> > skipping some positionals. The solution I came up with is a singleton
> > object located as an attribute of the partial function itself and used
> > like this:
> >
> > def foo(a, b):
> >     return a / b
> > pf = partial(foo, partial.skip, 2)
> > assert pf(1.0) == 0.5
>
> In Python 2.5.0:
>
> >>> import functools
> >>> def f(a, b):
> ...     return a + b
> ...
> >>> p = functools.partial(f, b=9)
> >>> p
> <functools.partial object at 0xb7d66194>
> >>> p(3)
> 12
> >>>
>
> Is this what you're looking for?
>
> Collin Winter
>

More or less but that posses two problems that I mentioned previously:
1) Relying on the names of position arguments does not feel right.
2) Buitin and extension functions don't work with that because you
can't pass positionals to them by name.

Besides, its a good excersize for me to finally get into any
moderately real hacking of CPython. I'm working on the patch right
now, one way or the other.

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/


From ironfroggy at gmail.com  Tue Apr 24 16:04:25 2007
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Tue, 24 Apr 2007 10:04:25 -0400
Subject: [Python-ideas] Removing instancemethod in favor of partial?
Message-ID: <76fd5acf0704240704m1b38d39fq189bd6e8f1f511ed@mail.gmail.com>

Hey, why not? They do basically the same thing, except instancemethod
allows only a single argument. Why not allow class and instance
methods to be wrapped with a partial instead of their own type? We can
rip out 300 lines of C code supporting instance method, at least. The
only thorn is the im_class attribute, but few seem to even use it (few
meaning just Twisted, according to Google Code Search). Anyway, I
figure we don't really need it anyway because if some of the proposals
for a way to reliably get the current function, class, or module go
through, perhaps we'll have a reference to the class from the function
itself. What does anyone think?

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/


From grosser.meister.morti at gmx.net  Tue Apr 24 18:32:24 2007
From: grosser.meister.morti at gmx.net (=?ISO-8859-15?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 24 Apr 2007 18:32:24 +0200
Subject: [Python-ideas] Sandbox?
Message-ID: <462E3118.1040600@gmx.net>

Are there any plans on a sandbox for python 3.0?
Just wondering.


	-panzi


From brett at python.org  Tue Apr 24 19:54:39 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 24 Apr 2007 10:54:39 -0700
Subject: [Python-ideas] Removing instancemethod in favor of partial?
In-Reply-To: <76fd5acf0704240704m1b38d39fq189bd6e8f1f511ed@mail.gmail.com>
References: <76fd5acf0704240704m1b38d39fq189bd6e8f1f511ed@mail.gmail.com>
Message-ID: <bbaeab100704241054v30324505qaa5580fe1499ad96@mail.gmail.com>

On 4/24/07, Calvin Spealman <ironfroggy at gmail.com> wrote:
> Hey, why not? They do basically the same thing, except instancemethod
> allows only a single argument. Why not allow class and instance
> methods to be wrapped with a partial instead of their own type? We can
> rip out 300 lines of C code supporting instance method, at least. The
> only thorn is the im_class attribute, but few seem to even use it (few
> meaning just Twisted, according to Google Code Search). Anyway, I
> figure we don't really need it anyway because if some of the proposals
> for a way to reliably get the current function, class, or module go
> through, perhaps we'll have a reference to the class from the function
> itself. What does anyone think?
>

Huh, interesting idea.  In theory it seems fine.  I would want to know
if any performance penalty exists from this first, though.

-Brett


From brett at python.org  Tue Apr 24 19:56:37 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 24 Apr 2007 10:56:37 -0700
Subject: [Python-ideas] Sandbox?
In-Reply-To: <462E3118.1040600@gmx.net>
References: <462E3118.1040600@gmx.net>
Message-ID: <bbaeab100704241056u2a08b5d5r5f479bb15eb79158@mail.gmail.com>

On 4/24/07, Mathias Panzenb?ck <grosser.meister.morti at gmx.net> wrote:
> Are there any plans on a sandbox for python 3.0?
> Just wondering.

Specifically no.  I guess my security work is the closest thing to a
sandbox in the pipeline
(http://sayspy.blogspot.com/2007/04/python-security-paper-online.html).
 But I don't know if it is going to make it into Python 3 or not.

-Brett

From talin at acm.org  Thu Apr 26 07:59:44 2007
From: talin at acm.org (Talin)
Date: Wed, 25 Apr 2007 22:59:44 -0700
Subject: [Python-ideas] The case against static type checking,
	in detail (long)
Message-ID: <46303FD0.1020504@acm.org>

(This is a fragment of an email that I sent to Guido earlier, I mention 
this here so that Guido can skip reading it. Of course, I recognize that 
most people here already know all this - but since it relates to the 
recent discussion about the value of type checking, I'd like to post it 
here as a kind of "manifesto" of why Python is the way it is.)

Strongly-typed languages such as C++ and Java require up-front 
declaration of everything. It is the nature of such languages that there 
is a lot of cross-checking in the compiler between the declaration of a 
thing and its use. The idea is to prevent programmer errors by insuring 
internal consistency.

However, we find in practice that much of the programmer's effort is 
spent in maintaining this cross-checking structure. To use a building 
analogy, a statically-typed program is like a truss structure, where 
there's a complex web of cross-braces, and a force applied at any given 
point is spread out over the whole structure. Each time such a program 
is modified, the programmer must partially dismantle and then 
re-assemble the existing structure. This takes time.

It also clutters the code. Reading the source code of a program written 
in a statically typed language reveals that a substantial part of the 
text serves only to support the compile-time checking of definitions, or 
provides visual redundancy to aid the programmer in connecting two 
aspects of the program which are defined far apart.

An example of what I mean is the use of variable type declarations - 
even in a statically typed language, it would be fairly easy for the 
compiler to automatically infer most variable types if the language were 
designed that way; The fact that the programmer is required to manually 
specify these types serves as an additional consistency check on the code.

However, time spend serving the needs of these consistency checks is 
time away from actually serving the functional purpose of the code. 
Programmers in Python, on the other hand, not only need not worry about 
type declarations, they also spend much less time worrying about 
converting from one type to another in order to meet the constraints of 
a particular API.

This is one of the reasons why I can generally write Python code about 4 
times as fast as the C++ equivalent.

(Understand that this is coming from someone who loves working in C++ 
and Java and has used them daily for the last 15 years. At the same 
time, however, I also enjoy programming in Python and I recognize that 
each language has their strengths.)

There is also the question of how much static typing helps improve 
program reliability.

In statically typed languages, there are two kinds of ways that types 
are used. In languages such as C and Pascal, the type declarations serve 
primarily as a consistency check. However, in C++ template 
metaprogramming, and in languages like Haskell, there is a second use 
for types, which is to provide a kind of type calculus or type 
inferencing, which gives additional expressive power to the language. 
C++ templates can act as powerful code generators, allowing the 
programmer to program in ever higher levels of abstraction and express 
the basic ideas even more succinctly and clearly than before.

In a rapid-prototyping environment, the second use of types can be a 
major productivity win; However I would argue that the first use of 
types, consistency checking, is less beneficial, and is often more of a 
distraction to the programmer than a help. Yes, static type checking 
does detect some errors; But it also causes errors by making the code 
larger and more wordy, because that the programmer cannot hold large 
portions of the program in their mind all at once, which can lead to 
errors in overall design. It means the programmer spends more time 
thinking about the behavior of individual variables and less about the 
algorithm as a whole.

At this point, I want to talk about a related matter, another 
fundamental design aspect of Python which I call "decriminalization of 
minor errors".

An example of this is illustrated by the recent discussion over string 
slicing. As you know, when you attempt to index a string with a slice 
object that extends outside of the bounds of the string, the range is 
silently truncated.

Some argued that Python should be more strict, and report an error when 
this occurs - but instead, it was reaffirmed that the current behavior 
is correct. I would agree that this current behavior is the more 
Pythonic, and is part of a general pattern, which I shall attempt to 
describe:

To "decriminalize" an error means to find some alternative 
interpretation of the programmer's intent that produces a non-error 
result. That is, given a syntactical construct, and a choice of several 
interpretations of what that construct should mean, attempt to pick an 
interpretation that, when executed, does not produce an error.

In the design of the Python language, it is a regular practice to 
decriminalize minor errors, provided that the alternative interpretation 
can meet some fairly strict criteria: That it is useful, intuitive, 
reasonable, and pedagogically sound.

Note that this is a much more conservative rule than that used by 
languages such as Rexx, Javascript, and Perl, languages which make 
"heroic efforts" to bend the interpretation of an operation to a 
non-error result. Python does not do this.

Nor is decriminalizing errors isn't the same as ignoring errors. Errors 
are still, and should be, enforced vigorously and reported. The 
distinction is that decriminalizing an error results in code that 
produces a useful, logical result, whereas ignoring errors results in 
code that produces garbage or nothing. Decriminalization comes about 
when we broaden our definitions of what is the correct result of a given 
operation.

A couple of other examples of decriminalization:

1) there are languages in which the only valid argument for an 'if' 
statement is a boolean. Attempts to say "if 0" are errors. In Python we 
relax that rule, allowing any type to be used as the argument to an 
if-statement. We do this by having a broader interpretation of what it 
means to test an object for 'trueness', and allow 'trueness' to be 
implied by 'non-emptiness'.

2) Duck-typing is a decriminalization of the error that polymorphic 
types are required to inherit from a common interface. It also 
decriminalizes "missing methods", as long as those methods are never 
called. Again, this is due to having a broader interpretation of 
'polymorphism'.

(In fact, this aspect of Python is so fundamental, that I think that it 
deserves its own acronym alongside TOOWTDI and others, but I can't think 
of a short, pithy description of it. Maybe IOANEIR - "Interpret 
operations as non-errors if reasonable.")

Both static typing and decriminalization serve the same end - telling 
the programmer "don't sweat the small stuff". Both are very helpful and 
powerful, because they allow programmers to spend much less time 
worrying about minor error cases, things that would have to be checked 
for in C++. Python code is simply more *concise* than the C++ 
equivalent, yet it achieves this without being terse and cryptic, 
because the text of a Python program more closely embodies the "essence" 
of an algorithm, uncluttered by other agendas.

The price we pay for this, of course, is that sometimes errors show up 
much later (like, after ship) than they would have otherwise. But unit 
testing can catch a lot of the same errors.

And in many cases, the seriousness of such errors depends on what we 
mean by "ship". It's one thing to discover a fatal error after you've 
pressed thousands of CDs and shipped them all over the world; It's a 
much different matter if the program has the ability to automatically 
update itself, or is downloaded from some kind of subscription model 
such as a package manager.

In many environments, it is far more important to get something done 
quickly and validate the general concept, than it is to insure that the 
code is 100% correct. In other words, if it would take you 6 months to 
write it in a statically typed language, but only 2 months to write it 
in a dynamic language - well, that's 4 extra months you have to write 
unit tests and make sure it's right! And in the mean time, you can have 
real users banging on the code and making sure of something that is far 
more important, which is whether what you wrote is the right thing at all.

-- Talin



From terry at jon.es  Thu Apr 26 02:52:16 2007
From: terry at jon.es (Terry Jones)
Date: Thu, 26 Apr 2007 02:52:16 +0200
Subject: [Python-ideas] Minor suggestion for unittest
Message-ID: <17967.63424.413682.569882@terry-jones-computer.local>

There's a simple change that could be made to unittest that would make it
easier to automate some forms of testing.

I want to be able to dynamically add tests to an instance of a class
derived from unittest.TestCase. There are occasions when I don't want to
write my tests upfront in a Python file. E.g., given a bunch of
test/expectedResult data sitting around (below in a variable named
MyTestData), it would be nice to be able to do the following (untested
here, but I did it earlier for real and it works fine):

    import unittest

    class Test(unittest.TestCase):
        def runTest(): pass

    suite = unittest.TestSuite()

    for testFunc, expectedResult in MyTestData:
        newTestFuncName = 'dynamic-test-' + testFunc.__name__
        def tester():
            self.assertEqual(testFunc(), expectedResult)
        test = Test()
        setattr(test, newTestFuncName, tester)
        # Set the class instance up so that it will be the one run.
        test.__init__(newTestFuncName)  # ugh!
        suite.addTest(test)

    suite.run()

The explicit call to __init__ (marked ugh!) is ugly, dangerous, etc. You
could also say test._testMethodName = newTestFuncName (and set
_testMethodDoc too), but that's also ugly.


This would all be very simple though if instead of starting out like:

    class TestCase:
        def __init__(self, methodName='runTest'):
            try:
                self._testMethodName = methodName
                testMethod = getattr(self, methodName)
                self._testMethodDoc = testMethod.__doc__
            except AttributeError:
                raise ValueError, "no such test method in %s: %s" % \
                      (self.__class__, methodName)

unittest.TestCase started out like this:

    class TestCase:
        def __init__(self, methodName='runTest'):
            self.setTestMethod(methodName)

        def setTestMethod(self, methodName):
            try:
                self._testMethodName = methodName
                testMethod = getattr(self, methodName)
                self._testMethodDoc = testMethod.__doc__
            except AttributeError:
                raise ValueError, "no such test method in %s: %s" % \
                      (self.__class__, methodName)


That would allow people to create an instance of their Test class, add a
method to it using setattr, and then use setTestMethod to set the method
to be run.

A further improvement would be to have _testMethodName be None or left
undefined (and accessed via __getattr__) for as long as possible rather
than being set to runTest (and looked up with getattr) immediately. That
would allow the removal of the do-nothing runTest method in the above. No
old code need be broken as runTest would still be the default. You'd just
have a chance to get in there earlier so it never saw the light of day.

Programmers like to automate things, especially testing. These changes
don't break any existing code but they allow additional test automation.
Of course you _could_ achieve the above by writing out a brand new temp.py
file, running it, and so on, but that's not very Pythonic, is a bunch more
work, needs cleanup (temp.py needs to go away), etc.

I have some further thoughts about how to make this a bit more flexible,
but I'll save those for later, supposing there's any interest in the above.

Terry Jones


From bjourne at gmail.com  Thu Apr 26 15:22:47 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Thu, 26 Apr 2007 06:22:47 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <20070421112051.63AB.JCARLSON@uci.edu>
References: <f0d8sl$c0h$1@sea.gmane.org>
	<740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com>
	<20070421112051.63AB.JCARLSON@uci.edu>
Message-ID: <740c3aec0704260622m46dfd3aav56025918408e3935@mail.gmail.com>

On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> > On 4/21/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > > But it *is* currently a problem for lists that will become much more
> > > extensive in the future, so it *is* currently a problem for sorted dicts
> > > that will be much more of a problem in the future.  Hence, sorted dicts
> > > will have to be restricted to one type or one group of truly comparable
> > > types.
> >
> > Alternatively, you could require a comparator function to be specified
> > at creation time.
>
> You could, but that would imply a total ordering on elements that Python
> itself is removing because it doesn't make any sense.  Including a list
> of 'acceptable' classes as Terry has suggested would work, but would
> generally be superfluous.  The moment a user first added an object to
> the sorted dictionary is the moment the type of objects that can be
> inserted is easily limited (hello Abstract Base Classes PEP!)

Where did the "we are all consenting adults here" mantra go? Java
doesn't imply any total order on elements either, yet it manages to
fit a TreeMap class that does not artificially limit the kind of items
you can put in it. Yes, you can "screw up" by overriding the hashCode
and equals methods of the items you put in it. Java, in this case,
doesn't try to enforce correctness on the language level, instead it
documents the contract the programmer is supposed to follow.
m1.equals(m2) should imply that m1.hashCode() == m2.hashCode().

Python suffer the same "problem":

class Obj:
    def __eq__(self, o):
        return 0

o1 = Obj()
o2 = Obj()
L = [o1, o2]
assert L.index(o2) == 1

Similar fuck ups are possible when using dicts. In practice this is
not a problem. An ordered dict doesn't need any more safeguards than
Python's already existing data structures. Using the natural order of
its items are just fine and when you need something more fancy,
override the __eq__ method or give the collections sort method a
comparator function argument.


-- 
mvh Bj?rn


From lists at cheimes.de  Thu Apr 26 17:34:18 2007
From: lists at cheimes.de (Christian Heimes)
Date: Thu, 26 Apr 2007 17:34:18 +0200
Subject: [Python-ideas] The case against static type checking,
	in detail (long)
In-Reply-To: <46303FD0.1020504@acm.org>
References: <46303FD0.1020504@acm.org>
Message-ID: <f0qgpr$pd2$1@sea.gmane.org>

Wow that's a good posting. Can you put it on a website so I can show it 
to friends when they argue about dynamic typing sucks? :]

Christian



From collinw at gmail.com  Thu Apr 26 17:45:01 2007
From: collinw at gmail.com (Collin Winter)
Date: Thu, 26 Apr 2007 08:45:01 -0700
Subject: [Python-ideas] Minor suggestion for unittest
In-Reply-To: <17967.63424.413682.569882@terry-jones-computer.local>
References: <17967.63424.413682.569882@terry-jones-computer.local>
Message-ID: <43aa6ff70704260845x3f5f6ef0hde320f5513ce7fde@mail.gmail.com>

On 4/25/07, Terry Jones <terry at jon.es> wrote:
>     import unittest
>
>     class Test(unittest.TestCase):
>         def runTest(): pass
>
>     suite = unittest.TestSuite()
>
>     for testFunc, expectedResult in MyTestData:
>         newTestFuncName = 'dynamic-test-' + testFunc.__name__
>         def tester():
>             self.assertEqual(testFunc(), expectedResult)
>         test = Test()
>         setattr(test, newTestFuncName, tester)
>         # Set the class instance up so that it will be the one run.
>         test.__init__(newTestFuncName)  # ugh!
>         suite.addTest(test)
>
>     suite.run()

It sounds like what you're looking for is FunctionTestCase
(http://docs.python.org/lib/unittest-contents.html). Using that, your
loop above becomes something like

for testFunc, expectedResult in MyTestData:
         def tester():
             self.assertEqual(testFunc(), expectedResult)
         suite.addTest(FunctionTestCase(tester))

Collin Winter


From phd at phd.pp.ru  Thu Apr 26 17:47:24 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 26 Apr 2007 19:47:24 +0400
Subject: [Python-ideas] The case against static type checking,
	in detail (long)
In-Reply-To: <f0qgpr$pd2$1@sea.gmane.org>
References: <46303FD0.1020504@acm.org> <f0qgpr$pd2$1@sea.gmane.org>
Message-ID: <20070426154724.GE13988@phd.pp.ru>

On Thu, Apr 26, 2007 at 05:34:18PM +0200, Christian Heimes wrote:
> Wow that's a good posting. Can you put it on a website so I can show it 
> to friends when they argue about dynamic typing sucks? :]

   At least it in the mailing list archive:

http://mail.python.org/pipermail/python-ideas/2007-April/000552.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From ldlandis at gmail.com  Thu Apr 26 18:00:16 2007
From: ldlandis at gmail.com (LD 'Gus' Landis)
Date: Thu, 26 Apr 2007 11:00:16 -0500
Subject: [Python-ideas] What would you call such an object (was: ordered
	dict)
Message-ID: <a1ddf57e0704260900t729d8992o47a2d51553248113@mail.gmail.com>

Hi,

  I am wondering if in y'alls opinion the following is "just a dictionary" or
  is it a different kind of object that has some of the characteristics of a
  dictionary, and has order.

  If y'all think that it is "just a dictionary", then how does one override the
  notion of a "hash" for the key in a dictionary and make it some other
  ordered structure (e.g. a B-Tree, AVL, etc).  (Please no flame toss to
  some other list -- this is a "use" of an ordered "ordered dict")

  I don't know what such a critter would be called (in Python).  It has the
  name of "array" language where it is central, but don't want to go into that.

  The object has the following characteristics:
  - It is indexed by keys which are immutable (like dicts)
  - Each key has a single value (like dicts)
  - The keys are ordered (usually a B-Tree underneath)
  - The keys are "sorted" yielding a hierarchy such that (using Python tuples
    as an example and pseudo Python):
    object = {
     (0,): "value of node",
     (0,"name") : "name of node",
     (0,"name",1): "some data about name",
     (1,): "value of another node",
     (1,2,3): "some data value",
     (2,): 2,
     (2,2,"somekey",1): 32,
     (3,): 28,
     ("abc",1,2): 14
     }
  - Introspection of the object allows walking the keys by hierarchy,
using the above:
     key = object.order(None)     -> 0
     key = object.order(key)       -> 1
     key = object.order(key)       -> 2
     key = object.order(key)       -> 3
     key = object.order(key)       -> "abc"
     key = object.order(key)       -> None
    The first key is fetched when (None) is the initial key (or last
key if modifier is -1)
   Supplying a modifier (-1, where 1 is default of forward, -1 is
reverse) in the call
   traverses the keys in the reverse order from that shown above.
  - Introspection of the key results in:
     hasdata = object.data(key)
     =0 no subkeys no data for 'key' (in the above (39) would have no
subkeys, no data)
     =1 no subkeys has data for 'key' (in the above (3) has no
subkeys, but has data)
     =10 has subkeys no data for 'key' (in the above (2,2) has subkeys
but no data)
     =11 has subkeys has data for 'key' (in the above (2) has subkeys
and has data)
  - Introspection of object can yield "depth first" keys
     key = object.query(None)  -> (0,)
     key = object.query(key)    -> (0,"name")
     key = object.query(key)    -> (0,"name",1)
     key = object.query(key)    -> (1,)
     key = object.query(key)    -> (1,2,3)
     key = object.query(key)    -> (2,)
     key = object.query(key)    -> (2,2,"somekey",1)
     key = object.query(key)    -> (3,)
     key = object.query(key)    -> ("abc",1,2)
     key = object.query(key)    -> None
    Like object.order(), object.query() has the same "reverse" (using
-1) option to walk
    the keys in a reverse order.
  - Having an iterator over order/query:
    for key in object.ordered([start[,end]):
    for key in object.queryed([start[,end]): (spelling?? other alternative)
  - Set/get of
     object[(0,"name")] = "new name of node"
     print object[(0,"name")]

Cheers,
  --ldl

-- 

LD Landis - N0YRQ - de la tierra del encanto
3960 Schooner Loop, Las Cruces, NM 88012
651/340-4007  N32 21'48.28" W106 46'5.80"
"If a thing is worth doing, it is worth doing badly." ?GK Chesterton.

An interpretation: For things worth doing: Doing them, even if badly,
is better than doing nothing perfectly (on them).


From jcarlson at uci.edu  Thu Apr 26 18:17:03 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 26 Apr 2007 09:17:03 -0700
Subject: [Python-ideas] ordered dict
In-Reply-To: <740c3aec0704260622m46dfd3aav56025918408e3935@mail.gmail.com>
References: <20070421112051.63AB.JCARLSON@uci.edu>
	<740c3aec0704260622m46dfd3aav56025918408e3935@mail.gmail.com>
Message-ID: <20070426090200.6427.JCARLSON@uci.edu>


"BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> On 4/21/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> >
> > "BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> > > On 4/21/07, Terry Reedy <tjreedy at udel.edu> wrote:
> > > > But it *is* currently a problem for lists that will become much more
> > > > extensive in the future, so it *is* currently a problem for sorted dicts
> > > > that will be much more of a problem in the future.  Hence, sorted dicts
> > > > will have to be restricted to one type or one group of truly comparable
> > > > types.
> > >
> > > Alternatively, you could require a comparator function to be specified
> > > at creation time.
> >
> > You could, but that would imply a total ordering on elements that Python
> > itself is removing because it doesn't make any sense.  Including a list
> > of 'acceptable' classes as Terry has suggested would work, but would
> > generally be superfluous.  The moment a user first added an object to
> > the sorted dictionary is the moment the type of objects that can be
> > inserted is easily limited (hello Abstract Base Classes PEP!)
> 
> Where did the "we are all consenting adults here" mantra go? Java
> doesn't imply any total order on elements either, yet it manages to
> fit a TreeMap class that does not artificially limit the kind of items
> you can put in it. Yes, you can "screw up" by overriding the hashCode
> and equals methods of the items you put in it. Java, in this case,
> doesn't try to enforce correctness on the language level, instead it
> documents the contract the programmer is supposed to follow.
> m1.equals(m2) should imply that m1.hashCode() == m2.hashCode().

At no point has there been discussion over removing the ability for
types which don't have a total ordering to be placed into a dictionary 
(hash table).  a = {1:2, None:6, 'hello':0} will always work.

The only thing that anyone has talked about is the removal of >, >=, <,
<= for types that make no sense to compare.  Like unicode and tuple, or
int and tuple, or int and list, etc.  The removal of a "total ordering"
does not imply that 5 != 'hello' will somehow start failing, it means
that 5 < 'hello' will begin to raise an exception because it doesn't
make any sense.


> Similar fuck ups are possible when using dicts. In practice this is
> not a problem. An ordered dict doesn't need any more safeguards than
> Python's already existing data structures. Using the natural order of
> its items are just fine and when you need something more fancy,
> override the __eq__ method or give the collections sort method a
> comparator function argument.

Except this series of posts is about a "sorted dict", with a key,value
mapping in which the equivalent .items() are sorted() as an ordering
(rather than more or less dependant on hash value as in a standard
dictionary).  But as I, and others have stated before, which you should
read once again because you don't seem to get it:

THE EXISTANCE OF A TOTAL ORDERING ON VALUES IN PYTHON TODAY IS A LIE. 
IN FUTURE PYTHONS WE ARE REMOVING THE LIE BECAUSE IT DOESN'T HELP ANYONE. 
IF YOU DON'T LIKE IT; TOUGH COOKIES.  STANDARD PYTHON DICTIONARIES
WILL WORK THE WAY THEY ALWAYS HAVE.  ONLY PEOPLE WHO BELIEVE THAT
INCOMPATIBLE TYPES SHOULD BE ORDERED IN A PARTICULAR WAY IN THINGS LIKE
lst.sort() WILL BE AFFECTED.

If you want an actual reference, please see PEP 3100 which says,
"Comparisons other than == and != between disparate types will raise an
exception unless explicitly supported by the type"
... and references:
    http://mail.python.org/pipermail/python-dev/2004-June/045111.html

If you don't understand this, please ask again without profanity or
accusing the Python developers of removing the "consenting adults"
requirement.  Python is getting smarter.  Maybe you just don't
understand why this is the case.


 - Josiah



From jcarlson at uci.edu  Thu Apr 26 18:22:27 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 26 Apr 2007 09:22:27 -0700
Subject: [Python-ideas] What would you call such an object (was: ordered
	dict)
In-Reply-To: <a1ddf57e0704260900t729d8992o47a2d51553248113@mail.gmail.com>
References: <a1ddf57e0704260900t729d8992o47a2d51553248113@mail.gmail.com>
Message-ID: <20070426091746.642A.JCARLSON@uci.edu>


"LD 'Gus' Landis" <ldlandis at gmail.com> wrote:
> Hi,
> 
>   I am wondering if in y'alls opinion the following is "just a dictionary" or
>   is it a different kind of object that has some of the characteristics of a
>   dictionary, and has order.
> 
>   If y'all think that it is "just a dictionary", then how does one override the
>   notion of a "hash" for the key in a dictionary and make it some other
>   ordered structure (e.g. a B-Tree, AVL, etc).  (Please no flame toss to
>   some other list -- this is a "use" of an ordered "ordered dict")

This is easily implemented as a variant of a treap, in which rather than
choosing a new sub-node based on different characters in a string, you
choose a new sub-node based on different values in a tuple.  There is
one small problem with the structure as you have described it; in order
to be able to choose a (sorted) ordering on the portion of a key as you
show here...

>      key = object.order(key)       -> 3
>      key = object.order(key)       -> "abc"

...it won't make any sense in future Pythons.  3 < "abc" will return an
exception.


 - Josiah



From tony at PageDNA.com  Thu Apr 26 21:39:39 2007
From: tony at PageDNA.com (Tony Lownds)
Date: Thu, 26 Apr 2007 12:39:39 -0700
Subject: [Python-ideas] The case against static type checking,
	in detail (long)
In-Reply-To: <46303FD0.1020504@acm.org>
References: <46303FD0.1020504@acm.org>
Message-ID: <0DC60652-60EA-4C25-9EF5-89B767EB94FE@PageDNA.com>


On Apr 25, 2007, at 10:59 PM, Talin wrote:
> However, we find in practice that much of the programmer's effort is
> spent in maintaining this cross-checking structure.

How does this effort compare to writing all of the equivalent type  
tests?
Static type checking subsumes the need to write tests to ensure that for
every operation, the inputs are valid types, and the result type will  
be a
valid type.

> To use a building
> analogy, a statically-typed program is like a truss structure, where
> there's a complex web of cross-braces, and a force applied at any  
> given
> point is spread out over the whole structure. Each time such a program
> is modified, the programmer must partially dismantle and then
> re-assemble the existing structure.

However the work to ensure the re-assembled structure is completely  
valid
is shifted from human inspection and possibly incomplete tests, to  
static analysis.
Its like having a computer check all of the cross brace connections.   
When the
modifications are small dismantle/reassmbly costs can be dominated by
the checking costs.

> Yes, static type checking
> does detect some errors; But it also causes errors by making the code
> larger and more wordy, because that the programmer cannot hold large
> portions of the program in their mind all at once, which can lead to
> errors in overall design. It means the programmer spends more time
> thinking about the behavior of individual variables and less about the
> algorithm as a whole.

Thats like saying stairs should not have rails because thinking about  
where
to put your hand gets in the way of thinking about where to put your  
feet!

Proposals for static type checking in Python have long included the  
concept of
optional type checking where programs without declarations continue  
to run. So
clearly the desire not to clutter or force work on a type-declaration  
averse programmer
is already taken as a requirement.

-Tony



From adam at atlas.st  Thu Apr 26 21:59:46 2007
From: adam at atlas.st (Adam Atlas)
Date: Thu, 26 Apr 2007 15:59:46 -0400
Subject: [Python-ideas] Python package files
Message-ID: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>

I think it would be useful for Python to accept imports of standalone  
files representing entire packages, maybe with the extension .pyp. A  
package file would basically be a ZIP file, so it would follow fairly  
easily from the current zipimport mechanism... its top-level  
directory would be the contents of a package named by the outer ZIP  
file. In other words, suppose we have a ZIP file called  
"package.pyp", and at its top level, it contains "__init__.py" and  
"blah.py". Anywhere this can be located, it would be equivalent to a  
physical directory called "package" containing those two files. So  
you can simply do "import package" as usual, regardless of whether  
it's a directory or a .pyp.

A while ago I wrote a program called Squisher that does this (it  
takes a ZIP file and turns it into an importable .pyc file), but it's  
a huge hack. The hackishness mainly comes from my desire to not  
require users of Squished packages to install Squisher itself; so  
each module basically has to bootstrap itself, adding its own import  
hook and then adding its own path to sys.path and shuffling around a  
couple of things in sys.modules. All that could be avoided if this  
were a core feature; I expect a straightforward import hook would  
suffice.

As PEP 302 says, "Distributing lots of source or pyc files around is  
not always appropriate, so there is a frequent desire to package all  
needed modules in a single file." It's very useful to be able to  
download a single file, plop it into a directory, and immediately be  
able to import it like any .py or .pyc file. Eggs are nice, but  
having to manually add them to sys.path or install them system-wide  
with setuptools is not always ideal.


From brett at python.org  Thu Apr 26 23:29:09 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 26 Apr 2007 14:29:09 -0700
Subject: [Python-ideas] Python package files
In-Reply-To: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
References: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
Message-ID: <bbaeab100704261429l50f8d117ud803e972e9310fd2@mail.gmail.com>

On 4/26/07, Adam Atlas <adam at atlas.st> wrote:
> I think it would be useful for Python to accept imports of standalone
> files representing entire packages, maybe with the extension .pyp. A
> package file would basically be a ZIP file, so it would follow fairly
> easily from the current zipimport mechanism... its top-level
> directory would be the contents of a package named by the outer ZIP
> file. In other words, suppose we have a ZIP file called
> "package.pyp", and at its top level, it contains "__init__.py" and
> "blah.py". Anywhere this can be located, it would be equivalent to a
> physical directory called "package" containing those two files. So
> you can simply do "import package" as usual, regardless of whether
> it's a directory or a .pyp.
>

So basically zipimport, but instead of putting the zip file on
sys.path the zip file exists in a directory on sys.path and the file
name acts at the top-level package name?  I like the idea as making
stuff just work more easily by dropping into some common place and not
having to muck with the import settings would be nice.

-Brett


From greg.ewing at canterbury.ac.nz  Fri Apr 27 03:32:08 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Apr 2007 13:32:08 +1200
Subject: [Python-ideas] The case against static type checking,
 in detail (long)
In-Reply-To: <0DC60652-60EA-4C25-9EF5-89B767EB94FE@PageDNA.com>
References: <46303FD0.1020504@acm.org>
	<0DC60652-60EA-4C25-9EF5-89B767EB94FE@PageDNA.com>
Message-ID: <46315298.7020803@canterbury.ac.nz>

Tony Lownds wrote:

> Thats like saying stairs should not have rails because thinking about  
> where
> to put your hand gets in the way of thinking about where to put your  
> feet!

Instead of rails, Python stairs have bouncy cushions
along the sides and at the bottom to catch you gently
if you happen to fall, rather than burden you with
having to hold on every time you use the stairs,
even though on most occasions you don't fall.

Also it provides a lot more escalators.

--
Greg


From guido at python.org  Fri Apr 27 03:47:26 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 26 Apr 2007 18:47:26 -0700
Subject: [Python-ideas] The case against static type checking,
	in detail (long)
In-Reply-To: <46315298.7020803@canterbury.ac.nz>
References: <46303FD0.1020504@acm.org>
	<0DC60652-60EA-4C25-9EF5-89B767EB94FE@PageDNA.com>
	<46315298.7020803@canterbury.ac.nz>
Message-ID: <ca471dc20704261847j6934519rc21b1e70ad2582ae@mail.gmail.com>

On 4/26/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Tony Lownds wrote:
>
> > Thats like saying stairs should not have rails because thinking about
> > where
> > to put your hand gets in the way of thinking about where to put your
> > feet!
>
> Instead of rails, Python stairs have bouncy cushions
> along the sides and at the bottom to catch you gently
> if you happen to fall, rather than burden you with
> having to hold on every time you use the stairs,
> even though on most occasions you don't fall.
>
> Also it provides a lot more escalators.

But beware of the rotating knives!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From collinw at gmail.com  Sun Apr 29 06:44:29 2007
From: collinw at gmail.com (Collin Winter)
Date: Sat, 28 Apr 2007 21:44:29 -0700
Subject: [Python-ideas] Minor suggestion for unittest
In-Reply-To: <17971.21160.287099.857675@terry-jones-computer.local>
References: <17967.63424.413682.569882@terry-jones-computer.local>
	<43aa6ff70704260845x3f5f6ef0hde320f5513ce7fde@mail.gmail.com>
	<17971.21160.287099.857675@terry-jones-computer.local>
Message-ID: <43aa6ff70704282144hd6ad9d9q7036253fd9afe211@mail.gmail.com>

On 4/28/07, Terry Jones <terry at jon.es> wrote:
> | It sounds like what you're looking for is FunctionTestCase
> | (http://docs.python.org/lib/unittest-contents.html). Using that, your
> | loop above becomes something like
> |
> | for testFunc, expectedResult in MyTestData:
> |          def tester():
> |              self.assertEqual(testFunc(), expectedResult)
> |          suite.addTest(FunctionTestCase(tester))
>
> I had read about FunctionTestCase but it didn't seem to be what I was
> looking for - though it's the closest. FunctionTestCase is intended to
> allow people to easily bring a set of pre-existing tests under the umbrella
> of unittest. It overrides setUp and tearDown, and doesn't result in the
> test being a first-class test like those you get when you write tests for
> unittest from scratch (using TestCase directly, or something you write
> based on it).

I'm not sure why you think the tests produced FunctionTestCase are
somehow second-class citizens: unittest treats all test objects
equally, so long as they conform to the expected API. If the objection
to using FunctionTestCase is that the test names don't conform to the
same pattern as the statically-defined tests, that's easily solved
with a subclass.

> I want to dynamically (i.e. at run time) add functions that are treated
> equally with those that are added statically in python code. That could be
> really simple (and I can hack around it to achieve it), but the current
> method unittest uses to set its self._testMethodName prevents me from doing
> this in a nice way (because TestCase.__init__ immediately does a hasattr to
> look for the named method, and fails if it's absent).

If FunctionTestCase is undesirable, you might look at creating your
own TestSuite subclass. The API is pretty simple, and that would give
you all the control in the world over pulling in tests dynamically.

Hope that helps,
Collin Winter


From terry at jon.es  Sat Apr 28 15:56:56 2007
From: terry at jon.es (Terry Jones)
Date: Sat, 28 Apr 2007 15:56:56 +0200
Subject: [Python-ideas] Minor suggestion for unittest
In-Reply-To: Your message at 08:45:01 on Thursday, 26 April 2007
References: <17967.63424.413682.569882@terry-jones-computer.local>
	<43aa6ff70704260845x3f5f6ef0hde320f5513ce7fde@mail.gmail.com>
Message-ID: <17971.21160.287099.857675@terry-jones-computer.local>

Hi Collin

Thanks for the reply.

| It sounds like what you're looking for is FunctionTestCase
| (http://docs.python.org/lib/unittest-contents.html). Using that, your
| loop above becomes something like
| 
| for testFunc, expectedResult in MyTestData:
|          def tester():
|              self.assertEqual(testFunc(), expectedResult)
|          suite.addTest(FunctionTestCase(tester))

I had read about FunctionTestCase but it didn't seem to be what I was
looking for - though it's the closest. FunctionTestCase is intended to
allow people to easily bring a set of pre-existing tests under the umbrella
of unittest. It overrides setUp and tearDown, and doesn't result in the
test being a first-class test like those you get when you write tests for
unittest from scratch (using TestCase directly, or something you write
based on it).

I want to dynamically (i.e. at run time) add functions that are treated
equally with those that are added statically in python code. That could be
really simple (and I can hack around it to achieve it), but the current
method unittest uses to set its self._testMethodName prevents me from doing
this in a nice way (because TestCase.__init__ immediately does a hasattr to
look for the named method, and fails if it's absent).

I wonder if I'm being clear... it's pretty simple, but my explanation may
not be so good.

Regards,
Terry


From rrr at ronadam.com  Mon Apr 30 13:09:08 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 30 Apr 2007 06:09:08 -0500
Subject: [Python-ideas] Python package files
In-Reply-To: <bbaeab100704261429l50f8d117ud803e972e9310fd2@mail.gmail.com>
References: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
	<bbaeab100704261429l50f8d117ud803e972e9310fd2@mail.gmail.com>
Message-ID: <4635CE54.7070403@ronadam.com>

Brett Cannon wrote:
> On 4/26/07, Adam Atlas <adam at atlas.st> wrote:
>> I think it would be useful for Python to accept imports of standalone
>> files representing entire packages, maybe with the extension .pyp. A
>> package file would basically be a ZIP file, so it would follow fairly
>> easily from the current zipimport mechanism... its top-level
>> directory would be the contents of a package named by the outer ZIP
>> file. In other words, suppose we have a ZIP file called
>> "package.pyp", and at its top level, it contains "__init__.py" and
>> "blah.py". Anywhere this can be located, it would be equivalent to a
>> physical directory called "package" containing those two files. So
>> you can simply do "import package" as usual, regardless of whether
>> it's a directory or a .pyp.
>>
> 
> So basically zipimport, but instead of putting the zip file on
> sys.path the zip file exists in a directory on sys.path and the file
> name acts at the top-level package name?  I like the idea as making
> stuff just work more easily by dropping into some common place and not
> having to muck with the import settings would be nice.

I like that too.  + 1

I really dislike scattering a projects files around.

And conversely, I really dislike combining files from different sources.

Ron






From lists at cheimes.de  Mon Apr 30 16:22:31 2007
From: lists at cheimes.de (Christian Heimes)
Date: Mon, 30 Apr 2007 16:22:31 +0200
Subject: [Python-ideas] Python package files
In-Reply-To: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
References: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
Message-ID: <f14u38$r20$1@sea.gmane.org>

Adam Atlas wrote:
> I think it would be useful for Python to accept imports of standalone  
> files representing entire packages, maybe with the extension .pyp. A  
> package file would basically be a ZIP file, so it would follow fairly  
> easily from the current zipimport mechanism... its top-level  
> directory would be the contents of a package named by the outer ZIP  
> file. In other words, suppose we have a ZIP file called  
> "package.pyp", and at its top level, it contains "__init__.py" and  
> "blah.py". Anywhere this can be located, it would be equivalent to a  
> physical directory called "package" containing those two files. So  
> you can simply do "import package" as usual, regardless of whether  
> it's a directory or a .pyp.

What are the benefits of your proposal over the already established 
Python eggs? As far as I understand your proposal it's not much 
different to eggs. In fact eggs + setuptools support more features like 
dependencies, multiversion installation and many more.

Christian



From adam at atlas.st  Mon Apr 30 17:26:05 2007
From: adam at atlas.st (Adam Atlas)
Date: Mon, 30 Apr 2007 11:26:05 -0400
Subject: [Python-ideas] Python package files
In-Reply-To: <f14u38$r20$1@sea.gmane.org>
References: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
	<f14u38$r20$1@sea.gmane.org>
Message-ID: <9C20E3C1-F11E-4313-987E-E2842E75731B@atlas.st>


On 30 Apr 2007, at 10.22, Christian Heimes wrote:
> Adam Atlas wrote:
>> I think it would be useful for Python to accept imports of standalone
>> files representing entire packages, maybe with the extension .pyp. A
>> package file would basically be a ZIP file, so it would follow fairly
>> easily from the current zipimport mechanism... its top-level
>> directory would be the contents of a package named by the outer ZIP
>> file. In other words, suppose we have a ZIP file called
>> "package.pyp", and at its top level, it contains "__init__.py" and
>> "blah.py". Anywhere this can be located, it would be equivalent to a
>> physical directory called "package" containing those two files. So
>> you can simply do "import package" as usual, regardless of whether
>> it's a directory or a .pyp.
>
> What are the benefits of your proposal over the already established
> Python eggs? As far as I understand your proposal it's not much
> different to eggs. In fact eggs + setuptools support more features  
> like
> dependencies, multiversion installation and many more.

Python eggs use zipimport, which allow them to be elements of  
sys.path. Then, modules inside them can be imported as usual. My  
proposal is to make .pyp ZIP files importable themselves. You import  
a .pyp just like a package directory, instead of having to add an egg  
to sys.path and then import modules contained in it. It's convenient.

It is true that eggs do have many benefits for production use, but  
often while developing something, or using a package that you don't  
expect to use outside one project, or just trying out a package that  
you're not sure you'll use, it's simpler to be able to just drop a  
file into your project directory instead of having to `sudo  
easy_install` it system-wide. Zero-installation is nice.

Though since setuptools is set to be included in Python 2.6 (right?),  
maybe it could take advantage of those benefits -- perhaps .pyps  
could optionally include an EGG-INFO directory, and there could be a  
simple tool to transform those .pyps into eggs and vice versa. That  
way you can use whichever way is the most practical at the time, but  
be able to easily switch to the other if need be.


From fdrake at acm.org  Mon Apr 30 20:10:14 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Apr 2007 14:10:14 -0400
Subject: [Python-ideas] Python package files
In-Reply-To: <9C20E3C1-F11E-4313-987E-E2842E75731B@atlas.st>
References: <92B05426-D7F6-47A2-B3E5-344421964B15@atlas.st>
	<f14u38$r20$1@sea.gmane.org>
	<9C20E3C1-F11E-4313-987E-E2842E75731B@atlas.st>
Message-ID: <200704301410.15350.fdrake@acm.org>

On Monday 30 April 2007, Adam Atlas wrote:
 > It is true that eggs do have many benefits for production use, but
 > often while developing something, or using a package that you don't
 > expect to use outside one project, or just trying out a package that
 > you're not sure you'll use, it's simpler to be able to just drop a
 > file into your project directory instead of having to `sudo
 > easy_install` it system-wide. Zero-installation is nice.

-1 on adding yet-another-ZIP-thing.

Python eggs aren't always convenient, but they're easy enough to work with, 
and good tools to work with egg-based installations are appearing.  Having 
another way to do this, especially something that will be turned into eggs 
for deployment, seems like a distraction.  Differences between development 
environments and production environments lead to bugs, not ease-of-use.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>